pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

History

eqy 790763b0fe Add an option to disable reduced precision reductions for FP16 GEMM (#67946 ) Summary: https://github.com/pytorch/pytorch/issues/67578 disabled reduced precision reductions for FP16 GEMMs. After benchmarking, we've found that this has substantial performance impacts for common GEMM shapes (e.g., those found in popular instantiations of multiheaded-attention) on architectures such as Volta. As these performance regressions may come as a surprise to current users, this PR adds a toggle to disable reduced precision reductions `torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction = ` rather than making it the default behavior. CC ngimel ptrblck stas00 Note that the behavior after the previous PR can be replicated with `torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction = False` Pull Request resolved: https://github.com/pytorch/pytorch/pull/67946 Reviewed By: zou3519 Differential Revision: D32289896 Pulled By: ngimel fbshipit-source-id: a1ea2918b77e27a7d9b391e030417802a0174abe		2021-11-09 17:27:20 -08:00
..
_static	clarify the documentation of `torch.meshgrid` (#62977 )	2021-08-18 04:01:22 -07:00
_templates
community	Update contribution_guide.rst (#64142 )	2021-08-30 19:26:59 -07:00
elastic	(torchelastic) make --max_restarts explicit in the quickstart and runner docs (#65838 )	2021-09-29 19:29:01 -07:00
notes	Add an option to disable reduced precision reductions for FP16 GEMM (#67946 )	2021-11-09 17:27:20 -08:00
rpc	Support Union in TorchScript (#64234 )	2021-09-03 06:12:24 -07:00
scripts	[docs] Add images to some activation functions (#65415 )	2021-09-22 11:05:29 -07:00
__config__.rst
amp.rst	rebase for autocast updates to include device_type and dtype flags (#61002 )	2021-08-10 20:03:12 -07:00
autograd.rst	Update extending doc to cover forward mode AD (#66962 )	2021-10-27 14:18:38 -07:00
backends.rst	Add an option to disable reduced precision reductions for FP16 GEMM (#67946 )	2021-11-09 17:27:20 -08:00
benchmark_utils.rst
bottleneck.rst
checkpoint.rst
complex_numbers.rst	Grammatical update of tech docs (#61547 )	2021-07-14 14:01:59 -07:00
conf.py	[Quant] Add dynamic QAT Linear module (#67325 )	2021-11-08 10:24:25 -08:00
cpp_extension.rst
cpp_index.rst
cuda.rst	[CUDA graphs] Beta, not prototype (#65247 )	2021-09-20 13:32:36 -07:00
cudnn_persistent_rnn.rst	Remove orphan from cuDNN persistent note (#65160 )	2021-09-21 11:09:47 -07:00
cudnn_rnn_determinism.rst
data.rst	Add a warning about DataLoader num_workers > 0 "memory leak" (#64337 )	2021-09-01 21:49:41 -07:00
ddp_comm_hooks.rst	[DDP Comm Hook] Add debugging communication hooks to ddp_comm_hooks.rst (#64352 )	2021-09-01 17:37:19 -07:00
distributed.algorithms.join.rst	Add tutorial link (#62785 )	2021-08-05 17:28:02 -07:00
distributed.elastic.rst
distributed.optim.rst	[distributed][docs] Delete distributed optimimzer section from RPC and add reference to namespace docs page (#68068 )	2021-11-09 15:01:54 -08:00
distributed.rst	Update distributed.rst to show that CUDA send/recv on GPU is supported (#65601 )	2021-09-24 12:30:10 -07:00
distributions.rst
dlpack.rst
docutils.conf
fft.rst	C++ API and docs for hfftn (#66127 )	2021-10-07 12:48:36 -07:00
futures.rst
fx.rst	fx: Update fx.rst (#68043 )	2021-11-09 14:00:45 -08:00
hub.rst
index.rst	Make _Join, _Joinable, _JoinHook public (#62605 )	2021-08-03 12:20:11 -07:00
jit.rst	Back out "D30740897 Add fusion enabled apis" (#64500 )	2021-09-04 20:55:58 -07:00
jit_builtin_functions.rst
jit_language_reference.rst	Document `torch.jit.is_tracing()` (#67326 )	2021-10-28 09:56:09 -07:00
jit_language_reference_v2.rst	Document `torch.jit.is_tracing()` (#67326 )	2021-10-28 09:56:09 -07:00
jit_python_reference.rst
jit_unsupported.rst
linalg.rst	Create linalg.matrix_exp (#62715 )	2021-10-19 09:07:15 -07:00
math-quantizer-equation.png
mobile_optimizer.rst
model_zoo.rst
multiprocessing.rst
name_inference.rst
named_tensor.rst
nn.functional.rst
nn.init.rst
nn.rst	Implements the orthogonal parametrization (#62089 )	2021-08-30 13:12:07 -07:00
onnx.rst	[Doc] [ONNX]Fix a broken url for ONNXRuntime custom op (#67944 )	2021-11-08 15:51:02 -08:00
optim.rst	To add SequentialLR to PyTorch Core Schedulers (#64037 )	2021-09-09 09:36:32 -07:00
package.rst	[package] add some docs describing how to debug dependencies (#65704 )	2021-09-27 12:14:23 -07:00
pipeline.rst	fixed comments referring fairscale master branch (#65531 )	2021-09-23 14:37:58 -07:00
profiler.rst
quantization-support.rst	[Quant] Add dynamic QAT Linear module (#67325 )	2021-11-08 10:24:25 -08:00
quantization.rst	pytorch quantization: document the custom module APIs (#67449 )	2021-10-29 05:22:17 -07:00
random.rst
rpc.rst	[distributed][docs] Delete distributed optimimzer section from RPC and add reference to namespace docs page (#68068 )	2021-11-09 15:01:54 -08:00
sparse.rst
special.rst	[special] special alias for softmax (#62251 )	2021-10-01 03:55:32 -07:00
storage.rst
tensor_attributes.rst
tensor_view.rst	Add tensor.{adjoint(),H,mT,mH} methods and properties (#64179 )	2021-10-13 07:44:43 -07:00
tensorboard.rst
tensors.rst	[numpy] add torch.argwhere (#64257 )	2021-10-30 15:26:11 -07:00
testing.rst	[Doc] `make_tensor` to `torch.testing` module (#63925 )	2021-08-30 12:25:40 -07:00
torch.ao.ns._numeric_suite.rst	Quantization docs: add pages for Numeric Suite (Eager and FX) (#66380 )	2021-10-11 18:47:58 -07:00
torch.ao.ns._numeric_suite_fx.rst	Quantization docs: add pages for Numeric Suite (Eager and FX) (#66380 )	2021-10-11 18:47:58 -07:00
torch.overrides.rst
torch.rst	[numpy] add torch.argwhere (#64257 )	2021-10-30 15:26:11 -07:00
type_info.rst	clarify that `torch.finfo.tiny` is the smallest normal number (#63241 )	2021-08-18 13:44:52 -07:00