pytorch/docs
eqy 790763b0fe Add an option to disable reduced precision reductions for FP16 GEMM (#67946)
Summary:
https://github.com/pytorch/pytorch/issues/67578 disabled reduced precision reductions for FP16 GEMMs. After benchmarking, we've found that this has substantial performance impacts for common GEMM shapes (e.g., those found in popular instantiations of multiheaded-attention) on architectures such as Volta. As these performance regressions may come as a surprise to current users, this PR adds a toggle to disable reduced precision reductions
`torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction = `
rather than making it the default behavior.

CC ngimel ptrblck
stas00 Note that the behavior after the previous PR can be replicated with
`torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction = False`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67946

Reviewed By: zou3519

Differential Revision: D32289896

Pulled By: ngimel

fbshipit-source-id: a1ea2918b77e27a7d9b391e030417802a0174abe
2021-11-09 17:27:20 -08:00
..
caffe2 Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default 2021-08-12 11:45:01 -07:00
cpp [ONNX] Update onnx function export with comments and clean up (#66817) (#67803) 2021-11-05 10:35:35 -07:00
source Add an option to disable reduced precision reductions for FP16 GEMM (#67946) 2021-11-09 17:27:20 -08:00
.gitignore
libtorch.rst
make.bat
Makefile
README.md
requirements.txt Add copy button to code snippets in docs (#63149) 2021-08-15 06:25:32 -07:00

Please see the Writing documentation section of CONTRIBUTING.md for details on both writing and building the docs.