pytorch/test
eqy 790763b0fe Add an option to disable reduced precision reductions for FP16 GEMM (#67946)
Summary:
https://github.com/pytorch/pytorch/issues/67578 disabled reduced precision reductions for FP16 GEMMs. After benchmarking, we've found that this has substantial performance impacts for common GEMM shapes (e.g., those found in popular instantiations of multiheaded-attention) on architectures such as Volta. As these performance regressions may come as a surprise to current users, this PR adds a toggle to disable reduced precision reductions
`torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction = `
rather than making it the default behavior.

CC ngimel ptrblck
stas00 Note that the behavior after the previous PR can be replicated with
`torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction = False`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67946

Reviewed By: zou3519

Differential Revision: D32289896

Pulled By: ngimel

fbshipit-source-id: a1ea2918b77e27a7d9b391e030417802a0174abe
2021-11-09 17:27:20 -08:00
..
ao/sparsity Set test owners for tests with unknown owners (#67552) 2021-10-29 12:42:01 -07:00
autograd
backward_compatibility [jit][shape_prop] Fix jit registration of unpack_sizes ops for prepacked (#66737) 2021-11-01 12:43:10 -07:00
benchmark_utils Add lint to ensure all test files have headers with ownership info (#66826) 2021-11-03 18:21:49 -07:00
bottleneck_test Set test owners for tests with unknown owners (#67552) 2021-10-29 12:42:01 -07:00
cpp [LT] Merge cache.h (#67929) 2021-11-09 12:02:02 -08:00
cpp_api_parity
cpp_extensions Resubmit #67161 (#67735) 2021-11-04 09:59:30 -07:00
custom_backend [PyTorch] Adopt IValue::toTupleRef() where obvious (#65505) 2021-11-02 10:22:18 -07:00
custom_operator Set test owners for tests with unknown owners (#67552) 2021-10-29 12:42:01 -07:00
distributed [NCCL] Patch bfloat16 support (#67843) 2021-11-09 13:46:13 -08:00
distributions
error_messages
expect
fx Enforce that test cases extend from correct TestCase (#67819) 2021-11-08 18:28:36 -08:00
fx2trt [fx2trt] fix import in oss tests (#68016) 2021-11-08 16:11:00 -08:00
fx_acc [fx2trt] Add torch.nn.function.pad support for fx2trt (#67498) 2021-11-03 10:21:08 -07:00
jit Update Freezing Logic and add new passes (#68024) 2021-11-09 13:21:52 -08:00
jit_hooks
mobile Add lint to ensure all test files have headers with ownership info (#66826) 2021-11-03 18:21:49 -07:00
onnx [ONNX] Fix reciprocal when input is not floating point (#67471) (#67808) 2021-11-08 14:37:07 -08:00
package Add lint to ensure all test files have headers with ownership info (#66826) 2021-11-03 18:21:49 -07:00
quantization [quant] Fix comparison against reference for test_qat_functional_linear (#68061) 2021-11-09 13:33:13 -08:00
scripts
test_img
typing
delete.py
HowToWriteTestsUsingFileCheck.md
linear.py
run_test.py Test functionalization pass in python (#66101) 2021-11-09 14:34:05 -08:00
simulate_nccl_errors.py
test_ao_sparsity.py Add ownership to more edge tests (#67859) 2021-11-05 11:01:16 -07:00
test_autocast.py Set test owners for tests with unknown owners (#67552) 2021-10-29 12:42:01 -07:00
test_autograd.py torch.lobpcg.backward: do not save non-Variable types with ctx.save_for_backward. (#67994) 2021-11-08 10:02:09 -08:00
test_binary_ufuncs.py Clean up test autograd (#67413) 2021-11-03 15:26:09 -07:00
test_bundled_images.py Add ownership to more edge tests (#67859) 2021-11-05 11:01:16 -07:00
test_bundled_inputs.py Add lint to ensure all test files have headers with ownership info (#66826) 2021-11-03 18:21:49 -07:00
test_complex.py
test_cpp_api_parity.py
test_cpp_extensions_aot.py Resubmit #67161 (#67735) 2021-11-04 09:59:30 -07:00
test_cpp_extensions_jit.py
test_cuda.py Add an option to disable reduced precision reductions for FP16 GEMM (#67946) 2021-11-09 17:27:20 -08:00
test_cuda_primary_ctx.py
test_dataloader.py
test_datapipe.py [DataPipe] Fixing pickling issues with fork and demux (#67930) 2021-11-09 07:54:02 -08:00
test_deploy.py
test_determination.py
test_dispatch.py
test_foreach.py [Foreach] Implement L1&L2 norm (#62646) 2021-11-05 11:23:00 -07:00
test_function_schema.py Set test owners for tests with unknown owners (#67552) 2021-10-29 12:42:01 -07:00
test_functional_autograd_benchmark.py [skip ci] Add test owners for a special hi-pri class of tests (#67553) 2021-10-29 12:17:21 -07:00
test_functional_optim.py
test_functionalization.py Test functionalization pass in python (#66101) 2021-11-09 14:34:05 -08:00
test_futures.py Set test owners for tests with unknown owners (#67552) 2021-10-29 12:42:01 -07:00
test_fx.py Remove special FX OpInfo list (#67520) 2021-11-02 16:01:46 -07:00
test_fx_experimental.py Add OpInfo for torch.unique and torch.unique_consecutive (#67529) 2021-10-30 08:33:41 -07:00
test_gen_backend_stubs.py
test_import_stats.py
test_indexing.py Fixes CUDA vs CPU consistency for index_put_ when accumulating (part 2) (#67189) 2021-11-08 17:56:43 -08:00
test_jit.py Dtype Analysis for Unary and Binary ops with Metatensors (#66898) 2021-11-04 19:00:50 -07:00
test_jit_autocast.py [JIT] additional support for CallMethod with autocasting (#67925) 2021-11-08 14:37:09 -08:00
test_jit_cuda_fuser.py
test_jit_disabled.py
test_jit_fuser.py
test_jit_fuser_legacy.py
test_jit_fuser_te.py Op info for activation functions 2 (softsign, tanh, tanhshrink, threshold, celu, sigmoid, mish, hardsigmoid) (#67492) 2021-11-09 12:57:38 -08:00
test_jit_legacy.py
test_jit_profiling.py
test_jit_simple.py
test_jit_string.py
test_kernel_launch_checks.py Set test owners for tests with unknown owners (#67552) 2021-10-29 12:42:01 -07:00
test_license.py Set test owners for tests with unknown owners (#67552) 2021-10-29 12:42:01 -07:00
test_linalg.py Avoid prematurely casting GEMM parameters alpha, beta to scalar_t (#67633) 2021-11-03 12:01:04 -07:00
test_logging.py Set test owners for tests with unknown owners (#67552) 2021-10-29 12:42:01 -07:00
test_metal.py
test_mkldnn.py
test_mobile_optimizer.py
test_model_dump.py Add ownership to more edge tests (#67859) 2021-11-05 11:01:16 -07:00
test_module_init.py [skip ci] Add test owners for a special hi-pri class of tests (#67553) 2021-10-29 12:17:21 -07:00
test_modules.py Revert D32063662: [pytorch][PR] TST Adds device transfer into module info tests 2021-11-05 07:07:39 -07:00
test_multiprocessing.py
test_multiprocessing_spawn.py
test_namedtensor.py
test_namedtuple_return_api.py Set test owners for tests with unknown owners (#67552) 2021-10-29 12:42:01 -07:00
test_native_functions.py Set test owners for tests with unknown owners (#67552) 2021-10-29 12:42:01 -07:00
test_nn.py Refactor cuDNN Convolution memory format and Conv-Bias-Relu code (#65594) 2021-11-05 11:50:55 -07:00
test_nnapi.py Add ownership to more edge tests (#67859) 2021-11-05 11:01:16 -07:00
test_numba_integration.py Set test owners for tests with unknown owners (#67552) 2021-10-29 12:42:01 -07:00
test_numpy_interop.py
test_openmp.py Set test owners for tests with unknown owners (#67552) 2021-10-29 12:42:01 -07:00
test_ops.py replace onlyOnCPUAndCUDA with onlyNativeDeviceTypes (#65201) 2021-11-01 09:22:34 -07:00
test_optim.py Adds a maximize flag to SGD. (#67847) 2021-11-09 00:43:07 -08:00
test_overrides.py [skip ci] Add test owners for a special hi-pri class of tests (#67553) 2021-10-29 12:17:21 -07:00
test_package.py
test_profiler.py
test_pruning_op.py Set test owners for tests with unknown owners (#67552) 2021-10-29 12:42:01 -07:00
test_public_bindings.py
test_python_dispatch.py Test functionalization pass in python (#66101) 2021-11-09 14:34:05 -08:00
test_pytree.py [skip ci] Add test owners for a special hi-pri class of tests (#67553) 2021-10-29 12:17:21 -07:00
test_quantization.py [quant][embedding qat][bugfix] Fix and test QAT EmbeddingBag from_float error message (#66989) 2021-10-28 06:29:20 -07:00
test_reductions.py Updated searchsorted functionality (#66818) 2021-11-05 12:13:47 -07:00
test_segment_reductions.py Set test owners for tests with unknown owners (#67552) 2021-10-29 12:42:01 -07:00
test_serialization.py
test_set_default_mobile_cpu_allocator.py Add ownership to more edge tests (#67859) 2021-11-05 11:01:16 -07:00
test_shape_ops.py replace onlyOnCPUAndCUDA with onlyNativeDeviceTypes (#65201) 2021-11-01 09:22:34 -07:00
test_show_pickle.py Add ownership to more edge tests (#67859) 2021-11-05 11:01:16 -07:00
test_sort_and_select.py replace onlyOnCPUAndCUDA with onlyNativeDeviceTypes (#65201) 2021-11-01 09:22:34 -07:00
test_sparse.py
test_sparse_csr.py Sparse CSR CPU: add addmv_out (#61536) 2021-11-09 12:34:21 -08:00
test_spectral_ops.py replace onlyOnCPUAndCUDA with onlyNativeDeviceTypes (#65201) 2021-11-01 09:22:34 -07:00
test_stateless.py [skip ci] Add test owners for a special hi-pri class of tests (#67553) 2021-10-29 12:17:21 -07:00
test_static_runtime.py Set test owners for tests with unknown owners (#67552) 2021-10-29 12:42:01 -07:00
test_tensor_creation_ops.py Add meta support to tensor range factories (#67032) 2021-11-05 15:36:29 -07:00
test_tensorboard.py Set test owners for tests with unknown owners (#67552) 2021-10-29 12:42:01 -07:00
test_tensorexpr.py
test_tensorexpr_pybind.py
test_testing.py replace onlyOnCPUAndCUDA with onlyNativeDeviceTypes (#65201) 2021-11-01 09:22:34 -07:00
test_throughput_benchmark.py Set test owners for tests with unknown owners (#67552) 2021-10-29 12:42:01 -07:00
test_torch.py Fix conv_transpose3d backward with non-contiguous grad_out (#67829) 2021-11-05 08:34:21 -07:00
test_type_hints.py
test_type_info.py
test_type_promotion.py replace onlyOnCPUAndCUDA with onlyNativeDeviceTypes (#65201) 2021-11-01 09:22:34 -07:00
test_typing.py
test_unary_ufuncs.py Clean up test autograd (#67413) 2021-11-03 15:26:09 -07:00
test_utils.py broaden retries on TestHub (#67779) 2021-11-03 13:48:58 -07:00
test_view_ops.py Clean up test autograd (#67413) 2021-11-03 15:26:09 -07:00
test_vmap.py Set test owner for vmap (#67582) 2021-11-01 07:22:48 -07:00
test_vulkan.py
test_xnnpack_integration.py Add ownership to more edge tests (#67859) 2021-11-05 11:01:16 -07:00