pytorch/test
Elias Ellison d881b2978c Make autocast cache and buffer stealing aware of cudagraph static output tensors (#99368)
In this stack of PRs we adding caching to output tensors for cudagraph trees after we've done initial recording. On initial recording we do not cache tensor outputs because this prevents memory from being reclaimed. On subsequent exeuctions we do cache them to avoid overhead. However, because there is an extra reference around, this caused divergent recording & execution behavior in both autocast caching and autograd gradient stealing. Divergent recording & execution would keep on re-recording and eventually stabilize, but it's not what you want to see happen.

This pr makes the autocast cache and buffer stealing aware of the cudagraph static output tensors.

I will add this to the other cudagraph impl in another pr.

Not sure if this should be in autograd or in autocast since it affects both.. Or somewhere else

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99368
Approved by: https://github.com/albanD, https://github.com/ezyang
2023-04-24 20:23:12 +00:00
..
_nvfuser
ao/sparsity Back out "[core][pruning][be] rename BaseSparsifier to BasePruner (#98747)" (#99171) 2023-04-15 00:37:45 +00:00
autograd
backends/xeon
benchmark_utils [BE] Remove unnecessary dict comprehensions (#97116) 2023-03-20 00:56:57 +00:00
bottleneck_test
cpp [reland][BE][autograd Function] Raise an error if input is returned a… (#98051) 2023-04-11 15:42:54 +00:00
cpp_api_parity [BE] Remove unnecessary dict comprehensions (#97116) 2023-03-20 00:56:57 +00:00
cpp_extensions [Feature] storage pin memory support custom device. (#99712) 2023-04-21 18:31:01 +00:00
custom_backend
custom_operator Enable TestTorchbind on Windows (#96507) 2023-03-16 16:18:08 +00:00
distributed [spmd] Add list handling to data parallel and add foreach tests (#99373) 2023-04-22 05:39:20 +00:00
distributions
dynamo Revert "Do not assume static by default when exporting (#99554)" 2023-04-24 08:27:56 +00:00
edge
error_messages
expect Reland python ops (#99170) 2023-04-18 15:15:46 +00:00
export Revert "Delete tracing_mode argument to export (#99555)" 2023-04-24 08:21:41 +00:00
forward_backward_compatibility tweak heuristic for sdpa selection based off of *data* (and a decision tree) (#99644) 2023-04-21 23:28:44 +00:00
functorch Fix fake tracing of cross entropy with label smoothing and weight (#99830) 2023-04-24 04:07:23 +00:00
fx Revert "Delete tracing_mode argument to export (#99555)" 2023-04-24 08:21:41 +00:00
inductor Make autocast cache and buffer stealing aware of cudagraph static output tensors (#99368) 2023-04-24 20:23:12 +00:00
jit [JIT] clarify errors due to non-literal indexing into ModuleList, ModuleDict (#98606) 2023-04-18 02:53:53 +00:00
jit_hooks
lazy
mobile
nn Fix module backward pre-hooks to actually update gradient (#97983) 2023-03-30 20:33:44 +00:00
onnx [ONNX] Add additional_test_kwargs into test_fx_to_onnx_with_onnxruntime.py (#99434) 2023-04-22 04:03:50 +00:00
onnx_caffe2
package [torch package][easy] Make all the save/load tests use buffers (#98798) 2023-04-14 13:52:17 +00:00
profiler Removed hip call hipDeviceSynchronize (#97209) 2023-04-09 20:12:52 +00:00
quantization Revert "Delete tracing_mode argument to export (#99555)" 2023-04-24 08:21:41 +00:00
scripts Fix flake8 lint errors reported by ruff - take 2 (#99798) 2023-04-23 23:09:51 +00:00
test_img
typing
allowlist_for_publicAPI.json
conftest.py Fix pytest config (#98607) 2023-04-08 00:55:51 +00:00
create_dummy_torchscript_model.py
delete.py
HowToWriteTestsUsingFileCheck.md
linear.py
load_torchscript_model.py
mkl_verbose.py
mkldnn_verbose.py
run_doctests.sh
run_test.py Discover and run C++ tests with run_test.py (#99559) 2023-04-22 00:23:31 +00:00
simulate_nccl_errors.py Fix G001,G002,G003 in logs to % syntax (#97812) 2023-04-01 01:43:33 +00:00
test_ao_sparsity.py Back out "[core][pruning][be] rename BaseSparsifier to BasePruner (#98747)" (#99171) 2023-04-15 00:37:45 +00:00
test_autocast.py Make autocast cache and buffer stealing aware of cudagraph static output tensors (#99368) 2023-04-24 20:23:12 +00:00
test_autograd.py [BE] Enable flake8-comprehension rule C417 (#97880) 2023-03-30 14:34:24 +00:00
test_binary_ufuncs.py fix mul/div overflow issue on CPU float16 (#98820) 2023-04-17 07:12:53 +00:00
test_bundled_images.py
test_bundled_inputs.py
test_comparison_utils.py Add missing __main__ in two unittests (#97302) 2023-03-22 19:09:08 +00:00
test_compile_benchmark_util.py torch.compile benchmark utility (#97699) 2023-04-12 03:02:06 +00:00
test_complex.py Fix CPU vectorized eq and ne operations for complex types (#97374) 2023-04-14 02:02:16 +00:00
test_cpp_api_parity.py
test_cpp_extensions_aot.py Support large negative SymInt (#99157) 2023-04-15 22:43:51 +00:00
test_cpp_extensions_jit.py
test_cpp_extensions_open_device_registration.py [Feature] storage pin memory support custom device. (#99712) 2023-04-21 18:31:01 +00:00
test_cuda.py Remove redundant found_inf recompute from _step_supports_amp_unscaling path (#98620) 2023-04-20 19:24:09 +00:00
test_cuda_expandable_segments.py Revert "Revert "Expandable blocks in allocator (#96995)"" (#99275) 2023-04-17 23:46:08 +00:00
test_cuda_nvml_based_avail.py
test_cuda_primary_ctx.py
test_cuda_sanitizer.py
test_cuda_trace.py
test_dataloader.py Revert "Revert "Expandable blocks in allocator (#96995)"" (#99275) 2023-04-17 23:46:08 +00:00
test_datapipe.py [BE] Update flake8-comprehensions and adapt to rule C418 (#99178) 2023-04-15 15:33:42 +00:00
test_decomp.py addmv decomp #2 (#96264) 2023-03-16 23:09:45 +00:00
test_deploy.py
test_determination.py
test_dispatch.py
test_dlpack.py
test_dynamic_shapes.py suggest constraints to specify for export based on generated shape guards (#98463) 2023-04-19 21:56:36 +00:00
test_expanded_weights.py
test_fake_tensor.py Fix fake tracing of cross entropy with label smoothing and weight (#99830) 2023-04-24 04:07:23 +00:00
test_flop_counter.py tweak heuristic for sdpa selection based off of *data* (and a decision tree) (#99644) 2023-04-21 23:28:44 +00:00
test_foreach.py [test_foreach] add cases of zero size tensors (#95028) 2023-03-23 00:12:13 +00:00
test_function_schema.py
test_functional_autograd_benchmark.py
test_functional_optim.py
test_functionalization.py dont bake in defaults when tracing *_like factories (#97564) 2023-03-27 22:53:44 +00:00
test_functionalization_of_rng_ops.py Functionalization of torch.rand/rand_like ops (#97377) 2023-04-16 09:55:56 +00:00
test_futures.py
test_fx.py [fx] Add a function to allow adding more functions to the side effect function set (#97288) 2023-04-22 04:42:24 +00:00
test_fx_experimental.py fix conv+bn folding issue for mixed dtype (#99696) 2023-04-23 05:13:40 +00:00
test_fx_passes.py
test_fx_reinplace_pass.py
test_hub.py torch.hub: add safe weights_only option to load_state_dict_from_url (#98479) 2023-04-11 12:44:25 +00:00
test_import_stats.py
test_indexing.py
test_itt.py
test_jit.py Fix flake8 lint errors reported by ruff - take 2 (#99798) 2023-04-23 23:09:51 +00:00
test_jit_autocast.py
test_jit_cuda_fuser.py
test_jit_disabled.py
test_jit_fuser.py
test_jit_fuser_legacy.py
test_jit_fuser_te.py Fix flake8 lint errors reported by ruff - take 2 (#99798) 2023-04-23 23:09:51 +00:00
test_jit_legacy.py
test_jit_llga_fuser.py
test_jit_profiling.py
test_jit_simple.py
test_jit_string.py
test_jiterator.py [ROCm] add skipCUDAIfVersionLessThan to unskip test_jiterator for ROCm (#99197) 2023-04-17 16:05:16 +00:00
test_kernel_launch_checks.py
test_legacy_vmap.py
test_license.py
test_linalg.py [CUDA][CUDA 11] Remove more CUDA 11 version checks (#92934) 2023-03-30 19:49:52 +00:00
test_logging.py
test_masked.py
test_maskedtensor.py fix random mask creation in test_maskedtensor (#97017) 2023-03-24 23:55:17 +00:00
test_matmul_cuda.py [CUBLAS] Specify alignment for cuBlasLt addmm (#98975) 2023-04-18 06:19:30 +00:00
test_meta.py [primTorch] Add count_nonzero (#98995) 2023-04-13 22:08:19 +00:00
test_metal.py
test_mkl_verbose.py
test_mkldnn.py Revert "fix onednn ConvTranspose2d channels last issue when ic=1 (#99539)" 2023-04-21 08:44:28 +00:00
test_mkldnn_fusion.py
test_mkldnn_verbose.py
test_mobile_optimizer.py
test_model_dump.py
test_module_init.py
test_modules.py Added ModuleInfos for Pooling ops (#98358) 2023-04-05 19:39:07 +00:00
test_monitor.py
test_mps.py Fix flake8 lint errors reported by ruff - take 2 (#99798) 2023-04-23 23:09:51 +00:00
test_multiprocessing.py Revert "Revert "Expandable blocks in allocator (#96995)"" (#99275) 2023-04-17 23:46:08 +00:00
test_multiprocessing_spawn.py
test_namedtensor.py
test_namedtuple_return_api.py
test_native_functions.py
test_native_mha.py
test_nestedtensor.py Add NestedTensor ops: logical_not, logical_not_, masked_fill (#97934) 2023-03-30 08:14:39 +00:00
test_nn.py Update channel shuffle to return alias instead of self as-is (#99745) 2023-04-24 14:02:14 +00:00
test_nnapi.py
test_numba_integration.py
test_numpy_interop.py
test_nvfuser_dynamo.py
test_nvfuser_frontend.py
test_openmp.py
test_ops.py Turn on meta converter for complex (#98869) 2023-04-20 16:42:38 +00:00
test_ops_fwd_gradients.py
test_ops_gradients.py
test_ops_jit.py
test_optim.py Change 1D Tensor of 1 element to 0D Tensor (#96994) 2023-03-21 18:24:19 +00:00
test_overrides.py Make gen_annotated_args support kwargs (#98396) 2023-04-06 19:42:26 +00:00
test_package.py
test_per_overload_api.py
test_prims.py Functionalization of torch.rand/rand_like ops (#97377) 2023-04-16 09:55:56 +00:00
test_proxy_tensor.py Guard static shapes alongside tensors, instead of from shape_env, in dynamic_shapes=True (#99566) 2023-04-22 16:46:52 +00:00
test_pruning_op.py Add missing __main__ in two unittests (#97302) 2023-03-22 19:09:08 +00:00
test_public_bindings.py Add Symbool support in python to C++ translation (#98453) 2023-04-12 03:21:57 +00:00
test_python_dispatch.py Support record_stream in dispatch mode (#99529) 2023-04-21 07:17:19 +00:00
test_pytree.py
test_quantization.py Fix the pt2e UT path after refactor (#99402) 2023-04-18 10:48:52 +00:00
test_reductions.py
test_scatter_gather_ops.py add deterministic impl for scatter and scatter_reduction sum/mean mode (#98060) 2023-04-03 20:38:29 +00:00
test_schema_check.py
test_segment_reductions.py Allow data size equal to 0 for SegmentReduce (#99733) 2023-04-23 01:59:45 +00:00
test_serialization.py Add test for pickle_module (#98373) 2023-04-05 13:05:05 +00:00
test_set_default_mobile_cpu_allocator.py
test_shape_ops.py
test_show_pickle.py
test_sort_and_select.py
test_sparse.py Enable cadd_sparse for BFloat16 on CPU (#96767) 2023-04-14 19:50:49 +00:00
test_sparse_csr.py nn.Linear: dispatch to bsr_dense_mm for half and bfloat16 (#94825) 2023-04-15 13:38:42 +00:00
test_spectral_ops.py Revert "[cuda rng] Making offset calculation independent of device properties (#98988)" 2023-04-19 17:23:40 +00:00
test_stateless.py
test_static_runtime.py
test_subclass.py
test_sympy_utils.py
test_tensor_creation_ops.py [CI] Fix test failures at TestTensorCreationCPU.test_float_to_int_conversion_finite_cpu_uint8 (#98916) 2023-04-18 15:05:12 +00:00
test_tensorboard.py Use is_available instead of device_count to check for CUDA availability (#97043) 2023-03-18 00:39:42 +00:00
test_tensorexpr.py
test_tensorexpr_pybind.py
test_testing.py [ROCm] Enable test_filtering_env_var (#84100) 2023-04-04 21:49:53 +00:00
test_throughput_benchmark.py
test_torch.py Add itemsize and nbytes properties to Tensor (#98322) 2023-04-05 12:11:55 +00:00
test_transformers.py tweak heuristic for sdpa selection based off of *data* (and a decision tree) (#99644) 2023-04-21 23:28:44 +00:00
test_type_hints.py
test_type_info.py
test_type_promotion.py [BE] Enable flake8-comprehension rule C417 (#97880) 2023-03-30 14:34:24 +00:00
test_typing.py [BE] Enable flake8-comprehension rule C417 (#97880) 2023-03-30 14:34:24 +00:00
test_unary_ufuncs.py
test_utils.py add get_device_index for custom device (#98804) 2023-04-12 23:58:31 +00:00
test_view_ops.py Add overflow check for stride calculation (#94900) 2023-04-09 01:30:55 +00:00
test_vulkan.py
test_weak.py
test_xnnpack_integration.py