mirror of
https://github.com/saymrwulf/pytorch.git
synced 2026-05-14 20:57:59 +00:00
Summary: In et-replay, random data is used to run the operators. However, it does not work well for the op that uses index to access tensor. For example, embedding ops, which use the indices to look up the embedding table. If random data is used for these index ops, et-replay usually runs into invalid memory access issue. To fix it, ET provides an environment variable "ENABLE_PYTORCH_EXECUTION_TRACE_INTEGRAL_TENSOR_RANGE", if it is set, ET will capture the min/max value of the flattened integral tensor. Then in et_replay, the min/max is used to generate the random tensor within that range. It fixed invalid memory access issue. Test Plan: buck2 run mode/opt caffe2/test:test_profiler_cuda -- profiler.test_execution_trace.TestExecutionTraceCUDA.test_execution_trace_record_integral_tensor_range_cuda Differential Revision: D66666931 Pull Request resolved: https://github.com/pytorch/pytorch/pull/143088 Approved by: https://github.com/sanrise |
||
|---|---|---|
| .. | ||
| profiler_utils_mock_events.json | ||
| test_cpp_thread.cpp | ||
| test_cpp_thread.py | ||
| test_cpp_thread_lib.pyi | ||
| test_execution_trace.py | ||
| test_kineto.py | ||
| test_memory_profiler.py | ||
| test_profiler.py | ||
| test_profiler_tree.py | ||
| test_record_function.py | ||
| test_torch_tidy.py | ||