pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

History

Nichols A. Romero a99332eb25 [ROCM] Support Multi-GPU offline tuning in TunableOp (#139673 ) This PR enhances offline tuning to support multi-GPUs. High-level description of algorithm: - Duplicate GEMMs are first eliminated - GEMMs are distributed to multi-GPUs for tuning - Results are gathered into a file with `_full` in the filename Also adding support for GemmAndBias and ScaledGemm Pull Request resolved: https://github.com/pytorch/pytorch/pull/139673 Approved by: https://github.com/jeffdaily, https://github.com/hongxiayang		2024-11-26 19:07:41 +00:00
..
amp
__init__.py	[ROCm] AMDSMI memory usage unification (#139900 )	2024-11-21 21:11:39 +00:00
_gpu_trace.py
_memory_viz.py
_sanitizer.py
_utils.py
comm.py
error.py
gds.py
graphs.py
jiterator.py
memory.py	fix: Add type annotation to _record_memory_history (#140545 )	2024-11-14 17:44:46 +00:00
nccl.py
nvtx.py
profiler.py
random.py	[BE]: Apply PERF401 autofixes from ruff (#140980 )	2024-11-20 17:52:07 +00:00
sparse.py
streams.py
tunable.py	[ROCM] Support Multi-GPU offline tuning in TunableOp (#139673 )	2024-11-26 19:07:41 +00:00