mirror of
https://github.com/saymrwulf/pytorch.git
synced 2026-05-14 20:57:59 +00:00
This PR enhances offline tuning to support multi-GPUs. High-level description of algorithm: - Duplicate GEMMs are first eliminated - GEMMs are distributed to multi-GPUs for tuning - Results are gathered into a file with `_full` in the filename Also adding support for GemmAndBias and ScaledGemm Pull Request resolved: https://github.com/pytorch/pytorch/pull/139673 Approved by: https://github.com/jeffdaily, https://github.com/hongxiayang |
||
|---|---|---|
| .. | ||
| amp | ||
| __init__.py | ||
| _gpu_trace.py | ||
| _memory_viz.py | ||
| _sanitizer.py | ||
| _utils.py | ||
| comm.py | ||
| error.py | ||
| gds.py | ||
| graphs.py | ||
| jiterator.py | ||
| memory.py | ||
| nccl.py | ||
| nvtx.py | ||
| profiler.py | ||
| random.py | ||
| sparse.py | ||
| streams.py | ||
| tunable.py | ||