pytorch/docs/source/cuda.tunable.rst
Nichols A. Romero a99332eb25 [ROCM] Support Multi-GPU offline tuning in TunableOp (#139673)
This PR enhances offline tuning to support multi-GPUs.

High-level description of algorithm:
- Duplicate GEMMs are first eliminated
- GEMMs are distributed to multi-GPUs for tuning
- Results are gathered into a file with `_full` in the filename

Also adding support for GemmAndBias and ScaledGemm

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139673
Approved by: https://github.com/jeffdaily, https://github.com/hongxiayang
2024-11-26 19:07:41 +00:00

36 lines
970 B
ReStructuredText

.. currentmodule:: torch.cuda.tunable
TunableOp
=========
.. note::
This is a prototype feature, which means it is at an early stage
for feedback and testing, and its components are subject to change.
Overview
--------
.. automodule:: torch.cuda.tunable
API Reference
-------------
.. autofunction:: enable
.. autofunction:: is_enabled
.. autofunction:: tuning_enable
.. autofunction:: tuning_is_enabled
.. autofunction:: record_untuned_enable
.. autofunction:: record_untuned_is_enabled
.. autofunction:: set_max_tuning_duration
.. autofunction:: get_max_tuning_duration
.. autofunction:: set_max_tuning_iterations
.. autofunction:: get_max_tuning_iterations
.. autofunction:: set_filename
.. autofunction:: get_filename
.. autofunction:: get_results
.. autofunction:: get_validators
.. autofunction:: write_file_on_exit
.. autofunction:: write_file
.. autofunction:: read_file
.. autofunction:: tune_gemm_in_file
.. autofunction:: mgpu_tune_gemm_in_file