mirror of
https://github.com/saymrwulf/pytorch.git
synced 2026-05-15 21:00:47 +00:00
This PR enhances offline tuning to support multi-GPUs. High-level description of algorithm: - Duplicate GEMMs are first eliminated - GEMMs are distributed to multi-GPUs for tuning - Results are gathered into a file with `_full` in the filename Also adding support for GemmAndBias and ScaledGemm Pull Request resolved: https://github.com/pytorch/pytorch/pull/139673 Approved by: https://github.com/jeffdaily, https://github.com/hongxiayang
36 lines
970 B
ReStructuredText
36 lines
970 B
ReStructuredText
.. currentmodule:: torch.cuda.tunable
|
|
|
|
TunableOp
|
|
=========
|
|
|
|
.. note::
|
|
This is a prototype feature, which means it is at an early stage
|
|
for feedback and testing, and its components are subject to change.
|
|
|
|
Overview
|
|
--------
|
|
|
|
.. automodule:: torch.cuda.tunable
|
|
|
|
API Reference
|
|
-------------
|
|
|
|
.. autofunction:: enable
|
|
.. autofunction:: is_enabled
|
|
.. autofunction:: tuning_enable
|
|
.. autofunction:: tuning_is_enabled
|
|
.. autofunction:: record_untuned_enable
|
|
.. autofunction:: record_untuned_is_enabled
|
|
.. autofunction:: set_max_tuning_duration
|
|
.. autofunction:: get_max_tuning_duration
|
|
.. autofunction:: set_max_tuning_iterations
|
|
.. autofunction:: get_max_tuning_iterations
|
|
.. autofunction:: set_filename
|
|
.. autofunction:: get_filename
|
|
.. autofunction:: get_results
|
|
.. autofunction:: get_validators
|
|
.. autofunction:: write_file_on_exit
|
|
.. autofunction:: write_file
|
|
.. autofunction:: read_file
|
|
.. autofunction:: tune_gemm_in_file
|
|
.. autofunction:: mgpu_tune_gemm_in_file
|