mirror of
https://github.com/saymrwulf/pytorch.git
synced 2026-05-15 21:00:47 +00:00
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44087 Each op taking a TensorOptions argument now has an additional overload in the C++ frontend where it takes scattered ScalarType, Layout, Device, bool instead of one TensorOptions argument. If it is a c10-full op, then the scattered version calls into the dispatcher and the gathered version is a proxy calling into the scattered version. If it is a non-c10-full op, then the gathered version calls into the dispatcher and the scattered version is a proxy calling into the gathered version. This should minimize the amount of gathering and scattering needed. This PR is also a prerequisite to remove the re-gathering of arguments that is currently happening in VariableKernel. Currently, VariableKernels gather arguments into a TensorOptions object to call into the C++ API. In a PR stacked on top of this, VariableKernel will just directly call into the scattered C++ API introduced here and avoid the gathering step. ghstack-source-id: 113355689 Test Plan: waitforsandcastle vs master: https://www.internalfb.com/intern/fblearner/details/216169815/ vs previous diff: https://www.internalfb.com/intern/fblearner/details/216169957/ Reviewed By: ezyang Differential Revision: D23492188 fbshipit-source-id: 3e84c467545ad9371e98e09075a311bd18411c5a |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| cpp.py | ||
| dispatcher.py | ||
| legacy_dispatcher.py | ||
| types.py | ||