pytorch/torch
Simon Fan d27509c384 [compiled autograd] support custom ops backed by c++ autograd::Function (#120681)
- Adds support for custom ops backed by c++ custom autograd functions, e.g. fbgemm
- Include files more granularly to avoid namespace pollution and circular imports

limitations:
- requires user to audit their code and opt-in their custom autograd::Function via autograd::Function::is_traceable and maybe additional compiled_args + apply_with_saved implementation. this was the only way I can think of for soundness
- will throw if we can't hash the saved_data i.e. for any non implemented type other than list and dict in at::IValue::hash b0cfa96e82/aten/src/ATen/core/ivalue.cpp (L364)
- can technically silently fail if both the typeid hash and the typeid string name of the custom autograd::Function collide at the same time, and an identical autograd graph containing a different custom autograd::Function, yet that has an identical implementation, is called. this case seems extremely unlikely, and the only alternative to hash collision i can think of is compiling with reflection
- tensors not saved via save_variables are not lifted, and are specialized on TensorImpl*'s hash (treated as a memory address). if needed, we can lift them.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/120681
Approved by: https://github.com/jansel
2024-03-08 20:43:29 +00:00
..
_awaits
_C Batch Norm Consolidation (#116092) 2024-03-08 15:07:15 +00:00
_C_flatbuffer
_custom_op
_decomp Batch Norm Consolidation (#116092) 2024-03-08 15:07:15 +00:00
_dispatch
_dynamo Improve Dynamo support for torch function and class methods in general (#121365) 2024-03-08 20:03:49 +00:00
_export fix accidental specialization with faketensor input checks (#121460) 2024-03-08 08:02:37 +00:00
_functorch [AOTDispatch] Return mutated inputs directly when keeping mutations (#120514) 2024-03-08 16:33:26 +00:00
_higher_order_ops Clean up mode handling in python dispatcher (#121083) 2024-03-08 00:30:34 +00:00
_inductor cudagraphs backend refactoring (#121017) 2024-03-08 19:47:41 +00:00
_lazy
_library Fix FallbackKernel behavior on mutable ops (#118649) 2024-02-09 19:01:54 +00:00
_logging Switch TORCH_TRACE to accept a directory by default (#121331) 2024-03-06 22:46:18 +00:00
_numpy Fix dynamo failure w/ astype (#117952) 2024-02-03 08:10:15 +00:00
_prims add decomposition for frexp (#119217) 2024-02-23 21:52:42 +00:00
_prims_common Handle transposition pattern seen in SDPA with unbacked SymInts (#121005) 2024-03-01 18:58:19 +00:00
_refs [dynamo bug burndown] update tensor creation to support sequences of tensors (#120872) 2024-03-02 02:22:59 +00:00
_subclasses Revert "[fake_impls] Fix seed/offset device for attention kernels (#120839)" (#121447) 2024-03-08 01:48:23 +00:00
_vendor
amp Remove device assert in Gradscaler (#119362) 2024-02-22 08:02:18 +00:00
ao Update Quantizable LSTM to support QAT (#121448) 2024-03-08 18:55:50 +00:00
autograd Deprecate torch.autograd.function.traceable, is_traceable (#121413) 2024-03-08 18:41:07 +00:00
backends [TorchElastic] Refactoring to support non-default logging strategy (#120691) 2024-02-29 20:59:17 +00:00
compiler [torch.export] Support is_compiling() flag for non-strict mode (#119602) 2024-02-29 05:52:51 +00:00
contrib
cpu
csrc [compiled autograd] support custom ops backed by c++ autograd::Function (#120681) 2024-03-08 20:43:29 +00:00
cuda [BE]: FURB187 Use inplace reverse on lists: faster, more readable. (#121140) 2024-03-05 01:36:17 +00:00
distributed [c10d] Deprecate torch.distributed.pipeline (#121464) 2024-03-08 19:55:02 +00:00
distributions Bugfix to MixtureSameFamily's _pad_mixture_dimension (#118947) 2024-02-06 16:24:22 +00:00
export [export] Fix nn_module_stack in retracing (#121423) 2024-03-08 00:34:11 +00:00
fft
func Let torch dynamo inline torch.func.grad (#118407) 2024-02-28 20:05:00 +00:00
futures
fx Revert "[fx] Preserve Fx graph node order in partitioner across runs (#115621)" 2024-03-08 19:50:57 +00:00
jit Batch Norm Consolidation (#116092) 2024-03-08 15:07:15 +00:00
legacy
lib Remove unneeded linking of torch_shm_manager in CMake (#119540) 2024-02-11 06:33:35 +00:00
linalg Add links to _ex variants in all linalg functions that support them (#121451) 2024-03-08 12:19:16 +00:00
masked
monitor
mps
multiprocessing
nested [NJT] support chunk on batch dim (#119713) 2024-03-05 17:57:50 +00:00
nn Add complex support to parametrizations.spectral_norm (#121452) 2024-03-08 19:17:20 +00:00
onnx Add Float8 support to onnx exporter (#121281) 2024-03-06 18:46:56 +00:00
optim Add ASGD capturable API for forloop (#121264) 2024-03-08 00:00:30 +00:00
package Release the GIL in serialization when it is safe to do so (#120818) 2024-03-01 22:37:26 +00:00
profiler [profiler] record nccl version in distributed info (#121044) 2024-03-07 15:56:02 +00:00
quantization
signal Clarifying windows cosine behaviour in the documentation (#119444) 2024-02-09 05:57:44 +00:00
sparse [sparse] semi-structured sparse refactor (#117302) 2024-02-14 01:10:40 +00:00
special
testing Disable GroupRegistry's thread isolation by default (#121457) 2024-03-08 19:31:24 +00:00
utils Clean up mode handling in python dispatcher (#121083) 2024-03-08 00:30:34 +00:00
xpu [DeviceIndex][7/N] Use DeviceIndex in XPU (#120576) 2024-02-29 05:54:23 +00:00
__config__.py
__future__.py Update nn.Module._apply to not gate on should_use_set_data when swap_tensors is set (#120659) 2024-02-28 00:59:34 +00:00
__init__.py Update _constrain_as_size docs (#120728) 2024-02-28 15:03:10 +00:00
_appdirs.py
_classes.py
_compile.py
_custom_ops.py
_deploy.py [Lint] replace [assigment] with [method-assign] for methods (#119706) 2024-02-13 02:06:04 +00:00
_guards.py Introduce EphemeralSource for symbols that should be simplified out (#120948) 2024-03-06 02:30:52 +00:00
_jit_internal.py [jit][perf] Reduce lookupInModule overhead. (#119145) 2024-02-05 18:01:00 +00:00
_linalg_utils.py
_lobpcg.py [Lint] replace [assigment] with [method-assign] for methods (#119706) 2024-02-13 02:06:04 +00:00
_lowrank.py
_meta_registrations.py add int8 packed gemm support on CPU device (#118056) 2024-03-07 08:41:43 +00:00
_namedtensor_internals.py
_ops.py Better error messages for impl_abstract_pystub (#120959) 2024-03-04 15:24:36 +00:00
_python_dispatcher.py
_sources.py
_storage_docs.py
_streambase.py
_tensor.py Add assign argument to torch.Tensor.module_load (#121158) 2024-03-06 01:32:06 +00:00
_tensor_docs.py update the tensor.scatter_ doc (#120169) 2024-02-23 02:51:55 +00:00
_tensor_str.py Add sparse compressed meta tensor support (#120707) 2024-03-01 13:28:47 +00:00
_torch_docs.py Fix the default value of side in torch.searchsorted (#120066) 2024-02-22 19:35:17 +00:00
_utils.py [torch.export] Support is_compiling() flag for non-strict mode (#119602) 2024-02-29 05:52:51 +00:00
_utils_internal.py Enable TORCH_TRACE by default in all Tupperware like environments (#120915) 2024-03-01 04:47:13 +00:00
_VF.py
_vmap_internals.py
_weights_only_unpickler.py
abi-check.cpp
CMakeLists.txt [3/4] Intel GPU Runtime Upstreaming for Device (#116850) 2024-02-01 12:31:26 +00:00
custom_class.h
custom_class_detail.h
extension.h
functional.py Fix ouput typos (#120870) 2024-02-29 08:29:14 +00:00
hub.py Add verbose parameter to torch.hub.list (#120717) 2024-03-01 07:39:48 +00:00
library.h
library.py Better error messages for impl_abstract_pystub (#120959) 2024-03-04 15:24:36 +00:00
overrides.py Add assign argument to torch.Tensor.module_load (#121158) 2024-03-06 01:32:06 +00:00
py.typed
quasirandom.py
random.py [2/2] Intel GPU Runtime Upstreaming for Generator (#118613) 2024-02-28 05:28:11 +00:00
README.txt
return_types.py register torch.return_types in torch.fx._pytree (#120027) 2024-02-23 21:52:42 +00:00
script.h
serialization.py Release the GIL in serialization when it is safe to do so (#120818) 2024-03-01 22:37:26 +00:00
storage.py Add hpu device support in storage/resize (#119761) 2024-02-17 01:04:27 +00:00
torch_version.py
types.py
version.py.tpl

Note [TH abstraction violation]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.