pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

History

Edward Z. Yang 90bed32b98 Introduce torch.sym_sum (#136429 ) Partially addresses https://github.com/pytorch/pytorch/issues/128150 When you have big sums of values, we end up computing long chains of binary addition in our FX graph representation. Not only is this ugly, it also is quadratic, as the sympy.Add constructor is O(N) in number of arguments. Instead, ensure that we maintain the summation as a single FX node so we can do the entire addition all in one go. update_hint_regression benchmark, before and after: ``` update_hint_regression,compile_time_instruction_count,2648328980 update_hint_regression,compile_time_instruction_count,2563748678 ``` Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/136429 Approved by: https://github.com/isuruf		2024-10-08 18:12:57 +00:00
..
_awaits
_C	Revert "Disallow FakeTensor.data_ptr access in eager mode (#137221 )"	2024-10-07 21:46:13 +00:00
_C_flatbuffer
_custom_op
_decomp	Preserve custom ops via run_decomps (#136882 )	2024-10-01 17:38:00 +00:00
_dispatch
_dynamo	Introduce torch.sym_sum (#136429 )	2024-10-08 18:12:57 +00:00
_export	Add original forward names to schema so that prettify pass works (#136887 )	2024-10-08 04:21:02 +00:00
_functorch	fix silly mapping issue with torch.Size (#137465 )	2024-10-08 16:53:15 +00:00
_higher_order_ops	Revert "[FlexAttention] Support training bias for eager (#136910 )"	2024-10-08 17:29:02 +00:00
_inductor	Introduce torch.sym_sum (#136429 )	2024-10-08 18:12:57 +00:00
_lazy
_library	Proper handling of arguments passed by in kwargs inside zip_schema (#137311 )	2024-10-04 21:50:31 +00:00
_logging	Don't actually import module when checking if its valid (#136548 )	2024-09-25 20:47:32 +00:00
_numpy
_prims	Fix AOT Graph capture not propagating non_blocking copy parameter to … (#136513 )	2024-10-01 00:32:47 +00:00
_prims_common	Fix six broken tests in test_ops.py (#136653 )	2024-09-30 20:32:55 +00:00
_refs	Fix typo in _normalize ref (#137079 )	2024-10-02 19:06:48 +00:00
_strobelight	[Pytorch] Cleanup Strobelight URL and shorten for readability (#136102 )	2024-09-16 18:10:33 +00:00
_subclasses	Revert "Disallow FakeTensor.data_ptr access in eager mode (#137221 )"	2024-10-07 21:46:13 +00:00
_vendor
amp	[MPS] Add support for autocast in MPS (#99272 )	2024-09-05 23:23:17 +00:00
ao	Change to export_for_training in quantize_pt2e tests (#137233 )	2024-10-04 18:33:02 +00:00
autograd	Param fixes in docstring (#136097 )	2024-09-21 18:56:34 +00:00
backends	[sparse][semi-structured] Add float8 dtype support to 24 sparsity (#136397 )	2024-09-27 21:37:34 +00:00
compiler
contrib
cpu	Revise CPU vectorization ISA support API (#135075 )	2024-09-05 12:14:56 +00:00
csrc	[Profiler] Clear Out Dangling AppendOnlyLists (#137450 )	2024-10-08 17:48:59 +00:00
cuda	raw_alloc ignores PYTORCH_NO_CUDA_MEMORY_CACHING (#131114 )	2024-10-04 15:36:29 +00:00
distributed	[FSDP2] Required `mesh_dim_names` for HSDP (#137436 )	2024-10-08 16:31:18 +00:00
distributions	[BE]: Update mypy to 1.11.2 (#133816 )	2024-09-16 19:44:11 +00:00
export	Add original forward names to schema so that prettify pass works (#136887 )	2024-10-08 04:21:02 +00:00
fft
func
futures
fx	Introduce torch.sym_sum (#136429 )	2024-10-08 18:12:57 +00:00
jit
legacy
lib
linalg	docs: clarify alias usage for `x` parameter in vector_norm function (#136921 )	2024-09-30 02:50:06 +00:00
masked	[BE]: Update mypy to 1.11.2 (#133816 )	2024-09-16 19:44:11 +00:00
monitor
mps
mtia	[MTIA] Support torch.cuda.get_device_capability equivalent API on MTIA (#135889 )	2024-09-17 17:42:56 +00:00
multiprocessing	multiprocessing.spawn: allow a grace period when shutdown (#131278 )	2024-10-07 12:37:34 +00:00
nested	Fix to() on non-contiguous NJTs (#137124 )	2024-10-08 15:11:05 +00:00
nn	Revert "[Dynamo] Move flex attention torch function mode to traceable HOP file (#137120 )"	2024-10-08 17:26:19 +00:00
onnx	[ONNX] Insert contiguous node between transpose and view before calling run_decompositions (#137340 )	2024-10-08 16:45:59 +00:00
optim	Add missing input "eps" to adam docs (#135191 )	2024-09-25 20:17:23 +00:00
package	[3.13] fix 3.13 pickle error in torch/package (#136049 )	2024-09-14 14:28:09 +00:00
profiler	[Profiler] Torch Profiler distributed info is not JSON serializable (#135548 )	2024-09-13 02:22:33 +00:00
quantization
signal
sparse	[sparse][semi-structured] Add float8 dtype support to 24 sparsity (#136397 )	2024-09-27 21:37:34 +00:00
special
testing	Fix to() on non-contiguous NJTs (#137124 )	2024-10-08 15:11:05 +00:00
utils	Introduce torch.sym_sum (#136429 )	2024-10-08 18:12:57 +00:00
xpu	Use torch.Stream&torch.Event for Dynamo capature (#134850 )	2024-10-02 14:15:33 +00:00
__config__.py
__future__.py
__init__.py	Introduce torch.sym_sum (#136429 )	2024-10-08 18:12:57 +00:00
_appdirs.py
_classes.py
_compile.py
_custom_ops.py
_deploy.py
_environment.py	Improve is_fbcode functionality (#136871 )	2024-09-27 21:19:01 +00:00
_guards.py	Turn on type-checking in torch.fx.experimental.symbolic_shapes (#136972 )	2024-10-01 13:22:10 +00:00
_jit_internal.py
_linalg_utils.py
_lobpcg.py
_lowrank.py
_meta_registrations.py	Introduce torch.sym_sum (#136429 )	2024-10-08 18:12:57 +00:00
_namedtensor_internals.py
_ops.py	Add type annotations for higher order ops/flex_attention (#137065 )	2024-10-02 04:39:25 +00:00
_python_dispatcher.py
_size_docs.py
_sources.py
_storage_docs.py
_streambase.py	Use torch.Stream&torch.Event for Dynamo capature (#134850 )	2024-10-02 14:15:33 +00:00
_tensor.py	Fix wrapper subclass serialization with custom sizes / strides (#137030 )	2024-10-02 18:55:03 +00:00
_tensor_docs.py	Revert "Add deterministic path for CUDA `cumsum` (#136224 )"	2024-09-27 12:54:47 +00:00
_tensor_str.py
_thread_safe_fork.py	[inductor] parallel compile: add import of thread_safe_fork for internal (#137155 )	2024-10-03 17:37:21 +00:00
_torch_docs.py	[Doc] Clarify that NaNs are not equal to each other (#137386 )	2024-10-05 06:19:12 +00:00
_utils.py	Add torch.serialization.skip_data context manager (#134504 )	2024-09-05 16:53:39 +00:00
_utils_internal.py	Log compile ids to pt2_remote_cache and pt2_compile_events (#137431 )	2024-10-08 18:04:48 +00:00
_VF.py
_vmap_internals.py
_weights_only_unpickler.py
abi-check.cpp
CMakeLists.txt
custom_class.h
custom_class_detail.h
extension.h
functional.py	Revert "Add deterministic path for CUDA `cumsum` (#136224 )"	2024-09-27 12:54:47 +00:00
hub.py	torch.hub: add get_dir/set_dir type hints (#134906 )	2024-09-12 03:53:29 +00:00
library.h
library.py	noop on torch.library APIs under torch::deploy (multipy) (#136645 )	2024-09-26 02:34:34 +00:00
overrides.py	Introduce torch.sym_sum (#136429 )	2024-10-08 18:12:57 +00:00
py.typed
quasirandom.py
random.py
README.txt
return_types.py
script.h
serialization.py	[3.13] fix 3.13 pickle error in serialization.py (#136034 )	2024-09-14 00:02:40 +00:00
storage.py	Fix serialization for torch.uint16, torch.uint32, torch.uint64 (#137184 )	2024-10-03 14:56:11 +00:00
torch_version.py
types.py
version.py.tpl

README.txt

Note [TH abstraction violation]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.