pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

History

Nikita Shulga dc9b77cc55 [MPS] Support includes in metal objects (#145087 ) Useful for code reuse for Metal shader build both for eager mode and MPSInductor, but it requires one to implement `_cpp_embed_headers` tool that, as name suggests, would preprocess and embeds the for shader to be used in dynamic compilation. Test using: - `TestMetalLibrary.test_metal_include` - Moving `i0`/`i1` implementation to `c10/util/metal_special_math.h` and call it from `SpecialOps.metal` shader, which now looks much more compact: ```metal template <typename T, typename Tout = T> void kernel i0(constant T* input, device Tout* output, uint index [[thread_position_in_grid]]) { output[index] = c10::i0(static_cast<Tout>(input[index])); } ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/145087 Approved by: https://github.com/dcci ghstack dependencies: #145023		2025-01-18 05:35:22 +00:00
..
_awaits
_C	Use torch with statement in torch distributed module (#144951 )	2025-01-17 01:49:28 +00:00
_C_flatbuffer
_custom_op	Delete torch._library.register_functional_op (#145110 )	2025-01-18 00:58:25 +00:00
_decomp	Revert "Migrate from Tuple -> tuple in torch/_decomp (#144260 )"	2025-01-10 01:47:29 +00:00
_dispatch
_dynamo	[Trace Python Dispatcher] Support FuncTorchInterpreter (#144444 )	2025-01-17 02:26:37 +00:00
_export	Revert "patch for block-wise quantization + pt2e (#144492 )"	2025-01-17 14:27:53 +00:00
_functorch	Revert "Make functionalization `ViewMeta` serializable with pickle. (#143712 )"	2025-01-17 00:52:50 +00:00
_higher_order_ops	[BE] typing for decorators - library (#138969 )	2025-01-15 17:08:55 +00:00
_inductor	Added swizzle searching, disabled fp16 accum, and enabled ping-pong for cutlass (#144829 )	2025-01-18 02:39:22 +00:00
_lazy	remove allow-untyped-defs from torch/_lazy/config.py (#143603 )	2024-12-20 05:34:19 +00:00
_library	fix typo in doc and import for torch._library.triton (#144882 )	2025-01-17 17:32:12 +00:00
_logging	Implement increment and add_to_set for CompileEventLogger (#143427 )	2025-01-14 02:42:49 +00:00
_numpy	[BE] fix ruff rule E226: add missing whitespace around operator in f-strings (#144415 )	2025-01-08 21:55:00 +00:00
_prims	Fix unbind_copy and add its decomposition (#134319 )	2025-01-17 18:21:22 +00:00
_prims_common	Fix unbind_copy and add its decomposition (#134319 )	2025-01-17 18:21:22 +00:00
_refs	Fix unbind_copy and add its decomposition (#134319 )	2025-01-17 18:21:22 +00:00
_strobelight	Propagate callable parameter types using ParamSpec (#142306 ) (#143797 )	2024-12-29 23:03:14 +00:00
_subclasses	Add generator parameter to rand*_like functions (#136780 )	2025-01-15 21:16:52 +00:00
_vendor
accelerator	torch/accelerator: fix device type comparison (#143541 )	2024-12-23 10:54:53 +00:00
amp
ao	Revert "patch for block-wise quantization + pt2e (#144492 )"	2025-01-17 14:27:53 +00:00
autograd	[5/N] Apply Ruff fixes and pyupgrade to Python 3.9 (#144205 )	2025-01-15 04:00:47 +00:00
backends	Revert "[CUDA][cuBLAS] Add fp16 accumulate option to cuBLAS/cuBLASLt (#144441 )"	2025-01-16 21:12:41 +00:00
compiler	Add AOTAutogradCache support for cache hot loading APIs (#144499 )	2025-01-13 07:07:18 +00:00
contrib	[BE][Easy] enable PYFMT for `torch/[a-s]*/` (#138447 )	2024-12-23 14:04:00 +00:00
cpu
csrc	[pytorch/ncclx] Remove Alltoallv specialization for PTD all_to_all (#145045 )	2025-01-18 05:26:55 +00:00
cuda	Support with statement on torch.Stream (#140138 )	2025-01-10 02:05:19 +00:00
distributed	[Pipelining] Relax scale_grads assert (#145010 )	2025-01-17 21:33:28 +00:00
distributions	ReshapeTransform: added missing argument in docstring (#144401 )	2025-01-13 17:59:59 +00:00
export	[export] Support module inputs for non strict mode. (#143925 )	2025-01-16 17:30:36 +00:00
fft	[BE][Easy] enable PYFMT for `torch/[a-s]*/` (#138447 )	2024-12-23 14:04:00 +00:00
func	[BE][Easy] enable PYFMT for `torch/[a-s]*/` (#138447 )	2024-12-23 14:04:00 +00:00
futures	[BE][Easy] enable PYFMT for `torch/[a-s]*/` (#138447 )	2024-12-23 14:04:00 +00:00
fx	[export] Support module inputs for non strict mode. (#143925 )	2025-01-16 17:30:36 +00:00
jit	Apply Ruff fixes and pyupgrade to torch/jit (#144208 )	2025-01-16 00:28:50 +00:00
legacy
lib
linalg	[BE][Easy] enable PYFMT for `torch/[a-s]*/` (#138447 )	2024-12-23 14:04:00 +00:00
masked	Update torch.masked.mean to upcast dtype for bool tensors (#139999 )	2025-01-08 10:35:19 +00:00
monitor	[BE][Easy] enable PYFMT for `torch/[a-s]*/` (#138447 )	2024-12-23 14:04:00 +00:00
mps	[MPS] Support includes in metal objects (#145087 )	2025-01-18 05:35:22 +00:00
mtia	Revert "[MTIA] (3/n) Implement PyTorch APIs to query/reset device peak memory usage (#143347 )"	2024-12-21 04:04:16 +00:00
multiprocessing	[BE][CI] bump `ruff` to 0.8.4 (#143753 )	2024-12-24 12:24:10 +00:00
nested	Fix NJT min / max backward() for non-ragged reductions (#144583 )	2025-01-17 20:57:11 +00:00
nn	Add strict kwarg to `nn.Module.set_submodule` and fix bug for non dot delineated strings (#143455 )	2025-01-16 05:06:33 +00:00
onnx	[ONNX] Use python_dispatcher in type promotion (#144801 )	2025-01-15 23:25:19 +00:00
optim	Fix loading older state_dict into AdamW after refactor (#144972 )	2025-01-16 19:50:31 +00:00
package	Revert "Use absolute path `path.resolve()` -> `path.absolute()` (#129409 )"	2025-01-04 14:17:20 +00:00
profiler	[Profiler] Fix device setting error of other backends in torch.profiler (#144237 )	2025-01-10 10:41:11 +00:00
quantization
signal	[BE] typing for decorators (#144161 )	2025-01-04 16:40:09 +00:00
sparse
special	[BE][Easy] enable PYFMT for `torch/[a-s]*/` (#138447 )	2024-12-23 14:04:00 +00:00
testing	Make MultiProcContinuousTest timeout configurable (#145099 )	2025-01-18 04:37:12 +00:00
utils	[MPS] Support includes in metal objects (#145087 )	2025-01-18 05:35:22 +00:00
xpu	Refine torch.xpu.get_device_properties API error message (#144379 )	2025-01-10 06:27:51 +00:00
__config__.py
__future__.py
__init__.py	[inductor] Fix ignored options for torch.compile (#145131 )	2025-01-18 03:39:49 +00:00
_appdirs.py
_classes.py
_compile.py	[BE] typing for decorators (#144161 )	2025-01-04 16:40:09 +00:00
_custom_ops.py
_deploy.py
_environment.py
_guards.py	[ca] add compiled autograd to CompileId (#141907 )	2024-12-21 00:41:24 +00:00
_jit_internal.py
_linalg_utils.py
_lobpcg.py
_lowrank.py
_meta_registrations.py	[Break XPU][Inductor UT] Fix broken XPU CI introduced by community changes (#145058 )	2025-01-18 01:30:24 +00:00
_namedtensor_internals.py
_ops.py	Propagate callable parameter types using ParamSpec (#142306 ) (#144047 )	2025-01-06 16:16:18 +00:00
_python_dispatcher.py
_size_docs.py	remove allow-untyped-defs from torch/_size_docs.py (#143942 )	2024-12-29 01:00:46 +00:00
_sources.py
_storage_docs.py
_streambase.py
_tensor.py
_tensor_docs.py	Update pin memory related APIs to not pass 'device' argument (#131858 )	2025-01-15 17:23:35 +00:00
_tensor_str.py
_thread_safe_fork.py
_torch_docs.py	Add generator parameter to rand*_like functions (#136780 )	2025-01-15 21:16:52 +00:00
_utils.py
_utils_internal.py
_VF.py
_vmap_internals.py
_weights_only_unpickler.py
abi-check.cpp
CMakeLists.txt	Revert "export AOTI_TORCH_EXPORT on Windows. (#140030 )"	2025-01-06 18:15:52 +00:00
custom_class.h
custom_class_detail.h	Enable readability-redundant-declaration (#143982 )	2024-12-31 00:20:10 +00:00
extension.h
functional.py
hub.py
library.h	Enable more readability-redundant checks (#143963 )	2024-12-30 14:49:33 +00:00
library.py	[BE] typing for decorators - library (#138969 )	2025-01-15 17:08:55 +00:00
overrides.py	Add generator parameter to rand*_like functions (#136780 )	2025-01-15 21:16:52 +00:00
py.typed
quasirandom.py
random.py
README.txt
return_types.py
script.h
serialization.py	Prevent legacy_load when weights_only=True (correctly) (#145020 )	2025-01-17 20:10:22 +00:00
storage.py	Update pin memory related APIs to not pass 'device' argument (#131858 )	2025-01-15 17:23:35 +00:00
torch_version.py
types.py
version.py.tpl

README.txt

Note [TH abstraction violation]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.