pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

Author	SHA1	Message	Date
clr	f746bb6311	config: Don't spam warnings about reference type configs (#145800 ) Summary: https://github.com/pytorch/pytorch/issues/145755 The is_dynamic check for reference types was subtly broken, causing log spam after it was accessed Added an explicit type for is_default for reference types to make sure this behaviour is correct Pull Request resolved: https://github.com/pytorch/pytorch/pull/145800 Approved by: https://github.com/eellison	2025-01-30 18:57:16 +00:00
clr	6b41f310c2	config: Support str env variables (#145980 ) Summary: This allows us to use environment variables to set string values. We've added tests for the specific functionality implemented here. Note that we already accidentally started setting up configs to use this, so we're just adding the feature. Additionally, we're not fully validating the underlying type when we set the value (and in general, it's more difficult than we would like to do this). Let me know if people feel strongly, and we can add a PR to do this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145980 Approved by: https://github.com/yushangdi, https://github.com/oulgen	2025-01-30 00:13:02 +00:00
Colin Peppler	521588519d	re-use FloorDiv for RShift (#145898 ) I encountered this C++ compilation error. ``` 579 \| int64_t var_6 = (static_cast<int64_t>(std::floor((1.0/2.0)u0)) \| static_cast<int64_t>(std::floor((1.0/4.0)static_cast<int64_t>(std::floor((1.0/2.0)u0))))) \| std::floor((1.0/16.0)(static_cast<int64_t>(std::floor((1.0/2.0)u0)) \| static_cast<int64_t>(std::floor((1.0/4.0)static_cast<int64_t>(std::floor((1.0/2.0)u0)))))); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ \| \| \| \| int64_t {aka long int} double ``` Then, I figured out where this std::floor came from with the help of Bob's guard provenance tool. It comes from RShift which is used in `triton.next_power_of_2`. --- Before, we used `std::floor` ``` int64_t var_6 = ( static_cast<int64_t>(std::floor((1.0/2.0)u0)) \| static_cast<int64_t>(std::floor((1.0/4.0)static_cast<int64_t>(std::floor((1.0/2.0)u0))))) \| std::floor((1.0/16.0)(static_cast<int64_t>(std::floor((1.0/2.0)u0)) # no cast to int here. \| static_cast<int64_t>(std::floor((1.0/4.0)static_cast<int64_t>(std::floor((1.0/2.0)u0)))))); ``` Now, we use `c10::div_floor_integer` instead ``` int64_t var_6 = ( (c10::div_floor_integer(static_cast<int64_t>(u0), static_cast<int64_t>(2L))) \| (c10::div_floor_integer(static_cast<int64_t>(u0), static_cast<int64_t>(8L)))) \| (c10::div_floor_integer(static_cast<int64_t>((c10::div_floor_integer(static_cast<int64_t>(u0), static_cast<int64_t>(2L))) \| (c10::div_floor_integer(static_cast<int64_t>(u0), static_cast<int64_t>(8L)))), static_cast<int64_t>(16L))); ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/145898 Approved by: https://github.com/desertfire, https://github.com/bobrenjc93 ghstack dependencies: #145802	2025-01-29 22:50:22 +00:00
Aaron Orenstein	7178b827d7	PEP585: Missed conversions (#145342 ) Differential Revision: [D68785969](https://our.internmc.facebook.com/intern/diff/D68785969) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145342 Approved by: https://github.com/bobrenjc93	2025-01-29 05:24:36 +00:00
Jane Xu	515e55e692	Set -DPy_LIMITED_API flag for py_limited_api=True extensions (#145764 ) This could be BC breaking, because there was a period of time when we use py_limited_api=True but don't enforce the flag, and now that we will start enforcing the flag, people's custom extensions may fail to build. This is strictly still better behavior, as it is sketchy to claim CPython agnosticism without the flag, but calling this out as potential people yelling at us. Ways to mitigate this risk + reasons this may not be too big a deal: - People haven't known about py_limited_api for extensions much due to lack of docs from python so usage is low right now - My current tutorial is in store to make new users of py_limited_api pass this flag, so it'd be a noop for them. Test plan: * Locally i'm confident as I tried rebuilding ao with this change and it reliably failed (cuz importing torch/extension.h is a nono) * Unit test wise, the normal python_agnostic one I added should work Pull Request resolved: https://github.com/pytorch/pytorch/pull/145764 Approved by: https://github.com/ezyang, https://github.com/zou3519, https://github.com/albanD	2025-01-28 20:11:05 +00:00
Aaron Gokaslan	8e46d0f595	[BE]: Update typing of OrderedSet ancestor (#145783 ) Now that we are on python 3.9 minimum version we can properly use Generics in the superclass Pull Request resolved: https://github.com/pytorch/pytorch/pull/145783 Approved by: https://github.com/eellison	2025-01-28 04:43:49 +00:00
PyTorch MergeBot	9010649292	Revert "Add option to serialization config to reduce random reads from get_record_offset when loading with mmap=True (#143880 )" This reverts commit `db3685a35c`. Reverted https://github.com/pytorch/pytorch/pull/143880 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but either this PR or the base PR breaks distributed tests ([comment](https://github.com/pytorch/pytorch/pull/143880#issuecomment-2617743403))	2025-01-28 03:07:17 +00:00
Mikayla Gawarecki	db3685a35c	Add option to serialization config to reduce random reads from get_record_offset when loading with mmap=True (#143880 ) ## Background This PR adds `torch.utils.serialization.config.load.calculate_storage_offsets`. This option relies on the previous PR in this stack, where storage order was changed to non lexicographical. A `.format_version` entry was added to the zipfile and `calculate_storage_offsets` will only work on checkpoints with `.format_version`. When this is turned on, for `torch.load(mmap=True)`, offsets of each storage record (other than the 0th storage will be calculated instead of relying on `miniz` APIs to determine this). The existing APIs will issue multiple random reads (reading the end of central directory record, then reading the zipfile header for the record) to determine the storage offset where the record starts. This can greatly degrade `torch.load(mmap=True)` performance for non-filesystem cases. `6aaae9d78f/caffe2/serialize/inline_container.cc (L589-L605)` ## Testing strategy The agreed upon testing strategy was as follows: - Add debug code gated by an environment flag `TORCH_SERIALIZATION_DEBUG` that will run this offset calculation logic and verify it against getRecordOffset for each storage (when mmap=False) - This flag is set throughout CI, which means that every time `torch.load` is called, the offset calculation logic is implicitly being tested. Differential Revision: [D67673026](https://our.internmc.facebook.com/intern/diff/D67673026) Pull Request resolved: https://github.com/pytorch/pytorch/pull/143880 Approved by: https://github.com/albanD ghstack dependencies: #143879	2025-01-27 23:57:30 +00:00
Randolf Scholz	835e770bad	Use `typing.IO[bytes]` instead of `io.BytesIO` in annotations (#144994 ) Fixes #144976 Using appoach ① `IO[bytes]`, but could also try with a protocol. ## Notes: - moved `torch.serialization.FILE_LIKE` to `torch.types.FileLike` - Use `FileLike` annotation where it makes sense - made sure those functions also support `os.PathLike` - Replaced `isinstance(x, io.BytesIO)` with `isinstance(x, (io.IOBase, IO))` where appropriate. - Replaced `BinaryIO` with `IO[bytes]` (the two ABCs are almost identical, the only difference is that `BinaryIO` allows `bytearray` input to `write`, whereas `IO[bytes]` only `bytes`) - needed to make `torch.serialization._opener` generic to avoid LSP violations. - skipped `torch/onnx/verification` for now (functions use `BytesIO.getvalue` which is not part of the `IO[bytes]` ABC, but it kind of seems that this is redundant, as e.g. `onnx.load` supports `str \| PathLike[str] \| IO[bytes]` directly... Pull Request resolved: https://github.com/pytorch/pytorch/pull/144994 Approved by: https://github.com/ezyang, https://github.com/Skylion007	2025-01-27 18:08:07 +00:00
H. Vetinari	e6c1e6e20e	simplify torch.utils.cpp_extension.include_paths; use it in cpp_builder (#145480 ) While working on conda-forge integration, I needed to look at the way the include paths are calculated, and noticed an avoidable duplication between `torch/utils/cpp_extension.py` and `torch/_inductor/cpp_builder.py`. The latter already imports the former anyway, so simply reuse the same function. Furthermore, remove long-obsolete include-paths. AFAICT, the `/TH` headers have not existed since pytorch 1.11. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145480 Approved by: https://github.com/ezyang	2025-01-27 07:19:42 +00:00
PyTorch MergeBot	09ae69a364	Revert "Fix type annotation of `Linear.bias` (#142326 )" This reverts commit `81e370fc6b`. Reverted https://github.com/pytorch/pytorch/pull/142326 on behalf of https://github.com/malfet due to This introduced a graph break and regressed inductor tests, see `73622fc5fa/1` ([comment](https://github.com/pytorch/pytorch/pull/142326#issuecomment-2614196349))	2025-01-26 03:41:00 +00:00
Fabian Keller	81e370fc6b	Fix type annotation of `Linear.bias` (#142326 ) Currently the `bias` attribute of `torch.nn.Linear` (and `Bilinear`) is typed incorrectly, because it relies on the implicit `Module.__getattr__` which types it as `Tensor \| Module`. This has two issues: - It hides the fact that `bias` is optional, and can be `None`, which in turn can hide actual bugs on user side. - It blurs the type due to having `Module` in the union, which can require unnecessary `isistance(linear.bias, Tensor)` on user side. This PR types the `bias` attribute explicitly to fix these issues. CC @ezyang @Skylion007 Pull Request resolved: https://github.com/pytorch/pytorch/pull/142326 Approved by: https://github.com/ezyang	2025-01-24 22:43:52 +00:00
Oguz Ulgen	d3989ca636	Add multi env variable support to configs (#145288 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145288 Approved by: https://github.com/c00w	2025-01-24 10:04:24 +00:00
Johnny	732c4998f3	[NVIDIA] Full Family Blackwell Support codegen (#145436 ) More references: https://github.com/NVIDIA/nccl Pull Request resolved: https://github.com/pytorch/pytorch/pull/145436 Approved by: https://github.com/ezyang, https://github.com/drisspg	2025-01-24 04:36:00 +00:00
PyTorch MergeBot	714f64329b	Revert "Add multi env variable support to configs (#145288 )" This reverts commit `a8b7cb6a2d`. Reverted https://github.com/pytorch/pytorch/pull/145288 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it is failing lint from a landrace with some recent PEP585 changes ([comment](https://github.com/pytorch/pytorch/pull/145288#issuecomment-2611278428))	2025-01-24 00:20:00 +00:00
Oguz Ulgen	a8b7cb6a2d	Add multi env variable support to configs (#145288 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145288 Approved by: https://github.com/c00w	2025-01-23 23:00:23 +00:00
Irem Yuksel	66bf7da446	Enable sleef for Win Arm64 (#144876 ) Sleef module was disabled for Windows Arm64 on `b021486405` This PR enables it again since the issue is no longer valid. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144876 Approved by: https://github.com/albanD, https://github.com/malfet Co-authored-by: Ozan Aydin <148207261+ozanMSFT@users.noreply.github.com>	2025-01-23 19:22:58 +00:00
Aaron Orenstein	629840e038	Backout PEP585 use of Iterable (#145438 ) Summary: Importing Iterable from collections.abc here causes an internal product to fail MRO discovery causing a collision between Iterable and Generic. This fixes the failure on D68461304 Differential Revision: D68531443 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145438 Approved by: https://github.com/izaitsevfb	2025-01-23 11:45:37 +00:00
Johnny	a57133e3c7	[NVIDIA] Jetson Thor Blackwell Support codegen (#145395 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145395 Approved by: https://github.com/eqy, https://github.com/malfet	2025-01-22 20:13:19 +00:00
Isuru Fernando	4b77ff9784	Fix PythonMod printing for C++ (#143385 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/143385 Approved by: https://github.com/leslie-fang-intel, https://github.com/anijain2305	2025-01-22 14:58:35 +00:00
johnnynunez	35f5668f7e	[NVIDIA] RTX50 Blackwell Support codegen (#145270 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145270 Approved by: https://github.com/ezyang	2025-01-21 21:10:05 +00:00
Aaron Orenstein	2f9d378f7b	PEP585 update - torch/utils (#145201 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145201 Approved by: https://github.com/bobrenjc93	2025-01-21 21:04:10 +00:00
Edward Z. Yang	efa88e04e1	Don't overspecialize float when propagating cache guards to ShapeEnv (#145078 ) Fixes https://github.com/pytorch/pytorch/issues/142507 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/145078 Approved by: https://github.com/Skylion007	2025-01-21 18:05:43 +00:00
Aaron Gokaslan	cf05f6a134	[BE]: Improve typing for torch/fx/_pytree.py and torch/utils/_pytree.py (#145173 ) Improve type inference in _pytree.py utility functions Pull Request resolved: https://github.com/pytorch/pytorch/pull/145173 Approved by: https://github.com/bobrenjc93	2025-01-20 22:18:19 +00:00
Nikita Shulga	dc9b77cc55	[MPS] Support includes in metal objects (#145087 ) Useful for code reuse for Metal shader build both for eager mode and MPSInductor, but it requires one to implement `_cpp_embed_headers` tool that, as name suggests, would preprocess and embeds the for shader to be used in dynamic compilation. Test using: - `TestMetalLibrary.test_metal_include` - Moving `i0`/`i1` implementation to `c10/util/metal_special_math.h` and call it from `SpecialOps.metal` shader, which now looks much more compact: ```metal template <typename T, typename Tout = T> void kernel i0(constant T* input, device Tout* output, uint index [[thread_position_in_grid]]) { output[index] = c10::i0(static_cast<Tout>(input[index])); } ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/145087 Approved by: https://github.com/dcci ghstack dependencies: #145023	2025-01-18 05:35:22 +00:00
Luca Wehrstedt	a0d2c09115	Add flop formula for _scaled_mm (#144973 ) This will make it work correctly with the partitioner's AutoAC Pull Request resolved: https://github.com/pytorch/pytorch/pull/144973 Approved by: https://github.com/jeffdaily	2025-01-17 09:38:30 +00:00
Zhengxu Chen	53256edff9	[export] Support module inputs for non strict mode. (#143925 ) Summary: Add experimental support for torch.nn.Module as input types. Before this change, we don't support module inputs but recently we saw some interesting use cases like gpt-fast https://github.com/pytorch-labs/gpt-fast/blob/main/generate.py#L68 where we directly pass in a module input for different variants of the same models. Since we don't really care about non-param or non-buffer states in non strict mode, we don't care about those either and pretend they are like plain constants during tracing. We treat any module input like a nested container of tensor, and each time we will automatically register a pytree handler for these module types to flatten its state dict into a group of tensors. We will just inline any module method call during tracing like we did for `self` module in export_for_training. This will make input modules' behavior very similar to the training module in typical case, except that we don't record the inputs as parameter or buffers but rather just plain user inputs. Test Plan: buck run mode/opt caffe2/test:test_export -- -r test_module_input Differential Revision: D67680827 Pull Request resolved: https://github.com/pytorch/pytorch/pull/143925 Approved by: https://github.com/tugsbayasgalan	2025-01-16 17:30:36 +00:00
PyTorch MergeBot	6559374494	Revert "Add flop formula for _scaled_mm (#144872 )" This reverts commit `f31452268b`. Reverted https://github.com/pytorch/pytorch/pull/144872 on behalf of https://github.com/lw due to Breaks ROCm jobs on main ([comment](https://github.com/pytorch/pytorch/pull/144872#issuecomment-2595994134))	2025-01-16 15:16:18 +00:00
Luca Wehrstedt	f31452268b	Add flop formula for _scaled_mm (#144872 ) This will make it work correctly with the partitioner's AutoAC Pull Request resolved: https://github.com/pytorch/pytorch/pull/144872 Approved by: https://github.com/vkuzo	2025-01-16 13:57:54 +00:00
wizzniu	c07dc64017	Update pin memory related APIs to not pass 'device' argument (#131858 ) Based on https://github.com/pytorch/pytorch/pull/126376, this PR tries to update all PT callers (e.g., `Tensor.is_pinned()`, `Tensor.pin_memory()`) to not pass `device` argument. As for `storage/untyped_storage.is_pinned()/pin_memory()`, we keep the `device` argument but passing `device` is discouraged. And if not given, the default `device` is still 'cuda' for BC. Additionally, based on device-agnostic pin_memory, `pin_memory_device` argument of `torch.utils.data.DataLoader` is discouraged now. For BC, explictly passing this argument is still effective. If not given, the default `device` will be the current accelerator. Fixes #124908 Relates https://github.com/pytorch/pytorch/pull/126376 Pull Request resolved: https://github.com/pytorch/pytorch/pull/131858 Approved by: https://github.com/albanD Co-authored-by: albanD <desmaison.alban@gmail.com>	2025-01-15 17:23:35 +00:00
PyTorch MergeBot	d21738f24a	Revert "Fix torch.normal ignores default_device (#144070 )" This reverts commit `184549b2d7`. Reverted https://github.com/pytorch/pytorch/pull/144070 on behalf of https://github.com/ezyang due to broken a specific use case ([comment](https://github.com/pytorch/pytorch/pull/144070#issuecomment-2590681953))	2025-01-14 17:41:58 +00:00
Shangdi Yu	5c727d5679	[minifier] Fix config generator for callables (#144518 ) Summary: When config contains callables, the current configs generated cannot be run: ``` torch._dynamo.config.reorderable_logging_functions = {<built-in function print>, <function warning at 0x7f774c595630>, <function log at 0x7f774c595870>, <function error at 0x7f774c595510>, <function info at 0x7f774c595750>, <built-in function warn>, <function exception at 0x7f774c5955a0>, <function debug at 0x7f774c5957e0>, <function critical at 0x7f774c5953f0>} ``` We fix the config to generate the right string, so the config is runnable, like below ``` import logging import warnings torch._dynamo.config.reorderable_logging_functions = { warnings.warn, logging.warn, print } ``` Test Plan: ``` buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:utils -- -r test_codegen_config ``` Differential Revision: D67998703 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144518 Approved by: https://github.com/desertfire	2025-01-14 17:18:13 +00:00
PyTorch MergeBot	0aa34e9591	Revert "Collect packages with importlib in collect_env (#144616 )" This reverts commit `3541d2a2aa`. Reverted https://github.com/pytorch/pytorch/pull/144616 on behalf of https://github.com/malfet due to Somehow this change causes test_bottleneck_cuda to fail ([comment](https://github.com/pytorch/pytorch/pull/144616#issuecomment-2586095595))	2025-01-13 03:11:04 +00:00
Sv. Lockal	3541d2a2aa	Collect packages with importlib in collect_env (#144616 ) If pytorch is installed systemwide (via os package manager) or by alternative package manager like `uv`, pip is not available, causing error in `collect_env`. However it is still possible to collect exactly the same list using `importlib` API, which is always available. Fixes #144615 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144616 Approved by: https://github.com/malfet	2025-01-12 23:21:08 +00:00
Gabriel Ferns	1376116ab1	Config fuzzer (#139736 ) This tool makes it easy to search through config state-space with a minimal reproduction or test. It presents a similar interface to the config bisector by taking a test_function that should either raise on Exception or return False upon failure. It has two entry points: `fuzz_n_tuple`, which tries every combination of n configs, and `bisect`, which randomly flips configs and tries to find the minimal reproduction upon failure. `bisect` is a much more efficient way to search the space, but `fuzz_n_tuple` can give you peace of mind that a new config will compose with every other config. It's been used to find three bugs so far in the inductor config: https://github.com/pytorch/pytorch/issues/140220 https://github.com/pytorch/pytorch/issues/140219 https://github.com/pytorch/pytorch/issues/143524 This PR also adds a bunch of missing types to the inductor config to get them to play nice with the fuzzer, so it can be a good forcing function for adding types to config. Pull Request resolved: https://github.com/pytorch/pytorch/pull/139736 Approved by: https://github.com/eellison	2025-01-12 22:59:02 +00:00
zeshengzong	184549b2d7	Fix torch.normal ignores default_device (#144070 ) Fixes #122886 1. Enable `torch.normal` working with `DeviceContext` to get default device which set via `set_default_device`. 2. Add hint in `set_default_device` doc, suggest use `torch.Tensor.to` method move to desired device explicitly. Test Result 1. Doc Preview ![image](https://github.com/user-attachments/assets/eb69c334-be2b-4dc5-bdce-567da21e1635) 2. Local Test ```python >>> import torch >>> torch.normal(0.,1., (10,10)).device device(type='cpu') >>> torch.set_default_device('cuda') >>> torch.normal(0.,1., (10,10)).device device(type='cuda', index=0) ``` ```bash pytest test/test_tensor_creation_ops.py ``` ![image](https://github.com/user-attachments/assets/8b466b55-f162-4b83-8b20-71de2c1d0914) ```bash lintrunner ``` ![image](https://github.com/user-attachments/assets/5b269c50-da57-47ed-8500-4edf2c2295e4) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144070 Approved by: https://github.com/ezyang	2025-01-10 08:19:55 +00:00
Eddie Yan	28b1960d49	[CUDA] parse arch-conditional compute-capability when building extensions (#144446 ) don't choke on arch-conditional compute capabilities e.g., `sm_90a`: #144037 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144446 Approved by: https://github.com/Skylion007, https://github.com/ezyang	2025-01-09 22:05:18 +00:00
bobrenjc93	90e81a157a	Migrate from Tuple -> tuple in torch/utils/data (#144255 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144255 Approved by: https://github.com/andrewkho	2025-01-08 04:09:45 +00:00
Shangdi Yu	72e8f34715	[AoTI Minifier] UX Improvement (#143330 ) Summary: - When a user specify `TORCHINDUCTOR_MAX_AUTOTUNE=1` env variable, we add `config.max_autotune=True` to the generated minifier_launcher - We should do this to other inductor configs as well in a followup Diff Currently in dynamo and aoti minifier, if a config is overwritten by an env variable, the config will not show up in the config list in the minifier_launcher.py file. As a result, when running the minifier_launcher, they need to re-apply the same env variable. This is: 1) not convenient for the users 2) if they copy-paste the minifier_launcher.py to us without including the env variable, we could be confused and not able to reproduce the error. Underlying implementation change: - Add `env_default` parameter to `codegen_config()`. If set, configs overriden by the env are not considered default. Test Plan: ``` buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:utils -- -r test_codegen_config ``` Differential Revision: D67299312 Pull Request resolved: https://github.com/pytorch/pytorch/pull/143330 Approved by: https://github.com/jansel, https://github.com/eellison	2025-01-07 20:04:19 +00:00
Henry Hu	f013cfee38	[TreeSpec] Support enum in defaultdict (#144235 ) Summary: Followup from D66269157, add support for enum in defaultdict. Test Plan: Added unit test Differential Revision: D67832100 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144235 Approved by: https://github.com/henrylhtsang, https://github.com/houseroad	2025-01-07 00:10:46 +00:00
Isuru Fernando	301b9c8a90	Fix PythonMod printing (#144078 ) Fixes #144075 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144078 Approved by: https://github.com/anijain2305	2025-01-06 22:52:34 +00:00
Xiaodong Wang	3d3a07963f	[reland][attempt2][AMD] Turn on TF32 for aten::mm (#144145 ) Summary: https://github.com/pytorch/pytorch/pull/143549 was reverted due to some internal/oss tooling issue. Relanding. hipblaslt supports TF32, so adding the support. Original PR https://github.com/pytorch/pytorch/pull/139869 Test Plan: CI Differential Revision: D67785496 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144145 Approved by: https://github.com/jianyuh	2025-01-06 00:37:01 +00:00
Michal Gallus	93633d0e80	[ROCm][Windows] Fix export macros (#144098 ) For correct import and export of functions when the dynamic linkage is used for HIP libraries on windows, the appropriate export/import macros need to be put in place. This Pull Request utilizes existing CUDA import/export macros by converting them to corresponding HIP macros during the hipification process. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144098 Approved by: https://github.com/jeffdaily	2025-01-04 17:12:46 +00:00
Aaron Orenstein	45ef3309e3	[BE] typing for decorators (#144161 ) Summary: Untyped decorators strip annotations from the decorated items. - _compile - _inductor/fx_passes/post_grad - _inductor/lowering - _library/custom_ops - _meta_registrations - _ops - _refs/nn/functional - ao/quantization/quantizer/xnnpack_quantizer_utils - distributed/_composable/contract - fx/experimental/graph_gradual_typechecker - fx/experimental/migrate_gradual_types/constraint_generator - optim/optimizer - signal/windows/windows - testing/_internal/common_device_type - torch/_inductor/decomposition - utils/flop_counter Test Plan: unit tests Differential Revision: D62302684 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144161 Approved by: https://github.com/Skylion007, https://github.com/albanD	2025-01-04 16:40:09 +00:00
PyTorch MergeBot	99f2491af9	Revert "Use absolute path `path.resolve()` -> `path.absolute()` (#129409 )" This reverts commit `45411d1fc9`. Reverted https://github.com/pytorch/pytorch/pull/129409 on behalf of https://github.com/jeanschmidt due to Breaking internal CI, @albanD please help get this PR merged ([comment](https://github.com/pytorch/pytorch/pull/129409#issuecomment-2571316444))	2025-01-04 14:17:20 +00:00
Xuehai Pan	45411d1fc9	Use absolute path `path.resolve()` -> `path.absolute()` (#129409 ) Changes: 1. Always explicit `.absolute()`: `Path(__file__)` -> `Path(__file__).absolute()` 2. Replace `path.resolve()` with `path.absolute()` if the code is resolving the PyTorch repo root directory. Pull Request resolved: https://github.com/pytorch/pytorch/pull/129409 Approved by: https://github.com/albanD	2025-01-03 20:03:40 +00:00
Michael Diggin	55dc61dd52	Dataloader distribute tasks to workers when in_order is False (#142324 ) Fixes #105203 and is a follow up PR to #141833 When `in_order` is True (the default), tasks are given out to workers in a round robin fashion. When `in_order` is False this is no longer needed, as we give up guarantees of reproducibility, and instead tasks should be given to workers that are able to perform work. In this PR I've added tracking of the number of outstanding tasks for each worker (updated when tasks are added to their queue, and when data is returned to the main thread). When finding the next queue to add a task to, if `in_order` is False it will only add the task to the workers queue if it has fewer than `_prefetch_factor` tasks outstanding. The current default behaviour is left as is. Tests are also updated to assert on the worker IDs for each sample of data returned. I've run the following to confirm they aren't flaky ```bash for i in {1..20}; do python test/test_dataloader.py TestOutOfOrderDataLoader; done ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/142324 Approved by: https://github.com/andrewkho	2025-01-03 12:57:04 +00:00
bobrenjc93	0d6db839a7	remove allow-untyped-defs from utils/data/datapipes/iter/streamreader.py (#144088 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144088 Approved by: https://github.com/aorenste	2025-01-03 01:21:44 +00:00
bobrenjc93	bdfb40ed29	remove allow-untyped-defs from utils/_import_utils.py (#144089 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144089 Approved by: https://github.com/aorenste	2025-01-03 01:21:13 +00:00
Kasperi Apell	a7915c56f6	Propagate callable parameter types using ParamSpec (#142306 ) (#143797 ) The codebase has a few locations where callable parameter type information is lost when the unpackings args and *kwargs are typed as Any. Refactor these instances to retain type information using typing_extensions.ParamSpec. Also, in these functions, enforce return type with TypeVar. Addresses #142306 Pull Request resolved: https://github.com/pytorch/pytorch/pull/143797 Approved by: https://github.com/Skylion007 Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com> Co-authored-by: Xuehai Pan <XuehaiPan@outlook.com>	2024-12-29 23:03:14 +00:00

1 2 3 4 5 ...

2230 commits