pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

Author	SHA1	Message	Date
johnnynunez	35f5668f7e	[NVIDIA] RTX50 Blackwell Support codegen (#145270 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145270 Approved by: https://github.com/ezyang	2025-01-21 21:10:05 +00:00
PyTorch MergeBot	895659cb41	Revert "Fix RMSNorm epsilon value type for BF16 or FP16 (#142848 )" This reverts commit `07e23653cd`. Reverted https://github.com/pytorch/pytorch/pull/142848 on behalf of https://github.com/izaitsevfb due to breaking internal tests, see D68355212 ([comment](https://github.com/pytorch/pytorch/pull/142848#issuecomment-2605734067))	2025-01-21 21:04:45 +00:00
Aaron Orenstein	bac62341eb	PEP585 update - torch/_inductor (#145198 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145198 Approved by: https://github.com/bobrenjc93	2025-01-21 21:04:33 +00:00
Aaron Orenstein	2f9d378f7b	PEP585 update - torch/utils (#145201 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145201 Approved by: https://github.com/bobrenjc93	2025-01-21 21:04:10 +00:00
Edward Z. Yang	693d8c7e94	Output of nonzero is transposed, fix fake tensor (#144695 ) Needs this companion executorch PR: https://github.com/pytorch/executorch/pull/7657 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/144695 Approved by: https://github.com/bobrenjc93, https://github.com/albanD	2025-01-21 20:50:09 +00:00
Edward Z. Yang	323fb4dad0	Unconditionally exclude upper bound in all size oblivious tests (#144867 ) I was thinking about https://github.com/pytorch/pytorch/pull/144471 some more and I thought, "Hmm, why not just always exclude the constant upper bound." So here it is. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/144867 Approved by: https://github.com/bobrenjc93	2025-01-21 20:44:09 +00:00
Wei Wang	df67ac4c86	[CI][CUDA][Distributed][FSDP] Remove hardcoded world size of 2 (#145195 ) as these unit tests would fail if run on a single GPU (i.e. skip_if_lt_x_gpu(2)) seems to view world size as 2 even on platforms with 1 GPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145195 Approved by: https://github.com/Skylion007, https://github.com/atalman	2025-01-21 20:25:52 +00:00
Jason Ansel	505ade7471	[inductor] Simplify mode options, only apply CompilerBisector changes once (#145232 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145232 Approved by: https://github.com/yanboliang	2025-01-21 19:25:46 +00:00
RanTao123	85811631d7	[Intel CPU] Fix issue #143489 . (#145062 ) Fix issue in https://github.com/pytorch/pytorch/issues/143489. kernel_height * kernel_weight will cause Floating point exception, so we will divide by them one by one. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145062 Approved by: https://github.com/soulitzer	2025-01-21 18:38:33 +00:00
Joel Schlosser	128f3627b1	Implement backward for NJT matmul (#144587 ) Part of my BE project addressing NJT bugs surfaced via OpInfo tests. This PR implements missing backward support for NJT matmul. Notably, for dense tensors, matmul dispatches to bmm. However, due to historical reasons related to NST, NJT handles matmul directly, and thus can't rely on the CompositeImplicit impl of matmul to get the derivative formula. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144587 Approved by: https://github.com/soulitzer ghstack dependencies: #144586	2025-01-21 18:27:50 +00:00
Joel Schlosser	af204135d8	Fix NJT fill.Scalar for contiguous inputs (#144586 ) Part of my BE project addressing NJT bugs surfaced via OpInfo tests. This PR implements the missing `fill.Scalar` support, which works fine for contiguous inputs, but there is still some AOTAutograd debugging required to handle non-contiguous transposed NJTs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144586 Approved by: https://github.com/soulitzer	2025-01-21 18:22:08 +00:00
Edward Z. Yang	efa88e04e1	Don't overspecialize float when propagating cache guards to ShapeEnv (#145078 ) Fixes https://github.com/pytorch/pytorch/issues/142507 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/145078 Approved by: https://github.com/Skylion007	2025-01-21 18:05:43 +00:00
Edward Z. Yang	b3e90c8c33	Add support for torch function on dtype arguments (#145085 ) Along the lines of https://github.com/pytorch/pytorch/issues/119194 although it doesn't actually address the FCD case. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/145085 Approved by: https://github.com/vmoens, https://github.com/Skylion007	2025-01-21 17:44:47 +00:00
Huy Do	eb553ae3cf	Fix broken gpt_fast micro benchmark after #144315 (#145235 ) The benchmark is failing with the following error ``` File "/var/lib/jenkins/workspace/benchmarks/gpt_fast/benchmark.py", line 333, in <module> main(output_file=args.output, only_model=args.only) File "/var/lib/jenkins/workspace/benchmarks/gpt_fast/benchmark.py", line 308, in main lst = func(device) File "/var/lib/jenkins/workspace/benchmarks/gpt_fast/benchmark.py", line 66, in run_mlp_layer_norm_gelu us_per_iter = benchmarker.benchmark(compiled_mod, (x,)) * 1000 File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/torch/_inductor/runtime/benchmarking.py", line 39, in wrapper return fn(self, args, *kwargs) TypeError: benchmark() missing 1 required positional argument: 'fn_kwargs' ``` An example error is https://github.com/pytorch/pytorch/actions/runs/12862761823/job/35858912555 I also assign `oncall: pt2` as the owner of this job going forward. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145235 Approved by: https://github.com/nmacchioni	2025-01-21 17:42:24 +00:00
atalman	2cffbff7da	Add 3.13t Windows and MacOS binary builds (#141806 ) Related to: https://github.com/pytorch/pytorch/issues/130249 For conda uses approach described here: https://conda-forge.org/blog/2024/09/26/python-313/ Create Python 3.13t conda env like so: ``` conda create -n py313 python=3.13 python-freethreading -c conda-forge ``` For windows executable installation we need to pass additional parameter to enable 3.13t: ``` Include_freethreaded=1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/141806 Approved by: https://github.com/albanD	2025-01-21 17:16:19 +00:00
Aaron Orenstein	0afd335174	PEP585 update - torch/nn torch/optim torch/package torch/profiler torch/serialization torch/sparse torch/xpu (#145175 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145175 Approved by: https://github.com/bobrenjc93	2025-01-21 16:57:27 +00:00
Shunting Zhang	803017f3cb	[inductor] fix MA on poor gpu (#145133 ) Found this bug when debugging a MA issue in CI that can not be repro-ed on devgpu. On GPU with less than 68 SMs (like NVidia L4 used in CI), running torch compile in max-autotune mode may result in the following confusing error https://gist.github.com/shunting314/370f42f547e3367a3773237942725a86 complaining about layout: ``` torch._inductor.exc.InductorError: LoweringException: AssertionError: convert FlexibleLayout to FixedLayout first ``` The reason is, even if we don't pick Triton template, Inductor still returns a MultiTemplateBuffer for tuned addmm. MultiTemplateBuffer.get_reads called from Reduction.num_splits may indexing a FlexibleLayout which results in the error aforementioned. The issue does not appear on devgpu because we freeze the layout of addmm inputs when rendering triton templates. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145133 Approved by: https://github.com/jansel	2025-01-21 09:31:34 +00:00
Aaron Orenstein	b5655d9821	PEP585 update - .ci android aten (#145177 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145177 Approved by: https://github.com/Skylion007	2025-01-21 06:31:26 +00:00
Aaron Orenstein	00ffeca1b1	PEP585 update - torch/distributed (#145164 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145164 Approved by: https://github.com/bobrenjc93	2025-01-21 04:23:29 +00:00
PyTorch MergeBot	c6986ca2e1	Revert "[dcp] Add ZStandard transformer (#143360 )" This reverts commit `7b56b039af`. Reverted https://github.com/pytorch/pytorch/pull/143360 on behalf of https://github.com/atalman due to Broke 3.13t builds please test with ciflow/binaries label attached ([comment](https://github.com/pytorch/pytorch/pull/143360#issuecomment-2603433066))	2025-01-21 01:10:16 +00:00
PyTorch MergeBot	5fd881a5b6	Revert "PEP585 update - torch/nn torch/optim torch/package torch/profiler torch/serialization torch/sparse torch/xpu (#145175 )" This reverts commit `54a00af2c6`. Reverted https://github.com/pytorch/pytorch/pull/145175 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to break some trunk tests ([comment](https://github.com/pytorch/pytorch/pull/145175#issuecomment-2603418267))	2025-01-21 00:49:55 +00:00
Aaron Orenstein	dea7ad3371	PEP585 update - torch/testing (#145200 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145200 Approved by: https://github.com/bobrenjc93	2025-01-20 22:42:42 +00:00
Aaron Orenstein	805c4b597a	PEP585 update - torch/_higher_order_ops torch/_subclasses torch/backends torch/compiler torch/cuda torch/masked torch/mtia torch/nested (#145202 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145202 Approved by: https://github.com/bobrenjc93	2025-01-20 22:37:26 +00:00
Aaron Orenstein	54a00af2c6	PEP585 update - torch/nn torch/optim torch/package torch/profiler torch/serialization torch/sparse torch/xpu (#145175 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145175 Approved by: https://github.com/bobrenjc93	2025-01-20 22:32:59 +00:00
Aaron Orenstein	bd97ce0b45	PEP585 update - torch/ao (#145199 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145199 Approved by: https://github.com/bobrenjc93	2025-01-20 22:32:35 +00:00
Aaron Gokaslan	cf05f6a134	[BE]: Improve typing for torch/fx/_pytree.py and torch/utils/_pytree.py (#145173 ) Improve type inference in _pytree.py utility functions Pull Request resolved: https://github.com/pytorch/pytorch/pull/145173 Approved by: https://github.com/bobrenjc93	2025-01-20 22:18:19 +00:00
Wang, Chuanqi	225a10febe	[CI] Add xpu linux build into pull workflow (#145084 ) To mitigate the XPU build failure risk introduced by non-XPU specific PRs. Refer #144967 & #143803 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145084 Approved by: https://github.com/huydhn, https://github.com/atalman	2025-01-20 19:31:48 +00:00
Zhengxu Chen	d0100050dd	[aoti] Deduplicate "V.aot_compilation" and "V.graph.aot_mode" flags. [2/n] (#145091 ) Summary: Following up D68122536 to remove configurable aot_mode for inner_compile Test Plan: CI Reviewed By: desertfire Differential Revision: D68158512 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145091 Approved by: https://github.com/ydwu4	2025-01-20 19:09:10 +00:00
Aaron Orenstein	0b2a3687b9	PEP585 update - torch/fx (#145166 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145166 Approved by: https://github.com/bobrenjc93	2025-01-20 18:11:54 +00:00
PyTorch MergeBot	6374332d33	Revert "PEP585 update - torch/distributed (#145164 )" This reverts commit `6cb186e279`. Reverted https://github.com/pytorch/pytorch/pull/145164 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it is failing an inductor test ([comment](https://github.com/pytorch/pytorch/pull/145164#issuecomment-2602875679))	2025-01-20 16:46:46 +00:00
Dmitry Nikolaev	57b2b64acf	Fix always true scaled_mm test (#143912 ) Looks like `out_fp8` should use matmul without scales and `out_fp8_s` with Scales were optional arguments before PR https://github.com/pytorch/pytorch/pull/128683 Then test_float8_scale started comparing two identical results and lost its meaning Reason of making scales required https://github.com/pytorch/pytorch/pull/128683#issuecomment-2169146402UMBER This PR uses scale=1.0 to compare result with scaled matmul Pull Request resolved: https://github.com/pytorch/pytorch/pull/143912 Approved by: https://github.com/drisspg, https://github.com/malfet, https://github.com/pruthvistony	2025-01-20 16:17:46 +00:00
Aleksei Nikiforov	53e2408015	Improve cleanup of cancelled jobs on s390x for tests too (#144968 ) Follow up to https://github.com/pytorch/pytorch/pull/144149 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144968 Approved by: https://github.com/huydhn	2025-01-20 12:56:07 +00:00
Sun, Jiayi	92b9da1fc2	fix torch.atan for torch.complex datatypes on CPU (#144749 ) Fix https://github.com/pytorch/pytorch/issues/141487. This issue is caused by the lack of special handling of the case where the real number/imag number is 0/Inf/NaN in the vectorized implementation of `atan`. For correctness, I temporarily fallback the implementation of `atan` to scalar implementation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144749 Approved by: https://github.com/mingfeima, https://github.com/Skylion007	2025-01-20 08:45:03 +00:00
Sun, Jiayi	ed669a9db7	fix torch.div for torch.complex datatypes on CPU (#140375 ) Fix https://github.com/pytorch/pytorch/issues/135428. Fix https://github.com/pytorch/pytorch/issues/106845. These two issues are caused by the lack of special handling of the case where the real number/imag number is 0/Inf/NaN in the vectorized implementation of `div`. For correctness, I temporarily fallback the implementation of `div` to scalar implementation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/140375 Approved by: https://github.com/mingfeima	2025-01-20 08:34:29 +00:00
Sun, Jiayi	c922ccb7c4	fix sigmoid for torch.complex datatypes on CPU (#140391 ) Fix https://github.com/pytorch/pytorch/issues/135777. This issue is caused by the lack of special handling of the case where the real number/imag number is 0/Inf/NaN in the vectorized implementation of `reciprocal`. For correctness, I temporarily fallback the implementation of `reciprocal` to scalar implementation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/140391 Approved by: https://github.com/mingfeima, https://github.com/Skylion007 ghstack dependencies: #140358	2025-01-20 08:23:58 +00:00
Sun, Jiayi	507bf65c6a	fix torch.exp for torch.complex datatypes on CPU (#140358 ) Fix https://github.com/pytorch/pytorch/issues/48010, https://github.com/pytorch/pytorch/issues/136063. These two issues are caused by the lack of special handling of the case where the real number/imag number is 0/Inf/NaN in the vectorized implementation of `exp`. For correctness, I temporarily fallback the implementation of `exp` to scalar implementation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/140358 Approved by: https://github.com/mingfeima, https://github.com/Skylion007	2025-01-20 08:03:17 +00:00
ankurneog	972d4a154d	Add facility to run dynamo UTs for non-cuda devices (#140929 ) This is in line with changes introduced with https://github.com/pytorch/pytorch/pull/130714, additional files are included to support non-cuda devices. Pull Request resolved: https://github.com/pytorch/pytorch/pull/140929 Approved by: https://github.com/kwen2501, https://github.com/EikanWang, https://github.com/guangyey	2025-01-20 05:56:38 +00:00
Aaron Orenstein	2b809e58ad	PEP585 update - torch/onnx (#145174 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145174 Approved by: https://github.com/justinchuby	2025-01-20 05:48:52 +00:00
Animesh Jain	19584b28fd	[dynamo][dicts] Consolidate dict(..) construction (#144342 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144342 Approved by: https://github.com/StrongerXi	2025-01-20 04:42:06 +00:00
Nikita Shulga	980c75fe6e	[MPSInductor] Add `TrueDiv` and `Round[Int\|Decimal]` (#145160 ) That fixes `test_builtins_round_float_ndigits_neg` and `test_builtins_round` Pull Request resolved: https://github.com/pytorch/pytorch/pull/145160 Approved by: https://github.com/jansel, https://github.com/dcci	2025-01-20 04:29:42 +00:00
Aaron Orenstein	6cb186e279	PEP585 update - torch/distributed (#145164 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145164 Approved by: https://github.com/bobrenjc93	2025-01-20 00:19:01 +00:00
Aaron Orenstein	b6c5562c1f	PEP585 update - torch/export (#145165 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145165 Approved by: https://github.com/bobrenjc93	2025-01-19 20:56:55 +00:00
Aaron Orenstein	316808e4e9	PEP585 update - torch/distributed/elastic torch/distributed/checkpoint (#145163 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145163 Approved by: https://github.com/Skylion007	2025-01-19 20:55:59 +00:00
Aaron Orenstein	c64e657632	PEP585 update - torch/distributed/fsdp (#145162 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145162 Approved by: https://github.com/bobrenjc93	2025-01-19 20:04:05 +00:00
Nikita Shulga	371a361db9	Enable bfloat16 testing on MacOS14+ (#145159 ) As Metal-3.1 supports this dtype Pull Request resolved: https://github.com/pytorch/pytorch/pull/145159 Approved by: https://github.com/Skylion007, https://github.com/jansel ghstack dependencies: #145157	2025-01-19 19:35:31 +00:00
Aaron Orenstein	97d4d3c40a	PEP585 update - torch/_export (#145138 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145138 Approved by: https://github.com/bobrenjc93 ghstack dependencies: #145154	2025-01-19 18:48:35 +00:00
Aaron Orenstein	cd8d0fa20c	Tweak schema_check to handle annotated builtin types (#145154 ) As of python 3.9 annotated lists can be written as `list[T]` and `List[T]` has been deprecated. However schema_check was converting `list[T]` to simply be `list`. This change teaches it to handle `list[T]` the same as `List[T]`. A couple small drive-by changes I noticed as well: - Path concatenation should use `os.path.join`, not `+` - Spelling in error message Pull Request resolved: https://github.com/pytorch/pytorch/pull/145154 Approved by: https://github.com/bobrenjc93	2025-01-19 18:48:35 +00:00
Aaron Orenstein	9e0437a04a	PEP585 update - torch/ao/quantization (#145140 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145140 Approved by: https://github.com/bobrenjc93	2025-01-19 10:20:00 +00:00
Aaron Orenstein	78bff1e8c1	PEP585 update - torch/_functorch (#145139 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145139 Approved by: https://github.com/bobrenjc93	2025-01-19 07:06:10 +00:00
cassanof	10e4d3aebb	[DCP] Fix fsspec fsync bug on .finish() (#144753 ) Fixes #144752 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144753 Approved by: https://github.com/Skylion007, https://github.com/saumishr	2025-01-19 03:21:00 +00:00

1 2 3 4 5 ...

83398 commits