Commit graph

82599 commits

Author SHA1 Message Date
Jerry Zhang
ace645a017 Add support for prototype affine quantization in pt2e flow (#141421)
Summary:
duplicated affine quantization functionality including
observer (https://github.com/pytorch/ao/blob/main/torchao/quantization/observer.py)
and some quant_primitive ops (7c3c51fd0d/torchao/quantization/quant_primitives.py (L26-L30))
to allow for per group quantization min max observer in pt2e flow

Next: We can follow up to add moving average min max observer

Test Plan:
python test/test_quantization.py -k test_channel_group_quantization

Reviewers:

Subscribers:

Tasks:

Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/141421
Approved by: https://github.com/cccclai
2024-12-24 04:22:18 +00:00
Jason Ansel
60a0d53c13 [dynamo] Add test for #143697 (#143764)
The issue from #143697 seems to already be fixed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143764
Approved by: https://github.com/Skylion007
2024-12-24 03:50:15 +00:00
zeshengzong
01d60bcf32 [Easy] Fix todo by enable tests for cuda (#143637)
Fix TODO in `test_tensor_creation_ops.py` file:

```python
# TODO: update to work on CUDA, too
```

**Test Result**

```bash
$ pytest test/test_tensor_creation_ops.py
```

![image](https://github.com/user-attachments/assets/ef829541-668e-446d-a9ab-b26b9d73085f)

```bash
$ lintrunner
```
![image](https://github.com/user-attachments/assets/d6a46eee-1f60-48e6-898a-a8d9620eb54a)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143637
Approved by: https://github.com/albanD
2024-12-24 03:47:43 +00:00
Eddie Yan
b90a3b7281 [cumsum][CUDA][64-bit indexing] Add 64-bit indexing path for cumsum (#143696)
For #143486

Interestingly enough changing the indexing type seems to degrade performance when a larger width is not needed, even on small sizes, so making this a template param rather than forcing all cases to 64-bit

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143696
Approved by: https://github.com/malfet

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
2024-12-24 03:45:28 +00:00
Jason Ansel
dec4286b2d [inductor] Fix for extract_target with dots (#143766)
Fixes #143650

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143766
Approved by: https://github.com/yanboliang
2024-12-24 03:42:15 +00:00
cyy
1feae27ed6 [16/N] Fix extra warnings brought by clang-tidy-17 (#143714)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143714
Approved by: https://github.com/Skylion007, https://github.com/albanD
2024-12-24 03:29:38 +00:00
PyTorch MergeBot
49fdc52fd2 Revert "Add a warning when a tensor with requires_grad=True is converted to a scalar (#143261)"
This reverts commit bc78b6ea4f.

Reverted https://github.com/pytorch/pytorch/pull/143261 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it is failing lint, plz help fix and reland this ([comment](https://github.com/pytorch/pytorch/pull/143261#issuecomment-2560583332))
2024-12-24 03:15:38 +00:00
cyy
d6a066ead6 Simplify host_softmax (#143251)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143251
Approved by: https://github.com/albanD
2024-12-24 02:27:51 +00:00
Nikita Shulga
da21fabf34 [BE] Only print MKL version on x86 platforms (#143763)
As it will obviously be missing on ARM/S390, etc

Test plan: run `python3 -c "import torch;print(torch.__config__.parallel_info())"` on both x86 and non-x86 system
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143763
Approved by: https://github.com/Skylion007, https://github.com/albanD
2024-12-24 02:04:26 +00:00
Animesh Jain
7d1c666139 [dynamo] Remove dead code after introducing UserDefinedDictVariable (#143699)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143699
Approved by: https://github.com/williamwen42, https://github.com/yanboliang, https://github.com/jansel
ghstack dependencies: #143722
2024-12-24 02:00:18 +00:00
Animesh Jain
fe95cbe018 [dynamo] Remove DICT_SUBCLASS_GUARD_MANAGER and use dict.keys (#143722)
In hinsight, we never needed a DICT_SUBCLASS_GUARD_MANAGER, because Dynamo would inline through the overridden keys method. In this PR, we ensure that while creating guards and constructing variable trackers, we get the `d.keys()` value by using `dict.keys(d)`. This ensures that we do not call overridden keys method. Therefore, the C++ guard can use `PyDict_Next` directly to check the guards.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143722
Approved by: https://github.com/jansel
2024-12-24 02:00:18 +00:00
zeshengzong
67355a1289 [Easy] Add torch.range, torch.arange params optional description (#143731)
Fixes #129333

**Test Result**

**Before**

![image](https://github.com/user-attachments/assets/c5873690-7de7-4a14-9423-a150d17d137e)

![image](https://github.com/user-attachments/assets/ff4ee545-f27a-403b-bf92-51f9571022a3)

**After**

![image](https://github.com/user-attachments/assets/34e2c41f-8b54-417d-bb10-7ca6f679206a)

![image](https://github.com/user-attachments/assets/b54bcebd-70e9-4a1a-8a22-1ab815e17827)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143731
Approved by: https://github.com/janeyx99
2024-12-24 01:29:24 +00:00
Jithun Nair
0ca6a47872 Update tag_regex in filter_test_configs.py for workflows such as inductor-rocm (#143768)
This helps to make `continue-through-error`/`keep-going` work as expected on `inductor-rocm` workflow jobs.

Without this, the code here doesn't enter the `if` condition: 6ccb8ed186/.github/scripts/filter_test_configs.py (L577)

Tested via [this PR](https://github.com/pytorch/pytorch/pull/140989):
Without this change: https://hud.pytorch.org/pytorch/pytorch/pull/140989?sha=8232e18957f987d99c946efc0cf6da9be9b52067: https://github.com/pytorch/pytorch/actions/runs/12164558045/job/34192442187#step:13:144

With this change: https://hud.pytorch.org/pytorch/pytorch/pull/140989?sha=763179c5e421791ee05c8e2a600379b29a1c8c33: https://github.com/pytorch/pytorch/actions/runs/12261943684/job/34213300153#step:13:145

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143768
Approved by: https://github.com/huydhn
2024-12-24 00:50:14 +00:00
Joshua Hamilton
bc78b6ea4f Add a warning when a tensor with requires_grad=True is converted to a scalar (#143261)
Fixes #143071

Operations performed on tensors with `requires_grad=True` such as
```python
import torch

x = torch.tensor(2.0, requires_grad=True)
y = x ** 3
```
and
```python
x = torch.tensor(2.0, requires_grad=True)
y = torch.pow(x,3)
```
are valid operations.

While an operation using `numpy` like
```python
import numpy as np

x = torch.tensor(2.0, requires_grad=True)
y = np.pow(x,3)
# > RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.
```
leads to an error.

However, an operation that uses `math` like
```python
import math

x = torch.tensor(2.0, requires_grad=True)
y = math.pow(x,3)
```
does not cause an error, and `y` is no longer a tensor with a gradient!

This represents a [footgun](https://en.wiktionary.org/wiki/footgun#Noun) for some users, like myself when training small, custom, non-neural network models.

To prevent future undesired behavior, I added a warning when converting tensors with `requires_grad=True` to scalars. Now, when using `math.pow` on a `tensor`, we get a single warning with:
```python
x = torch.tensor(2.0, requires_grad=True)
y = math.pow(x,3)
# > UserWarning: Converting a tensor with requires_grad=True to a scalar may lead to unexpected behavior.
# Consider using tensor.detach() first.
```

Please let me know if you have any questions 👍
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143261
Approved by: https://github.com/albanD
2024-12-24 00:22:18 +00:00
emmettbicker
6ccb8ed186 Refactor AdamW into Adam (heavily inspired by tfsingh) (#143710)
Fixes #104899

Refactors AdamW into Adam by making AdamW a subclass of Adam. Additionally adds a test to assert that the added parameter `decoupled_weight_decay` is True in AdamW and also updates test_defaults_changed_to_foreach to account for the differences in module location for AdamW.

Heavily heavily inspired by #118857 by @tfsingh

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143710
Approved by: https://github.com/janeyx99
2024-12-23 23:27:28 +00:00
Sam Larsen
4271a95590 [logging] A few fixes/updates to record_compilation_metrics (#143332)
Summary: Mostly cosmetic, but one bug fix:
* Bug fix: Make sure compile_id is converted to a string in the compilation metrics so it's printed as, e.g., "0/1" instead of "[0, 1]"
* Sort collections in `collection_to_str`
* Print non-string elements as `"<unknown>"` instead of None (since we don't expect non-strings)
* Move the population of the legacy metrics and any pre-processing to a new factory method in CompilationMetrics

Test Plan:
```
python test/dynamo/test_structured_trace.py
python test/dynamo/test_utils.py
```
Internal testing: https://fburl.com/scuba/dynamo_compile/sandbox/l0me8auf

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143332
Approved by: https://github.com/ppanchalia
2024-12-23 23:10:11 +00:00
Natalia Gimelshein
2ab698e708 allow profiling on all threads via experimentalConfig (#143659)
In some situations we want to profile calls coming from all threads (similar to on-demand), not just the thread that started profiling and the spawned threads that would inherit KinetoThreadLocal state.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143659
Approved by: https://github.com/sraikund16
2024-12-23 20:41:27 +00:00
Aaron Gokaslan
00831f9b22 [BE]: Properly forward raise pickle exception with from (#143761)
Properly raises the pickle exception with from. Provides a more informative stack trace and forwards information about the exception that led to the current exception.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143761
Approved by: https://github.com/XuehaiPan, https://github.com/albanD
2024-12-23 20:21:30 +00:00
Jithun Nair
75e1f8a227 [ROCm] upgrade nightly wheels to rocm6.3 - 2 of 2 (binaries) (#143613)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143613
Approved by: https://github.com/jeffdaily
2024-12-23 19:47:30 +00:00
PyTorch MergeBot
0ebc6388cf Revert "Exclude py 31.3t triton package from PyTorch 3.13t wheel (#143218)"
This reverts commit 3bfdf6f063.

Reverted https://github.com/pytorch/pytorch/pull/143218 on behalf of https://github.com/atalman due to this constrain is ignored see https://github.com/pytorch/pytorch/issues/143654 ([comment](https://github.com/pytorch/pytorch/pull/143218#issuecomment-2560208992))
2024-12-23 19:37:35 +00:00
Sergii Dymchenko
727ee853b4 Apply TorchFix TOR203 fixes (#143691)
Codemodded via `torchfix . --select=TOR203 --fix`.
This is a step to unblock https://github.com/pytorch/pytorch/pull/141076
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143691
Approved by: https://github.com/malfet
2024-12-23 18:21:03 +00:00
Sergii Dymchenko
c042c8a475 Use default_collate from public API (#143616)
Codemodded via `torchfix . --select=TOR104 --fix`.
This is a step to unblock https://github.com/pytorch/pytorch/pull/141076
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143616
Approved by: https://github.com/malfet
2024-12-23 17:38:43 +00:00
zeshengzong
a70191da41 Add torch.topk indices vary description (#143736)
Fixes #133542

**Test Result**

**Before**

![image](https://github.com/user-attachments/assets/65227efb-02af-45e7-804c-35588dff360d)

**After**

![image](https://github.com/user-attachments/assets/91f1f53f-008c-4784-82fe-013404e273cb)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143736
Approved by: https://github.com/zou3519
2024-12-23 17:16:31 +00:00
PyTorch MergeBot
1519a9e30b Revert "Add FP8 support for eye (#139974)"
This reverts commit 01890526b9.

Reverted https://github.com/pytorch/pytorch/pull/139974 on behalf of https://github.com/huydhn due to Sorry for reverting your change but this seems to fail some slow tests ([comment](https://github.com/pytorch/pytorch/pull/139974#issuecomment-2560046399))
2024-12-23 17:12:39 +00:00
Nikita Shulga
12662901aa [BE] Move Mac BB test to its own step (#143513)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143513
Approved by: https://github.com/huydhn, https://github.com/atalman, https://github.com/kit1980, https://github.com/seemethere
ghstack dependencies: #143395, #143511, #143512
2024-12-23 14:05:10 +00:00
Xuehai Pan
5c4545f857 [BE][Easy] enable PYFMT for torch/[a-s]*/ (#138447)
Reproduce command:

```bash
ghstack checkout https://github.com/pytorch/pytorch/pull/138447
git checkout HEAD~1 torch/
lintrunner -a --take "PYFMT" --all-files
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138447
Approved by: https://github.com/ezyang
2024-12-23 14:04:00 +00:00
Dmitry Rogozhkin
7314cf44ae torch/accelerator: fix device type comparison (#143541)
This was failing without the fix:
```
python -c 'import torch; d=torch.device("xpu:0"); torch.accelerator.current_stream(d)'
```
with:
```
ValueError: xpu doesn't match the current accelerator xpu.
```

CC: @guangyey, @EikanWang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143541
Approved by: https://github.com/guangyey, https://github.com/albanD
2024-12-23 10:54:53 +00:00
Kai Londenberg
434e0c2104 Inductor Cutlass backend: Eliminate unused code. (#143723)
Summary: Eliminates an unused file and some smaller unused code fragments from the inductor cutlass codebase.

Test Plan: CI

Differential Revision: D67579837

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143723
Approved by: https://github.com/ColinPeppler
2024-12-23 09:35:03 +00:00
Jiang, Yanbing
01890526b9 Add FP8 support for eye (#139974)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/139974
Approved by: https://github.com/jgong5, https://github.com/malfet
2024-12-23 06:47:49 +00:00
PyTorch MergeBot
448c16ac87 Revert "[reland][AMD] Turn on TF32 for aten::mm (#143549)"
This reverts commit 41cdc7f735.

Reverted https://github.com/pytorch/pytorch/pull/143549 on behalf of https://github.com/malfet due to It breaks ROCM testing, see 06b4b96b34/1 ([comment](https://github.com/pytorch/pytorch/pull/143549#issuecomment-2559016960))
2024-12-23 06:47:36 +00:00
Aaron Orenstein
06b4b96b34 dynamo tracing perf: no re in arg_ref: 33.9 -> 33.7 (#143069)
See #143056 for overall docs.

This PR: Avoid use of python re and move valid varname check in
`GuardBuilder.arg_ref()` into C++

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143069
Approved by: https://github.com/jansel
2024-12-23 05:32:09 +00:00
Yu, Guangye
07fa6e2c8b Fix torch.accelerator api abort when passing invaild device (#143550)
# Motivation
Fix https://github.com/pytorch/pytorch/issues/143543

# Solution
We should raise python exception instead of aborting...

# Additional Context
without this PR:
```python
>>> import torch
>>> torch.accelerator.current_stream(torch.accelerator.device_count())
terminate called after throwing an instance of 'c10::Error'
  what():  device is out of range, device is 2, total number of device is 2.
Exception raised from check_device_index at /home/dvrogozh/git/pytorch/pytorch/c10/xpu/XPUFunctions.h:36 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0xac (0x7f30707eb95c in /home/dvrogozh/git/pytorch/pytorch/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xf3 (0x7f307078fc57 in /home/dvrogozh/git/pytorch/pytorch/torch/lib/libc10.so)
frame #2: <unknown function> + 0x19a3e (0x7f3070c2ba3e in /home/dvrogozh/git/pytorch/pytorch/torch/lib/libc10_xpu.so)
frame #3: c10::xpu::getCurrentXPUStream(signed char) + 0x2f (0x7f3070c2c83f in /home/dvrogozh/git/pytorch/pytorch/torch/lib/libc10_xpu.so)
frame #4: <unknown function> + 0x1ca35 (0x7f3070c2ea35 in /home/dvrogozh/git/pytorch/pytorch/torch/lib/libc10_xpu.so)
frame #5: <unknown function> + 0x653f15 (0x7f3083391f15 in /home/dvrogozh/git/pytorch/pytorch/torch/lib/libtorch_python.so)
frame #6: <unknown function> + 0x39e5f2 (0x7f30830dc5f2 in /home/dvrogozh/git/pytorch/pytorch/torch/lib/libtorch_python.so)
<omitting python frames>
frame #20: <unknown function> + 0x29d90 (0x7f308b19bd90 in /lib/x86_64-linux-gnu/libc.so.6)
frame #21: __libc_start_main + 0x80 (0x7f308b19be40 in /lib/x86_64-linux-gnu/libc.so.6)

Aborted (core dumped)
```
with this PR:
```python
>>> import torch
>>> torch.accelerator.current_stream(torch.accelerator.device_count())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/pt-gpu/4T-4652/guangyey/stock-pytorch/torch/accelerator/__init__.py", line 123, in current_stream
    return torch._C._accelerator_getStream(device_index)
RuntimeError: The device index is out of range. It must be in [0, 2), but got 2.
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143550
Approved by: https://github.com/EikanWang, https://github.com/dvrogozh, https://github.com/albanD
2024-12-23 03:44:22 +00:00
Jason Ansel
eebc93d41e Better fix for f-strings in set_linter for py3.12 (#143725)
#143628 didn't handle a few cases right for example:
```py
$ python3 tools/linter/adapters/set_linter.py torch/_inductor/scheduler.py
torch/_inductor/scheduler.py:261:24: Builtin `set` is deprecated
  259 |                 multiline=False,
  260 |             )
  261 |         return f"{self}{data_str}"
                               ^
  262 |
  263 |     def log_details(self) -> None:

torch/_inductor/scheduler.py:261:33: Builtin `set` is deprecated
  259 |                 multiline=False,
  260 |             )
  261 |         return f"{self}{data_str}"
                                        ^
  262 |
  263 |     def log_details(self) -> None:
```
also multi-line fstrings
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143725
Approved by: https://github.com/yanboliang
2024-12-22 22:51:27 +00:00
Xiaodong Wang
41cdc7f735 [reland][AMD] Turn on TF32 for aten::mm (#143549)
Summary:
hipblaslt supports TF32, so adding the support.

Original PR https://github.com/pytorch/pytorch/pull/139869

Test Plan: CI

Differential Revision: D67431681

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143549
Approved by: https://github.com/eqy
2024-12-22 21:05:05 +00:00
Nikita Shulga
6425f0779d [BE] Update triton repo link (#143429)
It should be https://github.com/triton-lang/triton rather than https://github.com/openai/triton shouldn't it?

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143429
Approved by: https://github.com/jansel
2024-12-22 18:38:35 +00:00
Nikita Shulga
a316a4581d Add mps to GPU_TYPES (#143634)
Because it is a GPU, but don't require a triton, as it does not need one

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143634
Approved by: https://github.com/jansel
2024-12-22 18:37:35 +00:00
cyy
09c950cc87 Remove unused <ATen/core/Array.h> inclusion (#143701)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143701
Approved by: https://github.com/albanD
2024-12-22 14:30:11 +00:00
Oguz Ulgen
dc55704b48 Rename cache limit to recompile limit in configs (#143709)
This PR renames every cache_limit to recompile_limit via sed.

Old config options are maintained via Config(alias='xyz')

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143709
Approved by: https://github.com/jansel
2024-12-22 10:03:57 +00:00
Aaron Orenstein
9bf4b1c2e9 dynamo tracing perf: c++ strip_function_call: 49.12 -> 47.77 (#143063)
See #143056 for overall docs.

This PR: Convert `strip_function_call()` into C++

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143063
Approved by: https://github.com/jansel
ghstack dependencies: #143057, #143062
2024-12-22 06:38:46 +00:00
Aaron Orenstein
3ec04d30d5 dynamo tracing perf: kill import: 50.36 -> 49.12 (#143062)
See #143056 for overall docs.

This PR: Stop importing in the body of `BuiltinVariable.call_getattr()`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143062
Approved by: https://github.com/jansel
ghstack dependencies: #143057
2024-12-22 06:38:46 +00:00
Aaron Orenstein
f2b744b9ca dynamo tracing perf: import_module: 59.92 -> 52.9 (#143057)
See #143056 for overall docs.

This PR: Using `importlib.import_module()` within the hot path of
symbolic_convert is slow. Memoize it.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143057
Approved by: https://github.com/jansel
2024-12-22 06:38:38 +00:00
Tom Ritchford
f1cbf4b1b5 Enable ruff's unused variable checking everywhere in pytorch (#136965)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136965
Approved by: https://github.com/cyyever, https://github.com/albanD
2024-12-22 02:33:11 +00:00
Xuehai Pan
2293fe1024 [BE][Easy] use pathlib.Path instead of dirname / ".." / pardir (#129374)
Changes by apply order:

1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`.
2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`.
3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first.

    `.parent{...}.absolute()` -> `.absolute().parent{...}`

4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.)

    `.parent.parent.parent.parent` -> `.parents[3]`

5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~

    ~`.parents[3]` -> `.parents[4 - 1]`~

6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374
Approved by: https://github.com/justinchuby, https://github.com/malfet
2024-12-21 22:08:01 +00:00
PyTorch MergeBot
197954e14b Revert "Handle meta tensors in FX quantization (#142262)"
This reverts commit e97b97af56.

Reverted https://github.com/pytorch/pytorch/pull/142262 on behalf of https://github.com/janeyx99 due to this PR broke lint  ([comment](https://github.com/pytorch/pytorch/pull/142262#issuecomment-2558233022))
2024-12-21 20:34:09 +00:00
Yanan Cao (PyTorch)
0666347fc4 [Codemod][AddExplicitStrictExportArg] caffe2/benchmarks/dynamo (#143686)
Reviewed By: avikchaudhuri

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143686
Approved by: https://github.com/tugsbayasgalan
2024-12-21 19:56:56 +00:00
Kaustubh Vartak
e97b97af56 Handle meta tensors in FX quantization (#142262)
Summary:
If module being quantized contains a some meta tensors and some tensors with actual device, we should not fail quantization.

Quantization should also not fail if new quantized module is created on a meta device.

Differential Revision: D66895899

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142262
Approved by: https://github.com/iamzainhuda
2024-12-21 13:19:30 +00:00
cyy
daa3ffe0eb Enable more C++ warnings (#143355)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143355
Approved by: https://github.com/albanD
2024-12-21 09:19:02 +00:00
PyTorch MergeBot
e15442a9b2 Revert "export AOTI_TORCH_EXPORT on Windows. (#140030)"
This reverts commit 6733045a4a.

Reverted https://github.com/pytorch/pytorch/pull/140030 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but my first attempt to fix internal build does not fix all the cases, so let us try again ([comment](https://github.com/pytorch/pytorch/pull/140030#issuecomment-2558043056))
2024-12-21 08:06:19 +00:00
Avik Chaudhuri
51eacea8c4 graph module retracing without preserving MCS (#143676)
Retracing while preserving module call signatures used to be a problem because graph modules don't have submodules at given paths. This led to a number of failing retracebility tests. By not trying to wrap modules with export tracepoints we can pass most of these tests; the only exception is where you do module swapping on retraced programs, which is still not possible.

Differential Revision: [D67539304](https://our.internmc.facebook.com/intern/diff/D67539304/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143676
Approved by: https://github.com/zhxchen17, https://github.com/tugsbayasgalan
ghstack dependencies: #143664
2024-12-21 07:57:43 +00:00
cyy
d7e59c2f85 Fix cppcoreguidelines-pro-type-member-init (#141787)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141787
Approved by: https://github.com/albanD
2024-12-21 07:51:30 +00:00