Commit graph

57487 commits

Author SHA1 Message Date
Huy Do
44d8e6c2aa Retry CI Android emulator test (#96163)
This is not the first time I spot Android test flakiness such as
893aa5df3f.  From some StackOverflow results, it looks like the failure `Unknown failure: Error: Could not access the Package Manager.  Is the system running?` could be fixed by waiting a bit for the emulator to start fully https://stackoverflow.com/questions/15524185/could-not-access-the-package-manager-is-the-system-running-while-installing-and

So, I'm adding retry capability here to give the test another chance.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96163
Approved by: https://github.com/ZainRizvi
2023-03-09 00:09:10 +00:00
BowenBao
df0ff34bcb [ONNX] Bump onnx submodule to release 1.13.1 from rc2 (#96325)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96325
Approved by: https://github.com/justinchuby
2023-03-09 00:00:44 +00:00
Edward Z. Yang
32ffd70644 Rewrite fallthrough to more closely match how C++ works (#96304)
Fallthrough is modeled as a mask which we use to remove keys from the
compute dispatch key set for eligibility.

It's possible this addresses https://github.com/pytorch/pytorch/issues/89037
in a better way than https://github.com/pytorch/pytorch/pull/95891 but I
cannot easily tell as the original repro no longer works and the new PR
does not have a test.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96304
Approved by: https://github.com/zou3519, https://github.com/albanD, https://github.com/zhxchen17
2023-03-08 23:00:26 +00:00
Edward Z. Yang
67c329bc9b Refactor to reduce duplicate logic in torch._ops (#96302)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96302
Approved by: https://github.com/zou3519
2023-03-08 23:00:26 +00:00
Will Constable
4662ae5b62 Add missing types to inductor IR assert (#96221)
Unclear if there is a more efficient way to define the allowed types for IR (or if we even need this, perhaps we just ditch the assert?)  But Inductor experts can deteremine if these added ops are appropriate and if so they fix the reported issue.

Fixes #96204

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96221
Approved by: https://github.com/ezyang
2023-03-08 22:55:43 +00:00
Wei Wang
038e838e7b Make setup linux action be more friendly with gcp linux runners (#96289)
Fixes issues like the following:
https://github.com/pytorch/pytorch/actions/runs/4362155257/jobs/7627059487 has a more serious core dump failure but the log of curl failures (GCP linux trying to get EC2 specific metadata like EC2 AMI-ID, Instance ID, and Instance Type) confused the HUD.
<img width="848" alt="image" src="https://user-images.githubusercontent.com/109318740/223670567-330521ba-050a-41c3-9efb-fae6ea3398c0.png">
This PR gets rid of those curl failures.

This may have contributed to the impression of "flaky GCP" in #95416

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96289
Approved by: https://github.com/huydhn, https://github.com/yanboliang
2023-03-08 22:17:36 +00:00
Gao, Xiang
78e04f8272 Update nvfuser_executor.py (#96218)
In https://github.com/csarofeen/pytorch/pull/2517 the return value of `compute_contiguity` is changed from tuple to list. This PR handles that change.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96218
Approved by: https://github.com/jjsjann123, https://github.com/davidberard98
2023-03-08 22:07:58 +00:00
fduwjj
7863efbd76 [BE][8/N] Remove ShardedTensor from TP FSDP integration test and other tests depending on Sharded Linear (#96254)
We removed ShardedLinear in https://github.com/pytorch/pytorch/pull/95948 but it broke TP_FSDP integration test because it is using ShardedTensor in the test. Migrating using DTensor fixes the test. DTensor shards the bias too so that we need to change the test a little bit.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96254
Approved by: https://github.com/huydhn
2023-03-08 21:56:41 +00:00
Aidyn-A
f5c39b7ba2 [inductor] fix typos in test_torchinductor.py (#96233)
Fixes typos in `test_torchinductor.py::test_recompile_on_index_cuda`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96233
Approved by: https://github.com/jansel
2023-03-08 21:24:46 +00:00
BowenBao
0f4652f498 [ONNX] Merge 'initializers' into 'TorchScriptGraph' (#95676)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95676
Approved by: https://github.com/titaiwangms, https://github.com/wschin
2023-03-08 21:12:20 +00:00
Edward Z. Yang
e9e6b3b6c5 [EASY] Add complex dtypes to partitioner (#96297)
Also, delete some redundant dtype setting.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96297
Approved by: https://github.com/Chillee
2023-03-08 21:08:26 +00:00
Catherine Lee
a7fe11dec0 --subprocess for pytest (#96210)
Implements --subprocess flag for pytest, which previously only worked with unittest

Pretty much all the tests in the custom handler list use --subprocess
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96210
Approved by: https://github.com/huydhn
2023-03-08 21:04:50 +00:00
Catherine Lee
8921b22297 Set ref for linux_job checkout in lint (#96317)
test-infra's linux_job uses github.ref as the default value for the ref, which is the branch, so it checks out the most recent commit on the branch.
Might be better to fix this on the test-infra side instead
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96317
Approved by: https://github.com/huydhn
2023-03-08 21:04:30 +00:00
albanD
c8216e558b Add basic Module serialization BC test (#96238)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96238
Approved by: https://github.com/ezyang
2023-03-08 21:01:27 +00:00
Horace He
5bbec680d7 Fix usages of contextmanager without finally (#96170)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96170
Approved by: https://github.com/ngimel, https://github.com/malfet
2023-03-08 20:59:27 +00:00
Hansong Zhang
34d18c8bee Remove unimported expecttest deps and usage (#96314)
expecttest is not imported to OSS BUCK build yet. Using it in target test_torchgen_executorch breaks build.

Remove it first to fix the build. Will import and fix in a follow-up PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96314
Approved by: https://github.com/huydhn
2023-03-08 20:54:11 +00:00
Aidyn-A
0f6d6d6124 [TorchScript] Fix torch.cuda._exchange_device (#95306)
Fixes #95305
I am not sure why these one-line changes fix TorchScript, but it works...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95306
Approved by: https://github.com/ngimel
2023-03-08 20:29:05 +00:00
Bin Bao
deaf9e5e65 [reland][inductor] Add an AOT compilation mode for Inductor CPP backend (#95985)
Summary: This is a reland of https://github.com/pytorch/pytorch/pull/94822

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95985
Approved by: https://github.com/jansel
2023-03-08 20:02:32 +00:00
Thiago Crepaldi
b9c25f186c Ignore shape inference exception from Caffe2 ATen fallback (#90408)
Fixes #87318

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90408
Approved by: https://github.com/BowenBao
2023-03-08 20:02:11 +00:00
Edward Z. Yang
c988de1040 [EASY] Update inductor training dynamic skips (#96298)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96298
Approved by: https://github.com/Chillee, https://github.com/janeyx99
2023-03-08 19:31:46 +00:00
Bin Bao
b3a079810e [CI] Add a workflow for quick perf comparison (#96166)
Summary: ciflow/inductor-perf-test-nightly now contains full dashboard
run which takes a very long time. Ed proposed a simplification of the
perf run there, but it is still worth to have a set of fast perf test
which only includes one configuration (--training --amp).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96166
Approved by: https://github.com/huydhn, https://github.com/weiwangmeta
2023-03-08 19:09:04 +00:00
Huy Do
4a1b971748 Move MacOS x86_64 build and test jobs to periodic (#96279)
The correlation result can be found at https://github.com/pytorch/test-infra/pull/3852.  This is the first step toward reducing the redundancy of having both x86_64 and Apple silicon M1

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96279
Approved by: https://github.com/ZainRizvi, https://github.com/seemethere, https://github.com/malfet
2023-03-08 18:52:18 +00:00
PyTorch MergeBot
9137f53ec2 Revert "Error when jit.trace/script is used with torch.compile (#91681)"
This reverts commit fa92b6a7b0.

Reverted https://github.com/pytorch/pytorch/pull/91681 on behalf of https://github.com/izaitsevfb due to Breaks internal tests, see T147501786
2023-03-08 18:47:38 +00:00
Zain Rizvi
7362e22f8b Notify on outdated lintrunner (#96241)
Let users know if they have an outdated version of lintrunner installed on their box

Sets the minimum version to one which uses master as a default mergebase (see https://github.com/pytorch/pytorch/pull/95938)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96241
Approved by: https://github.com/huydhn
2023-03-08 18:41:31 +00:00
Driss Guessous
11aab72dc9 [SDPA] Add an optional scale kwarg (#95259)
# Summary
This PR adds an optional kwarg to torch torch.nn.functional.scaled_dot_product_attention()
The new kwarg is a scaling factor that is applied after the q@k.T step of the computation. Made updates to the efficient kernel to support but flash and math were minimally updated to support as well.

Will reduce the complexity of: #94729 and has been asked for by a couple of users.

# Review Highlights
- As far as I know I did this the correct way and this both BC and FC compliant. However I always seem to break internal workloads so I would love if someone can advice I did this right?
- I named the optional arg 'scale'. This is probably dumb and I should name it 'scale_factor'. I will make this change but this is annoying and it will require someone thinking we should rename.
- 'scale' is interpreted as `Q@K.T * (scale)`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95259
Approved by: https://github.com/cpuhrsch
2023-03-08 18:07:40 +00:00
PyTorch MergeBot
3f840cc627 Revert "Ignore shape inference exception from Caffe2 ATen fallback (#90408)"
This reverts commit 1d4e872370.

Reverted https://github.com/pytorch/pytorch/pull/90408 on behalf of https://github.com/huydhn due to Sorry for reverting your PR, but it breaks lint check https://hud.pytorch.org/pr/90408#11855039599. Please fix the error and reland your change
2023-03-08 17:28:21 +00:00
Nikita Shulga
9c5a24b9df [BE] Delete `pre-cxx-11-abi MacOS libtorch builds (#96301)
Those ABI flags makes sense only for Linux, libc++ binaries shipped with MacOS has only one ABI flavor.

Moreover, those binaries were uploaded to the same location anyway, see
[upload job for pre-cxx-11 abi](https://github.com/pytorch/pytorch/actions/runs/4362299843/jobs/7628815268#step:7:97) and [upload job for cxx-11 abi](https://github.com/pytorch/pytorch/actions/runs/4362299812/jobs/7628879450#step:7:97)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96301
Approved by: https://github.com/atalman
2023-03-08 17:25:19 +00:00
andrewor14
e7dd9b1138 [Quant][test] Add test for mixed dtypes in the same model (#96104)
Summary: This commit adds a test for mixing multiple dtypes
for different layers in the same model. The test verifies that
FX graph mode quantization converts the dtypes correctly
between the layers.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_mixed_dtypes

Reviewers: jcaip, vkuzo, supriyar

Subscribers: jcaip, vkuzo, supriyar
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96104
Approved by: https://github.com/jcaip
2023-03-08 17:08:12 +00:00
Thiago Crepaldi
1d4e872370 Ignore shape inference exception from Caffe2 ATen fallback (#90408)
Fixes #87318

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90408
Approved by: https://github.com/BowenBao
2023-03-08 16:57:48 +00:00
Brian Hirsh
98ece75043 [aot autograd] merge all outputs of funtionalization analysis into single metadata (#95991)
This makes the next PR in the stack cleaner: having the top level entry point to aot autograd perform the functionalization analysis pass once, and plumb the metadata everywhere else that we need it.

I put it in a separate PR because I recently learned that this function is used in fbcode, so I'll need to fix up internals when I land this PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95991
Approved by: https://github.com/ezyang
2023-03-08 16:22:54 +00:00
Brian Hirsh
29b216acd5 aot autograd: handle detach() and no_grad() mutations on input (#95980)
Fixes https://github.com/pytorch/pytorch/issues/95167

More details are in that issue. To summarize, the issue shows up when we have some code like this:

```
def f(x):
    x.detach().mul_(2) # can also happen if the mul_() happens under torch.no_grad()
    return x + 1
```

AOTAutograd will then spit out code like this:
```
def compiled_fn(x):
    x_updated = x.mul(2)
    out = x_updated + 1
    return x_updated, out

def CompiledFunction.forward(x):  # pseudocode, this is part of an autograd.Function
    x_updated, out = compiled_function(x):
    return x_updated, out

def runtime_wrapper(x):
    x_updated, out = CompiledFunction.apply(x)
    x.copy_(x_updated)

x = torch.ones(2, requires_grad=True)
out = runtime_wrapper(x)
```

However, the call to `x.copy_(x_updated)` will fail with the error: `a leaf Variable that requires grad is being used in an in-place operation`. This is because `x` is an autograd leaf, and autograd doesn't allow you to mutate leaves.

In this case though, the data mutation should be entirely opaque to autograd - all mutations happened underneath a `.detach()` or a `torch.no_grad()`.

As Ed pointed out in the issue, we can detect this situation by checking if the mutated input is an autograd leaf. If it is, then it must have been the case that any mutations on it must have been hidden from autograd, since otherwise the eager code would have error'd. The solution I added is to detect this situation, and manually run `x.detach().copy_(x_updated)`, to hide the update from autograd.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95980
Approved by: https://github.com/ezyang
2023-03-08 16:11:06 +00:00
Nikita Karetnikov
bb650b34c4 [inductor] do not handle int in placeholder (#96230)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96230
Approved by: https://github.com/ezyang
2023-03-08 13:50:40 +00:00
Brian Hirsh
f96bd52841 aot autograd: dont allow symint outputs to get tangents in the bw graph (#96219)
Previously, if dynamic shapes were turned on and we had a forward graph that returns a symint, then we would generate a backward graph that takes in a tangent input for that symint fwd output. This causes problems for downstream - inductor will see an input that it expects to be a symint, but it gets a `None` from autograd.

Confirmed that this repro now passes:
```
benchmarks/dynamo/torchbench.py --devices cuda --inductor --dynamic-shapes --unspecialize-int --accuracy --training --only drq
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96219
Approved by: https://github.com/ezyang
2023-03-08 13:02:34 +00:00
PyTorch MergeBot
6bbae86253 Revert "Fix hooks handling for unpickled nnmodule (#96224)"
This reverts commit 8ca264ef36.

Reverted https://github.com/pytorch/pytorch/pull/96224 on behalf of https://github.com/ezyang due to inductor regression
2023-03-08 13:01:16 +00:00
Kwanghoon An
a1d7014c0f Hooking backward for QNNPACK (#94432)
Summary: Enabling quantized gradient.

Test Plan:
Algorithmic correctness - Dequantized matmul vs QNNPACK matmul for gradient - P616202766

```
dequantized matmul : [1.5463, -0.2917, -2.1735, 0.5689, -1.0795]
QNNPACK matmul : tensor([[ 1.5463, -0.2917, -2.1735,  0.5689, -1.0795]])
```

Differential Revision: D42593235

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94432
Approved by: https://github.com/malfet, https://github.com/kimishpatel
2023-03-08 10:21:32 +00:00
Chien-Chin Huang
92edac72aa [FSDP][optim_state_dict] Fix a memory leakage in optim_state_dict (#96263)
Summary: The original code uses a class variable to store flat_parameter result. This could cause memory leakage.

Test Plan: CI and a E2E run

Reviewed By: awgu

Differential Revision: D43893577

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96263
Approved by: https://github.com/zhaojuanmao
2023-03-08 08:43:42 +00:00
Kulin Seth
2bb022e902 [MPS] Adding xfaillist with all categories of failures. (#96176)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96176
Approved by: https://github.com/malfet
2023-03-08 08:41:21 +00:00
Max Podkorytov
b90a9c7db2 [static-runtime] fix one forwarding usage (#96271)
Summary: as titled

Test Plan: ci

Differential Revision: D43897369

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96271
Approved by: https://github.com/davidberard98
2023-03-08 07:38:21 +00:00
PyTorch MergeBot
3ce1e15cf7 Revert "[Dynamo] Support torch.{cuda/cpu}.amp.autocast (#95416)"
This reverts commit c88aa336aa.

Reverted https://github.com/pytorch/pytorch/pull/95416 on behalf of https://github.com/huydhn due to Sorry for reverting your PR. But it seems that the smoke test issue is related as it starts to fail consistently in trunk https://hud.pytorch.org/hud/pytorch/pytorch/master/1?per_page=50&name_filter=inductor_torchbench_smoketest_perf
2023-03-08 06:51:57 +00:00
Nikita Shulga
941ff109d3 dl_open_guard should restore flag even after exception (#96231)
I.e. follow pattern outlined in https://docs.python.org/3.8/library/contextlib.html#contextlib.contextmanager

Also, return early on non-unix platforms (when `sys.getdlopenflags` is not defined)

Fixes https://github.com/pytorch/pytorch/issues/96159

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96231
Approved by: https://github.com/atalman
2023-03-08 06:01:27 +00:00
Will Constable
8ca264ef36 Fix hooks handling for unpickled nnmodule (#96224)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96224
Approved by: https://github.com/albanD
2023-03-08 05:33:15 +00:00
Jiaxu Zhu
08fb13db65 [Quant] Add lowering for pixel_unshuffle/narrow (#96160)
Summary:
## Summary
torch.nn.functional.pixel_unshuffle and torch.narrow accepts both float
and quantized inputs. However, previously we would unnecessarily
dequantize quantized inputs into floats before passing them to
the function. This commit fixes this by lowering the pattern
[dequant - pixel_unshuffle - quant].
[dequant - narrow - quant].

Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps.test_pixel_unshuffle
```

```
python test/test_quantization.py TestQuantizeFxOps.test_narrow
```

Differential Revision: D43858199

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96160
Approved by: https://github.com/andrewor14
2023-03-08 05:25:03 +00:00
Han Qi
9e3f173636 [1/n] Add verifier for EXIR Aten dialect (#94783)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94783
Approved by: https://github.com/zhxchen17
2023-03-08 04:55:54 +00:00
Catherine Lee
3a4275278b Use GH cache for sccache on GH mac runners (#96142)
sccache added GH cache as a storage option, so try to use it for the GH provided mac runners.

My experiments with this are varied.  I tried a couple of different releases and the first run with a cold cache took 1hr (v0.3.3), 1hr (v0.4.0 pre7), 2hr (v0.3.3).

Afterwards it usually takes 30 minutes but sometimes longer, but no longer than 1hr.

I am using v0.4.0 pre7 because they reduced the amount of configuration/env vars you need to set and the GH cache keys get managed by sccache.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96142
Approved by: https://github.com/huydhn, https://github.com/malfet
2023-03-08 04:18:54 +00:00
Michael Voznesensky
d7db5b05b4 Context manager to push/pop frame summaries (#96054)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96054
Approved by: https://github.com/avikchaudhuri, https://github.com/ezyang
2023-03-08 04:01:49 +00:00
PyTorch MergeBot
bb8645acda [vision hash update] update the pinned vision hash (#96243)
This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/master/.github/workflows/_update-commit-hash.yml).
Update the pinned vision hash.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96243
Approved by: https://github.com/pytorchbot
2023-03-08 03:57:12 +00:00
Bin Bao
664381b293 [CI] Avoid calling torch.use_deterministic_algorithms for some models (#96245)
tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96245
Approved by: https://github.com/davidberard98
2023-03-08 03:35:32 +00:00
Hansong Zhang
93ff71ec37 [ET] Add RuntimeContext to ET Aten mode (#96084)
Summary:
In ATen mode, we add the RuntimeContext arg, so we have something like
```
TORCH_API inline at::Tensor & gelu_outf(torch::executor::RuntimeContext & context, const at::Tensor & self, c10::string_view approximate, at::Tensor & out) {
    return at::gelu_outf(self, approximate, out);
}
```
and user can use `<namespace like aten>::gelu_outf` and we will automatically dispatch the registered function in aten kernel using `at::gelu_outf` (dispatched by ATen/Functions.h header)

In optimized kernel tests, we can now automatically handle between aten kernel and optimized kernel.

The implication is that the test must depend on the correctness of codegen; an error in codegen can break the kernel tests.

Test Plan: CI

Differential Revision: D43777848

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96084
Approved by: https://github.com/larryliu0820
2023-03-08 02:51:47 +00:00
Yanbo Liang
c88aa336aa [Dynamo] Support torch.{cuda/cpu}.amp.autocast (#95416)
For Meta internal use cases.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95416
Approved by: https://github.com/jansel
2023-03-08 01:40:27 +00:00
Yanbo Liang
b8f7bd593c [Dynamo] Guard name should be valid Python identifier (#96174)
Fixes #96149

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96174
Approved by: https://github.com/ezyang, https://github.com/jansel
2023-03-08 01:33:29 +00:00