Commit graph

47037 commits

Author SHA1 Message Date
Nikita Shulga
40e2aadf47 Create __init__.py (#78629)
To make `torch.utils.jit` a proper package, otherwise it will not be added to the wheel

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78629
Approved by: https://github.com/seemethere, https://github.com/xuzhao9, https://github.com/davidberard98
2022-06-03 18:14:21 +00:00
goldenxuett
eb49dde9cf Disable TracerWarnings on NNC opinfo tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78756

Approved by: https://github.com/davidberard98
2022-06-03 18:11:12 +00:00
zengk95
c5a0d8dccc Default on green (#78811)
Reopens https://github.com/pytorch/pytorch/pull/78771
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78811
Approved by: https://github.com/janeyx99
2022-06-03 18:04:31 +00:00
Catherine Lee
86c42d63c8 rebase via comment - rebase to any branch (#78772)
pr in test infra: https://github.com/pytorch/test-infra/pull/366
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78772
Approved by: https://github.com/suo, https://github.com/janeyx99
2022-06-03 17:13:11 +00:00
Rohan Varma
79931889f7 Fix distributed_test.py flakiness (#78797)
There are several recent distributed_test / DDP flaky tests: https://github.com/pytorch/pytorch/issues?q=is%3Aopen+is%3Aissue+label%3A%22oncall%3A+distributed%22+label%3A%22module%3A+flaky-tests%22+

From local experimentation, we see segfaults such as the error in https://github.com/pytorch/pytorch/issues/78684 quite a bit locally when running with NCCL. I switched the test to run with gloo, and these issues appeared gone.

I then switched back to nccl but turned off async_errror_handling (some of the stacktrace had ncclCommWatchdog + workCleanupLoop in the trace, so I thought it might be an issue / race between the two or the like). Turning off async_error_handling also seems to alleviate the tests. If this indeed works, we should probably land this PR as we are losing a lot of CI signal, and prioritize to understand why async error handling / comm watchdog interaction might be causing these segfaults.

Closes https://github.com/pytorch/pytorch/issues/78768 https://github.com/pytorch/pytorch/issues/78767 https://github.com/pytorch/pytorch/issues/78748 https://github.com/pytorch/pytorch/issues/78685 https://github.com/pytorch/pytorch/issues/78684 https://github.com/pytorch/pytorch/issues/78641
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78797
Approved by: https://github.com/wanchaol, https://github.com/fduwjj
2022-06-03 16:02:47 +00:00
Nikita Vedeneev
a4509f5b72 More forward-over-reverse implementations. (#78740)
Umbrella issue: https://github.com/pytorch/pytorch/issues/75432.

This one implements forward-over-reverse for:

* mse_loss
* l1_loss
* smooth_l1_loss
* softplus
* hardswish (also adds double backward support)
* prelu

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78740
Approved by: https://github.com/soulitzer
2022-06-03 15:44:06 +00:00
Michael Suo
298b9ad708 [ci] harden test stats reporting
1. Fix bug where we were uploading stats to the wrong place in s3
2. Hard error when upload_test_stats.py didn't find any stats in s3
3. Change the name of the workflow job to include the corresponding id,
to make debugging easier in the future.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78763

Approved by: https://github.com/janeyx99, https://github.com/seemethere
2022-06-03 15:21:19 +00:00
Nikita Shulga
f7ac389e71 Run MPS tests (#78723)
This adds a workflow, that is executed on MacOS 12.3+ machines and runs just test_mps.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78723
Approved by: https://github.com/albanD, https://github.com/kulinseth
2022-06-03 15:06:08 +00:00
YifanShenSZ
6ba1d05fa4 to_padded_tensor doc v0 (#78657)
Fixes #76846

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78657
Approved by: https://github.com/jbschlosser
2022-06-03 14:27:31 +00:00
Elias Ellison
26d273959c Add Caching of Conversion to Fake/Meta tensors in FakeTensorMode
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78090

Approved by: https://github.com/ezyang
2022-06-03 13:56:00 +00:00
Andrew Gu
4615738a3d [FSDP] Allow different optim_input orders across ranks
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78599

Approved by: https://github.com/rohan-varma
2022-06-03 11:47:24 +00:00
Andrew Gu
d4d8aaf7cb [FSDP][Docs] Fix typo in full_optim_state_dict()
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78784

Approved by: https://github.com/rohan-varma
2022-06-03 11:41:21 +00:00
Linbin Yu
1683a2618d rename BUILD.buck to BUCK.oss (#78792)
rename BUILD.buck to BUCK.oss to better reflect that it's the OSS version of BUCK build, not the one shared with Bazel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78792
Approved by: https://github.com/kit1980
2022-06-03 07:23:16 +00:00
Eddie Yan
fc66521ebd [cuDNN] [cuDNN v8 API] Support cuDNN Errata Filter (#73934)
Not originally mentioned in the tracking issue #58414, but is a nice-to-have feature. In summary, the errata filter allows known problematic kernels to be skipped instead of irrecoverably crashing a CUDA context (e.g., via an illegal memory access) via a JSON file supplied at run time. cuDNN frontend description: https://github.com/NVIDIA/cudnn-frontend#errata-filter

Sample errata filter JSON:
```
{
  "version" : 1,
  "rules" : [
    {
      "rule_id" : "avoid_bad_bwd_data",
      "operation" : "ConvBwdData",
      "engine" : 12,
      "cudnn_version_start" : 8000,
      "cudnn_version_end" : 9000
    }
  ]
}
```
CC @ngimel @zasdfgbnm @ptrblck

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73934
Approved by: https://github.com/ngimel
2022-06-03 06:25:54 +00:00
linjianma
c29df68f95 [FSDP] Return original module when fsdp wrapped model call .module (#78671)
Fixes #78607

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78671
Approved by: https://github.com/awgu, https://github.com/rohan-varma
2022-06-03 04:38:19 +00:00
Shen Li
1884d7fbe9 Avoid CPU Sync in SyncBatchNorm When Capturing CUDA Graphs
We recently updated `SyncBatchNorm` to support empty input batches.
The new code removes stats from ranks with empty inputs. However,
this change breaks CUDA graph capture as it forces CPU sync. This
commit uses `is_current_stream_capturing()` to guard the new code
path, and only run the new code when not capturing CUA Graphs. To
support empty inputs with CUDA graph capturing, we might need to
update CUDA kernels for `batch_norm_backward_elemt` and
`batch_norm_gather_stats_with_counts`. See #78656.

Fixes #78549

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78666

Approved by: https://github.com/albanD
2022-06-03 04:32:57 +00:00
Sergii Dymchenko
1eab34d173 Remove non-existing code_template.py glob (#78773)
Test Plan: No-op, rely on CI

Reviewed By: dagitses

Differential Revision: D36848770

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78773
Approved by: https://github.com/linbinyu, https://github.com/seemethere
2022-06-03 03:39:51 +00:00
Sergii Dymchenko
dd45e8d3cd Add linux-focal-py3.7-gcc7 to merge_rules.json (#78785)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78785
Approved by: https://github.com/malfet
2022-06-03 02:31:49 +00:00
PyTorch MergeBot
954522a485 Revert "Autogen Tags enum, and allow specifying tags while defining an op"
This reverts commit 9476a78f37.

Reverted https://github.com/pytorch/pytorch/pull/77313 on behalf of https://github.com/malfet due to Broke OSS buck builds, see 9476a78f37
2022-06-03 01:53:53 +00:00
Justin Chu
c0814bff87 [ONNX] Variable length argument support for quantized_args (#78775)
Add support for decorating functions with variable length arguments in `quantized_args`. This is needed to decorate functions like `symbolic_fn` in `_interpolate_helper` which takes `*args`.

Previously it is not possible to decorate functions like it. Now we can do

```python
@quantized_args(True)
def symbolic_fn(g, input, output_size, *args):
    ...
```

and the rest of the params are defaulted to non-quantized.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78775
Approved by: https://github.com/garymm
2022-06-03 01:31:19 +00:00
Horace He
1ea4075bda Ported t decomp to become a ref (#78686)
Also added an error input for `t`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78686
Approved by: https://github.com/mruberry
2022-06-03 01:16:20 +00:00
anjali411
9476a78f37 Autogen Tags enum, and allow specifying tags while defining an op
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77313

Approved by: https://github.com/ezyang, https://github.com/albanD
2022-06-03 01:13:44 +00:00
Jerry Zhang
063c93665c [quant] follow up fixes for prepare_fx/prepare_qat_fx calls in classyvision (#105) (#78660)
Summary:
X-link: https://github.com/fairinternal/ClassyVision/pull/105

As follow up for https://github.com/pytorch/pytorch/pull/76496, we fixes the TODOs in quantization tests
by providing correct example_inputs in the tests

Test Plan:
classyvision sandcastle and ossci

**Static Docs Preview: classyvision**
|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D36818665/V1/classyvision/)|

|**Modified Pages**|

Differential Revision: D36818665

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78660
Approved by: https://github.com/vkuzo
2022-06-03 01:08:45 +00:00
eqy
ad1bff1bff [TF32] Fix typo in tf32 wrapper function (#78438)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78438
Approved by: https://github.com/ngimel
2022-06-03 01:03:43 +00:00
Eddie Yan
b740a99b9e [cuDNN][TF32] Threshold adjustments for TF32 on >=sm80 (#78437)
CC @ptrblck @mcarilli

Change to transformer multilayer test can potentially be swapped in favor of an rtol change? (see also: #75612).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78437
Approved by: https://github.com/ngimel
2022-06-03 01:02:56 +00:00
John Clow
416f581eb1 Updating torch.log example
Fixes issue  #78301

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78776

Approved by: https://github.com/ngimel
2022-06-03 00:57:35 +00:00
PyTorch MergeBot
8047d2a564 Revert "Reenable assert after test update"
This reverts commit b0814b63df.

Reverted https://github.com/pytorch/pytorch/pull/78658 on behalf of https://github.com/malfet due to test_ops crashes with SIGIOT on both PR and trunk CI, see b0814b63df
2022-06-03 00:21:23 +00:00
Sergii Dymchenko
76392c67ed Migrate off Xenial gcc5.4 for trunk jobs (#78734)
Fixes #78732
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78734
Approved by: https://github.com/seemethere
2022-06-02 23:21:16 +00:00
David Berard
4ab62ecfae [JIT] enable autocasting + freezing test
Test was marked as `skip` due ot a memory leak. Turns out the memory leak is expected - it can be fixed by clearing the compilation unit (with `torch.jit._state._python_cu.drop_all_functions()` at the end of the test function) or by disabling the leak detector on this test.

Fixes #77618

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78566

Approved by: https://github.com/eellison
2022-06-02 22:38:56 +00:00
swang392
88f4a12402 using new rockset commit_jobs_batch_query to print check statuses (#78750)
Relates to #76700

**Overview:** Made a new Rockset Query Lambda called `commit_jobs_batch_query` that takes in a string composed of multiple SHAs separated with commas. Only one query execution is necessary to get all of the workflow job conclusions for all SHAs, which saves time.

**Example output:**
![Screen Shot 2022-06-02 at 3 19 31 PM](https://user-images.githubusercontent.com/24441980/171720582-260ac000-2db4-4a98-a2eb-32d2f29915a5.png)

**Test Plan**: Compare output with HUD and verify that workflow conclusions are correct

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78750
Approved by: https://github.com/seemethere
2022-06-02 21:42:38 +00:00
Xiao Wang
ef0332e36d Allow relocatable device code linking in pytorch CUDA extensions (#78225)
Close https://github.com/pytorch/pytorch/issues/57543

Doc: check `Relocatable device code linking:` in https://docs-preview.pytorch.org/78225/cpp_extension.html#torch.utils.cpp_extension.CUDAExtension
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78225
Approved by: https://github.com/ezyang, https://github.com/malfet
2022-06-02 21:35:56 +00:00
Edward Z. Yang
9446f9678a repeat_interleaves meta function
Taken from https://github.com/albanD/subclass_zoo/blob/main/python_meta_tensor.py

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78602

Approved by: https://github.com/mruberry
2022-06-02 21:24:46 +00:00
Oliver Sellwood
cc6a51c9f3 added shape checking to WeightedRandomSampler (#78585)
Fixes #78236

An erronously shaped weights vector will result in the following output

```
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/datarwe/pytorch/torch/utils/data/sampler.py in <module>
      [274](file:///home/oliver/datarwe/pytorch/torch/utils/data/sampler.py?line=273) WeightedRandomSampler([1,2,3], 10)
----> [275](file:///home/oliver/datarwe/pytorch/torch/utils/data/sampler.py?line=274) WeightedRandomSampler([[1,2,3], [4,5,6]], 10)

~/datarwe/pytorch/torch/utils/data/sampler.py in __init__(self, weights, num_samples, replacement, generator)
    [192](file:///home/oliver/datarwe/pytorch/torch/utils/data/sampler.py?line=191)         weights = torch.as_tensor(weights, dtype=torch.double)
    [193](file:///home/oliver/datarwe/pytorch/torch/utils/data/sampler.py?line=192)         if len(weights.shape) != 1:
--> [194](file:///home/oliver/datarwe/pytorch/torch/utils/data/sampler.py?line=193)             raise ValueError("weights should be a 1d sequence but given "
    [195](file:///home/oliver/datarwe/pytorch/torch/utils/data/sampler.py?line=194)                              "weights have shape {}".format(tuple(weights.shape)))
    [196](file:///home/oliver/datarwe/pytorch/torch/utils/data/sampler.py?line=195)

ValueError: weights should be a 1d sequence but given weights have shape (2, 3)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78585
Approved by: https://github.com/NivekT, https://github.com/ejguan
2022-06-02 21:12:14 +00:00
Akshay Parashar
28f87b9cf9 [Static Runtime] Fix aten::clone out variant (#78297) (#78322)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78297

Clone followed by expand/expand_as due to memoryOverlap check on copy_ native method. Refer to T118519310 for more details.

Crashing test case:
a = tensor(3,1)			  // strides = (1,1)
B = tensor(3,2)	          // strides = (2,1)
Temp = a.expand_as(b).   // creates temp with shape as (3,2) and strides as (1,0)
temp.clone()		         // crashe on copy_ due to memoryOverlap

Fix: Disable the out variant for the expanded tensor.
- Calls native clone instead of out variant for clone dealing with expanded tensors
- Added test case for both clone variants (out and native clones)
- Increased the tensor size for memory planner test case to trigger dynamic allocation

Test Plan:
buck test caffe2/benchmarks/static_runtime/fb:test_fb_operators

buck test caffe2/benchmarks/static_runtime:static_runtime_cpptest

Differential Revision: D36672180

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78322
Approved by: https://github.com/mikeiovine
2022-06-02 21:06:59 +00:00
Max Podkorytov
ebfc70f37a [static-runtime] out variant for aten::mean (#78161)
Summary: As subject

Test Plan: Added unit tests

Differential Revision: D36614633

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78161
Approved by: https://github.com/mikeiovine
2022-06-02 20:56:42 +00:00
Michael Suo
22b10873f3 Allow torchdispatch to customize dim()
This follows the template in
https://github.com/pytorch/pytorch/pull/77396

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78691

Approved by: https://github.com/ezyang
2022-06-02 20:54:13 +00:00
albanD
b30b1f3dec update mps note with more details (#78669)
Follow up to the comments in https://github.com/pytorch/pytorch/pull/77767#pullrequestreview-978807521
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78669
Approved by: https://github.com/kulinseth, https://github.com/anjali411
2022-06-02 20:53:19 +00:00
albanD
3e0f1a8a32 Add option to skip binaries when doing pip install for lintrunner (#78668)
This is a workaround for https://github.com/suo/lintrunner/issues/7
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78668
Approved by: https://github.com/suo
2022-06-02 20:50:57 +00:00
Zain Rizvi
d8093105d1 Update linter to validate file names are compatible across OSes (#78736)
Fixes https://github.com/pytorch/pytorch/issues/74341

Ensure all files have names that are valid across all operating systems, including checking for a trailing white space

Note: This is not a fully comprehensive check since there are various quicks some the OSes have ([namely Windows](https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file)) that make it reserves certain names and characters.

Still, it covers the file names we are most likely to run into, including one that caused 7cf9b942da to need a revert

## Testing
Created another PR with four invalid file names and verified that the linter caught all of them:
https://github.com/pytorch/pytorch/runs/6712731938?check_suite_focus=true
<img width="1093" alt="image" src="https://user-images.githubusercontent.com/4468967/171684679-c968fecf-9bc5-480c-9e18-012def25df06.png">

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78736
Approved by: https://github.com/seemethere, https://github.com/atalman
2022-06-02 20:11:30 +00:00
Jane Xu
3354b31e9a Fix nightly docs push: don't use nonexistent 5.4 image (#78730)
Fixes #78687

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78730
Approved by: https://github.com/kit1980, https://github.com/atalman, https://github.com/malfet
2022-06-02 20:04:11 +00:00
Michael Andreas Dagitses
501d0729cb move build_variables.bzl and ufunc_defs.bzl from pytorch-root/tools/ to the root
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78542

This makes importing easier in different build systems that have
different absolute names for the pytorch-root.

Differential Revision: [D36782582](https://our.internmc.facebook.com/intern/diff/D36782582/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36782582/)!

Approved by: https://github.com/malfet
2022-06-02 19:39:27 +00:00
Zain Rizvi
5ef378a30f Fix out of date documentation & remove friction points (#78682)
Fixes various friction points with the documentation for onboarding new users and remove instructions that were no longer valid

Changes include:
- Listing prerequisites earlier, so that devs can ensure they're met before encountering error messages
- Removing linter invocations that are no longer valid
- Modifying instructions to install mkl packages to only apply to x86 based CPUs

[skip ci]
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78682
Approved by: https://github.com/seemethere, https://github.com/janeyx99, https://github.com/malfet
2022-06-02 19:31:48 +00:00
Eli Uriegas
4220799ea7 scripts: Fix dry run for cut-release-branch.sh
Signed-off-by: Eli Uriegas <eliuriegasfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77978

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Approved by: https://github.com/suo, https://github.com/atalman
2022-06-02 19:23:51 +00:00
Michael Andreas Dagitses
844368c032 remove Bazel globs that don't match any files
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78007

These are vestigial.

Differential Revision: [D36558465](https://our.internmc.facebook.com/intern/diff/D36558465/)

Approved by: https://github.com/kit1980
2022-06-02 18:39:47 +00:00
Michael Andreas Dagitses
7d12eecba1 move GENERATED_CPP_CUDA to caffe2/build.bzl
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77744

This is needed by gen_aten and it's immediate downstream libraries. As
such, it can live solely in the shared build structure.

Differential Revision: [D36480812](https://our.internmc.facebook.com/intern/diff/D36480812/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36480812/)!

Approved by: https://github.com/kit1980
2022-06-02 18:38:05 +00:00
Sergii Dymchenko
f1132c2c3c Remove mentions of deleted TH and friends (#78683)
Test Plan: No-op, rely on CI.

Differential Revision: D36829491

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78683
Approved by: https://github.com/dagitses, https://github.com/ngimel
2022-06-02 18:32:51 +00:00
Matt Guo
9c8eb2cf1b Leaky relu in metal shader (#78544)
Summary:
Heavily referenced how Hardswish was implemented.

This is a great intro task to get a taste of how a torch method is implemented in shader and tested.

Test Plan:
Compared in metal shader metal version and cpu version result in tests.

https://pxl.cl/251kT

Reviewed By: SS-JIA

Differential Revision: D36732187

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78544
Approved by: https://github.com/SS-JIA
2022-06-02 18:13:51 +00:00
Kshiteej K
849b08f14b [reland][chalf] where(cpu and cuda), pow(cuda) (#78665)
Reland: https://github.com/pytorch/pytorch/pull/77640
Ref: #74537
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78665
Approved by: https://github.com/ngimel
2022-06-02 18:04:06 +00:00
PyTorch MergeBot
d578197747 Revert "Fix embedding jvp support by making embedding_renorm ignore forward mode AD (#78560)"
This reverts commit ce7c7bb2a9.

Reverted https://github.com/pytorch/pytorch/pull/78560 on behalf of https://github.com/malfet due to broke XLA (on CI and trunk), see ce7c7bb2a9
2022-06-02 17:40:34 +00:00
Vitaly Fedyunin
883f8ef62e [DataLoader] DataLoader now automatically apply sharding to DataPipes
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78631

Approved by: https://github.com/ejguan, https://github.com/NivekT
2022-06-02 17:40:29 +00:00