Commit graph

852 commits

Author SHA1 Message Date
jjsjann123
7b419e8513 [NVFuser] Upstream push 1026 (#87779)
Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/

Codegen changes include:

* codegen improvement:
    i. allow non-root trivial reductions, allow empty/no-op fusion
    ii. fixes vectorization checks and size calculation
    iii. bank conflict handle improvement
    iv. enables transpose scheduler

* misc:
    i. CI tests failure fixes
    ii. cpp tests file clean up
    iii. trivial forwarding supports added in codegen runtime
    iv. added factory methods support in codegen

Commits that's in this PR from the devel branch:

```
7117a7e37ebec372d9e802fdfb8abb7786960f4a patching nvfuser conv cudnn test numerics mismatch (#2048)
65af1a4e7013f070df1ba33701f2d524de79d096 Inserting sync for redundant parallel types is already done at the (#2023)
6ac74d181689c8f135f60bfc1ec139d88941c98c Fix sync map (#2047)
f5bca333355e2c0033523f3402de5b8aac602c00 Bank conflict checker improvements (#2032)
d2ca7e3fd203537946be3f7b435303c60fa7f51e Minor update on cp.async code generation. (#1901)
d36cf61f5570c9c992a748126287c4e7432228e0 Test file cleanup (#2040)
0b8e83f49c2ea9f04a4aad5061c1e7f4268474c6 Allow non-root trivial reductions (#2037)
a2dfe40b27cd3f5c04207596f0a1818fbd5e5439 Fix vectorize size calculation (#2035)
e040676a317fe34ea5875276270c7be88f6eaa56 Use withPredicate to replace setPredicate to maintain Exprs immutable (#2025)
197221b847ad5eb347d7ec1cf2706733aacbf97c removing ci workflow (#2034)
40e2703d00795526e7855860aa00b9ab7160755f Reduction rand like patch (#2031)
bc772661cbdb3b711d8e9854ae9b8b7052e3e4a3 Add utility for checking bank conflict of shared memory (#2029)
ddd1cf7695f3fb172a0e4bcb8e4004573617a037 Add back FusionReductionWithTrivialReduction_CUDA (#2030)
fbd97e5ef15fa0f7573800e6fbb5743463fd9e57 Revert "Cleanup trivial reduction workarounds (#2006)" (#2024)
bca20c1dfb8aa8d881fc7973e7579ce82bc6a894 Cleanup trivial reduction workarounds (#2006)
e4b65850eee1d70084105bb6e1f290651adde23e Trivial forwarding (#1995)
1a0e355b5027ed0df501989194ee8f2be3fdd37a Fix contiguity analysis of predicates to match updated contiguity. (#1991)
a4effa6a5f7066647519dc56e854f4c8a2efd2a7 Enable output allocation cache (#2010)
35440b7953ed8da164a5fb28f87d7fd760ac5e00 Patching bn inference (#2016)
0f9f0b4060dc8ca18dc65779cfd7e0776b6b38e8 Add matmul benchmark (#2007)
45045cd05ea268f510587321dbcc8d7c2977cdab Enable tests previously disabled due to an aliasing bug (#2005)
967aa77d2c8e360c7c01587522eec1c1d377c87e Contiguous indexing for View operations (#1990)
a43cb20f48943595894e345865bc1eabf58a5b48 Make inlining even more modular (#2004)
dc458358c0ac91dfaf4e6655a9b3fc206fc0c897 Test util cleanup (#2003)
3ca21ebe4d213f0070ffdfa4ae5d7f6cb0b8e870 More strict validation (#2000)
a7a7d573310c4707a9f381831d3114210461af01 Fix build problem (#1999)
fc235b064e27921fa9d6dbb9dc7055e5bae1c222 Just fixes comments (#1998)
482386c0509fee6edb2964c5ae72074791f3e43a cleanup (#1997)
4cbe0db6558a82c3097d281eec9c85ad2ea0893a Improve divisible split detection (#1970)
42ccc52bdc18bab0330f4b93ed1399164e2980c9 Minor build fix. (#1996)
fcf8c091f72d46f3055975a35afd06263324ede6 Cleanup of lower_utils.cpp: Isolate out GpuLower usage (#1989)
15f2f6dba8cbf408ec93c344767c1862c30f7ecc Move ConcretizedBroadcastDomains to shared_ptr in GpuLower. (#1988)
8f1c7f52679a3ad6acfd419d28a2f4be4a7d89e2 Minor cleanup lower_unroll.cpp (#1994)
1d9858c80319ca7f0037db7de5f04e47f540d76c Minor cleanup (#1992)
f262d9cab59f41c669f53799c6d4a6b9fc4267eb Add support for uniform RNG (#1986)
eb1dad10c73f855eb1ecb20a8b1f7b6edb0c9ea3 Remove non-const functions, remove GpuLower instance on build, pass in ca_map. (#1987)
634820c5e3586c0fe44132c51179b3155be18072 Add support for some empty fusion (#1981)
eabe8d844ad765ee4973faa4821d451ef71b83c3 Segment self mapping fusions (#1954)
e96aacfd9cf9b3c6d08f120282762489bdf540c8 Enable Transpose operation (#1882)
425dce2777420248e9f08893765b5402644f4161 Add a null scheduler that helps segmenting away no-op schedules (#1835)
306d4a68f127dd1b854b749855e48ba23444ba60 Fix canScheduleCompileTime check of transpose scheduler (#1969)
b1bd32cc1b2ae7bbd44701477bddbcfa6642a9be Minor fix (#1967)
bd93578143c1763c1e00ba613a017f8130a6b989 Enable transpose scheduler (#1927)
b7a206e93b4ac823c791c87f12859cf7af264a4c Move scheduler vectorize utilities into their own file (#1959)
d9420e4ca090489bf210e68e9912bb059b895baf View scheduling (#1928)
c668e13aea0cf21d40f95b48e0163b812712cdf2 Upstream push ci fixes (#1965)
c40202bb40ce955955bb97b12762ef3b6b612997 Fix dump effective bandwidth (#1962)
93505bcbb90a7849bd67090fe5708d867e8909e4 WAR on index mapping when exact and permissive maps differ (#1960)
45e95fd1d3c773ee9b2a21d79624c279d269da9f Allow splitting inner-most ID to create virtual innermost ID in transpose scheduler (#1930)
a3ecb339442131f87842eb56955e4f17c544e99f Improve the comments at the beginning of index_compute.h (#1946)
f7bc3417cc2923a635042cc6cc361b2f344248d6 Remove unused variables (#1955)
df3393adbb5cb0309d091f358cfa98706bd4d313 Some cleanup (#1957)
7d1d7c8724ab5a226fad0f5a80feeac04975a496 TVDomainGuard factory (#1953)
357ba224c0fb41ed3e4e8594d95599c973f4a0ca Fill allocation with nan on tests (#1956)
8eafc54685d406f5ac527bcbacc475fda4492d7a Fix detection of unmappable root domains (#1952)
90a51f282601ba8ebd4c84b9334efd7762a234bc Some indexing cleanups, Add eye support (#1940)
ddc01e4e16428aec92f9c84d698f959b6436a971 Exclude unsupported data types (#1951)
992e17c0688fe690c51b50e81a75803621b7e6aa test the groups the same order as they are merged (#1949)
208262b75d1fed0597a0329d61d57bc8bcd7ff14 Move detection of self mapping IDs to IterDomainGraph from (#1941)
ac4de38c6ee53b366e85fdfe408c3642d32b57df Merge pull request #1945 from csarofeen/master_merge_0828
631094891a96f715d8c9925fb73d41013ca7f2e3 Add full, full_like, zeros, zeros_like, ones, ones_like (#1943)
aab10bce4541204c46b91ff0f0ed9878aec1bfc4 Merge remote-tracking branch 'upstream/viable/strict' into HEAD
4c254c063bb55887b45677e3812357556a7aa80d Fix arange when step is negative (#1942)
89330aa23aa804340b2406ab58899d816e3dc3d2 Tensor factories must set the output shape as its input (#1939)
```

RUN_TORCHBENCH: nvfuser

Differential Revision: [D40869846](https://our.internmc.facebook.com/intern/diff/D40869846)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87779
Approved by: https://github.com/davidberard98
2022-11-04 20:04:34 +00:00
Nikita Shulga
e1c123d29a Add UBSAN to ASAN (#88055)
Add undefined behavior sanitizer to `USE_ASAN` option.
Added `torch._C._crash_if_vptr_ubsan()` that only fails if vptr belongs to a wrong class after typecast
Deleted all ubsan supressions, but disabled `ProtoTest::Basic` as it fails above-mentioned vptr check.

Fixes https://github.com/pytorch/pytorch/issues/88042
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88055
Approved by: https://github.com/ezyang
2022-11-01 17:59:35 +00:00
Han Qi (qihqi)
5c3666cb81 [codev] Make backport work with flatbuffer models (#88127)
Summary: By adding flatbuffer as dependency of backport.

Differential Revision: D40865452

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88127
Approved by: https://github.com/cccclai
2022-11-01 16:11:30 +00:00
Edward Z. Yang
1ff52225f1 Unify SymIntNode and SymFloatNode into SymNode (#87817)
This refactor was prompted by challenges handling mixed int/float
operations in C++.  A previous version of this patch
added overloads for each permutation of int/float and was unwieldy
https://github.com/pytorch/pytorch/pull/87722/  This PR takes a different
approach.

The general outline of the patch is to combine the C++ types SymIntNode
and SymFloatNode into a single type, SymNode.  This is type erased; we
no longer know statically at C++ if we have an int/float and have to test
it with the is_int()/is_float() virtual methods.  This has a number of
knock on effects.

- We no longer have C++ classes to bind to Python.  Instead, we take an
  entirely new approach to our Python API, where we have a SymInt/SymFloat
  class defined entirely in Python, which hold a SymNode (which corresponds
  to the C++ SymNode).  However, SymNode is not pybind11-bound; instead,
  it lives as-is in Python, and is wrapped into C++ SymNode using PythonSymNode
  when it goes into C++.  This implies a userland rename.

  In principle, it is also possible for the canonical implementation of SymNode
  to be written in C++, and then bound to Python with pybind11 (we have
  this code, although it is commented out.)  However, I did not implement
  this as we currently have no C++ implementations of SymNode.

  Because we do return SymInt/SymFloat from C++ bindings, the C++ binding
  code needs to know how to find these classes.  Currently, this is done
  just by manually importing torch and getting the attributes.

- Because SymInt/SymFloat are easy Python wrappers, __sym_dispatch__ now
  takes SymInt/SymFloat, rather than SymNode, bringing it in line with how
  __torch_dispatch__ works.

Some miscellaneous improvements:

- SymInt now has a constructor that takes SymNode.  Note that this
  constructor is ambiguous if you pass in a subclass of SymNode,
  so an explicit downcast is necessary.  This means toSymFloat/toSymInt
  are no more.  This is a mild optimization as it means rvalue reference
  works automatically.

- We uniformly use the caster for c10::SymInt/SymFloat, rather than
  going the long way via the SymIntNode/SymFloatNode.

- Removed some unnecessary toSymInt/toSymFloat calls in normalize_*
  functions, pretty sure this doesn't do anything.

- guard_int is now a free function, since to guard on an int you cannot
  assume the method exists.  A function can handle both int and SymInt
  inputs.

- We clean up the magic method definition code for SymInt/SymFloat/SymNode.
  ONLY the user classes (SymInt/SymFloat) get magic methods; SymNode gets
  plain methods; this is to help avoid confusion between the two types.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

cc @jansel @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87817
Approved by: https://github.com/albanD, https://github.com/anjali411
2022-10-27 20:56:02 +00:00
PyTorch MergeBot
0c1dec375f Revert "Back out "Revert D40198461: [pytorch][PR] Backport currently dont work with some models if:" (#87124)"
This reverts commit a42fbfa0cb.

Reverted https://github.com/pytorch/pytorch/pull/87124 on behalf of https://github.com/ZainRizvi due to This is causing periodic jobs to fail
2022-10-21 16:03:00 +00:00
Han Qi (qihqi)
a42fbfa0cb Back out "Revert D40198461: [pytorch][PR] Backport currently dont work with some models if:" (#87124)
Summary:
reland after fixing windows build failure for OVR.

Notable change:
```
#if defined(FBCODE_CAFFE2) or defined(FB_XPLAT_BUILD)
```
changed to
```#if defined(FBCODE_CAFFE2) || defined(FB_XPLAT_BUILD)
```
Appearently `-DFB_XPLAT_BUILD` wasn't getting picked up in windows if using `or `to connect

Original commit changeset: 7a31fc4b455f

Original Phabricator Diff: D40198461

Test Plan: waitforsandcastle

Reviewed By: davidberard98, cccclai

Differential Revision: D40290932

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87124
Approved by: https://github.com/gmagogsfm
2022-10-20 23:02:10 +00:00
Nirav Mehta
fb614b1871 Enable UBSAN mode for test_jit (#85735)
# Summary

Run `test_jit` executable with UBSAN flag in order to catch errors that might cause internal breakage

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85735
Approved by: https://github.com/dagitses
2022-10-17 22:15:50 +00:00
Alex Beloi
a38e43e936 [perf][1/5] Replace IValue::toString()->string() with IValue::toStringRef() (#85437)
Summary: `IValue::toString()` creates a `new c10::intrusive_ptr` (like `std::shared_ptr`) and `->string()` immediately accesses it, creating an atomic reference increment/decrement. We can skip both of these operations by calling `IValue::toStringRef()`.

Test Plan: CI

Reviewed By: jaybean-dev

Differential Revision: D39605242

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85437
Approved by: https://github.com/jfix71
2022-09-23 23:36:57 +00:00
jjsjann123
0e582fbfcc [NVFuser] Upstream push 0907 (#84626)
Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/

Codegen changes include:

- codegen improvement:
i. improved view support on pointwise and transpose scheduler
ii. grouped grid welford added for better outer-norm grid persistence in normalization

- misc:
i. new composite ops added: variance_mean , arange,
ii. fixes misaligned address for transpose scheduler
iii. refactor on separation of compilation API from execution API to prepare us for async compilation
iv. double type support on expression evaluator
v. PYTORCH_NVFUSER_DUMP refactor to save PTX and CUBIN

Commits that's in this PR from the devel branch:
```
89330aa23aa804340b2406ab58899d816e3dc3d2 Tensor factories must set the output shape as its input (#1939)
b2fd01ea9346712c6d6f623ca6addbc4888d008e arange support (#1933)
56c00fd3922dad7dfc57351ad7d780f0f2f8e4ed Double support on all expression evaluators (#1937)
371f28223e57fe3f6b5e50a0a45177e6a5c0785c Improve trivial reduction merge support (#1931)
1d0c26790e5647920b40d419d26815bbe310b3a6 Test `rand` in a fusion with zero tensor input (#1932)
0dab160fb2177d178eef3148c6a529e0855009e9 Fix softmax bwd sizes. (#1890)
ef98f360f6d3e3e1cc662ecb65202d88150f128d Fix a bug (#1936)
63132a0c56508c550084b07fb76a3df865102d00 Propagate permissive mapping information into indexing pass (#1929)
b4ac2c88d78078ee4d8b21c4fc51645b5710a282 Map IterationDomains through view operations. (#1919)
c0a187a7619d7cf9dc920294e15461791e8d6d4d do not use deprecated functions (#1935)
88de85e758c5e4afb7b6e746573c0d9a53b4cea7 Upstream cherry pick fixes 0811 (#1934)
b247dcf7c57dc6ac3f7a799b0a6beb7770536a74 Separate kernel compilation API from kernel execution API (#1914)
b34e3b93ee1a8030730c14af3995dd95665af07d Fix `ir_utils::hasBlockSync` + misc fixes in transpose scheduler (#1924)
14a53e6707f43bf760494c238a46386d69830822 Nullary RNGOp (#1892)
3c3c89e638f5172cafb0761f22bacd1fd695eec3 Misc fixes/tuning for transpose scheduler (#1912)
20cf109c8b44d48f61977e35bae94368985144ac Grouped grid welford (#1921)
6cf7eb024c9e53c358cbe56597e117bad56efefd Transpose scheduler small dim sizes better support (#1910)
9341ea9a5bf42f9b14ccad0c94edbc79fc5bb552 Disabled ViewPersistentShmoo sizes that results in NAN (#1922)
057237f66deeea816bb943d802a97c1b7e4414ab Fix CUDA driver error: misaligned address for transpose scheduler  (#1918)
3fb3d80339e4f794767a53eb8fdd61e64cf404a2 Add variance_mean function using Welford (#1907)
98febf6aa3b8c6fe4fdfb2864cda9e5d30089262 Remove DisableOption::UnrollWithRng (#1913)
ee8ef33a5591b534cf587d347af11e48ba7a15d4 Minor fix for the debug interface of using PTX directly (#1917)
6e8f953351f9dabfd1f991d8431cecb6c2ce684d Add PYTORCH_NVFUSER_DUMP options to save PTX and CUBIN (#1916)
5eefa9a72385f6a4b145680a9dcc52d7e8293763 dopt is only available since nvrtc 11.7 (#1915)
2ec8fc711eafc72451eebf0f5e2a98a38bf3f6ef Kill computeAtBetween (#1911)
d0d106a1d9af118d71673173674e875be35d259d Improve view support on pointwise and transpose scheduler (#1906)
e71e1ecefe67219846070590bbed54bbc7416b79 Fix name clash of RNG with shared memory (#1904)
3381793a253689abf224febc73fd3fe2a0dbc921 Fix mutator and sameAs for expanded IterDomain (#1902)
```

RUN_TORCHBENCH: nvfuser

Differential Revision: [D39324552](https://our.internmc.facebook.com/intern/diff/D39324552)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84626
Approved by: https://github.com/malfet
2022-09-23 20:29:48 +00:00
Mike Iovine
63c1f2fef9 [Static Runtime] Fold linear prepack ops (#85289)
Summary: Split `quantized_linear_unpacked_weight_v2` into `linear_prepack` and `quantized_linear` so that the prepacking operation may be eliminated by constant folding.

Test Plan:
Fixes a huge regression in an internal model:

```
Before
        89.6141 ms.    99.0923%. fb::quantized_linear_unpacked_weight_v2 (12 nodes)
After
       0.806852 ms.    53.5365%. quantized::linear (12 nodes, out variant)
(prepacking eliminated)
```

Differential Revision: D39622530

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85289
Approved by: https://github.com/davidberard98
2022-09-22 20:23:07 +00:00
Kevin Stephano
b8418e02eb Create Cache for Fusion Reuse in NVFuser in Python Frontend for Primtorch (#85045)
This PR does the following:

- Replaces the `FusionOwner` with a `FusionCache` and `FusionInterface`.  The `FusionCache` is a singleton that contains a cache of Fusions based on the `FusionDefinition`.  It replaces the TorchScript graph caching that looked up a Fusion based on a stringified and canonicalized representation of the TorchScript graph with a prefix tree of statements in the `FusionDefinition`.  The `FusionInterface` is an object that represents a Fusion in python.  It can also query the cache based on id.
- The ability to print out a mechanically derived definition, in python, for the user to use when debugging was added.
- Replaces the python `examples` directory with true python tests under `test/test_nvfuser_frontend.py`.
- Adds a set of C++ tests under the `test` directory to verify the `FusionCache`, `FusionDefinition`, and parts of the `RecordFunctor` child classes.
- Adds a README file to explain how to use the Python Frontend

While there are 3,000+ line edits, the bulk of the changes were repetitive line changes to the python bindings for each operation.

An identical PR to #83267 to avoid tooling issues.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85045
Approved by: https://github.com/davidberard98
2022-09-17 10:52:54 +00:00
PyTorch MergeBot
94b67f4cd8 Revert "Create Cache for Fusion Reuse in NVFuser in Python Frontend for Primtorch (#83267)"
This reverts commit ec916bf6af.

Reverted https://github.com/pytorch/pytorch/pull/83267 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally
2022-09-14 17:40:22 +00:00
Kevin Stephano
ec916bf6af Create Cache for Fusion Reuse in NVFuser in Python Frontend for Primtorch (#83267)
This PR does the following:

- Replaces the `FusionOwner` with a `FusionCache` and `FusionInterface`.  The `FusionCache` is a singleton that contains a cache of Fusions based on the `FusionDefinition`.  It replaces the TorchScript graph caching that looked up a Fusion based on a stringified and canonicalized representation of the TorchScript graph with a prefix tree of statements in the `FusionDefinition`.  The `FusionInterface` is an object that represents a Fusion in python.  It can also query the cache based on id.
- The ability to print out a mechanically derived definition, in python, for the user to use when debugging was added.
- Replaces the python `examples` directory with true python tests under `test/test_nvfuser_frontend.py`.
- Adds a set of C++ tests under the `test` directory to verify the `FusionCache`, `FusionDefinition`, and parts of the `RecordFunctor` child classes.
- Adds a README file to explain how to use the Python Frontend

While there are 3,000+ line edits, the bulk of the changes were repetitive line changes to the python bindings for each operation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83267
Approved by: https://github.com/jjsjann123, https://github.com/davidberard98
2022-09-13 23:28:39 +00:00
Edward Z. Yang
ad44670fa1 Back out "Revert D38984222: Don't introduce new overload for SymInt (#83628)" (#84173)
Also Back out "Revert D39075159: [acc_tensor] Use SymIntArrayRef for overloaded empty.memory_format's signature"

Original commit changeset: dab4a9dba4fa
Original commit changeset: dcaf16c037a9

Original Phabricator Diff: D38984222
Original Phabricator Diff: D39075159

Also update Metal registrations for C++ registration changes.

Also update NNPI registration to account for tightened schema checking

Differential Revision: [D39084762](https://our.internmc.facebook.com/intern/diff/D39084762/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39084762/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84173
Approved by: https://github.com/Krovatkin
2022-08-29 18:01:07 +00:00
Nikolay Korovaiko
86e134ddf7 disable c10::SymIntNode tests on mobile (#84066)
This fixes c++ tests' breaks where we were passing pointers and expected `is_symbolic` to return `true`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84066
Approved by: https://github.com/albanD
2022-08-25 17:28:23 +00:00
jjsjann123
b21a6ff639 [NVFuser] Upstream push 0811 (#83239)
Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/

Code changes includes:

- codegen improvements:
  1. double support in expression evaluator
- bug fixes:
  1. dropout fix - rework RNG to support broadcasted dropout (Fixes #82784)
  2. expand fix - Patch expand+reduction, expand+view, rework view analysis and guard
- scheduler:
  1. manual transpose schedule example
  2. WIP transpose scheduler

Commits that's in this PR from the devel branch:

```
b7435afcd22c917713c2f41a7237bc26e1183f14 Transpose scheduler, step 1 (#1854)
8a45dbf72034684eb8e18b1835b533e90b68f184 Add an example on how to manually schedule transpose (#1889)
83dbf56a9554b2efbd5416461d938fff477b0b27 Patch dropout fix (#1898)
69d3519a532250719b1aa8341b50e067b181b42d Expand+Reduction, Expand+View support, rework View analysis and guards (#1883)
15091c488e96343bdc49e3990acbf238a3b3da51 Rework RNG to correctly support broadcasted dropout (#1888)
aafe2d048aaac596e503596a41303423619f3954 Make ExpressionEvaluator support Double (#1885)
```

RUN_TORCHBENCH: nvfuser

Differential Revision: [D38657074](https://our.internmc.facebook.com/intern/diff/D38657074)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83239
Approved by: https://github.com/davidberard98
2022-08-25 02:23:22 +00:00
Larry Liu
a8a36c45a6 [frontend] Fix tensor list alias annotation (#84005)
For issue https://github.com/pytorch/pytorch/issues/77920 and a retry of https://github.com/pytorch/pytorch/pull/83921

The current logic checks alias info before `[]` and after. If no alias info exists after `[]`, we overwrite the alias info before. This logic failed on argument like `Tensor(a!)[]`, dropping the alias info before `[]` on the floor. This PR adds a new alias info if it's missing after `[]`. This way we can keep the alias info before `[]`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84005
Approved by: https://github.com/cccclai, https://github.com/bdhirsh
2022-08-24 19:50:19 +00:00
Nikolay Korovaiko
b842670aa5 logical ops (#83879)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83879
Approved by: https://github.com/ezyang
2022-08-24 17:49:57 +00:00
Nikolay Korovaiko
2b805e3520 add arithmetic ops (#83878)
arithmetic ops tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83878
Approved by: https://github.com/ezyang
2022-08-24 17:49:56 +00:00
Nikolay Korovaiko
fcb124406b release the current symintnode in the move c-tor (#83789)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83789
Approved by: https://github.com/ezyang
2022-08-22 14:37:06 +00:00
Han Qi (qihqi)
f9533560cc Use flatbuffer of alternate namespace (#82952)
Summary: Minimal change to make use of flatbuffer with fbsource namespace.

Test Plan: existing unit tests

Differential Revision: D38494999

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82952
Approved by: https://github.com/cccclai
2022-08-09 07:40:59 +00:00
Tugsbayasgalan Manlaibaatar
b4b60c2a2e Get rid of ENABLE_UPGRADERS macro (#77574)
Since it's been a while after we merged the upgrader design and we haven't encountered any issues, let's get rid of the macro for safe rollout
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77574
Approved by: https://github.com/gmagogsfm
2022-08-09 05:33:14 +00:00
Dave Bort
6e712823c5 Migrate remaining pytorch code to use new flatbuffer_loader.h APIs (#82620)
This is the only file in pytorch core that refers to the deprecated flatbuffer_loader.h APIs. Move to the non-deprecated functions.

Differential Revision: [D38330369](https://our.internmc.facebook.com/intern/diff/D38330369/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82620
Approved by: https://github.com/qihqi
2022-08-05 02:25:33 +00:00
Dave Bort
0810961d5f Remove flatbuffer types/headers from flatbuffer_serializer[_jit].h (#82619)
Hide the flatbuffers types and headers from the serialize APIs, and stop using the DEPRECATED functions from flatbuffer_loader.h.

This required creating the new `DetachedBuffer` type to replace/hide `flatbuffers::DetachedBuffer`, a class that owns a span of custom-allocated memory.

This is another step towards hiding the flatbuffers types and headers from the load/serialize APIs.

Differential Revision: [D38292798](https://our.internmc.facebook.com/intern/diff/D38292798/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D38292798/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82619
Approved by: https://github.com/qihqi
2022-08-05 02:23:34 +00:00
Edward Z. Yang
df69660832 Revert "Revert "Add a lint rule for torch/csrc/util/pybind.h include (#82552)"" (#82599)
This reverts commit 532b8a9e00.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82599
Approved by: https://github.com/albanD
2022-08-02 19:37:02 +00:00
PyTorch MergeBot
532b8a9e00 Revert "Add a lint rule for torch/csrc/util/pybind.h include (#82552)"
This reverts commit 9465c0e0b5.

Reverted https://github.com/pytorch/pytorch/pull/82552 on behalf of https://github.com/zengk95 due to This seems to be breaking windows binary wheels
2022-08-01 20:25:35 +00:00
Edward Z. Yang
9465c0e0b5 Add a lint rule for torch/csrc/util/pybind.h include (#82552)
We define specializations for pybind11 defined templates
(in particular, PYBIND11_DECLARE_HOLDER_TYPE) and consequently
it is important that these specializations *always* be #include'd
when making use of pybind11 templates whose behavior depends on
these specializations, otherwise we can cause an ODR violation.

The easiest way to ensure that all the specializations are always
loaded is to designate a header (in this case, torch/csrc/util/pybind.h)
that ensures the specializations are defined, and then add a lint
to ensure this header is included whenever pybind11 headers are
included.

The existing grep linter didn't have enough knobs to do this
conveniently, so I added some features.  I'm open to suggestions
for how to structure the features better.  The main changes:

- Added an --allowlist-pattern flag, which turns off the grep lint
  if some other line exists.  This is used to stop the grep
  lint from complaining about pybind11 includes if the util
  include already exists.

- Added --match-first-only flag, which lets grep only match against
  the first matching line.  This is because, even if there are multiple
  includes that are problematic, I only need to fix one of them.
  We don't /really/ need this, but when I was running lintrunner -a
  to fixup the preexisting codebase it was annoying without this,
  as the lintrunner overall driver fails if there are multiple edits
  on the same file.

I excluded any files that didn't otherwise have a dependency on
torch/ATen, this was mostly caffe2 and the valgrind wrapper compat
bindings.

Note the grep replacement is kind of crappy, but clang-tidy lint
cleaned it up in most cases.

See also https://github.com/pybind/pybind11/issues/4099

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82552
Approved by: https://github.com/albanD
2022-08-01 17:16:58 +00:00
Max Ren
727a327162 Back out "Back out "[profiling] Adding targets file for test_mobile_profiler"" (#82243)
Summary:
Originally reverted this diff D37116110 (c9aa74a37f) because

```
> /usr/local/bin/buck build //caffe2/test/cpp/lite_interpreter_runtime/...

BUILD FAILED
The rule //caffe2:backend_interface_libAndroid could not be found.
Please check the spelling and whether it is one of the 1866 targets in /data/users/batanasov/fbsource/fbcode/caffe2/TARGETS. (52107 bytes)
1 similar targets in /data/users/batanasov/fbsource/fbcode/caffe2/TARGETS are:
  //caffe2:backend_interface_lib

This error happened while trying to get dependency '//caffe2:backend_interface_libAndroid' of target '//caffe2/test/cpp/lite_interpreter_runtime:test_mobile_profilerAndroid'
    At //caffe2:backend_interface_libAndroid (ovr_config//platform/linux:x86_64-fbcode)
    At //caffe2/test/cpp/lite_interpreter_runtime:test_mobile_profilerAndroid (ovr_config//platform/linux:x86_64-fbcode)
```

The add test_mobile_profiler was not meant to be built with Android or other mobile platforms, so we are changing the test to a cpp_unittest

Test Plan:
```
buck test //caffe2/test/cpp/lite_interpreter_runtime:test_mobile_profiler
Parsing buck files: finished in 0.9 sec
Creating action graph: finished in 26.5 sec
Downloaded 2/2 artifacts, 1.30 Mbytes, 0.0% cache miss (for updated rules)
Building: finished in 16.5 sec (100%) 18451/18451 jobs, 3/18451 updated
  Total time: 44.0 sec
More details at https://www.internalfb.com/intern/buck/build/8bee82c1-66a9-4fae-805f-e4ef5505d25d
BUILD SUCCEEDED
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id: 6904f989-5c17-4c5b-9a4f-ffb643dfcc43
Trace available for this run at /tmp/tpx-20220726-114727.001729-6904f989-5c17-4c5b-9a4f-ffb643dfcc43/trace.log
RemoteExecution session id: reSessionID-6904f989-5c17-4c5b-9a4f-ffb643dfcc43-tpx
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/844425183404951
    ✓ ListingSuccess: caffe2/test/cpp/lite_interpreter_runtime:test_mobile_profiler : 3 tests discovered (17.640)
    ✓ Pass: caffe2/test/cpp/lite_interpreter_runtime:test_mobile_profiler - MobileProfiler.Backend (0.206)
    ✓ Pass: caffe2/test/cpp/lite_interpreter_runtime:test_mobile_profiler - MobileProfiler.BackendMemoryEvents (0.271)
    ✓ Pass: caffe2/test/cpp/lite_interpreter_runtime:test_mobile_profiler - MobileProfiler.ModuleHierarchy (0.268)
Summary
  Pass: 3
  ListingSuccess: 1
Finished test run: https://www.internalfb.com/intern/testinfra/testrun/844425183404951
```

Differential Revision: D38166171

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82243
Approved by: https://github.com/salilsdesai
2022-07-28 23:08:52 +00:00
goldenxuett
c2ccf6e625 [JIT] Add backwards compatibility test for old NonDeterminism ops list in ir.cpp (#82257)
- Added backwards compatibility test to ensure that every Op in the old Nondeterministic op list from ir.cpp has the tag nondeterministic_seeded.
**Note that the 3 ops marked "normal" were not actually real op signatures. (ie findOp with dispatcher returned a nullptr). These were changed to normal.Tensor_Tensor, normal.Tensor_float and normal.float_Tensor in the list since that is what matches the rest of their signatures**
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82257
Approved by: https://github.com/davidberard98
2022-07-27 20:19:22 +00:00
goldenxuett
8d5951e7e8 [JIT] Add is_aliasing method to FunctionSchema (#82255)
- Add is_aliasing method in function schema to be able to indicate if an argument has an alias_set attached to it. This is utilized in the integration with autograd (see next PR)
- Tested in test_schema_info
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82255
Approved by: https://github.com/davidberard98
2022-07-27 20:19:21 +00:00
goldenxuett
67c22b6c07 [JIT] Modify is_nondeterministic to utilize tags in SchemaInfo for non-mobile contexts and integrate with ir.cpp (#82253)
- Modified is_nondeterministic method in SchemaInfo class to utilize tags.
- Modified isNonDeterministic method in ir.cpp to utilize SchemaInfo when a Node is an aten op.
- Added an assert to ensure that if a node is an aten op kind, it has a schema.
- Tested through verifying that all IR.cpp tests run, and through adding 2 custom determinism checks to test for the special dropout edge case and a general bernoulli case.

Differential Revision: [D38179499](https://our.internmc.facebook.com/intern/diff/D38179499)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82253
Approved by: https://github.com/davidberard98
2022-07-27 20:19:19 +00:00
PyTorch MergeBot
e1bd244a14 Revert "[JIT] Modify is_nondeterministic to utilize tags in schemaInfo and integrate with ir.cpp (#81836)"
This reverts commit fc3555ce4d.

Reverted https://github.com/pytorch/pytorch/pull/81836 on behalf of https://github.com/osalpekar due to Internal Mobile NNPACK custom_ops tests failing with Error: tags are not saved for Mobile
2022-07-26 19:11:49 +00:00
PyTorch MergeBot
cbeef2c541 Revert "[JIT] Add is_aliasing method to FunctionSchema (#81916)"
This reverts commit eb2ea9a581.

Reverted https://github.com/pytorch/pytorch/pull/81916 on behalf of https://github.com/osalpekar due to Need to revert this to revert https://github.com/pytorch/pytorch/pull/81836 cleanly. That PR broke internal mobile custom_ops
2022-07-26 19:06:56 +00:00
PyTorch MergeBot
18dd7e55c9 Revert "[JIT] Add backwards compatibility test for old NonDeterminism ops list in ir.cpp (#82029)"
This reverts commit 7288ea4e1d.

Reverted https://github.com/pytorch/pytorch/pull/82029 on behalf of https://github.com/osalpekar due to Need to revert this to revert https://github.com/pytorch/pytorch/pull/81836 cleanly. That PR broke internal mobile custom_ops
2022-07-26 19:00:44 +00:00
goldenxuett
7288ea4e1d [JIT] Add backwards compatibility test for old NonDeterminism ops list in ir.cpp (#82029)
- Added backwards compatibility test to ensure that every Op in the old Nondeterministic op list from ir.cpp has the tag nondeterministic_seeded.
**Note that the 3 ops marked "normal" were not actually real op signatures. (ie findOp with dispatcher returned a nullptr). These were changed to normal.Tensor_Tensor, normal.Tensor_float and normal.float_Tensor in the list since that is what matches the rest of their signatures**
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82029
Approved by: https://github.com/davidberard98
2022-07-25 15:44:34 +00:00
goldenxuett
eb2ea9a581 [JIT] Add is_aliasing method to FunctionSchema (#81916)
- Add is_aliasing method in function schema to be able to indicate if an argument has an alias_set attached to it. This is utilized in the integration with autograd (see next PR)
- Tested in test_schema_info
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81916
Approved by: https://github.com/davidberard98
2022-07-25 15:44:33 +00:00
goldenxuett
fc3555ce4d [JIT] Modify is_nondeterministic to utilize tags in schemaInfo and integrate with ir.cpp (#81836)
- Modified is_nondeterministic method in SchemaInfo class to utilize tags.
- Modified isNonDeterministic method in ir.cpp to utilize SchemaInfo when a Node is an aten op.
- Added an assert to ensure that if a node is an aten op kind, it has a schema.
- Tested through verifying that all IR.cpp tests run, and through adding 2 custom determinism checks to test for the special dropout edge case and a general bernoulli case.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81836
Approved by: https://github.com/davidberard98
2022-07-25 15:44:31 +00:00
goldenxuett
9a5fa15ea8 [JIT] Remove BatchNorm and InstanceNorm special cases from AliasDB and replace with SchemaInfo is_mutable checks (#81785)
- Generalized AnalyzeImpl cases for batchNorm and InstanceNorm in alias_analysis.cpp using schema_info.
- Tested by ensuring all aliasDB special case checks for batchNorm and instanceNorm pass as expected.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81785
Approved by: https://github.com/davidberard98
2022-07-23 05:50:39 +00:00
goldenxuett
c9497886fd [JIT] Modify is_mutable in FunctionSchema and SchemaInfo to have SchemaArgument parameter instead of index (#81784)
- Modify the is_mutable(size_t index) overload to become is_mutable(const SchemaArgument& argument) due to cases where one might want to check the mutability of either input or output arguments.
- Refactored all calls to the function to use this new overload
- Tested through is_mutable() tests in test_schema_info.cpp
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81784
Approved by: https://github.com/davidberard98
2022-07-20 22:09:56 +00:00
goldenxuett
1ddbc5a7dc [JIT] Remove has_side_effects functionality from SchemaInfo (#81575)
- This removes all functionality from https://github.com/pytorch/pytorch/pull/81002 due to a realization that the side effects check doesn't affect any ops outside of JIT.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81575
Approved by: https://github.com/davidberard98
2022-07-19 22:33:19 +00:00
goldenxuett
a6e716cfed [JIT] Add may_contains_alias function in SchemaInfo class (#81444)
- Created may_contain_alias method in SchemaInfo which is a wrapper around FunctionSchema may_contain_alias that also accounts for argument values. This is done using similar logic to AliasDB using an internal understanding of wildcard sets and container object
- Added a multitude of tests for various graph edge cases (inputs aliasing, outputs aliasing, multiple input wildcards, multiple container objects, etc...).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81444
Approved by: https://github.com/davidberard98
2022-07-19 04:29:22 +00:00
goldenxuett
47cdab6601 [JIT] Fix double wildcard edge case for may_alias in SchemaInfo and improve formatting (#81439)
- Create c10::AliasTypeSet type def of vector<TypePtr> to match alias_analysis.cpp formatting and improve readability.
- Move canAliasTypeSetsAlias, mapTypeToAliasTypeSet, getAliasTypeSetContainedTypes, and getCorrectList to public in function_schema.h for use in SchemaInfo class.
**In the future it might be better to find a different home for most of these functions since they don't depend on functionSchema. **
- Created hash function for SchemaArgument
- Add assert to ensure that there is only 1 input and 1 output with each alias set (excluding wildcard)
- Fixed double wildcard input edge case for may_alias. (This is the case where if there is a schema with the form (Tensor(a) a, Tensor(*) b, Tensor(*) c) -> Tensor, and the argument values for 'a' and 'b' cause them to alias, then 'a' may also alias 'c'.
- Added tests for double wildcard case in may_alias, mismatching types in may_alias, and the uniqueness internal assert.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81439
Approved by: https://github.com/davidberard98
2022-07-19 04:29:21 +00:00
zhang, xiaobing
86b86202b5 fix torch.config can't respect USE_MKLDNN flag issue (#75001)
Fixes https://github.com/pytorch/pytorch/issues/74949, which reports that torch.config can't respect USE_MKLDNN flag.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75001
Approved by: https://github.com/malfet
2022-07-17 15:00:48 +00:00
goldenxuett
e71f4e7958 [JIT] Implement may_contain_alias in FunctionSchema (#81352)
- Created may_contain_alias method in FunctionSchema to publicize more detailed aliasing information about inputs and outputs of a schema. This method returns whether the first argument may contain an alias to the second argument (ie if the first argument is a list[Tensor], it can contain an alias to the second argument of the second argument is Tensor(*)) and vice versa if bidirectional = true.
- Created helper methods are explained more thoroughly in detail in function_schema.h
-Tested may_contain_alias methods for basic functionality, bidirectional functionality, wildcard functionality and dual container functionality in test_schema_info.cpp.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81352
Approved by: https://github.com/davidberard98, https://github.com/Gamrix
2022-07-15 21:57:37 +00:00
goldenxuett
42ee1608d3 [JIT] Add special cases batch_norm, instance_norm and dropout for SchemaInfo (#81007)
- Added special cases for detach in is_non_deterministic() check and batch_norm and instance_norm in is_mutable() check in SchemaInfo().
- Added tests for the above special cases for detach, batch_norm and instance_norm.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81007
Approved by: https://github.com/davidberard98
2022-07-15 04:52:02 +00:00
goldenxuett
3b4964230e [JIT] Add side effects checks for ops in SchemaInfo subclass (#81002)
- Added has_side_effects method which returns whether a given op has side effects. Currently this is implemented with a hard-coded list of functions copied from ir.cpp in AliasDB, but this will eventually be implemented by returning with a given schema has the has_side_effects tag.
- Tested in test_schema_info.cpp with both an op with side effects and an op without side effects.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81002
Approved by: https://github.com/davidberard98
2022-07-13 00:39:30 +00:00
goldenxuett
14c28caed9 [JIT] Add determinism checks for ops in SchemaInfo subclass (#81000)
- Added is_non_deterministic which returns whether a given op is non-deterministic. Currently this is implemented with a hard-coded list of non-deterministic functions copied from ir.cpp in AliasDB, but this will eventually be implemented by returning with a given schema has the non_deterministic tag.
- Tested is_non_deterministic method with a deterministic op and a non deterministic op in test_schema_info.cpp

**Note that the case for op "aten::dropout(Tensor input, float p, bool train) -> Tensor" which is deterministic whenever "train=false" is not accounted for in this pr and will be fixed in a later pr. Currently "aten::dropout(Tensor input, float p, bool train) -> Tensor" is always considered nondeterministic.**
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81000
Approved by: https://github.com/davidberard98
2022-07-13 00:35:42 +00:00
goldenxuett
50ba94f5cc [JIT] Add aliasing checks in SchemaInfo with associated tests (#80984)
- Created may_alias method in SchemaInfo to update the implementation of FunctionSchema::may_alias for aliasing cases due to inputs aliasing.
- Created output_alias_map_ internal variable to check cases where outputs might alias due to inputs aliasing. This variable is updated in generateAliasMap().
- Added tests for various may_alias special cases (input - input, input - output, output - output) due to inputs aliasing causing other arguments to also alias.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80984
Approved by: https://github.com/davidberard98
2022-07-13 00:18:43 +00:00
goldenxuett
aa61fdb667 [JIT] Add argumentValue functions and is_mutable checks to SchemaInfo (#80972)
- Created addArgumentValue/s methods in SchemaInfo to pass argument values into the subclass. These are used for more accurate mutation, aliasing and determinism checks which include special cases.
- Added input_alias_map_ to keep track of which inputs alias each other. This is updated with the method generateAliasMap.
- Implemented is_mutable methods in SchemaInfo which also give information based on argument values. For instance, if two inputs alias and one is mutable by the schema, then the other will also be mutable.
- Tested Schema Info is_mutable implementation where inputs alias as mentioned above.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80972
Approved by: https://github.com/davidberard98
2022-07-13 00:16:41 +00:00
goldenxuett
e3a870986e [JIT] Add may_alias in FunctionSchema with associated tests (#80918)
- Created may_alias method in FunctionSchema to publicize aliasing information about inputs and outputs of a schema.
- Tested may_alias methods for basic functionality, exceptions, and wildcard functionality.

**Cases where elements of a container alias another argument will be handled with a new may_contain_alias method which will be created in a later pr**
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80918
Approved by: https://github.com/davidberard98
2022-07-12 18:07:23 +00:00