Commit graph

762 commits

Author SHA1 Message Date
Pavithran Ramachandran
6402e62454 Refractor flatbuffer jit code
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75239

Refractor flatbuffer_serializer to move JIT related code to a separate file .

Differential Revision: [D35301020](https://our.internmc.facebook.com/intern/diff/D35301020/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35301020/)!

Approved by: https://github.com/iseeyuan
2022-04-11 23:41:48 +00:00
Yulv-git
ac2d2e3a3d Fix some typos.
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75561
Approved by: https://github.com/albanD
2022-04-11 21:55:59 +00:00
Nikita Shulga
80ea6955af Add cuda-11.3+clang9 build workflow (take 2)
To be able to detect unused captures in GPU code lambdas (as gcc does not support this diagnostic)

Remove unused opts lambda capture in `ProcessGroupMPI.cpp` and `Distributions.cu`

Fix sign-compare in nvfuser benchmark and ignore signed unsigned comparison in nvfuser tests
Fixes https://github.com/pytorch/pytorch/issues/75475 by aliasing CMAKE_CUDA_HOST_COMPILER to C_COMPILER when clang is used
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75293
Approved by: https://github.com/atalman, https://github.com/seemethere
2022-04-11 17:13:01 +00:00
PyTorch MergeBot
8fe43d76d5 Revert "Add cuda-11.3+clang9 build workflow"
This reverts commit 709fcc862e.

Reverted https://github.com/pytorch/pytorch/pull/75293 on behalf of https://github.com/janeyx99
2022-04-11 15:24:59 +00:00
Nikita Shulga
709fcc862e Add cuda-11.3+clang9 build workflow
To be able to detect unused captures in GPU code lambdas (as gcc does not support this diagnostic)

Remove unused opts lambda capture in `ProcessGroupMPI.cpp` and `Distributions.cu`

Fix sign-compare in nvfuser benchmark and ignore signed unsigned comparison in nvfuser tests
Fixes https://github.com/pytorch/pytorch/issues/75475 by aliasing CMAKE_CUDA_HOST_COMPILER to C_COMPILER when clang is used
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75293
Approved by: https://github.com/atalman, https://github.com/seemethere
2022-04-11 14:10:57 +00:00
Pavithran Ramachandran
3001bda304 [PyTorchEdge] Backport from v9 flatbuffer to v8 pickle (#75201)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75201

In this diff:
1. Bump supported version to 9, which will serve as a placeholder for upcoming version bump to v9 for flatbuffer format migration.
2. Implements backport from v9 flatbuffer file to v8 pickle file.
ghstack-source-id: 153225189

(Note: this ignores all push blocking failures!)

Test Plan:
fb:
```
cd ~/fbsource/fbcode/ && buck test  -c fbcode.caffe2_enable_flatbuffer=1 caffe2/test/cpp/jit:jit -- LiteInterpreterTest.BackPortByteCodeModelAllVersions
Parsing buck files: finished in 0.7 sec
Downloaded 0/25 artifacts, 0.00 bytes, 100.0% cache miss (for updated rules)
Building: finished in 20.7 sec (100%) 21783/21783 jobs, 5/21783 updated

cd ~/fbsource/fbcode/ && buck test caffe2/test/cpp/jit:jit -- FlatbufferTest.FlatbufferBackPortTest
Parsing buck files: finished in 0.7 sec
Building: finished in 4.5 sec (100%) 12972/53298 jobs, 0/53298 updated
  Total time: 5.3 sec
More details at https://www.internalfb.com/intern/buck/build/b658d597-d358-4293-97cb-28e7612b96e8
BUILD SUCCEEDED
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id: 35d5542d-6ee3-4c28-be10-1d822c7a6fef
Trace available for this run at /tmp/tpx-20220308-090347.891303-35d5542d-6ee3-4c28-be10-1d822c7a6fef/trace.log
RemoteExecution session id: reSessionID-35d5542d-6ee3-4c28-be10-1d822c7a6fef-tpx
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/8444249379196000
    ✓ ListingSuccess: caffe2/test/cpp/jit:jit : 490 tests discovered (22.838)
    ✓ Pass: caffe2/test/cpp/jit:jit - FlatbufferTest.FlatbufferBackPortTest (0.289)
Summary
  Pass: 1
  ListingSuccess: 1
If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users
Finished test run: https://www.internalfb.com/intern/testinfra/testrun/8444249379196000
```

Reviewed By: iseeyuan

Differential Revision: D34702597

fbshipit-source-id: 5c203c29d13360d7934ce6e57557739e7038c05e
(cherry picked from commit 6189e08a2bd968fdab636f77cb6bd73d6c36beb2)
2022-04-07 19:43:57 +00:00
Martin Yuan
00c1e01ad0 Remove internal logic to handle bytecode version 3 (#57775)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57775

The minimum supported bytecode version is updated from 3 to 4. We no longer support version 3 bytecode models.

Why?
* There are hacky codes in operator loading, that performs differently on one operator on the global bytecode version 3. Instead operator related metadata should be passed (for example, in #56845). To allow future development, we remove the hacky way first.
* The bytecode version was bumped from 3 to 4 more than half a year ago. Since all the production models are all bumped to version 4, it's not practical to keep and maintain version 3. The risk to deprecate version 3 is low.

Test Plan: Imported from OSS

Reviewed By: raziel

Differential Revision: D28270791

Pulled By: cccclai

fbshipit-source-id: 70b1bd6352fdaae5f8d2173b81578d77018c8e44
(cherry picked from commit 3e930fa381cd01f3705116795c6426df992372fc)
2022-04-07 01:45:52 +00:00
Pavithran Ramachandran
f984e50f39 Extend jit::load to work on flatbuffer file; Take 2 (#75256)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75256

ghstack-source-id: 153138970

Test Plan: CI

Reviewed By: iseeyuan

Differential Revision: D35399581

fbshipit-source-id: dafe9d301009d3f70986ed92bfe06d160ab90ba0
(cherry picked from commit ccc860fd07946de5aae12bc179a0b8bbba83b997)
2022-04-06 17:54:01 +00:00
John Clow
26dcec152c Added support for SSA for ops not in a JIT graph
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74340

Approved by: https://github.com/eellison
2022-04-06 01:45:37 +00:00
Lu Fang
32e58c73c4 Back out "Extend jit::load to work on flatbuffer file" (#75244)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75244

Original commit changeset: d653a5af662a

Original Phabricator Diff: D35060736 (d9d34922a0)

Test Plan: Model loading test, verified that D35060736 (d9d34922a0) will cause the torch::save => torch::load failure.

Reviewed By: yinghai, jianyuh

Differential Revision: D35387009

fbshipit-source-id: 9d176992d402d57779e2af3d905b3c1538335298
(cherry picked from commit 6c8cc0d3b8a88b15e35702d70e18bbae8aa4628a)
2022-04-05 09:55:04 +00:00
Nikita Shulga
81d765ef1f Fix sign-compare violations in cpp tests
Prerequisite change for enabling `-Werror=sign-compare` across PyTorch repo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75080

Approved by: https://github.com/atalman
2022-04-04 23:05:31 +00:00
Chen Lai
6efc5c1acf Rewrite upgrader bytecode version from 3 to 4 (content unchanged) (#75120)
Summary:
update the upgrader models by hacking backport logic - copy everything in the model and only rewrite the bytecode version to 4 in D35265596

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75120

ghstack-source-id: 152823046

Test Plan: CI

Reviewed By: qihqi

Differential Revision: D35321154

fbshipit-source-id: 333158bd0fd9b4819b3b7cf47d80c285934adf3e
(cherry picked from commit 74bb2da73a4d18f448b8486772643eac89eb759a)
2022-04-02 01:51:39 +00:00
Pavithran Ramachandran
d9d34922a0 Extend jit::load to work on flatbuffer file (#75022)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75022

Extending torch::jit::load to read flatbuffer file
ghstack-source-id: 152820697

Test Plan: CI

Reviewed By: iseeyuan

Differential Revision: D35060736

fbshipit-source-id: d653a5af662a46107ff4fd70209fd2a0a4d40f20
(cherry picked from commit 109e14a54bd279011c8f9066e6c29e8e0b1fc4db)
2022-04-02 01:33:34 +00:00
Pavithran Ramachandran
7aaa75af05 Extending _get_bytecode_version to support flatbuffers format (#75021)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75021

Extending `_get_bytecode_version` to support flatbuffers.
ghstack-source-id: 152771695

(Note: this ignores all push blocking failures!)

Test Plan:
```
~/fbsource/xplat] cd ~/fbsource/xplat/ && buck test //xplat/caffe2:test_lite_interpreter
Building: finished in 0.8 sec (100%) 327/327 jobs, 0/327 updated
  Total time: 0.9 sec
Testing: finished in 06:59.5 min (85 PASS/0 FAIL)
BUILD SUCCEEDED
RESULTS FOR //xplat/caffe2:test_lite_interpreter
PASS    412.3s 85 Passed   0 Skipped   0 Failed   //xplat/caffe2:test_lite_interpreter
TESTS PASSED
```

Reviewed By: iseeyuan

Differential Revision: D34900498

fbshipit-source-id: 65743076d43a933c5381ec128d0268f22c0a8441
(cherry picked from commit 457c76c7d1df6050b941c56a8198162e2e4a3388)
2022-04-01 15:05:37 +00:00
Nikolay Korovaiko
5177f95d21 Introducing SymInt to Pytorch (for tracing size arithmetic) (master rebase) (#74861)
Summary:
This PR introduces `SymInt` type to Pytorch which will be used by LTC and AOTAutograd for tracing size arithmetic and tests.
`SymInt` is a C++ union structure [int64_t, SymbolicIntNode*] that wraps around an int64_t field where the value of the field could be an index into a list of `shared_ptr<SymbolicIntNode>` or a real int.
This PR doesn't add any support for actually tracing symbolic ints. i.e. data_ for now can only contain real ints.

```
Goal 1: just to show we can add a type to PyTorch core. (wraps int) LANDEABLE
Finalize the naming - symint
Want the name to be short
Does invoke “size” - NO
SInt/SymInt/SymbolicInt
SInt could mean signed int
sym_int or symint or SymInt (originally it was “int”; capitalized implies object semantics, whereas lowercase implies value semantics)
JIT schema - symint
C++ - symint
```

See more details here: https://docs.google.com/document/d/1iiLNwR5ohAsw_ymfnOpDsyF6L9RTUaHMpD8 (d843f63f2a)YLw-jxEw

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74861

Reviewed By: qihqi, ngimel

Differential Revision: D35226230

Pulled By: Krovatkin

fbshipit-source-id: 34acf342bd50fcaa4d8d5dd49c2fd6a98823a5b3
(cherry picked from commit 218643f63ef181cabb92d13a6e837eb64f2dda3c)
2022-03-31 21:59:59 +00:00
jjsjann123
873ced7cd0 Nvfuser code bump 030122 (#73627)
Summary:
Things changed in this PR that requires review:

test/forward_backward_compatibility/check_forward_backward_compatibility.py

Our previous function overload extension names were wrong and has been updated in this PR, hence the compatibility list updated.

nvfuser code updates with bug fixes towards failures we encountered in OpInfoTests as well as failures reported by AOTAutograd team.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73627

Reviewed By: Chillee

Differential Revision: D34765458

Pulled By: davidberard98

fbshipit-source-id: c81f3d6a1b723fb3a8ba419b7f82227f70440ca7
(cherry picked from commit b6a2c362c37051e44fac31687b2fe272f776551e)
2022-03-31 08:18:22 +00:00
Elias Ellison
2ef5611f31 Add comments for adding shape function and linting (#73570)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73570

Approved by: https://github.com/huiguoo

Test Plan: contbuild & OSS CI, see 6d36bbde7e

Reviewed By: pbelevich

Differential Revision: D35192688

Pulled By: atalman

fbshipit-source-id: b12b80e6a6dd1adaa57a8facb6bb077989faa543
(cherry picked from commit e50478c02592597f12b8490ec5496f76c7d8b8cc)
2022-03-31 04:25:43 +00:00
Nikita Shulga
3036a0309d [skip ci]Revert "Add comments for adding shape function and linting"
This is a technical revert of 6d36bbde7e to reconcile it with e50478c02592597f12b8490ec5496f76c7d8b8cc (which is the same + lint changes applied)

Should be skipped during import
2022-03-30 21:21:28 -07:00
Dave Bort
f82b2d4a82 [PyTorchEdge] Make _load_parameters() handle flatbuffer inputs (#74580)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74580

Handle Flatbuffer-serialized parameters.

Make `_load_parameters()` detect the input data format and use the correct deserializer to load the parameters.

Also, rename `BytecodeDeserializer` to `IValueUnpickler` to make it clear that it unpickles an `IValue` and doesn't have anything to do with bytecode.
ghstack-source-id: 152487890

Test Plan:
New unit test shows a successful round trip from _save_parameters() to _load_parameters() using flatbuffers.

```
$ buck test //xplat/caffe2:test_lite_trainer //xplat/caffe2:test_lite_trainer_pickle_and_flatbuffer
Building: finished in 0.5 sec (100%) 346/346 jobs, 0/346 updated
  Total time: 0.6 sec
Testing: finished in 0.5 sec (26 PASS/0 FAIL)
BUILD SUCCEEDED
RESULTS FOR //xplat/caffe2:test_lite_trainer //xplat/caffe2:test_lite_trainer_pickle_and_flatbuffer
PASS    <100ms 13 Passed   0 Skipped   0 Failed   //xplat/caffe2:test_lite_trainer
PASS    <100ms 13 Passed   0 Skipped   0 Failed   //xplat/caffe2:test_lite_trainer_pickle_and_flatbuffer
TESTS PASSED
```

Reviewed By: qihqi

Differential Revision: D34488913

fbshipit-source-id: 8d2c0b895699f3b336115d33bf96d49cbf9245d2
(cherry picked from commit 319345deff260826197f8cdf5ac03071b412c72f)
2022-03-30 20:39:58 +00:00
Dave Bort
1659a267f9 [PyTorchEdge] Export flatbuffers from _save_parameters() (#74579)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74579

Now that we can convert a module to a flatbuffer, update `_save_parameters()` to optionally write to that format.

Also, rename the internal `ScriptModuleSerializer` class to `IValuePickler` to make it more clear that a) it's pickle-specific, and b) it serializes IValues, not Modules.
ghstack-source-id: 152487889

Test Plan:
New unit test shows that we can produce Flatbuffer-formatted output.

```
$ buck test //xplat/caffe2:test_lite_trainer //xplat/caffe2:test_lite_trainer_pickle_and_flatbuffer
Building: finished in 0.5 sec (100%) 346/346 jobs, 0/346 updated
  Total time: 0.6 sec
Testing: finished in 0.5 sec (26 PASS/0 FAIL)
BUILD SUCCEEDED
RESULTS FOR //xplat/caffe2:test_lite_trainer //xplat/caffe2:test_lite_trainer_pickle_and_flatbuffer
PASS    <100ms 13 Passed   0 Skipped   0 Failed   //xplat/caffe2:test_lite_trainer
PASS    <100ms 13 Passed   0 Skipped   0 Failed   //xplat/caffe2:test_lite_trainer_pickle_and_flatbuffer
TESTS PASSED
```

A new test in later commit D34488913 tests the full round trip.

Reviewed By: qihqi

Differential Revision: D34408538

fbshipit-source-id: eea183c31b5e1b2b75a65f384d8a479223a4ae72
(cherry picked from commit de310a15422b65fb7e443f7005d287d9f5f586bc)
2022-03-30 20:39:58 +00:00
Elias Ellison
6d36bbde7e Add comments for adding shape function and linting
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73570

Approved by: https://github.com/huiguoo
2022-03-29 23:02:22 +00:00
Elias Ellison
9c4a63787b Add api for changing function executor settings, hook up execution with decomposition registry (#74186)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74186

Make the execution settings mutable on function_impl so that we can set it for running op decompositions. Add mapping to function objects and show example in test of executing op decompositions.

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D34938125

Pulled By: eellison

fbshipit-source-id: adf108b2f6c1bd166910c6d7b94245661d67ce0d
(cherry picked from commit 9957e33803002d9e71abe4ff802769270b6960d3)
2022-03-29 18:38:52 +00:00
Elias Ellison
0ecf1add1b Introduce function-local settings for executor, expose in c++ (#74012)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74012

This allows setting an executor on a function. The first use case is use to decompositions in C++ without additional fusion passes etc which might not work with custom tensors like batched tensors/vmap. A subsequent use case might be taking advantage of invokees of JIT execution which guard on certain properties before invocation (such as complete shapes in AOT autograd, rank in lazy tensor).

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D34938124

Pulled By: eellison

fbshipit-source-id: cf7a45416457942b872322cab47d871a8336bdb5
(cherry picked from commit 9c600eb9ad0f2173f003e511268e97584edae36d)
2022-03-29 18:38:52 +00:00
Elias Ellison
6694fdaccd Clean up profiling mode and profiling executor strategy (#73875)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73875

Previously we had a few settings:
- getExecutor - which toggled between Profiling Executor and Legacy
- getGraphOptimize - if true, overrides PE/Legacy to run with simple executor (no optimizations)
and then...
- getProfilingMode - which would set PE to 0 specializtions.

The last mode is redundant with getGraphOptimize, we should just remove it and use getGraphOptimize in these cases. It would lead to potentially invalid combinations of logic - what does mean if getProfilingMode is true but getExecutor is set to false ? This would lead to a bug in specialize_autograd_zero in this case, see: https://github.com/pytorch/pytorch/blob/master/torch%2Fcsrc%2Fjit%2Fpasses%2Fspecialize_autogradzero.cpp#L93.

The tests here are failing but get fixed with the PR above it, so i'll squash for landing.

Test Plan: Imported from OSS

Reviewed By: cpuhrsch

Differential Revision: D34938130

Pulled By: eellison

fbshipit-source-id: 1a9c0ae7f6d1cfddc2ed3499a5af611053ae5e1b
(cherry picked from commit cf69ce3d155ba7d334022c42fb2cee54bb088c23)
2022-03-29 18:38:51 +00:00
Pavithran Ramachandran
fc2cf3d26f Back out "Revert D34805092: Extend _save_for_mobile and _load_for_mobile to support flatbuffer format; Default format is pickle + Change buck targets to support only pickle and pickle + flatbuffer for migration" (#74594)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74594

Extending `_save_for_mobile` and `_load_for_mobile` to support faltbuffer format with additional optional argument which is set to pick pickle by default.

Adding new binary target with suffix `_pickle_and_flatbuffer` to help migration.

Size test in D34909502 shows the size has regressed by ~40K but after removing pickle and comparing lite_predictors we have ~120K size measure that we will achieve when deprecating pickle and moving to flatbuffer

**BEFORE:**

```lang=mermaid
graph TD;
    torch_core-->torch_mobile_deserialize;

    torch_mobile_core-->torch_mobile_deserialize;

    jit_module_saving-->torch_core;
    jit_module_saving-->torch_mobile_core;

    torch_mobile_deserialize-->caffe2_serialize;
    torch_mobile_deserialize-->torch_mobile_module;

    caffe2_serialize-->miniz;

    flatbuffer_loader-->mobile_bytecode;
    flatbuffer_serializer-->mobile_bytecode;

    mobile_bytecode-->flatbuffer_2.0;

    flatbuffer_loader-->torch_mobile_module;
    flatbuffer_serializer-->torch_mobile_module;
```

**AFTER:**
```lang=mermaid
graph TD;
    torch_core-->torch_mobile_deserialize;

    torch_mobile_core-->torch_mobile_deserialize;

    jit_module_saving-->torch_core;
    jit_module_saving-->torch_mobile_core;

    torch_mobile_deserialize-->caffe2_serialize;
    torch_mobile_deserialize-->torch_mobile_module;

    caffe2_serialize-->miniz;

    flatbuffer_loader-->mobile_bytecode;
    flatbuffer_serializer-->mobile_bytecode;

    mobile_bytecode-->flatbuffer_2.0;

    torch_mobile_deserialize_pickle_and_flatbuffer-->|new| flatbuffer_loader;
    torch_mobile_deserialize_pickle_and_flatbuffer-->|new| torch_mobile_deserialize;
    torch_mobile_core_pickle_and_flatbuffer-->|new| torch_mobile_deserialize_pickle_and_flatbuffer;
    torch_core_pickle_and_flatbuffer-->|new| torch_mobile_deserialize_pickle_and_flatbuffer;

    jit_module_saving_pickle_and_flatbuffer-->|new| torch_core_pickle_and_flatbuffer;
    jit_module_saving_pickle_and_flatbuffer-->|new| torch_mobile_core_pickle_and_flatbuffer;

    flatbuffer_serializer-->torch_mobile_module;

    jit_module_saving_pickle_and_flatbuffer-->|new|jit_module_saving;
    jit_module_saving_pickle_and_flatbuffer-->|new|flatbuffer_serializer;

    flatbuffer_loader-->torch_mobile_module;
```

Original commit changeset: 780dfb6fd6ba

Original Phabricator Diff: D34805092 (284b2b7135)
ghstack-source-id: 152044801

(Note: this ignores all push blocking failures!)

Test Plan:
CI

```
~/fbsource/fbcode] cd ~/fbsource/fbcode/ && buck test -c fbcode.caffe2_enable_flatbuffer=1 //caffe2/test/cpp/jit:jit  -- FlatbufferTest.ExtraFiles
Parsing buck files: finished in 0.9 sec
Building: finished in 5.3 sec (100%) 12992/54304 jobs, 0/54304 updated
  Total time: 6.2 sec
More details at https://www.internalfb.com/intern/buck/build/2b387fff-f813-4cfa-b53f-eb2378630d4e
BUILD SUCCEEDED
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id: f93a84d6-e7ce-41a0-a97f-0ef3fa6d199d
Trace available for this run at /tmp/tpx-20220323-134108.766518-f93a84d6-e7ce-41a0-a97f-0ef3fa6d199d/trace.log
RemoteExecution session id: reSessionID-f93a84d6-e7ce-41a0-a97f-0ef3fa6d199d-tpx
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/4503599723101693
    ✓ ListingSuccess: caffe2/test/cpp/jit:jit : 486 tests discovered (19.122)
    ✓ Pass: caffe2/test/cpp/jit:jit - FlatbufferTest.ExtraFiles (0.187)
Summary
  Pass: 1
  ListingSuccess: 1
If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users
Finished test run: https://www.internalfb.com/intern/testinfra/testrun/4503599723101693
```

Similar Build Deps Dags

```
[pavithran@devvm5216.vll0 /data/users/pavithran/fbsource] buck query 'allpaths(//xplat/caffe2:torch_mobile_all_ops_pickle_and_flatbuffer, //xplat/caffe2:torch_mobile_deserialize_pickle_and_flatbuffer)' --output-format dot-compact  | pastry
P486770901: https://www.internalfb.com/intern/paste/P486770901/

[pavithran@devvm5216.vll0 /data/users/pavithran/fbsource] buck query 'allpaths(//xplat/caffe2:torch_mobile_all_ops, //xplat/caffe2:torch_mobile_deserialize)' --output-format dot-compact  | pastry
P486771278: https://www.internalfb.com/intern/paste/P486771278/
```

pickle_and_flatbuffer: https://www.internalfb.com/intern/dgw/graph/?build_id=P486770901
pickle: https://www.internalfb.com/intern/dgw/graph/?build_id=P486771278

Reviewed By: iseeyuan

Differential Revision: D35067157

fbshipit-source-id: 9044259c17a2e0da79bd6aedb28efbdfd57e23e0
(cherry picked from commit f738069ec3a72e79da56172741d027de514e9e5f)
2022-03-24 21:51:05 +00:00
Nikita Shulga
c53b3ed20f Revert D34805092: Extend _save_for_mobile and _load_for_mobile to support flatbuffer format; Default format is pickle + Change buck targets to support only pickle and pickle + flatbuffer for migration
Test Plan: revert-hammer

Differential Revision:
D34805092 (284b2b7135)

Original commit changeset: 57f3fc81d68f

Original Phabricator Diff: D34805092 (284b2b7135)

fbshipit-source-id: 780dfb6fd6ba5f9348f24a2fb3c57971b7155541
(cherry picked from commit bebeb8b84e11c34cbde4857d0e1c291731a7c781)
2022-03-22 22:45:50 +00:00
Pavithran Ramachandran
284b2b7135 Extend _save_for_mobile and _load_for_mobile to support flatbuffer format; Default format is pickle + Change buck targets to support only pickle and pickle + flatbuffer for migration (#74209)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74209

Extending `_save_for_mobile` and `_load_for_mobile` to support faltbuffer format with additional optional argument which is set to pick pickle by default.

Adding new binary target with suffix `_pickle_and_flatbuffer` to help migration.

Size test in D34909502 shows the size has regressed by ~40K but after removing pickle and comparing lite_predictors we have ~120K size measure that we will achieve when deprecating pickle and moving to flatbuffer

**BEFORE:**

```lang=mermaid
graph TD;
    torch_core-->torch_mobile_deserialize;

    torch_mobile_core-->torch_mobile_deserialize;

    jit_module_saving-->torch_core;
    jit_module_saving-->torch_mobile_core;

    torch_mobile_deserialize-->caffe2_serialize;
    torch_mobile_deserialize-->torch_mobile_module;

    caffe2_serialize-->miniz;

    flatbuffer_loader-->mobile_bytecode;
    flatbuffer_serializer-->mobile_bytecode;

    mobile_bytecode-->flatbuffer_2.0;

    flatbuffer_loader-->torch_mobile_module;
    flatbuffer_serializer-->torch_mobile_module;
```

**AFTER:**
```lang=mermaid
graph TD;
    torch_core-->torch_mobile_deserialize;

    torch_mobile_core-->torch_mobile_deserialize;

    jit_module_saving-->torch_core;
    jit_module_saving-->torch_mobile_core;

    torch_mobile_deserialize-->caffe2_serialize;
    torch_mobile_deserialize-->torch_mobile_module;

    caffe2_serialize-->miniz;

    flatbuffer_loader-->mobile_bytecode;
    flatbuffer_serializer-->mobile_bytecode;

    mobile_bytecode-->flatbuffer_2.0;

    torch_mobile_deserialize_pickle_and_flatbuffer-->|new| flatbuffer_loader;
    torch_mobile_deserialize_pickle_and_flatbuffer-->|new| torch_mobile_deserialize;
    torch_mobile_core_pickle_and_flatbuffer-->|new| torch_mobile_deserialize_pickle_and_flatbuffer;
    torch_core_pickle_and_flatbuffer-->|new| torch_mobile_deserialize_pickle_and_flatbuffer;

    jit_module_saving_pickle_and_flatbuffer-->|new| torch_core_pickle_and_flatbuffer;
    jit_module_saving_pickle_and_flatbuffer-->|new| torch_mobile_core_pickle_and_flatbuffer;

    flatbuffer_serializer-->torch_mobile_module;

    jit_module_saving_pickle_and_flatbuffer-->|new|jit_module_saving;
    jit_module_saving_pickle_and_flatbuffer-->|new|flatbuffer_serializer;

    flatbuffer_loader-->torch_mobile_module;
```
ghstack-source-id: 151744258

Test Plan:
Similar Build Deps Dags

```
[pavithran@devvm5216.vll0 /data/users/pavithran/fbsource] buck query 'allpaths(//xplat/caffe2:torch_mobile_all_ops_pickle_and_flatbuffer, //xplat/caffe2:torch_mobile_deserialize_pickle_and_flatbuffer)' --output-format dot-compact  | pastry
P486770901: https://www.internalfb.com/intern/paste/P486770901/

[pavithran@devvm5216.vll0 /data/users/pavithran/fbsource] buck query 'allpaths(//xplat/caffe2:torch_mobile_all_ops, //xplat/caffe2:torch_mobile_deserialize)' --output-format dot-compact  | pastry
P486771278: https://www.internalfb.com/intern/paste/P486771278/
```

pickle_and_flatbuffer: https://www.internalfb.com/intern/dgw/graph/?build_id=P486770901
pickle: https://www.internalfb.com/intern/dgw/graph/?build_id=P486771278

Reviewed By: iseeyuan

Differential Revision: D34805092

fbshipit-source-id: 57f3fc81d68fce941a050c35bd8e6f05951183b3
(cherry picked from commit 671ae4ed29e65b86ffe507a503548d3e86ab0ea4)
2022-03-22 20:00:53 +00:00
Han Qi
4b4f652f79 [3/5] Put JIT source inside flatbuffer (#74245)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74245

title

Test Plan: unittest

Reviewed By: iseeyuan

Differential Revision: D34881612

fbshipit-source-id: 7037982e9267ad72b86e91cd5f2d92426d71dd56
(cherry picked from commit 88f34eb55b2bee6ef8ef27188e075fa2b8767fdf)
2022-03-17 18:46:47 +00:00
Han Qi
ded82ad7c7 Create method to map JIT module to (source, constant) and back. (#74119)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74119

  implemented function to generate source as ExtraFilesMap and constants

  wrote function to construct jit module given (ivalue, source,
  constant) tripple.

Test Plan: unittest

Reviewed By: pavithranrao

Differential Revision: D34803945

fbshipit-source-id: 2edc798407fe68294cb4c3c7516f5bd143df88c3
(cherry picked from commit 35e54e166b8f0f5cfe8f08c07866b59ae61ee79d)
2022-03-15 18:30:08 +00:00
Han Qi
0723639b60 Revert D34455360: Multisect successfully blamed D34455360 for test failures
Summary:
This diff is reverting D34455360 (61d6c43864)
D34455360 (61d6c43864) is making the following tests to fail and this revert diff is either the revert of the blame diff or the revert of the stack of diffs that need to be reverted to revert the blame diff

Tests affected:
- https://www.internalfb.com/intern/test/562950004334605/

Multisect link:
https://www.internalfb.com/intern/testinfra/multisect/756170

Test Plan: NA

Reviewed By: zhxchen17

Differential Revision: D34596156

fbshipit-source-id: a465bca0094db3caf6130c80f1ed49eea981359b
(cherry picked from commit ef5e5578c64ce9827570757fb016aafa9c782c6a)
2022-03-08 23:18:54 +00:00
Dave Bort
7b51629c53 [PyTorchEdge] Add getFileFormat() so we can differentiate Zip/Pickle from Flatbuffer (#73707)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73707

Add a helper function to detect the file format from the first bytes of a data file or stream. This will be necessary during the migration from Pickle-serialized modules to Flatbuffer-serialized modules.
ghstack-source-id: 150384317

Test Plan:
Existing tests for ZIP+Pickle continue to pass.

New unit tests pass:
```
cd xplat && buck test //xplat/caffe2:test_lite_trainer //xplat/caffe2:test_lite_interpreter

Building: finished in 26.6 sec (100%) 3180/3180 jobs, 571/3180 updated
  Total time: 32.2 sec
Testing: finished in 07:08.3 min (89 PASS/0 FAIL)
BUILD SUCCEEDED
RESULTS FOR //xplat/caffe2:test_lite_interpreter //xplat/caffe2:test_lite_trainer
PASS    421.1s 81 Passed   0 Skipped   0 Failed   //xplat/caffe2:test_lite_interpreter
PASS     103ms  8 Passed   0 Skipped   0 Failed   //xplat/caffe2:test_lite_trainer
TESTS PASSED
```

Reviewed By: iseeyuan

Differential Revision: D34527859

fbshipit-source-id: ff2d1eabc2f8be1de2e44709c878e2d1a373f0df
(cherry picked from commit 5c394848346ab9e374c9e7eed479ad70ed09a7ae)
2022-03-04 19:35:41 +00:00
Han Qi
61d6c43864 Make debug_pkl smaller by only emitting unique traces. (#73368)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73368

debug_pkl file inside of pytorch's .pt file consists of a list of SourceRanges. Each SourceRange points to a Source which is a stack track, filename, and start, end numbers. Those are emitted in debug_pkl file as strings.
Since many SourceRange shares the same source, the string for trace can be deduped.
The newer format saves a set of unique traces in a tuple, then each SourceRange will save the offset of it's trace w.r.t. position in that tuple. (i.e. manually applying dictionary compression).
The above helps with smaller file size. On loading, if we copy each trace to Source as string the runtime memory would still blowup.
To mitigate this, we use SourceView directly instead of source which will take the reference of string inside of Deserializer and make that into string_view. This is safe because Deserializer is hold by Unpickler by shared_ptr, and Unpickler is also hold by shared_ptr by another Source object. That Source object will be alive during the model construction.

Test Plan:
unit test

Took original file (312271638_930.predictor.disagg.local); loaded with `torch.jit.load` save again with `torch.jit.save`. Unzip both, look at contents:
```
[qihan@devvm5585.vll0 ~]$ du archive -h
4.0K    archive/xl_model_weights
3.7M    archive/extra
8.0K    archive/code/__torch__/caffe2/torch/fb/model_transform/splitting
8.0K    archive/code/__torch__/caffe2/torch/fb/model_transform
8.0K    archive/code/__torch__/caffe2/torch/fb
8.0K    archive/code/__torch__/caffe2/torch
8.0K    archive/code/__torch__/caffe2
20M     archive/code/__torch__/torch/fx/graph_module
20M     archive/code/__torch__/torch/fx
8.0K    archive/code/__torch__/torch/classes
20M     archive/code/__torch__/torch
20M     archive/code/__torch__
20M     archive/code
2.7M    archive/constants
35M     archive
[qihan@devvm5585.vll0 ~]$ du resaved -h
4.0K    resaved/extra
8.0K    resaved/code/__torch__/caffe2/torch/fb/model_transform/splitting
8.0K    resaved/code/__torch__/caffe2/torch/fb/model_transform
8.0K    resaved/code/__torch__/caffe2/torch/fb
8.0K    resaved/code/__torch__/caffe2/torch
8.0K    resaved/code/__torch__/caffe2
1.3M    resaved/code/__torch__/torch/fx/graph_module
1.3M    resaved/code/__torch__/torch/fx
8.0K    resaved/code/__torch__/torch/classes
1.4M    resaved/code/__torch__/torch
1.4M    resaved/code/__torch__
1.4M    resaved/code
2.7M    resaved/constants
13M     resaved
[qihan@devvm5585.vll0 ~]$
```

Reviewed By: gmagogsfm

Differential Revision: D34455360

fbshipit-source-id: 8cc716f9bba7183746b1b4ecc33a2de34ac503b9
(cherry picked from commit f1a04730fc9ac8fdab6c8e4c44cb5529e42090e4)
2022-03-02 08:37:08 +00:00
Elias Ellison
d3d74e9040 Allow custom registration of shape functions (#73270)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73270

Together with open registration of NNC lowerings this should make possible to add support for custom operators, including internal fb-ops

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D34451275

Pulled By: eellison

fbshipit-source-id: ae8ae2deb93caa6770e738217461e65853897b55
(cherry picked from commit ea6b7e8a6d8f970a20e68d02eefc5c951e32aa07)
2022-02-28 17:44:45 +00:00
Pavithran Ramachandran
62eb7d64cf [PyTorchEdge] Extend flatbuffer to support extra files map (#72951)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72951

Extend flatbuffer to support extra files map

Flatbuffer schema has extra files. The users can write extra files by providing a `map<string, string>` which will be part of the flatbuffer model asset and and can be loaded back similar to pickle.
ghstack-source-id: 149622799

Test Plan:
fb:

```[pavithran@devvm5216.vll0 ~/fbsource/fbcode] cd ~/fbsource/fbcode/ && buck test caffe2/test/cpp/jit:jit -- FlatbufferTest.ExtraFiles
Parsing buck files: finished in 0.7 sec
Downloaded 0/8 artifacts, 0.00 bytes, 100.0% cache miss (for updated rules)
Building: finished in 20.0 sec (100%) 22343/22343 jobs, 4/22343 updated
  Total time: 20.7 sec
More details at https://www.internalfb.com/intern/buck/build/7dba5034-d623-4a1e-afa1-b0e809df7066
BUILD SUCCEEDED
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id: 9c1ac1e0-a8c0-4a62-95df-8f49695aa7d1
Trace available for this run at /tmp/tpx-20220216-144630.207992/trace.log
RemoteExecution session id: reSessionID-9c1ac1e0-a8c0-4a62-95df-8f49695aa7d1-tpx
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/7318349470518809
    ✓ ListingSuccess: caffe2/test/cpp/jit:jit : 468 tests discovered (17.211)
    ✓ Pass: caffe2/test/cpp/jit:jit - FlatbufferTest.ExtraFiles (0.169)
Summary
  Pass: 1
  ListingSuccess: 1
If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users
Finished test run: https://www.internalfb.com/intern/testinfra/testrun/7318349470518809````

Reviewed By: iseeyuan

Differential Revision: D34286346

fbshipit-source-id: 4e09ab25b8ed6af6f8923db3aab046c255f13bb8
(cherry picked from commit ce8d88e22a360b25253d8a75f428d523fa88a79a)
2022-02-24 19:39:32 +00:00
Jacob Szwejbka
faacb8ab36 [Pytorch Edge] Lean Runtime Test
Summary:
As far as I can tell theres no CI that actually runs the lean_runtime. This should add it I think. (Is this directory covered by CI?) Next up is to create some test for min_runtime_lib

(Note: this ignores all push blocking failures!)

Test Plan: buck test :lean_runtime_delegate_flatbuffer_test

Reviewed By: iseeyuan

Differential Revision: D34255148

fbshipit-source-id: b44693220e93869edd984bbcd17d33db4007a4ea
(cherry picked from commit 0a4a6b5bd2b4a1f8cce8bc1c4a22dad9539631c1)
2022-02-24 18:40:47 +00:00
Alban Desmaison
3bd1507ff2 Revert D33994011: Make debug_pkl smaller by only emitting unique traces.
Test Plan: revert-hammer

Differential Revision:
D33994011 (3d37f5b052)

Original commit changeset: 8e6224c6e942

Original Phabricator Diff: D33994011 (3d37f5b052)

fbshipit-source-id: 885e739efa1081382e1fcf9c6cccba92c57e9f7a
(cherry picked from commit a6d98c85a736c2eb321a6f38005dd0f5dc43eb87)
2022-02-24 16:38:55 +00:00
Han Qi
3d37f5b052 Make debug_pkl smaller by only emitting unique traces. (#72596)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72596

debug_pkl file inside of pytorch's .pt file consists of a list of SourceRanges. Each SourceRange points to a Source which is a stack track, filename, and start, end numbers. Those are emitted in debug_pkl file as strings.

Since many SourceRange shares the same source, the string for trace can be deduped.

The newer format saves a set of unique traces in a tuple, then each SourceRange will save the offset of it's trace w.r.t. position in that tuple. (i.e. manually applying dictionary compression).

The above helps with smaller file size. On loading, if we copy each trace to Source as string the runtime memory would still blowup.
To mitigate this, we use SourceView directly instead of source which will take the reference of string inside of Deserializer and make that into string_view. This is safe because Deserializer is hold by Unpickler by shared_ptr, and Unpickler is also hold by shared_ptr by another Source object. That Source object will be alive during the model construction.

Test Plan:
unit test

Took original file (312271638_930.predictor.disagg.local); loaded with `torch.jit.load` save again with `torch.jit.save`. Unzip both, look at contents:
```
[qihan@devvm5585.vll0 ~]$ du archive -h
4.0K    archive/xl_model_weights
3.7M    archive/extra
8.0K    archive/code/__torch__/caffe2/torch/fb/model_transform/splitting
8.0K    archive/code/__torch__/caffe2/torch/fb/model_transform
8.0K    archive/code/__torch__/caffe2/torch/fb
8.0K    archive/code/__torch__/caffe2/torch
8.0K    archive/code/__torch__/caffe2
20M     archive/code/__torch__/torch/fx/graph_module
20M     archive/code/__torch__/torch/fx
8.0K    archive/code/__torch__/torch/classes
20M     archive/code/__torch__/torch
20M     archive/code/__torch__
20M     archive/code
2.7M    archive/constants
35M     archive
[qihan@devvm5585.vll0 ~]$ du resaved -h
4.0K    resaved/extra
8.0K    resaved/code/__torch__/caffe2/torch/fb/model_transform/splitting
8.0K    resaved/code/__torch__/caffe2/torch/fb/model_transform
8.0K    resaved/code/__torch__/caffe2/torch/fb
8.0K    resaved/code/__torch__/caffe2/torch
8.0K    resaved/code/__torch__/caffe2
1.3M    resaved/code/__torch__/torch/fx/graph_module
1.3M    resaved/code/__torch__/torch/fx
8.0K    resaved/code/__torch__/torch/classes
1.4M    resaved/code/__torch__/torch
1.4M    resaved/code/__torch__
1.4M    resaved/code
2.7M    resaved/constants
13M     resaved
[qihan@devvm5585.vll0 ~]$
```

Reviewed By: JasonHanwen

Differential Revision: D33994011

fbshipit-source-id: 8e6224c6e942e91c3403f686c8f0937d1002ed41
(cherry picked from commit a7014dd4029308c95007f362a57c31796d686647)
2022-02-24 09:31:16 +00:00
Mike Iovine
d1c5f9e439 [JIT][SR] Introduce prim::IfThenElse (#72587)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72587

This pattern frequently appears in a few graphs:

```
%result = prim::If(%condition)
  block0():
    -> (%a)
  block1():
    -> (%b)
```

This is slow, particularly in static runtime. Static runtime creates memory planners/block runners for each sub-block, which eats up a lot of memory and introduces a lot of extra overhead for this relatively simple operation.

This diff introduces a new op that replaces nodes like the above with a single op meant to act like a ternary operator:

```
%result = prim::IfThenElse(%condition, %a, %b)
```

Test Plan: New unit tests

Reviewed By: eellison

Differential Revision: D34091789

fbshipit-source-id: eb6a8c460c39b4c019a1f4ab1f3f1e5b6edc400c
(cherry picked from commit 0f1b335e5b83f402bda2dcdd9ecb411e0b67c651)
2022-02-17 18:22:48 +00:00
Shunting Zhang
763ad1bf25 (2/2) Make TorchScript Preserve Fully Qualified Class Name for Python Exceptions: frontend change (#72899)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72899

Reland D33282878 (911d527b87). This is the frontend change.
ghstack-source-id: 149204031

Test Plan: Refer to D33282878 (911d527b87). Also check CI

Reviewed By: gmagogsfm

Differential Revision: D34252127

fbshipit-source-id: 27b17ddd4d05d904eb91fd9ee094d9121f00e388
(cherry picked from commit 1d276baca308110ac40111ccd622400b3bbdc864)
2022-02-16 03:45:15 +00:00
Michael Suo
7db4a48d92 Revert D33342569: (2/2) Make TorchScript Preserve Fully Qualified Class Name for Python Exceptions: frontend change
Test Plan: revert-hammer

Differential Revision:
D33342569 (856157fcee)

Original commit changeset: 57984ac67ae2

Original Phabricator Diff: D33342569 (856157fcee)

fbshipit-source-id: 4c12235a1776a3652e7f91e93b626705759d5176
(cherry picked from commit 4cbd7d8bab76fcf050e376c8528dba36541a779f)
2022-02-15 18:45:44 +00:00
Shunting Zhang
856157fcee (2/2) Make TorchScript Preserve Fully Qualified Class Name for Python Exceptions: frontend change (#70471)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70471

Reland D33282878 (911d527b87). This is the frontend change.
ghstack-source-id: 149114933

Test Plan: Refer to D33282878 (911d527b87). Also check CI

Reviewed By: gmagogsfm

Differential Revision: D33342569

fbshipit-source-id: 57984ac67ae2c56c38f72d3b1fb69105901fb472
(cherry picked from commit b47cc935ee1fd7aa63aa453a323a637bc2c22f3c)
2022-02-15 07:21:19 +00:00
Pavithran Ramachandran
a482aeb0ce [PyTorchEdge] backport v8 to v7 to support promoted ops as instruction (#71662)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71662

backport v8 to v7 to support promoted ops as instruction

a flag to help export as instruction from v8 and export as operators for v7 and below

Test Plan:
```
buck test caffe2/test/cpp/jit:jit -- LiteInterpreterTest.BackPortByteCodeModelAllVersions

Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/5629499620570927
    ✓ ListingSuccess: caffe2/test/cpp/jit:jit : 461 tests discovered (15.693)
    ✓ Pass: caffe2/test/cpp/jit:jit - LiteInterpreterTest.BackPortByteCodeModelAllVersions (2.712)
Summary
  Pass: 1
  ListingSuccess: 1
If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users
Finished test run: https://www.internalfb.com/intern/testinfra/testrun/5629499620570927
```

```
buck run mode/opt //caffe2/torch/fb/mobile/upgrader_codegen:upgrader_codegen

buck test mode/opt //caffe2/test:upgrader_codegen -- mobile.test_upgrader_codegen.TestLiteScriptModule
Parsing buck files: finished in 0.8 sec
Downloaded 0/2 artifacts, 0.00 bytes, 100.0% cache miss (for updated rules)
Building: finished in 01:39.4 min (100%) 11031/11031 jobs, 2/11031 updated
  Total time: 01:40.2 min
More details at https://www.internalfb.com/intern/buck/build/a8b0e417-019c-44ba-be6b-23379411a965
BUILD SUCCEEDED
Tpx test run coordinator for Facebook. See https://fburl.com/tpx for details.
Running with tpx session id: 44fbfa66-cce8-4277-82ac-f89d79558581
Trace available for this run at /tmp/tpx-20220202-160956.915412/trace.log
RemoteExecution session id: reSessionID-44fbfa66-cce8-4277-82ac-f89d79558581-tpx
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/281475200877601
    ✓ ListingSuccess: caffe2/test:upgrader_codegen : 1 tests discovered (1.249)
    ✓ Pass: caffe2/test:upgrader_codegen - test_generate_bytecode (mobile.test_upgrader_codegen.TestLiteScriptModule) (1.365)
Summary
  Pass: 1
  ListingSuccess: 1
If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users
Finished test run: https://www.internalfb.com/intern/testinfra/testrun/281475200877601
```

Reviewed By: iseeyuan

Differential Revision: D33719098

fbshipit-source-id: e2d2b23d298f98e4d4fcdfc344f7b8c6f92cff26
(cherry picked from commit 81b956c23abc19489b69eee986721252474d00dc)
2022-02-15 03:47:39 +00:00
jiej
2d110d514f Nvfuser code bump 2_1_2022 (#72127)
Summary:
Things changed in this PR that requires review:
1. aten/src/ATen/core/interned_strings.h
2. torch/csrc/jit/ir/alias_analysis.h : exposing createValue to allow efficient mutation
3. torch/csrc/jit/runtime/symbolic_shape_registry.cpp : added gelu/tanh/erf in registry
4. torch/jit/_script.py : throws scripting model sees autocast as decorator since it's not supported

nvfuser code update:
1. codegen improvements and performance tuning
2. integration bug fixes for shape expression logic
3. kernel segmentation update to address perf regression from horizontal fusion
4. scalar cpu tensor promotion to support inter-device operation between cpu scalar tensor and cuda tensor

Things reverted from local changes:
aten::gelu with approximation (tracked in PR: https://github.com/pytorch/pytorch/pull/61439)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72127

Reviewed By: HamidShojanazeri

Differential Revision: D34113233

Pulled By: jbschlosser

fbshipit-source-id: b82cde32b71e324eca0ea57cb8c9f9647278ca74
(cherry picked from commit e009bc5c4e943211c4953e6fdf7c9913fa66b3c9)
2022-02-15 00:43:16 +00:00
Jacob Szwejbka
52c516ecb8 [Pytorch Edge] Minor improve documentation in test_backend_with_compiler
Summary:
Went through all these files and the design doc to understand the to_backend api. Figured I could add some comments to these files to make the apis a little clearer for those that come after.

(Note: this ignores all push blocking failures!)

Test Plan: na

Reviewed By: raziel, larryliu0820

Differential Revision: D34221989

fbshipit-source-id: 699fcbd8714bfb6b58c6c0bf0e5fbc019d2ef6f8
(cherry picked from commit 0b3f5d73e8de216b4402aed2f2b80be8781f145b)
2022-02-14 23:44:46 +00:00
David Berard
c314750401 [JIT] enable profiling optional tensors (#70532)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70532

This adds profiling to Optional[Tensor] types

First, in profiling_record.cpp, profiling nodes are added to Optional[Tensor] inputs. The nodes record
(a) whether or not any `None` types are encountered, and
(b) of the Tensor types, what's the most specific type matching all of non-null tensors that were encoutered (shape, dtype, etc.)

In tensorexpr_fuser, when specializing types based on the profiled information, an Optional[Tensor] type will always be Optional[], but the Tensor type contained in the optional type can be specialized (e.g. `Optional[Float(2x2x2, cpu, etc)]`)

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33714748

Pulled By: davidberard98

fbshipit-source-id: 93c819054450de7ac84b112de1012c0c12e34120
(cherry picked from commit 21cfd8012396cf4cc8437cb1dbf6b3846720d1b9)
2022-02-08 22:52:26 +00:00
Han Qi
57f039b41f Fixing few bugs in torch flatbuffer (#72349)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72349

1. Interface call'd methods need to be registered to class. Previously all interface calls are inlined so  there was no such problem.
2. parseDoubleList and parseBoolList got reversed when refactoring.

Test Plan:
1. Get ASR's test model at
```
mkdir ~/asr1 && cd ~/asr1
fbpkg fetch speech.tuna.milan.ondevice.en_us
```
2. Convert model:
```
cd ~/fbsource
buck run //xplat/caffe2/fb/lite_predictor:convert_model -- --model=$HOME/asr1/pytorchmodel.pt --output_name=$HOME/asr1/pytorchmodel.ff
```
3. Ran lite_predictor_flatbuffer
```
 buck run //xplat/caffe2/fb/lite_predictor:lite_predictor_flatbuffer -- --model=$HOME/asr1/pytorchmodel.ff --method_to_call=encode_src --method_to_generate_input=get_all_bundled_inputs_for_encode_src

```

See perf metric generated (means loading and inference succeeded).

Reviewed By: gmagogsfm, zhxchen17

Differential Revision: D33959746

fbshipit-source-id: 24671e1189438119f477032eb6c29bd7736e74ca
(cherry picked from commit 5e188093506aaf5ba64728364474ebde6e4d3d7c)
2022-02-05 00:25:27 +00:00
CodemodService FBSourceClangFormatLinterBot
ed435e903f [AutoAccept][Codemod][FBSourceClangFormatLinter] Daily arc lint --take CLANGFORMAT
Reviewed By: zertosh

Differential Revision: D33938055

fbshipit-source-id: 6c0643a18f09854e87e183341f252c66dd6395a6
(cherry picked from commit fd183aedbc0f015bd43a01a28930093ab94ab41e)
2022-02-02 11:27:15 +00:00
Elias Ellison
cf1833df70 [WIP] add explicit dynamic fusion arg (#71173)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71173

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D33536222

Pulled By: eellison

fbshipit-source-id: a097408ecdd6e284432de128feb297993d882d52
(cherry picked from commit 0e3419b2d32e798539a35db9c47aa85f7487a524)
2022-02-01 19:07:02 +00:00
David Berard
99bc978b78 [JIT] Propagate requires_grad to autodiff subgraphs (#71666)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71666

When JIT autodiff is constructing a gradient computation graph, it will only add gradients for tensors that require_grad. Previously, require_grad information was **not** propagated to the subgraph that autodiff used; as a result, autodiff would calculate *all* gradients, even if requires_grad had never been set during profiling runs. In certain cases, this can lead to performance issues. For example, during training, the gradient of the input data is not needed, but is still computed.

This propagates requires_grad to the subgraph passed into autodiff, so that autodiff will not compute unnecessary gradients.

Test: `./bin/test_jit --gtest_filter="AutodiffRemoveUnusedGradientsTest.Linear"`

Test Plan: Imported from OSS

Reviewed By: eellison

Differential Revision: D33725304

Pulled By: davidberard98

fbshipit-source-id: ca7ab4c9a6a26f94f93aff2d5a4135e125323ba1
(cherry picked from commit a97fe0556da1d74d04250c7cbcd1b8e9d8b41ebe)
2022-01-28 18:57:36 +00:00
John Clow
c85965600c Fix bug where frozen mod not used for OFI #68903 (#71436)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71436

Fixes issue #68903

Test Plan: Imported from OSS

Reviewed By: george-qi

Differential Revision: D33824857

Pulled By: Gamrix

fbshipit-source-id: 8d351feb4a621916f55003c58527a1e85eec476e
(cherry picked from commit 57bb4200403692adee1c09f22ce02b1534bad202)
2022-01-27 23:37:50 +00:00