Commit graph

415 commits

Author SHA1 Message Date
Yanan Cao
890b52e09f Reduce instability in runCleanUpPasses by reordering passes. (#41891)
Summary:
Currently constant pooling runs before const propagation, which can create more constants that need pooling. This can get in the way of serialization/deserialization stability because each time user serializes and deserializes a module, runCleanUpPasses is called upon it. Doing so multiple times would lead to different saved module.

This PR moves constant pooling after const propagation, which may slow down const propagation a little bit, but would otherwise side-step aforementioned problem.

test_constant_insertion in test_jit.py is also updated because after fixing the pass ordering, the number of constants is no longer a constant and it is extremely difficult to get the exact number with the current convoluted test structure. So for now, I changed the test to check only that CSE doesn't change number of "prim::constant" rather than comparing against a known number. Also left a TODO to improve this test.

ConstantPropagation pass is replaced by ConstantPropagationImmutableTypes because the latter is used in runCleanUpPasses. If not replaced, the former would create new CSE opportunities by folding more constants. This voids the purpose of the test case.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41891

Reviewed By: colesbury

Differential Revision: D22701540

Pulled By: gmagogsfm

fbshipit-source-id: 8e60dbdcc54a93dac111d81b8d88fb39387224f5
2020-07-24 11:39:20 -07:00
Ann Shan
dfe7d27d0e implement lite parameter serializer (#41403)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41403

Test Plan: Imported from OSS

Reviewed By: kwanmacher

Differential Revision: D22611633

Pulled By: ann-ss

fbshipit-source-id: b391e8c96234b2e69f350119a11f688e920c7817
2020-07-23 14:25:44 -07:00
Ann Shan
1039bbf4eb add named parameters to mobile module (#41376)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41376

torch::jit::mobile::Module does not currently support accessing parameters via their attribute names, but torch::jit::Module does. This diff adds an equivalent functionality to mobile::Module.

Test Plan: Imported from OSS

Reviewed By: iseeyuan

Differential Revision: D22609142

Pulled By: ann-ss

fbshipit-source-id: 1a5272ff336f99a3c0bb6194c6a6384754f47846
2020-07-20 15:57:49 -07:00
Ilia Cherniavskii
e7a09b4d17 RecordFunction in Dispatcher (#37587)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37587

Lifting RecordFunction up into the dispatcher code

Test Plan: Imported from OSS

Differential Revision: D21374246

fbshipit-source-id: 19f9c1719e6fd3990e451c5bbd771121e91128f7
2020-07-17 22:20:05 -07:00
Stanislau Hlebik
b774ce54f8 remediation of S205607
fbshipit-source-id: 798decc90db4f13770e97cdce3c0df7d5421b2a3
2020-07-17 17:19:47 -07:00
Stanislau Hlebik
8fdea489af remediation of S205607
fbshipit-source-id: 5113fe0c527595e4227ff827253b7414abbdf7ac
2020-07-17 17:17:03 -07:00
Mikhail Zolotukhin
5d7046522b [JIT] Teach IRPrinter and IRParser to handle 'requires_grad' and 'device' as a part of type info. (#41507)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41507

These fields have always been a part of tensor types, this change just
makes them serializable through IR dumps.

Test Plan: Imported from OSS

Reviewed By: Krovatkin, ngimel

Differential Revision: D22563661

Pulled By: ZolotukhinM

fbshipit-source-id: f01aaa130b7e0005bf1ff21f65827fc24755b360
2020-07-17 10:27:04 -07:00
Meghan Lele
4972cf06a2 [JIT] Add out-of-source-tree to_backend tests (#41145)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41145

**Summary**
This commit adds out-of-source-tree tests for `to_backend`. These tests check
that a Module can be lowered to a backend, exported, loaded (in both
Python and C++) and executed.

**Fixes**
This commit fixes #40067.

Test Plan: Imported from OSS

Reviewed By: jamesr66a

Differential Revision: D22510076

Pulled By: SplitInfinity

fbshipit-source-id: f65964ef3092a095740f06636ed5b1eb0884492d
2020-07-14 10:57:04 -07:00
Zino Benaissa
690946c49d Generalize constant_table from tensor only to ivalue (#40718)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40718

Currently only constant except tensor must be inlined during serialization.
Tensor are stored in the contant table. This patch generalizes this capability
to any IValue. This is particularly useful for non ASCII string literal that
cannot be inlined.

Test Plan: Imported from OSS

Differential Revision: D22298169

Pulled By: bzinodev

fbshipit-source-id: 88cc59af9cc45e426ca8002175593b9e431f4bac
2020-07-09 09:09:40 -07:00
Rohan Varma
bf9cc5c776 Add callback with TLS state API in futures (#40326)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40326

Adds a helper function `addCallbackWithTLSState` to both
torch/csrc/utils/future.h which is used internally by RPC framework and the JIT
future. Uses this helper function to avoid to pass in TLS state where it is needed for rpc and `record_function_ops.cpp`. For example, the following:

```
at::ThreadLocalState tls_state;
fut->addCallback([tls_state = std::move(tls_state)]() {
at::ThreadLocalStateGuard g(tls_state);
some_cb_that_requires_tls_state();
}
```

becomes

```
fut->addCallbackWithTLSState(some_cb_that_requires_tls_state);
```
ghstack-source-id: 107383961

Test Plan: RPC Tests and added a test in test_misc.cpp

Differential Revision: D22147634

fbshipit-source-id: 46c02337b90ee58ca5a0861e932413c40d06ed4c
2020-07-08 23:25:35 -07:00
Brian Vaughan
2bc9ee97d1 Revert D22418731: [JIT] Add out-of-source-tree to_backend tests
Test Plan: revert-hammer

Differential Revision:
D22418731 (e2a291b396)

Original commit changeset: 621ba4efc1b1

fbshipit-source-id: 475ae24c5b612fe285035e5ebb92ffc66780a468
2020-07-08 13:11:45 -07:00
Meghan Lele
e2a291b396 [JIT] Add out-of-source-tree to_backend tests (#40842)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40842

**Summary**
This commit adds out-of-source-tree tests for `to_backend`. These tests check
that a Module can be lowered to a backend, exported, loaded (in both
Python and C++) and executed.

**Fixes**
This commit fixes #40067.

Test Plan: Imported from OSS

Differential Revision: D22418731

Pulled By: SplitInfinity

fbshipit-source-id: 621ba4efc1b121fa76c9c7ca377792ac7440d250
2020-07-07 21:00:43 -07:00
Meghan Lele
5a4c45f8d1 [JIT] Move TestBackend to test directory (#40840)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40840

**Summary**
This commit moves the TestBackend used for the JIT backend
extension to the tests directory. It was temporarily placed
in the source directory while figuring out some details of
the user experience for this feature.

**Test Plan**
`python test/test_jit.py TestBackends`

**Fixes**
This commit fixes #40067.

Test Plan: Imported from OSS

Differential Revision: D22418682

Pulled By: SplitInfinity

fbshipit-source-id: 9356af1341ec4d552a41c2a8929b327bc8b56057
2020-07-07 21:00:38 -07:00
James Reed
c0f9bf9bea s/torch::jit::class_/torch::class_/ (#40795)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40795

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D22314215

Pulled By: jamesr66a

fbshipit-source-id: a2fb5c6804d4014f8e437c6858a7be8cd3efb380
2020-07-06 15:53:33 -07:00
Christian Sarofeen
b9b4f05abf [nvFuser] Working towards reductions, codegen improvements (#40864)
Summary:
Have basic reduction fusion working, and have improved code generator to approach performance of eager mode reductions. Coming soon will be pointwise-reduction fusions in a way that should prevent the possibility of hitting regressions. Also working on performant softmax kernels in the code generator which may be our next fusion target.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/40864

Reviewed By: ngimel

Differential Revision: D22392877

Pulled By: soumith

fbshipit-source-id: 457448a807d628b1035f6d90bc0abe8a87bf8447
2020-07-06 14:52:49 -07:00
Sebastian Messmer
53af9df557 Unify boxed function signature between jit and c10 (#37034)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37034

c10 takes a Stack* in boxed functions while JIT took Stack&.
c10 doesn't return anything while JIT returns an int which is always zero.

This changes JIT to follow the c10 behavior.
ghstack-source-id: 106834069

Test Plan: unit tests

Differential Revision: D20567950

fbshipit-source-id: 1a7aea291023afc52ae706957e9a5ca576fbb53b
2020-06-29 19:24:26 -07:00
Mike Ruberry
e66445878d Adds dynamic versioning pattern (#40279)
Summary:
BC NOTE:

This change makes it so modules saved with torch.jit.save in PyTorch 1.6 can be loaded by previous versions of PyTorch unless they use torch.div or (soon) torch.full. It also lets tensors saved using torch.save be loaded by previous versions. So this is the opposite of BC-breaking, but I'm using that label to highlight this issue since we don't have a "BC-improving" label.

PR NOTE:
When an operator's semantics change in PyTorch we want to do two things:

1) Preserve the semantics of older serialized Torchscript programs that use the operator
2) Ensure the new semantics are respected

Historically, this meant writing a Versioned Symbol that would remap older versions of the operator into current PyTorch code (1), and bumping the produced file format version (2). Unfortunately, bumping the produced file format version is a nuclear option for ensuring semantics are respected, since it also prevents older versions of PyTorch from loading anything (even tensors!) from newer versions.

Dynamic versioning addresses the nuclear consequences of bumping the produced file format version by only bumping it when necessary. That is, when an operator with changed semantics is detected in the serialized Torchscript. This will prevent Torchscript programs that use the changed operator from loading on earlier versions of PyTorch, as desired, but will have no impact on programs that don't use the changed operator.

Note that this change is only applicable when using torch.jit.save and torch.jit.load. torch.save pickles the given object using pickle (by default), which saves a function's Python directly.

No new tests for this behavior are added since the existing tests for versioned division in test_save_load already validate that models with div are loaded correctly at version 4.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40279

Reviewed By: dzhulgakov

Differential Revision: D22168291

Pulled By: mruberry

fbshipit-source-id: e71d6380e727e25123c7eedf6d80e5d7f1fe9f95
2020-06-24 12:52:50 -07:00
Mikhail Zolotukhin
79450edad3 [JIT] IRParser: properly parse negative numbers. (#39981)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39981

Test Plan: Imported from OSS

Reviewed By: jamesr66a

Differential Revision: D22032786

Pulled By: ZolotukhinM

fbshipit-source-id: b6c5237ac5c1c331d5053a620eb9a37a4c698125
2020-06-15 12:28:41 -07:00
Jeremy Lilley
569c85b45d [futures] Add assert to Future constValue() accessor, add hasValue(). (#39950)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39950

Per the comment in the code, constValue() should only be used in
the case where the future was complete and value was not an error.
Add an assert to enforce this.

Also, add hasValue() accessor for completeness.
ghstack-source-id: 105815597

Test Plan: buck test mode/dev-nosan caffe2/test/cpp/jit:

Differential Revision: D22021776

fbshipit-source-id: b59b6c775eab344068a76f4cd8c3a9dc1f2a174e
2020-06-15 12:11:22 -07:00
Jerry Zhang
004aa089a6 [jit][subgraph_rewriter] Support list of filters (#39867)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39867

Support list of filters in subgraph rewriter, the rewrite will execute only
when the match passes all filter check. this is useful for different matches
to share the same filter.

Test Plan: Imported from OSS

Differential Revision: D22009855

fbshipit-source-id: 67aab8d6326b2011a9061397699dc62ee9ad4e2d
2020-06-12 08:24:49 -07:00
Christian Sarofeen
80e5ebf989 [nvFuser] Transform replay refactor and minor updates (#39579)
Summary:
We've got quite a few things going on, preparing a push back to upstream so we don't get too desynced.

- Major refactor of transform replay. It is now far more robust and fixes bugs discovered in reductions. Preparing for extension to explicit broadcast ops which will be the last major memory pattern for op coverage. Broadcast ops will allow us to express up to and potentially beyond norms and gemms.

- Initial runtime expression evaluator. This allows us to evaluate expressions at runtime. Will be useful for determining our grid/block layout at runtime, so we don't have to manually compute them according to the code we're trying to generate.

- Moving to int64 and double for scalar representations to match PyTorch JIT.

- Improvements in codegen interface where we return Tensor like object instead of parent class Val.

- Add `addcmul` and `lerp` ops

- General updates, fixes, test additions, test inprovements.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39579

Differential Revision: D21974001

Pulled By: soumith

fbshipit-source-id: 7f7ccc91593466e948f3ce90f8f9b7fbc5c28de2
2020-06-11 23:04:24 -07:00
Nikolay Korovaiko
7f55197a57 Peel Loop (#39434)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39434

Differential Revision: D21857037

Pulled By: Krovatkin

fbshipit-source-id: 6583da167fe93d96e93f1c3d71f46f94e7f4e982
2020-06-10 13:48:18 -07:00
Yanan Cao
c22bbb2124 [JIT] Add Type::repr_str to return human-readable str (#39544)
Summary:
Clearly expressing a type is inferred by PyTorch instead of explicitly annotated by user makes many error messages more user-friendly

Currently Type has two string conversion methods. str() for IR printing and python_str() for serialization and error message generation. If we want to include more information in type printing while maintaining serialization/deserialization correctness, we need to split python_str() into annotation_str() and repr_str().

annotation_str is solely responsible for serialization, it strictly matches format of python type annotation. repr_str() is responsible for generating a human-readable error message that includes information like "this type is inferred, not explicitly annotated"

Closes https://github.com/pytorch/pytorch/issues/39449
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39544

Differential Revision: D21978759

Pulled By: gmagogsfm

fbshipit-source-id: 733566f5a62e748b5ca4bb3c5943ebb6d5b664d0
2020-06-10 12:01:24 -07:00
Elias Ellison
2193fa119e [JIT] consider side effects when trying moves in alias analysis (#39497)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39497

Previously, we didn't consider side effects at all when moving nodes in alias analysis. It is never valid to reorder a node with a side effect. This has led to bugs when used with Bailouts.

Unfortunately this will might cause regressions but it wasn't correct prior :/

Test Plan: Imported from OSS

Differential Revision: D21963774

Pulled By: eellison

fbshipit-source-id: 656995d1b82534eca65437ed4e397b2bf08a4dec
2020-06-09 19:32:55 -07:00
Jeremy Lilley
be3bbfc917 [futures] Add collectAny() to ivalue::Future for completeness (#39597)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39597

To complement collectAll(), this change adds collectAny(), and writes
up relevant unittest coverage.

We also remove the vector-based helper version of collectAll(), which
was debatable usefulness in a previous change.
ghstack-source-id: 105527180

Test Plan: buck test mode/dev-nosan caffe2/test/cpp/jit/...

Differential Revision: D21910311

fbshipit-source-id: dbb3ca404672a3d751b1b3cf016e6084a9ff8040
2020-06-09 16:32:52 -07:00
Jeremy Lilley
b83fed8d4c [futures] Add c++ ivalue::Future collectAll() helper (#39119)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39119

Add some base c++ unittest coverage for ivalue::Future, and in
the process, add a basic collectAll() primitive, per 38937.

In the process, I realized that List<Future> is effectively
impossible to construct (since the Future's type is not templated,
but rather passed in,  the getTypePtr_<T>::call() isn't defined),
so added a workaround in List to make it possible.
ghstack-source-id: 105309650

Test Plan: buck test mode/dev-nosan caffe2/test/cpp/jit/...

Differential Revision: D21756884

fbshipit-source-id: 5d40c8d1c55098de5497655c7b887f4f56508a37
2020-06-08 05:52:09 -07:00
Linbin Yu
b28422d444 add overload name for str cmp (#39607)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39607

add overload name for strcmp macro to prevent duplicated op names in lite interpreter

also reformatted some other files

Test Plan:
verified these op schema are changed

```
-aten::eq(str a, str b) -> (bool)
+aten::eq.str(str a, str b) -> (bool)

-aten::ne(str a, str b) -> (bool)
+aten::ne.str(str a, str b) -> (bool)

-aten::lt(str a, str b) -> (bool)
+aten::lt.str(str a, str b) -> (bool)

-aten::gt(str a, str b) -> (bool)
+aten::gt.str(str a, str b) -> (bool)

-aten::le(str a, str b) -> (bool)
+aten::le.str(str a, str b) -> (bool)

-aten::ge(str a, str b) -> (bool)
+aten::ge.str(str a, str b) -> (bool)
```

Reviewed By: iseeyuan

Differential Revision: D21913049

fbshipit-source-id: 518db068c8c5b0efd19223f0bd94fc3351335dc4
2020-06-06 23:21:35 -07:00
Jerry Zhang
3669e45736 [jit][subgraph_matcher] Enable regex matching for string attributes of node (#39454)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39454

Test Plan: Imported from OSS

Differential Revision: D21876224

fbshipit-source-id: c0fdff3a4532d2a73b222353e2cad6cf52444697
2020-06-05 23:03:38 -07:00
Nikolay Korovaiko
97a2918a07 reduce number of bailout nodes (#38281)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38281

Differential Revision: D21665509

Pulled By: Krovatkin

fbshipit-source-id: c2c34b759aec30d0a161e582030ba994192ee4ec
2020-06-05 13:45:37 -07:00
Ilia Cherniavskii
abe2be2063 [resubmit] Use TensorMethods.cpp (#39385)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39385

see https://github.com/pytorch/pytorch/pull/37639

Test Plan:
https://github.com/pytorch/pytorch/pull/37639

Imported from OSS

Differential Revision: D21833287

fbshipit-source-id: 9928d3f4122903d0de67ad312e349352d5f5c19c
2020-06-02 20:27:51 -07:00
Edward Yang
2fe0fc2684 Revert D21374247: Use TensorMethods.cpp
Test Plan: revert-hammer

Differential Revision:
D21374247

Original commit changeset: 076964415079

fbshipit-source-id: 732ec8c561d1f37475c1b5549ba79c718e3a6db8
2020-06-01 08:12:09 -07:00
Ilia Cherniavskii
68e62b9ab6 Use TensorMethods.cpp (#37639)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37639

Changing TensorMethods.h to .cpp
Necessary to avoid incomplete types in dispatcher

Test Plan:
CI

Imported from OSS

checked mobile size, no change, small reduction in size in fbios
fbios: Succeeded
Change in Download Size for arm64 + 3x assets variation: -18.2 KiB
Change in Uncompressed Size for arm64 + 3x assets variation: -8.8 KiB

reran benchmark, no stat. significant difference

buck run mode/opt caffe2/caffe2/fb/high_perf_models/pytorch/benchmark_framework_overheads:benchmark_torchscript_model -- --model_file caffe2/caffe2/fb/high_perf_models/pytorch/benchmark_framework_overheads/addmodule.pt --num_runs 3

╷ @  68592d0d  41 minutes ago  iliacher  D21374247
╭─╯  Use TensorMethods.cpp

Created 3 benchmark runs on aibench for caffe2/caffe2/fb/high_perf_models/pytorch/benchmark_framework_overheads/addmodule.pt.
Links to the results:

* Adhoc run: https://our.intern.facebook.com/intern/aibench/details/1729113760

* Adhoc run: https://our.intern.facebook.com/intern/aibench/details/3867976782

* Adhoc run: https://our.intern.facebook.com/intern/aibench/details/2782186766

hg prev

@  7f501b42  Thursday at 14:26  bvaughan  D21764704
╷  short-circuit pow for complex 1 and 0 exponents

Created 3 benchmark runs on aibench for caffe2/caffe2/fb/high_perf_models/pytorch/benchmark_framework_overheads/addmodule.pt.
Links to the results:

* Adhoc run: https://our.intern.facebook.com/intern/aibench/details/2155256332

* Adhoc run: https://our.intern.facebook.com/intern/aibench/details/1802057074

* Adhoc run: https://our.intern.facebook.com/intern/aibench/details/4119590830

Differential Revision: D21374247

fbshipit-source-id: 076964415079cf84fb57f1f7b43d087afed86e1d
2020-05-31 17:11:12 -07:00
Ilia Cherniavskii
a5e023f28a Set RecordFunction id only when needed (#39265)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39265

In this PR we set id of RecordFunction only when callbacks need them and when
there's at least one active callback

Test Plan:
testRecordFunction unit test in test_misc.cpp
buck test mode/dev caffe2/test/cpp/jit:jit

https://our.intern.facebook.com/intern/testinfra/testrun/8725724291116413

Reviewed By: dzhulgakov

Differential Revision: D21790421

fbshipit-source-id: 016623d7f1a2a271921a71c0483061e232b40321
2020-05-29 15:34:44 -07:00
lixinyu
a04fb2ab22 [Reland] add xenial + cuda 9.2 + gcc 5.4 CI test (#39036)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39036

Test Plan: Imported from OSS

Differential Revision: D21731026

Pulled By: glaringlee

fbshipit-source-id: ae678f786f95e3687ed6b3f176fe6736a436c721
2020-05-28 19:48:18 -07:00
Luca Antiga
e088902b4a Add type-hint check for default arguments in TorchScript C++ frontend (#39021)
Summary:
This PR fixes https://github.com/pytorch/pytorch/issues/39020 by requiring users to type-hint default arguments to a TorchScript when using the C++ frontend (the Python frontend will insert those automatically).

Since this is a bit of a niche use case, I opted for the simpler solution of making type-hints mandatory for default arguments, as opposed to trying to type-infer them. I left a comment in the code justifying this choice.

Test is included.

/cc t-vi
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39021

Differential Revision: D21755317

Pulled By: suo

fbshipit-source-id: e007650d3bfb3a4c58c25ad2c3a17759898f303b
2020-05-28 01:42:04 -07:00
Nikita Shulga
c6e9e9359f [Codemod][GleanFbcode] Remove dead includes in caffe2/test (#39023)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39023

Reviewed By: orionr

Differential Revision: D21702529

fbshipit-source-id: 6945bba95609102409850b105a8a091e33b8acc9
2020-05-27 14:07:26 -07:00
Christian Sarofeen
8e69c3be17 [nvFuser] Reduction support in codegen, fp16 support (#38627)
Summary:
Adds reduction  support for the code generator. Reductions are fully supported with split/merge/reorder/rfactor/computeAt/unroll operators. There is also cross thread (intra-block) reduction support.

The two remaining pieces missing for reduction support is:
- Safety: If cross thread reduction was used, child operators shouldn't be able to bind that thread dim anymore
- Cross block reduction: we will want inter-block reduction support to match parity with tensor iterator

PR also provides FP16 support for fusions now. We insert casts on FP16 inputs to FP32, and we insert casts to FP16 on FP16 outputs.

Also working towards reductions and shape inference for reductions in the fusion pass.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38627

Reviewed By: albanD

Differential Revision: D21663196

Pulled By: soumith

fbshipit-source-id: 3ff2df563f86c39cd5821ab9c1148149e5172a9e
2020-05-21 17:18:39 -07:00
Jerry Zhang
a8d8fc5532 [quant][graphmode] Different rule for add/add_/mul/mul_ (#38667)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38667

Test Plan: Imported from OSS

Differential Revision: D21633555

fbshipit-source-id: 03b0298e83bf4dbda41b048c0edc7bb92cd4e1df
2020-05-20 19:43:46 -07:00
Michael Voznesensky
f6f1384811 [JIT] Refactor attributes to support buffers and parameters as first class citizens, add support for iterating over named_buffers() (#37905)
Summary:
First part of https://github.com/pytorch/pytorch/issues/36211 - still a WIP, but asking for commentary to ensure this is the direction we want to go in.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37905

Differential Revision: D21633735

Pulled By: voznesenskym

fbshipit-source-id: f4e4302e40114513776c9e48867a90d72049e2e9
2020-05-18 23:23:43 -07:00
Michael Suo
0d220ef381 [torchbind] Better error message when missing init. (#37474)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37474

Previously we would segfault

Test Plan: Imported from OSS

Differential Revision: D21297542

Pulled By: suo

fbshipit-source-id: c7e2f828a250c490ec23fb51c6a4a642d3370e52
2020-05-13 17:38:31 -07:00
Ilia Cherniavskii
43dd8760d7 Move ThreadLocalDebugInfo to c10 (#37774)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37774

Move ThreadLocalDebugInfo from ATen to C10

Test Plan: Imported from OSS

Differential Revision: D21384249

Pulled By: ilia-cher

fbshipit-source-id: f9b5089a868f84a2ee013695a481fcc883d3c6b2
2020-05-11 19:27:41 -07:00
James Reed
a553935e3c [JIT] Expose magic methods on script::Object (#38167)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38167

Test Plan: Imported from OSS

Differential Revision: D21486709

Pulled By: jamesr66a

fbshipit-source-id: 17b44d979fc658768b0d64f7d8af6fb684043ea3
2020-05-11 15:01:15 -07:00
Ilia Cherniavskii
facc5e0cc4 Make profiler thread local (#36291)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36291

Move profiler state to be a thread local property,
reuse existing thread local propagation mechanism to ensure
correct profiling of async tasks. This also makes
push/pop callback thread safe and easier to use in e.g.
distributed profilier

Test Plan:
USE_BLAS=MKL USE_MKLDNN=0 USE_CUDA=0 python setup.py develop install
./build/bin/test_jit

./build/bin/test_jit
python test/test_autograd.py
python test/test_jit.py

Differential Revision: D20938501

Pulled By: ilia-cher

fbshipit-source-id: c0c6c3eddcfea8fc7c14229534b7246a0ad25845
2020-05-07 14:52:49 -07:00
Ilia Cherniavskii
2ef4010593 Propagate TLS callbacks with ThreadLocalState (#37745)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37745

This PR makes it possible to set TLS callbacks and use
them transparently not only in the main thread but also
in any async tasks

Test Plan: Imported from OSS

Differential Revision: D21374873

Pulled By: ilia-cher

fbshipit-source-id: 3be2e121673b32d7694e17e794f3b474826dffe9
2020-05-07 14:52:44 -07:00
Ilia Cherniavskii
2d708cefcc Move RecordFunction into ATen (#37548)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37548

Moving RecordFunction from torch::autograd::profiler into at namespace

Test Plan:
CI

Imported from OSS

Differential Revision: D21315852

fbshipit-source-id: 4a4dbabf116c162f9aef0da8606590ec3f3847aa
2020-05-07 14:52:39 -07:00
Ilia Cherniavskii
c24c5f9684 Make RecordFunction callbacks thread local and modernize interface (#37491)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37491

This PR modernizes RecordFunction API and adds thread local callbacks
in addition to the global ones

Changes:
 - support for TLS callbacks, this is going to be the foundation of profiler and other tools
 - modernize interface around simple set of functions (add|remove|has|clear)(Global|ThreadLocal)(Callback) and adding RecordFunctionCallback to easily construct callbacks to be passed
 - we also add `.setShouldRun` into the callback interface to support cases when simple uniform sampling is not enough
 - to properly support add/remove introduce the idea of callback handle returned by add
 - internal implementation still uses SmallVector to store intermediate state (as before) - in this case these are vector of handles of callbacks that were picked to run
 - to speed up runtime we keep these vectors sorted, this way we can quickly enumerate callbacks that need to be run
 - added tests for new functionality

Test Plan:
BUILD_BINARY=1 USE_BLAS=MKL USE_MKLDNN=0 USE_CUDA=0 python setup.py
develop install
./build/bin/test_jit
CI

record_function_benchmark: https://gist.github.com/ilia-cher/f1e094dae47fe23e55e7672ac4dcda2f

Imported from OSS

Differential Revision: D21300448

fbshipit-source-id: 6d55c26dbf20b33d35c3f1604dcc07bb063c8c43
2020-05-07 14:51:02 -07:00
jiej
1667aa6451 [CUDA_FUSER] Expand operation support for cuda fuser (#37849)
Summary:
This PR added more supported operations in CUDA fuser. We are covering major point-wise operations supported in legacy fuser.

In an attempt to adapt to legacy executor:
1. added an naive shape propagation pass on pytorch JIT IR;
2. small refactor on graph partitioning;
3. fallback interpreter execution of fusion group;
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37849

Reviewed By: yf225

Differential Revision: D21444320

Pulled By: soumith

fbshipit-source-id: 712e18ab8497f8d58a07e6f8d200cdab52cf0d74
2020-05-07 09:21:09 -07:00
Michael Suo
b53e6bfd49 [jit] normalize getMethod (#37472)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37472

Our convention is for `findX` to return an optional version and `getX`
to assert that the X is there. Fix up `getMethod` to be consistent with
this convention.

Test Plan: Imported from OSS

Differential Revision: D21297543

Pulled By: suo

fbshipit-source-id: b40f56231cc8183e61bbb01fe5c0c113bcb6464d
2020-05-06 15:22:25 -07:00
Jerry Zhang
1ad46f470f [jit] __copy__ for RecursiveScriptModule (#36830)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36830

Test Plan:
build/bin/test_jit

Imported from OSS

Differential Revision: D21431012

fbshipit-source-id: 13a1bf9744ec95ea59622226c8d8a8d55ec3f0b0
2020-05-06 13:55:01 -07:00
Nikolay Korovaiko
4ed790d742 Adding symbolic sizes, contiguity, stride indices (#36101)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36101

Reviewed By: jamesr66a

Differential Revision: D20908711

Pulled By: Krovatkin

fbshipit-source-id: f90ce74acffeb645d7d906d07e293164d65ed7e6
2020-05-01 02:01:25 -07:00