Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34985
IValue is part of the overall runtime system, not just the JIT. So it
should be tested in the ATen tests.
The real motivation though is so that I can use gtest directly, not the
hacked-up version the JIT uses.
Test Plan: Imported from OSS
Differential Revision: D20537902
Pulled By: suo
fbshipit-source-id: 09897e015ecde24aa8996babeaa08d98db90ef0d
Summary:
1. Removed LossClosureOptimizer, and merged Optimizer into OptimizerBase (and renamed the merged class to Optimizer)
2. Merged the LBFGS-specific serialize test function and the generic test_serialize_optimizer function.
3. BC-compatibility serialization test for LBFGS
4. Removed mentions of parameters_ in optimizer.cpp, de-virtualize all functions
5. Made defaults_ optional argument in all optimizers except SGD
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34957
Test Plan: Imported from GitHub, without a `Test Plan:` line.
Differential Revision: D20518647
Pulled By: anjali411
fbshipit-source-id: 4760d1d29df1784e2d01e2a476d2a08e9df4ea1c
Summary:
Follow-ups after this PR:
* Remove `LossClosureOptimizer`, and merge `Optimizer` into `OptimizerBase` (and rename the merged class to Optimizer)
* Merge the LBFGS-specific serialize test function and the generic `test_serialize_optimizer` function, possibly by passing a bool `has_only_global_state` flag into the `test_serialize_optimizer` function to denote whether `size()` should be equal to 1 or 2?
* https://github.com/pytorch/pytorch/pull/34564#discussion_r393780303
* It seems that we don't have the equivalent `XORConvergence_LBFGS` test like the other optimizers, and it would be good to add one
* Remove mentions of `parameters_` in optimizer.cpp, de-virtualize all functions, and remove the `OptimizerBase(std::vector<Tensor> parameters)` constructor from `OptimizerBase`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34564
Test Plan: Imported from GitHub, without a `Test Plan:` line.
Differential Revision: D20495701
Pulled By: anjali411
fbshipit-source-id: 6d35286d2decb6f7dff93d9d3e57515770666622
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34842
This PR (hopefully the last one of such kind) is merging changes from a
side branch where tensor expessions based fuser work has been done so
far. This PR is is a squashed version of changes in the side branch,
which is available here: https://github.com/bertmaher/pytorch
Differential Revision: D20478208
Test Plan: Imported from OSS
Pulled By: ZolotukhinM
fbshipit-source-id: 21556e009f1fd88099944732edba72ac40e9b9c0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34228
This PR adds LLVM codegen to tensor expressions. LLVM is added as an
optional build dependency specified with `USE_LLVM=<path_to_llvm>`
variable. If this variable is not set or LLVM is not found in the
specified path, the LLVM codegen is completely disabled.
Differential Revision: D20251832
Test Plan: Imported from OSS
Pulled By: ZolotukhinM
fbshipit-source-id: 77e203ab4421eb03afc64f8da17e0daab277ecc2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34227
This PR adds a CUDA support to tensor expressions.
Differential Revision: D20251836
Test Plan: Imported from OSS
Pulled By: ZolotukhinM
fbshipit-source-id: ab36a55834cceff30c8371fef6cca1054a32f017
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34224
Our development has been happening on a side branch `pytorch_fusion` in
`bertmaher/pytorch` fork. This PR moves changes to the core classes
representing expressions and transformations on them.
At this moment, the tensor expressions are only used in tests.
Subsequent PRs add LLVM and CUDA codegen for tensor expressions and
implement fuser on top of these.
This PR is huge as it is a squashed version of changes in the side
branch. It is not practical to pull changes one by one from the branch,
so here is the squashed version. If you're interested in seeing the
history of changes, please refer to https://github.com/bertmaher/pytorch
Differential Revision: D20251835
Test Plan: Imported from OSS
Pulled By: ZolotukhinM
fbshipit-source-id: 1a871acc09cf3c6f7fb4af40d408cdbb82dc7dab
Summary:
This PR refactors RNN / GRU / LSTM layers in C++ API to exactly match the implementation in Python API.
**BC-breaking changes:**
- Instead of returning `RNNOutput`, RNN / GRU forward method now returns `std::tuple<Tensor, Tensor>`, and LSTM forward method now returns `std::tuple<Tensor, std::tuple<Tensor, Tensor>>`, matching Python API.
- RNN / LSTM / GRU forward method now accepts the same inputs (input tensor and optionally hidden state), matching Python API.
- RNN / LSTM / GRU layers now have `forward_with_packed_input` method which accepts `PackedSequence` as input and optionally hidden state, matching the `forward(PackedSequence, ...)` variant in Python API.
- RNN / LSTM / GRU layers no longer have these fields: `w_ih` / `w_hh` / `b_ih` / `b_hh`. Instead, to access the weights and biases of the gates, users should do e.g. `rnn->named_parameters()["weight_ih_l0"]`, which mirrors the Python API `rnn.weight_ih_l0`.
- In `RNNOptions`
- `tanh()` / `relu()` / `activation` are removed. Instead, `nonlinearity` is added which takes either `torch::kTanh` or `torch::kReLU`
- `layers` -> `num_layers`
- `with_bias` -> `bias`
- In `LSTMOptions`
- `layers` -> `num_layers`
- `with_bias` -> `bias`
- In `GRUOptions`
- `layers` -> `num_layers`
- `with_bias` -> `bias`
The majority of the changes in this PR focused on refactoring the implementations in `torch/csrc/api/src/nn/modules/rnn.cpp` to match the Python API. RNN tests are then changed to reflected the revised API design.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34322
Differential Revision: D20458302
Pulled By: yf225
fbshipit-source-id: ffff2ae1ddb1c742c966956f6ad4d7fba03dc54d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34280
To have prim ops searchable for lite interpreter, overloaded names need to be added for the operators with the same name but different schema. For example, aten::add in register_prim_ops.cpp. The difference is a combination of args and output type.
`"aten::add(str a, str b) ->str"`
`"aten::add(int a, int b) ->int"`
`"aten::add(float a, float b) ->float"`
`"aten::add(int a, float b) ->float"`
`"aten::add(float a, int b) ->float"`
`"aten::add(Scalar a, Scalar b) ->Scalar"`
Solution:
Use the argument type and/or output type (the same to the existing overloaded names). The overloaded name should be minimum as long as the operators can be differentiated. For other operators please look into the source code change for details.
`"aten::add.str(str a, str b) ->str"`
`"aten::add.int(int a, int b) ->int"`
`"aten::add.float(float a, float b) ->float"`
`"aten::add.int_float(int a, float b) ->float"`
`"aten::add.float_int(float a, int b) ->float"`
`"aten::add.Scalar_Scalar(Scalar a, Scalar b) ->Scalar"`
Test Plan: Imported from OSS
Differential Revision: D20456997
Pulled By: iseeyuan
fbshipit-source-id: 2c3dc324b4a4e045559f62c6cc2a10fbb9a72dcf
Summary:
This PR refactors RNN / GRU / LSTM layers in C++ API to exactly match the implementation in Python API.
**BC-breaking changes:**
- Instead of returning `RNNOutput`, RNN / GRU forward method now returns `std::tuple<Tensor, Tensor>`, and LSTM forward method now returns `std::tuple<Tensor, std::tuple<Tensor, Tensor>>`, matching Python API.
- RNN / LSTM / GRU forward method now accepts the same inputs (input tensor and optionally hidden state), matching Python API.
- RNN / LSTM / GRU now has `forward_with_packed_input` method which accepts `PackedSequence` as input and optionally hidden state, matching the `forward(PackedSequence, ...)` variant in Python API.
- In `RNNOptions`
- `tanh()` / `relu()` / `activation` are removed. Instead, `nonlinearity` is added which takes either `torch::kTanh` or `torch::kReLU`
- `layers` -> `num_layers`
- `with_bias` -> `bias`
- In `LSTMOptions`
- `layers` -> `num_layers`
- `with_bias` -> `bias`
- In `GRUOptions`
- `layers` -> `num_layers`
- `with_bias` -> `bias`
The majority of the changes in this PR focused on refactoring the implementations in `torch/csrc/api/src/nn/modules/rnn.cpp` to match the Python API. RNN tests are then changed to reflected the revised API design.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34322
Differential Revision: D20311699
Pulled By: yf225
fbshipit-source-id: e2b60fc7bac64367a8434647d74c08568a7b28f7
Summary:
This PR adds `RNNCell` / `LSTMCell` / `GRUCell` layers to the C++ frontend, with implementations exactly matching the Python API equivalent.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34400
Differential Revision: D20316859
Pulled By: yf225
fbshipit-source-id: bb7cee092622334043c0d0fd0fcb4e75e707699c
Summary:
Now that lists are no longer specialized, we can register only one operator for list ops that are generic to their element type.
This PR reorgs lists into three sets of ops:
- CREATE_GENERIC_LIST_OPS
- CREATE_SPECIALIZED_LIST_OPS
- CREATE_COMPARATOR_LIST_OPS_SPECIALIZED (we didn't bind certain specialized ops to Tensor)
This is important to land quickly because mobile is finalizing its bytecode soon, after which we could not remove these ops.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34520
Reviewed By: iseeyuan
Differential Revision: D20429775
Pulled By: eellison
fbshipit-source-id: ae6519f9b0f731eaa2bf4ac20736317d0a66b8a0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34626
We need to check has_storage() before looking at it in
cloneSparseTensors(), to avoid gratuitously throwing.
Ideally, we'd add a test for this (I wrote one up but had to disable it),
but won't work until JIT Pickler supports sparse tensors.
ghstack-source-id: 100018077
Test Plan: buck test mode/dev-nosan caffe2/torch/fb/distributed/thriftRpcAgent/...
Differential Revision: D20399971
fbshipit-source-id: 5debfa8140eb1f949d37336330223962cc320abc
Summary:
Now that lists are no longer specialized, we can register only one operator for list ops that are generic to their element type.
This PR reorgs lists into three sets of ops:
- CREATE_GENERIC_LIST_OPS
- CREATE_SPECIALIZED_LIST_OPS
- CREATE_COMPARATOR_LIST_OPS_SPECIALIZED (we didn't bind certain specialized ops to Tensor)
This is important to land quickly because mobile is finalizing its bytecode soon, after which we could not remove these ops.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34520
Differential Revision: D20368543
Pulled By: eellison
fbshipit-source-id: ad0c6d70d2a6be6ff0e948d6786052167fc43e27
Summary:
This PR is BC-breaking in the following way:
- The deprecated `torch::nn::BatchNorm` is removed in favor of `torch::nn::BatchNorm{1,2,3}d`
- The deprecated `torch::nn::FeatureDropout` is removed in favor of `torch::nn::Dropout{2,3}d`
- The deprecated `torch::nn::modules_ordered_dict` is removed. User should do `Sequential sequential({{"m1", MyModule(1)}, {"m2", MyModule(2)}})` instead.
- The deprecated `torch::nn::init::Nonlinearity` is removed, in favor of the following enums:
- `torch::kLinear`
- `torch::kConv1D`
- `torch::kConv2D`
- `torch::kConv3D`
- `torch::kConvTranspose1D`
- `torch::kConvTranspose2D`
- `torch::kConvTranspose3D`
- `torch::kSigmoid`
- `torch::kTanh`
- `torch::kReLU`
- `torch::kLeakyReLU`
- The deprecated `torch::nn::init::FanMode` is removed, in favor of the following enums:
- `torch::kFanIn`
- `torch::kFanOut`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34508
Differential Revision: D20351601
Pulled By: yf225
fbshipit-source-id: cca0cd112f29a31bb023e348ca8f82780e42bea3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34515
Once upon a time we thought this was necessary. In reality it is not, so
removing it.
For backcompat, our public interface (defined in `api/`) still has
typedefs to the old `script::` names.
There was only one collision: `Pass` as a `Stmt` and `Pass` as a graph
transform. I renamed one of them.
Test Plan: Imported from OSS
Differential Revision: D20353503
Pulled By: suo
fbshipit-source-id: 48bb911ce75120a8c9e0c6fb65262ef775dfba93
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34588
I constructed the patch by deleting OperatorOptions and then rerouting
all queries for AliasAnalysisKind to FunctionSchema. Some of the
behavior is kind of bogus: we really shouldn't be mutating FunctionSchema
after the fact, but that won't get fixed until we actually switch to
true schema merging.
Reland of https://github.com/pytorch/pytorch/pull/34160
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D20387079
Pulled By: ezyang
fbshipit-source-id: d189f7a6ad8cd186b88b6fbfa3f189994eea14e8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34160
I constructed the patch by deleting OperatorOptions and then rerouting
all queries for AliasAnalysisKind to FunctionSchema. Some of the
behavior is kind of bogus: we really shouldn't be mutating FunctionSchema
after the fact, but that won't get fixed until we actually switch to
true schema merging.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D20282846
Pulled By: ezyang
fbshipit-source-id: ba7bca6e8adc3365789639b88e54c4e881b1692e
Summary:
Stacked PRs
* #33474 - [jit] Remove list specializations from pickler
* **#33255 - [jit] Add type tags to lists/dicts in pickle**
This adds a global call to `torch.jit._pickle.restore_type_tags` for
lists and dicts so that we can preserve their types after serialization.
](https://our.intern.facebook.com/intern/diff/20346780/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33255
Pulled By: driazati
Differential Revision: D20346780
fbshipit-source-id: c8534954ef4adb2e3c880401acbee30cd284f3db
Summary:
**Summary**
There is often a need to create a Tensor when writing IR by hand for JIT
optimisation pass unit tests. The only options for this today are real
Tensor creation functions like `aten::ones`. Any test that uses these functions
must also use the same default arguments as the Python/C++ API, which means
that all of the tests have to be updated when the API is updated. This commit
introduces a new primitive, `prim::MakeTestTensor` with schema `() -> Tensor` that
should be used in unit tests instead of real Tensor creation functions. This new
primitive has no public-facing API, so the maintenance burden is much lower.
**Testing**
This commit updates the alias analysis and DCE tests to use `prim::MakeTestTensor` instead of
`aten::rand`, `aten::ones`, and `aten::zeros`.
```
$ ./bin/test_jit
CUDA not available. Disabling CUDA and MultiCUDA tests
Note: Google Test filter = *-*_CUDA:*_MultiCUDA
[==========] Running 75 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 75 tests from JitTest
[ RUN ] JitTest.ADFormulas
[ OK ] JitTest.ADFormulas (82 ms)
[ RUN ] JitTest.Attributes
[ OK ] JitTest.Attributes (0 ms)
...
...
...
[ RUN ] JitTest.LiteInterpreterPrim
[ OK ] JitTest.LiteInterpreterPrim (0 ms)
[ RUN ] JitTest.LiteInterpreterLoadOrigJit
[ OK ] JitTest.LiteInterpreterLoadOrigJit (2 ms)
[----------] 75 tests from JitTest (150 ms total)
[----------] Global test environment tear-down
[==========] 75 tests from 1 test case ran. (150 ms total)
[ PASSED ] 75 tests.
```
**Fixes**
This pull request fixes https://github.com/pytorch/pytorch/issues/33500.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34334
Differential Revision: D20296437
Pulled By: SplitInfinity
fbshipit-source-id: df4e7b0881ae4913424e5a409bfa171a61c3e568
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26125
We already had some optimization implementation using AVX2 for improve the quantized kernel performance. In this diff, we want to enable the runtime dispatch.
Test Plan:
Sandcastle build and test
Also test with a python binary calling into vectorized op.
torch.__config__.show()
PyTorch built with:
- GCC 4.2
- clang 8.0.20181009
- Intel(R) Math Kernel Library Version 2017.0.3 Product Build 20170413 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v0.18.1 (Git Hash N/A)
- OpenMP 1
- **CPU capability usage: AVX2**
- Build settings:
Reviewed By: jamesr66a
Differential Revision: D17337251
fbshipit-source-id: 8e22d10011a12a4eaf54cea3485353eb1811d828
Summary:
**This PR is BC-breaking in the following way:**
In RMSpropOptions:
1. learning_rate is renamed to lr.
**Test plan before 1.5 release:**
Test that in 1.5 we can load a C++ RMSprop optimizer that was serialized in 1.4, and their states are the same.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33450
Differential Revision: D20366623
Pulled By: anjali411
fbshipit-source-id: 83250be9b583a766927e0e22a4de8b0765379451
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33807
afaik this is unused, so removing it from the source tree. RIP :(
Test Plan: Imported from OSS
Differential Revision: D20122118
Pulled By: suo
fbshipit-source-id: cb45943f5b9f969482301a2f9fe540326dbc78f2
Summary:
One example in the current docs for `torch::nn::ModuleList` doesn't compile, and this PR fixes it.
Fixes https://github.com/pytorch/pytorch/issues/32414.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34463
Test Plan: Imported from GitHub, without a `Test Plan:` line.
Differential Revision: D20331120
Pulled By: yf225
fbshipit-source-id: 50bb078fe1a900c9114d5434e92dc40ee13b52bf
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34099
This change effectively applies into IValue's future impl a few fixes
we discovered when using the torch::utils::Future<T> impl.
The parallel impls should probably eventually be merged, but until then:
- Don't hold the lock when invoking the callbacks. This makes
it effectively impossible (deadlocks) to call value() to get
the value from inside the callback.
- We discovered that it was slightly cleaner in practice to
notify condition variables prior to invoking callbacks
(best to unblock paused threads ASAP, before spawning new work).
- Fix some var naming inconsistency.
- Add a some caffe2 cpp test coverage.
ghstack-source-id: 99336569
Test Plan:
```
buck test mode/dev //caffe2/test/cpp/jit:jit -- 'JitTest\.IValueFuture'
```
Differential Revision: D20203278
fbshipit-source-id: 6e805ba547899dab9aab458e4b23049db31f930e
Summary:
The init-list form of `at::indexing::Slice` (i.e. `tensor.index({{1, None, 2}, ...})` instead of `tensor.index({Slice(1, None, 2), ...})`) in C++ API can be easily confused with the list-form indexing in Python API (e.g. `tensor[[1, 3, 2], ...]`), which is not good from readability perspective. This PR removes the init-list form of `at::indexing::Slice` to make the API less confusing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34255
Test Plan: Imported from GitHub, without a `Test Plan:` line.
Differential Revision: D20290166
Pulled By: yf225
fbshipit-source-id: abbcbeca0b179219e5e1f196a33ef8aec87ebb76
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34122
Earlier work added support for async rpc cases when RecordFunction's
end callbacks might be called in a different thread; in addition some
extra care was needed to handle pointer to parent function;
This PR makes RecordFunction aware of potentially multiple threads in
use, as well as removes unused parent() call and restricts current()
RecordFunction to scope-based record functions (RECORD_FUNCTION macro)
Test Plan: unit tests
Differential Revision: D20297709
Pulled By: ilia-cher
fbshipit-source-id: 46a59e1b2eea0bbd8a59630385e193b38d30f9d1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33294
1. Serialize bytecode of __setstate__ and run it when loading the model.
2. One use case is quantization. To test this use case a few operators are registered temporarily for lite interpreter. The "_" prefix registration will be removed when the operators are all migrated to mobile.
Test Plan: Imported from OSS
Differential Revision: D20162898
Pulled By: iseeyuan
fbshipit-source-id: 7a3180807bf38fbce594d86993896861f12bb58c
Summary:
**Summary**
There is often a need to create a Tensor when writing IR by hand for JIT
optimisation pass unit tests. The only options for this today are real
Tensor creation functions like `aten::ones`. Any test that uses these functions
must also use the same default arguments as the Python/C++ API, which means
that all of the tests have to be updated when the API is updated. This commit
introduces a new primitive, `prim::MakeTestTensor` with schema `() -> Tensor` that
should be used in unit tests instead of real Tensor creation functions. This new
primitive has no public-facing API, so the maintenance burden is much lower.
**Testing**
This commit updates the alias analysis and DCE tests to use `prim::MakeTestTensor` instead of
`aten::rand`, `aten::ones`, and `aten::zeros`.
```
$ ./bin/test_jit
CUDA not available. Disabling CUDA and MultiCUDA tests
Note: Google Test filter = *-*_CUDA:*_MultiCUDA
[==========] Running 75 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 75 tests from JitTest
[ RUN ] JitTest.ADFormulas
[ OK ] JitTest.ADFormulas (82 ms)
[ RUN ] JitTest.Attributes
[ OK ] JitTest.Attributes (0 ms)
...
...
...
[ RUN ] JitTest.LiteInterpreterPrim
[ OK ] JitTest.LiteInterpreterPrim (0 ms)
[ RUN ] JitTest.LiteInterpreterLoadOrigJit
[ OK ] JitTest.LiteInterpreterLoadOrigJit (2 ms)
[----------] 75 tests from JitTest (150 ms total)
[----------] Global test environment tear-down
[==========] 75 tests from 1 test case ran. (150 ms total)
[ PASSED ] 75 tests.
```
**Fixes**
This pull request fixes https://github.com/pytorch/pytorch/issues/33500.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33914
Differential Revision: D20150304
Pulled By: SplitInfinity
fbshipit-source-id: c88f5289055a02dc20b7a5dcdf87469f9816d020