Commit graph

418 commits

Author SHA1 Message Date
Onyiee
ae11264583 Fixed type checking errors in node.py (#68124)
Summary:
Fixes [issue#67](https://github.com/MLH-Fellowship/pyre-check/issues/67)
This PR fixes the type checking errors in Pytorch torch/fx/node.py .
The variable types in 363:20 and 364:20 were declared to have type `List[str]`  but were  assigned a value of  `None`. This caused an incompatitble variable type error.  I changed the type from `List[str]` to `Optional[List[str]` . This therefore fixed the incompatitble variable type error.

Signed-off-by: Onyemowo  Agbo
onionymous
0xedward

Pull Request resolved: https://github.com/pytorch/pytorch/pull/68124

Reviewed By: gmagogsfm

Differential Revision: D32322414

Pulled By: onionymous

fbshipit-source-id: be11bbbd463715ddf28a5ba78fb4adbf62878c80
2021-12-03 12:03:49 -08:00
Michael Suo
0aa9d177fe [fx] remove CPatcher (#69032)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69032

I am removing it because, for packaging-related reasons, it's easier if
torch.fx is a pure Python module.

I don't think there is much reason to keep it: this functionality was
experimental, has no known users currently, and we didn't have a clear
path to turning it on by default due to regressions in tracing
performance. Also, it only was ever enabled for `rand` and friends.

Technically the removal of the `enable_cpatching` arguments on
`symbolic_trace` and `Tracer.__init__` are BC-breaking, but the
docstrings clearly state that the argument is experimental and BC is not
guaranteed, so I think it's fine.

Test Plan: Imported from OSS

Reviewed By: soulitzer

Differential Revision: D32706344

Pulled By: suo

fbshipit-source-id: 501648b5c3610ae71829b5e7db74e3b8c9e1a480
2021-11-30 11:59:57 -08:00
Kefei Lu
76e9dbb0f4 [torch.fx] add code-gen customizability and support for setting breakpoint in code-gen'd forward() call (#67139)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67139

This diff enables setting breakpoint in the graph module's generated python code. See test plan for usage.

In order to support this functionality, and other similar functionalities to customize the generated code, a code transformer functionality is added to `fx.Graph`. This allows flexible customization of `fx.Graph`'s code gen behavior, in composable and functional ways. See test plan for its usage.

Test Plan:
### Use of `fx.experimental.debug.set_trace`

```
In [2]: from torch.fx.experimental.debug import set_trace

In [3]: set_trace(ttop)
Out[3]:
top(
  (a): Sub()
)

In [4]: ttop(1)
> /data/users/kefeilu/fbsource33/fbcode/buck-out/dev/gen/caffe2/torch/fb/fx2trt/<eval_with_key>.10(6)forward()
(Pdb) l
  1
  2
  3
  4     def forward(self, x):
  5         import pdb; pdb.set_trace()
  6  ->     a = self.a(x);  x = None
  7         getitem = a[0]
  8         getitem_1 = a[0];  a = None
  9         add = getitem + getitem_1;  getitem = getitem_1 = None
 10         return add
 11
(Pdb)
```

### Use of `on_generate_code`

```
In [1]: def insert_pdb(body):
   ...:     return ['import pdb; pdb.set_trace()\n', *body]
   ...:

In [8]: type(ttop)
Out[8]: torch.fx.graph_module.GraphModule.__new__.<locals>.GraphModuleImpl

In [10]: with ttop.graph.on_generate_code(lambda _: insert_pdb):
    ...:     ttop.recompile()
    ...:     print(f"== _on_generate_code should not be None: { ttop.graph._on_generate_code }")
    ...:     print(ttop.code)
    ...:

== _on_generate_code should not be None: <function insert_pdb at 0x7fc9895ddd30>

def forward(self, x):
    import pdb; pdb.set_trace()
    a = self.a(x);  x = None
    getitem = a[0]
    getitem_1 = a[0];  a = None
    add = getitem + getitem_1;  getitem = getitem_1 = None
    return add

In [11]: ttop.graph._on_generate_code  # restored to None

In [12]: ttop(1) # this should drop into pdb
> /data/users/kefeilu/fbsource33/fbcode/buck-out/dev/gen/caffe2/torch/fb/fx2trt/<eval_with_key>.6(6)forward()
(Pdb) l
  1
  2
  3
  4     def forward(self, x):
  5         import pdb; pdb.set_trace()
  6  ->     a = self.a(x);  x = None
  7         getitem = a[0]
  8         getitem_1 = a[0];  a = None
  9         add = getitem + getitem_1;  getitem = getitem_1 = None
 10         return add
 11
```

Reviewed By: jamesr66a

Differential Revision: D30736160

fbshipit-source-id: 9646867aae0461b5131dfd4ba9ee77a8c2ea9c93
2021-11-16 13:28:11 -08:00
liulixinkerry
257239972c Fix attr_to_scope's key in torch/utils/tensorboard/_pytorch_graph.py (#65692)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/65652

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65692

Reviewed By: Reubend

Differential Revision: D31678606

Pulled By: edward-io

fbshipit-source-id: 7c0bf740ee4f8c21bd01ced3ae70df23c9efadfb
2021-10-20 14:35:29 -07:00
Jerry Zhang
3d6d4f4322 [fx2trt][quant] Add lowering support for per channel quantization in fx2trt (#64787)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64787

This PR added support for lowering per channel quantization and dequantization operators
in fx2trt, this also extends TensorMeta with extra arguments corresponding to per channel quantized Tensors,
initially I was thinking of adding a qpram that can capture everything, but currently we still have some lowering support
for fbgemm ops (which has scale and zero_point in operator interface). I think we can move everything to qprams
after we deprecate lowering support for fbgemm ops in the future.

Test Plan:
Test for per channel weight:
```
python torch/fx/experimental/fx2trt/example/quantized_resnet_test.py
```

change BC compatibility test expect for TensorMeta
```
python test/test_fx.py TestFXAPIBackwardCompatibility.test_class_member_back_compat --accept
```

Imported from OSS

Reviewed By: jfix71, mrshenli, 842974287

Differential Revision: D30879848

fbshipit-source-id: 76c3804bb1d9343183ae53d9f02c1a3bf6c79e1c
2021-09-30 18:54:14 -07:00
Ansley Ussery
6831d8e379 Support Union in TorchScript (#64234)
Summary:
This PR is created to replace https://github.com/pytorch/pytorch/pull/53180 PR stack, which has all the review discussions. Reason for needing a replacement is due to a messy Sandcastle issue.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64234

Reviewed By: gmagogsfm

Differential Revision: D30656444

Pulled By: ansley

fbshipit-source-id: 77536c8bcc88162e2c72636026ca3c16891d669a
2021-09-03 06:12:24 -07:00
James Reed
e1c3e5f830 [resubmit][FX] Prototype for guarding against mutable operations in tracing (#64467)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64467

Test Plan: Imported from OSS

Reviewed By: driazati

Differential Revision: D30744870

Pulled By: jamesr66a

fbshipit-source-id: fc652f8b17748f90dbeb83fabf3bd5bb57d6ff1a
2021-09-02 21:13:21 -07:00
Eli Uriegas
32a93c2424 Revert D30675780: [FX] Prototype for guarding against mutable operations in tracing
Test Plan: revert-hammer

Differential Revision:
D30675780 (795387477f)

Original commit changeset: b2116b51dcc8

fbshipit-source-id: d4f1173f4989556ea54974f4c2739ef85a705fae
2021-09-02 16:07:29 -07:00
James Reed
795387477f [FX] Prototype for guarding against mutable operations in tracing (#64295)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64295

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D30675780

Pulled By: jamesr66a

fbshipit-source-id: b2116b51dcc87357f0c84192c4c336680875e27a
2021-09-02 15:17:04 -07:00
Michael Dagitses
b737629ff0 simplify op name determination into a single forward pass (#64261)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64261

Note that this does not preserve byte-for-byte compatibility with
existing names.

Test Plan:
* Rely on CI to catch gross errors.
* Merge after release cut to catch subtle issues.

Reviewed By: albanD

Differential Revision: D30700647

Pulled By: dagitses

fbshipit-source-id: 7b02f34b8fae3041240cc78fbc6bcae498c3acd4
2021-09-02 07:32:11 -07:00
James Reed
0c4e4e588e [FX] Rename reduce functions back to their old, public names (#64324)
Summary:
Unfortunately pickle serializes the names of these functions. Also put them under backward-compatibility enforcement.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64324

Test Plan: Local repro https://fb.workplace.com/groups/3440841732711443/permalink/4018921611570116/

Reviewed By: SplitInfinity, TailofJune

Differential Revision: D30684185

Pulled By: jamesr66a

fbshipit-source-id: 900701220155d15115cd0c07cf7774a2891bd04f
2021-08-31 22:36:11 -07:00
James Reed
538647fe1f [WIP][FX] BC guarantees for 1.10 (#63888)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63888

Test Plan: Imported from OSS

Reviewed By: pbelevich

Differential Revision: D30523133

Pulled By: jamesr66a

fbshipit-source-id: b04cc0d842a74862f42ecba98b757310cd2ec7b0
2021-08-30 19:56:46 -07:00
Zhengxu Chen
51af772937 [jit] Set debug name for value coming out of GetAttr nodes. (#59123)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59123

Test Plan: Imported from OSS

Reviewed By: SplitInfinity

Differential Revision: D28766023

fbshipit-source-id: 0919f4318fb5a7b1d5adc8f976dfc9309e233d13
2021-06-09 12:24:55 -07:00
Alexander
b435a27fb7 CUDA support in the CSR layout: constructors (#59010)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59010

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D28719287

Pulled By: bhosmer

fbshipit-source-id: fbb5784ccb5ce19dcca1f2f95c4ee16f9b7680c4
2021-05-26 16:39:43 -07:00
Alban Desmaison
032d6b0643 Revert D28112689: CUDA support in the CSR layout: constructors
Test Plan: revert-hammer

Differential Revision:
D28112689 (1416e57465)

Original commit changeset: f825cd4bce40

fbshipit-source-id: 421fc590797ac5fab6a55ac6f213361fbba7cd5b
2021-05-26 06:15:05 -07:00
Alexander
1416e57465 CUDA support in the CSR layout: constructors (#57274)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57274

Test Plan: Imported from OSS

Reviewed By: astaff

Differential Revision: D28112689

Pulled By: bhosmer

fbshipit-source-id: f825cd4bce402dd4c3f71db88854f77830b687b8
2021-05-26 01:36:20 -07:00
Tugsbayasgalan (Tugsuu) Manlaibaatar
b0c27b44cf Enable backward/forward compatibility for TS runtime (#57498)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57498

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D28162448

Pulled By: tugsbayasgalan

fbshipit-source-id: 5c21ced42a22aca7cee089e876e9d98d32f68955
2021-05-07 15:41:45 -07:00
Alexander
0d41122e61 Eliminate global usage of torch.set_default_dtype in sparse test (#56393)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56393

Fixes for  gh-56369

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D27913266

Pulled By: mruberry

fbshipit-source-id: 2c590d3a2188aae251184f08c1a6a2c4c570d150
2021-04-27 15:23:14 -07:00
Alexander
18c89a904b Modernize test-suite in sparse tensor CSR (#56392)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56392

Fixes for gh-56371 and gh-56369

Test Plan: Imported from OSS

Reviewed By: H-Huang

Differential Revision: D27913212

Pulled By: mruberry

fbshipit-source-id: 2c78fe9fa4b6c6b566d9eb01f71e6016d672a545
2021-04-27 15:22:17 -07:00
James Reed
68e0796466 [JIT][write path] Make NoneType annotation_str emit NoneType instead of None (#54746)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54746

Test Plan: Imported from OSS

Reviewed By: SplitInfinity

Differential Revision: D27350331

Pulled By: jamesr66a

fbshipit-source-id: 3f44d6589c29f39378432d0b6b281d96bb4829e7
2021-04-12 17:36:45 -07:00
Alexander
6ee333cdb5 modernize test_sparse (#54572)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54572

Adding device generic tests to `test_sparse`.
Follow-up PR: #54153

I think is ready to review.
Looking forward your comments cc mruberry.

Thanks

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D27562663

Pulled By: mruberry

fbshipit-source-id: c48973e707f779b529bc7f61b75103194b428987
2021-04-09 12:19:29 -07:00
Siqi Yan
317ff429d3 [TB] Support writing new style scalar (#53496)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53496

New style vs old style
b306651ab5/tensorboard/data_compat.py (L49-L53)

Writing in new style can help avoid the cost of migration
b306651ab5/tensorboard/data_compat.py (L46)

----

Test Plan:
buck run caffe2/test:tensorboard

 ---

Reviewed By: edward-io

Differential Revision: D26879076

fbshipit-source-id: 43cfe9e1ca52dad3efc10332715d39f1cc984862
2021-03-12 19:03:13 -08:00
Elias Ellison
752d808fa0 Trace linear as aten::linear (#51897)
Summary:
https://github.com/pytorch/pytorch/pull/51613 made `torch.nn.functional.linear` compile as `aten::linear`, extend the same behavior with tracing.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/51897

Reviewed By: albanD

Differential Revision: D26320711

Pulled By: eellison

fbshipit-source-id: a26d3c37323a0706313c6ebb210bad60eec6a64b
2021-02-19 10:20:42 -08:00
Yanan Cao
705fa7e964 [Usability] Capture argument names for traced functions and modules (#51775)
Summary:
Previously `torch.jit.trace` relies on AutoGrad hooks to infer name of tensors in computation, including those of function/method arguments. This often doesn't work out because:

- These names often do not exist
- Tracer uses argument name of first tensor operation on each tensor as inferred argument names. These tensor operations have programmatically-generated names like `argument_1`

This PR extracts argument names directly from Python functions and pass them down to tracer, which then assigns them to correct graph inputs. This way, we always have the correct argument names captured in IR.

This is useful for both debugging and supporting using `InterfaceType` to represent traced modules.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/51775

Reviewed By: izdeby

Differential Revision: D26273105

Pulled By: gmagogsfm

fbshipit-source-id: 934a385041137dc3731bb6fa8657b11532fed9e5
2021-02-10 18:28:08 -08:00
albanD
ccd646696b Fix Module backward hooks for all Tensor inputs/outputs (#46163)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/598

This is BC-breaking as we now explicitly don't call the hook when there are not Tensors at the top level of the output.
This feature was not working anyways as the returned grad_input/grad_output were wrong (not respecting the output structure and wrong inputs for multi-Node Module).

This is also BC-breaking as we now report the correct gradients for `nn.Module`s that contain multiple autograd `Node`s while we use to return bad results before.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46163

Reviewed By: ailzhang, mruberry

Differential Revision: D24894180

Pulled By: albanD

fbshipit-source-id: e1b5d193d2818eb2f51e2a2722c7405c8bd13c2b
2020-12-18 09:04:36 -08:00
Yanan Cao
bdcf320bed Support custom exception message (#41907)
Summary:
Raise and assert used to have a hard-coded error message "Exception". User provided error message was ignored. This PR adds support to represent user's error message in TorchScript.

This breaks backward compatibility because now we actually need to script the user's error message, which can potentially contain unscriptable expressions. Such programs can break when scripting, but saved models can still continue to work.

Increased an op count in test_mobile_optimizer.py because now we need aten::format to form the actual exception message.

This is built upon an WIP PR:  https://github.com/pytorch/pytorch/pull/34112 by driazati

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41907

Reviewed By: ngimel

Differential Revision: D22778301

Pulled By: gmagogsfm

fbshipit-source-id: 2b94f0db4ae9fe70c4cd03f4048e519ea96323ad
2020-08-01 13:03:45 -07:00
Stanislau Hlebik
b774ce54f8 remediation of S205607
fbshipit-source-id: 798decc90db4f13770e97cdce3c0df7d5421b2a3
2020-07-17 17:19:47 -07:00
Stanislau Hlebik
8fdea489af remediation of S205607
fbshipit-source-id: 5113fe0c527595e4227ff827253b7414abbdf7ac
2020-07-17 17:17:03 -07:00
Zino Benaissa
690946c49d Generalize constant_table from tensor only to ivalue (#40718)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40718

Currently only constant except tensor must be inlined during serialization.
Tensor are stored in the contant table. This patch generalizes this capability
to any IValue. This is particularly useful for non ASCII string literal that
cannot be inlined.

Test Plan: Imported from OSS

Differential Revision: D22298169

Pulled By: bzinodev

fbshipit-source-id: 88cc59af9cc45e426ca8002175593b9e431f4bac
2020-07-09 09:09:40 -07:00
Natalia Gimelshein
502ec8f7f7 Revert D22227939: [TB] Add support for hparam domain_discrete
Test Plan: revert-hammer

Differential Revision:
D22227939 (4c25428c8c)

Original commit changeset: d2f0cd8e5632

fbshipit-source-id: c4329fcead69cb0f3d368a254d8756fb04be742d
2020-06-27 22:20:31 -07:00
Siqi Yan
4c25428c8c [TB] Add support for hparam domain_discrete
Summary: Add support for populating domain_discrete field in TensorBoard add_hparams API

Test Plan: Unit test test_hparams_domain_discrete

Reviewed By: edward-io

Differential Revision: D22227939

fbshipit-source-id: d2f0cd8e5632cbcc578466ff3cd587ee74f847af
2020-06-27 14:07:24 -07:00
Michael Carilli
8066fba226 [RELAND2] Change AccumulateGrad to yield .grads that match weights' memory layout (#40358)
Summary:
https://github.com/pytorch/pytorch/pull/40129 fixed the error responsible for the first revert, but exposed another error in the same test.

This PR is intended as the "master copy" for merge, and it runs on full CI.
Two other PRs (restricted to run on a small subset of CI) supporting debugging DDP failures/hangs with multiple devices per process (`test_c10d.py:DistributedDataParallelTest.test_grad_layout_1devicemodule_2replicaperprocess`).
- https://github.com/pytorch/pytorch/pull/40290 tries the test with purely rowmajor contiguous params on an untouched master.  In other words https://github.com/pytorch/pytorch/pull/40290 contains none of this PR's diffs aside from the test itself.
- https://github.com/pytorch/pytorch/pull/40178, for comparison, tries the test with this PR's diffs.

Both fail the same way, indicating failure is unrelated to this PR's other diffs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40358

Differential Revision: D22165785

Pulled By: albanD

fbshipit-source-id: ac7cdd79af5c080ab74341671392dca8e717554e
2020-06-22 17:13:21 -07:00
Alban Desmaison
08227fea4f Revert D22079377: [pytorch][PR] [RELAND] Change AccumulateGrad to yield .grads that match weights' memory layout
Test Plan: revert-hammer

Differential Revision:
D22079377

Original commit changeset: 9bd2b7e0c34f

fbshipit-source-id: c22cc349d790caa574eace0d63980854c33e5a59
2020-06-17 10:17:27 -07:00
Michael Carilli
1ec8ece2b9 [RELAND] Change AccumulateGrad to yield .grads that match weights' memory layout (#40129)
Summary:
https://github.com/pytorch/pytorch/pull/34904 was reverted because it had a misconfigured 4 GPU test that for some reason wasn't caught by external CI ([example failure](https://app.circleci.com/pipelines/github/pytorch/pytorch/181719/workflows/cfb37cd9-9a0c-4738-898b-d683934cd308/jobs/5868948/steps)).

This PR reverts the revert, and adds diffs that should repair the misconfigured test.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40129

Differential Revision: D22079377

Pulled By: albanD

fbshipit-source-id: 9bd2b7e0c34fdaf887497b52037cfe82cba709c1
2020-06-17 09:02:54 -07:00
Alban Desmaison
f1e575a0bf Revert D20496044: [pytorch][PR] Change AccumulateGrad to yield .grads that match weights' memory layout
Test Plan: revert-hammer

Differential Revision:
D20496044

Original commit changeset: 248d680f4b1b

fbshipit-source-id: 6462b25e3fb9c8596c1da443389089f09c32df4d
2020-06-16 10:38:40 -07:00
Michael Carilli
2beb9690c3 Change AccumulateGrad to yield .grads that match weights' memory layout (#34904)
Summary:
Currently, whether `AccumulateGrad`  [steals](67cb018462/torch/csrc/autograd/functions/accumulate_grad.h (L42)) or [clones](67cb018462/torch/csrc/autograd/functions/accumulate_grad.h (L80)) an incoming gradient, the gradient ends up rowmajor contiguous, regardless of its param's layout.  If the param's layout is channels last, or otherwise not rowmajor contigous, later kernels that apply gradients to params are forced into an uncoalesced memory access pattern for either the param or the gradient.  This may not sound like a big deal but for any binary op on large tensors it's a >3X increase in gmem traffic => 3X slowdown.

The present PR changes `AccumulateGrad` to prefer, where possible, stashing gradients that match their params' layouts (["Gradient Layout Contract"](https://github.com/pytorch/pytorch/pull/34904/files#diff-ef1a56d24f66b280dcdb401502d6a796R29-R38)).

Allowing `AccumulateGrad` to stash non-rowmajor-contiguous grads means DDP allreduces and DP reduces must allow non-rowmajor-contiguous grads.  This PR extends DDP and DP to allow gradients with non-rowmajor-contiguous strides as long as their layout is nonoverlapping and dense.

For good measure, I include changes that allow all five nccl primitives (allreduce, reduce, broadcast, allgather, reducescatter) to act on non-rowmajor-contiguous tensors (again as long as each input's layout is nonoverlapping and dense, and as long as all tensors participating in a given collective have the same layout).  The primitive comm changes aren't necessary to enable the DDP changes, but I wasn't sure this would end up true until I had written both sets of changes.  I think primitive comm enablement is reasonable to keep in the PR, especially since the code for it is simple.

Channels last params will be a major beneficiary of this PR, but I don't see it as channels-last-specific fix.  The spirit is layout matching in general:
- Grads should be stashed with memory layouts matching their params.
- Src and dst tensors on opposite ends of collectives should have matching dense layouts.

This PR also updates autograd docs to describe potential BC-breaking changes below.

## BC notes
ngimel albanD gchanan

#### BC-breaking
In the common case where the user lets AccumulateGrad decide grad layouts, strides for grads of dense but non-rowmajor-contiguous params will change.  Any user code that was accustomed to `view(-1)`ing these grads will break.

Also, the circumstances under which a grad can be stolen directly from the backward function that created it, as opposed to deep-copied by AccumulateGrad, have changed.  In most cases we expect silent performance improvement, because we expect channels-last-aware backward kernels will create channels last gradients for channels last params.  Now those can be stolen, whereas before this PR they were cloned and made rowmajor contiguous.  IMO this is a mild BC breakage.  Param backward hooks still see grads come in with whatever format the backward kernel gave them.  The only BC breakage potential I see is if user code relies somehow on a grad in a hook having or not having the same deep memory as the eventual `param.grad`.  Any such users hopefully know they're off the edge of the map and understand how to update their expectations.

#### BC escape hatches
At alband's recommendation, this PR's changes to AccumulateGrad do not alter the pre-PR code's decisions about whether grad is accumulated in or out of place.  Accumulations of new grads onto an existing `.grad` attribute were (usually) in-place before this PR and remain in-place after this PR, keeping the existing `.grad`'s layout.  After this PR, if the user wants to force accumulation into a grad with a particular layout, they can preset `param.grad` to a zeroed tensor with the desired strides or call `grad.contiguous(desired format)`.  This likely won't be as performant as letting AccumulateGrad establish grad layouts by cloning or stealing grads with contract-compliant strides, but at least users have a control point.

One limitation (present before this PR and unchanged by this PR):  Presetting `param.grad` does not ensure in-place accumulation all the time.  For example, if `create_graph=True`, or if incoming `new_grad` is dense and existing `variable_grad` is sparse, accumulation occurs out of place, and the out-of-place result may not match the existing grad's strides.

----------------------------
I also noticed some potential DDP improvements that I considered out of scope but want to mention for visibility:
1. make sure Reducer's ops sync with AccumulateGrad streams
2. ~to reduce CPU overhead and incur fewer kernel launches, lazily create flat `contents` tensors by a single `cat` kernel only when a bucket is full, instead of `copy_`ing grads into `contents` individually as soon as they are received.~  PR includes a [minor change](https://github.com/pytorch/pytorch/pull/34904/files#diff-c269190a925a4b0df49eda8a8f6c5bd3R312-R315) to divide grads while copying them into flat buffers, instead of copying them in, then dividing separately.  Without cat+div fusion, div-while-copying is the best we can do.
3. https://github.com/pytorch/pytorch/issues/38942
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34904

Differential Revision: D20496044

Pulled By: albanD

fbshipit-source-id: 248d680f4b1bf77b0a986451844ec6e254469217
2020-06-16 08:43:31 -07:00
Hong Xu
336e1ec592 Clean up error handling in is_nonzero and where in TensorCompare.cpp (#38150)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38150

Differential Revision: D21539736

Pulled By: ezyang

fbshipit-source-id: e390c12f5948192a552d66dcd1bb89b2cb45f170
2020-05-13 20:19:40 -07:00
Tzu-Wei Huang
609d5a4476 [tensorboard] Let hparam render values correctly (#31544)
Summary:
The root cause of incorrect rendering is that numbers are treated as a string if the data type is not specified. Therefore the data is sort based on the first digit.

closes https://github.com/pytorch/pytorch/issues/29906
 cc orionr sanekmelnikov
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31544

Differential Revision: D21105403

Pulled By: natalialunova

fbshipit-source-id: a676ff5ab94c5bdb653615d43219604e54747e56
2020-05-08 00:05:16 -07:00
Peter Bell
675b3fc834 Prevent unbounded growth of sparse tensor in add operation (#36030)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/34964

Sparse cuda add was implemented by just concatenating the indices and values for the tensor. If called repeatedly in a tight loop this will let `nnz` grow unbounded. In the worst case of  `x.add_(x)` it grows exponentially.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36030

Differential Revision: D20873504

Pulled By: zou3519

fbshipit-source-id: d90ed8dda0c89571fb89e358757b5dde299513df
2020-05-01 12:05:15 -07:00
Elias Ellison
6bc8ffe824 [JIT] Optimize before inlining (#35562)
Summary:
Resubmit of https://github.com/pytorch/pytorch/pull/35424, only this time I run optimizations in the right order so the PR description is actually true.

This speeds up the inlining pass of FairSeq model from 180s -> 13s, and MaskRCNN model from 5s -> 1.5s.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35562

Differential Revision: D20738922

Pulled By: eellison

fbshipit-source-id: 1439cf9d1f0bc780e2d64a744694f8b3b7ba4b70
2020-04-07 09:42:26 -07:00
Pritam Damania
f050b16dd9 Move pytorch distributed tests to separate folder for contbuild. (#30445)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30445

Create distributed and rpc directories under caffe/test for better management
of unit tests.

Differential Revision: D18702786

fbshipit-source-id: e9daeed0cfb846ef68806f6decfcb57c0e0e3606
2020-01-22 21:16:59 -08:00
Jonathan Reynolds
0c04763d59 Changes to get inlined graph and proper names after JIT updates (#30244)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30244

This makes several small changes to the tensorboard graph parsing methods to address the recent changes to the PyTorch JIT trace/graph.
- Inline graph to get information for all nodes
- Assign and propagate scope names to GetAttr nodes
- Prune all useless GetAttr nodes (any with a ClassType output type - tensors and primitives are kept)
- Create output nodes so output tensor shape can be examined

Reviewed By: sanekmelnikov

Differential Revision: D18556323

fbshipit-source-id: b73a809bacfa554c3fe9c4ae3563525f57539874
2019-11-21 16:59:28 -08:00
James Reed
449828378d Serialize ClassType as its qualname
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30058

Test Plan: Imported from OSS

Differential Revision: D18584269

Pulled By: jamesr66a

fbshipit-source-id: 5f1d0142bd7cd94eecbd2ed9250a0de47639040b
2019-11-20 16:17:26 -08:00
Edward Yang
4e21157e01 Revert "Revert D18171156: Merge Tensor and Variable." (#29299)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29299

This reverts commit 9c43b16df9, but also
with the changes from D18348622.  Comments there:

thpp-compatibility is used by admarket/adreview/service:adreviewservice and
libtorch is too big for the service to deal with.

thpp-compatibility doesn't support autograd, so we hack around dispatching
variables by using AutoNonVariableTypeMode everywhere we call into ATen,
so we never attempt to call into Variable stubs.  If you get it wrong,
you'll get an error like:

```
what():  Could not run 'aten::empty' with arguments from the 'VariableTensorId' backend. 'aten::empty' is only available for these backends: [SparseCPUTensorId, CPUTensorId, MkldnnCPUTensorId]. (lookup_ at caffe2/aten/src/ATen/core/dispatch/DispatchTable.h:298)
```

Test Plan:
Imported from OSS

```
buck test //thpp-compatibility/...
buck build mode/opt-clang admarket/adreview/service:adreviewservice
```

adreviewservice canary: https://our.intern.facebook.com/intern/ads/canary/422290029716387895 (comparing against parent comment due to current breakage) ==> experiment store https://our.intern.facebook.com/intern/experiment_store/experiment/43990006/
adfinder canary: https://our.intern.facebook.com/intern/ads/canary/422268535840333934
adindexer canary: https://our.intern.facebook.com/intern/ads/canary/422268550559034675

adreview second canary:  https://our.intern.facebook.com/intern/ads/canary/422307863515591925

canary without thpp-compat fixups https://our.intern.facebook.com/intern/ads/canary/422308951649168772

Reviewed By: dreiss

Differential Revision: D18353504

Pulled By: ezyang

fbshipit-source-id: 65feaba39fa07bb66762810909aeb38868668a30
2019-11-08 09:11:20 -08:00
Edward Yang
9c43b16df9 Revert D18171156: Merge Tensor and Variable.
Test Plan: revert-hammer

Differential Revision:
D18171156

Original commit changeset: 5b6a045beba3

fbshipit-source-id: f5581d902c2305018ea49f8473592be2a465560b
2019-11-06 10:57:00 -08:00
Edward Yang
25261a4776 Merge Tensor and Variable. (#28620)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28620

All Tensors are Variables now, they just happen to have requires_grad=False. Tensors ALWAYS have `VariableTensorId` in their type set.

When constructing this patch, I had to make decisions about what I would fix in this patch, and what I would leave for follow up PRs. Here is the cleanup that happens in this patch:

- The `is_variable` property is removed from TensorOptions. I removed this immediately because unlike Tensor::is_variable, TensorOptions::is_variable doesn't respect our VariableTensorId thread-local state. This means that there were a bunch of places where TensorOptions::is_variable was false, which is obviously bogus in the world when tensor and variable are merged. Instead of keeping the method as a function that always returns true, I just opted to remove it entirely (it's not public API.) All places we set `is_variable` are deleted.
  - Knock on effect: there is no longer a separate DeprecatedTypeProperties for the variable and non-variable versions of type.
  - Knock on effect: instead of asserting on TensorOptions::is_variable, instead we just test `at::impl::variable_is_excluded()`
- There is now only one copy of the cuDNN RNN dropout cache, not two (I'm not sure why we had two to begin with)

Some cleanup that doesn't happen in this patch:
- Eliminating unnecessary uses of `make_variable`
- Eliminating `Tensor::is_variable`

The most subtle part of this patch is retaining tracing behavior: the fact that everything is a Variable means that more code gets routed to VariableType than before; this can change traces. I identified two places where we didn't appropriately turn off VariableType, mostly factory functions:

- `torch.tensor` must turn off VariableType before invoking `at::empty` to construct the tensor, as it subsequently does direct data access
- `tensor_slow` (invoked when you pass a Python scalar to a tensor argument) must turn off VariableType before calling `scalar_to_tensor` so the scalar gets traced as constant, rather than as a call to `scalar_to_tensor`.

Honestly, these are all giant hacks, and should be replaced with a more specialized guard that just toggles tracing.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: dreiss

Differential Revision: D18171156

Pulled By: ezyang

fbshipit-source-id: 5b6a045beba37492647e350190f495114e86504d
2019-11-04 14:59:57 -08:00
Zachary DeVito
121839b2f8 Fix bugs in assignment to optionals (#25059)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25059

This fixes the cases where a type annotated with optional cannot
be conditionally assigned to none:

```
x : Optional[int] = 4
if ...:
 x = None
```

Test Plan: Imported from OSS

Differential Revision: D16975166

Pulled By: zdevito

fbshipit-source-id: 5a7a81224d08b9447e1f4d957fcd882091e02f32
2019-08-26 13:47:54 -07:00
Zachary DeVito
f9f5af0ed7 Revert D16949314: [jit] Fix bugs in assignment to optionals
Test Plan: revert-hammer

Differential Revision:
D16949314

Original commit changeset: 7f63d88b30a3

fbshipit-source-id: d1f00de2ad9c3484b731ad1b24205ca60024355d
2019-08-22 16:50:48 -07:00
Zachary DeVito
bb79b61ce7 Fix bugs in assignment to optionals (#24989)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24989

This fixes the cases where a type annotated with optional cannot
be conditionally assigned to none:

```
x : Optional[int] = 4
if ...:
 x = None
```

Test Plan: Imported from OSS

Differential Revision: D16949314

Pulled By: zdevito

fbshipit-source-id: 7f63d88b30a3f5b024c2a539aa74967c9202af00
2019-08-22 16:27:46 -07:00
Elias Ellison
e8ea44796e add support for multiple assignment statements (#24477)
Summary:
add support for : `a = b, c = (1, 2)`

partial fix for https://github.com/pytorch/pytorch/issues/24256
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24477

Differential Revision: D16963413

Pulled By: eellison

fbshipit-source-id: 0433a1e759b3aa719ef1b766bb5160f2ca814205
2019-08-22 10:17:14 -07:00