Commit graph

25 commits

Author SHA1 Message Date
Xiaomeng Yang
9243b64bff
[Caffe2] Update elementwise ops to support numpy style boradcast (#8070)
* Update elementwise ops to support numpy style boradcast

Update elementwise ops to support numpy style boradcast

* Fix sqrt_op

* Fix compare ops

* Fix gradient test

* Fix optimizer legacy broadcast

* Fix legacy broadcast for elementwise ops

* Skip flaky test

* Fix eigen simple binary op

* Fix attention test

* Fix rnn test

* Fix LSTM test

* Fix tan grad

* Fix schema check
2018-06-05 15:49:16 -07:00
Orion Reblitz-Richardson
1d5780d42c Remove Apache headers from source.
* LICENSE file contains details, so removing from individual source files.
2018-03-27 13:10:18 -07:00
Bram Wasti
51897e52da fix all the broken tests from adding debug info (#2013) 2018-02-22 17:43:53 -08:00
Yangqing Jia
8286ce1e3a Re-license to Apache
Summary: Closes https://github.com/caffe2/caffe2/pull/1260

Differential Revision: D5906739

Pulled By: Yangqing

fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902
2017-09-28 16:22:00 -07:00
Luke Yeager
f775149205 tests: use assertRaises, not expectedFail
Summary:
I would expect that tests marked "expected failure" mean that there is a known issue in the code which will be fixed later. Both of these tests are simply verifying proper error-checking - nothing needs fixing.

Before (looks like something is wrong):
```
======================================= 2 xfailed in 0.27 seconds =======================================
```
After:
```
======================================= 2 passed in 0.28 seconds ========================================
```
/cc akyrola gsethi523
Closes https://github.com/caffe2/caffe2/pull/1209

Differential Revision: D5825373

Pulled By: akyrola

fbshipit-source-id: 1b98f503e4e406f69567d02425532f43bd16a465
2017-09-13 11:39:35 -07:00
Aapo Kyrola
cbb85545ec warn about orphan StopGradient output
Summary: Quite common confusion is how to use StopGradient, and typical bug is to forget to specify input=output. This adds a sanity check to gradient builder that checks if some StopGradient outputs are orphaned.

Reviewed By: dzhulgakov

Differential Revision: D5458341

fbshipit-source-id: 056fef4f0ee53eb10e66e9be0ecb55b55f9cc3d7
2017-07-20 21:41:41 -07:00
Aapo Kyrola
ab0fe0a5f4 add debug information when there is blob version mismatch
Summary:
It is quite common question when users get some variant of "blob has version 2 but gradient expects version 1" in their backward pass. The error message is completely unhelpful.
To remedy this, I added proper debug information which tells user how the version number of a blob was incremented over time. i.e which ops caused the version to go op. This should help
understand the issue.

Reviewed By: dzhulgakov

Differential Revision: D5358227

fbshipit-source-id: bc09d048ac33200c35d56460e44e86c2f2888f3f
2017-06-30 16:22:46 -07:00
Alexander Sidorov
83e6a0bec8 Revert uuid change to OperatorDef protobuf
Summary:
a few issues:

1. Randomization hurts memoization
1. Even if we make it non random, then we can get key colisions when loading it back.
2. RNNs use prototxt for step net and apparently its not forward compatible like normal protobuf is

I am thinking of a better less invasive solution now.

Reviewed By: jamesr66a

Differential Revision: D5272118

fbshipit-source-id: ab577fad04fbfc632e1fceffa923377a0d3da1be
2017-06-19 16:47:31 -07:00
Luke Yeager
8ef12951e0 Fix for protobuf with unicode_literals
Summary:
Python 2.7, Protobuf 2.6

    >                   op.ClearField('uuid')
    E                   TypeError: field name must be a string

Fix: http://python-future.org/imports.html#should-i-import-unicode-literals

/cc salexspb tomdz
Closes https://github.com/caffe2/caffe2/pull/804

Differential Revision: D5258494

Pulled By: akyrola

fbshipit-source-id: 04c473c1e55bf8caac0bfde7d86171c9f95e71a1
2017-06-15 13:22:57 -07:00
Aapo Kyrola
7ffd76db51 check operator schema before calling gradient creator
Summary: Hard-to-debug problems arise when a gradient creator fails when the forward op is incorrect itself. Add checking of the schema before callig the creator. Also clarify the error messages

Reviewed By: Yangqing

Differential Revision: D5256016

fbshipit-source-id: 78550f7e2ce5b88e26b69fdae4be0eece52edfea
2017-06-15 13:04:58 -07:00
Alexander Sidorov
eebda50b79 Operator python traceback
Summary: This is going to show a python Caffe2 user where a failed operator was created. Motivation for having this information not right in protobuf is to avoid having it too verboose and keep ability to read protobufs of a net after a simple print() call.

Reviewed By: jamesr66a

Differential Revision: D5226047

fbshipit-source-id: 7edfe850e05a2ec209577142aa3368664a57a108
2017-06-13 18:50:02 -07:00
Alexander Sidorov
7f1385e70c Improve gradient accumulation of the framework: 1.5x - 2x
Summary:
We waste extra memory by creating two autosplit gradient
blobs and then accumulating it into them main one. Sometimesk, when Sum
/ Sub ops are involved, we can avoid wasting extra memory at all.

Ideally we would not waste any memory and make ops add to the same
blob rather then calculating separate results and then mering
them. But it would require a substantial change to the frameworks and
rewriting a lot of operators.

Reviewed By: dzhulgakov

Differential Revision: D5157667

fbshipit-source-id: 8293824d6cdd971d8853ae90aee68e4a6d1e132b
2017-06-11 22:02:30 -07:00
Alexander Sidorov
264f75fdd0 ZeroGradient op
Summary:
when building a multi layer static RNN the last timestep of
the first layer (and other layers except the last one) doesn't get a
gradient for the cell state as normally user uses results only from
the last layer and cell state doesn't go up either.

ZeroGradient provides a general solution for injecting 0 gradient
blobs. It is in some way similar to StopGradient operator which is
also specialcased

Reviewed By: bwasti

Differential Revision: D5198375

fbshipit-source-id: a21d0cfb3676a77fac72e5897a200d0bd25fc6de
2017-06-08 16:02:38 -07:00
Alexander Sidorov
846240a340 Caffe2 gradient generator bug fix
Summary:
Bug repro is in a test. Generally speaking accumulation was
not happening if len(ys) >= 2 (list of blobs we compute gradients
from) and for some blob in the net it was both in ys list and also got
a gradient propagated from another element in ys.

Reviewed By: akyrola

Differential Revision: D5121695

fbshipit-source-id: 282d88f2f4f6e27dadae311964f40246a2739130
2017-05-30 18:47:08 -07:00
Aapo Kyrola
570c6bb9b7 Fix backward pass computation when an input is used in a Fill-op input for shape
Summary:
Fix issue that amyzhang encountered. She was using ConstantFill to create a blob of same size as an another blob. This caused the gradient op computation flow to interrupt through the ConstantFil since the gradient for the input blob was set to None (although it had another gradient already set). The correct solution is to avoid overwriting gradient assignments with None, if they already have a gradient. UNLESS that blob is output of the same op, as with StopGradient op. (Note that Amy's problem was fixed by using instead a fixed shape ConstantFill and Add with broadcast=1, which is better solution anyway).

Not sure if I explained this well, but see the new unit tests. Before this change, the testAddAndDynamicConstant failed but the testAddAndStaticConstant succeeded.

Reviewed By: dzhulgakov

Differential Revision: D4861176

fbshipit-source-id: 3b53621bfaba2e36786a5e4664145038995f6616
2017-04-11 19:32:22 -07:00
Aapo Kyrola
02937903cc add inference for gradient ops + a couple of missing shape inference functions + fix to scalars
Summary:
A bit too much stuff in one diff, so sorry:

1. Add inference for gradient types by using the fact that x_grad is gradient of x and must be of same shape. This is kind of awkward to use string matching, but in addition I rely on the operator being actually a gradient op.
2. dzhulgakov was write, scalar shape is () and not (1). Sorry, my claim easlier was #fakenews.
3. Added inference functions for MakeTwoClass, MomentumSGDUpdate and Cross entropy ops.

Reviewed By: dzhulgakov

Differential Revision: D4569758

fbshipit-source-id: 0db13f33819777fdddefe21d4b1ebf906fcaf98c
2017-02-28 23:33:32 -08:00
Xian Li
64419a928d Implement EnsureDenseOp and EnsureDenseGradientOp.
Summary:
This operator can always outputs dense gradients regardless of
the input gradients. For forward pass, it passes inputs to outputs in place.

Reviewed By: xianjiec

Differential Revision: D4582511

fbshipit-source-id: 7eb2c5d2142aa05d373f06cab1e7f89d8b747d34
2017-02-22 07:16:26 -08:00
Simon Layton
12c4090ea5 Skip sparse tests if operators not available
Summary:
Only tests for SparseFunHash for now
Closes https://github.com/caffe2/caffe2/pull/60

Reviewed By: Yangqing

Differential Revision: D4348961

Pulled By: bwasti

fbshipit-source-id: cd05d73ccc711b42a7d33e7a6b65a9d1a9bfa7e6
2016-12-19 15:59:32 -08:00
Martin Raison
ea9a0f24bf automatic aggregation of sparse gradients
Summary:
This adds support for automatic aggregation of sparse gradients. We simply concatenate indices and values (no attempt to deduplicate, since this is already done before feeding into the optimizer). This should support various cases (indices and/or values can be generated by one or more gradient ops, or gradient outputs can be directly passed from inputs).

I tried to minimize the code footprint, but I introduced SparseGradGenMeta because GradGenMeta didn't lend itself very well to be used with sparse gradients.

Reviewed By: dzhulgakov

Differential Revision: D4219788

fbshipit-source-id: 1d074664cffd82a8764e4b1473ada6bc46e6c51a
2016-12-05 11:53:26 -08:00
Yangqing Jia
b23e51d467 chunky sync 2016-09-06 15:55:19 -07:00
Yangqing Jia
6463eebc7b chunky sync - build scripts to be written 2016-07-21 10:16:42 -07:00
Yangqing Jia
559053d3a8 chunky sync 2016-05-13 14:43:48 -07:00
Yangqing Jia
4ae1bbbd7e bugfix 2016-03-11 10:30:16 -08:00
Yangqing Jia
cf7ca23fc1 make caffe2.python build 2016-03-08 16:48:19 -08:00
Yangqing Jia
9ae880bb6f move pycaffe2 to caffe2.python 2016-03-08 15:45:30 -08:00
Renamed from pycaffe2/core_gradients_test.py (Browse further)