pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

Author	SHA1	Message	Date
Xiaomeng Yang	9243b64bff	[Caffe2] Update elementwise ops to support numpy style boradcast (#8070 ) * Update elementwise ops to support numpy style boradcast Update elementwise ops to support numpy style boradcast * Fix sqrt_op * Fix compare ops * Fix gradient test * Fix optimizer legacy broadcast * Fix legacy broadcast for elementwise ops * Skip flaky test * Fix eigen simple binary op * Fix attention test * Fix rnn test * Fix LSTM test * Fix tan grad * Fix schema check	2018-06-05 15:49:16 -07:00
Orion Reblitz-Richardson	1d5780d42c	Remove Apache headers from source. * LICENSE file contains details, so removing from individual source files.	2018-03-27 13:10:18 -07:00
Bram Wasti	51897e52da	fix all the broken tests from adding debug info (#2013 )	2018-02-22 17:43:53 -08:00
Yangqing Jia	8286ce1e3a	Re-license to Apache Summary: Closes https://github.com/caffe2/caffe2/pull/1260 Differential Revision: D5906739 Pulled By: Yangqing fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902	2017-09-28 16:22:00 -07:00
Luke Yeager	f775149205	tests: use assertRaises, not expectedFail Summary: I would expect that tests marked "expected failure" mean that there is a known issue in the code which will be fixed later. Both of these tests are simply verifying proper error-checking - nothing needs fixing. Before (looks like something is wrong): ``` ======================================= 2 xfailed in 0.27 seconds ======================================= ``` After: ``` ======================================= 2 passed in 0.28 seconds ======================================== ``` /cc akyrola gsethi523 Closes https://github.com/caffe2/caffe2/pull/1209 Differential Revision: D5825373 Pulled By: akyrola fbshipit-source-id: 1b98f503e4e406f69567d02425532f43bd16a465	2017-09-13 11:39:35 -07:00
Aapo Kyrola	cbb85545ec	warn about orphan StopGradient output Summary: Quite common confusion is how to use StopGradient, and typical bug is to forget to specify input=output. This adds a sanity check to gradient builder that checks if some StopGradient outputs are orphaned. Reviewed By: dzhulgakov Differential Revision: D5458341 fbshipit-source-id: 056fef4f0ee53eb10e66e9be0ecb55b55f9cc3d7	2017-07-20 21:41:41 -07:00
Aapo Kyrola	ab0fe0a5f4	add debug information when there is blob version mismatch Summary: It is quite common question when users get some variant of "blob has version 2 but gradient expects version 1" in their backward pass. The error message is completely unhelpful. To remedy this, I added proper debug information which tells user how the version number of a blob was incremented over time. i.e which ops caused the version to go op. This should help understand the issue. Reviewed By: dzhulgakov Differential Revision: D5358227 fbshipit-source-id: bc09d048ac33200c35d56460e44e86c2f2888f3f	2017-06-30 16:22:46 -07:00
Alexander Sidorov	83e6a0bec8	Revert uuid change to OperatorDef protobuf Summary: a few issues: 1. Randomization hurts memoization 1. Even if we make it non random, then we can get key colisions when loading it back. 2. RNNs use prototxt for step net and apparently its not forward compatible like normal protobuf is I am thinking of a better less invasive solution now. Reviewed By: jamesr66a Differential Revision: D5272118 fbshipit-source-id: ab577fad04fbfc632e1fceffa923377a0d3da1be	2017-06-19 16:47:31 -07:00
Luke Yeager	8ef12951e0	Fix for protobuf with unicode_literals Summary: Python 2.7, Protobuf 2.6 > op.ClearField('uuid') E TypeError: field name must be a string Fix: http://python-future.org/imports.html#should-i-import-unicode-literals /cc salexspb tomdz Closes https://github.com/caffe2/caffe2/pull/804 Differential Revision: D5258494 Pulled By: akyrola fbshipit-source-id: 04c473c1e55bf8caac0bfde7d86171c9f95e71a1	2017-06-15 13:22:57 -07:00
Aapo Kyrola	7ffd76db51	check operator schema before calling gradient creator Summary: Hard-to-debug problems arise when a gradient creator fails when the forward op is incorrect itself. Add checking of the schema before callig the creator. Also clarify the error messages Reviewed By: Yangqing Differential Revision: D5256016 fbshipit-source-id: 78550f7e2ce5b88e26b69fdae4be0eece52edfea	2017-06-15 13:04:58 -07:00
Alexander Sidorov	eebda50b79	Operator python traceback Summary: This is going to show a python Caffe2 user where a failed operator was created. Motivation for having this information not right in protobuf is to avoid having it too verboose and keep ability to read protobufs of a net after a simple print() call. Reviewed By: jamesr66a Differential Revision: D5226047 fbshipit-source-id: 7edfe850e05a2ec209577142aa3368664a57a108	2017-06-13 18:50:02 -07:00
Alexander Sidorov	7f1385e70c	Improve gradient accumulation of the framework: 1.5x - 2x Summary: We waste extra memory by creating two autosplit gradient blobs and then accumulating it into them main one. Sometimesk, when Sum / Sub ops are involved, we can avoid wasting extra memory at all. Ideally we would not waste any memory and make ops add to the same blob rather then calculating separate results and then mering them. But it would require a substantial change to the frameworks and rewriting a lot of operators. Reviewed By: dzhulgakov Differential Revision: D5157667 fbshipit-source-id: 8293824d6cdd971d8853ae90aee68e4a6d1e132b	2017-06-11 22:02:30 -07:00
Alexander Sidorov	264f75fdd0	ZeroGradient op Summary: when building a multi layer static RNN the last timestep of the first layer (and other layers except the last one) doesn't get a gradient for the cell state as normally user uses results only from the last layer and cell state doesn't go up either. ZeroGradient provides a general solution for injecting 0 gradient blobs. It is in some way similar to StopGradient operator which is also specialcased Reviewed By: bwasti Differential Revision: D5198375 fbshipit-source-id: a21d0cfb3676a77fac72e5897a200d0bd25fc6de	2017-06-08 16:02:38 -07:00
Alexander Sidorov	846240a340	Caffe2 gradient generator bug fix Summary: Bug repro is in a test. Generally speaking accumulation was not happening if len(ys) >= 2 (list of blobs we compute gradients from) and for some blob in the net it was both in ys list and also got a gradient propagated from another element in ys. Reviewed By: akyrola Differential Revision: D5121695 fbshipit-source-id: 282d88f2f4f6e27dadae311964f40246a2739130	2017-05-30 18:47:08 -07:00
Aapo Kyrola	570c6bb9b7	Fix backward pass computation when an input is used in a Fill-op input for shape Summary: Fix issue that amyzhang encountered. She was using ConstantFill to create a blob of same size as an another blob. This caused the gradient op computation flow to interrupt through the ConstantFil since the gradient for the input blob was set to None (although it had another gradient already set). The correct solution is to avoid overwriting gradient assignments with None, if they already have a gradient. UNLESS that blob is output of the same op, as with StopGradient op. (Note that Amy's problem was fixed by using instead a fixed shape ConstantFill and Add with broadcast=1, which is better solution anyway). Not sure if I explained this well, but see the new unit tests. Before this change, the testAddAndDynamicConstant failed but the testAddAndStaticConstant succeeded. Reviewed By: dzhulgakov Differential Revision: D4861176 fbshipit-source-id: 3b53621bfaba2e36786a5e4664145038995f6616	2017-04-11 19:32:22 -07:00
Aapo Kyrola	02937903cc	add inference for gradient ops + a couple of missing shape inference functions + fix to scalars Summary: A bit too much stuff in one diff, so sorry: 1. Add inference for gradient types by using the fact that x_grad is gradient of x and must be of same shape. This is kind of awkward to use string matching, but in addition I rely on the operator being actually a gradient op. 2. dzhulgakov was write, scalar shape is () and not (1). Sorry, my claim easlier was #fakenews. 3. Added inference functions for MakeTwoClass, MomentumSGDUpdate and Cross entropy ops. Reviewed By: dzhulgakov Differential Revision: D4569758 fbshipit-source-id: 0db13f33819777fdddefe21d4b1ebf906fcaf98c	2017-02-28 23:33:32 -08:00
Xian Li	64419a928d	Implement EnsureDenseOp and EnsureDenseGradientOp. Summary: This operator can always outputs dense gradients regardless of the input gradients. For forward pass, it passes inputs to outputs in place. Reviewed By: xianjiec Differential Revision: D4582511 fbshipit-source-id: 7eb2c5d2142aa05d373f06cab1e7f89d8b747d34	2017-02-22 07:16:26 -08:00
Simon Layton	12c4090ea5	Skip sparse tests if operators not available Summary: Only tests for SparseFunHash for now Closes https://github.com/caffe2/caffe2/pull/60 Reviewed By: Yangqing Differential Revision: D4348961 Pulled By: bwasti fbshipit-source-id: cd05d73ccc711b42a7d33e7a6b65a9d1a9bfa7e6	2016-12-19 15:59:32 -08:00
Martin Raison	ea9a0f24bf	automatic aggregation of sparse gradients Summary: This adds support for automatic aggregation of sparse gradients. We simply concatenate indices and values (no attempt to deduplicate, since this is already done before feeding into the optimizer). This should support various cases (indices and/or values can be generated by one or more gradient ops, or gradient outputs can be directly passed from inputs). I tried to minimize the code footprint, but I introduced SparseGradGenMeta because GradGenMeta didn't lend itself very well to be used with sparse gradients. Reviewed By: dzhulgakov Differential Revision: D4219788 fbshipit-source-id: 1d074664cffd82a8764e4b1473ada6bc46e6c51a	2016-12-05 11:53:26 -08:00
Yangqing Jia	b23e51d467	chunky sync	2016-09-06 15:55:19 -07:00
Yangqing Jia	6463eebc7b	chunky sync - build scripts to be written	2016-07-21 10:16:42 -07:00
Yangqing Jia	559053d3a8	chunky sync	2016-05-13 14:43:48 -07:00
Yangqing Jia	4ae1bbbd7e	bugfix	2016-03-11 10:30:16 -08:00
Yangqing Jia	cf7ca23fc1	make caffe2.python build	2016-03-08 16:48:19 -08:00
Yangqing Jia	9ae880bb6f	move pycaffe2 to caffe2.python	2016-03-08 15:45:30 -08:00

25 commits