Commit graph

74 commits

Author SHA1 Message Date
Kevin Matzen
04d02632e9 instance norm test fix
Summary:
Reduce test input size to instance norm gradient check.  Larger size is currently timing out on stress tests.
e.g. failed: Timeout: Ran out of time before finding a satisfying example for test_instance_norm_gradients. Only found 2 examples in 125.39s.

Reviewed By: Yangqing

Differential Revision: D4608828

fbshipit-source-id: ce17a3ad28752d808efcbf79f1ea4238e63fb005
2017-02-25 14:31:42 -08:00
Peng Yang
8ab13eea6f delete redundant comment lines.
Summary: delete redundant comment lines.

Differential Revision: D4600596

fbshipit-source-id: 4bb619f9ff99d6f799e87970b6b6d5ea7de02c98
2017-02-24 11:04:36 -08:00
Deepak Gopinath
cd4ea42048 Allowing creation of random odd length arrays in RandGaussian
Summary: curandGenerateNormal can only generate arrays of multiple of 2 lengths. MSRAFill and GaussianFill operators use RandGaussian utility method which in turn uses curandGenerateNormal. This is a test which runs the operators on both devices to generate odd sized random arrays.

Differential Revision: D4602819

fbshipit-source-id: e65f5c731e925886cfa14afff482f7053bd020a0
2017-02-23 15:03:22 -08:00
Yury Zemlyanskiy
4a53ab3cb6 LSTMWithAttention implementation in Caffe2
Summary:
Implementation of ##LSTMWithAttention##

Still TBD:
1. There are problems with back propagation, because gradient is not implemented for ops with broadcasting
2. I need to make initial_recurrent_state to be of shape [dim] rather than [1, batch_size, dim], so one doesn't need to provide batch_size to LSTMWithAttention

Differential Revision: D4298735

fbshipit-source-id: 8903fcff4d6a66647ee6d45a6ef28803fc3091e5
2017-02-23 04:08:34 -08:00
Andrew Tulloch
312821d36c Allow in-place instance norm.
Summary:
In-place is ~30% speedup, but needs a change to torch2caffe
or a graph rewrite on the client.

Differential Revision: D4577582

fbshipit-source-id: c31bf8ba97f4fa4cedf355cf2475eb7bab48b304
2017-02-22 14:03:55 -08:00
Artem Volkhin
45e1905722 add support of fp16 to SparseLengthsSum and SparseLengthsMean
Summary: Another part of making DPER compatible with half-floats. This diffs adds supoprt of fp16 to segment reduction operators used in DPER.

Reviewed By: dzhulgakov

Differential Revision: D4587560

fbshipit-source-id: 0ae10648a7286a820bffaee802464dd9464584bc
2017-02-22 11:05:55 -08:00
Peng Yang
26be1977bf fix CrossEntropyOp bug for batch input
Summary: this is to fix the bug with eigen implementation which calculating crossentropy

Reviewed By: salexspb

Differential Revision: D4582078

fbshipit-source-id: 4c92047e9dbbe219fcbef618a45c584c2fbfaad5
2017-02-21 17:34:31 -08:00
Alisson Gusatti Azzolini
04eccb8ebe Performance counters
Summary:
- Key-value store for counters.
- Counters are updated via macros that also export USTD probes.
- Counter values can be exported using caffe2 operators.
- Snapshot mechanism for tracking time-window counter values.

Reviewed By: dzhulgakov, pietern

Differential Revision: D4553761

fbshipit-source-id: 25a1a91a3168dcff2159c6fba7b357d3fd3aa9bf
2017-02-21 16:31:24 -08:00
Qichao Que
7f4d5e9900 Add feed label parser operator.
Summary: Add feed label parser operator, this layer depends on D4520993.

Reviewed By: kennyhorror

Differential Revision: D4538797

fbshipit-source-id: 8efcd7b2f6962c30023c7464a13c125ba1a99dc4
2017-02-21 14:17:00 -08:00
Ahmed Taei
5bc3d2ef03 Add ReduceFront GPU Op's
Summary: Add GPU implementation for ReduceFront{Sum|Mean} Ops

Differential Revision: D4577270

fbshipit-source-id: 697f498531af6b9da4a0138d2a9beb39234f9756
2017-02-17 16:46:42 -08:00
Xianjie Chen
d0621a2449 NextScopedBlob with well-defined behavior and respect namescope
Summary:
Remove the use of `NextName` in layer model helper, so that the same function return `model_helper` that should construct identical `Net`, when under the same NameScope.

The `NextScopedBlob` should only take effect when there is real name conflicting, otherwise it returns ScopedBlobReference.

This is critical for parameter blobs. In long run, we need to be able to specify parameter blobs more explicitly. (kennyhorror is working on this). This solution works in short term for e.g., two tower sparse nn models.

Reviewed By: kennyhorror

Differential Revision: D4555423

fbshipit-source-id: 2c4b99a61392e5d51aa878f7346466a8f14be187
2017-02-16 17:16:36 -08:00
James Cross
b436788b16 LSTMUnit: pass through H values
Summary:
Pass through the h-value recurrent output unchanged at each LSTM step beyond the valid part of a sequence (computed based on seqLengths, allowing batching of sequences of different length). This enables using the final-step output of each sequence as the output when one vector is desired for the entire sequence. Gradient also passed back unchanged.

Also made some cosmetic changes to recurrent_network_test.py (seq_lengths offset corrected, should be in [1, T] rather than [0, T-1]).

Reviewed By: urikz

Differential Revision: D4540307

fbshipit-source-id: 73a9f6326069d713dcb0cdc8d17869317c6dbe96
2017-02-16 15:31:38 -08:00
Steven Strijakov
5429031917 Adding SoftmaxWithLoss operator to Shape Inference
Summary: This diff adds shape inference for the SoftmaxWithLoss Operator

Differential Revision: D4565835

fbshipit-source-id: 1c2db398524c765977ec4d8a22c9b986bf9faf82
2017-02-16 12:32:51 -08:00
Yury Zemlyanskiy
40534de705 Gradient for Copy operator
Summary:
One can find a reason, why I need gradient for CopyOp in this post - https://fb.facebook.com/groups/1405155842844877/permalink/1639683782725414/

Gradient for CopyOp is trivial in case the device was the same (cpu, or same gpu), but get's a little harder, when the copy was made across two different gpu.
I introduce new operator CopyOnDeviceLike, which has additional second input. The op copies the first input to the same device as the second one. The default implementation is exactly the same as CopyOp, but I specialize it for CUDAContext.

Please, let me know if I'm doing anything wrong here! That's my first caffe2 diff, related to operators definitions.

Reviewed By: Yangqing

Differential Revision: D4557258

fbshipit-source-id: 9494be589cc1e5696bbbfe25b7622aaa4c9efe4a
2017-02-16 06:11:27 -08:00
Tullie Murrell
81d932b161 Add LeakyReluOp to caffe
Summary: Adds LeakyRelu to caffe2 with a test.

Reviewed By: bwasti

Differential Revision: D4511970

fbshipit-source-id: a7189c691ec1813b304bf04f2b73f1c61acd08e2
2017-02-15 16:00:45 -08:00
Aapo Kyrola
50a6897e80 Shape inference for ImageInput, NHWC2NCHW and StopGradient
Summary: As in headline. I had missed these originally.

Reviewed By: kennyhorror

Differential Revision: D4560255

fbshipit-source-id: e69458e8a2574b981e40e915d87c8e16dadee7d6
2017-02-15 16:00:45 -08:00
James Cross
63901e9aca allow recurrent network gradient op to receive gradient on any combination of network output blobs
Summary:
(Caffe2) Modified RecurrentNetworkGradient operator so that training is possible with any of the output blob(s) receiving gradient during the backward pass. This is realized through a new argument for the RecurrentNetwork op, outputs_with_grads, which takes a list of the indices of the output blobs which will receive gradient. The default case (only receiving gradient from the first output blob) remains the default.

New unit test covers the case where outputs_with_grads = [1, 2] using Python LSTM wrapper.

Reviewed By: urikz

Differential Revision: D4518516

fbshipit-source-id: 5c531582b20f3cf727d1aa91239b4d5a2b8a7c1f
2017-02-15 16:00:45 -08:00
Huazhong Ning
cb3c41b9a9 PiecewiseLinearTransformOp transform binary predictions specially
Summary:
The existing op tranforms the input in a general way. It needs M transform mappings to transform a NxM input tensor.
But for binary predictions X (Nx2 tensor), we know that X[:, 0] = 1 - X[:, 1].
So we just need one mapping for X[:, 1]. After being transformed, we can compute X[:, 0].
This diff is to handle this.

Differential Revision: D4550441

fbshipit-source-id: 42d8c6e88d830c97628ee930b543740a32acf904
2017-02-15 16:00:44 -08:00
Kittipat Virochsiri
718786add7 UniqueUniformFillOp
Summary: This is like `UniformIntFill` but guarantee to return unique elements in the output, excluding the optional avoiding elements.

Reviewed By: xianjiec

Differential Revision: D4511814

fbshipit-source-id: 5dc98ee580616e60e46ee74ebb3f5ddd29a09965
2017-02-15 16:00:44 -08:00
Steven Strijakov
2de4b8840d Added MatMul operator inference
Summary: MatMul operator now performs inference

Differential Revision: D4515770

fbshipit-source-id: 237b527cce306b4858452d430c8ecc8a79537aff
2017-02-14 15:32:14 -08:00
Kittipat Virochsiri
524bc07973 Change the schema of IndexLoad & IndexFreeze so that state change is captured by the framework
Summary: These operators update the state of the instance and therefor should have the instance in the output list.

Reviewed By: xianjiec

Differential Revision: D4554773

fbshipit-source-id: 556d484fcf58878308aa6b0f7cd7ea2446d3f29e
2017-02-14 10:05:12 -08:00
David Truong
60be25f4cd Added shape inference to padding operator for tensors
Summary: Can now infer the shape of the tensor

Differential Revision: D4529339

fbshipit-source-id: 33553611fd3ecd7fde4b7b432c7720255ddda8be
2017-02-13 11:04:13 -08:00
Amy Zhang
5c007be804 add soft label functionality to softmax with loss op
Differential Revision: D4527240

fbshipit-source-id: 548bf943857adb8f198348cc5b17ec52dc65bd2e
2017-02-10 09:01:53 -08:00
Andrew Dye
306fde233a Accept optional blob map for InferShapesAndTypes
Summary:
Shape inference allows Caffe2 to compute shapes of blobs without running a model. Update InferShapesAndTypes() to accept an optional blob:dimensions map so that external input blobs do not need to be part of the workspace.

InferShapesAndTypes() in workspace.py conditionally calls the ...from_workspace or ...from_map bindings. Note I favored a small amount of code duplication here for the sake of readability. InferShapesAndTypes() in operator.cc has been refactored into mirrored entry points, invoking a common helper.

Other minor changes to address linter warnings.

Reviewed By: dzhulgakov

Differential Revision: D4524873

fbshipit-source-id: 56f863b759c016d7f23523f06fda3aa5bba22357
2017-02-08 15:04:24 -08:00
Steven Strijakov
e6a18d2e9a Added TransposeOp Inference
Summary: TransposeOp shape inference is now implemented

Differential Revision: D4517155

fbshipit-source-id: fb2b11c27231043f87a4c128b0eb3cbb60ab2c0c
2017-02-08 10:29:31 -08:00
Yury Zemlyanskiy
280718b40c Allow non-batched initial recurrent states for RecurrentNetworkOp
Summary: title

Reviewed By: salexspb

Differential Revision: D4493728

fbshipit-source-id: a9ba25bd325b413ed15c35754afb9ed562b1a60c
2017-02-06 15:01:36 -08:00
Aapo Kyrola
dcefc74a0c Shape and Type Inference Part1
Summary:
This is a bit large diff, sorry about it. It includes basic shape and type inference functionality, based on YQ's Schema scaffolding. I added some helper functions to make it easier to write simple translations.

Bigger refactoring was needed for ConvPoolBase so that we could use the shape inference already there in the schema.

I annotated enough operators to be able to infer forward-pass of shapes for basic convnet, and added test for that. I intend to bootcamp some annotations and annotate enough to handle Resnets fully. Need to think about gradients, if they could be annotated in an easier way.

Only shapes are now exposed to Python, types will follow later. Also the inference is not called yet anywhere but unit test.

Also I am not sure if everything is in the best location in the code, but shouldn't be hard to move stuff around.

Reviewed By: dzhulgakov

Differential Revision: D4436818

fbshipit-source-id: eebee5937ccc9ac09c245465302388a1fae6933c
2017-02-02 22:29:22 -08:00
Alisson Gusatti Azzolini
000c53a7b1 AtomicCounter to return previous value on Reset.
Summary: This allows to save the previous value of the counter and send it upstream without losing counts.

Reviewed By: kennyhorror

Differential Revision: D4497854

fbshipit-source-id: 28a7ad0ff1020bde26f78b1f59614b094d1e1881
2017-02-02 14:59:30 -08:00
Alexander Sidorov
b7fa6b2a8b remove recurrent_inputs in a favor of recurrent_input_ids
Summary:
I have forgotten to remove this one. The rest of indexing
instead of string names is comming after  D4446813 lands as scratches
aren't inputs or outputs and thus can't be indexed.

Reviewed By: urikz

Differential Revision: D4465748

fbshipit-source-id: 2ccbedfb35541ef4a2231d1480eef59025bd5290
2017-01-31 13:14:33 -08:00
Alexander Sidorov
d019ec793c improve fluky test
Summary: On some inputs TestWarden was failing

Reviewed By: Yangqing

Differential Revision: D4487293

fbshipit-source-id: 3da4b310a619c2b57f033b2dd7727f71403bfd68
2017-01-30 22:14:27 -08:00
Yury Zemlyanskiy
debd256177 Fix for gradient propagation for initial recurrent state for RecurrentNetwork
Summary: looks like we don't a good job with initial recurrent input gradients yet. Here is some fix, but gradient doesn't check yet. The shape is correct now though

Reviewed By: salexspb

Differential Revision: D4475447

fbshipit-source-id: 280f1f59f19e487fd0dce0d440609c50ddce294a
2017-01-30 18:59:32 -08:00
Yury Zemlyanskiy
22e1bdd6d1 Use stack workspaces in RecurrentNetwork
Summary: This diff use stack workspaces in RecurrentNetwork, which allows to simplify the implementation and get rid of scratches.

Reviewed By: salexspb

Differential Revision: D4446813

fbshipit-source-id: 514eec7e4300bdf492a9cb192b40cf4f89acf656
2017-01-27 11:44:26 -08:00
Vsevolod Oparin
319945df15 Test for FC operator + fix for docs
Summary: Test for FC operator + fix for docs

Differential Revision: D4473293

fbshipit-source-id: 6e6ebad007ee08b05184fda288ab74982c6b2219
2017-01-27 10:44:24 -08:00
Alexander Sidorov
8bff8014b3 print out inputs in lstm test to catch when it is fluky
Summary:
We get fluky lstm tests on a numerical gradient check. I
would like to improve accuracy of the latter. But first need an
example. After lading this TestWarden would find a bad input for me.

Reviewed By: urikz

Differential Revision: D4467223

fbshipit-source-id: 68d4bf22af11190f39fa28332c6d99efbb192132
2017-01-25 20:59:21 -08:00
Andrew Tulloch
0f870d4f40 Add error checking for too-small input in ConvPoolOpBase
Summary: Fixes segfaults that occur in Eigen and im2col/sgemm backends.

Reviewed By: Yangqing

Differential Revision: D4451772

fbshipit-source-id: 3cf21e5afb2fe300db4228933a82063db5f7091f
2017-01-25 17:44:22 -08:00
Yury Zemlyanskiy
0e3146e1e8 Remove recurrent_sizes from RecurrentNetwork
Summary: Remove usage of recurrent_sizes, so recurrent states' sizes can depend on input (in case of attention matrix for beam decoder). I removed recurrent_sizes from forward and backward steps.

Reviewed By: salexspb

Differential Revision: D4427688

fbshipit-source-id: 580420a294d309c86ec5cb4e677058623b7228e1
2017-01-24 23:14:25 -08:00
Alexander Sidorov
b1472a173a don't hardcode outputs order to work only for lstm + don't pass blob names for parameters
Summary:
In this diff I stop passing parameters by name and also remove hardcoded output ids which were there specifically for LSTM to work. It also allows to avoid using recurrent_sizes in the backward pass (for forward this is done in D4427688)

Using similar technic it should be simple enough to eliminate blob name passing at all. Then we can fix scoping. These can be done in a next diff.

Reviewed By: urikz

Differential Revision: D4444614

fbshipit-source-id: 3580a76365502b9f2f09e3d8b7e78084ca739f00
2017-01-24 16:29:23 -08:00
Alexander Sidorov
f09da676d7 CNNModelHelper.LSTM test
Summary:
lets have a test for this so we don't break existing usecases
while iterating over RecurrentOp's code

Reviewed By: urikz

Differential Revision: D4456404

fbshipit-source-id: 79f2b88c1eed16106adf5b793b4c74441c7146c6
2017-01-24 15:59:24 -08:00
Yangqing Jia
e3ea3e8c12 MKL convolution operator
Summary: Closes https://github.com/caffe2/caffe2/pull/102

Differential Revision: D4448886

Pulled By: Yangqing

fbshipit-source-id: 914d11cd79107895a9755154df3526fcf71a31ea
2017-01-23 09:59:30 -08:00
Aapo Kyrola
06398e9bfb softmax-with-loss, handle gracefully cases when total weight is 0
Summary:
Spatial Softmax allows specifying locations that are not counted for the loss. If none of the locations are counted, this resulted in NaNs, and headache. This diff fixes that by explicitly handling these cases.

+ assertion for label blob dimension(0)

Created a new test as well.

Differential Revision: D4442939

fbshipit-source-id: 8641bfad2a994e517ca3eda39345380a6ca1ba50
2017-01-20 15:29:21 -08:00
Kevin Matzen
6a7dd236fa instance norm
Summary: Added gradient and GPU implementation to caffe2 InstanceNorm op

Reviewed By: Yangqing

Differential Revision: D4304808

fbshipit-source-id: 6feecaed589ea9f825260a49b39b4260da6e5426
2017-01-20 12:29:28 -08:00
Ahmed Taei
411059d649 Generate huffman tree
Summary:
In this diff :
[1] Change the output from generating all paths from root to labels to TreeProto.
TreeProto itself is required by inference and we can use hsm_util to get the
paths from TreeProto.

[2] Fix hsm_util index assigment.

Differential Revision: D4416731

fbshipit-source-id: 657d8b9b4df6fa30c9f92d391cf7e07b5c5db1f8
2017-01-19 16:14:23 -08:00
Ahmed Taei
dd51336611 Fix label start index for HuffmanTreeHierarchyOp
Summary: Change labels indices range to be in the range [0, num_classes[

Differential Revision: D4416685

fbshipit-source-id: b16ca8539fd538ad62bf1298dbad3f1553956241
2017-01-19 15:14:53 -08:00
Yangqing Jia
91ebfa3c7c Unit test for big batch size avg pooling
Summary: basically copied test_pooling and hard coded values

Reviewed By: prigoyal

Differential Revision: D4428162

fbshipit-source-id: 6c0444ac8c21f08824df7ff53999a94967607dc4
2017-01-18 19:29:20 -08:00
Aapo Kyrola
4f1db36cff add CUDA gradient for Div
Summary: DivOp missed a gradient for CUDA, so implemented it. Also added operator test.

Differential Revision: D4396638

fbshipit-source-id: 9949e47aa3735bb418a0db003e2b2f4896056a71
2017-01-09 21:59:23 -08:00
Maxime Boucher
e2181a32ca Normalize rank loss gradient to avoid convergence issues when the number of pairs is really large
Summary:
Essentially, when number of pairs is around 1000, then only positive samples in the list gets a massive boost from all the negative examples. This diff normalizes the gradient and the loss with the number of pairs.

This diff also adds protection against NaN and more logging to help debug.

Reviewed By: kdub0

Differential Revision: D4359782

fbshipit-source-id: 7240344ddb1f2f670d1eec1b03e7f6e413f3dfcc
2016-12-21 17:29:24 -08:00
Yangqing Jia
2c6a579859 Make all convolution operators allow optional bias term
Summary:
It used to be that only the cudnn engine supports it, and now it should be
fully supported by any conv engine.

To ignore bias, simply use a convolution op that has two inputs instead of
3. The gradient operator will automatically figure out that it does not
compute the bias gradient.

Reviewed By: prigoyal

Differential Revision: D4354183

fbshipit-source-id: cf71b6289a254d15a6a663a85df63fbbaec3702b
2016-12-21 15:14:24 -08:00
Yury Zemlyanskiy
c2d28fb874 RNNs API simplification
Summary:
This is a first step in improving our RNN story. It provides a wrapper around current RecurrentNetworkOp implementation which infers most of the redundant parameters and makes API much simpler.

Also in order to support general step nets I added an extra argument to the RecurrentNetworkOp.

Future work:

1. Inferring step net output and internal blobs (scratches) sizes and type
2. Avoid accessing blobs by names in c++ part
3. Remove requirement for inputs / output 1:1 correspondence in the step net
4. Make python API support networks with operators like Sum being on the boarder of the Cell net (currently there is an issue with such networks where gradient blobs which are on the side are not explicitly created).

Differential Revision: D4268503

fbshipit-source-id: f8a66491c2b55daa730caeed7e9f2b3921541b49
2016-12-21 09:29:43 -08:00
Yangqing Jia
6abf5c99dc Implement group convolution in the cudnn interface.
Summary:
This is an ongoing work - currently the forward pass is implemented, but backward
is yet to be done. We might want a CPU counterpart as well.

I will wait for D4341288 to land and then make bias optional.

Reviewed By: prigoyal

Differential Revision: D4342210

fbshipit-source-id: 51bb0e98d917970bdc040d076b535beb8e994d9a
2016-12-20 13:44:44 -08:00
Maxime Boucher
a03692069e Adjust numerical precision of comparison to make test pass
Summary: see title

Differential Revision: D4351545

fbshipit-source-id: 1cca4552ea8f1051796a85724ba0c136ea38b5ec
2016-12-20 11:30:01 -08:00