Commit graph

1058 commits

Author SHA1 Message Date
Kevin Wilfong
5dba88b40b Caffe2 [easy]: Better exception logging in parallel_workers/data_workers
Summary: Instead of printing the exception using print() use traceback.print_exc()  This way you get a stack trace

Reviewed By: jay-mahadeokar

Differential Revision: D5604642

fbshipit-source-id: f8cb67e554305cd2fbed384a4a2040fa2b16e7c0
2017-08-10 15:27:19 -07:00
James Cross
4758bd851b rectify args btw. train and translate
Summary: Make the command-line arguments pertaining to model architecture the same as between train.py and translate.py. Also use s() scoping function for all intermediate blobs in attention.py (this is for comatibility with multi-headed attention).

Differential Revision: D5594312

fbshipit-source-id: cadf51d854b5a9174ec913f32c655be2abf111e5
2017-08-10 15:27:18 -07:00
Christopher Hay
f2dfb40302 Added amplitude argument to SinusoidPositionEncodingOp
Summary: In order to control the absolute scale/magnitude of the output of this op, added a tuning parameter: amplitude

Reviewed By: jamesr66a

Differential Revision: D5596574

fbshipit-source-id: 3b7e316de55cce6fd686da70aa5658ec3e99b070
2017-08-10 15:27:17 -07:00
Ahmed Taei
5bb1e6b817 Allow passing unsymmetric 2d kernels to brew.conv.
Reviewed By: jay-mahadeokar

Differential Revision: D5598523

fbshipit-source-id: 47135a8562f7c720badb2be677cb79730dc417a0
2017-08-10 15:27:16 -07:00
Kittipat Virochsiri
eb85258beb CreateMapOp
Summary: Add operator to create empty map

Reviewed By: xianjiec

Differential Revision: D5454652

fbshipit-source-id: ecad6cc58572b378962af08cf02063ef546ed58f
2017-08-09 13:32:19 -07:00
Tao Wu
7b86a34610 modify _LSTM into _RNN to adapt GRU
Summary: GRU is different than LSTM that it only has hidden states but no cell states. So in this case, reusing the code of _LSTM is problematic, as we need to delete the part of creating cell state, and change many other places that use hard-coded 4 (hidden_all, hidden, cell_all, cell) into 2 (hidden_all, hidden). Otherwise GRU will break during the backward pass, when the optimizer tries to apply gradient to each of the parameters, because cell state is never used, so it does not have gradients for the corresponding parameters (i.e., cell_state_w, cell_state_b).

Differential Revision: D5589309

fbshipit-source-id: f5af67dfe0842acd68223f6da3e96a81639e8049
2017-08-09 13:24:45 -07:00
Aaron Markham
784ba07bf3 updated downloader to use s3 url without a redirect via the vanity url
Summary:
Model downloader was broken after the move on s3 to the vanity url, download.caffe2.ai. Using this as the url base hits a redirect, and will result in the script throwing a 403 error.  Rather than upgrading to urllib2 or putting in a bunch of code to handle a redirect on urllib, we can just use the non-vanity base url.
Closes https://github.com/caffe2/caffe2/pull/1020

Reviewed By: Yangqing

Differential Revision: D5568686

Pulled By: aaronmarkham

fbshipit-source-id: d88a6b3e1b7955835fc03b036dc54dec48316e7f
2017-08-09 12:25:30 -07:00
Junjie Bai
1ce95090ca Add support for specifying engine preferences
Reviewed By: Yangqing

Differential Revision: D5460994

fbshipit-source-id: 08a8af699eebec37defc070389a8415b3e81ac16
2017-08-09 00:47:18 -07:00
Priya Goyal
5c77cc8182 Exposing num_workers as parameter and enable recycling activations
Summary: as promised, a separate diff for dpm changes I made in experimental code

Reviewed By: pietern

Differential Revision: D5551304

fbshipit-source-id: 9013aeab6c388b1c415ffb2e36fb8dd6b8cf90b0
2017-08-08 19:48:41 -07:00
Andrei Chtcherbatchenko
a2204f0b1e Caffe2: Write CUDA version of OneHot operator
Summary: This diff implements CUDA version of OneHot operator.

Reviewed By: bddppq

Differential Revision: D5578543

fbshipit-source-id: 55b70e8ec6ee34b647b9140fecbba31b6968f403
2017-08-08 18:17:39 -07:00
Long Jin
ef64a4f6b2 Add conv layer and layer tests
Reviewed By: xianjiec

Differential Revision: D5569206

fbshipit-source-id: ed836315f3ee4d7983da94f2633a3085fe99194d
2017-08-08 10:57:43 -07:00
Jianlong Zhong
152d2ae3a8 Implement CUDA version of GRU operator
Summary: Add CUDA version of GRU operator

Reviewed By: jamesr66a

Differential Revision: D5571043

fbshipit-source-id: 332aa64fc8a9116cc33382f2b2907080e58c13b3
2017-08-08 10:57:40 -07:00
James Cross
9fcf676cfa testing for open-source seq2seq
Summary:
Fix multilayer inference in Caffe2 example seq2seq code. (Rely on LSTMWithAttentionDecoder.apply rather than fixed state indices to determine stepwise decoder output.)

Also assorted updates to bring code in line with changes elsewhere in the codebase, and added unit tests which ensure that training and inference networks generate the same loss, which should make these problems much easier to identify in future.

Reviewed By: jamesr66a

Differential Revision: D5579803

fbshipit-source-id: 6e0f27340d981990ab8d0da58e63793222e7be87
2017-08-08 10:09:41 -07:00
Chonglin Sun
8ad382df3c implement LengthsTopK operator
Summary:
It was reverted previously because of lack of schema for gradient op. Added it back and resend.

difference between this diff and previous reverted diff:
1. added schema for gradient operator
2. change line:95 in kmax_pooling_op.h from CAFFE_ENFORCE to CAFFE_ENFORCE_GE

Reviewed By: xianjiec

Differential Revision: D5568867

fbshipit-source-id: 39813b389a5da803967a561249793afdfce00c58
2017-08-07 18:19:29 -07:00
Ahmed Taei
8af625ede2 Implement gradients for Col2Im and Im2Col operators
Reviewed By: jay-mahadeokar

Differential Revision: D5576385

fbshipit-source-id: a0ca4f704fd861f7cc67079041b1d0772fc66920
2017-08-07 15:51:30 -07:00
Ahmed Taei
647f35e742 Fix SyncAllParamsDistributed for Python 3x
Summary:
In Python 3x dictionary values aren't a list and can't be concatenated to a list
this diff should fix that.

Reviewed By: andrewwdye

Differential Revision: D5576724

fbshipit-source-id: c60441857ceceb9c4a71122d2db5e9abad6d3fc2
2017-08-07 14:23:32 -07:00
Ben Zhang
42fb87d0b1 L1Distance Row-wise, instead of cumulative
Summary:
The L1Distance operator used to return a single value denoting the L1 of the entire input, instead of a vector for each input value.

This fixes that.

Reviewed By: Yangqing

Differential Revision: D5570385

fbshipit-source-id: fbab0e0c9262ccbdb3af27262b8baacdeb2d0fc9
2017-08-07 14:09:25 -07:00
Jacqueline Xu
a1bf14d8e6 Building new randomized sparse nn model
Summary: New hybrid randomized sparse nn, which allows layers of sparse NN model to be randomized, semi-random, or learnable

Reviewed By: chocjy

Differential Revision: D5416489

fbshipit-source-id: eb8640ddf463865097ba054b9f8d63da7403024d
2017-08-07 12:48:58 -07:00
Zhicheng Yan
e7192c3b91 image_input_op_dense_multi_label
Summary:
To train an image model, we also can use label embedding vector as supervision as opposed to using SoftmaxLoss/SigmoidCrossEntropyLoss.
In such case, the label is a dense vector. This diff enables such use cases.

Reviewed By: panshen1

Differential Revision: D5556203

fbshipit-source-id: 52c61495e02fab457dc2d43e3345d7dbd5580ab7
2017-08-07 12:38:16 -07:00
Kevin Wilfong
d072701547 Caffe2: Refactor the core logic from data_workers.py into parallel_workers.py
Summary:
data_workers.py provides a really nice, easy way to run background threads for data input.  Unfortunately, it's restrictive, the output of the fetcher function has to be a numpy array.

I pulled out that core nice thread management into parallel_workers, and updated the classes data_workers to extend those classes.  The main change was refactoring out most of the queue handling logic into QueueManager.

This way parallel_workers can be used to manage background threads without having to use the queue for output.

Reviewed By: akyrola

Differential Revision: D5538626

fbshipit-source-id: f382cc43f800ff90840582a378dc9b86ac05b613
2017-08-07 10:14:08 -07:00
Yangqing Jia
cc2c4d07d6 Always use assertAlmostEqual for floats when crossing python and C boundaries
Summary:
This fixes travis numerical issue.
Closes https://github.com/caffe2/caffe2/pull/1024

Differential Revision: D5571340

Pulled By: Yangqing

fbshipit-source-id: 097e6f91da68cc3eacf21fe109f342e0dddea189
2017-08-06 14:51:11 -07:00
Juan Miguel Pino
4d8a8c2e1e Implement dot attention
Summary:
Implement dot attention as described in https://arxiv.org/abs/1508.04025
This saves the computation of weighted encoder outputs in `rnn_cell.py`
When the encoder and decoder dimensions are different, we apply an FC, which corresponds to the general case below Figure 2.
Refactored unit tests.

Reviewed By: jhcross

Differential Revision: D5486976

fbshipit-source-id: f9e9aea675b3b072fbe631bc004199b90a9d95cb
2017-08-06 11:50:16 -07:00
Jerry Pan
fac241bcbc Caffe2: add a DB that's wrapped around a BlobsQueue as an adapter for data from non-DB interface
Summary:
Caffe2: add a DB that's wrapped around a BlobsQueue as an adapter for data from non-DB interface.

This is useful for bridging the gap between DB interface data processing ops (TensorProtosDBInput, ImageInputOp etc.) and data that's coming from arbitrary Python or the pretty intricate Hive reader.

Reviewed By: akyrola

Differential Revision: D5554560

fbshipit-source-id: 01bb0056410f9ade205367d5fefc721f91f5b629
2017-08-06 11:50:14 -07:00
Szymon Piechowicz
12f25c8106 Revert D5545533: [pairatt] implement kMaxPooling operator
Summary:
This reverts commit 8378caaac528a71c154067168787ed493bfb0d37

bypass-lint

Differential Revision: D5545533

fbshipit-source-id: a8d9db807f5b22461b21b7589886cf54861e3757
2017-08-04 01:33:29 -07:00
Jiyan Yang
4b80ff89e2 Use softsign op for s=0 in arc-cosine feature map
Summary:
The current implementation for s=0 doesn't support backward pass.
Switching to using pow op instead as a temporary solution.

Reviewed By: jackielxu

Differential Revision: D5551742

fbshipit-source-id: 33db18325b3166d60933284ca1c4e2f88675c3d3
2017-08-03 23:35:11 -07:00
Pieter Noordhuis
d177846dbf Add prefix argument to FileStoreHandler
Summary:
This brings it up to par with how the RedisStoreHandler
works. The store handler configuration does not have to change and
only the run ID parameter changes across runs.

This was inconsistent and came up in https://github.com/caffe2/caffe2/issues/984.

Reviewed By: Yangqing

Differential Revision: D5539299

fbshipit-source-id: 3b5f31c6549b46c24bbd70ebc0bec150eac8b76c
2017-08-03 10:37:26 -07:00
Yiming Wu
8e1ecb1cfd async sparse length sum op
Summary:
This diff makes SparseLengthsSum(Gradient) Async. It goes through these logics:

1. Adding INDICES to Gradient op input so that we can make it async without device host copies.
2. Registering new 3 input op as gradient for CPU/GPU version of SLS
3. In order to not breaking old nets(they are mostly on cpu), I still register the old 2 input op. So the op schema will not complain when it encounter some old nets that has SLSGradient op in it.

wickedfoo  Sorry this diff might bring you extra work of migrating your optimization effort to this new async gradient op. But we think it is worth it. :(

Reviewed By: dzhulgakov

Differential Revision: D5423188

fbshipit-source-id: 62494a6c52a507c4a4688d5a9e1a2bc720d5370d
2017-08-03 03:04:15 -07:00
Christopher Hay
a4e6ca6956 Added Sinusoidal Position Encoding Op
Summary: Added caffe2 operator to calculate the sinusoidal position encoding for word embeddings, as described on page 6 in  https://arxiv.org/abs/1706.03762.

Reviewed By: jamesr66a

Differential Revision: D5533024

fbshipit-source-id: 1afb35cd7f9d8c71f2635b853e56b2c840f0bc1f
2017-08-03 01:46:46 -07:00
Chonglin Sun
4a8545e3c6 implement kMaxPooling operator
Summary: used by attention model

Differential Revision: D5545533

fbshipit-source-id: 8378caaac528a71c154067168787ed493bfb0d37
2017-08-03 00:48:34 -07:00
Wenlin Chen
adc5510ecb dynamic embedding
Summary: refactor get_categorical_limit

Reviewed By: xianjiec

Differential Revision: D5459389

fbshipit-source-id: 14a7e07394db52fb090c6923e341c34576fcb6d6
2017-08-03 00:33:18 -07:00
Jiyan Yang
a8695178aa Adding parameter sharing API to Dper2
Summary:
To achive this, I modified the blob name scheme defined in a layer.
Before it was scope/fc_w and scope/fc_w_auto_0 (if there is another fc
    within the same scope).
Now I change it to scope/fc/w and scope/fc_auto_0/w.
That is, we rely on the uniqueness of the scoped layer name to define
names for blobs.

I also overwrote the create_param method in LayerModelHelper to let it
use the resolved name for blobs given the sharingparameter context.

There are some details such as making the initializer more structured
that I need to finalize.

Reviewed By: kennyhorror

Differential Revision: D5435132

fbshipit-source-id: a0525f5ea0977e255dd5ea765b38913f5951d455
2017-08-03 00:33:18 -07:00
Honghao Wei
cb1dd21280 adding operator lp_norm to support calculating l1 norm and l2 norm
Summary: Implement operators LpNorm, which is to calculate the Lp norm of a tensor for regularization(p=1or 2) . Currently, there are only operator L1Distance to calculate the l1 distance of two same-shape tenors. We want to make it take only one input and output the l1 loss. We would do the same for l2 loss. We also plan to implement l_{p,q} loss, but have not decided which p and q to take.

Reviewed By: xianjiec

Differential Revision: D5460051

fbshipit-source-id: d67a38fbc94afa52de26d4a53e4d2b7df3c50b6a
2017-08-02 15:09:08 -07:00
Simon Layton
ded2a5899e Option to set BN scale and bias initial values
Summary:
Necessary to reproduce setup from 1-hour imagenet paper
Closes https://github.com/caffe2/caffe2/pull/995

Differential Revision: D5547666

Pulled By: akyrola

fbshipit-source-id: cbd4396888b02f32c67e1fe7e53636329de64f1b
2017-08-02 11:38:57 -07:00
Aapo Kyrola
ab42a95b6f fast path for CUDNN global average pooling
Summary:
KaimingHe  debugged slow model, and found out that global average pooling was hideously slow, even with CUDNN. Turns out CUDNN pooling op (especially backward pass) is not optimized for global pooling.

This adds a fast path for global average pooling with NCHW. This is about 30x faster than CUDNN with 56 x 56 pooling, Compared to equivalent ReduceBackSum, this is about 3x faster.

I will bootcamp the max pooling.

Reviewed By: asaadaldien

Differential Revision: D5533059

fbshipit-source-id: 2d590693d737fa92184603663031d96f6145f304
2017-08-02 11:10:10 -07:00
Alisson Gusatti Azzolini
0fc2bf26b4 Option to enforce batch size
Summary: This will throw away a few examples. It is desirable to keep batch size constant for full sync data parallel

Reviewed By: dzhulgakov

Differential Revision: D5531788

fbshipit-source-id: e19385401155e731cfc5b25e8e9ea7c16c19d478
2017-08-01 22:29:55 -07:00
Yan Shang
c662480ea6 Return empty Struct when get_field has empty input
Summary:
Currently, for `from_column_list` if the input col_names=[], it throws
errors. To solve this issue, we fix the get_field function so that it creates
an empty Struct when empty col_names is given.

Reviewed By: kittipatv

Differential Revision: D5543865

fbshipit-source-id: f6dfa25326e355f8ec24e5542761851a276beeb9
2017-08-01 19:49:47 -07:00
Junjie Bai
0c7ee02c37 Add CUDA implementation of BooleanUnmask and fixed some bugs in the test
Reviewed By: akyrola

Differential Revision: D5405606

fbshipit-source-id: fd755ee2ec3d742597f7f5500f54caa396db4da4
2017-08-01 16:51:40 -07:00
Ben Zhang
6314c1fc15 Transforms in Python
Summary: Allow the use of apply_transform() in the python API

Reviewed By: bwasti

Differential Revision: D5530483

fbshipit-source-id: 61a6d36fe125c89629fdeea040a717c453d84417
2017-08-01 16:51:38 -07:00
Thomas Dudziak
676bedd298 Fixes for Python 3 in caffe2/caffe2/fb/data
Summary: As title

Reviewed By: MisterTea

Differential Revision: D5532387

fbshipit-source-id: 0a51ca40b93cc2eb5371f0b86f2800354cd1939c
2017-08-01 15:22:55 -07:00
Kevin Wilfong
60cb55461e Caffe2: Support additional outputs in ImageInputOp
Summary: This allows users to add an arbitrary of additional outputs to ImageInputOp.  These are populated by reading additional TensorProto values from the TensorProtos from the DBReader, and converting them into Tensors.  Similar to labels, only ints and floats are supported, and multiple values are supported.

Reviewed By: panshen1

Differential Revision: D5502019

fbshipit-source-id: 5a8b61b3a8549272a112e8e02cd613d8f9a271ba
2017-08-01 14:36:05 -07:00
Bram Wasti
3a99698734 include numpy's other 32bit int type
Summary: forgot one :)

Reviewed By: akyrola

Differential Revision: D5534905

fbshipit-source-id: a0e58ca3922ec80f526f7586931ff3da8e9bcffc
2017-08-01 13:53:11 -07:00
Tao Wu
5d304a3b49 add gradient for SparseToDenseMask operator
Summary: add gradient for SparseToDenseMask operator

Reviewed By: kittipatv

Differential Revision: D5320792

fbshipit-source-id: 8ee7f1c87e8270ad6077ed197ce9512524069b59
2017-08-01 13:05:03 -07:00
Alisson Gusatti Azzolini
1968e03486 net_printer.to_string() accepts NetDef
Summary: Title.

Reviewed By: kennyhorror

Differential Revision: D5531925

fbshipit-source-id: 8f8961e6ab14d49720f74ec01c197ba9cc3e33ce
2017-08-01 10:17:29 -07:00
Szymon Piechowicz
3324db447f Caffe2: allow nets that don't use all input in net.ClonePartial
Summary: Caffe2: allow nets that don't use all input in net.ClonePartial

Differential Revision: D5535564

fbshipit-source-id: 0ec8fb3ade4d7d6cd4a702c9c265d9c77f27a627
2017-08-01 10:05:46 -07:00
Aapo Kyrola
e38015756a shape inference for Squeeze
Summary: Add tensor inference function for squeeze, refactor a bit

Reviewed By: asaadaldien

Differential Revision: D5518880

fbshipit-source-id: 5b8cb9154f5f777d4be3612a96d7ed76a9068c0c
2017-07-31 16:04:24 -07:00
Xiaolong Wang
82adbde878 pass layer_parameter shape to ps builder if cannot inferred from initializer
Summary:
Feed team uses distributed training and wants to also use transfer learning.

Currently, transfer learning implements by overwriting the layer parameter
initializer. Therefore, PS builder can't infer correctly the parameter shape.

To fix this, add a field 'shape' in `layer_parameter` and set the shape if we
overwrite its initializer.

We also enforce the check of parameter shape between the original initializer
and the loaded blob. (this adds extra cost)

Differential Revision: D5520541

fbshipit-source-id: 80547dbd328b3f6cbfcea0b2daaf4004703dfe81
2017-07-31 16:04:23 -07:00
James Cross
8c65b5ab34 multilayer seq2seq
Summary: Several refinements to seq2seq example code, including support for multilayer LSTM.

Reviewed By: jamesr66a

Differential Revision: D5460372

fbshipit-source-id: d2eabf6aa9a5b5df7bbc341fd99c4e7d8322e717
2017-07-31 12:27:51 -07:00
Aapo Kyrola
8079abbaf1 fix traversal order
Summary: Memonger did not properly track the number of times a blob output has to be produced before an operator can be visited. Actually I remember fixing this before, but well. This bug was manifested in Priya's model, so thanks prigoyal, and benz's model verifier nicely caught the wrong output.

Reviewed By: asaadaldien

Differential Revision: D5524912

fbshipit-source-id: 10f4d7056b84aba0274a918af508ea043e6026f9
2017-07-30 21:47:48 -07:00
Mingda Li
e3c45206ec Add a method to run a train net multiple times in layer_test_util.py
Summary: This method runs a train net multiple times therefore enables testing layers with iteration-dependent behavior.

Differential Revision: D5493750

fbshipit-source-id: a7fb967a66f799aaf82acfadc4ecf66e0744da20
2017-07-28 19:56:05 -07:00
Aapo Kyrola
84b9d267dc add warnings about slow data input
Summary: One of my workflows was stuck before everstore/hive data input was experiencing networking issues (No route to host etc.). But it is hard to know this is happening because the errors were logged to stdout. Anyway, added a simple logging to warn if the data workers enqueue thread is not getting new data for over 10 secs.

Reviewed By: panshen1

Differential Revision: D5522816

fbshipit-source-id: a036c4afdfbbafea130a4251c1ca02c138d19a83
2017-07-28 18:21:42 -07:00