Commit graph

1444 commits

Author SHA1 Message Date
Martin Schatz
f233a3ebd8 Explicitly set default data type in seq2seq/translate.py
Summary: word_rewards data type is mixed; ConstantFill assigns long but later is filled with float32.  This causes issues when running net from outputted protobuf.  This change makes data type to be float32 for lifetime of blob.

Reviewed By: jhcross

Differential Revision: D6486723

fbshipit-source-id: c4ce5185a0a6d71b08b1819f2355e9354823b701
2017-12-07 11:21:01 -08:00
James Reed
dc47319074 Implement AssertOp
Summary:
This can be used for testing and debugging. zdevito and I will primarily use this for our caffe2 script project
Closes https://github.com/caffe2/caffe2/pull/1585

Reviewed By: zdevito

Differential Revision: D6501209

Pulled By: jamesr66a

fbshipit-source-id: fdd65e422c44b74bb6926320af506dcae13327f3
2017-12-06 17:18:52 -08:00
Ilia Cherniavskii
79ac146808 Add if and while ops to brew
Summary:
Adding if and while control ops to brew, also adding unit tests
Note: unlike net_builder where we can figure which blobs are external and which ones are local to subnets, here in brew we need to use external_blobs param explicitly to point at external blobls

Reviewed By: harouwu

Differential Revision: D6440508

fbshipit-source-id: c920f0af84b77ccb2d8462ffc7567bb1908c844a
2017-12-05 17:33:34 -08:00
Ahmed Taei
d1d6c0b12b Add CUDA implementation for ReplaceNaNOp
Reviewed By: jay-mahadeokar

Differential Revision: D6481993

fbshipit-source-id: cb253621795bb9de73d3e8bc1c8fc21b596d88c3
2017-12-05 13:34:51 -08:00
Simon Layton
a8250280bb Py3 test fixes
Summary:
\cc pietern
Closes https://github.com/caffe2/caffe2/pull/1555

Differential Revision: D6479902

Pulled By: pietern

fbshipit-source-id: 84647eddec45620b1ed603f4882ded2dd49adc43
2017-12-05 10:34:41 -08:00
James Reed
ea56e0d424 Implement BatchMatMul with Numpy-style batch broadcast semantics
Summary:
ONNX has decided to implement a single MatMul operator that borrows semantics from np.matmul: https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.matmul.html

This PR introduces a new op that we can target for ONNX that mimics the numpy-style broadcast semantics
Closes https://github.com/caffe2/caffe2/pull/1507

Reviewed By: dzhulgakov

Differential Revision: D6389022

Pulled By: jamesr66a

fbshipit-source-id: a2270ad0042b1ddf6c65ba7cb10d83e0763cf950
2017-12-05 10:34:35 -08:00
Hassan Eslami
8da31c240d Revert changes in blob name in optimizer
Summary: A while ago, we had to change some blob names in `optimizer.py` (more specifically, names of `iteration_mutex` and `optimizer_iteration`) to handle corner cases when preparing a net for parallel execution.

Reviewed By: azzolini

Differential Revision: D6480819

fbshipit-source-id: a03a7aa9fad322a50e7785914b0eb0f8654e6d90
2017-12-04 19:32:45 -08:00
Jesse Hellemn
32500fe800 Reducing array sizes used in pack_ops_test to prevent time outs during Travis CI builds
Summary:
Reduced the array sizes used in pack_ops_test to prevent time outs
during Travis CI builds.

Reviewed By: enosair

Differential Revision: D6476703

fbshipit-source-id: 20ab871ae40349ca27186447a84135bbc5c351b1
2017-12-04 12:48:53 -08:00
Peter Goldsborough
540a9c279e Add LayerNormLSTM
Summary:
Adds a new `LSTMCell` subclass to the `rnn_cell` module that performs layer normalization on the fused input matrix. Moves around some code in `rnn_cell.py` to avoid copy-pasta. Adds relevant test cases to `rnn_cell_test.py`.

Had to fix `brew.layer_norm` first. See T24013870.

Reviewed By: jhcross

Differential Revision: D6454883

fbshipit-source-id: 0f4ea7a778cc5be6a7274f7b28c793f5dd7c6095
2017-12-04 10:48:37 -08:00
Pieter Noordhuis
e1e08d631a Always check cuDNN support in test_convolution_gradients
Summary:
Regardless of device checker/gradient checker we cannot run a
backwards pass with cuDNN when NHWC is used.
Closes https://github.com/caffe2/caffe2/pull/1566

Differential Revision: D6474181

Pulled By: pietern

fbshipit-source-id: 727d7b4f2a1431a4d6675ffb76c5b60d3d7fa712
2017-12-04 08:50:39 -08:00
Pieter Noordhuis
41897e3e78 Supress hypothesis health check in glu_op_test.py
Summary: Closes https://github.com/caffe2/caffe2/pull/1564

Differential Revision: D6472568

Pulled By: pietern

fbshipit-source-id: 4f1bd3a1ced6d77991531eb864d2cf5d39bc7c4f
2017-12-03 22:51:46 -08:00
Pieter Noordhuis
1351152362 Skip DeviceShiftTest if host has < 4 GPU devices
Summary: Closes https://github.com/caffe2/caffe2/pull/1563

Differential Revision: D6471667

Pulled By: pietern

fbshipit-source-id: 99efd21b98c00eb0a846ca8b395bdfd550fe02f1
2017-12-03 16:02:05 -08:00
Davin Wang
f2be3a4e5e Allow specifying device to prepare_prediction_net()
Summary:
This is a supplementary to commit ce8267d425444f60ae650389fb41838847a44a5e. It allows specifying device to prepare_prediction_net() so prediction extractor can work with GPU.
Closes https://github.com/caffe2/caffe2/pull/1035

Differential Revision: D6467420

Pulled By: salexspb

fbshipit-source-id: b5b9a1536fb516e90b5e4b615403086943cfbe93
2017-12-03 10:32:08 -08:00
James Cross
2c190d2f05 update transformer code for layer_norm() API change
Summary: Quick fix for unit test broken by D6454290. This is my fault for approving while the tests covering the single callsite were broken.

Reviewed By: goldsborough

Differential Revision: D6466566

fbshipit-source-id: 2683be3d6bb184286e64fbde3e572946e39030c7
2017-12-01 20:19:31 -08:00
Peter Goldsborough
b43c1b2bed Fix and upgrade brew.layer_norm
Summary:
While working on layer normalization for LSTMs I encountered an issue where the layer norm parameters (which are the scale/gain and bias/shift from the paper) were not registered in the model for `brew.layer_norm`. salexspb explained that this is because it was using the `init_net_param` API instead of `create_param`. This diff fixes this.

While fixing I noticed that I noticed that `brew.layer_norm` actually had a bug where it was multiplying with the bias instead of adding it. Another issue was that the function giving the scale and bias a shape of `[1]`, however the paper (https://arxiv.org/pdf/1607.06450.pdf) specifies that, like for batch norm, there is one scale and bias parameter per neuron, i.e. the shape should be `[1, axis_dimension]`. The API now takes an explicit `dim_in` parameter (also more consistent with other normalization functions in that module) so that this can be specified. See tests for how this now looks.

Reviewed By: jhcross

Differential Revision: D6454290

fbshipit-source-id: fc00ca614de3190c40ab743e8984bec9e85fb58c
2017-12-01 14:18:28 -08:00
Jesse Hellemn
3af2b8f428 Adding length verification check to pack_segments
Summary:
Adding a check to pack_segments to make sure the lengths passed in add up as expected.

Additionally started to address https://fb.facebook.com/groups/1405155842844877/permalink/1977332432293879/ , but it might not fix that issue, but is still useful if it does not help that issue.

Reviewed By: salexspb

Differential Revision: D6443490

fbshipit-source-id: 680dc763a788a550d321d97a556c5b46e3402dd1
2017-12-01 10:47:25 -08:00
Pieter Noordhuis
3d1135c842 Skip remove_padding test because it is flaky
Summary:
Must be fixed in #1547
Closes https://github.com/caffe2/caffe2/pull/1548

Reviewed By: jhcross

Differential Revision: D6456373

Pulled By: pietern

fbshipit-source-id: 484a58e31506acfc8b8a0954f76796d14dfdfda3
2017-12-01 09:47:31 -08:00
Yan Shang
cf07820849 Enable SparseLengthsMean
Differential Revision: D6445834

fbshipit-source-id: 5cbc95e6975b2447dc82dbe293d0ddd9adf6b5a3
2017-11-30 16:04:38 -08:00
Xue Feng
0c588a500b Replace sigmoid + xent loss with SigmoidCrossEntropyWithLogits for better numerical stability
Summary: Replaced sigmoid + xent loss with SigmoidCrossEntropyWithLogits. The sigmoid layer computes the multinomial logistic loss of the sigmoid of its inputs. It's conceptually identical to a sigmoid layer followed by a multinomial logistic loss layer, but provides a more numerical stable gradient.

Reviewed By: xianjiec

Differential Revision: D6305455

fbshipit-source-id: 444c9f651fbdf13c3c52be5142769f8f98ed8770
2017-11-30 14:04:36 -08:00
Ellie Wen
fc3f88d8a4 higher order interaction of embeddings
Summary:
Get higher order interaction of embeddings, similar to cross net but applied in the embedding level.
Formula:
  e_(l+1,i) = element_wise_mul[e_(0,i), \sum_i(e_(l,i) * w_(l,i))] + e_(l,i) + b
where l means the l-th layer of this higher order net, i means the i-th embedding in the list.

Finally, concat all the embeddings in the last layer, or concat the sum of each embedding, and attach to the output blob of dot processor.

Differential Revision: D6244001

fbshipit-source-id: 96292914158347b79fc1299694d65605999b55e8
2017-11-30 08:51:09 -08:00
Bingjun Sun
7e9724142a batched layer parameter loading for model initialization from an existing model
Summary:
Problem:
when we initialize a model from an existing model, currently we load information for each layer parameter independently (in utils.py), including shape information. we have to load the whole model from the db_path every time when we initialize one parameter (in layers.py). For example, in f31078253, the model needs to be initialized twice (not sure why). each time there are 152 layer parameters to load. and loading a model needs 10 min - 50 min depending on resource status.
Restriction:
1. _infer_shape_from_initializer in layers.py is called from multiple other places, besides the if branch of ModelInitDefinition.INIT_MODEL_PATH in load_parameters_from_model_init_options in utils.py, which is the root cause of f31078253. So we still need to support the load operator in _infer_shape_from_initializer. So we need to batch shape blobs loading outside of LayerParameter.
2. in the if branch of ModelInitDefinition.PARAMS in load_parameters_from_model_init_options in utils.py, the db_path can be different from different parameters, so it is hard to batch them.
Solution:
Batch the shape blobs loading in the if branch of ModelInitDefinition.INIT_MODEL_PATH in load_parameters_from_model_init_options in utils.py. We load the model and generate shape blobs of layer parameters in the workspace, so that _infer_shape_from_initializer in layers.py can directly return shape blobs of layer parameters cached in the workspace without reloading the model. and at the same time _infer_shape_from_initializer can still support separate any load operator if shape blobs are not pre-loaded into the workspace (this logic can be used for other ways to initialize a model rather than from an existing model).
Right now we are using 500 layer parameters per batch, and it worked fine. So for 152 layer parameters, one model loading is enough.

Reviewed By: xianjiec

Differential Revision: D6397607

fbshipit-source-id: 54f6f61d6d8b70c82b74c2d72ac56cd010a710da
2017-11-29 22:17:51 -08:00
Qinqing Zheng
7374c981d8 CUDA support for PackSegments Op
Summary: Replace GPUFallbackOp by native CUDA implementation

Reviewed By: akyrola

Differential Revision: D6423200

fbshipit-source-id: 47dfecbc486e9a8bf0cc6b897ab8b6a2488caa34
2017-11-29 22:01:42 -08:00
Andrey Malevich
b766335753 Revert D6403523: [Part 2] Support regression with output transform in MTML for feed.
Summary:
This reverts commit faa0aab1227a27286b617e8e25adfbab3a349d2c

bypass-lint

Differential Revision: D6403523

fbshipit-source-id: eb43f348b09f2abcc52e101f43b0b9cc42a48ffb
2017-11-29 21:47:01 -08:00
Aapo Kyrola
2caca70a37 Allow shifting of activations / ops to other GPUs in data parallel model
Summary:
(Work in progress). This diff will allow shifting of activations to other GPUs, in case the model does not fit into memory. To see the API, check the code in data_parallel_model_test, which tests shifting two activations from 0 and 1 to gpu 4, and from gpu 2 and 3 to gpu 5.

I will need to further test on ResNets, and probablly add copy operations to handle device change points.

Reviewed By: asaadaldien

Differential Revision: D5591674

fbshipit-source-id: eb12d23651a56d64fa4db91090c6474218705270
2017-11-29 21:17:00 -08:00
James Cross
0e21cd2eae CUDA implementation of RemovePadding operator
Summary:
This is a CUDA implementation of the RemovePadding operator, modeled on akyrola's implementation for AddPadding.

There's also an incidental spelling correction: GetAddPadingGradient -> GetAddPaddingGradient.

Reviewed By: akyrola

Differential Revision: D6439594

fbshipit-source-id: b29cd0c252021c58e150b901bbaad28a3bd3cc4a
2017-11-29 18:48:01 -08:00
Zachary DeVito
6811acbef9 Syntax for control flow in C2
Summary: Experimental code that allows you to write C2 NetDefs directly using python-like syntax. This includes the ability to write native control-flow (if, while) and have it turn into IfOp and WhileOp

Reviewed By: jamesr66a, dzhulgakov

Differential Revision: D6123298

fbshipit-source-id: 25fc078b5769be61ac7fb3aa9a7c95bd88dccc30
2017-11-29 16:47:45 -08:00
Qichao Que
c9e181f50f Support regression with output transform in MTML for feed.
Summary: Support regression with output transform in MTML for feed.

Differential Revision: D6403523

fbshipit-source-id: faa0aab1227a27286b617e8e25adfbab3a349d2c
2017-11-29 15:47:19 -08:00
Pieter Noordhuis
6f218cef25 Supress hypothesis health check in adagrad_test.py
Summary:
With some test seeds this warning starts firing.

Should be addressed in a better way, not generating as many invalid examples.
Closes https://github.com/caffe2/caffe2/pull/1536

Reviewed By: bddppq

Differential Revision: D6437138

Pulled By: pietern

fbshipit-source-id: c619d928a585e3d887f686db5d98f841af10c56b
2017-11-29 11:35:04 -08:00
Yangqing Jia
4beb3ac3ab Properly guard cudnn backward path - NHWC is still not supported.
Summary:
TSIA. This is found in

https://github.com/caffe2/caffe2/pull/1530

Reviewed By: dzhulgakov

Differential Revision: D6434417

fbshipit-source-id: 2285c2f6252eb7f24e83357eb4887851b3adf690
2017-11-28 23:03:02 -08:00
Ilia Cherniavskii
38f166c13a Async executor with less polling
Summary:
Async executor based on async_polling (D5985110):
- Tasks scheduling other tasks, using polling only when necessary (e.g.
  CUDA->CPU case)
- Fully async, i.e. RunAsync immediately returns

Reviewed By: azzolini

Differential Revision: D6281681

fbshipit-source-id: 06e3723e1424ffab652c38ca7b279cf76e43fa44
2017-11-28 18:50:32 -08:00
Matthias Ochs
14cc15e8f4 fixed NCCL bug in data_parallel_model.py
Summary:
Changed the dict of viewvalues into a python list

See issue: https://github.com/caffe2/caffe2/issues/1516
Closes https://github.com/caffe2/caffe2/pull/1532

Differential Revision: D6425901

Pulled By: akyrola

fbshipit-source-id: 37988abe29726aea86637e18eedb948b7c281008
2017-11-28 10:50:02 -08:00
Aapo Kyrola
a08909160e fix bug in CUDA AddPadding when lenghts output is not provided
Summary:
enosair caught bug that the operator returned too early if the lengths output was not provided. Fixed and added testing.

+ noticed the op does not support case when no lengths-input is provided. Added a temporary CAFFE_THROW for this case, will fix later

Reviewed By: enosair

Differential Revision: D6405585

fbshipit-source-id: a81717e1b39afde6e900ddd9049b820943aea9f1
2017-11-27 15:14:07 -08:00
Junjie Bai
3da9d7971d Suppress pytest filter_too_much health check
Summary:
Fix the travis CI
Closes https://github.com/caffe2/caffe2/pull/1524

Reviewed By: dzhulgakov

Differential Revision: D6412499

Pulled By: bddppq

fbshipit-source-id: eaa5942c88d4edd65600d035e31d2300fd8ab3a8
2017-11-27 08:35:27 -08:00
Aapo Kyrola
0954775d28 AddPadding CUDA version
Summary: CUDA version of the AddPadding op. It first executes a prefix-sum using Cub to compute the cumulative lenghts array. Then it launches a kernel that uses this information to fill the output tensor with start, end paddding and the actual contents.

Reviewed By: asaadaldien

Differential Revision: D6391413

fbshipit-source-id: 45b431e5976674729e53cb4752c7753c1d8a69e8
2017-11-22 18:17:21 -08:00
Xianjie Chen
5250d7fd11 simplify logic for weighted pooling using id score list
Summary:
so that user can use 'WeightedSum' pooling method when there is mix of id list feature and id score list features.

- it's still intuitive to have "WeightedSum" for id list, and we do not need to introduce new "UnWeightedSum" etc.

Reviewed By: chocjy

Differential Revision: D6369270

fbshipit-source-id: 722fa08d1a7986bc6ecf4c7cb02bbae0825bcab4
2017-11-22 17:32:04 -08:00
Andrew Tulloch
48415d83c8 Fix instance_norm_test.test_instance_norm_model_helper
Reviewed By: jerryzh168

Differential Revision: D6391749

fbshipit-source-id: ba861d401e358290782db8f360c430e3f3daae96
2017-11-22 15:05:29 -08:00
Liang Xiong
fc0c8c2316 minor refactoring in dper
Summary: small changes as I was reading through the dper code base. all of them are nits, but somewhat helped me understanding things.

Reviewed By: xianjiec

Differential Revision: D6389380

fbshipit-source-id: 3412052e4fcba199c6ffc84c6f7ae11bf8ff6ee9
2017-11-21 18:12:49 -08:00
Aapo Kyrola
daa450d656 add sanity check to model_helper.TensorProtosDBInput
Summary:
Caffe2 user was confused when model.TensorProtosDBINput([reader]) did not work. This is because of this outdated model helper function, that ignored the input blobs.
Added assertion to enforce correct usage. I did not want to make this work with reader input as well, since this probably should not be used anyway.

Reviewed By: amanrajdce

Differential Revision: D6380326

fbshipit-source-id: 6a50c2861f7f58c06cbfe3e86bde0f17a2b443cb
2017-11-21 10:28:25 -08:00
Yan Shang
dcaaf51100 Support /sqrt(n) pooling
Differential Revision: D6378584

fbshipit-source-id: 3c6606c4e71afbd31dbb97ceeac38dfbe7b40090
2017-11-21 09:04:02 -08:00
Andrew Tulloch
77b78935f2 More extensions
Reviewed By: kevinbchen

Differential Revision: D6300944

fbshipit-source-id: e915c3f3d6b475752d8b7df82ec467d86f88a7c7
2017-11-20 17:18:51 -08:00
Andrew Dye
1ba3e14608 Throw Python exception from PythonOp instead of logging
Summary: Today when PythonOp throws an exception, we log the error and fail the op. Later we assert that the op/net/plan succeeds and throw with a generic message. The user must ttail the logs to find the real error. Instead, align with exception handling from other ops - throw directly. This will include full context of the exception in the error message.

Reviewed By: Yangqing, akyrola

Differential Revision: D6359684

fbshipit-source-id: 85133ba6562759607a3971449120647cbacce946
2017-11-20 09:03:17 -08:00
Qinqing Zheng
4471e15b76 BMUF cpu support
Summary: change the interface so BMUF can run on cpus

Reviewed By: asaadaldien

Differential Revision: D6356026

fbshipit-source-id: f58a4da9f800d969145a1a376e118b0f3581f8c1
2017-11-19 23:41:25 -08:00
Yongqiang Wang
0e99334efb move print to logger
Summary: further cleanup data_worker's messy output

Reviewed By: asaadaldien

Differential Revision: D6217857

fbshipit-source-id: 51cee29a687501d0f965422586fd6cb66a2d516a
2017-11-17 18:03:44 -08:00
Aarti Basant
5de880f3e1 Resume from epoch instead of re-starting a worklow from scratch when we retry
Reviewed By: anshulverma

Differential Revision: D6354076

fbshipit-source-id: d2bee93a1136fb07c46942649e90110d2e3ccb0e
2017-11-17 12:51:07 -08:00
Aapo Kyrola
1a02e72254 fix missing DPM .values() and .keys() to viewvalues() and viewkeys()
Summary: Reported by SImon Layton from NVIDIA: we had a couple of py3-incompatible expresions in data_parallel_model

Reviewed By: azzolini

Differential Revision: D6349447

fbshipit-source-id: a09feb69396be43296400591a3bfed5b8c370b0d
2017-11-16 16:08:18 -08:00
Yan Shang
24e83acbb9 Enable sampling in evaluation
Reviewed By: chocjy

Differential Revision: D6119768

fbshipit-source-id: c8447326008392df70ab10b04f84223cf6d882b1
2017-11-16 14:03:51 -08:00
Yiming Wu
127a55ae49 cast op for empty batch
Summary: Cast op cuda can deal with empty batch now.

Reviewed By: azzolini

Differential Revision: D6350138

fbshipit-source-id: 2f3d19f4d42ff34806aa9597690e66f6b4de1a6b
2017-11-16 12:20:20 -08:00
Wenyi Huang
d8dfaeeef7 Add batch-based/row-based sparse from/to dense operator
Summary:
Two ops: BatchSparseToDenseOp and DenseToBatchSparseOp
Inverse operations of each other.

Details are described in op Doc

These op is used along with flexible topK, where the output is
lengths, indices, and values.
We want to do softmax on the values, but the dimension of each batch is different. So these op will convert sparse representation to dense and vice versa. The two ops are also gradient op for each other.

Reviewed By: chocjy

Differential Revision: D6288338

fbshipit-source-id: 0ba9e611058b39e46e7414dcc5f39cab29915fa3
2017-11-16 00:59:21 -08:00
Xiaolong Wang
3bde37fbf0 Listwise Ranking -- LambdaNDCG
Summary:
This is part one: It adds lambdaNDCG loss which can be used to heuristically
optimize the NDCG metric.

Differential Revision: D5830650

fbshipit-source-id: 1eb696337c9a77727ad40219c68f6468e2e097a5
2017-11-16 00:05:48 -08:00
Anshul Verma
a3afca6fc9 Minor documentation fix in NetBuiler
Summary: Came across this bug in doc when I was figuring out NetBuilder form the code.

Reviewed By: volkhin

Differential Revision: D6341821

fbshipit-source-id: 8818f3d92681366bfe7b90d9d4da9f68ef6e4672
2017-11-15 16:22:22 -08:00