Commit graph

1221 commits

Author SHA1 Message Date
Luke Yeager
ebeaecbfa3 workspace_gpu: Get{CUDAVersion,DeviceProperties}
Summary:
Expose some useful utilities to Python
Closes https://github.com/caffe2/caffe2/pull/1216

Differential Revision: D5843888

Pulled By: akyrola

fbshipit-source-id: fc731781aec3c7cc6a4b7132f1624423d015abff
2017-09-17 20:01:34 -07:00
Xianjie Chen
eccfa1041c fix cuda GatherOp for empty batch
Summary: as title

Differential Revision: D5840432

fbshipit-source-id: 5d9021f152c21d24e91dc0cc3d95443782afc228
2017-09-15 17:40:43 -07:00
Dhruv Mahajan
c3fd31b1a2 weights for labels in image_input_op
Summary: Introduced weight for labels in multi-lable setting. An extra weight blob is introduced and read in the operator in case lable setting is weighted sparse.

Reviewed By: kevinwilfong

Differential Revision: D5812467

fbshipit-source-id: efb209092e1e9effc915b0a753fa0c67b47a4fb6
2017-09-15 17:40:42 -07:00
Andrew Gallagher
9639ddd22f Cleanup omnibus-blacklist-hack rules
Summary:
Now that Buck supports a way to opt-out external C/C++ libs from omnibus linking,
this diff removes the hack we previously relied on (and which got copy-pasta-d everywhere).

Reviewed By: pixelb

Differential Revision: D5832450

fbshipit-source-id: cc3d12488f8498be6fb12bce1fedb3ad1accb518
2017-09-15 16:49:35 -07:00
Aapo Kyrola
9ec981b866 for CPU-data parallel, allow sharing model
Summary: On CPU, no need to replicate parameters. So try using only one copy (cpu_0) for parameters. Made resnet50_trainer use shared model in cpu mode.

Reviewed By: wesolwsk

Differential Revision: D5812181

fbshipit-source-id: 93254733edbc4a62bd74a629a68f5fa23f7e96ea
2017-09-15 16:19:37 -07:00
Aapo Kyrola
6b44a00c71 remove in-place Dropout from rnn_cell (bug in PR-1185)
Summary: This caused gradient generation problems. Output was made in-place in PR-1185, by mistake, I believe.

Differential Revision: D5844825

fbshipit-source-id: 4ad84d0fb468aafde9f78463b9acf89316e633ca
2017-09-15 14:03:33 -07:00
Matt Uyttendaele
af8f6c1bca adding unit tests to compphoto caffe2 projects
Summary: Ported existing adhoc test code to use python unittests. Small tweak to caffe2.python.hypothesis_test_util

Reviewed By: kmatzen

Differential Revision: D5837295

fbshipit-source-id: daa2360db3c18c7d4bda7785e7a0b9175f5858af
2017-09-15 12:49:37 -07:00
Pieter Noordhuis
27dde63358 Allow run of example resnet50_trainer without training data
Summary:
This is useful for pure throughput tests where
we don't care about training a real model.

Reviewed By: akyrola

Differential Revision: D5834293

fbshipit-source-id: dab528c9269fb713e6f6b42457966219c06e0a35
2017-09-15 09:45:11 -07:00
Huazhong Ning
1a89c6e1ec Decayed adagrad
Summary: When trained on billions of data, the adagrad gradient square sum be very big and create an issue of adding small numbers to big numbers. This diff Allow to decay the adagrad gradient square sum.

Reviewed By: queqichao

Differential Revision: D5825932

fbshipit-source-id: 570224483b77d42ae53410fa2f767af86de167eb
2017-09-15 00:35:21 -07:00
Aapo Kyrola
fb45383ed6 resubmission of PR1175: fp16 BatchMatMul
Summary: PR 1175 caused a build error because gemmBatched was only under a specific #ifdef. Now put it outside the #ifdef, and things work.

Reviewed By: asaadaldien

Differential Revision: D5834868

fbshipit-source-id: 072a64c8f4b259ff7504104121766115b46b8aa0
2017-09-14 21:46:05 -07:00
Aapo Kyrola
1e37145872 Resnet50 should param init net before creating test net
Summary: Otherwise weights, biases are not created and test creation fails

Reviewed By: gsethi523

Differential Revision: D5836438

fbshipit-source-id: 32a75313b6b9ebecbfaa43ebd39f19c8eaba8cd1
2017-09-14 16:06:01 -07:00
Junjie Bai
86a9a06878 HTTPMessage in Python 3 does not have getheader
Summary: get and getheader are the same in Python 2

Reviewed By: akyrola

Differential Revision: D5836486

fbshipit-source-id: 3bacfccc872c44741d7f26c68ba967093fce45c2
2017-09-14 13:59:06 -07:00
Jerry Zhang
0e7bd68536 Allow one output for droput at inference time
Summary: att

Reviewed By: bddppq

Differential Revision: D5680214

fbshipit-source-id: 19e731901cb5c9491100c61baefc4b75e6e8b262
2017-09-14 10:46:41 -07:00
Jerry Zhang
63a2b75027 Add option to remove legacy_pad in caffe_translator
Summary:
To speed up deprecating legacy_pad, we added the option
to remove legacy pad in the caffe_translator

Reviewed By: bddppq

Differential Revision: D5724079

fbshipit-source-id: 25465d26f35bd009aa71667c7c523047de42e802
2017-09-14 10:32:48 -07:00
Jongsoo Park
e9581e47a2 fix comment on core.Net.RunAllOnMKL
Summary: Fix comment on core.Net.RunAllOnMKL (the comment was actually for core.Net.RunAllOnGPU)

Reviewed By: zem7

Differential Revision: D5734309

fbshipit-source-id: 2cc40a99a2c0083c73ec1e4c8279f55f296a003c
2017-09-13 19:32:18 -07:00
Yangqing Jia
f0d0361609 Revert D5794634: [caffe2][PR] fp16: BatchMatMul
Summary:
This reverts commit 911c462824edec3de529a5a4385a4c437e24bf59

bypass-lint

Differential Revision: D5794634

fbshipit-source-id: 1863b02282329cbee6b10e5870f03051b4bb6c58
2017-09-13 18:46:47 -07:00
Luke Yeager
37af6566e1 fp16: LSTMUnit
Summary:
Was https://github.com/caffe2/caffe2/pull/1151
Closes https://github.com/caffe2/caffe2/pull/1191

Differential Revision: D5825387

Pulled By: akyrola

fbshipit-source-id: edb47c8bd7ffb72e1e587a9c5bfee9347e3d587e
2017-09-13 15:47:03 -07:00
Jerry Zhang
23f4f78c22 Functional C2
Summary:
Supporting calling C2 operators as functions, e.g.
```
from caffe2.python.functional import Functional
Y = Functional.Relu(X)[0]
```
Supporting numpy arrays as input for now.

Reviewed By: bddppq

Differential Revision: D5791821

fbshipit-source-id: 7e936ad52b8b304c5e210248bd6649fd066cd909
2017-09-13 15:37:28 -07:00
Junjie Bai
90ca470d70 Standardize operator argument "is_test"
Summary:
Also add the ability to mark an argument as required.

Added a string constant `OpSchema::Arg_IsTest` for `is_test` arg.
If users define the `is_test` argument with `ArgIsTest(...)`, then it automatically becomes required argument, in the meanwhile user can still use `Arg("is_test", ...)` to define an optional `is_test` argument.

Reviewed By: akyrola

Differential Revision: D5812391

fbshipit-source-id: eaaba50d027813a8012389edc6c459de23c3c728
2017-09-13 14:35:27 -07:00
Luke Yeager
3cfc6f26e7 fp16: BatchMatMul
Summary:
Was https://github.com/caffe2/caffe2/pull/1151
Closes https://github.com/caffe2/caffe2/pull/1175

Reviewed By: Yangqing

Differential Revision: D5794634

Pulled By: akyrola

fbshipit-source-id: 911c462824edec3de529a5a4385a4c437e24bf59
2017-09-13 14:35:25 -07:00
Alisson Gusatti Azzolini
c07ebd2396 TrimDataset to ensure size is multiple of number or replicas
Summary: For data parallel we need the batch size to be multiple of nubmer of replicas. In order to do so with this diff we do Dataset(rec).trim(multiple_of=num_replicas)

Reviewed By: dzhulgakov, harouwu

Differential Revision: D5753861

fbshipit-source-id: c5d728b925707dbd3d1f500a93e67e185c223569
2017-09-13 12:17:21 -07:00
Luke Yeager
c313855523 Use brew in rnn_cell.py
Summary:
Was https://github.com/caffe2/caffe2/pull/1151.
Closes https://github.com/caffe2/caffe2/pull/1185

Differential Revision: D5794716

Pulled By: akyrola

fbshipit-source-id: c27d30d5d6dd7dacc47610150dcfef03343a7120
2017-09-13 12:02:57 -07:00
Luke Yeager
361bbb8b43 fp16: SumReduceLike
Summary:
Was https://github.com/caffe2/caffe2/pull/1151
Closes https://github.com/caffe2/caffe2/pull/1183

Differential Revision: D5794704

Pulled By: akyrola

fbshipit-source-id: e4dee46f753e9a8663057c81f23028f6246fba02
2017-09-13 11:46:23 -07:00
Luke Yeager
f775149205 tests: use assertRaises, not expectedFail
Summary:
I would expect that tests marked "expected failure" mean that there is a known issue in the code which will be fixed later. Both of these tests are simply verifying proper error-checking - nothing needs fixing.

Before (looks like something is wrong):
```
======================================= 2 xfailed in 0.27 seconds =======================================
```
After:
```
======================================= 2 passed in 0.28 seconds ========================================
```
/cc akyrola gsethi523
Closes https://github.com/caffe2/caffe2/pull/1209

Differential Revision: D5825373

Pulled By: akyrola

fbshipit-source-id: 1b98f503e4e406f69567d02425532f43bd16a465
2017-09-13 11:39:35 -07:00
Sachin Padmanabhan
a198da5583 Added LengthMax Operator to Caffe2
Summary: Added LengthMax operator to Caffe2.

Reviewed By: dzhulgakov

Differential Revision: D5720124

fbshipit-source-id: 1995fea8e480c9a9f3e054d02801b03c1ce6c51b
2017-09-12 20:01:48 -07:00
Xian Li
68e7a0f2ed Enable target dialect token in inference.
Differential Revision: D5665714

fbshipit-source-id: 56ba88e72f71cae23d992e3ad7ea134c3d2c6d1d
2017-09-12 17:22:18 -07:00
Aapo Kyrola
ce36a972b0 fix timeouts in CloneOrCreateCommonWorld
Summary: Default value for timeout in CreateOrCloneCommonWorld does not work properly: if the value of dpm._DEFAULT_TIMEOUT is changed, the default still stays as old 30s. Changed to use None instead as default.

Reviewed By: pietern

Differential Revision: D5813228

fbshipit-source-id: f617ceec40a03893c27d3e13c426e1ca6b2114e2
2017-09-12 13:09:05 -07:00
Viswanath Sivakumar
583d031754 Operator to compute RoI region coordinates for RMAC
Summary:
Computes a fixed grid or RMAC region coordinates for a given 4D feature tensor
(NCHW) as described in https://arxiv.org/abs/1511.05879. The output is the
`roi` format expected by RoIPoolOp. To compute the actual RMAC itself, the
output of this op should be passed to RoIPoolOp.

Reviewed By: wickedfoo

Differential Revision: D5594994

fbshipit-source-id: 5edac98a18137b53555f9a16354419b424679c99
2017-09-12 12:47:17 -07:00
Xianjie Chen
be406b1e5f Revert D5639080: Caffe2: Cuda implementation for BatchOneHot operator
Summary:
This reverts commit 8ee280c4bab64c1fdfb7429ee2c9ac8c02933931

bypass-lint

Differential Revision: D5639080

fbshipit-source-id: cf522822b7cb5ba9a238ba7837f0f522e1f49b73
2017-09-12 11:51:14 -07:00
Aapo Kyrola
93bd3c77f8 AddBlobsSync()
Summary: Explicit function to sync blobs. Notice that this must be called before CreateNet(), and syncs the blobs every run.

Reviewed By: asaadaldien, jay-mahadeokar

Differential Revision: D5805891

fbshipit-source-id: 58a1bb47805d75d5cbead136e2e0e9fe663ea954
2017-09-12 10:33:22 -07:00
Xian Li
a782858285 Move go_token_id out of beam search constructor.
Summary: This is will allow the same decoder to handle different go tokens.

Differential Revision: D5801811

fbshipit-source-id: ddd309963c97e32c728b15d2ccd4ba0c4ad5ebbe
2017-09-11 18:52:08 -07:00
Luke Yeager
944115c915 Bugfix for concat frontend
Summary:
When breaking out pooyadavoodi's change to `brew.concat` from https://github.com/caffe2/caffe2/pull/1151 to https://github.com/caffe2/caffe2/pull/1184, I made it throw an error instead of silently changing removing `order`. But `order` is always present because of [this](https://github.com/caffe2/caffe2/blob/v0.8.1/caffe2/python/model_helper.py#L118), so the frontend can never be used to set `axis`. That's bad. This PR changes the behavior back to Pooya's original implementation.
Closes https://github.com/caffe2/caffe2/pull/1202

Reviewed By: akyrola

Differential Revision: D5806488

Pulled By: pietern

fbshipit-source-id: ceaea77469688a66b269b8ed2944f0d3fe873940
2017-09-11 13:02:59 -07:00
Pieter Noordhuis
84167faf0f Enable use of GPUDirect through argument to Gloo AllreduceOp
Summary:
If the Gloo InfiniBand transport is used, the Gloo algorithms can use
GPUDirect to DMA directly from/to GPU memory. This is done through the
CudaDeviceWorkspace. This change adds a "gpu_direct" option to the
Allreduce operator that makes it use GPUDirect if the transport
supports it.
Closes https://github.com/caffe2/caffe2/pull/1203

Reviewed By: wesolwsk

Differential Revision: D5806366

Pulled By: pietern

fbshipit-source-id: 9e9a78f059f2b5c6e4fbf6574b7db4776a94696c
2017-09-11 13:02:58 -07:00
Mayank Rana
1c414426df Caffe2: Cuda implementation for BatchOneHot operator
Summary: Cuda implementation for BatchOneHot operator.

Reviewed By: lvdmaaten

Differential Revision: D5639080

fbshipit-source-id: 8ee280c4bab64c1fdfb7429ee2c9ac8c02933931
2017-09-11 08:24:44 -07:00
Aapo Kyrola
45f07238f4 make rnn executor figure out recurrent mappings from links
Summary: RNN executor previously relied on getting the mapping from x to x_prev (and gradients) from recurrent.py, but we can just infer them from links. This makes all models compatible with rnn executor, given enable_rnn_executor=1 argument.

Reviewed By: jamesr66a

Differential Revision: D5801436

fbshipit-source-id: 14d0e26dfbad6347f645d907da493187c98e9b17
2017-09-09 16:19:26 -07:00
Luke Yeager
1cf94854a4 fp16: SequenceMask
Summary:
Was https://github.com/caffe2/caffe2/pull/1151
Closes https://github.com/caffe2/caffe2/pull/1178

Reviewed By: bddppq

Differential Revision: D5794641

Pulled By: akyrola

fbshipit-source-id: c3bd99dde74317280a65af7cc7a36a6a734822f6
2017-09-09 13:02:38 -07:00
Pieter Noordhuis
d43ab4bec5 Create Gloo common world through MPI rendezvous
Summary:
Before this change there were two ways for machines to rendezvous for a
distributed run: shared file system or Redis. If you're using an MPI
cluster it is much more convenient to simply execute mpirun and expect
the "right thing (tm)" to happen. This change adds the "mpi_rendezvous"
option to the CreateCommonWorld operator. If this is set, the common
world size and rank will be pulled from the MPI context and Gloo
rendezvous takes place using MPI. Note that this does NOT mean the MPI
BTL is used; MPI is only used for rendezvous.
Closes https://github.com/caffe2/caffe2/pull/1190

Reviewed By: akyrola

Differential Revision: D5796060

Pulled By: pietern

fbshipit-source-id: f8276908d3f3afef2ac88594ad377e38c17d0226
2017-09-08 17:18:47 -07:00
Luke Yeager
6cf172c60d fp16: SumSqrElements
Summary:
Was https://github.com/caffe2/caffe2/pull/1151
Closes https://github.com/caffe2/caffe2/pull/1179

Differential Revision: D5794650

Pulled By: akyrola

fbshipit-source-id: 63e7973a88193a3b74ac4ba677df737889cbf0b6
2017-09-08 16:36:51 -07:00
Aapo Kyrola
cef2068eee enable setting rnn executor threads and max streams
Summary: As title. Made the configurations op-specific since many models run multiple RNNs.

Reviewed By: jamesr66a

Differential Revision: D5796208

fbshipit-source-id: 88173879dfff9f3f7bf583ccc4f4c6385cca5aca
2017-09-08 16:36:51 -07:00
Kittipat Virochsiri
27433e978c Make piper of PipedReaderBuilder takes arguments
Summary: Allow context to be passed into piper function

Reviewed By: volkhin

Differential Revision: D5684716

fbshipit-source-id: 693f0464fe28f8692d75901705a85a0a413a7bed
2017-09-08 13:46:29 -07:00
Honghao Wei
6763c14e84 add base class ModifierContext, rewrite OptimizerContext, add RegularizerContext
Summary:
`ModifierContext` is the base class for `OptimizerContext` and `RegularizationContext`.
`UseModifierBase` is the base class for `UseRegularizer `and `UseOptimizer`

Most of codes in `OptimizerContext`, `RegularizationContext` and other potential Context class in future could be shared. We thus implemented a new base class, called `ModifierContext` to support it.

It happens to be the same for `UseRegularizer` and `UseOptimizer`, and we implemented a new base  class called `UseModifierBase`.

In this way, users only need to provide API for **get** and **has** operation. Also, they need to tell what's the **context class**.

**Note**
Mirrored code in fbandroid and fbobj would be added when finally check in.

Reviewed By: kittipatv, xianjiec

Differential Revision: D5724613

fbshipit-source-id: de19bb822dcd41ec5c459d65065603a0abe2fd20
2017-09-08 11:39:23 -07:00
Honghao Wei
e76015040a add regulariztion in caffe2 and dper
Summary:
Regularization added for caffe2 and dper.

This regularization is intended for `dense feature `only. Sparse feature would serve as individual optimizer, see ` D5618405 ` and  `D5534579` for details.

The implementation of dense regularization is similar to the ones in optimizer. we now support `l1 norm` and  ` l2 norm` in regularizer. In dper, we would call different regularization based on regularization type defined in model_definition.thrift.

Reviewed By: xianjiec

Differential Revision: D5724851

fbshipit-source-id: 0fbee698cfeff1ac477fc9d07785406069f8d9c8
2017-09-08 11:39:22 -07:00
Pieter Noordhuis
b8eb8ced7d Add transport/interface arguments to CreateCommonWorld operator
Summary:
These arguments control which Gloo transport (TCP or IB) and which
network interface is used for the common world. If not specified, it
defaults to using TCP and the network interface for the IP that the
machine's hostname resolves to.

The valid values for the transport argument are "tcp" and "ibverbs".
For ibverbs to work, Gloo must have been compiled with ibverbs
support. If Gloo is built as part of Caffe2 (sourced from the
third_party directory), then you can pass -DUSE_IBVERBS=ON to CMake to
enable ibverbs support in Gloo.
Closes https://github.com/caffe2/caffe2/pull/1177

Reviewed By: akyrola

Differential Revision: D5789729

Pulled By: pietern

fbshipit-source-id: 0dea1a115c729e54c5c1f9fdd5fb29c14a834a82
2017-09-08 10:57:41 -07:00
Luke Yeager
03de05229e brew.concat: don't set both order and axis
Summary:
Was https://github.com/caffe2/caffe2/pull/1151.

pooyadavoodi says this was causing problems for him. I don't remember the details.
Closes https://github.com/caffe2/caffe2/pull/1184

Differential Revision: D5794711

Pulled By: akyrola

fbshipit-source-id: 4d75f2a9b30881ba662141c352ac556cb5d3cce6
2017-09-08 10:34:34 -07:00
Luke Yeager
1a2b229d47 fp16: add test for FC
Summary:
fp16 and TensorCore support was already added to the op in https://github.com/caffe2/caffe2/pull/1056. This adds a test.
Closes https://github.com/caffe2/caffe2/pull/1182

Differential Revision: D5794698

Pulled By: akyrola

fbshipit-source-id: b0d7ef317dfbb9d712b0b4646b38dc600b8434f1
2017-09-08 10:34:34 -07:00
James Reed
9aed89ac88 Allow specification of num_workers in PredictorExportMeta and enable for NMT beam search model
Summary:
The predictor export functions allowed a way to specify a net type, but no way to specify num_workers for when you use net type 'dag'. This adds that option to the PredictorExportMeta named tuple and populates the field in the exported protobuf. Also added parameters to callsites in NMT ensemble model class and model repackager to populate net_type and num_workers.

Using DAGNet for our base predictor net (not recurrent stepnets) speeds up our inference by 1.15x, since we can now run encoder forward and backward RecurrentNet's for each model in the ensemble in parallel.

Reviewed By: salexspb

Differential Revision: D5792203

fbshipit-source-id: cb9a8237a0cbe1a09645d4de051dfbb23f06dcfa
2017-09-07 22:48:45 -07:00
Yan Shang
6a883d1bc0 Remove dot_product layer
Summary: This dot_product layer was added before functional layer was added. Now we have functional layer, this dot_product layer is no longer needed. This diff removes dot_product layer.

Reviewed By: kittipatv

Differential Revision: D5783303

fbshipit-source-id: 5d13f729918148ee57836fb47c48e6f24773654b
2017-09-07 18:48:30 -07:00
Xianjie Chen
ec713d437d make sure the output of sparse lookup layer is float
Summary: currently, if reduer=Nonoe, the output if fp16

Differential Revision: D5773560

fbshipit-source-id: 24d7e5fae366d70352582e9a1ee14c7613753b7a
2017-09-07 17:47:39 -07:00
Yan Shang
b6c9ecac7c Fix shape inference of distance_op
Summary: The shape inference of distance_op has issues (only works when inputs are 1D tensors). This diff fix the shape inference and the unit test.

Reviewed By: kittipatv

Differential Revision: D5788744

fbshipit-source-id: cb1b7facf7b9ccd64b54edca156325eceef50f33
2017-09-07 17:16:46 -07:00
Junjie Bai
176f8f9a19 Make ConvTranspose allow optional bias term
Reviewed By: jerryzh168

Differential Revision: D5755702

fbshipit-source-id: a00487ca376d09b68132162c53797f5af052d114
2017-09-07 17:16:43 -07:00