pytorch/caffe2/python
James Reed 9aed89ac88 Allow specification of num_workers in PredictorExportMeta and enable for NMT beam search model
Summary:
The predictor export functions allowed a way to specify a net type, but no way to specify num_workers for when you use net type 'dag'. This adds that option to the PredictorExportMeta named tuple and populates the field in the exported protobuf. Also added parameters to callsites in NMT ensemble model class and model repackager to populate net_type and num_workers.

Using DAGNet for our base predictor net (not recurrent stepnets) speeds up our inference by 1.15x, since we can now run encoder forward and backward RecurrentNet's for each model in the ensemble in parallel.

Reviewed By: salexspb

Differential Revision: D5792203

fbshipit-source-id: cb9a8237a0cbe1a09645d4de051dfbb23f06dcfa
2017-09-07 22:48:45 -07:00
..
docs
examples Add fp16 and tensorcore support to resnet50_trainer 2017-08-17 15:16:24 -07:00
helpers Layer norm brew wrapper 2017-08-17 11:17:47 -07:00
layers Remove dot_product layer 2017-09-07 18:48:30 -07:00
mint
mkl Support grouped convolutions in MKL 2017-07-25 14:19:02 -07:00
modeling Scaled training and fetching from the PS 2017-08-23 18:16:03 -07:00
models rectify args btw. train and translate 2017-08-10 15:27:18 -07:00
operator_test Fix shape inference of distance_op 2017-09-07 17:16:46 -07:00
predictor Allow specification of num_workers in PredictorExportMeta and enable for NMT beam search model 2017-09-07 22:48:45 -07:00
rnn Revert D5589309: modify _LSTM into _RNN to adapt GRU 2017-08-10 16:42:41 -07:00
_import_c_extension.py
allcompare_test.py Adding AllCompare-like function to data_parallel_model 2017-07-13 13:03:57 -07:00
attention.py soft-coverage attention 2017-08-31 21:21:54 -07:00
benchmark_generator.py A benchmark generator for individual ops 2017-08-31 17:33:21 -07:00
binarysize.py binary size util 2017-07-14 17:49:24 -07:00
brew.py Layer norm brew wrapper 2017-08-17 11:17:47 -07:00
brew_test.py Allow passing unsymmetric 2d kernels to brew.conv. 2017-08-10 15:27:16 -07:00
build.py Strip Operator Schema in mobile build 2017-08-22 13:31:08 -07:00
caffe_translator.py Read pretrained weights using binary mode in caffe_translator.py 2017-07-08 10:17:57 -07:00
caffe_translator_test.py
checkpoint.py Enable reader checkpoint 2017-09-05 14:21:25 -07:00
checkpoint_test.py Changes the checkpoint naming rules. 2017-08-17 22:16:42 -07:00
CMakeLists.txt
cnn.py
context.py Add default implementation of __call__ for context manager 2017-08-22 17:46:22 -07:00
context_test.py Add default implementation of __call__ for context manager 2017-08-22 17:46:22 -07:00
control.py
control_ops_util.py Control flow operators 2017-08-28 20:04:43 -07:00
control_test.py
convnet_benchmarks.py
convnet_benchmarks_test.py
core.py Handle bool's correctly in net.Const 2017-08-31 12:02:58 -07:00
core_gradients_test.py warn about orphan StopGradient output 2017-07-20 21:41:41 -07:00
core_test.py Support session in distributed realtime trainer 2017-08-16 10:28:55 -07:00
crf.py
data_parallel_model.py support device ids>10 2017-09-07 00:01:33 -07:00
data_parallel_model_test.py support device ids>10 2017-09-07 00:01:33 -07:00
data_workers.py Caffe2: Refactor the core logic from data_workers.py into parallel_workers.py 2017-08-07 10:14:08 -07:00
data_workers_test.py Caffe2: Refactor the core logic from data_workers.py into parallel_workers.py 2017-08-07 10:14:08 -07:00
dataio.py
dataio_test.py
dataset.py Option to enforce batch size 2017-08-01 22:29:55 -07:00
db_test.py
device_checker.py
dyndep.py
embedding_generation_benchmark.py Benchmark for embedding generation 2017-08-15 14:22:41 -07:00
empty.so
experiment_util.py
extension_loader.py
gradient_check_test.py
gradient_checker.py
gru_cell.py Revert D5589309: modify _LSTM into _RNN to adapt GRU 2017-08-10 16:42:41 -07:00
hsm_util.py
hypothesis_test.py EnsureDense/SparseToDense for CUDA 2017-09-01 09:33:05 -07:00
hypothesis_test_util.py Adding a range operator similar to np.arange 2017-08-18 14:45:56 -07:00
layer_model_helper.py Adding parameter sharing API to Dper2 2017-08-03 00:33:18 -07:00
layer_model_instantiator.py saving/loading CPU/GPU nets 2017-07-23 02:18:15 -07:00
layer_parameter_sharing_test.py Adding parameter sharing API to Dper2 2017-08-03 00:33:18 -07:00
layer_test_util.py Add a method to run a train net multiple times in layer_test_util.py 2017-07-28 19:56:05 -07:00
layers_test.py Create MergeIdListsLayer 2017-08-22 17:00:55 -07:00
lengths_reducer_rowwise_8bit_ops_test.py Rowwise quantization 2017-09-06 10:19:38 -07:00
load_save_test.py
lstm_benchmark.py threaded RNN executor for CPU, multi-stream executor CUDA 2017-09-06 12:26:30 -07:00
memonger.py insert Free ops when blob used last time + memory allocation estimator 2017-09-05 12:03:04 -07:00
memonger_test.py insert Free ops when blob used last time + memory allocation estimator 2017-09-05 12:03:04 -07:00
mkl_test_util.py Implement a filler op test 2017-07-25 14:18:57 -07:00
model_device_test.py
model_helper.py YellowFin GPU class and Python optimizer 2017-08-30 18:32:24 -07:00
mpi_python.cc
muji.py
muji_test.py
net_builder.py Tuning number of parameter servers based on performance estimation job 2017-08-30 18:03:59 -07:00
net_builder_test.py Control flow operators 2017-08-28 20:04:43 -07:00
net_drawer.py
net_printer.py net_printer.to_string() accepts NetDef 2017-08-01 10:17:29 -07:00
net_printer_test.py
optimizer.py YellowFin GPU class and Python optimizer 2017-08-30 18:32:24 -07:00
optimizer_context.py allow param_info to set optimizer 2017-07-12 08:49:48 -07:00
optimizer_test.py Disabled test for equivalency between Caffe2's and Numpy's YellowFin 2017-09-06 13:47:45 -07:00
optimizer_test_util.py Added support for scaling learning rate of Caffe2 optimizers during training 2017-08-25 19:04:47 -07:00
parallel_workers.py Caffe2 [easy]: Better exception logging in parallel_workers/data_workers 2017-08-10 15:27:19 -07:00
parallel_workers_test.py Caffe2: Refactor the core logic from data_workers.py into parallel_workers.py 2017-08-07 10:14:08 -07:00
parallelize_gpu_bmuf_distributed_test.py Added Nesterov 2017-08-11 13:52:43 -07:00
pipeline.py
pipeline_test.py Ability to dequeue and concat multiple records in a single QueueDequeue op 2017-08-31 10:48:59 -07:00
predictor_constants.py
pybind_state.cc Strip Operator Schema in mobile build 2017-08-22 13:31:08 -07:00
pybind_state.h fast simple-net memonger for C++ 2017-07-06 15:17:07 -07:00
pybind_state_gpu.cc
pybind_state_mkl.cc MKL code move 2017-07-26 20:21:55 -07:00
python_op_test.py
queue_util.py Ability to dequeue and concat multiple records in a single QueueDequeue op 2017-08-31 10:48:59 -07:00
record_queue.py
recurrent.py threaded RNN executor for CPU, multi-stream executor CUDA 2017-09-06 12:26:30 -07:00
rnn_cell.py Fix cell/hidden init issue, add copy states to test 2017-09-06 14:16:17 -07:00
schema.py logging the blob that has type error 2017-08-23 21:21:27 -07:00
schema_test.py Return empty Struct when get_field has empty input 2017-08-01 19:49:47 -07:00
scope.py
scope_test.py
session.py Adds the master setup plan to the model exporter. 2017-08-25 16:01:24 -07:00
session_test.py
sparse_to_dense_mask_test.py Add more enforces to SparseToDenseMask operator. 2017-09-02 02:16:24 -07:00
task.py Support session in distributed realtime trainer 2017-08-16 10:28:55 -07:00
test_util.py Clear the operator default engines before running operator tests 2017-08-29 17:47:20 -07:00
text_file_reader.py
timeout_guard.py Revert D5655753: [Caffe2] better straggler exit procedure 2017-08-25 14:23:09 -07:00
toy_regression_test.py
tt_core.py
tt_core_test.py
utils.py Update proto definition 2017-08-22 19:01:18 -07:00
visualize.py Python 3 compatible integer division 2017-07-06 11:47:12 -07:00
workspace.py ApplyTransformIfFaster 2017-08-17 15:36:51 -07:00
workspace_test.py ApplyTransformIfFaster 2017-08-17 15:36:51 -07:00