pytorch/caffe2/python
Deepak Gopinath 57ecd20197 seq2seq open source implementation
Summary:
OSS implementation of seq2seq model in Caffe2. The script uses Seq2SeqModelCaffe2 class to build and run the model. It takes in training data in the form of text file with one sentence in each line, builds a vocabulary, generates batches based on batch size and runs the net for a configurable number of epochs. It prints total scalar loss at the end of each epoch.

All FBLearner and neural_mt type system dependencies have been removed. Unimplemented and unnecessary methods have been removed to make the script simpler.
fblearner/flow/projects/langtech/translation/neural_mt/model_util_caffe2.py has been moved to caffe2/caffe2/python/examples/seq2seq_util.py and remains unchanged

Potential TODOs:
  - Get the model running in GPU. Only GatherOp does not have a corresponding GPU implementation. Try adding CopyGPUToCPU before and CopyCPUToGPU after Gather, and use CUDA DeviceOption.
  - Add evaluation on test data with suitable metric (perplexity? bleu?)

Reviewed By: urikz

Differential Revision: D4653333

fbshipit-source-id: 1c7d970ebc86afe23fad4d48854296bf54eb0f77
2017-03-09 16:18:08 -08:00
..
docs Documenation generation to wiki 2017-02-15 16:00:44 -08:00
examples seq2seq open source implementation 2017-03-09 16:18:08 -08:00
layers clean old unit test, add sum processor and sqrt pooling 2017-03-08 23:04:19 -08:00
mint goodbye old brewery 2017-01-04 20:58:35 -08:00
models Added model downloader 2017-02-22 12:47:15 -08:00
operator_test Allow use of ReversePackedSegs operator in CUDA context 2017-03-09 15:03:55 -08:00
_import_c_extension.py fbsync. TODO: check if build files need update. 2016-11-15 00:00:46 -08:00
attention.py Implement recurrent attention in C2 2017-03-08 11:21:28 -08:00
caffe_translator.py translator fix to solve Aaron's issue 2017-02-13 11:19:13 -08:00
caffe_translator_test.py protected legacy_pad_, replace DeleteDropout with is_test=True 2016-07-29 11:44:55 -07:00
checkpoint.py Fix issues pickling jobs 2017-02-21 20:47:27 -08:00
checkpoint_test.py Fix issues pickling jobs 2017-02-21 20:47:27 -08:00
CMakeLists.txt CMake completions work 2017-01-11 16:59:22 -08:00
cnn.py Do not initialize BN params if init_params is false. 2017-02-27 20:19:03 -08:00
context.py Make ContextManager thread-safe 2017-02-13 19:45:35 -08:00
context_test.py Make ContextManager thread-safe 2017-02-13 19:45:35 -08:00
control.py Better visualization for gpu training plan 2016-12-21 09:29:43 -08:00
control_test.py fbsync. TODO: check if build files need update. 2016-11-15 00:00:46 -08:00
convnet_benchmarks.py Convnet benchmark cudnn_ws 2017-03-02 15:32:37 -08:00
convnet_benchmarks_test.py chunky sync - build scripts to be written 2016-07-21 10:16:42 -07:00
core.py New approach to metrics. 2017-03-06 14:48:16 -08:00
core_gradients_test.py add inference for gradient ops + a couple of missing shape inference functions + fix to scalars 2017-02-28 23:33:32 -08:00
core_test.py NextScopedBlob with well-defined behavior and respect namescope 2017-02-16 17:16:36 -08:00
data_parallel_model.py data_parallel_model support for sparse gradients and CPU ops 2017-03-09 13:48:41 -08:00
data_parallel_model_test.py data_parallel_model support for sparse gradients and CPU ops 2017-03-09 13:48:41 -08:00
data_workers.py Remove use of logging module and np.random.randint() due to deadlocks with forks 2017-03-01 03:32:56 -08:00
data_workers_test.py close blobs queues when stopping + test 2017-02-27 10:07:57 -08:00
dataio.py fix typo in TextFileReader 2017-02-21 14:02:48 -08:00
dataio_test.py NextScopedBlob with well-defined behavior and respect namescope 2017-02-16 17:16:36 -08:00
dataset.py fbsync. TODO: check if build files need update. 2016-11-15 00:00:46 -08:00
db_test.py Fix db_test under tsan 2016-11-29 15:18:37 -08:00
device_checker.py chunky sync 2016-09-06 15:55:19 -07:00
dyndep.py fbsync. TODO: check if build files need update. 2016-11-15 00:00:46 -08:00
experiment_util.py use Pieter-MPI and fb.distributed 2016-11-29 15:18:36 -08:00
extension_loader.py fbsync 2016-10-07 13:08:53 -07:00
gradient_check_test.py Fix test cases: tensor of size 0 not supported by GPU ops yet. 2016-12-15 19:59:24 -08:00
gradient_checker.py fbsync 2016-10-07 13:08:53 -07:00
hsm_util.py Generate huffman tree 2017-01-19 16:14:23 -08:00
hypothesis_test.py add AccumulateHistogramOp 2017-03-08 19:37:32 -08:00
hypothesis_test_util.py Allow use of ReversePackedSegs operator in CUDA context 2017-03-09 15:03:55 -08:00
introspect_vis.py User input (Conv out, etc.) 2017-03-08 13:49:45 -08:00
layer_model_helper.py Use new metric intefaces in trainer workflows. 2017-03-07 12:46:52 -08:00
layer_model_instantiator.py Migrate realtime training workflows to use new metrics. 2017-03-08 23:49:41 -08:00
layers_test.py Add a way do describe layers in a more AdHoc manner. 2017-02-27 23:30:39 -08:00
load_save_test.py Add validation checks to load op 2017-03-06 09:46:35 -08:00
lstm_benchmark.py LSTM benchmark (Caffe2 RNN based) 2017-02-28 23:17:26 -08:00
memonger.py Fixes to topological sort, canonical blob naming, sharing final blob 2017-01-25 15:14:26 -08:00
memonger_test.py Gradient Input memory sharing using memonger blob sharing 2017-01-09 19:44:23 -08:00
mkl_test_util.py MKLDevice and MKLOperator 2016-12-15 19:59:24 -08:00
model_device_test.py Comment out NHWC Alexnet test for now 2017-01-23 13:59:29 -08:00
model_helper.py Added editDistance helper to caffe2 operators 2017-02-28 13:31:56 -08:00
mpi_python.cc Move mpi_python.cc to the python folder to be more consistent about source file locations. 2017-01-09 10:59:39 -08:00
muji.py fbsync 2016-10-07 13:08:53 -07:00
muji_test.py chunky sync - build scripts to be written 2016-07-21 10:16:42 -07:00
net_builder.py Improve "reporter net" design 2017-02-21 20:17:40 -08:00
net_builder_test.py Improvements+fixes for NetBuilder 2017-01-03 16:59:24 -08:00
net_drawer.py Add model graph to dper_example 2017-02-07 13:03:54 -08:00
net_printer.py Add task outputs and stop signals to net_printer 2017-03-07 01:21:40 -08:00
net_printer_test.py Debug/Analysis tools for Jobs/ExecutionSteps 2017-02-06 17:31:20 -08:00
optimizer.py refactor and modulize optimizers 2017-03-07 18:46:47 -08:00
optimizer_test.py refactor and modulize optimizers 2017-03-07 18:46:47 -08:00
optimizer_test_util.py refactor and modulize optimizers 2017-03-07 18:46:47 -08:00
pipeline.py Better names for nets, steps and tasks 2017-02-09 16:33:54 -08:00
pybind_state.cc Make ModelExporter.load_from_db() load to specific workspace 2017-03-08 09:31:42 -08:00
pybind_state.h Allow PythonOp to access the workspace 2016-12-05 11:53:26 -08:00
pybind_state_gpu.cc Cudnn v6 2017-02-28 17:46:33 -08:00
pybind_state_mkl.cc Expose MKLMemory to the Python Feed and Fetch interface, and misc changes 2016-11-29 15:18:36 -08:00
python_op_test.py Allow PythonOp to access the workspace 2016-12-05 11:53:26 -08:00
queue_util.py Better names for nets, steps and tasks 2017-02-09 16:33:54 -08:00
record_queue.py chunky sync 2016-09-06 15:55:19 -07:00
recurrent.py Implement recurrent attention in C2 2017-03-08 11:21:28 -08:00
schema.py Add a way do describe layers in a more AdHoc manner. 2017-02-27 23:30:39 -08:00
schema_test.py schema.Struct.__add__ 2017-02-06 13:47:58 -08:00
scope.py fbsync. TODO: check if build files need update. 2016-11-15 00:00:46 -08:00
scope_test.py fbsync. TODO: check if build files need update. 2016-11-15 00:00:46 -08:00
session.py Default LocalSession to current workspace. 2017-03-01 16:03:18 -08:00
session_test.py NextScopedBlob with well-defined behavior and respect namescope 2017-02-16 17:16:36 -08:00
sparse_to_dense_mask_test.py Fix few more operators to handle empty batches correctly. 2016-11-29 15:18:37 -08:00
task.py Gather perf counters for distributed jobs 2017-02-21 22:06:25 -08:00
test_util.py MKL convolution operator 2017-01-23 09:59:30 -08:00
text_file_reader.py fix typo in TextFileReader 2017-02-21 14:02:48 -08:00
timeout_guard.py Euthanize a process with timeout 2017-03-01 11:38:11 -08:00
toy_regression_test.py sync 2016-08-10 11:02:15 -07:00
tt_core.py sync 2016-08-10 11:02:15 -07:00
tt_core_test.py sync 2016-08-10 11:02:15 -07:00
utils.py Add a create your own dataset tutorial 2017-02-22 03:31:47 -08:00
visualize.py chunky sync 2016-05-13 14:43:48 -07:00
workspace.py backup functions for non-cuda cases 2017-02-28 22:07:54 -08:00
workspace_test.py Remove redundant and failing test of FeedBlob asserts 2016-12-22 14:59:28 -08:00