Commit graph

13 commits

Author SHA1 Message Date
Ahmed Taei
0a25926f4b CUDA implementation for GatherPadddingOp
Summary: AT

Reviewed By: enosair

Differential Revision: D6561996

fbshipit-source-id: ad03d6db8d4318e426ff96569bb3c93cba696926
2017-12-15 16:05:31 -08:00
Pieter Noordhuis
3d1135c842 Skip remove_padding test because it is flaky
Summary:
Must be fixed in #1547
Closes https://github.com/caffe2/caffe2/pull/1548

Reviewed By: jhcross

Differential Revision: D6456373

Pulled By: pietern

fbshipit-source-id: 484a58e31506acfc8b8a0954f76796d14dfdfda3
2017-12-01 09:47:31 -08:00
James Cross
0e21cd2eae CUDA implementation of RemovePadding operator
Summary:
This is a CUDA implementation of the RemovePadding operator, modeled on akyrola's implementation for AddPadding.

There's also an incidental spelling correction: GetAddPadingGradient -> GetAddPaddingGradient.

Reviewed By: akyrola

Differential Revision: D6439594

fbshipit-source-id: b29cd0c252021c58e150b901bbaad28a3bd3cc4a
2017-11-29 18:48:01 -08:00
Aapo Kyrola
a08909160e fix bug in CUDA AddPadding when lenghts output is not provided
Summary:
enosair caught bug that the operator returned too early if the lengths output was not provided. Fixed and added testing.

+ noticed the op does not support case when no lengths-input is provided. Added a temporary CAFFE_THROW for this case, will fix later

Reviewed By: enosair

Differential Revision: D6405585

fbshipit-source-id: a81717e1b39afde6e900ddd9049b820943aea9f1
2017-11-27 15:14:07 -08:00
Aapo Kyrola
0954775d28 AddPadding CUDA version
Summary: CUDA version of the AddPadding op. It first executes a prefix-sum using Cub to compute the cumulative lenghts array. Then it launches a kernel that uses this information to fill the output tensor with start, end paddding and the actual contents.

Reviewed By: asaadaldien

Differential Revision: D6391413

fbshipit-source-id: 45b431e5976674729e53cb4752c7753c1d8a69e8
2017-11-22 18:17:21 -08:00
Yangqing Jia
8286ce1e3a Re-license to Apache
Summary: Closes https://github.com/caffe2/caffe2/pull/1260

Differential Revision: D5906739

Pulled By: Yangqing

fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902
2017-09-28 16:22:00 -07:00
James Cross
79c3a3af54 add gpu support for caffe2-seq2seq
Summary: Adding synchronous optimization on GPUs to the translation training pipeline, via data_parallel_model.Parallelize_GPU, which needs to be updated so there is some way of performing sparse parameter updates (e.g., on embedding tables), whether on GPU or CPU.

Reviewed By: urikz

Differential Revision: D4631914

fbshipit-source-id: 9cdd655f7dbda3f9b2733d459228b3e097892441
2017-03-17 05:19:14 -07:00
James Cross
c5621ded31 Allow use of ReversePackedSegs operator in CUDA context
Summary: ReversePackedSegs operator for CUDA. Input "lengths" (static integers) required to be in CPU memory.

Differential Revision: D4661281

fbshipit-source-id: c800c316c34015ba8e732dcbcaa8c4edaffdfeab
2017-03-09 15:03:55 -08:00
Yangqing Jia
5eb836880d Add unittest.main() lines to test scripts under python/operator_test
Summary:
Needed by oss.

This is done by running the following line:

  find . -name "*_test.py" -exec sed -i '$ a \\nif __name__ == "__main__":\n    import unittest\n    unittest.main()' {} \;

Reviewed By: ajtulloch

Differential Revision: D4223848

fbshipit-source-id: ef4696e9701d45962134841165c53e76a2e19233
2016-11-29 15:18:37 -08:00
Yangqing Jia
d1e9215184 fbsync 2016-10-07 13:08:53 -07:00
Yangqing Jia
b23e51d467 chunky sync 2016-09-06 15:55:19 -07:00
Yangqing Jia
bcea409c82 sync 2016-07-28 15:06:43 -07:00
Yangqing Jia
09bed67e4f add untracked files 2016-07-21 11:26:41 -07:00