pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

Author	SHA1	Message	Date
Aapo Kyrola	e89474c496	fix forward_only mode Summary: Forward-only mode had broken at some point. Two things: RNNCell did not pass the parameter to recurrent.py and also recurrent.py was broken if forward_only=True after python3 codemod. Added test to rnn_cell_test to actually check the forward only parameter is passed to prevent future breakage. Reviewed By: jmp84 Differential Revision: D5639306 fbshipit-source-id: b1bbc39d59c3f3734b2f40a1c2f3740c733e0bd4	2017-08-17 10:19:04 -07:00
Jerry Zhang	a63e7314f3	Adding 1d-2d-3d Schemas for Conv and Pool Summary: Add Conv and Pool operators with dimensions. Reviewed By: bddppq Differential Revision: D5588614 fbshipit-source-id: 2552c40dc3ca180a6ab51817d60f0b85b97885d5	2017-08-17 09:45:54 -07:00
Jerry Zhang	4ca5735753	Allow inplace for spatial_bn_op Summary: att Reviewed By: Yangqing Differential Revision: D5644717 fbshipit-source-id: 1a020fe4ca7028056ce7bebddb7bfd1437998530	2017-08-17 09:18:55 -07:00
Badri Narayan Bhaskar	ae2aad9c0d	Operator to Merge ID_LIST features Summary: As an alternative to sharing embeddings, we want to explore merging the ID_LISTs in the net. This commit adds an operator to merge many ID_LIST features into a single one. Differential Revision: D5481523 fbshipit-source-id: 446121122a32de5682d5d75a165370bc8d776d03	2017-08-17 01:16:00 -07:00
Jingfei Du	b3029df1d0	Added window mode for caffe2 sequence operator Summary: This can be used for local attention to mask elements outside of a window Reviewed By: jamesr66a Differential Revision: D5643677 fbshipit-source-id: 92b33866258ccc7307d5bcf08234610aa3fb152d	2017-08-16 21:34:29 -07:00
Ahmed Taei	a0fe96d7cd	Rewrite memonger DAG in C++. Summary: This diff replaces the main of the memonger for dag algorithm _compute_blob_recycling_for_dag with a c++ implementation. Reviewed By: akyrola Differential Revision: D5544219 fbshipit-source-id: 9f868880c8d0eb997ad3dd39433f9d0b9216d303	2017-08-16 16:17:15 -07:00
Yiming Wu	a104dac193	remove unsed code and bring back single benchmark mode Summary: the old gpu single benchmark mode is lost in recent changes. We still need this mode to benchmark some operators. I also removed some unused ancient code Reviewed By: azzolini Differential Revision: D5628501 fbshipit-source-id: c5d2c6c99af18c41bead5d86c46a42f05821e2ff	2017-08-16 14:06:31 -07:00
Kevin Wilfong	1f47a80e88	Caffe2: diagonal fill op Summary: Caffe2: diagonal fill op Reviewed By: panshen1 Differential Revision: D4775640 fbshipit-source-id: bb388ffe223e6b153d4cde1fdad6f84a2bb65b0f	2017-08-16 13:05:11 -07:00
Bor-Yiing Su	30616ee309	Fixes the broken checkpoint test. Summary: Since we temporarily disable checkpointing the readers, we need to rename all the node names in the test to make it pass. Reviewed By: azzolini Differential Revision: D5640930 fbshipit-source-id: 1e61be31ddf9b6e28efd2eb8e6e91e63dcd83154	2017-08-16 11:24:50 -07:00
Lei Chen	14950a9082	Support session in distributed realtime trainer Summary: Convert from PlanDef ProtoBuf into python Plan object by recursively creating Nets and ExecutionSteps. Also support running Plan object directly in Session. Reviewed By: azzolini Differential Revision: D5608393 fbshipit-source-id: c0ae3b6da743a759af6db3b614a5a3935fe0b34c	2017-08-16 10:28:55 -07:00
Aapo Kyrola	a53192e334	Revert D5001637: [Caffe2][RNN] Threaded dependency-aware RNNExecutor (frontier/diagonal execution). Summary: This reverts commit 3d0a71593d73a9ff22f4c1a5c9abf2a4a0c633c8 bypass-lint Differential Revision: D5001637 fbshipit-source-id: 4d6250ae7e66ea0aa635a68d943d552e5db65b69	2017-08-16 03:21:49 -07:00
Aapo Kyrola	453c60ce28	Threaded dependency-aware RNNExecutor (frontier/diagonal execution). Summary: This diff adds dependency-aware concurrent/parallel execution of operators in stepnets. For CPU, we use multi-threaded execution. For CUDA, we use multiple streams and cuda events for parallelism and dependency tracking. Much of the diff is about computing dependency graph, which was quite tricky because we need to also avoid write-races of multiple operators running in multiple timesteps in parallel. Also, recurrent blobs "change name" when passing over timestep ("_prev"), so that needs to be handled as well. This diff also restores the link-ops that I unlanded earlier. The performance gain of this diff is very good for CPU (same perf as with static_dag, even better on forward-only). On CUDA, the gains are modest, at least with the sizes i was testing with. Reviewed By: salexspb Differential Revision: D5001637 fbshipit-source-id: 3d0a71593d73a9ff22f4c1a5c9abf2a4a0c633c8	2017-08-15 23:55:15 -07:00
Bor-Yiing Su	49ec942825	Temporarily disables the checkpoints for the readers. Summary: The hive reader checkpoints are broken because of D5582328. This breaks our offline simulator test as well. This is a temporary fix that disables the checkpoints for readers. Reviewed By: azzolini Differential Revision: D5637719 fbshipit-source-id: 4f31ae534cb7e981fcacbb721cbb2420249fad91	2017-08-15 19:36:11 -07:00
Yangqing Jia	1db7a99249	disable travis test for dpm test Summary: After this, we should have test going back to all green. Closes https://github.com/caffe2/caffe2/pull/1058 Reviewed By: harouwu Differential Revision: D5637495 Pulled By: Yangqing fbshipit-source-id: ac3ab5a27bc56e3bb08fa81aa8ed186cb7e8832b	2017-08-15 19:17:41 -07:00
Luke Yeager	f92fdd850d	Important typo in resnet50_trainer Summary: Closes https://github.com/caffe2/caffe2/pull/1092 Reviewed By: Yangqing Differential Revision: D5637489 Pulled By: harouwu fbshipit-source-id: 13609a3e14a45e640849268821fd8565fd7aae4d	2017-08-15 19:03:15 -07:00
Douglas Chen	e95b79a69c	Benchmark for embedding generation Summary: Adds a benchmark comparing two methods used to generate positional embeddings, table-based and sinusoid (as in the Transformer paper). Reviewed By: jamesr66a Differential Revision: D5625633 fbshipit-source-id: faee2d20ea0c3d9c41479c5114fa010ac49fab24	2017-08-15 14:22:41 -07:00
Alexander Sidorov	52befa4802	DataParallelModel: take param_init_net into account in _InferBlobDevice Summary: Here is my example: For static RNN timestep is created as a part of param_init_net. Before DPM assumed that it is CUDA blob by default and it participated in broadcasting causing Copy on line 798 to fail. No device mapping is correct for this blob. Reviewed By: akyrola Differential Revision: D5631716 fbshipit-source-id: 28c3eb17ecc3080c95c41d69a60bf7262d3907d4	2017-08-15 12:06:46 -07:00
Aapo Kyrola	c05c500a82	check _grad suffix Summary: Memonger had a subtle bug which caused it to recycle "splitinfo" outputs of Concat/Split. That is bad since they are in CPU device, and woult cause them to be realloaced. This caused big slowdown with Kaiming's trainer. Bug was that we checked for gradients as contaning "_grad" in the name, although we should only allow it as a suffix. Admittedly, this is not elegant to do string checking anyways, but that is how Caffe2 works now. Reviewed By: asaadaldien Differential Revision: D5627251 fbshipit-source-id: c12be2323109bf81c3725d8884c7ef024e010bd5	2017-08-14 19:47:59 -07:00
Juan Miguel Pino	434fa7f694	Reduce memory usage for dot attention Summary: Title Differential Revision: D5569996 fbshipit-source-id: c705fc7870ac3e71a071c3f808ac885a82334af2	2017-08-14 12:35:50 -07:00
James Reed	ffd9316b03	Use SequenceMask op in attention code for sequence masking Summary: Use the new SequenceMask op to mask out invalid positions in the attention mechanism rather than using PackSegments and UnpackSegments. This should help us on several fronts, including elision of host<>device copies and using fewer intermediate blobs Differential Revision: D5619156 fbshipit-source-id: e59c644236cee02f853d8743f9a938fb10adc73b	2017-08-12 19:17:49 -07:00
James Reed	a985355935	Gradient for SequenceMaskOp Summary: Implement backward pass for a SequenceMaskOp to replace https://github.com/caffe2/caffe2/blob/master/caffe2/python/attention.py#L54-L72. Reviewed By: akyrola Differential Revision: D5618373 fbshipit-source-id: b831fa69f51d9468c858961f922564159e12b46f	2017-08-12 14:34:29 -07:00
James Reed	0a828768e9	Implement SequenceMaskOp forward pass Summary: Implement forward pass for a SequenceMaskOp to replace https://github.com/caffe2/caffe2/blob/master/caffe2/python/attention.py#L54-L72. This implements two modes: a sequence-length based mode and a matrix triangle mode. Reviewed By: akyrola Differential Revision: D5615493 fbshipit-source-id: a2ce4a8e655d9b720049010a7856be052c5567eb	2017-08-12 14:34:28 -07:00
Bor-Yiing Su	8a5bdc383e	Fixes the flaky upload test Summary: The LocalSession does not work with the multi-node definitions. The test becomes flaky because of that. The fix is to create different LocalSession for each Node(), and run each node sequentially. Differential Revision: D5617857 fbshipit-source-id: a8079a90291b4c8b5aa6b471c33c06d18e59976c	2017-08-11 18:58:24 -07:00
Bor-Yiing Su	404f8ee9b4	Extends the jobrunner to support uploading checkpoints. Summary: 1. Adds one more step in the JobRunner class to upload checkpoints. 2. Adds one function to return the name of the checkpoint given the name of the node. Reviewed By: andrewwdye Differential Revision: D5597130 fbshipit-source-id: 570a55785e6227859e1115326d6cab077f0e7f72	2017-08-11 14:17:17 -07:00
Zhaoming Wu	399fc9fb09	Added Nesterov Summary: Added Nesterov momentum as an option for BMUF and corresponding tests Reviewed By: asaadaldien Differential Revision: D5599888 fbshipit-source-id: 30819c9e689347c8b75daddc7444bea9f54193ae	2017-08-11 13:52:43 -07:00
Jerry Pan	9372ff7a86	Caffe2: support Tensor in BlobsQueueDB Summary: Caffe2: support Tensor in BlobsQueueDB Reviewed By: kevinwilfong Differential Revision: D5589616 fbshipit-source-id: 66aa6092b6403960c4858abd986771b58be94106	2017-08-11 11:21:14 -07:00
Simon Layton	85788a0f65	Add TensorCore support Summary: Add support for TensorCore convolution and gemm on Volta hardware. Currently built on top of #1055 Closes https://github.com/caffe2/caffe2/pull/1056 Differential Revision: D5604068 Pulled By: Yangqing fbshipit-source-id: 100f67e26ed5fabb1dbb31dcd77f7ecb84de4ee7	2017-08-10 20:16:48 -07:00
Alexander Sidorov	a7be496fe2	Revert D5589309: modify _LSTM into _RNN to adapt GRU Summary: This reverts commit f5af67dfe0842acd68223f6da3e96a81639e8049 bypass-lint Differential Revision: D5589309 fbshipit-source-id: 79b0a3a9455829c3899472a1368ef36dc75f6e14	2017-08-10 16:42:41 -07:00
Kittipat Virochsiri	b91c2f5064	Make reservoir sampling thread safe Summary: Guarding reservoir sampling with mutex & fix the bug in counting number of new entries. Reviewed By: chocjy Differential Revision: D5503300 fbshipit-source-id: fd6b0bacb71fbab99d6d5df2c72da523fba02847	2017-08-10 15:27:21 -07:00
Kittipat Virochsiri	9c4872f4bc	Reservoir sampling with object ID deduplication Summary: Adding the option to dedup by object ID so that more frequent objects are not present more than once in the reservoir Reviewed By: chocjy Differential Revision: D5503109 fbshipit-source-id: e36c3ad8eea134d6c10a4c875fceadc0f843c976	2017-08-10 15:27:20 -07:00
Kittipat Virochsiri	f78af06f1b	Features collection with reservoir sampling Summary: Make the candidate pool less localized Reviewed By: chocjy Differential Revision: D5453289 fbshipit-source-id: 848cb7551d7112f6f47f2cf647bb0daca6eff341	2017-08-10 15:27:20 -07:00
Kevin Wilfong	5dba88b40b	Caffe2 [easy]: Better exception logging in parallel_workers/data_workers Summary: Instead of printing the exception using print() use traceback.print_exc() This way you get a stack trace Reviewed By: jay-mahadeokar Differential Revision: D5604642 fbshipit-source-id: f8cb67e554305cd2fbed384a4a2040fa2b16e7c0	2017-08-10 15:27:19 -07:00
James Cross	4758bd851b	rectify args btw. train and translate Summary: Make the command-line arguments pertaining to model architecture the same as between train.py and translate.py. Also use s() scoping function for all intermediate blobs in attention.py (this is for comatibility with multi-headed attention). Differential Revision: D5594312 fbshipit-source-id: cadf51d854b5a9174ec913f32c655be2abf111e5	2017-08-10 15:27:18 -07:00
Christopher Hay	f2dfb40302	Added amplitude argument to SinusoidPositionEncodingOp Summary: In order to control the absolute scale/magnitude of the output of this op, added a tuning parameter: amplitude Reviewed By: jamesr66a Differential Revision: D5596574 fbshipit-source-id: 3b7e316de55cce6fd686da70aa5658ec3e99b070	2017-08-10 15:27:17 -07:00
Ahmed Taei	5bb1e6b817	Allow passing unsymmetric 2d kernels to brew.conv. Reviewed By: jay-mahadeokar Differential Revision: D5598523 fbshipit-source-id: 47135a8562f7c720badb2be677cb79730dc417a0	2017-08-10 15:27:16 -07:00
Kittipat Virochsiri	eb85258beb	CreateMapOp Summary: Add operator to create empty map Reviewed By: xianjiec Differential Revision: D5454652 fbshipit-source-id: ecad6cc58572b378962af08cf02063ef546ed58f	2017-08-09 13:32:19 -07:00
Tao Wu	7b86a34610	modify _LSTM into _RNN to adapt GRU Summary: GRU is different than LSTM that it only has hidden states but no cell states. So in this case, reusing the code of _LSTM is problematic, as we need to delete the part of creating cell state, and change many other places that use hard-coded 4 (hidden_all, hidden, cell_all, cell) into 2 (hidden_all, hidden). Otherwise GRU will break during the backward pass, when the optimizer tries to apply gradient to each of the parameters, because cell state is never used, so it does not have gradients for the corresponding parameters (i.e., cell_state_w, cell_state_b). Differential Revision: D5589309 fbshipit-source-id: f5af67dfe0842acd68223f6da3e96a81639e8049	2017-08-09 13:24:45 -07:00
Aaron Markham	784ba07bf3	updated downloader to use s3 url without a redirect via the vanity url Summary: Model downloader was broken after the move on s3 to the vanity url, download.caffe2.ai. Using this as the url base hits a redirect, and will result in the script throwing a 403 error. Rather than upgrading to urllib2 or putting in a bunch of code to handle a redirect on urllib, we can just use the non-vanity base url. Closes https://github.com/caffe2/caffe2/pull/1020 Reviewed By: Yangqing Differential Revision: D5568686 Pulled By: aaronmarkham fbshipit-source-id: d88a6b3e1b7955835fc03b036dc54dec48316e7f	2017-08-09 12:25:30 -07:00
Junjie Bai	1ce95090ca	Add support for specifying engine preferences Reviewed By: Yangqing Differential Revision: D5460994 fbshipit-source-id: 08a8af699eebec37defc070389a8415b3e81ac16	2017-08-09 00:47:18 -07:00
Priya Goyal	5c77cc8182	Exposing num_workers as parameter and enable recycling activations Summary: as promised, a separate diff for dpm changes I made in experimental code Reviewed By: pietern Differential Revision: D5551304 fbshipit-source-id: 9013aeab6c388b1c415ffb2e36fb8dd6b8cf90b0	2017-08-08 19:48:41 -07:00
Andrei Chtcherbatchenko	a2204f0b1e	Caffe2: Write CUDA version of OneHot operator Summary: This diff implements CUDA version of OneHot operator. Reviewed By: bddppq Differential Revision: D5578543 fbshipit-source-id: 55b70e8ec6ee34b647b9140fecbba31b6968f403	2017-08-08 18:17:39 -07:00
Long Jin	ef64a4f6b2	Add conv layer and layer tests Reviewed By: xianjiec Differential Revision: D5569206 fbshipit-source-id: ed836315f3ee4d7983da94f2633a3085fe99194d	2017-08-08 10:57:43 -07:00
Jianlong Zhong	152d2ae3a8	Implement CUDA version of GRU operator Summary: Add CUDA version of GRU operator Reviewed By: jamesr66a Differential Revision: D5571043 fbshipit-source-id: 332aa64fc8a9116cc33382f2b2907080e58c13b3	2017-08-08 10:57:40 -07:00
James Cross	9fcf676cfa	testing for open-source seq2seq Summary: Fix multilayer inference in Caffe2 example seq2seq code. (Rely on LSTMWithAttentionDecoder.apply rather than fixed state indices to determine stepwise decoder output.) Also assorted updates to bring code in line with changes elsewhere in the codebase, and added unit tests which ensure that training and inference networks generate the same loss, which should make these problems much easier to identify in future. Reviewed By: jamesr66a Differential Revision: D5579803 fbshipit-source-id: 6e0f27340d981990ab8d0da58e63793222e7be87	2017-08-08 10:09:41 -07:00
Chonglin Sun	8ad382df3c	implement LengthsTopK operator Summary: It was reverted previously because of lack of schema for gradient op. Added it back and resend. difference between this diff and previous reverted diff: 1. added schema for gradient operator 2. change line:95 in kmax_pooling_op.h from CAFFE_ENFORCE to CAFFE_ENFORCE_GE Reviewed By: xianjiec Differential Revision: D5568867 fbshipit-source-id: 39813b389a5da803967a561249793afdfce00c58	2017-08-07 18:19:29 -07:00
Ahmed Taei	8af625ede2	Implement gradients for Col2Im and Im2Col operators Reviewed By: jay-mahadeokar Differential Revision: D5576385 fbshipit-source-id: a0ca4f704fd861f7cc67079041b1d0772fc66920	2017-08-07 15:51:30 -07:00
Ahmed Taei	647f35e742	Fix SyncAllParamsDistributed for Python 3x Summary: In Python 3x dictionary values aren't a list and can't be concatenated to a list this diff should fix that. Reviewed By: andrewwdye Differential Revision: D5576724 fbshipit-source-id: c60441857ceceb9c4a71122d2db5e9abad6d3fc2	2017-08-07 14:23:32 -07:00
Ben Zhang	42fb87d0b1	L1Distance Row-wise, instead of cumulative Summary: The L1Distance operator used to return a single value denoting the L1 of the entire input, instead of a vector for each input value. This fixes that. Reviewed By: Yangqing Differential Revision: D5570385 fbshipit-source-id: fbab0e0c9262ccbdb3af27262b8baacdeb2d0fc9	2017-08-07 14:09:25 -07:00
Jacqueline Xu	a1bf14d8e6	Building new randomized sparse nn model Summary: New hybrid randomized sparse nn, which allows layers of sparse NN model to be randomized, semi-random, or learnable Reviewed By: chocjy Differential Revision: D5416489 fbshipit-source-id: eb8640ddf463865097ba054b9f8d63da7403024d	2017-08-07 12:48:58 -07:00
Zhicheng Yan	e7192c3b91	image_input_op_dense_multi_label Summary: To train an image model, we also can use label embedding vector as supervision as opposed to using SoftmaxLoss/SigmoidCrossEntropyLoss. In such case, the label is a dense vector. This diff enables such use cases. Reviewed By: panshen1 Differential Revision: D5556203 fbshipit-source-id: 52c61495e02fab457dc2d43e3345d7dbd5580ab7	2017-08-07 12:38:16 -07:00

1 2 3 4 5 ...

1089 commits