pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

Author	SHA1	Message	Date
Aaron Markham	58f7f2b441	doxygen python block added Summary: Closes https://github.com/caffe2/caffe2/pull/226 Differential Revision: D4793550 Pulled By: JoelMarcey fbshipit-source-id: cc33e58186304fa8dcac2ee9115dcc271d785b1e	2017-03-29 06:46:16 -07:00
Deepak Gopinath	422c65ca35	Removing unnecessary Copy after fixing gradients for external parameters Summary: Apart from copying gradient blobs for inputs with initial_cell_input, we needed to perform a similar operation for external parameters used by the step net Reviewed By: salexspb Differential Revision: D4752259 fbshipit-source-id: 13ee48cf583ed86221a4cc1cc9f57f5c3a7d2450	2017-03-23 15:04:22 -07:00
Yury Zemlyanskiy	ea66516d5e	Output attention weights from apply_xxx_attention methods Summary: OSS diff. We need it later for beam decoding. Differential Revision: D4747785 fbshipit-source-id: ce2d53ee2434216ace3c4ddbd40a9b68e9db7ec5	2017-03-21 19:01:58 -07:00
Yury Zemlyanskiy	93ff338ca7	Beam decoder for NMT in Caffe2 Summary: yolo5 Differential Revision: D4685076 fbshipit-source-id: b5534e441bb453f90e5210294f2dfff6b5c3b5b1	2017-03-20 22:03:59 -07:00
James Reed	33f41c06c0	Remove more instances of batch_size Summary: D4734505 part 2. Remove more instances of the batch_size parameter Reviewed By: urikz Differential Revision: D4736906 fbshipit-source-id: fc9d374e9308017d61c427890364c5ab9cec2edf	2017-03-19 22:31:30 -07:00
James Reed	17da5856ed	Remove batch_size parameter from attention and LSTMWithAttention interfaces Summary: Reshape based on tensor shapes in the graph rather than based on a passed-in batch_size parameter Reviewed By: urikz Differential Revision: D4734505 fbshipit-source-id: d9c23d85be84f61124106e752ef2b4f6945e2a07	2017-03-19 18:16:28 -07:00
Yury Zemlyanskiy	d1424c3265	Revert D4702086: Remove batch_size parameter from attention and LSTMWithAttention interfaces Summary: This reverts commit c4c1d8425cd36c1e86695918eaba2667c27e9601 Differential Revision: D4702086 fbshipit-source-id: 4620610b182bb84b9297b5de32782761ae89d20b	2017-03-17 17:36:47 -07:00
James Reed	10d95bd0f0	Remove batch_size parameter from attention and LSTMWithAttention interfaces Summary: Reshape based on tensor shapes in the graph rather than based on a passed-in batch_size parameter Reviewed By: urikz Differential Revision: D4702086 fbshipit-source-id: c4c1d8425cd36c1e86695918eaba2667c27e9601	2017-03-16 11:47:52 -07:00
James Reed	8de1db9eb6	Implement recurrent attention in C2 Summary: Super rough implementation of recurrent attention. Planning to factor out the common code between the two functions as well as train and eval. I want to get this out and get eyes on it sooner rather than later Differential Revision: D4647837 fbshipit-source-id: 54bc4e8ed0df6f04c86c425926decbe89f73b068	2017-03-08 11:21:28 -08:00
Yury Zemlyanskiy	4a53ab3cb6	LSTMWithAttention implementation in Caffe2 Summary: Implementation of ##LSTMWithAttention## Still TBD: 1. There are problems with back propagation, because gradient is not implemented for ops with broadcasting 2. I need to make initial_recurrent_state to be of shape [dim] rather than [1, batch_size, dim], so one doesn't need to provide batch_size to LSTMWithAttention Differential Revision: D4298735 fbshipit-source-id: 8903fcff4d6a66647ee6d45a6ef28803fc3091e5	2017-02-23 04:08:34 -08:00

10 commits