pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

Author	SHA1	Message	Date
Aaron Markham	58f7f2b441	doxygen python block added Summary: Closes https://github.com/caffe2/caffe2/pull/226 Differential Revision: D4793550 Pulled By: JoelMarcey fbshipit-source-id: cc33e58186304fa8dcac2ee9115dcc271d785b1e	2017-03-29 06:46:16 -07:00
Andrey Malevich	7cc92b1260	Add eval net for layer_model_helper Summary: This diff is adding eval nets to layer model helper. It should be useful for the cases when train/eval nets need some extra input (usually some supervision) for train/eval. For example various sampled layers, etc. Differential Revision: D4769453 fbshipit-source-id: 7a8ec7024051eab73b8869ec21e20b5f10fd9acb	2017-03-29 04:03:40 -07:00
Fei Sun	95657ea1e8	Protobuf is binary string. Use bytes instead. Summary: Prepare for the Protobuf change. Reviewed By: dzhulgakov Differential Revision: D4784884 fbshipit-source-id: 86219eecefaf7637e70339437c9274c526ebd6fe	2017-03-28 19:03:23 -07:00
Aapo Kyrola	fd2835887b	only resize stepWorkspaces when sequence length increases Summary: We should resize the workspace-vector only when it increases. Otherwise we end up destroying and recreating workspaces constantly if sequence length varies. Modified the lstm_benchmark test to randomize sequence length. This provides big perf improvement to machine translation pipeline. Look at the recurrent network op runtimes. WITH: I0328 12:17:54.073976 492094 prof_dag_net.cc:156] 136.271 ms/iter ( 120.987 ms/iter) RecurrentNetwork I0328 12:17:54.073982 492094 prof_dag_net.cc:156] 190.074 ms/iter ( 156.828 ms/iter) RecurrentNetworkGradient WITHOUT: I0328 12:25:17.658206 518884 prof_dag_net.cc:156] 375.369 ms/iter ( 249.268 ms/iter) RecurrentNetwork I0328 12:25:17.658211 518884 prof_dag_net.cc:156] 278.892 ms/iter ( 227.29 ms/iter) RecurrentNetworkGradient With LSTM benchmark, get about 2x speedup Reviewed By: jamesr66a Differential Revision: D4789354 fbshipit-source-id: ad72f61974e35b0474abcacdc466ae9c6b4eb0ff	2017-03-28 14:08:00 -07:00
Bor-Yiing Su	a03d956b56	Fixes the flaky test. Although we create nets in three different nodes, Reviewed By: azzolini Differential Revision: D4788418 fbshipit-source-id: bdf90c5674b5dbb8b3bda21cf85ea33fedb36fa6	2017-03-28 13:48:07 -07:00
Ahmed Taei	f2b8150a1a	Fix PadImage same padding argument. Summary: PadImage has no kernel parameters resulting pads_ paraemeters to be not set (0). I added a test case too. Differential Revision: D4785230 fbshipit-source-id: fd475e7c41208e07fa7a363def9a45c6f82cddfe	2017-03-28 13:21:36 -07:00
Alexander Sidorov	939daa3d99	gradient checker for nets Summary: this is useful to test rnn cells Reviewed By: dzhulgakov Differential Revision: D4720641 fbshipit-source-id: baa7df43357ed8af72ede64be3e0a642a40472df	2017-03-28 13:03:14 -07:00
Aapo Kyrola	1ed746df45	BatchMatMulOp: use cuBLAS batched strided gemm for CUDA Summary: Instead of doing gemms in a for-loop (which is not parallelized), it is much better to do the batched matmuls using CUDA 8's new batched-striped version of gemm. With the MT team's test, we get 5-10% improvement in overall walltime, so it is significant improvement: ---- Without batched gemm: I0328 10:46:48.118605 58068 prof_dag_net.cc:136] 424.757 ms/iter ( 283.878 ms/iter) RecurrentNetwork I0328 10:46:48.118609 58068 prof_dag_net.cc:136] 352.603 ms/iter ( 265.85 ms/iter) RecurrentNetworkGradient With batched gemm: I0328 10:53:48.169996 85617 prof_dag_net.cc:136] 407.438 ms/iter ( 269.564 ms/iter) RecurrentNetwork I0328 10:53:48.169999 85617 prof_dag_net.cc:136] 322.393 ms/iter ( 287.625 ms/iter) RecurrentNetworkGradient Reviewed By: jamesr66a Differential Revision: D4788272 fbshipit-source-id: 210e8b94c1e036b6ef0f039ce000d455258651f4	2017-03-28 11:54:09 -07:00
Alexander Sidorov	242bff8480	RNN: avoid copy for gradients of inputs to the rnn cell and save more memory! Summary: This is pretty tricky to explain, but we can just use backward_links. This way the whole cell would use a blob from the states_grad tensor instead of having its own blob. This also should save on memory a bit Differential Revision: D4770798 fbshipit-source-id: 673f85b2c2fdf42c47feeaa24d1e2bf086f012f9	2017-03-28 10:02:25 -07:00
Jerry Pan	327d3cb2b5	Caffe2: add init method and metric logging to data loader Summary: Caffe2: add init method and metric logging to data loader Differential Revision: D4685665 fbshipit-source-id: c4e0a09ab6a90c26c329f731f261cba8af1d6bbd	2017-03-28 08:48:27 -07:00
Jerry Pan	78f0b35949	Caffe2: CUDA implementation for LeakyReluOp Summary: Caffe2: CUDA implementation for LeakyReluOp Reviewed By: asaadaldien Differential Revision: D4782336 fbshipit-source-id: 402eace695307b62740c918660d9e521217e928a	2017-03-28 08:48:25 -07:00
James Cross	b41449b680	SparseMomentumSGDUpdateOp Summary: Creates SparseMomentumSGDUpdate, a sparse version of MomentumSGDUpdate, to make that optimization method (via in-place updating operator) compatible with GradientSlices. Differential Revision: D4784973 fbshipit-source-id: e6330f471a4d5f53589a6ac245e38f256ca7f354	2017-03-28 07:47:46 -07:00
Kittipat Virochsiri	da36212259	SamplingTrain layer Summary: `SamplingTrain` layer is a wrapper around another layer subclassing `SamplingTrainableMixin`. When initiated in the training context, `SamplingTrain` produces sparse output of the wrapped layer. Output can be paired with `indices` to create Map schema. When initiated in prediction context, the full output of the wrap layer is produced. This is liked the SampledFC function in model helper, https://fburl.com/gi9g1awh, with the ability to initiated in both trainig and prediction context. I'd like to get consensus whether we should introduce the `SamplingTrain` layer and the accompaying mixin. This can probably be accomplished in some other way, but I think this is not too bad. Reviewed By: xianjiec Differential Revision: D4689887 fbshipit-source-id: 7be8a52d82f3a09a053378146262df1047ab26a8	2017-03-27 23:31:55 -07:00
Minsuk (Brian) Kahng	ebeb36f6ee	Refactoring, t-sne, additional features Summary: t-sne projection of instances activations Minor refactorings Reviewed By: Mortimerp9 Differential Revision: D4752784 fbshipit-source-id: f5cdb74616ab8e00f9ec362c0b94bcf7988e680e	2017-03-27 20:33:20 -07:00
Yury Zemlyanskiy	0c47d345df	Multi-gpu training for OSS seq2seq Summary: Use data_parallel_model for seq2seq multi-gpu training. The main reason for complexity here is that GatherOp hasn't yet been implemented on GPU. This diff also adds better cliping procedure - clip by global norm rather than by absolute value. Differential Revision: D4778691 fbshipit-source-id: bff184dae02ecc227413fef51f48a4726e5d3825	2017-03-27 17:32:39 -07:00
Fei Sun	3ddcff659d	Move AddPlan, AddNet, AddBlobs to predictor_py_utils.py Summary: Cleanup Reviewed By: salexspb Differential Revision: D4775061 fbshipit-source-id: b58405729227a6e3fd867d9d5ba959feaa99e5a6	2017-03-27 11:03:22 -07:00
Jerry Pan	ee28b6ce22	Caffe2: instrument Everstore loader Summary: Caffe2: instrument Everstore loader and log to Scuba Differential Revision: D4669060 fbshipit-source-id: 603256e4ba62a32d9aeadc409f83ef9b1f6a7358	2017-03-27 10:02:11 -07:00
Bor-Yiing Su	7fa4acab9b	Loads only the model blobs from the checkpoints. Summary: To evaluate from checkpoints, we need to load a model from the checkpoints. However, the checkpoints store way more blobs than the blobs needed by the model. This function enables the model builder to load only the blobs associated with the model to the workspace. After that, the model builder can evaluate the model from the populated workspace. Reviewed By: azzolini Differential Revision: D4751414 fbshipit-source-id: a7a420228d681fc2dcfd8573cf69a97b1abc2ef3	2017-03-27 10:02:11 -07:00
Kittipat Virochsiri	6163676ebe	Skip optimizer when param doesn't have gradient and optimizer is not set Summary: Currently, we cannot have layer constant because layer params are required to have gradient and optimizer. Global constants don't cut for this because it can only be added once; therefore, a layer that add any global constant can only be used once. Differential Revision: D4773212 fbshipit-source-id: 5b60d31f3c1602afb04b61f6d30b8e3e06ed2de3	2017-03-24 22:18:34 -07:00
Kevin Waugh	eea0ea7712	Struct nested field name lookup supports List Summary: D4690225 added support for nested field name lookup in nested `schema.Struct`s. It would throw a KeyError if trying to access a nested `List`s field. Writing the lookup recursively avoids the need to enumerate all complex field types in the lookup. Differential Revision: D4719755 fbshipit-source-id: 37c87a32d730f0f45f72fb20894da3e32f820999	2017-03-24 18:17:19 -07:00
Deepak Gopinath	6aee34b666	Registering GPU version of PackSegments using GPUFallbackOp Summary: Creating PackSegments and UnpackSegments GPU operators using GPUFallbackOp for now. The op does mainly copying of blobs and this is a reasonable solution until we have a CUDA op. Reviewed By: pietern Differential Revision: D4761589 fbshipit-source-id: dd483b9e34ecb6b53925405e5b4c24859c549606	2017-03-24 16:01:53 -07:00
Xiaolong Wang	8ce34d6c87	Add Calibration Summary: Add calibration to sparse_nn Differential Revision: D4735564 fbshipit-source-id: 6baa637cbffcbbd50134a256d622ef8c962fca3b	2017-03-24 14:32:23 -07:00
Alisson Gusatti Azzolini	b711c7d039	More perf stats for BlobsQueue Summary: Allow to drill down on data throuhgput overall and per field. Reviewed By: dzhulgakov Differential Revision: D4622168 fbshipit-source-id: 1462bb2fac05824fda0c02f4f5f0b8713893e650	2017-03-24 14:03:28 -07:00
Fei Sun	29c1102806	Extract net and blobs assignment to separate functions Summary: Use AddNet and AddBlobs to add net and blobs to meta_net_def. This a codemod and does not change the functionality. It is for preparation of the protobuf change. Depends on: D4770648 Reviewed By: salexspb Differential Revision: D4771110 fbshipit-source-id: 00cecb2105f2c332bd50c3c51b9a10e1004fa90f	2017-03-24 13:17:24 -07:00
Luke Yeager	0ade0578b1	Reset workspace after each test in copy_ops_test Summary: This was a nasty one to track down. This was the error message: ``` E0323 14:47:46.138900 2870 context_gpu.h:126] Encountered CUDA error: an illegal memory access was encountered F0323 14:47:46.139143 2870 operator.h:176] Computation on device returned error in operator input: "x_gpu_2" output: "loss" name: "" type: "AveragedLoss" device_option { device_type: 1 cuda_gpu_id: 1 } ``` Closes https://github.com/caffe2/caffe2/pull/220 Differential Revision: D4771086 Pulled By: Yangqing fbshipit-source-id: f2d0f39f1647c84d97d9745f8a0305a389bfbc41	2017-03-24 12:20:34 -07:00
Fei Sun	ad8b92b9e8	Extract plans assignment to AddPlan function Summary: Codemod to use a separate function, for protobuf change later on It does not change the functionality Reviewed By: salexspb Differential Revision: D4770648 fbshipit-source-id: d8090f45d31ffa5ca1dca47297fb7c196f34d8a6	2017-03-24 12:02:49 -07:00
Yury Zemlyanskiy	97a6400f03	Don't do copy for param_grad in backward_step_net Summary: We anyway accumulate values of this blob (param_grad) in a another special internal blob Differential Revision: D4768643 fbshipit-source-id: a9d08b7eafd25f278a8db722f9cdb1d0064b852a	2017-03-24 02:22:33 -07:00
Ahmed Aly	99bfd36a04	CRF layer in caffe2 Summary: This is implementation of a CRF layer in caffe2 according to this paper: https://arxiv.org/abs/1603.01360 Currently this implementation works only for batch_size = 1 Reference implementations: - Tensorflow: `63a21e0540/tensorflow/contrib/crf/python/ops/crf.py` - Theano: https://github.com/glample/tagger/blob/master/model.py#L286 Differential Revision: D4644004 fbshipit-source-id: bf0801fd8562d11dca3fefe371c3d85e1dd69ccc	2017-03-23 22:02:02 -07:00
Bram Wasti	396ebb0546	exec_net --> predict_net Summary: Change the naming convention back for maintainability. Reviewed By: Yangqing Differential Revision: D4741875 fbshipit-source-id: 044051e772383e81812ae7064a921e97d63615dc	2017-03-23 16:31:49 -07:00
Deepak Gopinath	422c65ca35	Removing unnecessary Copy after fixing gradients for external parameters Summary: Apart from copying gradient blobs for inputs with initial_cell_input, we needed to perform a similar operation for external parameters used by the step net Reviewed By: salexspb Differential Revision: D4752259 fbshipit-source-id: 13ee48cf583ed86221a4cc1cc9f57f5c3a7d2450	2017-03-23 15:04:22 -07:00
Huazhong Ning	8168e8ac25	allows to specify output names for functional layers Summary: currently the output schema and blobs are names as "field_i" which is bad for debugging. This diff allows us to specify output names. Reviewed By: kennyhorror Differential Revision: D4744949 fbshipit-source-id: 8ac4d3c75cacbb4c9b5f55793ac969fe1cf20467	2017-03-23 13:18:58 -07:00
Ahmed Taei	3b7cb50d1c	Add ConvNd to model helper Summary: Add ConvNd interface for Nd convluton and keep Conv for 2d convlution. I added _BaseConv to share code between ConvNd and Conv. Reviewed By: Yangqing Differential Revision: D4660822 fbshipit-source-id: 8339421351ce9a36ce5a165f7fa455cfcc61733d	2017-03-22 15:47:48 -07:00
Yangqing Jia	0276c992b7	translator fix Summary: This completes the fix that viswanathgs started in an earlier diff but did not cover the full Caffe convention. It should have proper guards for all the stuff that Caffe implies, either supporting it or throwing an explicit exception. Reviewed By: viswanathgs Differential Revision: D4751751 fbshipit-source-id: 474e921c33840cff333a631b7b19f881b39ebccd	2017-03-22 15:09:13 -07:00
Yury Zemlyanskiy	ea66516d5e	Output attention weights from apply_xxx_attention methods Summary: OSS diff. We need it later for beam decoding. Differential Revision: D4747785 fbshipit-source-id: ce2d53ee2434216ace3c4ddbd40a9b68e9db7ec5	2017-03-21 19:01:58 -07:00
Alexander Sidorov	d7b2aebf2c	Support for Sum in cell net as first operator Summary: This didn't work for a reason specified in comments. Also some cleanup in the unit tests, now inference uses a custom workspace to run cell net on Reviewed By: urikz Differential Revision: D4742670 fbshipit-source-id: 04165c029fddec5ae31b20b207faf06d2fa20816	2017-03-21 18:32:18 -07:00
Yangqing Jia	aa4d07d3c4	bugfix for Windows, esp. VS 2017 Summary: aaronmarkham this solves your Windows build issue. Basically: (1) VS 2017 does not have CUDA support yet, and we will be waiting on NVidia to do so. (2) VS 2015 and 2017 need different cmake generator strings. This PR shows how to determine those and also updates appveyor to do contbuild guard for the following 3 settings: - VS2015 without cuda - VS2017 without cuda - VS2015 with cuda Closes https://github.com/caffe2/caffe2/pull/210 Differential Revision: D4745007 Pulled By: Yangqing fbshipit-source-id: 50952552843abd0eb6f4145d9f132daeee3a6794	2017-03-21 05:17:59 -07:00
Yury Zemlyanskiy	93ff338ca7	Beam decoder for NMT in Caffe2 Summary: yolo5 Differential Revision: D4685076 fbshipit-source-id: b5534e441bb453f90e5210294f2dfff6b5c3b5b1	2017-03-20 22:03:59 -07:00
Kevin Waugh	d13f98de4e	implemented DistillLRLoss Summary: Created `BatchDistillLRLoss` layer and added support for it in DPer2. Differential Revision: D4718333 fbshipit-source-id: b873954ea704daafed94ac65fef47a20d56858e2	2017-03-20 16:01:29 -07:00
Ahmed Taei	e41d35909a	Conv-ND NCHW CUP/CUDA implementation Summary: Migrate caffe1 ConvNd implementation to caffe2. Reviewed By: Yangqing Differential Revision: D4659868 fbshipit-source-id: 14b178af3faa2c0b12e5a9f7aa76c1d8945419ea	2017-03-20 14:01:07 -07:00
James Reed	33f41c06c0	Remove more instances of batch_size Summary: D4734505 part 2. Remove more instances of the batch_size parameter Reviewed By: urikz Differential Revision: D4736906 fbshipit-source-id: fc9d374e9308017d61c427890364c5ab9cec2edf	2017-03-19 22:31:30 -07:00
James Reed	17da5856ed	Remove batch_size parameter from attention and LSTMWithAttention interfaces Summary: Reshape based on tensor shapes in the graph rather than based on a passed-in batch_size parameter Reviewed By: urikz Differential Revision: D4734505 fbshipit-source-id: d9c23d85be84f61124106e752ef2b4f6945e2a07	2017-03-19 18:16:28 -07:00
Yury Zemlyanskiy	d1424c3265	Revert D4702086: Remove batch_size parameter from attention and LSTMWithAttention interfaces Summary: This reverts commit c4c1d8425cd36c1e86695918eaba2667c27e9601 Differential Revision: D4702086 fbshipit-source-id: 4620610b182bb84b9297b5de32782761ae89d20b	2017-03-17 17:36:47 -07:00
Alexander Sidorov	f97d7949d0	Remove legacy LSTM, cleanup tests Summary: we don't use this one any more except a few tests Reviewed By: urikz Differential Revision: D4731401 fbshipit-source-id: c5c28b7594e3251f501fc28455dfc9bd2093a836	2017-03-17 16:33:53 -07:00
Kittipat Virochsiri	4829bdb1ea	BatchSoftmaxLoss layer Summary: Similar to BatchLRLoss layer Reviewed By: xianjiec Differential Revision: D4689609 fbshipit-source-id: 89fa4b9d4145ce77cb2aaa7a5c0c1a24f901d88f	2017-03-17 10:19:06 -07:00
Kittipat Virochsiri	cea16ff7cd	BatchSigmoidCrossEntropyLoss Summary: To support feed interset team Reviewed By: kdub0 Differential Revision: D4719213 fbshipit-source-id: 8deb3544377fb06593399b101de66f3f845f93b5	2017-03-17 09:35:51 -07:00
James Cross	79c3a3af54	add gpu support for caffe2-seq2seq Summary: Adding synchronous optimization on GPUs to the translation training pipeline, via data_parallel_model.Parallelize_GPU, which needs to be updated so there is some way of performing sparse parameter updates (e.g., on embedding tables), whether on GPU or CPU. Reviewed By: urikz Differential Revision: D4631914 fbshipit-source-id: 9cdd655f7dbda3f9b2733d459228b3e097892441	2017-03-17 05:19:14 -07:00
Jon Morton	1513b1de6b	Add ResizeNearest operator Summary: This adds a nearest neighbor interpolation resizing operator to caffe2. CPU only, NCHW only, no gradients. Also adds torch2caffe support. This is probably not optimal in terms of performance, but it works. Reviewed By: ajtulloch Differential Revision: D4724244 fbshipit-source-id: b8295061141fb513da84acf91fdfd67264119059	2017-03-16 18:49:01 -07:00
Huazhong Ning	ad4ae4528f	migrate mtml to dper2 Summary: 1. migrate the basic mtml model to dper 2 2. test dper 2 mtml model 3. test all optimizers Reviewed By: kittipatv Differential Revision: D4680215 fbshipit-source-id: 7aac5c59bdac22fcad8ed869b98e9e62dca1d337	2017-03-16 17:48:05 -07:00
James Reed	cc2e915461	Implement TopK op in caffe2 Reviewed By: salexspb, urikz Differential Revision: D4718439 fbshipit-source-id: e6866eb7bb586f2716662cd4b65961bdd9914525	2017-03-16 17:32:20 -07:00
Kevin Waugh	2c8bf2525b	added BatchL2Loss layer Summary: layer that takes a label, prediction pair and outputs the L2 loss Reviewed By: kittipatv Differential Revision: D4702111 fbshipit-source-id: 09f2ede44d1b548e61096de741f1b2aa0b66bbcb	2017-03-16 17:32:20 -07:00

1 2 3 4 5 ...

419 commits