pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

Author	SHA1	Message	Date
Yiming Wu	0aeffa985e	make sure mutex is on CPU too Summary: mutex is only supported on CPU. need to make sure mutex and following atomicIter are both on CPU. This is critical for gpu SparseNN training Differential Revision: D5093184 fbshipit-source-id: 021e6ba699a3208449fa4761cad6b0ec4544957e	2017-05-19 12:17:17 -07:00
Yiming Wu	65750349ba	deprecate CNNModelHelper in python/operator_test dir Summary: deprecate CNNModelHelper in python/operator_test dir BTW I found that there is 2 mkl_speed_test. I am confused... Reviewed By: salexspb Differential Revision: D5094122 fbshipit-source-id: f6526f4de334f2245eb4c1f204a8ec9f23750d78	2017-05-19 12:17:17 -07:00
Ahmed Taei	32bf7a2c2b	Generalize PoolingOp(cuDNN) to compute 2D and 3D pooling. Reviewed By: akyrola Differential Revision: D5090689 fbshipit-source-id: f9f11e12adc0ee8db088f3397a8c33aa31eb5deb	2017-05-19 10:19:00 -07:00
Yiming Wu	1b7497807f	cnnmodelhelper deprecate warning Summary: We will start our API migration process. Before that, I want to make sure people don't add new CNNModelHelper instance to our opensource code. So that I put deprecation warning here in advance Reviewed By: salexspb Differential Revision: D5093556 fbshipit-source-id: 74bf4a7782c2d882f72f202d48c72255d152b68a	2017-05-18 23:35:26 -07:00
Pooya Davoodi	307459eb62	Fix conv_test for CUDNN dilated convolution in NHWC Summary: CUDNN dilated convolution was added to V6. This version of CUDNN does not support NHWC for dilated convolution. Fix conv_test.py so that it does not test CUDNN for dilated convolution in NHWC format. Closes https://github.com/caffe2/caffe2/pull/598 Reviewed By: akyrola Differential Revision: D5084835 Pulled By: asaadaldien fbshipit-source-id: 3c0c5ed02c5d9232fca567e387ab6260d71e5aaf	2017-05-18 10:07:28 -07:00
James Reed	85f1d947dd	Vectorize SigmoidOp on CPU Summary: I noticed that Sigmoid was taking an inordinate amount of time in our NMT benchmark, so I looked at the implementation and it didn't seem optimal. I replaced the implementation with an Eigen version so that when the Eigen update goes through, we will get proper AVX(2) vectorization. Differential Revision: D5082464 fbshipit-source-id: aa951f7d730fc05198f7dd04076ec58d471b74c8	2017-05-17 20:33:36 -07:00
Ben Zhang	12edbcb154	Implemented L1Distance Operator for CUDA Summary: Added L1Distance Operator for CUDA, as well as tests. Reviewed By: bwasti Differential Revision: D5071966 fbshipit-source-id: 4c3d862605e9123d955bf091efa67d0731bd816a	2017-05-17 17:32:53 -07:00
Pieter Noordhuis	bbd7aee9ab	Revert D4952993: [Caffe2] fix mkl_sparse and migrate sparsity experiments Summary: This reverts commit 86c03676ab4e47f04d2d0dd438a4a1c849bbbff0 Differential Revision: D4952993 fbshipit-source-id: 5c213c48ac44ce6aefccacc6d80534648d3c516a	2017-05-17 14:46:56 -07:00
James Cross	f27c9eea20	dropout for C2 multilayer Summary: Incorporate arbitrary dropout for encoder and decoder layers for Caffe2 NMT models using current configuration. This involves separate output processing (_prepare_output() and _prepare_output_sequence()) for the final layer in a MultiRNNCell. Switching to using the newly introduced forward_only switch for RNN cells revealed an unrelated bug in our NetGradientChecker test, which urikz is investigating. Reviewed By: salexspb Differential Revision: D5031964 fbshipit-source-id: 19b49607d551aa3e2140041ef4e585f128c8f178	2017-05-17 11:32:47 -07:00
Aapo Kyrola	658c337f41	Error status for Gloo ops, and handling in elastic dpm Summary: Add a RandomFailureOp and handling to elastic data parallel model of the status code Reviewed By: andrewwdye Differential Revision: D5065936 fbshipit-source-id: 24224f9ea414ee535c9e90cc28add5189354b0ef	2017-05-17 00:16:52 -07:00
Szymon Piechowicz	5ced84856a	Caffe2: SparseToDenseMask: return key presence Summary: Caffe2: SparseToDenseMask: return key presence Reviewed By: matbd Differential Revision: D5066863 fbshipit-source-id: 4f4dd141f6661829535cb77ff47cc0c230dce5d6	2017-05-16 20:22:03 -07:00
Yiming Wu	f359d70ae7	fix mkl_sparse and migrate sparsity experiments Summary: Migrate experiments folder to fb/sparse folder. Keep FunHashOp and SparseFunHashOp because they are now assumed as a default Op in depr. What I did # Migrate FunHashOp and SparseFunHashOp and their unitests to core-caffe2, make sure tests are passed. # Migrate other Ops in experiment folder to fb/sparse folder. Write new TARGETS files for them. Make sure tests are passed. # Make sure all related tests passed. # Fix MKL definition btw. Make sure that FC_Sparse is not compiled when there is no MKL support Reviewed By: salexspb Differential Revision: D4952993 fbshipit-source-id: 86c03676ab4e47f04d2d0dd438a4a1c849bbbff0	2017-05-16 18:33:51 -07:00
James Cross	37c06a3ba8	residual connections in multilayer C2 ('add' only) Summary: Residual connections for multilayer RNN encoder/decoder for Caffe2 NMT model. Only supporting 'add' connections (the standard approach, which ves's TF experiments concluded was at least as good as other approaches), and also only implementing for residual_level >= 1 (which also fits our use case). It is the responsibility of the config to ensure dimension compatibility: each level at and beyond residual_level (in both the encoder and decoder) should have the same number of units, with the exception that a bidirectional initial encoder layer should have half the number of units of the succeeding layer if that next layer is a residual layer. Differential Revision: D5023160 fbshipit-source-id: f38c1b140638fee78cf3ef7d6b4602dd462484ee	2017-05-16 17:04:58 -07:00
Yiming Wu	a28b01c155	rnn with brew Summary: Update rnn_cell.py and char_rnn.py example with new `brew` model. - Deprecated CNNModelHelper - replace all helper functions with brew helper functions - Use `model.net.<SingleOp>` format to create bare bone Operator for better clarity. Reviewed By: salexspb Differential Revision: D5062963 fbshipit-source-id: 254f7b9059a29621027d2b09e932f3f81db2e0ce	2017-05-16 13:33:44 -07:00
Alisson Gusatti Azzolini	310f505da7	Remove application-specific comment. Summary: This comment is not relevant for open-source. Differential Revision: D5070835 fbshipit-source-id: 8e2dadae85566e7f6684d42f921daf7d345dc065	2017-05-16 12:17:03 -07:00
Yang Yang	769e668faf	ttsn model fails to set optimizer for FC layer Summary: the FC ModelLayer needs an optimizer, also seems the catch-all that sets a default for missing optimizers had a bug Reviewed By: xianjiec Differential Revision: D5048302 fbshipit-source-id: cbbf641fb9ee4f4f89c5dbb132f7837ecdbe37a5	2017-05-16 11:26:02 -07:00
Yiming Wu	64d43dbb6e	new resnet building with brew Summary: new resnet building with brew Reviewed By: akyrola Differential Revision: D4945418 fbshipit-source-id: d90463834cbba2c35d625053ba8812e192df0adf	2017-05-15 22:47:24 -07:00
Ahmed Taei	25fd005dd9	Initial implementation of Blockwise Model Update Filtering (BMUF) Summary: A Single machine multi-GPU version of BMUF algorithm. BMUF is a modification to model averaging where updates to global model is implemented as a filter: param_t = param_(t-1) + delta delta = \beta delta_(t-1) + \alpha average(param_t) - param_(t-1) Reviewed By: akyrola Differential Revision: D4995057 fbshipit-source-id: 48176ba66d67eaf3fa4dee16d50d9589825ddba4	2017-05-15 18:18:15 -07:00
Huazhong Ning	e394b60a9c	Support un-equal weight training for mtml models Reviewed By: queqichao Differential Revision: D5047939 fbshipit-source-id: 857d0d77e0413939e5774fa37d21b92a00d34bf0	2017-05-15 12:56:11 -07:00
Aaron Markham	ad37840329	fixed document generator for github Summary: Fixed generator. Tweaked the output to fit github markdown template. Reviewed By: bwasti Differential Revision: D4569692 fbshipit-source-id: 87f497319cc8b258c6c75dc0837d728c5eda5636	2017-05-15 11:40:46 -07:00
Yiming Wu	3eeca5b5e0	arg scope in ModelHelper Summary: based on our discussion, we want an arg_map in ModelHelper and create arg_scope for that model within brew. Now it is realized Reviewed By: salexspb Differential Revision: D5042983 fbshipit-source-id: ddd2c7e9bca1be2f08a32f7252b44d3b60a57996	2017-05-12 11:18:59 -07:00
Du Tran	5989deb707	adding 3d operator translators Summary: Adding caffe-to-caffe2 translators for Conv3D, Pooling3D, BatchNorm Differential Revision: D4945495 fbshipit-source-id: fe3c97547507924a1409b977307b928ce78445f3	2017-05-11 23:01:44 -07:00
Yiming Wu	b070197e8a	cuda unique op Summary: cuda unique op , unittest provided, will provide benchmark agains CPU SpeedUp results for synthetic real data. Input of size 20k, range[1, 10million], ~5x speedup CPU 9.05795(ms) Unique GPU 1.79434(ms) Unique SpeedUp results for 5x synthetic data. Input of size 1 million, range[1, 10million] ~13.7x speedup CPU 54.7539(ms) Unique GPU 3.99473(ms) Unique Reviewed By: akyrola Differential Revision: D5007726 fbshipit-source-id: 0a00c518fd1809d0ae8c6cfcba09b0bd982ffaff	2017-05-11 21:08:10 -07:00
Huazhong Ning	942f53b5a6	gradient impact of task layers on shared is configurable Reviewed By: chocjy Differential Revision: D4943948 fbshipit-source-id: 2e26dfb30c6893b60985f693a823646ed3d3e0e3	2017-05-11 20:34:04 -07:00
Ben Zhang	93f1d0ca7c	L1 Operator Summary: Adds the L1 Distance operator to distance_op. Reviewed By: bwasti Differential Revision: D5007719 fbshipit-source-id: fd547c6645cf5f87305e9ebfd95ed918779c1d2a	2017-05-11 18:03:10 -07:00
Ahmed Taei	8df51a84ac	Support 3D&1D SpatialBatchNorm[CPU] Summary: Generalize SpatialBatchNorm CPU Op to compute Spatial batch normalization for 1D, 2D & 3D input tensors. Reviewed By: dutran Differential Revision: D5043563 fbshipit-source-id: 7fcb933a628dd47f13aa622f63601a87382f09cd	2017-05-11 09:32:54 -07:00
Romain Cledat	e16ea46013	Extended ImageInputOp Summary: Added several features to the ImageInputOp: - bounding box (per image as well as default for the operator). For per-image, it only works in Caffe2 format and is passed as the third tensor in the form (ymin, xmin, height, width). For the operator, pass bounding_xmin, bounding_ymin, bounding_width and bounding_height as parameters. - per-channel mean/std. You can use the usual mean/std to pass a single value to be used for all channels or also pass mean_per_channel and std_per_channel to specify different values per channel. Order of channels is BGR. - A minimum size parameter that can be specified instead of the scale parameter. The minsize parameter will only scale the image if it is smaller than required. This differs from scale which will scale up as well as down. You can only specify one of scale or minsize. Added a test case to test some of the features Differential Revision: D4874988 fbshipit-source-id: 437191052a46e9916defe8b100d7cc7864373f61	2017-05-10 17:52:01 -07:00
Yury Zemlyanskiy	e8c274cf16	Optimize memory usage for MI-LSTM Summary: Use ElementwiseLinearOps instead of manual Mul + Sum. That saves intermediate blobs. For NMT use case Before: https://our.intern.facebook.com/intern/fblearner/details/18060753 Time per step: 0.072 memory usage (per each of 2 GPUs): 9041MiB After:https://our.intern.facebook.com/intern/fblearner/details/18107583 Time per step: 0.0715 Memory (per each GPU): 8560MiB Reviewed By: akyrola Differential Revision: D5038785 fbshipit-source-id: 4bc8155dbd0c87729e17236d68d62ca530aadb53	2017-05-10 16:53:43 -07:00
Xiaolong Wang	11bcdbc3f0	Load Parameters from Model Summary: In Dper utility, add a function `load_parameters_from_model_init_options` to allow init parameters from pretrained models Reviewed By: xianjiec Differential Revision: D4926075 fbshipit-source-id: 5ab563140b5b072c9ed076bbba1aca43e71c6ac5	2017-05-10 10:33:04 -07:00
Yury Zemlyanskiy	3abd0cb623	Add axis argument to SoftmaxWithLoss Summary: ##axis## argument for SoftmaxWithLoss (it doesn't yet work for spatial case). Reviewed By: akyrola Differential Revision: D5025797 fbshipit-source-id: 9e3cf39223af3f2c8bb357f8d9fe952b7349f913	2017-05-09 19:36:00 -07:00
Alisson Gusatti Azzolini	75bc9f5e77	Relax requirement on token uniqueness Summary: Relax requirement on token uniqueness since a few use cases broke after the uniqueness requirement was added in a previous diff. Reviewed By: kittipatv Differential Revision: D5034132 fbshipit-source-id: 327eb065923e6ea152a360324316f81b7fb9564b	2017-05-09 19:36:00 -07:00
Yury Zemlyanskiy	48de1ea165	Drop extra Reshape in attention calculation Summary: We can avoid this extra Reshape. Reviewed By: jamesr66a Differential Revision: D5032874 fbshipit-source-id: 92bd568bc6bec53d7f81a64cfa96d2c610823f8c	2017-05-09 17:16:36 -07:00
Yury Zemlyanskiy	ae924be3ac	Removing extra Reshapes in MILSTM with new broadcasted ops Summary: D4873222 introduced SumReduceLike and removed the use_grad_hack ... hack. Remove unnecessary reshapes and kill use_grad_hack parameters. Reviewed By: jamesr66a Differential Revision: D4894243 fbshipit-source-id: c4f3f84abf95572d436b58bbdc2b18b21583c2f1	2017-05-09 14:11:04 -07:00
Xiaolong Wang	add840510f	Refactor Optimizer to Allow scale_learning_rate Summary: In transfer learning, parameter initialized from pretrained model might require a different learning rate than otherwise initialized. To this end, here we implement a python solution where `base_learning_rate` is scaled by `scale`, which is in turn set by `scale_learning_rate`; Alternatively, we can achieve same effect by rewriting the LearningRate operator in C++ Reviewed By: kennyhorror Differential Revision: D4992827 fbshipit-source-id: 8d7e87a61c95b3eb8ef733ec436f4060e865c0ac	2017-05-09 13:16:21 -07:00
Alisson Gusatti Azzolini	20d8de8d51	Parameter cost estimation job Summary: Adds a parameter cost estimation step before the actual training starts. The costs are later used in order to better shard the parameters across instances of the parameter server. Things I needed to modify: - A few changes to make ModelLayerHelper picklable - Add support for stopping a distributed job after a number of stats reporting steps. - Refactored run_dist_job to support collocating the reader with the trainer even when PS are present. - Option to disable dense updates (when num_dense_servers=0). Currently there's a huge overhead posed by having to launch a child workflow. I'll try and address next in a subsequent diff. This is WIP because the other workflows need to be migrated as well. I can break this down into smaller diffs if reviewers would prefer it. Reviewed By: kennyhorror Differential Revision: D4974752 fbshipit-source-id: 04c336acb2945f8f11324a221ffc6967818c0672	2017-05-09 13:02:24 -07:00
Alisson Gusatti Azzolini	bd8ed6641c	Stabilize PythonOp token name Summary: For distributed jobs, we were relying on the order the PythonOps were registered, which was very fragile. Reviewed By: dzhulgakov Differential Revision: D5016847 fbshipit-source-id: f5601467c5b0569d5e8a0efdd76abad0d703c5f5	2017-05-09 11:19:44 -07:00
Simon Layton	1d0ba2cfbd	New cudnn ops Summary: cuDNN versions of dropout and LRN (for native fp16 support), port of Caffe's max pooling algo that uses an explicit mask to store locations (also supports fp16 storage) Closes https://github.com/caffe2/caffe2/pull/396 Reviewed By: akyrola Differential Revision: D4990880 Pulled By: asaadaldien fbshipit-source-id: a716acffb656843e9b31e3e6808bd2d8aa959d03	2017-05-08 16:33:21 -07:00
Yury Zemlyanskiy	11052d03aa	RNNCell API change: returns states and outputs Summary: Incorporating definition of cell's output and illustraing it's usage by adding dropout to all types of cell. I think that we should try to get rid of aliases in RecurrentNetwork, so output of applied_over_sequence is also always (state_1_all, state_2_all, ...). This way we can merge get_output_from_single_step, get_output_from_sequence and get_outputs_with_grads into a single method Let me know what do you think! Reviewed By: jhcross Differential Revision: D4992913 fbshipit-source-id: 737939be336ad145f84e8733cd255d4f7188ef70	2017-05-08 15:19:48 -07:00
Yury Zemlyanskiy	b6a8dd1438	don't recompute small blob in attention Summary: decoder_hidden_encoder_outputs_sum_tmp is tiny after D5010109, no need to recompute it. Reviewed By: akyrola Differential Revision: D5014335 fbshipit-source-id: cc9e8f91372889d10bd99c79366018cb3943a435	2017-05-08 13:06:06 -07:00
Kevin Matzen	0cb7774445	softplus op Summary: Added softplus function, f(x) = ln(exp(x) + 1) Reviewed By: akyrola Differential Revision: D5011057 fbshipit-source-id: 5fddb1568fee625f81ea3a86a85d0f400c3ee278	2017-05-08 10:40:25 -07:00
Xianjie Chen	8a7f00d61b	fix mean pooling Summary: Segment based Ops requires increasing seg id, and without gap. Lengths based Ops does not have this requirements. Otherpooling methods, e.g., LogExpMean does not have Lengths based Ops available yet. Differential Revision: D5019165 fbshipit-source-id: ab01a220e10d4ed9fa2162939579d346607f905e	2017-05-08 01:09:07 -07:00
Jon Morton	ac1c63dda8	Add specialized ResizeNearest implementation for scale=2 Summary: Specialized implementation of ResizeNearest for width_scale=2 and height_scale=2. This implementation doesn't use divides or calls to std::min, and is unrolled 2x over the width dimension. Also add a correctness test. About 6x faster. Reviewed By: ajtulloch Differential Revision: D4928579 fbshipit-source-id: 5cc92a52bd688690fee907b4333d9c84b666f9c9	2017-05-07 21:10:11 -07:00
Aapo Kyrola	711ea1d4ac	fix enternalinputs handling in AppendNet v2 Summary: External inputs must be computed before updating the _ops_output structure, otherwise if the net to be appended outputs the external input, it is not added correctly Differential Revision: D5013496 fbshipit-source-id: 6a83d0a6f1c63ef8ae7bec4d862c0ac2a690d47b	2017-05-05 21:50:57 -07:00
Du Tran	033ab9da1b	Adding video data layer for caffe2 Summary: Adding a simple video data layer which allows to read video data from frames, videos and output 5D tensor. It also allows multiple labels. The current implementation is based on ffmpeg Differential Revision: D4801798 fbshipit-source-id: 46448e9c65fb055c2d71855447383a33ade0e444	2017-05-05 14:16:38 -07:00
Aapo Kyrola	a61778a628	fix recompute_blobs_on_backward Summary: My previous refactoring broke recompute_blobs_on_backward, which was cleared. Reviewed By: urikz Differential Revision: D5013351 fbshipit-source-id: 5945778c0cff2ee2c7f5ad7b59b58f4305fa6a05	2017-05-05 14:01:34 -07:00
James Cross	5c667ebe4e	AttentionCell Summary: This diff creates a generalized AttentionCell class, which will allow us to construct attention decoders out of arbitrary RNNCell components (with a particular view to using stacked, multi-layer RNNs). In order to do this, we introduce a new optional input for RNNCell._apply which allows us to provide an additional input that is not processed by prepare_input(). Note that this is an argument only to _apply, not apply, since it is only meant to be used for additional recurrent connections to "embedded" cells, not for standalone RNNs. Reviewed By: urikz Differential Revision: D4998465 fbshipit-source-id: 473009ea4917e86e365f9d23aa2f11a46a94fd65	2017-05-05 12:33:01 -07:00
Yury Zemlyanskiy	d7f20c94fd	Optimize memory for RNN attention Summary: The fix should save us (source_len - 1) * target_len * batch_size * encoder_output_size * 4 bytes for the forward pass. Typically, these values are 100 * 100 * 128 * 512 * 4 = 2.4GB. Not entirely sure about backward pass. Reviewed By: akyrola Differential Revision: D5010109 fbshipit-source-id: 2ed68f3ebfd3b8362916d24af991482f1686e064	2017-05-05 12:18:50 -07:00
Eider Moore	0c6099ce25	Add __dir__ so autocomplete in iPython works. Summary: It is good practice to provide __dir__ whenever __getattr__ is defined so that tooling will work intelligently. In particular, it is hard to explore the available methods in iPython without tab completion. Reviewed By: dzhulgakov Differential Revision: D5006545 fbshipit-source-id: 1a150d91d54637d80b292764513943ff70d971b4	2017-05-05 11:32:06 -07:00
Heng Wang	8a2433eacb	Add model saving and loading to resnet50_trainer.py Summary: Script caffe2/caffe2/python/examples/resnet50_trainer.py can be used to train a ResNet-50 model with Imagenet data (or similar). However, currently the script does not actually save the model, so it is kind of useless. Task 1: After each Epoch, save the model in a file "<filename>_X.mdl' where X is the epoch number and <filename> is given as a command line parameter. By default, use "resnet50_model" as filename. Task 2: Add a functionality to restore the model from a previous file: - add a command line parameter "load_model", which user can use to specify a filename. - if this parameter is set, load the model parameters from the previous file Reviewed By: prigoyal Differential Revision: D4984340 fbshipit-source-id: 333e92679ba52a7effe9917fdfc2d55d652b868f	2017-05-05 10:08:37 -07:00
Aapo Kyrola	5c52392229	opsify AccumulateInputGradients Summary: Part of project to make all gradient accumulation business ops in RecurrentNetworkGradientOp, this makes the accumulateInputGradients ops. Also added way to mark operators private so they don't appear in docs. Reviewed By: salexspb Differential Revision: D5006698 fbshipit-source-id: 226d7afb473290c8d0f936d2cc87640be3e06615	2017-05-05 09:13:39 -07:00

1 2 3 4 5 ...

679 commits