pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

Author	SHA1	Message	Date
Bangsheng Tang	e5a7891038	dot product using matmul Summary: 1. PairwiseDotProduct in layers 2. add_axis argument in Concat and Split(just for backward propagtion) Reviewed By: xianjiec Differential Revision: D5383208 fbshipit-source-id: 8e18ce371fff2da2da77b1a728142d69cd48e9c3	2017-07-17 23:20:37 -07:00
Tao Wu	427cc68ba2	added TensorInferenceFunction for ExpandDims operator; deleted Reshape layer. Summary: The diff added TensorInferenceFunction for ExpandDims operator, so that ExpandDims layer is no longer needed (it can be handled by functional layer) Reviewed By: kittipatv Differential Revision: D5430889 fbshipit-source-id: 4f895f2751663c45db4cc4f87e5114c63cda9fbb	2017-07-17 21:03:00 -07:00
Tao Wu	78c4c4f885	handle RecurrentNetwork operator when clone net Summary: added support of passing remap_funcs to clone_and_bind_net, so that it can pass it to clone method. Added other utils to ensure RecurrentNetwork operator is correctly cloned based on the remap_blob. The reason that RecurrentNetwork operator needs special treatment is that its arguments contain proto and blobs. Reviewed By: kittipatv Differential Revision: D5421532 fbshipit-source-id: 5de68365ce97df2de483f02ad260d78c8d35eead	2017-07-17 17:33:21 -07:00
Victor Gao	f7a92145d4	comment out unused parameter in pybind_state.cc Summary: This removes/comments out/silences one or more unused parameters in the files. We are going to enable `-Wunused-parameter` in fbcode and this fixes a case that automated tooling can't handle. This diff is automatically generated. Reviewers are added heuristically. Reviewed By: dzhulgakov Differential Revision: D5437217 fbshipit-source-id: c2fc5ed30e7ee47b8c40248f89a9f4304ce7c098	2017-07-17 15:57:49 -07:00
Aapo Kyrola	baef769035	add code comments to memonger Summary: Add some comments to dag-memonger to help asaadaldien with his C++ port. Reviewed By: asaadaldien Differential Revision: D5435459 fbshipit-source-id: dd5d482efb017418d22f42ee79fbd4668bd31bdd	2017-07-17 13:07:33 -07:00
Geet Sethi	2dc8851206	RNN Workspace Blob Extraction Summary: Added operator RecurrentNetworkBlobFetcherOp that takes as input a scratch workspace name and prefix, and copies over all blobs in the scratch workspace into the global workspace. This essentially extracts all intermediate recurrent network computation for each timestep. Added a wrapper in recurrent.py - retrieve_step_blobs(net, prefix='rnn') - which, when called after an rnn is run, will return a list of all blobs extracted from the net. Reviewed By: akyrola Differential Revision: D5421926 fbshipit-source-id: 0f35b466d77d3c719fb0e32de7dbcafc6c0d5225	2017-07-17 10:24:18 -07:00
Huazhong Ning	9e2c74cc58	Use scope name for dataset cursor Summary: Currently the dataset cursor blob is using a fixed name. When we read from multi input tables, the dataset cursor of each table is using the same blob. This messed up the split queue and crashed the reader pipelines (see the errors and failures in https://fb.quip.com/uzbIA7K0PgVe) Reviewed By: dragonxlwang, rayleichen Differential Revision: D5419863 fbshipit-source-id: 5983a3d8d2e286dc47c2ec38ed1dbbe30c7c9b49	2017-07-15 19:22:32 -07:00
Yangqing Jia	b6691277f5	binary size util Summary: This would allow us to inspect the binary size of the builds more easily. Reviewed By: jonmorton Differential Revision: D4553515 fbshipit-source-id: 95371bf67e66490a8653b874e1ff79cc987805e6	2017-07-14 17:49:24 -07:00
Honghao Wei	b68adec7bb	adding model loss logic Summary: Add api model.add_loss(), which allows adding loss, such as optimization and regularization. See change in sparse_nn.py, in which 'model.loss = loss' is changed to 'model.add_loss(loss)'. Reviewed By: xianjiec Differential Revision: D5399056 fbshipit-source-id: 13b2ced4b75d129a5ee4a9b0e989606c04d2ca8b	2017-07-14 16:25:23 -07:00
Alexander Sidorov	bd29260f47	hyposesis_test grad_reference bug fixes Summary: 1. it was easy to pass grad_reference which was just ignored due to missing output_to_grad 2. threshold was not passed to the gradient checkinglogic Reviewed By: dzhulgakov Differential Revision: D5425226 fbshipit-source-id: 2eb41f2601d5e356f7872e57724d08ab2e742329	2017-07-14 14:41:23 -07:00
Jacqueline Xu	2aa8fc7e8d	Implementing Semi-Random Features Layer Summary: - (Split diff from Arc Cosine) - Implemented [[ https://arxiv.org/pdf/1702.08882.pdf \| Semi-Random Features ]] Layer - Created a buck unit test for SRF Layer Reviewed By: chocjy Differential Revision: D5374803 fbshipit-source-id: 0293fd91ed5bc19614d418c2fce9c1cfdd1128ae	2017-07-14 13:15:50 -07:00
Junjie Bai	a305ce3ece	Fix broken seq2seq example Reviewed By: harouwu Differential Revision: D5423060 fbshipit-source-id: 4537b020546503a1f9cb237257ab3c42665ae07f	2017-07-13 23:31:54 -07:00
Aapo Kyrola	f44991b398	add timeout argument to DequeueBlobs; use 10 min timeout for data workers Summary: As title. This helps with (quite common) cases where data input is stuck for reason or another, and the net execution never proceeds and is stuck forever. Reviewed By: andrewwdye Differential Revision: D5409885 fbshipit-source-id: 840261fd5964408f788fc0f50ece0d74193694ac	2017-07-13 18:52:03 -07:00
Honghao Wei	34f7acbedf	Report bugs in BatchNormalization, the dimension is wrong for second order Summary: The number input dimension for NHWC should be the last dimension C. Since batch size is omitted, it should be 2 instead of 3. Reviewed By: chocjy Differential Revision: D5418538 fbshipit-source-id: a6939a863817b7566198ea2a665a1d236a2cf63d	2017-07-13 18:31:18 -07:00
Ahmed Taei	13980d2bb5	Set device to the default device(CPU) when DeviceContext is None. Summary: Fix case when optimizer isn't called within a device scope context. Fix OptimizerContext lr blob names Reviewed By: volkhin Differential Revision: D5421046 fbshipit-source-id: 186a0d05f40d4442c5ba5736084626da73a0c0f1	2017-07-13 17:54:36 -07:00
Geet Sethi	ab0d631d6d	Adding AllCompare-like function to data_parallel_model Summary: Added function _RunComparison to data_parallel_model that checks if all shards in a given rendevous have the same value for a given blob_name Reviewed By: wesolwsk Differential Revision: D5394164 fbshipit-source-id: c2b07d0f8d5846fa9887d53b0be091a8c057f106	2017-07-13 13:03:57 -07:00
Aapo Kyrola	59c0bb9e5a	fix for duplicate input case Summary: Fix a bug reported by dzhulgakov that occurs when input blobs is used twice in a same op --> it was released to the recycled blobs pool twice. Reviewed By: dzhulgakov, volkhin Differential Revision: D5414023 fbshipit-source-id: 861bb46fe901023cb9a496401736e6ecb77d5fae	2017-07-13 01:51:30 -07:00
Jiyan Yang	043640c3eb	Return top K classes Reviewed By: kittipatv Differential Revision: D5363481 fbshipit-source-id: 27ce37878434917c1a7c5f325ed77c989a1448af	2017-07-13 00:20:00 -07:00
Ahmed Taei	3faca65adf	Add a unit-test to validate sharing learning rate between Reviewed By: kennyhorror Differential Revision: D5413387 fbshipit-source-id: ff4022375183394ca9cee6faea5ac46e56079b86	2017-07-12 21:53:25 -07:00
Luke Yeager	82e318cf8b	Optimizer: one LR op per (device, optimizer) Summary: Try running this script through `nvprof`: ```py import numpy as np from caffe2.proto import caffe2_pb2 from caffe2.python import brew, core, optimizer, workspace from caffe2.python.model_helper import ModelHelper do = core.DeviceOption(caffe2_pb2.CUDA, 0) with core.DeviceScope(do): model = ModelHelper(arg_scope={'order': 'NCHW'}) conv1 = brew.conv(model, 'data', 'conv1', 1, 20, 5) pool1 = brew.max_pool(model, conv1, 'pool1', kernel=2, stride=2) conv2 = brew.conv(model, pool1, 'conv2', 20, 50, 5) pool2 = brew.max_pool(model, conv2, 'pool2', kernel=2, stride=2) fc3 = brew.fc(model, pool2, 'fc3', 50 * 4 * 4, 500) fc3 = brew.relu(model, fc3, fc3) pred = brew.fc(model, fc3, 'pred', 500, 10) softmax, loss = model.SoftmaxWithLoss([pred, 'label'], ['softmax', 'loss']) model.AddGradientOperators([loss]) optimizer.build_sgd(model, 0.01, policy='step', stepsize=1, gamma=0.999, momentum=0.9, nesterov=False) workspace.FeedBlob('data', np.zeros((1, 1, 28, 28), dtype=np.float32)) workspace.FeedBlob('label', np.zeros((1, 1), dtype=np.int32)) workspace.RunNetOnce(model.param_init_net) workspace.CreateNet(model.net) for _ in range(100): workspace.RunNet(model.net) ``` Before this change: ``` 1.55% 1.4185ms 837 1.6940us 1.6630us 2.4000us [CUDA memcpy HtoD] 0.72% 656.03us 200 3.2800us 3.1350us 3.5840us [CUDA memcpy DtoD] 0.39% 7.1574ms 1034 6.9220us 3.8300us 18.677us cudaMemcpyAsync 0.00% 34.180us 3 11.393us 9.0960us 12.910us cudaMemcpy ``` And after it (look at the third column): ``` 0.73% 657.15us 200 3.2850us 3.1040us 3.6160us [CUDA memcpy DtoD] 0.26% 235.07us 137 1.7150us 1.6640us 2.3680us [CUDA memcpy HtoD] 0.20% 3.4493ms 334 10.327us 6.4220us 16.958us cudaMemcpyAsync 0.00% 37.376us 3 12.458us 9.4120us 15.412us cudaMemcpy ``` That makes a pretty big difference in performance. Is there any particular reason you decided to have a separate `LearningRate` op for every parameter in `1317e3498c`? Closes https://github.com/caffe2/caffe2/pull/893 Reviewed By: kennyhorror Differential Revision: D5372541 Pulled By: asaadaldien fbshipit-source-id: 57357e1be2d58ce294058e9422fb3b1eddfca24d	2017-07-12 21:17:49 -07:00
Jiyan Yang	d6f5452240	Allow to import subclasses of layers Summary: We want it to be able to register children of layers who are not direct children of ModelLayer. This requires us to find subclasses of ModelLayer recursively. Reviewed By: kittipatv, kennyhorror Differential Revision: D5397120 fbshipit-source-id: cb1e03d72e3bedb960b1b865877a76e413218a71	2017-07-12 20:19:47 -07:00
Tao Wu	02aa5ad9fb	make functional layer return scalar if only one output Summary: This diff makes functional layer return scalar if only one output. This diff also corrects all other corresponding implementations. Reviewed By: kittipatv Differential Revision: D5386853 fbshipit-source-id: 1f00582f6ec23384b2a6db94e19952836755ef42	2017-07-12 11:34:31 -07:00
Geet Sethi	a68bb5e3f9	Added device scope checks to data_parallel_model and data_parallel_rendevous Summary: Added device scope checks to data_parallel_model and data_parallel_rendevous Added test to check that checks are working correctly to data_parallel_model_test Fixed device_scope error in test_synchronization_barrier Reviewed By: akyrola Differential Revision: D5403936 fbshipit-source-id: 849c1cd7452692efbc5ef74d2d60ede090c9c017	2017-07-12 10:47:28 -07:00
Tao Wu	74fd4bf9e4	quick fix for model_helper __init__ Summary: the init method should also make _parameters_info shared between self and param_model, since params is shared. Otherwise it can cause a inconsistence between _parameters_info and params. Examples of using param_model can be find in rnn_cell.py. Reviewed By: kennyhorror Differential Revision: D5405327 fbshipit-source-id: ca8079058e898f529906452163cda234cb30a7df	2017-07-12 08:49:48 -07:00
Tao Wu	b9e64ecef1	allow param_info to set optimizer Summary: this diff adds optimizer into param_info, and the associated implementations for modelhelper and brew to set optimizer for each individual parameter. Reviewed By: kennyhorror Differential Revision: D5385432 fbshipit-source-id: 5d682f9d1ab077e04a5d76a24d71470f4e64fc92	2017-07-12 08:49:48 -07:00
Mitchell Wortsman	823869ba79	Adding tanh to brew Summary: Added tanh to brew. Reviewed By: harouwu Differential Revision: D5395358 fbshipit-source-id: 8eb5303f503e10aec4c59b42055933198d67e9b3	2017-07-11 18:17:52 -07:00
Dmytro Dzhulgakov	67d2f45e2f	Fix net_printer.py Summary: Fix the unprintable characters fix :) Reviewed By: akyrola Differential Revision: D5398914 fbshipit-source-id: 2c607c497f15e324e863ff1dae7bb16199d4074e	2017-07-11 15:26:52 -07:00
Aapo Kyrola	192e0546bf	fix for back-and-forth models, pass reference instead of copy Summary: akirillov again presented me with a memonger-bug: his model that has kind of a 'back-and-forth structure' where blobs are passed left and right in a ladder-like structure, revealed a bug in memonger: I should pass the set of free blobs as a reference, not a copy so that the recyclings are properly accounted for. Hard to explain. Since we have the graph verifier, we can be more confident with these changes. I also added some helpful debug to the graph verifier. Differential Revision: D5396925 fbshipit-source-id: 0bffb3a0bf8532afcd6b5bc9331c779768a8c5c5	2017-07-11 10:52:14 -07:00
Jacqueline Xu	e89e71c595	Simplifying Random Fourier Features and layer test Summary: - Condensed operators in RFF layer - Adjusted RFF layer test; made test code more concise Reviewed By: chocjy Differential Revision: D5391436 fbshipit-source-id: 08748861cd6fb4a9e4cc9c8762996371492020a1	2017-07-11 00:40:53 -07:00
Robert Verkuil	97193478c7	Implemented GRUCell Summary: Implemented python logic and tests to create an RNNCell for GRU. Uses the preexisting GRU Unit Op code. Reviewed By: salexspb Differential Revision: D5364893 fbshipit-source-id: 2451d7ec8c2eacb8d8c9b7c893bfd21b65fb9d18	2017-07-10 17:52:25 -07:00
Robert Verkuil	2409c2e359	GRUUnit Op Backwards Pass Summary: Just an implementation of the forward pass of the GRU Unit Op, not the full RNNCell. Functions were created to mimic LSTM implementation as closely as possible. Backwards pass implementations are defined in GRU_unit_op.{h, cc} assertGradientChecks call added to gru_cell_test.py Reviewed By: salexspb Differential Revision: D5364856 fbshipit-source-id: 09cff4478091827763b40cc331e4e0abf0ec258f	2017-07-10 17:52:24 -07:00
Robert Verkuil	279f3f095e	Implemented Gated Recurrent Unit (GRU) c++ operator forward pass Summary: Just an implementation of the forward pass of the GRU Unit Op, not the full RNNCell. Functions were created to mimic LSTM implementation as closely as possible. Implementation defined in GRU_unit_op.{h, cc} tests put in gru_cell_test.py, which import rnn_cell_test_util.py for sigmoid, tanh, and _prepare_rnn functions. Reviewed By: jamesr66a Differential Revision: D5363697 fbshipit-source-id: f9ba9fe0be01ffc868dd22027be8be4975b84998	2017-07-10 17:52:23 -07:00
Robert Verkuil	48bd102b95	Moved sigmoid, tanh, and _prepare_lstm (renamed) to a util file. Summary: Moved sigmoid, tanh, and _prepare_lstm (renamed) to a util file. Also renamed _prepare_lstm to _preapare_rnn since it is being used for both setting up and LSTM and GRU model. The reason for this commit is to allow the creation of GRU Op and testing code without copying and pasting code for sigmoid, tanh, and setting up an rnn unit op mode. Reviewed By: jamesr66a Differential Revision: D5363675 fbshipit-source-id: 352bd70378031f1d81606c9267e625c6728b18fd	2017-07-10 17:52:22 -07:00
Kevin Matzen	4b1ebd2f65	Fast path for serializing large floating-point tensors to protobuf Summary: Our existing serialization routines take a significant amount of time for large numpy arrays in order to verify the type of each element in the array as well as converting each element to a canonical type. For large floating-point tensors, such as model parameters, this checking and converting takes a significant amount of time. Adding a fast track path for just float32 arrays as this is the most common use case to worry about. Reviewed By: akyrola Differential Revision: D5389953 fbshipit-source-id: 26f44cb2426ea3efb849e7707b27d5485f69956c	2017-07-10 17:52:22 -07:00
Kevin Matzen	c096c188c3	minor leaky relu bug fixes Summary: numpy.random.rand generates samples from [0, 1) and therefore, the leaky relu test cases weren't testing negative inputs. Tests still pass after change. Leaky relu can be used in-place, but gradient took X rather than Y. Technically, the result is no different as it's just used for a sign test in the gradient, but updated it to take Y to reduce confusion. Differential Revision: D5390126 fbshipit-source-id: d0c428abbb2797eb33902a7d2a2f59d5e85daaa6	2017-07-10 16:04:45 -07:00
Kevin Matzen	720db19fa2	make GetComputedParams work like GetParams Summary: GetComputedParams tests namescopes with equality while GetParams tests with a prefix. Switching GetComputedParams to also use a prefix so that both functions have similar usages. Reviewed By: akyrola Differential Revision: D5389816 fbshipit-source-id: 0e43e4b491fccbad3b855b6b735dc2b91d7626c9	2017-07-10 12:30:44 -07:00
Junjie Bai	ff3996acb9	Add NormalizeL1Op for doing L1 nomalization along given axis Reviewed By: salexspb Differential Revision: D5380220 fbshipit-source-id: 38fc56a1013c25b0c8b0fc161ca54fea412fb8b2	2017-07-10 10:10:36 -07:00
Jacqueline Xu	6ea71155c1	Implementing Arc Cosine Layer Summary: - Implemented the [[ http://cseweb.ucsd.edu/~saul/papers/nips09_kernel.pdf \| Arc Cosine ]] layer - Developed buck unit test for Arc Cosine Reviewed By: chocjy Differential Revision: D5367604 fbshipit-source-id: ffd3ee081bc055b06c075c34aa6ce329b62ce2e0	2017-07-10 10:10:36 -07:00
Jiyan Yang	3598bdd044	Modify samplingTrain layer to take more general inputs Summary: As desc. Reviewed By: kittipatv Differential Revision: D5363486 fbshipit-source-id: cb8fa65d750e80d2bf3e9909ca9b2d83a5548099	2017-07-08 22:19:55 -07:00
Guillaume Dumont	dc13345eb3	Read pretrained weights using binary mode in caffe_translator.py Summary: Binary mode must be explicitly specified when reading binary files under windows. Closes https://github.com/caffe2/caffe2/pull/883 Differential Revision: D5373073 Pulled By: Yangqing fbshipit-source-id: afedebdc74c954dbb6d24c0bccc192c8712c4c88	2017-07-08 10:17:57 -07:00
Bangsheng Tang	5f63f5697a	IndexHash Summary: 1. IndexHashOp 2. Helper class SparseFeatureHash 3. FeatureSpec changes to add desired_hash_size Reviewed By: kennyhorror Differential Revision: D5361370 fbshipit-source-id: bf02e3ca12b3654f1d291f77c8af9248b6c4ac55	2017-07-07 23:06:11 -07:00
Geet Sethi	86b6a6e2f8	Added PiecewiseLinearTransform CUDA Op Summary: Added a CUDA implementation of the PiecewiseLinearTransformOp. Differential Revision: D5378537 fbshipit-source-id: 38857f59f5cc52e16e1ecc97983a0b0b82a46c74	2017-07-07 15:20:00 -07:00
Clément Godard	cb7f17ab64	added gradients for ResizeNearest (CPU + CUDA) and ref Summary: # Added the gradients of the operation for both CPU and CUDA kernels. # Unified variable names across all ops. # Added reference implementation in numpy. # The gradient check needs a larger stepsize to succeed, is that normal? Reviewed By: akyrola Differential Revision: D5313682 fbshipit-source-id: aceb92649e01c5caeba8774e678f9095502d396c	2017-07-07 14:19:42 -07:00
Ralph Mao	febae7b20b	fix a bug in the report function of Data_Parallel Summary: replace params with sp, otherwise it will report an empty list Reviewed By: akyrola Differential Revision: D5382716 fbshipit-source-id: 34d8e6ee00cbe1718702e3d1f23ea12f8d65063e	2017-07-07 13:03:46 -07:00
Jacqueline Xu	8cedf35d55	Adding Random Fourier Features to SparseNN Model and Flow Summary: - Integrated RFF into the preprocessing workflow for dense features - Developed Flow interface to input RFF parameters - Created unit test for using RFF with sparseNN Reviewed By: chocjy Differential Revision: D5367534 fbshipit-source-id: 07307259c501a614d9ee68a731f0cc8ecd17db68	2017-07-07 09:39:32 -07:00
Aapo Kyrola	ad62e82179	fast simple-net memonger for C++ Summary: To be used with predictor "online": C++ version of memonger for simple nets. Very simple greedy algorithm. Works well at least on Resnet-50 inference graph: only 3 shared blobs are used. Next I will integrate this with predictor and run canary (separate diff). Reviewed By: asaadaldien Differential Revision: D5375392 fbshipit-source-id: d36e419e39a32e568e105657c27fb00c85a2535d	2017-07-06 15:17:07 -07:00
Guillaume Dumont	e8689dda8f	Python 3 compatible integer division Summary: As the title says. Closes https://github.com/caffe2/caffe2/pull/879 Differential Revision: D5372787 Pulled By: akyrola fbshipit-source-id: 0ff469c0d227f1b2252c1a0c4f6f8bebaac5580f	2017-07-06 11:47:12 -07:00
Andrew Dye	31f394f8b3	Add synchronization barrier API to data parallel model Summary: Add synchronization barrier API with configurable timeout. Users can call Synchronize() to join variable length execution before resuming multi-machine communication steps, i.e., resuming distributed training iterations after validation on a single machine. Reviewed By: akyrola Differential Revision: D5348387 fbshipit-source-id: 5826da10e6a60c50394c36c7cf47624f10191d11	2017-07-06 09:21:19 -07:00
Aapo Kyrola	21ba0ff560	small fix to when input blob is input to multiple ops Summary: Memonger had a bug that it crashes if an input blob was input to multiple ops. This fixes that and adds a test. Reviewed By: asaadaldien Differential Revision: D5374860 fbshipit-source-id: 1d5044001eacdbe6db43f69727da9297558f5c5c	2017-07-05 22:37:26 -07:00
Aapo Kyrola	2d133d4627	increase concurrency default Summary: Huge improvement in my tests, and it does not really hurt either. Reviewed By: wesolwsk Differential Revision: D5374925 fbshipit-source-id: c96a4ed2ca653120a82233c0037cbfded8a2d2a1	2017-07-05 21:46:31 -07:00

1 2 3 4 5 ...

962 commits