pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

Author	SHA1	Message	Date
Pieter Noordhuis	e1e08d631a	Always check cuDNN support in test_convolution_gradients Summary: Regardless of device checker/gradient checker we cannot run a backwards pass with cuDNN when NHWC is used. Closes https://github.com/caffe2/caffe2/pull/1566 Differential Revision: D6474181 Pulled By: pietern fbshipit-source-id: 727d7b4f2a1431a4d6675ffb76c5b60d3d7fa712	2017-12-04 08:50:39 -08:00
Pieter Noordhuis	41897e3e78	Supress hypothesis health check in glu_op_test.py Summary: Closes https://github.com/caffe2/caffe2/pull/1564 Differential Revision: D6472568 Pulled By: pietern fbshipit-source-id: 4f1bd3a1ced6d77991531eb864d2cf5d39bc7c4f	2017-12-03 22:51:46 -08:00
James Cross	2c190d2f05	update transformer code for layer_norm() API change Summary: Quick fix for unit test broken by D6454290. This is my fault for approving while the tests covering the single callsite were broken. Reviewed By: goldsborough Differential Revision: D6466566 fbshipit-source-id: 2683be3d6bb184286e64fbde3e572946e39030c7	2017-12-01 20:19:31 -08:00
Peter Goldsborough	b43c1b2bed	Fix and upgrade brew.layer_norm Summary: While working on layer normalization for LSTMs I encountered an issue where the layer norm parameters (which are the scale/gain and bias/shift from the paper) were not registered in the model for `brew.layer_norm`. salexspb explained that this is because it was using the `init_net_param` API instead of `create_param`. This diff fixes this. While fixing I noticed that I noticed that `brew.layer_norm` actually had a bug where it was multiplying with the bias instead of adding it. Another issue was that the function giving the scale and bias a shape of `[1]`, however the paper (https://arxiv.org/pdf/1607.06450.pdf) specifies that, like for batch norm, there is one scale and bias parameter per neuron, i.e. the shape should be `[1, axis_dimension]`. The API now takes an explicit `dim_in` parameter (also more consistent with other normalization functions in that module) so that this can be specified. See tests for how this now looks. Reviewed By: jhcross Differential Revision: D6454290 fbshipit-source-id: fc00ca614de3190c40ab743e8984bec9e85fb58c	2017-12-01 14:18:28 -08:00
Jesse Hellemn	3af2b8f428	Adding length verification check to pack_segments Summary: Adding a check to pack_segments to make sure the lengths passed in add up as expected. Additionally started to address https://fb.facebook.com/groups/1405155842844877/permalink/1977332432293879/ , but it might not fix that issue, but is still useful if it does not help that issue. Reviewed By: salexspb Differential Revision: D6443490 fbshipit-source-id: 680dc763a788a550d321d97a556c5b46e3402dd1	2017-12-01 10:47:25 -08:00
Pieter Noordhuis	3d1135c842	Skip remove_padding test because it is flaky Summary: Must be fixed in #1547 Closes https://github.com/caffe2/caffe2/pull/1548 Reviewed By: jhcross Differential Revision: D6456373 Pulled By: pietern fbshipit-source-id: 484a58e31506acfc8b8a0954f76796d14dfdfda3	2017-12-01 09:47:31 -08:00
Qinqing Zheng	7374c981d8	CUDA support for PackSegments Op Summary: Replace GPUFallbackOp by native CUDA implementation Reviewed By: akyrola Differential Revision: D6423200 fbshipit-source-id: 47dfecbc486e9a8bf0cc6b897ab8b6a2488caa34	2017-11-29 22:01:42 -08:00
James Cross	0e21cd2eae	CUDA implementation of RemovePadding operator Summary: This is a CUDA implementation of the RemovePadding operator, modeled on akyrola's implementation for AddPadding. There's also an incidental spelling correction: GetAddPadingGradient -> GetAddPaddingGradient. Reviewed By: akyrola Differential Revision: D6439594 fbshipit-source-id: b29cd0c252021c58e150b901bbaad28a3bd3cc4a	2017-11-29 18:48:01 -08:00
Pieter Noordhuis	6f218cef25	Supress hypothesis health check in adagrad_test.py Summary: With some test seeds this warning starts firing. Should be addressed in a better way, not generating as many invalid examples. Closes https://github.com/caffe2/caffe2/pull/1536 Reviewed By: bddppq Differential Revision: D6437138 Pulled By: pietern fbshipit-source-id: c619d928a585e3d887f686db5d98f841af10c56b	2017-11-29 11:35:04 -08:00
Yangqing Jia	4beb3ac3ab	Properly guard cudnn backward path - NHWC is still not supported. Summary: TSIA. This is found in https://github.com/caffe2/caffe2/pull/1530 Reviewed By: dzhulgakov Differential Revision: D6434417 fbshipit-source-id: 2285c2f6252eb7f24e83357eb4887851b3adf690	2017-11-28 23:03:02 -08:00
Aapo Kyrola	a08909160e	fix bug in CUDA AddPadding when lenghts output is not provided Summary: enosair caught bug that the operator returned too early if the lengths output was not provided. Fixed and added testing. + noticed the op does not support case when no lengths-input is provided. Added a temporary CAFFE_THROW for this case, will fix later Reviewed By: enosair Differential Revision: D6405585 fbshipit-source-id: a81717e1b39afde6e900ddd9049b820943aea9f1	2017-11-27 15:14:07 -08:00
Aapo Kyrola	0954775d28	AddPadding CUDA version Summary: CUDA version of the AddPadding op. It first executes a prefix-sum using Cub to compute the cumulative lenghts array. Then it launches a kernel that uses this information to fill the output tensor with start, end paddding and the actual contents. Reviewed By: asaadaldien Differential Revision: D6391413 fbshipit-source-id: 45b431e5976674729e53cb4752c7753c1d8a69e8	2017-11-22 18:17:21 -08:00
Andrew Tulloch	48415d83c8	Fix instance_norm_test.test_instance_norm_model_helper Reviewed By: jerryzh168 Differential Revision: D6391749 fbshipit-source-id: ba861d401e358290782db8f360c430e3f3daae96	2017-11-22 15:05:29 -08:00
Yiming Wu	127a55ae49	cast op for empty batch Summary: Cast op cuda can deal with empty batch now. Reviewed By: azzolini Differential Revision: D6350138 fbshipit-source-id: 2f3d19f4d42ff34806aa9597690e66f6b4de1a6b	2017-11-16 12:20:20 -08:00
Wenyi Huang	d8dfaeeef7	Add batch-based/row-based sparse from/to dense operator Summary: Two ops: BatchSparseToDenseOp and DenseToBatchSparseOp Inverse operations of each other. Details are described in op Doc These op is used along with flexible topK, where the output is lengths, indices, and values. We want to do softmax on the values, but the dimension of each batch is different. So these op will convert sparse representation to dense and vice versa. The two ops are also gradient op for each other. Reviewed By: chocjy Differential Revision: D6288338 fbshipit-source-id: 0ba9e611058b39e46e7414dcc5f39cab29915fa3	2017-11-16 00:59:21 -08:00
Xiaolong Wang	3bde37fbf0	Listwise Ranking -- LambdaNDCG Summary: This is part one: It adds lambdaNDCG loss which can be used to heuristically optimize the NDCG metric. Differential Revision: D5830650 fbshipit-source-id: 1eb696337c9a77727ad40219c68f6468e2e097a5	2017-11-16 00:05:48 -08:00
Simon Layton	1ab3fd1a29	Fix Batched Matmul test accuracy Summary: Datatypes was being handled badly in reference check, causing sporadic fails in CI. All batched mat-mul with fp16 data is performed as pseudo-fp16, with all math in fp32. Adjusted the reference implementation to reflect this. Adjusted the gradient check threshold to the best I could get to consistently pass. Closes https://github.com/caffe2/caffe2/pull/1406 Differential Revision: D6324431 Pulled By: pietern fbshipit-source-id: 83ff2584438a11f7a6db4599a4fb0e75e9e15a3d	2017-11-14 09:31:18 -08:00
James Reed	8701a2dfa3	Allow negative indices in Concat/Split ops Summary: Closes https://github.com/caffe2/caffe2/pull/1440 Reviewed By: dzhulgakov Differential Revision: D6290009 Pulled By: jamesr66a fbshipit-source-id: 93eaff6103211ff89ed63ecaf4aa96d38e6bed63	2017-11-13 18:32:24 -08:00
Yan Zhu	7b047c161d	NegateGradientOp and test Summary: add NegateGradientOp: in forward pass, this op simply copies the input to output. In backward pass, it flips the sign of gradients. Reviewed By: dragonxlwang Differential Revision: D6314456 fbshipit-source-id: 56afd8b131eff9f7e120ab7e4e87461df49649d4	2017-11-13 18:05:14 -08:00
Xianjie Chen	c04ec84e1a	disable uniform fill large blob Reviewed By: pietern Differential Revision: D6299413 fbshipit-source-id: 2ea4a5f1434060c3ab6fd42abd4052bdb10a37cc	2017-11-10 12:10:14 -08:00
Jeff Johnson	0440f3bf93	Reduce caffe2 GPU topk test sizes Summary: The topk GPU test was taking too much time, but there are still a variety of codepaths to test (k <= 1024, k > 1024, k == 1, k == n). Reduce the batch sizes and n to reduce time taken by the in-python CPU code equivalent. Reviewed By: pietern Differential Revision: D6272628 fbshipit-source-id: b8b8f3601f28bf64f144c73d7c9e915f40c84d70	2017-11-10 07:47:00 -08:00
Xianjie Chen	d1c73eb407	use size_t for rand fill functions in math Summary: The number of elements in the caffe2 blob can be larger than int32. Use size_t to prevent overflow. Reviewed By: ajtulloch Differential Revision: D6278363 fbshipit-source-id: 356e294c667a53360d8a65b56a63a39d5ce3384e	2017-11-09 18:44:46 -08:00
Wenyi Huang	7cedf80923	add flexible topK op Summary: Will probably rename to adaptive topK to be aligned with the layer name. The main difference from top_k op is that the K is not fixed as a layer parameter, instead this op takes in a blob that conatins K information for each row of the input data (batch mode). Reviewed By: chocjy Differential Revision: D6221209 fbshipit-source-id: f7fd575ff8f515d886d93278ad94fd17e8bd6fa5	2017-11-09 16:48:14 -08:00
Junjie Bai	e6fadfa76e	Relaxing checks for fp16 in BatchMatMul tests Reviewed By: pietern Differential Revision: D6275557 fbshipit-source-id: e336ba9c897b88801f1be1b32029c5af58ec3fc5	2017-11-08 13:42:28 -08:00
Pieter Noordhuis	348e29c49b	Don't run CUDA tests for ops without CUDA implementation Summary: Closes https://github.com/caffe2/caffe2/pull/1434 Reviewed By: houseroad, ilia-cher Differential Revision: D6272614 Pulled By: pietern fbshipit-source-id: 7b998b08ec02b03f88a6fd24a949b0d199b2aa37	2017-11-08 10:28:02 -08:00
Xianjie Chen	cbb03b8db8	add modulo operator Summary: as desc. Reviewed By: chocjy Differential Revision: D6240026 fbshipit-source-id: fa4dcccebc44b0a713946823b6f56e73d5d6146b	2017-11-06 16:44:16 -08:00
Alexander Sidorov	20feef45bc	NNFC operator: an FC with noTrans noTrans options Summary: This seems to be faster in a bunch of cases. Prefer to keep it as a separate op instead of MatMul + Add so its easy to compare perf on per op basis between this one and the baseline (normal FC) Reviewed By: akyrola Differential Revision: D6169187 fbshipit-source-id: 09b96325d44bd181896f396aec88b27314c435b0	2017-11-03 15:08:39 -07:00
Philipp Keller	68ed66a2c5	Faster BatchBoxCox Operator using MKL Summary: Use MKL VML vsPow() and row-major iteration for faster BatchBoxCox operator. Reviewed By: kennyhorror Differential Revision: D6042052 fbshipit-source-id: 54fc6b9184cb341672183a77730d79a271d09207	2017-11-03 12:04:03 -07:00
Dmytro Dzhulgakov	583bc63c98	Fix boundary checking in 8-bit sparselengthssum ops Summary: Before the boundary checking was happening after the first access for 8bit ops. Reviewed By: Yangqing Differential Revision: D6206753 fbshipit-source-id: 07ab240cae8c67b3048f03aa79af0b6399b9940b	2017-11-03 05:19:57 -07:00
Aapo Kyrola	14f95c2782	Updated brew SpatialBN to use initializers Summary: Updated brew SpatialBN to use initializers similar to other brew ops such as conv and fc instead of initilaizing all of its parameters itself within the brew call. Reviewed By: asaadaldien Differential Revision: D5840359 fbshipit-source-id: 9f3d688d4957605eaf7ecd2488bc26bfb1da3f78	2017-11-02 11:25:45 -07:00
Junjie Bai	7c2804ee90	Add support for doing broadcast with single elem dimensions at both ends Summary: Closes https://github.com/caffe2/caffe2/pull/1413 Reviewed By: jamesr66a Differential Revision: D6201556 Pulled By: bddppq fbshipit-source-id: 1d443e895dbb3f5b67a5a0e027977b7807df3de1	2017-11-01 18:33:11 -07:00
Dong Li	3bfabb4d5f	support float16 input for operator SparseAdagrad Summary: Implemented new CUDA class for operator SparseAdagrad. The param and moment inputs now can be float or float16. The functions for mixed-precision add/mult/store are defined in a separate head file ("caffe2/core/float16_util.h") for reuse purpose. Reviewed By: azzolini Differential Revision: D5880200 fbshipit-source-id: dca227f38629a03a9d771f42efe2c0b673075c4d	2017-10-30 19:32:30 -07:00
Aapo Kyrola	669ec0ccba	Added FP16 compute support to FC Op Summary: Allow the GEMMs in the FC/FCGradient Op to do FP16 compute instead of FP32 if the appropriate op flag is set. Reviewed By: asaadaldien Differential Revision: D5839777 fbshipit-source-id: 8051daedadf72bf56c298c1cf830b019b7019f43	2017-10-30 17:03:51 -07:00
Junjie Bai	b7a9f51de3	In BatchMatMul, add support for accepting inputs >=2d Summary: Closes https://github.com/caffe2/caffe2/pull/1399 Differential Revision: D6183083 Pulled By: bddppq fbshipit-source-id: 5c8f17c2de212fbc39a66c90aa2599b714f5ceb4	2017-10-29 23:38:33 -07:00
Qinqing Zheng	42ffb1ae07	support non-normalized weights Reviewed By: akyrola Differential Revision: D6158290 fbshipit-source-id: 4d54e5c0d0f91f23deab18da047df4d209d4c312	2017-10-27 23:18:25 -07:00
Tilak Sharma	7b7dcaf269	Initialize presence tensor if data is empty. Summary: See https://fb.facebook.com/groups/811605488888068/permalink/1645450575503551. Differential Revision: D6116836 fbshipit-source-id: 3072643eaf6f134bda7d224af3d5f8339da1f39d	2017-10-27 01:05:42 -07:00
Qing He	0b0d5b2b1d	Add tensor output that gives the sampled values Summary: Given an additional tensor containing the values corresponding to the weighted samples, add tensor output that contains the values selected by the sampled indexes. Reviewed By: akyrola Differential Revision: D6050094 fbshipit-source-id: 1eccc641b99e30d36ae83d49f630b018a53e4147	2017-10-26 16:04:57 -07:00
Jiyan Yang	6e33ae79df	Add gradient op for WeightedSum op Reviewed By: dzhulgakov Differential Revision: D6149163 fbshipit-source-id: 0e8cf400323233d001243bc5cb25a0025115a564	2017-10-26 00:16:51 -07:00
Ahmed Taei	5bb8ed67e3	Compute GLU for an arbitrary axis Summary: As in title Differential Revision: D6151804 fbshipit-source-id: bd0fa08be1676ebd1abd9720711c221c61c11ad1	2017-10-25 19:49:55 -07:00
Aapo Kyrola	2e4d8aa530	Added FP16/FP32 MomentumSGD + WeightDecay Update Ops Summary: Added two new ops, FP16MomentumSGDUpdate and FP32MomentumSGDUpdate, which perform both the momentum sgd and weight decay updates to a given parameter in a single op -- thus being more efficient. Also updated the standard momentum sgd test to test if nesterov momentum works. Reviewed By: asaadaldien Differential Revision: D5837837 fbshipit-source-id: 5ad487b9c59434491d3a4fcfdeed820db6083f57	2017-10-24 12:28:16 -07:00
Junjie Bai	ed08533a1e	Add CUDA version of ScatterAssign Reviewed By: houseroad Differential Revision: D6128352 fbshipit-source-id: ea59f4bc723ef929b0f6ed15797df776d8054422	2017-10-24 10:20:03 -07:00
Ahmed Taei	512a8015b8	Gated Linear Unit implementation Summary: As titled Differential Revision: D6117600 fbshipit-source-id: 84b0154dc4cf77cc9c9146e9a534c7485989346b	2017-10-23 18:14:57 -07:00
Yarik Markov	c6ef04db04	Add "dtype" parameter for GivenTensorOp Summary: Adding "dtype" parameter for the GivenTensorOp. Also, providing backwards compatibility for the existing code, byt supporting the templating if "dtype" is not provided. Reviewed By: bddppq Differential Revision: D6090049 fbshipit-source-id: f5deaa57b49f2280289975f4583aba5bc064a2bc	2017-10-23 16:06:37 -07:00
Qinqing Zheng	6a4182eead	weighted sample op cuda Summary: CUDA version of weighted sampling operator; minor changes for CPU version Reviewed By: asaadaldien Differential Revision: D6106668 fbshipit-source-id: 42d7607bd845a4a39cf5b89d7476904cb5928431	2017-10-21 18:49:59 -07:00
Badri Narayan Bhaskar	25bfffeafe	Swish Activation Function Summary: Swish: A self-gated activation function. https://arxiv.org/pdf/1710.05941.pdf Reviewed By: ajtulloch Differential Revision: D6100424 fbshipit-source-id: 0103d6d82e9ffb50106c98a8785e62b8808e9af1	2017-10-20 10:37:43 -07:00
Junjie Bai	ee62a595fc	ScatterAssign int types Summary: Closes https://github.com/caffe2/caffe2/pull/1357 Reviewed By: dzhulgakov Differential Revision: D6107036 Pulled By: bddppq fbshipit-source-id: 9278dae988c3c0656b4e4fd08bf7ca1e2eec3348	2017-10-19 23:22:54 -07:00
Dmytro Dzhulgakov	623f2bf815	Add GivenTensorInt64Fill on gpu Summary: Before we fix it properly with 'type' argument. Reviewed By: bddppq Differential Revision: D6103973 fbshipit-source-id: 8c00a93c373dd0ad0bbfe59944495f6574223ab6	2017-10-19 18:32:41 -07:00
Hassan Eslami	db6a9d2ae4	Fixes type inference for Slice and GivenTensorFill operators Summary: Currently, the type inference infers FLOAT as the type for all GivenTensorFill operators. However, the inferred type should match the actual operators. Also, for `Slice` operator, there is a corner case where type inference fails Reviewed By: azzolini Differential Revision: D6096813 fbshipit-source-id: d65b7c0f42436138cbc49d8a5a62374fa5e927e1	2017-10-19 14:02:21 -07:00
James Cross	96c6212513	repeat sequence mask for data dims Summary: Allow the application of sequence-length masking to be replicated along one or more minor axes. See task for details. Reviewed By: jamesr66a Differential Revision: D6090835 fbshipit-source-id: 9064232aa9b93246c582b6e0bae73be5dbe09e98	2017-10-18 18:08:08 -07:00
Bryan Wu	6ac393a32b	WeightedSigmoidCrossEntropyWithLogits Summary: Op for computing SigmoidCrossEntropyWithLogits with per-label, per-sample weights. Can be used for addressing class or label imbalance. Doc: Given three matrices: logits, targets, weights, all of the same shape, (batch_size, num_classes), computes the weighted sigmoid cross entropy between logits and targets. Specifically, at each position r,c, this computes weights[r, c] * crossentropy(sigmoid(logits[r, c]), targets[r, c]), and then averages over each row. Returns a tensor of shape (batch_size,) of losses for each example. Reviewed By: stephenyan1231 Differential Revision: D5997723 fbshipit-source-id: f3172325f1c98b6f26e1700131ef897b743a72fc	2017-10-16 17:34:38 -07:00

1 2 3 4 5 ...

447 commits