Commit graph

654 commits

Author SHA1 Message Date
Chenguang Xi
cdefc27795 Support lr adaption for SparseAdam and RowWiseSparseAdam (#11162)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11162

as title, fix pr test failure

Reviewed By: chocjy

Differential Revision: D9619308

fbshipit-source-id: 0a2228841ed8fadb15f07e94d3575aa701b10146
2018-09-17 10:29:03 -07:00
Xiaomeng Yang
7f7cda99cd Optimize order_swich_ops on GPU (#11404)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11404

Optimize order_swich_ops on GPU

Reviewed By: houseroad

Differential Revision: D9728642

fbshipit-source-id: 74ff62268856fb1613fa61eb214bed6ec6716632
2018-09-12 16:56:15 -07:00
Lukasz Wesolowski
4db21a1d8e Optimize LengthsTileOp on GPU to run a kernel instead of a sequence of memcopies (#11413)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11413

LengthsTileOp was implemented using a sequence of device memcopies initiated on the CPU. This was very slow. I changed it to use a kernel. TUM benchmark QPS improved from 13k QPS to 20k QPS as a result.

Reviewed By: manojkris, xianjiec

Differential Revision: D9724988

fbshipit-source-id: 2f98c697730982734d7c6a26d0b6967310d49900
2018-09-11 13:25:35 -07:00
Mingda Li
f2f43ad2da Add new LengthsSplit operator (#10974)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10974

Pull Request resolved: https://github.com/pytorch/pytorch/pull/10291

This new operator will do the following:

Given a LENGTHS vector and n_splits, output a "split" LENGTHS vector where:

1. Each length in input vector is split into n_splits values (thus output vector should have LENGTHS.size(0) * n_splits elements)
2. The new lengths in output should be evenly split, and if the length is not divisible by n_splits, then order new values in descending order. (e.g. n_splits = 3, length = 5 -> 2 2 1)
3. If n_splits > some element in the array, its split elements will contain 0s. (e.g. n_splits = 3, length = 2 - > 1 1 0)

Reviewed By: bddppq, chocjy

Differential Revision: D9013119

fbshipit-source-id: 82bf3371ec08c41fc3379177f0007afc142e0d84
2018-09-10 15:40:28 -07:00
Xiaomeng Yang
ec5404a449 Add cuda version of SpatialBNOp also optimize SpatialBN on CPU (#10888)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10888

Add cuda version of SpatialBNOp also optimize SpatialBN on CPU

Reviewed By: houseroad

Differential Revision: D9512435

fbshipit-source-id: 6f828c88d56d30dc9a2f98a297a161c35cc511b1
2018-09-06 18:26:13 -07:00
Will Feng
c9e66351a7 Port all PyTorch and Caffe2 jobs to CircleCI (#11264)
Summary:
This PR adds all PyTorch and Caffe2 job configs to CircleCI.

Steps for the CircleCI mini-trial:
- [ ] Make sure this PR passes Jenkins CI and fbcode internal tests
- [x] Approve this PR
- [ ] Ask CircleCI to turn up the number of build machines
- [ ] Land this PR so that the new `.circleci/config.yml` will take effect

Several Caffe2 tests are flaky on CircleCI machines and hence skipped when running on CircleCI. A proper fix for them will be worked on after a successful mini-trial.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11264

Differential Revision: D9656793

Pulled By: yf225

fbshipit-source-id: 7832e90018f3dff7651489c04a179d6742168fe1
2018-09-05 16:28:11 -07:00
Xiaomeng Yang
b3d559cdd1 Optimize WeightedSumOp for two inputs (#11049)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11049

Optimize WeightedSumOp for two inputs

Reviewed By: houseroad

Differential Revision: D9566692

fbshipit-source-id: 9aab1f02251d386b6f7d0699ae11eeb2ea2b5b4f
2018-09-01 11:54:55 -07:00
Edward Yang
3073051a18 Revert D9554375: Support lr adaption for SparseAdam and RowWiseSparseAdam
Differential Revision:
D9554375

Original commit changeset: b88768f470ef

fbshipit-source-id: 2c103c616c8680684892c7d9085fd7bb8289d2f1
2018-08-31 07:54:31 -07:00
Chenguang Xi
0555768e0f Support lr adaption for SparseAdam and RowWiseSparseAdam (#10993)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10993

as title

Reviewed By: chocjy

Differential Revision: D9554375

fbshipit-source-id: b88768f470ef7d023dd481c6a97b91594892f422
2018-08-31 00:55:39 -07:00
Ansha Yu
9fae8fcdff framework for committed serialized tests (#10594)
Summary:
Generate serialized test inputs/outputs/backward graphs of tests inside `caffe2/python/operator_test` that call assertSerializedOperatorCheck(). Tests should be decorated with serialized_test.collect_tests.given_and_seeded to run hypothesis tests that are actually random and a single fixed seeded hypothesis tests.

To use:
1. Refactor your test to be a SerializedTestCase
1a. Decorate it with given_and_seeded
1b. Call testWithArgs in main
2. Run your test with -g to generate the output. Check it in.
3. Subsequent runs of the test without generating the output will check against the checked in test case.

Details:
Run your test with `python caffe2/python/operator_test/[your_test].py -g`
Outputs are in `caffe2/python/serialized_test/data`. The operator tests outputs are in a further subdirectory `operator_test`, to allow for other tests in the future (model zoo tests?)

Currently, we've only refactored weighted_sum_test to use this, but in the next diff, we'll refactor as many as possible. The directory structure may also change as usually there are multiple tests in a single file, so we may create more structure to account for that.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10594

Reviewed By: ezyang

Differential Revision: D9370359

Pulled By: ajyu

fbshipit-source-id: 2ce77389cd8bcc0255d3bccd61569833e545ede8
2018-08-30 22:41:46 -07:00
Tommy Yu
89834dfe64 Add GPU version of HardSigmoid Op to Caffe2 (#10955)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10955

Add GPU version of HardSigmoid Op to Caffe2. Updated test file to
include GPU tests.

Reviewed By: enosair

Differential Revision: D9499353

fbshipit-source-id: fcb51902063d0c3e4b10354533a8a42cf827c545
2018-08-29 14:55:29 -07:00
Tommy Yu
92ff070b83 Add CPU version of hard sigmoid operator to caffe2 (#10837)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10837

Add CPU version of hard sigmoid operator to caffe2. The definition of
this operator can be found here:
https://github.com/onnx/onnx/blob/master/docs/Operators.md#HardSigmoid.

Reviewed By: BIT-silence

Differential Revision: D9489536

fbshipit-source-id: 67b3171ed96d5ebcc8d500d93e7827a4a9705a81
2018-08-28 14:55:49 -07:00
Yanghan Wang
f64f6eed3a move HeatmapMaxKeypointOp unittest to oss
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10859

Reviewed By: newstzpz

Differential Revision: D9498312

fbshipit-source-id: 08b8a596f774c9102286019f286ca0b74d1f5304
2018-08-27 12:56:46 -07:00
Edward Yang
deda05e59f Revert D9395814: move HeatmapMaxKeypointOp unittest to oss
Differential Revision:
D9395814

Original commit changeset: 25073eb6b143

fbshipit-source-id: 56f2b7b57e3c6361e2d78e5ba7850ea3b89e98fb
2018-08-23 06:54:29 -07:00
Yanghan Wang
9a43fc5eaa move HeatmapMaxKeypointOp unittest to oss
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10674

Reviewed By: newstzpz

Differential Revision: D9395814

fbshipit-source-id: 25073eb6b143fc1e7cbf5f887545d2b7df15c9a9
2018-08-22 19:11:10 -07:00
Wei Wen
6c75fc0aa3 Intergrating stochastic quantization to easgd to reduce communication + supporting quantization on both sides (split from D8849770) (#10644)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10644

Depends on D8493264

Reviewed By: chocjy, boryiingsu

Differential Revision: D9347706

fbshipit-source-id: 6fdcc5b61098bf47ec9391b1f009b0e6a0615842
2018-08-22 17:10:03 -07:00
Huan Gui
7832e9d564 Add a bisect percentile operator (#10563)
Summary:
Add a bisect percentile operators with lower and upper bounds for interpolation
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10563

Reviewed By: chocjy

Differential Revision: D7802182

Pulled By: olittle

fbshipit-source-id: 89ebfa8b3463adc2c89235fa3dfffa187a9d5417
2018-08-20 13:14:05 -07:00
Jongsoo Park
cc53807be5 group conv with NHWC layout (#10585)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10585

group conv with NHWC layout

Reviewed By: BIT-silence

Differential Revision: D7547497

fbshipit-source-id: da0ec5a4512c15a0a0d7b79e6ce00c1f8f77f661
2018-08-17 00:39:23 -07:00
Xiaomeng Yang
87cac4c2f1 Update Im2Col related to make preparation for group conv in NHWC order. (#10439)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10439

Update Im2Col related to make preparation for group conv in NHWC order.

Reviewed By: houseroad

Differential Revision: D9285344

fbshipit-source-id: 1377b0243acb880d2ad9cf73084529a787dcb97d
2018-08-15 17:10:24 -07:00
Eli Amesefe
c5b1aa93ee Export uint8 tensors as byte string in mobile_exporter and add GivenTensorByteStringToUInt8FillOp (#10385)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10385

Pull Request resolved: https://github.com/pytorch/pytorch/pull/10354

Pull Request resolved: https://github.com/pytorch/pytorch/pull/10316

Because Protobuf encodes uint8_t tensors using a less space efficient varint uin32_t encoding, we are adding a new operator that reads back a byte string into a uint8_t tensor.

Reviewed By: harouwu

Differential Revision: D9004839

fbshipit-source-id: dfd27085c813fdeff13fee15eef4a2e7fef72845
2018-08-15 14:26:50 -07:00
Jongsoo Park
d8ff7ad6f8 generalize order switch ops for 1-3d (#10395)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10395

Order switch ops (NCHW2NHWC and NHWC2NCHW) were only supporting 2D images.
This diff generalizes them to 1D and 3D, and also add a unit test we didn't have.

Reviewed By: protonu

Differential Revision: D9261177

fbshipit-source-id: 56e7ec54c9a8fb71781ac1336f3f28cf024b4bda
2018-08-15 10:09:31 -07:00
Peizhao Zhang
ce8e8feceb Fixed a bug in box_with_nms_limit where it may produce more bounding boxes than specified. (#10390)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10390

Fixed a bug in box_with_nms_limit where it may produce more bounding boxes than specified.
* The original code first finds the threshold for the boxes at the 'detectons_per_im' position, and filters out boxes lower than the threshold.
* In some cases that there are multiple boxes have the same threshold, the op will return more boxes than 'detectons_per_im'.

Reviewed By: wat3rBro

Differential Revision: D9252726

fbshipit-source-id: 63f40829bcd275cb181692bc7547c384cee01499
2018-08-14 23:54:23 -07:00
Peizhao Zhang
520f4f6cb9 Added some unit test for box_with_nms_limit_op. (#10389)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10389

Added some unit test for box_with_nms_limit_op.

Reviewed By: wat3rBro

Differential Revision: D9237860

fbshipit-source-id: 2d65744bd387314071b68d2a0c934289fc64a731
2018-08-14 11:55:03 -07:00
Wei Wen
ffb59e5f20 adding stochastic quantization caffe2 operators (encoder and decoder in CPU are implemented. GPU mode is pending)
Summary:
This operator implements b (1/2/4/8) bit stochastic quantization of a floating
matrix in a row-wise fashion. 8/b floating values are concatenated to a byte
and returned in uint8 tensor. PR: https://github.com/pytorch/pytorch/pull/8629

Reviewed By: harouwu

Differential Revision: D8493264

fbshipit-source-id: 01f64066568a1e5a2b87c6d2134bd31cdf119c02
2018-08-13 16:39:23 -07:00
Jerry Zhang
656bb320b7 EnforceFinite test (#10143)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10143

att

Reviewed By: xianjiec

Differential Revision: D9122444

fbshipit-source-id: 010abcc1eb64f084c00890e8de5f5d422b4b8d02
2018-08-03 10:31:29 -07:00
Junjie Bai
4778afb8bb In Expand support using -1 to indicate preserving original size (#10174)
Summary:
zrphercule

https://pytorch.org/docs/stable/tensors.html#torch.Tensor.expand
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10174

Differential Revision: D9136467

Pulled By: bddppq

fbshipit-source-id: 825c489899097acda8d43706964d78a104cdf583
2018-08-02 22:09:47 -07:00
Junjie Bai
dd527db711 Skip TestConvolution.test_convolution_sync on ROCM which caused random segfaults (#10179)
Summary:
https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-clang3.8-rocm1.7.1-ubuntu16.04-test/4701/console

petrex ashishfarmer rohithkrn
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10179

Differential Revision: D9139657

Pulled By: bddppq

fbshipit-source-id: 9b1bb2ad185ed16fff696ce026a5ee5fcf9cbaee
2018-08-02 21:09:27 -07:00
Lin Li
4a2f3cc45f Improve lars operator by applying clipping (#9905)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9905

This diff improves lars operator in Caffe2 by applying clipping to the computed learning rate

Reviewed By: pjh5

Differential Revision: D9020606

fbshipit-source-id: b579f1d628113c09366feac9406002f1ef4bd54f
2018-08-02 11:54:28 -07:00
Anshul Jain (B*8)
56974a06b5 Revert D8909766: [caffe2] Simplify order switch operators
Differential Revision:
D8909766

Original commit changeset: 17a302d5bf4a

fbshipit-source-id: 56c75a8ce27873ed1d5f194b9d6bf0049d8f21ba
2018-07-28 18:40:13 -07:00
Igor Milyakov
607688e928 Adding reciprocal operator and a test
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9908

Differential Revision: D9035809

Pulled By: virtan

fbshipit-source-id: bce1db46fd55faeeab18a3b266d25c8beeb08df7
2018-07-27 18:24:43 -07:00
Igor Milyakov
12a1af3731 Adding conv tests with explicit algo definition
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9798

Differential Revision: D9034663

Pulled By: virtan

fbshipit-source-id: d722f25f1dd00231ccc3ad5960bbbef63af02c2d
2018-07-27 17:39:17 -07:00
root
c3fe071483 Update hip files (#9826)
Summary:
The goal of this PR is to update the hip files to reflect relevant changes in cuda source files.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9826

Differential Revision: D9032840

Pulled By: bddppq

fbshipit-source-id: 504e55c46308eebfee3c9a7beea1f294fe03470f
2018-07-27 16:54:39 -07:00
Norman Mu
a532c1a48c Fix default argument value for CTCGreedyDecoder op (#9747)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9747

Currently the ctc_greedy_decoder op initializes the `merge_repeated` argument only if it has been provided by the user. Change to initialize in all cases.

Reviewed By: houseroad

Differential Revision: D8963635

fbshipit-source-id: 18955c7c26a77d9d7f5137e4dec085252ffabfeb
2018-07-27 16:33:07 -07:00
Jongsoo Park
e7ab093d93 Simplify order switch operators (#9581)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9581

Mostly to simplify code. Should also improve performance but order switch ops
don't take much time anyway.

Reviewed By: viswanathgs

Differential Revision: D8909766

fbshipit-source-id: 17a302d5bf4aba2755d88223fc01a41fd72c5919
2018-07-26 18:24:29 -07:00
Junjie Bai
bdbbcf068a Temporarily disable test_unique on rocm since it keeps running into segfault (#9872)
Summary:
petrex

https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-clang3.8-rocm1.7.1-ubuntu16.04-test/3758/console
https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-clang3.8-rocm1.7.1-ubuntu16.04-test/3757/console
https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-clang3.8-rocm1.7.1-ubuntu16.04-test/3752/console
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9872

Reviewed By: ezyang

Differential Revision: D9013335

Pulled By: bddppq

fbshipit-source-id: 80490a0fd4a86aa9c8454378c0edddc57d135c4e
2018-07-26 08:34:00 -07:00
Junjie Bai
997f46d1e1 Disable "filter too much" health check for fc operator tests (#9865)
Summary:
makes the CI flaky
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9865

Differential Revision: D9011882

Pulled By: bddppq

fbshipit-source-id: 5124ab97d258eed7585734d64fb01e5df98abd0d
2018-07-25 21:41:14 -07:00
Siddharth Goyal
4b61760738 Add Adadelta optimizer to caffe2 (#9088)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9088

Closes https://github.com/pytorch/pytorch/pull/9088

- Added CPU/GPU implementations of Adadelta and SparseAdadelta.
- Added corresponding Python unittests

Reviewed By: BIT-silence

Differential Revision: D8712169

fbshipit-source-id: 544e99e13b230a919672a7341b3715d64597c0be
2018-07-24 20:09:21 -07:00
Kittipat Virochsiri
2b134c72e6 Add interface to provide blob types to shape&type inference (#9643)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9643

Current map interface assumes float data type, which is not always correct.

Reviewed By: kennyhorror

Differential Revision: D8455784

fbshipit-source-id: b94a31267760f7f97c15aa4b03008affc347fd10
2018-07-24 11:58:05 -07:00
Junjie Bai
7af5883860 Eanble python tests on ROCM (#9616)
Summary:
petrex
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9616

Differential Revision: D8960623

Pulled By: bddppq

fbshipit-source-id: bde93bda6230094e6bf4badd8ee79f0688ae1993
2018-07-24 11:37:58 -07:00
Xiaomeng Yang
5df3eae89e Add 1x1 specialization for conv with NCHW order (#9671)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9671

Add 1x1 specialization for conv with NCHW order

Reviewed By: houseroad

Differential Revision: D8944686

fbshipit-source-id: 94bf44f69498b1934b7dfff4c0e989342c7bb61c
2018-07-23 18:54:58 -07:00
Norman Mu
ee2cc68259 Add ctc_beam_search_decoder op for caffe2 (#9622)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9622

Implement a ctc_beam_sarch_decoder operator based on ctc_greedy_decoder.

Differential Revision: D8903100

fbshipit-source-id: 38973632cb437e5cfcb9ed3a48ed6b901c10efa3
2018-07-23 13:40:24 -07:00
Xiaomeng Yang
a01d6f01b5 Update channel_shuffle_op and transpose 2d to speed up ShuffleNet (#9525)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9525

Update channel_shuffle_op and transpose 2d to speed up ShuffleNet

Reviewed By: houseroad

Differential Revision: D8889361

fbshipit-source-id: 60196e819b6842becc53b4859b62d4419a0e2c6e
2018-07-21 12:54:33 -07:00
Zhaoheng Ni
a3a6ab60cd Fix the error in UnpackSegmentsOp when calculating the gradient with "max_length" argument (#9598)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9598

The "max_length" should be passed to UnPackSegmentsOp if "max_length" is given when calling PackSegmentsOp.

Reviewed By: jerryzh168

Differential Revision: D8919799

fbshipit-source-id: 8c97aa717b69177b8a5d5d56892817d488853840
2018-07-20 11:09:34 -07:00
Zhishuai Zhang
6557856671 Fix l2 normalization when handling zero vector (#9594)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9594

When the input vector is a zero vector, the previous GPU code will give Nan in backward. We fix this.

Reviewed By: pjh5

Differential Revision: D8849732

fbshipit-source-id: 87b1fb1ee05dfdb0d43bcbe67e36f15896fe1706
2018-07-19 14:10:03 -07:00
Xiaomeng Yang
ca3b36aa6a Add implementation for batch_moments_op (#9510)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9510

Add implementation for batch_moments_op

Reviewed By: houseroad

Differential Revision: D8587654

fbshipit-source-id: d20f52cc8e900716c1057e68c147258dfda5245b
2018-07-18 11:59:54 -07:00
Viswanath Sivakumar
9235ff53f1 Clip horizontal bounding boxes during rotated detection for backward compatibility (#9403)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9403

In BBoxTransform and GenerateProposal ops, clip_boxes makes sure the bbox fits
within the images. For rotated boxes, this doesn't always make sense as there
could be multiple ways to clip a rotated box within an image boundary.
Moreover, clipping to a horizontal box means we leave out pixels of interest
potentially. Therefore, we clip only boxes with angle almost equal to 0 (with a
specified `angle_thresh` tolerance).

Reviewed By: pjh5

Differential Revision: D8828588

fbshipit-source-id: 39c1eafdb5d39d383780faa0a47e76149145e50c
2018-07-16 20:24:49 -07:00
Mark Richardson
88146484b4 Add support for .norm() pytorch onnx export and ReduceL1/ReduceL2 caffe2 operators (#9299)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9299

Onnx has ReduceL1 and ReduceL2 operators that would facilitate this, so allow pytorch to export those and allow caffe2 to run them.

I only implemented this on CPU so far.

Reviewed By: pjh5

Differential Revision: D8757381

fbshipit-source-id: 68afc9e2f90042a70929b73ace05a499b5c670c7
2018-07-14 10:54:13 -07:00
Zhaoheng Ni
5ac8a80f8b Add BatchBucketizeOp in caffe2 (#9385)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9385

The operator transform dense features to sparse features by bucketizing. Only the feature in indices tensor will be transformed and output.

Reviewed By: bddppq

Differential Revision: D8820351

fbshipit-source-id: a66cae546b870c6b2982ac20641f198334f2e853
2018-07-13 20:39:30 -07:00
Jian Zhang
9e2f2cab94 Implementation and operator test for Wngrad optimizer (#8999)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/8999

Closes https://github.com/pytorch/pytorch/pull/8999

Implemented the WRgrad optimizer operator for dense (base case as well as the case with additional output for effective learning rate and update value) and sparse case.

Reviewed By: pjh5

Differential Revision: D8627933

fbshipit-source-id: a63cde46c04bcc6b428ab5f77a4b3b2beb66c046
2018-07-13 18:11:41 -07:00
Xiaomeng Yang
bb9ff58c6d Add cudnn activation ops (#9379)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9379

Add cudnn activation ops

Reviewed By: houseroad

Differential Revision: D8818013

fbshipit-source-id: d3881c634a46578b9331da07f9fdf7e1f31d7e8a
2018-07-12 23:18:56 -07:00