Commit graph

18 commits

Author SHA1 Message Date
Xiaomeng Yang
08a853b02c
Add rsqrt op in caffe2 (#7154) 2018-05-01 15:06:53 -07:00
Yinghai Lu
ef8f556212
[Caffe2] Changes done inside Facebook (#6378)
* fix unit test for sqrt op

From the error logging:

[idx, grad, grad_estimate] are:
[[ 146.            0.5           0.45776367]
 [ 147.            0.5           0.45776367]

The gradient == 0.5 is correct, which means the SqrtOp and its gradient is doing right job. (Because y = sqrt(x), loss = y^2/2 = x/2, and then d(loss)/dx = 1/2 = 0.5; )

The test failed because of numerical problem of grad_estimate (in unit test). It can be because the step_size is small, and float precision is not high (when there are multiple elements in the tensor, we do sum(y^2) to compute loss)

This diff
- increase the step size, and also move the test cases to be further away from 0 (where sqrt(x) is not well defined) to be safe :)
- also clean up, and merge the test case for inplace Vs. non-inplace

Tested with:

`CAFFE2_HYPOTHESIS_PROFILE=debug ai_bt caffe2/caffe2/python/operator_test:elementwise_ops_test -- "test_sqrt"`

* CompositeReader & CompositeReaderBuilder

A new type of reader gluing multiple readers together.

* Back out "Revert D7394363: [GanH]: Log D Trick for Cross Entropy with Sigmoid"

Original commit changeset: 9325a4356dbe

* [dai][WIP] convert params to int8 on ps before sending to trainer

Add float->uint8 conversion in addition to float->fp16 conversion in model_saver.

* [easy] improve unit test for sparse length sum ops

as desc.

#accept2ship

* Update GitHub upstream to 771fcb3455

* move sparse hash unique ops to OOS and add unit tests

- move the SparseHash version to OOS, since 'sparsehash' is already deps of caffe2 OOS: https://fburl.com/arssw4n1
- The 'SparseHash' engine is also being used in OOS, so the SparseHash version shall be in OOS to reduce confusion: https://fburl.com/o5ea7ah2

- fix the CUDA UniqueOp for the case when batch is empty.
- add unit test

* group_norm_op for caffe2

This is the cuda op for Group Normalization (GN): https://arxiv.org/abs/1803.08494

This code implements GN in one op that computes Y=gamma * (X-mu) / sigma + beta and also its gradients. It is expected to have minimal memory consumption (similar to the BN op), without creating new blobs if GN were implemented as several ops (e.g., reshape, norm_mean/std, affine_channel).

* Resubmit D7405233: disappeared in D7464958

OOS publish causes the op missing -- however, test was still there

* [c2] add sparse hash engine for cuda unique op

The SparseHash version of UniqueOp copy input tensor to CPU, and make use of sparse hash map to get unique output, and then copy back to GPU.

* [dper][gpu] enable unit testing gpu trainer for sparse nn

to debug the GPU trainer using mock data in unit test.

make it easier to develop GPU trainer for new models.

* Reuse Gloo context for Synchronize() calls

Previously we were creating (and leaking) the Gloo context on each call to Synchronize(). Now only run the common world op and create the barrier net once, then run the barrier net on each Synchronize() call. Since timeout is associated with the Gloo context, assert that the timeout is fixed instead of trying to handle the complexity of multiple timeouts (and associated contexts).

* [GanH/WGAN][1/n]: add FC param clipping

as titled

* [mobile] minimizing changes between caffe2_benchmark and speed_benchmark

* [GanH]: enable diagnose within model

avoid finding blob names but to directly enable inside the model

* Add `net_transformer_fun` option to DPM

This callback allows for various transformations to be made to the
model after gradient operators have been added. The immediate motivation for
this is to allow transformations such has "checkpoint-and-recompute" which
allow trading off memory for additional compute.

Adding several callbacks like this has made DPM's API less than ideal at this
stage. However, I could not find any reasonable alternative.

* [DT] [33/n] Compile flow task groups

task groups need to compiled in order to pickle the object in fblearner. However I also changed the Job's compile function as creating new object is not necessary.

* Initial commit for sparse_normalize vectorization and benchmark

* [GanH]: LB Calibration for JSD

as titled

* Tracing event in async executor

Adding event tracing through TRACE_EVENT macro in async executor

* [Resubmit] D7409751 Reseting book-keeping blobs when the reservoir is reset

D7409751 got lost in D7464958

* Visualizing realtime weights values

we want to visualize the weights values as optimizer is iterating. This diff supports to visual the weights at an assigned index.
Currently, we assume the blob to be 2 dimensional.

* [GanH][Easy]: Fix Homotopy Weighting

apparantely, there was a bug in homotopy weight (alpha, beta) update

* [c2] move sparse hash unique op out of oss

so that oss do not need to depend on google hash map.

* Get rid of std::round as it's not supported on Android

* Revert changes on setup.py

* Skip shaky test on Dataio

* fix
2018-04-10 21:11:43 -07:00
Xianjie Chen
6ed9a0c3f2 fix cuda elementwise ops for empty batch
CUDA will fail to launch empty kernel
2018-03-27 18:10:39 -07:00
Orion Reblitz-Richardson
1d5780d42c Remove Apache headers from source.
* LICENSE file contains details, so removing from individual source files.
2018-03-27 13:10:18 -07:00
Joseph Spisak
cebf44e960 Element-wise tests now use or seeded with hypothesis (#2181)
* Updated all element-wise tests to use hypothesis testing or at least use hypothesis seeds

* Updated tests to add seed to sqr function
2018-03-08 07:51:45 -08:00
Dmytro Dzhulgakov
7d141d4243 Changes done internally at Facebook (#2154)
f679c644e332 dzhulgakov [caffe2] Sync script - add ability to handle rebase conflicts
51729b061a15 dzhulgakov [caffe2] Changes done on GitHub
2018-03-06 01:23:54 -08:00
Joseph Spisak
11a736b682 Sqrt op (#2101)
* First attempt on sqrt op

* Adding the Sqrt op along with the test cases

* Made changes per @Yangqing's questions re: tensor format and used hypothesis to generate input tensor
2018-03-02 16:19:45 -08:00
Orion Reblitz-Richardson
ccea6924a2 Implementing Pow operator (this merges existing pow with a scalar and new pow with a tensor exponent). Second Try.
The old pow operator has been deleted in math_ops.cc, math_ops.cu and math_ops.h, while the new operator supporting scalar and tensor exponent has been added in pow_op.cc, pow_op.h an elementwise_op.cu.
2018-02-21 18:31:45 -08:00
Pieter Noordhuis
52fa742c51 Revert D6893040: Implementing Pow operator (this merges existing pow with a scalar and new pow with a tensor exponent).
Summary:
This reverts commit 30f614beea6f859fee25ce4f85573142885dde45

bypass-lint

An infra SEV is better than not reverting this diff.
If you copy this password, see you in SEV Review!
cause_a_sev_many_files

Differential Revision:
D6893040

Original commit changeset: 30f614beea6f

fbshipit-source-id: 5e98a24699088283f864efe31234874bdacbe3c3
2018-02-14 10:34:08 -08:00
Maxim Naumov
f7cc8e8822 Implementing Pow operator (this merges existing pow with a scalar and new pow with a tensor exponent).
Summary: The old pow operator has been deleted in math_ops.cc, math_ops.cu and math_ops.h, while the new operator supporting scalar and tensor exponent has been added in pow_op.cc, pow_op.h an elementwise_op.cu.

Reviewed By: houseroad

Differential Revision: D6893040

fbshipit-source-id: 30f614beea6f859fee25ce4f85573142885dde45
2018-02-13 17:46:35 -08:00
Manoj Krishnan
6d32e36682 Caffe2 Operator: GPU implementation of Swish Activation
Summary: GPU (CUDA) implementation of the Swish activation function in Caffe2.

Reviewed By: Yangqing, xianjiec

Differential Revision: D6656907

fbshipit-source-id: f5f2c667055abf679728d2b5d43998895ddec708
2018-01-05 12:04:25 -08:00
Pieter Noordhuis
348e29c49b Don't run CUDA tests for ops without CUDA implementation
Summary: Closes https://github.com/caffe2/caffe2/pull/1434

Reviewed By: houseroad, ilia-cher

Differential Revision: D6272614

Pulled By: pietern

fbshipit-source-id: 7b998b08ec02b03f88a6fd24a949b0d199b2aa37
2017-11-08 10:28:02 -08:00
Badri Narayan Bhaskar
25bfffeafe Swish Activation Function
Summary:
Swish: A self-gated activation function.
https://arxiv.org/pdf/1710.05941.pdf

Reviewed By: ajtulloch

Differential Revision: D6100424

fbshipit-source-id: 0103d6d82e9ffb50106c98a8785e62b8808e9af1
2017-10-20 10:37:43 -07:00
Yangqing Jia
8286ce1e3a Re-license to Apache
Summary: Closes https://github.com/caffe2/caffe2/pull/1260

Differential Revision: D5906739

Pulled By: Yangqing

fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902
2017-09-28 16:22:00 -07:00
James Reed
85f1d947dd Vectorize SigmoidOp on CPU
Summary: I noticed that Sigmoid was taking an inordinate amount of time in our NMT benchmark, so I looked at the implementation and it didn't seem optimal. I replaced the implementation with an Eigen version so that when the Eigen update goes through, we will get proper AVX(2) vectorization.

Differential Revision: D5082464

fbshipit-source-id: aa951f7d730fc05198f7dd04076ec58d471b74c8
2017-05-17 20:33:36 -07:00
Aapo Kyrola
8fab453863 Sqr op and gradient
Summary: Due to popular demand, added an op to compute element-wise square + gradient for it (just for the fun of it).

Reviewed By: Yangqing

Differential Revision: D4664797

fbshipit-source-id: 0a29c7c249fdc72f51412bebd6ae352a7801cf05
2017-03-07 03:03:07 -08:00
Aapo Kyrola
8caa7cec8d CUDA version of Log
Summary: As in the title. Simple registration issue.

Reviewed By: Yangqing, jhcross

Differential Revision: D4655691

fbshipit-source-id: 661e4d5f1226ec05e099c84f4454aa07c6be4449
2017-03-04 00:32:03 -08:00
Aapo Kyrola
4f1db36cff add CUDA gradient for Div
Summary: DivOp missed a gradient for CUDA, so implemented it. Also added operator test.

Differential Revision: D4396638

fbshipit-source-id: 9949e47aa3735bb418a0db003e2b2f4896056a71
2017-01-09 21:59:23 -08:00