onnxruntime/onnxruntime/contrib_ops/cuda
Du Li 621b3ac03a
FFT contrib ops (#3381)
* add custom op skeleton

* Adding Rfft, Irfft kernels.

* Fix a few errors:
1. make kernel stateless to avoid race condition
2. reclaim cufft plan

* Adding MLFloat16 support

* Adding fp16 support for fft ops.

* Adding cufft plan cache.

* adding a util func

* adding copyright info.

* Accommodating PR comments.
2020-04-14 10:12:04 -07:00
..
activation Replacing CudaAsyncBuffer with TArray to improve perf (#3303) 2020-03-24 12:13:27 -07:00
bert Change mask_index input of Attention op to be optional (#3459) 2020-04-12 22:55:37 -07:00
math FFT contrib ops (#3381) 2020-04-14 10:12:04 -07:00
tensor move all contrib ops to contrib ops namespace (#1190) 2019-06-24 10:19:01 -07:00
conv_transpose_with_dynamic_pads.cc
conv_transpose_with_dynamic_pads.h
layer_norm.cc Revert an op version change (#3026) 2020-02-18 09:43:18 -08:00
layer_norm.h Add layernorm operator (#1967) 2019-10-03 11:32:13 -07:00
layer_norm_impl.cu use cublasHgemm for Volta GPU (#2074) 2019-10-14 17:29:13 -07:00
layer_norm_impl.h Add layernorm operator (#1967) 2019-10-03 11:32:13 -07:00