onnxruntime/onnxruntime/contrib_ops/cpu
Akshay Sonawane 56ad68120e
Add support to use sequence as input ids in decoder inputs to Beam Search CUDA Op (#15232)
Add support to use sequence as input ids in decoder inputs to Beam
Search CUDA Op

### Description
Currently Beam search Op is only supported for CPU EP, added support for
CUDA EP.

### Motivation and Context
- For Turing models inference was throwing segmentation fault due to
copy failing in cuda memory, also beam search support was not present in
cuda.
2023-04-13 13:35:33 -07:00
..
aten_ops
attnlstm Switch GSL to MS GSL 4.0.0 (#13416) 2022-10-29 04:15:20 -07:00
bert ROCm MHA (#15279) 2023-04-11 13:20:44 +08:00
fp16 Adding FP16 Global Average Pool operator (#15324) 2023-04-05 09:38:02 -07:00
math Switch GSL to MS GSL 4.0.0 (#13416) 2022-10-29 04:15:20 -07:00
quantization prefast C26451 (#14933) 2023-03-07 15:16:50 +08:00
tensor Introduce shrunken gather operator (#15396) 2023-04-07 15:12:58 +08:00
transformers Add support to use sequence as input ids in decoder inputs to Beam Search CUDA Op (#15232) 2023-04-13 13:35:33 -07:00
utils Add Environment Variables for cuda tensor dumper (#14780) 2023-02-22 15:59:10 -08:00
activations.cc QuickGelu Fusion (#12417) 2022-10-28 18:12:07 +08:00
activations.h Cjian/c4244 round 6 (#13663) 2022-11-16 16:26:11 -05:00
cdist.cc Cjian/c4244 round 1a (#13483) 2022-11-08 23:58:05 -05:00
cdist.h
conv_transpose_with_dynamic_pads.cc
conv_transpose_with_dynamic_pads.h
cpu_contrib_kernels.cc Support additional op domains in op reduction script. (#15424) 2023-04-11 08:57:51 -07:00
cpu_contrib_kernels.h
crop.cc
crop.h Switch GSL to MS GSL 4.0.0 (#13416) 2022-10-29 04:15:20 -07:00
crop_and_resize.cc code clean (#12392) 2022-08-01 14:12:35 +08:00
crop_and_resize.h
dynamicslice.cc
element_wise_ops.cc
element_wise_ops.h
expand_dims.cc
expand_dims.h code clean (#12392) 2022-08-01 14:12:35 +08:00
fused_activation.cc
fused_activation.h
fused_conv.cc
fused_gemm.cc
fused_matmul.cc
grid_sample.cc
image_scaler.cc
image_scaler.h Switch GSL to MS GSL 4.0.0 (#13416) 2022-10-29 04:15:20 -07:00
inverse.cc Switch GSL to MS GSL 4.0.0 (#13416) 2022-10-29 04:15:20 -07:00
layer_norm.cc Add ONNX LayerNormalization(17) (#12978) 2022-09-23 09:49:27 +10:00
layer_norm.h Add ONNX LayerNormalization(17) (#12978) 2022-09-23 09:49:27 +10:00
maxpool_with_mask.cc Update kernel matching logic: decouple from op schemas and remove kernel def hashes (#12791) 2022-09-20 14:24:59 -07:00
maxpool_with_mask.h Switch GSL to MS GSL 4.0.0 (#13416) 2022-10-29 04:15:20 -07:00
mean_variance_normalization_exp.cc
murmur_hash3.cc
murmur_hash3.h
nchwc_ops.cc Enable parallel output reordering in MlasReorderOutputNchw() (#13643) 2023-02-08 10:02:54 -08:00
nchwc_ops.h
sample.cc
sample.h code clean (#12392) 2022-08-01 14:12:35 +08:00
skip_layer_norm.cc Add tool to support packing mode for BERT model (#15283) 2023-03-31 08:46:47 -07:00
skip_layer_norm.h
tokenizer.cc Switch GSL to MS GSL 4.0.0 (#13416) 2022-10-29 04:15:20 -07:00
unique.cc Cjian/c4244 round 6 (#13663) 2022-11-16 16:26:11 -05:00
unique.h
word_conv_embedding.cc Fix round 4 (#13609) 2022-11-10 00:18:51 -05:00
word_conv_embedding.h