onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-24 19:43:35 +00:00

History

Akshay Sonawane 56ad68120e Add support to use sequence as input ids in decoder inputs to Beam Search CUDA Op (#15232 ) Add support to use sequence as input ids in decoder inputs to Beam Search CUDA Op ### Description Currently Beam search Op is only supported for CPU EP, added support for CUDA EP. ### Motivation and Context - For Turing models inference was throwing segmentation fault due to copy failing in cuda memory, also beam search support was not present in cuda.		2023-04-13 13:35:33 -07:00
..
aten_ops
attnlstm	Switch GSL to MS GSL 4.0.0 (#13416 )	2022-10-29 04:15:20 -07:00
bert	ROCm MHA (#15279 )	2023-04-11 13:20:44 +08:00
fp16	Adding FP16 Global Average Pool operator (#15324 )	2023-04-05 09:38:02 -07:00
math	Switch GSL to MS GSL 4.0.0 (#13416 )	2022-10-29 04:15:20 -07:00
quantization	prefast C26451 (#14933 )	2023-03-07 15:16:50 +08:00
tensor	Introduce shrunken gather operator (#15396 )	2023-04-07 15:12:58 +08:00
transformers	Add support to use sequence as input ids in decoder inputs to Beam Search CUDA Op (#15232 )	2023-04-13 13:35:33 -07:00
utils	Add Environment Variables for cuda tensor dumper (#14780 )	2023-02-22 15:59:10 -08:00
activations.cc	QuickGelu Fusion (#12417 )	2022-10-28 18:12:07 +08:00
activations.h	Cjian/c4244 round 6 (#13663 )	2022-11-16 16:26:11 -05:00
cdist.cc	Cjian/c4244 round 1a (#13483 )	2022-11-08 23:58:05 -05:00
cdist.h
conv_transpose_with_dynamic_pads.cc
conv_transpose_with_dynamic_pads.h
cpu_contrib_kernels.cc	Support additional op domains in op reduction script. (#15424 )	2023-04-11 08:57:51 -07:00
cpu_contrib_kernels.h
crop.cc
crop.h	Switch GSL to MS GSL 4.0.0 (#13416 )	2022-10-29 04:15:20 -07:00
crop_and_resize.cc
crop_and_resize.h
dynamicslice.cc
element_wise_ops.cc
element_wise_ops.h
expand_dims.cc
expand_dims.h
fused_activation.cc
fused_activation.h
fused_conv.cc
fused_gemm.cc
fused_matmul.cc
grid_sample.cc
image_scaler.cc
image_scaler.h	Switch GSL to MS GSL 4.0.0 (#13416 )	2022-10-29 04:15:20 -07:00
inverse.cc	Switch GSL to MS GSL 4.0.0 (#13416 )	2022-10-29 04:15:20 -07:00
layer_norm.cc
layer_norm.h
maxpool_with_mask.cc
maxpool_with_mask.h	Switch GSL to MS GSL 4.0.0 (#13416 )	2022-10-29 04:15:20 -07:00
mean_variance_normalization_exp.cc
murmur_hash3.cc
murmur_hash3.h
nchwc_ops.cc	Enable parallel output reordering in MlasReorderOutputNchw() (#13643 )	2023-02-08 10:02:54 -08:00
nchwc_ops.h
sample.cc
sample.h
skip_layer_norm.cc	Add tool to support packing mode for BERT model (#15283 )	2023-03-31 08:46:47 -07:00
skip_layer_norm.h
tokenizer.cc	Switch GSL to MS GSL 4.0.0 (#13416 )	2022-10-29 04:15:20 -07:00
unique.cc	Cjian/c4244 round 6 (#13663 )	2022-11-16 16:26:11 -05:00
unique.h
word_conv_embedding.cc	Fix round 4 (#13609 )	2022-11-10 00:18:51 -05:00
word_conv_embedding.h	Let mlas use session thread pool (#1609 )	2019-08-16 13:21:15 -07:00