onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-14 20:48:00 +00:00

Author	SHA1	Message	Date
mindest	5b9369e93c	Fix typos according to reviewdog report. (#21335 ) ### Description Fix typos based on reviewdog report but with some exceptions/corrections.	2024-07-22 13:37:32 -07:00
Tianlei Wu	f25cf19375	Add helper functions to dump 4d tensors in CPU for debugging (#21043 ) Add some helper functions to dump 4D tensors to help debugging. Example to use it: (1) Change DUMP_TENSOR_LEVEL from 0 to 2 in contrib_ops/cpu/utils/debug_macros.h to enable dumping. Without enabling, the dumping code will not be built into ORT binary. (2) Add a few lines to dump tensors like ``` DUMP_CPU_TENSOR_INIT(); DUMP_CPU_TENSOR("tensor name", tensor_data, dim0, dim1, dim2, dim3); ``` Changes: - [x] Add functions to dump 4D int32/int64/float/half tensors in CPU - [x] Add functions to dump 4D int32/int64 tensors in CUDA - [x] Change namespace (remove .transformers from namespace, and move files to utils directory)	2024-06-14 17:32:27 -07:00
Justin Chu	cf19c3697d	Run clang-format in CI (#15524 ) ### Description Run clang-format in CI. Formatted all c/c++, objective-c/c++ files. Excluded ``` 'onnxruntime/core/mlas/', 'onnxruntime/contrib_ops/cuda/bert/tensorrt_fused_multihead_attention/', ``` because they contain assembly or is data heavy ### Motivation and Context Coding style consistency	2023-04-18 09:26:58 -07:00
Yufeng Li	8824f812e0	optimize topk for greedysearch (#14271 ) Optimize top 1 computation in greedysearch. For vocabulary size 50k on A100, - batch size 1: from 220us to 10.4us. - batch size 4, from 230us to 11.5us. For generation of 50 tokens for example, it saves 50*0.2ms = 10ms.	2023-01-13 15:03:49 -08:00
Ye Wang	c9a53c9255	Some changes to Sampling Op (#14218 ) ### Description <!-- Describe your changes. --> 1. add an optional input to pass in seed 2. two UTs. one for top_p=0.5, another for top_p=0.01(create greedy search result, in convert_generation.py) 3. fix a bug in cpu kernel ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Co-authored-by: Ubuntu <wy@v100-2.0cdb2e52twzevn1i4fi45bylyg.jx.internal.cloudapp.net>	2023-01-12 14:15:26 -08:00
Ye Wang	68518a1b72	Sampling op (#13426 ) ### Description <!-- Describe your changes. --> Sampling op for cpu and cuda support huggingface case and custom case ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Co-authored-by: Ubuntu <wy@v100-2.0cdb2e52twzevn1i4fi45bylyg.jx.internal.cloudapp.net>	2022-12-22 17:34:12 -08:00

6 commits