onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-18 21:21:17 +00:00

History

kunal-vaishnavi 44d8ad93b2 Whisper Timestamps and Temperature (#19509 ) ### Description This PR updates exporting and running the Whisper model with beam search by adding the following. - Adds temperature as a graph input to the exported model - Fixes the token ids by adding them as attributes to `WhisperBeamSearch` - Fixes the timestamps test cases so they pass now - Fixes a bug with invoking `torch.onnx.export` - Cleans up the Whisper scripts and groups the arguments in `convert_to_onnx.py` - Adds a `requirements.txt` file to specify package dependencies - Adds `whisper-large-v3` to list of pretrained models - Fixes a bug with missing cross-attention KV cache inputs in the decoder subgraph ### Motivation and Context - This is a follow-up to [this PR](https://github.com/microsoft/onnxruntime/pull/19188). - The incorrect token ids in the timestamps processor were first noticed during [this PR review](https://github.com/microsoft/onnxruntime/pull/17500#discussion_r1333520007). When they were originally added in [this PR](https://github.com/microsoft/onnxruntime/pull/15853), the offsets were previously constant across the Whisper model sizes. When comparing the new `whisper-large-v3` variant, the English-only variants (e.g. `whisper-tiny.en`), and the original variants (e.g. `whisper-tiny`), both the values and the offsets differ. Therefore, it is easier to set the token ids as attributes to `WhisperBeamSearch` when exporting to ensure the right values are used in the timestamps processor. - The Hugging Face API for returning timestamps and the expected outputs from the PyTorch model have both changed. - The fix for `torch.onnx.export` is a follow-up to [this PR review](https://github.com/microsoft/onnxruntime/pull/17179#issuecomment-1683001470). - The argument grouping is a follow-up to [this PR review](https://github.com/microsoft/onnxruntime/pull/17500#discussion_r1333521721). - Specific package versions are needed to run the Whisper scripts and the `requirements.txt` file ensures that these versions are installed. - The `whisper-large-v3` variant is released and should be in the list of official pretrained models. - After the changes from [this PR](https://github.com/microsoft/onnxruntime/pull/17316), the exported model is not loading in an ORT inference session because the cross-attention KV cache inputs are missing in the decoder subgraph.		2024-02-16 15:21:43 -08:00
..
common	ORT ETW dynamic logging that improves ORT diagnosability & performance (#18882 )	2024-01-11 12:43:27 -08:00
contrib_ops	Speed Up DecoderMaskedSelfAttentionTest (#19531 )	2024-02-15 20:22:36 -08:00
custom_op_registration
debug_node_inputs_outputs
framework	Disable CPU EP's allocator's arena when address sanitizer is enabled (#19485 )	2024-02-12 09:39:49 -08:00
fuzzing
global_thread_pools	Replace T4 to A10 in Linux GPU workflow (#19205 )	2024-01-23 10:49:24 -08:00
ir	Make session configuration options available to kernels via OpKernelInfo (#18897 )	2024-01-13 10:02:43 +10:00
logging_apis	Remove two tests from test_logging_apis.cc (#19100 )	2024-01-12 09:26:28 -08:00
mlas	[MLAS AArch64] SQNBitGemm optimization (#19272 )	2024-01-30 14:29:12 -08:00
onnx	[QNN EP] Expose device-level session options (#19212 )	2024-01-22 12:47:42 -08:00
opaque_api
optimizer	add GatherSliceToSplitFusion and Unittest (#19218 )	2024-02-14 15:07:56 -08:00
perftest	Add option to skip session run in perf_test tool (#19501 )	2024-02-12 19:11:40 -08:00
platform	Add capturestate / rundown ETW support logging for session and provider options (#19397 )	2024-02-08 11:28:05 -08:00
proto
providers	QNN EP: Fuse DQ -> Q sequences into a QNN Convert op (#19511 )	2024-02-16 14:36:05 -08:00
python	Whisper Timestamps and Temperature (#19509 )	2024-02-16 15:21:43 -08:00
quantization	Disable CPU EP's allocator's arena when address sanitizer is enabled (#19485 )	2024-02-12 09:39:49 -08:00
shared_lib	[ROCm] enable hipGraph (#18382 )	2024-01-23 11:17:04 +08:00
testdata	Bump ruff linter to 0.2.1 (#19471 )	2024-02-08 16:08:27 -08:00
unittest_main	Disable a few tests for wasm build (#19316 )	2024-01-30 08:16:57 -08:00
util	Add initial support for CoreML ML Program to the CoreML EP. (#19347 )	2024-02-15 08:46:03 +10:00
wasm	Bump follow-redirects from 1.15.2 to 1.15.4 in /onnxruntime/test/wasm (#19069 )	2024-01-11 16:13:22 -08:00
win_getopt
xctest