onnxruntime/onnxruntime/test
kunal-vaishnavi 44d8ad93b2
Whisper Timestamps and Temperature (#19509)
### Description
This PR updates exporting and running the Whisper model with beam search
by adding the following.

- Adds temperature as a graph input to the exported model
- Fixes the token ids by adding them as attributes to
`WhisperBeamSearch`
- Fixes the timestamps test cases so they pass now
- Fixes a bug with invoking `torch.onnx.export`
- Cleans up the Whisper scripts and groups the arguments in
`convert_to_onnx.py`
- Adds a `requirements.txt` file to specify package dependencies
- Adds `whisper-large-v3` to list of pretrained models
- Fixes a bug with missing cross-attention KV cache inputs in the
decoder subgraph

### Motivation and Context

- This is a follow-up to [this
PR](https://github.com/microsoft/onnxruntime/pull/19188).
- The incorrect token ids in the timestamps processor were first noticed
during [this PR
review](https://github.com/microsoft/onnxruntime/pull/17500#discussion_r1333520007).
When they were originally added in [this
PR](https://github.com/microsoft/onnxruntime/pull/15853), the offsets
were previously constant across the Whisper model sizes. When comparing
the new `whisper-large-v3` variant, the English-only variants (e.g.
`whisper-tiny.en`), and the original variants (e.g. `whisper-tiny`),
both the values and the offsets differ. Therefore, it is easier to set
the token ids as attributes to `WhisperBeamSearch` when exporting to
ensure the right values are used in the timestamps processor.
- The Hugging Face API for returning timestamps and the expected outputs
from the PyTorch model have both changed.
- The fix for `torch.onnx.export` is a follow-up to [this PR
review](https://github.com/microsoft/onnxruntime/pull/17179#issuecomment-1683001470).
- The argument grouping is a follow-up to [this PR
review](https://github.com/microsoft/onnxruntime/pull/17500#discussion_r1333521721).
- Specific package versions are needed to run the Whisper scripts and
the `requirements.txt` file ensures that these versions are installed.
- The `whisper-large-v3` variant is released and should be in the list
of official pretrained models.
- After the changes from [this
PR](https://github.com/microsoft/onnxruntime/pull/17316), the exported
model is not loading in an ORT inference session because the
cross-attention KV cache inputs are missing in the decoder subgraph.
2024-02-16 15:21:43 -08:00
..
common ORT ETW dynamic logging that improves ORT diagnosability & performance (#18882) 2024-01-11 12:43:27 -08:00
contrib_ops Speed Up DecoderMaskedSelfAttentionTest (#19531) 2024-02-15 20:22:36 -08:00
custom_op_registration
debug_node_inputs_outputs
framework Disable CPU EP's allocator's arena when address sanitizer is enabled (#19485) 2024-02-12 09:39:49 -08:00
fuzzing
global_thread_pools Replace T4 to A10 in Linux GPU workflow (#19205) 2024-01-23 10:49:24 -08:00
ir Make session configuration options available to kernels via OpKernelInfo (#18897) 2024-01-13 10:02:43 +10:00
logging_apis Remove two tests from test_logging_apis.cc (#19100) 2024-01-12 09:26:28 -08:00
mlas [MLAS AArch64] SQNBitGemm optimization (#19272) 2024-01-30 14:29:12 -08:00
onnx [QNN EP] Expose device-level session options (#19212) 2024-01-22 12:47:42 -08:00
opaque_api
optimizer add GatherSliceToSplitFusion and Unittest (#19218) 2024-02-14 15:07:56 -08:00
perftest Add option to skip session run in perf_test tool (#19501) 2024-02-12 19:11:40 -08:00
platform Add capturestate / rundown ETW support logging for session and provider options (#19397) 2024-02-08 11:28:05 -08:00
proto
providers QNN EP: Fuse DQ -> Q sequences into a QNN Convert op (#19511) 2024-02-16 14:36:05 -08:00
python Whisper Timestamps and Temperature (#19509) 2024-02-16 15:21:43 -08:00
quantization Disable CPU EP's allocator's arena when address sanitizer is enabled (#19485) 2024-02-12 09:39:49 -08:00
shared_lib [ROCm] enable hipGraph (#18382) 2024-01-23 11:17:04 +08:00
testdata Bump ruff linter to 0.2.1 (#19471) 2024-02-08 16:08:27 -08:00
unittest_main Disable a few tests for wasm build (#19316) 2024-01-30 08:16:57 -08:00
util Add initial support for CoreML ML Program to the CoreML EP. (#19347) 2024-02-15 08:46:03 +10:00
wasm Bump follow-redirects from 1.15.2 to 1.15.4 in /onnxruntime/test/wasm (#19069) 2024-01-11 16:13:22 -08:00
win_getopt
xctest