onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-11 17:48:34 +00:00

Author	SHA1	Message	Date
Dmitri Smirnov	684e900e96	Remove NETSTANDARD1.1 moniker and NETSTD1.1 specific code (#16018 ) ### Description Remove NETSTANDARD1.1 moniker and NETSTD1.1 specific code. We no longer target this platform. ### Motivation and Context NETSTANDARD1.1 target constraints the development and the modern libraries we would like to use in the code while it is apparently no longer required by customers.	2023-05-22 17:33:46 -07:00
RandySheriffH	d35361bf9d	Fix python pipeline for AzureEP without using root (#16023 ) Fix python pipeline for AzureEP without using root, this is for 1.15. --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-22 16:38:47 -07:00
satyajandhyala	22a578c06c	Use node name to uniquify the subgraph nodes. (#15855 ) ### Description <!-- Describe your changes. --> Use the unique name of the function node name to uniquify the subgraph node names. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? Prevent duplicate node names in the graph. - If it fixes an open issue, please link to the issue here. --> https://github.com/microsoft/onnxruntime/issues/15849 --------- Co-authored-by: Satya Jandhyala <sajandhy@microsoft.com>	2023-05-22 16:15:14 -07:00
zhijiang	4dc4470cc7	Fix fusion for two LayerNorm sharing same input but with different weights (#15919 ) in gpt_j_residual(https://arxiv.org/pdf/2204.06745.pdf), there are 2 LN nodes will share one same input, and ORT does CSE graph optimization before LN fusion, which will modify the LN graph pattern and thus make LN fusion failure. ![image](https://github.com/microsoft/onnxruntime/assets/10530022/40990fd6-796f-4edf-be0b-3203e8503678)	2023-05-22 08:26:36 +08:00
zhijiang	5607a7151a	Introduce register-efficient warp-wise Softmax (#15266 ) improve softmax forward when number of elem to do softmax is between (1024,2048] several optimizations done in the PR: 1. originally ort will call softmax_block_forward when shape is 1500, this will cause 5.53ms, however ort has another implementation called softmax_warp_forward, this function will only need 4.74ms, so i modified the function selection logic to call the faster version. 2. softmax_warp_forward will use register to cache the input in fp32 mode, this will consume many registers when data number is large and will make warp occupancy quite low, also compiler can do some of its optimizations, so the pr implements another version of softmax_warp_forward, it will use shared memory instead of register to cache the input; also when the for loop in the function has many iterations, actually disable loop unrolling will make kernel faster further. the perf table between softmax_warp_forward1(the original version) and softmax_warp_forward2 ![image](https://user-images.githubusercontent.com/43435212/228491963-cf87e3b3-e69e-454c-bab6-7e62a25bf76b.png) in open-ai whisper case, the kernel gain will be 5.53ms/3.03ms = 82% (softmax_block_forward vs softmax_warp_forward2)	2023-05-22 08:26:03 +08:00
Changming Sun	0204594f90	Cleanup WASM cmake code (#15996 ) ### Description Remove the "onnxruntime_BUILD_WEBASSEMBLY" cmake option. Use `if (CMAKE_SYSTEM_NAME STREQUAL "Emscripten")` instead. It makes some code look more nature. For example, ```cmake if (CMAKE_SYSTEM_NAME STREQUAL "iOS" OR CMAKE_SYSTEM_NAME STREQUAL "Android" OR onnxruntime_BUILD_WEBASSEMBLY) ``` becomes ```cmake if (CMAKE_SYSTEM_NAME STREQUAL "iOS" OR CMAKE_SYSTEM_NAME STREQUAL "Android" OR CMAKE_SYSTEM_NAME STREQUAL "Emscripten") ```	2023-05-20 18:07:39 -07:00
Yulong Wang	e9e6bedf37	[js/webgpu] generate operator table for webgpu (#15954 ) ### Description [js/webgpu] generate operator table for webgpu	2023-05-20 12:20:41 -07:00
Yulong Wang	18f17c555d	[js/webgpu] fix buffer size when download (#15990 ) ### Description fix buffer size when download. buffer size should always be padded to multiple of 4. resolved issue described in #15796 > ![Image](https://user-images.githubusercontent.com/26504141/239093785-9417dffc-6f00-47b2-956d-402b43bdb0a9.png)	2023-05-20 00:21:18 -07:00
Patrice Vignola	85cacf315b	[DML EP] Add MultiHeadAttention and fix Attention (#15727 )	2023-05-19 15:07:14 -07:00
Yulong Wang	dc06c255b4	fix transpose optimizer on GPU EP (#15988 ) ### Description because of #15618 , the default allocator changed to device allocator, which will be GPU instead of CPU. in transpose optimizer we expect to read data from initializers so a CPU allocator is required here. this change fixes transpose optimizer on GPU EP Fixes the issue referred to in #15869, #15796	2023-05-19 14:33:45 -07:00
Hector Li	4324d2173b	[QNN EP] Enable Qnn context cache to save model initialization time (#15815 ) ### Description Enable Qnn Context cache feature to save model initialization time Provider options: qnn_context_cache_enable\|1 to enable the cache feature qnn_context_cache_path to set the cache path. It is set to model_file.onnx.bin by default. ### Motivation and Context Model initialization time takes long because the cost of conversion from Onnx model to Qnn model. Qnn have feature to serialize the Qnn context to file, then next time user can load it from the cache context and execute the graph to save the cost. --------- Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com>	2023-05-19 10:52:17 -07:00
RandySheriffH	4dfb89b3ad	Implement mutex-free spin lock for task queue (#14834 ) Implemented "lock-free" spinlock to save CPU usage on context switching. The change has been tested on queene service of Ads team, the lock-free version of ort (40 threads) saves CPU usage on gen8 (128 logical processors on 8 numa nodes) windows by nearly half, from 65% to 35%. For 32 cores, the curve is flat: Anubis, 32 vCPU, windows, hugging face models, 95 percentile E2E latency in ms: model \| mutex(ms) \| mutex-free --- \| --- \| --- alvert_base_v2 \| 34.21 \| 34.09 bert_large_uncased \| 116.27\| 117.84 bart_base \| 72.06 \| 71.99 distilgpt2 \| 25.43 \| 25.02 vit_base_patch16_224 \| 37.33 \| 37.76 Anubis, 32 vCPU win, Linux, 1st party models, 95 percentile E2E latency in ms: model \| mutex(ms) \| mutex-free --- \| --- \| --- deepthink_v2 \| 24.35 \| 22.95 bing_feeds \| 36.96 \| 36.48 deep_writes \| 14.46 \| 14.32 keypoints \| 9.34 \| 7.69 model11 \| 1.71 \| 1.66 model12 \| 1.82 \| 1.44 model2 \| 4.21 \| 3.95 model6 \| 1.08 \| 1.05 agiencoder \| 0.99 \| 0.93 geminet_transformer \| 5.32 \| 5.24 --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-19 10:12:10 -07:00
cloudhan	0b0a359520	[CAPI] CAPI impl refactor (#15974 ) 1. Better options string building 2. avoid potential `new` `delete`	2023-05-19 11:40:56 +08:00
Patrice Vignola	310b22aa0c	[DML EP] Update DirectML version to 1.12.0 (#16011 )	2023-05-18 19:37:12 -07:00
PeixuanZuo	d78bbf5ef2	[ROCm] remove ROCm5.2.3, ROCm5.3, ROCm5.4 from pipeline (#16004 ) remove ROCm5.2.3, ROCm5.3, ROCm5.4 from pipeline.	2023-05-19 10:29:01 +08:00
George Wu	a74fdeb7fc	fix unused var warning in contrib_ops/cuda/bert/attention.cc (#16010 ) fix https://github.com/microsoft/onnxruntime/issues/16000	2023-05-18 17:42:08 -07:00
Zhang Lei	0f8e66d905	optimization for whisper model with decoder masked multihead attention (#15827 ) * graph tools update * cuda kernel update * operator spec update and implementation update * greed search bug fix on wrong assumption for cross/self attention input length * avoid use of "" name in value info when loading graph which historically in many model	2023-05-18 15:38:31 -07:00
Changming Sun	be6c0bb53c	Update cgmanifests/generated/cgmanifest.json to fix a syntax error (#15997 ) ### Description In PR #15797, the author manually edited the cgmanifests/generated/cgmanifest.json file and made an error that makes the file ill formed. ### Motivation and Context	2023-05-18 15:03:06 -07:00
Yufeng Li	0fed00c04d	fix topo sort in quantization tool (#16003 ) ### Description <!-- Describe your changes. --> Should not set up dependent node list for empty('') input ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-18 13:43:52 -07:00
Jian Chen	ea7b2deffd	Removing C4090 warning suppression (#15994 ) ### Description Removing C4090 warning suppression after windows pipelines adapt vs2022 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-18 10:08:05 -07:00
Ashwini Khade	0c815a95b7	android package fix (#15999 ) ### Description This PR adds the training headers to the training android packages. ### Motivation and Context Training headers need to be added as part of the training android packages, however because of the typo in the cmake these headers were not being added. This PR fixes the issue.	2023-05-18 09:21:03 -07:00
Changming Sun	842b1a3472	Revert a change in #15797 : restore the correct version of emsdk (#15995 ) ### Description Revert a change in #15797: restore the correct version of emsdk ### Motivation and Context Without change, when you build it on Windows you will see: ``` 2023-05-17 19:41:30,093 build [INFO] - Activating emsdk... 2023-05-17 19:41:30,093 util.run [INFO] - Running subprocess in 'C:\src\onnxruntime2\cmake\external\emsdk' 'C:\src\onnxruntime2\cmake\external\emsdk\emsdk.bat' activate 3.1.37 error: tool or SDK not found: '3.1.37' ```	2023-05-18 07:41:38 -07:00
Edward Chen	648bedf91a	[CoreML EP] Minor changes to allow CoreML EP to handle more nodes and models. (#15993 ) ### Description <!-- Describe your changes. --> Minor changes to allow CoreML EP to handle more nodes and models. - Remove graph input dynamic shape check from coreml::GetSupportedNodes(). Each node input is still checked. - Add check for optional input in coreml::IsInputSupported(). If an input does not exist it should not be considered unsupported. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Some CoreML EP checks seem too strict now.	2023-05-18 16:24:30 +10:00
cloudhan	5a8b892bdc	[C#] Address the concern of append EP throw (#15973 )	2023-05-18 11:53:54 +08:00
Edward Chen	6d46007028	Add explicit 'set +x' before printing a vso[] command to avoid output getting parsed again with a trailing quote. (#15986 ) Here's the motivating issue: https://github.com/microsoft/azure-pipelines-tasks/issues/10331 Noticed some problems in other repos so also updating usages in ORT. We may be fine now without it, but this change adds some safeguard against future additions of 'set -x' for debugging.	2023-05-17 19:30:28 -07:00
Changming Sun	d98763473a	Change CUDA pipelines to download CUDA SDK in every build job (#15915 ) ### Description Change CUDA pipelines to download CUDA SDK in every build job ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-17 17:31:51 -07:00
cloudhan	856afa49dd	[C#] Add missing rocm csharp api (#15540 )	2023-05-18 08:15:19 +08:00
Linnea May	0d6416c0e9	DML EP Bitwise operators opset 18 (#15892 ) ### Description <!-- Describe your changes. --> Add dml registration for bitwise and, or, xor and not added in opset 18. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Linnea May <linneamay@microsoft.com>	2023-05-17 13:27:49 -07:00
Vrajang Parikh	5abaca9d69	add maybe unused attribute to vars only used for logging (#15970 ) ### Description Add maybe_unused attribute to variables that are only used for logging ### Motivation and Context Building ORT with training using Xcode 14.3 causes` -Wunused-but-set-variable` error as some variables are created and exclusively used for debug logging. Adding maybe_unused suppresses warnings on unused variables when logging is disabled and fixes the local build.	2023-05-17 10:24:13 -07:00
Yi Zhang	6d43d51eb0	[Fix] No test result report while not using ctest (#15976 ) ### Description 1. Set gtest output while ctest is set to empty. 2. onnx_src in _deps shouldn't be removed because onnx_test_pytorch_converted and onnx_test_pytorch_converted need to read data from onnx/backend/test/data/.. ### Motivation and Context Test result report is important to find the flaky tests. ### To do Tests are not inconsistent. If ctest_path is empty, onnx_test_pytorch_converted and onnx_test_pytorch_converted will not be executed, if it's not, onnxruntime_mlas_test will not be executed. `270c09a37f/tools/ci_build/build.py (L1743-L1753)`	2023-05-17 08:31:16 -07:00
Jian Chen	2881d849d4	Update Win-CPU-2021 to onnxruntime-Win-CPU-2022 (#15967 ) ### Description After this PR there are following pool need to be updated. old\|new\|note ---\|---\|--- onnxruntime-Win2019-GPU-dml-A10\|tbd\| onnxruntime-Win2019-GPU-T4\|onnxruntime-Win2022-GPU-T4\| onnxruntime-Win2019-GPU-training-T4\|onnxruntime-Win2022-GPU-T4\|ame as the above because we do not have many T4 GPUs onnxruntime-tensorrt8-winbuild-T4\|tbd\| aiinfra-dml-winbuild\|tbd\| ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-17 08:29:27 -07:00
Yulong Wang	084d0d0d2d	Update github issue template for 'web': add EP (#15955 ) ### Description Update github issue template for 'web': add EP	2023-05-16 23:50:33 -07:00
Patrice Vignola	0ff915eba8	[DML EP] Add frequent upload heap flushing (#15960 ) This reduces peak nonlocal memory consumption when uploading large weights for big models (e.g. LLMs), while at the same time trying to keep the GPU as busy as possible. This change could be more sophisticated, but at this stage it is the most minimal and least risky change required to support LLMs.	2023-05-16 22:35:38 -07:00
stevenlix	270c09a37f	Add timestamp logits processor for whisper (#15853 ) Enable timestamp estimation and logits processing for Whisper model.	2023-05-16 21:40:00 -07:00
kailums	f62f722c70	integrate triton into ort (#15862 ) ### Description In some scenarios, the triton written kernels are more performant than CK or other handwritten kernels, so we implement a framework that onnxruntime can use these triton written kernels. This PR is to integrate triton into ort, so that ort can use kernels that written and compiled by triton. The main change focus on two part: 1. a build part to compile triton written kernel and combine these kernels into libonnxruntime_providers_rocm.so 2. a loader and launcher in c++, for loading and launch triton written kernels. #### Build To compile triton written kernel, add a script `tools/ci_build/compile_triton.py`. This script will dynamic load all kernel files, compile them, and generate `triton_kernel_infos.a` and `triton_kernel_infos.h`. `triton_kernel_infos.a` contains all compiled kernel instructions, this file will be combined into libonnxruntime_providers_rocm.so, using --whole-archive flag. `triton_kernel_infos.h` defines a const array that contains all the metadata for each compiled kernel. These metadata will be used for load and launch. So this header file is included by 'triton_kernel.cu' which defines load and launch functions. Add a build flag in build.py and CMakeList.txt, when building rocm provider, it will call triton_kernel build command, and generate all necessary files. #### C++ Load and Launch On c++ part, we implement load and launch functions in triton_kernel.cu and triton_kernel.h. These two files located in `providers/cuda`, and when compiling rocm, they will be hipified. so this part supports both cuda and rocm. But currently we only call triton kernel in rocm. We also implement a softmax triton op for example. Because there will generate many kernels for different input shape of softmax, we use TunableOp to select the best one. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-17 09:35:28 +08:00
Sheil Kumar	a7ad859e3a	DML EP Register Split18 (#15931 ) Register Split18 for DirectML Split13 was previously implemented. Split18 adds a new attribute called "num_outputs" that must be used mutually exclusively with the "split" input. The "num_outputs" attribute wil split the tensor evenly (and handles odd uneven splits). To implement, the DML split tensor just needs to be overridden in the presence of the num_output attribute. --------- Co-authored-by: Dwayne Robinson <dwayner@microsoft.com>	2023-05-16 11:58:19 -07:00
Yulong Wang	04ea561fc8	[js/webgpu] throw error when WebGPU=ON and SIMD=OFF (#15924 ) ### Description throw error when WebGPU=ON and SIMD=OFF	2023-05-16 11:05:56 -07:00
Jian Chen	780442b9f6	Change windows machine pools to use VS2022  (#15806 ) ### Description <!-- Describe your changes. --> Old pool \| New pool \| Notes -- \| -- \| -- onnxruntime-Win-CPU-2019 \| onnxruntime-Win-CPU-2022 \| onnxruntime-Win2019-CPU-training \| onnxruntime-Win2022-CPU-training-AMD \| onnxruntime-Win2019-CPU-training-AMD \| onnxruntime-Win2022-CPU-training-AMD \| Same as the above onnxruntime-Win2019-GPU-dml-A10 \| Need be created \| You need to create a new image for it first onnxruntime-Win2019-GPU-T4 \| onnxruntime-Win2022-GPU-T4 \| onnxruntime-Win2019-GPU-training-T4 \| onnxruntime-Win2022-GPU-T4 \| Same as the above because we do not have many T4 GPUs onnxruntime-tensorrt8-winbuild-T4\| TBD\|TBD Win-CPU-2021\|onnxruntime-Win-CPU-2022\| will do it in next PR Win-CPU-2019\|onnxruntime-Win2022-Intel-CPU'\| Intel CPU needed for win-ci-pipeline.yml -> `stage: x64_release_dnnl` <br class="Apple-interchange-newline"> ### Motivation and Context With vs2022 we can take the advantage of 64bit compiler. It also with better c++20 support	2023-05-16 10:34:34 -07:00
RandySheriffH	7faad53632	Set default option for package name and build arg options (#15958 ) Set default value for parameters in nuget-zip pipeline, and only apply the configurations when they are not "NONE". --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-16 09:07:38 -07:00
Akash	1079df6aaa	Update StableDiffusion path after cloning repo (#15948 ) ### Description Correct path to SD files in README ### Motivation and Context Small typo in path	2023-05-16 08:39:27 -07:00
Baiju Meswani	6b7181d31d	Add C# API documentation for training (and some other changes) (#15935 )	2023-05-16 03:15:24 -07:00
Prathik Rao	a0ccb95f3c	add option to load pretrained weights for T5 model (#15951 ) ### Description <!-- Describe your changes. --> Adds option to pass in pretrained weights file during T5 inference onnx export. Mimics the changes made to whisper: https://github.com/microsoft/onnxruntime/pull/15759 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Required for ONNX Runtime demo being presented at BUILD.	2023-05-15 22:52:35 -07:00
PeixuanZuo	e96f10d27b	[ROCm] reduce batch size to fix CI error (#15714 ) ROCm CI batch size test occasionally fail. Try reduce batch size to fix it. error log: Non-zero status code returned while running FusedMatMul node. Name:'MatMul_2914_Grad/FusedMatMul_0' Status Message: HIP error hipErrorNotFound:named symbol not found Non-zero status code returned while running Gemm node. Name:'MatMul_2891_Grad/Gemm_5' Status Message: HIP error hipErrorNotFound:named symbol not found	2023-05-16 13:10:02 +08:00
Aung T Naing	bc5018a4e1	[QNN EP] test coverage for MaxPool (#15904 ) ### Description Added MaxPool tests to show the issues with MaxPool and also provide test coverage The following tests are currently Failing: ./onnxruntime_test_all --gtest_filter=.TestMaxPool [ FAILED ] 5 tests, listed below: [ FAILED ] QnnCPUBackendTests.TestMaxPool_Ceil [ FAILED ] QnnCPUBackendTests.TestMaxPool_Large_Input2_Ceil [ FAILED ] QnnHTPBackendTests.TestMaxPool_Large_Input_HTP_u8 [ FAILED ] QnnHTPBackendTests.TestMaxPool_Large_Input2_HTP_u8 [ FAILED ] QnnHTPBackendTests.TestMaxPool_Large_Input2_Ceil_HTP_u8 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? Provide test coverage for MaxPool and debug model related issues.	2023-05-15 21:35:50 -07:00
Yulong Wang	22a9a1a630	[js/webgpu] only register webgpu backend when it's available (#15922 ) ### Description only register webgpu backend when it's available	2023-05-15 18:09:31 -07:00
cloudhan	dc383ed4ce	Basic CSharp packaging support for ROCm EP (#15535 ) This PR mainly fixes building errors when trying to build nupkg for ROCm EP. It also slighly improve the packaging logic so that devlopers can produce the nupkg on linux natively.	2023-05-16 07:27:38 +08:00
Yulong Wang	204111a79e	[js/webgpu] support proxy for webgpu (#15851 ) ### Description [js/webgpu] support proxy for webgpu. fixes #15832	2023-05-15 16:23:13 -07:00
Yulong Wang	f3b8130d1a	[js/web] support npm run pull:wasm [buildID] (#15877 ) ### Description support `npm run pull:wasm [buildID]` remove `npm run pull:wasm:debug` as it can be simply replaced with `npm run pull:wasm debug`.	2023-05-15 16:19:34 -07:00
Jian Chen	00c1da5e0a	Fixing NhwcFusedConv fp16 (#15950 ) ### Description <!-- Describe your changes. --> This should produced fused Resnet50.fp16.onnx ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-15 15:34:41 -07:00
kunal-vaishnavi	5b663d6797	Whisper Multitask and Multilingual (#15936 ) ### Description This PR enables Whisper's multitask format and allows a user to use Whisper for multiple tasks (e.g. transcription, translation) and for multilingual purposes (e.g. English, Spanish). This PR also removes `attention_mask` as a required input for Whisper with beam search. ### Usage Here is an example of how you can use Whisper for English transcription. ``` import numpy as np import onnxruntime as ort from datasets import load_dataset from transformers import AutoConfig, AutoProcessor model = "openai/whisper-tiny" config = AutoConfig.from_pretrained(model) processor = AutoProcessor.from_pretrained(model) forced_decoder_ids = processor.get_decoder_prompt_ids(language="english", task="transcribe") # forced_decoder_ids is of the format [(1, 50259), (2, 50359), (3, 50363)] and needs to be # of the format [50258, 50259, 50359, 50363] where 50258 is the start token id forced_decoder_ids = [config.decoder_start_token_id] + list(map(lambda token: token[1], forced_decoder_ids)) ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation") input_features = processor(ds[0]["audio"]["array"], return_tensors="np").input_features inputs = { "input_features": np.float32(input_features), "max_length": np.array([26], dtype=np.int32), "min_length": np.array([1], dtype=np.int32), "num_beams": np.array([2], dtype=np.int32), "num_return_sequences": np.array([1], dtype=np.int32), "length_penalty": np.array([1.0], dtype=np.float32), "repetition_penalty": np.array([1.0], dtype=np.float32), "decoder_input_ids": np.array([forced_decoder_ids], dtype=np.int32), } sess = ort.InferenceSession("whisper-tiny_beamsearch.onnx", providers=["CPUExecutionProvider"]) outputs = sess.run(None, inputs) # Print tokens and decoded output print(outputs[0][0][0]) print(processor.decode(outputs[0][0][0])) ``` If you don't want to provide specific decoder input ids or you want Whisper to predict the output language and task, you can set `forced_decoder_ids = [config.decoder_start_token_id]` instead. ### Motivation and Context As seen in the figure below from the [OpenAI Whisper paper](https://cdn.openai.com/papers/whisper.pdf), Whisper can be used for multiple tasks and languages. ![Screenshot 2023-05-12 165215](https://github.com/microsoft/onnxruntime/assets/115581922/49335e39-a79c-4f78-92e9-89b034405f65)	2023-05-15 14:36:33 -07:00

1 2 3 4 5 ...

8854 commits