onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-02 03:55:34 +00:00

Author	SHA1	Message	Date
RandySheriffH	d35361bf9d	Fix python pipeline for AzureEP without using root (#16023 ) Fix python pipeline for AzureEP without using root, this is for 1.15. --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-22 16:38:47 -07:00
Changming Sun	0204594f90	Cleanup WASM cmake code (#15996 ) ### Description Remove the "onnxruntime_BUILD_WEBASSEMBLY" cmake option. Use `if (CMAKE_SYSTEM_NAME STREQUAL "Emscripten")` instead. It makes some code look more nature. For example, ```cmake if (CMAKE_SYSTEM_NAME STREQUAL "iOS" OR CMAKE_SYSTEM_NAME STREQUAL "Android" OR onnxruntime_BUILD_WEBASSEMBLY) ``` becomes ```cmake if (CMAKE_SYSTEM_NAME STREQUAL "iOS" OR CMAKE_SYSTEM_NAME STREQUAL "Android" OR CMAKE_SYSTEM_NAME STREQUAL "Emscripten") ```	2023-05-20 18:07:39 -07:00
Hector Li	4324d2173b	[QNN EP] Enable Qnn context cache to save model initialization time (#15815 ) ### Description Enable Qnn Context cache feature to save model initialization time Provider options: qnn_context_cache_enable\|1 to enable the cache feature qnn_context_cache_path to set the cache path. It is set to model_file.onnx.bin by default. ### Motivation and Context Model initialization time takes long because the cost of conversion from Onnx model to Qnn model. Qnn have feature to serialize the Qnn context to file, then next time user can load it from the cache context and execute the graph to save the cost. --------- Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com>	2023-05-19 10:52:17 -07:00
RandySheriffH	4dfb89b3ad	Implement mutex-free spin lock for task queue (#14834 ) Implemented "lock-free" spinlock to save CPU usage on context switching. The change has been tested on queene service of Ads team, the lock-free version of ort (40 threads) saves CPU usage on gen8 (128 logical processors on 8 numa nodes) windows by nearly half, from 65% to 35%. For 32 cores, the curve is flat: Anubis, 32 vCPU, windows, hugging face models, 95 percentile E2E latency in ms: model \| mutex(ms) \| mutex-free --- \| --- \| --- alvert_base_v2 \| 34.21 \| 34.09 bert_large_uncased \| 116.27\| 117.84 bart_base \| 72.06 \| 71.99 distilgpt2 \| 25.43 \| 25.02 vit_base_patch16_224 \| 37.33 \| 37.76 Anubis, 32 vCPU win, Linux, 1st party models, 95 percentile E2E latency in ms: model \| mutex(ms) \| mutex-free --- \| --- \| --- deepthink_v2 \| 24.35 \| 22.95 bing_feeds \| 36.96 \| 36.48 deep_writes \| 14.46 \| 14.32 keypoints \| 9.34 \| 7.69 model11 \| 1.71 \| 1.66 model12 \| 1.82 \| 1.44 model2 \| 4.21 \| 3.95 model6 \| 1.08 \| 1.05 agiencoder \| 0.99 \| 0.93 geminet_transformer \| 5.32 \| 5.24 --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-19 10:12:10 -07:00
Patrice Vignola	310b22aa0c	[DML EP] Update DirectML version to 1.12.0 (#16011 )	2023-05-18 19:37:12 -07:00
PeixuanZuo	d78bbf5ef2	[ROCm] remove ROCm5.2.3, ROCm5.3, ROCm5.4 from pipeline (#16004 ) remove ROCm5.2.3, ROCm5.3, ROCm5.4 from pipeline.	2023-05-19 10:29:01 +08:00
Edward Chen	6d46007028	Add explicit 'set +x' before printing a vso[] command to avoid output getting parsed again with a trailing quote. (#15986 ) Here's the motivating issue: https://github.com/microsoft/azure-pipelines-tasks/issues/10331 Noticed some problems in other repos so also updating usages in ORT. We may be fine now without it, but this change adds some safeguard against future additions of 'set -x' for debugging.	2023-05-17 19:30:28 -07:00
Changming Sun	d98763473a	Change CUDA pipelines to download CUDA SDK in every build job (#15915 ) ### Description Change CUDA pipelines to download CUDA SDK in every build job ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-17 17:31:51 -07:00
cloudhan	856afa49dd	[C#] Add missing rocm csharp api (#15540 )	2023-05-18 08:15:19 +08:00
Yi Zhang	6d43d51eb0	[Fix] No test result report while not using ctest (#15976 ) ### Description 1. Set gtest output while ctest is set to empty. 2. onnx_src in _deps shouldn't be removed because onnx_test_pytorch_converted and onnx_test_pytorch_converted need to read data from onnx/backend/test/data/.. ### Motivation and Context Test result report is important to find the flaky tests. ### To do Tests are not inconsistent. If ctest_path is empty, onnx_test_pytorch_converted and onnx_test_pytorch_converted will not be executed, if it's not, onnxruntime_mlas_test will not be executed. `270c09a37f/tools/ci_build/build.py (L1743-L1753)`	2023-05-17 08:31:16 -07:00
Jian Chen	2881d849d4	Update Win-CPU-2021 to onnxruntime-Win-CPU-2022 (#15967 ) ### Description After this PR there are following pool need to be updated. old\|new\|note ---\|---\|--- onnxruntime-Win2019-GPU-dml-A10\|tbd\| onnxruntime-Win2019-GPU-T4\|onnxruntime-Win2022-GPU-T4\| onnxruntime-Win2019-GPU-training-T4\|onnxruntime-Win2022-GPU-T4\|ame as the above because we do not have many T4 GPUs onnxruntime-tensorrt8-winbuild-T4\|tbd\| aiinfra-dml-winbuild\|tbd\| ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-17 08:29:27 -07:00
kailums	f62f722c70	integrate triton into ort (#15862 ) ### Description In some scenarios, the triton written kernels are more performant than CK or other handwritten kernels, so we implement a framework that onnxruntime can use these triton written kernels. This PR is to integrate triton into ort, so that ort can use kernels that written and compiled by triton. The main change focus on two part: 1. a build part to compile triton written kernel and combine these kernels into libonnxruntime_providers_rocm.so 2. a loader and launcher in c++, for loading and launch triton written kernels. #### Build To compile triton written kernel, add a script `tools/ci_build/compile_triton.py`. This script will dynamic load all kernel files, compile them, and generate `triton_kernel_infos.a` and `triton_kernel_infos.h`. `triton_kernel_infos.a` contains all compiled kernel instructions, this file will be combined into libonnxruntime_providers_rocm.so, using --whole-archive flag. `triton_kernel_infos.h` defines a const array that contains all the metadata for each compiled kernel. These metadata will be used for load and launch. So this header file is included by 'triton_kernel.cu' which defines load and launch functions. Add a build flag in build.py and CMakeList.txt, when building rocm provider, it will call triton_kernel build command, and generate all necessary files. #### C++ Load and Launch On c++ part, we implement load and launch functions in triton_kernel.cu and triton_kernel.h. These two files located in `providers/cuda`, and when compiling rocm, they will be hipified. so this part supports both cuda and rocm. But currently we only call triton kernel in rocm. We also implement a softmax triton op for example. Because there will generate many kernels for different input shape of softmax, we use TunableOp to select the best one. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-17 09:35:28 +08:00
Jian Chen	780442b9f6	Change windows machine pools to use VS2022  (#15806 ) ### Description <!-- Describe your changes. --> Old pool \| New pool \| Notes -- \| -- \| -- onnxruntime-Win-CPU-2019 \| onnxruntime-Win-CPU-2022 \| onnxruntime-Win2019-CPU-training \| onnxruntime-Win2022-CPU-training-AMD \| onnxruntime-Win2019-CPU-training-AMD \| onnxruntime-Win2022-CPU-training-AMD \| Same as the above onnxruntime-Win2019-GPU-dml-A10 \| Need be created \| You need to create a new image for it first onnxruntime-Win2019-GPU-T4 \| onnxruntime-Win2022-GPU-T4 \| onnxruntime-Win2019-GPU-training-T4 \| onnxruntime-Win2022-GPU-T4 \| Same as the above because we do not have many T4 GPUs onnxruntime-tensorrt8-winbuild-T4\| TBD\|TBD Win-CPU-2021\|onnxruntime-Win-CPU-2022\| will do it in next PR Win-CPU-2019\|onnxruntime-Win2022-Intel-CPU'\| Intel CPU needed for win-ci-pipeline.yml -> `stage: x64_release_dnnl` <br class="Apple-interchange-newline"> ### Motivation and Context With vs2022 we can take the advantage of 64bit compiler. It also with better c++20 support	2023-05-16 10:34:34 -07:00
RandySheriffH	7faad53632	Set default option for package name and build arg options (#15958 ) Set default value for parameters in nuget-zip pipeline, and only apply the configurations when they are not "NONE". --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-16 09:07:38 -07:00
cloudhan	dc383ed4ce	Basic CSharp packaging support for ROCm EP (#15535 ) This PR mainly fixes building errors when trying to build nupkg for ROCm EP. It also slighly improve the packaging logic so that devlopers can produce the nupkg on linux natively.	2023-05-16 07:27:38 +08:00
yf711	825d691617	Unify cuda & trt version on few CIs (#15943 ) ### Description The cuda & trt version of some CIs didn't sync with the majority. Unifying cuda version as 11.8 and trt version as 8.6 on these CIs	2023-05-15 09:54:30 -07:00
Rachel Guo	18133ddadb	[doc] add LeakyRelu to coreml supported ops (#15944 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-15 09:46:30 -07:00
Adrian Lizarraga	5542e70dd1	[QNN EP] Update default QNN SDK version to 2.10 for QNN NuGet pipeline (#15899 ) ### Description Updates the default QNN SDK version to 2.10 for the QNN NuGet pipeline. ### Motivation and Context Ensures that the daily QNN NuGet pipeline builds ORT using the latest QNN SDK by default.	2023-05-15 09:17:42 -07:00
PeixuanZuo	af6cb2af87	[ROCm] update ROCm/MIGraphX CI to ROCm5.5 (#15905 ) update ROCm/MIGraphX CI to ROC5.5. TODO: two PR to fix failure on orttraining/orttraining/test/python/orttraining_test_ortmodule_api.py - test_gradient_correctness_minmax/test_gradient_correctness_argmax_unfold/test_gradient_correctness_argmax_diagonal (https://github.com/microsoft/onnxruntime/pull/15903) - test_ortmodule_attribute_name_collision_warning (https://github.com/microsoft/onnxruntime/pull/15884)	2023-05-15 10:28:15 +08:00
Yi Zhang	b20d5e85d5	Update Cuda to 11.8 in 2 Linux GPU workflows. (#15925 ) ### Description use template variable for cuda version ### Motivation and Context	2023-05-14 12:51:25 +08:00
RandySheriffH	7c4e8267e7	Implement openAI endpoint invoker for nuget (#15797 ) Implement openAI audio endpoint, and enable nuget packaging. --------- Co-authored-by: Randy Shuai <rashuai@microsoft.com>	2023-05-11 22:04:02 -07:00
Yi Zhang	0e7ae13e74	Run Linux GPU tests in docker container (#15872 ) ### Description Run Linux GPU tests in docker container ### Motivation and Context	2023-05-12 06:29:22 +08:00
Jian Chen	1a73d61829	Update eigen to 3.4 and remove the eigen from git submodule (#15875 ) ### Description Update eigen to 3.4 and remove the eigen from git submodule ### Motivation and Context We need to have eigen 3.4 for c++20	2023-05-11 11:56:59 -07:00
Changming Sun	7c58d013aa	Remove Ubuntu 18.04 usages (#15781 ) ### Description Remove Ubuntu 18.04 usages because it will be EOL this month. ### Motivation and Context	2023-05-11 11:44:00 -07:00
Yulong Wang	756cf3a76f	increase web CI timeout (#15876 ) ### Description The CI is extremely slow on downloading source code (~1MB/sec) so the web CI went timeout. This is blocking the PR/checks. Increase the timeout temporarily.	2023-05-11 11:17:46 -07:00
liqun Fu	ac9ae9f7c5	update onnx release 1.14 for docker files (#15680 ) ### Description this is for ort 1.15 release to work with onnx 1.14 It shall be merged after onnx 1.14 release and before ort 1.15 release. ### Motivation and Context --------- Signed-off-by: Liqun Fu <liqfu@microsoft.com>	2023-05-10 13:15:56 -07:00
Nat Kershaw (MSFT)	36c9ae0f58	Fix release version suffix for RC builds (#15865 )	2023-05-09 23:06:08 -07:00
Sumit Agarwal	b473e3f3c6	[DML EP] Update DirectML version to 1.11.0 (#15858 ) ### Description - Update DML version to 1.11.0 - Disable Gemm+Softmax fusion ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-09 12:48:15 -07:00
Jian Chen	34cb293c6b	Remove unused ADO YML pipeline template (#15857 ) ### Description Remove unused ADO YML pipeline template ### Motivation and Context Clean up and reduce our codebase.	2023-05-09 09:15:04 -07:00
Wanming Lin	00b1e79e04	Support WebNN EP (#15698 ) Description: This PR intends to enable WebNN EP in ONNX Runtime Web. It translates the ONNX nodes by [WebNN API](https://webmachinelearning.github.io/webnn/), which is implemented in C++ and uses Emscripten [Embind API](https://emscripten.org/docs/porting/connecting_cpp_and_javascript/embind.html#). Temporarily using preferred layout NHWC for WebNN graph partitions since the restriction in WebNN XNNPack backend implementation and the ongoing [discussion](https://github.com/webmachinelearning/webnn/issues/324) in WebNN spec that whether WebNN should support both 'NHWC' and 'NCHW' layouts. No WebNN native EP, only for Web. Motivation and Context: Allow ONNXRuntime Web developers to access WebNN API to benefit from hardware acceleration. WebNN API Implementation Status in Chromium: - Tracked in Chromium issue: [#1273291](https://bugs.chromium.org/p/chromium/issues/detail?id=1273291) - CPU device: based on XNNPack backend, and had been available on Chrome Canary M112 behind "#enable-experimental-web-platform-features" flag for Windows and Linux platforms. Further implementation for more ops is ongoing. - GPU device: based on DML, implementation is ongoing. Open: - GitHub CI: WebNN currently is only available on Chrome Canary/Dev with XNNPack backend for Linux and Windows. This is an open to reviewers to help identify which GitHub CI should involved the WebNN EP and guide me to enable it. Thanks!	2023-05-08 21:25:10 -07:00
Yulong Wang	0457fd0b40	upgrade emsdk to 3.1.37 (#15817 ) ### Description upgrade emsdk to 3.1.37 WIP branch to debug the mystery memory issue in web assembly multi-thread build.	2023-05-08 16:49:47 -07:00
Yi Zhang	045c623415	Make Nuget workflow easy to debug (#15808 ) ### Description Fix the bug in #15693 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-05-08 20:53:08 +08:00
Nat Kershaw (MSFT)	5e9b42326c	Fix packaging pipeline for nightly builds (#15839 )	2023-05-07 20:42:38 -07:00
PeixuanZuo	41457885e0	[ROCm] add rocm5.5 to python package pipeline (#15820 ) add rocm5.5 to python packaging pipeline. https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=306082&view=results TODO: Remove version 5.2.3, 5.3.2 and 5.4 in the next PR.	2023-05-06 10:21:15 +08:00
Nat Kershaw (MSFT)	ed31e4b737	Add nuget release version suffix to support publishing rcs to nuget.org (#15791 )	2023-05-05 18:18:24 -07:00
Adrian Lizarraga	45f5c27632	[QNN EP] Update default QNN SDK to version 2.10.0 (#15818 ) ### Description - Updates the default QNN SDK for CI pipelines to version 2.10.0. - Disables convolution op tests that run on the QNN CPU backend due to a potential bug with QNN SDK 2.10.0. ### Motivation and Context Allows us to test the latest QNN SDK in default CI pipeline runs.	2023-05-05 13:01:21 -07:00
Guenther Schmuelling	5a43828b3d	update ort extensions to 94142d8391c9791ec71c38336436319a2d4ac7a0 (#15688 ) needed to get tokenizers/decode for whisper --------- Co-authored-by: Shalva Mist <shalvamist@microsoft.com>	2023-05-05 09:48:07 -07:00
Scott McKay	d1b2b35cd2	Various fixes to the CSharp setup (#15782 ) ### Description <!-- Describe your changes. --> Various fixes to the CSharp setup - fix warnings - fix invalid tests - update test sdk nuget package - enables testing on linux - fixes issue with some unit tests not running in CI - run unit tests in linux pipeline using dotnet ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Unit tests weren't breaking in CIs for both Windows and Linux builds and should have been.	2023-05-05 14:27:30 +10:00
Yulong Wang	4712009f8a	[js/web] add target ort.webgpu.min.js (#15780 ) ### Description add target ort.webgpu.min.js WebGPU is experimental feature, so I don't want to put webgpu into the ort.min.js file. This change adds 2 ways for users to access ort-web with webgpu: - using script tag: by URL `https://cdn.jsdelivr.net/npm/onnxruntime-web@1.15.0/dist/ort.webgpu.min.js` ( this URL is not ready yet ) - using `import()`: use `import { Tensor, InferenceSession } from 'onnxruntime-web/webgpu';` - 'onnxruntime-web/webgpu' instead of 'onnxruntime-web'	2023-05-04 10:05:39 -07:00
Yulong Wang	33d1372729	[wasm] revert emsdk to v3.1.19 (#15793 ) ### Description latest emsdk generated multi-thread version sometimes crash with unknown reason ( error: memory access out of bounds ). we don't want to break existing ort-web users, so revert emsdk back to 3.1.19 (same to what ort v1.14.0 uses)	2023-05-04 01:15:01 -07:00
Baiju Meswani	e464588a0e	Avoid generating training documentation during packaging (#15795 )	2023-05-03 19:09:07 -07:00
Changming Sun	1fb2f2605b	Update VERSION_NUMBER (#15773 ) ### Description 1. Update VERSION_NUMBER for preparing the upcoming release. This PR's commit will not be included in the 1.15 release branch 2. Delete package/rpm/onnxruntime.spec since it was not used in past years. ### Motivation and Context Preparing the release. Fixed [AB#15311](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/15311)	2023-05-03 15:07:34 -07:00
Baiju Meswani	ba7b83ff3c	Remove onnxruntime_PYBIND_EXPORT_OPSCHEMA definition from onnxruntime (#15776 )	2023-05-03 13:08:35 -07:00
Changming Sun	d53324d4a7	Update cmake version in a few places (#15775 ) ### Description They were missed in #15707 , because they are not in common places for Dockerfiles. Though this commit updated tools/ci_build/github/pai/rocm-ci-pipeline-env.Dockerfile, it won't automatically take effect. The image needs to be manually generated and pushed to a place, and before doing that our CMakeLists.txt also needs to be tweaked a little bit.	2023-05-02 22:56:28 -07:00
Yulong Wang	ef1f17f3dc	[wasm/JSEP] add threaded build to artifacts (#15777 ) ### Description This is the first part to create a webassembly artifacts for ort-web webgpu EP (wasm build). there will be following steps to consume the artifacts in web build	2023-05-02 17:53:44 -07:00
Baiju Meswani	2d519d21af	Python documentation for onnxruntime-training (#15765 )	2023-05-02 16:58:16 -07:00
Jian Chen	abdd4f518a	Update TRT Windows Cuda 11.6 to 11.8 (#15746 ) ### Description Update TRT Windows cuda 11.6 to 11.8 ### Motivation and Context We are adapting newer version of cuda systemwide.	2023-05-02 12:23:13 -07:00
Changming Sun	328cabb194	Download protoc from Github Release instead of Nuget (#15731 ) ### Description Download protoc from Github Release instead of Nuget to avoid having dependency on nuget.exe on Linux ### Motivation and Context To avoid having dependency on nuget.exe on Linux. Many users' build environment do not have nuget or dotnet.	2023-05-02 12:18:59 -07:00
Changming Sun	5352f6d9b0	Make "--cuda_version" build arg optional (#15758 ) ### Description This change will allow us building CUDA EP without installing CUDA SDK on Windows. ### Motivation and Context Nvidia's CUDA installer comes with a VS extension. In the past, we require installing the extension. It is a little bit inconvenient since: 1. Visual Studio must be installed before CUDA SDK. CUDA's installer will not install the extension if your machine doesn't have Visual Studio. 2. We need to install CUDA SDK on our build machines, instead of just downloading it and using it. After this change, we will not need to install CUDA SDK on our build machines. So it will be easier to add a support for a different CUDA version. Also, fix two PreFast warnings.	2023-05-01 18:00:47 -07:00
Ashwini Khade	0ffae8073b	Creating Nuget and Android packages for Training (#15712 ) ### Description This PR creates Nuget and Android for Training. ### Motivation and Context These packages are intended to be released in ORT 1.15 to enable On-Device Training Scenarios. ## Packaging Story for Learning On The Edge Release ### Nuget Packages: 1. New Native package -> Microsoft.ML.OnnxRuntime.Training (Native package will contain binaries for: win-x86, win-x64, win-arm, win-arm64, linux-x64, linux-arm64, android) 2. C# bindings will be added to existing package -> Microsoft.ML.OnnxRuntime.Managed ### Android Package published to Maven: 1. New package for training (full build) -> onnxruntime-training-android-full-aar ### Python Package published to PyPi: 1. Python bindings and offline tooling will be added to the existing ort training package -> onnxruntime-training	2023-05-01 12:59:56 -07:00

1 2 3 4 5 ...

1959 commits