onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-15 20:50:42 +00:00

Author	SHA1	Message	Date
Karim Vadsariya	655a23ff1d	[onnxruntime/build] Add new flag enable_generic_interface to build primary EPs by default (#23342 ) ### Description - Add new build flag in build.py to build onnxruntime.dll supporting interfaces for all primary EPs( QNN, TensoRT, OpenVino, VitisAI). - Modify onnxruntime.dll/onnxruntime_shared.dll build settings to remove dependency of IHV SDK Toolset to be installed on the system. - Change CMake variables to be explicit when building EP vs ORT. e.g. onnxruntime_USE_TENSORRT vs onnxruntime_USE_TENSORRT_INTERFACE, to evolve the build system to build ORT independent of EPs. ### Motivation and Context Changes in the build system required to evolve the repo to build the components independently while removing unnecessary dependencies --------- Co-authored-by: Lei Cao <jslhcl@gmail.com> Co-authored-by: Karim Vadsariya <kvadsariya@microsoft.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-01-28 15:24:09 -08:00
Yulong Wang	8db97a68f2	[webgpu] Bump version of Dawn to b9b4a370 (#23494 ) ### Description This PR updates the version of Dawn to `b9b4a37041dec3dd62ac92014a6cc1aece48d9f3` (ref: [chromium](`67f86f01dd/DEPS (399)`)) in the `deps.txt` file. The newer version of Dawn includes the previous changes from dawn.patch so that we can remove the patch file. There is a little interface changes and code is updated correspondingly.	2025-01-27 14:02:06 -08:00
Changming Sun	1fc9c4823d	Enable coremltools for Linux build (#23481 ) ### Description Enable coremltools for Linux build. In order to do this, I did: 1. Add uuid-devel to the Linux images and regenerate them. 2. Patch the coremltools code a little bit to add some missing header files. ### Motivation and Context To make the code simpler. Later on I will create another PR to remove the COREML_ENABLE_MLPROGRAM C/C++ macro. Also, after this PR I will bring more changes to onnxruntime_provider_coreml.cmake to make it work with vcpkg.	2025-01-24 18:18:37 -08:00
Adrian Lizarraga	3b4c7df4e9	[QNN EP] Make QNN EP a shared library (#23120 ) ### Description - Makes QNN EP a shared library by default when building with `--use_qnn` or `--use_qnn shared_lib`. Generates the following build artifacts: - Windows: `onnxruntime_providers_qnn.dll` and `onnxruntime_providers_shared.dll` - Linux: `libonnxruntime_providers_qnn.so` and `libonnxruntime_providers_shared.so` - Android: Not supported. Must build QNN EP as a static library. - Allows QNN EP to still be built as a static library with `--use_qnn static_lib`. This is primarily for the Android QNN AAR package. - Unit tests run for both the static and shared QNN EP builds. ### Detailed changes - Updates Java bindings to support both shared and static QNN EP builds. - Provider bridge API: - Adds logging sink ETW to the provider bridge. Allows EPs to register ETW callbacks for ORT logging. - Adds a variety of methods for onnxruntime objects that are needed by QNN EP. - QNN EP: - Adds `ort_api.h` and `ort_api.cc` that encapsulates the API provided by ORT in a manner that allows the EP to be built as either a shared or static library. - Adds custom function to transpose weights for Conv and Gemm (instead of adding util to provider bridge API). - Adds custom function to quantize data for LeakyRelu (instead of adding util to provider bridge API). - Adds custom ETW tracing for QNN profiling events: - shared library: defines its own TraceLogging provider handle - static library: uses ORT's TraceLogging provider handle and existing telemetry provider. - ORT-QNN Packages: - Python: Pipelines build QNN EP as a shared library by default. User can build a local python wheel with QNN EP as a static library by passing `--use_qnn static_lib`. - NuGet: Pipelines build QNN EP as a shared library by default. `build.py` currently enforces QNN EP to be built as a shared library. Can add support for building a QNN NuGet package with static later if deemed necessary. - Android: Pipelines build QNN EP as a static library. `build.py` enforces QNN EP to be built as a static library. Packaging multiple shared libraries into an Android AAR package is not currently supported due to the added need to also distribute a shared libcpp.so library. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2025-01-22 12:11:00 -08:00
Jian Chen	628c0e00c4	Change MacOS-13 to ubuntu on for android-java-api-aar-test.yml. (#23444 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2025-01-21 17:07:20 -08:00
Jian Chen	899ea21ffe	Moving RN_CI Android Testing to Linux (#23422 ) ### Description Moving Android E2E test steps from Mac-OS13 to unbunt22.04 ### Motivation and Context Deduced the dependency on MacOS, which is deprecating the x64 version.	2025-01-21 11:55:29 -08:00
Jian Chen	83cb1e4a3c	Seperate RN andriod and IOS into 2 separated Stages. (#23400 ) ### Description Seperate RN andriod and IOS into 2 separated Stages. ### Motivation and Context Speed up the PR process.	2025-01-20 18:08:01 -08:00
Hector Li	f35924a891	Update Qnn SDK default version to 2.30 (#23411 ) ### Description Update Qnn SDK default version to 2.30	2025-01-17 22:36:35 -08:00
Changming Sun	d461ca9dcd	Update onnxruntime binary size checks ci pipeline's docker image (#23405 ) 1. Update onnxruntime binary size checks ci pipeline's docker image. Use a different docker image that is not manylinux based. The new one is smaller. 2. Add flatbuffers tools/ci_build/requirements/pybind/requirements.txt 3. Delete tools/ci_build/github/azure-pipelines/py-package-build-pipeline.yml. The pipeline was for generating packages for Olive, but it went unused. And the content is highly duplicated with our official python packaging pipeline. 4. A lot of YAML files reference pypa/manylinux git repo but do not use it. This PR removes the references.	2025-01-17 15:29:17 -08:00
Justin Chu	09c4cc7b36	Target py310 and modernize codebase with ruff (#23401 ) Change `target-version = "py310"` and modernize the code base with ruff.	2025-01-16 19:10:14 -08:00
Justin Chu	c7c8757a1c	Use ruff as the formatter to replace black-isort (#23397 ) Use ruff as the code formatter in place of black and isort since it is much faster, and as projects like PyTorch and ONNX have adopted ruff format as well. This PR include only auto-fixed changes in formatting.	2025-01-16 11:14:15 -08:00
Yulong Wang	080c67e900	[WebGPU] allow build WebGPU EP for WebAssembly (#23364 ) ### Description This PR allows WebGPU EP to be built with Emscripten for WebAssembly, Including: - cmake build files update to support correct setup for Emscripten. - code changes to fix build breaks for wasm - change in Web CI pipeline to add a build-only target for wasm with `--use_webgpu`.	2025-01-16 10:52:17 -08:00
Changming Sun	82aa355904	Update android_min_sdk_version/android_target_sdk_version (#23369 ) Update android_min_sdk_version to 24 and android_target_sdk_version to 34. Previously Jian already updated the values for some pipelines. This PR updates the other occurrences to make things consistent Why android_min_sdk_version is set to 24: Because React Native requires so: https://github.com/react-native-community/discussions-and-proposals/discussions/802 Why android_target_sdk_version is set to 34: Because according to Google Play's policy, new apps and app updates must target Android 14 (API level 34) to be submitted to Google Play. https://support.google.com/googleplay/android-developer/answer/11926878?hl=en	2025-01-16 08:03:31 -08:00
Jian Chen	331fc36b6a	Remove hot path for pre-0.70.15 RN fix (#23382 ) ### Description This undo the changes from #23281	2025-01-15 16:16:38 -08:00
dependabot[bot]	1461a16e71	Bump ruff from 0.5.4 to 0.9.1 (#23328 ) Bumps [ruff](https://github.com/astral-sh/ruff) from 0.5.4 to 0.9.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/ruff/releases">ruff's releases</a>.</em></p> <blockquote> <h2>0.9.1</h2> <h2>Release Notes</h2> <h3>Preview features</h3> <ul> <li>[<code>pycodestyle</code>] Run <code>too-many-newlines-at-end-of-file</code> on each cell in notebooks (<code>W391</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15308">#15308</a>)</li> <li>[<code>ruff</code>] Omit diagnostic for shadowed private function parameters in <code>used-dummy-variable</code> (<code>RUF052</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15376">#15376</a>)</li> </ul> <h3>Rule changes</h3> <ul> <li>[<code>flake8-bugbear</code>] Improve <code>assert-raises-exception</code> message (<code>B017</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15389">#15389</a>)</li> </ul> <h3>Formatter</h3> <ul> <li>Preserve trailing end-of line comments for the last string literal in implicitly concatenated strings (<a href="https://redirect.github.com/astral-sh/ruff/pull/15378">#15378</a>)</li> </ul> <h3>Server</h3> <ul> <li>Fix a bug where the server and client notebooks were out of sync after reordering cells (<a href="https://redirect.github.com/astral-sh/ruff/pull/15398">#15398</a>)</li> </ul> <h3>Bug fixes</h3> <ul> <li>[<code>flake8-pie</code>] Correctly remove wrapping parentheses (<code>PIE800</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15394">#15394</a>)</li> <li>[<code>pyupgrade</code>] Handle comments and multiline expressions correctly (<code>UP037</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15337">#15337</a>)</li> </ul> <h2>Contributors</h2> <ul> <li><a href="https://github.com/AntoineD"><code>@AntoineD</code></a></li> <li><a href="https://github.com/InSyncWithFoo"><code>@InSyncWithFoo</code></a></li> <li><a href="https://github.com/MichaReiser"><code>@MichaReiser</code></a></li> <li><a href="https://github.com/calumy"><code>@calumy</code></a></li> <li><a href="https://github.com/dcreager"><code>@dcreager</code></a></li> <li><a href="https://github.com/dhruvmanila"><code>@dhruvmanila</code></a></li> <li><a href="https://github.com/dylwil3"><code>@dylwil3</code></a></li> <li><a href="https://github.com/sharkdp"><code>@sharkdp</code></a></li> <li><a href="https://github.com/tjkuson"><code>@tjkuson</code></a></li> </ul> <h2>Install ruff 0.9.1</h2> <h3>Install prebuilt binaries via shell script</h3> <pre lang="sh"><code>curl --proto '=https' --tlsv1.2 -LsSf https://github.com/astral-sh/ruff/releases/download/0.9.1/ruff-installer.sh \| sh </code></pre> <h3>Install prebuilt binaries via powershell script</h3> <pre lang="sh"><code>powershell -ExecutionPolicy ByPass -c "irm https://github.com/astral-sh/ruff/releases/download/0.9.1/ruff-installer.ps1 \| iex" </code></pre> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md">ruff's changelog</a>.</em></p> <blockquote> <h2>0.9.1</h2> <h3>Preview features</h3> <ul> <li>[<code>pycodestyle</code>] Run <code>too-many-newlines-at-end-of-file</code> on each cell in notebooks (<code>W391</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15308">#15308</a>)</li> <li>[<code>ruff</code>] Omit diagnostic for shadowed private function parameters in <code>used-dummy-variable</code> (<code>RUF052</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15376">#15376</a>)</li> </ul> <h3>Rule changes</h3> <ul> <li>[<code>flake8-bugbear</code>] Improve <code>assert-raises-exception</code> message (<code>B017</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15389">#15389</a>)</li> </ul> <h3>Formatter</h3> <ul> <li>Preserve trailing end-of line comments for the last string literal in implicitly concatenated strings (<a href="https://redirect.github.com/astral-sh/ruff/pull/15378">#15378</a>)</li> </ul> <h3>Server</h3> <ul> <li>Fix a bug where the server and client notebooks were out of sync after reordering cells (<a href="https://redirect.github.com/astral-sh/ruff/pull/15398">#15398</a>)</li> </ul> <h3>Bug fixes</h3> <ul> <li>[<code>flake8-pie</code>] Correctly remove wrapping parentheses (<code>PIE800</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15394">#15394</a>)</li> <li>[<code>pyupgrade</code>] Handle comments and multiline expressions correctly (<code>UP037</code>) (<a href="https://redirect.github.com/astral-sh/ruff/pull/15337">#15337</a>)</li> </ul> <h2>0.9.0</h2> <p>Check out the <a href="https://astral.sh/blog/ruff-v0.9.0">blog post</a> for a migration guide and overview of the changes!</p> <h3>Breaking changes</h3> <p>Ruff now formats your code according to the 2025 style guide. As a result, your code might now get formatted differently. See the formatter section for a detailed list of changes.</p> <p>This release doesn’t remove or remap any existing stable rules.</p> <h3>Stabilization</h3> <p>The following rules have been stabilized and are no longer in preview:</p> <ul> <li><a href="https://docs.astral.sh/ruff/rules/stdlib-module-shadowing/"><code>stdlib-module-shadowing</code></a> (<code>A005</code>). This rule has also been renamed: previously, it was called <code>builtin-module-shadowing</code>.</li> <li><a href="https://docs.astral.sh/ruff/rules/builtin-lambda-argument-shadowing/"><code>builtin-lambda-argument-shadowing</code></a> (<code>A006</code>)</li> <li><a href="https://docs.astral.sh/ruff/rules/slice-to-remove-prefix-or-suffix/"><code>slice-to-remove-prefix-or-suffix</code></a> (<code>FURB188</code>)</li> <li><a href="https://docs.astral.sh/ruff/rules/boolean-chained-comparison/"><code>boolean-chained-comparison</code></a> (<code>PLR1716</code>)</li> <li><a href="https://docs.astral.sh/ruff/rules/decimal-from-float-literal/"><code>decimal-from-float-literal</code></a> (<code>RUF032</code>)</li> <li><a href="https://docs.astral.sh/ruff/rules/post-init-default/"><code>post-init-default</code></a> (<code>RUF033</code>)</li> <li><a href="https://docs.astral.sh/ruff/rules/useless-if-else/"><code>useless-if-else</code></a> (<code>RUF034</code>)</li> </ul> <p>The following behaviors have been stabilized:</p> <ul> <li><a href="https://docs.astral.sh/ruff/rules/pytest-parametrize-names-wrong-type/"><code>pytest-parametrize-names-wrong-type</code></a> (<code>PT006</code>): Detect <a href="https://docs.pytest.org/en/7.1.x/how-to/parametrize.html#parametrize"><code>pytest.parametrize</code></a> calls outside decorators and calls with keyword arguments.</li> </ul> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="`12f86f39a4`"><code>12f86f3</code></a> Ruff 0.9.1 (<a href="https://redirect.github.com/astral-sh/ruff/issues/15407">#15407</a>)</li> <li><a href="`2b28d566a4`"><code>2b28d56</code></a> Associate a trailing end-of-line comment in a parenthesized implicit concaten...</li> <li><a href="`adca7bd95c`"><code>adca7bd</code></a> Remove pygments pin (<a href="https://redirect.github.com/astral-sh/ruff/issues/15404">#15404</a>)</li> <li><a href="`6b98a26452`"><code>6b98a26</code></a> [red-knot] Support <code>assert_type</code> (<a href="https://redirect.github.com/astral-sh/ruff/issues/15194">#15194</a>)</li> <li><a href="`c87463842a`"><code>c874638</code></a> [red-knot] Move tuple-containing-Never tests to Markdown (<a href="https://redirect.github.com/astral-sh/ruff/issues/15402">#15402</a>)</li> <li><a href="`c364b586f9`"><code>c364b58</code></a> [<code>flake8-pie</code>] Correctly remove wrapping parentheses (<code>PIE800</code>) (<a href="https://redirect.github.com/astral-sh/ruff/issues/15394">#15394</a>)</li> <li><a href="`73d424ee5e`"><code>73d424e</code></a> Fix outdated doc for handling the default file types with the pre-commit hook...</li> <li><a href="`6e9ff445fd`"><code>6e9ff44</code></a> Insert the cells from the <code>start</code> position (<a href="https://redirect.github.com/astral-sh/ruff/issues/15398">#15398</a>)</li> <li><a href="`f2c3ddc5ea`"><code>f2c3ddc</code></a> [red-knot] Move intersection type tests to Markdown (<a href="https://redirect.github.com/astral-sh/ruff/issues/15396">#15396</a>)</li> <li><a href="`b861551b6a`"><code>b861551</code></a> Remove unnecessary backticks (<a href="https://redirect.github.com/astral-sh/ruff/issues/15393">#15393</a>)</li> <li>Additional commits viewable in <a href="https://github.com/astral-sh/ruff/compare/0.5.4...0.9.1">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=ruff&package-manager=pip&previous-version=0.5.4&new-version=0.9.1)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>	2025-01-15 11:11:17 -08:00
Changming Sun	6a7ea5c896	Update xnnpack, cpuinfo and pthreadpool (#23362 ) ### Description Update xnnpack to remove the dependency on psimd and fp16 libraries. However, coremltool still depends on them, which will be addressed later. Also, update CPUINFO because the latest xnnpack requires CPUINFO's avx10 support. ### Motivation and Context The fewer dependencies the better.	2025-01-15 09:42:15 -08:00
Yifan Li	5c3c7643db	Update range of gpu arch (#23309 ) ### Description <!-- Describe your changes. --> * Remove deprecated gpu arch to control nuget/python package size (latest TRT supports sm75 Turing and newer arch) * Add 90 to support blackwell series in next release (86;89 not considered as adding them will rapidly increase package size) \| arch_range \| Python-cuda12 \| Nuget-cuda12 \| \| -------------- \| ------------------------------------------------------------ \| ---------------------------------- \| \| 60;61;70;75;80 \| Linux: 279MB Win: 267MB \| Linux: 247MB Win: 235MB \| \| 75;80 \| Linux: 174MB Win: 162MB \| Linux: 168MB Win: 156MB \| \| 75;80;90 \| Linux: 299MB Win: 277MB \| Linux: 294MB Win: 271MB \| \| 75;80;86;89 \| [Linux: MB Win: 390MB](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=647457&view=results) \| Linux: 416MB Win: 383MB \| \| 75;80;86;89;90 \| [Linux: MB Win: 505MB](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=646536&view=results) \| Linux: 541MB Win: 498MB \| ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Callout: While adding sm90 support, the build of cuda11.8+cudnn8 will be dropped in the coming ORT release, as the build has issue with blackwell (mentioned in comments) and demand on cuda 11 is minor, according to internal ort-cuda11 repo.	2025-01-14 14:27:34 -08:00
Changming Sun	4e4fd2bdcf	Update ORT extension to the latest (#23314 ) Update ORT extension to the latest, to include some build system fixes.	2025-01-13 18:59:42 -08:00
Changming Sun	ecdeecae61	Update MACOSX_DEPLOYMENT_TARGET (#23308 ) Fix some inconsistency. All our iOS build should target iOS 15.1. All our macOS desktop build should target macOS 13.3 to align with the changes made in #17361	2025-01-10 14:25:32 -08:00
Changming Sun	e7d8596c7c	Update docker images: remove python 3.8 and 3.9 (#23310 ) Python 3.8 and 3.9 are removed from the new manylinux images, to reduce image size.	2025-01-10 13:09:04 -08:00
Changming Sun	0ec2171b9f	Update Linux docker images (#23244 ) The new images contain the following updates: 1. Added Git, Ninja and VCPKG to all docker images 2. Updated CPU containers' GCC version from 12 to 14 3. Pinned CUDA 12 images' CUDNN version to 9.5(The latest one is 9.6) 4. Addressed container supply chain warnings by building CUDA 12 images from scratch(avoid using Nvidia's prebuilt images) 5. Updated manylinux commit id to 75aeda9d18eafb323b00620537c8b4097d4bef48 Also, this PR updated some source code to make the CPU EP's source code compatible with GCC 14.	2025-01-09 10:20:33 -08:00
Changming Sun	3328eb3bb3	Update min iOS version to 15.1 to align with React Native 0.76 (#23292 ) Update min iOS version to 15.1 to align with React Native 0.76. We need to update React Native . See https://github.com/react-native-community/discussions-and-proposals/discussions/812 for background. Similar to PR #20773	2025-01-08 16:02:45 -08:00
Changming Sun	ccbe66d422	Update NDK (#23280 ) Similar to #21989	2025-01-08 13:57:23 -08:00
Jian Chen	da35cceac9	Add a temporary path to RN 0.69.3 to update the boost url (#23281 ) ### Description Add a temporary path to RN 0.69.3 to update the boost url ### Motivation and Context Fix the React-native CI until we update the RN to 0.70.15 or 0.73.3+ versions	2025-01-08 09:28:35 -08:00
Changming Sun	c6cbda3257	Update Python-Cuda-Publishing-Pipeline (#23253 ) ### Description 1. Currently Python-Cuda-Publishing-Pipeline only publishes Linux wheels, not Windows wheels. It is because recently we refactored the upstream pipeline("Python-CUDA-Packaging-Pipeline") to use 1ES PT. This PR fixed the issue 2. tools/ci_build/github/azure-pipelines/stages/py-win-gpu-stage.yml no longer includes component-governance-component-detection-steps.yml , because 1ES PT already inserted such a thing 3. Delete tools/ci_build/github/windows/eager/requirements.txt because it is no longer used. ### Motivation and Context The "Python-CUDA-Packaging-Pipeline" is for CUDA 12. "Python CUDA ALT Packaging Pipeline" is for CUDA 11. The two pipelines are very similar, except the CUDA versions are different. Each of them has three parts: build, test, publish. "Python-CUDA-Packaging-Pipeline" is the first part: build. "Python CUDA12 Package Test Pipeline" is the second part. "Python-Cuda-Publishing-Pipeline" is the third part that publishes the packages to an internal ADO feed.	2025-01-06 11:50:58 -08:00
Yulong Wang	21b4d2ac9f	fix pipeline build-perf-test-binaries (#23255 )	2025-01-05 22:28:41 -08:00
Changming Sun	b7ef81a034	Move Linux GPU CI pipeline to A10 (#23235 ) Move Linux GPU CI pipeline to A10 machines which are more advanced. Retire onnxruntime-Linux-GPU-T4 machine pool. Disable run_lean_attention test because the new machines do not have enough shared memory. ``` skip loading trt attention kernel fmha_mhca_fp16_128_256_sm86_kernel because no enough shared memory [E:onnxruntime:, sequential_executor.cc:505 ExecuteKernel] Non-zero status code returned while running MultiHeadAttention node. Name:'MultiHeadAttention_0' Status Message: CUDA error cudaErrorInvalidValue:invalid argument ```	2025-01-04 19:11:37 -08:00
Changming Sun	5d692b0136	Merge web machine pools (#23243 ) ### Description The Web CI pipeline uses three different Windows machine pools: 1. onnxruntime-Win2022-webgpu-A10 2. onnxruntime-Win2022-VS2022-webgpu-A10 3. onnxruntime-Win-CPU-2022-web This PR merges them together to reduce ongoing maintenance cost.	2025-01-03 13:53:17 -08:00
Changming Sun	afd3e81c94	Remove PostBuildCleanup (#23233 ) Remove PostBuildCleanup tasks since it is deprecated. It is to address a warning in our pipelines: "Task 'Post Build Cleanup' version 3 (PostBuildCleanup@3) is dependent on a Node version (6) that is end-of-life. Contact the extension owner for an updated version of the task. Task maintainers should review Node upgrade guidance: https://aka.ms/node-runner-guidance" Now the cleanup is controlled in another place: https://learn.microsoft.com/en-us/azure/devops/pipelines/yaml-schema/workspace?view=azure-pipelines The code change was generated by the following Linux command: ```bash find . -name \*.yml -exec sed -i '/PostBuildCleanup/,+2d' {} \; ```	2024-12-31 13:12:33 -08:00
liqun Fu	a9a881cc98	Integrate onnx 1.17.0 (#21897 ) ### Description <!-- Describe your changes. --> for ORT 1.21.0 release Create following related issues to track skipped tests due to updated ONNX operators in the ONNX 1.17.0 release: https://github.com/microsoft/onnxruntime/issues/23162 https://github.com/microsoft/onnxruntime/issues/23164 https://github.com/microsoft/onnxruntime/issues/23163 https://github.com/microsoft/onnxruntime/issues/23161 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Signed-off-by: Liqun Fu <liqfu@microsoft.com> Signed-off-by: Liqun Fu <liqun.fu@microsoft.com> Co-authored-by: Guenther Schmuelling <guschmue@microsoft.com> Co-authored-by: Yifan Li <109183385+yf711@users.noreply.github.com> Co-authored-by: yf711 <yifanl@microsoft.com>	2024-12-24 09:02:02 -08:00
Yifan Li	d9d07ad8ae	[TensorRT EP] support TensorRT 10.7-GA (#23011 ) ### Description <!-- Describe your changes. --> Update CIs to TRT10.7 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-12-19 10:39:15 -08:00
Yifan Li	a3bb3f1487	[TensorRT EP] New CIs to test TRT+minimal CUDA build (#23028 ) ### Description <!-- Describe your changes. --> New CI: [Linux_TRT_Minimal_CUDA_Test_CI](https://dev.azure.com/onnxruntime/onnxruntime/_build?definitionId=230&_a=summary) and [Win_TRT_Minimal_CUDA_Test_CI ](https://dev.azure.com/onnxruntime/onnxruntime/_build?definitionId=231) Setting config for new CI to monitor if there's no issue to build ORT-TRTEP with minimal CUDA * yaml content is following Linux TRT CI yaml, with different build arg/cache name * build arg is following [[TensorRT EP] Enable a minimal CUDA EP compilation without kernels](https://github.com/microsoft/onnxruntime/pull/19052#issuecomment-1888066851) ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Monitor if user is able to build ORT-TRTEP-minimalCUDA without any blocker (which takes ~30min to build)	2024-12-19 10:30:39 -08:00
Yulong Wang	8680244ebc	Fix delay load for WebGPU EP and DML EP (#23111 ) ### Description This change fixes the DLL delay load problem for the WebGPU EP and DirectML EP. See detailed explanation below. ### Problem When onnxruntime.dll uses delay loading for its dependencies, the dependencies are loaded using `LoadLibraryEx()`, which search the directory of process (.exe) instead of this library (onnxruntime.dll). This is a problem for usages of Node.js binding and python binding, because Windows will try to find the dependencies in the directory of node.exe or python.exe, which is not the directory of onnxruntime.dll. There was previous attempt to fix this by loading DirectML.dll in the initialization of onnxruntime nodejs binding, which works for DML EP but is not a good solution because it does not really "delay" the load. For WebGPU, the situation became worse because webgpu_dawn.dll depends on dxil.dll and dxcompiler.dll, which are explicitly dynamically loaded in the code using `LoadLibraryA()`. This has the same problem of the DLL search. ### Solutions For onnxruntime.dll loading its direct dependencies, it can be resolved by set the [`__pfnDliNotifyHook2` hook](https://learn.microsoft.com/en-us/cpp/build/reference/understanding-the-helper-function?view=msvc-170#structure-and-constant-definitions) to load from an absolute path that constructed from the onnxruntime.dll folder and the DLL name. For webgpu_dawn.dll loading dxil.dll and dxcompiler.dll, since they are explicitly loaded in the code, the hook does not work. Instead, it can be resolved by ~~using WIN32 API `SetDllDirectory()` to add the onnxruntime.dll folder to the search path.~~ preloading the 2 DLLs from the onnxruntime.dll folder .	2024-12-19 10:23:48 -08:00
Changming Sun	5d7030e4c6	Revert DML pipeline changes (#23135 ) ### Description Previously we wanted to add DirectML EP to existing onnxruntime Windows CUDA packages. After careful consideration, we will postpone the change. This PR reverts some pipeline changes previously made by @mszhanyi and @jchen351 .	2024-12-18 10:42:10 -08:00
Ankit Maheshkar	1f88284f96	OVEP 1.21.0 Development Updates (#23080 ) ### Description OVEP development changes for ORT 1.21 Release ### Motivation and Context - Has Critical Bug Fixes - Improved Performance optimizations for both memory & inference latency (https://github.com/intel/onnxruntime/pull/513) - Enabled Model Compilation using NPUW (https://github.com/intel/onnxruntime/pull/508) - Fixed support for EPContext embed mode 0 for lower memory utilization - Updated NuGet package name as `Intel.ML.OnnxRuntime.OpenVino` - Fixed QDQ Stripping logic on NPU	2024-12-11 22:26:32 -08:00
Yi Zhang	6ed77cc374	Deprecate macos-12 (#23017 ) ### Description <!-- Describe your changes. --> ### Motivation and Context ESRP code-sign task has supported .net 8, so we can remove macos-12	2024-12-05 14:07:21 +08:00
Jian Chen	f340b3cad3	Adding DML to python cuda package (#22606 )	2024-12-04 21:20:12 -05:00
Yulong Wang	a615bd6688	Bump version of Dawn to 12a3b24c4 (#23002 ) ### Description Upgrade version of Dawn. Removed dawn.patch, because all patches are included in upstream. Updated code that affected by API changes (`const char*` -> `WGPUStringView`) ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-12-04 09:47:16 -08:00
Jian Chen	9ed0c7fe26	Redo "Update Gradle version 8.7 and java version 17 within onnxruntime/java" (#22923 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-12-02 18:34:25 -08:00
wejoncy	a24723df16	[CoreML ] ML Program more operators support [3/N] (#22710 ) ### Description - Erf - Round - Max - ReduceMax - ReduceMean - ReduceSum - Unsqueeze - Squeeze - Softmax ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-11-28 09:21:02 +08:00
Yi Zhang	b930b4ab5b	Limit PipAuthenticate in Private Project now (#22954 ) ### Description Fixes regression in post merge pipeline caused by #22612 ### Motivation and Context So far, there isn't the artifactFeeds in Public Project	2024-11-27 13:32:35 +08:00
sheetalarkadam	f80afeb9a1	Override android qnn sdk version with pipeline param (#22895 ) We need to be able to control/override the exact version of qnn sdk used for the android build as qnn-runtime (maven package) releases are slower to QNN SDK releases.	2024-11-25 21:01:05 -08:00
Yi Zhang	85751e7276	Build DML in Windows GPU CI pipeline (#22869 ) ### Description Add a new stage to build cuda and dml in Windows GPU CI pipeline (PR checks) to prevent regressions introduced by new cuda tests. Update all tests in cuda/testcases name prefix to CudaEp for skipping them easily ### Motivation and Context 1. CudaNhwcEP is added by default when using cuda ep 2. if onnxruntime_ENABLE_CUDA_EP_INTERNAL_TES is enable, the tests in tests/provider/cuda/testcases is added too. ### To do add enable_pybind in the new stage. Now, --enable_pybind will trigger some python test, like onnxruntime_test_python.py. It uses the API of get_avaible_providers() . More discussions are needed to decide how to make it works	2024-11-25 10:50:52 +08:00
kailums	1e605be166	bigmodel pipeline update cp38 to cp310 (#22793 ) ### Description <!-- Describe your changes. --> when updating from cp38 to cp310, there has some issues for bigmodel pipeine. there are two jobs failed: stable_diffusion and whisper. 1. for stable_diffusion, we are now using "nvcr.io/nvidia/pytorch:22.11-py3" from nvidia repo. it is for cuda11 and python3.8. and they are not providing python3.10 version for cuda 11. the latest version of this docker image is for cuda12 and python3.10. To solve this problem, i use a docker image of ubuntu22.04, and then install all need python package for this job. 2. for whisper. the original docker image is ubuntu20.04 which doesn't have python3.10, and has to update to ubuntu22.04.	2024-11-21 07:25:01 -08:00
Jian Chen	369d7bf887	Update the Docker image version (#22907 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-11-21 19:38:39 +08:00
Yi Zhang	a28246a994	Revert "Update Gradle version 8.7 and java version 17 within onnxrunt… (#22914 ) …ime/java (#22771)" This reverts commit `632a36a233`. ### Description <!-- Describe your changes. --> ### Motivation and Context Run E2E tests using Browserstack failed due to this PR.	2024-11-21 18:12:28 +08:00
Kyle	712bee13db	Fix Pipeline Timeout Issue (#22901 ) ### Description <!-- Describe your changes. --> Extend timeout for always failed job. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2024-11-20 17:18:50 +01:00
Changming Sun	13346fdf18	Cleanup code (#22827 ) ### Description 1. Delete TVM EP because it is out of maintain 2. Delete ortmodule related docker files and scripts.	2024-11-19 14:13:33 -08:00
Caroline Zhu	0d00fc3130	[mobile] Fix for mac-ios-packaging pipeline (#22879 ) ### Description Appends variant name to the Browserstack artifacts that are published so that we don't run into the error: "##[error]Artifact browserstack_test_artifacts already exists for build 609095." [Working pipeline run](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=609503&view=results) ### Motivation and Context - onnxruntime-ios-packaging-pipeline has been failing	2024-11-19 09:27:51 -08:00
Adrian Lizarraga	497b06f0a9	[QNN EP] QNN SDK 2.28.2 (#22844 ) ### Description - Updates pipelines to use QNN SDK 2.28.2.241116. - Re-enable LayerNormalization unit tests that failed with accuracy errors with the previous QNN SDK (2.28.0). - Update QNN EP to no longer provide a dummy bias for LayerNorm if the QNN SDK version is >= 2.28.0. ### Motivation and Context Use the latest QNN SDK. This version improves inference latency for certain customer models.	2024-11-18 20:10:36 -08:00

1 2 3 4 5 ...

2192 commits