### Description
- Add new build flag in build.py to build onnxruntime.dll supporting
interfaces for all primary EPs( QNN, TensoRT, OpenVino, VitisAI).
- Modify onnxruntime.dll/onnxruntime_shared.dll build settings to remove
dependency of IHV SDK Toolset to be installed on the system.
- Change CMake variables to be explicit when building EP vs ORT. e.g.
onnxruntime_USE_TENSORRT vs onnxruntime_USE_TENSORRT_INTERFACE, to
evolve the build system to build ORT independent of EPs.
### Motivation and Context
Changes in the build system required to evolve the repo to build the
components independently while removing unnecessary dependencies
---------
Co-authored-by: Lei Cao <jslhcl@gmail.com>
Co-authored-by: Karim Vadsariya <kvadsariya@microsoft.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
### Description
This PR updates the version of Dawn to
`b9b4a37041dec3dd62ac92014a6cc1aece48d9f3` (ref:
[chromium](67f86f01dd/DEPS (399)))
in the `deps.txt` file.
The newer version of Dawn includes the previous changes from dawn.patch
so that we can remove the patch file.
There is a little interface changes and code is updated correspondingly.
### Description
Enable coremltools for Linux build. In order to do this, I did:
1. Add uuid-devel to the Linux images and regenerate them.
2. Patch the coremltools code a little bit to add some missing header
files.
### Motivation and Context
To make the code simpler. Later on I will create another PR to remove
the COREML_ENABLE_MLPROGRAM C/C++ macro.
Also, after this PR I will bring more changes to
onnxruntime_provider_coreml.cmake to make it work with vcpkg.
### Description
- Makes QNN EP a shared library **by default** when building with
`--use_qnn` or `--use_qnn shared_lib`. Generates the following build
artifacts:
- **Windows**: `onnxruntime_providers_qnn.dll` and
`onnxruntime_providers_shared.dll`
- **Linux**: `libonnxruntime_providers_qnn.so` and
`libonnxruntime_providers_shared.so`
- **Android**: Not supported. Must build QNN EP as a static library.
- Allows QNN EP to still be built as a static library with `--use_qnn
static_lib`. This is primarily for the Android QNN AAR package.
- Unit tests run for both the static and shared QNN EP builds.
### Detailed changes
- Updates Java bindings to support both shared and static QNN EP builds.
- Provider bridge API:
- Adds logging sink ETW to the provider bridge. Allows EPs to register
ETW callbacks for ORT logging.
- Adds a variety of methods for onnxruntime objects that are needed by
QNN EP.
- QNN EP:
- Adds `ort_api.h` and `ort_api.cc` that encapsulates the API provided
by ORT in a manner that allows the EP to be built as either a shared or
static library.
- Adds custom function to transpose weights for Conv and Gemm (instead
of adding util to provider bridge API).
- Adds custom function to quantize data for LeakyRelu (instead of adding
util to provider bridge API).
- Adds custom ETW tracing for QNN profiling events:
- shared library: defines its own TraceLogging provider handle
- static library: uses ORT's TraceLogging provider handle and existing
telemetry provider.
- ORT-QNN Packages:
- **Python**: Pipelines build QNN EP as a shared library by default.
User can build a local python wheel with QNN EP as a static library by
passing `--use_qnn static_lib`.
- **NuGet**: Pipelines build QNN EP as a shared library by default.
`build.py` currently enforces QNN EP to be built as a shared library.
Can add support for building a QNN NuGet package with static later if
deemed necessary.
- **Android**: Pipelines build QNN EP as a **static library**.
`build.py` enforces QNN EP to be built as a static library. Packaging
multiple shared libraries into an Android AAR package is not currently
supported due to the added need to also distribute a shared libcpp.so
library.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Moving Android E2E test steps from Mac-OS13 to unbunt22.04
### Motivation and Context
Deduced the dependency on MacOS, which is deprecating the x64 version.
1. Update onnxruntime binary size checks ci pipeline's docker image. Use
a different docker image that is not manylinux based. The new one is
smaller.
2. Add flatbuffers tools/ci_build/requirements/pybind/requirements.txt
3. Delete
tools/ci_build/github/azure-pipelines/py-package-build-pipeline.yml. The
pipeline was for generating packages for Olive, but it went unused. And
the content is highly duplicated with our official python packaging
pipeline.
4. A lot of YAML files reference pypa/manylinux git repo but do not use
it. This PR removes the references.
Use ruff as the code formatter in place of black and isort since it is
much faster, and as projects like PyTorch and ONNX have adopted ruff
format as well.
This PR include only auto-fixed changes in formatting.
### Description
This PR allows WebGPU EP to be built with Emscripten for WebAssembly,
Including:
- cmake build files update to support correct setup for Emscripten.
- code changes to fix build breaks for wasm
- change in Web CI pipeline to add a build-only target for wasm with
`--use_webgpu`.
Bumps [ruff](https://github.com/astral-sh/ruff) from 0.5.4 to 0.9.1.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/astral-sh/ruff/releases">ruff's
releases</a>.</em></p>
<blockquote>
<h2>0.9.1</h2>
<h2>Release Notes</h2>
<h3>Preview features</h3>
<ul>
<li>[<code>pycodestyle</code>] Run
<code>too-many-newlines-at-end-of-file</code> on each cell in notebooks
(<code>W391</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/15308">#15308</a>)</li>
<li>[<code>ruff</code>] Omit diagnostic for shadowed private function
parameters in <code>used-dummy-variable</code> (<code>RUF052</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/15376">#15376</a>)</li>
</ul>
<h3>Rule changes</h3>
<ul>
<li>[<code>flake8-bugbear</code>] Improve
<code>assert-raises-exception</code> message (<code>B017</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/15389">#15389</a>)</li>
</ul>
<h3>Formatter</h3>
<ul>
<li>Preserve trailing end-of line comments for the last string literal
in implicitly concatenated strings (<a
href="https://redirect.github.com/astral-sh/ruff/pull/15378">#15378</a>)</li>
</ul>
<h3>Server</h3>
<ul>
<li>Fix a bug where the server and client notebooks were out of sync
after reordering cells (<a
href="https://redirect.github.com/astral-sh/ruff/pull/15398">#15398</a>)</li>
</ul>
<h3>Bug fixes</h3>
<ul>
<li>[<code>flake8-pie</code>] Correctly remove wrapping parentheses
(<code>PIE800</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/15394">#15394</a>)</li>
<li>[<code>pyupgrade</code>] Handle comments and multiline expressions
correctly (<code>UP037</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/15337">#15337</a>)</li>
</ul>
<h2>Contributors</h2>
<ul>
<li><a
href="https://github.com/AntoineD"><code>@AntoineD</code></a></li>
<li><a
href="https://github.com/InSyncWithFoo"><code>@InSyncWithFoo</code></a></li>
<li><a
href="https://github.com/MichaReiser"><code>@MichaReiser</code></a></li>
<li><a href="https://github.com/calumy"><code>@calumy</code></a></li>
<li><a
href="https://github.com/dcreager"><code>@dcreager</code></a></li>
<li><a
href="https://github.com/dhruvmanila"><code>@dhruvmanila</code></a></li>
<li><a href="https://github.com/dylwil3"><code>@dylwil3</code></a></li>
<li><a href="https://github.com/sharkdp"><code>@sharkdp</code></a></li>
<li><a href="https://github.com/tjkuson"><code>@tjkuson</code></a></li>
</ul>
<h2>Install ruff 0.9.1</h2>
<h3>Install prebuilt binaries via shell script</h3>
<pre lang="sh"><code>curl --proto '=https' --tlsv1.2 -LsSf
https://github.com/astral-sh/ruff/releases/download/0.9.1/ruff-installer.sh
| sh
</code></pre>
<h3>Install prebuilt binaries via powershell script</h3>
<pre lang="sh"><code>powershell -ExecutionPolicy ByPass -c "irm
https://github.com/astral-sh/ruff/releases/download/0.9.1/ruff-installer.ps1
| iex"
</code></pre>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md">ruff's
changelog</a>.</em></p>
<blockquote>
<h2>0.9.1</h2>
<h3>Preview features</h3>
<ul>
<li>[<code>pycodestyle</code>] Run
<code>too-many-newlines-at-end-of-file</code> on each cell in notebooks
(<code>W391</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/15308">#15308</a>)</li>
<li>[<code>ruff</code>] Omit diagnostic for shadowed private function
parameters in <code>used-dummy-variable</code> (<code>RUF052</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/15376">#15376</a>)</li>
</ul>
<h3>Rule changes</h3>
<ul>
<li>[<code>flake8-bugbear</code>] Improve
<code>assert-raises-exception</code> message (<code>B017</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/15389">#15389</a>)</li>
</ul>
<h3>Formatter</h3>
<ul>
<li>Preserve trailing end-of line comments for the last string literal
in implicitly concatenated strings (<a
href="https://redirect.github.com/astral-sh/ruff/pull/15378">#15378</a>)</li>
</ul>
<h3>Server</h3>
<ul>
<li>Fix a bug where the server and client notebooks were out of sync
after reordering cells (<a
href="https://redirect.github.com/astral-sh/ruff/pull/15398">#15398</a>)</li>
</ul>
<h3>Bug fixes</h3>
<ul>
<li>[<code>flake8-pie</code>] Correctly remove wrapping parentheses
(<code>PIE800</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/15394">#15394</a>)</li>
<li>[<code>pyupgrade</code>] Handle comments and multiline expressions
correctly (<code>UP037</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/pull/15337">#15337</a>)</li>
</ul>
<h2>0.9.0</h2>
<p>Check out the <a href="https://astral.sh/blog/ruff-v0.9.0">blog
post</a> for a migration guide and overview of the changes!</p>
<h3>Breaking changes</h3>
<p>Ruff now formats your code according to the 2025 style guide. As a
result, your code might now get formatted differently. See the formatter
section for a detailed list of changes.</p>
<p>This release doesn’t remove or remap any existing stable rules.</p>
<h3>Stabilization</h3>
<p>The following rules have been stabilized and are no longer in
preview:</p>
<ul>
<li><a
href="https://docs.astral.sh/ruff/rules/stdlib-module-shadowing/"><code>stdlib-module-shadowing</code></a>
(<code>A005</code>).
This rule has also been renamed: previously, it was called
<code>builtin-module-shadowing</code>.</li>
<li><a
href="https://docs.astral.sh/ruff/rules/builtin-lambda-argument-shadowing/"><code>builtin-lambda-argument-shadowing</code></a>
(<code>A006</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/slice-to-remove-prefix-or-suffix/"><code>slice-to-remove-prefix-or-suffix</code></a>
(<code>FURB188</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/boolean-chained-comparison/"><code>boolean-chained-comparison</code></a>
(<code>PLR1716</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/decimal-from-float-literal/"><code>decimal-from-float-literal</code></a>
(<code>RUF032</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/post-init-default/"><code>post-init-default</code></a>
(<code>RUF033</code>)</li>
<li><a
href="https://docs.astral.sh/ruff/rules/useless-if-else/"><code>useless-if-else</code></a>
(<code>RUF034</code>)</li>
</ul>
<p>The following behaviors have been stabilized:</p>
<ul>
<li><a
href="https://docs.astral.sh/ruff/rules/pytest-parametrize-names-wrong-type/"><code>pytest-parametrize-names-wrong-type</code></a>
(<code>PT006</code>): Detect <a
href="https://docs.pytest.org/en/7.1.x/how-to/parametrize.html#parametrize"><code>pytest.parametrize</code></a>
calls outside decorators and calls with keyword arguments.</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="12f86f39a4"><code>12f86f3</code></a>
Ruff 0.9.1 (<a
href="https://redirect.github.com/astral-sh/ruff/issues/15407">#15407</a>)</li>
<li><a
href="2b28d566a4"><code>2b28d56</code></a>
Associate a trailing end-of-line comment in a parenthesized implicit
concaten...</li>
<li><a
href="adca7bd95c"><code>adca7bd</code></a>
Remove pygments pin (<a
href="https://redirect.github.com/astral-sh/ruff/issues/15404">#15404</a>)</li>
<li><a
href="6b98a26452"><code>6b98a26</code></a>
[red-knot] Support <code>assert_type</code> (<a
href="https://redirect.github.com/astral-sh/ruff/issues/15194">#15194</a>)</li>
<li><a
href="c87463842a"><code>c874638</code></a>
[red-knot] Move tuple-containing-Never tests to Markdown (<a
href="https://redirect.github.com/astral-sh/ruff/issues/15402">#15402</a>)</li>
<li><a
href="c364b586f9"><code>c364b58</code></a>
[<code>flake8-pie</code>] Correctly remove wrapping parentheses
(<code>PIE800</code>) (<a
href="https://redirect.github.com/astral-sh/ruff/issues/15394">#15394</a>)</li>
<li><a
href="73d424ee5e"><code>73d424e</code></a>
Fix outdated doc for handling the default file types with the pre-commit
hook...</li>
<li><a
href="6e9ff445fd"><code>6e9ff44</code></a>
Insert the cells from the <code>start</code> position (<a
href="https://redirect.github.com/astral-sh/ruff/issues/15398">#15398</a>)</li>
<li><a
href="f2c3ddc5ea"><code>f2c3ddc</code></a>
[red-knot] Move intersection type tests to Markdown (<a
href="https://redirect.github.com/astral-sh/ruff/issues/15396">#15396</a>)</li>
<li><a
href="b861551b6a"><code>b861551</code></a>
Remove unnecessary backticks (<a
href="https://redirect.github.com/astral-sh/ruff/issues/15393">#15393</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/astral-sh/ruff/compare/0.5.4...0.9.1">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
### Description
Update xnnpack to remove the dependency on psimd and fp16 libraries.
However, coremltool still depends on them, which will be addressed
later.
Also, update CPUINFO because the latest xnnpack requires CPUINFO's avx10
support.
### Motivation and Context
The fewer dependencies the better.
### Description
<!-- Describe your changes. -->
* Remove deprecated gpu arch to control nuget/python package size
(latest TRT supports sm75 Turing and newer arch)
* Add 90 to support blackwell series in next release (86;89 not
considered as adding them will rapidly increase package size)
| arch_range | Python-cuda12 | Nuget-cuda12 |
| -------------- |
------------------------------------------------------------ |
---------------------------------- |
| 60;61;70;75;80 | Linux: 279MB Win: 267MB | Linux: 247MB Win: 235MB |
| 75;80 | Linux: 174MB Win: 162MB | Linux: 168MB Win: 156MB |
| **75;80;90** | **Linux: 299MB Win: 277MB** | **Linux: 294MB Win:
271MB** |
| 75;80;86;89 | [Linux: MB Win:
390MB](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=647457&view=results)
| Linux: 416MB Win: 383MB |
| 75;80;86;89;90 | [Linux: MB Win:
505MB](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=646536&view=results)
| Linux: 541MB Win: 498MB |
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Callout: While adding sm90 support, the build of cuda11.8+cudnn8 will be
dropped in the coming ORT release,
as the build has issue with blackwell (mentioned in comments) and demand
on cuda 11 is minor, according to internal ort-cuda11 repo.
Fix some inconsistency.
All our iOS build should target iOS 15.1.
All our macOS desktop build should target macOS 13.3 to align with the
changes made in #17361
The new images contain the following updates:
1. Added Git, Ninja and VCPKG to all docker images
2. Updated CPU containers' GCC version from 12 to 14
3. Pinned CUDA 12 images' CUDNN version to 9.5(The latest one is 9.6)
4. Addressed container supply chain warnings by building CUDA 12 images
from scratch(avoid using Nvidia's prebuilt images)
5. Updated manylinux commit id to
75aeda9d18eafb323b00620537c8b4097d4bef48
Also, this PR updated some source code to make the CPU EP's source code
compatible with GCC 14.
### Description
Add a temporary path to RN 0.69.3 to update the boost url
### Motivation and Context
Fix the React-native CI until we update the RN to 0.70.15 or 0.73.3+
versions
### Description
1. Currently Python-Cuda-Publishing-Pipeline only publishes Linux
wheels, not Windows wheels. It is because recently we refactored the
upstream pipeline("Python-CUDA-Packaging-Pipeline") to use 1ES PT. This
PR fixed the issue
2. tools/ci_build/github/azure-pipelines/stages/py-win-gpu-stage.yml no
longer includes component-governance-component-detection-steps.yml ,
because 1ES PT already inserted such a thing
3. Delete tools/ci_build/github/windows/eager/requirements.txt because
it is no longer used.
### Motivation and Context
The "Python-CUDA-Packaging-Pipeline" is for CUDA 12.
"Python CUDA ALT Packaging Pipeline" is for CUDA 11.
The two pipelines are very similar, except the CUDA versions are
different.
Each of them has three parts: build, test, publish.
"Python-CUDA-Packaging-Pipeline" is the first part: build.
"Python CUDA12 Package Test Pipeline" is the second part.
"Python-Cuda-Publishing-Pipeline" is the third part that publishes the
packages to an internal ADO feed.
Move Linux GPU CI pipeline to A10 machines which are more advanced.
Retire onnxruntime-Linux-GPU-T4 machine pool.
Disable run_lean_attention test because the new machines do not have
enough shared memory.
```
skip loading trt attention kernel fmha_mhca_fp16_128_256_sm86_kernel because no enough shared memory
[E:onnxruntime:, sequential_executor.cc:505 ExecuteKernel] Non-zero status code returned while running MultiHeadAttention node. Name:'MultiHeadAttention_0' Status Message: CUDA error cudaErrorInvalidValue:invalid argument
```
### Description
The Web CI pipeline uses three different Windows machine pools:
1. onnxruntime-Win2022-webgpu-A10
2. onnxruntime-Win2022-VS2022-webgpu-A10
3. onnxruntime-Win-CPU-2022-web
This PR merges them together to reduce ongoing maintenance cost.
Remove PostBuildCleanup tasks since it is deprecated. It is to address a
warning in our pipelines:
"Task 'Post Build Cleanup' version 3 (PostBuildCleanup@3) is dependent
on a Node version (6) that is end-of-life. Contact the extension owner
for an updated version of the task. Task maintainers should review Node
upgrade guidance: https://aka.ms/node-runner-guidance"
Now the cleanup is controlled in another place:
https://learn.microsoft.com/en-us/azure/devops/pipelines/yaml-schema/workspace?view=azure-pipelines
The code change was generated by the following Linux command:
```bash
find . -name \*.yml -exec sed -i '/PostBuildCleanup/,+2d' {} \;
```
### Description
<!-- Describe your changes. -->
Update CIs to TRT10.7
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
This change fixes the DLL delay load problem for the WebGPU EP and
DirectML EP. See detailed explanation below.
### Problem
When onnxruntime.dll uses delay loading for its dependencies, the
dependencies are loaded using `LoadLibraryEx()`, which search the
directory of process (.exe) instead of this library (onnxruntime.dll).
This is a problem for usages of Node.js binding and python binding,
because Windows will try to find the dependencies in the directory of
node.exe or python.exe, which is not the directory of onnxruntime.dll.
There was previous attempt to fix this by loading DirectML.dll in the
initialization of onnxruntime nodejs binding, which works for DML EP but
is not a good solution because it does not really "delay" the load.
For WebGPU, the situation became worse because webgpu_dawn.dll depends
on dxil.dll and dxcompiler.dll, which are explicitly dynamically loaded
in the code using `LoadLibraryA()`. This has the same problem of the DLL
search.
### Solutions
For onnxruntime.dll loading its direct dependencies, it can be resolved
by set the [`__pfnDliNotifyHook2`
hook](https://learn.microsoft.com/en-us/cpp/build/reference/understanding-the-helper-function?view=msvc-170#structure-and-constant-definitions)
to load from an absolute path that constructed from the onnxruntime.dll
folder and the DLL name.
For webgpu_dawn.dll loading dxil.dll and dxcompiler.dll, since they are
explicitly loaded in the code, the hook does not work. Instead, it can
be resolved by ~~using WIN32 API `SetDllDirectory()` to add the
onnxruntime.dll folder to the search path.~~ preloading the 2 DLLs from
the onnxruntime.dll folder .
### Description
Previously we wanted to add DirectML EP to existing onnxruntime Windows
CUDA packages. After careful consideration, we will postpone the change.
This PR reverts some pipeline changes previously made by @mszhanyi and
@jchen351 .
### Description
OVEP development changes for ORT 1.21 Release
### Motivation and Context
- Has Critical Bug Fixes
- Improved Performance optimizations for both memory & inference latency
(https://github.com/intel/onnxruntime/pull/513)
- Enabled Model Compilation using NPUW
(https://github.com/intel/onnxruntime/pull/508)
- Fixed support for EPContext embed mode 0 for lower memory utilization
- Updated NuGet package name as `Intel.ML.OnnxRuntime.OpenVino`
- Fixed QDQ Stripping logic on NPU
### Description
Upgrade version of Dawn.
Removed dawn.patch, because all patches are included in upstream.
Updated code that affected by API changes (`const char*` ->
`WGPUStringView`)
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
- Erf
- Round
- Max
- ReduceMax
- ReduceMean
- ReduceSum
- Unsqueeze
- Squeeze
- Softmax
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
---------
Co-authored-by: Scott McKay <skottmckay@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
### Description
Fixes regression in post merge pipeline caused by #22612
### Motivation and Context
So far, there isn't the artifactFeeds in Public Project
We need to be able to control/override the exact version of qnn sdk used
for the android build as qnn-runtime (maven package) releases are slower
to QNN SDK releases.
### Description
Add a new stage to build cuda and dml in Windows GPU CI pipeline (PR
checks) to prevent regressions introduced by new cuda tests.
Update all tests in cuda/testcases name prefix to CudaEp for skipping
them easily
### Motivation and Context
1. CudaNhwcEP is added by default when using cuda ep
2. if onnxruntime_ENABLE_CUDA_EP_INTERNAL_TES is enable, the tests in
tests/provider/cuda/testcases is added too.
### To do
add enable_pybind in the new stage.
Now, --enable_pybind will trigger some python test, like
onnxruntime_test_python.py.
It uses the API of get_avaible_providers() .
More discussions are needed to decide how to make it works
### Description
<!-- Describe your changes. -->
when updating from cp38 to cp310, there has some issues for bigmodel
pipeine. there are two jobs failed: stable_diffusion and whisper.
1. for stable_diffusion, we are now using
"nvcr.io/nvidia/pytorch:22.11-py3" from nvidia repo. it is for cuda11
and python3.8. and they are not providing python3.10 version for cuda
11. the latest version of this docker image is for cuda12 and
python3.10. To solve this problem, i use a docker image of ubuntu22.04,
and then install all need python package for this job.
2. for whisper. the original docker image is ubuntu20.04 which doesn't
have python3.10, and has to update to ubuntu22.04.
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
…ime/java (#22771)"
This reverts commit 632a36a233.
### Description
<!-- Describe your changes. -->
### Motivation and Context
Run E2E tests using Browserstack failed due to this PR.
### Description
<!-- Describe your changes. -->
Extend timeout for always failed job.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Appends variant name to the Browserstack artifacts that are published so
that we don't run into the error:
"##[error]Artifact browserstack_test_artifacts already exists for build
609095."
[Working pipeline
run](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=609503&view=results)
### Motivation and Context
- onnxruntime-ios-packaging-pipeline has been failing
### Description
- Updates pipelines to use QNN SDK 2.28.2.241116.
- Re-enable LayerNormalization unit tests that failed with accuracy
errors with the previous QNN SDK (2.28.0).
- Update QNN EP to no longer provide a dummy bias for LayerNorm if the
QNN SDK version is >= 2.28.0.
### Motivation and Context
Use the latest QNN SDK. This version improves inference latency for
certain customer models.