Approved cherry picks for ORT 1.19.2 release.
---------
Co-authored-by: Yi Zhang <zhanyi@microsoft.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Co-authored-by: Ye Wang <52801275+wangyems@users.noreply.github.com>
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Tianlei Wu <tlwu@microsoft.com>
Co-authored-by: aciddelgado <139922440+aciddelgado@users.noreply.github.com>
Co-authored-by: mindest <30493312+mindest@users.noreply.github.com>
Co-authored-by: Changming Sun <chasun@microsoft.com>
### Description
<!-- Describe your changes. -->
PRs marked for cherry-pick & bug fixes.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
ORT 1.19.0 Release Preparation
---------
Signed-off-by: Liqun Fu <liqfu@microsoft.com>
Co-authored-by: George Wu <jywu@microsoft.com>
Co-authored-by: liqun Fu <liqfu@microsoft.com>
Co-authored-by: Scott McKay <skottmckay@gmail.com>
Co-authored-by: Yi Zhang <zhanyi@microsoft.com>
### Description
<!-- Describe your changes. -->
PRs marked for cherry-pick.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
ORT 1.19.0 Release Preparation
---------
Signed-off-by: Liqun Fu <liqfu@microsoft.com>
Signed-off-by: liqunfu <liqun.fu@microsoft.com>
Signed-off-by: Liqun Fu <liqun_fu@hotmail.com>
Co-authored-by: liqun Fu <liqfu@microsoft.com>
Co-authored-by: Jing Fang <126209182+fajin-corp@users.noreply.github.com>
Co-authored-by: Tianlei Wu <tlwu@microsoft.com>
Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com>
Co-authored-by: Changming Sun <chasun@microsoft.com>
Co-authored-by: Sumit Agarwal <sumitagarwal330@gmail.com>
Co-authored-by: vraspar <vrajang@outlook.com>
Co-authored-by: Scott McKay <skottmckay@gmail.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Co-authored-by: Yi Zhang <zhanyi@microsoft.com>
Co-authored-by: jingyanwangms <47403504+jingyanwangms@users.noreply.github.com>
Co-authored-by: Yi Zhang <your@email.com>
Co-authored-by: Chi Lo <54722500+chilo-ms@users.noreply.github.com>
Co-authored-by: saurabh <saurabh1.kale@intel.com>
Co-authored-by: sfatimar <sahar.fatima@intel.com>
### Description
<!-- Describe your changes. -->
Critical changes required for an external developer (GeekBench)
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
ORT 1.19.0 Release Preparation
---------
Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com>
### Description
Add support for Split Op
### Motivation and Context
Address operator gaps in high priority model.
---------
Co-authored-by: Scott McKay <skottmckay@gmail.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
### Description
<!-- Describe your changes. -->
Update TRT OSS Parser to [latest 10.2-GA
branch](f161f95883)
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
This change adds a check for whether the scale in the QuantizeLinear (or
DequantizeLinear) is a positive scalar, and a new selector to disallow
removing the QDQ around MaxPool if it is not.
### Motivation and Context
Currently, the DropQDQNodesRules optimization removes QuantizeLinear and
DequantizeLinear nodes from DequantizeLinear ∘ MaxPool ∘ QuantizeLinear.
However, if the x_scale/y_scale values are non-positive, the
(de-)quantization changes the ordering of the elements in the input
value, so this optimization is changing the results.
https://github.com/microsoft/onnxruntime/issues/21176
---------
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Current labeling bot over-applies many of the labels (e.g., ep:CUDA and
platform:windows) and is missing some of the APIs + EPs
Working on migrating this workflow to GitHub policies but would like to
use this fix in the meantime to avoid causing any issues w/ ORT 1.19
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Propagates NaN values in the min and max operators so that min or max
with a NaN in either input always produces NaN.
### Motivation and Context
Fixes#21455
### Description
Follow up PR for bug fixes on 1.19
### Motivation and Context
- Handles 1.19 docker file fixes.
- Sets the default file naming of epctx onnx model with _ctx.onnx as
suffix.
- Create epctx model directories if it doesn't exist.
---------
Co-authored-by: jatinwadhwa921 <110383850+jatinwadhwa921@users.noreply.github.com>
### Description
This PR adds a new option `ort.env.wasm.wasmBinary`, which allows user
to set to a buffer containing preload .wasm file content.
This PR should resolve the problem from latest discussion in #20876.
Bug:https://github.com/microsoft/onnxruntime/issues/21467
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Since the onedevice training cpu packaging has been a separated
pipeline, it's nuget package publishing step must be moved as well.
### Motivation and Context
Fixes the exception in Nuget Publishing Packaging Pipeline caused by
#21485
### Description
Functionality extension for the SetOutputShape method in custom op shape inference.
### Motivation and Context
- **SetOutputShape** Interface enhancement Actually, the shape infer function need set the tensor type and shape ,Add a parameter **type** to allow users to specify the tensor type, and set **ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT** as default value to ensure compatibility.
Co-authored-by: mingyue <mingyue@amd.com>
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Delete tools/ci_build/github/azure-pipelines/win-gpu-ci-pipeline.yml
### Motivation and Context
This CI pipeline has been divided into 4 different pipeline.
### Description
Local CI setup for AIX reported tests failure after the gtest 1.15.0
upgrade.
### Motivation and Context
Below tests failure is observed after gtest upgrade.
The following tests FAILED:
1 - onnxruntime_test_all (ILLEGAL)
7 - onnxruntime_logging_apis_test (Subprocess aborted)
To fix this, I am enabling pthread support under gtest. This was
disabled with previous version of gtest for some reason.
Now by enabling this, above tests are getting passed with gtest 1.15.0.
Bumps [Sixlabors.ImageSharp](https://github.com/SixLabors/ImageSharp)
from 2.1.8 to 2.1.9.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/SixLabors/ImageSharp/releases">Sixlabors.ImageSharp's
releases</a>.</em></p>
<blockquote>
<h2>v2.1.9</h2>
<h2>What's Changed</h2>
<ul>
<li>[2.1] Fix overflow in MemoryAllocator.Create(options) by <a
href="https://github.com/antonfirsov"><code>@antonfirsov</code></a> in
<a
href="https://redirect.github.com/SixLabors/ImageSharp/pull/2732">SixLabors/ImageSharp#2732</a></li>
<li>Backport GIF LZW fix to 2.1 by <a
href="https://github.com/antonfirsov"><code>@antonfirsov</code></a> in
<a
href="https://redirect.github.com/SixLabors/ImageSharp/pull/2756">SixLabors/ImageSharp#2756</a></li>
<li>Backport 2759 to 2.1.x by <a
href="https://github.com/antonfirsov"><code>@antonfirsov</code></a> in
<a
href="https://redirect.github.com/SixLabors/ImageSharp/pull/2770">SixLabors/ImageSharp#2770</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/SixLabors/ImageSharp/compare/v2.1.8...v2.1.9">https://github.com/SixLabors/ImageSharp/compare/v2.1.8...v2.1.9</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="9816ca4501"><code>9816ca4</code></a>
Merge pull request <a
href="https://redirect.github.com/SixLabors/ImageSharp/issues/2770">#2770</a>
from SixLabors/af/backport-2759-2.1.x</li>
<li><a
href="b33d666ab7"><code>b33d666</code></a>
handle DecodingMode</li>
<li><a
href="6b2030b549"><code>6b2030b</code></a>
Merge branch 'release/2.1.x' into af/backport-2759-2.1.x</li>
<li><a
href="8ffad3f480"><code>8ffad3f</code></a>
Issue2012BadMinCode should decode now</li>
<li><a
href="1f5bf23b9e"><code>1f5bf23</code></a>
skip Issue2758_DecodeWorks</li>
<li><a
href="3bf8c572a0"><code>3bf8c57</code></a>
manual port of 3.1 gif decoder</li>
<li><a
href="28c20ded87"><code>28c20de</code></a>
Clamp JPEG quality estimation results.</li>
<li><a
href="4b910e7f84"><code>4b910e7</code></a>
Decode LZW row by row</li>
<li><a
href="a1f2879771"><code>a1f2879</code></a>
Merge pull request <a
href="https://redirect.github.com/SixLabors/ImageSharp/issues/2756">#2756</a>
from SixLabors/af/git-av-2.1</li>
<li><a
href="898df7f8ca"><code>898df7f</code></a>
backport <a
href="https://redirect.github.com/SixLabors/ImageSharp/issues/2749">#2749</a>
to 2.1</li>
<li>Additional commits viewable in <a
href="https://github.com/SixLabors/ImageSharp/compare/v2.1.8...v2.1.9">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/microsoft/onnxruntime/network/alerts).
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
The change in #21005 works for directly building wheels with `build.py`,
but ort-nightly-directml wheels, as well as the 1.18.1 release of the
onnxruntime-directml python wheel, still do not work with conda since
they're built from the `py-win-gpu.yml` pipeline, which uses
`install_third_party_deps.ps1` to set compile flags.
### Description
<!-- Describe your changes. -->
1. We decided to move the context node creation back to our own repo because it is more flexible to modify.
2. We found a bug related the context node. It would change the inference order. So, we fixed in this PR as well.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
This is crucial for Microsoft Release next month.
---------
Co-authored-by: Yueqing Zhang <yueqingz@amd.com>
### Description
<!-- Describe your changes. -->
[VitisAI] 1. KernelDef supports StartVersion and EndVersion
2. CapabilityOps checks domain
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Co-authored-by: Zhenze Wang <zhenzew@xilinx.com>
### Description
<!-- Describe your changes. -->
Set version and other info in the Microsoft.ML.OnnxRuntime C# dll by
setting GenerateAssemblyInfo to true and passing in ORT version in the
CI.
Minor re-org of the order of properties so related things are grouped a
little better.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
#21475
### Description
Add QNN EP option context_node_name_prefix to set EPContext node name prefix
### Motivation and Context
For the case to workaround QNN context PD memory limit, user need split the model into pieces and generate the QNN context model separately. It could happen that the generated EPContext node in separate graph has same node name. This will cause issue if glue those EPContext nodes together into a single model.
To avoid this user can set this context_node_name_prefix for each split pieces to make the node name unique.
### Description
<!-- Describe your changes. -->
`enable_windows_arm64_qnn` and `enable_windows_x64_qnn` are true by
default but unnecessary for training. This change explicitly sets these
parameters to false for training pipeline.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
ORT 1.19 Release Preparation
WebNN spec recently changes the definition of argMax/argMin:
- Remove selectLastIndex option, let backends decide to select the last
index or not.
- Move axes option to axis input
### Description
This PR registers the ReduceMin-20 operator to the DML EP.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
<!-- Describe your changes. -->
Current behavior forces all L2 optimizers to loop until they hit the max
number of iterations.
Only update modified if the graph was modified.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Fix unnecessary loops of L2 optimizers during model loading.
### Description
<!-- Describe your changes. -->
Add these changes to one PR to simplify checkin
- Add Concat (#21423)
- Add DepthToSpace (#21426)
- Add LeakyRelu (#21453)
- Add test scripts (#21427)
- Add ability to set coreml flags from python (#21434)
Other changes
- updated partitioning utils to support dropping constant initializers
from a ComputeCapability's inputs.
- noticed that the list of inputs to the coreml model was unexpectedly
long due to this
- we copy constant initializers to a CoreML model so don't need the
originals, and if they remain as inputs ORT can't free them as they
appear to be in use.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
<!-- Describe your changes. -->
Current failure is due to a version mismatch.
Use llvm-cov from the Android NDK instead of the system gcov so that the
version is correct.
Also comment out publishing to the Azure dashboard to simplify the
setup. The CI prints out the stats for review by developers.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Fix CI pipeline
### Description
Right now our "Zip-Nuget-Java-Nodejs Packaging Pipeline" is too big.
This OnDevice training part is independent of the others, so it can be
split out. Then our NPM Packaging pipeline will not depends on this
training stuff.
### Motivation and Context
Similar to #21235
Also, this PR fixed a problem that: "NuGet_Test_Linux_Training_CPU" job
downloads artifacts from "onnxruntime-linux-x64" for getting customop
shared libs, but the job forget to declare it depends on the
"Linux_C_API_Packaging_CPU_x64" which produces the artifact. Such
problems can be hard to find when a pipeline goes big.
### Description
* Swap cuda version 11.8/12.2 in GPU CIs
* Set CUDA12 as default version in yamls of publishing nuget/python/java
GPU packages
* Suppress warnings as errors of flash_api.cc during ort win-build
Updating Performance issue template so "performance" label is
automatically applied
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
<!-- Describe your changes. -->
We found text format could caused error.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Because the OS could change the string so we decided to save it as
binary file.
### Description
Add OVEP features for 1.19
The PR has,
- Added support for EpCtx with ORT Session options for optimized
performance.
- Added bug fixes
- Support for OV 2024.3
---------
Co-authored-by: ubuntu <ubuntu@ubuntu-mtlp-118727.iind.intel.com>
Co-authored-by: vthaniel <vishnudas.thaniel.s@intel.com>
Co-authored-by: sfatimar <sahar.fatima@intel.com>
Co-authored-by: saurabhkale17 <saurabh1.kale@intel.com>
Co-authored-by: Maheshkar <ankit.maheshkar@intel.com>
### Description
<!-- Describe your changes. -->
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->