Commit graph

2137 commits

Author SHA1 Message Date
Caroline Zhu
b7f09d4c27
Increase timeout for orttraining-linux-gpu pipeline (#21844)
### Description
Increase timeout to 160 minutes

### Motivation and Context
- Recent runs of orttraining-linux-gpu pipeline have been timing out
2024-08-27 11:47:12 -07:00
Jian Chen
7f851f4e61
Removing docker_base_image parameter and variables (#21864)
### Description
Removing `docker_base_image` parameter and variables. From the Cuda
Packaging pipeline.



### Motivation and Context
Since the docker image is hard coded in the 

`onnxruntime/tools/ci_build/github/linux/docker/inference/x86_64/default/cuda12/Dockerfile`
and 

`onnxruntime/tools/ci_build/github/linux/docker/inference/x86_64/default/cuda11/Dockerfile`
This parameter and variable is no longer needed.
2024-08-27 10:36:17 -07:00
Yi Zhang
2877de73e1
sign native dll with correct cert (#21854)
### Description
Fixed #21775



### Motivation and Context
The dlls should be signed with Keycode CP-230012.
The default is the test code sign.
2024-08-26 16:46:19 +08:00
Caroline Zhu
983c4d57a4
Fix typo for react native pipeline (#21845)
### Description
fix typo

### Motivation and Context
[RN pipeline
failing](https://dev.azure.com/onnxruntime/onnxruntime/_build?definitionId=188&_a=summary)
since #21578 with this error:

![image](https://github.com/user-attachments/assets/75e5b968-572f-42cc-9816-7940de464cfa)
2024-08-26 12:05:11 +10:00
Guenther Schmuelling
ba7baae994
Revert "Upgrade emsdk from 3.1.59 to 3.1.62" (#21817)
Reverts microsoft/onnxruntime#21421

Users are seeing chrome memory grow to 16GB before it crashes:
https://github.com/microsoft/onnxruntime/issues/21810

Revert for now so we have time to debug.
2024-08-22 11:21:00 -07:00
Jian Chen
6c1a3f85a6
Do not allow clearing Android logs if the emulator is not running (#21578)
### Description
Do not allow clearing Android logs if the emulator is not running



### Motivation and Context
Previously the Clearing Android logs step stuck until the pipeline
timeout. If one of the previous steps failed.
2024-08-22 10:18:01 -07:00
Yi Zhang
12f426c63f
update size limit check of training GPU wheel (#21762)
### Description
<!-- Describe your changes. -->



### Motivation and Context
The training wheel size limit should be 400M
2024-08-21 09:30:05 +08:00
Tianlei Wu
7c93d5ded1
Upgrade pytorch_lightning to 2.3.3 to fix orttraining_amd_gpu_ci_pipeline (#21789)
### Description
Upgrade pytorch_lightning to fix orttraining_amd_gpu_ci_pipeline
```
#24 1.838 WARNING: Ignoring version 1.6.0 of pytorch_lightning since it has invalid metadata:
#24 1.838 Requested pytorch_lightning==1.6.0 from cee67f4849/pytorch_lightning-1.6.0-py3-none-any.whl has invalid metadata: .* suffix can only be used with `==` or `!=` operators
#24 1.838     torch (>=1.8.*)
#24 1.838            ~~~~~~^
#24 1.838 Please use pip<24.1 if you need to use this version.
#24 1.838 ERROR: Ignored the following versions that require a different python version: 1.14.0 Requires-Python >=3.10; 1.14.0rc1 Requires-Python >=3.10; 1.14.0rc2 Requires-Python >=3.10; 2.1.0 Requires-Python >=3.10; 2.1.0rc1 Requires-Python >=3.10
#24 1.838 ERROR: Could not find a version that satisfies the requirement pytorch_lightning==1.6.0 (from versions: 0.0.2, 0.2, 0.2.2, 0.2.3, 0.2.4, 0.2.4.1, 0.2.5, 0.2.5.1, 0.2.5.2, 0.2.6, 0.3, 0.3.1, 0.3.2, 0.3.3, 0.3.4, 0.3.4.1, 0.3.5, 0.3.6, 0.3.6.1, 0.3.6.3, 0.3.6.4, 0.3.6.5, 0.3.6.6, 0.3.6.7, 0.3.6.8, 0.3.6.9, 0.4.0, 0.4.1, 0.4.2, 0.4.3, 0.4.4, 0.4.5, 0.4.6, 0.4.7, 0.4.8, 0.4.9, 0.5.0, 0.5.1, 0.5.1.2, 0.5.1.3, 0.5.2, 0.5.2.1, 0.5.3, 0.5.3.1, 0.5.3.2, 0.5.3.3, 0.6.0, 0.7.1, 0.7.3, 0.7.5, 0.7.6, 0.8.1, 0.8.3, 0.8.4, 0.8.5, 0.9.0, 0.10.0, 1.0.0, 1.0.1, 1.0.2, 1.0.3, 1.0.4, 1.0.5, 1.0.6, 1.0.7, 1.0.8, 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.1.4, 1.1.5, 1.1.6, 1.1.7, 1.1.8, 1.2.0rc0, 1.2.0rc1, 1.2.0rc2, 1.2.0, 1.2.1, 1.2.2, 1.2.3, 1.2.4, 1.2.5, 1.2.6, 1.2.7, 1.2.8, 1.2.9, 1.2.10, 1.3.0rc1, 1.3.0rc2, 1.3.0rc3, 1.3.0, 1.3.1, 1.3.2, 1.3.3, 1.3.4, 1.3.5, 1.3.6, 1.3.7, 1.3.7.post0, 1.3.8, 1.4.0rc0, 1.4.0rc1, 1.4.0rc2, 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 1.4.5, 1.4.6, 1.4.7, 1.4.8, 1.4.9, 1.5.0rc0, 1.5.0rc1, 1.5.0, 1.5.1, 1.5.2, 1.5.3, 1.5.4, 1.5.5, 1.5.6, 1.5.7, 1.5.8, 1.5.9, 1.5.10, 1.6.0rc0, 1.6.0rc1, 1.6.0, 1.6.1, 1.6.2, 1.6.3, 1.6.4, 1.6.5, 1.7.0rc0, 1.7.0rc1, 1.7.0, 1.7.1, 1.7.2, 1.7.3, 1.7.4, 1.7.5, 1.7.6, 1.7.7, 1.8.0rc0, 1.8.0rc1, 1.8.0rc2, 1.8.0, 1.8.0.post1, 1.8.1, 1.8.2, 1.8.3, 1.8.3.post0, 1.8.3.post1, 1.8.3.post2, 1.8.4, 1.8.4.post0, 1.8.5, 1.8.5.post0, 1.8.6, 1.9.0rc0, 1.9.0, 1.9.1, 1.9.2, 1.9.3, 1.9.4, 1.9.5, 2.0.0rc0, 2.0.0, 2.0.1, 2.0.1.post0, 2.0.2, 2.0.3, 2.0.4, 2.0.5, 2.0.6, 2.0.7, 2.0.8, 2.0.9, 2.0.9.post0, 2.1.0rc0, 2.1.0rc1, 2.1.0, 2.1.1, 2.1.2, 2.1.3, 2.1.4, 2.2.0rc0, 2.2.0, 2.2.0.post0, 2.2.1, 2.2.2, 2.2.3, 2.2.4, 2.2.5, 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.4.0)
#24 1.838 ERROR: No matching distribution found for pytorch_lightning==1.6.0
```
2024-08-19 12:58:22 -07:00
jingyanwangms
c018ba43ef
[Running CI] [TensorRT EP] support TensorRT 10.3-GA (#21742)
### Description
- TensorRT 10.2.0.19 -> 10.3.0.26

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-08-18 13:26:41 -07:00
Edward Chen
63e8849992
build_aar_package.py - Check that executable is present before trying to copy it. (#21730)
Check that executable is present before trying to copy it.

Accommodate builds where we skip building the test executables.
2024-08-16 11:21:09 -07:00
Yi Zhang
8a59b4dc4b
Move Python Training CUDA 12.2 pipeline to another pool. (#21745)
### Description
<!-- Describe your changes. -->



### Motivation and Context
[Python Training CUDA 12.2
pipeline](https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1308&_a=summary)
has been always cancelled by remote provider since Aug 2nd.
But other workflows with the same pool haven't this issue.
 It looks like there're some weird things in Azure devops.
It works by using another pool. In fact, the SKU is smaller than the
old.

### Verification
https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1308&_a=summary
2024-08-15 17:31:56 +08:00
Satya Kumar Jandhyala
6d8de1f7b8
Upgrade emsdk from 3.1.59 to 3.1.62 (#21421)
### Description
Upgrade EM SDK to 3.1.62.



### Motivation and Context
The changes are required to clear wasm64 errors.
2024-08-14 12:38:52 -07:00
Prathik Rao
e32e3575d8
pin pytorch lightning version for training CI (#21731)
### Description
<!-- Describe your changes. -->

Pins pytorch-lightning package to version 2.3.3 since version >=2.4.0
requires torch > 2.1.0 which is not compatible with cu118.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

ORT 1.19 Release Preparation
2024-08-13 20:04:56 -07:00
Yi Zhang
6db3d63add
move the A100 stage to main build (#21722)
### Description
<!-- Describe your changes. -->



### Motivation and Context
We couldn't get enough A100 agent time to finish the jobs since today.
The PR makes the A100 job only runs in main branch to unblock other PRs
if it's not recovered in a short time.
2024-08-13 22:48:58 +08:00
George Wu
a8462ffb61
enable qnn python arm64ec packaging (#21575)
create the x64 qnn python package as arm64ec so it can be published
publicly.
2024-08-12 22:43:17 -07:00
Yulong Wang
6ae7e02d34
Web CI: make multi-browser test job optional (#21669)
### Description

This job is a little bit unstable. Make it optional to avoid blocking
other PRs before we revise it.
2024-08-09 23:53:26 -07:00
Scott McKay
410ae94e9e
Use zipped xcframework in nuget package (#21663)
### Description
<!-- Describe your changes. -->
The xcframework now uses symlinks to have the correct structure
according to Apple requirements. Symlinks are not supported by nuget on
Windows.

In order to work around that we can store a zip of the xcframeworks in
the nuget package.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Fix nuget packaging build break
2024-08-09 17:38:18 +10:00
Tianlei Wu
a46e49b439
Unblock migraphx and linux GPU training ci pipelines (#21662)
### Description
* Fix migraphx build error caused by
https://github.com/microsoft/onnxruntime/pull/21598:
Add a conditional compile on code block that depends on ROCm >= 6.2.
Note that the pipeline uses ROCm 6.0.

Unblock orttraining-linux-gpu-ci-pipeline and
orttraining-ortmodule-distributed and orttraining-amd-gpu-ci-pipeline
pipelines:
* Disable a model test in linux GPU training ci pipelines caused by
https://github.com/microsoft/onnxruntime/pull/19470:
Sometime, cudnn frontend throws exception that cudnn graph does not
support a Conv node of keras_lotus_resnet3D model on V100 GPU.
Note that same test does not throw exception in other GPU pipelines. The
failure might be related to cudnn 8.9 and V100 GPU used in the pipeline
(Amper GPUs and cuDNN 9.x do not have the issue).
The actual fix requires fallback logic, which will take time to
implement, so we temporarily disable the test in training pipelines.
* Force install torch for cuda 11.8. (The docker has torch 2.4.0 for
cuda 12.1 to build torch extension, which it is not compatible cuda
11.8). Note that this is temporary walkround. More elegant fix is to
make sure right torch version in docker build step, that might need
update install_python_deps.sh and corresponding requirements.txt.
* Skip test_gradient_correctness_conv1d since it causes segment fault.
Root cause need more investigation (maybe due to cudnn frontend as
well).
* Skip test_aten_attention since it causes assert failure. Root cause
need more investigation (maybe due to torch version).
* Skip orttraining_ortmodule_distributed_tests.py since it has error
that compiler for torch extension does not support c++17. One possible
fix it to set the following compile argument inside setup.py of
extension fused_adam: extra_compile_args['cxx'] = ['-std=c++17'].
However, due to the urgency of unblocking the pipelines, just disable
the test for now.
* skip test_softmax_bf16_large. For some reason,
torch.cuda.is_bf16_supported() returns True in V100 with torch 2.3.1, so
the test was run in CI, but V100 does not support bf16 natively.
* Fix typo of deterministic

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-08-08 19:44:15 -07:00
Scott McKay
d616025884
Match changes in gh-pages PR (#21628)
### Description
<!-- Describe your changes. -->
Update to match #21627 and make the info for Split consistent.

As a Split that doesn't split anything is a no-op it doesn't seem
meaningful to call that limitation out in the docs.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-08-08 10:29:15 +10:00
Adrian Lizarraga
0acefc7988
[QNN EP] Update QNN SDK to 2.25 (#21623)
### Description
- Update pipelines to use QNN SDK 2.25 by default
- Update ifdef condition to apply workaround for QNN LayerNorm
validation bug to QNN SDK 2.25 (as well as 2.24)



### Motivation and Context
Use the latest QNN SDK
2024-08-06 09:08:48 -07:00
Yi Zhang
0d1da41ca8
Fix docker image layer caching to avoid redundant docker building and transient connection exceptions. (#21612)
### Description
Improve docker commands to make docker image layer caching works.
It can make docker building faster and more stable.
So far, A100 pool's system disk is too small to use docker cache.
We won't use pipeline cache for docker image and remove some legacy
code.

### Motivation and Context
There are often an exception of
```
64.58 + curl https://nodejs.org/dist/v18.17.1/node-v18.17.1-linux-x64.tar.gz -sSL --retry 5 --retry-delay 30 --create-dirs -o /tmp/src/node-v18.17.1-linux-x64.tar.gz --fail
286.4 curl: (92) HTTP/2 stream 0 was not closed cleanly: INTERNAL_ERROR (err 2)
```
Because Onnxruntime pipeline have been sending too many requests to
download Nodejs in docker building.
Which is the major reason of pipeline failing now

In fact, docker image layer caching never works.
We can always see the scrips are still running
```
#9 [3/5] RUN cd /tmp/scripts && /tmp/scripts/install_centos.sh && /tmp/scripts/install_deps.sh && rm -rf /tmp/scripts
#9 0.234 /bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
#9 0.235 /bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
#9 0.235 /tmp/scripts/install_centos.sh: line 1: !/bin/bash: No such file or directory
#9 0.235 ++ '[' '!' -f /etc/yum.repos.d/microsoft-prod.repo ']'
#9 0.236 +++ tr -dc 0-9.
#9 0.236 +++ cut -d . -f1
#9 0.238 ++ os_major_version=8
....
#9 60.41 + curl https://nodejs.org/dist/v18.17.1/node-v18.17.1-linux-x64.tar.gz -sSL --retry 5 --retry-delay 30 --create-dirs -o /tmp/src/node-v18.17.1-linux-x64.tar.gz --fail
#9 60.59 + return 0
...
```

This PR is improving the docker command to make image layer caching
work.
Thus, CI won't send so many redundant request of downloading NodeJS.
```
#9 [2/5] ADD scripts /tmp/scripts
#9 CACHED

#10 [3/5] RUN cd /tmp/scripts && /tmp/scripts/install_centos.sh && /tmp/scripts/install_deps.sh && rm -rf /tmp/scripts
#10 CACHED

#11 [4/5] RUN adduser --uid 1000 onnxruntimedev
#11 CACHED

#12 [5/5] WORKDIR /home/onnxruntimedev
#12 CACHED
```

###Reference
https://docs.docker.com/build/drivers/

---------

Co-authored-by: Yi Zhang <your@email.com>
2024-08-06 21:37:09 +08:00
Edward Chen
a5ce65d87a
Clean up some mobile package related files and their usages. (#21606)
The mobile packages have been removed.
2024-08-05 16:38:20 -07:00
Scott McKay
bcc01ac123
Updates to apple packaging (#21611)
### Description
<!-- Describe your changes. -->
Add ability to test packaging without rebuilding every time.
Add ability to comment out some platforms/architectures without the
scripts to assemble the c/obj-c packages breaking.
Update a couple of commands to preserve symlinks.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Make debugging packaging issues faster.
Creates correct package for mac-catalyst and doesn't require setting
symlinks via bash script.
2024-08-06 08:50:56 +10:00
vraspar
88c811b638
Restructure MacOS framework package to fix malformed Framework errors (#21536)
### Description

Refactor framework directory structure for MacOS packages

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Apple started enforcing specific [framework
structure](https://developer.apple.com/library/archive/documentation/MacOSX/Conceptual/BPFrameworks/Concepts/FrameworkAnatomy.html)
for MacOS packages. We need to change how we package for MacOS to follow
the guidelines

Fixes following issue: [Malformed
Framework](https://github.com/microsoft/onnxruntime-swift-package-manager/issues/19
)
2024-08-04 12:47:16 -07:00
Julius Tischbein
1391354265
Adding CUDNN Frontend and use for CUDA NN Convolution (#19470)
### Description
Added CUDNN Frontend and used it for NHWC convolutions, and optionally
fuse activation.

#### Backward compatible 
- For model existed with FusedConv, model can still run. 
- If ORT is built with cuDNN 8, cuDNN frontend will not be built into
binary. Old kernels (using cudnn backend APIs) are used.

#### Major Changes
- For cuDNN 9, we will enable cudnn frontend to fuse convolution and
bias when a provider option `fuse_conv_bias=1`.
- Remove the fusion of FusedConv from graph transformer for CUDA
provider, so there will not be FusedConv be added to graph for CUDA EP
in the future.
- Update cmake files regarding to cudnn settings. The search order of
CUDNN installation in build are like the following:
  * environment variable `CUDNN_PATH`
* `onnxruntime_CUDNN_HOME` cmake extra defines. If a build starts from
build.py/build.sh, user can pass it through `--cudnn_home` parameter, or
by environment variable `CUDNN_HOME` if `--cudnn_home` not used.
* cudnn python package installation directory like
python3.xx/site-packages/nvidia/cudnn
  * CUDA installation path

#### Potential Issues

- If ORT is built with cuDNN 8, FusedConv fusion is no longer done
automatically, so some model might have performance regression. If user
still wants FusedConv operator for performance reason, they can still
have multiple ways to walkaround: like use older version of onnxruntime;
or use older version of ORT to save optimized onnx, then run with latest
version of ORT. We believe that majority users have moved to cudnn 9
when 1.20 release (since the default in ORT and PyTorch is cudnn 9 for 3
months when 1.20 release), so the impact is small.
- cuDNN graph uses TF32 by default, and user cannot disable TF32 through
the use_tf32 cuda provider option. If user encounters accuracy issue
(like in testing), user has to set environment variable
`NVIDIA_TF32_OVERRIDE=0` to disable TF32. Need update the document of
use_tf32 later.

#### Follow ups
This is one of PRs that target to enable NHWC convolution in CUDA EP by
default if device supports it. There are other changes will follow up to
make it possible.
(1) Enable `prefer_nhwc` by default for device with sm >= 70. 
(2) Change `fuse_conv_bias=1` by default after more testing.
(3) Add other NHWC operators (like Resize or UpSample).

### Motivation and Context

The new CUDNN Frontend library provides the functionality to fuse
operations and provides new heuristics for kernel selection. Here it
fuses the convolution with the pointwise bias operation. On the [NVIDIA
ResNet50](https://pytorch.org/hub/nvidia_deeplearningexamples_resnet50/)
we get a performance boost from 49.1144 ms to 42.4643 ms per inference
on a 2560x1440 input (`onnxruntime_perf_test -e cuda -I -q -r 100-d 1 -i
'prefer_nhwc|1' resnet50.onnx`).

---------

Co-authored-by: Tianlei Wu <tlwu@microsoft.com>
Co-authored-by: Maximilian Mueller <maximilianm@nvidia.com>
2024-08-02 15:16:42 -07:00
dependabot[bot]
3b73ef2bf7
Bump torch from 1.13.1 to 2.2.0 in /tools/ci_build/github/windows/eager (#21505)
Bumps [torch](https://github.com/pytorch/pytorch) from 1.13.1 to 2.2.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/pytorch/pytorch/releases">torch's
releases</a>.</em></p>
<blockquote>
<h2>PyTorch 2.2: FlashAttention-v2, AOTInductor</h2>
<h1>PyTorch 2.2 Release Notes</h1>
<ul>
<li>Highlights</li>
<li>Backwards Incompatible Changes</li>
<li>Deprecations</li>
<li>New Features</li>
<li>Improvements</li>
<li>Bug fixes</li>
<li>Performance</li>
<li>Documentation</li>
</ul>
<h1>Highlights</h1>
<p>We are excited to announce the release of PyTorch® 2.2! PyTorch 2.2
offers ~2x performance improvements to
<code>scaled_dot_product_attention</code> via FlashAttention-v2
integration, as well as AOTInductor, a new ahead-of-time compilation and
deployment tool built for non-python server-side deployments.</p>
<p>This release also includes improved torch.compile support for
Optimizers, a number of new inductor optimizations, and a new logging
mechanism called TORCH_LOGS.</p>
<p><strong>Please note that we are <a
href="https://redirect.github.com/pytorch/pytorch/issues/114602">deprecating
macOS x86 support</a>, and PyTorch 2.2.x will be the last version that
supports macOS x64.</strong></p>
<p>Along with 2.2, we are also releasing a series of updates to the
PyTorch domain libraries. More details can be found in the library
updates blog.</p>
<p>This release is composed of 3,628 commits and 521 contributors since
PyTorch 2.1. We want to sincerely thank our dedicated community for your
contributions. As always, we encourage you to try these out and report
any issues as we improve 2.2. More information about how to get started
with the PyTorch 2-series can be found at our <a
href="https://pytorch.org/get-started/pytorch-2.0/">Getting Started</a>
page.</p>
<p>Summary:</p>
<ul>
<li><code>scaled_dot_product_attention</code> (SDPA) now supports
FlashAttention-2, yielding around 2x speedups compared to previous
versions.</li>
<li>PyTorch 2.2 introduces a new ahead-of-time extension of
TorchInductor called AOTInductor, designed to compile and deploy PyTorch
programs for non-python server-side.</li>
<li><code>torch.distributed</code> supports a new abstraction for
initializing and representing ProcessGroups called device_mesh.</li>
<li>PyTorch 2.2 ships a standardized, configurable logging mechanism
called TORCH_LOGS.</li>
<li>A number of torch.compile improvements are included in PyTorch 2.2,
including improved support for compiling Optimizers and improved
TorchInductor fusion and layout optimizations.</li>
<li>Please note that we are deprecating macOS x86 support, and PyTorch
2.2.x will be the last version that supports macOS x64.</li>
<li><code>torch.ao.quantization</code> now offers a prototype
<code>torch.export</code> based flow</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="8ac9b20d4b"><code>8ac9b20</code></a>
Run docker release build on final tag (<a
href="https://redirect.github.com/pytorch/pytorch/issues/117131">#117131</a>)
(<a
href="https://redirect.github.com/pytorch/pytorch/issues/117182">#117182</a>)</li>
<li><a
href="2490352430"><code>2490352</code></a>
Fix cuInit test on Windows (<a
href="https://redirect.github.com/pytorch/pytorch/issues/117095">#117095</a>)</li>
<li><a
href="3a44bb713f"><code>3a44bb7</code></a>
[CI] Test that cuInit is not called during import (<a
href="https://redirect.github.com/pytorch/pytorch/issues/117043">#117043</a>)</li>
<li><a
href="1c8ba3847d"><code>1c8ba38</code></a>
[CI] Use jemalloc for CUDA builds (<a
href="https://redirect.github.com/pytorch/pytorch/issues/116900">#116900</a>)
(<a
href="https://redirect.github.com/pytorch/pytorch/issues/116988">#116988</a>)</li>
<li><a
href="96d2ddbafe"><code>96d2ddb</code></a>
Store user model to simplify
ONNXProgram.{adapt_torch_*,<strong>call</strong>} APIs (<a
href="https://redirect.github.com/pytorch/pytorch/issues/1152">#1152</a>...</li>
<li><a
href="738b4a560a"><code>738b4a5</code></a>
Update ONNX's IO Adapter to support FakeTensor with ExportedProgram (<a
href="https://redirect.github.com/pytorch/pytorch/issues/114407">#114407</a>)...</li>
<li><a
href="4cf10bf4dc"><code>4cf10bf</code></a>
[Cherry-pick] [Quant] [PT2] Enable batchnorm in
_move_exported_model_to_eval ...</li>
<li><a
href="7e97e4b4b6"><code>7e97e4b</code></a>
[AARCH64] Fall back to GEMM if mkldnn_matmul fails (<a
href="https://redirect.github.com/pytorch/pytorch/issues/115936">#115936</a>)
(<a
href="https://redirect.github.com/pytorch/pytorch/issues/116666">#116666</a>)</li>
<li><a
href="1a3e3c7cff"><code>1a3e3c7</code></a>
[CUDA] baddmm should fall back to addmm for batch=1 (<a
href="https://redirect.github.com/pytorch/pytorch/issues/114992">#114992</a>)
(<a
href="https://redirect.github.com/pytorch/pytorch/issues/116518">#116518</a>)</li>
<li><a
href="ab7505f78c"><code>ab7505f</code></a>
Fix broken PyYAML 6.0 on MacOS x86 (<a
href="https://redirect.github.com/pytorch/pytorch/issues/115956">#115956</a>)
(<a
href="https://redirect.github.com/pytorch/pytorch/issues/116551">#116551</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/pytorch/pytorch/compare/v1.13.1...v2.2.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=torch&package-manager=pip&previous-version=1.13.1&new-version=2.2.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/microsoft/onnxruntime/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-01 04:28:43 -07:00
vraspar
07d3be5b0e
CoreML: Add ML Program Split Op (#21456)
### Description

Add support for Split Op


### Motivation and Context
Address operator gaps in high priority model.

---------

Co-authored-by: Scott McKay <skottmckay@gmail.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2024-07-30 14:04:47 +10:00
Yifan Li
5d78b9a17b
[TensorRT EP] Update TRT OSS Parser to 10.2 (#21552)
### Description
<!-- Describe your changes. -->
Update TRT OSS Parser to [latest 10.2-GA
branch](f161f95883)


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-07-29 17:27:38 -07:00
Jian Chen
79537d0523
Remove tools/ci_build/github/android/run_nnapi_code_coverage.sh (#21371)
### Description
Remove tools/ci_build/github/android/run_nnapi_code_coverage.sh

### Motivation and Context
This file is no longer needed
2024-07-29 10:00:52 -07:00
Jian Chen
bc3713206d
Update QNN pipeline pool (#21482)
### Description
Update QNN pipeline pool 



### Motivation and Context
Let all our pipelines are using the latest NDK version
2024-07-29 10:00:21 -07:00
Yi Zhang
05cef469e8
Move on-device training packages publish step (#21539)
### Description
Since the onedevice training cpu packaging has been a separated
pipeline, it's nuget package publishing step must be moved as well.

### Motivation and Context
Fixes the exception in Nuget Publishing Packaging Pipeline caused by
#21485
2024-07-29 09:59:46 -07:00
Jian Chen
7e23212de9
Delete tools/ci_build/github/azure-pipelines/win-gpu-ci-pipeline.yml (#21529)
### Description
Delete tools/ci_build/github/azure-pipelines/win-gpu-ci-pipeline.yml


### Motivation and Context
This CI pipeline has been divided into 4 different pipeline.
2024-07-27 15:58:12 -07:00
maggie1059
10b4a3b90b
Fix conda failure for onnxruntime-directml (#21526)
The change in #21005 works for directly building wheels with `build.py`,
but ort-nightly-directml wheels, as well as the 1.18.1 release of the
onnxruntime-directml python wheel, still do not work with conda since
they're built from the `py-win-gpu.yml` pipeline, which uses
`install_third_party_deps.ps1` to set compile flags.
2024-07-26 22:26:38 -07:00
Scott McKay
5af423c7c0
Set version and other info in the C# dll (#21517)
### Description
<!-- Describe your changes. -->
Set version and other info in the Microsoft.ML.OnnxRuntime C# dll by
setting GenerateAssemblyInfo to true and passing in ORT version in the
CI.

Minor re-org of the order of properties so related things are grouped a
little better.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
#21475
2024-07-27 13:22:57 +10:00
Jian Chen
7db7c4e5c8
Separating all GPU stages into different Pipelines (#21521)
### Description
Separating all GPU stages into different Pipelines
2024-07-26 14:54:45 -07:00
Prathik Rao
278f0f5cd2
disables qnn in ort training cpu pipeline (#21510)
### Description
<!-- Describe your changes. -->

`enable_windows_arm64_qnn` and `enable_windows_x64_qnn` are true by
default but unnecessary for training. This change explicitly sets these
parameters to false for training pipeline.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

ORT 1.19 Release Preparation
2024-07-26 17:23:35 +08:00
Scott McKay
b0e1f7f798
CoreML: Aggregated changes to add all required ops for priority model (#21472)
### Description
<!-- Describe your changes. -->
Add these changes to one PR to simplify checkin
- Add Concat (#21423)
- Add DepthToSpace (#21426)
- Add LeakyRelu (#21453)
- Add test scripts (#21427)
- Add ability to set coreml flags from python (#21434)


Other changes
- updated partitioning utils to support dropping constant initializers
from a ComputeCapability's inputs.
- noticed that the list of inputs to the coreml model was unexpectedly
long due to this
- we copy constant initializers to a CoreML model so don't need the
originals, and if they remain as inputs ORT can't free them as they
appear to be in use.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-07-26 08:29:33 +10:00
Scott McKay
3cdf4b917b
Fix Android CI Pipeline code coverage failure (#21504)
### Description
<!-- Describe your changes. -->
Current failure is due to a version mismatch.

Use llvm-cov from the Android NDK instead of the system gcov so that the
version is correct.

Also comment out publishing to the Azure dashboard to simplify the
setup. The CI prints out the stats for review by developers.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Fix CI pipeline
2024-07-26 07:36:23 +10:00
Changming Sun
4167b68abf
Split ondevice training cpu packaging pipeline to a separated pipeline (#21485)
### Description
Right now our "Zip-Nuget-Java-Nodejs Packaging Pipeline" is too big.
This OnDevice training part is independent of the others, so it can be
split out. Then our NPM Packaging pipeline will not depends on this
training stuff.

### Motivation and Context
Similar to #21235 

Also, this PR fixed a problem that: "NuGet_Test_Linux_Training_CPU" job
downloads artifacts from "onnxruntime-linux-x64" for getting customop
shared libs, but the job forget to declare it depends on the
"Linux_C_API_Packaging_CPU_x64" which produces the artifact. Such
problems can be hard to find when a pipeline goes big.
2024-07-25 10:58:34 -07:00
Yifan Li
ebcb7075eb
Set CUDA12 as default in GPU packages (#21438)
### Description
* Swap cuda version 11.8/12.2 in GPU CIs
* Set CUDA12 as default version in yamls of publishing nuget/python/java
GPU packages
* Suppress warnings as errors of flash_api.cc during ort win-build
2024-07-25 10:17:16 -07:00
Adrian Lizarraga
eb9b377306
[QNN EP] Update to QNN SDK 2.24.0 (#21463)
### Description
- Update pipelines to use QNN SDK 2.24 by default
- Update QNN_Nuget_Windows pipeline to build csharp solution without
mobile projects (fixes errors).
- Implement workaround for QNN 2.24 validation bug for LayerNorm ops
without an explicit bias input.
- Enable Relu unit test, which now passes due to the fact Relu is no
longer fused into QuantizeLinear for QNN EP.
- Fix bug where a negative quantization axis is not properly normalized
for per-channel int4 conv.



### Motivation and Context
Update QNN SDk.
2024-07-24 10:17:12 -07:00
Changming Sun
b04adcc381
Update copy_strip_binary.sh: use "make install" instead (#21464)
### Description
Before this change, copy_strip_binary.sh manually copies each file from
onnx runtime's build folder to an artifact folder. It can be hard when
dealing with symbolic link for shared libraries.
This PR will change the packaging pipelines to run "make install" first,
before packaging shared libs .


### Motivation and Context

Recently because of feature request #21281 , we changed
libonnxruntime.so's SONAME. Now every package that contains this shared
library must also contains libonnxruntime.so.1. Therefore we need to
change the packaging scripts to include this file. Instead of manually
construct the symlink layout, using `make install` is much easier and
will make things more consistent because it is a standard way of making
packages.

**Breaking change:**
After this change, our **inference** tarballs that are published to our
Github release pages will be not contain ORT **training** headers.
2024-07-24 10:02:00 -07:00
Scott McKay
2580d935cb
CoreML: Add ML Program ConvTranspose (#21416)
### Description
<!-- Describe your changes. -->
Add ML Program ConvTranspose
- some limitations to simplify the implementation for now
- some limitations due to flaky CoreML output

Added support for non-contiguous MLMultiArray output as we see that with
some unit tests when the CPU-only flag is not set (e.g. innermost dim
has min size of 16 but test output only has 8 values).
- support only one non-contiguous dim to keep it simple
- manually tested as we don't have a setup that can test objective-c
code
- test code is in model.mm and can be enabled via ifdef if we need to
validate any future changes



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Address operator gaps in high priority model.

---------

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2024-07-24 16:08:20 +10:00
Scott McKay
1df9aa2f08
CoreML: Add GridSample ML Program support (#21431)
### Description
<!-- Describe your changes. -->
Add GridSample ML Program support

One combination of inputs has diffs between the pytorch generated unit
tests data and CoreML. Disabling until needed as investigation may take
a while.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
High priorities models
2024-07-24 11:04:48 +10:00
George Wu
c65afcea55
fix python qnn pipelines issues (#21462)
build_py_params wasn't plumbed through for python qnn pipelines. 
incorporate fixes for deprecated numpy version option from
https://github.com/microsoft/onnxruntime/pull/21459
2024-07-23 15:54:44 -07:00
Changming Sun
f70215d4e6
Update C++ dependencies (#21410)
1. Update google benchmark from 1.8.3 to 1.8.5
2. Update google test from commit in main branch to tag 1.15.0 
3. Update pybind11 from 2.12.0 to 2.13.1
4. Update pytorch cpuinfo to include the support for Arm Neoverse V2,
Cortex X4, A720 and A520.
5. Update re2 from 2024-05-01 to 2024-07-02
6. Update cmake to 3.30.1
7. Update Linux docker images
8. Fix a warning in test/perftest/ort_test_session.cc:826:37: error:
implicit conversion loses integer precision: 'streamoff' (aka 'long
long') to 'const std::streamsize' (aka 'const long')
[-Werror,-Wshorten-64-to-32]
2024-07-23 10:00:36 -07:00
Scott McKay
0f1f3b7705
CoreML: ML Program Slice (#21433)
### Description
<!-- Describe your changes. -->
Add support for Slice


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
High priority models.
2024-07-23 20:21:55 +10:00
mindest
5b9369e93c
Fix typos according to reviewdog report. (#21335)
### Description
Fix typos based on reviewdog report but with some
exceptions/corrections.
2024-07-22 13:37:32 -07:00
Jian Chen
4e75605eec
Replace inline pip install with pip install from requirements*.txt (#21106)
### Description
Replace inline pip install with pip install from requirements*.txt



### Motivation and Context
so that CG can recognize

### Dependency

- [x] https://github.com/microsoft/onnxruntime/pull/21085
2024-07-22 12:39:10 -07:00
Scott McKay
34cd2e8ed8
Add CoreML ML Program Resize (#21370)
### Description
<!-- Describe your changes. -->
Add CoreML ML Program Resize
- refactor existing logic to try and simplify and share between
NeuralNetwork and MLProgram checks
- add handling for some new attributes
- antialias and axes - should have been done when setting the CoreML EP
max opset to 21

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Support priority models
2024-07-20 09:35:05 +10:00
Changming Sun
9140d9b1ff
Update azure-kusto-data and azure-kusto-ingest (#21409)
A vulnerability has been found in the Kusto SDK. We need to update it to
latest to address a security alert.
2024-07-18 14:26:26 -07:00
Yifan Li
bb76ead96c
[TensorRT EP] support TensorRT 10.2-GA (#21395)
### Description
<!-- Describe your changes. -->
* promote trt version to 10.2.0.19
* EP_Perf CI: clean config of legacy TRT<8.6, promote test env to
trt10.2-cu118/cu125
* skip two tests as Float8/BF16 are supported by TRT>10.0 but TRT CIs
are not hardware-compatible on these:
 ```
 1: [  FAILED  ] 2 tests, listed below:
 1: [  FAILED  ] IsInfTest.test_isinf_bfloat16
 1: [  FAILED  ] IsInfTest.test_Float8E4M3FN
 ```

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-07-18 12:11:52 -07:00
kailums
1b38c05544
change ci docker image to rocm6.1 (#21296)
### Description
<!-- Describe your changes. -->
There is a bug for kernel running on rocm6.0, so change ci docker image
to rocm6.1

For the torch installed in the docker image, change to rocm repo when it
is not 6.0 version.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-07-18 14:50:01 +08:00
vraspar
fa287042ca
Add ML Program support for transpose op (#21364)
### Description
Add support for transpose op



### Motivation and Context
Enable support for Autodesk model
2024-07-16 16:34:58 -07:00
vraspar
218301403d
Add ML Program support for basic activation ops (#21326)
### Description

Add support for:

- Sigmoid
- Relu
- Tanh

### Motivation and Context
Enable support for Autodesk model
2024-07-15 22:30:20 -07:00
George Wu
4005d12ed4
add vitisai ep build stage to Windows CPU Pipeline (#21361)
We need to prevent VitisAI EP build breaks, add a stage in Windows CPU
CI Pipeline to build Vitis AI EP on Windows.
There are no external dependencies for builds. Tests have to be disabled
though as the EP has external SW/HW dependencies.
This will at least allow us to prevent build breaks which has happened
on multiple occasions recently.

tested
https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1432346&view=results
and it seems to run fine.
2024-07-15 19:34:08 -07:00
Jian Chen
c03e6fff4c
Combining android build and test step into one job (#21340)
### Description
Combining android build and test step into one job


### Motivation and Context
Reduce runtime by removing additional machine allocation, and artifact
uploading and downloading.

---------

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2024-07-15 14:44:03 -07:00
Yi Zhang
f2ebd1cd6b
[Fix] Exception in iosDynamicFramework Post-Merge workflow (#21262)
### Description
the exception was caused by
3dd6fcc089


Why I add skip_macos_test
because there's new an exception in
https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1425579&view=logs&j=c90c5af3-67d5-5936-5a62-71c93ebfca65&t=01038f35-8e78-5801-1aa1-d9647bb65858
```
2024-07-05T14:41:09.3864740Z mkdir -p /Users/runner/Library/Developer/Xcode/DerivedData/apple_package_test-akksnidsbpojopfdqrclgsoqqerv/Build/Products/Debug/macos_package_testUITests.xctest/Contents/Frameworks
2024-07-05T14:41:09.3933430Z mkdir: /Users/runner/Library/Developer/Xcode/DerivedData/apple_package_test-akksnidsbpojopfdqrclgsoqqerv/Build/Products/Debug/macos_package_testUITests.xctest: Operation not permitted
2024-07-05T14:41:09.3996760Z /var/folders/0f/b0mzpg5d31z074x3z5lzkdxc0000gn/T/tmp97ycvwq5/apple_package_test/Pods/Target Support Files/Pods-macos_package_testUITests/Pods-macos_package_testUITests-frameworks.sh: line 7: realpath: command not found
2024-07-05T14:41:09.4003170Z :18: error: Unexpected failure
2024-07-05T14:41:11.1323470Z error: Sandbox: mkdir(72212) deny(1) file-write-create /Users/runner/Library/Developer/Xcode/DerivedData/apple_package_test-akksnidsbpojopfdqrclgsoqqerv/Build/Products/Debug/macos_package_testUITests.xctest (in target 'macos_package_testUITests' from project 'apple_package_test')
2024-07-05T14:41:11.1325620Z 
2024-07-05T14:41:11.8731110Z 
2024-07-05T14:41:11.8733040Z Test session results, code coverage, and logs:
2024-07-05T14:41:11.8734820Z 	/Users/runner/Library/Developer/Xcode/DerivedData/apple_package_test-akksnidsbpojopfdqrclgsoqqerv/Logs/Test/Test-macos_package_test-2024.07.05_14-40-38-+0000.xcresult
2024-07-05T14:41:11.8735530Z 
2024-07-05T14:41:11.8906210Z Testing failed:
2024-07-05T14:41:11.8911060Z 	Sandbox: mkdir(72212) deny(1) file-write-create /Users/runner/Library/Developer/Xcode/DerivedData/apple_package_test-akksnidsbpojopfdqrclgsoqqerv/Build/Products/Debug/macos_package_testUITests.xctest
2024-07-05T14:41:11.8912570Z 	Unexpected failure
2024-07-05T14:41:11.8913690Z 	Testing cancelled because the build failed.
2024-07-05T14:41:11.8914380Z 
2024-07-05T14:41:11.8914970Z ** TEST FAILED **
2024-07-05T14:41:11.8915480Z 
2024-07-05T14:41:11.8915780Z 
2024-07-05T14:41:11.8916750Z The following build commands failed:
2024-07-05T14:41:11.8919280Z 	PhaseScriptExecution [CP]\ Embed\ Pods\ Frameworks /Users/runner/Library/Developer/Xcode/DerivedData/apple_package_test-akksnidsbpojopfdqrclgsoqqerv/Build/Intermediates.noindex/apple_package_test.build/Debug/macos_package_testUITests.build/Script-059136A7770CA5376C30F2FD.sh (in target 'macos_package_testUITests' from project 'apple_package_test')
2024-07-05T14:41:11.8922180Z (1 failure)
```

And I find macos test is skipped in

9ef28f092f/tools/ci_build/github/azure-pipelines/templates/c-api-cpu.yml (L119-L127)
as well.
Maybe it is an known issue.
2024-07-12 09:24:12 -07:00
Edward Chen
33e7c7f6ec
Enable Android CI build stages to run in parallel. (#21314)
Enable Android CI build stages to run in parallel to possibly reduce total build time.
2024-07-11 10:09:09 -07:00
Yi Zhang
41ea47be1e
Move QNN nuget package stages out of the big Nuget packaging pipeline. (#21306)
### Description
1. remove QNN stages from the big packaging pipeline
2. Add publish nightly package in the current [QNN Nuget
pipeline](https://dev.azure.com/aiinfra/Lotus/_builddefinitionId=1234])



### Motivation and Context
Reduce the complexity of the big Nuget packaging pipelines.

---------

Co-authored-by: Yi Zhang <your@email.com>
2024-07-11 09:07:23 -07:00
Changming Sun
fe6ef404b5
Enable LTO for Android build (#21243)
### Description
Enable LTO for Android build, which can reduce binary size by 6%.
2024-07-10 18:44:17 -07:00
Changming Sun
8749fa381e
Update absl (#21300)
### Description
Our macOS pipeline are failing because of a build error in absl.
However, the bug fix we need is not available in the latest ABSL
release.

Here  is the issue: https://github.com/abseil/abseil-cpp/pull/1536
And here is the fix:
779a3565ac
 
GTests uses ABSL. But this ABSL target also depends on GTest. So, it is
a circular dependency. We should be able to avoid that by avoid building
tests for ABSL. However, the version we are using has a problem with
that: it has cmake target that still depends on GTest even when testing
is disabled.

It's strange that we suddenly hit this problem and it only happens on macOS.
2024-07-10 11:14:15 -07:00
Jian Chen
d1c19e79ea
Update OpenVino CI Ubuntu to 22.04 (#21127)
### Description
[Update OpenVino CI Ubuntu to
22.04](312fab5b3f)



### Motivation and Context
Ubuntu 22.04 is needed for linux C++20
2024-07-09 09:56:44 -07:00
Yi Zhang
30b6e82e7d
Make ROCm packaging stages to a single workflow (#21235)
### Description
Make current ROCm packaging stages to a single workflow.
Reduce the possibility of all nightly packages can't be generated by one
failed stage



### Motivation and Context
Our plan is to reduce the complexity of the current zip-nuget pipeline
to improve the stability and performance of nightly packages generation.
ROCm packaging stages has no dependencies with other packaging jobs and
it's the most time-consuming route.
After this change, the most used CPU/CUDA/Mobile packaging workflow
duration can be reduced roughly from 3h20m to 2h30m.
2024-07-04 11:07:04 +08:00
cloudhan
f39ee14b46
Add GQA support for ROCm (#21032) 2024-07-03 14:55:31 +08:00
Yi Zhang
beb2496748
Templatize publishing nuget package (#21199)
### Description
It's the prerequisite step of reducing complexity of current zip-nuget
pipeline.
Some packaging tasks could be cut from the most complex nuget pipline
and easily be published

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-07-02 09:24:19 +08:00
Jian Chen
9007ede102
Update upstream packaging pipeline name to make it more meaningful. (#21154)
### Description
Update upstream packaging pipeline name to make it more meaningful.



### Motivation and Context
The upstream pipeline used to only building Nuget packages, but now it
also builds Zip and Java. So change the name will make it more
meaningful.
2024-06-28 21:40:09 -07:00
Jian Chen
0cbe7eec5e
Uppdate nuget to Use Nuget 6.10.x (#21209)
### Description
Uppdate nuget to Use Nuget 6.10.x
2024-06-28 19:49:54 -07:00
Yi Zhang
587e92c279
Add FP32 and INT4 test in Llama2 (#21187)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-06-28 06:18:26 +08:00
Changming Sun
d1ab94c2b0
Add compatibility for NumPy 2.0 (#21085)
### Description

As suggested by SciPy's doc, we will
`Build against NumPy 2.0.0, then it will work for all NumPy versions
with the same major version number (NumPy does maintain backwards ABI
compatibility), and as far back as NumPy 1.19 series at the time of
writing`

I think it works because in
[numpyconfig.h#L64](https://github.com/numpy/numpy/blob/main/numpy/_core/include/numpy/numpyconfig.h#L64)
there is a macro NPY_FEATURE_VERSION. By default it is set to
NPY_1_19_API_VERSION. And the NPY_FEATURE_VERSION macro controls ABI.

This PR only upgrade the build time dependency; When a user installs
ONNX Runtime, they still can use numpy 1.x.

### Motivation and Context
Recently numpy published a new version, 2.0.0, which is incompatible with the latest ONNX Runtime release.
2024-06-27 13:50:53 -07:00
PeixuanZuo
446aa986a1
[ROCm] Extend the Pipeline restriction time (#21158)
ROCm EP builds are taking longer.
2024-06-27 15:36:04 +08:00
Jian Chen
f81c0ec32a
Remove warning suppression from Java Packaging pipeline. (#21010)
### Description
Remove warning suppression from Java Packaging pipeline.


### Motivation and Context
We want the CI step not to produce warning.
2024-06-24 16:46:21 -07:00
aciddelgado
ebd0368bb0
Make Flash Attention work on Windows (#21015)
### Description
Previously, Flash Attention only worked on Linux systems. This PR will
make it work and enable it to be built and run on Windows.

Limitations of Flash Attention in Windows: Requires CUDA 12.

### Motivation and Context
This will significantly increase the performance of Windows-based LLM's
with hardware sm>=80.

To illustrate the improvement of Flash Attention over Memory Efficient
Attention, here are some average benchmark numbers for the GQA operator,
run with configurations based on several recent models (Llama, Mixtral,
Phi-3). The benchmarks were obtained on RTX4090 GPU using the test
script located at
(onnxruntime/test/python/transformers/benchmark_gqa_windows.py).

* Clarifying Note: These benchmarks are just for the GQA operator, not
the entire model.

### Memory Efficient Attention Kernel Benchmarks:
| Model Name | Max Sequence Length | Inference Interval (ms) |
Throughput (samples/second) |

|----------------------------------------|---------------------|-------------------------|-----------------------------|
| Llama3-8B (Average Prompt) | 8192 | 0.19790525 | 13105.63425 |
| Llama3-8B (Average Token) | 8192 | 0.207775538 | 12025.10172 |
| Llama3-70B (Average Prompt) | 8192 | 0.216049167 | 11563.31185 |
| Llama3-70B (Average Token) | 8192 | 0.209730731 | 12284.38149 |
| Mixtral-8x22B-v0.1 (Average Prompt) | 32768 | 0.371928785 |
7031.440056 |
| Mixtral-8x22B-v0.1 (Average Token) | 32768 | 0.2996659 | 7607.947159 |
| Phi-3-mini-128k (Average Prompt) | 131072 | 0.183195867 | 15542.0852 |
| Phi-3-mini-128k (Average Token) | 131072 | 0.198215688 | 12874.53494 |
| Phi-3-small-128k (Average Prompt) | 65536 | 2.9884929 | 2332.584142 |
| Phi-3-small-128k (Average Token) | 65536 | 0.845072406 | 2877.85822 |
| Phi-3-medium-128K (Average Prompt) | 32768 | 0.324974429 | 8094.909517
|
| Phi-3-medium-128K (Average Token) | 32768 | 0.263662567 | 8978.463687
|

### Flash Attention Kernel Benchmarks:
| Model Name | Max Sequence Length | Inference Interval (ms) |
Throughput (samples/second) |

|--------------------------------------|---------------------|-------------------------|-----------------------------|
| Llama3-8B (Average Prompt) | 8192 | 0.163566292 | 16213.69057 |
| Llama3-8B (Average Token) | 8192 | 0.161643692 | 16196.14715 |
| Llama3-70B (Average Prompt) | 8192 | 0.160510375 | 17448.67753 |
| Llama3-70B (Average Token) | 8192 | 0.169427308 | 14702.62043 |
| Mixtral-8x22B-v0.1 (Average Prompt) | 32768 | 0.164121964 |
15618.51301 |
| Mixtral-8x22B-v0.1 (Average Token) | 32768 | 0.1715865 | 14524.32273 |
| Phi-3-mini-128k (Average Prompt) | 131072 | 0.167527167 | 14576.725 |
| Phi-3-mini-128k (Average Token) | 131072 | 0.175940594 | 15762.051 |
| Phi-3-small-128k (Average Prompt) | 65536 | 0.162719733 | 17824.494 |
| Phi-3-small-128k (Average Token) | 65536 | 0.14977525 | 16749.19858 |
| Phi-3-medium-128K (Average Prompt) | 32768 | 0.156490786 | 17679.2513
|
| Phi-3-medium-128K (Average Token) | 32768 | 0.165333833 | 14932.26079
|

Flash Attention is consistently faster for every configuration we
benchmarked, with improvements in our trials ranging from ~20% to ~650%.

In addition to these improvements in performance, Flash Attention has
better memory usage. For example, Memory Efficient Attention cannot
handle a max sequence length higher than 32,768, but Flash Attention can
handle max sequence lengths at least as high as 131,072.

---------

Co-authored-by: Tianlei Wu <tlwu@microsoft.com>
2024-06-24 09:43:49 -07:00
Yi Zhang
5b5ce0bfb0
Add UsePython Task in Nuget Publish workflow (#21144)
### Description
Otherwise it would fail in 

b95982e588/tools/ci_build/github/azure-pipelines/publish-nuget.yml (L78-L81)



### Motivation and Context
The Windows CPU image is migrated  to managed image


### Verification Link
https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1313
2024-06-24 13:36:13 +08:00
Yi Zhang
69d522f4e9
[Fix] use cmdline in Final Jar Testing Stage for new managed Windows Image (#21130)
### Description
No bash command in Managed Windows image.
Use CmdlLine step instead.



### Verified Link

https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=491902&view=logs&j=f1f8e11e-a9fa-53e5-cd29-3ba2c1988550
2024-06-21 12:41:06 +08:00
Changming Sun
efcaa835b1
Update generate_nuspec_for_native_nuget.py for training (#21112)
### Description
Similar to #21096 , but this one is for ORT training nuget package.
2024-06-20 16:13:31 -07:00
Changming Sun
bd3a9ee99d
Add UsePythonVersion (#21109)
### Description
The machine has multiple python installations and none of them is in
PATH. Therefore we should explicitly set python version via this task to
avoid having surprises.

### Motivation and Context
Similar to #21095
2024-06-19 20:47:21 -07:00
Changming Sun
27f3ac78d4
Delete RoslynAnalyzers (#21104)
### Description
Delete RoslynAnalyzers. Use CodeQL instead.


### Motivation and Context
Now we already have CodeQL which is modern and also covers C# code. The
RoslynAnalyzers one is not in our pull request pipelines. The
"RoslynAnalyzers@2" task is outdated and needs be upgraded. I will
delete it for now since we already have CodeQL.
2024-06-19 20:11:15 -07:00
Changming Sun
be423747b1
Delete pyop (#21094)
### Description
Remove the "--enable_language_interop_ops" build flag, because the code
is incompatible with the latest numpy, and the build flag is not used
anywhere except a macOS CI pipeline. It does not seem to have a ship
plan.


### Motivation and Context
The build error was:
```
onnxruntime/core/language_interop_ops/pyop/pyop.cc:122:85: error: no member named 'elsize' in '_PyArray_Descr'
                                  static_cast<int64_t>(PyArray_DescrFromType(type)->elsize),
                                                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~  ^
```
2024-06-19 16:21:33 -07:00
Adrian Lizarraga
3ae5df1d18
[QNN EP] Update QNN SDK to 2.23.0 (#21008)
### Description
- Updates CI pipelines to use QNN SDK 2.23.0 by default.
- QNN SDK adds support for int64 Cast. This allows QNN EP to support
ONNX ArgMax/ArgMin/TopK operators that generate an int64 graph output.

Example translation of ArgMax:
- **ONNX**:    input --> ArgMax --> output (int64)
- **QNN**: input --> ArgMax --> Cast (int32 to int64) --> output (int64)

### Motivation and Context
Update onnxruntime to use the latest QNN SDK.
2024-06-19 12:37:42 -07:00
Jian Chen
6a0d64e65c
Component Gov round 7 (#21051)
### Description
ignoreDirectories does not recursively include sub folders like we
thought it would. We need to add additional sub folders.



### Motivation and Context
Fix CG :
1.
https://aiinfra.visualstudio.com/Lotus/_componentGovernance/218239/alert/11474679?typeId=25427568
2.
https://aiinfra.visualstudio.com/Lotus/_componentGovernance/218239/alert/11475140?typeId=25421034&pipelinesTrackingFilter=0
2024-06-19 11:07:02 -07:00
Yi Zhang
cc3168bcbb
Add UsePython task in Nuget_Packaging_CPU stage (#21095)
### Description
supplement of https://github.com/microsoft/onnxruntime/pull/21062



### Motivation and Context
2024-06-19 21:09:37 +08:00
Scott McKay
5fc60f36f2
Update to the net8 MAUI targets. Remove Xamarin. (#21062)
### Description
<!-- Describe your changes. -->
Xamarin is EOL so remove support.
The MAUI targets are EOL and need updating.
https://dotnet.microsoft.com/en-us/platform/support/policy/maui

Other cleanups:
- netcoreapp3.1 is EOL
- the net6 macos target was added in the mistaken belief that was for
MAUI mac support, but that is actually via the mac-catalyst target which
we recently added support for.
- some CIs that were using the old build setup of splitting pre-net6
targets. The ORT C# bindings csproj was updated last year and the
`PreNet6` and `SelectedTargets` properties no longer exist as they were
replaced by the simpler `IncludeMobileTargets` property.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Remove EOL components.
#21058
2024-06-19 16:20:58 +10:00
cloudhan
ddd4ce3cb7
[ROCm] Update ck to use ck_tile (#21030) 2024-06-19 14:06:10 +08:00
Yi Zhang
809cb26ace
Use A100 for LLama2 model test (#21068)
### Description




### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-06-18 11:04:02 +08:00
Changming Sun
9ef4f1b789
Update pybind11 (#21072)
### Description
Upgrade pybind11 to the latest as suggested by @gnought in #21063

### Motivation and Context
Recently numpy released a new version, which caused compatibility issue
between the latest numpy version and the latest ONNX Runtime version.
2024-06-17 19:50:57 -07:00
Scott McKay
159fe9d4f3
Update to mobile model usability checker (#19843)
### Description
<!-- Describe your changes. -->

- Add check for CoreML MLProgram supported ops
- Only check usability with ORT Mobile package if requested
- this package will be deprecated so info is a) of minimal value and b)
can be confusing.
- Output more things at INFO level
- a lot of meaningful info was only output at DEBUG level. The default
INFO level is more useful
  - dump full partition info at DEBUG level
- Check subgraphs fully
  - CoreML can handle a subgraph
- TBD if we want to add support for adding a subgraph to the parent
graph for Loop and If nodes
    - most likely will be required for simple If nodes to be performant
- Check 5D CoreML limitation

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Improve helper tools

---------

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2024-06-18 07:50:33 +10:00
Changming Sun
80a60d9a65
Update ONNX installing script (#21044)
Avoid using command line flags to pass in CMAKE_PREFIX_PATH. Use
environment variables instead.
Because, otherwise the value of CMAKE_PREFIX_PATH could get encoded
twice. For example, if the prefix is `C:\a\root`, then in
tools/ci_build/github/windows/helpers.ps1 we set it in Env:CMAKE_ARGS
which will be consumed by ONNX. Then when ONNX get it and decoded it,
ONNX will get `C:aroot` instead. Then because the path doesn't exist,
the CMAKE_PREFIX_PATH couldn't take effect when the script installs
ONNX. This PR fixes the issue.

The issue got discovered when I tried to upgrade cmake to a newer
version. Now our Windows CPU CI build pipeline uses cmake 3.27. In the
main branch even the CMAKE_PREFIX_PATH setting does not work, cmake
still can find protoc.exe from the directories. However, starting from
3.28 cmake changed it. With the newer cmake versions the find_library(),
find_path(), and find_file() cmake commands no longer search in
installation prefixes derived from the PATH environment variable.
2024-06-13 23:49:41 -07:00
Ye Wang
f35dd1407f
custom allreduce cuda kernel (#20703)
### Description
<!-- Describe your changes. -->

Conditionally route to custom AllReduce kernel when buffer size and gpu
numbers meet certain requirements. Otherwise, keep using NCCL's
AllReduce.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Ye Wang <wangye@microsoft.com@h100vm-ort.kxelwkzfzxguje5bxvwxxs135a.gvxx.internal.cloudapp.net>
Co-authored-by: Your Name <you@example.com>
2024-06-13 11:09:49 -07:00
Jian Chen
9daed5565a
Component Governance Fix round 6 (#21021)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-06-13 09:10:51 -07:00
Changming Sun
73271dd329
Move jobs in onnxruntime-Win2022-GPU-T4 machine pool to onnxruntime-Win2022-GPU-A10 (#21023)
### Description
Move jobs in onnxruntime-Win2022-GPU-T4 machine pool to
onnxruntime-Win2022-GPU-A10

### Motivation and Context
To reduce the variants of VM images we need to maintain. Now we have 3:
1. Windows 2022 CPU
2. Windows 2022 GPU A10
3. Windows 2022 GPU T4

This change allows us removing the last one.
2024-06-12 22:04:40 -07:00
Changming Sun
99f0fe3fae
Fix a few issues in "Zip-Nuget-Java-Nodejs Packaging Pipeline" (#21014)
### Description
Fix a few issues in the Windows TRT job in "Zip-Nuget-Java-Nodejs
Packaging Pipeline":
1. It is a Windows job. It should not use bash(which is usually not
available on Windows).
2. When it sets ADO vars, it missed a semicolon 

Here is the doc of how to set ADO vars via scripts:
https://learn.microsoft.com/en-us/azure/devops/pipelines/process/set-variables-scripts?view=azure-devops&tabs=bash

You could see it needs a semicolon . Without the semicolon , the vars
will have an extra quotation mark in their values.
2024-06-12 09:44:24 -07:00
Yi Zhang
17d5dc503f
Upgrade ESRP signing task from v2 to v5 (#20995)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-06-12 08:31:53 +08:00
Tianlei Wu
b3fc9b5a0e
[CUDA] upgrade cutlass to 3.5.0 (#20940)
### Description
Upgrade cutlass to 3.5 to fix build errors using CUDA 12.4 or 12.5 in
Windows
- [x] Upgrade cutlass to 3.5.0.
- [x] Fix flash attention build error with latest cutlass header files
and APIs. This fix is provided by @wangyems.
- [x] Update efficient attention to use new cutlass fmha interface.
- [x] Patch cutlass to fix `hrsqrt` not found error for sm < 53.
- [x] Disable TF32 Staged Accumulation to fix blkq4_fp16_gemm_sm80_test
build error for cuda 11.8 to 12.3.
- [x] Disable TRT 10 deprecate warnings. 

The following are not included in this PR:
* TRT provider replaces the deprecated APIs.
* Fix blkq4_fp16_gemm_sm80_test build error for cuda 12.4 or 12.5. This
test is not built by default unless you add `--cmake_extra_defines
onnxruntime_ENABLE_CUDA_EP_INTERNAL_TESTS=ON` in build command.

To integrate to rel-1.18.1: Either bring in other changes (like onnx
1.16.1), or generate manifest and upload a new ONNX Runtime Build Time
Deps artifact based on rel-1.18.1.

### Motivation and Context
https://github.com/microsoft/onnxruntime/issues/19891
https://github.com/microsoft/onnxruntime/issues/20924
https://github.com/microsoft/onnxruntime/issues/20953
2024-06-11 13:32:15 -07:00
Jian Chen
05032e5e5f
Updating cudnn from 8 to 9 on exsiting cuda 12 docker image (#20925)
### Description
Adding support of cudnn 9 


### Motivation and Context
Keep exsiting  cuda 12.2 with nvidia dirver 535
2024-06-11 09:37:16 -07:00
Changming Sun
ae4a2e6b3f
Publish Build Symbols for DML nightly nuget package (#20988)
### Description
Publish Build Symbols for DML nightly nuget package.
2024-06-10 17:53:22 -07:00
Changming Sun
dc545d366d
Publish debug symbols for Windows python packages (#20973)
### Description
1. Publish debug symbols for Windows python packages. This PR will
publish them to ADO. Later on I will also replicate them to Microsoft
Symbol Server.
2. Build the packages in Release mode instead of RelWithDebInfo, to be
consistent with the other platforms(Linux/macOS/...)


### Motivation and Context
To help debug things. Sometimes we found an issue, but we couldn't debug
it because we didn't have symbols, and once we rebuilt the package
locally the issue was gone. This change would be helpful for such
scenarios.

Build log:
https://aiinfra.visualstudio.com/Lotus/_build?definitionId=841
2024-06-10 12:33:49 -07:00
Edward Chen
981893c318
Remove deprecated "mobile" packages (#20941)
# Description

This PR removes the building of the ORT "mobile" packages and much of the associated infrastructure which is no longer needed.

Not removed yet - tools/ci_build/github/android/mobile_package.required_operators.config and the helper scripts that depend on it.

# Motivation and Context

The mobile packages were deprecated in 1.18. Users should use the full packages (Android - onnxruntime-android, iOS - onnxruntime-c/onnxruntime-objc) instead or do a custom build.
2024-06-07 16:20:32 -05:00
Changming Sun
a53f692832
Update c-api-noopenmp-packaging-pipelines.yml: remove CUDA version parameter (#20955)
### Description
Update c-api-noopenmp-packaging-pipelines.yml: remove CUDA version
parameter
To reduce confusion. This pipeline is for generating CUDA 11 packages.
Just it. Not CUDA 12.

### Motivation and Context
In the last release we accidentally published CUDA 12(instead of CUDA
11) packages to nuget.org.
We also tried to publish CUDA 12 packages to
https://aiinfra.visualstudio.com/PublicPackages/_artifacts/feed/ORT-Nightly.
Luckily it didn't go through because a package with the same version
number already existed there. Every time when someone runs this pipeline
with CUDA version set to 12, the built packages will be published to
https://aiinfra.visualstudio.com/PublicPackages/_artifacts/feed/ORT-Nightly.
And GenAI team's build pipelines are based on the nightly packages. So
sometimes GenAI team builds their packages with CUDA 12 and sometimes
with CUDA 11, which is very random.
Therefore, please limit the use of pipeline parameters. Most Azure
DevOps yml files are template files. They should use parameters. But the
top level yml files should be more careful on that.
2024-06-07 11:19:59 -07:00
Jian Chen
d32adb26f2
Refactor deprecated gradle syntax (#20922)
To replaced deprecated API. 
Should verify with the `Gradle cmakeCheck` step from
`Windows_Packaging_CPU_x64_default` stage from the Zip-Nuge-...
pipeline.
2024-06-07 11:08:52 -07:00
Jian Chen
96228c86a0
Adding Job names to jobs without a name (#20961)
### Description
Adding Job names to jobs without a name

### Motivation and Context
This way we will know which job fails CG scan.
2024-06-06 19:09:21 -07:00
Adrian Lizarraga
b5eb9e8a8a
[QNN EP] Update to QNN SDK 2.22 (#20628)
### Description
- Updates pipelines to use QNN SDK 2.22 by default.
- Linux QNN pipeline now uses an Ubuntu 22.04 image (required by QNN
SDK)
- Android QNN pipeline still uses the current Ubuntu 20.04 image. Will
update in a separate PR.
- Disables QDQ LayerNorm test that triggers QNN's graph finalization
error on QNN 2.22
- Increases accuracy tolerance for various HTP tests so that they pass
on Windows arm64.



### Motivation and Context
Test QNN EP with latest QNN SDK version by default.

---------

Signed-off-by: adrianlizarraga <adlizarraga@microsoft.com>
2024-06-05 18:25:23 -07:00
Jian Chen
5faeaf6437
Remove failOnStderr from Gradle cmakeCheck (#20919)
### Description
Remove failOnStderr from Gradle cmakeCheck



### Motivation and Context
The Gradle is still using the deprecated API
2024-06-04 13:54:49 -07:00
liqun Fu
51bc53580d
Update to onnx 1.16.1 (#20702) 2024-06-04 11:06:28 -07:00
Changming Sun
3dd6fcc089
Upgrade min ios version to 13.0 (#20773)
To align with Office and other MS products.
Office's support policy is:
"Office for iPad and iPhone is supported on the two most recent versions
of iOS and iPadOS. When a new version of iOS or iPadOS is released, the
Office Operating System requirement becomes the two most recent
versions: the new version of iOS or iPadOS and the previous version."
(from https://products.office.com/office-system-requirements)

The latest iOS version is 17. So they support both 17 and 16. Here I set
our min iOS version to 13 so that it will be a superset of what Office
supports.

This change would allow us using C++17's std::filesystem feature in the
core framework. The modifications were generated by running
```bash
 find . -type f -exec sed -i "s/apple_deploy_target[ =]12.0/apple_deploy_target=13.0/g"  {} \;
```

Cannot use 15.0 because otherwise iOS packaging would fail with:

```
/Users/runner/work/1/b/apple_framework/intermediates/iphoneos_arm64/Release/_deps/coremltools-src/mlmodel/src/MILBlob/Util/Span.hpp:288:9: error: cannot use 'throw' with exceptions disabled
        MILVerifyIsTrue(index < Size(), std::range_error, "index out of bounds");
```

The Google OSS libraries we use only officially support iOS 15+.
2024-06-04 10:15:20 -07:00
Yi Zhang
c5087b9b58
Improve stable diffusion image parity test stability (#20904)
### Description
1. Add one image into whitelist, but if the image is hit, the pipeline
status is warning.
2. adjust the image parity test tolerance



### Motivation and Context
improve pipeline stability
2024-06-04 10:19:32 +08:00
Jian Chen
456ab09d17
Component Governance fix round 5 (#20905)
…over the case where there is only single repo checked out

### Description
adding $(Build.SourcesDirectory)/cmake/external/onnx/third_party to
cover the case where there is only single repo checked out



### Motivation and Context
Fix CG issue
https://aiinfra.visualstudio.com/Lotus/_componentGovernance/97926/alert/8862110?typeId=16576846
2024-06-03 14:22:22 -07:00
Jian Chen
ae8df4db8f
Split java's gradle build and test (#20817)
### Description

This PR to allow `./gradlew cmakeCheck` failed on
Windows_Packaging_(CUDA|TensorRT) Job. This way, it will still generate
all nessary jar and pom file need for later stage to consume while
`./gradlew cmakeCheck`will be also run again in the
Windows_Packaging_(CUDA|TensorRT)_Testing stage.


### Motivation and Context
Reduce the time of All java packaging stages by 30+ min.
2024-06-03 14:08:45 -07:00
Changming Sun
d13cabf7f9
Upgrade GCC and remove the dependency on GCC8's experimental std::filesystem implementation (#20893)
### Description
This PR upgrades CUDA 11 build pipelines' GCC version from 8 to 11. 

### Motivation and Context

GCC8 has an experimental std::filesystem implementation which is not ABI
compatible with the formal one in later GCC releases. It didn't cause
trouble for us, however, ONNX community has encountered this issue much.
For example, https://github.com/onnx/onnx/issues/6047 . So this PR
increases the minimum supported GCC version from 8 to 9, and removes the
references to GCC's "stdc++fs" library. Please note we compile our code
on RHEL8 and RHEL8's libstdc++ doesn't have the fs library, which means
the binaries in ONNX Runtime's official packages always static link to
the fs library. It is just a matter of which version of the library, an
experimental one or a more mature one. And it is an implementation
detail that is not visible from outside. Anyway, a newer GCC is better.
It will give us the chance to use many C++20 features.

#### Why we were using GCC 8?
It is because all our Linux packages were built on RHEL8 or its
equivalents. The default GCC version in RHEL8 is 8. RHEL also provides
additional GCC versions from RH devtoolset. UBI8 is the abbreviation of
Red Hat Universal Base Image 8, which is the containerized RHEL8. UBI8
is free, which means it doesn't require a subscription(while RHEL does).
The only devtoolset that UBI8 provides is GCC 12, which is too new for
being used with CUDA 11.8. And our CUDA 11.8's build env is a docker
image from Nvidia that is based on UBI8.
#### How the problem is solved
Almalinux is an alternative to RHEL. Almalinux 8 provides GCC 11. And
the CUDA 11.8 docker image from Nvidia is open source, which means we
can rebuild the image based on Almalinux 8 to get GCC 11. I've done
this, but I cannot republish the new image due to various complicated
license restrictions. Therefore I put them at an internal location in
onnxruntimebuildcache.azurecr.io.
2024-06-03 10:14:08 -07:00
Jian Chen
217b66fd85
Update py-publishing pipeline to use the resoure from packaging pipeline (#20888)
### Description
<!-- Describe your changes. -->



### Motivation and Context
To allow nightly release to be automatic triggered
2024-06-01 16:10:02 -07:00
Changming Sun
67bc9438d7
Update training packaging pipeline's docker files (#20853)
### Description
Similar to #20786 . The last PR was able to update all pipelines and all
docker files. This is a follow-up to that PR.

### Motivation and Context
1. To extract the common part as a reusable build infra among different
ONNX Runtime projects.
2. Avoid hitting docker hub's limit: 429 Too Many Requests - Server
message: toomanyrequests: You have reached your pull rate limit. You may
increase the limit by authenticating and upgrading:
https://www.docker.com/increase-rate-limit
2024-05-30 23:48:42 -07:00
Edward Chen
a508130456
Address React Native pipeline component detection timeout (#20871)
mac-react-native-ci-pipeline.yml:
- We don't need to run component detection for PR builds so just disable it there.

npm-packaging-pipeline.yml:
- Manually added component detection task was being added twice - removed one.
- Increased timeout of stage where component detection is run since the existing timeout was close for some builds.
2024-05-30 16:37:03 -07:00
Changming Sun
65ef270e06
Update Aten pipeline's docker file to use UBI8 (#20856)
### Description
Now it uses CentOS 7 which is EOL. This PR updates it to UBI8.

### Motivation and Context
To deprecate CentOS 7 .
2024-05-30 07:38:15 -07:00
Jian Chen
228713f635
adding publishing stage to publish java CUDA 12 pkg to ado (#20834) 2024-05-29 16:24:23 -07:00
Vincent Wang
e77f238dc6
Update Torch Version to Fix ATen CPU Pipeline Failure (#20845)
Update Torch Version to Fix ATen CPU Pipeline Failure.
2024-05-29 16:04:18 +08:00
Edward Chen
535e9d7114
Update package_release_tasks.py (#20835)
1. Move azcopy environment variables out of script and into an Azure DevOps variable group. Move towards consolidating the managed identity client ID definition in one place.
2. Disable azcopy overwrite. We don't want to accidentally change the files for a released package.
2024-05-28 17:50:25 -07:00
Adrian Lizarraga
e78b18a2fb
Increase ComponentDetection timeout for React Native CI (#20800)
### Description
Runs of the React Native CI are timing out during ComponentDetection
after 8 minutes. This increases the timeout value.



### Motivation and Context
Runs of the React Native CI are timing out during ComponentDetection.
2024-05-28 08:36:38 -07:00
Jian Chen
b1b8cb05dc
Adding java build and packaging stage to cuda-packaging-pipeline.yml (#20812)
### Description
Adding java build/packaging stage to `cuda-packaging-pipeline.yml`



### Motivation and Context
This way we can enable publishing the Java Cuda 12 along with Nuget CUDA
12
2024-05-27 07:59:19 -07:00
Changming Sun
439ed92b96
Remove TVM EP's pipeline (#20813)
### Description
Temporarily remove TVM EP's pipeline until someone helps us upgrade TVM
to a newer version which is compatible with the latest ONNX.

### Motivation and Context
The ONNX version that TVM EP uses has a known security vulnerability. We
cannot continue using it in our hosted build environment. This change is temporary
2024-05-25 20:42:41 -07:00
Jian Chen
fe24006425
Fix Nuget Cuda pipeline package pipeline (#20741)
### Description
<!-- Describe your changes. -->

This PR adding protoc.exe to make the Nuget Cuda Pipleine, which also
allowing it to get build Java for various CUDA version

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-05-24 09:15:57 -07:00
Changming Sun
535a030b1e
Remove manylinux build scripts from python packaging pipeline (#20786)
### Description
Use a common set of prebuilt manylinux base images to build the
packages, to avoid building the manylinux part again and again. The base
images can be used in GenAI and other projects too.
This PR also updates the GCC version for inference python CUDA11/CUDA12
builds from 8 to 11. Later on I will update all other CUDA pipelines to
use GCC 11, to avoid the issue described in
https://github.com/onnx/onnx/issues/6047 and
https://github.com/microsoft/onnxruntime-genai/issues/257 .

### Motivation and Context
To extract the common part as a reusable build infra among different
ONNX Runtime projects.
2024-05-24 08:18:22 -07:00
Jian Chen
884acd4598
Fix Nuget-Cuda pubish pipeline (#20794)
### Description
Previous all feed are set to nightly, the offcial released feed-id is
not set


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-05-23 18:27:46 -07:00
Changming Sun
b522df0ae4
Update RE2 to the latest (#20775)
Update RE2 to the latest.

To keep the components up to date.
2024-05-23 14:30:15 -07:00
Yi Zhang
fa8670fe5b
Add a test image for stable diffusion (#20780) 2024-05-23 08:50:23 -07:00
Jian Chen
d4fe4b5b51
Replace ubuntu-latest with onnxruntime-Ubuntu2204-AMD-CPU (#20736)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-05-22 13:36:02 -07:00
Jian Chen
0a10a3003a
component-governance fix round 4 (#20754)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-05-22 11:05:24 -07:00
Jian Chen
372974e5d6
Using CPU pool to build Linux GPU C API Package (#20648)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-05-20 15:25:14 -07:00
Jian Chen
ddafbf2224
Component Governance fix round 3 (#20689)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-05-20 13:39:09 -07:00
Jian Chen
11df22b59b
Reenabling Nuget Cuda Packaging Pipeline (#20688)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-05-20 10:37:15 -07:00
Edward Chen
fefae0cd04
Add Mac CI GitHub Actions workflow (#20717)
Add a new GitHub Actions workflow, `.github/workflows/mac.yml`. It contains these jobs:
- ARM64 MacOS CI build.
- Objective-C static analysis build. This was moved over from another Azure DevOps pipeline to make it more visible.
2024-05-20 10:27:03 -07:00
Yulong Wang
036fcd93d4
[js/web] optimize module export and deployment (#20165)
### Description

This PR make numbers of optimizations to onnxruntime-web's module export
and deployment.

See each section below for more details.

#### Preview

>
[onnxruntime-web@1.19.0-esmtest.20240513-a16cd2bd21](https://www.npmjs.com/package/onnxruntime-web/v/1.19.0-esmtest.20240513-a16cd2bd21)

> ~~onnxruntime-web@1.19.0-esmtest.20240430-c7edbcc63d~~

> ~~onnxruntime-web@1.18.0-esmtest.20240428-624c681c83~~

> ~~onnxruntime-web@1.18.0-esmtest.20240411-1abb64e894~~

<details>
<summary><h4>Breaking changes</h4></summary>

There is no code change required, but there are a few differences
regarding **code import**, **flags**, **bundler config** and
**deployment steps**.

#### Importing:

Import table is changed. See following for details.

<details>
<summary><h5>Current import table:</h5></summary>

| Target Name | Path for "import" or "require" | WebGL | JSEP | wasm |
Proxy | Training |
  |------|-----|-----|-----|-----|-----|-----|
  | `ort` (default) | `onnxruntime-web` | ✔️ |  | ✔️ | ✔️ |  |
  | `ort.all` | `onnxruntime-web/experimental` | ✔️ | ✔️ | ✔️ | ✔️ |  |
  | `ort.node` | `onnxruntime-web` |  |  | ✔️ |  |  |
| `ort.training` | `onnxruntime-web/training` |  |  | ✔️ |
✔️<sup>\[1]</sup> | ✔️ |
  | `ort.wasm` | `onnxruntime-web/wasm` |  |  | ✔️ | ✔️ |  |
  | `ort.wasm-core` | `onnxruntime-web/wasm-core` |  |  | ✔️ |  |  |
| `ort.webgl` | `onnxruntime-web/webgl` | ✔️ |  |  | ✔️<sup>\[2]</sup>
|  |
  | `ort.webgpu` | `onnxruntime-web/webgpu` |  | ✔️ | ✔️ | ✔️ |  |

* [1] didn't test. may not actually work.
* [2] not working. this is a mistake in build config.

</details>

<details>
<summary><h5>Proposed update:</h5></summary>

| Target Name | Path for "import" or "require" | WebGL | JSEP | wasm |
Proxy | Training |
  |------|-----|-----|-----|-----|-----|-----|
  | `ort` (default) | `onnxruntime-web` | ✔️ |  | ✔️ | ✔️ |  |
| `ort.all` |
~~`onnxruntime-web/experimental`~~<br/>`onnxruntime-web/all` | ✔️ | ✔️ |
✔️ | ✔️ |  |
  | `ort.node` | `onnxruntime-web` |  |  | ✔️ |  |  |
  | `ort.training` | `onnxruntime-web/training` |  |  | ✔️ | ✔️ | ✔️ |
  | `ort.wasm` | `onnxruntime-web/wasm` |  |  | ✔️ | ✔️ |  |
| ~~`ort.wasm-core`~~ | ~~`onnxruntime-web/wasm-core`~~ | ~~~~ | ~~~~
| ~~✔️~~ | ~~~~ | ~~~~ |
  | `ort.webgl` | `onnxruntime-web/webgl` | ✔️ |  |  | ~~✔️~~  |  |
  | `ort.webgpu` | `onnxruntime-web/webgpu` |  | ✔️ | ✔️ | ✔️ |  |

</details>

#### Flags:

The following flags are deprecated:
- `env.wasm.simd` (boolean): will be ignored. SIMD is always enabled in
build.

The following flags changed their type:
- `env.wasm.wasmPaths`: When using this flag as a string ( for the URL
prefix ), nothing is changed. When using this flag as an object ( for
per-file path override ), the type changed:
  ```diff
  -  export interface Old_WasmFilePaths{
  -    'ort-wasm.wasm'?: string;
  -    'ort-wasm-threaded.wasm'?: string;
  -    'ort-wasm-simd.wasm'?: string;
  -    'ort-training-wasm-simd.wasm'?: string;
  -    'ort-wasm-simd-threaded.wasm'?: string;
  -  };
  +  export interface New_WasmFilePaths {
  +    /**
  +     * Specify the override path for the main .wasm file.
  +     *
  +     * This path should be an absolute path.
  +     *
  +     * If not modified, the filename of the .wasm file is:
  +     * - `ort-wasm-simd-threaded.wasm` for default build
+ * - `ort-wasm-simd-threaded.jsep.wasm` for JSEP build (with WebGPU and
WebNN)
  +     * - `ort-training-wasm-simd-threaded.wasm` for training build
  +     */
  +    wasm?: URL|string;
  +    /**
  +     * Specify the override path for the main .mjs file.
  +     *
  +     * This path should be an absolute path.
  +     *
  +     * If not modified, the filename of the .mjs file is:
  +     * - `ort-wasm-simd-threaded.mjs` for default build
+ * - `ort-wasm-simd-threaded.jsep.mjs` for JSEP build (with WebGPU and
WebNN)
  +     * - `ort-training-wasm-simd-threaded.mjs` for training build
  +     */
  +    mjs?: URL|string;
  +  }
  ```

#### Bundler compatibility:

Config changes are need for bundlers. See usage example in
/js/web/test/e2e/ for Webpack, parcel and rollup.

#### Deployment:

- if consuming from a CDN, there is no breaking change.
- if consuming from a local server, need to copy all `ort-*.wasm` and
`ort-*.mjs` files (totally 6 files) in the dist folder. (previously only
need to copy `ort-*.wasm` files.)

</details>
<details>
<summary><h4>Problems</h4></summary>

There are a few problems with the current module export and deployment:

- Script URL cannot be correctly inferred when imported as ESM.
- Workers are forcefully encoded using Blob URL, which makes
onnxruntime-web not working in CSP environment and Node.js, when using
proxy or multi-threading feature.
- Generated JS code (by Emscripten) is encoded using
`function.toString()`, which is unstable and error-prone.
- When running with a different Emscripten build, always need the build
step. Making it difficult to swap artifacts in deveopment/debug.
</details>
<details>
<summary><h4>Goals</h4></summary>

- Full ESM support
- Support variances of ways to import. Including:
- import from HTML's `<script>` tag (IIFE format, exporting to global
variable `ort`)
    ```html
<script
src="https://example.com/cdn-path-to-onnxruntime-web/dist/ort.min.js"></script>
    ```
  - import from source code inside `<script type="module">` tag (ESM)
    ```html
    <script type="module">
import * as ort from
"https://example.com/cdn-path-to-onnxruntime-web/dist/ort.min.mjs";

      // using 'ort'
    </script>
    ```
- import in a CommonJS project (CJS format, resolve from package.json
"exports" field)
    ```js
    // myProject/main.js
    const ort = require('onnxruntime-web');
    ```
- import in an ESM project (ESM format, resolve from package.json
"exports" field)
    ```js
    // myProject/main.js (or main.mjs)
    import * as ort from 'onnxruntime-web';
    ```
- Support popular bundlers when importing onnxruntime-web into a CJS/ESM
project.
  - webpack (esm requires extra post-process step)
  - rollup
  - parcel (esm requires extra post-process step)
  - More bundlers **TBD**
- Multi-threading support for Node.js

NOTE: keeping single JavaScript file (the all-in-one bundle) is no
longer a goal. This is because technically there is a conflict with the
other requirements.
</details>

<details>
<summary><h4>Important Design Decisions</h4></summary>

- Drop support of single JavaScript output.
- The current onnxruntime-web distribution uses a single JavaScript file
to include all code. While there are a few benefits, it also creates
problems as mentioned above. Since ESM is being used more and more
widely, and browsers are making more restricted security checks and
requirement, the old Blob based solution is going to be replaced.
- To achieve the requirement, specifically, the CSP environment support,
we have to offer a non Blob based solution. Therefore, we have to
distribute multiple files and drop the single file solution.

- Do not run parser/postprocess on Emscripten generated JavaScript.
- Emscripten is evolving quickly so we should only depends on what's in
its documentation instead of a certain implementation details. (for
example, currently we patch on its code to deal with a special variable
`_scriptDir`)
  - Keep the generated files as-is also helps to:
    - reduce the size of ort.min.js
- make it easier to replace build artifacts when in development/debug

- Drop support for non-SIMD and non-MultiThread. This helps to reduce
the number of artifacts in distribution.
  - (fixed-sized) SIMD is supported in any mainstream JS environment.
- Multi-thread as WebAssembly feature is supported in any mainstream JS
environment. In some environment the feature is guarded with cross
origin policy, but it can still work if not trying to create any worker.

- Use ESM output for Emscripten generated JavaScript.
- There are 2 ways to dynamically import classic (umd) modules and
neither of them are recommended:
- dynamically creating a <script> tag. This changes the HTML structure
and have quite a lot of compatibility issue
- use `fetch()` and `eval()`. However `eval` is strongly suggested to be
avoid because there is a great perf hit.
- importing ESM is super easy - just use the `import()` call.
Considering ESM is widely supported in modern browsers and Node.js this
is the better option.

- Add Blob based solution as a fallback for cross-origin workers.
- There are still wide use case of importing onnxruntime-web from CDN.
In this usage, make it able create worker by using `fetch()`+`Blob` to
create a same-origin Blob URL.

</details>

<details>
<summary><h4>Distribution File Manifest</h4></summary>

The distribution folder contains the following files:

- WebAssembly artifacts. These files are the result of compiling the
ONNX Runtime C++ code to WebAssembly by Emscripten.

  | File Name | Build Flags |
  |------|-----|
| ort-wasm-simd-threaded.mjs <br/> ort-wasm-simd-threaded.wasm |
`--enable_wasm_simd` <br/> `--enable_wasm_threads` |
| ort-training-wasm-simd-threaded.mjs <br/>
ort-training-wasm-simd-threaded.wasm | `--enable_training_apis` <br/>
`--enable_wasm_simd` <br/> `--enable_wasm_threads` |
| ort-wasm-simd-threaded.jsep.mjs <br/> ort-wasm-simd-threaded.jsep.wasm
| `--enable_wasm_simd` <br/> `--enable_wasm_threads` <br/> `--use_jsep`
<br/> `--use_webnn` |

- onnxruntime-web JavaScript artifacts. These files are generated by
ESBuild as the entry point for onnxruntime-web.

  There are multiple build targets for different use cases:
  | Target Name | Path for "import" or "require" | Description |
  |------|-----|-----|
  | `ort` | `onnxruntime-web` | The default target. |
  | `ort.all` | `onnxruntime-web/all` | The target including webgl. |
  | `ort.node` | `onnxruntime-web` | The default target for Node.js. |
| `ort.training` | `onnxruntime-web/training` | The target including
training APIs |
| `ort.wasm` | `onnxruntime-web/wasm` | The target including only
WebAssembly (CPU) EP |
| `ort.webgl` | `onnxruntime-web/webgl` | The target including only
WebGL EP |


  For each target, there are multiple files generated:
  | File Name | Description |
  |------|-----|
| [target].js | The entry point for the target. IIFE and CommonJS
format. |
  | [target].mjs | The entry point for the target. ESM format. |
| [target].min.js <br/> [target].min.js.map | The entry point for the
target. Minimized with sourcemap. IIFE and CommonJS format. |
| [target].min.mjs <br/> [target].min.mjs.map | The entry point for the
target. Minimized with sourcemap. ESM format. |
| [target].proxy.mjs | (if appliable) The proxy ESM module for the
target. |
| [target].proxy.min.mjs <br/> [target].proxy.min.mjs.map | (if
appliable) The proxy ESM module for the target. Minimized with
sourcemap. |

</details>

<details>
<summary><h4>Dynamic Import Explained</h4></summary>

- Local Served | No Proxy:
  ```
  [Bundle or ort.min.js]
    |
    + import()--> [ort-wasm-simd-threaded.mjs]
                    |
+ WebAssembly.instantiateStreaming()--> [ort-wasm-simd-threaded.wasm]
                    |
+ new Worker()--> [ort-wasm-simd-threaded.mjs (worker)]
                                        |
+ WebAssembly.instantiateStreaming()--> [ort-wasm-simd-threaded.wasm]
  ```
- Local Served | Proxy:
  ```
  [Bundle or ort.min.js]
    |
    + import()--> [ort.proxy.min.mjs]
                    |
                    + new Worker()--> [ort.proxy.min.mjs (worker)]
                                        |
+ import()--> [ort-wasm-simd-threaded.mjs]
                                                        |
+ WebAssembly.instantiateStreaming()--> [ort-wasm-simd-threaded.wasm]
                                                        |
+ new Worker()--> [ort-wasm-simd-threaded.mjs (worker)]
|
+ WebAssembly.instantiateStreaming()--> [ort-wasm-simd-threaded.wasm]
  ```
- Cross Origin | No Proxy:
  ```
  [Bundle or ort.min.js]
    |
    + fetch('ort-wasm-simd-threaded.mjs')
        |
        + URL.createObjectURL(res.blob())
        |
        + import()--> [blob:... (ort-wasm-simd-threaded)]
                        |
+ WebAssembly.instantiateStreaming()--> [ort-wasm-simd-threaded.wasm]
                        |
+ new Worker()--> [blob:... (ort-wasm-simd-threaded) (worker)]
                                            |
+ WebAssembly.instantiateStreaming()--> [ort-wasm-simd-threaded.wasm]
  ```

- Cross Origin | Proxy
  ```
  [Bundle or ort.min.js]
    |
    + fetch('ort.proxy.min.mjs')
        |
        + URL.createObjectURL(res.blob())
        |
        + import()--> [blob:... (ort.proxy)]
                        |
+ new Worker()--> [blob:... (ort.proxy) (worker)]
                                            |
+ fetch('ort-wasm-simd-threaded.mjs')
                                                |
+ URL.createObjectURL(res.blob())
                                                |
+ import()--> [blob:... (ort-wasm-simd-threaded)]
                                                                |
+ WebAssembly.instantiateStreaming()--> [ort-wasm-simd-threaded.wasm]
                                                                |
+ new Worker()--> [blob:... (ort-wasm-simd-threaded) (worker)]
|
+ WebAssembly.instantiateStreaming()--> [ort-wasm-simd-threaded.wasm]
  ```
</details>
2024-05-20 09:51:16 -07:00
Edward Chen
e81c8676e3
MatMulNBits + Add fusion (#20587)
- Add MatMulNBits Bias input
- Add graph transformer to fuse MatMulNBits + Add
2024-05-16 11:00:59 -07:00
Yifan Li
47a178b518
[EP Perf] Fix on EP Perf (#20683)
### Description
<!-- Describe your changes. -->
* Partially revert [previous
change](https://github.com/microsoft/onnxruntime/pull/19804), and
   * Redo concurrency_test_result parser outside of post.py
* Add support of syncing memtest result to db


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
To fix the error when CI is running on two model groups.
- When running on two model groups, the [previous
change](https://github.com/microsoft/onnxruntime/pull/19804) wrongly
navigates two levels up in the directory after running one model group,
while one level is needed. After that, the script can't find another
model group.
- Running on one model group can't repro the issue
2024-05-15 21:38:52 -07:00
Jian Chen
d1e66f0446
Increase NPM ComponentDetection.Timeout: 1200 (#20681)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-05-15 13:41:59 -07:00
Jian Chen
87ed1e3e3f
Component governance fix round 2 (#20679) 2024-05-14 17:15:15 -07:00
Edward Chen
113aa2992f
Update React Native CI (#20673)
- Move iOS package build to separate job so it can run in parallel with Android AAR build and be decoupled from the test stage. The test stage fails sometimes (not infrequently) and may need to be re-run.
- Update stop iOS simulator step so it doesn't fail if the start step doesn't run.
2024-05-14 14:10:56 -07:00
Jian Chen
83a871f890
Fix critical and High issues from Component Governance (#20611)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-05-14 09:17:23 -07:00
Hector Li
0e11d0c4f8
Enable Qnn nuget nightly (#20662)
### Description
Enable Qnn nuget nightly
2024-05-13 21:28:43 -07:00
Yi Zhang
c131ea89e1
Nuget Publish pipelines should be trigger by rel-* automatically too. (#20652)
### Description
And
Set allowPackageConflicts = True
`#allowPackageConflicts: false # boolean. Optional. Use when command =
push && nuGetFeedType = internal. Allow duplicates to be skipped.
Default: false.`

https://learn.microsoft.com/en-us/azure/devops/pipelines/tasks/reference/nuget-command-v2?view=azure-pipelines

Once the publish patial failed, we don't need to rerun the whole package
generation workflow.
2024-05-13 13:18:16 -07:00
Edward Chen
90d49ccb9a
Allow path pattern to be specified in package_release_tasks.py. (#20650)
Do more in the Python helper script so the Bash code in the release definition can be simplified.
2024-05-13 09:16:04 -07:00
Jian Chen
4fe565a62a
Java CUDA 12 support (#20583)
### Description

- This PR combine all CUDA 12 stage into the Zip-nuget-... pipeline.
- It also enables the cuda12 support



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-05-10 14:16:22 -07:00
George Wu
a0c4bd4da7
[qnn ep] sign onnxruntime.dll/pyd for qnn packages (#20634)
sign only onnxruntime.dll and onnxruntime_pybind11_state.pyd in
packages.
2024-05-09 20:45:44 -07:00
Yi Zhang
5a18818e1d
Migrate training storage from SAS to managed identity (#20618)
### Description
orttrainingtestdatascus has only save mnist whose size is only 64M in
Azure File
To meet security requirements and reduce maintenance cost, move the test
data to lotusscus and saved in Azure blob.
2024-05-09 15:44:29 -07:00
Jian Chen
d1cbb3e076
The time for nuget pkg should be consistent (#20522)
This pull request primarily involves changes to the build scripts in the
`tools/ci_build/github/azure-pipelines` directory. The changes add build
date and time information to the build process. This is achieved by
introducing two new parameters, `BuildDate` and `BuildTime`, and
incorporating them into the `msbuildArguments` in multiple locations.

Addition of new parameters:

*
[`tools/ci_build/github/azure-pipelines/templates/c-api-cpu.yml`](diffhunk://#diff-00815920cc190d10fdebceac0c3a4b8a59e408684ae38177dfe7f96cae276c59R309-R310):
Added `BuildDate` and `BuildTime` parameters using the pipeline's start
time.

Incorporation of new parameters in `msbuildArguments`:

*
[`tools/ci_build/github/azure-pipelines/c-api-noopenmp-packaging-pipelines.yml`](diffhunk://#diff-efb530efd945fdd9d3e1b92e53d25cc8db7df2e28071c364b07a7193092de01bL947-R948):
Added `CurrentDate` and `CurrentTime` parameters to `msbuildArguments`
in multiple locations.
[[1]](diffhunk://#diff-efb530efd945fdd9d3e1b92e53d25cc8db7df2e28071c364b07a7193092de01bL947-R948)
[[2]](diffhunk://#diff-efb530efd945fdd9d3e1b92e53d25cc8db7df2e28071c364b07a7193092de01bL1092-R1093)
[[3]](diffhunk://#diff-efb530efd945fdd9d3e1b92e53d25cc8db7df2e28071c364b07a7193092de01bL1114-R1115)
[[4]](diffhunk://#diff-efb530efd945fdd9d3e1b92e53d25cc8db7df2e28071c364b07a7193092de01bL1137-R1138)
*
[`tools/ci_build/github/azure-pipelines/templates/c-api-cpu.yml`](diffhunk://#diff-00815920cc190d10fdebceac0c3a4b8a59e408684ae38177dfe7f96cae276c59L446-R448):
Incorporated the `CurrentDate` and `CurrentTime` parameters into
`msbuildArguments`.### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-05-09 11:35:45 -07:00
Edward Chen
a0db2187ee
Update CocoaPods package release script. (#20608)
- Update method for uploading to Azure storage to use managed identity.
- Allow helper script tasks to be split across different calls.
- Rewrite helper script in Python.

Motivation:
Recently the Azure storage account configuration was changed and now the old way of uploading to it no longer works.
2024-05-08 16:17:26 -07:00
Changming Sun
08b637350a
Remove an extra space in azure_scale_set_vm_mount_test_data.sh (#20584) 2024-05-08 09:46:50 -07:00
Scott McKay
8d09baf49f
Clarify when protobuf dependency builds protoc (#20542)
### Description
<!-- Describe your changes. -->
Currently figuring out if the protobuf dependency is building protoc it
is a little obtuse and inconsistent
* in some places we directly set protobuf_BUILD_PROTOC_BINARIES to OFF
to indicate the protobuf dependency is not building protoc
  * e.g. macOS/iOS/visionOS builds
* for a user provided protoc path we don't set
protobuf_BUILD_PROTOC_BINARIES, and inside protobuf_function.cmake that
determines if `protobuf::protoc` is added as a dependency or not
*
0dda8b0c44/cmake/external/protobuf_function.cmake (L40-L45)

To be more consistent/explicit, set protobuf_BUILD_PROTOC_BINARIES to
OFF when ONNX_CUSTOM_PROTOC_EXECUTABLE set and valid.

Remove outdated script that built and external protoc binary which was
used in later builds. The build setup will fetch a pre-built protoc so
there's no need for this additional build.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Make it easier to figure out if protoc is coming from the protobuf
dependency.
2024-05-08 08:30:11 +10:00
aciddelgado
4e27841bdb
fix gqa cpu nan bug (#20521)
### Description
There was a bug with gqa on cpu where on token case, with batch_size >
1, and with past_present_share_buffer off, the output would occasionally
contain nans. this pr fixes that. it also updates documentation and
fixes posid gen for rotary in cuda in prompt case.



### Motivation and Context
this pr solves the GQA CPU bug as well as updates the documentation and
makes seqlens_k irrelevant for prompt case, which is useful to prevent
user error.
2024-05-07 15:19:26 -07:00
Adrian Lizarraga
0dda8b0c44
[QNN EP] Update QNN SDK to 2.21 (#20534)
### Description
- Updates QNN pipelines to use QNN SDK 2.21
- Downloads QNN SDK from Azure storage to avoid having to rebuild images
when a new version is released.


### Motivation and Context
Test with the latest QNN SDK.
2024-05-01 20:17:35 -07:00
Scott McKay
f9febc4f35
Remove usage of 'required reason' iOS API from protobuf (#20529)
### Description
<!-- Describe your changes. -->

Using certain APIs is about to require a [privacy
manifest](https://developer.apple.com/documentation/bundleresources/privacy_manifest_files/describing_use_of_required_reason_api)
to be added to a package.

Our version of protobuf uses `mach_absolute_time`. Patch as per
https://github.com/protocolbuffers/protobuf/pull/15662/ to remove usage.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Usage of API will require a privacy manifest for an iOS app to be
accepted as of 5/1/2024
#20519
2024-05-02 08:21:08 +10:00