Commit graph

1975 commits

Author SHA1 Message Date
Jian Chen
d1c19e79ea
Update OpenVino CI Ubuntu to 22.04 (#21127)
### Description
[Update OpenVino CI Ubuntu to
22.04](312fab5b3f)



### Motivation and Context
Ubuntu 22.04 is needed for linux C++20
2024-07-09 09:56:44 -07:00
Yi Zhang
30b6e82e7d
Make ROCm packaging stages to a single workflow (#21235)
### Description
Make current ROCm packaging stages to a single workflow.
Reduce the possibility of all nightly packages can't be generated by one
failed stage



### Motivation and Context
Our plan is to reduce the complexity of the current zip-nuget pipeline
to improve the stability and performance of nightly packages generation.
ROCm packaging stages has no dependencies with other packaging jobs and
it's the most time-consuming route.
After this change, the most used CPU/CUDA/Mobile packaging workflow
duration can be reduced roughly from 3h20m to 2h30m.
2024-07-04 11:07:04 +08:00
cloudhan
f39ee14b46
Add GQA support for ROCm (#21032) 2024-07-03 14:55:31 +08:00
Yi Zhang
beb2496748
Templatize publishing nuget package (#21199)
### Description
It's the prerequisite step of reducing complexity of current zip-nuget
pipeline.
Some packaging tasks could be cut from the most complex nuget pipline
and easily be published

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-07-02 09:24:19 +08:00
Jian Chen
9007ede102
Update upstream packaging pipeline name to make it more meaningful. (#21154)
### Description
Update upstream packaging pipeline name to make it more meaningful.



### Motivation and Context
The upstream pipeline used to only building Nuget packages, but now it
also builds Zip and Java. So change the name will make it more
meaningful.
2024-06-28 21:40:09 -07:00
Jian Chen
0cbe7eec5e
Uppdate nuget to Use Nuget 6.10.x (#21209)
### Description
Uppdate nuget to Use Nuget 6.10.x
2024-06-28 19:49:54 -07:00
Yi Zhang
587e92c279
Add FP32 and INT4 test in Llama2 (#21187)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-06-28 06:18:26 +08:00
Changming Sun
d1ab94c2b0
Add compatibility for NumPy 2.0 (#21085)
### Description

As suggested by SciPy's doc, we will
`Build against NumPy 2.0.0, then it will work for all NumPy versions
with the same major version number (NumPy does maintain backwards ABI
compatibility), and as far back as NumPy 1.19 series at the time of
writing`

I think it works because in
[numpyconfig.h#L64](https://github.com/numpy/numpy/blob/main/numpy/_core/include/numpy/numpyconfig.h#L64)
there is a macro NPY_FEATURE_VERSION. By default it is set to
NPY_1_19_API_VERSION. And the NPY_FEATURE_VERSION macro controls ABI.

This PR only upgrade the build time dependency; When a user installs
ONNX Runtime, they still can use numpy 1.x.

### Motivation and Context
Recently numpy published a new version, 2.0.0, which is incompatible with the latest ONNX Runtime release.
2024-06-27 13:50:53 -07:00
PeixuanZuo
446aa986a1
[ROCm] Extend the Pipeline restriction time (#21158)
ROCm EP builds are taking longer.
2024-06-27 15:36:04 +08:00
Jian Chen
f81c0ec32a
Remove warning suppression from Java Packaging pipeline. (#21010)
### Description
Remove warning suppression from Java Packaging pipeline.


### Motivation and Context
We want the CI step not to produce warning.
2024-06-24 16:46:21 -07:00
aciddelgado
ebd0368bb0
Make Flash Attention work on Windows (#21015)
### Description
Previously, Flash Attention only worked on Linux systems. This PR will
make it work and enable it to be built and run on Windows.

Limitations of Flash Attention in Windows: Requires CUDA 12.

### Motivation and Context
This will significantly increase the performance of Windows-based LLM's
with hardware sm>=80.

To illustrate the improvement of Flash Attention over Memory Efficient
Attention, here are some average benchmark numbers for the GQA operator,
run with configurations based on several recent models (Llama, Mixtral,
Phi-3). The benchmarks were obtained on RTX4090 GPU using the test
script located at
(onnxruntime/test/python/transformers/benchmark_gqa_windows.py).

* Clarifying Note: These benchmarks are just for the GQA operator, not
the entire model.

### Memory Efficient Attention Kernel Benchmarks:
| Model Name | Max Sequence Length | Inference Interval (ms) |
Throughput (samples/second) |

|----------------------------------------|---------------------|-------------------------|-----------------------------|
| Llama3-8B (Average Prompt) | 8192 | 0.19790525 | 13105.63425 |
| Llama3-8B (Average Token) | 8192 | 0.207775538 | 12025.10172 |
| Llama3-70B (Average Prompt) | 8192 | 0.216049167 | 11563.31185 |
| Llama3-70B (Average Token) | 8192 | 0.209730731 | 12284.38149 |
| Mixtral-8x22B-v0.1 (Average Prompt) | 32768 | 0.371928785 |
7031.440056 |
| Mixtral-8x22B-v0.1 (Average Token) | 32768 | 0.2996659 | 7607.947159 |
| Phi-3-mini-128k (Average Prompt) | 131072 | 0.183195867 | 15542.0852 |
| Phi-3-mini-128k (Average Token) | 131072 | 0.198215688 | 12874.53494 |
| Phi-3-small-128k (Average Prompt) | 65536 | 2.9884929 | 2332.584142 |
| Phi-3-small-128k (Average Token) | 65536 | 0.845072406 | 2877.85822 |
| Phi-3-medium-128K (Average Prompt) | 32768 | 0.324974429 | 8094.909517
|
| Phi-3-medium-128K (Average Token) | 32768 | 0.263662567 | 8978.463687
|

### Flash Attention Kernel Benchmarks:
| Model Name | Max Sequence Length | Inference Interval (ms) |
Throughput (samples/second) |

|--------------------------------------|---------------------|-------------------------|-----------------------------|
| Llama3-8B (Average Prompt) | 8192 | 0.163566292 | 16213.69057 |
| Llama3-8B (Average Token) | 8192 | 0.161643692 | 16196.14715 |
| Llama3-70B (Average Prompt) | 8192 | 0.160510375 | 17448.67753 |
| Llama3-70B (Average Token) | 8192 | 0.169427308 | 14702.62043 |
| Mixtral-8x22B-v0.1 (Average Prompt) | 32768 | 0.164121964 |
15618.51301 |
| Mixtral-8x22B-v0.1 (Average Token) | 32768 | 0.1715865 | 14524.32273 |
| Phi-3-mini-128k (Average Prompt) | 131072 | 0.167527167 | 14576.725 |
| Phi-3-mini-128k (Average Token) | 131072 | 0.175940594 | 15762.051 |
| Phi-3-small-128k (Average Prompt) | 65536 | 0.162719733 | 17824.494 |
| Phi-3-small-128k (Average Token) | 65536 | 0.14977525 | 16749.19858 |
| Phi-3-medium-128K (Average Prompt) | 32768 | 0.156490786 | 17679.2513
|
| Phi-3-medium-128K (Average Token) | 32768 | 0.165333833 | 14932.26079
|

Flash Attention is consistently faster for every configuration we
benchmarked, with improvements in our trials ranging from ~20% to ~650%.

In addition to these improvements in performance, Flash Attention has
better memory usage. For example, Memory Efficient Attention cannot
handle a max sequence length higher than 32,768, but Flash Attention can
handle max sequence lengths at least as high as 131,072.

---------

Co-authored-by: Tianlei Wu <tlwu@microsoft.com>
2024-06-24 09:43:49 -07:00
Yi Zhang
5b5ce0bfb0
Add UsePython Task in Nuget Publish workflow (#21144)
### Description
Otherwise it would fail in 

b95982e588/tools/ci_build/github/azure-pipelines/publish-nuget.yml (L78-L81)



### Motivation and Context
The Windows CPU image is migrated  to managed image


### Verification Link
https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1313
2024-06-24 13:36:13 +08:00
Yi Zhang
69d522f4e9
[Fix] use cmdline in Final Jar Testing Stage for new managed Windows Image (#21130)
### Description
No bash command in Managed Windows image.
Use CmdlLine step instead.



### Verified Link

https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=491902&view=logs&j=f1f8e11e-a9fa-53e5-cd29-3ba2c1988550
2024-06-21 12:41:06 +08:00
Changming Sun
efcaa835b1
Update generate_nuspec_for_native_nuget.py for training (#21112)
### Description
Similar to #21096 , but this one is for ORT training nuget package.
2024-06-20 16:13:31 -07:00
Changming Sun
bd3a9ee99d
Add UsePythonVersion (#21109)
### Description
The machine has multiple python installations and none of them is in
PATH. Therefore we should explicitly set python version via this task to
avoid having surprises.

### Motivation and Context
Similar to #21095
2024-06-19 20:47:21 -07:00
Changming Sun
27f3ac78d4
Delete RoslynAnalyzers (#21104)
### Description
Delete RoslynAnalyzers. Use CodeQL instead.


### Motivation and Context
Now we already have CodeQL which is modern and also covers C# code. The
RoslynAnalyzers one is not in our pull request pipelines. The
"RoslynAnalyzers@2" task is outdated and needs be upgraded. I will
delete it for now since we already have CodeQL.
2024-06-19 20:11:15 -07:00
Changming Sun
be423747b1
Delete pyop (#21094)
### Description
Remove the "--enable_language_interop_ops" build flag, because the code
is incompatible with the latest numpy, and the build flag is not used
anywhere except a macOS CI pipeline. It does not seem to have a ship
plan.


### Motivation and Context
The build error was:
```
onnxruntime/core/language_interop_ops/pyop/pyop.cc:122:85: error: no member named 'elsize' in '_PyArray_Descr'
                                  static_cast<int64_t>(PyArray_DescrFromType(type)->elsize),
                                                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~  ^
```
2024-06-19 16:21:33 -07:00
Adrian Lizarraga
3ae5df1d18
[QNN EP] Update QNN SDK to 2.23.0 (#21008)
### Description
- Updates CI pipelines to use QNN SDK 2.23.0 by default.
- QNN SDK adds support for int64 Cast. This allows QNN EP to support
ONNX ArgMax/ArgMin/TopK operators that generate an int64 graph output.

Example translation of ArgMax:
- **ONNX**:    input --> ArgMax --> output (int64)
- **QNN**: input --> ArgMax --> Cast (int32 to int64) --> output (int64)

### Motivation and Context
Update onnxruntime to use the latest QNN SDK.
2024-06-19 12:37:42 -07:00
Jian Chen
6a0d64e65c
Component Gov round 7 (#21051)
### Description
ignoreDirectories does not recursively include sub folders like we
thought it would. We need to add additional sub folders.



### Motivation and Context
Fix CG :
1.
https://aiinfra.visualstudio.com/Lotus/_componentGovernance/218239/alert/11474679?typeId=25427568
2.
https://aiinfra.visualstudio.com/Lotus/_componentGovernance/218239/alert/11475140?typeId=25421034&pipelinesTrackingFilter=0
2024-06-19 11:07:02 -07:00
Yi Zhang
cc3168bcbb
Add UsePython task in Nuget_Packaging_CPU stage (#21095)
### Description
supplement of https://github.com/microsoft/onnxruntime/pull/21062



### Motivation and Context
2024-06-19 21:09:37 +08:00
Scott McKay
5fc60f36f2
Update to the net8 MAUI targets. Remove Xamarin. (#21062)
### Description
<!-- Describe your changes. -->
Xamarin is EOL so remove support.
The MAUI targets are EOL and need updating.
https://dotnet.microsoft.com/en-us/platform/support/policy/maui

Other cleanups:
- netcoreapp3.1 is EOL
- the net6 macos target was added in the mistaken belief that was for
MAUI mac support, but that is actually via the mac-catalyst target which
we recently added support for.
- some CIs that were using the old build setup of splitting pre-net6
targets. The ORT C# bindings csproj was updated last year and the
`PreNet6` and `SelectedTargets` properties no longer exist as they were
replaced by the simpler `IncludeMobileTargets` property.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Remove EOL components.
#21058
2024-06-19 16:20:58 +10:00
cloudhan
ddd4ce3cb7
[ROCm] Update ck to use ck_tile (#21030) 2024-06-19 14:06:10 +08:00
Yi Zhang
809cb26ace
Use A100 for LLama2 model test (#21068)
### Description




### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-06-18 11:04:02 +08:00
Changming Sun
9ef4f1b789
Update pybind11 (#21072)
### Description
Upgrade pybind11 to the latest as suggested by @gnought in #21063

### Motivation and Context
Recently numpy released a new version, which caused compatibility issue
between the latest numpy version and the latest ONNX Runtime version.
2024-06-17 19:50:57 -07:00
Scott McKay
159fe9d4f3
Update to mobile model usability checker (#19843)
### Description
<!-- Describe your changes. -->

- Add check for CoreML MLProgram supported ops
- Only check usability with ORT Mobile package if requested
- this package will be deprecated so info is a) of minimal value and b)
can be confusing.
- Output more things at INFO level
- a lot of meaningful info was only output at DEBUG level. The default
INFO level is more useful
  - dump full partition info at DEBUG level
- Check subgraphs fully
  - CoreML can handle a subgraph
- TBD if we want to add support for adding a subgraph to the parent
graph for Loop and If nodes
    - most likely will be required for simple If nodes to be performant
- Check 5D CoreML limitation

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Improve helper tools

---------

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2024-06-18 07:50:33 +10:00
Changming Sun
80a60d9a65
Update ONNX installing script (#21044)
Avoid using command line flags to pass in CMAKE_PREFIX_PATH. Use
environment variables instead.
Because, otherwise the value of CMAKE_PREFIX_PATH could get encoded
twice. For example, if the prefix is `C:\a\root`, then in
tools/ci_build/github/windows/helpers.ps1 we set it in Env:CMAKE_ARGS
which will be consumed by ONNX. Then when ONNX get it and decoded it,
ONNX will get `C:aroot` instead. Then because the path doesn't exist,
the CMAKE_PREFIX_PATH couldn't take effect when the script installs
ONNX. This PR fixes the issue.

The issue got discovered when I tried to upgrade cmake to a newer
version. Now our Windows CPU CI build pipeline uses cmake 3.27. In the
main branch even the CMAKE_PREFIX_PATH setting does not work, cmake
still can find protoc.exe from the directories. However, starting from
3.28 cmake changed it. With the newer cmake versions the find_library(),
find_path(), and find_file() cmake commands no longer search in
installation prefixes derived from the PATH environment variable.
2024-06-13 23:49:41 -07:00
Ye Wang
f35dd1407f
custom allreduce cuda kernel (#20703)
### Description
<!-- Describe your changes. -->

Conditionally route to custom AllReduce kernel when buffer size and gpu
numbers meet certain requirements. Otherwise, keep using NCCL's
AllReduce.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Ye Wang <wangye@microsoft.com@h100vm-ort.kxelwkzfzxguje5bxvwxxs135a.gvxx.internal.cloudapp.net>
Co-authored-by: Your Name <you@example.com>
2024-06-13 11:09:49 -07:00
Jian Chen
9daed5565a
Component Governance Fix round 6 (#21021)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-06-13 09:10:51 -07:00
Changming Sun
73271dd329
Move jobs in onnxruntime-Win2022-GPU-T4 machine pool to onnxruntime-Win2022-GPU-A10 (#21023)
### Description
Move jobs in onnxruntime-Win2022-GPU-T4 machine pool to
onnxruntime-Win2022-GPU-A10

### Motivation and Context
To reduce the variants of VM images we need to maintain. Now we have 3:
1. Windows 2022 CPU
2. Windows 2022 GPU A10
3. Windows 2022 GPU T4

This change allows us removing the last one.
2024-06-12 22:04:40 -07:00
Changming Sun
99f0fe3fae
Fix a few issues in "Zip-Nuget-Java-Nodejs Packaging Pipeline" (#21014)
### Description
Fix a few issues in the Windows TRT job in "Zip-Nuget-Java-Nodejs
Packaging Pipeline":
1. It is a Windows job. It should not use bash(which is usually not
available on Windows).
2. When it sets ADO vars, it missed a semicolon 

Here is the doc of how to set ADO vars via scripts:
https://learn.microsoft.com/en-us/azure/devops/pipelines/process/set-variables-scripts?view=azure-devops&tabs=bash

You could see it needs a semicolon . Without the semicolon , the vars
will have an extra quotation mark in their values.
2024-06-12 09:44:24 -07:00
Yi Zhang
17d5dc503f
Upgrade ESRP signing task from v2 to v5 (#20995)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-06-12 08:31:53 +08:00
Tianlei Wu
b3fc9b5a0e
[CUDA] upgrade cutlass to 3.5.0 (#20940)
### Description
Upgrade cutlass to 3.5 to fix build errors using CUDA 12.4 or 12.5 in
Windows
- [x] Upgrade cutlass to 3.5.0.
- [x] Fix flash attention build error with latest cutlass header files
and APIs. This fix is provided by @wangyems.
- [x] Update efficient attention to use new cutlass fmha interface.
- [x] Patch cutlass to fix `hrsqrt` not found error for sm < 53.
- [x] Disable TF32 Staged Accumulation to fix blkq4_fp16_gemm_sm80_test
build error for cuda 11.8 to 12.3.
- [x] Disable TRT 10 deprecate warnings. 

The following are not included in this PR:
* TRT provider replaces the deprecated APIs.
* Fix blkq4_fp16_gemm_sm80_test build error for cuda 12.4 or 12.5. This
test is not built by default unless you add `--cmake_extra_defines
onnxruntime_ENABLE_CUDA_EP_INTERNAL_TESTS=ON` in build command.

To integrate to rel-1.18.1: Either bring in other changes (like onnx
1.16.1), or generate manifest and upload a new ONNX Runtime Build Time
Deps artifact based on rel-1.18.1.

### Motivation and Context
https://github.com/microsoft/onnxruntime/issues/19891
https://github.com/microsoft/onnxruntime/issues/20924
https://github.com/microsoft/onnxruntime/issues/20953
2024-06-11 13:32:15 -07:00
Jian Chen
05032e5e5f
Updating cudnn from 8 to 9 on exsiting cuda 12 docker image (#20925)
### Description
Adding support of cudnn 9 


### Motivation and Context
Keep exsiting  cuda 12.2 with nvidia dirver 535
2024-06-11 09:37:16 -07:00
Changming Sun
ae4a2e6b3f
Publish Build Symbols for DML nightly nuget package (#20988)
### Description
Publish Build Symbols for DML nightly nuget package.
2024-06-10 17:53:22 -07:00
Changming Sun
dc545d366d
Publish debug symbols for Windows python packages (#20973)
### Description
1. Publish debug symbols for Windows python packages. This PR will
publish them to ADO. Later on I will also replicate them to Microsoft
Symbol Server.
2. Build the packages in Release mode instead of RelWithDebInfo, to be
consistent with the other platforms(Linux/macOS/...)


### Motivation and Context
To help debug things. Sometimes we found an issue, but we couldn't debug
it because we didn't have symbols, and once we rebuilt the package
locally the issue was gone. This change would be helpful for such
scenarios.

Build log:
https://aiinfra.visualstudio.com/Lotus/_build?definitionId=841
2024-06-10 12:33:49 -07:00
Edward Chen
981893c318
Remove deprecated "mobile" packages (#20941)
# Description

This PR removes the building of the ORT "mobile" packages and much of the associated infrastructure which is no longer needed.

Not removed yet - tools/ci_build/github/android/mobile_package.required_operators.config and the helper scripts that depend on it.

# Motivation and Context

The mobile packages were deprecated in 1.18. Users should use the full packages (Android - onnxruntime-android, iOS - onnxruntime-c/onnxruntime-objc) instead or do a custom build.
2024-06-07 16:20:32 -05:00
Changming Sun
a53f692832
Update c-api-noopenmp-packaging-pipelines.yml: remove CUDA version parameter (#20955)
### Description
Update c-api-noopenmp-packaging-pipelines.yml: remove CUDA version
parameter
To reduce confusion. This pipeline is for generating CUDA 11 packages.
Just it. Not CUDA 12.

### Motivation and Context
In the last release we accidentally published CUDA 12(instead of CUDA
11) packages to nuget.org.
We also tried to publish CUDA 12 packages to
https://aiinfra.visualstudio.com/PublicPackages/_artifacts/feed/ORT-Nightly.
Luckily it didn't go through because a package with the same version
number already existed there. Every time when someone runs this pipeline
with CUDA version set to 12, the built packages will be published to
https://aiinfra.visualstudio.com/PublicPackages/_artifacts/feed/ORT-Nightly.
And GenAI team's build pipelines are based on the nightly packages. So
sometimes GenAI team builds their packages with CUDA 12 and sometimes
with CUDA 11, which is very random.
Therefore, please limit the use of pipeline parameters. Most Azure
DevOps yml files are template files. They should use parameters. But the
top level yml files should be more careful on that.
2024-06-07 11:19:59 -07:00
Jian Chen
d32adb26f2
Refactor deprecated gradle syntax (#20922)
To replaced deprecated API. 
Should verify with the `Gradle cmakeCheck` step from
`Windows_Packaging_CPU_x64_default` stage from the Zip-Nuge-...
pipeline.
2024-06-07 11:08:52 -07:00
Jian Chen
96228c86a0
Adding Job names to jobs without a name (#20961)
### Description
Adding Job names to jobs without a name

### Motivation and Context
This way we will know which job fails CG scan.
2024-06-06 19:09:21 -07:00
Adrian Lizarraga
b5eb9e8a8a
[QNN EP] Update to QNN SDK 2.22 (#20628)
### Description
- Updates pipelines to use QNN SDK 2.22 by default.
- Linux QNN pipeline now uses an Ubuntu 22.04 image (required by QNN
SDK)
- Android QNN pipeline still uses the current Ubuntu 20.04 image. Will
update in a separate PR.
- Disables QDQ LayerNorm test that triggers QNN's graph finalization
error on QNN 2.22
- Increases accuracy tolerance for various HTP tests so that they pass
on Windows arm64.



### Motivation and Context
Test QNN EP with latest QNN SDK version by default.

---------

Signed-off-by: adrianlizarraga <adlizarraga@microsoft.com>
2024-06-05 18:25:23 -07:00
Jian Chen
5faeaf6437
Remove failOnStderr from Gradle cmakeCheck (#20919)
### Description
Remove failOnStderr from Gradle cmakeCheck



### Motivation and Context
The Gradle is still using the deprecated API
2024-06-04 13:54:49 -07:00
liqun Fu
51bc53580d
Update to onnx 1.16.1 (#20702) 2024-06-04 11:06:28 -07:00
Changming Sun
3dd6fcc089
Upgrade min ios version to 13.0 (#20773)
To align with Office and other MS products.
Office's support policy is:
"Office for iPad and iPhone is supported on the two most recent versions
of iOS and iPadOS. When a new version of iOS or iPadOS is released, the
Office Operating System requirement becomes the two most recent
versions: the new version of iOS or iPadOS and the previous version."
(from https://products.office.com/office-system-requirements)

The latest iOS version is 17. So they support both 17 and 16. Here I set
our min iOS version to 13 so that it will be a superset of what Office
supports.

This change would allow us using C++17's std::filesystem feature in the
core framework. The modifications were generated by running
```bash
 find . -type f -exec sed -i "s/apple_deploy_target[ =]12.0/apple_deploy_target=13.0/g"  {} \;
```

Cannot use 15.0 because otherwise iOS packaging would fail with:

```
/Users/runner/work/1/b/apple_framework/intermediates/iphoneos_arm64/Release/_deps/coremltools-src/mlmodel/src/MILBlob/Util/Span.hpp:288:9: error: cannot use 'throw' with exceptions disabled
        MILVerifyIsTrue(index < Size(), std::range_error, "index out of bounds");
```

The Google OSS libraries we use only officially support iOS 15+.
2024-06-04 10:15:20 -07:00
Yi Zhang
c5087b9b58
Improve stable diffusion image parity test stability (#20904)
### Description
1. Add one image into whitelist, but if the image is hit, the pipeline
status is warning.
2. adjust the image parity test tolerance



### Motivation and Context
improve pipeline stability
2024-06-04 10:19:32 +08:00
Jian Chen
456ab09d17
Component Governance fix round 5 (#20905)
…over the case where there is only single repo checked out

### Description
adding $(Build.SourcesDirectory)/cmake/external/onnx/third_party to
cover the case where there is only single repo checked out



### Motivation and Context
Fix CG issue
https://aiinfra.visualstudio.com/Lotus/_componentGovernance/97926/alert/8862110?typeId=16576846
2024-06-03 14:22:22 -07:00
Jian Chen
ae8df4db8f
Split java's gradle build and test (#20817)
### Description

This PR to allow `./gradlew cmakeCheck` failed on
Windows_Packaging_(CUDA|TensorRT) Job. This way, it will still generate
all nessary jar and pom file need for later stage to consume while
`./gradlew cmakeCheck`will be also run again in the
Windows_Packaging_(CUDA|TensorRT)_Testing stage.


### Motivation and Context
Reduce the time of All java packaging stages by 30+ min.
2024-06-03 14:08:45 -07:00
Changming Sun
d13cabf7f9
Upgrade GCC and remove the dependency on GCC8's experimental std::filesystem implementation (#20893)
### Description
This PR upgrades CUDA 11 build pipelines' GCC version from 8 to 11. 

### Motivation and Context

GCC8 has an experimental std::filesystem implementation which is not ABI
compatible with the formal one in later GCC releases. It didn't cause
trouble for us, however, ONNX community has encountered this issue much.
For example, https://github.com/onnx/onnx/issues/6047 . So this PR
increases the minimum supported GCC version from 8 to 9, and removes the
references to GCC's "stdc++fs" library. Please note we compile our code
on RHEL8 and RHEL8's libstdc++ doesn't have the fs library, which means
the binaries in ONNX Runtime's official packages always static link to
the fs library. It is just a matter of which version of the library, an
experimental one or a more mature one. And it is an implementation
detail that is not visible from outside. Anyway, a newer GCC is better.
It will give us the chance to use many C++20 features.

#### Why we were using GCC 8?
It is because all our Linux packages were built on RHEL8 or its
equivalents. The default GCC version in RHEL8 is 8. RHEL also provides
additional GCC versions from RH devtoolset. UBI8 is the abbreviation of
Red Hat Universal Base Image 8, which is the containerized RHEL8. UBI8
is free, which means it doesn't require a subscription(while RHEL does).
The only devtoolset that UBI8 provides is GCC 12, which is too new for
being used with CUDA 11.8. And our CUDA 11.8's build env is a docker
image from Nvidia that is based on UBI8.
#### How the problem is solved
Almalinux is an alternative to RHEL. Almalinux 8 provides GCC 11. And
the CUDA 11.8 docker image from Nvidia is open source, which means we
can rebuild the image based on Almalinux 8 to get GCC 11. I've done
this, but I cannot republish the new image due to various complicated
license restrictions. Therefore I put them at an internal location in
onnxruntimebuildcache.azurecr.io.
2024-06-03 10:14:08 -07:00
Jian Chen
217b66fd85
Update py-publishing pipeline to use the resoure from packaging pipeline (#20888)
### Description
<!-- Describe your changes. -->



### Motivation and Context
To allow nightly release to be automatic triggered
2024-06-01 16:10:02 -07:00
Changming Sun
67bc9438d7
Update training packaging pipeline's docker files (#20853)
### Description
Similar to #20786 . The last PR was able to update all pipelines and all
docker files. This is a follow-up to that PR.

### Motivation and Context
1. To extract the common part as a reusable build infra among different
ONNX Runtime projects.
2. Avoid hitting docker hub's limit: 429 Too Many Requests - Server
message: toomanyrequests: You have reached your pull rate limit. You may
increase the limit by authenticating and upgrading:
https://www.docker.com/increase-rate-limit
2024-05-30 23:48:42 -07:00
Edward Chen
a508130456
Address React Native pipeline component detection timeout (#20871)
mac-react-native-ci-pipeline.yml:
- We don't need to run component detection for PR builds so just disable it there.

npm-packaging-pipeline.yml:
- Manually added component detection task was being added twice - removed one.
- Increased timeout of stage where component detection is run since the existing timeout was close for some builds.
2024-05-30 16:37:03 -07:00