Commit graph

2818 commits

Author SHA1 Message Date
Changming Sun
5d7030e4c6
Revert DML pipeline changes (#23135)
### Description
Previously we wanted to add DirectML EP to existing onnxruntime Windows
CUDA packages. After careful consideration, we will postpone the change.
This PR reverts some pipeline changes previously made by @mszhanyi and
@jchen351 .
2024-12-18 10:42:10 -08:00
Ankit Maheshkar
1f88284f96
OVEP 1.21.0 Development Updates (#23080)
### Description
OVEP development changes for ORT 1.21 Release
 
 
### Motivation and Context
- Has Critical Bug Fixes
- Improved Performance optimizations for both memory & inference latency
(https://github.com/intel/onnxruntime/pull/513)
- Enabled Model Compilation using NPUW
(https://github.com/intel/onnxruntime/pull/508)
- Fixed support for EPContext embed mode 0 for lower memory utilization
- Updated NuGet package name as `Intel.ML.OnnxRuntime.OpenVino`
- Fixed QDQ Stripping logic on NPU
2024-12-11 22:26:32 -08:00
A-Satti
b14b4ec703
Restore Qspectre flag (#23060)
Restore a removed Qspectre flag and update comment

### Motivation and Context
Adjustment for PR
f5293d253c
2024-12-09 21:52:21 -08:00
A-Satti
f5293d253c
Update Intel Thread Counts (#22894)
### Description
The default thread count methodology by onnxruntime did not account for
new upcoming Intel microarchitectures leading to a suboptimal thread
count. Optimizing the thread count for new Intel microarchitectures
reveal gains on the majority of models across datatypes and shows gains
up to ~1.5x speedup.


### Motivation and Context
Applications should run on Intel with the most performant thread
configuration for the majority of models. With new microarchitectures,
adjusting the thread count methodology is required to take advantage of
their differences.
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-12-06 13:56:50 -08:00
Yi Zhang
6ed77cc374
Deprecate macos-12 (#23017)
### Description
<!-- Describe your changes. -->



### Motivation and Context
ESRP code-sign task has supported .net 8, so we can remove macos-12
2024-12-05 14:07:21 +08:00
Jian Chen
f340b3cad3
Adding DML to python cuda package (#22606) 2024-12-04 21:20:12 -05:00
Yulong Wang
a615bd6688
Bump version of Dawn to 12a3b24c4 (#23002)
### Description

Upgrade version of Dawn.

Removed dawn.patch, because all patches are included in upstream.

Updated code that affected by API changes (`const char*` ->
`WGPUStringView`)


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-12-04 09:47:16 -08:00
Jian Chen
9ed0c7fe26
Redo "Update Gradle version 8.7 and java version 17 within onnxruntime/java" (#22923)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-12-02 18:34:25 -08:00
Jian Chen
6c2ff5fc55
Refactor emulator start and stop functions for clarity and efficiency (#22861)
### Description
This pull request introduces several enhancements and new
functionalities to the `tools/python/util/android/android.py` file,
focusing on improving the management of Android emulators. The most
important changes include adding a timeout parameter to the
`start_emulator` function, adding checks to prevent multiple emulators
from running simultaneously, and introducing new utility functions to
manage emulator processes more effectively.

Enhancements to `start_emulator` function:

* Added a `timeout_minutes` parameter to the `start_emulator` function
to make the startup timeout configurable.
[[1]](diffhunk://#diff-c54db556a9c445989f830c09ab90ce2704e648deaccce9c9e0ee4875ddaa864dL108-R117)
[[2]](diffhunk://#diff-c54db556a9c445989f830c09ab90ce2704e648deaccce9c9e0ee4875ddaa864dL158-R170)
* Added a check to prevent starting a new emulator if one with the same
AVD name is already running.
* Included additional emulator arguments `-verbose` for better control
and debugging.
* Added a final verification step to ensure the emulator has started
successfully.

New utility functions for managing emulator processes:

* Introduced `check_emulator_running_using_avd_name `,
`check_emulator_running_using_process`, and
`check_emulator_running_using_pid` to check if an emulator is running
based on AVD name, process instance, or PID, respectively.
* Added `stop_emulator_by_proc` and `stop_emulator_by_pid` functions to
stop the emulator process using a `subprocess.Popen` instance or PID,
with a configurable timeout.
* Updated the `stop_emulator` function to use the new utility functions
for stopping the emulator process.

These changes enhance the robustness and flexibility of the emulator
management utilities, making it easier to handle different scenarios in
CI environments and development workflows.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Scott McKay <skottmckay@gmail.com>
2024-12-02 09:29:17 -08:00
wejoncy
a24723df16
[CoreML ] ML Program more operators support [3/N] (#22710)
### Description
- Erf
- Round
- Max
- ReduceMax
- ReduceMean
- ReduceSum
- Unsqueeze
- Squeeze
- Softmax



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Scott McKay <skottmckay@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-11-28 09:21:02 +08:00
Yi Zhang
b930b4ab5b
Limit PipAuthenticate in Private Project now (#22954)
### Description
Fixes regression in post merge pipeline caused by #22612



### Motivation and Context
So far, there isn't  the artifactFeeds in Public Project
2024-11-27 13:32:35 +08:00
sheetalarkadam
f80afeb9a1
Override android qnn sdk version with pipeline param (#22895)
We need to be able to control/override the exact version of qnn sdk used
for the android build as qnn-runtime (maven package) releases are slower
to QNN SDK releases.
2024-11-25 21:01:05 -08:00
Yi Zhang
85751e7276
Build DML in Windows GPU CI pipeline (#22869)
### Description
Add a new stage to build cuda and dml in Windows GPU CI pipeline (PR
checks) to prevent regressions introduced by new cuda tests.
Update all tests in cuda/testcases name prefix to CudaEp for skipping
them easily

### Motivation and Context
1. CudaNhwcEP is added by default when using cuda ep
2. if onnxruntime_ENABLE_CUDA_EP_INTERNAL_TES is enable, the tests in
tests/provider/cuda/testcases is added too.

### To do
add enable_pybind in the new stage.
Now, --enable_pybind will trigger some python test, like
onnxruntime_test_python.py.
It uses the API of get_avaible_providers() .
More discussions are needed to decide how to make it works
2024-11-25 10:50:52 +08:00
Tianlei Wu
c97dd6e3c1
Update transformers test requirements (#22911)
### Description

* Install PyTorch for transformers tests. The installation is before
python tests so that it can use torch if needed.
* Update protobuf and numpy versions used in transformers test.

### Motivation and Context

Currently, transformers tests are enabled in the following CI pipelines:
* Linux CPU CI Pipeline (torch for cpu-only)
* Linux GPU CI Pipeline (torch for cuda 12)
* Windows GPU CUDA CI Pipeline (torch for cpu-only right now, note that
we might change it to torch for cuda 12 in the future).

For ROCm CI Pipeline, transformer tests are enabled but skipped since
onnx package is not installed in CI.

Previously, torch was not installed before python tests, so some tests
depending on torch were skipped like
[test_bind_onnx_types_not_supported_by_numpy](f6e1d44829/onnxruntime/test/python/onnxruntime_test_python_iobinding.py (L199))
or [test
user_compute_stream](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/test/python/onnxruntime_test_python.py#L465-L476).

In this PR, we changed build.py to install torch before running python
tests.
2024-11-22 09:45:12 -08:00
Tianlei Wu
8d99b1a8dc
reduce GQA test combinations (#22918)
### Description
* Reduce GQA test combinations to save about 35 minutes test time in CI
pipelines.
* Show latency of transformers tests
* Use seed in DMMHA test to avoid random failure.
* For test_flash_attn_rocm.py, test skipping condition from "has cuda
ep" to "not has rocm ep", so that it does not run in cpu build.
* For test_flash_attn_cuda.py, move flash attention and memory efficient
attention tests to different classes, so that we can skip a test suite
instead of checking in each test.

### Motivation and Context
It takes too long to run GQA tests in CI pipelines since there are too
many combinations.

###### Linux GPU CI Pipeline
Before: 5097 passed, 68 skipped, 8 warnings in 1954.64s (0:32:34)
After:  150 passed, 176 skipped, 8 warnings in 530.38s (0:08:50)
Time Saved: **1424** seconds (0:23:44)

###### Windows GPU CUDA CI Pipeline
Before: 1781 passed, 72 skipped, 6 warnings in 605.48s (0:10:05)
After: 116 passed, 118 skipped, 6 warnings in 275.48s (0:04:35) 
Time Saved: **330** seconds (0:05:30)

###### Linux CPU CI Pipeline
Before: 5093 passed, 72 skipped, 4 warnings in 467.04s (0:07:47)
- 212.96s transformers/test_gqa_cpu.py::TestGQA::test_gqa_past
- 154.12s transformers/test_gqa_cpu.py::TestGQA::test_gqa_no_past
- 26.45s
transformers/test_gqa_cpu.py::TestGQA::test_gqa_interactive_one_batch

After: 116 passed, 210 skipped, 4 warnings in 93.41s (0:01:33)
- 0.97s  transformers/test_gqa_cpu.py::TestGQA::test_gqa_past
- 19.23s transformers/test_gqa_cpu.py::TestGQA::test_gqa_no_past
- 2.41s
transformers/test_gqa_cpu.py::TestGQA::test_gqa_interactive_one_batch

Time Saved: **374** seconds (0:06:14).
2024-11-21 12:26:46 -08:00
Tianlei Wu
55f0559e5d
Update attention fusion to support SDPA pattern (#22629)
### Description
Match new SDPA pattern for huggingface BERT model that exported from
latest transformers package.

Some changes of transformers tests in CI pipeline:
(1) Enable tests for bert, distilbert and roberta models in CI.
(2) Remove out-of-date tests for huggingface models that were marked as
slow and not enabled in CI pipeline.
(3) Upgrade transformers package version to the latest.

### Motivation and Context

Recent huggingface transformers use torch SDPA in bert modeling. The
graph pattern change causes attention fusion not working anymore. Update
the fusion script to match the new pattern.
2024-11-21 09:42:41 -08:00
kailums
1e605be166
bigmodel pipeline update cp38 to cp310 (#22793)
### Description
<!-- Describe your changes. -->
when updating from cp38 to cp310, there has some issues for bigmodel
pipeine. there are two jobs failed: stable_diffusion and whisper.

1. for stable_diffusion, we are now using
"nvcr.io/nvidia/pytorch:22.11-py3" from nvidia repo. it is for cuda11
and python3.8. and they are not providing python3.10 version for cuda
11. the latest version of this docker image is for cuda12 and
python3.10. To solve this problem, i use a docker image of ubuntu22.04,
and then install all need python package for this job.
2. for whisper. the original docker image is ubuntu20.04 which doesn't
have python3.10, and has to update to ubuntu22.04.
2024-11-21 07:25:01 -08:00
Jian Chen
369d7bf887
Update the Docker image version (#22907)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-11-21 19:38:39 +08:00
Yi Zhang
a28246a994
Revert "Update Gradle version 8.7 and java version 17 within onnxrunt… (#22914)
…ime/java (#22771)"

This reverts commit 632a36a233.

### Description
<!-- Describe your changes. -->



### Motivation and Context
Run E2E tests using Browserstack failed due to this PR.
2024-11-21 18:12:28 +08:00
Kyle
712bee13db
Fix Pipeline Timeout Issue (#22901)
### Description
<!-- Describe your changes. -->
Extend timeout for always failed job. 


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-11-20 17:18:50 +01:00
Changming Sun
13346fdf18
Cleanup code (#22827)
### Description
1.  Delete TVM EP because it is out of maintain 
2.  Delete ortmodule related docker files and scripts.
2024-11-19 14:13:33 -08:00
Caroline Zhu
0d00fc3130
[mobile] Fix for mac-ios-packaging pipeline (#22879)
### Description
Appends variant name to the Browserstack artifacts that are published so
that we don't run into the error:
"##[error]Artifact browserstack_test_artifacts already exists for build
609095."

[Working pipeline
run](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=609503&view=results)


### Motivation and Context
- onnxruntime-ios-packaging-pipeline has been failing
2024-11-19 09:27:51 -08:00
Adrian Lizarraga
497b06f0a9
[QNN EP] QNN SDK 2.28.2 (#22844)
### Description
- Updates pipelines to use QNN SDK 2.28.2.241116.
- Re-enable LayerNormalization unit tests that failed with accuracy
errors with the previous QNN SDK (2.28.0).
- Update QNN EP to no longer provide a dummy bias for LayerNorm if the
QNN SDK version is >= 2.28.0.


### Motivation and Context
Use the latest QNN SDK. This version improves inference latency for
certain customer models.
2024-11-18 20:10:36 -08:00
Yi Zhang
135d8b2beb
Fix CUDA/DML package exception caused by ENABLE_CUDA_NHWC_OPS (#22851)
### Description
Now,  ENABLE_CUDA_NHWC_OPS is enabled by default.
It adds a new chance to create cuda provider while both cuda/dml are
enabled


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-11-18 10:46:23 +08:00
Preetha Veeramalai
ac9c135b95
Ovep develop 1.21 (#22824)
### Description
OVEP development changes for ORT 1.21 Release


### Motivation and Context
Has critical bug fixes
Support for concurrency execution of models is enabled
Support for OV 2024.5
Memory optimizations for NPU platform

---------

Co-authored-by: jatinwadhwa921 <jatin.wadhwa@intel.com>
Co-authored-by: Ankit Maheshkar <ankit.maheshkar@intel.com>
Co-authored-by: sfatimar <sahar.fatima@intel.com>
Co-authored-by: saurabhkale17 <saurabh1.kale@intel.com>
Co-authored-by: TejalKhade28 <tejal.khade@intel.com>
Co-authored-by: Javier E. Martinez <javier.e.martinez@intel.com>
2024-11-14 20:10:07 -08:00
Jian Chen
632a36a233
Update Gradle version 8.7 and java version 17 within onnxruntime/java (#22771)
### Description
This change is to update the Gradle version within java project to 8.7,
it also upgrades the JAVA to 17. Gradle version from react-native was
also updated to 7.5 to make it compatible with changes from the Java
directory. However, the target java version remains the same. Java
version from these will be upgraded in a separated PR.

This is spited from #22206

### Motivation and Context
This is the first step to upgrade the react native version.
2024-11-14 17:10:44 -08:00
Yifan Li
562ddce270
Re-enable test symbolic shape infer (#22737)
### Description
<!-- Describe your changes. -->
It seems after CI updated to py310, numpy got updated to 2.0 and sympy
1.2 failed to cast float numpy array.
Pointing sympy to 1.13 when py>=3.9 and re-enable unit test

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Error: Linux CPU
CI
2024-11-14 11:28:00 -08:00
Jian Chen
c645bd202c
Fix spellchecks from Optional Lint (#22802)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-11-14 10:27:33 -08:00
Jian Chen
5659d055ee
Fix Linux CI pipeline where ep was not provided for py-packaging-linux-test-cpu.yml (#22828)
### Description
Current linux-ci-pipeline was broken due to missing parameters from
`py-packaging-linux-test-cpu.yml` template


### Motivation and Context
Fix Linux CI pipeline
2024-11-14 09:41:37 -08:00
Jian Chen
f423b737a9
Fix Linux python CUDA package pipeline (#22803)
### Description
Making ::p optional in the Linux python CUDA package pipeline



### Motivation and Context
Linux stage from Python-CUDA-Packaging-Pipeline has failed since merge
of #22773
2024-11-13 14:20:21 -08:00
Jian Chen
75a44582ba
Update all JDK version to 17 (#22786) 2024-11-12 11:42:18 -08:00
Adrian Lizarraga
b1e0930eab
Fix build for linux python wheel (#22801)
### Description
Fixes command for building Linux python packages by preventing an empty
`-p` command-line option from being passed to a subsequent build script:
1f3b675453/tools/ci_build/github/linux/run_python_dockerbuild.sh (L37)



### Motivation and Context
A recent [PR
](https://github.com/microsoft/onnxruntime/pull/22773)introduced a new
optional command-line option (`-p`) to pass custom python exe paths. We
need to check if the option is empty before forwarding the option to a
separate build script.
2024-11-11 15:20:07 -08:00
Jian Chen
885a7acd45
Fix warning - LegacyKeyValueFormat: "ENV key=value" should be used instead of legacy "ENV key value" format (#22800)
### Description
This PR Fix warning - `LegacyKeyValueFormat: "ENV key=value" should be
used instead of legacy "ENV key value" format` from all Dockerfile



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-11-11 13:05:34 -08:00
sheetalarkadam
e8f1d73b0b
Add Android QNN Browserstack test (#22434)
Add Android QNN Browserstack test



### Motivation and Context
Real device test in CI
2024-11-10 16:10:29 -08:00
Yi Zhang
ef281f850a
Add XNNPack build on Linux ARM64 and improve Linux CPU (#22773)
### Description
1. Add XNNPack build on Linux ARM64
2. Build only one python wheel for PR request.

[AB#49763](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/49763)



### Motivation and Context
Why I add xnnpack build on Linux ARM64  rather than Windows ARM64.
Becuase KleidiAI  doesn't support Windows

```
IF(XNNPACK_TARGET_PROCESSOR STREQUAL "arm64" AND XNNPACK_ENABLE_ARM_I8MM AND NOT CMAKE_C_COMPILER_ID STREQUAL "MSVC")
  IF (XNNPACK_ENABLE_KLEIDIAI)
    MESSAGE(STATUS "Enabling KleidiAI for Arm64")
  ENDIF()
ELSE()
  SET(XNNPACK_ENABLE_KLEIDIAI OFF)
ENDIF()
```

---------
2024-11-09 11:26:19 +08:00
Jian Chen
e7987a6b0b
Replace reference to python 3.8 with python 3.10 (#22692)
### Description
This PR will set default python to 3.10 except
tools/ci_build/github/azure-pipelines/bigmodels-ci-pipeline.yml. This is
needed because we are no longer using python 3.8

This PR excludes changes for Big Models CI, because it will require
additional changes. Which will be track in
USER STORY 52729



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-11-07 16:51:40 -08:00
Yifan Li
3b7a6eba69
[TensorRT EP] support TensorRT 10.6-GA (#22644)
### Description
<!-- Describe your changes. -->
* Update CI with TRT 10.6
* Update oss parser to [10.6-GA-ORT-DDS
](https://github.com/onnx/onnx-tensorrt/tree/10.6-GA-ORT-DDS) and update
dependency version
* Update Py-cuda11 CI to use trt10.6


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
(There will be 3rd PR to further reduce trt_version hardcoding)
2024-11-06 14:33:46 -08:00
Caroline Zhu
0221693e43
[Mobile] Add E2E BrowserStack tests for iOS tests (#22610)
### Description
- Changes running the E2E iOS tests from running in App Center to
running in BrowserStack
- Steps for running locally can be found in the OneNote

### Motivation and Context
- Follow-up of #22117 
- App Center (the previous platform for running E2E mobile tests) is
getting deprecated in 2025

### Misc info
Additional build steps were required to get the necessary testing
artifacts for BrowserStack. App Center consumed an entire folder, while
BrowserStack requests the following:
1. a ZIP file of all the tests
2. an IPA file of the test app

#### Flow
Here is a rough outline of what is happening in the pipeline:
1. The build_and_assemble_apple_pods.py script builds the relevant
frameworks (currently, this means packages for iOS and Mac)
4. The test_apple_packages.py script installs the necessary cocoapods
for later steps
5. XCode task to build for testing builds the iOS target for the test
app
6. Now that the test app and the tests have been built, we can zip them,
creating the tests .zip file
7. To create the IPA file, we need to create a .plist XML file which is
generated by the generate_plist.py script.
- Attempts to use the Xcode@5 task to automatically generate the plist
file failed.
- Also, building for testing generates some plist files -- these cannot
be used to export an IPA file.
8. We run the Xcode task to build an .xcarchive file, which is required
for creating an IPA file.
9. We use xcodebuild in a script step to build an IPA file with the
xcarchive and plist files from the last two steps.
10. Finally, we can run the tests using the BrowserStack script.

---------

Co-authored-by: Scott McKay <skottmckay@gmail.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2024-11-06 11:22:29 -08:00
Tianlei Wu
72186bbb71
[CUDA] Build nhwc ops by default (#22648)
### Description

* Build cuda nhwc ops by default.
* Deprecate `--enable_cuda_nhwc_ops` in build.py and add
`--disable_cuda_nhwc_ops` option

Note that it requires cuDNN 9.x. If you build with cuDNN 8, NHWC ops
will be disabled automatically.

### Motivation and Context

In general, NHWC is faster than NCHW for convolution in Nvidia GPUs with
Tensor Cores, and this could improve performance for vision models.

This is the first step to prefer NHWC for CUDA in 1.21 release. Next
step is to do some tests on popular vision models. If it help in most
models and devices, set `prefer_nhwc=1` as default cuda provider option.
2024-11-06 09:54:55 -08:00
Jian Chen
deee48002c
Enable CUDA Python Test (#22717)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-11-05 16:26:50 -08:00
Hector Li
017246260f
support Qnn 2 28 (#22724)
### Description
support Qnn 2.28
update default qnn vesion to 2.28 in build pipeline
2024-11-05 15:41:15 -08:00
Jian Chen
3711a655bc
Update DNNL CI python to 310 (#22691)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-11-05 09:14:48 -08:00
Yi Zhang
33a2059ced
Remove webgpu ep in mobile packaging stages (#22725)
### Description
The nuget-zip-java packaging pipeline has been failed for 4 days since
it's introduced in #22591
2024-11-05 09:14:26 -08:00
Changming Sun
66980e4646
Refactor the cmake code that is related to delay loading (#22646)
### Description
Refactor the cmake code that is related to delay loading. Provide a
cmake option to control if delay loading should be enabled or not.
Disabling the option when python is enabled, due to a known issue. 

### Motivation and Context
ONNX Runtime's python package depends on DirectML.dll, but supposedly
the DLL should be delay loaded.
This PR only refactor the code. It doesn't change the behavior.
2024-11-04 16:30:50 -08:00
Kyle
74adfc2099
Nuget Windows AI Pipeline, Disable SDL Submodules. (#22711)
### Description
<!-- Describe your changes. -->
Set SDL's git submodule to false. 


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
* Previous job's SDL logs:It has 'git submodule sync' command, which
means 'git submodule sync synchronizes all submodules while git
submodule sync'

* After set sdl git submodules to false, the logs don't have 'git
submodule sync' command.
2024-11-04 08:39:28 -08:00
wejoncy
9daf7664fc
[CoreML] ML Program more ops (2/N) (#22480)
- cast 
 - argmax
 - gelu 
 - cast 
 - LayerNorm 
 - GroupNorm 
 - InstanceNorm

### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
Co-authored-by: Scott McKay <skottmckay@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-11-01 08:37:56 +08:00
Yi Zhang
8e8b62b8b5
Build CUDA and DML together (#22602)
### Description
Now, we need to build cuda and dml in one package.
But CUDA EP and DML EP can't run in one process.
It will throw the exception of `the GPU device instance has been
suspended`
So the issue is CUDA EP and DML EP coexist in compile time but can't
exist in run time.

This PR is to split cuda ep test and dml ep test in all unit tests.
The solution is to use 2 environment variable, NO_CUDA_TEST and
NO_DML_TEST, in CI.

For example, if NO_CUDA_TEST is set, the DefaultCudaExecutionProvider
will be nullptr, and the test will not run with CUDA EP.
In debugging, the CUDAExecutionProvider will not be called. 
I think, as long as cuda functions, like cudaSetDevice, are not called,
DML EP tests can pass.

Disabled java test of testDIrectML because it doesn't work now even
without CUDA EP.
2024-10-31 15:51:13 -07:00
Yulong Wang
7a8fa12850
Add implementation of WebGPU EP (#22591)
### Description

This PR adds the actual implementation of the WebGPU EP based on
https://github.com/microsoft/onnxruntime/pull/22318.

This change includes the following:

<details>
<summary><b>core framework of WebGPU EP</b></summary>

  - WebGPU EP factory classes for:
    - handling WebGPU options
    - creating WebGPU EP instance
    - creating WebGPU context
  - WebGPU Execution Provider classes
    - GPU Buffer allocator
    - data transfer
  - Buffer management classes
    - Buffer Manager
    - BufferCacheManager
      - DisabledCacheManager
      - SimpleCacheManager
      - LazyReleaseCacheManager
      - BucketCacheManager
  - Program classes
    - Program (base)
    - Program Cache Key
    - Program Manager
  - Shader helper classes
    - Shader Helper
    - ShaderIndicesHelper
    - ShaderVariableHelper
  - Utils
    - GPU Query based profiler
    - compute context
    - string utils
  - Miscs
    - Python binding webgpu support (basic)
 
</details>

<details>
<summary><b>Kernel implementation</b></summary>


  - onnx.ai (default opset):
- Elementwise (math): Abs, Neg, Floor, Ceil, Reciprocal, Sqrt, Exp, Erf,
Log, Sin, Cos, Tan, Asin, Acos, Atan, Sinh, Cosh, Asinh, Acosh, Atanh,
Tanh, Not, Cast
- Elementwise (activation): Sigmoid, HardSigmoid, Clip, Elu, Relu,
LeakyRelu, ThresholdedRelu, Gelu
- Binary (math): Add, Sub, Mul, Div, Pow, Equal, Greater,
GreaterOrEqual, Less, LessOrEqual
    - (Tensors): Shape, Reshape, Squeeze, Unsqueeze
    - Where
    - Transpose
    - Concat
    - Expand
    - Gather
    - Tile
    - Range
    - LayerNormalization
  - com.microsoft
    - FastGelu
    - MatMulNBits
    - MultiHeadAttention
    - RotaryEmbedding
    - SkipLayerNormalization
    - LayerNormalization
    - SimplifiedLayerNormalization
    - SkipSimplifiedLayerNormalization

</details>

<details>
<summary><b>Build, test and CI pipeline integration</b></summary>

  - build works for Windows, macOS and iOS
  - support onnxruntime_test_all and python node test
  - added a new unit test for `--use_external_dawn` build flag.
  - updated MacOS pipeline to build with WebGPU support
  - added a new pipeline for WebGPU Windows

</details>

This change does not include:

- Node.js binding support for WebGPU (will be a separate PR)
2024-10-29 18:29:40 -07:00
Indy Zhu
e2e837584f
[DML EP] Update DML to 1.15.4 (#22635)
### Description
[DML EP] Update DML to 1.15.4



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
We want the customer to use the latest DirectML.
2024-10-29 17:13:57 -07:00
Yifan Li
951d9aa99f
[TensorRT EP] Refactor TRT version update logic & apply TRT 10.5 (#22483)
### Description
<!-- Describe your changes. -->
* Leverage template `common-variables.yml` and reduce usage of hardcoded
trt_version

8391b24447/tools/ci_build/github/azure-pipelines/templates/common-variables.yml (L2-L7)
* Among all CI yamls, this PR reduces usage of hardcoding trt_version
from 40 to 6, by importing trt_version from `common-variables.yml`
* Apply TRT 10.5 and re-enable control flow op test


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
- Reduce usage of hardcoding trt_version among all CI ymls

### Next refactor PR 
will work on reducing usage of hardcoding trt_version among
`.dockerfile`, `.bat` and remaining 2 yml files
(download_win_gpu_library.yml & set-winenv.yml, which are step-template
yaml that can't import variables)
2024-10-29 09:23:41 -07:00
Changming Sun
3641d184f8
Add pipauth to more ADO pipelines and enable CSV (#22612)
### Description
1. Add pipauth to more ADO pipeline. (We will use a private ADO feed to
fetch python packages in these pipeline, to improve security)
2. Enforce codeSignValidation(CSV).

### Motivation and Context
Fulfill some internal compliance requirements.
2024-10-28 16:39:22 -07:00
kailums
dd28f09ce2
fix issue when build with hipblasLt on rocm6.1 (#22553)
### Description
<!-- Describe your changes. -->

hipblasLt library is released with rocm6.x, and current onnxruntime's
code need some modifications to match new hipblasLt API.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-10-28 13:57:08 +08:00
Tianlei Wu
b4afc6266f
[ROCm] Python 3.10 in ROCm CI, and ROCm 6.2.3 in MigraphX CI (#22527)
### Description
Upgrade python from 3.9 to 3.10 in ROCm and MigraphX docker files and CI
pipelines. Upgrade ROCm version to 6.2.3 in most places except ROCm CI,
see comment below.

Some improvements/upgrades on ROCm/Migraphx docker or pipeline:
* rocm 6.0/6.1.3 => 6.2.3
* python 3.9 => 3.10
* Ubuntu 20.04 => 22.04
* Also upgrade ml_dtypes, numpy and scipy packages.
* Fix message "ROCm version from ..." with correct file path in
CMakeList.txt
* Exclude some NHWC tests since ROCm EP lacks support for NHWC
convolution.

#### ROCm CI Pipeline:
ROCm 6.1.3 is kept in the pipeline for now.
- Failed after upgrading to ROCm 6.2.3: `HIPBLAS_STATUS_INVALID_VALUE ;
GPU=0 ; hostname=76123b390aed ;
file=/onnxruntime_src/onnxruntime/core/providers/rocm/rocm_execution_provider.cc
; line=170 ; expr=hipblasSetStream(hipblas_handle_, stream);` . It need
further investigation.
- cupy issues:
(1) It currently supports numpy < 1.27, might not work with numpy 2.x.
So we locked numpy==1.26.4 for now.
(2) cupy support of ROCm 6.2 is still in progress:
https://github.com/cupy/cupy/issues/8606.

Note that miniconda issues: its libstdc++.so.6 and libgcc_s.so.1 might
have conflict with the system ones. So we created links to use the
system ones.

#### MigraphX CI pipeline

MigraphX CI does not use cupy, and we are able to use ROCm 6.2.3 and
numpy 2.x in the pipeline.

#### Other attempts

Other things that I've tried which might help in the future: 

Attempt to use a single docker file for both ROCm and Migraphx:
https://github.com/microsoft/onnxruntime/pull/22478

Upgrade to ubuntu 24.04 and python 3.12, and use venv like
[this](27903e7ff1/tools/ci_build/github/linux/docker/rocm-ci-pipeline-env.Dockerfile).

### Motivation and Context
In 1.20 release, ROCm nuget packaging pipeline will use 6.2:
https://github.com/microsoft/onnxruntime/pull/22461.
This upgrades rocm to 6.2.3 in CI pipelines to be consistent.
2024-10-25 11:47:16 -07:00
dependabot[bot]
7acbd51912
Bump onnx from 1.16.1 to 1.17.0 in /tools/ci_build/github/linux/docker/inference/aarch64/python/cpu/scripts (#22593)
Bumps [onnx](https://github.com/onnx/onnx) from 1.16.1 to 1.17.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/onnx/onnx/releases">onnx's
releases</a>.</em></p>
<blockquote>
<h2>v1.17.0</h2>
<p>ONNX v1.17.0 is now available with exciting new features! We would
like to thank everyone who contributed to this release!
Please visit <a href="https://onnx.ai/">onnx.ai</a> to learn more about
ONNX and associated projects.</p>
<h1>Key Updates</h1>
<h2>ai.onnx Opset 22</h2>
<ul>
<li>Update to support bfloat16:
<ul>
<li><a
href="https://onnx.ai/onnx/operators/onnx__Acos.html#acos-22">Acos</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Acosh.html#acosh-22">Acosh</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Asin.html#asin-22">Asin</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Asinh.html#asinh-22">Asinh</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Atan.html#atan-22">Atan</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Atanh.html#atanh-22">Atanh</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__AveragePool.html#averagepool-22">AveragePool</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Bernoulli.html#bernoulli-22">Bernoulli</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Conv.html#conv-22">Conv</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__ConvTranspose.html#convtranspose-22">ConvTranspose</a>,
<a href="https://onnx.ai/onnx/operators/onnx__Cos.html#cos-22">Cos</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Cosh.html#cosh-22">Cosh</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__DeformConv.html#deformconv-22">DeformConv</a>,
<a href="https://onnx.ai/onnx/operators/onnx__Det.html#det-22">Det</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Dropout.html#dropout-22">Dropout</a>,
<a href="https://onnx.ai/onnx/operators/onnx__Elu.html#elu-22">Elu</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__EyeLike.html#eyelike-22">EyeLike</a>,
<a href="https://onnx.ai/onnx/operators/onnx__GRU.html#gru-22">GRU</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__GlobalAveragePool.html#globalaveragepool-22">GlobalAveragePool</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__GlobalLpPool.html#globallppool-22">GlobalLpPool</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__GlobalMaxPool.html#globalmaxpool-22">GlobalMaxPool</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__GridSample.html#gridsample-22">GridSample</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__HardSigmoid.html#hardsigmoid-22">HardSigmoid</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__HardSwish.html#hardswish-22">HardSwish</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__InstanceNormalization.html#instancenormalization-22">InstanceNormalization</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__LSTM.html#lstm-22">LSTM</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__LpNormalization.html#lpnormalization-22">LpNormalization</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__LpPool.html#lppool-22">LpPool</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__MaxPool.html#maxpool-22">MaxPool</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__MaxRoiPool.html#maxroipool-22">MaxRoiPool</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__MaxUnpool.html#maxunpool-22">MaxUnpool</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Mish.html#mish-22">Mish</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Multinomial.html#multinomial-22">Multinomial</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__NegativeLogLikelihoodLoss.html#negativeloglikelihoodloss-22">NegativeLogLikelihoodLoss</a>,
<a href="https://onnx.ai/onnx/operators/onnx__RNN.html#rnn-22">RNN</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__RandomNormal.html#randomnormal-22">RandomNormal</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__RandomNormalLike.html#randomnormallike-22">RandomNormalLike</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__RandomUniform.html#randomuniform-22">RandomUniform</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__RandomUniformLike.html#randomuniformlike-22">RandomUniformLike</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__RoiAlign.html#roialign-22">RoiAlign</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Round.html#round-22">Round</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Selu.html#selu-22">Selu</a>,
<a href="https://onnx.ai/onnx/operators/onnx__Sin.html#sin-22">Sin</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Sinh.html#sinh-22">Sinh</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Softplus.html#softplus-22">Softplus</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__Softsign.html#softsign-22">Softsign</a>,
<a href="https://onnx.ai/onnx/operators/onnx__Tan.html#tan-22">Tan</a>,
<a
href="https://onnx.ai/onnx/operators/onnx__ThresholdedRelu.html#thresholdedrelu-22">ThresholdedRelu</a></li>
</ul>
</li>
</ul>
<h2>Python Changes</h2>
<ul>
<li>Support for numpy &gt;= 2.0</li>
</ul>
<h1>Bug fixes and infrastructure improvements</h1>
<ul>
<li>Fix Check URLs errors <a
href="https://redirect.github.com/onnx/onnx/pull/5972">5972</a></li>
<li>Use CMAKE_PREFIX_PATH in finding libprotobuf <a
href="https://redirect.github.com/onnx/onnx/pull/5975">5975</a></li>
<li>Bump main VERSION_NUMBER to 1.17.0 <a
href="https://redirect.github.com/onnx/onnx/pull/5968">5968</a></li>
<li>Fix source and pip tar.gz builds on s390x systems <a
href="https://redirect.github.com/onnx/onnx/pull/5984">5984</a></li>
<li>Fix unique_name <a
href="https://redirect.github.com/onnx/onnx/pull/5992">5992</a></li>
<li>Fix SegFault bug in shape inference <a
href="https://redirect.github.com/onnx/onnx/pull/5990">5990</a></li>
<li>Fix onnx.compose when connecting subgraphs <a
href="https://redirect.github.com/onnx/onnx/pull/5991">5991</a></li>
<li>Fix conversion from split 11 to split 18 <a
href="https://redirect.github.com/onnx/onnx/pull/6020">6020</a></li>
<li>Update error messages for NegativeLogLikelihoodLoss inference
function <a
href="https://redirect.github.com/onnx/onnx/pull/6021">6021</a></li>
<li>Generalize input/output number check in shape inference <a
href="https://redirect.github.com/onnx/onnx/pull/6005">6005</a></li>
<li>Replace rank inference with shape inference for Einsum op <a
href="https://redirect.github.com/onnx/onnx/pull/6010">6010</a></li>
<li>build from source instruction with latest cmake change <a
href="https://redirect.github.com/onnx/onnx/pull/6038">6038</a></li>
<li>Handle OneHot's depth value during shape inference <a
href="https://redirect.github.com/onnx/onnx/pull/5963">5963</a></li>
<li>Not to install cmake in pyproject.toml on Windows <a
href="https://redirect.github.com/onnx/onnx/pull/6045">6045</a></li>
<li>fix a skipped shape infer code <a
href="https://redirect.github.com/onnx/onnx/pull/6049">6049</a></li>
<li>Include the &quot;.onnxtext&quot; extension in supported
serialization format <a
href="https://redirect.github.com/onnx/onnx/pull/6051">6051</a></li>
<li>Allow ReferenceEvaluator to return intermediate results <a
href="https://redirect.github.com/onnx/onnx/pull/6066">6066</a></li>
<li>Fix 1 typo in numpy_helper.py <a
href="https://redirect.github.com/onnx/onnx/pull/6041">6041</a></li>
<li>Remove benchmarking code <a
href="https://redirect.github.com/onnx/onnx/pull/6076">6076</a></li>
<li>Prevent crash on import after GCC 8 builds <a
href="https://redirect.github.com/onnx/onnx/pull/6048">6048</a></li>
<li>Check graph outputs are defined <a
href="https://redirect.github.com/onnx/onnx/pull/6083">6083</a></li>
<li>Enable additional ruff rules <a
href="https://redirect.github.com/onnx/onnx/pull/6032">6032</a></li>
<li>Add missing shape inference check for DequantizeLinear <a
href="https://redirect.github.com/onnx/onnx/pull/6080">6080</a></li>
<li>Add bfloat16 to all relevant ops <a
href="https://redirect.github.com/onnx/onnx/pull/6099">6099</a></li>
<li>fix(ci): install python dependencies with --only-binary :all: in
manylinux <a
href="https://redirect.github.com/onnx/onnx/pull/6120">6120</a></li>
<li>fix: install google-re2 with --only-binary option <a
href="https://redirect.github.com/onnx/onnx/pull/6129">6129</a></li>
<li>Specify axis parameter for DequantizeLinear when input rank is 1 <a
href="https://redirect.github.com/onnx/onnx/pull/6095">6095</a></li>
<li>Pin onnxruntime to 1.17.3 for release CIs <a
href="https://redirect.github.com/onnx/onnx/pull/6143">6143</a></li>
<li>Fix INT4 TensorProto byte size is 5x larger than expected with
negative values <a
href="https://redirect.github.com/onnx/onnx/pull/6161">6161</a></li>
<li>Mitigate tarball directory traversal risks <a
href="https://redirect.github.com/onnx/onnx/pull/6164">6164</a></li>
<li>Fix reference implementation for ScatterND with 4D tensors <a
href="https://redirect.github.com/onnx/onnx/pull/6174">6174</a></li>
<li>Addition of group &gt; 1 in test and in backend for ConvTranspose <a
href="https://redirect.github.com/onnx/onnx/pull/6175">6175</a></li>
<li>Support for bfloat16 for binary, unary operators in reference
implementation <a
href="https://redirect.github.com/onnx/onnx/pull/6166">6166</a></li>
<li>Refactor windows workflow to work on standard windows <a
href="https://redirect.github.com/onnx/onnx/pull/6190">6190</a></li>
<li>Fix a few crashes while running shape inference <a
href="https://redirect.github.com/onnx/onnx/pull/6195">6195</a></li>
<li>Update onnx to work with numpy&gt;=2.0 <a
href="https://redirect.github.com/onnx/onnx/pull/6196">6196</a></li>
<li>Use sets to improve performance of dfs search <a
href="https://redirect.github.com/onnx/onnx/pull/6213">6213</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="b8baa84466"><code>b8baa84</code></a>
Set version 1.17.0 for official release (<a
href="https://redirect.github.com/onnx/onnx/issues/6405">#6405</a>)</li>
<li><a
href="6d77b80821"><code>6d77b80</code></a>
[Cherry-Pick] Fix main url checks (<a
href="https://redirect.github.com/onnx/onnx/issues/6312">#6312</a>) (<a
href="https://redirect.github.com/onnx/onnx/issues/6327">#6327</a>)</li>
<li><a
href="174938d8b7"><code>174938d</code></a>
[Cherry-Pick] Fix protobuf pkg 5.28.0 failing on Windows (<a
href="https://redirect.github.com/onnx/onnx/issues/6342">#6342</a>) (<a
href="https://redirect.github.com/onnx/onnx/issues/6347">#6347</a>)</li>
<li><a
href="f18d5931ad"><code>f18d593</code></a>
[Cherry-Pick] Remove unused variables (<a
href="https://redirect.github.com/onnx/onnx/issues/6303">#6303</a>) (<a
href="https://redirect.github.com/onnx/onnx/issues/6324">#6324</a>)</li>
<li><a
href="c58890537f"><code>c588905</code></a>
Set version in rel-1.17.0 to 1.17.0rc1 (<a
href="https://redirect.github.com/onnx/onnx/issues/6317">#6317</a>)</li>
<li><a
href="4392c2c9ae"><code>4392c2c</code></a>
Prepare for rel-1.17.0 (<a
href="https://redirect.github.com/onnx/onnx/issues/6281">#6281</a>)</li>
<li><a
href="cb54169e4f"><code>cb54169</code></a>
Update ort filter to 1.20.0 to skip tests known to fail with ort 1.19.0
(<a
href="https://redirect.github.com/onnx/onnx/issues/6306">#6306</a>)</li>
<li><a
href="99e1fd352c"><code>99e1fd3</code></a>
Bump reviewdog/action-misspell from 1.21.0 to 1.23.0 (<a
href="https://redirect.github.com/onnx/onnx/issues/6268">#6268</a>)</li>
<li><a
href="1920565505"><code>1920565</code></a>
Bump ossf/scorecard-action from 2.3.3 to 2.4.0 (<a
href="https://redirect.github.com/onnx/onnx/issues/6273">#6273</a>)</li>
<li><a
href="2e8f2289b9"><code>2e8f228</code></a>
Bump mypy from 1.10.1 to 1.11.1 (<a
href="https://redirect.github.com/onnx/onnx/issues/6275">#6275</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/onnx/onnx/compare/v1.16.1...v1.17.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=onnx&package-manager=pip&previous-version=1.16.1&new-version=1.17.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/microsoft/onnxruntime/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-25 10:03:43 -07:00
Kyle
10bdf6e797
Fix Maven Sha256 Checksum Issue (#22600)
### Description
<!-- Describe your changes. -->
**Changes applied to maven related signing:** 
* Windows sha256 file encoded by utf8(no BOM)
* powershell script task used latest version, previous 5.1 version only
supports utf8 with BOM.
* Windows sha256 file content in format 'sha256value
*filename.extension'.
* Linux sha256 file content in format 'sha256value *filename.extension'.

**More information about powershell encoding:**
Windows powershell encoding reference: [about_Character_Encoding -
PowerShell | Microsoft
Learn](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_character_encoding?view=powershell-7.4)
- for version 5.1, it only has 'UTF8 Uses UTF-8 (with BOM).'
- for version v7.1 and higher, it has:
     utf8: Encodes in UTF-8 format (no BOM).
     utf8BOM: Encodes in UTF-8 format with Byte Order Mark (BOM)
     utf8NoBOM: Encodes in UTF-8 format without Byte Order Mark (BOM)
2024-10-25 08:13:02 -07:00
Satya Kumar Jandhyala
4ed5bec2e7
[JS/WebGPU] Support WASM64 (#21836)
### Description
Support wasm64



### Motivation and Context
Overcome memory limitations

---------

Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com>
2024-10-24 20:21:51 -07:00
Jian Chen
3fe7aa3b59
Adding new Python package testing pipeline for Cuda Alt (#22584)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-10-24 19:24:53 -07:00
Changming Sun
15556c492d
Use a private PIP feed in 1ES pipeline (#22590) 2024-10-24 19:10:30 -07:00
Scott McKay
b9903617b6
Exclude padding section from minimal build size report (#22578)
### Description
<!-- Describe your changes. -->
Should make the binary size report more stable as changes < 4K can occur
when a padding boundary is crossed.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-10-25 08:14:15 +10:00
Jian Chen
3ae7c3c0a6
Enable 1ES on Python CUDA Package Pipelines (#22560)
### Description
These 3 following CUDA packaging pipeline shoud be enabled with 1ES
after this pull request.
•
[Python-CUDA-Packaging-Pipeline](https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1299&view=runs)
• [Python CUDA Alt Packaging
Pipeline](https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1626)
• [Python DML Packaging
Pipeline](https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1625)

This should also fix the issue where [Python packaging
pipeline](https://aiinfra.visualstudio.com/Lotus/_build?definitionId=841&_a=summary)
failed due to cannot find `publish_symbols`


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-10-24 09:51:00 -07:00
Kyle
70be2eb6da
Migrate Nuget Windows AI Pipeline to Use 1ES Template (#22572) 2024-10-24 09:15:39 -07:00
Yulong Wang
ef7f1ce08b
Update Node.js version from 18.x to 20.x in CI pipelines (#22576) 2024-10-24 07:34:42 -07:00
Kyle
d9ca84ef96
Add DoEsrp Check for Signature Verification (#22570)
### Description
<!-- Describe your changes. -->
Add DoEsrp Check for Signature Verification


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-10-24 16:55:36 +08:00
Changming Sun
a25c9315ea
Move ORT Training pipeline to github actions (#22543)
Move ORT Training pipeline to github actions and enable CodeQL scan for the code(including inference code).
We will move all pull request pipelines to Github Actions.
2024-10-23 11:57:15 -07:00
Jian Chen
ffaddead0a
Refactor cuda packaging pipeline (#22542)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-10-23 08:14:10 -07:00
Tianlei Wu
63a07c1838
update pipeline name list in run_CIs_for_external_pr.py (#22540)
### Description
Update list of CI pipelines to trigger for external PRs.

### Motivation and Context
The pipelines triggered for external PRs are not consistent with
internal PRs.
2024-10-22 17:14:48 -07:00
Tianlei Wu
8a04ab421d
[CUDA] upgrade opencv in stable diffusion demo (#22470)
### Description
(1) Upgrade opencv
(2) Add some comments about onnxruntime-gpu installation

### Motivation and Context
opencv-python was locked to an older version, which has security
vulnerabilities: see https://github.com/microsoft/onnxruntime/pull/22445
for more info
2024-10-21 23:20:49 -07:00
Changming Sun
88676e62b9
Remove nsync (#20413)
### Description
1. Remove the onnxruntime::OrtMutex class and replace it with
~absl::Mutex~ std::mutex.
2. After this change, most source files will not include <Windows.h>
indirectly.


### Motivation and Context
To reduce the number of deps we have, and address some Github issues
that are related to build ONNX Runtime from source.
In PR #3000 , I added a custom implementation of std::mutex . It was
mainly because at that time std::mutex's default constructor was not
trivial on Windows. If you had such a mutex as a global var, it could
not be initialized at compile time. Then VC++ team fixed this issue.
Therefore we don't need this custom implementation anymore.

This PR also removes nsync. I ran several models tests on Linux. I
didn't see any perf difference.
This PR also reverts PR #21005 , which is no longer needed since conda
has updated its msvc runtime DLL.

This PR unblocks #22173 and resolves #22092 . We have a lot of open
issues with nsync. This PR can resolve all of them.
2024-10-21 15:32:14 -07:00
Changming Sun
c7138a2630
Update CMake (#22516)
This pull request upgrades the CMake version from v3.31.0-rc1 to
v3.31.0-rc2 to include a bug fix for CUDA
https://gitlab.kitware.com/cmake/cmake/-/merge_requests/9902 from Nvidia
company.

AB#51692
2024-10-21 07:51:05 -07:00
kailums
3174e3da57
update pipline python version from 3.8 to 3.12 (#22517)
### Description
As the python3.8 is going to reach EOL. 

https://discuss.python.org/t/python-3-13-0-final-has-been-released/
https://discuss.python.org/t/python-3-8-is-now-officially-eol/66983

we update our ci pipeline python version which still using 3.8 to 3.12
2024-10-21 07:50:31 -07:00
Jeff Daily
5aabc53121
[ROCm] redo hipify of version controlled files (#22449)
### Description
Updates the ROCm EP opsets to match the current CUDA EP opsets. Also
enable the test CApiTest.basic_cuda_graph_with_annotation.

Note that some changes are whitespace-only. These changes were made to
improve the comparison of corresponding ROCm and CUDA EP source files
when using a side by side diff tool.

### Motivation and Context
The ROCm EP derives from the CUDA EP. Many source files are shared
between the EPs and "hipified" during the ROCm EP build, however quite a
few files within the ROCm EP are under source control after their
initial hipification. Over time these ROCm EP files get stale relative
to their CUDA EP counterparts. It becomes necessary to re-hipify these
otherwise static files in order to pick up important changes such as
opset differences.
2024-10-18 12:40:54 -07:00
Edward Chen
7964d3aef6
Specify iOS simulator runtime version (#22474)
- Allow specification of iOS simulator runtime version to use.
- Pick simulator runtime version (iphonesimulator 16.4) that is supported by the Xcode version (14.3.1) that we use.
- Disable CoreML EP's DepthToSpace op support for CoreML version less than 7, with DCR mode, and FP16 input. It doesn't produce the correct output in this case.
- Some cleanup of iOS test infrastructure.
2024-10-18 09:26:06 -07:00
Yulong Wang
1247d69c28
Add onnxtestdata cache for win-web-multi-browsers pipeline (#22477)
### Description

Apply onnxtestdata cache to win-web-multi-browsers pipeline

Same change that applied to win-web-ci #16659
2024-10-17 12:03:29 -07:00
Hector Li
ac98bcae37
Update QNN default version to 2.27 in CI pipeline (#22471)
### Description
Update QNN default version to 2.27 in CI pipeline
2024-10-16 22:05:47 -07:00
Changming Sun
f9e623e4d1
Update CMake to 3.31.0rc1 (#22433)
To include a bug fix:
https://gitlab.kitware.com/cmake/cmake/-/merge_requests/9890

Discussion:

https://discourse.cmake.org/t/cmake-incorrectly-links-to-nvrtc-builtins/12723/4

This bug fix should be included in our upcoming release, because right
now our GPU package depends on “libnvrtc-builtins.so.12.2" which has a
hardcoded CUDA version: 12.2. The minor CUDA version should not be
there.
2024-10-16 11:50:13 -07:00
Caroline Zhu
691de83892
Enable BrowserStack tests (#22457)
### Description
BrowserStack account issues have been resolved -- this PR enables E2E
browserstack tests in the pipeline again
2024-10-16 11:10:12 -07:00
PeixuanZuo
bf604428aa
[ROCm] Update ROCm Nuget pipeline to ROCm 6.2 (#22461)
1. Update ROCm Nuget pipeline build version to ROCm 6.2
2. Update AMD-GPU Agent Pool base docker image for ROCm Nuget pipeline
test stage. search `AMD GPU pipeline Nuget` page in onenote to see how
to update it.

passed pipeline:
https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=580846&view=results
2024-10-16 10:36:49 -07:00
Jian Chen
af00a20f8a
Change ORT nightly python packages' name (#22450)
### Description
Our nightly CPU python package's name is "ort-nightly" instead of
"onnxruntime". It was because of some historical reasons. Tensorflow was
like that.
Now we would prefer to make them the same.
Do this change for all nightly python packages, including CPU,
GPU(CUDA), and maybe others.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-10-15 18:44:59 -07:00
Caroline Zhu
6407d81b35
Disable BrowserStack testing stage (#22438)
### Description
We are seeing this [packaging
pipeline](https://aiinfra.visualstudio.com/Lotus/_build?definitionId=940&_a=summary)
fail because we are running into BrowserStack account issues. Disabling
this step until issues are resolved
2024-10-15 13:27:05 -07:00
Jeff Daily
8c21680ffc
[ROCm] prefer hip interfaces over roc during hipify (#22394)
### Description
Change the hipify step to remove the -roc option to hipify-perl. This
will prefer hipblas over rocblas. rocblas can still be called directly
such as in TunableOp.

### Motivation and Context
hip interfaces are preferred over roc for porting from cuda to hip.
Calling roc interfaces is meant for ROCm-specific enhancements or
extensions.
2024-10-14 20:34:03 -07:00
Changming Sun
4af593a722
Add python 3.13 support (#22380)
1. Add python 3.13 to our python packaging pipelines
2. Because numpy 2.0.0 doesn't support thread free python, this PR also
upgrades numpy to the latest
3. Delete some unused files.
2024-10-14 18:07:54 -07:00
Edward Chen
04404ea482
Fix Xcode 16 iOS build issues (#22379)
- Work around Xcode 16 iOS test build issue: `error: Multiple commands produce '.../PlugIns'`.
- Fix link error in iOS static framework test.
- Update build.py to check for the right kind of build before running iOS tests on the simulator.
- Update Xcode 16 build images to 'macos-15' because that's the only image that will have Xcode 16 soon. See https://github.com/actions/runner-images/issues/10703.
2024-10-14 09:24:38 -07:00
Yi Zhang
72cc72cc21
New rocm nuget publish pipeline (#22418)
### Description
Add a new pipeline to publish ROCM package to ADO



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

### Test Link
https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1615
2024-10-13 08:30:06 +08:00
Edward Chen
d7367653ab
Remove clean_docker_image_cache.py and clean-build-docker-image-cache-pipeline.yml. (#22409)
Clean up old script and build definition.
2024-10-11 14:25:13 -07:00
Kyle
cdebf37105
Add Digital Signature to DLLs in Maven Build (#22401)
### Description
* Add digital signature to dll files in jar files.
* Jar file names: onnxruntime-{version}.jar,
onnxruntime_gpu-{version}.jar

### Motivation and Context
#19204
2024-10-11 12:14:03 -07:00
sheetalarkadam
c06ecd415c
RC releases to Maven for Android (#22391)
### Description
Aallows alpha, beta and rc version releases to Maven for Android
artifacts.



### Motivation and Context
Helpful to release rc versions or test artifacts to Maven for testing.
For example, a new QNN android package is being released and it will be
nice to test the RC version for dependencies before release

## Future Work
Allow RC version for all Maven artifacts.
2024-10-11 08:58:02 -07:00
Changming Sun
6ada97c84c
Fix a build issue when statically link to MSVC Runtime (#22393)
Yesterday I updated ABSL to a newer version which added a new cmake
option: ABSL_MSVC_STATIC_RUNTIME . I wasn't aware of it. This PR fixes
it.
2024-10-10 20:09:13 -07:00
sheetalarkadam
dd2ea8469e
Add qnn android package (#22296)
### Description
Pre built QNN Android package


### Future Work
1. Setting up CI with Browserstack- onnxruntime_tests and Android test
2. ESRP Release to Maven
2024-10-10 10:37:22 -07:00
Changming Sun
2bef89c171
Upgrade absl to the latest released version (#22365)
### Description
Resolve #21976 .  
 
ABSL generally does not have forward/backward compatibility. Our code is
only compatible with one fixed LTS version. So it's important to fix the
version number there when using find_package to detect an installed
version.
2024-10-09 20:21:40 -07:00
Changming Sun
dcf1e0c3b0
Re-enable CUDA 12 python package test pipeline (#22370)
### Description
It runs after "Python-CUDA-Packaging-Pipeline" that runs on a CPU
machine that skipped all tests.
This testing pipeline is for doing the tests.
2024-10-09 20:21:27 -07:00
Yi Zhang
25b1c38e87
Add conv fp16 kernel in xnnpack EP (#22301)
### Description
Add FP16 kernels of Conv and ConvTranspose

[AB#50186](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/50186)



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------
2024-10-10 08:48:09 +08:00
Hector Li
3b00024b55
Fix the QNN nuget package issue (#22358)
Fix the QNN nuget package issue

### Description
Inside the package, folder name \runtimes\win-arm64\ was changed to \runtimes\win-ARM64\, which breaks lib copy settings in Microsoft.ML.OnnxRuntime.QNN.props.


### Motivation and Context
Fix issue: https://github.com/microsoft/onnxruntime/issues/21692
2024-10-09 08:41:23 -07:00
Changming Sun
9ee963110e
Update manylinux version (#22355)
### Description
Update the commit from 59600894a2c1c18290944b83e989bfe618975230 to
1887322ed36d522409a6b805d4e7942cf76a8e40


### Motivation and Context
The new one has python 3.13.

AB#50959
2024-10-08 23:11:11 -07:00
Yulong Wang
c5d28cac4d
Initial WebGPU EP checkin (#22318)
### Description

This change introduces the WebGPU EP into ONNX Runtime.

To make the PR as simple as possible, this PR excluded the following:
- C API changes for WebGPU EP
- actual implementation of WebGPU EP. Currently in this PR, WebGPU is a
stub implementation that does not register any kernel.
- Python IO Binding update
- Node.js IO Binding update

This PR now contains only 43 file changes (while the working branch
contains 130+) and hopefully this makes it easier to review.

There is going to be separated PRs for each mentioned above.

Current working branch: #21904
2024-10-08 16:10:46 -07:00
Changming Sun
d98340968e
Stop publishing python 3.8/3.9 packages (#22343)
### Description
1. Stop publishing python 3.8/3.9 packages, to align with numpy. 
2. Add a trigger for CUDA12's python test pipeline.
2024-10-08 09:50:05 -07:00
Changming Sun
715b74d61a
Re-enable codesign for maven packages (#22308)
### Description
PR #22217 was reverted.  This PR re-enables it.


### Motivation and Context
2024-10-04 14:30:17 -07:00
Tianlei Wu
f3f33bfa05
Upgrade cutlass to 3.5.1 and cudnn frontend to 1.7.0 (#22316)
### Description
Upgrade cutlass to 3.5.1
Upgrade cudnn_frontend to 1.7.0
2024-10-04 11:48:50 -07:00
jingyanwangms
bb0c1f0a05
Update cuda version in release pipeline (#22305)
### Description
With TensorRT 10.4 update, the name of TensorRT windows package changed


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-10-03 22:28:28 -07:00
Caroline Zhu
c73e6afa6c
Migrate Android Java E2E tests from App Center to Browserstack (#22117)
### Description
- removed installing AppCenter + pipeline step that runs AppCenter
Espresso tests
- added script for running AppCenter tests

### Motivation and Context
App Center is getting deprecated in the next year + we have upcoming
Android work that depends on working E2E testing.

---------

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2024-10-02 15:04:58 -07:00
Edward Chen
c24e55b1f1
[Java] Add API for appending QNN EP (#22208)
- Add Java API for appending QNN EP
- Update Java unit test setup
  - Fix issues with setting system properties for tests
  - Unify Windows/non-Windows setup to simplify
2024-10-01 10:18:04 -07:00