Commit graph

878 commits

Author SHA1 Message Date
Yifan Li
0274b7b82f
fix on trtCudaVersion (#23616)
### Description
<!-- Describe your changes. -->
TensorRT 10.8 zip file has suffix of cuda-12.8, not 12.6


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2025-02-08 14:20:00 -08:00
Changming Sun
328a13c06d
Enable VCPKG in more pipelines (#23590)
### Description
Enable VCPKG in more pipelines
2025-02-06 10:10:31 -08:00
Yifan Li
6728d6085d
[TensorRT EP] support TensorRT 10.8-GA (#23592)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2025-02-06 10:05:57 -08:00
Changming Sun
5f6a3158f8
Enable VCPKG in CI build (#23426)
### Description
1. Enable VCPKG flag in Windows CPU CI build pipelines. 
2. Increased the min supported cmake version from 3.26 to 3.28. Because
of it, drop the support for the old way of finding python by
"find_package(PythonLibs)". Therefore, in build.py we no longer set
"PYTHON_EXECUTABLE" cmake var when doing cmake configure.
3. Added "xnnpack-ep" as a feature for ORT's vcpkg config.
4. Added asset cache support for ORT's vcpkg build
5. Added VCPKG triplet files for Android build.
6. Set VCPKG triplet to "universal2-osx" if CMAKE_OSX_ARCHITECTURES was
found in cmake extra defines.
7. Removed a small piece of code in build.py, which was for support CUDA
version < 11.8.
8. Fixed an issue that CMAKE_OSX_ARCHITECTURES sometimes got specified
twice when build.py invoked cmake.
9. Added more model tests to Android build. After this change, we will
test all ONNX versions instead of just the latest one.
10. Fixed issues that are related to build.py's "--build_nuget"
parameter. Also, enable the flag in most Windows CPU CI build jobs.
11. Removed a restriction in build.py that disallowed cross-compiling
Windows ARM64 nuget package on Windows x86.
 
### Motivation and Context
Adopt vcpkg.
2025-02-05 10:58:53 -08:00
Hector Li
c29ca1cb41
Update QNN default version to 2.31 (#23573)
Update QNN default version to 2.31
2025-02-04 16:24:54 -08:00
Yulong Wang
8db97a68f2
[webgpu] Bump version of Dawn to b9b4a370 (#23494)
### Description

This PR updates the version of Dawn to
`b9b4a37041dec3dd62ac92014a6cc1aece48d9f3` (ref:
[chromium](67f86f01dd/DEPS (399)))
in the `deps.txt` file.

The newer version of Dawn includes the previous changes from dawn.patch
so that we can remove the patch file.

There is a little interface changes and code is updated correspondingly.
2025-01-27 14:02:06 -08:00
Adrian Lizarraga
3b4c7df4e9
[QNN EP] Make QNN EP a shared library (#23120)
### Description
- Makes QNN EP a shared library **by default** when building with
`--use_qnn` or `--use_qnn shared_lib`. Generates the following build
artifacts:
- **Windows**: `onnxruntime_providers_qnn.dll` and
`onnxruntime_providers_shared.dll`
- **Linux**: `libonnxruntime_providers_qnn.so` and
`libonnxruntime_providers_shared.so`
  - **Android**: Not supported. Must build QNN EP as a static library.
- Allows QNN EP to still be built as a static library with `--use_qnn
static_lib`. This is primarily for the Android QNN AAR package.
- Unit tests run for both the static and shared QNN EP builds.

### Detailed changes
- Updates Java bindings to support both shared and static QNN EP builds.
- Provider bridge API:
- Adds logging sink ETW to the provider bridge. Allows EPs to register
ETW callbacks for ORT logging.
- Adds a variety of methods for onnxruntime objects that are needed by
QNN EP.
- QNN EP:
- Adds `ort_api.h` and `ort_api.cc` that encapsulates the API provided
by ORT in a manner that allows the EP to be built as either a shared or
static library.
- Adds custom function to transpose weights for Conv and Gemm (instead
of adding util to provider bridge API).
- Adds custom function to quantize data for LeakyRelu (instead of adding
util to provider bridge API).
  - Adds custom ETW tracing for QNN profiling events:
    - shared library: defines its own TraceLogging provider handle
- static library: uses ORT's TraceLogging provider handle and existing
telemetry provider.
- ORT-QNN Packages:
- **Python**: Pipelines build QNN EP as a shared library by default.
User can build a local python wheel with QNN EP as a static library by
passing `--use_qnn static_lib`.
- **NuGet**: Pipelines build QNN EP as a shared library by default.
`build.py` currently enforces QNN EP to be built as a shared library.
Can add support for building a QNN NuGet package with static later if
deemed necessary.
- **Android**: Pipelines build QNN EP as a **static library**.
`build.py` enforces QNN EP to be built as a static library. Packaging
multiple shared libraries into an Android AAR package is not currently
supported due to the added need to also distribute a shared libcpp.so
library.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2025-01-22 12:11:00 -08:00
Jian Chen
628c0e00c4
Change MacOS-13 to ubuntu on for android-java-api-aar-test.yml. (#23444)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2025-01-21 17:07:20 -08:00
Jian Chen
899ea21ffe
Moving RN_CI Android Testing to Linux (#23422)
### Description
Moving Android E2E test steps from Mac-OS13 to unbunt22.04



### Motivation and Context
Deduced the dependency on MacOS, which is deprecating the x64 version.
2025-01-21 11:55:29 -08:00
Jian Chen
83cb1e4a3c
Seperate RN andriod and IOS into 2 separated Stages. (#23400)
### Description
Seperate RN andriod and IOS into 2 separated Stages.



### Motivation and Context
Speed up the PR process.
2025-01-20 18:08:01 -08:00
Hector Li
f35924a891
Update Qnn SDK default version to 2.30 (#23411)
### Description
Update Qnn SDK default version to 2.30
2025-01-17 22:36:35 -08:00
Changming Sun
d461ca9dcd
Update onnxruntime binary size checks ci pipeline's docker image (#23405)
1. Update onnxruntime binary size checks ci pipeline's docker image. Use
a different docker image that is not manylinux based. The new one is
smaller.
2. Add flatbuffers tools/ci_build/requirements/pybind/requirements.txt
3. Delete
tools/ci_build/github/azure-pipelines/py-package-build-pipeline.yml. The
pipeline was for generating packages for Olive, but it went unused. And
the content is highly duplicated with our official python packaging
pipeline.
4. A lot of YAML files reference pypa/manylinux git repo but do not use
it. This PR removes the references.
2025-01-17 15:29:17 -08:00
Yulong Wang
080c67e900
[WebGPU] allow build WebGPU EP for WebAssembly (#23364)
### Description

This PR allows WebGPU EP to be built with Emscripten for WebAssembly,
Including:


- cmake build files update to support correct setup for Emscripten.
- code changes to fix build breaks for wasm
- change in Web CI pipeline to add a build-only target for wasm with
`--use_webgpu`.
2025-01-16 10:52:17 -08:00
Jian Chen
331fc36b6a
Remove hot path for pre-0.70.15 RN fix (#23382)
### Description
This undo the changes from #23281
2025-01-15 16:16:38 -08:00
Changming Sun
6a7ea5c896
Update xnnpack, cpuinfo and pthreadpool (#23362)
### Description
Update xnnpack to remove the dependency on psimd and fp16 libraries.
However, coremltool still depends on them, which will be addressed
later.

Also, update CPUINFO because the latest xnnpack requires CPUINFO's avx10
support.

### Motivation and Context
The fewer dependencies the better.
2025-01-15 09:42:15 -08:00
Yifan Li
5c3c7643db
Update range of gpu arch (#23309)
### Description
<!-- Describe your changes. -->
* Remove deprecated gpu arch to control nuget/python package size
(latest TRT supports sm75 Turing and newer arch)
* Add 90 to support blackwell series in next release (86;89 not
considered as adding them will rapidly increase package size)

| arch_range | Python-cuda12 | Nuget-cuda12 |
| -------------- |
------------------------------------------------------------ |
---------------------------------- |
| 60;61;70;75;80 | Linux: 279MB Win: 267MB | Linux: 247MB Win: 235MB |
| 75;80 | Linux: 174MB Win: 162MB | Linux: 168MB Win: 156MB |
| **75;80;90** | **Linux: 299MB Win: 277MB** | **Linux: 294MB Win:
271MB** |
| 75;80;86;89 | [Linux: MB Win:
390MB](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=647457&view=results)
| Linux: 416MB Win: 383MB |
| 75;80;86;89;90 | [Linux: MB Win:
505MB](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=646536&view=results)
| Linux: 541MB Win: 498MB |

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Callout: While adding sm90 support, the build of cuda11.8+cudnn8 will be
dropped in the coming ORT release,
as the build has issue with blackwell (mentioned in comments) and demand
on cuda 11 is minor, according to internal ort-cuda11 repo.
2025-01-14 14:27:34 -08:00
Changming Sun
4e4fd2bdcf
Update ORT extension to the latest (#23314)
Update ORT extension to the latest, to include some build system fixes.
2025-01-13 18:59:42 -08:00
Changming Sun
ccbe66d422
Update NDK (#23280)
Similar to #21989
2025-01-08 13:57:23 -08:00
Jian Chen
da35cceac9
Add a temporary path to RN 0.69.3 to update the boost url (#23281)
### Description
Add a temporary path to RN 0.69.3 to update the boost url


### Motivation and Context
Fix the React-native CI until we update the RN to 0.70.15 or 0.73.3+
versions
2025-01-08 09:28:35 -08:00
Changming Sun
5d692b0136
Merge web machine pools (#23243)
### Description
The Web CI pipeline uses three different Windows machine pools:
1. onnxruntime-Win2022-webgpu-A10
2. onnxruntime-Win2022-VS2022-webgpu-A10
3. onnxruntime-Win-CPU-2022-web

This PR merges them together to reduce ongoing maintenance cost.
2025-01-03 13:53:17 -08:00
Changming Sun
afd3e81c94
Remove PostBuildCleanup (#23233)
Remove PostBuildCleanup tasks since it is deprecated. It is to address a
warning in our pipelines:

"Task 'Post Build Cleanup' version 3 (PostBuildCleanup@3) is dependent
on a Node version (6) that is end-of-life. Contact the extension owner
for an updated version of the task. Task maintainers should review Node
upgrade guidance: https://aka.ms/node-runner-guidance"

Now the cleanup is controlled in another place:

https://learn.microsoft.com/en-us/azure/devops/pipelines/yaml-schema/workspace?view=azure-pipelines


The code change was generated by the following Linux command:
```bash
find . -name \*.yml -exec sed -i '/PostBuildCleanup/,+2d' {} \;
```
2024-12-31 13:12:33 -08:00
liqun Fu
a9a881cc98
Integrate onnx 1.17.0 (#21897)
### Description
<!-- Describe your changes. -->
for ORT 1.21.0 release

Create following related issues to track skipped tests due to updated
ONNX operators in the ONNX 1.17.0 release:
https://github.com/microsoft/onnxruntime/issues/23162
https://github.com/microsoft/onnxruntime/issues/23164
https://github.com/microsoft/onnxruntime/issues/23163
https://github.com/microsoft/onnxruntime/issues/23161

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Signed-off-by: Liqun Fu <liqfu@microsoft.com>
Signed-off-by: Liqun Fu <liqun.fu@microsoft.com>
Co-authored-by: Guenther Schmuelling <guschmue@microsoft.com>
Co-authored-by: Yifan Li <109183385+yf711@users.noreply.github.com>
Co-authored-by: yf711 <yifanl@microsoft.com>
2024-12-24 09:02:02 -08:00
Yifan Li
d9d07ad8ae
[TensorRT EP] support TensorRT 10.7-GA (#23011)
### Description
<!-- Describe your changes. -->
Update CIs to TRT10.7

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-12-19 10:39:15 -08:00
Changming Sun
5d7030e4c6
Revert DML pipeline changes (#23135)
### Description
Previously we wanted to add DirectML EP to existing onnxruntime Windows
CUDA packages. After careful consideration, we will postpone the change.
This PR reverts some pipeline changes previously made by @mszhanyi and
@jchen351 .
2024-12-18 10:42:10 -08:00
Yi Zhang
6ed77cc374
Deprecate macos-12 (#23017)
### Description
<!-- Describe your changes. -->



### Motivation and Context
ESRP code-sign task has supported .net 8, so we can remove macos-12
2024-12-05 14:07:21 +08:00
Yulong Wang
a615bd6688
Bump version of Dawn to 12a3b24c4 (#23002)
### Description

Upgrade version of Dawn.

Removed dawn.patch, because all patches are included in upstream.

Updated code that affected by API changes (`const char*` ->
`WGPUStringView`)


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-12-04 09:47:16 -08:00
Jian Chen
9ed0c7fe26
Redo "Update Gradle version 8.7 and java version 17 within onnxruntime/java" (#22923)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-12-02 18:34:25 -08:00
Yi Zhang
b930b4ab5b
Limit PipAuthenticate in Private Project now (#22954)
### Description
Fixes regression in post merge pipeline caused by #22612



### Motivation and Context
So far, there isn't  the artifactFeeds in Public Project
2024-11-27 13:32:35 +08:00
sheetalarkadam
f80afeb9a1
Override android qnn sdk version with pipeline param (#22895)
We need to be able to control/override the exact version of qnn sdk used
for the android build as qnn-runtime (maven package) releases are slower
to QNN SDK releases.
2024-11-25 21:01:05 -08:00
Yi Zhang
85751e7276
Build DML in Windows GPU CI pipeline (#22869)
### Description
Add a new stage to build cuda and dml in Windows GPU CI pipeline (PR
checks) to prevent regressions introduced by new cuda tests.
Update all tests in cuda/testcases name prefix to CudaEp for skipping
them easily

### Motivation and Context
1. CudaNhwcEP is added by default when using cuda ep
2. if onnxruntime_ENABLE_CUDA_EP_INTERNAL_TES is enable, the tests in
tests/provider/cuda/testcases is added too.

### To do
add enable_pybind in the new stage.
Now, --enable_pybind will trigger some python test, like
onnxruntime_test_python.py.
It uses the API of get_avaible_providers() .
More discussions are needed to decide how to make it works
2024-11-25 10:50:52 +08:00
Jian Chen
369d7bf887
Update the Docker image version (#22907)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-11-21 19:38:39 +08:00
Yi Zhang
a28246a994
Revert "Update Gradle version 8.7 and java version 17 within onnxrunt… (#22914)
…ime/java (#22771)"

This reverts commit 632a36a233.

### Description
<!-- Describe your changes. -->



### Motivation and Context
Run E2E tests using Browserstack failed due to this PR.
2024-11-21 18:12:28 +08:00
Kyle
712bee13db
Fix Pipeline Timeout Issue (#22901)
### Description
<!-- Describe your changes. -->
Extend timeout for always failed job. 


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-11-20 17:18:50 +01:00
Changming Sun
13346fdf18
Cleanup code (#22827)
### Description
1.  Delete TVM EP because it is out of maintain 
2.  Delete ortmodule related docker files and scripts.
2024-11-19 14:13:33 -08:00
Caroline Zhu
0d00fc3130
[mobile] Fix for mac-ios-packaging pipeline (#22879)
### Description
Appends variant name to the Browserstack artifacts that are published so
that we don't run into the error:
"##[error]Artifact browserstack_test_artifacts already exists for build
609095."

[Working pipeline
run](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=609503&view=results)


### Motivation and Context
- onnxruntime-ios-packaging-pipeline has been failing
2024-11-19 09:27:51 -08:00
Adrian Lizarraga
497b06f0a9
[QNN EP] QNN SDK 2.28.2 (#22844)
### Description
- Updates pipelines to use QNN SDK 2.28.2.241116.
- Re-enable LayerNormalization unit tests that failed with accuracy
errors with the previous QNN SDK (2.28.0).
- Update QNN EP to no longer provide a dummy bias for LayerNorm if the
QNN SDK version is >= 2.28.0.


### Motivation and Context
Use the latest QNN SDK. This version improves inference latency for
certain customer models.
2024-11-18 20:10:36 -08:00
Yi Zhang
135d8b2beb
Fix CUDA/DML package exception caused by ENABLE_CUDA_NHWC_OPS (#22851)
### Description
Now,  ENABLE_CUDA_NHWC_OPS is enabled by default.
It adds a new chance to create cuda provider while both cuda/dml are
enabled


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-11-18 10:46:23 +08:00
Jian Chen
632a36a233
Update Gradle version 8.7 and java version 17 within onnxruntime/java (#22771)
### Description
This change is to update the Gradle version within java project to 8.7,
it also upgrades the JAVA to 17. Gradle version from react-native was
also updated to 7.5 to make it compatible with changes from the Java
directory. However, the target java version remains the same. Java
version from these will be upgraded in a separated PR.

This is spited from #22206

### Motivation and Context
This is the first step to upgrade the react native version.
2024-11-14 17:10:44 -08:00
Jian Chen
75a44582ba
Update all JDK version to 17 (#22786) 2024-11-12 11:42:18 -08:00
sheetalarkadam
e8f1d73b0b
Add Android QNN Browserstack test (#22434)
Add Android QNN Browserstack test



### Motivation and Context
Real device test in CI
2024-11-10 16:10:29 -08:00
Yi Zhang
ef281f850a
Add XNNPack build on Linux ARM64 and improve Linux CPU (#22773)
### Description
1. Add XNNPack build on Linux ARM64
2. Build only one python wheel for PR request.

[AB#49763](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/49763)



### Motivation and Context
Why I add xnnpack build on Linux ARM64  rather than Windows ARM64.
Becuase KleidiAI  doesn't support Windows

```
IF(XNNPACK_TARGET_PROCESSOR STREQUAL "arm64" AND XNNPACK_ENABLE_ARM_I8MM AND NOT CMAKE_C_COMPILER_ID STREQUAL "MSVC")
  IF (XNNPACK_ENABLE_KLEIDIAI)
    MESSAGE(STATUS "Enabling KleidiAI for Arm64")
  ENDIF()
ELSE()
  SET(XNNPACK_ENABLE_KLEIDIAI OFF)
ENDIF()
```

---------
2024-11-09 11:26:19 +08:00
Jian Chen
e7987a6b0b
Replace reference to python 3.8 with python 3.10 (#22692)
### Description
This PR will set default python to 3.10 except
tools/ci_build/github/azure-pipelines/bigmodels-ci-pipeline.yml. This is
needed because we are no longer using python 3.8

This PR excludes changes for Big Models CI, because it will require
additional changes. Which will be track in
USER STORY 52729



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2024-11-07 16:51:40 -08:00
Yifan Li
3b7a6eba69
[TensorRT EP] support TensorRT 10.6-GA (#22644)
### Description
<!-- Describe your changes. -->
* Update CI with TRT 10.6
* Update oss parser to [10.6-GA-ORT-DDS
](https://github.com/onnx/onnx-tensorrt/tree/10.6-GA-ORT-DDS) and update
dependency version
* Update Py-cuda11 CI to use trt10.6


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
(There will be 3rd PR to further reduce trt_version hardcoding)
2024-11-06 14:33:46 -08:00
Caroline Zhu
0221693e43
[Mobile] Add E2E BrowserStack tests for iOS tests (#22610)
### Description
- Changes running the E2E iOS tests from running in App Center to
running in BrowserStack
- Steps for running locally can be found in the OneNote

### Motivation and Context
- Follow-up of #22117 
- App Center (the previous platform for running E2E mobile tests) is
getting deprecated in 2025

### Misc info
Additional build steps were required to get the necessary testing
artifacts for BrowserStack. App Center consumed an entire folder, while
BrowserStack requests the following:
1. a ZIP file of all the tests
2. an IPA file of the test app

#### Flow
Here is a rough outline of what is happening in the pipeline:
1. The build_and_assemble_apple_pods.py script builds the relevant
frameworks (currently, this means packages for iOS and Mac)
4. The test_apple_packages.py script installs the necessary cocoapods
for later steps
5. XCode task to build for testing builds the iOS target for the test
app
6. Now that the test app and the tests have been built, we can zip them,
creating the tests .zip file
7. To create the IPA file, we need to create a .plist XML file which is
generated by the generate_plist.py script.
- Attempts to use the Xcode@5 task to automatically generate the plist
file failed.
- Also, building for testing generates some plist files -- these cannot
be used to export an IPA file.
8. We run the Xcode task to build an .xcarchive file, which is required
for creating an IPA file.
9. We use xcodebuild in a script step to build an IPA file with the
xcarchive and plist files from the last two steps.
10. Finally, we can run the tests using the BrowserStack script.

---------

Co-authored-by: Scott McKay <skottmckay@gmail.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2024-11-06 11:22:29 -08:00
Hector Li
017246260f
support Qnn 2 28 (#22724)
### Description
support Qnn 2.28
update default qnn vesion to 2.28 in build pipeline
2024-11-05 15:41:15 -08:00
Yi Zhang
8e8b62b8b5
Build CUDA and DML together (#22602)
### Description
Now, we need to build cuda and dml in one package.
But CUDA EP and DML EP can't run in one process.
It will throw the exception of `the GPU device instance has been
suspended`
So the issue is CUDA EP and DML EP coexist in compile time but can't
exist in run time.

This PR is to split cuda ep test and dml ep test in all unit tests.
The solution is to use 2 environment variable, NO_CUDA_TEST and
NO_DML_TEST, in CI.

For example, if NO_CUDA_TEST is set, the DefaultCudaExecutionProvider
will be nullptr, and the test will not run with CUDA EP.
In debugging, the CUDAExecutionProvider will not be called. 
I think, as long as cuda functions, like cudaSetDevice, are not called,
DML EP tests can pass.

Disabled java test of testDIrectML because it doesn't work now even
without CUDA EP.
2024-10-31 15:51:13 -07:00
Yulong Wang
7a8fa12850
Add implementation of WebGPU EP (#22591)
### Description

This PR adds the actual implementation of the WebGPU EP based on
https://github.com/microsoft/onnxruntime/pull/22318.

This change includes the following:

<details>
<summary><b>core framework of WebGPU EP</b></summary>

  - WebGPU EP factory classes for:
    - handling WebGPU options
    - creating WebGPU EP instance
    - creating WebGPU context
  - WebGPU Execution Provider classes
    - GPU Buffer allocator
    - data transfer
  - Buffer management classes
    - Buffer Manager
    - BufferCacheManager
      - DisabledCacheManager
      - SimpleCacheManager
      - LazyReleaseCacheManager
      - BucketCacheManager
  - Program classes
    - Program (base)
    - Program Cache Key
    - Program Manager
  - Shader helper classes
    - Shader Helper
    - ShaderIndicesHelper
    - ShaderVariableHelper
  - Utils
    - GPU Query based profiler
    - compute context
    - string utils
  - Miscs
    - Python binding webgpu support (basic)
 
</details>

<details>
<summary><b>Kernel implementation</b></summary>


  - onnx.ai (default opset):
- Elementwise (math): Abs, Neg, Floor, Ceil, Reciprocal, Sqrt, Exp, Erf,
Log, Sin, Cos, Tan, Asin, Acos, Atan, Sinh, Cosh, Asinh, Acosh, Atanh,
Tanh, Not, Cast
- Elementwise (activation): Sigmoid, HardSigmoid, Clip, Elu, Relu,
LeakyRelu, ThresholdedRelu, Gelu
- Binary (math): Add, Sub, Mul, Div, Pow, Equal, Greater,
GreaterOrEqual, Less, LessOrEqual
    - (Tensors): Shape, Reshape, Squeeze, Unsqueeze
    - Where
    - Transpose
    - Concat
    - Expand
    - Gather
    - Tile
    - Range
    - LayerNormalization
  - com.microsoft
    - FastGelu
    - MatMulNBits
    - MultiHeadAttention
    - RotaryEmbedding
    - SkipLayerNormalization
    - LayerNormalization
    - SimplifiedLayerNormalization
    - SkipSimplifiedLayerNormalization

</details>

<details>
<summary><b>Build, test and CI pipeline integration</b></summary>

  - build works for Windows, macOS and iOS
  - support onnxruntime_test_all and python node test
  - added a new unit test for `--use_external_dawn` build flag.
  - updated MacOS pipeline to build with WebGPU support
  - added a new pipeline for WebGPU Windows

</details>

This change does not include:

- Node.js binding support for WebGPU (will be a separate PR)
2024-10-29 18:29:40 -07:00
Yifan Li
951d9aa99f
[TensorRT EP] Refactor TRT version update logic & apply TRT 10.5 (#22483)
### Description
<!-- Describe your changes. -->
* Leverage template `common-variables.yml` and reduce usage of hardcoded
trt_version

8391b24447/tools/ci_build/github/azure-pipelines/templates/common-variables.yml (L2-L7)
* Among all CI yamls, this PR reduces usage of hardcoding trt_version
from 40 to 6, by importing trt_version from `common-variables.yml`
* Apply TRT 10.5 and re-enable control flow op test


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
- Reduce usage of hardcoding trt_version among all CI ymls

### Next refactor PR 
will work on reducing usage of hardcoding trt_version among
`.dockerfile`, `.bat` and remaining 2 yml files
(download_win_gpu_library.yml & set-winenv.yml, which are step-template
yaml that can't import variables)
2024-10-29 09:23:41 -07:00
Changming Sun
3641d184f8
Add pipauth to more ADO pipelines and enable CSV (#22612)
### Description
1. Add pipauth to more ADO pipeline. (We will use a private ADO feed to
fetch python packages in these pipeline, to improve security)
2. Enforce codeSignValidation(CSV).

### Motivation and Context
Fulfill some internal compliance requirements.
2024-10-28 16:39:22 -07:00
Kyle
10bdf6e797
Fix Maven Sha256 Checksum Issue (#22600)
### Description
<!-- Describe your changes. -->
**Changes applied to maven related signing:** 
* Windows sha256 file encoded by utf8(no BOM)
* powershell script task used latest version, previous 5.1 version only
supports utf8 with BOM.
* Windows sha256 file content in format 'sha256value
*filename.extension'.
* Linux sha256 file content in format 'sha256value *filename.extension'.

**More information about powershell encoding:**
Windows powershell encoding reference: [about_Character_Encoding -
PowerShell | Microsoft
Learn](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_character_encoding?view=powershell-7.4)
- for version 5.1, it only has 'UTF8 Uses UTF-8 (with BOM).'
- for version v7.1 and higher, it has:
     utf8: Encodes in UTF-8 format (no BOM).
     utf8BOM: Encodes in UTF-8 format with Byte Order Mark (BOM)
     utf8NoBOM: Encodes in UTF-8 format without Byte Order Mark (BOM)
2024-10-25 08:13:02 -07:00