### Description
Updates the ROCm EP opsets to match the current CUDA EP opsets. Also
enable the test CApiTest.basic_cuda_graph_with_annotation.
Note that some changes are whitespace-only. These changes were made to
improve the comparison of corresponding ROCm and CUDA EP source files
when using a side by side diff tool.
### Motivation and Context
The ROCm EP derives from the CUDA EP. Many source files are shared
between the EPs and "hipified" during the ROCm EP build, however quite a
few files within the ROCm EP are under source control after their
initial hipification. Over time these ROCm EP files get stale relative
to their CUDA EP counterparts. It becomes necessary to re-hipify these
otherwise static files in order to pick up important changes such as
opset differences.
- Allow specification of iOS simulator runtime version to use.
- Pick simulator runtime version (iphonesimulator 16.4) that is supported by the Xcode version (14.3.1) that we use.
- Disable CoreML EP's DepthToSpace op support for CoreML version less than 7, with DCR mode, and FP16 input. It doesn't produce the correct output in this case.
- Some cleanup of iOS test infrastructure.
### Description
Our nightly CPU python package's name is "ort-nightly" instead of
"onnxruntime". It was because of some historical reasons. Tensorflow was
like that.
Now we would prefer to make them the same.
Do this change for all nightly python packages, including CPU,
GPU(CUDA), and maybe others.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Change the hipify step to remove the -roc option to hipify-perl. This
will prefer hipblas over rocblas. rocblas can still be called directly
such as in TunableOp.
### Motivation and Context
hip interfaces are preferred over roc for porting from cuda to hip.
Calling roc interfaces is meant for ROCm-specific enhancements or
extensions.
1. Add python 3.13 to our python packaging pipelines
2. Because numpy 2.0.0 doesn't support thread free python, this PR also
upgrades numpy to the latest
3. Delete some unused files.
- Work around Xcode 16 iOS test build issue: `error: Multiple commands produce '.../PlugIns'`.
- Fix link error in iOS static framework test.
- Update build.py to check for the right kind of build before running iOS tests on the simulator.
- Update Xcode 16 build images to 'macos-15' because that's the only image that will have Xcode 16 soon. See https://github.com/actions/runner-images/issues/10703.
### Description
Add a new pipeline to publish ROCM package to ADO
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Test Link
https://dev.azure.com/aiinfra/Lotus/_build?definitionId=1615
### Description
* Add digital signature to dll files in jar files.
* Jar file names: onnxruntime-{version}.jar,
onnxruntime_gpu-{version}.jar
### Motivation and Context
#19204
### Description
Aallows alpha, beta and rc version releases to Maven for Android
artifacts.
### Motivation and Context
Helpful to release rc versions or test artifacts to Maven for testing.
For example, a new QNN android package is being released and it will be
nice to test the RC version for dependencies before release
## Future Work
Allow RC version for all Maven artifacts.
### Description
Pre built QNN Android package
### Future Work
1. Setting up CI with Browserstack- onnxruntime_tests and Android test
2. ESRP Release to Maven
### Description
Resolve#21976 .
ABSL generally does not have forward/backward compatibility. Our code is
only compatible with one fixed LTS version. So it's important to fix the
version number there when using find_package to detect an installed
version.
### Description
It runs after "Python-CUDA-Packaging-Pipeline" that runs on a CPU
machine that skipped all tests.
This testing pipeline is for doing the tests.
Fix the QNN nuget package issue
### Description
Inside the package, folder name \runtimes\win-arm64\ was changed to \runtimes\win-ARM64\, which breaks lib copy settings in Microsoft.ML.OnnxRuntime.QNN.props.
### Motivation and Context
Fix issue: https://github.com/microsoft/onnxruntime/issues/21692
### Description
Update the commit from 59600894a2c1c18290944b83e989bfe618975230 to
1887322ed36d522409a6b805d4e7942cf76a8e40
### Motivation and Context
The new one has python 3.13.
AB#50959
### Description
This change introduces the WebGPU EP into ONNX Runtime.
To make the PR as simple as possible, this PR excluded the following:
- C API changes for WebGPU EP
- actual implementation of WebGPU EP. Currently in this PR, WebGPU is a
stub implementation that does not register any kernel.
- Python IO Binding update
- Node.js IO Binding update
This PR now contains only 43 file changes (while the working branch
contains 130+) and hopefully this makes it easier to review.
There is going to be separated PRs for each mentioned above.
Current working branch: #21904
### Description
With TensorRT 10.4 update, the name of TensorRT windows package changed
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
- removed installing AppCenter + pipeline step that runs AppCenter
Espresso tests
- added script for running AppCenter tests
### Motivation and Context
App Center is getting deprecated in the next year + we have upcoming
Android work that depends on working E2E testing.
---------
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
- Add Java API for appending QNN EP
- Update Java unit test setup
- Fix issues with setting system properties for tests
- Unify Windows/non-Windows setup to simplify
### Description
<!-- Describe your changes. -->
NS is not developed anymore and ORT doesn't use it for int4 inference
either. Remove it to clean up the code
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
<!-- Describe your changes. -->
Fix syntax so usability checker works as expected.
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
if the variable is 1, the job running on A100 in PR checks.
Fixes
[AB#50333](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/50333)
### Motivation and Context
We wish more big models which need to run on A100 can be tested in PR
checks, but Azure may decommission A100 agents without notifications
sometimes, which will block merging PRs.
This PR is an improvement of current workaround, making those jobs only
run main branch.
Once we find the A100 are all decommisioned by Azure, we could change
the UseA100 variable to 0 to disable the A100 jobs in PR checks
### Description
Support Float16 for CoreML MLProgram EP.
Operations:
"Add", "Mul", "Sub", "Div", "Pow", "Sqrt", "Reciprocal",
"Sigmoid", "Tanh", "Relu", "LeakyRelu", "Concat", "GridSample",
"GlobalAveragePool",
"Clip", "DepthToSpace", "Resize", "Slice", "Conv",
"ConvTranspose", "GlobalMaxPool", "Gemm", "MatMul",
"AveragePool", "MaxPool", "Reshape", "Split", "Transpose"
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
---------
Co-authored-by: Scott McKay <skottmckay@gmail.com>
### Description
<!-- Describe your changes. -->
Jar maven signing:
- GnuPG
- sha256.
Jar packages artifacts:
- onnxruntime-android-full-aar
- onnxruntime-java
- onnxruntime-java-gpu
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Previously, it is manually signed.
Goal: make it automatically.
### Description
TensorRT 10.4 is GA now, update to 10.4
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Fix regression caused by #17361
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Description
Update XNNPack to latest version (Sep 4)
- Some op outputs are changed, channel or stride paras are moved into
reshape func.
e.g.
96962a602d
- input params of xnnpack's resize related function are changed a lot
- KleidiAI is added as a dependency in ARM64
- The latest XNNPACK includes 2 static libs microkernels-prod and
xnnpack.
Without microkernels-prod, it throws the exception of Undefined symbols.
- Add ORT_TARGET_PROCESSOR to get the real processor target in CMake
### Description
See https://github.com/microsoft/onnxruntime-extensions/pull/476
and https://github.com/actions/runner-images/issues/7671
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
### Current issue
- [ ] For default xcode 15.2, that come with the MacOS-13, We Need to
update the boost container header boost/container_hash/hash.hpp version
to pass the build
- [x] For xcode 14.2 The Build passed but the `Run React Native Detox
Android e2e Test` Failed.
Possible flaky test, https://github.com/microsoft/onnxruntime/pull/21969
- [x] For xcode 14.3.1 We encountered following issue in `Build React
Native Detox iOS e2e Tests`
```
ld: file not found: /Applications/Xcode_14.3.1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/arc/libarclite_iphonesimulator.a
clang: error: linker command failed with exit code 1 (use -v to see invocation)
```
Applied following code to the eof in both ios/Podfile and fixed the
issue
```
post_install do |installer|
installer.generated_projects.each do |project|
project.targets.each do |target|
target.build_configurations.each do |config|
config.build_settings['IPHONEOS_DEPLOYMENT_TARGET'] = '13.0'
end
end
end
end
```
- [x] https://github.com/facebook/react-native/issues/32483
Applying changes to ios/Pofile
```
pre_install do |installer|
# Custom pre-install script or commands
puts "Running pre-install script..."
# Recommended fix for https://github.com/facebook/react-native/issues/32483
# from https://github.com/facebook/react-native/issues/32483#issuecomment-966784501
system("sed -i '' 's/typedef uint8_t clockid_t;//' \"${SRCROOT}/Pods/RCT-Folly/folly/portability/Time.h\"")
end
```
- [ ] Detox environment setting up exceeded time out of 120000ms during
iso e2e test
### dependent
- [x] https://github.com/microsoft/onnxruntime/pull/21159
---------
Co-authored-by: Changming Sun <chasun@microsoft.com>
### Description
This PR makes the following updates to the Arm Compute Library execution
provider:
- Target Arm Compute Library 24.07
- Add support for the following operators:
- Conv (FP16)
- NhwcConv
- QLinearConv
- MatMul
- FusedMatMul
- MatMulIntegerToFloat
- Optimize memory usage and performance
- Expose the enable_fast_math setting
- Use the main runtime thread pool
### Motivation and Context
These updates improve performance and memory usage, and enable use of a
more recent version of Arm Compute Library.
@microsoft-github-policy-service agree company="Arm Ltd"
---------
Signed-off-by: Michael Tyler <michael.tyler@arm.com>
### Description
<!-- Describe your changes. -->
### Motivation and Context
The parameter isn't correct.
Maybe it hasn't negative impact by chance so far.
d8e64bb529/cmake/CMakeLists.txt (L1712-L1717)