Commit graph

1460 commits

Author SHA1 Message Date
PeixuanZuo
d78bbf5ef2
[ROCm] remove ROCm5.2.3, ROCm5.3, ROCm5.4 from pipeline (#16004)
remove ROCm5.2.3, ROCm5.3, ROCm5.4 from pipeline.
2023-05-19 10:29:01 +08:00
Edward Chen
6d46007028
Add explicit 'set +x' before printing a vso[] command to avoid output getting parsed again with a trailing quote. (#15986)
Here's the motivating issue:
https://github.com/microsoft/azure-pipelines-tasks/issues/10331

Noticed some problems in other repos so also updating usages in ORT.

We may be fine now without it, but this change adds some safeguard against future additions of 'set -x' for debugging.
2023-05-17 19:30:28 -07:00
Changming Sun
d98763473a
Change CUDA pipelines to download CUDA SDK in every build job (#15915)
### Description
Change CUDA pipelines to download CUDA SDK in every build job


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-05-17 17:31:51 -07:00
Yi Zhang
6d43d51eb0
[Fix] No test result report while not using ctest (#15976)
### Description
1. Set gtest output while ctest is set to empty.
2. onnx_src in _deps shouldn't be removed because
onnx_test_pytorch_converted and onnx_test_pytorch_converted need to read
data from onnx/backend/test/data/..

### Motivation and Context
Test result report is important to find the flaky tests.

### To do
Tests are not inconsistent.
If ctest_path is empty, onnx_test_pytorch_converted and
onnx_test_pytorch_converted will not be executed, if it's not,
onnxruntime_mlas_test will not be executed.


270c09a37f/tools/ci_build/build.py (L1743-L1753)
2023-05-17 08:31:16 -07:00
Jian Chen
2881d849d4
Update Win-CPU-2021 to onnxruntime-Win-CPU-2022 (#15967)
### Description
After this PR there are following pool need to be updated.

old|new|note
---|---|---
onnxruntime-Win2019-GPU-dml-A10|tbd|
onnxruntime-Win2019-GPU-T4|onnxruntime-Win2022-GPU-T4|
onnxruntime-Win2019-GPU-training-T4|onnxruntime-Win2022-GPU-T4|ame as
the above because we do not have many T4 GPUs
onnxruntime-tensorrt8-winbuild-T4|tbd|
aiinfra-dml-winbuild|tbd|
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-05-17 08:29:27 -07:00
Jian Chen
780442b9f6
Change windows machine pools to use VS2022
 (#15806)
### Description
<!-- Describe your changes. -->



Old pool | New pool | Notes
-- | -- | --
onnxruntime-Win-CPU-2019 | onnxruntime-Win-CPU-2022 |  
onnxruntime-Win2019-CPU-training | onnxruntime-Win2022-CPU-training-AMD
|  
onnxruntime-Win2019-CPU-training-AMD |
onnxruntime-Win2022-CPU-training-AMD | Same as the above
onnxruntime-Win2019-GPU-dml-A10 | Need be created | You need to create a
new image for it first
onnxruntime-Win2019-GPU-T4 | onnxruntime-Win2022-GPU-T4 |  
onnxruntime-Win2019-GPU-training-T4 | onnxruntime-Win2022-GPU-T4 | Same
as the above because we do not have many T4 GPUs
onnxruntime-tensorrt8-winbuild-T4| TBD|TBD
Win-CPU-2021|onnxruntime-Win-CPU-2022| will do it in next PR
Win-CPU-2019|onnxruntime-Win2022-Intel-CPU'| Intel CPU needed for
win-ci-pipeline.yml -> `stage: x64_release_dnnl`

<br class="Apple-interchange-newline">

### Motivation and Context
With vs2022 we can take the advantage of 64bit compiler. It also with
better c++20 support
2023-05-16 10:34:34 -07:00
RandySheriffH
7faad53632
Set default option for package name and build arg options (#15958)
Set default value for parameters in nuget-zip pipeline, and only apply
the configurations when they are not "NONE".

---------

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2023-05-16 09:07:38 -07:00
yf711
825d691617
Unify cuda & trt version on few CIs (#15943)
### Description
The cuda & trt version of some CIs didn't sync with the majority. 
Unifying cuda version as 11.8 and trt version as 8.6 on these CIs
2023-05-15 09:54:30 -07:00
Rachel Guo
18133ddadb
[doc] add LeakyRelu to coreml supported ops (#15944)
### Description
<!-- Describe your changes. -->



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-05-15 09:46:30 -07:00
Adrian Lizarraga
5542e70dd1
[QNN EP] Update default QNN SDK version to 2.10 for QNN NuGet pipeline (#15899)
### Description
Updates the default QNN SDK version to 2.10 for the QNN NuGet pipeline.

### Motivation and Context
Ensures that the daily QNN NuGet pipeline builds ORT using the latest
QNN SDK by default.
2023-05-15 09:17:42 -07:00
PeixuanZuo
af6cb2af87
[ROCm] update ROCm/MIGraphX CI to ROCm5.5 (#15905)
update ROCm/MIGraphX CI to ROC5.5.

TODO:
two PR to fix failure on
orttraining/orttraining/test/python/orttraining_test_ortmodule_api.py
-
test_gradient_correctness_minmax/test_gradient_correctness_argmax_unfold/test_gradient_correctness_argmax_diagonal
(https://github.com/microsoft/onnxruntime/pull/15903)
- test_ortmodule_attribute_name_collision_warning
(https://github.com/microsoft/onnxruntime/pull/15884)
2023-05-15 10:28:15 +08:00
Yi Zhang
b20d5e85d5
Update Cuda to 11.8 in 2 Linux GPU workflows. (#15925)
### Description
use template variable for cuda version


### Motivation and Context
2023-05-14 12:51:25 +08:00
RandySheriffH
7c4e8267e7
Implement openAI endpoint invoker for nuget (#15797)
Implement openAI audio endpoint, and enable nuget packaging.

---------

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2023-05-11 22:04:02 -07:00
Yi Zhang
0e7ae13e74
Run Linux GPU tests in docker container (#15872)
### Description
Run  Linux GPU tests in docker container

### Motivation and Context
2023-05-12 06:29:22 +08:00
Jian Chen
1a73d61829
Update eigen to 3.4 and remove the eigen from git submodule (#15875)
### Description
Update eigen to 3.4 and remove the eigen from git submodule

### Motivation and Context
We need to have eigen 3.4 for c++20
2023-05-11 11:56:59 -07:00
Changming Sun
7c58d013aa
Remove Ubuntu 18.04 usages (#15781)
### Description
Remove Ubuntu 18.04 usages because it will be EOL this month.

### Motivation and Context
2023-05-11 11:44:00 -07:00
Yulong Wang
756cf3a76f
increase web CI timeout (#15876)
### Description
The CI is extremely slow on downloading source code (~1MB/sec) so the
web CI went timeout. This is blocking the PR/checks.

Increase the timeout temporarily.
2023-05-11 11:17:46 -07:00
liqun Fu
ac9ae9f7c5
update onnx release 1.14 for docker files (#15680)
### Description
this is for ort 1.15 release to work with onnx 1.14
It shall be merged after onnx 1.14 release and before ort 1.15 release.


### Motivation and Context

---------

Signed-off-by: Liqun Fu <liqfu@microsoft.com>
2023-05-10 13:15:56 -07:00
Nat Kershaw (MSFT)
36c9ae0f58
Fix release version suffix for RC builds (#15865) 2023-05-09 23:06:08 -07:00
Jian Chen
34cb293c6b
Remove unused ADO YML pipeline template (#15857)
### Description
Remove unused ADO YML pipeline template



### Motivation and Context
Clean up and reduce our codebase.
2023-05-09 09:15:04 -07:00
Yulong Wang
0457fd0b40
upgrade emsdk to 3.1.37 (#15817)
### Description
upgrade emsdk to 3.1.37

WIP branch to debug the mystery memory issue in web assembly
multi-thread build.
2023-05-08 16:49:47 -07:00
Yi Zhang
045c623415
Make Nuget workflow easy to debug (#15808)
### Description
Fix the bug in #15693 



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-05-08 20:53:08 +08:00
Nat Kershaw (MSFT)
5e9b42326c
Fix packaging pipeline for nightly builds (#15839) 2023-05-07 20:42:38 -07:00
PeixuanZuo
41457885e0
[ROCm] add rocm5.5 to python package pipeline (#15820)
add rocm5.5 to python packaging pipeline.

https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=306082&view=results

TODO:
Remove version 5.2.3, 5.3.2 and 5.4 in the next PR.
2023-05-06 10:21:15 +08:00
Nat Kershaw (MSFT)
ed31e4b737
Add nuget release version suffix to support publishing rcs to nuget.org (#15791) 2023-05-05 18:18:24 -07:00
Adrian Lizarraga
45f5c27632
[QNN EP] Update default QNN SDK to version 2.10.0 (#15818)
### Description
- Updates the default QNN SDK for CI pipelines to version 2.10.0.
- Disables convolution op tests that run on the QNN CPU backend due to a
potential bug with QNN SDK 2.10.0.

### Motivation and Context
Allows us to test the latest QNN SDK in default CI pipeline runs.
2023-05-05 13:01:21 -07:00
Guenther Schmuelling
5a43828b3d
update ort extensions to 94142d8391c9791ec71c38336436319a2d4ac7a0 (#15688)
needed to get tokenizers/decode for whisper

---------

Co-authored-by: Shalva Mist <shalvamist@microsoft.com>
2023-05-05 09:48:07 -07:00
Scott McKay
d1b2b35cd2
Various fixes to the CSharp setup (#15782)
### Description
<!-- Describe your changes. -->
Various fixes to the CSharp setup
- fix warnings
- fix invalid tests
- update test sdk nuget package
  - enables testing on linux
  - fixes issue with some unit tests not running in CI
- run unit tests in linux pipeline using dotnet

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Unit tests weren't breaking in CIs for both Windows and Linux builds and
should have been.
2023-05-05 14:27:30 +10:00
Yulong Wang
4712009f8a
[js/web] add target ort.webgpu.min.js (#15780)
### Description
add target ort.webgpu.min.js

WebGPU is experimental feature, so I don't want to put webgpu into the
ort.min.js file. This change adds 2 ways for users to access ort-web
with webgpu:
- using script tag: by URL
`https://cdn.jsdelivr.net/npm/onnxruntime-web@1.15.0/dist/ort.webgpu.min.js`
( this URL is not ready yet )
- using `import()`: use `import { Tensor, InferenceSession } from
'onnxruntime-web/webgpu';` - 'onnxruntime-web/webgpu' instead of
'onnxruntime-web'
2023-05-04 10:05:39 -07:00
Yulong Wang
33d1372729
[wasm] revert emsdk to v3.1.19 (#15793)
### Description
latest emsdk generated multi-thread version sometimes crash with unknown
reason ( error: memory access out of bounds ).

we don't want to break existing ort-web users, so revert emsdk back to
3.1.19 (same to what ort v1.14.0 uses)
2023-05-04 01:15:01 -07:00
Baiju Meswani
e464588a0e
Avoid generating training documentation during packaging (#15795) 2023-05-03 19:09:07 -07:00
Changming Sun
d53324d4a7
Update cmake version in a few places (#15775)
### Description
They were missed in #15707 , because they are not in common places for Dockerfiles.

Though this commit updated tools/ci_build/github/pai/rocm-ci-pipeline-env.Dockerfile, it won't automatically take effect. The image needs to be manually generated and pushed to a place, and before doing that our CMakeLists.txt also needs to be tweaked a little bit.
2023-05-02 22:56:28 -07:00
Yulong Wang
ef1f17f3dc
[wasm/JSEP] add threaded build to artifacts (#15777)
### Description
This is the first part to create a webassembly artifacts for ort-web
webgpu EP (wasm build).

there will be following steps to consume the artifacts in web build
2023-05-02 17:53:44 -07:00
Jian Chen
abdd4f518a
Update TRT Windows Cuda 11.6 to 11.8 (#15746)
### Description
Update TRT Windows cuda 11.6 to 11.8



### Motivation and Context
We are adapting newer version of cuda systemwide.
2023-05-02 12:23:13 -07:00
Changming Sun
328cabb194
Download protoc from Github Release instead of Nuget (#15731)
### Description
Download protoc from Github Release instead of Nuget to avoid having
dependency on nuget.exe on Linux

### Motivation and Context
To avoid having dependency on nuget.exe on Linux. Many users' build
environment do not have nuget or dotnet.
2023-05-02 12:18:59 -07:00
Ashwini Khade
0ffae8073b
Creating Nuget and Android packages for Training (#15712)
### Description
This PR creates Nuget and Android for Training. 


### Motivation and Context
These packages are intended to be released in ORT 1.15 to enable
On-Device Training Scenarios.

## Packaging Story for Learning On The Edge Release
### Nuget Packages:
1. New Native package -> **Microsoft.ML.OnnxRuntime.Training** (Native
package will contain binaries for: win-x86, win-x64, win-arm, win-arm64,
linux-x64, linux-arm64, android)
2. C# bindings will be added to existing package ->
**Microsoft.ML.OnnxRuntime.Managed**

### Android Package published to Maven:
1. New package for training (full build) ->
**onnxruntime-training-android-full-aar**

### Python Package published to PyPi:
1. Python bindings and offline tooling will be added to the existing ort
training package -> **onnxruntime-training**
2023-05-01 12:59:56 -07:00
Changming Sun
176161348e
Revert "make nuget workflow easy to debug. (#15693)" (#15744)
This reverts commit 53ff50d19a because it
make the nuget pipeline fail.
2023-04-29 19:05:01 -07:00
Jian Chen
ec2f038c6d
Update Nuget pipeline's Linux CUDA job to cuda 11.8 (#15516)
### Description
Fixed AB#14497
2023-04-29 07:38:18 -07:00
Adrian Lizarraga
191deb4235
[QNN EP] Nuget package (#15711)
Adds pipeline for QNN NuGet package (x64 and arm64).
2023-04-28 19:33:14 -07:00
Edward Chen
c415bc725f
Add 'name' key to xcodebuild 'destination' option. (#15690) 2023-04-28 08:52:18 -07:00
Changming Sun
5b826b1bc3
Update cmake version in Linux build (#15707)
### Description
All our Windows build pipelines already uses cmake 3.26 except one
pipeline: QNN ARM64.
This PR does the same for Linux build pipelines.

### Motivation and Context
This change is related to #15704 .
2023-04-27 20:02:33 -07:00
Adrian Lizarraga
be5c582e65
[QNN EP] Update to QNN SDK 2.9.0 (#15709)
### Description
- Update to QNN SDK 2.9.0 for QNN pipelines
- Temporarily disable warnings as errors for QNN Windows x64 pipeline
- Note that this pipeline did not previously run to completion. It also
currently does not run for pull requests.

### Motivation and Context
Need to update and test the latest available version of the QNN SDK.
2023-04-27 13:44:09 -07:00
Changming Sun
d3d232b047
Rename onnxruntime-Linux-CPU-2019 machine pool (#15691)
Rename onnxruntime-Linux-CPU-2019 machine pool to
"onnxruntime-Ubuntu2004-AMD-CPU". The old one has an internal error and
stuck there. I cannot make any change to it. It has been like this for
more than 1 week. So I created a new pool with the same setting except
the name is different.
Also, move some android pipelines to
"onnxruntime-Linux-CPU-For-Android-CI" which uses a standard image from
https://github.com/actions/runner-images
2023-04-27 12:46:18 -07:00
yf711
2e1f92a986
Fix EP Perf pipeline (#15507)
### Description
* Update TensorRT 8.6 lib dependencies in dockerfile of TRT EP Perf
pipeline
* Avoid using `--allow_running_as_root` and build ORT with non-root user


### Motivation and Context
To fix the build issue on EP perf pipeline

Fixed
[AB#14615]
2023-04-27 10:09:14 -07:00
Yi Zhang
8cda1ffa28
Fix error in post-merge pipeline (#15717)
### Description
Get the right drive letter on Windows

### Motivation and Context
Build Directory might be in drive C
2023-04-27 10:05:15 -07:00
Yi Zhang
53ff50d19a
make nuget workflow easy to debug. (#15693)
### Description
Add parameters to make some stages could use other run's intermediate
output.

### Motivation and Context
nuget workflow has 38 stages of 4 layers.
We had to run the whole workflow from begining to test one stage.
It could make life easier to run only one stage for testing.
like

![image](https://user-images.githubusercontent.com/16190118/234453721-e6e9a4bd-5e0b-4101-a18e-d5cf60615c9f.png)

### N.B.
In this PR, Nuget_Test_Linux_CPU, Nuget_Test_LinuxGPU and
Jar_Packaging_GPU are enabled as the first step.
So I can start to move tests from Linux host to container
2023-04-27 14:54:14 +08:00
yf711
28985c47b7
[TensorRT EP] Unleash opset16-17 onnx model tests (#15657)
### Description
In 2021 we restricted onnx node test CI execution in range of opset
14-15 for ORT-TRT, which was the latest opset that TRT EP could support

Update this range to opset 14-17 to improve the ORT-TRT unit test
coverage, as [Nvidia announced that TRT 8.6 supported
opset17](https://github.com/onnx/onnx-tensorrt/blob/main/docs/operators.md)
2023-04-26 11:44:19 -07:00
yf711
d701dcd027
Fix Linux MultiGPU TensorRT CI (#15697)
### Description
* Reverting default TensorRT version to 8.5 as temporary fix
  
* Apart from that, this PR temporarily leaves this CI as a place to
validate user behavior that uses TRT 8.5 with latest ORT

### Context
* This CI pool equips 2xTesla M60 GPUs, which are no longer supported by
TensorRT 8.6.
* Currently, other CIs are using single-T4 VM but there's no VM with
2xT4 or other suitable dualGPU in the range.
* Once we decide which VM instance for this CI to migrate to, TRT8.6 can
be enabled on this CI

* According to
[Nvidia](https://docs.nvidia.com/deeplearning/tensorrt/release-notes/index.html):
* TensorRT 8.5.3 was the last release supporting NVIDIA Kepler (SM 3.x)
and NVIDIA Maxwell (SM 5.x) devices. *These devices are no longer
supported in TensorRT 8.6*. NVIDIA Pascal (SM 6.x) devices are
deprecated in TensorRT 8.6.
2023-04-26 10:01:33 -07:00
Xavier Dupré
699c9a520b
Fix TVM pipelines (#15653)
### Description
Fix TVM pipelines by adding missing dependancy of TVM (attrs).
2023-04-26 09:55:05 +02:00
Yulong Wang
b98317b907
[js/webgpu] following up for JSEP/WebGPU code cleanup (#15666)
### Description
This PR resolves a part of non-critical comments from code review
comments in #14579.

- use `USE_JSEP` instead of `USE_JS` in build definition to make it less
ambiguous
- remove unused util functions from util.ts
- fix transpose.h
- other misc fixes
2023-04-25 21:20:03 -07:00