Commit graph

2046 commits

Author SHA1 Message Date
Yi Zhang
bd95a8ea77
update onnxruntime-gpu-winbuild-T4 to onnxruntime-Win2022-GPU-T4 (#16838)
### Description


### Motivation and Context
It's also used to upgrade visual studio to VS2022.
onnxruntime-gpu-winbuild-T4 and onnxruntime-gpu-tensorrt8-winbuild-t4
are using the image based on one dev branch and VS2019

To avoid breaking the current CIs, we move jobs running on
onnxruntime-gpu-winbuild-T4/onnxruntime-gpu-tensorrt8-winbuild-t4 to
onnxruntime-Win2022-GPU-T4.
2023-07-27 08:38:20 -07:00
Wang, Mengni
fe463d4957
Support SmoothQuant for ORT static quantization (#16288)
### Description

Support SmoothQuant for ORT static quantization via intel neural
compressor

> Note:
Please use neural-compressor==2.2 to try SmoothQuant function.

### Motivation and Context
For large language models (LLMs) with gigantic parameters, the
systematic outliers make quantification of activations difficult. As a
training free post-training quantization (PTQ) solution, SmoothQuant
offline migrates this difficulty from activations to weights with a
mathematically equivalent transformation. Integrating SmoothQuant into
ORT quantization can benefit the accuracy of INT8 LLMs.

---------

Signed-off-by: Mengni Wang <mengni.wang@intel.com>
2023-07-26 18:56:45 -07:00
Justin Chu
0c1a5098dc
Disable PERF* rules in ruff to allow better readability (#16834)
### Description

Disable two PERF* rules in ruff to allow better readability. Rational
commented inline. This change also removes the unused noqa directives
because of the rule change.

### Motivation and Context

Readability
2023-07-25 15:38:22 -07:00
Edward Chen
e01365f80b
Update upload_pod_archive_and_update_podspec.sh to take path pattern (#16810)
Update upload_pod_archive_and_update_podspec.sh to take a pod archive path glob pattern. The actual pod archive path has a version suffix which changes.
2023-07-25 08:55:31 -07:00
Yi Zhang
38db5eca65
replace onnxruntime-Win-CPU-2019 with onnxruntime-Win-CPU-2022 (#16844)
### Description
<!-- Describe your changes. -->



### Motivation and Context
upgrade to VS2022
2023-07-25 23:05:34 +08:00
Yi Zhang
f88f0d8e36
Upgrade 4 stages in nuget pipeline to VS2022 (#16825)
### Description


### Motivation and Context
Continue upgrading to VS2022

### Verfication

https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=331377&view=results

N.B.
In practice, SDLNativeRules@3 doesn't support VS2019.
2023-07-25 14:22:39 +08:00
Yulong Wang
8b30dc11d7
Update run_CIs_for_external_pr.py to skip passed checks (#16808)
### Description
Update run_CIs_for_external_pr.py to skip passed checks
2023-07-25 16:11:53 +10:00
PeixuanZuo
8ede2f139e
[ROCm] Optimize ROCm CI pipeline 2 (#16691)
- Set `KERNEL_EXPLORER_TEST_USE_CUPY=1` to replace numpy with cupy on
kernel explorer test.

KERNEL_EXPLORER_TEST_USE_CUPY=0 The CPU utilization is shown as below:

![image](https://github.com/microsoft/onnxruntime/assets/94887879/91724b78-0b4e-4cbd-ad88-83cad9976472)

KERNEL_EXPLORER_TEST_USE_CUPY=1 The CPU utilization is shown as below:

![image](https://github.com/microsoft/onnxruntime/assets/94887879/58239911-667c-4d5f-bb78-deca60d0266f)


- Use `Bash@3`.
- Update shell script.
2023-07-24 13:57:48 +08:00
Chi Lo
21ef14476b
Bug fix for nested control flow ops for TRT EP (#16343)
Current TRT EP can support model which has nested control flow ops
(multiple level subgraphs). But it fails at a case where the subgraph
has outer scope value that is defined several levels up in the top-level
graph, in this case, the outer scope value is the input of the top-level
graph. The outer scope values are not properly handled during TRT EP's
subgraph reconstruction stage and fails at `graph.resolve()`.

The way ORT gets capability from EPs is a bottom-up approach meaning
inner most subgraph gets handled first. TRT EP reconstructs each
subgraph level by level and following modifications are made to fix the
outer scope values issue:

- `SetGraphOuterScopeValuesAndInputs()` and `SetAllGraphInputs()` are
added to handle outer scope values and add those values as graph inputs
if needed in order to make `graph.resolve()` happy.
- Change to use `GetNodeArgIncludingParentGraphs` so that when creating
the fused TRT node for some subgraphs in`
Graph::CreateFusedSubGraphNode()`, it can get the NodeArgs for outer
scope values from top-level graph.


This PR fixes https://github.com/microsoft/onnxruntime/issues/16217
2023-07-23 16:16:17 -07:00
Yi Zhang
3252ff2cb7
Change DML GPU pool in Windows GPU workflow use Visual Studio 2022 (#16784)
### Description
1. use the pool with VS2022
2. upgrade System.Memory to 4.5.5


### Motivation and Context
Solve the build error while using VS2022:
`[Failure] Msbuild failed when processing the file
'D:\a\_work\1\s\csharp\src\Microsoft.ML.OnnxRuntime\Microsoft.ML.OnnxRuntime.csproj'
with message: Method not found: 'System.ReadOnlySpan`1<Char>
Microsoft.IO.Path.GetFileName(System.ReadOnlySpan`1<Char>)'`

Ref:
https://stackoverflow.com/questions/73399777/azure-build-failing-due-to-method-not-found-system-readonlyspan1char-micros
2023-07-23 10:07:21 +08:00
Justin Chu
d79515041c
[Better Engineering] Bump ruff to 0.0.278 and fix new lint errors (#16789)
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at
bottom):
* __->__ #16789

Bump ruff to 0.0.278 and fix new lint errors. I added noqa to all
existing RUF012 errors which requires mutable class variables to be
annotated with `ClassVar`, as well as all PERF issues.

Signed-off-by: Justin Chu <justinchu@microsoft.com>
2023-07-21 12:53:41 -07:00
Baiju Meswani
538d2412ef
Objective-C Add Support to Create and Query String ORTValues (#16764)
This pull request contains a few changes:

1. Adds support for string ort values.
2. Fixes the training minimal build (that was broken with #16601) by
putting custom op registration behind #ifdefs
3. Fixes the iOS pod package generation (that was again broken with
#16601) by explicitly providing paths to be copied during pod creation.
2023-07-20 17:39:29 -07:00
Adrian Lizarraga
a8c263f92c
[QNN EP] Update QNN SDK to 2.12 (#16750)
### Description
- Updates the default QNN SDK to 2.12 for CI pipelines
- Adds a disabled InstanceNormalization test for regression on QNN SDK
2.12
- Cleans up logs for unsupported ops.

### Motivation and Context
Test with the latest QNN SDK.
2023-07-20 16:22:14 -07:00
Xavier Dupré
2bc9fbb621
Fix url in the code documentation (graph optimizations) (#16770)
### Description
Fix a wrong url in the documentation as mentioned in issue #16678.



### Motivation and Context
Better documentation.
2023-07-20 07:02:22 -07:00
Yi Zhang
c314d7724f
Update dml gpu pool to onnxruntime-Win2022-GPU-dml-A10 (#16765)
### Description
onnxruntime-Win2022-GPU-dml-A10 is using VS2022.



### Motivation and Context
1. Upgrade VS2019 to VS2022 to fix prefast issue.
2023-07-20 16:52:13 +08:00
Edward Chen
fc1f463ff1
[ios] Enable training package in packaging pipeline (#16683)
Build iOS training package in packaging pipeline.
Refactor iOS packaging pipeline to build different package variants in parallel.
2023-07-19 19:55:00 -07:00
saurabh
24566058b3
ovep dockerfile and wheel docs changes (#16482)
### Description
This PR is includes changes in the documentation of _readmeOV.rst_ file
and also the changes in the dockerfile which enables to build ORT with
latest OpenVINO 2023.0.0



### Motivation and Context
Modified the dockerfile to incorporate the latest version of OpenVINO
(2023.0.0) for building Onnxruntime.
The changes in the PR aim to improve the overall user experience by
providing accurate and up-to-date documentation while leveraging latest
OpenVINO 2023.0.0
2023-07-19 09:01:09 -07:00
Scott McKay
ad90352a68
Add MAUI test app that can be used to test model loading and performance (#16658)
### Description
<!-- Describe your changes. -->
MAUI test app with tooling to add model and generated or provided input
test data.

The app will load the model and validate the output. It can also run a
specified number of iterations to provide basic performance information.

<img width="401" alt="image"
src="https://github.com/microsoft/onnxruntime/assets/979079/daf3af13-fb22-4cbb-9159-486b483a7485">

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Primarily to make it easier to test an arbitrary model on iOS. A MAUI
app allows testing on all platforms.

---------

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2023-07-18 08:21:18 +10:00
Edward Chen
df8843c4a7
Upgrade old Python version in packaging pipeline (#16667)
- Upgrade from Python 3.6 to 3.8 in packaging pipeline.
- Raise build.py minimum required Python version.
2023-07-17 08:24:47 -07:00
Adrian Lizarraga
19169afe30
[QNN EP] Add option to skip unit tests in the QNN NuGet packaging pipeline (#16164)
Add option to skip unit tests in the QNN NuGet packaging pipeline.
2023-07-14 10:52:05 -07:00
Yi Zhang
36b121d8c2
add more check to Web CI on cache restore (#16689)
### Description
<!-- Describe your changes. -->



### Motivation and Context
Make sure the data is correct.
2023-07-14 10:00:13 +08:00
Scott McKay
a3fc04ba74
Fix CodeCoverage pipeline (#16684)
### Description
<!-- Describe your changes. -->
Delete second reference to onnxruntime_api_tests_without_env in the code
coverage commands. One was removed in #16373 and the duplicate wasn't
noticed.



### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Fix pipeline.
2023-07-14 07:47:04 +10:00
PeixuanZuo
ebc311365b
[ROCm] Optimize ROCm CI to reduce time (#16620)
This PR mainly optimize ROCm CI test to reduce time and CPU utilization.

- use smaller batch size on strided_batched_gemm/batched_gemm test
- disable cpu training test
- fix test_e2e_padding_elimination Occasional failures on ROCm.
2023-07-13 10:58:03 +08:00
Yi Zhang
f3b40abe29
Use pipeline cache to cache onnx node test data. (#16659)
### Description
Use pipeline cache instead of reading data from the image.


### Motivation and Context
1. To reduce the browser dependency of custom image.
2. The onnx node test data is less than 30M and the cache download time
is very short.
2023-07-13 09:26:27 +08:00
PeixuanZuo
596dbe277e
[ROCm] add upgrade to fix security issue (#16668) 2023-07-12 17:57:18 +08:00
Edward Chen
1b8d5c43c2
Fix builds (#16646)
- Fix some more `shorten-64-to-32` warnings
- Move minimum build.py Python version back to 3.6
2023-07-11 19:21:25 -07:00
Scott McKay
ce68a4c06a
Fix Linux build failure when onnxruntime_DISABLE_ABSEIL=ON (#16373)
### Description
<!-- Describe your changes. -->
Add ort_value.h to session_options.h so OrtValue is defined. 

Update a unit test binary to add required include paths. Adding
ort_value.h pulls in more data type headers.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
#16193
2023-07-12 11:23:18 +10:00
mindest
347c963d5c
[ROCm] Add ROCm Triton TunableOp for GroupNorm (#16196)
### Description
- Refactor existing Triton TunableOp-related code (based on work in
#15862)
- Add GroupNorm Triton implementation
2023-07-11 13:55:30 +08:00
Yulong Wang
5b6c1394cb
[js/test] CI: use pre-downloaded testdata in image (#16562)
### Description
update web CI to use pre-downloaded testdata in image
2023-07-10 22:22:14 -07:00
PeixuanZuo
2fd5e1cc39
[ROCm] fix shell bug (#16641)
`set -ex` with `grep` will exit when grep doesn't meet any string.
2023-07-10 17:31:27 +08:00
PeixuanZuo
cb4bf4f5c8
[ROCm] Move ROCm build step on CPU only machine (#16596)
- Move ROCm build step on CPU only machine
- Add the performance data of the huggingface bert-large model on the
MI200
- At the beginning of the test step, check the agent's GPU usage and
kill the threads occupying the GPU, which may be left over from previous
tasks that exited abnormally.
- Use different docker images during the build and test steps. The
difference is the `uid` and `user` when build docker image and create
docker container.
2023-07-10 11:55:10 +08:00
Xavier Dupré
47a0289ee6
[CI] Removes type2 in process_registration and fix Windows GPU Reduced Ops CI Pipeline (#16530)
### Description
Windows GPU Reduced Ops CI Pipeline is broken due to the introduction of
a second template type in registered kernels. The python code checking
the registration is broken due to that. This PR addresses this issue on
the python side by keeping only one type equal to the concatenation of
the two types.
2023-07-07 18:21:06 +02:00
Edward Chen
6be7b03e53
Enable -Wshorten-64-to-32 warning if available. (#16524)
- Fix some warnings from Xcode build (`-Wshorten-64-to-32`).
- Enable `-Wshorten-64-to-32` warning if available. Currently it's not fully enabled for `onnxruntime_test_all` and `onnxruntime_providers_xnnpack` yet.
- Some clean up in build.py including setting CMake generator more consistently.
2023-07-07 08:11:44 -07:00
Edward Chen
e22b0836e7
[objc] Update docs and fix static analysis build (#16617)
- Update some documentation comments.
- Use onnxruntime_training.h as the umbrella header so training API docs are included in generated docs.
- Fix static analysis build.
2023-07-07 07:58:54 -07:00
Scott McKay
697dd12f6e
Re-organize the transpose optimization and layout transformation files. (#16246)
### Description
<!-- Describe your changes. -->
Split out the more basic changes from #15552 for easier review.

Re-organize to clarify the structure
- Separate out generic base functionality from ORT specific components
  - pass in handlers for internal ORT ops to Optimize
- Split out layout transformation from transpose optimization
- Separate out level 1 transpose optimizer
- Cleanup some naming to try and clarify things like an optimizer vs.
general optimization code

Most of the changes are from this movement of code.

Two implementation changes:
- the extended handlers are queried first in GetHandler
- allows the extended handlers to override the default behaviour for an
ONNX operator
- simplify the Optimize function to remove OptimizerMode. 
- `can_modify_node` is used instead of `mode` and
`ignore_assigned_nodes` and a long description of the current usage is
added. I don't _think_ that changes the current behavior and hopefully
clarifies what happens and when, and makes the base transpose optimizer
implementation more generic.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Create a cleaner separation to support adding EP specific logic next to
cleanly handle where an EP has additional layout sensitive behaviour
required (e.g. it's Resize implementation only handles one layout).
2023-07-07 08:24:47 +10:00
Yi Zhang
fed08e070a
Add compiler cache in linux wasm build (#16579)
### Description
Add compiler cache in wasm build to accelerate web ci

### Motivation and Context
It could reduce the pipeline duration by 30 minutes.
web ci could be completed in 2 hours with cache.

https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1053219&view=results
2023-07-06 06:58:48 +08:00
Vrajang Parikh
fd8ad9b950
Enable iOS packaging for training (#16525)
### Description
Enable support for building iOS packages/CocoaPods with training API

- Add `Training` Package variant and config files in current iOS
packaging utilities to enable creation of training packages

### Motivation and Context
This PR introduces new `Training` variant in
`build_and_assemble_ios_pods.py` script which allows creating pods for
iOS with training API enabled.

The sample script to build training pods:

```
python3 tools/ci_build/github/apple/build_and_assemble_ios_pods.py --variant Training \
--build-settings-file  tools/ci_build/github/apple/default_full_ios_training_framework_build_settings.json \ 
-b=-- path_to_protoc_exe=<path/to/protoc>
``` 

Note: build settings file should have `--enable_training` as a build
parameter.


Simply adding training packaging increases the duration of the Azure
pipeline for packaging by 70 minutes. To address this issue, we need to
parallelize pod creation. In order not to further strain the pipeline,
the changes for training packaging will be added in another PR, which
optimizes the packaging pipeline.

---------

Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>
2023-07-05 13:27:59 -07:00
PeixuanZuo
e2526714e2
[ROCm] Move MIGraphX build step on CPU only machine (#16582)
- Move MIGraphX build step on CPU only machine
- Use ccache on build step
- Not pass host uid into docker build process.
2023-07-05 13:55:28 +08:00
Wei-Sheng Chin
a0a5f57581
[DORT] Use new FX-to-ONNX exporter (#16450)
The ONNX exporter in DORT have been moved to PyTorch as a formal
feature. We therefore switch to consume the exporter from PyTorch
instead of maintaining two duplicates.
2023-07-04 13:13:04 -07:00
PeixuanZuo
d540c7da0f
[ROCm] Add ROCm5.6 to python package pipeline (#16572)
Add ROCm5.6 to python package pipeline.
2023-07-04 18:18:12 +08:00
pengwa
ac100ebb64
Fix orttraining-ortmodule-distributed CI (#16569)
### Fix orttraining-ortmodule-distributed CI

https://pypi.org/project/pydantic/#history released version 2.0 1st
July, Deepspeed has known issue on newer version of it
(https://github.com/microsoft/DeepSpeed/issues/3280). So fix this by add
similar check as DS did in
https://github.com/microsoft/DeepSpeed/pull/3290
2023-07-03 13:18:59 +08:00
Scott McKay
2fd25de360
Use verbose logging in Android emulator in React Native CI (#16528)
### Description
<!-- Describe your changes. -->
Set emulator logging to verbose to see if it helps with intermittent
React Native CI failures when emulator crashes at startup


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-06-30 11:51:20 +10:00
Yi Zhang
fb7e1f133f
[Fix] TSA Upload failed in nuget pipeline. (#16476)
### Description
partially revert PR  #16244.


### Motivation and Context
npm pipeline couldn't triggered if nuget pipeline status is warning.


### Test Run

https://dev.azure.com/aiinfra/Lotus/_build/results?buildId=321873&view=logs&s=b17bed5b-cc14-5026-390a-fb2feea063f2
2023-06-28 06:40:49 +08:00
Rachel Guo
892b1b19ea
[js/rn] limit x86_64 arch in detox xcodebuild for react native e2e test (#16460)
### Description
<!-- Describe your changes. -->

As title.




### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Works with local onnxruntime-c pod in js/rn/e2e test.
2023-06-27 09:45:04 -07:00
guyang3532
4768ac5f30
Fix onnxruntime-CI-nightly-ort-pipeline Failure (#16495)
The image for the onnxruntime-CI-nightly-ort-pipeline is too old. 
The ort package in the image is older than latest test code in nightly
ci. This causes the nightly ci failed.
2023-06-27 23:19:23 +08:00
Yi Zhang
6e9541046e
extend react native ci timeout limit (#16469)
### Description
<!-- Describe your changes. -->

### Motivation and Context
2 consecutive runs in npm pipeline failed due to time out
2023-06-27 08:44:03 +08:00
Yifan Li
e2c214d81f
[TensorRT EP] TRT 8.6 minor version update (#16475)
### Description
* Minor version update: TRT 8.6.0.12->8.6.1.6
  * CI pipeline ymls/dockerfiles are updated
* cgmanifest.json/deps.txt/download-deps.yml are updated; Win trt
binaries uploaded to [win img
307029](https://aiinfra.visualstudio.com/AI%20Infra%20Management/_build/results?buildId=307029&view=results)
* Re-enable unit tests which were failed in 8.6.0 and re-gained support
in 8.6.1
2023-06-26 10:44:27 -07:00
PeixuanZuo
7e211f0e03
[ROCm] Move mount data step into docker container (#16471)
Some CI jobs may interrupted unexpectedly and didn't execute umount data
step. The data left in host device will cause `device or resource busy`
and make subsequent CI jobs fail.

Move the mount data step into docker container, the host machine will
not be occupied when CI jobs exit incorrectly.
2023-06-26 10:25:06 +08:00
Rachel Guo
04dbdc96bf
[js/rn] Fix React Native CI pipeline E2E test (#16447)
### Description
<!-- Describe your changes. -->

Based on this kindly provided quick fix:
https://github.com/microsoft/onnxruntime/pull/16411

See more description in the above linked pr about bumping AGP version,
etc.

Also fixed import header file path in detox e2e test.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Good build:

https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1041757&view=logs&j=de302ec2-2305-57e0-e8c6-cd89c569f2a3&t=9894c870-b8ce-548d-51ff-8f44d21a4117&l=18
2023-06-22 14:33:49 -07:00
Yi Zhang
8e8840f1de
Enable Web CI on Linux (#16419)
### Description
1. Enable Web ci on Linux

### Motivation and Context
1. speed up web ci, the duration can be reduced from 160 minutes to 130
minutes, a time saving of 20% could be be achieved.
The total computation time is 455 minutes now. Moved to Linux, it could
be reduced to 336 minutes.
2. It's the first step to enable compilation cache for emscripten
3. per Yulong's request, build_web stages are still using windows pool


![image](https://github.com/microsoft/onnxruntime/assets/16190118/c9496408-74bd-45ea-b4ae-a4dd2a574d17)


https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1038382&view=results
2023-06-22 15:42:58 +08:00