Commit graph

1223 commits

Author SHA1 Message Date
Adam Louly
cf8bf0c141
add on device training to the packaging pipelines (#13446)
### Description
enabling on device training apis in the packaging pipelines. 



### Motivation and Context
adding on device training flag so we can enable the on-device training
apis for Federated learning scenarios

Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-10-25 15:03:34 -07:00
Changming Sun
a396a91c9a
Move build machines with Nvidia M60 GPUs to Nvidia T4 (#13170) 2022-10-25 11:21:13 -07:00
PeixuanZuo
4b2b588895
[ROCm] Fix azcopy issue on ROCm ci pipeline (#13365)
### Description
<!-- Describe your changes. -->

Use SAS Token to fix error` failed to perform copy command due to error:
no SAS token or OAuth token is present and the resource is not public`

Generate SAS Token of target data, add it into Key vault, and use it as
Pipeline Variable.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Co-authored-by: peixuanzuo <peixuanzuo@linmif39a000004.zvflicr54joexhdgnhvmxrxygg.phxx.internal.cloudapp.net>
2022-10-20 12:08:57 +08:00
PeixuanZuo
665fb346ab
[ROCm] set parallel=16 when build on ROCm CI (#13368)
### Description
<!-- Describe your changes. -->

ROCm CI build step takes more than one hour. Set parallel=16 when build
on ROCm CI to reduce build time.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Co-authored-by: peixuanzuo <peixuanzuo@linmif39a000004.zvflicr54joexhdgnhvmxrxygg.phxx.internal.cloudapp.net>
2022-10-20 11:36:00 +08:00
Adrian Lizarraga
418304743d
[EP-Perf-Dashboard] Update table schemas (#13327)
Updates EP perf benchmarking scripts to upload new data with an improved table schema. In order to preserve compatibility with the current benchmarking pipeline, we still upload data that uses the old schema as well. These changes are required in order to improve data filtering capabilities and general UX in dashboards that visualize this data.

Details:
- EP names no longer hardcoded as columns for tables that store inference latency, session creation times, memory usage, and model/EP status.
- Add explicit branch, commit ID, and commit date columns to all tables
- Improvements to the docker image building scripts (simplify docker image build; support installing binary TensorRT packages)
- Remove use of deprecated DataFrame.append in favor of pandas.concat.
2022-10-19 16:15:05 -07:00
Edward Chen
2fa18ea77e
[React Native CI] Record more info to debug E2E test (#13329)
Record more info from the React Native CI E2E test. In particular, log the view hierarchy when exiting the test and dump logs from Android emulator to the build output.
2022-10-18 17:21:28 -07:00
Adam Louly
61ee5585b2
update the nightly build to use the latest ptca image. (#13309)
### Description
updating the ptca image used in the nightly pipeline

Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-10-17 14:12:03 -07:00
PeixuanZuo
b4853a978a
[ROCm] add rocm python package pipeline with --use_rocm_profiling (#13068)
### Description
<!-- Describe your changes. -->

ROCm developers always need to build onnxruntime *whl with
`--enable_rocm_profiling`.
Add a ROCm dev python package pipeline which product *.whl with build
args `--enable_rocm_profiling`.
The dev *whl need to upload to azure storage and can get from
https://download.onnxruntime.ai/onnxruntime_nightly_rocm53.profiling.html


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-10-17 10:11:20 +08:00
Wei-Sheng Chin
dc324b1d90
[LazyTensor] Make LORT Build Again with Latest PyTorch (#13303)
`python setup.py develop` doesn't install PyTorch as a normal package in
site-packages anymore, and the user must stay at PyTorch's root
directory to call `import torch`. This will break LORT tests because
LORT tests contains `import torch` and are called outside PyTorch root
directory. To make PyTorch a normal package again, this PR build PyTorch
with `python setup.py install`.
2022-10-13 13:56:17 -07:00
PeixuanZuo
6895918b1c
[ROCm] Revert CI pipeline to ROCm5.2.3 (#13297)
### Description
<!-- Describe your changes. -->

Unit test with ROCm5.3 slower than ROCm5.2.3. Revert to ROCm5.2.3.
We will update to ROCm5.3 when the issue resloved by AMD.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-10-12 10:47:33 -07:00
Edward Chen
9422438782
Objective-C static analysis - use different llvm path to try to find clang-tidy. (#13280)
Use different llvm path to try to find clang-tidy. Sometimes the build fails because it can't find clang-tidy. Hopefully this path works better.
2022-10-12 10:16:26 -07:00
Yi Zhang
67bde18d0d
Update Win_GPU_CI trigger (#13290)
### Description
supplement of #13248

Add PR trigger 

https://learn.microsoft.com/en-us/azure/devops/pipelines/repos/github?view=azure-devops&tabs=yaml#pr-triggers

fix: master -> main

Testted with #13289 #13292

NB:
the real pipeline is always triggered if the workflow yaml changed even
it's added in the path filter.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Make sure the real pipeline not run in the backend.
2022-10-12 15:22:42 +08:00
PeixuanZuo
b2353fa737
[ROCm] Add ROCm5.3 to python package pipeline (#13249)
### Description
<!-- Describe your changes. -->

1. Remove ROCm5.1.1 and ROCm5.2 from ROCm python package pipeline
2. Add ROCm5.3 to ROCm python package pipeline
pipeline:

https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=237172&view=results

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-10-12 07:23:42 +08:00
Yi Zhang
6b499db7e1
increase ios pipeline timeout limit (#13268)
### Description
<!-- Describe your changes. -->



### Motivation and Context
The timeout issues increased
2022-10-11 14:07:04 +08:00
Yi Zhang
ea128cdb18
skip windows GPU check if changes only in doc (#13248)
### Description
Use Path filter and fake workflow to skip windows GPU check if there's
only changes in doc.
Refs:

https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/defining-the-mergeability-of-pull-requests/troubleshooting-required-status-checks#handling-skipped-but-required-checks

The fake github yaml is generated by code.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

###verifications:###
In this PR:
since the win-gpu-ci-pipeline.yml and .github are updated, so the real
Windows GPU workflows are always triggered.

in #13256
To avoid update win-gpu-ci-pipleline.yml, I added the path filter in
devops page. the fake win GPU workflows triggered, and the real
workflows are skipped.
2022-10-11 13:51:44 +08:00
PeixuanZuo
4d25b9c8f0
[ROCm] Update ROCm and MIGraphX CI pipeline to ROCm5.3 (#13257)
### Description
<!-- Describe your changes. -->

1. Update ROCm pipeline and MIGraphX pipeline to ROCm5.3
ROCm pipeline run ortmodule test one time and disable it :
https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=777794&view=logs&j=48b14a85-ff1a-5ca4-53fa-8ea420d27feb&t=9c199f35-fc50-565d-6c65-5162c9bb1b04
2. Add `workspace: clean: all `.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-10-11 13:47:22 +08:00
Edward Chen
00146b2541
Add onnxruntime_BUILD_UNIT_TESTS=OFF definition to iOS package build options. (#13238)
Add onnxruntime_BUILD_UNIT_TESTS=OFF definition to iOS package build options. The `--skip_tests` option is already specified.
2022-10-10 18:00:17 -07:00
Edward Chen
d411bd277e
Increase iOS packaging pipeline timeout. (#13233)
Increase iOS packaging pipeline timeout to 300 minutes.
2022-10-07 14:49:16 -07:00
Jian Chen
6662ece4a1
increase timeout to 5 hours (#13226)
### Description
Increase MacOS pipeline timeout to 5 hours



### Motivation and Context
It blocks Release pipeline
2022-10-07 13:02:48 -04:00
cloudhan
51ac6617f5
Fix warnings and enable dev mode for ROCm CI (#13223)
Fix warnings and enable dev mode for ROCm CI:

* Fix ROCm headers complaining "This file is deprecated. Use the header file from ..."
* Disable warning signed and unsigned compare for kernel explorer
* Fix unused and nondiscard warnings
* Enable dev mode for ROCm CI
* Walkaround error "unknown warning option '-Wno-nonnull-compare'" in kernel explorer by using '-Wno-unknown-warning-option' to ignore the unknown option
* Fix error "unused parameter 'mask'"
* Fix warning "instantiation of variable 'onnxruntime::rocm::Consts<float>::One' required here, but no definition is available", etc. Fixed by using C++17's inline (implied by constexpr) static initialization.
* Remove unused variable
* Add the missing `override` specifier
2022-10-07 09:45:01 +08:00
Edward Chen
4e37464cc5
Add build configuration to binary size checks pipeline. (#13208)
Add another build configuration to binary size checks pipeline. Enable additional configurations to be added more easily.
2022-10-05 12:39:19 -07:00
cloudhan
72076b1eb2
Update ROCm CI to use HIP LANGUAGE (#13214)
Update for ROCm CI before reland tunable GEMM #12853. This PR also update
composable kernel to use CMakes's HIP language support so that we can
mix C/C++ compiler with HIP compiler instead of locking to hip-clang
2022-10-05 16:15:16 +08:00
Yulong Wang
82786baed1
[js/web] add 'xnnpack' to EP list (#12723)
**Description**: This PR adds support for "XNNPACK EP" in ORTWeb and
changes the behavior of how ORTWeb deals with "backends", or "EPs" in
API.

**Background**: Term "backend" is introduced in ONNX.js to representing
a TypeScript type which implements a "backend" interface, which is a
similar but different concept to ORT's EP (execution provider). There
was 3 backends in ONNX.js: "cpu", "wasm" and "webgl".

When ORT Web is launched, the concept is derived to help users to
integrate smoothly. Technically, when "wasm" backend is used, users need
to also specify "EP" in the session options. Considering it may get
complicated and confused for users to figure out the difference between
"backend" and "EP", the JS API hide the "backend" concept and made a
mapping between names, backends and EPs:
"webgl" (Name) <==> "onnxjsBackend" (Backend)
"wasm" (Name) <==> "wasmBackend" (Backend) <==> "CPU" (EP)

**Details**:
The following changes are applied in this PR:
1. allow multi-registration for backends using the same name. This is
for use scenarios where both "onnxruntime-node" and "onnxruntime-web"
are consumed in a Node.js App ( so "cpu" will be registered twice in
this scenario. )
2. re-assign priority values to backends. I give 100 as base to "cpu"
for node and react_native, and 10 as base to "cpu" in web.
3. add "cpu", "xnnpack" as new names of backends.
4. update onnxruntime wasm exported functions to support EP
registration.
5. update implementations in ort web to handle execution providers in
session options.
6. add '--use_xnnpack' as default build flag for ort-web
2022-10-03 10:38:45 -07:00
Baiju Meswani
0cf17b1921
Add linux debug training package to nightly pipeline (#13192) 2022-10-01 06:58:43 -07:00
Yulong Wang
054464dce2
fix XNNPACK on WebAssembly SIMD (#13161)
### Description

fix XNNPACK on WebAssembly SIMD.

Flag "-msimd128" need to be applied to every source file when compiling
WASM SIMD. Currently only a part of the source files are compiled with
this flag so we get inconsistent result for
`sizeof(xnn_f32_minmax_params)` because the type definition include a
`#ifdef` for `__wasm_simd128__`. The inconsistency causes writing
garbage data to a stack variable and eventually cause the crash.

XNNPACK libraries are C libraries so need to apply the build flags not
only to `CMAKE_CXX_FLAGS` but also to `CMAKE_C_FLAGS`.
2022-09-30 16:34:15 -07:00
Changming Sun
5f1bc8ff56
Add "--parallel" to the build flags of WASM pipeline (#13179) 2022-09-30 06:54:39 -07:00
Yi Zhang
a862b0cad1
increase ios_CI_coreml stage timeout limit (#13157)
### Description
As titile 

### Motivation and Context
Recently, it became more frequently that the workflow canceled due to
timeout.
2022-09-30 14:45:14 +08:00
PeixuanZuo
3157cdb19a
[ROCm] Fix MIGraphX ciagent user Permissions issues (#13137)
### Description
<!-- Describe your changes. -->

fix migraphx ci pipeline failed problem.

Disabled MIGraphX pipeline now. It will  be Enabled when this PR merge.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2022-09-29 10:25:02 +08:00
Baiju Meswani
5182d6610d
Upgrade pytorch to 1.12.1 for training pipelines (#13128) 2022-09-28 17:59:49 -07:00
sfatimar
c9a86fa27f
Openvino GPU Unit/Python Tests fix failure (#13122)
### Description
We fix iGPU Unit and Python tests with this PR
We add packaging pip pkg to build Many Linux DockerFile


### Motivation and Context
This change is required to make sure iGPU Unit Test/Python Tests with OV
are fixed
 - If it fixes an open issue, please link to the issue here. -->

Co-authored-by: shamaksx <shamax.kshirsagar@intel.com>
Co-authored-by: mayavijx <mayax.vijayan@intel.com>
Co-authored-by: pratiksha <pratikshax.bapusaheb.vanse@intel.com>
Co-authored-by: pratiksha <mohsinx.mohammad@intel.com>
Co-authored-by: Sahar Fatima <sfatima.3001@gmail.com>
Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com>
Co-authored-by: nmaajidk <n.maajid.khan@intel.com>
Co-authored-by: Mateusz Tabaka <mateusz.tabaka@intel.com>
2022-09-28 16:00:06 -07:00
Edward Chen
55ae71c160
Reduce Objective-C static analysis build time. (#13149) 2022-09-28 15:49:48 -07:00
PeixuanZuo
5e4ebbd9d9
[ROCm] add MIGraphX ci pipeline (#11569)
**Description**: Describe your changes.
Add migraphx ci pipeline, test build and unit tests.
This PR is based on #11492 

Pipeline :
https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=765711&view=results
2022-09-28 10:59:30 +08:00
Baiju Meswani
f99d00fa38
Add rel* branches to upload training packages to final storage (#13124) 2022-09-27 17:20:17 -07:00
leqiao-1
43766ee36d
Fix OLive build pipeline (#13114) 2022-09-27 10:19:58 -07:00
RandySheriffH
237ccc01c7
Remove one last nuphar reference (#13111)
Remove one last nuphar reference.
2022-09-26 23:02:36 -07:00
RandySheriffH
77a066c700
Drop nuphar from java API (#13107)
Drop nuphar from:

- java API
- tvm.cmake
- run_build.sh
2022-09-26 17:06:08 -07:00
Edward Chen
b62ba0b5a7
Remove old enable_linux_gpu_tests parameter from template invocation. (#13102)
Remove old enable_linux_gpu_tests parameter from template invocation in build-perf-test-binaries-pipeline.yml.
2022-09-26 16:27:40 -07:00
RandySheriffH
a83a9ed6b0
Remove miscellaneous nuphar configs (#13070)
Remove a handful of nuphar related configurations after deprecation.

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2022-09-26 13:41:28 -07:00
Changming Sun
7116825aef
Add CMAKE_CUDA_ARCHITECTURES list to python packaging pipeline (#13081) 2022-09-26 10:22:43 -07:00
mayavijx
ade0d29174
Updated Dockerfile.ubuntu_openvino with OV 2022.2 official release (#13069)
Updated Dockerfile.ubuntu_openvino to use OV 2022.2 official release
which was using pre release only.
2022-09-26 00:15:52 -07:00
dependabot[bot]
6587a85f8f Bump protobuf from 3.18.1 to 3.18.3 in /tools/ci_build/github/linux/tvm
Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 3.18.1 to 3.18.3.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Changelog](https://github.com/protocolbuffers/protobuf/blob/main/generate_changelog.py)
- [Commits](https://github.com/protocolbuffers/protobuf/compare/v3.18.1...v3.18.3)

---
updated-dependencies:
- dependency-name: protobuf
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-09-24 21:12:16 -07:00
dependabot[bot]
c1ff4b468d Bump protobuf in /tools/ci_build/github/linux/docker/scripts/manylinux
Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 3.18.1 to 3.18.3.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Changelog](https://github.com/protocolbuffers/protobuf/blob/main/generate_changelog.py)
- [Commits](https://github.com/protocolbuffers/protobuf/compare/v3.18.1...v3.18.3)

---
updated-dependencies:
- dependency-name: protobuf
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-09-24 15:21:50 -07:00
dependabot[bot]
63c3b21902
Bump protobuf from 3.18.1 to 3.18.3 in /tools/ci_build/github/linux/docker/inference/x64/python/cpu/scripts (#13080) 2022-09-23 22:15:36 -07:00
Changming Sun
9e21ffb649
Add license header to some files. (#13074) 2022-09-23 18:46:02 -07:00
Baiju Meswani
8bb16ab900
Propagate environment variable to docker image (#13031) 2022-09-23 11:23:49 -07:00
Changming Sun
eafd67b8fd
Update CUDA version to 11.6 and refactor python packaging pipeline (#13002)
1. Update CUDA version from 11.4 to 11.6.
2. Update Manylinux version
3. Upgrade GCC version from 10 to 11 for most x86_64 pipelines. CentOS 7 ARM64 doesn't have GCC 11 yet.
4. Refactor python packaging pipeline: 
    a. Split Linux GPU build job to two parts, build and test, so that the
build part doesn't need to use a GPU machine
    b. Make the Linux GPU build job and Linux CPU build job more similar: share the same bash script and yaml file.
5. Temporarily disable Attention_Mask1D_Fp16_B2_FusedNoPadding because it is causing one of our packaging pipeline to fail. I have created an ADO task for this.
2022-09-23 00:29:27 -07:00
Scott McKay
078ceab1db
Use full ORT package for onnxruntime-react-native. (#13037)
**Description**: 
Use full ORT package for onnxruntime-react-native.

Left the params required for the mobile build in comments so they're
easily discovered if we need to create onnxruntime-react-native-mobile
in the future.

**Motivation and Context**
Remove barrier to using ORT with react native as the mobile package that
was being used supports a limited range of opsets/operators/types, and
requires ORT format models. The full package will run any model.
2022-09-23 07:20:03 +10:00
sfatimar
cccbe90764
Openvino ep 2022.2 v4.2 (#13023)
This changes are to align OV 2022.2 Release with ORT . Changes
CPU FP16 Support, dGPU Support, RHEL Dockerfile, Ubuntu 20 Dockerfile 

**Motivation and Context**
- This change is required to ensure ORT-OpenVINO Execution Provider is
aligned with latest changes.
- If it fixes an open issue, please link to the issue here.

Co-authored-by: mayavijx <mayax.vijayan@intel.com>
Co-authored-by: shamaksx <shamax.kshirsagar@intel.com>
Co-authored-by: pratiksha <pratikshax.bapusaheb.vanse@intel.com>
Co-authored-by: pratiksha <mohsinx.mohammad@intel.com>
Co-authored-by: Sahar Fatima <sfatima.3001@gmail.com>
Co-authored-by: Preetha Veeramalai <preetha.veeramalai@intel.com>
Co-authored-by: nmaajidk <n.maajid.khan@intel.com>
Co-authored-by: Mateusz Tabaka <mateusz.tabaka@intel.com>
Co-authored-by: intel <intel@iotgecsp-nuc04.iind.intel.com>
2022-09-22 12:31:40 -07:00
Adrian Lizarraga
39e20686a0
[EP Perf Dashboard] Fix incorrect calls to trtexec with fp16 inputs (#13018) 2022-09-21 10:31:45 -07:00
Yi Zhang
8356e3b9b0
Add onnx single node test data to tests (#12822)
1. add node test data to current model tests
2. support opset version to filter tests.
3. remove old filter based on onnx version. To avoid confusion, ONLY
support opset version filter in onnxruntime_test_all
4. support read onnx test data from absolute path on Windows.
2022-09-21 10:02:57 -07:00