### Description
1. Remove Linux jobs for ORT-Extension combined build
2. Add a macOS build job for ORT-Extension combined build
3. Adjust the yaml file so that it can support two different ADO
instances.
### Motivation and Context
To test our code better. And it will enable us to run such tests for
every commit in the main branch. It would be easier for us to figure out
which change caused a build break.
See
[AB#13435](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/13435)
### Description
To reduce CUDA package's size a little bit. 37 is for Tesla K80. Azure's
NC-series uses it, but in most cases CUDA can dynamic generate device
code .
### Description
Fixes the DML release build for 1.14.1. This was initially fixed by
https://github.com/microsoft/onnxruntime/pull/13417 for 1.13.1, but the
changes didn't make their way back to the main branch.
Integrate TensorRT 8.5
- Update TensorRT EP to support TensorRT 8.5
- Update relevant CI pipelines
- Disable known non-supported ops for TensorRT
- Make timeout configurable.
We observe more than [20
hours](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=256729&view=logs&j=71ce39d8-054f-502a-dcd0-e89fa9931f40)
of running unit tests with TensorRT 8.5 in package pipelines. Because we
can't use placeholder to significantly reduce testing time (c-api
application test will deadlock) in package pipelines, we only run
subsets of model tests and unit tests that are related to TRT (add new
build flag--test_all_timeout and set it to 72000 seconds by package
pipelines). Just to remember, we still run all the tests in TensorRT CI
pipelines to have full test coverage.
- include https://github.com/microsoft/onnxruntime/pull/13918 to fix
onnx-tensorrt compile error.
Co-authored-by: George Wu <jywu@microsoft.com>
Patch Protobuf and ONNX's cmake files and enforce BinSkim check.
This PR has overlap with #13523 . I would prefer to get this one merged
first so that we can finished the BinSkim work, and I try to make this
PR as small as possible.
### Description
Update protobuf-java to version 3.21.7. This change only impact tests.
### Motivation and Context
The current version exhibits CVE-2022-3509
1. Move DML packaging pipelines to aiinfra-dml-winbuild machine pool
2. Delete
tools/ci_build/github/azure-pipelines/templates/windowsai-nuget-build.yml
because the pipeline has been migrated to Onebranch. I monitored it for
months, it worked well.
1. Update CUDA version from 11.4 to 11.6.
2. Update Manylinux version
3. Upgrade GCC version from 10 to 11 for most x86_64 pipelines. CentOS 7 ARM64 doesn't have GCC 11 yet.
4. Refactor python packaging pipeline:
a. Split Linux GPU build job to two parts, build and test, so that the
build part doesn't need to use a GPU machine
b. Make the Linux GPU build job and Linux CPU build job more similar: share the same bash script and yaml file.
5. Temporarily disable Attention_Mask1D_Fp16_B2_FusedNoPadding because it is causing one of our packaging pipeline to fail. I have created an ADO task for this.
1. add node test data to current model tests
2. support opset version to filter tests.
3. remove old filter based on onnx version. To avoid confusion, ONLY
support opset version filter in onnxruntime_test_all
4. support read onnx test data from absolute path on Windows.
1. Move the Linux ARM64 part of python packaging pipeline to a real ARM64 machine pool
2. Refactor the Linux CPU build jobs of python packaging pipeline to two parts: build and test. The test part will be exempted from Cyber EO compliance requirements as it won't affect the final bits we publish. This refactoring is to reduce dependencies in the build part. For example, this PR remove pytorch from the build dependencies.
3. Combine DML nuget packaging pipeline with "Zip-Nuget-Java-Nodejs Packaging Pipeline" as they all produce ORT nuget packages. Also, publish DML nuget packages and ORT GPU nuget packages to https://aiinfra.visualstudio.com/PublicPackages/_artifacts/feed/ORT-Nightly feed.
1. Delete the build scripts that were copied from manylinux project. Use "git checkout" instead.
2. Update manylinux version to get python 3.11. Related issue: Python 3.11 support #12343
3. Change the cuda version of linux gpu build job of nuget packaging pipeline from cuda 11.4 to cuda 11.6 to match the TRT job within the same pipeline.. (A lot other places need be updated as well, but I'd prefer to put them in another PR)
4. Make dockerfile names static. For example, replace tools/ci_build/github/linux/docker/$(DockerFile) to tools/ci_build/github/linux/docker/Dockerfile.manylinux2014_cpu . The former one relies on a runtime variable $(DockerFile), Template Parameters are expanded early in processing a pipeline run when most variables are not available. It like C++ macros vs variables.
* Add net6 targets.
Remove maccatalyst as we don't have a native build targetting that.
* Set platform in macos targets
* Add targetFramework entries
* Move NativeLib.DllName definition and set using preprocessor values for simplicity. Couldn't get it to build with the preprocessor based setup when it was in a separate file.
Update the nuspec generation to set platform version for .net6 targets. TODO: Validate versions. I copied them from the managed nuget package the packaging pipeline generated prior to adding targets. Possibly w could/should lower some of the versions.
Hopefully the need to specify a version goes away when the release version of VS2022 supports .net6.
* Try android 31.1 as https://github.com/actions/virtual-environments/blob/main/images/win/Windows2022-Readme.md suggests that should be available on the CI machines
* Fix patch version mismatch
Add some extra debug info in case it helps
* Debug nuget location in CI
* Add workspace entry back in
* Add steps
* One more attempt with hardcoded nuget.exe path and original android31.0 version
* Better fix - found explicit nuget download and updated version there.
* flake8 fixes
* Fix black complaints.
* Exit Microsoft_ML_OnnxRuntime_CheckPrerequisites for net6 iOS.
* Removed outdated comment
* Add .net6 support to the C# nuget package.
Currently requires jumping through a lot of hoops due to .net 6 only being supported in the preview release of VS 2022.
Build existing targets using msbuild.
Add .net6 targets and build using dotnet.
Create nuget package with combined targets.
A few misc automated changes from VS to spacing and adding a couple of properties.
* update trt 8.4ga
* trt 8.4 linux ci pipeline
* fix cmake
* placeholder_builder
* trt 8.4 windows pipeline
* gpu package pipeline
* trt 8.4.1.5 , packaging pipeline updates
* python packaging
* ctest timeout
* python packaging test
* bump timeout
* python format
* format
* revert
* newline
* enable trt python tests
* typo
* python format
* disable on windows
* add c-api test for package
* fix bug for running c-api test for package
* refine run application script
* remove redundant code
* include CUDA test
* Remove testing CUDA EP temporarily
* fix bug
* Code refactor
* try to fix YAML bug
* try to fix YAML bug
* try to fix YAML bug
* fix bug for multiple directories in Pipelines
* fix bug
* add comments and fix bug
* Update c-api-noopenmp-packaging-pipelines.yml
* Remove failOnStandardError flag in Pipelines
* update base image from 11.4.0 to 11.4.2
* update Linux TRT GPU pipeline to TRT 8.2
* update onnx-tensorrt to 8.2-GA
* disable failing TensorRT 8.2 tests.
* update pad test.
* fix
* update win trt ci pipeline to trt 8.2
* test run with cuda 11.4 and cudnn 8.2
* increase timeout
* revert
* revert
* update packaging pipelines to use trt 8.2
* fix typo
* update trt gpu perf pipeline to trt 8.2
* increase timeout
* delete deprecated ci-perf-pipeline.yml
* bump timeout
* adjust timeout packaging
- Only set them as targets for the ORT nuget package
- Use OrtPackageId as the condition for inclusion, if installed
- need to do the nuget restore via msbuild so that this property is set correctly
- Add desktop-only version of the C# sln as there is no way to exclude the mobile specific csproj's from an sln
- use this when applicable if someone is running build.py with the `--build_nuget` flag
Other
- remove attempt to include symbols in the nuget package as nuget doesn't support symbols in native packages
- update build.py to use `nuget` and not a windows specific path and filename for a linux build with `--build_nuget`
Add Xamarin support to the ORT nuget packages.
- Update C# code to support Xamarin builds for iOS and Android
- refactor some things to split out common code
- include iOS and Android ORT native shared library in native nuget package
* Update to CUDA11.4 and TensorRT-8.0.3.4
* update trt pool, remove cudnn from setup_env_gpu.bat
* revert pool
* test gpu package pipeline on t4
* back out changes
* back out changes
Co-authored-by: George Wu <jywu@microsoft.com>
* Add netstandard2.0 to nuget managed package.
Re-does PR that was backed out due to packaging pipeline changes.
Allows deprecation of netstandard1.1 in the following release as netstandard2 is the preferred lowest level framework.
* modify for test
* modify for test
* modify for test
* modify for test
* modify for test
* modify for test
* prepare for PR
* Rename cuda directory to gpu directory in tarball
* Fix gpu java package
* fix bug
* fix small bug
* Add onnxruntime_providers_shared.dll into gpu nuget package
* Modify for test
* Temporarily remove for test
* Modify for test
* Modify for test
* Test packging Windows combined GPU
* Test packging Windows combined GPU
* Test packging Windows combined GPU
* Test packging Windows combined GPU
* modify for test
* modify for test
* fix bug
* Modify for test
* Modify for test
* Modify for test
* Modify for test
* Modify for test
* Modify for test
* Modify for test
* Modify for test
* Prepare for PR
* Prepare for PR
* Code refactor
* Rename proper Artifact name
* Rename intermediate Artifact names
* Revert Artifact Names
* Rename Artifact Names
* Modify Artifact name
* Modify Artifact name
* Modify Artifact name
* Update Java package
* Update Java package
* fix bug to change artifact name
* Fix bug for the wrong file path
* Fix no fetching correct artifact and test
* temporarily modify for test
* undo the change for test
* Merge CPU/GPU nuget pipeline
* Include TensorRT EP libraries into existing GPU nuget package pipeline
* modify to use correct YAML
* Modify for test
* modify for test
* Add depedance
* Add depedance (cont.)
* modify for test
* Add create TensorRT nuget package
* modify for test
* modify for test
* Merge CPU/GPU nuget pipeline
* Include TensorRT EP libraries into existing GPU nuget package pipeline
* modify to use correct YAML
* Modify for test
* modify for test
* Add depedance
* Add depedance (cont.)
* modify for test
* Add create TensorRT nuget package
* modify for test
* fix merge bug
* code refactor
* code refactor
* modify for test
* modify for test
* modify for test
* modify for test
* modify for test
* modify for test
* cleanup
* modify for test
* fix bug
* modify for test
* refactor
* fix bug and test
* Modify for test
* Modify for test
* Modify for test
* Modify for test
* Prepare for PR
* Prepare for PR
* code refacotr from review
* Remove naming 'Microsoft.ML.OnnxRuntime.TensorRT' to avoid confusion
* Add linux TensorRT libraries
* Remove redundant variable in YMAL
* revert file
* undo revert file
* Modify regular expression so that it can capture the correct file
* Remove newline at end of file
* small fix
* Revert to CUDA11.1 on Windows
* Add unit tests for nuget package on Linux
Co-authored-by: Changming Sun <chasun@microsoft.com>
* initial update from 11.1 to 11.4
* change 11.4.1 to 11.4.0
* adjusting to match nvidia/cuda image tags
* adjusting to match nvidia/cuda image tags centos7
* correction to 11.4.0
* correction to 11.4.0
* update to cuda 11.4
* change training back to 11.1
* change training back to 11.1
* point to correct nvcr.io/nvidia/cuda 11.4.1 image
* change centos8 to centos7
* correct cudnn path
* Update linux-gpu-ci-pipeline.yml for Azure Pipelines
* Update c-api-noopenmp-packaging-pipelines.yml
* need to resolve centos images but remove space and change to 11.4
* Update linux-gpu-ci-pipeline.yml
* add cudnn to docker image
* bump devtoolset to 10
* revert cuda 11.4 change to setup_env_trt
* orttraining back to 11.1
* use nvcr.io
* Fix previous change back to cuda 11.1
* update cudnn path
* use cudnn image (revert if failure)
Merge CPU/GPU nuget pipeline. The old GPU nuget pipeline will be only for DML.
TODO: the result GPU package contains PDB files for some of the DLLs, but not all. It is due to the refactoring of CUDA EP to pluggable DLLs. At that time we forgot to copy the PDB files. However, I can't add them in now. Because currently the package is already 220MB large. If the missed PDB files were added, then it will be oversize. nuget.org doesn't accept >250MB packages.
1. Update manylinux build scripts. This will add [PEP600](https://www.python.org/dev/peps/pep-0600/)(manylinux2 tags) support. numpy has adopted this new feature, we should do the same. The old build script files were copied from https://github.com/pypa/manylinux, but they has been deleted and replaced in the upstream repo. The manylinux repo doesn't have a manylinux2014 branch anymore. So I'm removing the obsolete code, sync the files with the latest master.
2. Update GPU CUDA version from 11.0 to 11.1(after a discussion with PMs).
3. Delete tools/ci_build/github/linux/docker/Dockerfile.manylinux2014_cuda10_2. (Merged the content to tools/ci_build/github/linux/docker/Dockerfile.manylinux2014_cuda11)
4. Modernize the cmake code of how to locate python devel files. It was suggested in https://github.com/onnx/onnx/pull/1631 .
5. Remove `onnxruntime_MSVC_STATIC_RUNTIME` and `onnxruntime_GCC_STATIC_CPP_RUNTIME` build options. Now cmake has builtin support for it. Starting from cmake 3.15, we can use `CMAKE_MSVC_RUNTIME_LIBRARY` cmake variable to choose which MSVC runtime library we want to use.
6. Update Ubuntu docker images that used in our CI build from Ubuntu 18.04 to Ubuntu 20.04.
7. Update GCC version in CUDA 11.1 pipelines from 8.x to 9.3.1
8. Split Linux GPU CI pipeline to two jobs: build the code on a CPU machine then run the tests on another GPU machines. In the past we didn't test our python packages. We only tested the pre-packed files. So we didn't catch the rpath issue in CI build.
9. Add a CentOS machine pool and test our Linux GPU build on real CentOS machines.
10. Rework ARM64 Linux GPU python packaging pipeline. Previously it uses cross-compiling therefore we must static link to C Runtime. But now have pluggable EP API and it doesn't support static link. So I changed to use qemu emulation instead. Now the build is 10x slower than before. But it is more extensible.
1. Remove openmp related packaging pipelines and build jobs.
2. Set continueOnError to true for the TSAUpload tasks. Their service is unstable recently.
3. Update Ubuntu 16 docker images to Ubuntu 18, in prepare for getting C++17 support
4. Cherry-pick the changes in 1.7.1 to the master: updating CFLAGS/CXXFLAGS to strip out debug symbols