onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-23 02:38:28 +00:00

Author	SHA1	Message	Date
Jian Chen	3ea27c2925	Create a new Nuget Package pipeline for CUDA 12 (#18135 )	2023-11-28 09:03:46 -08:00
Jian Chen	7c18c60bc2	Change cuda image for tensorRT to the one with cudnn8 (#18102 ) ### Description copilot:summary ### Motivation and Context copliot::walkthrough	2023-10-26 16:28:57 -07:00
Jian Chen	76e275baf4	Merge Cuda docker files into a single one (#18020 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-10-24 15:17:36 -07:00
Hariharan Seshadri	9356986730	Fix AMD builds and enable testing NHWC CUDA ops in one GPU CI (#17972 ) ### Description This PR: (1) Fixes AMD builds after #17200 broke them (Need to remember to run AMD builds while trying to merge external CUDA PRs next time) (2) Turn on the NHWC CUDA feature in the Linux GPU CI. The extra time spent in building a few more files and running a few more tests will not be much. Test Linux GPU CI run : https://dev.azure.com/onnxruntime/onnxruntime/_build/results?buildId=1170770 ### Motivation and Context Keep the NHWC CUDA ops tested (https://github.com/microsoft/onnxruntime/pull/17200) and guard against regressions	2023-10-17 09:23:52 -07:00
Changming Sun	c6b0d185b4	Update cmake to 3.27 and upgrade Linux CUDA docker files from CentOS7 to UBI8 (#16856 ) ### Description 1. Update docker files and their build instructions. ARM64 and x86_64 can use the same docker file. 2. Upgrade Linux CUDA pipeline's base docker image from CentOS7 to UBI8 AB#18990	2023-09-05 18:12:10 -07:00
Yi Zhang	d4a61ac71f	Pr trggiers generated by code (#17247 ) ### Description 1. Refactor the trigger rules generation. 2. Skip all doc changes in PR pipelines. ### Motivation and Context Make all trigger rules generated by running set-trigger-rules.py to reduce inconsistences. It's easily to make mistakes to copy&paste manually. For example: these 2 excludes are different, Why? `4e6cec4d09/tools/ci_build/github/azure-pipelines/linux-ci-pipeline.yml (L16-L18)` `4e6cec4d09/tools/ci_build/github/azure-pipelines/linux-gpu-ci-pipeline.yml (L27-L29)` ### Note All changes in workflow yamls are generated by code. Please review the skip-js.yml, skip-docs.yml and set-trigger-rules.py. @fs-eire, please double check the filter rules in skip-js.yml and the skipped workflows `7023c2edff/tools/ci_build/set-trigger-rules.py (L14-L41)`	2023-08-30 05:57:03 +08:00
Jian Chen	45f52987a2	Web CI Pipeline Isolation (#17005 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2023-08-14 10:37:37 -07:00
Yi Zhang	555414f1aa	Set PR trigger rules (#16987 ) ### Description Add a script to insert the trigger rules to workflow yamls. First step, skipp windows gpu and linux gpu workflow when there's only doc change ### Motivation and Context Make skipping workflows for doc change easily. [AB#18201](https://aiinfra.visualstudio.com/6a833879-cd9b-44a4-a9de-adc2d818f13c/_workitems/edit/18201)	2023-08-04 08:21:07 -07:00
Changming Sun	6b5b79872b	Avoid taking dependency on dl.fedoraproject.org (#16202 ) ### Description 1. Avoid taking dependency on dl.fedoraproject.org The website is not very stable. Our build pipelines often fail to fetch packages from there. 2. Update manylinux to the latest version	2023-06-02 07:41:46 -07:00
Yi Zhang	6d43d51eb0	[Fix] No test result report while not using ctest (#15976 ) ### Description 1. Set gtest output while ctest is set to empty. 2. onnx_src in _deps shouldn't be removed because onnx_test_pytorch_converted and onnx_test_pytorch_converted need to read data from onnx/backend/test/data/.. ### Motivation and Context Test result report is important to find the flaky tests. ### To do Tests are not inconsistent. If ctest_path is empty, onnx_test_pytorch_converted and onnx_test_pytorch_converted will not be executed, if it's not, onnxruntime_mlas_test will not be executed. `270c09a37f/tools/ci_build/build.py (L1743-L1753)`	2023-05-17 08:31:16 -07:00
Yi Zhang	b20d5e85d5	Update Cuda to 11.8 in 2 Linux GPU workflows. (#15925 ) ### Description use template variable for cuda version ### Motivation and Context	2023-05-14 12:51:25 +08:00
Yi Zhang	0e7ae13e74	Run Linux GPU tests in docker container (#15872 ) ### Description Run Linux GPU tests in docker container ### Motivation and Context	2023-05-12 06:29:22 +08:00
Changming Sun	d3d232b047	Rename onnxruntime-Linux-CPU-2019 machine pool (#15691 ) Rename onnxruntime-Linux-CPU-2019 machine pool to "onnxruntime-Ubuntu2004-AMD-CPU". The old one has an internal error and stuck there. I cannot make any change to it. It has been like this for more than 1 week. So I created a new pool with the same setting except the name is different. Also, move some android pipelines to "onnxruntime-Linux-CPU-For-Android-CI" which uses a standard image from https://github.com/actions/runner-images	2023-04-27 12:46:18 -07:00
Yi Zhang	0ea965c541	clear cache stat. after building (#15439 ) ### Description Add `ccache -z` after every building. ### Motivation and Context Uploaded Cache stat shouldn't include cache stat.	2023-04-10 13:56:55 +08:00
Jian Chen	792d411135	Update python 3.11 and remove 3.7 for Linux (#15214 ) ### Description Update python 3.11 and remove 3.7 ### Motivation and Context Update python 3.11 and remove 3.7 --------- Co-authored-by: Ubuntu <chasun@chasunlinux.lw3b1xzoyrkuzm34swpscft0ff.dx.internal.cloudapp.net>	2023-03-27 14:46:30 -07:00
Changming Sun	63cc1bb26a	Move Linux CPU pipelines to an AMD CPU pool which is cheaper (#15144 ) ### Description 1. Move Linux CPU pipelines to an AMD CPU pool which is cheaper 2. Enable CCache for orttraining pipeline ### Motivation and Context Azure AMD CPU machines are generally much cheaper than Intel CPU machines. However, they don't have local disks.	2023-03-27 14:10:08 -07:00
Edward Chen	c46c7ccba5	Update Gradle version (#14862 ) - Update Gradle version used in most places from 6.8.3 to 8.0.1. Update Android Gradle Plugin version where applicable. Not updated in this change: React Native Android projects (under `js/react_native/`). That can be done later along with updating the React Native projects. - Add Gradle wrapper in `java/` to make it easier to consistently use a specific Gradle version.	2023-03-08 12:22:06 -08:00
Yi Zhang	aa9fbed3d4	Add compilation cache for Linux GPU (#13995 ) ### Description <!-- Describe your changes. --> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->	2022-12-16 16:38:12 +08:00
Changming Sun	04900f96c1	Improve dependency management (#13523 ) ## Description 1. Convert some git submodules to cmake external projects 2. Update nsync from [1.23.0](https://github.com/google/nsync/releases/tag/1.23.0) to [1.25.0](https://github.com/google/nsync/releases/tag/1.25.0) 3. Update re2 from 2021-06-01 to 2022-06-01 4. Update wil from an old commit to 1.0.220914.1 tag 5. Update gtest to a newer commit so that it can optionally leverage absl/re2 for parsing command line flags. The following git submodules are deleted: 1. FP16 2. safeint 3. XNNPACK 4. cxxopts 5. dlpack 7. flatbuffers 8. googlebenchmark 9. json 10. mimalloc 11. mp11 12. pthreadpool More will come. ## Motivation and Context There are 3 ways of integrating 3rd party C/C++ libraries into ONNX Runtime: 1. Install them to a system location, then use cmake's find_package module to locate them. 2. Use git submodules 6. Use cmake's external projects(externalproject_add). At first when this project was just started, we considered both option 2 and option 3. We preferred option 2 because: 1. It's easier to handle authentication. At first this project was not open source, and it had some other non-public dependencies. If we use git submodule, ADO will handle authentication smoothly. Otherwise we need to manually pass tokens around and be very careful on not exposing them in build logs. 2. At that time, cmake fetched dependencies after "cmake" finished generating vcprojects/makefiles. So it was very difficult to make cflags consistent. Since cmake 3.11, it has a new command: FetchContent, which fetches dependencies when it generates vcprojects/makefiles just before add_subdirectories, so the parent project's variables/settings can be easily passed to the child projects. And when the project went on, we had some new concerns: 1. As we started to have more and more EPs and build configs, the number of submodules grew quickly. For more developers, most ORT submodules are not relevant to them. They shouldn't need to download all of them. 2. It is impossible to let two different build configs use two different versions of the same dependency. For example, right now we have protobuf 3.18.3 in the submodules. Then every EP must use the same version. Whenever we have a need to upgrade protobuf, we need to coordinate across the whole team and many external developers. I can't manage it anymore. 3. Some projects want to manage the dependencies in a different way, either because of their preference or because of compliance requirements. For example, some Microsoft teams want to use vcpkg, but we don't want to force every user of onnxruntime using vcpkg. 7. Someone wants to dynamically link to protobuf, but our build script only does static link. 8. Hard to handle security vulnerabilities. For example, whenever protobuf has a security patch, we have a lot of things to do. But if we allowed people to build ORT with a different version of protobuf without changing ORT"s source code, the customer who build ORT from source will be able to act on such things in a quicker way. They will not need to wait ORT having a patch release. 9. Every time we do a release, github will also publish a source file zip file and a source file tarball for us. But they are not usable, because they miss submodules. ### New features After this change, users will be able to: 1. Build the dependencies in the way they want, then install them to somewhere(for example, /usr or a temp folder). 2. Or download the dependencies by using cmake commands from these dependencies official website 3. Similar to the above, but use your private mirrors to migrate supply chain risks. 4. Use different versions of the dependencies, as long as our source code is compatible with them. For example, you may use you can't use protobuf 3.20.x as they need code changes in ONNX Runtime. 6. Only download the things the current build needs. 10. Avoid building external dependencies again and again in every build. ### Breaking change The onnxruntime_PREFER_SYSTEM_LIB build option is removed you could think from now it is default ON. If you don't like the new behavior, you can set FETCHCONTENT_TRY_FIND_PACKAGE_MODE to NEVER. Besides, for who relied on the onnxruntime_PREFER_SYSTEM_LIB build option, please be aware that this PR will change find_package calls from Module mode to Config mode. For example, in the past if you have installed protobuf from apt-get from ubuntu 20.04's official repo, find_package can find it and use it. But after this PR, it won't. This is because that protobuf version provided by Ubuntu 20.04 is too old to support the "config mode". It can be resolved by getting a newer version of protobuf from somewhere.	2022-12-01 09:51:59 -08:00
Changming Sun	a396a91c9a	Move build machines with Nvidia M60 GPUs to Nvidia T4 (#13170 )	2022-10-25 11:21:13 -07:00
Changming Sun	eafd67b8fd	Update CUDA version to 11.6 and refactor python packaging pipeline (#13002 ) 1. Update CUDA version from 11.4 to 11.6. 2. Update Manylinux version 3. Upgrade GCC version from 10 to 11 for most x86_64 pipelines. CentOS 7 ARM64 doesn't have GCC 11 yet. 4. Refactor python packaging pipeline: a. Split Linux GPU build job to two parts, build and test, so that the build part doesn't need to use a GPU machine b. Make the Linux GPU build job and Linux CPU build job more similar: share the same bash script and yaml file. 5. Temporarily disable Attention_Mask1D_Fp16_B2_FusedNoPadding because it is causing one of our packaging pipeline to fail. I have created an ADO task for this.	2022-09-23 00:29:27 -07:00
Changming Sun	b2b4f703a5	Move Linux GPU CI pipeline to T4 (#12996 ) Move Linux GPU CI pipeline to T4	2022-09-20 20:21:32 -07:00
Changming Sun	5d610bc8eb	Disable CG task in PR pipelines (#12426 )	2022-08-02 19:01:41 -07:00
Changming Sun	7b4ce0c1e1	Delete the build scripts that were copied from manylinux project (#12358 ) 1. Delete the build scripts that were copied from manylinux project. Use "git checkout" instead. 2. Update manylinux version to get python 3.11. Related issue: Python 3.11 support #12343 3. Change the cuda version of linux gpu build job of nuget packaging pipeline from cuda 11.4 to cuda 11.6 to match the TRT job within the same pipeline.. (A lot other places need be updated as well, but I'd prefer to put them in another PR) 4. Make dockerfile names static. For example, replace tools/ci_build/github/linux/docker/$(DockerFile) to tools/ci_build/github/linux/docker/Dockerfile.manylinux2014_cpu . The former one relies on a runtime variable $(DockerFile), Template Parameters are expanded early in processing a pipeline run when most variables are not available. It like C++ macros vs variables.	2022-07-29 18:24:19 -07:00
Gary Miguel	4bf22e2a40	Update ONNX to 1.12 (#11924 ) Follow-ups that need to happen after this and before the next ORT release: * Support SequenceMap with https://github.com/microsoft/onnxruntime/pull/11731 * Support signal ops with https://github.com/microsoft/onnxruntime/pull/11778 Follow-ups that need to happen after this but don't necessarily need to happen before the release: * Implement LayerNormalization kernel for opset version 17: https://github.com/microsoft/onnxruntime/issues/11916 Fixes #11640	2022-06-21 17:19:52 -07:00
Changming Sun	57b51e72d7	Linux CI: uninstall onnx before installing it (#11428 )	2022-05-04 08:49:37 -07:00
Changming Sun	588a66e221	Add cleanup steps to the build jobs which run in Linux CPU machine pool (#11078 )	2022-03-31 22:34:12 -07:00
Changming Sun	cc3a3476ed	Uninstall onnxruntime-training before running local tests (#10827 ) * Uninstall onnxruntime-training before running local tests	2022-03-09 18:45:04 -08:00
dependabot[bot]	4d943c9bd3	Bump numpy from 1.16.6 to 1.21.0 in /tools/ci_build/github/linux/docker/scripts/manylinux (#10387 ) * Bump numpy in /tools/ci_build/github/linux/docker/scripts/manylinux	2022-03-07 20:39:49 -08:00
Dmitri Smirnov	7e092a7e3f	Reduce number of memory allocations based on a customer profiling case (#10193 ) Add abseil and inlined containers typedefs Introduce TensorShapeVector for shape building. Use gsl::span<const T> to make interfaces accept different types of vector like args. Introduce InineShapeVectorT for shape capacity typed instantiations Refactor cuda slice along with provider shared interfaces Refactor Concat, Conv, Pad Build with Conv Einsum and ConvTranspose refactored. Remove TesnorShape::GetDimsAsVector() Refactor SliceIterator and SliceIteratorBase Refactor broadcast Refactor Pads for twice as long Remove memory planner intermediate shapes vector Refactor orttraining Fix passing TenshroShapeVector to tests Remove abseil copy and submodule, use FetchContent_Declare/Fetch Path with separate command Make RocmAsyncBuffer accept anything convertible to span. Adjust Linux GPU pipeline.	2022-01-24 10:40:46 -08:00
RandySheriffH	9345894c82	Add build option to enable cuda profiling (#9875 )	2021-11-29 22:44:50 -08:00
Chun-Wei Chen	ac57afc3a6	Update ONNX to 1.10 globally in CIs (#9751 ) * Bump ONNX 1.10.2 globally * load ONNX_VERSION from VERSION_NUMBER * /	2021-11-15 15:28:26 -08:00
Olivia Jain	60089f7093	Cuda11.4 (#8709 ) * initial update from 11.1 to 11.4 * change 11.4.1 to 11.4.0 * adjusting to match nvidia/cuda image tags * adjusting to match nvidia/cuda image tags centos7 * correction to 11.4.0 * correction to 11.4.0 * update to cuda 11.4 * change training back to 11.1 * change training back to 11.1 * point to correct nvcr.io/nvidia/cuda 11.4.1 image * change centos8 to centos7 * correct cudnn path * Update linux-gpu-ci-pipeline.yml for Azure Pipelines * Update c-api-noopenmp-packaging-pipelines.yml * need to resolve centos images but remove space and change to 11.4 * Update linux-gpu-ci-pipeline.yml * add cudnn to docker image * bump devtoolset to 10 * revert cuda 11.4 change to setup_env_trt * orttraining back to 11.1 * use nvcr.io * Fix previous change back to cuda 11.1 * update cudnn path * use cudnn image (revert if failure)	2021-08-17 16:36:26 -07:00
Thiago Crepaldi	83be3759bc	Add post-install command to build PyTorch CPP extensions from within onnxruntime package (#8027 ) ORTModule requires two PyTorch CPP extensions that are currently JIT compiled. The runtime compilation can cause issues in some environments without all build requirements or in environments with multiple instances of ORTModule running in parallel This PR creates a custom command to compile such extensions that must be manually executed before ORTModule is executed for the first time. When users try to use ORTModule before the extensions are compiled, an error with instructions are raised PyTorch CPP Extensions for ORTModule can be compiled by running: python -m onnxruntime.training.ortmodule.torch_cpp_extensions.install Full build environment is needed for this	2021-06-28 18:11:58 -07:00
Changming Sun	cba4bc11c7	Split Linux CPU CI pipeline (#8097 )	2021-06-21 10:52:30 -07:00
Changming Sun	b854f2399d	Update manylinux build scripts and GPU CUDA version from 11.0 to 11.1 (#7632 ) 1. Update manylinux build scripts. This will add [PEP600](https://www.python.org/dev/peps/pep-0600/)(manylinux2 tags) support. numpy has adopted this new feature, we should do the same. The old build script files were copied from https://github.com/pypa/manylinux, but they has been deleted and replaced in the upstream repo. The manylinux repo doesn't have a manylinux2014 branch anymore. So I'm removing the obsolete code, sync the files with the latest master. 2. Update GPU CUDA version from 11.0 to 11.1(after a discussion with PMs). 3. Delete tools/ci_build/github/linux/docker/Dockerfile.manylinux2014_cuda10_2. (Merged the content to tools/ci_build/github/linux/docker/Dockerfile.manylinux2014_cuda11) 4. Modernize the cmake code of how to locate python devel files. It was suggested in https://github.com/onnx/onnx/pull/1631 . 5. Remove `onnxruntime_MSVC_STATIC_RUNTIME` and `onnxruntime_GCC_STATIC_CPP_RUNTIME` build options. Now cmake has builtin support for it. Starting from cmake 3.15, we can use `CMAKE_MSVC_RUNTIME_LIBRARY` cmake variable to choose which MSVC runtime library we want to use. 6. Update Ubuntu docker images that used in our CI build from Ubuntu 18.04 to Ubuntu 20.04. 7. Update GCC version in CUDA 11.1 pipelines from 8.x to 9.3.1 8. Split Linux GPU CI pipeline to two jobs: build the code on a CPU machine then run the tests on another GPU machines. In the past we didn't test our python packages. We only tested the pre-packed files. So we didn't catch the rpath issue in CI build. 9. Add a CentOS machine pool and test our Linux GPU build on real CentOS machines. 10. Rework ARM64 Linux GPU python packaging pipeline. Previously it uses cross-compiling therefore we must static link to C Runtime. But now have pluggable EP API and it doesn't support static link. So I changed to use qemu emulation instead. Now the build is 10x slower than before. But it is more extensible.	2021-06-02 23:36:49 -07:00
Edward Chen	04679e31ab	Specify CUDA compute capability 7.5 in Linux GPU build (#7203 ) Recently a build agent pool was changed to use T4 GPUs (CUDA compute capability 7.5). Updating some CUDA build options to accommodate that.	2021-03-31 18:51:44 -07:00
Changming Sun	4161758058	Remove openmp related packaging pipeline (#6991 ) 1. Remove openmp related packaging pipelines and build jobs. 2. Set continueOnError to true for the TSAUpload tasks. Their service is unstable recently. 3. Update Ubuntu 16 docker images to Ubuntu 18, in prepare for getting C++17 support 4. Cherry-pick the changes in 1.7.1 to the master: updating CFLAGS/CXXFLAGS to strip out debug symbols	2021-03-12 10:02:59 -08:00
Changming Sun	b5bd14fc9f	Update GPU packaging pipelines to cuda11 and fix the other build break issues (#6585 ) Update gpu packaging pipelines to CUDA11 In the next release we will use CUDA 11. And our CUDA 11 build suddenly became broken because recently CentOS 7 posted an update of glibc. The version of glibc was changed from 2.17-317.el7 to 2.17-322.el7_9. But the newer one isn't compatible with CUDA 11. We have to downgrade it.	2021-02-05 16:58:37 -08:00
Edward Chen	6d642a3dba	Replace direct pulls from image cache container registry with get_docker_image.py, build definition clean up. (#5906 )	2020-12-01 19:10:23 -08:00
Changming Sun	2d9dcc4576	Add python 3.9 support (#5874 ) 1. Add python 3.9 support(except Linux ARM) 2. Add Windows GPU python 3.8 to our packaging pipeline.	2020-11-30 12:02:48 -08:00
Changming Sun	26db396b4b	Reduce the number of CI build variants (#5856 )	2020-11-18 20:41:30 -08:00
Changming Sun	85f945a875	Regenerate CI build docker images (#5850 )	2020-11-18 14:36:59 -08:00
Ashwini Khade	1cca903680	update onnx commit id (#5594 ) * update onnx commit id * update onnx commit for docker images * update docker images	2020-11-02 09:46:36 -08:00
Ashwini Khade	df22611026	Update ONNX commit (#5487 ) * update ONNX * update onnx + register kernels for reduction ops * bug fix kernel reg * update cgmanifests * revert unsqueeze op 13 registration * filter ops which are not implemented yet * filter some tests * update onnx commit to include conv transpose bug fix * update docker images * undo not required test changes * fix test failures	2020-10-21 07:22:20 -07:00
Changming Sun	17f1178c2e	Downgrade GCC (#5269 ) Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2020-09-24 21:14:54 -07:00
Changming Sun	a0a435abc6	Add sympy==1.1.1 to Linux docker image (#5177 )	2020-09-15 16:08:49 -07:00
Changming Sun	c5efb0085d	Update Linux GPU build pipelines to CUDA 10.2 (#5120 ) * Update Linux GPU build pipelines to CUDA 10.2	2020-09-10 17:40:51 -07:00
Changming Sun	d5d5e37e76	Build system enhancements (#5012 ) 1. Add a docker file for CUDA11 2. Support setting CUDA_ARCHITECTURES from command line.	2020-09-02 10:13:26 -07:00
Ashwini Khade	8679a7244e	Enable rejecting models based on onnx opset (#4912 ) * enable rejecting models based on onnx opset * enable unreleased opsets in linux and mac CI * test fixes and more updates * enable unreleased opsets in CI builds * enable released opsets in linux cis * try fix windows ci yml * yml fixes * update yml * yml updates post master merge * review comments * bug fix	2020-08-31 13:35:36 -07:00

1 2

62 commits