onnxruntime/tools/ci_build/github/azure-pipelines
Tang, Cheng 8f34c8c8ed
Introduce collective ops to ort inference build (#14399)
### Description
Introduce collective ops into onnxruntime inference build, including
1) AllReduce and AllGather schema in contrib op, controlled by USE_MPI
flag
2) AllReduce and AllGather kernel in cuda EP, controlled by ORT_USE_NCCL
flag


### Motivation and Context
Enable the collective ops in onnxruntime inference build so we have the
ability to run distributed inference with multiple GPUs.
The original ncclAllReduce ops in training build require quite complex
configurations, which is not suitable for inference case, and it already
broken. so we introduce a new implementation.

---------

Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2023-02-07 13:47:48 -08:00
..
nodejs/templates Delete cpu-esrp-pipeline.yml (#13623) 2022-11-14 19:00:40 -08:00
nuget/templates Improve dependency management (#13523) 2022-12-01 09:51:59 -08:00
templates Revert mimalloc from v2.0.9 to v2.0.3 (#14603) 2023-02-07 09:58:25 -08:00
android-x86_64-crosscompile-ci-pipeline.yml Free OrtStatus in ASSERT_ORT_STATUS_OK, make run_android_emulator.py work with newer JDK version (#14369) 2023-01-20 09:27:47 -08:00
anybuild.yml
binary-size-checks-pipeline.yml Improve dependency management (#13523) 2022-12-01 09:51:59 -08:00
build-perf-test-binaries-pipeline.yml Remove old enable_linux_gpu_tests parameter from template invocation. (#13102) 2022-09-26 16:27:40 -07:00
c-api-noopenmp-packaging-pipelines.yml Update package pipelines to support TRT 8.5 (#13998) 2022-12-16 15:01:50 -08:00
clean-build-docker-image-cache-pipeline.yml
linux-ci-pipeline.yml Use today's cache only (#14120) 2023-01-04 17:48:52 +08:00
linux-cpu-aten-pipeline.yml Add Cache in Linux CPU Aten Pipeline (#14313) 2023-01-17 10:49:29 +08:00
linux-cpu-eager-pipeline.yml Enable ORT in TorchDynamo (#13259) 2022-11-01 11:19:29 -07:00
linux-cpu-minimal-build-ci-pipeline.yml Improve dependency management (#13523) 2022-12-01 09:51:59 -08:00
linux-dnnl-ci-pipeline.yml Improve dependency management (#13523) 2022-12-01 09:51:59 -08:00
linux-gpu-ci-pipeline.yml Add compilation cache for Linux GPU (#13995) 2022-12-16 16:38:12 +08:00
linux-gpu-tensorrt-ci-pipeline.yml [TensorRT EP] support TensorRT 8.5 (#13867) 2022-12-14 13:06:03 -08:00
linux-gpu-tensorrt-daily-perf-pipeline.yml Update package pipelines to support TRT 8.5 (#13998) 2022-12-16 15:01:50 -08:00
linux-gpu-tensorrt-packaging-pipeline.yml [TensorRT EP] support TensorRT 8.5 (#13867) 2022-12-14 13:06:03 -08:00
linux-migraphx-ci-pipeline.yml [ROCm] Update ROCm and MigraphX CI to ROCm5.4 (#14011) 2022-12-22 10:01:05 +08:00
linux-multi-gpu-ci-pipeline.yml Improve dependency management (#13523) 2022-12-01 09:51:59 -08:00
linux-multi-gpu-tensorrt-ci-pipeline.yml
linux-openvino-ci-pipeline.yml Openvino ep 2022.3 v4.3 (#14210) 2023-01-11 16:31:26 -08:00
linux-openvino-nightly-pipeline.yml
mac-ci-pipeline.yml make WITHCACHE as an option in MacOS workflow (#14188) 2023-01-10 10:54:19 +08:00
mac-coreml-ci-pipeline.yml
mac-ios-ci-pipeline.yml Add ability to register custom ops by specifying a function name (#14177) 2023-01-12 15:11:34 +10:00
mac-ios-packaging-pipeline.yml Improve dependency management (#13523) 2022-12-01 09:51:59 -08:00
mac-objc-static-analysis-ci-pipeline.yml Objective-C static analysis - use different llvm path to try to find clang-tidy. (#13280) 2022-10-12 10:16:26 -07:00
mac-react-native-ci-pipeline.yml
npm-packaging-pipeline.yml
orttraining-linux-ci-pipeline.yml increase the time limit as more unit tests added (#14327) 2023-01-18 15:51:21 -08:00
orttraining-linux-external-custom-ops.yml
orttraining-linux-gpu-amd-e2e-test-ci-pipeline.yml [ROCm] Fix azcopy issue on ROCm ci pipeline (#13365) 2022-10-20 12:08:57 +08:00
orttraining-linux-gpu-ci-pipeline.yml Update torch to 1.13.1 in CI and packaging pipelines for ort training (#14055) 2023-01-03 20:03:33 -08:00
orttraining-linux-gpu-distributed-e2e-test-pipeline.yml
orttraining-linux-gpu-docker-release-pipeline.yml
orttraining-linux-gpu-ortmodule-distributed-test-ci-pipeline.yml Introduce collective ops to ort inference build (#14399) 2023-02-07 13:47:48 -08:00
orttraining-linux-gpu-ortmodule-test-clear-cache-pipeline.yml
orttraining-linux-gpu-training-apis.yml Refactor training build options (#13964) 2023-01-03 13:28:16 -08:00
orttraining-linux-nightly-ortmodule-test-pipeline.yml Fix onnxruntime-CI-nightly-ort-pipeline Failure (#14464) 2023-01-28 16:05:56 +08:00
orttraining-mac-ci-pipeline.yml Improve dependency management (#13523) 2022-12-01 09:51:59 -08:00
orttraining-pai-ci-pipeline.yml Enable ccache for HIP objects (#14465) 2023-01-28 22:34:24 +08:00
orttraining-py-packaging-pipeline-cpu.yml Add support for python 3.10 for onnxruntime-training cuda and cpu (#14100) 2023-02-02 11:32:41 -08:00
orttraining-py-packaging-pipeline-cuda116.yml Create dedicated build for training api (#14136) 2023-01-10 20:58:04 -08:00
orttraining-py-packaging-pipeline-rocm.yml [ROCm] Add ROCm5.4 to python package pipeline (#14012) 2022-12-22 10:01:40 +08:00
post-merge-jobs.yml Android package custom build script update (#14403) 2023-01-25 09:19:05 -08:00
py-package-build-pipeline.yml Fix OLive build pipeline (#13114) 2022-09-27 10:19:58 -07:00
py-package-test-pipeline.yml
py-packaging-pipeline.yml CloudEP (#13855) 2023-01-03 10:03:15 -08:00
python-checks-ci-pipeline.yml
sign_ov_ep_binaries.yml Move build machines with Nvidia M60 GPUs to Nvidia T4 (#13170) 2022-10-25 11:21:13 -07:00
snpe-ep-nuget-packaging-pipeline.yml Add yml file for Snpe EP build (#13494) 2022-10-28 19:47:50 -07:00
web-ci-pipeline.yml
web-packaging-pipeline.yml
win-ci-fuzz-testing.yml Improve dependency management (#13523) 2022-12-01 09:51:59 -08:00
win-ci-pipeline.yml Remove intermedia obj files once build finished (#14361) 2023-01-20 13:37:15 +08:00
win-eager-ci-pipeline.yml
win-gpu-ci-pipeline.yml Enable cache for msbuild (#14085) 2023-01-06 11:19:57 +08:00
win-gpu-reduce-op-ci-pipeline.yml Improve dependency management (#13523) 2022-12-01 09:51:59 -08:00
win-gpu-tensorrt-ci-pipeline.yml [TensorRT EP] support TensorRT 8.5 (#13867) 2022-12-14 13:06:03 -08:00