onnxruntime/tools/ci_build/github/azure-pipelines
Changming Sun 57dfd15d7b
Remove dnf update from docker build scripts (#17551)
### Description
1. Remove 'dnf update' from docker build scripts, because it upgrades TRT
packages from CUDA 11.x to CUDA 12.x.
To reproduce it, you can run the following commands in a CentOS CUDA
11.x docker image such as nvidia/cuda:11.8.0-cudnn8-devel-ubi8.
```
export v=8.6.1.6-1.cuda11.8
dnf  install -y libnvinfer8-${v} libnvparsers8-${v} libnvonnxparsers8-${v} libnvinfer-plugin8-${v} libnvinfer-vc-plugin8-${v}        libnvinfer-devel-${v} libnvparsers-devel-${v} libnvonnxparsers-devel-${v} libnvinfer-plugin-devel-${v} libnvinfer-vc-plugin-devel-${v} libnvinfer-headers-devel-${v}  libnvinfer-headers-plugin-devel-${v} 
dnf update -y
```
The last command will generate the following outputs:
```
========================================================================================================================
 Package                                     Architecture       Version                          Repository        Size
========================================================================================================================
Upgrading:
 libnvinfer-devel                            x86_64             8.6.1.6-1.cuda12.0               cuda             542 M
 libnvinfer-headers-devel                    x86_64             8.6.1.6-1.cuda12.0               cuda             118 k
 libnvinfer-headers-plugin-devel             x86_64             8.6.1.6-1.cuda12.0               cuda              14 k
 libnvinfer-plugin-devel                     x86_64             8.6.1.6-1.cuda12.0               cuda              13 M
 libnvinfer-plugin8                          x86_64             8.6.1.6-1.cuda12.0               cuda              13 M
 libnvinfer-vc-plugin-devel                  x86_64             8.6.1.6-1.cuda12.0               cuda             107 k
 libnvinfer-vc-plugin8                       x86_64             8.6.1.6-1.cuda12.0               cuda             251 k
 libnvinfer8                                 x86_64             8.6.1.6-1.cuda12.0               cuda             543 M
 libnvonnxparsers-devel                      x86_64             8.6.1.6-1.cuda12.0               cuda             467 k
 libnvonnxparsers8                           x86_64             8.6.1.6-1.cuda12.0               cuda             757 k
 libnvparsers-devel                          x86_64             8.6.1.6-1.cuda12.0               cuda             2.0 M
 libnvparsers8                               x86_64             8.6.1.6-1.cuda12.0               cuda             854 k
Installing dependencies:
 cuda-toolkit-12-0-config-common             noarch             12.0.146-1                       cuda             7.7 k
 cuda-toolkit-12-config-common               noarch             12.2.140-1                       cuda             7.9 k
 libcublas-12-0                              x86_64             12.0.2.224-1                     cuda             361 M
 libcublas-devel-12-0                        x86_64             12.0.2.224-1                     cuda             397 M

Transaction Summary
========================================================================================================================

```
As you can see from the output,  they are CUDA 12 packages. 

The problem can also be solved by lock the packages' versions by using
"dnf versionlock" command right after installing the CUDA/TRT packages.
However, going forward, to get the better reproducibility, I suggest
manually fix dnf package versions in the installation scripts like we do
for TRT now.

```bash
v="8.6.1.6-1.cuda11.8" &&\
    yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo &&\
    yum -y install libnvinfer8-${v} libnvparsers8-${v} libnvonnxparsers8-${v} libnvinfer-plugin8-${v} libnvinfer-vc-plugin8-${v}\
        libnvinfer-devel-${v} libnvparsers-devel-${v} libnvonnxparsers-devel-${v} libnvinfer-plugin-devel-${v} libnvinfer-vc-plugin-devel-${v} libnvinfer-headers-devel-${v}  libnvinfer-headers-plugin-devel-${v}
```
When we have a need to upgrade a package due to security alert or some
other reasons, we manually change the version string instead of relying
on "dnf update". Though this approach increases efforts, it can make our
pipeines more stable.

2. Move python test to docker
### Motivation and Context
Right now the nightly gpu package mixes using CUDA 11.x and CUDA 12.x
and the result package is totally not usable(crashes every time)
2023-09-21 07:33:29 -07:00
..
nodejs/templates
nuget/templates Run Nuget_Test_Linux_GPU in container (#17452) 2023-09-08 13:41:20 +08:00
templates Remove dnf update from docker build scripts (#17551) 2023-09-21 07:33:29 -07:00
triggers Pr trggiers generated by code (#17247) 2023-08-30 05:57:03 +08:00
android-arm64-v8a-QNN-crosscompile-ci-pipeline.yml [QNN EP] Update QNN SDK to version 2.14.1 (#17467) 2023-09-11 21:07:50 -07:00
android-x86_64-crosscompile-ci-pipeline.yml Update C/C++ dependencies: abseil, date, nsync, googletest, wil, mp11, cpuinfo and safeint (#15470) 2023-09-08 13:35:04 -07:00
binary-size-checks-pipeline.yml
build-perf-test-binaries-pipeline.yml
c-api-noopenmp-packaging-pipelines.yml Run Final_Jar_Testing_Linux_GPU in docker (#17533) 2023-09-15 08:35:55 -07:00
clean-build-docker-image-cache-pipeline.yml
linux-ci-pipeline.yml Remove dnf update from docker build scripts (#17551) 2023-09-21 07:33:29 -07:00
linux-cpu-aten-pipeline.yml Pr trggiers generated by code (#17247) 2023-08-30 05:57:03 +08:00
linux-cpu-eager-pipeline.yml Pr trggiers generated by code (#17247) 2023-08-30 05:57:03 +08:00
linux-cpu-minimal-build-ci-pipeline.yml Refine build script for adding disable selected data types option (#17284) 2023-08-31 13:32:55 -07:00
linux-dnnl-ci-pipeline.yml Upgrade Centos7 to Alamlinux8 (#16907) 2023-08-29 21:05:36 -07:00
linux-gpu-ci-pipeline.yml Update cmake to 3.27 and upgrade Linux CUDA docker files from CentOS7 to UBI8 (#16856) 2023-09-05 18:12:10 -07:00
linux-gpu-tensorrt-ci-pipeline.yml Fix Bug: Step failed but not exited with error (#17442) 2023-09-07 14:33:31 +08:00
linux-gpu-tensorrt-daily-perf-pipeline.yml
linux-migraphx-ci-pipeline.yml Pr trggiers generated by code (#17247) 2023-08-30 05:57:03 +08:00
linux-multi-gpu-tensorrt-ci-pipeline.yml Pr trggiers generated by code (#17247) 2023-08-30 05:57:03 +08:00
linux-openvino-ci-pipeline.yml Pr trggiers generated by code (#17247) 2023-08-30 05:57:03 +08:00
linux-qnn-ci-pipeline.yml [QNN EP] Update QNN SDK to version 2.14.1 (#17467) 2023-09-11 21:07:50 -07:00
mac-ci-pipeline.yml Pr trggiers generated by code (#17247) 2023-08-30 05:57:03 +08:00
mac-coreml-ci-pipeline.yml Pr trggiers generated by code (#17247) 2023-08-30 05:57:03 +08:00
mac-ios-ci-pipeline.yml Update C/C++ dependencies: abseil, date, nsync, googletest, wil, mp11, cpuinfo and safeint (#15470) 2023-09-08 13:35:04 -07:00
mac-ios-packaging-pipeline.yml Pr trggiers generated by code (#17247) 2023-08-30 05:57:03 +08:00
mac-objc-static-analysis-ci-pipeline.yml
mac-react-native-ci-pipeline.yml Pr trggiers generated by code (#17247) 2023-08-30 05:57:03 +08:00
npm-packaging-pipeline.yml Update npm-packaging-pipeline.yml to always use artifacts from main branch (#17604) 2023-09-19 14:42:08 -07:00
orttraining-linux-ci-pipeline.yml Update cmake to 3.27 and upgrade Linux CUDA docker files from CentOS7 to UBI8 (#16856) 2023-09-05 18:12:10 -07:00
orttraining-linux-gpu-ci-pipeline.yml Updates to training pipelines (#17292) 2023-09-08 11:57:12 -07:00
orttraining-linux-gpu-ortmodule-distributed-test-ci-pipeline.yml Pr trggiers generated by code (#17247) 2023-08-30 05:57:03 +08:00
orttraining-linux-nightly-ortmodule-test-pipeline.yml update acpt image for the training ci nightly (#17521) 2023-09-12 22:32:20 -07:00
orttraining-mac-ci-pipeline.yml Pr trggiers generated by code (#17247) 2023-08-30 05:57:03 +08:00
orttraining-pai-ci-pipeline.yml [ROCm] add manylinux build test for ROCm CI (#17621) 2023-09-21 10:45:16 +08:00
orttraining-py-packaging-pipeline-cpu.yml Upgrade Centos7 to Alamlinux8 (#16907) 2023-08-29 21:05:36 -07:00
orttraining-py-packaging-pipeline-cuda.yml Updates to training pipelines (#17292) 2023-09-08 11:57:12 -07:00
orttraining-py-packaging-pipeline-rocm.yml
post-merge-jobs.yml Delete all Prefast tasks (#17522) 2023-09-12 17:40:49 -07:00
py-package-build-pipeline.yml
py-package-test-pipeline.yml Remove dnf update from docker build scripts (#17551) 2023-09-21 07:33:29 -07:00
py-packaging-pipeline.yml
qnn-ep-nuget-packaging-pipeline.yml [QNN EP] Update QNN SDK to version 2.14.1 (#17467) 2023-09-11 21:07:50 -07:00
web-ci-pipeline.yml [web] a few updates to web pipeline (#17485) 2023-09-11 11:43:42 -07:00
win-ci-fuzz-testing.yml
win-ci-pipeline.yml Delete all Prefast tasks (#17522) 2023-09-12 17:40:49 -07:00
win-gpu-ci-pipeline.yml Delete all Prefast tasks (#17522) 2023-09-12 17:40:49 -07:00
win-gpu-reduce-op-ci-pipeline.yml
win-gpu-tensorrt-ci-pipeline.yml Pr trggiers generated by code (#17247) 2023-08-30 05:57:03 +08:00
win-qnn-arm64-ci-pipeline.yml [QNN EP] Update QNN SDK to version 2.14.1 (#17467) 2023-09-11 21:07:50 -07:00
win-qnn-ci-pipeline.yml Improve Win QNNEP pipeline (#17586) 2023-09-19 07:36:17 +08:00