onnxruntime/tools/ci_build/github/azure-pipelines/templates
Changming Sun 57dfd15d7b
Remove dnf update from docker build scripts (#17551)
### Description
1. Remove 'dnf update' from docker build scripts, because it upgrades TRT
packages from CUDA 11.x to CUDA 12.x.
To reproduce it, you can run the following commands in a CentOS CUDA
11.x docker image such as nvidia/cuda:11.8.0-cudnn8-devel-ubi8.
```
export v=8.6.1.6-1.cuda11.8
dnf  install -y libnvinfer8-${v} libnvparsers8-${v} libnvonnxparsers8-${v} libnvinfer-plugin8-${v} libnvinfer-vc-plugin8-${v}        libnvinfer-devel-${v} libnvparsers-devel-${v} libnvonnxparsers-devel-${v} libnvinfer-plugin-devel-${v} libnvinfer-vc-plugin-devel-${v} libnvinfer-headers-devel-${v}  libnvinfer-headers-plugin-devel-${v} 
dnf update -y
```
The last command will generate the following outputs:
```
========================================================================================================================
 Package                                     Architecture       Version                          Repository        Size
========================================================================================================================
Upgrading:
 libnvinfer-devel                            x86_64             8.6.1.6-1.cuda12.0               cuda             542 M
 libnvinfer-headers-devel                    x86_64             8.6.1.6-1.cuda12.0               cuda             118 k
 libnvinfer-headers-plugin-devel             x86_64             8.6.1.6-1.cuda12.0               cuda              14 k
 libnvinfer-plugin-devel                     x86_64             8.6.1.6-1.cuda12.0               cuda              13 M
 libnvinfer-plugin8                          x86_64             8.6.1.6-1.cuda12.0               cuda              13 M
 libnvinfer-vc-plugin-devel                  x86_64             8.6.1.6-1.cuda12.0               cuda             107 k
 libnvinfer-vc-plugin8                       x86_64             8.6.1.6-1.cuda12.0               cuda             251 k
 libnvinfer8                                 x86_64             8.6.1.6-1.cuda12.0               cuda             543 M
 libnvonnxparsers-devel                      x86_64             8.6.1.6-1.cuda12.0               cuda             467 k
 libnvonnxparsers8                           x86_64             8.6.1.6-1.cuda12.0               cuda             757 k
 libnvparsers-devel                          x86_64             8.6.1.6-1.cuda12.0               cuda             2.0 M
 libnvparsers8                               x86_64             8.6.1.6-1.cuda12.0               cuda             854 k
Installing dependencies:
 cuda-toolkit-12-0-config-common             noarch             12.0.146-1                       cuda             7.7 k
 cuda-toolkit-12-config-common               noarch             12.2.140-1                       cuda             7.9 k
 libcublas-12-0                              x86_64             12.0.2.224-1                     cuda             361 M
 libcublas-devel-12-0                        x86_64             12.0.2.224-1                     cuda             397 M

Transaction Summary
========================================================================================================================

```
As you can see from the output,  they are CUDA 12 packages. 

The problem can also be solved by lock the packages' versions by using
"dnf versionlock" command right after installing the CUDA/TRT packages.
However, going forward, to get the better reproducibility, I suggest
manually fix dnf package versions in the installation scripts like we do
for TRT now.

```bash
v="8.6.1.6-1.cuda11.8" &&\
    yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo &&\
    yum -y install libnvinfer8-${v} libnvparsers8-${v} libnvonnxparsers8-${v} libnvinfer-plugin8-${v} libnvinfer-vc-plugin8-${v}\
        libnvinfer-devel-${v} libnvparsers-devel-${v} libnvonnxparsers-devel-${v} libnvinfer-plugin-devel-${v} libnvinfer-vc-plugin-devel-${v} libnvinfer-headers-devel-${v}  libnvinfer-headers-plugin-devel-${v}
```
When we have a need to upgrade a package due to security alert or some
other reasons, we manually change the version string instead of relying
on "dnf update". Though this approach increases efforts, it can make our
pipeines more stable.

2. Move python test to docker
### Motivation and Context
Right now the nightly gpu package mixes using CUDA 11.x and CUDA 12.x
and the result package is totally not usable(crashes every time)
2023-09-21 07:33:29 -07:00
..
jobs Improve Win QNNEP pipeline (#17586) 2023-09-19 07:36:17 +08:00
stages Use name of temporary provisioning profile. (#17459) 2023-09-12 10:56:35 -07:00
android-binary-size-check-stage.yml Upgrade Centos7 to Alamlinux8 (#16907) 2023-08-29 21:05:36 -07:00
android-dump-logs-from-steps.yml
android-java-api-aar-test.yml
android-java-api-aar.yml Upgrade Centos7 to Alamlinux8 (#16907) 2023-08-29 21:05:36 -07:00
build-linux-wasm-step.yml
c-api-artifacts-package-and-publish-steps-posix.yml
c-api-artifacts-package-and-publish-steps-windows.yml
c-api-cpu.yml Revert the yaml file changes in "Nodejs_Packaging_CPU" build job (#17441) 2023-09-06 20:20:55 -07:00
c-api-linux-cpu.yml Remove dnf update from docker build scripts (#17551) 2023-09-21 07:33:29 -07:00
check-cache-stats.yml
clean-agent-build-directory-step.yml Build nuget pkg for ROCm (#16791) 2023-08-28 13:35:08 +08:00
common-variables.yml Update cmake to 3.27 and upgrade Linux CUDA docker files from CentOS7 to UBI8 (#16856) 2023-09-05 18:12:10 -07:00
compliance.yml Delete all Prefast tasks (#17522) 2023-09-12 17:40:49 -07:00
component-governance-component-detection-steps.yml Move DML build job's Prefast task to a CPU machine pool (#17192) 2023-08-17 13:16:29 -07:00
download-deps.yml Update C/C++ dependencies: abseil, date, nsync, googletest, wil, mp11, cpuinfo and safeint (#15470) 2023-09-08 13:35:04 -07:00
esrp_nuget.yml
explicitly-defined-final-tasks.yml
flex-downloadPipelineArtifact.yml Run Final_Jar_Testing_Linux_GPU in docker (#17533) 2023-09-15 08:35:55 -07:00
get-docker-image-steps.yml
install-appcenter.yml
java-api-artifacts-package-and-publish-steps-posix.yml
linux-build-step-with-cache.yml
linux-ci.yml
linux-cpu-packaging-pipeline.yml Update cmake to 3.27 and upgrade Linux CUDA docker files from CentOS7 to UBI8 (#16856) 2023-09-05 18:12:10 -07:00
linux-gpu-tensorrt-packaging-pipeline.yml Update cmake to 3.27 and upgrade Linux CUDA docker files from CentOS7 to UBI8 (#16856) 2023-09-05 18:12:10 -07:00
linux-wasm-ci.yml Add training WASM generation to Web CI pipeline (#17319) 2023-09-08 15:49:47 -07:00
linux-web-init-and-check.yml
mac-build-step-with-cache.yml
mac-cpu-packaging-pipeline.yml
mac-cpu-packaging-steps.yml
mac-cpu-packing-jobs.yml
mac-esrp-dylib.yml
nodejs-artifacts-package-and-publish-steps-posix.yml
nodejs-artifacts-package-and-publish-steps-windows.yml
ondevice-training-cpu-packaging-pipeline.yml
orttraining-linux-gpu-test-ci-pipeline.yml Updates to training pipelines (#17292) 2023-09-08 11:57:12 -07:00
publish-nuget.yml Build nuget pkg for ROCm (#16791) 2023-08-28 13:35:08 +08:00
py-linux-gpu.yml Update cmake to 3.27 and upgrade Linux CUDA docker files from CentOS7 to UBI8 (#16856) 2023-09-05 18:12:10 -07:00
py-linux.yml Update cmake to 3.27 and upgrade Linux CUDA docker files from CentOS7 to UBI8 (#16856) 2023-09-05 18:12:10 -07:00
py-package-smoking-test.yml Remove dnf update from docker build scripts (#17551) 2023-09-21 07:33:29 -07:00
py-packaging-linux-test-cpu.yml Remove dnf update from docker build scripts (#17551) 2023-09-21 07:33:29 -07:00
py-packaging-linux-test-cuda.yml Remove dnf update from docker build scripts (#17551) 2023-09-21 07:33:29 -07:00
py-packaging-selectable-stage.yml
py-packaging-stage.yml Delete all Prefast tasks (#17522) 2023-09-12 17:40:49 -07:00
py-packaging-training-cuda-stage.yml Update cmake to 3.27 and upgrade Linux CUDA docker files from CentOS7 to UBI8 (#16856) 2023-09-05 18:12:10 -07:00
py-win-gpu.yml Delete all Prefast tasks (#17522) 2023-09-12 17:40:49 -07:00
react-native-ci.yml
rocm.yml Fix ROCM's nightly build (#17518) 2023-09-13 08:50:14 -07:00
run-docker-build-steps.yml
set-nightly-build-option-variable-step.yml
set-python-manylinux-variables-step.yml
set-version-number-variables-step.yml
telemetry-steps.yml
upload-code-coverage-data.yml
use-android-ndk.yml
use-xcode-version.yml
validate-package.yml
web-browserstack-ci.yml
web-ci.yml [web] a few updates to web pipeline (#17485) 2023-09-11 11:43:42 -07:00
win-ci.yml Delete all Prefast tasks (#17522) 2023-09-12 17:40:49 -07:00
win-esrp-dll.yml
win-wasm-ci.yml
win-web-ci.yml [js/web] add a test flag to customize chromium flags (#17545) 2023-09-14 10:05:31 -07:00
win-web-multi-browsers.yml
with-container-registry-steps.yml