onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-20 19:12:24 +00:00

History

Yi Zhang 0d1da41ca8 Fix docker image layer caching to avoid redundant docker building and transient connection exceptions. (#21612 ) ### Description Improve docker commands to make docker image layer caching works. It can make docker building faster and more stable. So far, A100 pool's system disk is too small to use docker cache. We won't use pipeline cache for docker image and remove some legacy code. ### Motivation and Context There are often an exception of ``` 64.58 + curl https://nodejs.org/dist/v18.17.1/node-v18.17.1-linux-x64.tar.gz -sSL --retry 5 --retry-delay 30 --create-dirs -o /tmp/src/node-v18.17.1-linux-x64.tar.gz --fail 286.4 curl: (92) HTTP/2 stream 0 was not closed cleanly: INTERNAL_ERROR (err 2) ``` Because Onnxruntime pipeline have been sending too many requests to download Nodejs in docker building. Which is the major reason of pipeline failing now In fact, docker image layer caching never works. We can always see the scrips are still running ``` #9 [3/5] RUN cd /tmp/scripts && /tmp/scripts/install_centos.sh && /tmp/scripts/install_deps.sh && rm -rf /tmp/scripts #9 0.234 /bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8) #9 0.235 /bin/sh: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8) #9 0.235 /tmp/scripts/install_centos.sh: line 1: !/bin/bash: No such file or directory #9 0.235 ++ '[' '!' -f /etc/yum.repos.d/microsoft-prod.repo ']' #9 0.236 +++ tr -dc 0-9. #9 0.236 +++ cut -d . -f1 #9 0.238 ++ os_major_version=8 .... #9 60.41 + curl https://nodejs.org/dist/v18.17.1/node-v18.17.1-linux-x64.tar.gz -sSL --retry 5 --retry-delay 30 --create-dirs -o /tmp/src/node-v18.17.1-linux-x64.tar.gz --fail #9 60.59 + return 0 ... ``` This PR is improving the docker command to make image layer caching work. Thus, CI won't send so many redundant request of downloading NodeJS. ``` #9 [2/5] ADD scripts /tmp/scripts #9 CACHED #10 [3/5] RUN cd /tmp/scripts && /tmp/scripts/install_centos.sh && /tmp/scripts/install_deps.sh && rm -rf /tmp/scripts #10 CACHED #11 [4/5] RUN adduser --uid 1000 onnxruntimedev #11 CACHED #12 [5/5] WORKDIR /home/onnxruntimedev #12 CACHED ``` ###Reference https://docs.docker.com/build/drivers/ --------- Co-authored-by: Yi Zhang <your@email.com>		2024-08-06 21:37:09 +08:00
..
nodejs/templates	Adding Job names to jobs without a name (#20961 )	2024-06-06 19:09:21 -07:00
nuget/templates	[TensorRT EP] support TensorRT 10.2-GA (#21395 )	2024-07-18 12:11:52 -07:00
stages	Set CUDA12 as default in GPU packages (#21438 )	2024-07-25 10:17:16 -07:00
templates	Fix docker image layer caching to avoid redundant docker building and transient connection exceptions. (#21612 )	2024-08-06 21:37:09 +08:00
triggers	Pr trggiers generated by code (#17247 )	2023-08-30 05:57:03 +08:00
android-arm64-v8a-QNN-crosscompile-ci-pipeline.yml	Update QNN pipeline pool (#21482 )	2024-07-29 10:00:21 -07:00
android-x86_64-crosscompile-ci-pipeline.yml	Fix Android CI Pipeline code coverage failure (#21504 )	2024-07-26 07:36:23 +10:00
bigmodels-ci-pipeline.yml	Fix docker image layer caching to avoid redundant docker building and transient connection exceptions. (#21612 )	2024-08-06 21:37:09 +08:00
binary-size-checks-pipeline.yml	Clean up some mobile package related files and their usages. (#21606 )	2024-08-05 16:38:20 -07:00
build-perf-test-binaries-pipeline.yml	Upgrade Ubuntu machine pool from 20.04 to 22.04 (#19117 )	2024-01-16 17:25:18 -08:00
c-api-noopenmp-packaging-pipelines.yml	Split ondevice training cpu packaging pipeline to a separated pipeline (#21485 )	2024-07-25 10:58:34 -07:00
c-api-training-packaging-pipelines.yml	Move on-device training packages publish step (#21539 )	2024-07-29 09:59:46 -07:00
clean-build-docker-image-cache-pipeline.yml	Upgrade Ubuntu machine pool from 20.04 to 22.04 (#19117 )	2024-01-16 17:25:18 -08:00
cuda-packaging-pipeline.yml	[TensorRT EP] support TensorRT 10.2-GA (#21395 )	2024-07-18 12:11:52 -07:00
linux-ci-pipeline.yml	Update training packaging pipeline's docker files (#20853 )	2024-05-30 23:48:42 -07:00
linux-cpu-aten-pipeline.yml	Update Aten pipeline's docker file to use UBI8 (#20856 )	2024-05-30 07:38:15 -07:00
linux-cpu-eager-pipeline.yml	Update Aten pipeline's docker file to use UBI8 (#20856 )	2024-05-30 07:38:15 -07:00
linux-cpu-minimal-build-ci-pipeline.yml	Update training packaging pipeline's docker files (#20853 )	2024-05-30 23:48:42 -07:00
linux-dnnl-ci-pipeline.yml	Update training packaging pipeline's docker files (#20853 )	2024-05-30 23:48:42 -07:00
linux-gpu-ci-pipeline.yml	Set CUDA12 as default in GPU packages (#21438 )	2024-07-25 10:17:16 -07:00
linux-gpu-tensorrt-ci-pipeline.yml	Set CUDA12 as default in GPU packages (#21438 )	2024-07-25 10:17:16 -07:00
linux-gpu-tensorrt-daily-perf-pipeline.yml	Set CUDA12 as default in GPU packages (#21438 )	2024-07-25 10:17:16 -07:00
linux-migraphx-ci-pipeline.yml	change ci docker image to rocm6.1 (#21296 )	2024-07-18 14:50:01 +08:00
linux-openvino-ci-pipeline.yml	Update OpenVino CI Ubuntu to 22.04 (#21127 )	2024-07-09 09:56:44 -07:00
linux-qnn-ci-pipeline.yml	[QNN EP] Update to QNN SDK 2.24.0 (#21463 )	2024-07-24 10:17:12 -07:00
mac-ci-pipeline.yml	Delete pyop (#21094 )	2024-06-19 16:21:33 -07:00
mac-coreml-ci-pipeline.yml	Switch a portion of CI/packaging jobs to MacOS12 (#19908 )	2024-03-19 14:54:58 -07:00
mac-ios-ci-pipeline.yml	Upgrade min ios version to 13.0 (#20773 )	2024-06-04 10:15:20 -07:00
mac-ios-packaging-pipeline.yml	Upgrade min ios version to 13.0 (#20773 )	2024-06-04 10:15:20 -07:00
mac-react-native-ci-pipeline.yml	Address React Native pipeline component detection timeout (#20871 )	2024-05-30 16:37:03 -07:00
npm-packaging-pipeline.yml	Increase NPM ComponentDetection.Timeout: 1200 (#20681 )	2024-05-15 13:41:59 -07:00
nuget-cuda-publishing-pipeline.yml	Set CUDA12 as default in GPU packages (#21438 )	2024-07-25 10:17:16 -07:00
orttraining-linux-ci-pipeline.yml	Remove manylinux build scripts from python packaging pipeline (#20786 )	2024-05-24 08:18:22 -07:00
orttraining-linux-gpu-ci-pipeline.yml	Update nodejs to 18.x (#17657 )	2023-09-25 14:12:11 -07:00
orttraining-linux-gpu-ortmodule-distributed-test-ci-pipeline.yml	custom allreduce cuda kernel (#20703 )	2024-06-13 11:09:49 -07:00
orttraining-linux-nightly-ortmodule-test-pipeline.yml	ORTModule memory improvement (#18924 )	2024-01-16 08:57:37 +08:00
orttraining-mac-ci-pipeline.yml	Pr trggiers generated by code (#17247 )	2023-08-30 05:57:03 +08:00
orttraining-pai-ci-pipeline.yml	Replace inline pip install with pip install from requirements*.txt (#21106 )	2024-07-22 12:39:10 -07:00
orttraining-py-packaging-pipeline-cpu.yml	disables qnn in ort training cpu pipeline (#21510 )	2024-07-26 17:23:35 +08:00
orttraining-py-packaging-pipeline-cuda.yml	Update training packaging pipeline's docker files (#20853 )	2024-05-30 23:48:42 -07:00
orttraining-py-packaging-pipeline-cuda12.yml	Update training packaging pipeline's docker files (#20853 )	2024-05-30 23:48:42 -07:00
orttraining-py-packaging-pipeline-rocm.yml	[ROCm] Update ck to use ck_tile (#21030 )	2024-06-19 14:06:10 +08:00
post-merge-jobs.yml	[TensorRT EP] support TensorRT 10.2-GA (#21395 )	2024-07-18 12:11:52 -07:00
publish-nuget.yml	Move on-device training packages publish step (#21539 )	2024-07-29 09:59:46 -07:00
py-cuda-package-test-pipeline.yml	Adding new pipeline for python cuda testing (#18718 )	2023-12-18 18:13:03 -08:00
py-cuda-packaging-pipeline.yml	Remove manylinux build scripts from python packaging pipeline (#20786 )	2024-05-24 08:18:22 -07:00
py-cuda-publishing-pipeline.yml	Set CUDA12 as default in GPU packages (#21438 )	2024-07-25 10:17:16 -07:00
py-package-build-pipeline.yml	OpenVINO EP Rel 1.18 Changes (#20337 )	2024-04-19 00:31:38 -07:00
py-package-test-pipeline.yml	[TensorRT EP] support TensorRT 10.2-GA (#21395 )	2024-07-18 12:11:52 -07:00
py-packaging-pipeline.yml	[QNN EP] Update to QNN SDK 2.24.0 (#21463 )	2024-07-24 10:17:12 -07:00
qnn-ep-nuget-packaging-pipeline.yml	[QNN EP] Update to QNN SDK 2.24.0 (#21463 )	2024-07-24 10:17:12 -07:00
rocm-nuget-packaging-pipeline.yml	Make ROCm packaging stages to a single workflow (#21235 )	2024-07-04 11:07:04 +08:00
web-ci-pipeline.yml	Fix typos according to reviewdog report. (#21335 )	2024-07-22 13:37:32 -07:00
win-ci-fuzz-testing.yml	Uppdate nuget to Use Nuget 6.10.x (#21209 )	2024-06-28 19:49:54 -07:00
win-ci-pipeline.yml	add vitisai ep build stage to Windows CPU Pipeline (#21361 )	2024-07-15 19:34:08 -07:00
win-gpu-cuda-ci-pipeline.yml	Separating all GPU stages into different Pipelines (#21521 )	2024-07-26 14:54:45 -07:00
win-gpu-dml-ci-pipeline.yml	Separating all GPU stages into different Pipelines (#21521 )	2024-07-26 14:54:45 -07:00
win-gpu-doc-gen-ci-pipeline.yml	Separating all GPU stages into different Pipelines (#21521 )	2024-07-26 14:54:45 -07:00
win-gpu-reduce-op-ci-pipeline.yml	Move jobs in onnxruntime-Win2022-GPU-T4 machine pool to onnxruntime-Win2022-GPU-A10 (#21023 )	2024-06-12 22:04:40 -07:00
win-gpu-tensorrt-ci-pipeline.yml	Set CUDA12 as default in GPU packages (#21438 )	2024-07-25 10:17:16 -07:00
win-gpu-training-ci-pipeline.yml	Separating all GPU stages into different Pipelines (#21521 )	2024-07-26 14:54:45 -07:00
win-qnn-arm64-ci-pipeline.yml	[QNN EP] Update to QNN SDK 2.24.0 (#21463 )	2024-07-24 10:17:12 -07:00
win-qnn-ci-pipeline.yml	[QNN EP] Update to QNN SDK 2.24.0 (#21463 )	2024-07-24 10:17:12 -07:00