mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-17 21:10:43 +00:00
### Description
Upgrade python from 3.9 to 3.10 in ROCm and MigraphX docker files and CI
pipelines. Upgrade ROCm version to 6.2.3 in most places except ROCm CI,
see comment below.
Some improvements/upgrades on ROCm/Migraphx docker or pipeline:
* rocm 6.0/6.1.3 => 6.2.3
* python 3.9 => 3.10
* Ubuntu 20.04 => 22.04
* Also upgrade ml_dtypes, numpy and scipy packages.
* Fix message "ROCm version from ..." with correct file path in
CMakeList.txt
* Exclude some NHWC tests since ROCm EP lacks support for NHWC
convolution.
#### ROCm CI Pipeline:
ROCm 6.1.3 is kept in the pipeline for now.
- Failed after upgrading to ROCm 6.2.3: `HIPBLAS_STATUS_INVALID_VALUE ;
GPU=0 ; hostname=76123b390aed ;
file=/onnxruntime_src/onnxruntime/core/providers/rocm/rocm_execution_provider.cc
; line=170 ; expr=hipblasSetStream(hipblas_handle_, stream);` . It need
further investigation.
- cupy issues:
(1) It currently supports numpy < 1.27, might not work with numpy 2.x.
So we locked numpy==1.26.4 for now.
(2) cupy support of ROCm 6.2 is still in progress:
https://github.com/cupy/cupy/issues/8606.
Note that miniconda issues: its libstdc++.so.6 and libgcc_s.so.1 might
have conflict with the system ones. So we created links to use the
system ones.
#### MigraphX CI pipeline
MigraphX CI does not use cupy, and we are able to use ROCm 6.2.3 and
numpy 2.x in the pipeline.
#### Other attempts
Other things that I've tried which might help in the future:
Attempt to use a single docker file for both ROCm and Migraphx:
https://github.com/microsoft/onnxruntime/pull/22478
Upgrade to ubuntu 24.04 and python 3.12, and use venv like
[this](27903e7ff1/tools/ci_build/github/linux/docker/rocm-ci-pipeline-env.Dockerfile).
### Motivation and Context
In 1.20 release, ROCm nuget packaging pipeline will use 6.2:
https://github.com/microsoft/onnxruntime/pull/22461.
This upgrades rocm to 6.2.3 in CI pipelines to be consistent.
83 lines
3 KiB
Docker
83 lines
3 KiB
Docker
# Refer to https://github.com/RadeonOpenCompute/ROCm-docker/blob/master/dev/Dockerfile-ubuntu-22.04-complete
|
|
FROM ubuntu:22.04
|
|
|
|
ARG ROCM_VERSION=6.2.3
|
|
ARG AMDGPU_VERSION=${ROCM_VERSION}
|
|
ARG APT_PREF='Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600'
|
|
|
|
CMD ["/bin/bash"]
|
|
|
|
RUN echo "$APT_PREF" > /etc/apt/preferences.d/rocm-pin-600
|
|
|
|
ENV DEBIAN_FRONTEND noninteractive
|
|
|
|
RUN apt-get update && \
|
|
apt-get install -y --no-install-recommends ca-certificates curl libnuma-dev gnupg && \
|
|
curl -sL https://repo.radeon.com/rocm/rocm.gpg.key | apt-key add - &&\
|
|
printf "deb [arch=amd64] https://repo.radeon.com/rocm/apt/$ROCM_VERSION/ jammy main" | tee /etc/apt/sources.list.d/rocm.list && \
|
|
printf "deb [arch=amd64] https://repo.radeon.com/amdgpu/$AMDGPU_VERSION/ubuntu jammy main" | tee /etc/apt/sources.list.d/amdgpu.list && \
|
|
apt-get update && apt-get install -y --no-install-recommends \
|
|
sudo \
|
|
libelf1 \
|
|
kmod \
|
|
file \
|
|
python3 \
|
|
python3-pip \
|
|
rocm-dev \
|
|
rocm-libs \
|
|
build-essential && \
|
|
apt-get clean && \
|
|
rm -rf /var/lib/apt/lists/*
|
|
|
|
RUN groupadd -g 109 render
|
|
|
|
# Upgrade to meet security requirements
|
|
RUN apt-get update -y && apt-get upgrade -y && apt-get autoremove -y && \
|
|
apt-get install -y locales cifs-utils wget half libnuma-dev lsb-release && \
|
|
apt-get clean -y
|
|
|
|
ENV MIGRAPHX_DISABLE_FAST_GELU=1
|
|
RUN locale-gen en_US.UTF-8
|
|
RUN update-locale LANG=en_US.UTF-8
|
|
ENV LC_ALL C.UTF-8
|
|
ENV LANG C.UTF-8
|
|
|
|
WORKDIR /stage
|
|
|
|
# Cmake
|
|
ENV CMAKE_VERSION=3.30.1
|
|
RUN cd /usr/local && \
|
|
wget -q https://github.com/Kitware/CMake/releases/download/v${CMAKE_VERSION}/cmake-${CMAKE_VERSION}-Linux-x86_64.tar.gz && \
|
|
tar -zxf /usr/local/cmake-3.30.1-Linux-x86_64.tar.gz --strip=1 -C /usr
|
|
|
|
# ccache
|
|
RUN mkdir -p /tmp/ccache && \
|
|
cd /tmp/ccache && \
|
|
wget -q -O - https://github.com/ccache/ccache/releases/download/v4.7.4/ccache-4.7.4-linux-x86_64.tar.xz | tar --strip 1 -J -xf - && \
|
|
cp /tmp/ccache/ccache /usr/bin && \
|
|
rm -rf /tmp/ccache
|
|
|
|
# Install Conda
|
|
ENV PATH /opt/miniconda/bin:${PATH}
|
|
RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh --no-check-certificate && /bin/bash ~/miniconda.sh -b -p /opt/miniconda && \
|
|
conda init bash && \
|
|
conda config --set auto_activate_base false && \
|
|
conda update --all && \
|
|
rm ~/miniconda.sh && conda clean -ya
|
|
|
|
# Create migraphx-ci environment
|
|
ENV CONDA_ENVIRONMENT_PATH /opt/miniconda/envs/migraphx-ci
|
|
ENV CONDA_DEFAULT_ENV migraphx-ci
|
|
RUN conda create -y -n ${CONDA_DEFAULT_ENV} python=3.10
|
|
ENV PATH ${CONDA_ENVIRONMENT_PATH}/bin:${PATH}
|
|
|
|
# Enable migraphx-ci environment
|
|
SHELL ["conda", "run", "-n", "migraphx-ci", "/bin/bash", "-c"]
|
|
|
|
# ln -sf is needed to make sure that version `GLIBCXX_3.4.30' is found
|
|
RUN ln -sf /usr/lib/x86_64-linux-gnu/libstdc++.so.6 ${CONDA_ENVIRONMENT_PATH}/bin/../lib/libstdc++.so.6
|
|
|
|
# Install migraphx
|
|
RUN apt update && apt install -y migraphx
|
|
|
|
RUN pip install numpy packaging ml_dtypes==0.5.0
|