onnxruntime/dockerfiles/Dockerfile.rocm

25 lines
1.1 KiB
Text
Raw Permalink Normal View History

# --------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License.
# --------------------------------------------------------------
# Dockerfile to run ONNXRuntime with ROCm integration
#--------------------------------------------------------------------------
[ROCm] Python 3.10 in ROCm CI, and ROCm 6.2.3 in MigraphX CI (#22527) ### Description Upgrade python from 3.9 to 3.10 in ROCm and MigraphX docker files and CI pipelines. Upgrade ROCm version to 6.2.3 in most places except ROCm CI, see comment below. Some improvements/upgrades on ROCm/Migraphx docker or pipeline: * rocm 6.0/6.1.3 => 6.2.3 * python 3.9 => 3.10 * Ubuntu 20.04 => 22.04 * Also upgrade ml_dtypes, numpy and scipy packages. * Fix message "ROCm version from ..." with correct file path in CMakeList.txt * Exclude some NHWC tests since ROCm EP lacks support for NHWC convolution. #### ROCm CI Pipeline: ROCm 6.1.3 is kept in the pipeline for now. - Failed after upgrading to ROCm 6.2.3: `HIPBLAS_STATUS_INVALID_VALUE ; GPU=0 ; hostname=76123b390aed ; file=/onnxruntime_src/onnxruntime/core/providers/rocm/rocm_execution_provider.cc ; line=170 ; expr=hipblasSetStream(hipblas_handle_, stream);` . It need further investigation. - cupy issues: (1) It currently supports numpy < 1.27, might not work with numpy 2.x. So we locked numpy==1.26.4 for now. (2) cupy support of ROCm 6.2 is still in progress: https://github.com/cupy/cupy/issues/8606. Note that miniconda issues: its libstdc++.so.6 and libgcc_s.so.1 might have conflict with the system ones. So we created links to use the system ones. #### MigraphX CI pipeline MigraphX CI does not use cupy, and we are able to use ROCm 6.2.3 and numpy 2.x in the pipeline. #### Other attempts Other things that I've tried which might help in the future: Attempt to use a single docker file for both ROCm and Migraphx: https://github.com/microsoft/onnxruntime/pull/22478 Upgrade to ubuntu 24.04 and python 3.12, and use venv like [this](https://github.com/microsoft/onnxruntime/blob/27903e7ff1dd7256cd2b277c03766b4f2ad9e2f1/tools/ci_build/github/linux/docker/rocm-ci-pipeline-env.Dockerfile). ### Motivation and Context In 1.20 release, ROCm nuget packaging pipeline will use 6.2: https://github.com/microsoft/onnxruntime/pull/22461. This upgrades rocm to 6.2.3 in CI pipelines to be consistent.
2024-10-25 18:47:16 +00:00
FROM rocm/pytorch:rocm6.2.3_ubuntu22.04_py3.10_pytorch_release_2.3.0
ARG ONNXRUNTIME_REPO=https://github.com/Microsoft/onnxruntime
ARG ONNXRUNTIME_BRANCH=main
WORKDIR /code
ENV PATH=/code/cmake-3.27.3-linux-x86_64/bin:${PATH}
# Prepare onnxruntime repository & build onnxruntime
RUN git clone --single-branch --branch ${ONNXRUNTIME_BRANCH} --recursive ${ONNXRUNTIME_REPO} onnxruntime &&\
/bin/sh onnxruntime/dockerfiles/scripts/install_common_deps.sh &&\
cd onnxruntime &&\
/bin/sh ./build.sh --allow_running_as_root --config Release --build_wheel --update --build --parallel --cmake_extra_defines\
ONNXRUNTIME_VERSION=$(cat ./VERSION_NUMBER) --use_rocm --rocm_home=/opt/rocm &&\
pip install /code/onnxruntime/build/Linux/Release/dist/*.whl &&\
cd ..