mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-14 20:48:00 +00:00
### Description <!-- Describe your changes. --> * Remove deprecated gpu arch to control nuget/python package size (latest TRT supports sm75 Turing and newer arch) * Add 90 to support blackwell series in next release (86;89 not considered as adding them will rapidly increase package size) | arch_range | Python-cuda12 | Nuget-cuda12 | | -------------- | ------------------------------------------------------------ | ---------------------------------- | | 60;61;70;75;80 | Linux: 279MB Win: 267MB | Linux: 247MB Win: 235MB | | 75;80 | Linux: 174MB Win: 162MB | Linux: 168MB Win: 156MB | | **75;80;90** | **Linux: 299MB Win: 277MB** | **Linux: 294MB Win: 271MB** | | 75;80;86;89 | [Linux: MB Win: 390MB](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=647457&view=results) | Linux: 416MB Win: 383MB | | 75;80;86;89;90 | [Linux: MB Win: 505MB](https://aiinfra.visualstudio.com/Lotus/_build/results?buildId=646536&view=results) | Linux: 541MB Win: 498MB | ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Callout: While adding sm90 support, the build of cuda11.8+cudnn8 will be dropped in the coming ORT release, as the build has issue with blackwell (mentioned in comments) and demand on cuda 11 is minor, according to internal ort-cuda11 repo.
32 lines
1.6 KiB
Text
32 lines
1.6 KiB
Text
# --------------------------------------------------------------
|
|
# Copyright (c) Microsoft Corporation. All rights reserved.
|
|
# Licensed under the MIT License.
|
|
# --------------------------------------------------------------
|
|
# Dockerfile to run ONNXRuntime with TensorRT integration
|
|
|
|
# nVidia TensorRT Base Image
|
|
ARG TRT_CONTAINER_VERSION=22.12
|
|
FROM nvcr.io/nvidia/tensorrt:${TRT_CONTAINER_VERSION}-py3
|
|
|
|
ARG ONNXRUNTIME_REPO=https://github.com/Microsoft/onnxruntime
|
|
ARG ONNXRUNTIME_BRANCH=main
|
|
# Adjust as needed
|
|
# Check your CUDA arch: https://developer.nvidia.com/cuda-gpus
|
|
ARG CMAKE_CUDA_ARCHITECTURES=75;80;90
|
|
|
|
RUN apt-get update &&\
|
|
apt-get install -y sudo git bash unattended-upgrades
|
|
RUN unattended-upgrade
|
|
|
|
WORKDIR /code
|
|
ENV PATH=/usr/local/nvidia/bin:/usr/local/cuda/bin:/code/cmake-3.27.3-linux-x86_64/bin:/opt/miniconda/bin:${PATH}
|
|
|
|
# Prepare onnxruntime repository & build onnxruntime with TensorRT
|
|
RUN git clone --single-branch --branch ${ONNXRUNTIME_BRANCH} --recursive ${ONNXRUNTIME_REPO} onnxruntime &&\
|
|
/bin/sh onnxruntime/dockerfiles/scripts/install_common_deps.sh &&\
|
|
trt_version=${TRT_VERSION:0:3} &&\
|
|
/bin/sh onnxruntime/dockerfiles/scripts/checkout_submodules.sh ${trt_version} &&\
|
|
cd onnxruntime &&\
|
|
/bin/sh build.sh --allow_running_as_root --parallel --build_shared_lib --cuda_home /usr/local/cuda --cudnn_home /usr/lib/x86_64-linux-gnu/ --use_tensorrt --tensorrt_home /usr/lib/x86_64-linux-gnu/ --config Release --build_wheel --skip_tests --skip_submodule_sync --cmake_extra_defines '"CMAKE_CUDA_ARCHITECTURES='${CMAKE_CUDA_ARCHITECTURES}'"' &&\
|
|
pip install /code/onnxruntime/build/Linux/Release/dist/*.whl &&\
|
|
cd ..
|