onnxruntime/setup.py
Ryan Hill c99aa3a3f3
Ryanunderhill/cuda shared (#7626)
* First iteration of making cuda a shared provider.
Separated out shared OpKernel change, so doing this to merge with that change.

* More cuda shared library refactoring

* More cuda shared library refactoring

* More build options tested, converted the training ops over.

* Fix merge breaks

* Fix submodules

* Fix submodules

* Fix submodules

* Fix python

* Fix compile errors

* Duplicate symbol fix

* Test fix for ROCM provider

* Another ROCM test workaround

* ROCM Build Test

* ROCM build fix

* ROCM

* ROCM

* ROCM

* ROCM

* ROCM

* ROCM test

* Reduce header dependencies

* Remove redundant namespace

* Test fix for linux

* Fix linux build

* Fix Eigen build error

* Fix unused parameter warning

* Test link error

* Another linker test

* Linker test

* Linker test

* Another test

* Another build test

* Fix linux link error

* Build test

* Fix control flow ops to use common base class with core code

* Remove extra qualifiers

* Fix template syntax for linux

* Fix cuda memory leak

* Fix pybind

* Test disabling cast

* Cleanup

* Restore cuda in test

* Remove more header dependencies

* Test not adding cuda provider to session

* Make GetProviderInfo_CUDA throw

* No-op cuda provider creation

* Fix some setup issues

* Fix memory cleanup on unload

* Diagnostics

* Don't unload library

* Add diagnostics

* Fix deleting registry at right time.

* Test disabling profiler

* Fix merge break

* Revert profiler change

* Move unloading of shared providers into Environment

* Free more global allocations before library unloads

* Add more diagnostics

* Move unloading back to the OrtEnv as there are multiple Environments created during a session.

Remove some library dependencies for tests.

* Fix more cmake files

* ERROR -> WARNING

* Fix python shutdown

* Test not using dml in pipeline

* Change python version and disable dml

* Update python version

* Test adding unload method for shared providers

* Disable DLL test

* Python test

* Revert "Python test"

This reverts commit c7ec2cfe98.

* Revert "Disable DLL test"

This reverts commit e901cb93aa.

* Revert "Test adding unload method for shared providers"

This reverts commit c427b78799.

* Point to RyanWinGPU

* Revert python version

* Fix id_to_allocator_map

* Another python exit test

* Remove extra debug messages
Try a more clean python shutdown through DllMain

* Revert DllMain idea, it didn't work

* Merge conflicts

* Fix merge with master issues.

* Comments

* Undo edit to file

* Cleanup + new training ops

* Revert yml changes

* Fix another merge error

* ROCM fix

* ROCM fix v2

* Put back Linux hack, it is necessary

* Stupid fixes

* Fix submodule out of sync

* ROCM fix 3

* ROCM 4

* Test java fix

* Fix typos

* Java test on my VM

* Fix build error

* Spotless fix

* Leave temp file around to load properly

* Fix cleanup on exit

* Fix break

* Java comments

* Remove LongformerAttentionBase workaround

* Spotless fix

* Switch yml back to regular build pool

* Revert "Switch yml back to regular build pool"

This reverts commit be35fc2a5a.

* Code review feedback

* Fix errors due to merge

* Spotless fix

* Fix minimal build

* Java fix for non cuda case

* Java fix for CPU build

* Fix Nuphar?

* Fix nuphar 2

* Fix formatting

* Revert "Remove LongformerAttentionBase workaround"

This reverts commit 648679b370.

* Training fix

* Another java fix

* Formatting

* Formatting

* For orttraining

* Last orttraining build fix...

* training fixes

* Fix test provider error

* Missing pass command

* Removed in wrong spot

* Python typo

* Python typos

* Python crash on exit, possibly due to unloading of libraries.

* Remove test_execution_provider from training build
Only enable python atexit on windows
Remove assert on provider library exit

* Still can't unload providers in python, alas.

* Disable Nvtx temporarily

* MPI Kernels for Training

* MPI Kernels part 2

* Patch through INcclService

* Oops, wrong CMakeLists

* Missing namespace

* Fix missing ()

* Move INcclService::GetInstance around to link nicer

* Missing }

* Missing MPI libraries for Cuda

* Add extra GetType functions used by MPI

* Missing Nccl library

* Remove LOGS statements as a test

* Add in a couple more missing GetType methods

* Update comments

* Missed a logging reference in mpi_context.h

* Convert aten_op to shared (due to marge with master)

* Test moving DistributedRunContext instance into shared provider layer
(with purpose error to verify it's being built properly)

* Test passed, now with fix

* Missing static

* Oops, scope DistributedRunContext to just NCCL

* Merge related issues and code review feedback.

* Merge error

* Bump to rel-1.9.1 (#7684)

* Formatting

* Code review feedback for Java build on non Windows

* Remove cupti library dependency from core library

* Test Java pipeline fix

* Linux build fix

* Revert "Linux build fix"

This reverts commit a73a811516.

* Revert "Remove cupti library dependency from core library"

This reverts commit 6a889ee8bf.

* Packaging pipeline fixes to copy cuda shared provider for tensorrt & standard packages

* Add cuda to Tensorrt nuget package

* onnxruntime_common still has a cuda header dependency

Co-authored-by: ashbhandare <ash.bhandare@gmail.com>
2021-05-20 07:53:47 -07:00

444 lines
18 KiB
Python

#-------------------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License.
#--------------------------------------------------------------------------
from setuptools import setup, find_packages, Extension
from distutils import log as logger
from distutils.command.build_ext import build_ext as _build_ext
from glob import glob
from os import path, getcwd, environ, remove, walk, makedirs, listdir
from shutil import copyfile, copytree, rmtree
import platform
import subprocess
import sys
import datetime
nightly_build = False
featurizers_build = False
package_name = 'onnxruntime'
wheel_name_suffix = None
def parse_arg_remove_boolean(argv, arg_name):
arg_value = False
if arg_name in sys.argv:
arg_value = True
argv.remove(arg_name)
return arg_value
def parse_arg_remove_string(argv, arg_name_equal):
arg_value = None
for arg in sys.argv[1:]:
if arg.startswith(arg_name_equal):
arg_value = arg[len(arg_name_equal):]
sys.argv.remove(arg)
break
return arg_value
# Any combination of the following arguments can be applied
featurizers_build = parse_arg_remove_boolean(sys.argv, '--use_featurizers')
if parse_arg_remove_boolean(sys.argv, '--nightly_build'):
package_name = 'ort-nightly'
nightly_build = True
wheel_name_suffix = parse_arg_remove_string(sys.argv, '--wheel_name_suffix=')
cuda_version = None
rocm_version = None
# The following arguments are mutually exclusive
if parse_arg_remove_boolean(sys.argv, '--use_tensorrt'):
package_name = 'onnxruntime-gpu-tensorrt' if not nightly_build else 'ort-trt-nightly'
elif parse_arg_remove_boolean(sys.argv, '--use_cuda'):
package_name = 'onnxruntime-gpu' if not nightly_build else 'ort-gpu-nightly'
cuda_version = parse_arg_remove_string(sys.argv, '--cuda_version=')
elif parse_arg_remove_boolean(sys.argv, '--use_rocm'):
package_name = 'onnxruntime-rocm' if not nightly_build else 'ort-rocm-nightly'
rocm_version = parse_arg_remove_string(sys.argv, '--rocm_version=')
elif parse_arg_remove_boolean(sys.argv, '--use_openvino'):
package_name = 'onnxruntime-openvino'
elif parse_arg_remove_boolean(sys.argv, '--use_dnnl'):
package_name = 'onnxruntime-dnnl'
elif parse_arg_remove_boolean(sys.argv, '--use_nuphar'):
package_name = 'onnxruntime-nuphar'
elif parse_arg_remove_boolean(sys.argv, '--use_vitisai'):
package_name = 'onnxruntime-vitisai'
elif parse_arg_remove_boolean(sys.argv, '--use_acl'):
package_name = 'onnxruntime-acl'
elif parse_arg_remove_boolean(sys.argv, '--use_armnn'):
package_name = 'onnxruntime-armnn'
elif parse_arg_remove_boolean(sys.argv, '--use_dml'):
package_name = 'onnxruntime-dml'
# PEP 513 defined manylinux1_x86_64 and manylinux1_i686
# PEP 571 defined manylinux2010_x86_64 and manylinux2010_i686
# PEP 599 defines the following platform tags:
# manylinux2014_x86_64
# manylinux2014_i686
# manylinux2014_aarch64
# manylinux2014_armv7l
# manylinux2014_ppc64
# manylinux2014_ppc64le
# manylinux2014_s390x
manylinux_tags = [
'manylinux1_x86_64',
'manylinux1_i686',
'manylinux2010_x86_64',
'manylinux2010_i686',
'manylinux2014_x86_64',
'manylinux2014_i686',
'manylinux2014_aarch64',
'manylinux2014_armv7l',
'manylinux2014_ppc64',
'manylinux2014_ppc64le',
'manylinux2014_s390x',
]
is_manylinux = environ.get('AUDITWHEEL_PLAT', None) in manylinux_tags
class build_ext(_build_ext):
def build_extension(self, ext):
dest_file = self.get_ext_fullpath(ext.name)
logger.info('copying %s -> %s', ext.sources[0], dest_file)
copyfile(ext.sources[0], dest_file)
try:
from wheel.bdist_wheel import bdist_wheel as _bdist_wheel
class bdist_wheel(_bdist_wheel):
def finalize_options(self):
_bdist_wheel.finalize_options(self)
if not is_manylinux:
self.root_is_pure = False
def _rewrite_ld_preload(self, to_preload):
with open('onnxruntime/capi/_ld_preload.py', 'rt') as f:
ld_preload = f.read().splitlines()
with open('onnxruntime/capi/_ld_preload.py', 'wt') as f:
for line in ld_preload:
f.write(line)
f.write('\n')
if 'LD_PRELOAD_BEGIN_MARK' in line:
break
if len(to_preload) > 0:
f.write('from ctypes import CDLL, RTLD_GLOBAL\n')
for library in to_preload:
f.write('_{} = CDLL("{}", mode=RTLD_GLOBAL)\n'.format(library.split('.')[0], library))
def run(self):
if is_manylinux:
source = 'onnxruntime/capi/onnxruntime_pybind11_state.so'
dest = 'onnxruntime/capi/onnxruntime_pybind11_state_manylinux1.so'
logger.info('copying %s -> %s', source, dest)
copyfile(source, dest)
result = subprocess.run(['patchelf', '--print-needed', dest], check=True, stdout=subprocess.PIPE, universal_newlines=True)
cuda_dependencies = ['libcublas.so', 'libcudnn.so', 'libcudart.so', 'libcurand.so', 'libcufft.so', 'libnvToolsExt.so']
cuda_dependencies.extend(['librccl.so', 'libamdhip64.so', 'librocblas.so', 'libMIOpen.so', 'libhsa-runtime64.so', 'libhsakmt.so'])
to_preload = []
args = ['patchelf', '--debug']
for line in result.stdout.split('\n'):
for dependency in cuda_dependencies:
if dependency in line:
to_preload.append(line)
args.extend(['--remove-needed', line])
args.append(dest)
if len(to_preload) > 0:
subprocess.run(args, check=True, stdout=subprocess.PIPE)
self._rewrite_ld_preload(to_preload)
_bdist_wheel.run(self)
if is_manylinux:
file = glob(path.join(self.dist_dir, '*linux*.whl'))[0]
logger.info('repairing %s for manylinux1', file)
try:
subprocess.run(['auditwheel', 'repair', '-w', self.dist_dir, file], check=True, stdout=subprocess.PIPE)
finally:
logger.info('removing %s', file)
remove(file)
except ImportError as error:
print("Error importing dependencies:")
print(error)
bdist_wheel = None
# Additional binaries
if platform.system() == 'Linux':
libs = ['onnxruntime_pybind11_state.so', 'libdnnl.so.2', 'libmklml_intel.so', 'libmklml_gnu.so', 'libiomp5.so', 'mimalloc.so']
# DNNL, TensorRT & OpenVINO EPs are built as shared libs
libs.extend(['libonnxruntime_providers_shared.so'])
libs.extend(['libonnxruntime_providers_dnnl.so'])
libs.extend(['libonnxruntime_providers_tensorrt.so'])
libs.extend(['libonnxruntime_providers_openvino.so'])
libs.extend(['libonnxruntime_providers_cuda.so'])
# Nuphar Libs
libs.extend(['libtvm.so.0.5.1'])
if nightly_build:
libs.extend(['libonnxruntime_pywrapper.so'])
elif platform.system() == "Darwin":
libs = ['onnxruntime_pybind11_state.so', 'libdnnl.2.dylib', 'mimalloc.so'] # TODO add libmklml and libiomp5 later.
# DNNL & TensorRT EPs are built as shared libs
libs.extend(['libonnxruntime_providers_shared.dylib'])
libs.extend(['libonnxruntime_providers_dnnl.dylib'])
libs.extend(['libonnxruntime_providers_tensorrt.dylib'])
libs.extend(['libonnxruntime_providers_cuda.dylib'])
if nightly_build:
libs.extend(['libonnxruntime_pywrapper.dylib'])
else:
libs = ['onnxruntime_pybind11_state.pyd', 'dnnl.dll', 'mklml.dll', 'libiomp5md.dll']
# DNNL, TensorRT & OpenVINO EPs are built as shared libs
libs.extend(['onnxruntime_providers_shared.dll'])
libs.extend(['onnxruntime_providers_dnnl.dll'])
libs.extend(['onnxruntime_providers_tensorrt.dll'])
libs.extend(['onnxruntime_providers_openvino.dll'])
libs.extend(['onnxruntime_providers_cuda.dll'])
# DirectML Libs
libs.extend(['DirectML.dll'])
# Nuphar Libs
libs.extend(['tvm.dll'])
if nightly_build:
libs.extend(['onnxruntime_pywrapper.dll'])
if is_manylinux:
data = ['capi/libonnxruntime_pywrapper.so'] if nightly_build else []
ext_modules = [
Extension(
'onnxruntime.capi.onnxruntime_pybind11_state',
['onnxruntime/capi/onnxruntime_pybind11_state_manylinux1.so'],
),
]
else:
data = [path.join('capi', x) for x in libs if path.isfile(path.join('onnxruntime', 'capi', x))]
ext_modules = []
# Additional examples
examples_names = ["mul_1.onnx", "logreg_iris.onnx", "sigmoid.onnx"]
examples = [path.join('datasets', x) for x in examples_names]
# Extra files such as EULA and ThirdPartyNotices
extra = ["LICENSE", "ThirdPartyNotices.txt", "Privacy.md"]
# Description
README = path.join(getcwd(), "docs/python/README.rst")
if not path.exists(README):
this = path.dirname(__file__)
README = path.join(this, "docs/python/README.rst")
if not path.exists(README):
raise FileNotFoundError("Unable to find 'README.rst'")
with open(README) as f:
long_description = f.read()
packages = [
'onnxruntime',
'onnxruntime.backend',
'onnxruntime.capi',
'onnxruntime.capi.training',
'onnxruntime.datasets',
'onnxruntime.tools',
'onnxruntime.tools.ort_format_model',
'onnxruntime.tools.ort_format_model.ort_flatbuffers_py',
'onnxruntime.tools.ort_format_model.ort_flatbuffers_py.experimental',
'onnxruntime.tools.ort_format_model.ort_flatbuffers_py.experimental.fbs',
'onnxruntime.quantization',
'onnxruntime.quantization.operators',
'onnxruntime.quantization.CalTableFlatBuffers',
'onnxruntime.transformers',
'onnxruntime.transformers.longformer',
]
requirements_file = "requirements.txt"
local_version = None
enable_training = parse_arg_remove_boolean(sys.argv, '--enable_training')
if enable_training:
packages.extend(['onnxruntime.training',
'onnxruntime.training.amp',
'onnxruntime.training.optim',
'onnxruntime.training.ortmodule'])
requirements_file = "requirements-training.txt"
# with training, we want to follow this naming convention:
# stable:
# onnxruntime-training-1.7.0+cu111-cp36-cp36m-linux_x86_64.whl
# nightly:
# onnxruntime-training-1.7.0.dev20210408+cu111-cp36-cp36m-linux_x86_64.whl
# this is needed immediately by pytorch/ort so that the user is able to
# install an onnxruntime training package with matching torch cuda version.
package_name = 'onnxruntime-training'
if cuda_version:
# removing '.' to make local Cuda version number in the same form as Pytorch.
local_version = '+cu' + cuda_version.replace('.', '')
if rocm_version:
# removing '.' to make Cuda version number in the same form as Pytorch.
rocm_version = rocm_version.replace('.', '')
local_version = '+rocm' + rocm_version
package_data = {}
data_files = []
if package_name == 'onnxruntime-nuphar':
packages += ["onnxruntime.nuphar"]
extra += [path.join('nuphar', 'NUPHAR_CACHE_VERSION')]
if featurizers_build:
# Copy the featurizer data from its current directory into the onnx runtime directory so that the
# content can be included as module data.
# Apparently, the root_dir is different based on how the script is invoked
source_root_dir = None
dest_root_dir = None
for potential_source_prefix, potential_dest_prefix in [
(getcwd(), getcwd()),
(path.dirname(__file__), path.dirname(__file__)),
(path.join(getcwd(), ".."), getcwd()),
]:
potential_dir = path.join(potential_source_prefix, "external", "FeaturizersLibrary", "Data")
if path.isdir(potential_dir):
source_root_dir = potential_source_prefix
dest_root_dir = potential_dest_prefix
break
if source_root_dir is None:
raise Exception("Unable to find the build root dir")
assert dest_root_dir is not None
featurizer_source_dir = path.join(source_root_dir, "external", "FeaturizersLibrary", "Data")
assert path.isdir(featurizer_source_dir), featurizer_source_dir
featurizer_dest_dir = path.join(dest_root_dir, "onnxruntime", "FeaturizersLibrary", "Data")
if path.isdir(featurizer_dest_dir):
rmtree(featurizer_dest_dir)
for item in listdir(featurizer_source_dir):
this_featurizer_source_fullpath = path.join(featurizer_source_dir)
assert path.isdir(this_featurizer_source_fullpath), this_featurizer_source_fullpath
copytree(this_featurizer_source_fullpath, featurizer_dest_dir)
packages.append("onnxruntime.FeaturizersLibrary.Data.{}".format(item))
package_data[packages[-1]] = listdir(path.join(featurizer_dest_dir, item))
package_data["onnxruntime"] = data + examples + extra
version_number = ''
with open('VERSION_NUMBER') as f:
version_number = f.readline().strip()
if nightly_build:
# https://docs.microsoft.com/en-us/azure/devops/pipelines/build/variables
build_suffix = environ.get('BUILD_BUILDNUMBER')
if build_suffix is None:
# The following line is only for local testing
build_suffix = str(datetime.datetime.now().date().strftime("%Y%m%d"))
else:
build_suffix = build_suffix.replace('.', '')
if enable_training:
from packaging import version
from packaging.version import Version
# with training package, we need to bump up version minor number so that
# nightly releases take precedence over the latest release when --pre is used during pip install.
# eventually this shall be the behavior of all onnxruntime releases.
# alternatively we may bump up version number right after every release.
ort_version = version.parse(version_number)
if isinstance(ort_version, Version):
version_number = '{major}.{minor}.{macro}'.format(
major=ort_version.major,
minor=ort_version.minor + 1,
macro=ort_version.micro)
version_number = version_number + ".dev" + build_suffix
if local_version:
version_number = version_number + local_version
if wheel_name_suffix:
package_name = "{}_{}".format(package_name, wheel_name_suffix)
cmd_classes = {}
if bdist_wheel is not None:
cmd_classes['bdist_wheel'] = bdist_wheel
cmd_classes['build_ext'] = build_ext
requirements_path = path.join(getcwd(), requirements_file)
if not path.exists(requirements_path):
this = path.dirname(__file__)
requirements_path = path.join(this, requirements_file)
if not path.exists(requirements_path):
raise FileNotFoundError("Unable to find " + requirements_file)
with open(requirements_path) as f:
install_requires = f.read().splitlines()
if enable_training:
def save_build_and_package_info(package_name, version_number, cuda_version):
sys.path.append(path.join(path.dirname(__file__), 'onnxruntime', 'python'))
from onnxruntime_collect_build_info import find_cudart_versions
version_path = path.join('onnxruntime', 'capi', 'build_and_package_info.py')
with open(version_path, 'w') as f:
f.write("package_name = '{}'\n".format(package_name))
f.write("__version__ = '{}'\n".format(version_number))
if cuda_version:
f.write("cuda_version = '{}'\n".format(cuda_version))
# cudart_versions are integers
cudart_versions = find_cudart_versions(build_env=True)
if len(cudart_versions) == 1:
f.write("cudart_version = {}\n".format(cudart_versions[0]))
else:
print(
"Error getting cudart version. ",
"did not find any cudart library" if len(cudart_versions) == 0 else "found multiple cudart libraries")
else:
# TODO: rocm
pass
save_build_and_package_info(package_name, version_number, cuda_version)
# Setup
setup(
name=package_name,
version=version_number,
description='ONNX Runtime is a runtime accelerator for Machine Learning models',
long_description=long_description,
author='Microsoft Corporation',
author_email='onnxruntime@microsoft.com',
cmdclass=cmd_classes,
license="MIT License",
packages=packages,
ext_modules=ext_modules,
package_data=package_data,
url="https://onnxruntime.ai",
download_url='https://github.com/microsoft/onnxruntime/tags',
data_files=data_files,
install_requires=install_requires,
keywords='onnx machine learning',
entry_points= {
'console_scripts': [
'onnxruntime_test = onnxruntime.tools.onnxruntime_test:main',
]
},
classifiers=[
'Development Status :: 5 - Production/Stable',
'Intended Audience :: Developers',
'License :: OSI Approved :: MIT License',
'Operating System :: POSIX :: Linux',
'Operating System :: Microsoft :: Windows',
'Operating System :: MacOS',
'Topic :: Scientific/Engineering',
'Topic :: Scientific/Engineering :: Mathematics',
'Topic :: Scientific/Engineering :: Artificial Intelligence',
'Topic :: Software Development',
'Topic :: Software Development :: Libraries',
'Topic :: Software Development :: Libraries :: Python Modules',
'Programming Language :: Python',
'Programming Language :: Python :: 3 :: Only',
'Programming Language :: Python :: 3.6',
'Programming Language :: Python :: 3.7',
'Programming Language :: Python :: 3.8',
'Programming Language :: Python :: 3.9'],
)