2018-11-06 00:44:45 +00:00
|
|
|
import sys
|
2018-02-13 23:02:50 +00:00
|
|
|
import torch.cuda
|
Add option to use ninja to compile ahead-of-time cpp_extensions (#32495)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32495
Background
------------------------------
Previously, ninja was used to compile+link inline cpp_extensions and
ahead-of-time cpp_extensions were compiled with distutils. This PR adds
the ability to compile (but not link) ahead-of-time cpp_extensions with ninja.
The main motivation for this is to speed up cpp_extension builds: distutils
does not make use of parallelism. With this PR, using the new option, on my machine,
- torchvision compilation goes from 3m43s to 49s
- nestedtensor compilation goes from 2m0s to 28s.
User-facing changes
------------------------------
I added a `use_ninja` flag to BuildExtension. This defaults to
`True`. When `use_ninja` is True:
- it will attempt to use ninja.
- If we cannot use ninja, then this throws a warning and falls back to
distutils.
- Situations we cannot use ninja: Windows (NYI, I'll open a new issue
for this), if ninja cannot be found on the system.
Implementation Details
------------------------------
This PR makes this change in two steps. Please me know if it would be
easier to review this if I split this up into a stacked diff.
Those changes are:
1) refactor _write_ninja_file to separate the policy (what compiler flags
to pass) from the mechanism (how to write the ninja file and do compilation).
2) call _write_ninja_file and _run_ninja_build while building
ahead-of-time cpp_extensions. These are only used to compile objects;
distutils still handles the linking.
Change 1: refactor _write_ninja_file to seperate policy from mechanism
- I split _write_ninja_file into: _write_ninja_file and
_write_ninja_file_to_build_library
- I renamed _build_extension_module to _run_ninja_build
Change 2: Call _write_ninja_file while building ahead-of-time
cpp_extensions
- _write_ninja_file_and_compile_objects calls _write_ninja_file to only
build object files.
- We monkey-patch distutils.CCompiler.compile to call
_write_ninja_files_and_compile_objects
- distutils still handles the linking step. The linking step is not a
bottleneck so it was not a concern.
- This change only works on unix-based systems. Our code for windows
goes down a different codepath and I did not want to mess with that.
- If a system does not support ninja, we raise a warning and fall back
to the original compilation path.
Test Plan
------------------------------
Adhoc testing
- I built torchvision using pytorch master and printed out the build
commands. Next, I used this branch to build torchvision and looked at
the ninja file. I compared the ninja file with the build commands and
asserted that they were functionally the same.
- I repeated the above for pytorch/nestedtensor.
PyTorch test suite
- I split `test_cpp_extensions` into `test_cpp_extensions_aot` and
`test_cpp_extensions_jit`. The AOT (ahead-of-time) version tests
ahead-of-time and the JIT version tests just-in-time (not to be confused
with TorchScript)
- `test_cpp_extensions_aot` gets run TWICE by run_test.py, once with
a module that was built with ninja, and once with a module that was
built without ninja.
- run_test.py asserts that when we are building with use_ninja=True,
ninja is actually available on the system.
Test Plan: Imported from OSS
Differential Revision: D19730432
Pulled By: zou3519
fbshipit-source-id: 819590d01cf65e8da5a1e8019b8b3084792fee90
2020-02-06 02:44:19 +00:00
|
|
|
import os
|
2018-02-13 23:02:50 +00:00
|
|
|
from setuptools import setup
|
2018-11-28 01:33:54 +00:00
|
|
|
from torch.utils.cpp_extension import BuildExtension, CppExtension, CUDAExtension
|
2020-02-21 20:07:51 +00:00
|
|
|
from torch.utils.cpp_extension import CUDA_HOME, ROCM_HOME
|
2020-11-16 21:01:14 +00:00
|
|
|
from torch.testing._internal.common_utils import IS_WINDOWS
|
2018-01-23 00:49:11 +00:00
|
|
|
|
2020-02-14 21:40:15 +00:00
|
|
|
if sys.platform == 'win32':
|
|
|
|
|
vc_version = os.getenv('VCToolsVersion', '')
|
|
|
|
|
if vc_version.startswith('14.16.'):
|
|
|
|
|
CXX_FLAGS = ['/sdl']
|
|
|
|
|
else:
|
|
|
|
|
CXX_FLAGS = ['/sdl', '/permissive-']
|
|
|
|
|
else:
|
|
|
|
|
CXX_FLAGS = ['-g']
|
|
|
|
|
|
Add option to use ninja to compile ahead-of-time cpp_extensions (#32495)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32495
Background
------------------------------
Previously, ninja was used to compile+link inline cpp_extensions and
ahead-of-time cpp_extensions were compiled with distutils. This PR adds
the ability to compile (but not link) ahead-of-time cpp_extensions with ninja.
The main motivation for this is to speed up cpp_extension builds: distutils
does not make use of parallelism. With this PR, using the new option, on my machine,
- torchvision compilation goes from 3m43s to 49s
- nestedtensor compilation goes from 2m0s to 28s.
User-facing changes
------------------------------
I added a `use_ninja` flag to BuildExtension. This defaults to
`True`. When `use_ninja` is True:
- it will attempt to use ninja.
- If we cannot use ninja, then this throws a warning and falls back to
distutils.
- Situations we cannot use ninja: Windows (NYI, I'll open a new issue
for this), if ninja cannot be found on the system.
Implementation Details
------------------------------
This PR makes this change in two steps. Please me know if it would be
easier to review this if I split this up into a stacked diff.
Those changes are:
1) refactor _write_ninja_file to separate the policy (what compiler flags
to pass) from the mechanism (how to write the ninja file and do compilation).
2) call _write_ninja_file and _run_ninja_build while building
ahead-of-time cpp_extensions. These are only used to compile objects;
distutils still handles the linking.
Change 1: refactor _write_ninja_file to seperate policy from mechanism
- I split _write_ninja_file into: _write_ninja_file and
_write_ninja_file_to_build_library
- I renamed _build_extension_module to _run_ninja_build
Change 2: Call _write_ninja_file while building ahead-of-time
cpp_extensions
- _write_ninja_file_and_compile_objects calls _write_ninja_file to only
build object files.
- We monkey-patch distutils.CCompiler.compile to call
_write_ninja_files_and_compile_objects
- distutils still handles the linking step. The linking step is not a
bottleneck so it was not a concern.
- This change only works on unix-based systems. Our code for windows
goes down a different codepath and I did not want to mess with that.
- If a system does not support ninja, we raise a warning and fall back
to the original compilation path.
Test Plan
------------------------------
Adhoc testing
- I built torchvision using pytorch master and printed out the build
commands. Next, I used this branch to build torchvision and looked at
the ninja file. I compared the ninja file with the build commands and
asserted that they were functionally the same.
- I repeated the above for pytorch/nestedtensor.
PyTorch test suite
- I split `test_cpp_extensions` into `test_cpp_extensions_aot` and
`test_cpp_extensions_jit`. The AOT (ahead-of-time) version tests
ahead-of-time and the JIT version tests just-in-time (not to be confused
with TorchScript)
- `test_cpp_extensions_aot` gets run TWICE by run_test.py, once with
a module that was built with ninja, and once with a module that was
built without ninja.
- run_test.py asserts that when we are building with use_ninja=True,
ninja is actually available on the system.
Test Plan: Imported from OSS
Differential Revision: D19730432
Pulled By: zou3519
fbshipit-source-id: 819590d01cf65e8da5a1e8019b8b3084792fee90
2020-02-06 02:44:19 +00:00
|
|
|
USE_NINJA = os.getenv('USE_NINJA') == '1'
|
2018-11-06 00:44:45 +00:00
|
|
|
|
2018-01-23 00:49:11 +00:00
|
|
|
ext_modules = [
|
2018-02-13 23:02:50 +00:00
|
|
|
CppExtension(
|
2018-04-29 16:10:03 +00:00
|
|
|
'torch_test_cpp_extension.cpp', ['extension.cpp'],
|
2018-11-06 00:44:45 +00:00
|
|
|
extra_compile_args=CXX_FLAGS),
|
2019-02-01 18:55:00 +00:00
|
|
|
CppExtension(
|
|
|
|
|
'torch_test_cpp_extension.msnpu', ['msnpu_extension.cpp'],
|
|
|
|
|
extra_compile_args=CXX_FLAGS),
|
2020-03-22 17:54:06 +00:00
|
|
|
CppExtension(
|
|
|
|
|
'torch_test_cpp_extension.rng', ['rng_extension.cpp'],
|
|
|
|
|
extra_compile_args=CXX_FLAGS),
|
2018-01-23 00:49:11 +00:00
|
|
|
]
|
|
|
|
|
|
2020-12-03 02:00:15 +00:00
|
|
|
if torch.cuda.is_available() and (CUDA_HOME is not None or ROCM_HOME is not None):
|
2018-02-13 23:02:50 +00:00
|
|
|
extension = CUDAExtension(
|
2018-04-29 16:10:03 +00:00
|
|
|
'torch_test_cpp_extension.cuda', [
|
2018-02-23 15:15:30 +00:00
|
|
|
'cuda_extension.cpp',
|
|
|
|
|
'cuda_extension_kernel.cu',
|
|
|
|
|
'cuda_extension_kernel2.cu',
|
|
|
|
|
],
|
2018-11-06 00:44:45 +00:00
|
|
|
extra_compile_args={'cxx': CXX_FLAGS,
|
2018-02-13 23:02:50 +00:00
|
|
|
'nvcc': ['-O2']})
|
|
|
|
|
ext_modules.append(extension)
|
|
|
|
|
|
2020-11-16 21:01:14 +00:00
|
|
|
if not IS_WINDOWS: # MSVC has bug compiling this example
|
2020-12-03 02:00:15 +00:00
|
|
|
if torch.cuda.is_available() and (CUDA_HOME is not None or ROCM_HOME is not None):
|
2020-11-16 21:01:14 +00:00
|
|
|
extension = CUDAExtension(
|
|
|
|
|
'torch_test_cpp_extension.torch_library', [
|
|
|
|
|
'torch_library.cu'
|
|
|
|
|
],
|
|
|
|
|
extra_compile_args={'cxx': CXX_FLAGS,
|
|
|
|
|
'nvcc': ['-O2']})
|
|
|
|
|
ext_modules.append(extension)
|
|
|
|
|
|
2018-01-23 00:49:11 +00:00
|
|
|
setup(
|
2018-02-17 03:31:04 +00:00
|
|
|
name='torch_test_cpp_extension',
|
2018-04-29 16:10:03 +00:00
|
|
|
packages=['torch_test_cpp_extension'],
|
2018-01-23 00:49:11 +00:00
|
|
|
ext_modules=ext_modules,
|
2020-05-14 02:56:06 +00:00
|
|
|
include_dirs='self_compiler_include_dirs_test',
|
Add option to use ninja to compile ahead-of-time cpp_extensions (#32495)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32495
Background
------------------------------
Previously, ninja was used to compile+link inline cpp_extensions and
ahead-of-time cpp_extensions were compiled with distutils. This PR adds
the ability to compile (but not link) ahead-of-time cpp_extensions with ninja.
The main motivation for this is to speed up cpp_extension builds: distutils
does not make use of parallelism. With this PR, using the new option, on my machine,
- torchvision compilation goes from 3m43s to 49s
- nestedtensor compilation goes from 2m0s to 28s.
User-facing changes
------------------------------
I added a `use_ninja` flag to BuildExtension. This defaults to
`True`. When `use_ninja` is True:
- it will attempt to use ninja.
- If we cannot use ninja, then this throws a warning and falls back to
distutils.
- Situations we cannot use ninja: Windows (NYI, I'll open a new issue
for this), if ninja cannot be found on the system.
Implementation Details
------------------------------
This PR makes this change in two steps. Please me know if it would be
easier to review this if I split this up into a stacked diff.
Those changes are:
1) refactor _write_ninja_file to separate the policy (what compiler flags
to pass) from the mechanism (how to write the ninja file and do compilation).
2) call _write_ninja_file and _run_ninja_build while building
ahead-of-time cpp_extensions. These are only used to compile objects;
distutils still handles the linking.
Change 1: refactor _write_ninja_file to seperate policy from mechanism
- I split _write_ninja_file into: _write_ninja_file and
_write_ninja_file_to_build_library
- I renamed _build_extension_module to _run_ninja_build
Change 2: Call _write_ninja_file while building ahead-of-time
cpp_extensions
- _write_ninja_file_and_compile_objects calls _write_ninja_file to only
build object files.
- We monkey-patch distutils.CCompiler.compile to call
_write_ninja_files_and_compile_objects
- distutils still handles the linking step. The linking step is not a
bottleneck so it was not a concern.
- This change only works on unix-based systems. Our code for windows
goes down a different codepath and I did not want to mess with that.
- If a system does not support ninja, we raise a warning and fall back
to the original compilation path.
Test Plan
------------------------------
Adhoc testing
- I built torchvision using pytorch master and printed out the build
commands. Next, I used this branch to build torchvision and looked at
the ninja file. I compared the ninja file with the build commands and
asserted that they were functionally the same.
- I repeated the above for pytorch/nestedtensor.
PyTorch test suite
- I split `test_cpp_extensions` into `test_cpp_extensions_aot` and
`test_cpp_extensions_jit`. The AOT (ahead-of-time) version tests
ahead-of-time and the JIT version tests just-in-time (not to be confused
with TorchScript)
- `test_cpp_extensions_aot` gets run TWICE by run_test.py, once with
a module that was built with ninja, and once with a module that was
built without ninja.
- run_test.py asserts that when we are building with use_ninja=True,
ninja is actually available on the system.
Test Plan: Imported from OSS
Differential Revision: D19730432
Pulled By: zou3519
fbshipit-source-id: 819590d01cf65e8da5a1e8019b8b3084792fee90
2020-02-06 02:44:19 +00:00
|
|
|
cmdclass={'build_ext': BuildExtension.with_options(use_ninja=USE_NINJA)})
|