pytorch/tools
Wenlei Xie 2ecb2c7931 Pass Scalar by reference (#53583)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53583

`Scalar` takes 32 bytes due to `c10::complex<double>`
requires aligning to 16 bytes. Passing Scalar by reference
shows about 1% improvements on instruction count.

All the changes in this commit are codemoded except for
the following 4 files (which code-gen signatures):
```
tools/codegen/api/cpp.py
tools/codegen/api/native.py
tools/codegen/api/structured.py
caffe2/contrib/aten/gen_op.py
```

# Codemode

## Main Step

For the codemod part, here is the main command used:
```
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)optional<Scalar> (\w+)' '${1}const optional<Scalar>& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)optional<Scalar> (\w+)' '${1}const optional<Scalar>& ${2}'
```

As you can tell, it codemods both `Scalar` and `optional<Scalar>`.  Apply these commands iteratively until reaching a fix-point (since one method signature might contain multiple `Scalar` parameter).

In retrospect, excluding `thrid_party` and `torch/csrc/jit` would be a good idea. (I revert it manually later, see https://github.com/pytorch/pytorch/pull/53479 as an reference).

## Pre-Step

Prior to applying the main command,  as some `Scalar` are presented as `at::Scalar` or `c10::Scalar`, so I codemod some of them in advance. Here is an incomplete list:
```
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)at::Scalar (\w+)' '${1}const at::Scalar& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)at::Scalar (\w+)' '${1}const at::Scalar& ${2}'
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)c10::optional<Scalar> (\w+)' '${1}const c10::optional<Scalar>& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)c10::optional<Scalar> (\w+)' '${1}const c10::optional<Scalar>& ${2}'
```

## Fixup
There are a couple of post codemod fixup. For example, `const Scalar` will be codemoded into `const const Scalar&`. `at:Scalar` will be codemoded into `at::const Scalar&`  (if `Pre-step` is not done comprehensively). Here is an incomplete list:
```
fastmod --extensions cpp 'const const Scalar' 'const Scalar'
fastmod --extensions h 'const const c10::optional<Scalar>' 'const c10::optional<Scalar>'
fastmod --extensions cpp 'const const c10::optional<Scalar>' 'const c10::optional<Scalar>'
fastmod 'at::const Scalar&' 'const at::Scalar&'
```

## Supplementary

`cu` and `mm` files also need to be codemoded, for example:

```
fastmod --extensions cu 'at::const Scalar&' 'const at::Scalar&'
fastmod --extensions mm '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
```

Function pointers are not codemoded. Here is an incomplete list:

```
# Cover case: using index_fill_fn = void(*)(TensorIterator & iter, int64_t dim, int64_t self_dim_size, int64_t self_dim_stride, Scalar source);
fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'

# Cover case: using softplus_fn = void (*)(TensorIterator&, Scalar, Scalar);
fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar([, \)])' '${1}const Scalar&${2}'
fastmod --extensions cpp '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar([, \)])' '${1}const Scalar&${2}'
fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)optional<Scalar>([, \)])' '${1}const optional<Scalar>&${2}'
```

Some corner cases needs to be manually fixed.

ghstack-source-id: 123970306

Test Plan: Imported from OSS

Reviewed By: smessmer

Differential Revision: D26904445

fbshipit-source-id: 8d8a002af4b5125f153a32f03c6956be7ae5671d
2021-03-15 23:17:06 -07:00
..
amd_build [ROCm] rename HIP_HCC_FLAGS to HIP_CLANG_FLAGS (#50917) 2021-01-22 07:24:05 -08:00
autograd Pass Scalar by reference (#53583) 2021-03-15 23:17:06 -07:00
clang_format_hash [tools] Update clang-format linux hash (#50520) 2021-01-13 20:50:56 -08:00
code_analyzer [pytorch][bot] update mobile op deps (#52110) 2021-02-12 14:50:29 -08:00
code_coverage [reland] Report test time regressions (#50171) 2021-02-08 15:35:21 -08:00
codegen Pass Scalar by reference (#53583) 2021-03-15 23:17:06 -07:00
config Bazel build of pytorch with gating CI (#36011) 2020-04-06 22:50:33 -07:00
docker Forbid trailing whitespace (#53406) 2021-03-05 17:22:55 -08:00
fast_nvcc Use .gv instead of .dot for Graphviz in fast_nvcc (#53208) 2021-03-03 15:01:21 -08:00
jit Remove generated_unboxing_wrappers and setManuallyBoxedKernel (#49251) 2021-01-06 14:22:50 -08:00
lite_interpreter [PyTorch] Move selected_mobile_ops.h codegen function to tools (#53786) 2021-03-12 00:13:03 -08:00
pyi Updates rounding_mode documentation to remove "true" (#52202) 2021-02-12 09:19:39 -08:00
rules remediation of S205607 2020-07-17 17:19:47 -07:00
setup_helpers Generate header with version #defines for LibTorch (#50073) 2021-02-03 22:18:53 -08:00
shared Introducing TORCH_CUDA_CPP_API and TORCH_CUDA_CU_API to the code (#50627) 2021-01-21 19:09:11 -08:00
__init__.py remediation of S205607 2020-07-17 17:19:47 -07:00
build_libtorch.py Remove Incorrect Comment in tools/build_libtorch and remove Python2 support in the module import (#44888) 2020-09-18 10:03:36 -07:00
build_pytorch_libs.py Cleanup unused code for Python < 3.6 (#47822) 2020-11-13 21:37:01 -08:00
build_variables.bzl Move as_view/increment_version to its separate key. (#53342) 2021-03-15 14:47:12 -07:00
clang_format_all.py Replaced whitelist with allowlist (#45796) 2020-10-06 09:18:51 -07:00
clang_format_ci.sh Removed whitelist reference from tools/clang_format_ci.sh (#41636) 2020-07-21 12:32:14 -07:00
clang_format_utils.py Type-annotate tools/generate_torch_version (#51637) 2021-02-03 18:07:01 -08:00
clang_tidy.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
download_mnist.py Remove Incorrect Comment in tools/build_libtorch and remove Python2 support in the module import (#44888) 2020-09-18 10:03:36 -07:00
flake8_hook.py
generate_torch_version.py Fix local version generation (#52898) 2021-02-26 10:57:07 -08:00
generated_dirs.txt
git-clang-format clang-format don't run on master (#37058) 2020-04-22 11:37:22 -07:00
git-pre-commit [ONNX] Utilize ONNX shape inference for ONNX exporter (#40628) 2020-08-30 18:35:46 -07:00
git_add_generated_dirs.sh
git_reset_generated_dirs.sh
nightly.py Bugfix nightly checkout tool to work on Windows (#49274) 2021-01-06 16:14:51 -08:00
pytorch.version
README.md Add script to display history for a single test across multiple jobs over time (#52000) 2021-02-11 13:27:49 -08:00
test_history.py Fix typo in tools/test_history.py (#53514) 2021-03-08 11:42:30 -08:00
update_disabled_tests.sh we should have a config-based way to skip flaky tests (#30978) 2019-12-17 11:58:43 -08:00

This folder contains a number of scripts which are used as part of the PyTorch build process. This directory also doubles as a Python module hierarchy (thus the __init__.py).

Overview

Modern infrastructure:

  • autograd - Code generation for autograd. This includes definitions of all our derivatives.
  • jit - Code generation for JIT
  • shared - Generic infrastructure that scripts in tools may find useful.
    • module_loader.py - Makes it easier to import arbitrary Python files in a script, without having to add them to the PYTHONPATH first.

Legacy infrastructure (we should kill this):

  • cwrap - Implementation of legacy code generation for THNN/THCUNN. This is used by nnwrap.

Build system pieces:

  • setup_helpers - Helper code for searching for third-party dependencies on the user system.
  • build_pytorch_libs.py - cross-platform script that builds all of the constituent libraries of PyTorch, but not the PyTorch Python extension itself.
  • build_libtorch.py - Script for building libtorch, a standalone C++ library without Python support. This build script is tested in CI.
  • fast_nvcc - Mostly-transparent wrapper over nvcc that parallelizes compilation when used to build CUDA files for multiple architectures at once.
    • fast_nvcc.py - Python script, entrypoint to the fast nvcc wrapper.

Developer tools which you might find useful:

Important if you want to run on AMD GPU:

  • amd_build - HIPify scripts, for transpiling CUDA into AMD HIP. Right now, PyTorch and Caffe2 share logic for how to do this transpilation, but have separate entry-points for transpiling either PyTorch or Caffe2 code.
    • build_amd.py - Top-level entry point for HIPifying our codebase.

Tools which are only situationally useful: