pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

History

Jiakai Liu 7bd7a3d806 add AutoNonVariableTypeMode for USE_STATIC_DISPATCH on JIT->ATen path (#27274 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27274 This is yet another fix to address #26764. PR #26908 toggles NonVariableTypeMode in ATen dispatcher, which is where USE_STATIC_DISPATCH takes place thus it's most logically sound place to do such tweaks. However, we observed nontrivial perf regression due to this fix. Turns out the numel() tensor method gets called in several for-loops thus incurs ~7M thread_local updates in a single forward call: ``` 7173330 numel 558 size 416 q_scale 302 _empty_affine_quantized 288 contiguous 257 q_zero_point 216 qscheme 173 empty 110 set_ 105 as_strided 104 permute ... ``` As numel() is not called from a single place so a natural workaround is to update function_wrapper.py so that it only adds the guard on gen_namespace_function() case and ignore the gen_tensor_method() case. But some tensor methods are actually being called from JIT side directly (e.g. "aten::eq_" -> "(self).eq_") so the only "band aid" left on the table is to insert guard on JIT->aten path as originally did on #26868 - this is a simplified version of it as it doesn't hurt to extend the NonVariableMode scope a little bit to also cover stack drop/pack calls. On Android we only expose JIT API so we don't need worry about TensorMethods being called directly. On iOS we don't provide a wrapper yet but we can mention this caveat in the doc. Hopefully by the time it's widely used we can finish Variable/Tensor unification and remove all these hacks. Test Plan: - Verified it runs quantized/fp32 MobileNetV2 models; - Verified it fixes the perf regression (revert #26908 separately); Differential Revision: D17732489 Pulled By: ljk53 fbshipit-source-id: c14ca66aebc6b6f17ad6efac7ca47f9487c98de5		2019-10-03 12:24:29 -07:00
..
amd_build	Fix compiler unwrapping step in jenkins build scripts for Caffe2/PyTorch on ROCm (#25409 )	2019-09-17 13:50:42 -07:00
autograd	Add memory format argument to the `clone` operator (#27106 )	2019-10-03 12:08:47 -07:00
docker
jit	add AutoNonVariableTypeMode for USE_STATIC_DISPATCH on JIT->ATen path (#27274 )	2019-10-03 12:24:29 -07:00
pyi	Port new_full to ATen. (#25583 )	2019-09-04 14:34:43 -07:00
setup_helpers	Avoid configuring ROCm if USE_CUDA is on. (#26910 )	2019-10-03 08:31:03 -07:00
shared	Kill declared_type and ignore_check from THFormal. (#26284 )	2019-09-17 07:40:33 -07:00
__init__.py
aten_mirror.sh
build_libtorch.py	Specify build dir as a global variable in BUILD_DIR in the build system.	2019-07-25 07:19:47 -07:00
build_pytorch_libs.py	remove tools/setup_helpers/cudnn.py (#25876 )	2019-09-24 07:44:33 -07:00
build_variables.py	Add send and recv backward functions for builtin operators RPC. (#25527 )	2019-10-03 01:18:46 -07:00
clang_format.py
clang_tidy.py	Fix clang-tidy script (#25652 )	2019-09-04 09:46:26 -07:00
download_mnist.py
flake8_hook.py
generated_dirs.txt
git-pre-commit	Remove THD (#22065 )	2019-06-25 12:19:13 -07:00
git_add_generated_dirs.sh
git_reset_generated_dirs.sh
pytorch.version
README.md	Stop doing nn wrap. (#25353 )	2019-08-30 07:42:20 -07:00
run-clang-tidy-in-ci.sh	fix clang-tidy failing on master (#25121 )	2019-08-23 13:50:24 -07:00

README.md

This folder contains a number of scripts which are used as part of the PyTorch build process. This directory also doubles as a Python module hierarchy (thus the __init__.py).

Overview

Modern infrastructure:

autograd - Code generation for autograd. This includes definitions of all our derivatives.
jit - Code generation for JIT
shared - Generic infrastructure that scripts in tools may find useful.
- module_loader.py - Makes it easier to import arbitrary Python files in a script, without having to add them to the PYTHONPATH first.

Legacy infrastructure (we should kill this):

cwrap - Implementation of legacy code generation for THNN/THCUNN. This is used by nnwrap.

Build system pieces:

setup_helpers - Helper code for searching for third-party dependencies on the user system.
build_pytorch_libs.sh - Script that builds all of the constituent libraries of PyTorch, but not the PyTorch Python extension itself. We are working on eliminating this script in favor of a unified cmake build.
build_pytorch_libs.bat - Same as above, but for Windows.
build_libtorch.py - Script for building libtorch, a standalone C++ library without Python support. This build script is tested in CI.

Developer tools which you might find useful:

clang_tidy.py - Script for running clang-tidy on lines of your script which you changed.
git_add_generated_dirs.sh and git_reset_generated_dirs.sh - Use this to force add generated files to your Git index, so that you can conveniently run diffs on them when working on code-generation. (See also generated_dirs.txt which specifies the list of directories with generated files.)

Important if you want to run on AMD GPU:

amd_build - HIPify scripts, for transpiling CUDA into AMD HIP. Right now, PyTorch and Caffe2 share logic for how to do this transpilation, but have separate entry-points for transpiling either PyTorch or Caffe2 code.
- build_amd.py - Top-level entry point for HIPifying our codebase.

Tools which are only situationally useful:

aten_mirror.sh - Mirroring script responsible for keeping https://github.com/zdevito/ATen up-to-date.
docker - Dockerfile for running (but not developing) PyTorch, using the official conda binary distribution. Context: https://github.com/pytorch/pytorch/issues/1619
download_mnist.py - Download the MNIST dataset; this is necessary if you want to run the C++ API tests.
run-clang-tidy-in-ci.sh - Responsible for checking that C++ code is clang-tidy clean in CI on Travis