pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

Author	SHA1	Message	Date
Jiakai Liu	9e0ce72e9e	[pytorch] change op dependency output to use double-quoted strings (#32464 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32464 Changed to double quoted strings to make FB linter happy. Test Plan: Imported from OSS Differential Revision: D19507859 Pulled By: ljk53 fbshipit-source-id: fa70535c7fbea73214b3b0efb0532184b5ee6854	2020-01-24 15:27:28 -08:00
Jiakai Liu	fc598f9023	generate op dependency graph as python code Summary: Add support to print op dependence as python code so that both custom build script and BUCK can import it without yaml parser. Test Plan: - generate the file: ``` ANALYZE_TORCH=1 FORMAT=py DEPLOY=1 tools/code_analyzer/build.sh -closure=false ``` - load the file in python: ``` python >>> from tools.code_analyzer.generated.torch import TORCH_DEPS >>> print(TORCH_DEPS) ``` Differential Revision: D18894639 Pulled By: ljk53 fbshipit-source-id: e304d0525a07a13cf6e8a9317cd22637200d044c	2020-01-02 20:26:28 -08:00
Jiakai Liu	be55874f2c	style fixes to code analyzer (#30808 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30808 Addressed some comments on #29550 after it's landed. Test Plan: ``` LLVM_DIR=... ANALYZE_TEST=1 CHECK_RESULT=1 tools/code_analyzer/build.sh LLVM_DIR=... ANALYZE_TORCH=1 tools/code_analyzer/build.sh -closure=false -debug_path=true ``` Differential Revision: D18835100 Pulled By: ljk53 fbshipit-source-id: 991d292ddc0211a88b04d0bdc24719f471c7786e	2019-12-05 11:25:37 -08:00
Jiakai Liu	c0299d2707	add LLVM code analyzer in order to replace static dispatch Summary: [Why static dispatch] Static dispatch was introduced to allow stripping out unused ops at link time (with “gc-sections” linker flag) for mobile build. The alternative approaches to do "non-static" dispatch are: * virtual methods - old ATen dispatcher, which has already been deprecated; * registry pattern - used by caffe2, c10 and JIT; However, none of them are “gc-sections” friendly. Global registers are root symbols - linker cannot strip out any op if we use registry pattern for mobile. [Why static dispatch isn’t great] * One more code path to maintain; * Need recompile framework to add new backends/ops; * Doesn’t support AutoGrad yet thus blocks on-device training; [Static Code Analysis] This PR introduces a LLVM analysis pass. It takes LLVM bitcode / assembly as input and generates dependecy graph among aten ops. From a set of root ops used by a model, we can calculate transitive closure of all dependent ops, then we can ask codegen to only register these ops. [Approach] To generate the dependency graph it searches for 3 types of connections in LLVM bitcode / assembly: 1) op registration: op name (schema string literal) -> registered function; 2) regular function call: function -> function; 3) op invocation: function -> op name (schema string literal) For 2) it uses similar algorithm as llvm::LazyCallGraph - not only looks into call/invoke instructions but also recursively searches for function pointers in each instruction's operands. For 1) and 3) it searches for connections between operator name string literals / function pointers and c10 op registration/invocation API calls in LLVM IR graph via "use" edges (bi-directional): 1. llvm::Value has "users()" method to get other llvm::Value nodes that use the value; 2. most of types derive from llvm::User which has "operands()" method to get other llvm::Value nodes being used by the value; [Limitation] For now the search doesn't go beyond the function boundary because the reference to op name string literals and c10 op registration/invocation APIs are almost always in the same function. The script uses regular expression to identify c10 API calls: * op_schema_pattern="^(aten\|quantized\|profiler\|_test)::[^ ]+" * op_register_pattern="c10::RegisterOperators::(op\|checkSchemaAndRegisterOp_)" * op_invoke_pattern="c10::Dispatcher::findSchema\|callOp" If we create helper function around c10 API (e.g. the "callOp" method defined in aten/native), we could simply add them to the regular expression used to identify c10 API. [Example] In the following example, it finds out: 1) the registered function for "quantized:add" operator; 2) one possible call path to at::empty() function; 3) the called operator name "aten::empty": - "quantized::add" - c10::detail::wrap_kernel_functor_unboxed_<at::native::(anonymous namespace)::QAdd<false>, at::Tensor (at::Tensor, at::Tensor, double, long)>::call(c10::OperatorKernel, at::Tensor, at::Tensor, double, long) - at::native::(anonymous namespace)::QAdd<false>::operator()(at::Tensor, at::Tensor, double, long) - void at::native::DispatchStub<void ()(at::Tensor&, at::Tensor const&, at::Tensor const&), at::native::qadd_stub>::operator()<at::Tensor&, at::Tensor const&, at::Tensor const&>(c10::DeviceType, at::Tensor&, at::Tensor const&, at::Tensor const&) - at::native::DispatchStub<void ()(at::Tensor&, at::Tensor const&, at::Tensor const&), at::native::qadd_stub>::choose_cpu_impl() - void at::native::(anonymous namespace)::qadd_kernel<false>(at::Tensor&, at::Tensor const&, at::Tensor const&) - at::TensorIterator::binary_op(at::Tensor&, at::Tensor const&, at::Tensor const&, bool) - at::TensorIterator::build() - at::TensorIterator::fast_set_up() - at::empty(c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>) - "aten::empty" [How do we know it’s correct?] Built a test project that contains different op registration/invocation patterns found in pytorch codebase, including both codegen and non-codegen cases. * Tried different optimization flags “-O0”, “-O3” - the result seems to be stable. * Filtered by common patterns: “aten::”, “at::”, “at::native”, “at::CPUType”, “at::TypeDefault” - manually checked the relationship between function schema strings and corresponding implementations were captured. * It can print instruction level data flow and show warning message if it encounters unexpected cases (e.g.: found 0 or multiple op names per registration/invocation API call, found 0 registered functions, etc). * Verified consistent results on different linux / macOs hosts. It can handle different STL library ABI reliably, including rare corner cases for short string literals [Known issues] * Doesn’t handle C code yet; * Doesn’t handle overload name yet (all variants are collapsed into the main op name); Test Plan: ``` LLVM_DIR=... ANALYZE_TEST=1 CHECK_RESULT=1 scripts/build_code_analyzer.sh ``` Differential Revision: D18428118 Pulled By: ljk53 fbshipit-source-id: d505363fa0cbbcdae87492c1f2c29464f6df2fed	2019-12-04 01:02:33 -08:00

4 commits