2021-04-13 16:21:51 +00:00
|
|
|
from tools.codegen.model import (Argument, Arguments, BaseTy, BaseType,
|
|
|
|
|
FunctionSchema, ListType, NativeFunction,
|
|
|
|
|
OptionalType, Return, SelfArgument,
|
|
|
|
|
TensorOptionsArguments, Type, assert_never)
|
2021-04-16 18:40:22 +00:00
|
|
|
from tools.codegen.api.types import (ArgName, BaseCType, Binding, ConstRefCType, NamedCType, CType,
|
add C++ namespacing logic to ctypes (#55047)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55047
Added namespaces to all of the `CTypes` printed in the codegen. This is pretty much required if we want to use codegen externally, since we can no longer assume that we're inside of the `at::` namespace.
Important changes are in `types.py`.
How do we add the notion of namespaces to C++ types without people having to write "at::Tensor" everywhere? Before this PR, `CType` held a raw string representing the type, i.e. `BaseCType("Tensor", binds)`. This PR introduces a set of singleton base C++ types in `types.py`, that know how to print their namespace. Instead, we'd write `BaseCType(tensorT, binds)`, where printing `tensorT` will properly print out "at::Tensor".
This also means that you can't create arbitrary `CTypes`. If we need a new C++ type in the codegen, we need to add it to the list in `types.py`.
One blip in the design: we don't want to change `RegistrationDeclarations.yaml`, since that'll break external backends that ingest it. I added separate functions to display types without the namespace that are used to create RegistrationDeclarations.yaml`. With an external codegen API though, we can eventually kill it :)
I also didn't realize until this PR that `Declarations.yaml` is still directly in use, by some python/autograd codegen. Rather than keep that yaml byte-for-byte compatible, I just updated the callsites in the autograd codegen to work with namespaces. In the NEXT pr, I try to clean up some of the autograd codegen to stop using raw strings to match against C++ types.
Test Plan: Imported from OSS
Reviewed By: bhosmer
Differential Revision: D27708349
Pulled By: bdhirsh
fbshipit-source-id: 56a4f81fc101795bcb9ee1f722121480fb2356ad
2021-04-16 18:40:22 +00:00
|
|
|
MutRefCType, ArrayCType, ListCType, VectorCType, ArrayRefCType,
|
|
|
|
|
OptionalCType, TupleCType, SpecialArgName, boolT, scalarT,
|
|
|
|
|
tensorListT, dimnameListT, tensorT, voidT,
|
|
|
|
|
BaseTypeToCppMapping, intArrayRefT, tensorOptionsT)
|
2021-01-04 19:51:28 +00:00
|
|
|
from typing import Optional, Sequence, Union, List, Set
|
2020-08-31 15:58:32 +00:00
|
|
|
|
|
|
|
|
# This file describes the translation of JIT schema to the public C++
|
|
|
|
|
# API, which is what people use when they call functions like at::add.
|
|
|
|
|
#
|
|
|
|
|
# Prominent characteristics of the C++ API:
|
|
|
|
|
#
|
|
|
|
|
# - dtype, layout, device and pin_memory are collected into
|
2020-10-13 15:24:07 +00:00
|
|
|
# a single C++ type TensorOptions (the native functions API
|
2020-08-31 15:58:32 +00:00
|
|
|
# also has this, but tensor options is really most relevant
|
|
|
|
|
# for the C++ API; it makes calling kwarg factory functions
|
|
|
|
|
# pleasant)
|
|
|
|
|
#
|
|
|
|
|
# - defaulting lives here (in fact, the dispatcher is completely
|
|
|
|
|
# oblivious of defaults!)
|
|
|
|
|
#
|
|
|
|
|
# BTW: policy on name collisions: we try not to have types with
|
|
|
|
|
# collisions, but functions are fair game to collide
|
|
|
|
|
|
2020-12-08 11:41:19 +00:00
|
|
|
def name(func: FunctionSchema, *, faithful_name_for_out_overloads: bool = False) -> str:
|
2020-08-31 15:58:32 +00:00
|
|
|
name = str(func.name.name)
|
|
|
|
|
if func.is_out_fn():
|
2020-12-08 11:41:19 +00:00
|
|
|
if faithful_name_for_out_overloads:
|
|
|
|
|
name += '_outf'
|
|
|
|
|
else:
|
|
|
|
|
name += '_out'
|
|
|
|
|
|
2020-08-31 15:58:32 +00:00
|
|
|
return name
|
|
|
|
|
|
|
|
|
|
# Translation of "value types" in JIT schema to C++ API type. Value
|
2020-10-23 08:28:49 +00:00
|
|
|
# types look the same no matter if they are argument types or return
|
2020-08-31 15:58:32 +00:00
|
|
|
# types. Returns None if the type in question is not a value type.
|
2021-04-16 18:40:22 +00:00
|
|
|
def valuetype_type(t: Type, *, binds: ArgName) -> Optional[NamedCType]:
|
2020-08-31 15:58:32 +00:00
|
|
|
if isinstance(t, BaseType):
|
Pass Scalar by reference (#53583)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53583
`Scalar` takes 32 bytes due to `c10::complex<double>`
requires aligning to 16 bytes. Passing Scalar by reference
shows about 1% improvements on instruction count.
All the changes in this commit are codemoded except for
the following 4 files (which code-gen signatures):
```
tools/codegen/api/cpp.py
tools/codegen/api/native.py
tools/codegen/api/structured.py
caffe2/contrib/aten/gen_op.py
```
# Codemode
## Main Step
For the codemod part, here is the main command used:
```
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)optional<Scalar> (\w+)' '${1}const optional<Scalar>& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)optional<Scalar> (\w+)' '${1}const optional<Scalar>& ${2}'
```
As you can tell, it codemods both `Scalar` and `optional<Scalar>`. Apply these commands iteratively until reaching a fix-point (since one method signature might contain multiple `Scalar` parameter).
In retrospect, excluding `thrid_party` and `torch/csrc/jit` would be a good idea. (I revert it manually later, see https://github.com/pytorch/pytorch/pull/53479 as an reference).
## Pre-Step
Prior to applying the main command, as some `Scalar` are presented as `at::Scalar` or `c10::Scalar`, so I codemod some of them in advance. Here is an incomplete list:
```
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)at::Scalar (\w+)' '${1}const at::Scalar& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)at::Scalar (\w+)' '${1}const at::Scalar& ${2}'
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)c10::optional<Scalar> (\w+)' '${1}const c10::optional<Scalar>& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)c10::optional<Scalar> (\w+)' '${1}const c10::optional<Scalar>& ${2}'
```
## Fixup
There are a couple of post codemod fixup. For example, `const Scalar` will be codemoded into `const const Scalar&`. `at:Scalar` will be codemoded into `at::const Scalar&` (if `Pre-step` is not done comprehensively). Here is an incomplete list:
```
fastmod --extensions cpp 'const const Scalar' 'const Scalar'
fastmod --extensions h 'const const c10::optional<Scalar>' 'const c10::optional<Scalar>'
fastmod --extensions cpp 'const const c10::optional<Scalar>' 'const c10::optional<Scalar>'
fastmod 'at::const Scalar&' 'const at::Scalar&'
```
## Supplementary
`cu` and `mm` files also need to be codemoded, for example:
```
fastmod --extensions cu 'at::const Scalar&' 'const at::Scalar&'
fastmod --extensions mm '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
```
Function pointers are not codemoded. Here is an incomplete list:
```
# Cover case: using index_fill_fn = void(*)(TensorIterator & iter, int64_t dim, int64_t self_dim_size, int64_t self_dim_stride, Scalar source);
fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
# Cover case: using softplus_fn = void (*)(TensorIterator&, Scalar, Scalar);
fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar([, \)])' '${1}const Scalar&${2}'
fastmod --extensions cpp '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar([, \)])' '${1}const Scalar&${2}'
fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)optional<Scalar>([, \)])' '${1}const optional<Scalar>&${2}'
```
Some corner cases needs to be manually fixed.
ghstack-source-id: 123970306
Test Plan: Imported from OSS
Reviewed By: smessmer
Differential Revision: D26904445
fbshipit-source-id: 8d8a002af4b5125f153a32f03c6956be7ae5671d
2021-03-16 06:13:12 +00:00
|
|
|
if t.name == BaseTy.Tensor or t.name == BaseTy.Scalar:
|
2020-08-31 15:58:32 +00:00
|
|
|
return None
|
add C++ namespacing logic to ctypes (#55047)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55047
Added namespaces to all of the `CTypes` printed in the codegen. This is pretty much required if we want to use codegen externally, since we can no longer assume that we're inside of the `at::` namespace.
Important changes are in `types.py`.
How do we add the notion of namespaces to C++ types without people having to write "at::Tensor" everywhere? Before this PR, `CType` held a raw string representing the type, i.e. `BaseCType("Tensor", binds)`. This PR introduces a set of singleton base C++ types in `types.py`, that know how to print their namespace. Instead, we'd write `BaseCType(tensorT, binds)`, where printing `tensorT` will properly print out "at::Tensor".
This also means that you can't create arbitrary `CTypes`. If we need a new C++ type in the codegen, we need to add it to the list in `types.py`.
One blip in the design: we don't want to change `RegistrationDeclarations.yaml`, since that'll break external backends that ingest it. I added separate functions to display types without the namespace that are used to create RegistrationDeclarations.yaml`. With an external codegen API though, we can eventually kill it :)
I also didn't realize until this PR that `Declarations.yaml` is still directly in use, by some python/autograd codegen. Rather than keep that yaml byte-for-byte compatible, I just updated the callsites in the autograd codegen to work with namespaces. In the NEXT pr, I try to clean up some of the autograd codegen to stop using raw strings to match against C++ types.
Test Plan: Imported from OSS
Reviewed By: bhosmer
Differential Revision: D27708349
Pulled By: bdhirsh
fbshipit-source-id: 56a4f81fc101795bcb9ee1f722121480fb2356ad
2021-04-16 18:40:22 +00:00
|
|
|
# All other BaseType currently map directly to BaseCppTypes.
|
2021-04-16 18:40:22 +00:00
|
|
|
return NamedCType(binds, BaseCType(BaseTypeToCppMapping[t.name]))
|
2020-08-31 15:58:32 +00:00
|
|
|
elif isinstance(t, OptionalType):
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
elem = valuetype_type(t.elem, binds=binds)
|
2020-08-31 15:58:32 +00:00
|
|
|
if elem is None:
|
|
|
|
|
return None
|
2021-04-16 18:40:22 +00:00
|
|
|
return NamedCType(binds, OptionalCType(elem.type))
|
2020-08-31 15:58:32 +00:00
|
|
|
elif isinstance(t, ListType):
|
|
|
|
|
if str(t.elem) == 'bool':
|
|
|
|
|
assert t.size is not None
|
2021-04-16 18:40:22 +00:00
|
|
|
return NamedCType(binds, ArrayCType(BaseCType(boolT), t.size))
|
2020-08-31 15:58:32 +00:00
|
|
|
else:
|
|
|
|
|
return None
|
|
|
|
|
else:
|
|
|
|
|
raise AssertionError(f"unrecognized type {repr(t)}")
|
|
|
|
|
|
|
|
|
|
# Translation of types occuring in JIT arguments to a C++ argument type.
|
2021-04-16 18:40:22 +00:00
|
|
|
def argumenttype_type(t: Type, *, mutable: bool, binds: ArgName) -> NamedCType:
|
2020-08-31 15:58:32 +00:00
|
|
|
# If it's a value type, do the value type translation
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
r = valuetype_type(t, binds=binds)
|
2020-08-31 15:58:32 +00:00
|
|
|
if r is not None:
|
|
|
|
|
return r
|
|
|
|
|
|
|
|
|
|
if isinstance(t, BaseType):
|
|
|
|
|
if t.name == BaseTy.Tensor:
|
|
|
|
|
if mutable:
|
2021-04-16 18:40:22 +00:00
|
|
|
return NamedCType(binds, MutRefCType(BaseCType(tensorT)))
|
2020-08-31 15:58:32 +00:00
|
|
|
else:
|
2021-04-16 18:40:22 +00:00
|
|
|
return NamedCType(binds, ConstRefCType(BaseCType(tensorT)))
|
Pass Scalar by reference (#53583)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53583
`Scalar` takes 32 bytes due to `c10::complex<double>`
requires aligning to 16 bytes. Passing Scalar by reference
shows about 1% improvements on instruction count.
All the changes in this commit are codemoded except for
the following 4 files (which code-gen signatures):
```
tools/codegen/api/cpp.py
tools/codegen/api/native.py
tools/codegen/api/structured.py
caffe2/contrib/aten/gen_op.py
```
# Codemode
## Main Step
For the codemod part, here is the main command used:
```
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)optional<Scalar> (\w+)' '${1}const optional<Scalar>& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)optional<Scalar> (\w+)' '${1}const optional<Scalar>& ${2}'
```
As you can tell, it codemods both `Scalar` and `optional<Scalar>`. Apply these commands iteratively until reaching a fix-point (since one method signature might contain multiple `Scalar` parameter).
In retrospect, excluding `thrid_party` and `torch/csrc/jit` would be a good idea. (I revert it manually later, see https://github.com/pytorch/pytorch/pull/53479 as an reference).
## Pre-Step
Prior to applying the main command, as some `Scalar` are presented as `at::Scalar` or `c10::Scalar`, so I codemod some of them in advance. Here is an incomplete list:
```
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)at::Scalar (\w+)' '${1}const at::Scalar& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)at::Scalar (\w+)' '${1}const at::Scalar& ${2}'
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)c10::optional<Scalar> (\w+)' '${1}const c10::optional<Scalar>& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)c10::optional<Scalar> (\w+)' '${1}const c10::optional<Scalar>& ${2}'
```
## Fixup
There are a couple of post codemod fixup. For example, `const Scalar` will be codemoded into `const const Scalar&`. `at:Scalar` will be codemoded into `at::const Scalar&` (if `Pre-step` is not done comprehensively). Here is an incomplete list:
```
fastmod --extensions cpp 'const const Scalar' 'const Scalar'
fastmod --extensions h 'const const c10::optional<Scalar>' 'const c10::optional<Scalar>'
fastmod --extensions cpp 'const const c10::optional<Scalar>' 'const c10::optional<Scalar>'
fastmod 'at::const Scalar&' 'const at::Scalar&'
```
## Supplementary
`cu` and `mm` files also need to be codemoded, for example:
```
fastmod --extensions cu 'at::const Scalar&' 'const at::Scalar&'
fastmod --extensions mm '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
```
Function pointers are not codemoded. Here is an incomplete list:
```
# Cover case: using index_fill_fn = void(*)(TensorIterator & iter, int64_t dim, int64_t self_dim_size, int64_t self_dim_stride, Scalar source);
fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
# Cover case: using softplus_fn = void (*)(TensorIterator&, Scalar, Scalar);
fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar([, \)])' '${1}const Scalar&${2}'
fastmod --extensions cpp '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar([, \)])' '${1}const Scalar&${2}'
fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)optional<Scalar>([, \)])' '${1}const optional<Scalar>&${2}'
```
Some corner cases needs to be manually fixed.
ghstack-source-id: 123970306
Test Plan: Imported from OSS
Reviewed By: smessmer
Differential Revision: D26904445
fbshipit-source-id: 8d8a002af4b5125f153a32f03c6956be7ae5671d
2021-03-16 06:13:12 +00:00
|
|
|
elif t.name == BaseTy.Scalar:
|
2021-04-16 18:40:22 +00:00
|
|
|
return NamedCType(binds, ConstRefCType(BaseCType(scalarT)))
|
2020-08-31 15:58:32 +00:00
|
|
|
else:
|
|
|
|
|
raise AssertionError(f"base type should have been value type {t}")
|
|
|
|
|
elif isinstance(t, OptionalType):
|
|
|
|
|
if str(t.elem) == 'Tensor':
|
|
|
|
|
if mutable:
|
2021-04-16 18:40:22 +00:00
|
|
|
return NamedCType(binds, MutRefCType(BaseCType(tensorT))) # TODO: fix this discrepancy
|
2020-08-31 15:58:32 +00:00
|
|
|
else:
|
2021-04-16 18:40:22 +00:00
|
|
|
return NamedCType(binds, ConstRefCType(OptionalCType(BaseCType(tensorT))))
|
Pass Scalar by reference (#53583)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53583
`Scalar` takes 32 bytes due to `c10::complex<double>`
requires aligning to 16 bytes. Passing Scalar by reference
shows about 1% improvements on instruction count.
All the changes in this commit are codemoded except for
the following 4 files (which code-gen signatures):
```
tools/codegen/api/cpp.py
tools/codegen/api/native.py
tools/codegen/api/structured.py
caffe2/contrib/aten/gen_op.py
```
# Codemode
## Main Step
For the codemod part, here is the main command used:
```
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)optional<Scalar> (\w+)' '${1}const optional<Scalar>& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)optional<Scalar> (\w+)' '${1}const optional<Scalar>& ${2}'
```
As you can tell, it codemods both `Scalar` and `optional<Scalar>`. Apply these commands iteratively until reaching a fix-point (since one method signature might contain multiple `Scalar` parameter).
In retrospect, excluding `thrid_party` and `torch/csrc/jit` would be a good idea. (I revert it manually later, see https://github.com/pytorch/pytorch/pull/53479 as an reference).
## Pre-Step
Prior to applying the main command, as some `Scalar` are presented as `at::Scalar` or `c10::Scalar`, so I codemod some of them in advance. Here is an incomplete list:
```
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)at::Scalar (\w+)' '${1}const at::Scalar& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)at::Scalar (\w+)' '${1}const at::Scalar& ${2}'
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)c10::optional<Scalar> (\w+)' '${1}const c10::optional<Scalar>& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)c10::optional<Scalar> (\w+)' '${1}const c10::optional<Scalar>& ${2}'
```
## Fixup
There are a couple of post codemod fixup. For example, `const Scalar` will be codemoded into `const const Scalar&`. `at:Scalar` will be codemoded into `at::const Scalar&` (if `Pre-step` is not done comprehensively). Here is an incomplete list:
```
fastmod --extensions cpp 'const const Scalar' 'const Scalar'
fastmod --extensions h 'const const c10::optional<Scalar>' 'const c10::optional<Scalar>'
fastmod --extensions cpp 'const const c10::optional<Scalar>' 'const c10::optional<Scalar>'
fastmod 'at::const Scalar&' 'const at::Scalar&'
```
## Supplementary
`cu` and `mm` files also need to be codemoded, for example:
```
fastmod --extensions cu 'at::const Scalar&' 'const at::Scalar&'
fastmod --extensions mm '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
```
Function pointers are not codemoded. Here is an incomplete list:
```
# Cover case: using index_fill_fn = void(*)(TensorIterator & iter, int64_t dim, int64_t self_dim_size, int64_t self_dim_stride, Scalar source);
fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
# Cover case: using softplus_fn = void (*)(TensorIterator&, Scalar, Scalar);
fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar([, \)])' '${1}const Scalar&${2}'
fastmod --extensions cpp '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar([, \)])' '${1}const Scalar&${2}'
fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)optional<Scalar>([, \)])' '${1}const optional<Scalar>&${2}'
```
Some corner cases needs to be manually fixed.
ghstack-source-id: 123970306
Test Plan: Imported from OSS
Reviewed By: smessmer
Differential Revision: D26904445
fbshipit-source-id: 8d8a002af4b5125f153a32f03c6956be7ae5671d
2021-03-16 06:13:12 +00:00
|
|
|
elif str(t.elem) == 'Scalar':
|
2021-04-16 18:40:22 +00:00
|
|
|
return NamedCType(binds, ConstRefCType(OptionalCType(BaseCType(scalarT))))
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
elem = argumenttype_type(t.elem, mutable=mutable, binds=binds)
|
2021-04-16 18:40:22 +00:00
|
|
|
return NamedCType(binds, OptionalCType(elem.type))
|
2020-08-31 15:58:32 +00:00
|
|
|
elif isinstance(t, ListType):
|
|
|
|
|
# TODO: remove these special cases, ArrayRef fallthrough works fine
|
|
|
|
|
if str(t.elem) == 'int':
|
2021-04-16 18:40:22 +00:00
|
|
|
return NamedCType(binds, BaseCType(intArrayRefT))
|
2020-08-31 15:58:32 +00:00
|
|
|
elif str(t.elem) == 'Tensor':
|
2021-04-16 18:40:22 +00:00
|
|
|
return NamedCType(binds, BaseCType(tensorListT))
|
Pass Scalar by reference (#53583)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53583
`Scalar` takes 32 bytes due to `c10::complex<double>`
requires aligning to 16 bytes. Passing Scalar by reference
shows about 1% improvements on instruction count.
All the changes in this commit are codemoded except for
the following 4 files (which code-gen signatures):
```
tools/codegen/api/cpp.py
tools/codegen/api/native.py
tools/codegen/api/structured.py
caffe2/contrib/aten/gen_op.py
```
# Codemode
## Main Step
For the codemod part, here is the main command used:
```
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)optional<Scalar> (\w+)' '${1}const optional<Scalar>& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)optional<Scalar> (\w+)' '${1}const optional<Scalar>& ${2}'
```
As you can tell, it codemods both `Scalar` and `optional<Scalar>`. Apply these commands iteratively until reaching a fix-point (since one method signature might contain multiple `Scalar` parameter).
In retrospect, excluding `thrid_party` and `torch/csrc/jit` would be a good idea. (I revert it manually later, see https://github.com/pytorch/pytorch/pull/53479 as an reference).
## Pre-Step
Prior to applying the main command, as some `Scalar` are presented as `at::Scalar` or `c10::Scalar`, so I codemod some of them in advance. Here is an incomplete list:
```
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)at::Scalar (\w+)' '${1}const at::Scalar& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)at::Scalar (\w+)' '${1}const at::Scalar& ${2}'
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)c10::optional<Scalar> (\w+)' '${1}const c10::optional<Scalar>& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)c10::optional<Scalar> (\w+)' '${1}const c10::optional<Scalar>& ${2}'
```
## Fixup
There are a couple of post codemod fixup. For example, `const Scalar` will be codemoded into `const const Scalar&`. `at:Scalar` will be codemoded into `at::const Scalar&` (if `Pre-step` is not done comprehensively). Here is an incomplete list:
```
fastmod --extensions cpp 'const const Scalar' 'const Scalar'
fastmod --extensions h 'const const c10::optional<Scalar>' 'const c10::optional<Scalar>'
fastmod --extensions cpp 'const const c10::optional<Scalar>' 'const c10::optional<Scalar>'
fastmod 'at::const Scalar&' 'const at::Scalar&'
```
## Supplementary
`cu` and `mm` files also need to be codemoded, for example:
```
fastmod --extensions cu 'at::const Scalar&' 'const at::Scalar&'
fastmod --extensions mm '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
```
Function pointers are not codemoded. Here is an incomplete list:
```
# Cover case: using index_fill_fn = void(*)(TensorIterator & iter, int64_t dim, int64_t self_dim_size, int64_t self_dim_stride, Scalar source);
fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
# Cover case: using softplus_fn = void (*)(TensorIterator&, Scalar, Scalar);
fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar([, \)])' '${1}const Scalar&${2}'
fastmod --extensions cpp '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar([, \)])' '${1}const Scalar&${2}'
fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)optional<Scalar>([, \)])' '${1}const optional<Scalar>&${2}'
```
Some corner cases needs to be manually fixed.
ghstack-source-id: 123970306
Test Plan: Imported from OSS
Reviewed By: smessmer
Differential Revision: D26904445
fbshipit-source-id: 8d8a002af4b5125f153a32f03c6956be7ae5671d
2021-03-16 06:13:12 +00:00
|
|
|
elif str(t.elem) == 'Scalar':
|
2021-04-16 18:40:22 +00:00
|
|
|
return NamedCType(binds, ArrayRefCType(BaseCType(scalarT)))
|
2020-08-31 15:58:32 +00:00
|
|
|
elif str(t.elem) == 'Dimname':
|
2021-04-16 18:40:22 +00:00
|
|
|
return NamedCType(binds, BaseCType(dimnameListT))
|
Making ops c10-full: list of optional tensors (#49138)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49138
See for details: https://fb.quip.com/QRtJAin66lPN
We need to model optional types explicitly, mostly for schema inference. So we cannot pass a `Tensor?[]` as `ArrayRef<Tensor>`, instead we need to pass it as an optional type. This PR changes it to `torch::List<c10::optional<Tensor>>`. It also makes the ops c10-full that were blocked by this.
## Backwards Compatibility
- This should not break the Python API because the representation in Python is the same and python_arg_parser just transforms the python list into a `List<optional<Tensor>>` instead of into a `List<Tensor>`.
- This should not break serialized models because there's some logic that allows loading a serialized `List<Tensor>` as `List<optional<Tensor>>`, see https://github.com/pytorch/pytorch/pull/49138/files#diff-9315f5dd045f47114c677174dcaa2f982721233eee1aa19068a42ff3ef775315R57
- This will break backwards compatibility for the C++ API. There is no implicit conversion from `ArrayRef<Tensor>` (which was the old argument type) to `List<optional<Tensor>>`. One common call pattern is `tensor.index({indices_tensor})`, where indices_tensor is another `Tensor`, and that will continue working because the `{}` initializer_list constructor for `List<optional<Tensor>>` can take `Tensor` elements that are implicitly converted to `optional<Tensor>`, but another common call pattern was `tensor.index(indices_tensor)`, where previously, the `Tensor` got implicitly converted to an `ArrayRef<Tensor>`, and to implicitly convert `Tensor -> optional<Tensor> -> List<optional<Tensor>>` would be two implicit conversions. C++ doesn't allow chaining. two implicit conversions. So those call sites have to be rewritten to `tensor.index({indices_tensor})`.
ghstack-source-id: 119269131
Test Plan:
## Benchmarks (C++ instruction counts):
### Forward
#### Script
```py
from torch.utils.benchmark import Timer
counts = Timer(
stmt="""
auto t = {{op call to measure}};
""",
setup="""
using namespace torch::indexing;
auto x = torch::ones({4, 4, 4});
""",
language="cpp",
).collect_callgrind(number=1_000)
print(counts)
```
#### Results
| Op call |before |after |delta | |
|------------------------------------------------------------------------|---------|--------|-------|------|
|x[0] = 1 |11566015 |11566015|0 |0.00% |
|x.index({0}) |6807019 |6801019 |-6000 |-0.09%|
|x.index({0, 0}) |13529019 |13557019|28000 |0.21% |
|x.index({0, 0, 0}) |10677004 |10692004|15000 |0.14% |
|x.index({"..."}) |5512015 |5506015 |-6000 |-0.11%|
|x.index({Slice(None, None, None)}) |6866016 |6936016 |70000 |1.02% |
|x.index({None}) |8554015 |8548015 |-6000 |-0.07%|
|x.index({false}) |22400000 |22744000|344000 |1.54% |
|x.index({true}) |27624088 |27264393|-359695|-1.30%|
|x.index({"...", 0, true, Slice(1, None, 2), torch::tensor({1, 2})})|123472000|123463306|-8694|-0.01%|
### Autograd
#### Script
```py
from torch.utils.benchmark import Timer
counts = Timer(
stmt="""
auto t = {{op call to measure}};
""",
setup="""
using namespace torch::indexing;
auto x = torch::ones({4, 4, 4}, torch::requires_grad());
""",
language="cpp",
).collect_callgrind(number=1_000)
print(counts)
```
Note: the script measures the **forward** path of an op call with autograd enabled (i.e. calls into VariableType). It does not measure the backward path.
#### Results
| Op call |before |after |delta | |
|------------------------------------------------------------------------|---------|--------|-------|------|
|x.index({0}) |14839019|14833019|-6000| 0.00% |
|x.index({0, 0}) |28342019|28370019|28000| 0.00% |
|x.index({0, 0, 0}) |24434004|24449004|15000| 0.00% |
|x.index({"..."}) |12773015|12767015|-6000| 0.00% |
|x.index({Slice(None, None, None)}) |14837016|14907016|70000| 0.47% |
|x.index({None}) |15926015|15920015|-6000| 0.00% |
|x.index({false}) |36958000|37477000|519000| 1.40% |
|x.index({true}) |41971408|42426094|454686| 1.08% |
|x.index({"...", 0, true, Slice(1, None, 2), torch::tensor({1, 2})}) |168184392|164545682|-3638710| -2.16% |
Reviewed By: bhosmer
Differential Revision: D25454632
fbshipit-source-id: 28ab0cffbbdbdff1c40b4130ca62ee72f981b76d
2021-01-04 13:01:02 +00:00
|
|
|
elif str(t.elem) == 'Tensor?':
|
2021-04-16 18:40:22 +00:00
|
|
|
return NamedCType(binds, ConstRefCType(ListCType(OptionalCType(BaseCType(tensorT)))))
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
elem = argumenttype_type(t.elem, mutable=mutable, binds=binds)
|
2021-04-16 18:40:22 +00:00
|
|
|
return NamedCType(binds, ArrayRefCType(elem.type))
|
2020-08-31 15:58:32 +00:00
|
|
|
else:
|
|
|
|
|
raise AssertionError(f"unrecognized type {repr(t)}")
|
|
|
|
|
|
|
|
|
|
# Translate a JIT argument into its C++ type
|
2021-04-16 18:40:22 +00:00
|
|
|
def argument_type(a: Argument, *, binds: ArgName) -> NamedCType:
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
return argumenttype_type(a.type, mutable=a.is_write, binds=binds)
|
2020-08-31 15:58:32 +00:00
|
|
|
|
|
|
|
|
# Translation of a (non-multi) return type from JIT to C++
|
2021-04-16 18:40:22 +00:00
|
|
|
# N.B: returntype_type returns a CType, not a NamedCType.
|
|
|
|
|
# This is mostly because of the mismatch between return types and return names.
|
|
|
|
|
# e.g. a function with a return type of 'void' has 0 return names,
|
|
|
|
|
# and a function with a return type of 'std::tuple' has >1 return name.
|
2021-04-16 18:40:22 +00:00
|
|
|
def returntype_type(t: Type, *, mutable: bool) -> CType:
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
# placeholder is ignored
|
|
|
|
|
r = valuetype_type(t, binds="__placeholder__")
|
2020-08-31 15:58:32 +00:00
|
|
|
if r is not None:
|
2021-04-16 18:40:22 +00:00
|
|
|
return r.type
|
2020-08-31 15:58:32 +00:00
|
|
|
|
|
|
|
|
if isinstance(t, BaseType):
|
|
|
|
|
if t.name == BaseTy.Tensor:
|
|
|
|
|
if mutable:
|
2021-04-16 18:40:22 +00:00
|
|
|
return MutRefCType(BaseCType(tensorT))
|
2020-08-31 15:58:32 +00:00
|
|
|
else:
|
2021-04-16 18:40:22 +00:00
|
|
|
return BaseCType(tensorT)
|
Pass Scalar by reference (#53583)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53583
`Scalar` takes 32 bytes due to `c10::complex<double>`
requires aligning to 16 bytes. Passing Scalar by reference
shows about 1% improvements on instruction count.
All the changes in this commit are codemoded except for
the following 4 files (which code-gen signatures):
```
tools/codegen/api/cpp.py
tools/codegen/api/native.py
tools/codegen/api/structured.py
caffe2/contrib/aten/gen_op.py
```
# Codemode
## Main Step
For the codemod part, here is the main command used:
```
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)optional<Scalar> (\w+)' '${1}const optional<Scalar>& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)optional<Scalar> (\w+)' '${1}const optional<Scalar>& ${2}'
```
As you can tell, it codemods both `Scalar` and `optional<Scalar>`. Apply these commands iteratively until reaching a fix-point (since one method signature might contain multiple `Scalar` parameter).
In retrospect, excluding `thrid_party` and `torch/csrc/jit` would be a good idea. (I revert it manually later, see https://github.com/pytorch/pytorch/pull/53479 as an reference).
## Pre-Step
Prior to applying the main command, as some `Scalar` are presented as `at::Scalar` or `c10::Scalar`, so I codemod some of them in advance. Here is an incomplete list:
```
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)at::Scalar (\w+)' '${1}const at::Scalar& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)at::Scalar (\w+)' '${1}const at::Scalar& ${2}'
fastmod --extensions h '([a-zA-Z_+]\([^)]*,?\s*)c10::optional<Scalar> (\w+)' '${1}const c10::optional<Scalar>& ${2}'
fastmod --extensions cpp '([a-zA-Z_+]\([^)]*,?\s*)c10::optional<Scalar> (\w+)' '${1}const c10::optional<Scalar>& ${2}'
```
## Fixup
There are a couple of post codemod fixup. For example, `const Scalar` will be codemoded into `const const Scalar&`. `at:Scalar` will be codemoded into `at::const Scalar&` (if `Pre-step` is not done comprehensively). Here is an incomplete list:
```
fastmod --extensions cpp 'const const Scalar' 'const Scalar'
fastmod --extensions h 'const const c10::optional<Scalar>' 'const c10::optional<Scalar>'
fastmod --extensions cpp 'const const c10::optional<Scalar>' 'const c10::optional<Scalar>'
fastmod 'at::const Scalar&' 'const at::Scalar&'
```
## Supplementary
`cu` and `mm` files also need to be codemoded, for example:
```
fastmod --extensions cu 'at::const Scalar&' 'const at::Scalar&'
fastmod --extensions mm '([a-zA-Z_+]\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
```
Function pointers are not codemoded. Here is an incomplete list:
```
# Cover case: using index_fill_fn = void(*)(TensorIterator & iter, int64_t dim, int64_t self_dim_size, int64_t self_dim_stride, Scalar source);
fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar (\w+)' '${1}const Scalar& ${2}'
# Cover case: using softplus_fn = void (*)(TensorIterator&, Scalar, Scalar);
fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar([, \)])' '${1}const Scalar&${2}'
fastmod --extensions cpp '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)Scalar([, \)])' '${1}const Scalar&${2}'
fastmod --extensions h '(void\s*\(\s*\*\s*\)\([^)]*,?\s*)optional<Scalar>([, \)])' '${1}const optional<Scalar>&${2}'
```
Some corner cases needs to be manually fixed.
ghstack-source-id: 123970306
Test Plan: Imported from OSS
Reviewed By: smessmer
Differential Revision: D26904445
fbshipit-source-id: 8d8a002af4b5125f153a32f03c6956be7ae5671d
2021-03-16 06:13:12 +00:00
|
|
|
elif t.name == BaseTy.Scalar:
|
2021-04-16 18:40:22 +00:00
|
|
|
return BaseCType(scalarT)
|
2020-08-31 15:58:32 +00:00
|
|
|
elif isinstance(t, ListType):
|
|
|
|
|
elem = returntype_type(t.elem, mutable=mutable)
|
|
|
|
|
assert t.size is None, f"fixed size list returns not supported: {t}"
|
2021-04-16 18:40:22 +00:00
|
|
|
return VectorCType(elem)
|
2020-08-31 15:58:32 +00:00
|
|
|
|
|
|
|
|
raise AssertionError(f"unrecognized return type {t}")
|
|
|
|
|
|
|
|
|
|
# Translation of a single return to its C++ type
|
2021-04-16 18:40:22 +00:00
|
|
|
def return_type(r: Return) -> CType:
|
2020-08-31 15:58:32 +00:00
|
|
|
return returntype_type(r.type, mutable=r.is_write)
|
|
|
|
|
|
|
|
|
|
# Translation of a full (possibly multi) return from JIT to its C++ type
|
2021-04-16 18:40:22 +00:00
|
|
|
def returns_type(rs: Sequence[Return]) -> CType:
|
2020-08-31 15:58:32 +00:00
|
|
|
if len(rs) == 0:
|
2021-04-16 18:40:22 +00:00
|
|
|
return BaseCType(voidT)
|
2020-08-31 15:58:32 +00:00
|
|
|
elif len(rs) == 1:
|
|
|
|
|
return return_type(rs[0])
|
|
|
|
|
else:
|
2021-04-16 18:40:22 +00:00
|
|
|
return TupleCType([return_type(r) for r in rs])
|
2020-08-31 15:58:32 +00:00
|
|
|
|
2021-04-21 18:06:26 +00:00
|
|
|
def return_names(f: NativeFunction) -> Sequence[str]:
|
2020-11-09 19:55:37 +00:00
|
|
|
returns: List[str] = []
|
|
|
|
|
for i, r in enumerate(f.func.returns):
|
|
|
|
|
# If we have an inplace function, the return argument is
|
|
|
|
|
# implicitly named self.
|
|
|
|
|
# TODO: Consider incorporating this into the data model
|
|
|
|
|
if f.func.name.name.inplace:
|
|
|
|
|
assert i == 0, "illegal inplace function with multiple returns"
|
|
|
|
|
name = 'self'
|
|
|
|
|
# If we are out function, the name is the name of the
|
|
|
|
|
# corresponding output function (r.name will get recorded
|
|
|
|
|
# in field_name later.)
|
|
|
|
|
elif f.func.is_out_fn():
|
2020-12-02 15:47:13 +00:00
|
|
|
name = f.func.arguments.out[i].name
|
2020-11-09 19:55:37 +00:00
|
|
|
# If the return argument is explicitly named...
|
|
|
|
|
elif r.name:
|
|
|
|
|
name_conflict = any(r.name == a.name for a in f.func.schema_order_arguments())
|
|
|
|
|
if name_conflict and not f.func.is_out_fn():
|
|
|
|
|
name = f'{r.name}_return'
|
|
|
|
|
else:
|
|
|
|
|
name = r.name
|
2021-04-21 18:06:26 +00:00
|
|
|
# If there is no explicit name, we just name the output result,
|
2020-11-09 19:55:37 +00:00
|
|
|
# unless it's a multi-return, in which case it's result0,
|
|
|
|
|
# result1, etc (zero-indexed)
|
|
|
|
|
else:
|
2021-04-21 18:06:26 +00:00
|
|
|
name = 'result' if len(f.func.returns) == 1 else f'result{i}'
|
2020-11-09 19:55:37 +00:00
|
|
|
returns.append(name)
|
|
|
|
|
return returns
|
|
|
|
|
|
2020-08-31 15:58:32 +00:00
|
|
|
JIT_TO_CPP_DEFAULT = {
|
|
|
|
|
'False': 'false',
|
|
|
|
|
'True': 'true',
|
|
|
|
|
'None': 'c10::nullopt', # UGH this one is type directed
|
|
|
|
|
'Mean': 'at::Reduction::Mean',
|
|
|
|
|
'[]': '{}',
|
|
|
|
|
'contiguous_format': 'MemoryFormat::Contiguous',
|
2020-10-02 11:06:27 +00:00
|
|
|
'long': 'at::kLong',
|
2020-08-31 15:58:32 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# Convert a JIT default into C++ expression representing the default
|
|
|
|
|
def default_expr(d: str, t: Type) -> str:
|
|
|
|
|
if d == 'None' and str(t) == 'Tensor?':
|
|
|
|
|
return '{}'
|
2020-10-07 04:51:12 +00:00
|
|
|
if isinstance(t, BaseType) and t.name is BaseTy.str:
|
|
|
|
|
# Schema allows single quotes but C++ needs double
|
|
|
|
|
if len(d) >= 2 and d[0] == "'" and d[-1] == "'":
|
|
|
|
|
s = ''
|
|
|
|
|
i = 1
|
|
|
|
|
while i + 1 < len(d):
|
|
|
|
|
if d[i] != '\\':
|
|
|
|
|
if d[i] == '"':
|
|
|
|
|
s += '\\"'
|
|
|
|
|
else:
|
|
|
|
|
s += d[i]
|
|
|
|
|
i += 1
|
|
|
|
|
else:
|
|
|
|
|
if d[i + 1] == "'":
|
|
|
|
|
s += "'"
|
|
|
|
|
else:
|
|
|
|
|
s += d[i:i + 2]
|
|
|
|
|
i += 2
|
|
|
|
|
|
|
|
|
|
return f'"{s}"'
|
2020-10-08 05:25:38 +00:00
|
|
|
|
|
|
|
|
if isinstance(t, OptionalType):
|
|
|
|
|
if d == 'None':
|
|
|
|
|
return 'c10::nullopt'
|
|
|
|
|
|
|
|
|
|
return default_expr(d, t.elem)
|
|
|
|
|
|
|
|
|
|
if isinstance(t, ListType):
|
|
|
|
|
if (d.startswith('[') and d.endswith(']')):
|
|
|
|
|
return '{' + d[1:-1] + '}'
|
|
|
|
|
elif t.size is None:
|
|
|
|
|
# NOTE: Sized lists can have scalar defaults
|
|
|
|
|
raise ValueError(f"Expected a list default '[...]' but found: '{d}'")
|
|
|
|
|
|
2020-08-31 15:58:32 +00:00
|
|
|
return JIT_TO_CPP_DEFAULT.get(d, d)
|
|
|
|
|
|
|
|
|
|
# Convert an argument into its C++ API form
|
Rewrite implementation of faithful cpp signatures (#45890)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45890
This rewrite is as per my comments at https://github.com/pytorch/pytorch/pull/44087#issuecomment-701664506
I did the rewrite by reverting #44087 and then reimplementing it on top.
You may find it easier to review by diffing against master with only #44087
reverted.
There are two main ideas.
First, we now factor cpp argument processing into two phases operating
on three representations of data:
1. `FunctionSchema` - this is the source from native_functions.yaml
2. `Union[Argument, ThisArgument, TensorOptionsArgument]` - this is
the arguments after doing some basic semantic analysis to group
them (for TensorOptions) or identify the this argument (if this
is a method). There is only ever one of these per functions.
3. `Union[CppArgument, CppThisArgument, CppTensorOptionsArgument]` -
this is the arguments after we've elaborated them to C++. There
may be multiple of these per actual C++ signature.
You can think of (2) as common processing, whereas (3) bakes in specific
assumptions about whether or not you have a faithful or non-faithful
signature.
Second, we now have CppSignature and CppSignatureGroup representing
the *total* public C++ API signature. So those dataclasses are what
know how to render definitions/declarations, and you no longer have
to manually type it out in the Functions/TensorMethods codegen.
Here is an exhaustive accounting of the changes.
tools.codegen.api.types
- CppSignature and CppSignatureGroup got moved to tools.codegen.api.types
- Add new CppThisArgument and CppTensorOptionsArguments (modeled off
of ThisArgument and TensorOptionsArguments) so that we can retain
high level semantic structure even after elaborating terms with C++
API information. Once this is done, we can refine
CppArgument.argument to no longer contain a ThisArgument (ThisArgument
is always translated to CppThisArgument. Note that this doesn't
apply to TensorOptionsArguments, as those may be expanded or not
expanded, and so you could get a single CppArgument for 'options')
- Add no_default() functional mutator to easily remove default arguments
from CppArgument and friends
- Add an explicit_arguments() method to CppArgument and friends to
extract (flat) argument list that must be explicitly written in the signature.
This is everything except (Cpp)ThisArgument, and is also convenient
when you don't care about the extra structure of
CppTensorOptionsArguments
tools.codegen.api.cpp
- group_arguments is back, and it doesn't send things directly to a
CppSignatureGroup; instead, it moves us from representation (1) to (2)
(perhaps it should live in model). Here I changed my mind from my
PR comment; I discovered it was not necessary to do classification at
grouping time, and it was simpler and easier to do it later.
- argument got split into argument_not_this/argument/argument_faithful.
argument and argument_faithful are obvious enough what they do,
and I needed argument_not_this as a more refined version of argument
so that I could get the types to work out on TensorOptionsArguments
tools.codegen.api.dispatcher
- Here we start seeing the payoff. The old version of this code had a
"scatter" mode and a "gather" mode. We don't need that anymore:
cppargument_exprs is 100% type-directed via the passed in cpp
arguments. I am able to write the functions without any reference
to use_c10_dispatcher
tools.codegen.gen
- Instead of having exprs_str and types_str functions, I moved these to
live directly on CppSignature, since it seemed pretty logical.
- The actual codegen for TensorMethods/Functions is greatly simplified,
since (1) all of the heavy lifting is now happening in
CppSignature(Group) construction, and (2) I don't need to proxy one
way or another, the new dispatcher translation code is able to handle
both cases no problem. There is a little faffing about with ordering
to reduce the old and new diff which could be removed afterwards.
Here are codegen diffs. For use_c10_dispatcher: full:
```
+// aten::_cudnn_init_dropout_state(float dropout, bool train, int dropout_seed, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False) -> Tensor
Tensor _cudnn_init_dropout_state(double dropout, bool train, int64_t dropout_seed, const TensorOptions & options) {
- return _cudnn_init_dropout_state(dropout, train, dropout_seed, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
+ static auto op = c10::Dispatcher::singleton()
+ .findSchemaOrThrow("aten::_cudnn_init_dropout_state", "")
+ .typed<Tensor (double, bool, int64_t, c10::optional<ScalarType>, c10::optional<Layout>, c10::optional<Device>, c10::optional<bool>)>();
+ return op.call(dropout, train, dropout_seed, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}
```
Otherwise:
```
+// aten::empty_meta(int[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None, MemoryFormat? memory_format=None) -> Tensor
Tensor empty_meta(IntArrayRef size, c10::optional<ScalarType> dtype, c10::optional<Layout> layout, c10::optional<Device> device, c10::optional<bool> pin_memory, c10::optional<MemoryFormat> memory_format) {
- return empty_meta(size, TensorOptions().dtype(dtype).layout(layout).device(device).pinned_memory(pin_memory), memory_format);
+ static auto op = c10::Dispatcher::singleton()
+ .findSchemaOrThrow("aten::empty_meta", "")
+ .typed<Tensor (IntArrayRef, const TensorOptions &, c10::optional<MemoryFormat>)>();
+ return op.call(size, TensorOptions().dtype(dtype).layout(layout).device(device).pinned_memory(pin_memory), memory_format);
}
```
Things that I probably did not get right:
- The Union[Argument, TensorOptionsArguments, ThisArgument] and
the Cpp variants are starting to get a little unwieldy. Not sure if
this means I should add a supertype (or at the very least an
alias); in some cases I do purposely omit one of these from the Union
- Code may not necessarily live in the most logical files. There isn't
very much rhyme or reason to it.
- The fields on CppSignature. They're not very well constrained and
it will be better if people don't use them directly.
- Disambiguation. We should do this properly in #44087 and we don't
need special logic for deleting defaulting for faithful signatures;
there is a more general story here.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: smessmer
Differential Revision: D24144035
Pulled By: ezyang
fbshipit-source-id: a185f8bf9df8b44ca5718a7a44dac23cefd11c0a
2020-10-13 15:24:07 +00:00
|
|
|
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
def argument(
|
|
|
|
|
a: Union[Argument, TensorOptionsArguments, SelfArgument],
|
2021-01-04 19:51:28 +00:00
|
|
|
*, cpp_no_default_args: Set[str], method: bool, faithful: bool,
|
|
|
|
|
has_tensor_options: bool
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
) -> List[Binding]:
|
2021-01-04 19:51:28 +00:00
|
|
|
def sub_argument(a: Union[Argument, TensorOptionsArguments, SelfArgument]) -> List[Binding]:
|
|
|
|
|
return argument(
|
|
|
|
|
a, cpp_no_default_args=cpp_no_default_args, method=method, faithful=faithful,
|
|
|
|
|
has_tensor_options=has_tensor_options)
|
|
|
|
|
|
2020-08-31 15:58:32 +00:00
|
|
|
if isinstance(a, Argument):
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
binds: ArgName
|
|
|
|
|
if a.name == "memory_format" and has_tensor_options:
|
|
|
|
|
binds = SpecialArgName.possibly_redundant_memory_format
|
|
|
|
|
else:
|
|
|
|
|
binds = a.name
|
2021-01-04 19:51:28 +00:00
|
|
|
default: Optional[str] = None
|
|
|
|
|
if a.name not in cpp_no_default_args and a.default is not None:
|
|
|
|
|
default = default_expr(a.default, a.type)
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
return [Binding(
|
2021-04-16 18:40:22 +00:00
|
|
|
nctype=argument_type(a, binds=binds),
|
2020-08-31 15:58:32 +00:00
|
|
|
name=a.name,
|
2021-01-04 19:51:28 +00:00
|
|
|
default=default,
|
2020-08-31 15:58:32 +00:00
|
|
|
argument=a,
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
)]
|
2020-08-31 15:58:32 +00:00
|
|
|
elif isinstance(a, TensorOptionsArguments):
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
if faithful:
|
2021-01-04 19:51:28 +00:00
|
|
|
return sub_argument(a.dtype) + sub_argument(a.layout) + \
|
|
|
|
|
sub_argument(a.device) + sub_argument(a.pin_memory)
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
else:
|
|
|
|
|
default = None
|
2021-01-04 19:51:28 +00:00
|
|
|
# Enforced by NativeFunction.__post_init__
|
|
|
|
|
assert 'options' not in cpp_no_default_args
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
if all(x.default == "None" for x in a.all()):
|
|
|
|
|
default = '{}'
|
|
|
|
|
elif a.dtype.default == "long":
|
|
|
|
|
default = 'at::kLong' # TODO: this is wrong
|
|
|
|
|
return [Binding(
|
2021-04-16 18:40:22 +00:00
|
|
|
nctype=NamedCType('options', BaseCType(tensorOptionsT)),
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
name='options',
|
|
|
|
|
default=default,
|
|
|
|
|
argument=a,
|
|
|
|
|
)]
|
|
|
|
|
elif isinstance(a, SelfArgument):
|
2020-12-11 02:12:21 +00:00
|
|
|
if method:
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
# Caller is responsible for installing implicit this in context!
|
|
|
|
|
return []
|
2020-12-11 02:12:21 +00:00
|
|
|
else:
|
2021-01-04 19:51:28 +00:00
|
|
|
return sub_argument(a.argument)
|
Rewrite implementation of faithful cpp signatures (#45890)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45890
This rewrite is as per my comments at https://github.com/pytorch/pytorch/pull/44087#issuecomment-701664506
I did the rewrite by reverting #44087 and then reimplementing it on top.
You may find it easier to review by diffing against master with only #44087
reverted.
There are two main ideas.
First, we now factor cpp argument processing into two phases operating
on three representations of data:
1. `FunctionSchema` - this is the source from native_functions.yaml
2. `Union[Argument, ThisArgument, TensorOptionsArgument]` - this is
the arguments after doing some basic semantic analysis to group
them (for TensorOptions) or identify the this argument (if this
is a method). There is only ever one of these per functions.
3. `Union[CppArgument, CppThisArgument, CppTensorOptionsArgument]` -
this is the arguments after we've elaborated them to C++. There
may be multiple of these per actual C++ signature.
You can think of (2) as common processing, whereas (3) bakes in specific
assumptions about whether or not you have a faithful or non-faithful
signature.
Second, we now have CppSignature and CppSignatureGroup representing
the *total* public C++ API signature. So those dataclasses are what
know how to render definitions/declarations, and you no longer have
to manually type it out in the Functions/TensorMethods codegen.
Here is an exhaustive accounting of the changes.
tools.codegen.api.types
- CppSignature and CppSignatureGroup got moved to tools.codegen.api.types
- Add new CppThisArgument and CppTensorOptionsArguments (modeled off
of ThisArgument and TensorOptionsArguments) so that we can retain
high level semantic structure even after elaborating terms with C++
API information. Once this is done, we can refine
CppArgument.argument to no longer contain a ThisArgument (ThisArgument
is always translated to CppThisArgument. Note that this doesn't
apply to TensorOptionsArguments, as those may be expanded or not
expanded, and so you could get a single CppArgument for 'options')
- Add no_default() functional mutator to easily remove default arguments
from CppArgument and friends
- Add an explicit_arguments() method to CppArgument and friends to
extract (flat) argument list that must be explicitly written in the signature.
This is everything except (Cpp)ThisArgument, and is also convenient
when you don't care about the extra structure of
CppTensorOptionsArguments
tools.codegen.api.cpp
- group_arguments is back, and it doesn't send things directly to a
CppSignatureGroup; instead, it moves us from representation (1) to (2)
(perhaps it should live in model). Here I changed my mind from my
PR comment; I discovered it was not necessary to do classification at
grouping time, and it was simpler and easier to do it later.
- argument got split into argument_not_this/argument/argument_faithful.
argument and argument_faithful are obvious enough what they do,
and I needed argument_not_this as a more refined version of argument
so that I could get the types to work out on TensorOptionsArguments
tools.codegen.api.dispatcher
- Here we start seeing the payoff. The old version of this code had a
"scatter" mode and a "gather" mode. We don't need that anymore:
cppargument_exprs is 100% type-directed via the passed in cpp
arguments. I am able to write the functions without any reference
to use_c10_dispatcher
tools.codegen.gen
- Instead of having exprs_str and types_str functions, I moved these to
live directly on CppSignature, since it seemed pretty logical.
- The actual codegen for TensorMethods/Functions is greatly simplified,
since (1) all of the heavy lifting is now happening in
CppSignature(Group) construction, and (2) I don't need to proxy one
way or another, the new dispatcher translation code is able to handle
both cases no problem. There is a little faffing about with ordering
to reduce the old and new diff which could be removed afterwards.
Here are codegen diffs. For use_c10_dispatcher: full:
```
+// aten::_cudnn_init_dropout_state(float dropout, bool train, int dropout_seed, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False) -> Tensor
Tensor _cudnn_init_dropout_state(double dropout, bool train, int64_t dropout_seed, const TensorOptions & options) {
- return _cudnn_init_dropout_state(dropout, train, dropout_seed, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
+ static auto op = c10::Dispatcher::singleton()
+ .findSchemaOrThrow("aten::_cudnn_init_dropout_state", "")
+ .typed<Tensor (double, bool, int64_t, c10::optional<ScalarType>, c10::optional<Layout>, c10::optional<Device>, c10::optional<bool>)>();
+ return op.call(dropout, train, dropout_seed, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}
```
Otherwise:
```
+// aten::empty_meta(int[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None, MemoryFormat? memory_format=None) -> Tensor
Tensor empty_meta(IntArrayRef size, c10::optional<ScalarType> dtype, c10::optional<Layout> layout, c10::optional<Device> device, c10::optional<bool> pin_memory, c10::optional<MemoryFormat> memory_format) {
- return empty_meta(size, TensorOptions().dtype(dtype).layout(layout).device(device).pinned_memory(pin_memory), memory_format);
+ static auto op = c10::Dispatcher::singleton()
+ .findSchemaOrThrow("aten::empty_meta", "")
+ .typed<Tensor (IntArrayRef, const TensorOptions &, c10::optional<MemoryFormat>)>();
+ return op.call(size, TensorOptions().dtype(dtype).layout(layout).device(device).pinned_memory(pin_memory), memory_format);
}
```
Things that I probably did not get right:
- The Union[Argument, TensorOptionsArguments, ThisArgument] and
the Cpp variants are starting to get a little unwieldy. Not sure if
this means I should add a supertype (or at the very least an
alias); in some cases I do purposely omit one of these from the Union
- Code may not necessarily live in the most logical files. There isn't
very much rhyme or reason to it.
- The fields on CppSignature. They're not very well constrained and
it will be better if people don't use them directly.
- Disambiguation. We should do this properly in #44087 and we don't
need special logic for deleting defaulting for faithful signatures;
there is a more general story here.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: smessmer
Differential Revision: D24144035
Pulled By: ezyang
fbshipit-source-id: a185f8bf9df8b44ca5718a7a44dac23cefd11c0a
2020-10-13 15:24:07 +00:00
|
|
|
else:
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
assert_never(a)
|
2020-10-02 11:06:27 +00:00
|
|
|
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
def arguments(
|
|
|
|
|
arguments: Arguments,
|
2021-01-04 19:51:28 +00:00
|
|
|
*, faithful: bool, method: bool, cpp_no_default_args: Set[str]
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
) -> List[Binding]:
|
|
|
|
|
args: List[Union[Argument, TensorOptionsArguments, SelfArgument]] = []
|
|
|
|
|
if faithful:
|
|
|
|
|
args.extend(arguments.non_out)
|
|
|
|
|
args.extend(arguments.out)
|
Rewrite implementation of faithful cpp signatures (#45890)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45890
This rewrite is as per my comments at https://github.com/pytorch/pytorch/pull/44087#issuecomment-701664506
I did the rewrite by reverting #44087 and then reimplementing it on top.
You may find it easier to review by diffing against master with only #44087
reverted.
There are two main ideas.
First, we now factor cpp argument processing into two phases operating
on three representations of data:
1. `FunctionSchema` - this is the source from native_functions.yaml
2. `Union[Argument, ThisArgument, TensorOptionsArgument]` - this is
the arguments after doing some basic semantic analysis to group
them (for TensorOptions) or identify the this argument (if this
is a method). There is only ever one of these per functions.
3. `Union[CppArgument, CppThisArgument, CppTensorOptionsArgument]` -
this is the arguments after we've elaborated them to C++. There
may be multiple of these per actual C++ signature.
You can think of (2) as common processing, whereas (3) bakes in specific
assumptions about whether or not you have a faithful or non-faithful
signature.
Second, we now have CppSignature and CppSignatureGroup representing
the *total* public C++ API signature. So those dataclasses are what
know how to render definitions/declarations, and you no longer have
to manually type it out in the Functions/TensorMethods codegen.
Here is an exhaustive accounting of the changes.
tools.codegen.api.types
- CppSignature and CppSignatureGroup got moved to tools.codegen.api.types
- Add new CppThisArgument and CppTensorOptionsArguments (modeled off
of ThisArgument and TensorOptionsArguments) so that we can retain
high level semantic structure even after elaborating terms with C++
API information. Once this is done, we can refine
CppArgument.argument to no longer contain a ThisArgument (ThisArgument
is always translated to CppThisArgument. Note that this doesn't
apply to TensorOptionsArguments, as those may be expanded or not
expanded, and so you could get a single CppArgument for 'options')
- Add no_default() functional mutator to easily remove default arguments
from CppArgument and friends
- Add an explicit_arguments() method to CppArgument and friends to
extract (flat) argument list that must be explicitly written in the signature.
This is everything except (Cpp)ThisArgument, and is also convenient
when you don't care about the extra structure of
CppTensorOptionsArguments
tools.codegen.api.cpp
- group_arguments is back, and it doesn't send things directly to a
CppSignatureGroup; instead, it moves us from representation (1) to (2)
(perhaps it should live in model). Here I changed my mind from my
PR comment; I discovered it was not necessary to do classification at
grouping time, and it was simpler and easier to do it later.
- argument got split into argument_not_this/argument/argument_faithful.
argument and argument_faithful are obvious enough what they do,
and I needed argument_not_this as a more refined version of argument
so that I could get the types to work out on TensorOptionsArguments
tools.codegen.api.dispatcher
- Here we start seeing the payoff. The old version of this code had a
"scatter" mode and a "gather" mode. We don't need that anymore:
cppargument_exprs is 100% type-directed via the passed in cpp
arguments. I am able to write the functions without any reference
to use_c10_dispatcher
tools.codegen.gen
- Instead of having exprs_str and types_str functions, I moved these to
live directly on CppSignature, since it seemed pretty logical.
- The actual codegen for TensorMethods/Functions is greatly simplified,
since (1) all of the heavy lifting is now happening in
CppSignature(Group) construction, and (2) I don't need to proxy one
way or another, the new dispatcher translation code is able to handle
both cases no problem. There is a little faffing about with ordering
to reduce the old and new diff which could be removed afterwards.
Here are codegen diffs. For use_c10_dispatcher: full:
```
+// aten::_cudnn_init_dropout_state(float dropout, bool train, int dropout_seed, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=False) -> Tensor
Tensor _cudnn_init_dropout_state(double dropout, bool train, int64_t dropout_seed, const TensorOptions & options) {
- return _cudnn_init_dropout_state(dropout, train, dropout_seed, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
+ static auto op = c10::Dispatcher::singleton()
+ .findSchemaOrThrow("aten::_cudnn_init_dropout_state", "")
+ .typed<Tensor (double, bool, int64_t, c10::optional<ScalarType>, c10::optional<Layout>, c10::optional<Device>, c10::optional<bool>)>();
+ return op.call(dropout, train, dropout_seed, optTypeMetaToScalarType(options.dtype_opt()), options.layout_opt(), options.device_opt(), options.pinned_memory_opt());
}
```
Otherwise:
```
+// aten::empty_meta(int[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None, MemoryFormat? memory_format=None) -> Tensor
Tensor empty_meta(IntArrayRef size, c10::optional<ScalarType> dtype, c10::optional<Layout> layout, c10::optional<Device> device, c10::optional<bool> pin_memory, c10::optional<MemoryFormat> memory_format) {
- return empty_meta(size, TensorOptions().dtype(dtype).layout(layout).device(device).pinned_memory(pin_memory), memory_format);
+ static auto op = c10::Dispatcher::singleton()
+ .findSchemaOrThrow("aten::empty_meta", "")
+ .typed<Tensor (IntArrayRef, const TensorOptions &, c10::optional<MemoryFormat>)>();
+ return op.call(size, TensorOptions().dtype(dtype).layout(layout).device(device).pinned_memory(pin_memory), memory_format);
}
```
Things that I probably did not get right:
- The Union[Argument, TensorOptionsArguments, ThisArgument] and
the Cpp variants are starting to get a little unwieldy. Not sure if
this means I should add a supertype (or at the very least an
alias); in some cases I do purposely omit one of these from the Union
- Code may not necessarily live in the most logical files. There isn't
very much rhyme or reason to it.
- The fields on CppSignature. They're not very well constrained and
it will be better if people don't use them directly.
- Disambiguation. We should do this properly in #44087 and we don't
need special logic for deleting defaulting for faithful signatures;
there is a more general story here.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: smessmer
Differential Revision: D24144035
Pulled By: ezyang
fbshipit-source-id: a185f8bf9df8b44ca5718a7a44dac23cefd11c0a
2020-10-13 15:24:07 +00:00
|
|
|
else:
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
args.extend(arguments.out)
|
|
|
|
|
args.extend(arguments.non_out)
|
|
|
|
|
return [
|
|
|
|
|
r.no_default() if faithful else r for a in args
|
2021-01-04 19:51:28 +00:00
|
|
|
for r in argument(
|
|
|
|
|
a, faithful=faithful, method=method,
|
|
|
|
|
has_tensor_options=arguments.tensor_options is not None,
|
|
|
|
|
cpp_no_default_args=cpp_no_default_args)
|
Introduce tools.codegen.api.translate (#49122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49122
cpparguments_exprs has induced a lot of head scratching in many recent PRs for how to structure the code in a good way. This PR eliminates the old algorithm for an entirely new algorithm inspired by logic programming. The net result is shorter, cleaner and should be more robust to future changes.
This PR is a bit of a whopper. Here is the order to review it.
- tools/codegen/api/types.py
- Deleted CppArgument, CppArgumentPackIface (and subclasses), CppExpr, DispatcherExpr, DispatcherArgument, NativeExpr, NativeArgument, MetaArgument. All things previously called XArgument are now Binding. All things previously called XExpr are now Expr. I deleted the `__str__` implementation on Binding and fixed all call sites not to use it. On Binding, I renamed `str_no_default` and `str_default` to `defn` and `decl` for better symmetry with the corresponding signature concepts, although I'm open to naming them back to their original versions.
- Obviously, things are less type safe without the class distinctions. So I introduce a new ADT called CType. CType represents the *semantic C++ type* of a binding: it is both the C++ type (e.g., `const Tensor&`) as well as the argument name that specifies what the binding denotes (e.g., `other`). Every binding now records its CType. The key observation here is that you don't actually care if a given expression is from the cpp or dispatcher or native API; what you care is having enough information to know what the expression means, so you can use it appropriately. CType has this information. For the most part, ArgNames are just the string names of the arguments as you see them in JIT schema, but there is one case (`possibly_redundant_memory_format`) where we encode a little extra information. Unlike the plain strings we previously used to represent C++ types, CType have a little bit of structure around optional and references, because the translation code needs to work around these concepts.
- I took the opportunity to kill all of the private fields like `_arguments` and `_returns_type` (since the argument types don't make sense anymore). Everything is computed for you on the fly. If this is a perf problem in codegen we can start using `cached_property` decorator.
- All of the heavy lifting in CppSignature.argument_packs has been moved to the cpp module. We'll head over there next. Similarly, all of the exprs methods are now calling translate, the new functionality which we haven't gotten to yet
- tools/codegen/api/cpp.py
- We refactor all of the type computation functions to return CType instead of str. Because CTypes need to know the denotation, there is a new `binds: ArgName` argument to most functions that provides the denotation, so we can slot it in. (An alternative would have been to construct CTypes without denotations and then fill them in post-facto, but I didn't do it this way. One downside is there are some places where I need a CType without denotation, so I fill these in with `__placeholder__` whenever this happens).
- `argument` and `arguments` are now extremely simple. There is no more Pack business, just produce one or more Bindings. The one thing of note is that when both a `memory_format` and `options` are in scope, we label the memory format as `possibly_redundant_memory_format`. This will be used in translation
- tools/codegen/api/dispatcher.py and tools/codegen/api/native.py - same deal as cpp.py. One thing is that `cpparguments_exprs` is deleted; that is in the translator
- tools/codegen/api/translate.py - the translator! It uses a very simple backwards deduction engine to work out how to fill in the arguments of functions. There are comments in the file that explain how it works.
- Everything else: just some small call site tweaks for places when I changed API.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: ljk53
Differential Revision: D25455887
Pulled By: ezyang
fbshipit-source-id: 90dc58d420d4cc49281aa8647987c69f3ed42fa6
2020-12-17 00:15:52 +00:00
|
|
|
]
|