pytorch/docs/source/torch.rst

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

805 lines
14 KiB
ReStructuredText
Raw Permalink Normal View History

2016-12-23 21:28:04 +00:00
torch
=====
.. automodule:: torch
.. currentmodule:: torch
2016-12-23 21:28:04 +00:00
2017-01-05 04:20:57 +00:00
Tensors
-------
.. autosummary::
:toctree: generated
:nosignatures:
is_tensor
is_storage
is_complex
Conjugate View (#54987) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54987 Based off of ezyang (https://github.com/pytorch/pytorch/pull/44799) and bdhirsh (https://github.com/pytorch/pytorch/pull/43702) 's prototype: Here's a summary of the changes in this PR: This PR adds a new dispatch key called Conjugate. This enables us to make conjugate operation a view and leverage the specialized library functions that fast path with the hermitian operation (conj + transpose). 1. Conjugate operation will now return a view with conj bit (1) for complex tensors and returns self for non-complex tensors as before. This also means `torch.view_as_real` will no longer be a view on conjugated complex tensors and is hence disabled. To fill the gap, we have added `torch.view_as_real_physical` which would return the real tensor agnostic of the conjugate bit on the input complex tensor. The information about conjugation on the old tensor can be obtained by calling `.is_conj()` on the new tensor. 2. NEW API: a) `.conj()` -- now returning a view. b) `.conj_physical()` -- does the physical conjugate operation. If the conj bit for input was set, you'd get `self.clone()`, else you'll get a new tensor with conjugated value in its memory. c) `.conj_physical_()`, and `out=` variant d) `.resolve_conj()` -- materializes the conjugation. returns self if the conj bit is unset, else returns a new tensor with conjugated values and conj bit set to 0. e) `.resolve_conj_()` in-place version of (d) f) `view_as_real_physical` -- as described in (1), it's functionally same as `view_as_real`, just that it doesn't error out on conjugated tensors. g) `view_as_real` -- existing function, but now errors out on conjugated tensors. 3. Conjugate Fallback a) Vast majority of PyTorch functions would currently use this fallback when they are called on a conjugated tensor. b) This fallback is well equipped to handle the following cases: - functional operation e.g., `torch.sin(input)` - Mutable inputs and in-place operations e.g., `tensor.add_(2)` - out-of-place operation e.g., `torch.sin(input, out=out)` - Tensorlist input args - NOTE: Meta tensors don't work with conjugate fallback. 4. Autograd a) `resolve_conj()` is an identity function w.r.t. autograd b) Everything else works as expected. 5. Testing: a) All method_tests run with conjugate view tensors. b) OpInfo tests that run with conjugate views - test_variant_consistency_eager/jit - gradcheck, gradgradcheck - test_conj_views (that only run for `torch.cfloat` dtype) NOTE: functions like `empty_like`, `zero_like`, `randn_like`, `clone` don't propagate the conjugate bit. Follow up work: 1. conjugate view RFC 2. Add neg bit to re-enable view operation on conjugated tensors 3. Update linalg functions to call into specialized functions that fast path with the hermitian operation. Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D28227315 Pulled By: anjali411 fbshipit-source-id: acab9402b9d6a970c6d512809b627a290c8def5f
2021-06-04 21:11:23 +00:00
is_conj
is_floating_point
is_nonzero
set_default_dtype
get_default_dtype
set_default_device
get_default_device
set_default_tensor_type
numel
set_printoptions
set_flush_denormal
2017-01-05 04:20:57 +00:00
.. _tensor-creation-ops:
2017-01-05 04:20:57 +00:00
Creation Ops
~~~~~~~~~~~~
.. note::
Random sampling creation ops are listed under :ref:`random-sampling` and
include:
:func:`torch.rand`
:func:`torch.rand_like`
:func:`torch.randn`
:func:`torch.randn_like`
:func:`torch.randint`
:func:`torch.randint_like`
:func:`torch.randperm`
Update docs with new tensor repr (#6454) * Update docs with new tensor repr * remove cuda in dtype * remove changes to gloo submodule * [docs] document tensor.new_* ctor * [docs] Add docs for tensor.to(), tensor.float(), etc * [docs] Moar examples for docs. * [docs] Warning for tensor ctor copy behavior * Quick fix * [docs] Document requires_grad_() * [docs] Add example for requires_grad_() * update slogdet and *fft * update tensor rst * small fixes * update some docs * additional doc changes * update torch and tensor docs * finish changing tensor docs * fix flake8 * slogdet with negative det * Update functional.py tensor ctors * Fix nll_loss docs * reorder to move device up * torch.LongTensor -> torch.tensor or torch.empty in docs * update tensor constructors in docs * change tensor constructors * change constructors * change more Tensor() to tensor() * Show requires_grads_ docs * Fix set_default_dtype docs * Update docs with new tensor repr * remove cuda in dtype * remove changes to gloo submodule * [docs] document tensor.new_* ctor * [docs] Add docs for tensor.to(), tensor.float(), etc * [docs] Moar examples for docs. * [docs] Warning for tensor ctor copy behavior * Quick fix * [docs] Document requires_grad_() * [docs] Add example for requires_grad_() * update slogdet and *fft * update tensor rst * small fixes * update some docs * additional doc changes * update torch and tensor docs * finish changing tensor docs * fix flake8 * slogdet with negative det * Update functional.py tensor ctors * Fix nll_loss docs * reorder to move device up * torch.LongTensor -> torch.tensor or torch.empty in docs * update tensor constructors in docs * change tensor constructors * change constructors * change more Tensor() to tensor() * Show requires_grads_ docs * Fix set_default_dtype docs * Link to torch.no_grad, etc, from torch doc * Add dtype aliases to table * regen docs again * Tensor attributes stub page * link to inplace sampling * Link torch.dtype, device, and layout * fix dots after nonfinite floats * better layout docs
2018-04-21 11:35:37 +00:00
You may also use :func:`torch.empty` with the :ref:`inplace-random-sampling`
methods to create :class:`torch.Tensor` s with values sampled from a broader
range of distributions.
.. autosummary::
:toctree: generated
:nosignatures:
tensor
sparse_coo_tensor
sparse_csr_tensor
sparse_csc_tensor
sparse_bsr_tensor
sparse_bsc_tensor
asarray
as_tensor
as_strided
from_file
from_numpy
from_dlpack
frombuffer
zeros
zeros_like
ones
ones_like
arange
range
linspace
logspace
eye
empty
empty_like
empty_strided
full
full_like
quantize_per_tensor
quantize_per_channel
dequantize
complex
polar
heaviside
2017-01-05 04:20:57 +00:00
.. _indexing-slicing-joining:
2017-01-05 04:20:57 +00:00
Indexing, Slicing, Joining, Mutating Ops
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated
:nosignatures:
adjoint
argwhere
cat
concat
concatenate
Conjugate View (#54987) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54987 Based off of ezyang (https://github.com/pytorch/pytorch/pull/44799) and bdhirsh (https://github.com/pytorch/pytorch/pull/43702) 's prototype: Here's a summary of the changes in this PR: This PR adds a new dispatch key called Conjugate. This enables us to make conjugate operation a view and leverage the specialized library functions that fast path with the hermitian operation (conj + transpose). 1. Conjugate operation will now return a view with conj bit (1) for complex tensors and returns self for non-complex tensors as before. This also means `torch.view_as_real` will no longer be a view on conjugated complex tensors and is hence disabled. To fill the gap, we have added `torch.view_as_real_physical` which would return the real tensor agnostic of the conjugate bit on the input complex tensor. The information about conjugation on the old tensor can be obtained by calling `.is_conj()` on the new tensor. 2. NEW API: a) `.conj()` -- now returning a view. b) `.conj_physical()` -- does the physical conjugate operation. If the conj bit for input was set, you'd get `self.clone()`, else you'll get a new tensor with conjugated value in its memory. c) `.conj_physical_()`, and `out=` variant d) `.resolve_conj()` -- materializes the conjugation. returns self if the conj bit is unset, else returns a new tensor with conjugated values and conj bit set to 0. e) `.resolve_conj_()` in-place version of (d) f) `view_as_real_physical` -- as described in (1), it's functionally same as `view_as_real`, just that it doesn't error out on conjugated tensors. g) `view_as_real` -- existing function, but now errors out on conjugated tensors. 3. Conjugate Fallback a) Vast majority of PyTorch functions would currently use this fallback when they are called on a conjugated tensor. b) This fallback is well equipped to handle the following cases: - functional operation e.g., `torch.sin(input)` - Mutable inputs and in-place operations e.g., `tensor.add_(2)` - out-of-place operation e.g., `torch.sin(input, out=out)` - Tensorlist input args - NOTE: Meta tensors don't work with conjugate fallback. 4. Autograd a) `resolve_conj()` is an identity function w.r.t. autograd b) Everything else works as expected. 5. Testing: a) All method_tests run with conjugate view tensors. b) OpInfo tests that run with conjugate views - test_variant_consistency_eager/jit - gradcheck, gradgradcheck - test_conj_views (that only run for `torch.cfloat` dtype) NOTE: functions like `empty_like`, `zero_like`, `randn_like`, `clone` don't propagate the conjugate bit. Follow up work: 1. conjugate view RFC 2. Add neg bit to re-enable view operation on conjugated tensors 3. Update linalg functions to call into specialized functions that fast path with the hermitian operation. Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D28227315 Pulled By: anjali411 fbshipit-source-id: acab9402b9d6a970c6d512809b627a290c8def5f
2021-06-04 21:11:23 +00:00
conj
chunk
dsplit
column_stack
dstack
gather
hsplit
hstack
index_add
index_copy
index_reduce
index_select
masked_select
movedim
moveaxis
narrow
narrow_copy
nonzero
permute
reshape
row_stack
select
scatter
diagonal_scatter
select_scatter
slice_scatter
scatter_add
scatter_reduce
split
squeeze
stack
swapaxes
swapdims
t
take
take_along_dim
tensor_split
tile
transpose
unbind
unravel_index
unsqueeze
vsplit
vstack
where
2017-01-05 04:20:57 +00:00
.. _accelerators:
Accelerators
----------------------------------
Within the PyTorch repo, we define an "Accelerator" as a :class:`torch.device` that is being used
alongside a CPU to speed up computation. These device use an asynchronous execution scheme,
using :class:`torch.Stream` and :class:`torch.Event` as their main way to perform synchronization.
We also assume that only one such accelerator can be available at once on a given host. This allows
us to use the current accelerator as the default device for relevant concepts such as pinned memory,
Stream device_type, FSDP, etc.
As of today, accelerator devices are (in no particular order) :doc:`"CUDA" <cuda>`, :doc:`"MTIA" <mtia>`,
:doc:`"XPU" <xpu>`, and PrivateUse1 (many device not in the PyTorch repo itself).
.. autosummary::
:toctree: generated
:nosignatures:
Stream
Event
.. _generators:
Generators
----------------------------------
.. autosummary::
:toctree: generated
:nosignatures:
Generator
.. _random-sampling:
2017-01-05 04:20:57 +00:00
Random sampling
----------------------------------
.. autosummary::
:toctree: generated
:nosignatures:
seed
manual_seed
initial_seed
get_rng_state
set_rng_state
.. autoattribute:: torch.default_generator
:annotation: Returns the default CPU torch.Generator
.. The following doesn't actually seem to exist.
https://github.com/pytorch/pytorch/issues/27780
.. autoattribute:: torch.cuda.default_generators
:annotation: If cuda is available, returns a tuple of default CUDA torch.Generator-s.
The number of CUDA torch.Generator-s returned is equal to the number of
GPUs available in the system.
.. autosummary::
:toctree: generated
:nosignatures:
bernoulli
multinomial
normal
poisson
rand
rand_like
randint
randint_like
randn
randn_like
randperm
2017-09-14 14:03:17 +00:00
Update docs with new tensor repr (#6454) * Update docs with new tensor repr * remove cuda in dtype * remove changes to gloo submodule * [docs] document tensor.new_* ctor * [docs] Add docs for tensor.to(), tensor.float(), etc * [docs] Moar examples for docs. * [docs] Warning for tensor ctor copy behavior * Quick fix * [docs] Document requires_grad_() * [docs] Add example for requires_grad_() * update slogdet and *fft * update tensor rst * small fixes * update some docs * additional doc changes * update torch and tensor docs * finish changing tensor docs * fix flake8 * slogdet with negative det * Update functional.py tensor ctors * Fix nll_loss docs * reorder to move device up * torch.LongTensor -> torch.tensor or torch.empty in docs * update tensor constructors in docs * change tensor constructors * change constructors * change more Tensor() to tensor() * Show requires_grads_ docs * Fix set_default_dtype docs * Update docs with new tensor repr * remove cuda in dtype * remove changes to gloo submodule * [docs] document tensor.new_* ctor * [docs] Add docs for tensor.to(), tensor.float(), etc * [docs] Moar examples for docs. * [docs] Warning for tensor ctor copy behavior * Quick fix * [docs] Document requires_grad_() * [docs] Add example for requires_grad_() * update slogdet and *fft * update tensor rst * small fixes * update some docs * additional doc changes * update torch and tensor docs * finish changing tensor docs * fix flake8 * slogdet with negative det * Update functional.py tensor ctors * Fix nll_loss docs * reorder to move device up * torch.LongTensor -> torch.tensor or torch.empty in docs * update tensor constructors in docs * change tensor constructors * change constructors * change more Tensor() to tensor() * Show requires_grads_ docs * Fix set_default_dtype docs * Link to torch.no_grad, etc, from torch doc * Add dtype aliases to table * regen docs again * Tensor attributes stub page * link to inplace sampling * Link torch.dtype, device, and layout * fix dots after nonfinite floats * better layout docs
2018-04-21 11:35:37 +00:00
.. _inplace-random-sampling:
2017-09-14 14:03:17 +00:00
In-place random sampling
~~~~~~~~~~~~~~~~~~~~~~~~
There are a few more in-place random sampling functions defined on Tensors as well. Click through to refer to their documentation:
- :func:`torch.Tensor.bernoulli_` - in-place version of :func:`torch.bernoulli`
- :func:`torch.Tensor.cauchy_` - numbers drawn from the Cauchy distribution
- :func:`torch.Tensor.exponential_` - numbers drawn from the exponential distribution
- :func:`torch.Tensor.geometric_` - elements drawn from the geometric distribution
- :func:`torch.Tensor.log_normal_` - samples from the log-normal distribution
- :func:`torch.Tensor.normal_` - in-place version of :func:`torch.normal`
- :func:`torch.Tensor.random_` - numbers sampled from the discrete uniform distribution
- :func:`torch.Tensor.uniform_` - numbers sampled from the continuous uniform distribution
2017-01-05 04:20:57 +00:00
Quasi-random sampling
~~~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated
:nosignatures:
:template: sobolengine.rst
quasirandom.SobolEngine
2017-01-05 04:20:57 +00:00
Serialization
----------------------------------
.. autosummary::
:toctree: generated
:nosignatures:
2017-01-05 04:20:57 +00:00
save
load
2017-01-05 04:20:57 +00:00
Parallelism
----------------------------------
.. autosummary::
:toctree: generated
:nosignatures:
get_num_threads
set_num_threads
get_num_interop_threads
set_num_interop_threads
2017-01-05 04:20:57 +00:00
.. _torch-rst-local-disable-grad:
Update docs with new tensor repr (#6454) * Update docs with new tensor repr * remove cuda in dtype * remove changes to gloo submodule * [docs] document tensor.new_* ctor * [docs] Add docs for tensor.to(), tensor.float(), etc * [docs] Moar examples for docs. * [docs] Warning for tensor ctor copy behavior * Quick fix * [docs] Document requires_grad_() * [docs] Add example for requires_grad_() * update slogdet and *fft * update tensor rst * small fixes * update some docs * additional doc changes * update torch and tensor docs * finish changing tensor docs * fix flake8 * slogdet with negative det * Update functional.py tensor ctors * Fix nll_loss docs * reorder to move device up * torch.LongTensor -> torch.tensor or torch.empty in docs * update tensor constructors in docs * change tensor constructors * change constructors * change more Tensor() to tensor() * Show requires_grads_ docs * Fix set_default_dtype docs * Update docs with new tensor repr * remove cuda in dtype * remove changes to gloo submodule * [docs] document tensor.new_* ctor * [docs] Add docs for tensor.to(), tensor.float(), etc * [docs] Moar examples for docs. * [docs] Warning for tensor ctor copy behavior * Quick fix * [docs] Document requires_grad_() * [docs] Add example for requires_grad_() * update slogdet and *fft * update tensor rst * small fixes * update some docs * additional doc changes * update torch and tensor docs * finish changing tensor docs * fix flake8 * slogdet with negative det * Update functional.py tensor ctors * Fix nll_loss docs * reorder to move device up * torch.LongTensor -> torch.tensor or torch.empty in docs * update tensor constructors in docs * change tensor constructors * change constructors * change more Tensor() to tensor() * Show requires_grads_ docs * Fix set_default_dtype docs * Link to torch.no_grad, etc, from torch doc * Add dtype aliases to table * regen docs again * Tensor attributes stub page * link to inplace sampling * Link torch.dtype, device, and layout * fix dots after nonfinite floats * better layout docs
2018-04-21 11:35:37 +00:00
Locally disabling gradient computation
--------------------------------------
The context managers :func:`torch.no_grad`, :func:`torch.enable_grad`, and
:func:`torch.set_grad_enabled` are helpful for locally disabling and enabling
gradient computation. See :ref:`locally-disable-grad` for more details on
their usage. These context managers are thread local, so they won't
work if you send work to another thread using the ``threading`` module, etc.
Update docs with new tensor repr (#6454) * Update docs with new tensor repr * remove cuda in dtype * remove changes to gloo submodule * [docs] document tensor.new_* ctor * [docs] Add docs for tensor.to(), tensor.float(), etc * [docs] Moar examples for docs. * [docs] Warning for tensor ctor copy behavior * Quick fix * [docs] Document requires_grad_() * [docs] Add example for requires_grad_() * update slogdet and *fft * update tensor rst * small fixes * update some docs * additional doc changes * update torch and tensor docs * finish changing tensor docs * fix flake8 * slogdet with negative det * Update functional.py tensor ctors * Fix nll_loss docs * reorder to move device up * torch.LongTensor -> torch.tensor or torch.empty in docs * update tensor constructors in docs * change tensor constructors * change constructors * change more Tensor() to tensor() * Show requires_grads_ docs * Fix set_default_dtype docs * Update docs with new tensor repr * remove cuda in dtype * remove changes to gloo submodule * [docs] document tensor.new_* ctor * [docs] Add docs for tensor.to(), tensor.float(), etc * [docs] Moar examples for docs. * [docs] Warning for tensor ctor copy behavior * Quick fix * [docs] Document requires_grad_() * [docs] Add example for requires_grad_() * update slogdet and *fft * update tensor rst * small fixes * update some docs * additional doc changes * update torch and tensor docs * finish changing tensor docs * fix flake8 * slogdet with negative det * Update functional.py tensor ctors * Fix nll_loss docs * reorder to move device up * torch.LongTensor -> torch.tensor or torch.empty in docs * update tensor constructors in docs * change tensor constructors * change constructors * change more Tensor() to tensor() * Show requires_grads_ docs * Fix set_default_dtype docs * Link to torch.no_grad, etc, from torch doc * Add dtype aliases to table * regen docs again * Tensor attributes stub page * link to inplace sampling * Link torch.dtype, device, and layout * fix dots after nonfinite floats * better layout docs
2018-04-21 11:35:37 +00:00
Examples::
>>> x = torch.zeros(1, requires_grad=True)
>>> with torch.no_grad():
... y = x * 2
>>> y.requires_grad
False
>>> is_train = False
>>> with torch.set_grad_enabled(is_train):
... y = x * 2
>>> y.requires_grad
False
>>> torch.set_grad_enabled(True) # this can also be used as a function
>>> y = x * 2
>>> y.requires_grad
True
>>> torch.set_grad_enabled(False)
>>> y = x * 2
>>> y.requires_grad
False
.. autosummary::
:toctree: generated
:nosignatures:
no_grad
enable_grad
autograd.grad_mode.set_grad_enabled
is_grad_enabled
autograd.grad_mode.inference_mode
is_inference_mode_enabled
2017-01-05 04:20:57 +00:00
2016-12-23 21:28:04 +00:00
Math operations
---------------
2016-12-23 21:28:04 +00:00
Constants
~~~~~~~~~~~~~~~~~~~~~~
======================================= ===========================================
``inf`` A floating-point positive infinity. Alias for :attr:`math.inf`.
``nan`` A floating-point "not a number" value. This value is not a legal number. Alias for :attr:`math.nan`.
======================================= ===========================================
2017-01-05 04:20:57 +00:00
Pointwise Ops
~~~~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated
:nosignatures:
abs
absolute
acos
arccos
acosh
arccosh
add
addcdiv
addcmul
angle
asin
arcsin
asinh
arcsinh
atan
arctan
atanh
arctanh
atan2
arctan2
bitwise_not
bitwise_and
bitwise_or
bitwise_xor
bitwise_left_shift
bitwise_right_shift
ceil
clamp
clip
Conjugate View (#54987) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54987 Based off of ezyang (https://github.com/pytorch/pytorch/pull/44799) and bdhirsh (https://github.com/pytorch/pytorch/pull/43702) 's prototype: Here's a summary of the changes in this PR: This PR adds a new dispatch key called Conjugate. This enables us to make conjugate operation a view and leverage the specialized library functions that fast path with the hermitian operation (conj + transpose). 1. Conjugate operation will now return a view with conj bit (1) for complex tensors and returns self for non-complex tensors as before. This also means `torch.view_as_real` will no longer be a view on conjugated complex tensors and is hence disabled. To fill the gap, we have added `torch.view_as_real_physical` which would return the real tensor agnostic of the conjugate bit on the input complex tensor. The information about conjugation on the old tensor can be obtained by calling `.is_conj()` on the new tensor. 2. NEW API: a) `.conj()` -- now returning a view. b) `.conj_physical()` -- does the physical conjugate operation. If the conj bit for input was set, you'd get `self.clone()`, else you'll get a new tensor with conjugated value in its memory. c) `.conj_physical_()`, and `out=` variant d) `.resolve_conj()` -- materializes the conjugation. returns self if the conj bit is unset, else returns a new tensor with conjugated values and conj bit set to 0. e) `.resolve_conj_()` in-place version of (d) f) `view_as_real_physical` -- as described in (1), it's functionally same as `view_as_real`, just that it doesn't error out on conjugated tensors. g) `view_as_real` -- existing function, but now errors out on conjugated tensors. 3. Conjugate Fallback a) Vast majority of PyTorch functions would currently use this fallback when they are called on a conjugated tensor. b) This fallback is well equipped to handle the following cases: - functional operation e.g., `torch.sin(input)` - Mutable inputs and in-place operations e.g., `tensor.add_(2)` - out-of-place operation e.g., `torch.sin(input, out=out)` - Tensorlist input args - NOTE: Meta tensors don't work with conjugate fallback. 4. Autograd a) `resolve_conj()` is an identity function w.r.t. autograd b) Everything else works as expected. 5. Testing: a) All method_tests run with conjugate view tensors. b) OpInfo tests that run with conjugate views - test_variant_consistency_eager/jit - gradcheck, gradgradcheck - test_conj_views (that only run for `torch.cfloat` dtype) NOTE: functions like `empty_like`, `zero_like`, `randn_like`, `clone` don't propagate the conjugate bit. Follow up work: 1. conjugate view RFC 2. Add neg bit to re-enable view operation on conjugated tensors 3. Update linalg functions to call into specialized functions that fast path with the hermitian operation. Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D28227315 Pulled By: anjali411 fbshipit-source-id: acab9402b9d6a970c6d512809b627a290c8def5f
2021-06-04 21:11:23 +00:00
conj_physical
copysign
cos
cosh
deg2rad
div
divide
digamma
erf
erfc
erfinv
exp
exp2
expm1
fake_quantize_per_channel_affine
fake_quantize_per_tensor_affine
fix
float_power
floor
floor_divide
fmod
frac
frexp
gradient
imag
ldexp
lerp
lgamma
log
log10
log1p
log2
logaddexp
logaddexp2
logical_and
logical_not
logical_or
logical_xor
logit
hypot
i0
igamma
igammac
mul
multiply
mvlgamma
nan_to_num
neg
negative
nextafter
polygamma
positive
pow
quantized_batch_norm
quantized_max_pool1d
quantized_max_pool2d
rad2deg
real
reciprocal
remainder
round
rsqrt
sigmoid
sign
sgn
signbit
sin
sinc
sinh
softmax
sqrt
square
sub
subtract
tan
tanh
true_divide
trunc
xlogy
2017-01-05 04:20:57 +00:00
Reduction Ops
~~~~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated
:nosignatures:
argmax
argmin
amax
amin
aminmax
all
any
max
min
dist
logsumexp
mean
nanmean
median
nanmedian
mode
norm
nansum
prod
quantile
nanquantile
std
std_mean
sum
unique
unique_consecutive
var
var_mean
count_nonzero
2017-01-05 04:20:57 +00:00
Comparison Ops
~~~~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated
:nosignatures:
allclose
argsort
eq
equal
ge
greater_equal
gt
greater
isclose
isfinite
Implementation of torch.isin() (#53125) Summary: Fixes https://github.com/pytorch/pytorch/issues/3025 ## Background This PR implements a function similar to numpy's [`isin()`](https://numpy.org/doc/stable/reference/generated/numpy.isin.html#numpy.isin). The op supports integral and floating point types on CPU and CUDA (+ half & bfloat16 for CUDA). Inputs can be one of: * (Tensor, Tensor) * (Tensor, Scalar) * (Scalar, Tensor) Internally, one of two algorithms is selected based on the number of elements vs. test elements. The heuristic for deciding which algorithm to use is taken from [numpy's implementation](https://github.com/numpy/numpy/blob/fb215c76967739268de71aa4bda55dd1b062bc2e/numpy/lib/arraysetops.py#L575): if `len(test_elements) < 10 * len(elements) ** 0.145`, then a naive brute-force checking algorithm is used. Otherwise, a stablesort-based algorithm is used. I've done some preliminary benchmarking to verify this heuristic on a devgpu, and determined for a limited set of tests that a power value of `0.407` instead of `0.145` is a better inflection point. For now, the heuristic has been left to match numpy's, but input is welcome for the best way to select it or whether it should be left the same as numpy's. Tests are adapted from numpy's [isin and in1d tests](https://github.com/numpy/numpy/blob/7dcd29aaafe1ab8be4be04d3c793e5bcaf17459f/numpy/lib/tests/test_arraysetops.py). Note: my locally generated docs look terrible for some reason, so I'm not including the screenshot for them until I figure out why. Pull Request resolved: https://github.com/pytorch/pytorch/pull/53125 Test Plan: ``` python test/test_ops.py # Ex: python test/test_ops.py TestOpInfoCPU.test_supported_dtypes_isin_cpu_int32 python test/test_sort_and_select.py # Ex: python test/test_sort_and_select.py TestSortAndSelectCPU.test_isin_cpu_int32 ``` Reviewed By: soulitzer Differential Revision: D29101165 Pulled By: jbschlosser fbshipit-source-id: 2dcc38d497b1e843f73f332d837081e819454b4e
2021-06-14 20:49:38 +00:00
isin
isinf
isposinf
isneginf
isnan
isreal
kthvalue
le
less_equal
lt
less
maximum
minimum
fmax
fmin
ne
not_equal
sort
topk
msort
2017-01-05 04:20:57 +00:00
Spectral Ops
~~~~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated
:nosignatures:
stft
istft
bartlett_window
blackman_window
hamming_window
hann_window
kaiser_window
2017-01-05 04:20:57 +00:00
Other Operations
~~~~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated
:nosignatures:
atleast_1d
atleast_2d
atleast_3d
bincount
block_diag
broadcast_tensors
broadcast_to
broadcast_shapes
bucketize
cartesian_prod
cdist
clone
combinations
corrcoef
cov
cross
cummax
cummin
cumprod
cumsum
diag
diag_embed
diagflat
diagonal
Implement `np.diff` for single order differences (#50569) Summary: Implements `np.diff` for single order differences only: - method and function variants for `diff` and function variant for `diff_out` - supports out variant, but not in-place since shape changes - adds OpInfo entry, and test in `test_torch` - automatic autograd because we are using the `Math` dispatch _Update: we only support Tensors for prepend and append in this PR. See discussion below and comments for more details._ Currently there is a quirk in the c++ API based on how this is implemented: it is not possible to specify scalar prepend and appends without also specifying all 4 arguments. That is because the goal is to match NumPy's diff signature of `diff(int n=1, int dim=-1, Union[Scalar, Tensor] prepend=None, Union[Scalar, Tensor] append)=None` where all arguments are optional, positional and in the correct order. There are a couple blockers. One is c++ ambiguity. This prevents us from simply doing `diff(int n=1, int dim=-1, Scalar? prepend=None, Tensor? append=None)` etc for all combinations of {Tensor, Scalar} x {Tensor, Scalar}. Why not have append, prepend not have default args and then write out the whole power set of {Tensor, Scalar, omitted} x {Tensor, Scalar, omitted} you might ask. Aside from having to write 18 overloads, this is actually illegal because arguments with defaults must come after arguments without defaults. This would mean having to write `diff(prepend, append, n, dim)` which is not desired. Finally writing out the entire power set of all arguments n, dim, prepend, append is out of the question because that would actually involve 2 * 2 * 3 * 3 = 36 combinations. And if we include the out variant, that would be 72 overloads! With this in mind, the current way this is implemented is actually to still do `diff(int n=1, int dim=-1, Scalar? prepend=None, Tensor? append=None)`. But also make use of `cpp_no_default_args`. The idea is to only have one of the 4 {Tensor, Scalar} x {Tensor, Scalar} provide default arguments for the c++ api, and add `cpp_no_default_args` for the remaining 3 overloads. With this, Python api works as expected, but some calls such as `diff(prepend=1)` won't work on c++ api. We can optionally add 18 more overloads that cover the {dim, n, no-args} x {scalar-tensor, tensor-scalar, scalar-scalar} x {out, non-out} cases for c++ api. _[edit: counting is hard - just realized this number is still wrong. We should try to count the cases we do cover instead and subtract that from the total: (2 * 2 * 3 * 3) - (3 + 2^4) = 17. 3 comes from the 3 of 4 combinations of {tensor, scalar}^2 that we declare to be `cpp_no_default_args`, and the one remaining case that has default arguments has covers 2^4 cases. So actual count is 34 additional overloads to support all possible calls]_ _[edit: thanks to https://github.com/pytorch/pytorch/issues/50767 hacky_wrapper is no longer necessary; it is removed in the latest commit]_ hacky_wrapper was also necessary here because `Tensor?` will cause dispatch to look for the `const optional<Tensor>&` schema but also generate a `const Tensor&` declaration in Functions.h. hacky_wrapper allows us to define our function as `const Tensor&` but wraps it in optional for us, so this avoids both the errors while linking and loading. _[edit: rewrote the above to improve clarity and correct the fact that we actually need 18 more overloads (26 total), not 18 in total to complete the c++ api]_ Pull Request resolved: https://github.com/pytorch/pytorch/pull/50569 Reviewed By: H-Huang Differential Revision: D26176105 Pulled By: soulitzer fbshipit-source-id: cd8e77cc2de1117c876cd71c29b312887daca33f
2021-02-03 04:20:15 +00:00
diff
einsum
flatten
flip
fliplr
flipud
kron
rot90
gcd
histc
histogram
histogramdd
meshgrid
lcm
logcumsumexp
ravel
renorm
repeat_interleave
roll
searchsorted
tensordot
trace
tril
tril_indices
triu
triu_indices
unflatten
vander
view_as_real
view_as_complex
Conjugate View (#54987) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54987 Based off of ezyang (https://github.com/pytorch/pytorch/pull/44799) and bdhirsh (https://github.com/pytorch/pytorch/pull/43702) 's prototype: Here's a summary of the changes in this PR: This PR adds a new dispatch key called Conjugate. This enables us to make conjugate operation a view and leverage the specialized library functions that fast path with the hermitian operation (conj + transpose). 1. Conjugate operation will now return a view with conj bit (1) for complex tensors and returns self for non-complex tensors as before. This also means `torch.view_as_real` will no longer be a view on conjugated complex tensors and is hence disabled. To fill the gap, we have added `torch.view_as_real_physical` which would return the real tensor agnostic of the conjugate bit on the input complex tensor. The information about conjugation on the old tensor can be obtained by calling `.is_conj()` on the new tensor. 2. NEW API: a) `.conj()` -- now returning a view. b) `.conj_physical()` -- does the physical conjugate operation. If the conj bit for input was set, you'd get `self.clone()`, else you'll get a new tensor with conjugated value in its memory. c) `.conj_physical_()`, and `out=` variant d) `.resolve_conj()` -- materializes the conjugation. returns self if the conj bit is unset, else returns a new tensor with conjugated values and conj bit set to 0. e) `.resolve_conj_()` in-place version of (d) f) `view_as_real_physical` -- as described in (1), it's functionally same as `view_as_real`, just that it doesn't error out on conjugated tensors. g) `view_as_real` -- existing function, but now errors out on conjugated tensors. 3. Conjugate Fallback a) Vast majority of PyTorch functions would currently use this fallback when they are called on a conjugated tensor. b) This fallback is well equipped to handle the following cases: - functional operation e.g., `torch.sin(input)` - Mutable inputs and in-place operations e.g., `tensor.add_(2)` - out-of-place operation e.g., `torch.sin(input, out=out)` - Tensorlist input args - NOTE: Meta tensors don't work with conjugate fallback. 4. Autograd a) `resolve_conj()` is an identity function w.r.t. autograd b) Everything else works as expected. 5. Testing: a) All method_tests run with conjugate view tensors. b) OpInfo tests that run with conjugate views - test_variant_consistency_eager/jit - gradcheck, gradgradcheck - test_conj_views (that only run for `torch.cfloat` dtype) NOTE: functions like `empty_like`, `zero_like`, `randn_like`, `clone` don't propagate the conjugate bit. Follow up work: 1. conjugate view RFC 2. Add neg bit to re-enable view operation on conjugated tensors 3. Update linalg functions to call into specialized functions that fast path with the hermitian operation. Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D28227315 Pulled By: anjali411 fbshipit-source-id: acab9402b9d6a970c6d512809b627a290c8def5f
2021-06-04 21:11:23 +00:00
resolve_conj
resolve_neg
2017-01-05 04:20:57 +00:00
BLAS and LAPACK Operations
~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autosummary::
:toctree: generated
:nosignatures:
addbmm
addmm
addmv
addr
baddbmm
bmm
chain_matmul
cholesky
cholesky_inverse
cholesky_solve
dot
geqrf
ger
inner
inverse
det
logdet
slogdet
lu
lu_solve
lu_unpack
matmul
matrix_power
matrix_exp
mm
mv
orgqr
ormqr
Adds linalg.det alias, fixes outer alias, updates alias testing (#42802) Summary: This PR: - updates test_op_normalization.py, which verifies that aliases are correctly translated in the JIT - adds torch.linalg.det as an alias for torch.det - moves the torch.linalg.outer alias to torch.outer (to be consistent with NumPy) The torch.linalg.outer alias was put the linalg namespace erroneously as a placeholder since it's a "linear algebra op" according to NumPy but is actually still in the main NumPy namespace. The updates to test_op_normalization are necessary. Previously it was using method_tests to generate tests, and method_tests assumes test suites using it also use the device generic framework, which test_op_normalization did not. For example, some ops require decorators like `skipCPUIfNoLapack`, which only works in device generic test classes. Moving test_op_normalization to the device generic framework also lets these tests run on CPU and CUDA. Continued reliance on method_tests() is excessive since the test suite is only interested in testing aliasing, and a simpler and more readable `AliasInfo` class is used for the required information. An example impedance mismatch between method_tests and the new tests, for example, was how to handle ops in namespaces like torch.linalg.det. In the future this information will likely be folded into a common 'OpInfo' registry in the test suite. The actual tests performed are similar to what they were previously: a scripted and traced version of the op is run and the test verifies that both graphs do not contain the alias name and do contain the aliased name. The guidance for adding an alias has been updated accordingly. cc mattip Note: ngimel suggests: - deprecating and then removing the `torch.ger` name - reviewing the implementation of `torch.outer` Pull Request resolved: https://github.com/pytorch/pytorch/pull/42802 Reviewed By: zou3519 Differential Revision: D23059883 Pulled By: mruberry fbshipit-source-id: 11321c2a7fb283a6e7c0d8899849ad7476be42d1
2020-08-12 04:47:08 +00:00
outer
pinverse
qr
svd
svd_lowrank
pca_lowrank
lobpcg
trapz
trapezoid
cumulative_trapezoid
triangular_solve
vdot
Foreach Operations
~~~~~~~~~~~~~~~~~~
.. warning::
This API is in beta and subject to future changes.
Forward-mode AD is not supported.
.. autosummary::
:toctree: generated
:nosignatures:
_foreach_abs
_foreach_abs_
_foreach_acos
_foreach_acos_
_foreach_asin
_foreach_asin_
_foreach_atan
_foreach_atan_
_foreach_ceil
_foreach_ceil_
_foreach_cos
_foreach_cos_
_foreach_cosh
_foreach_cosh_
_foreach_erf
_foreach_erf_
_foreach_erfc
_foreach_erfc_
_foreach_exp
_foreach_exp_
_foreach_expm1
_foreach_expm1_
_foreach_floor
_foreach_floor_
_foreach_log
_foreach_log_
_foreach_log10
_foreach_log10_
_foreach_log1p
_foreach_log1p_
_foreach_log2
_foreach_log2_
_foreach_neg
_foreach_neg_
_foreach_tan
_foreach_tan_
_foreach_sin
_foreach_sin_
_foreach_sinh
_foreach_sinh_
_foreach_round
_foreach_round_
_foreach_sqrt
_foreach_sqrt_
_foreach_lgamma
_foreach_lgamma_
_foreach_frac
_foreach_frac_
_foreach_reciprocal
_foreach_reciprocal_
_foreach_sigmoid
_foreach_sigmoid_
_foreach_trunc
_foreach_trunc_
_foreach_zero_
Utilities
----------------------------------
.. autosummary::
:toctree: generated
:nosignatures:
compiled_with_cxx11_abi
result_type
can_cast
promote_types
use_deterministic_algorithms
are_deterministic_algorithms_enabled
is_deterministic_algorithms_warn_only_enabled
set_deterministic_debug_mode
get_deterministic_debug_mode
set_float32_matmul_precision
get_float32_matmul_precision
set_warn_always
get_device_module
is_warn_always_enabled
vmap
_assert
Symbolic Numbers
----------------
Implement SymBool (#92149) We have known for a while that we should in principle support SymBool as a separate concept from SymInt and SymFloat ( in particular, every distinct numeric type should get its own API). However, recent work with unbacked SymInts in, e.g., https://github.com/pytorch/pytorch/pull/90985 have made this a priority to implement. The essential problem is that our logic for computing the contiguity of tensors performs branches on the passed in input sizes, and this causes us to require guards when constructing tensors from unbacked SymInts. Morally, this should not be a big deal because, we only really care about the regular (non-channels-last) contiguity of the tensor, which should be guaranteed since most people aren't calling `empty_strided` on the tensor, however, because we store a bool (not a SymBool, prior to this PR it doesn't exist) on TensorImpl, we are forced to *immediately* compute these values, even if the value ends up not being used at all. In particular, even when a user allocates a contiguous tensor, we still must compute channels-last contiguity (as some contiguous tensors are also channels-last contiguous, but others are not.) This PR implements SymBool, and makes TensorImpl use SymBool to store the contiguity information in ExtraMeta. There are a number of knock on effects, which I now discuss below. * I introduce a new C++ type SymBool, analogous to SymInt and SymFloat. This type supports logical and, logical or and logical negation. I support the bitwise operations on this class (but not the conventional logic operators) to make it clear that logical operations on SymBool are NOT short-circuiting. I also, for now, do NOT support implicit conversion of SymBool to bool (creating a guard in this case). This does matter too much in practice, as in this PR I did not modify the equality operations (e.g., `==` on SymInt) to return SymBool, so all preexisting implicit guards did not need to be changed. I also introduced symbolic comparison functions `sym_eq`, etc. on SymInt to make it possible to create SymBool. The current implementation of comparison functions makes it unfortunately easy to accidentally introduce guards when you do not mean to (as both `s0 == s1` and `s0.sym_eq(s1)` are valid spellings of equality operation); in the short term, I intend to prevent excess guarding in this situation by unit testing; in the long term making the equality operators return SymBool is probably the correct fix. * ~~I modify TensorImpl to store SymBool for the `is_contiguous` fields and friends on `ExtraMeta`. In practice, this essentially meant reverting most of the changes from https://github.com/pytorch/pytorch/pull/85936 . In particular, the fields on ExtraMeta are no longer strongly typed; at the time I was particularly concerned about the giant lambda I was using as the setter getting a desynchronized argument order, but now that I have individual setters for each field the only "big list" of boolean arguments is in the constructor of ExtraMeta, which seems like an acceptable risk. The semantics of TensorImpl are now that we guard only when you actually attempt to access the contiguity of the tensor via, e.g., `is_contiguous`. By in large, the contiguity calculation in the implementations now needs to be duplicated (as the boolean version can short circuit, but the SymBool version cannot); you should carefully review the duplicate new implementations. I typically use the `identity` template to disambiguate which version of the function I need, and rely on overloading to allow for implementation sharing. The changes to the `compute_` functions are particularly interesting; for most of the functions, I preserved their original non-symbolic implementation, and then introduce a new symbolic implementation that is branch-less (making use of our new SymBool operations). However, `compute_non_overlapping_and_dense` is special, see next bullet.~~ This appears to cause performance problems, so I am leaving this to an update PR. * (Update: the Python side pieces for this are still in this PR, but they are not wired up until later PRs.) While the contiguity calculations are relatively easy to write in a branch-free way, `compute_non_overlapping_and_dense` is not: it involves a sort on the strides. While in principle we can still make it go through by using a data oblivious sorting network, this seems like too much complication for a field that is likely never used (because typically, it will be obvious that a tensor is non overlapping and dense, because the tensor is contiguous.) So we take a different approach: instead of trying to trace through the logic computation of non-overlapping and dense, we instead introduce a new opaque operator IsNonOverlappingAndDenseIndicator which represents all of the compute that would have been done here. This function returns an integer 0 if `is_non_overlapping_and_dense` would have returned `False`, and an integer 1 otherwise, for technical reasons (Sympy does not easily allow defining custom functions that return booleans). The function itself only knows how to evaluate itself if all of its arguments are integers; otherwise it is left unevaluated. This means we can always guard on it (as `size_hint` will always be able to evaluate through it), but otherwise its insides are left a black box. We typically do NOT expect this custom function to show up in actual boolean expressions, because we will typically shortcut it due to the tensor being contiguous. It's possible we should apply this treatment to all of the other `compute_` operations, more investigation necessary. As a technical note, because this operator takes a pair of a list of SymInts, we need to support converting `ArrayRef<SymNode>` to Python, and I also unpack the pair of lists into a single list because I don't know if Sympy operations can actually validly take lists of Sympy expressions as inputs. See for example `_make_node_sizes_strides` * On the Python side, we also introduce a SymBool class, and update SymNode to track bool as a valid pytype. There is some subtlety here: bool is a subclass of int, so one has to be careful about `isinstance` checks (in fact, in most cases I replaced `isinstance(x, int)` with `type(x) is int` for expressly this reason.) Additionally, unlike, C++, I do NOT define bitwise inverse on SymBool, because it does not do the correct thing when run on booleans, e.g., `~True` is `-2`. (For that matter, they don't do the right thing in C++ either, but at least in principle the compiler can warn you about it with `-Wbool-operation`, and so the rule is simple in C++; only use logical operations if the types are statically known to be SymBool). Alas, logical negation is not overrideable, so we have to introduce `sym_not` which must be used in place of `not` whenever a SymBool can turn up. To avoid confusion with `__not__` which may imply that `operators.__not__` might be acceptable to use (it isn't), our magic method is called `__sym_not__`. The other bitwise operators `&` and `|` do the right thing with booleans and are acceptable to use. * There is some annoyance working with booleans in Sympy. Unlike int and float, booleans live in their own algebra and they support less operations than regular numbers. In particular, `sympy.expand` does not work on them. To get around this, I introduce `safe_expand` which only calls expand on operations which are known to be expandable. TODO: this PR appears to greatly regress performance of symbolic reasoning. In particular, `python test/functorch/test_aotdispatch.py -k max_pool2d` performs really poorly with these changes. Need to investigate. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/92149 Approved by: https://github.com/albanD, https://github.com/Skylion007
2023-01-20 19:10:29 +00:00
.. autoclass:: SymInt
:members:
.. autoclass:: SymFloat
:members:
.. autoclass:: SymBool
:members:
.. autosummary::
:toctree: generated
:nosignatures:
sym_float
sym_fresh_size
sym_int
sym_max
sym_min
Implement SymBool (#92149) We have known for a while that we should in principle support SymBool as a separate concept from SymInt and SymFloat ( in particular, every distinct numeric type should get its own API). However, recent work with unbacked SymInts in, e.g., https://github.com/pytorch/pytorch/pull/90985 have made this a priority to implement. The essential problem is that our logic for computing the contiguity of tensors performs branches on the passed in input sizes, and this causes us to require guards when constructing tensors from unbacked SymInts. Morally, this should not be a big deal because, we only really care about the regular (non-channels-last) contiguity of the tensor, which should be guaranteed since most people aren't calling `empty_strided` on the tensor, however, because we store a bool (not a SymBool, prior to this PR it doesn't exist) on TensorImpl, we are forced to *immediately* compute these values, even if the value ends up not being used at all. In particular, even when a user allocates a contiguous tensor, we still must compute channels-last contiguity (as some contiguous tensors are also channels-last contiguous, but others are not.) This PR implements SymBool, and makes TensorImpl use SymBool to store the contiguity information in ExtraMeta. There are a number of knock on effects, which I now discuss below. * I introduce a new C++ type SymBool, analogous to SymInt and SymFloat. This type supports logical and, logical or and logical negation. I support the bitwise operations on this class (but not the conventional logic operators) to make it clear that logical operations on SymBool are NOT short-circuiting. I also, for now, do NOT support implicit conversion of SymBool to bool (creating a guard in this case). This does matter too much in practice, as in this PR I did not modify the equality operations (e.g., `==` on SymInt) to return SymBool, so all preexisting implicit guards did not need to be changed. I also introduced symbolic comparison functions `sym_eq`, etc. on SymInt to make it possible to create SymBool. The current implementation of comparison functions makes it unfortunately easy to accidentally introduce guards when you do not mean to (as both `s0 == s1` and `s0.sym_eq(s1)` are valid spellings of equality operation); in the short term, I intend to prevent excess guarding in this situation by unit testing; in the long term making the equality operators return SymBool is probably the correct fix. * ~~I modify TensorImpl to store SymBool for the `is_contiguous` fields and friends on `ExtraMeta`. In practice, this essentially meant reverting most of the changes from https://github.com/pytorch/pytorch/pull/85936 . In particular, the fields on ExtraMeta are no longer strongly typed; at the time I was particularly concerned about the giant lambda I was using as the setter getting a desynchronized argument order, but now that I have individual setters for each field the only "big list" of boolean arguments is in the constructor of ExtraMeta, which seems like an acceptable risk. The semantics of TensorImpl are now that we guard only when you actually attempt to access the contiguity of the tensor via, e.g., `is_contiguous`. By in large, the contiguity calculation in the implementations now needs to be duplicated (as the boolean version can short circuit, but the SymBool version cannot); you should carefully review the duplicate new implementations. I typically use the `identity` template to disambiguate which version of the function I need, and rely on overloading to allow for implementation sharing. The changes to the `compute_` functions are particularly interesting; for most of the functions, I preserved their original non-symbolic implementation, and then introduce a new symbolic implementation that is branch-less (making use of our new SymBool operations). However, `compute_non_overlapping_and_dense` is special, see next bullet.~~ This appears to cause performance problems, so I am leaving this to an update PR. * (Update: the Python side pieces for this are still in this PR, but they are not wired up until later PRs.) While the contiguity calculations are relatively easy to write in a branch-free way, `compute_non_overlapping_and_dense` is not: it involves a sort on the strides. While in principle we can still make it go through by using a data oblivious sorting network, this seems like too much complication for a field that is likely never used (because typically, it will be obvious that a tensor is non overlapping and dense, because the tensor is contiguous.) So we take a different approach: instead of trying to trace through the logic computation of non-overlapping and dense, we instead introduce a new opaque operator IsNonOverlappingAndDenseIndicator which represents all of the compute that would have been done here. This function returns an integer 0 if `is_non_overlapping_and_dense` would have returned `False`, and an integer 1 otherwise, for technical reasons (Sympy does not easily allow defining custom functions that return booleans). The function itself only knows how to evaluate itself if all of its arguments are integers; otherwise it is left unevaluated. This means we can always guard on it (as `size_hint` will always be able to evaluate through it), but otherwise its insides are left a black box. We typically do NOT expect this custom function to show up in actual boolean expressions, because we will typically shortcut it due to the tensor being contiguous. It's possible we should apply this treatment to all of the other `compute_` operations, more investigation necessary. As a technical note, because this operator takes a pair of a list of SymInts, we need to support converting `ArrayRef<SymNode>` to Python, and I also unpack the pair of lists into a single list because I don't know if Sympy operations can actually validly take lists of Sympy expressions as inputs. See for example `_make_node_sizes_strides` * On the Python side, we also introduce a SymBool class, and update SymNode to track bool as a valid pytype. There is some subtlety here: bool is a subclass of int, so one has to be careful about `isinstance` checks (in fact, in most cases I replaced `isinstance(x, int)` with `type(x) is int` for expressly this reason.) Additionally, unlike, C++, I do NOT define bitwise inverse on SymBool, because it does not do the correct thing when run on booleans, e.g., `~True` is `-2`. (For that matter, they don't do the right thing in C++ either, but at least in principle the compiler can warn you about it with `-Wbool-operation`, and so the rule is simple in C++; only use logical operations if the types are statically known to be SymBool). Alas, logical negation is not overrideable, so we have to introduce `sym_not` which must be used in place of `not` whenever a SymBool can turn up. To avoid confusion with `__not__` which may imply that `operators.__not__` might be acceptable to use (it isn't), our magic method is called `__sym_not__`. The other bitwise operators `&` and `|` do the right thing with booleans and are acceptable to use. * There is some annoyance working with booleans in Sympy. Unlike int and float, booleans live in their own algebra and they support less operations than regular numbers. In particular, `sympy.expand` does not work on them. To get around this, I introduce `safe_expand` which only calls expand on operations which are known to be expandable. TODO: this PR appears to greatly regress performance of symbolic reasoning. In particular, `python test/functorch/test_aotdispatch.py -k max_pool2d` performs really poorly with these changes. Need to investigate. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/92149 Approved by: https://github.com/albanD, https://github.com/Skylion007
2023-01-20 19:10:29 +00:00
sym_not
sym_ite
sym_sum
Export Path
-------------
.. autosummary::
:toctree: generated
:nosignatures:
.. warning::
This feature is a prototype and may have compatibility breaking changes in the future.
export
generated/exportdb/index
Control Flow
------------
.. warning::
This feature is a prototype and may have compatibility breaking changes in the future.
.. autosummary::
:toctree: generated
:nosignatures:
cond
Optimizations
-------------
.. autosummary::
:toctree: generated
:nosignatures:
compile
`torch.compile documentation <https://pytorch.org/docs/main/torch.compiler.html>`__
Rework torch.compile docs (#96706) Chatted with @stas00 on slack and here are some great improvements he suggested to the compile docs - [x] Rename `dynamo` folder to `compile` - [x] Link `compile` docstring on `torch.html` to main index page for compile - [x] Create a new index page that describes why people should care - [x] easy perf, memory reduction, 1 line - [x] Short benchmark table - [x] How to guide - [x] TOC that links to the more technical pages folks have written, make the existing docs we have a Technical overview - [x] Highlight the new APIs for `torch._inductor.list_options()` and `torch._inductor.list_mode_options()` - clarify these are inductor specific and add more prose around which ones are most interesting He also highlighted an interesting way to think about who is reading this doc we have - [x] End users, that just want things to run fast - [x] Library maintainers wrapping torch.compile which would care for example about understanding when in their code they should compile a model, which backends are supported - [x] Debuggers who needs are somewhat addressed by the troubleshooting guide and faq but those could be dramatically reworked to say what we expect to break And in a seperate PR I'll work on the below with @SherlockNoMad - [ ] Authors of new backends that care about how to plug into dynamo or inductor layer so need to explain some more internals like - [ ] IR - [ ] Where to plugin, dynamo? inductor? triton? Pull Request resolved: https://github.com/pytorch/pytorch/pull/96706 Approved by: https://github.com/svekars
2023-03-15 04:41:09 +00:00
Operator Tags
------------------------------------
.. autoclass:: Tag
:members:
.. Empty submodules added only for tracking.
.. py:module:: torch.contrib
.. py:module:: torch.utils.backcompat
.. This module is only used internally for ROCm builds.
.. py:module:: torch.utils.hipify
.. This module needs to be documented. Adding here in the meantime
.. for tracking purposes
.. py:module:: torch.utils.model_dump
.. py:module:: torch.utils.viz
.. py:module:: torch.functional
.. py:module:: torch.quasirandom
.. py:module:: torch.return_types
.. py:module:: torch.serialization
.. py:module:: torch.signal.windows.windows
.. py:module:: torch.sparse.semi_structured
.. py:module:: torch.storage
.. py:module:: torch.torch_version
.. py:module:: torch.types
.. py:module:: torch.version