pytorch/torch
Edward Z. Yang b6028acfa4 Add _assert_scalar and teach Inductor to codegen it (#114148)
Inductor codegen for `_assert_async` is currently disabled because we don't really understand how to codegen `scalar_to_tensor` on a Sympy expression. I initially tried to see if I could get this to work, but I got into some weird problem involving stride sorting, so I decided to fix it properly by not going through a tensor.

So we introduce an `_assert_scalar` which takes a scalar as an argument, avoiding needing to turn a SymBool into a tensor before asserting on it. I also add `_functional_assert_scalar` for good luck, although this doesn't do anything right now because https://github.com/pytorch/pytorch/pull/104203 still hasn't been landed.

I need to customize the codegen for this operator, so I decide to directly implement it in Inductor, rather than trying to treat it as a generic ExternKernel. This leads to the new AssertScalar IR node. This is written carefully so that it doesn't get DCE'd by Inductor.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114148
Approved by: https://github.com/jansel
2024-01-09 23:21:26 +00:00
..
_awaits
_C [AOTI] Add pybind for AOTIModelContainerRunnerCpu and AOTIModelContainerRunnerCuda (#116269) 2024-01-04 18:58:24 +00:00
_C_flatbuffer
_custom_op Allow functionalization to work with optional mutable (#114803) 2023-11-30 23:48:03 +00:00
_decomp Add decomp for pad_sequence (#116285) 2023-12-27 23:56:51 +00:00
_dispatch
_dynamo Add _assert_scalar and teach Inductor to codegen it (#114148) 2024-01-09 23:21:26 +00:00
_export [export] Remove hacks for passing pinned version test. (#116871) 2024-01-06 18:09:27 +00:00
_functorch Enable reverse view_funcs by default for python subclasses (#116512) 2024-01-05 16:48:12 +00:00
_higher_order_ops Experimental non-strict mode (#114658) 2024-01-04 12:24:58 +00:00
_inductor Add _assert_scalar and teach Inductor to codegen it (#114148) 2024-01-09 23:21:26 +00:00
_lazy
_library Refactor can_auto_functionalize (#115134) 2023-12-05 22:43:06 +00:00
_logging Make TORCH_LOGS="dist_ddp" include DDPOptimizer logs (#116794) 2024-01-05 21:31:42 +00:00
_numpy [dynamo] Fix np.issubdtype (#116459) 2024-01-05 01:48:07 +00:00
_prims
_prims_common Add a decomposition for take() (#114813) 2023-12-22 18:14:57 +00:00
_refs Add decomposition for torch.block_diag (#115096) 2023-12-11 20:04:22 +00:00
_subclasses Experimental non-strict mode (#114658) 2024-01-04 12:24:58 +00:00
_vendor
amp
ao [quant][pt2e][xnnpack_quantizer] add support for linear_relu (#117052) 2024-01-09 23:19:52 +00:00
autograd Update torch.autograd.graph logging to not print out grad_output (#116523) 2024-01-09 20:40:02 +00:00
backends Add config to disable TransformerEncoder/MHA fastpath (#112212) 2024-01-02 23:59:30 +00:00
compiler Add a wrapper to transform a NumPy function into a PyTorch function (#114610) 2024-01-02 18:35:29 +00:00
contrib
cpu
csrc Enables private_use_one lazy_init by PrivateUse1HooksInterface (#115067) 2024-01-09 20:12:08 +00:00
cuda Try creating a bf16 tensor as a last resort of is_bf16_supported(). (#115924) 2024-01-01 01:15:30 +00:00
distributed [reland] unflatten_tensor on compute stream for DTensorExtension (#117020) 2024-01-09 21:25:15 +00:00
distributions Fix hang in VonMises rejection sampling for small values of concentration (#114498) 2023-12-04 23:07:06 +00:00
export [export][refactor][6/n] Remove equality_constraints (#116979) 2024-01-09 19:04:47 +00:00
fft
func
futures
fx Add _assert_scalar and teach Inductor to codegen it (#114148) 2024-01-09 23:21:26 +00:00
jit [BE]: Update flake8 to v6.1.0 and fix lints (#116591) 2024-01-03 06:04:44 +00:00
legacy
lib
linalg
masked
monitor
mps
multiprocessing Robustify torch.multiprocessing.spawn error reporting to be less deadlock prone (#114688) 2023-12-09 03:36:43 +00:00
nested Support squeeze.dim for jagged NT (#116891) 2024-01-06 01:00:53 +00:00
nn Fix TransformerEncoderLayer for bias=False (#116760) 2024-01-05 00:13:10 +00:00
onnx [ONNX] Fix output mismatch issue of repeat_interleave when dim is None (#116689) 2024-01-03 18:38:00 +00:00
optim [BE]: Enable F821 and fix bugs (#116579) 2024-01-01 08:40:46 +00:00
package [BE]: Add better handling of pathlib.Path with os calls (#116564) 2023-12-31 01:46:03 +00:00
profiler
quantization
signal Fix NaN bug in torch.signal.windows.kaiser (#116470) 2024-01-08 22:24:52 +00:00
sparse Update F32 sparse semi-structured support for CUTLASS back-end (#116017) 2023-12-22 16:53:04 +00:00
special
testing [quant][pt2e][xnnpack_quantizer] add support for linear_relu (#117052) 2024-01-09 23:19:52 +00:00
utils Make not passing use_reentrant back to warning instead of erroring and clarify docs (#116710) 2024-01-09 20:58:49 +00:00
__config__.py
__future__.py
__init__.py [BE]: Update flake8 to v6.1.0 and fix lints (#116591) 2024-01-03 06:04:44 +00:00
_appdirs.py
_classes.py
_compile.py
_custom_ops.py
_deploy.py
_guards.py
_jit_internal.py
_linalg_utils.py
_lobpcg.py
_lowrank.py
_meta_registrations.py [CPU] Add flash attention mask version (#115913) 2024-01-07 04:58:23 +00:00
_namedtensor_internals.py
_ops.py pre_dispatch aot_export (#115188) 2023-12-25 04:51:21 +00:00
_python_dispatcher.py
_sources.py
_storage_docs.py
_streambase.py
_tensor.py Fix torch.detach doc-string (#115850) 2023-12-22 20:04:33 +00:00
_tensor_docs.py Bring docstring to .pyi file (#114705) 2024-01-09 18:37:16 +00:00
_tensor_str.py
_torch_docs.py Bring docstring to .pyi file (#114705) 2024-01-09 18:37:16 +00:00
_utils.py pre_dispatch aot_export (#115188) 2023-12-25 04:51:21 +00:00
_utils_internal.py [inductor][Observability] Add log for Optimus to enable easier debug (#110452) 2023-12-01 18:25:56 +00:00
_VF.py
_vmap_internals.py
_weights_only_unpickler.py
abi-check.cpp
CMakeLists.txt [BE] [cuDNN] Always build assuming cuDNN >= 8.1 (#95722) 2024-01-03 15:41:28 +00:00
custom_class.h
custom_class_detail.h
extension.h
functional.py
hub.py Increase hub download chunk size (#116536) 2024-01-03 17:38:45 +00:00
library.h
library.py
overrides.py Introduce reverse view_funcs (#115894) 2024-01-05 16:48:12 +00:00
py.typed
quasirandom.py
random.py
README.txt
return_types.py
script.h
serialization.py [BE]: Use os.fspath and os.PathLike in torch serialization (#116562) 2023-12-30 20:53:10 +00:00
storage.py
torch_version.py
types.py
version.py.tpl

Note [TH abstraction violation]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.