pytorch/torch
eellison 71e8a2bda4 Expand inductor codegen dtype asserts, fix scan (#146067)
We were codegening intermediary dtype asserts in some places but not all. expands assertions, fixes newly failing assertion in

`TORCHINDUCTOR_COMPILE_THREADS=1 TORCH_LOGS="output_code" PYTORCH_OPINFO_SAMPLE_INPUT_INDEX=1 python test/inductor/test_torchinductor_opinfo.py TestInductorOpInfoCUDA.test_comprehensive_logcumsumexp_cuda_float16` for scan.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146067
Approved by: https://github.com/shunting314, https://github.com/jansel
2025-02-07 06:35:47 +00:00
..
_awaits
_C [CUDA][cuBLAS] Add fp16 accumulate option to cuBLAS/cuBLASLt (#144441) 2025-02-06 19:04:50 +00:00
_C_flatbuffer
_custom_op
_decomp
_dispatch
_dynamo [dynamo][fullgraph] Do not skip frame with fullgraph=True (#146527) 2025-02-06 18:56:07 +00:00
_export [export] make stack_trace optional in insert_custom_op_guards (#146438) 2025-02-06 01:48:26 +00:00
_functorch Add torch.func.debug_unwrap (#146528) 2025-02-06 18:48:09 +00:00
_higher_order_ops [auto_functionalized] Support Tensor(a!)[]? (#145400) 2025-02-05 14:52:39 +00:00
_inductor Expand inductor codegen dtype asserts, fix scan (#146067) 2025-02-07 06:35:47 +00:00
_lazy
_library [BE][Ez]: ISC001 Auto concatenate implicit one line strings (#146408) 2025-02-04 19:07:04 +00:00
_logging use DTRACE_ENV_VAR as the trace logs directory of set (#146412) 2025-02-04 20:54:28 +00:00
_numpy
_prims
_prims_common [dynamo] Disable compiling on elementwise_type_promotion_wrapper (#146219) 2025-02-03 18:02:48 +00:00
_refs Re-add stft option to align window for center = false (#146379) 2025-02-06 14:07:13 +00:00
_strobelight
_subclasses Fix aten.to when input is a tensor constant (#146220) 2025-02-01 11:07:33 +00:00
_vendor
accelerator
amp
ao [BE]: Enable ruff SLOT checks (#146276) 2025-02-04 19:18:23 +00:00
autograd update _unsafe_set_version_counter to accept lists of tensors (#137921) 2025-02-04 04:51:11 +00:00
backends [CUDA][cuBLAS] Add fp16 accumulate option to cuBLAS/cuBLASLt (#144441) 2025-02-06 19:04:50 +00:00
compiler
contrib
cpu [CPUInductor] Fix SVE256 detection (#146207) 2025-02-01 18:51:34 +00:00
csrc Remove some NOLINT (#146610) 2025-02-07 01:50:06 +00:00
cuda [inductor triton] Disable incorrect TF32 usage on CUDA capability < 8 (#145684) 2025-01-28 22:01:08 +00:00
distributed [2/N][cp][example] flex attention in context parallel (backward pass) (#146397) 2025-02-06 19:50:02 +00:00
distributions torch.distributions: replace numbers.Number with torch.types.Number. (#145086) 2025-01-27 20:24:55 +00:00
export [export][dynamic shapes] log provenance for locals & symbols for non-strict (#143378) 2025-02-07 05:46:05 +00:00
fft
func Add torch.func.debug_unwrap (#146528) 2025-02-06 18:48:09 +00:00
futures
fx [export][dynamic shapes] log provenance for locals & symbols for non-strict (#143378) 2025-02-07 05:46:05 +00:00
jit
legacy
lib
linalg
masked
monitor add WaitCounter type interface and get rid of type errors (#146175) 2025-02-01 23:24:52 +00:00
mps
mtia
multiprocessing
nested Small improvements to NJT matrix multiplies (#146405) 2025-02-06 04:51:12 +00:00
nn Fix torch.nn.functional.one_hot param num_classes optional description (#146470) 2025-02-06 07:48:05 +00:00
onnx [ONNX] Create deprecation warning on dynamo_export (#146425) 2025-02-07 04:20:46 +00:00
optim [BE][Ez]: ISC001 Auto concatenate implicit one line strings (#146408) 2025-02-04 19:07:04 +00:00
package [BE][Ez]: ISC001 Auto concatenate implicit one line strings (#146408) 2025-02-04 19:07:04 +00:00
profiler execution trace export supports gzip format (#146179) 2025-02-01 01:25:25 +00:00
quantization
signal
sparse
special
testing Small improvements to NJT matrix multiplies (#146405) 2025-02-06 04:51:12 +00:00
utils Fixed a typo in dataset.py (#146600) 2025-02-07 05:09:51 +00:00
xpu
__config__.py
__future__.py
__init__.py Torch device backend autoload fix (#145611) 2025-01-31 19:27:42 +00:00
_appdirs.py
_classes.py
_compile.py
_custom_ops.py
_deploy.py
_environment.py
_guards.py
_jit_internal.py PEP585: Missed conversions (#145342) 2025-01-29 05:24:36 +00:00
_linalg_utils.py
_lobpcg.py
_lowrank.py
_meta_registrations.py nonzero_static with symint size (#146006) 2025-01-30 23:42:42 +00:00
_namedtensor_internals.py
_ops.py [Dynamo][Trace PyDispatcher] Remove disable from HigherOrderOperator.__call__ (#146270) 2025-02-03 21:47:54 +00:00
_python_dispatcher.py
_size_docs.py
_sources.py
_storage_docs.py
_streambase.py
_tensor.py Re-add stft option to align window for center = false (#146379) 2025-02-06 14:07:13 +00:00
_tensor_docs.py Re-add stft option to align window for center = false (#146379) 2025-02-06 14:07:13 +00:00
_tensor_str.py [BE][Ez]: ISC001 Auto concatenate implicit one line strings (#146408) 2025-02-04 19:07:04 +00:00
_thread_safe_fork.py
_torch_docs.py Add overloads to diagonal docs (#144214) 2025-01-31 15:53:59 +00:00
_utils.py [BE]: Enable ruff SLOT checks (#146276) 2025-02-04 19:18:23 +00:00
_utils_internal.py
_VF.py
_vmap_internals.py
_weights_only_unpickler.py
abi-check.cpp
CMakeLists.txt
custom_class.h
custom_class_detail.h
extension.h
functional.py Re-add stft option to align window for center = false (#146379) 2025-02-06 14:07:13 +00:00
hub.py
library.h Remove trivial dispatch_key_allowlist_check function (#146169) 2025-01-31 19:59:40 +00:00
library.py [opcheck] Improve error reporting; allow atol/rtol overrides (#146488) 2025-02-05 21:25:06 +00:00
overrides.py Re-add stft option to align window for center = false (#146379) 2025-02-06 14:07:13 +00:00
py.typed
quasirandom.py
random.py
README.txt
return_types.py
script.h
serialization.py Add option to serialization config to reduce random reads from get_record_offset when loading with mmap=True (#143880) 2025-01-31 17:09:20 +00:00
storage.py
torch_version.py [BE]: Enable ruff SLOT checks (#146276) 2025-02-04 19:18:23 +00:00
types.py Improve typing in torch/types.py (#145237) 2025-01-28 05:29:12 +00:00
version.py.tpl

Note [TH abstraction violation]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.