pytorch/torch
eellison 47af7cc962 Add compiler bisector (#131936)
This is a utility to aid the torch.compile debugging. You provide a function that returns True on success, False on failure, or do something out of process and run bisect_helper `good | bad`.

The bisector will first go through backends - `eager`, `aot_eager`, `aot_eager_decomp_partition`, `inductor` to find the first failing backend. Then, it will go through subsystems within the backend - currently limited but could be expanded - and try to find the first subsystem for which disabling fixes the problem. Once it has found the failing subsystem, it will find the number of times the subsystem is applied, and then bisect through it.

An example usage of how to hook it up for aot_eager_decomp_partition and decomposition subsystem is :

```
    from torch._inductor.bisect_helper import BisectionManager
    if op in CURRENT_DECOMPOSITION_TABLE:
        if BisectionManager.disable_subsystem("aot_eager_decomp_partition", "decomposition", lambda: repr(op)):
            return NotImplemented
```

Once it has discovered the problematic change, it will print out the associated debug info, and you can set the same limits with `TORCH_BISECT_BACKEND` `TORCH_BISECT_SUBSYSTEM` and `TORCH_BISECT_MAX`.

We could add further options as an automated way of going through a check list for checking divergence - e.g., the mode to emulate amp casts.

Fix for https://github.com/pytorch/pytorch/issues/126546

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131936
Approved by: https://github.com/ezyang
2024-10-09 20:34:11 +00:00
..
_awaits
_C [Dynamo] Remove ignored modes from torch function mode stack guard (#135503) (#137116) 2024-10-09 02:29:40 +00:00
_C_flatbuffer
_custom_op
_decomp [TorchRec][PT2 compile] enable dynamo in _get_user_embeddings (#136798) 2024-10-09 17:19:45 +00:00
_dispatch
_dynamo Revert "Log chromium event for automatic dynamic reasons (#137491)" 2024-10-09 20:24:12 +00:00
_export Add original forward names to schema so that prettify pass works (#136887) 2024-10-08 04:21:02 +00:00
_functorch Tensorify compute on Python scalars (#136674) 2024-10-09 18:51:41 +00:00
_higher_order_ops [Dynamo] Move flex attention torch function mode to traceable HOP file (#137120) 2024-10-09 02:29:40 +00:00
_inductor Add compiler bisector (#131936) 2024-10-09 20:34:11 +00:00
_lazy
_library Proper handling of arguments passed by in kwargs inside zip_schema (#137311) 2024-10-04 21:50:31 +00:00
_logging Don't actually import module when checking if its valid (#136548) 2024-09-25 20:47:32 +00:00
_numpy
_prims Fix AOT Graph capture not propagating non_blocking copy parameter to … (#136513) 2024-10-01 00:32:47 +00:00
_prims_common Fix six broken tests in test_ops.py (#136653) 2024-09-30 20:32:55 +00:00
_refs Fix typo in _normalize ref (#137079) 2024-10-02 19:06:48 +00:00
_strobelight [Pytorch] Cleanup Strobelight URL and shorten for readability (#136102) 2024-09-16 18:10:33 +00:00
_subclasses Revert "Disallow FakeTensor.data_ptr access in eager mode (#137221)" 2024-10-07 21:46:13 +00:00
_vendor
amp
ao Change to export_for_training in quantize_pt2e tests (#137233) 2024-10-04 18:33:02 +00:00
autograd Param fixes in docstring (#136097) 2024-09-21 18:56:34 +00:00
backends Clarify opt-einsum usage, fix #127109 (#137596) 2024-10-09 20:31:24 +00:00
compiler
contrib
cpu
csrc [NCCL][Profiler] Add functionality to call dump function of NCCL profiler plugin (#137523) 2024-10-09 18:19:33 +00:00
cuda raw_alloc ignores PYTORCH_NO_CUDA_MEMORY_CACHING (#131114) 2024-10-04 15:36:29 +00:00
distributed [FSDP2] Added shard_placement_fn arg (#137496) 2024-10-09 19:13:32 +00:00
distributions [BE]: Update mypy to 1.11.2 (#133816) 2024-09-16 19:44:11 +00:00
export Lift restriction on training IR for unflatten (#137470) 2024-10-08 22:30:24 +00:00
fft
func
futures
fx Add compiler bisector (#131936) 2024-10-09 20:34:11 +00:00
jit
legacy
lib
linalg docs: clarify alias usage for x parameter in vector_norm function (#136921) 2024-09-30 02:50:06 +00:00
masked [BE]: Update mypy to 1.11.2 (#133816) 2024-09-16 19:44:11 +00:00
monitor
mps
mtia [MTIA] Support torch.cuda.get_device_capability equivalent API on MTIA (#135889) 2024-09-17 17:42:56 +00:00
multiprocessing multiprocessing.spawn: allow a grace period when shutdown (#131278) 2024-10-07 12:37:34 +00:00
nested Fix to() on non-contiguous NJTs (#137124) 2024-10-08 15:11:05 +00:00
nn [Dynamo] Move flex attention torch function mode to traceable HOP file (#137120) 2024-10-09 02:29:40 +00:00
onnx [ONNX] Implement patch for jit.isinstance (#137592) 2024-10-09 18:06:52 +00:00
optim Minorly reorder optim kwargs in docs, fixes #137391 (#137531) 2024-10-09 04:14:45 +00:00
package [3.13] fix 3.13 pickle error in torch/package (#136049) 2024-09-14 14:28:09 +00:00
profiler [Profiler] Torch Profiler distributed info is not JSON serializable (#135548) 2024-09-13 02:22:33 +00:00
quantization
signal
sparse [sparse][semi-structured] Add float8 dtype support to 24 sparsity (#136397) 2024-09-27 21:37:34 +00:00
special
testing Fix to() on non-contiguous NJTs (#137124) 2024-10-08 15:11:05 +00:00
utils Revert "Introduce torch.sym_sum (#136429)" 2024-10-09 20:08:01 +00:00
xpu Use torch.Stream&torch.Event for Dynamo capature (#134850) 2024-10-02 14:15:33 +00:00
__config__.py
__future__.py
__init__.py Add compiler bisector (#131936) 2024-10-09 20:34:11 +00:00
_appdirs.py
_classes.py
_compile.py
_custom_ops.py
_deploy.py
_environment.py Improve is_fbcode functionality (#136871) 2024-09-27 21:19:01 +00:00
_guards.py Turn on type-checking in torch.fx.experimental.symbolic_shapes (#136972) 2024-10-01 13:22:10 +00:00
_jit_internal.py
_linalg_utils.py
_lobpcg.py
_lowrank.py
_meta_registrations.py Revert "Introduce torch.sym_sum (#136429)" 2024-10-09 20:08:01 +00:00
_namedtensor_internals.py
_ops.py Add type annotations for higher order ops/flex_attention (#137065) 2024-10-02 04:39:25 +00:00
_python_dispatcher.py
_size_docs.py
_sources.py
_storage_docs.py
_streambase.py Use torch.Stream&torch.Event for Dynamo capature (#134850) 2024-10-02 14:15:33 +00:00
_tensor.py Remove dependency on numpy for serialization for XLA/open registration devices without numpy (#137444) 2024-10-09 19:35:55 +00:00
_tensor_docs.py Revert "Add deterministic path for CUDA cumsum (#136224)" 2024-09-27 12:54:47 +00:00
_tensor_str.py
_thread_safe_fork.py [inductor] parallel compile: add import of thread_safe_fork for internal (#137155) 2024-10-03 17:37:21 +00:00
_torch_docs.py Add torch.squeeze parameter description to declare allowed type (#137485) 2024-10-09 05:29:13 +00:00
_utils.py Remove dependency on numpy for serialization for XLA/open registration devices without numpy (#137444) 2024-10-09 19:35:55 +00:00
_utils_internal.py Log compile ids to pt2_remote_cache and pt2_compile_events (#137431) 2024-10-08 18:04:48 +00:00
_VF.py
_vmap_internals.py
_weights_only_unpickler.py Remove dependency on numpy for serialization for XLA/open registration devices without numpy (#137444) 2024-10-09 19:35:55 +00:00
abi-check.cpp
CMakeLists.txt
custom_class.h
custom_class_detail.h
extension.h
functional.py Clarify opt-einsum usage, fix #127109 (#137596) 2024-10-09 20:31:24 +00:00
hub.py torch.hub: add get_dir/set_dir type hints (#134906) 2024-09-12 03:53:29 +00:00
library.h
library.py noop on torch.library APIs under torch::deploy (multipy) (#136645) 2024-09-26 02:34:34 +00:00
overrides.py Revert "Introduce torch.sym_sum (#136429)" 2024-10-09 20:08:01 +00:00
py.typed
quasirandom.py
random.py
README.txt
return_types.py
script.h
serialization.py [3.13] fix 3.13 pickle error in serialization.py (#136034) 2024-09-14 00:02:40 +00:00
storage.py Fix serialization for torch.uint16, torch.uint32, torch.uint64 (#137184) 2024-10-03 14:56:11 +00:00
torch_version.py
types.py
version.py.tpl

Note [TH abstraction violation]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.