pytorch/torch/_inductor/codegen
Blaine Burton Rister a1bfb39a31 [Inductor] Expand Identity ops prior to block pattern matching (#146000)
# Feature

Inductor sometimes uses `Identity` functions to group various terms of an expression. While this is convenient in some scenarios, it can frustrate pattern matching. For example, when we're matching an indexing expression to tell if it can be represented as a block pointer, that analysis should be invariant to `Identity`'s.

This PR adds a few features to achieve this invariance.
 - Create a new expansion mode `expr.expand(identity=True)`, which removes all `Identity` functions from the expression.
 -  Preprocess the expression with this expansion prior to pattern matching.
 - Bonus: create a new test utility function called `dummy_graph()`, which creates a simple `GraphLowering`. This is useful for testing the pattern matcher, as we need to initialize `V.graph` before we can access `V.graph.sizevars`.

# Test plan
This PR adds a few new unit tests:
 - Added a unit test specifically for `expr.expand(identity=True)`.
 - Added a new unit test module for the block pattern matcher. Tested that we can correctly match some example patterns containing Identity ops.

I originally intended to add an end to end test compiling pointwise cat, and mapping the corresponding memory accesses to block pointers. However, it looks like that will take more work, since the [relevant code path](https://github.com/pytorch/pytorch/blob/main/torch/_inductor/codegen/triton.py#L1306) disables block pointer analysis. It might be better to defer that to a future PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146000
Approved by: https://github.com/eellison, https://github.com/jansel
2025-02-08 18:11:53 +00:00
..
aoti_runtime cpp_wrapper: Move #includes to per-device header files (#145932) 2025-01-29 21:08:45 +00:00
cuda [inductor] Fix test error test_force_cutlass_backend_aoti_cexpr_codegen (#146564) 2025-02-06 23:02:41 +00:00
rocm [inductor] Add typing to common.CSE (#145993) 2025-02-04 16:05:39 +00:00
xpu [inductor] Add types to DeviceOpOverrides (#145913) 2025-02-01 16:33:49 +00:00
__init__.py
aoti_hipify_utils.py
block_analysis.py [Inductor] Expand Identity ops prior to block pattern matching (#146000) 2025-02-08 18:11:53 +00:00
common.py [inductor] Minor compile time optimizations in DefaultHandler (#146282) 2025-02-08 18:00:40 +00:00
cpp.py [inductor] Refactor op handlers part 1 (#146235) 2025-02-04 23:35:53 +00:00
cpp_bmm_template.py [inductor][4/N] triton support post-#5512, fix constexpr signatures (#145583) 2025-01-29 05:46:05 +00:00
cpp_flex_attention_template.py [Inductor-CPU] Add profiling support for codegened flex attention kernels (#145894) 2025-01-29 20:54:46 +00:00
cpp_gemm_template.py
cpp_grouped_gemm_template.py [inductor] Finish typing common.py (#146225) 2025-02-04 23:35:33 +00:00
cpp_micro_gemm.py [CPUInductor] Fix SVE256 detection (#146207) 2025-02-01 18:51:34 +00:00
cpp_prefix.h
cpp_template.py Fix assertion failure in gemm template lowering (#146353) 2025-02-08 01:52:20 +00:00
cpp_template_kernel.py [inductor] Add typing to common.KernelArgs (#145916) 2025-02-04 16:05:39 +00:00
cpp_utils.py cpp_wrapper: enable all CPU repro tests (#145655) 2025-02-04 22:05:59 +00:00
cpp_wrapper_cpu.py Revert "[while_loop][inductor] support sym expression as cond_fn output (#146222)" 2025-02-07 16:19:41 +00:00
cpp_wrapper_cpu_array_ref.py cpp_wrapper: Move #includes to per-device header files (#145932) 2025-01-29 21:08:45 +00:00
cpp_wrapper_gpu.py cpp_wrapper: Move #includes to per-device header files (#145932) 2025-01-29 21:08:45 +00:00
cpu_device_op_overrides.py [inductor] Add types to DeviceOpOverrides (#145913) 2025-02-01 16:33:49 +00:00
cuda_combined_scheduling.py [BE] Type annotate wrapper_benchmark.py and cuda_combined_scheduling.py (#145542) 2025-01-30 03:53:52 +00:00
debug_utils.py cpp_wrapper: Move #includes to per-device header files (#145932) 2025-01-29 21:08:45 +00:00
halide.py [inductor] Refactor op handlers part 5 (#146257) 2025-02-08 18:00:30 +00:00
memory_planning.py
mps.py [inductor] Refactor op handlers part 5 (#146257) 2025-02-08 18:00:30 +00:00
mps_device_op_overrides.py [inductor] Add types to DeviceOpOverrides (#145913) 2025-02-01 16:33:49 +00:00
multi_kernel.py [inductor][4/N] triton support post-#5512, fix constexpr signatures (#145583) 2025-01-29 05:46:05 +00:00
simd.py [inductor] Add typing to common.CSE (#145993) 2025-02-04 16:05:39 +00:00
simd_kernel_features.py
triton.py [Inductor] Expand Identity ops prior to block pattern matching (#146000) 2025-02-08 18:11:53 +00:00
triton_combo_kernel.py [inductor] Add typing to common.KernelArgs (#145916) 2025-02-04 16:05:39 +00:00
triton_split_scan.py [inductor] Add typing to common.CSE (#145993) 2025-02-04 16:05:39 +00:00
triton_utils.py [inductor][5/N] triton support post-#5512, fix 1 and None handling (#145515) 2025-02-01 02:11:48 +00:00
wrapper.py Revert "[while_loop][inductor] support sym expression as cond_fn output (#146222)" 2025-02-07 16:19:41 +00:00