[inductor] set sanitize_overflow=False for triton kernels (#139502)

In upstream triton, https://github.com/triton-lang/triton/pull/4589 introduces overflow checks. However, overflow checks likely add some overhead, and have some correctness bugs at the moment (e.g. https://github.com/triton-lang/triton/pull/5033). Let's set `sanitize_overflow=False` but keep `debug=True` so that we can keep using device_assert but without the additional asserts added by `sanitize_overflow`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139502
Approved by: https://github.com/bertmaher
This commit is contained in:
David Berard 2024-11-01 11:56:15 -07:00 committed by PyTorch MergeBot
parent da395384a2
commit 60542eeb33

View file

@ -442,6 +442,7 @@ class CachingAutotuner(KernelInterface):
"num_warps": compile_meta["num_warps"],
"num_stages": compile_meta["num_stages"],
"debug": compile_meta["debug"],
"sanitize_overflow": False, # turn off additional asserts added for overflow checks
}
if self.device_props.type == "hip":
if "waves_per_eu" in compile_meta: