pytorch/c10
Nikita Shulga 2328dcccb9 [MPSInductor] Implement Welford reduction (#146703)
Still work in progress, though fallback works as expected, but custom shader is not

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146703
Approved by: https://github.com/jansel, https://github.com/dcci
2025-02-08 05:00:00 +00:00
..
benchmark Set RUNPATH so installed tests can find the required shared libraries (#136627) 2024-10-25 09:38:08 +00:00
core Use std::string_view (#145906) 2025-01-30 03:14:27 +00:00
cuda [Windows][ROCm] Fix c10 hip tests (#146599) 2025-02-06 23:41:25 +00:00
hip Fix hardcoded ROCm paths in Caffe2Targets.cmake (#136283) 2024-09-26 00:34:43 +00:00
macros [ROCm][Windows] Fix export macros (#144098) 2025-01-04 17:12:46 +00:00
metal [MPSInductor] Implement Welford reduction (#146703) 2025-02-08 05:00:00 +00:00
mobile [2/N] Fix extra warnings brought by clang-tidy-17 (#137459) 2024-10-08 19:05:02 +00:00
test Fix cppcoreguidelines-init-variables ignorance (#141795) 2025-01-28 17:11:37 +00:00
util [ROCm][Windows] Fix unrecognized _BitScanReverse intrinsic (#146606) 2025-02-06 23:47:18 +00:00
xpu Filter out iGPU if dGPU is found on XPU (#144378) 2025-01-29 15:53:16 +00:00
BUCK.oss
BUILD.bazel
build.bzl
CMakeLists.txt [pytorch][monitoring] Dynamic backend for WaitCounter (#135967) 2024-09-15 18:07:49 +00:00
ovrsource_defs.bzl