pytorch/torch/backends
Eddie Yan 9ee506bd93 [CUDA][cuBLAS] Add fp16 accumulate option to cuBLAS/cuBLASLt (#144441)
Test for `cublasGemmEx` added, still need to figure out the best way to exercise the other APIs...

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144441
Approved by: https://github.com/Chillee, https://github.com/malfet
2025-02-06 19:04:50 +00:00
..
_coreml
_nnapi
cpu
cuda [CUDA][cuBLAS] Add fp16 accumulate option to cuBLAS/cuBLASLt (#144441) 2025-02-06 19:04:50 +00:00
cudnn
cusparselt
kleidiai Revert "Reverting the PR adding Kleidiai-based int4 kernels (#145392)" (#145505) 2025-01-23 18:50:59 +00:00
mha
mkl
mkldnn
mps
nnpack
openmp
opt_einsum
quantized
xeon
xnnpack
__init__.py Revert "Reverting the PR adding Kleidiai-based int4 kernels (#145392)" (#145505) 2025-01-23 18:50:59 +00:00