pytorch/torch/backends
eqy a6763b7b81 [CUDA][cuBLAS] Add fp16 accumulate option to cuBLAS/cuBLASLt (#144441)
Test for `cublasGemmEx` added, still need to figure out the best way to exercise the other APIs...

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144441
Approved by: https://github.com/Chillee
2025-01-15 18:37:55 +00:00
..
_coreml
_nnapi
cpu
cuda [CUDA][cuBLAS] Add fp16 accumulate option to cuBLAS/cuBLASLt (#144441) 2025-01-15 18:37:55 +00:00
cudnn
cusparselt
kleidiai [ARM][feat]: Add 4 bit dynamic quantization matmuls & KleidiAI Backend (#134124) 2024-12-20 19:32:03 +00:00
mha
mkl
mkldnn
mps
nnpack
openmp
opt_einsum
quantized
xeon
xnnpack
__init__.py [ARM][feat]: Add 4 bit dynamic quantization matmuls & KleidiAI Backend (#134124) 2024-12-20 19:32:03 +00:00