onnxruntime/onnxruntime/core/mlas
RajalakshmiSR 5d8c5409ab
POWER10: QGEMM optimization (#10642)
* POWER10: QGEMM optimization

This patch makes use of POWER10 MMA feature for QGEMM function.
This optimization includes signed and unsigned cases.Tested and
there are no new failures with gcc11 and clang-14.

* Changes as per review comments

Co-authored-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>
2022-03-02 08:36:26 -08:00
..
inc Symmetric QGEMM (#10289) 2022-01-24 10:49:04 -08:00
lib POWER10: QGEMM optimization (#10642) 2022-03-02 08:36:26 -08:00
.clang-format Adding interface for batched integer gemm (#7249) 2021-04-15 10:25:31 -07:00
README.md

About MLAS

MLAS is a compute library containing processor optimized GEMM kernels and platform specific threading code.

Unit tests for MLAS

Unit tests for the SGEMM kernels are available under onnxruntime\test\mlas. These tests run over a range of inputs that then execute the various special cases for aligned and unaligned outputs. The tests have failed if any "mismatch" strings are printed.