onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-29 23:06:41 +00:00

History

Edward Chen 0a4d76d98b MLAS AArch64 quantized int4 Gemm kernel (#18031 ) - Implement MLAS function for quantized 4-bit int Gemm (Gemm with float A and quantized 4-bit int B) for ARM NEON. This is an initial implementation. Only the M=1 path (with M being number of rows of A and C) has any optimization attempted so far. More optimization to come in future PRs. - Connect MatMulNBits contrib op to MLAS function.		2023-11-15 09:31:54 -08:00
..
contrib_ops	MLAS AArch64 quantized int4 Gemm kernel (#18031 )	2023-11-15 09:31:54 -08:00
core	MLAS AArch64 quantized int4 Gemm kernel (#18031 )	2023-11-15 09:31:54 -08:00
python	SDXL demo: consistent opt shape and seed (#18445 )	2023-11-14 20:24:32 -08:00
test	MLAS AArch64 quantized int4 Gemm kernel (#18031 )	2023-11-15 09:31:54 -08:00
tool/etw
wasm	[js/web/training] Add CreateTrainingSession (#17891 )	2023-10-26 09:22:10 -07:00
__init__.py	Python API to check whether collective ops are available or not (#17730 )	2023-09-29 14:11:05 -07:00
ReformatSource.ps1
ReformatSourcePython.bat
VSCodeCoverage.runsettings