onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-01 03:45:06 +00:00

History

Wang, Mengni fe463d4957 Support SmoothQuant for ORT static quantization (#16288 ) ### Description Support SmoothQuant for ORT static quantization via intel neural compressor > Note: Please use neural-compressor==2.2 to try SmoothQuant function. ### Motivation and Context For large language models (LLMs) with gigantic parameters, the systematic outliers make quantification of activations difficult. As a training free post-training quantization (PTQ) solution, SmoothQuant offline migrates this difficulty from activations to weights with a mathematically equivalent transformation. Integrating SmoothQuant into ORT quantization can benefit the accuracy of INT8 LLMs. --------- Signed-off-by: Mengni Wang <mengni.wang@intel.com>		2023-07-26 18:56:45 -07:00
..
manylinux	Support SmoothQuant for ORT static quantization (#16288 )	2023-07-26 18:56:45 -07:00
training	Fix orttraining-ortmodule-distributed CI (#16569 )	2023-07-03 13:18:59 +08:00
install-protobuf.sh	Delete the build scripts that were copied from manylinux project (#12358 )	2022-07-29 18:24:19 -07:00
install_ninja.sh
install_openmpi.sh
install_os_deps.sh	Update cmake version in Linux build (#15707 )	2023-04-27 20:02:33 -07:00
install_protobuf.sh	Improve dependency management (#13523 )	2022-12-01 09:51:59 -08:00
install_python_deps.sh	Add support for cuda 11.8 and python 3.11 for training (#15548 )	2023-04-20 12:56:45 -07:00
install_rust.sh	Add support for python 3.10 for onnxruntime-training cuda and cpu (#14100 )	2023-02-02 11:32:41 -08:00
install_ubuntu.sh	Fix OLive build pipeline (#13114 )	2022-09-27 10:19:58 -07:00
requirements.txt	update onnx release 1.14 for docker files (#15680 )	2023-05-10 13:15:56 -07:00