onnxruntime/tools/ci_build/github/linux/docker/scripts/manylinux
Wang, Mengni fe463d4957
Support SmoothQuant for ORT static quantization (#16288)
### Description

Support SmoothQuant for ORT static quantization via intel neural
compressor

> Note:
Please use neural-compressor==2.2 to try SmoothQuant function.

### Motivation and Context
For large language models (LLMs) with gigantic parameters, the
systematic outliers make quantification of activations difficult. As a
training free post-training quantization (PTQ) solution, SmoothQuant
offline migrates this difficulty from activations to weights with a
mathematically equivalent transformation. Integrating SmoothQuant into
ORT quantization can benefit the accuracy of INT8 LLMs.

---------

Signed-off-by: Mengni Wang <mengni.wang@intel.com>
2023-07-26 18:56:45 -07:00
..
install_centos.sh
install_deps.sh
install_deps_aten.sh
install_deps_eager.sh
install_deps_lort.sh [DORT] Use new FX-to-ONNX exporter (#16450) 2023-07-04 13:13:04 -07:00
install_shared_deps.sh
install_ubuntuos.sh
requirements.txt Support SmoothQuant for ORT static quantization (#16288) 2023-07-26 18:56:45 -07:00