onnxruntime/tools/ci_build/github/linux/docker
Wang, Mengni fe463d4957
Support SmoothQuant for ORT static quantization (#16288)
### Description

Support SmoothQuant for ORT static quantization via intel neural
compressor

> Note:
Please use neural-compressor==2.2 to try SmoothQuant function.

### Motivation and Context
For large language models (LLMs) with gigantic parameters, the
systematic outliers make quantification of activations difficult. As a
training free post-training quantization (PTQ) solution, SmoothQuant
offline migrates this difficulty from activations to weights with a
mathematically equivalent transformation. Integrating SmoothQuant into
ORT quantization can benefit the accuracy of INT8 LLMs.

---------

Signed-off-by: Mengni Wang <mengni.wang@intel.com>
2023-07-26 18:56:45 -07:00
..
inference Upgrade old Python version in packaging pipeline (#16667) 2023-07-17 08:24:47 -07:00
scripts Support SmoothQuant for ORT static quantization (#16288) 2023-07-26 18:56:45 -07:00
Dockerfile.arm_yocto
Dockerfile.manylinux2014_aten_cpu
Dockerfile.manylinux2014_cpu Update python 3.11 and remove 3.7 for Linux (#15214) 2023-03-27 14:46:30 -07:00
Dockerfile.manylinux2014_cuda11 Update python 3.11 and remove 3.7 for Linux (#15214) 2023-03-27 14:46:30 -07:00
Dockerfile.manylinux2014_cuda11_6_tensorrt8_4 Update python 3.11 and remove 3.7 for Linux (#15214) 2023-03-27 14:46:30 -07:00
Dockerfile.manylinux2014_cuda11_6_tensorrt8_5 Update python 3.11 and remove 3.7 for Linux (#15214) 2023-03-27 14:46:30 -07:00
Dockerfile.manylinux2014_cuda11_8_tensorrt8_6 [TensorRT EP] TRT 8.6 minor version update (#16475) 2023-06-26 10:44:27 -07:00
Dockerfile.manylinux2014_eager_cpu
Dockerfile.manylinux2014_lort_cpu Make ORT callable from various Pytorch compilers (LazyTensor, TorchDynamo, etc) (#10460) 2022-08-22 09:40:40 -07:00
Dockerfile.manylinux2014_rocm [ROCm] remove ROCm5.2.3, ROCm5.3, ROCm5.4 from pipeline (#16004) 2023-05-19 10:29:01 +08:00
Dockerfile.manylinux2014_training_cuda11_8 Add support for cuda 11.8 and python 3.11 for training (#15548) 2023-04-20 12:56:45 -07:00
Dockerfile.ubuntu_cuda11_6_tensorrt8_4 Update cmake version in Linux build (#15707) 2023-04-27 20:02:33 -07:00
Dockerfile.ubuntu_cuda11_8_tensorrt8_5 Update cmake version in Linux build (#15707) 2023-04-27 20:02:33 -07:00
Dockerfile.ubuntu_cuda11_8_tensorrt8_6 [TensorRT EP] TRT 8.6 minor version update (#16475) 2023-06-26 10:44:27 -07:00
Dockerfile.ubuntu_for_arm
Dockerfile.ubuntu_gpu_training Add support for cuda 11.8 and python 3.11 for training (#15548) 2023-04-20 12:56:45 -07:00
Dockerfile.ubuntu_openvino ovep dockerfile and wheel docs changes (#16482) 2023-07-19 09:01:09 -07:00
Dockerfile.ubuntu_tensorrt Unblock Linux MultiGPU TensorRT CI (#16446) 2023-06-21 17:15:39 -07:00
Dockerfile.ubuntu_tensorrt_bin ort build flag fix (#16072) 2023-05-24 12:32:10 -07:00
Dockerfile_manylinux2014_openvino_multipython Gradle clean up (#14973) 2023-03-10 10:50:32 -08:00
manylinux-entrypoint
manylinux.patch Avoid taking dependency on dl.fedoraproject.org (#16202) 2023-06-02 07:41:46 -07:00
migraphx-ci-pipeline-env.Dockerfile [ROCm] Optimize ROCm CI pipeline 2 (#16691) 2023-07-24 13:57:48 +08:00