onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-14 20:48:00 +00:00

History

Wang, Mengni fe463d4957 Support SmoothQuant for ORT static quantization (#16288 ) ### Description Support SmoothQuant for ORT static quantization via intel neural compressor > Note: Please use neural-compressor==2.2 to try SmoothQuant function. ### Motivation and Context For large language models (LLMs) with gigantic parameters, the systematic outliers make quantification of activations difficult. As a training free post-training quantization (PTQ) solution, SmoothQuant offline migrates this difficulty from activations to weights with a mathematically equivalent transformation. Integrating SmoothQuant into ORT quantization can benefit the accuracy of INT8 LLMs. --------- Signed-off-by: Mengni Wang <mengni.wang@intel.com>		2023-07-26 18:56:45 -07:00
..
inference	Upgrade old Python version in packaging pipeline (#16667 )	2023-07-17 08:24:47 -07:00
scripts	Support SmoothQuant for ORT static quantization (#16288 )	2023-07-26 18:56:45 -07:00
Dockerfile.arm_yocto
Dockerfile.manylinux2014_aten_cpu
Dockerfile.manylinux2014_cpu	Update python 3.11 and remove 3.7 for Linux (#15214 )	2023-03-27 14:46:30 -07:00
Dockerfile.manylinux2014_cuda11	Update python 3.11 and remove 3.7 for Linux (#15214 )	2023-03-27 14:46:30 -07:00
Dockerfile.manylinux2014_cuda11_6_tensorrt8_4	Update python 3.11 and remove 3.7 for Linux (#15214 )	2023-03-27 14:46:30 -07:00
Dockerfile.manylinux2014_cuda11_6_tensorrt8_5	Update python 3.11 and remove 3.7 for Linux (#15214 )	2023-03-27 14:46:30 -07:00
Dockerfile.manylinux2014_cuda11_8_tensorrt8_6	[TensorRT EP] TRT 8.6 minor version update (#16475 )	2023-06-26 10:44:27 -07:00
Dockerfile.manylinux2014_eager_cpu
Dockerfile.manylinux2014_lort_cpu	Make ORT callable from various Pytorch compilers (LazyTensor, TorchDynamo, etc) (#10460 )	2022-08-22 09:40:40 -07:00
Dockerfile.manylinux2014_rocm	[ROCm] remove ROCm5.2.3, ROCm5.3, ROCm5.4 from pipeline (#16004 )	2023-05-19 10:29:01 +08:00
Dockerfile.manylinux2014_training_cuda11_8	Add support for cuda 11.8 and python 3.11 for training (#15548 )	2023-04-20 12:56:45 -07:00
Dockerfile.ubuntu_cuda11_6_tensorrt8_4	Update cmake version in Linux build (#15707 )	2023-04-27 20:02:33 -07:00
Dockerfile.ubuntu_cuda11_8_tensorrt8_5	Update cmake version in Linux build (#15707 )	2023-04-27 20:02:33 -07:00
Dockerfile.ubuntu_cuda11_8_tensorrt8_6	[TensorRT EP] TRT 8.6 minor version update (#16475 )	2023-06-26 10:44:27 -07:00
Dockerfile.ubuntu_for_arm
Dockerfile.ubuntu_gpu_training	Add support for cuda 11.8 and python 3.11 for training (#15548 )	2023-04-20 12:56:45 -07:00
Dockerfile.ubuntu_openvino	ovep dockerfile and wheel docs changes (#16482 )	2023-07-19 09:01:09 -07:00
Dockerfile.ubuntu_tensorrt	Unblock Linux MultiGPU TensorRT CI (#16446 )	2023-06-21 17:15:39 -07:00
Dockerfile.ubuntu_tensorrt_bin	ort build flag fix (#16072 )	2023-05-24 12:32:10 -07:00
Dockerfile_manylinux2014_openvino_multipython	Gradle clean up (#14973 )	2023-03-10 10:50:32 -08:00
manylinux-entrypoint
manylinux.patch	Avoid taking dependency on dl.fedoraproject.org (#16202 )	2023-06-02 07:41:46 -07:00
migraphx-ci-pipeline-env.Dockerfile	[ROCm] Optimize ROCm CI pipeline 2 (#16691 )	2023-07-24 13:57:48 +08:00