onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-16 01:33:39 +00:00

History

Wang, Mengni fe463d4957 Support SmoothQuant for ORT static quantization (#16288 ) ### Description Support SmoothQuant for ORT static quantization via intel neural compressor > Note: Please use neural-compressor==2.2 to try SmoothQuant function. ### Motivation and Context For large language models (LLMs) with gigantic parameters, the systematic outliers make quantification of activations difficult. As a training free post-training quantization (PTQ) solution, SmoothQuant offline migrates this difficulty from activations to weights with a mathematically equivalent transformation. Integrating SmoothQuant into ORT quantization can benefit the accuracy of INT8 LLMs. --------- Signed-off-by: Mengni Wang <mengni.wang@intel.com>		2023-07-26 18:56:45 -07:00
..
github	Support SmoothQuant for ORT static quantization (#16288 )	2023-07-26 18:56:45 -07:00
__init__.py
amd_hipify.py	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00
build.py	Bug fix for nested control flow ops for TRT EP (#16343 )	2023-07-23 16:16:17 -07:00
clean_docker_image_cache.py	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00
compile_triton.py	[Better Engineering] Bump ruff to 0.0.278 and fix new lint errors (#16789 )	2023-07-21 12:53:41 -07:00
coverage.py
gen_def.py	Basic CSharp packaging support for ROCm EP (#15535 )	2023-05-16 07:27:38 +08:00
get_docker_image.py	[Better Engineering] Bump ruff to 0.0.278 and fix new lint errors (#16789 )	2023-07-21 12:53:41 -07:00
logger.py
op_registration_utils.py	[CI] Removes type2 in process_registration and fix Windows GPU Reduced Ops CI Pipeline (#16530 )	2023-07-07 18:21:06 +02:00
op_registration_validator.py	[CI] Removes type2 in process_registration and fix Windows GPU Reduced Ops CI Pipeline (#16530 )	2023-07-07 18:21:06 +02:00
patch_manylinux.py	[Better Engineering] Bump ruff to 0.0.278 and fix new lint errors (#16789 )	2023-07-21 12:53:41 -07:00
policheck_exclusions.xml	Exculde hipify option from policheck (#13431 )	2022-10-25 16:35:16 +08:00
reduce_op_kernels.py	Re-organize the transpose optimization and layout transformation files. (#16246 )	2023-07-07 08:24:47 +10:00
replace_urls_in_deps.py	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00
requirements.txt	Bump transformers from 4.24.0 to 4.30.0 in /tools/ci_build (#16331 )	2023-06-16 13:08:46 -07:00
update_tsaoptions.py	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00
upload_python_package_to_azure_storage.py	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00