mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-09 00:30:53 +00:00

History

anujj 23d48ea647 Add TensorRT-Model-Optimizer INT4 AWQ support in onnxruntime tools (#22390 ) [TensorRT-Model-Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer) have a implementation for INT4 AWQ. Adding the support in onnxruntime tools to quantized the models with TensorRT-Model-Optimizer		2024-10-11 13:31:54 -07:00
..
CalTableFlatBuffers
execution_providers	Add changes for strided calibration (#20949 )	2024-06-21 08:23:23 -07:00
fusions	[QNN Quantization] Ensure fused nodes have names (#19650 )	2024-02-27 02:27:35 -08:00
operators	[QNN EP] Add support for GatherElements (#15966 )	2024-08-19 14:33:40 -07:00
__init__.py
base_quantizer.py	Add overflow protection for quantization bias to reduce quantization precision loss (#21645 )	2024-08-28 14:29:17 -07:00
calibrate.py	Fix conversion of TensorData, TensorsData to json (#22166 )	2024-10-06 19:13:03 -07:00
matmul_4bits_quantizer.py	Add TensorRT-Model-Optimizer INT4 AWQ support in onnxruntime tools (#22390 )	2024-10-11 13:31:54 -07:00
matmul_bnb4_quantizer.py	Fix argparser in `matmul_bnb4_quantizer` (#19812 )	2024-03-07 11:31:34 -08:00
onnx_model.py	[QDQ Quant] Support mixed-precision integer quantization via overrides (#19925 )	2024-03-23 11:05:08 -07:00
onnx_quantizer.py	Fix missing argument when calling _get_quantize_input_nodes (#20245 )	2024-04-25 00:46:48 +02:00
preprocess.py
qdq_loss_debug.py
qdq_quantizer.py	[QNN Quant tool] Fix validation of per-channel overrides for models with external data (#21656 )	2024-08-09 14:46:52 -07:00
quant_utils.py	Fix conversion of TensorData, TensorsData to json (#22166 )	2024-10-06 19:13:03 -07:00
quantize.py	Added a tool to quantize Gather to GatherBlockQuantized (#21697 )	2024-08-19 10:25:36 -07:00
README.md
registry.py	[QNN EP] Add support for GatherElements (#15966 )	2024-08-19 14:33:40 -07:00
shape_inference.py	Update api backward compatibility (#20136 )	2024-04-01 21:37:56 -07:00
tensor_quant_overrides.py	[QNN Quant tool] Fix validation of per-channel overrides for models with external data (#21656 )	2024-08-09 14:46:52 -07:00

README.md

Quantization Tool

This tool can be used to quantize select ONNX models. Support is based on operators in the model. Please refer to https://onnxruntime.ai/docs/performance/quantization.html for usage details and https://github.com/microsoft/onnxruntime-inference-examples/tree/main/quantization for examples.