onnxruntime/onnxruntime/python/tools/quantization
Xavier Dupré 905faea3b2
Fix static quantization for QDQ and Percentile distribution (#17649)
### Description
One quantization case was not covered by the current list of unit tests.
This PR adds a unit test to cover that case with the fix. It fixes the
issue #17619.



### Motivation and Context
2023-09-25 10:11:58 -07:00
..
CalTableFlatBuffers
operators Fix static quantization for QDQ and Percentile distribution (#17649) 2023-09-25 10:11:58 -07:00
__init__.py add int4 quantization code in python (#17077) 2023-08-11 15:17:58 -07:00
calibrate.py Fix static quantization for QDQ and Percentile distribution (#17649) 2023-09-25 10:11:58 -07:00
matmul_weight4_quantizer.py Bugfixes: dangling pointers and python property typo (#17285) 2023-08-29 12:50:15 -07:00
onnx_model.py Updating QDQ to support Float8E4M3FN (#16550) 2023-08-08 12:18:48 +02:00
onnx_quantizer.py [QNN/CPU EP] Add 16-bit Quantize/Dequantize contrib ops (#17015) 2023-09-18 09:43:34 -07:00
preprocess.py Update default external_data_location for pre-process of quantization (#16399) 2023-06-20 09:37:17 -07:00
qdq_loss_debug.py Disable PERF* rules in ruff to allow better readability (#16834) 2023-07-25 15:38:22 -07:00
qdq_quantizer.py Fix static quantization for QDQ and Percentile distribution (#17649) 2023-09-25 10:11:58 -07:00
quant_utils.py [QNN/CPU EP] Add 16-bit Quantize/Dequantize contrib ops (#17015) 2023-09-18 09:43:34 -07:00
quantize.py [QNN/CPU EP] Add 16-bit Quantize/Dequantize contrib ops (#17015) 2023-09-18 09:43:34 -07:00
README.md
registry.py
shape_inference.py Issue #17098: Shape inferencing fails during quantization for large models (#17100) 2023-08-15 18:38:14 -07:00

Quantization Tool

This tool can be used to quantize select ONNX models. Support is based on operators in the model. Please refer to https://onnxruntime.ai/docs/performance/quantization.html for usage details and https://github.com/microsoft/onnxruntime-inference-examples/tree/main/quantization for examples.