* Make split working for quantization.
* NHWC transformer support for split operator
* Refactor some according to Feedback. Will add test cases soon.
* Fix build error on windows.
* Add test case for split op on uint8_t support
* Add nhwc_transformer_test for split uint8_t support
* Some change according to PR feedbacks.
* Initial implementation of generating calibration dynamic range table
* Initialize validation support for Quantization
* Initialize validation support for Quantization (cont.)
* Improve validation support for Quantization
* Improve validation support for Quantization
* Rewrite/Refine for calibration and validation
* Rewrite/Refine for calibration and validation (cont.)
* Refine code
* Refine code
* Add data reader for BERT
* Add flatbuffers to serialize calibration table
* Refine code and add BERT evaluation
* Refine the code
* minor modification
* Add preprocess/postprocess of vision team yolov3 and refine the code
* Update annotation
* Make bbox cooridates more accurate
* Fix bug
* Add support of batch processing
* Batch processing for model zoo yolov3
* Add batch inference for evaluation
* Refine the code
* Add README
* Add comments
* Refine the code for PR
* Remove batch support checking in data_reader and refine the code
* Refine the code for PR
* Refine the code for PR review
Co-authored-by: Olivia Jain <oljain@microsoft.com>
Quantize LSTM:
1. dynamically quantizes MatMul inside the LSTM. It doesn't quantize activation function.
2. support per-channel on the input weight and recurrent weight.
* remove shape inference and fix save large model problem
* remove unnecessary import
* refine code and add external format for quantize_qat
* remove initializers in tensors_to_calibrate
* small refine
Co-authored-by: t-yguo <t-yguo@microsoft.com>
Improve quantization tools:
1. Support QAT
2. Make quantization tool to register Operators.
3. Make the API clear to use
Co-authored-by: t-yguo <t-yguo@microsoft.com>
* improve calibration tool
* modify calibration interface name
* modify calibration interface name
* refine calibrate and calibrate_user
* refine and add type info
* refine and add type info
* add e2e user example file
* remove unnecessary files
* remote test images no longer needed
* update readme document
Co-authored-by: t-yguo <t-yguo@microsoft.com>
* Make quantization support GPT2 past state
* Make OpTester to be able to generate reference outputs with a model. With it, there is no need to compute outputs manually, which are impossible for some cases.
Fix 3 bugs:
node names duplicate in calibration augment_graph if the name of node to quantize is empty.
If output nodes are quantized, output value are quantized and not dequantized back
Gather with data type int64 should not be quantized
Update ReformatSourcePython.bat to use YAPF to format python code, and add onnxruntime\test directory to be formatted.
Add onnxruntime\.style.yapf for configuration. The style is based on google, except max column width 120.
Format python scripts using ReformatSourcePython.bat.
* add calibration tool
* add model for e2e example
* format readme
* some more formatting updates
* plus a few more updates
* plus review comments
* plus updates
* more updates