Commit graph

19 commits

Author SHA1 Message Date
Yufeng Li
5dc7339be6
Add quantization tool to python package (#4458)
* Add quantization tool to python package
2020-07-08 21:42:53 -07:00
Yufeng Li
fc5e65a22d
Add quantization support for GPT2 past state and use model to generate outputs in OpTester (#4340)
* Make quantization support GPT2 past state
* Make OpTester to be able to generate reference outputs with a model. With it, there is no need to compute outputs manually, which are impossible for some cases.
2020-06-26 09:29:29 -07:00
Yufeng Li
197da135eb
Implement quantized Attention on cpu (#4111)
* Implement QAttention on CPU
* support QAttention in quantization tool
* refine attention code
* add more unit tests
2020-06-03 13:42:00 -07:00
Yufeng Li
7c774e967a
support quantization of optimized model with ir<4 (#3853) 2020-05-13 11:16:37 -07:00
Zhang Lei
c365822808
Refactor some for the calibate.py. Add QLinearAdd and QLinearMul support. Fix bugs loading jpgs not strict RGB, and typoes in load_batch call. (#3542) 2020-04-18 17:10:55 -07:00
Yufeng Li
af618278f6
fix bugs in quantization and calibration tools (#3329)
Fix 3 bugs:
node names duplicate in calibration augment_graph if the name of node to quantize is empty.
If output nodes are quantized, output value are quantized and not dequantized back
Gather with data type int64 should not be quantized
2020-03-30 17:50:25 -07:00
Tianlei Wu
403f99cd77
Use yapf to format python (#3276)
Update ReformatSourcePython.bat to use YAPF to format python code, and add onnxruntime\test directory to be formatted.

Add onnxruntime\.style.yapf for configuration. The style is based on google, except max column width 120.

Format python scripts using ReformatSourcePython.bat.
2020-03-20 14:34:10 -07:00
Yufeng Li
a69d859912
fix quantize_bias (#3270) 2020-03-20 11:36:47 -07:00
Tracy Sharpe
fe0b2b2abd
QLinearConv speed up (#3196)
For x86/x64 builds, change the QLinearConv op to use MLAS for the u8u8=s32 GEMM, then requantize the intermediate buffer to u8.
2020-03-13 16:54:55 -07:00
Yufeng Li
c69194ec4c
fix the missing return in _get_quantize_input_nodes and format code with yapf (#3199)
* fix the missing return for function _get_quantize_input_nodes

* format quantization code with yapf
2020-03-13 09:28:41 -07:00
Yufeng Li
876d0c5430
Make quantization parameters as constant weigth instead of overrideable (#3160) 2020-03-10 08:35:02 -07:00
Yufeng Li
1d2b8115e2
Support u8u8 in quantization tool (#3140) 2020-03-05 14:42:46 -08:00
Ashwini Khade
807a59c55d
Add calibration tool (#2845)
* add calibration tool

* add model for e2e example

* format readme

* some more formatting updates

* plus a few more updates

* plus review comments

* plus updates

* more updates
2020-01-20 14:49:35 -08:00
Ashwini Khade
8643f3ebbb
add domain check for nodes + update documentation (#2831) 2020-01-14 11:15:50 -08:00
Ashwini Khade
d197079473
quantization script updates (#2208) 2019-10-21 12:25:52 -07:00
Jonny Shipton
df472cbfbd quantise: Don't error when initializer graph input is missing (#1872)
ONNX IR version 4 and above do not require graph initializers to have
corresponding graph inputs.
2019-09-30 21:57:00 +00:00
Ashwini Khade
0f6cf9a335
enable quantizing specific nodes (#1742) 2019-09-03 11:04:17 -07:00
Ashwini Khade
16087f3133
update default values for weight quatization (#1564) 2019-08-05 21:39:37 -07:00
PhaniShekhar
e26e11b9f7 Quantization tool to support quantization of Conv and MatMul nodes. (#1057)
* Move quantization tool from onnx to onnxruntime

* Fix some issues

* Use u8_s8 for asymmetric mode and u8_u8 for symmetric mode irrespective of whether inputs are initializers or from previous

* Address PR comments

* Fix error message formatting

* Separate static/dynamic and quantization mode
2019-06-18 20:44:45 -07:00