onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-09 00:30:53 +00:00

Author	SHA1	Message	Date
Yufeng Li	5dc7339be6	Add quantization tool to python package (#4458 ) * Add quantization tool to python package	2020-07-08 21:42:53 -07:00
Yufeng Li	fc5e65a22d	Add quantization support for GPT2 past state and use model to generate outputs in OpTester (#4340 ) * Make quantization support GPT2 past state * Make OpTester to be able to generate reference outputs with a model. With it, there is no need to compute outputs manually, which are impossible for some cases.	2020-06-26 09:29:29 -07:00
Yufeng Li	197da135eb	Implement quantized Attention on cpu (#4111 ) * Implement QAttention on CPU * support QAttention in quantization tool * refine attention code * add more unit tests	2020-06-03 13:42:00 -07:00
Yufeng Li	7c774e967a	support quantization of optimized model with ir<4 (#3853 )	2020-05-13 11:16:37 -07:00
Zhang Lei	c365822808	Refactor some for the calibate.py. Add QLinearAdd and QLinearMul support. Fix bugs loading jpgs not strict RGB, and typoes in load_batch call. (#3542 )	2020-04-18 17:10:55 -07:00
Yufeng Li	af618278f6	fix bugs in quantization and calibration tools (#3329 ) Fix 3 bugs: node names duplicate in calibration augment_graph if the name of node to quantize is empty. If output nodes are quantized, output value are quantized and not dequantized back Gather with data type int64 should not be quantized	2020-03-30 17:50:25 -07:00
Tianlei Wu	403f99cd77	Use yapf to format python (#3276 ) Update ReformatSourcePython.bat to use YAPF to format python code, and add onnxruntime\test directory to be formatted. Add onnxruntime\.style.yapf for configuration. The style is based on google, except max column width 120. Format python scripts using ReformatSourcePython.bat.	2020-03-20 14:34:10 -07:00
Yufeng Li	a69d859912	fix quantize_bias (#3270 )	2020-03-20 11:36:47 -07:00
Tracy Sharpe	fe0b2b2abd	QLinearConv speed up (#3196 ) For x86/x64 builds, change the QLinearConv op to use MLAS for the u8u8=s32 GEMM, then requantize the intermediate buffer to u8.	2020-03-13 16:54:55 -07:00
Yufeng Li	c69194ec4c	fix the missing return in _get_quantize_input_nodes and format code with yapf (#3199 ) * fix the missing return for function _get_quantize_input_nodes * format quantization code with yapf	2020-03-13 09:28:41 -07:00
Yufeng Li	876d0c5430	Make quantization parameters as constant weigth instead of overrideable (#3160 )	2020-03-10 08:35:02 -07:00
Yufeng Li	1d2b8115e2	Support u8u8 in quantization tool (#3140 )	2020-03-05 14:42:46 -08:00
Ashwini Khade	807a59c55d	Add calibration tool (#2845 ) * add calibration tool * add model for e2e example * format readme * some more formatting updates * plus a few more updates * plus review comments * plus updates * more updates	2020-01-20 14:49:35 -08:00
Ashwini Khade	8643f3ebbb	add domain check for nodes + update documentation (#2831 )	2020-01-14 11:15:50 -08:00
Ashwini Khade	d197079473	quantization script updates (#2208 )	2019-10-21 12:25:52 -07:00
Jonny Shipton	df472cbfbd	quantise: Don't error when initializer graph input is missing (#1872 ) ONNX IR version 4 and above do not require graph initializers to have corresponding graph inputs.	2019-09-30 21:57:00 +00:00
Ashwini Khade	0f6cf9a335	enable quantizing specific nodes (#1742 )	2019-09-03 11:04:17 -07:00
Ashwini Khade	16087f3133	update default values for weight quatization (#1564 )	2019-08-05 21:39:37 -07:00
PhaniShekhar	e26e11b9f7	Quantization tool to support quantization of Conv and MatMul nodes. (#1057 ) * Move quantization tool from onnx to onnxruntime * Fix some issues * Use u8_s8 for asymmetric mode and u8_u8 for symmetric mode irrespective of whether inputs are initializers or from previous * Address PR comments * Fix error message formatting * Separate static/dynamic and quantization mode	2019-06-18 20:44:45 -07:00

19 commits