onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-14 01:13:40 +00:00

Author	SHA1	Message	Date
Zhang Lei	f77ff1bc3d	Quantization support for split operator with its NHWC support (#6107 ) * Make split working for quantization. * NHWC transformer support for split operator * Refactor some according to Feedback. Will add test cases soon. * Fix build error on windows. * Add test case for split op on uint8_t support * Add nhwc_transformer_test for split uint8_t support * Some change according to PR feedbacks.	2021-01-13 10:05:34 -08:00
Chi Lo	945fae8f56	Lochi/quantization tool for trt (#6103 ) * Initial implementation of generating calibration dynamic range table * Initialize validation support for Quantization * Initialize validation support for Quantization (cont.) * Improve validation support for Quantization * Improve validation support for Quantization * Rewrite/Refine for calibration and validation * Rewrite/Refine for calibration and validation (cont.) * Refine code * Refine code * Add data reader for BERT * Add flatbuffers to serialize calibration table * Refine code and add BERT evaluation * Refine the code * minor modification * Add preprocess/postprocess of vision team yolov3 and refine the code * Update annotation * Make bbox cooridates more accurate * Fix bug * Add support of batch processing * Batch processing for model zoo yolov3 * Add batch inference for evaluation * Refine the code * Add README * Add comments * Refine the code for PR * Remove batch support checking in data_reader and refine the code * Refine the code for PR * Refine the code for PR review Co-authored-by: Olivia Jain <oljain@microsoft.com>	2020-12-21 20:59:08 -08:00
Zhang Lei	648c9c7789	Fix bugs for 1: Calibrator should check model inputs; 2: (#6017 ) quantize_inupts forgot to use parameter initializer_use_weight_qtyp.	2020-12-03 00:00:16 -08:00
Zhang Lei	9992f0f812	Implement QLinear GlobalAveragePool with sse2/neon. (#5838 ) Add QLinear Global Average Pool for quantization for ARM and SSE2. Co-authored-by: Tracy Sharpe <tracysh@microsoft.com>	2020-11-23 19:23:58 -08:00
Yufeng Li	6f86c4dbe3	Quantize LSTM (#5595 ) Quantize LSTM: 1. dynamically quantizes MatMul inside the LSTM. It doesn't quantize activation function. 2. support per-channel on the input weight and recurrent weight.	2020-11-18 11:21:49 -08:00
Peichen Xie	e8c0f5d0ff	Update the quantization script to support GEMM (transB==1) (#5432 ) * Modify onnx_quantizer.py * Fix topology order issues * Handle more cases	2020-11-17 21:24:48 -08:00
Zhang Lei	77b1eea9cf	Add option to allow quantize_input() use input_qtype for initializers. (#5721 )	2020-11-06 09:33:24 -08:00
Yufeng Li	5c4543e194	Calibrate float tensor only (#5704 )	2020-11-04 23:55:48 -08:00
Yufeng Li	6c2162e97a	Fix quantization of Conv1D with bias (#5491 ) * Fix reshape for Conv with bias	2020-10-20 15:27:26 -07:00
Yufeng Li	b04cf2d229	Update ORT to 1.5.1 in Bert Quantization Notebook (#5396 ) * Update ORT to 1.5.1 in Bert Quantization Notebook	2020-10-08 09:55:01 -07:00
Yufeng Li	e8b9aa1f29	fix quantization of EmbeddingLayerNorm (#5321 )	2020-10-01 20:08:43 -07:00
RRRachelllll555	507f5bf5f6	Update test calibrate script (#5185 ) * update test_calibrate according to latest calibrate.py * fix datasize bug in e2e example Co-authored-by: t-yguo <t-yguo@microsoft.com>	2020-09-27 21:59:56 -07:00
Yufeng Li	61ba5b501a	Fix bug in the back to back quantization of matmul and conv (#5264 ) * fix bug in the back to back quantization of matmul and conv * fix bug in back to back gather	2020-09-23 08:47:20 -07:00
RRRachelllll555	f7c1e51810	Remove shape inference and fix save large model(>2g) issue (#5210 ) * remove shape inference and fix save large model problem * remove unnecessary import * refine code and add external format for quantize_qat * remove initializers in tensors_to_calibrate * small refine Co-authored-by: t-yguo <t-yguo@microsoft.com>	2020-09-18 08:46:31 -07:00
Pranav Prakash	f5df96256c	Fix order of returned values in quantize_weight_per_channel (#5205 ) Must match returned order of `quantize_inputs`	2020-09-17 17:57:46 -07:00
Zhang Lei	cd0386b649	MaxPool versioning in quantization tools. (#5194 ) MaxPool versioning in quantization tools.	2020-09-16 22:52:24 -07:00
Yufeng Li	3068a835f1	Fix quantization of 1-D conv with bias (#5157 )	2020-09-14 18:07:14 -07:00
Andrei Shadrikov	82b25e1731	Fix datasize call in calibrate (#5110 ) * Moving datasize to the interface. * Reverting changes and adressing the comment	2020-09-14 18:06:23 -07:00
Zhang Lei	d45e49dd2b	Add LeakyRelu and Sigmoid QLinear Quantization support (#5116 ) * Add LeakyRelu and Sigmoid QLinear Quantization support * Change due to reflect master changes.	2020-09-14 14:46:24 -07:00
Yufeng Li	20b2f45b24	Support per-channel quantization of weight tensor (#5057 ) * Support per-channel quantization of weight tensor * rename util functions * fix bugs in calibrate * add support of reduce_range * refine opset check	2020-09-14 11:53:50 -07:00
Yufeng Li	ffc2b25a3a	Quantization tool improvement (#4933 ) Improve quantization tools: 1. Support QAT 2. Make quantization tool to register Operators. 3. Make the API clear to use Co-authored-by: t-yguo <t-yguo@microsoft.com>	2020-09-01 09:07:46 -07:00
RRRachelllll555	9a6db9b9f4	Fix next node access bug in calibration tool (#4863 ) * fix bug in calibration tool * fix next node access bugs * rm file in wrong folder * refine * optimize * refine * refine format * refine Co-authored-by: t-yguo <t-yguo@microsoft.com>	2020-08-21 20:48:54 -07:00
Yufeng Li	0575881949	Update quantization notebook to pytorch 1.6 (#4834 )	2020-08-18 14:20:46 -07:00
Vagif	6499a38b7d	Add the missing onnx_proto import (#4705 ) * add missing onnx_proto import * Fix TensorProto usage in calibrate.py * remove unused imports	2020-08-10 12:46:21 -07:00
RRRachelllll555	f3fc8ca954	Add input tensor calibration (#4619 ) * add input tensor calibration * set default fusions to be true Co-authored-by: t-yguo <t-yguo@microsoft.com>	2020-07-28 14:04:41 -07:00
Yufeng Li	9c75c29403	refine opset version getter (#4602 )	2020-07-24 10:34:56 -07:00
RRRachelllll555	c5df918744	improve calibration tool (#4561 ) * improve calibration tool * modify calibration interface name * modify calibration interface name * refine calibrate and calibrate_user * refine and add type info * refine and add type info * add e2e user example file * remove unnecessary files * remote test images no longer needed * update readme document Co-authored-by: t-yguo <t-yguo@microsoft.com>	2020-07-22 21:31:49 -07:00
Yufeng Li	822b23ff2f	Add support of EmbeddingLayerNorm (#4562 )	2020-07-21 21:43:02 -07:00
Yufeng Li	e92e0860c8	BERT quantization notebook (#4543 ) * BERT quantization notebook * update notebooks * more benchmark * add version info	2020-07-20 18:23:37 -07:00
Yufeng Li	d4db83858b	Only quantize gather with initializer (#4469 )	2020-07-09 13:33:43 -07:00
Yufeng Li	5dc7339be6	Add quantization tool to python package (#4458 ) * Add quantization tool to python package	2020-07-08 21:42:53 -07:00
Yufeng Li	fc5e65a22d	Add quantization support for GPT2 past state and use model to generate outputs in OpTester (#4340 ) * Make quantization support GPT2 past state * Make OpTester to be able to generate reference outputs with a model. With it, there is no need to compute outputs manually, which are impossible for some cases.	2020-06-26 09:29:29 -07:00
Yufeng Li	197da135eb	Implement quantized Attention on cpu (#4111 ) * Implement QAttention on CPU * support QAttention in quantization tool * refine attention code * add more unit tests	2020-06-03 13:42:00 -07:00
Yufeng Li	7c774e967a	support quantization of optimized model with ir<4 (#3853 )	2020-05-13 11:16:37 -07:00
Zhang Lei	eab61e87ce	Fix quantization tool bugs when model nodes have no name. (#3854 ) Fix bugs when model nodes have no name.	2020-05-12 20:38:26 -07:00
Zhang Lei	c365822808	Refactor some for the calibate.py. Add QLinearAdd and QLinearMul support. Fix bugs loading jpgs not strict RGB, and typoes in load_batch call. (#3542 )	2020-04-18 17:10:55 -07:00
Yufeng Li	af618278f6	fix bugs in quantization and calibration tools (#3329 ) Fix 3 bugs: node names duplicate in calibration augment_graph if the name of node to quantize is empty. If output nodes are quantized, output value are quantized and not dequantized back Gather with data type int64 should not be quantized	2020-03-30 17:50:25 -07:00
Tianlei Wu	403f99cd77	Use yapf to format python (#3276 ) Update ReformatSourcePython.bat to use YAPF to format python code, and add onnxruntime\test directory to be formatted. Add onnxruntime\.style.yapf for configuration. The style is based on google, except max column width 120. Format python scripts using ReformatSourcePython.bat.	2020-03-20 14:34:10 -07:00
Yufeng Li	a69d859912	fix quantize_bias (#3270 )	2020-03-20 11:36:47 -07:00
Tracy Sharpe	fe0b2b2abd	QLinearConv speed up (#3196 ) For x86/x64 builds, change the QLinearConv op to use MLAS for the u8u8=s32 GEMM, then requantize the intermediate buffer to u8.	2020-03-13 16:54:55 -07:00
Yufeng Li	c69194ec4c	fix the missing return in _get_quantize_input_nodes and format code with yapf (#3199 ) * fix the missing return for function _get_quantize_input_nodes * format quantization code with yapf	2020-03-13 09:28:41 -07:00
Yufeng Li	876d0c5430	Make quantization parameters as constant weigth instead of overrideable (#3160 )	2020-03-10 08:35:02 -07:00
Yufeng Li	1d2b8115e2	Support u8u8 in quantization tool (#3140 )	2020-03-05 14:42:46 -08:00
Ashwini Khade	807a59c55d	Add calibration tool (#2845 ) * add calibration tool * add model for e2e example * format readme * some more formatting updates * plus a few more updates * plus review comments * plus updates * more updates	2020-01-20 14:49:35 -08:00
Ashwini Khade	8643f3ebbb	add domain check for nodes + update documentation (#2831 )	2020-01-14 11:15:50 -08:00
Ashwini Khade	cc75e5a162	update quantization doc (#2783 ) * update documentation for quantization script * plus some spell corrections	2020-01-13 10:52:46 -08:00
Ashwini Khade	d197079473	quantization script updates (#2208 )	2019-10-21 12:25:52 -07:00
Jonny Shipton	df472cbfbd	quantise: Don't error when initializer graph input is missing (#1872 ) ONNX IR version 4 and above do not require graph initializers to have corresponding graph inputs.	2019-09-30 21:57:00 +00:00
Ashwini Khade	0f6cf9a335	enable quantizing specific nodes (#1742 )	2019-09-03 11:04:17 -07:00
Ashwini Khade	16087f3133	update default values for weight quatization (#1564 )	2019-08-05 21:39:37 -07:00

1 2

51 commits