onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-02 03:55:34 +00:00

Author	SHA1	Message	Date
Patrice Vignola	18ef0fafc4	Merged PR 5551793: Merge with latest ORT master	2021-01-07 20:01:53 +00:00
Ryan Lai	054fb4d3f6	Merged PR 5490021: Merge latest github master into dmldev branch Last RI was on Monday : `0afbdfd81c`	2020-12-11 01:13:32 +00:00
Tianlei Wu	cdb91208a3	longformer onnx conversion and benchmark tools (#6007 ) * initial implementation of longformer tools for onnx conversion and benchmark * Support ONNX conversion for transformers 4.0 Add an option to optimize onnx model, and export fp16 model	2020-12-03 11:37:30 -08:00
Cecilia Liu	3b198c9614	Support Fusion for 1 and 2 Inputs Bert Models Converted From tf (#5993 ) Support fusion for 1 and 2 inputs Bert models converted from tf	2020-12-03 10:52:33 -08:00
Zhang Lei	648c9c7789	Fix bugs for 1: Calibrator should check model inputs; 2: (#6017 ) quantize_inupts forgot to use parameter initializer_use_weight_qtyp.	2020-12-03 00:00:16 -08:00
Ye Wang	5f516899bf	optimize a bert model converted using tf2onnx (#5492 ) * optimize a bert model converted using tf2onnx * add test data * update * remove comments * format * Revert "format" This reverts commit f8ae88cb564bce5caf4780e56561403f3ba3d524. * Revert "remove comments" This reverts commit 59d8a693581a731fd0291b70fe2c9cec6c4950fe. * add a squeeze node to convert a 3-d mask to 2-d * update * update * verify and add comments	2020-12-01 11:19:16 -08:00
KeDengMS	ee908eb0aa	Symbolic shape inference: fix rank for ConstantOfShape (#5912 )	2020-11-24 14:50:41 -08:00
Zhang Lei	9992f0f812	Implement QLinear GlobalAveragePool with sse2/neon. (#5838 ) Add QLinear Global Average Pool for quantization for ARM and SSE2. Co-authored-by: Tracy Sharpe <tracysh@microsoft.com>	2020-11-23 19:23:58 -08:00
Ye Wang	3d5b48a894	remove use_cdn when loading pretrained model (#5900 )	2020-11-23 14:26:55 -08:00
Olivia Jain	3738ca7e10	Improve perf testing (#5760 ) * build off a specific commit and archive wheel file * rename to fp32, prefix results w/ commit, add CPU col * rename 99th to 90 percentile * get symbolic_shape from master each time * add install archive wheel, parallel build * shortening hash	2020-11-20 16:03:09 -08:00
Yufeng Li	6f86c4dbe3	Quantize LSTM (#5595 ) Quantize LSTM: 1. dynamically quantizes MatMul inside the LSTM. It doesn't quantize activation function. 2. support per-channel on the input weight and recurrent weight.	2020-11-18 11:21:49 -08:00
Peichen Xie	e8c0f5d0ff	Update the quantization script to support GEMM (transB==1) (#5432 ) * Modify onnx_quantizer.py * Fix topology order issues * Handle more cases	2020-11-17 21:24:48 -08:00
Chi Lo	92292de135	Tensorrt perf tool (#5436 ) * Add YAML file for pipeline * Modify typo * Add working directory * Modify and test * Modfiy and test * Modify and test * Modify and test * Modify * Modify * Modify * Modify * Make sure to copy all the result files * Add clearn up * Modify * Modify agent pool name * Upload only specific artifacts * Modify * Integrated CI Pipeline for running TRT perf as well as added the “large amount of models” into perf model target * Fix bug * Fix bug * Add reading the information regarding previously known failing models and then skip testing them during benchmark/validation * Modify the script file for CI * Replace print with logger.info * Fix bug * Fix bug * Refine the code * Modify the script so that it can capture script segmentation fault while running ORT * Fix bug * fix bug * fix bug * Add debug info * fix bug * Refine perf code * Refine the code * fix bug * Code refactoring * change many-models path * remove metadata after validation/benchmark are done * Update README.md * Fix bug so that metadata doesn't hold stale value * Remove hardcode and update README * Add arguments to the script to make it run correctly * Update linux-gpu-tensorrt-ci-perf-pipeline.yml for Azure Pipelines * Update linux-gpu-tensorrt-ci-perf-pipeline.yml for Azure Pipelines * Fix bug so that metadata doesn't hold stale value * Fix small bug of finding test dataset directory for FP16 test data, as well as modification of some output information * use -i random for perf test of TRT changes Co-authored-by: Olivia Jain <oljain@microsoft.com>	2020-11-06 12:27:42 -08:00
Ye Wang	95e6da7957	Revert saving optimized model as external data (#5690 ) * revert and add support for saving external data * review comments * update	2020-11-06 11:54:19 -08:00
Zhang Lei	77b1eea9cf	Add option to allow quantize_input() use input_qtype for initializers. (#5721 )	2020-11-06 09:33:24 -08:00
Yufeng Li	5c4543e194	Calibrate float tensor only (#5704 )	2020-11-04 23:55:48 -08:00
Ye Wang	a028ca41ec	Optimize flaubert (#5651 ) * optimize flaubert * fix an issue and format * revert non-relevent change * review comments	2020-11-03 09:51:42 -08:00
Tianlei Wu	2c02530603	Bert Model Profiling Tool (#5654 ) * Add profiler tool for BERT models	2020-11-02 13:47:37 -08:00
Derek Murray	ff538b8d3a	Minor fixes in BERT Inference notebook (#5637 ) Add missing commas to the code example.	2020-11-02 09:49:23 -08:00
KeDengMS	32bf6390ad	Some fixes to symbolic shape inference (#5642 ) * Some fixes to symbolic shape inference 1. Topological sort before iteration in graph 2. Fix a case in slice: start=100000, end=-100000, step=-1, dim=2 3. Fix Nuphar Gemm test's random seed 4. Slice opset 1 axes is optional	2020-10-30 19:28:47 -07:00
Tianlei Wu	1f304fbee7	Attention with past and no unidirectional mask (#5557 ) * Update fusions to support shared node, and mask of all ones	2020-10-21 20:12:02 -07:00
Yufeng Li	6c2162e97a	Fix quantization of Conv1D with bias (#5491 ) * Fix reshape for Conv with bias	2020-10-20 15:27:26 -07:00
KeDengMS	e1a54c4090	Symbolic shape inference: fix a bug in shape merge (#5519 ) * Symbolic shape inference: fix a bug in shape merge OpType Where: input0: ['mt_src_tokens_batch', 1, 1, 'mt_src_tokens_len'] input1: [] input2: ['mt_prev_output_tokens_batch', 12, 'mt_prev_output_tokens_len', 'floor(mt_src_tokens_batchmt_src_tokens_len/mt_prev_output_tokens_batch)'] 1 output: [None, 12, 'mt_prev_output_tokens_len', None] Undo unintended TRT change	2020-10-16 17:54:57 -07:00
Ye Wang	67315d8ae0	Optimize openai-gpt/albert model and add fusion test (#5466 ) * optimize openai-gpt * add huggingface model fusion test * move albert's attention fusion here * add test for albert fusion	2020-10-13 19:24:14 -07:00
Ye Wang	90f976d060	Some improvements on transformers tool (#5383 ) * modify tensoflow benchmark gpu setting * add export from tf choice in script * fix typo * match more embedlayernorm pattern * format	2020-10-08 19:35:17 -07:00
Tianlei Wu	15696b8fce	bump version to 1.5.2 (#5420 )	2020-10-08 16:30:13 -07:00
Yufeng Li	b04cf2d229	Update ORT to 1.5.1 in Bert Quantization Notebook (#5396 ) * Update ORT to 1.5.1 in Bert Quantization Notebook	2020-10-08 09:55:01 -07:00
Tianlei Wu	8ee2b08325	Allow benchmark different threads (#5390 )	2020-10-07 11:13:01 -07:00
Tianlei Wu	094384781e	Add --use_external_data_format in convert_to_onnx.py (#5393 )	2020-10-07 09:42:02 -07:00
Tianlei Wu	f5e4c0ea04	Fix benchmark_gpt2 model verification (#5343 )	2020-10-02 13:53:02 -07:00
Tianlei Wu	e33de20861	Update gpt2 notebook for int8 quantization (#5346 ) * Update gpt2 notebook for ORT 1.5 * add sections for int8 quantization including QAT note	2020-10-02 09:41:52 -07:00
Yufeng Li	e8b9aa1f29	fix quantization of EmbeddingLayerNorm (#5321 )	2020-10-01 20:08:43 -07:00
KeDengMS	7495dc167a	Symbolic shape inference: fix a bug in auto_merge when broadcasting (#5349 ) The bug happens when merging following shapes: input0: [1, 1, 'Min(1024, input1_dynamic_axes_3)', 'Min(1024, input1_dynamic_axes_3)'] input1: ['input1_dynamic_axes_1*input1_dynamic_axes_2', 12, 'input1_dynamic_axes_3', 'input1_dynamic_axes_3'] input2: [] The fix is to avoid broadcasting merge on input2	2020-10-01 15:24:00 -07:00
Ye Wang	caed6c264c	Add tf2pytorch wrapper in transformers tool (#5316 ) * init checkin * format * refactor * review comments	2020-10-01 13:58:58 -07:00
Ye Wang	1a12f510fc	Support T5 benchmarking in transformers tool (#5133 ) * init checkin * review comments * modify according to transformers release	2020-09-29 22:58:28 -07:00
RRRachelllll555	507f5bf5f6	Update test calibrate script (#5185 ) * update test_calibrate according to latest calibrate.py * fix datasize bug in e2e example Co-authored-by: t-yguo <t-yguo@microsoft.com>	2020-09-27 21:59:56 -07:00
KeDengMS	5a71819be6	Symbolic shape inference: fix a case for concat (#5277 ) * Symbolic shape inference: fix a case when concat requires merge multiple dims * Fix a bug triggered in newer version of sympy Fix a bug in output data type guessing	2020-09-24 08:16:47 -07:00
Yufeng Li	61ba5b501a	Fix bug in the back to back quantization of matmul and conv (#5264 ) * fix bug in the back to back quantization of matmul and conv * fix bug in back to back gather	2020-09-23 08:47:20 -07:00
Tianlei Wu	3bbce69185	bump version to 1.5.1 (#5258 )	2020-09-22 20:57:34 -07:00
KeDengMS	8dceebda0e	[Training/Python] Add option to enable symbolic shape inference (#5107 ) This change adds symbolic shape inference to ORT training which helps static memory planning for model like BART.	2020-09-22 10:49:07 -07:00
Ye Wang	65740deb10	Fix a bug in EmbedLayerNorm fusion (#5150 ) * fix embedlayernorm bug * review comments * interim checkin * review comments * Fix core dump in MacOS * remove unnecessary lines * update document * Update graph_utils.cc * Update onnx_exporter.py * resolve comments	2020-09-21 12:26:14 -07:00
KeDengMS	ce3b67e0cd	[Python] Move symbolic_shape_infer from nuphar to tools (#5162 ) * [Python] Move symbolic shape inference from nuphar to tools * Fix PEP8 ERROR	2020-09-18 09:31:06 -07:00
RRRachelllll555	f7c1e51810	Remove shape inference and fix save large model(>2g) issue (#5210 ) * remove shape inference and fix save large model problem * remove unnecessary import * refine code and add external format for quantize_qat * remove initializers in tensors_to_calibrate * small refine Co-authored-by: t-yguo <t-yguo@microsoft.com>	2020-09-18 08:46:31 -07:00
Pranav Prakash	f5df96256c	Fix order of returned values in quantize_weight_per_channel (#5205 ) Must match returned order of `quantize_inputs`	2020-09-17 17:57:46 -07:00
Zhang Lei	cd0386b649	MaxPool versioning in quantization tools. (#5194 ) MaxPool versioning in quantization tools.	2020-09-16 22:52:24 -07:00
Chi Lo	9f526f45ac	TensorRT Perf Tool (#4900 ) * Initialize tensorrt perf script * Add bert-squad dependencies * Modified code to make ort inference with CUDA/Tensorrt * Add get CUDA/TRT version * uncomment bert-squad * Add BERT-SQUAD inputs.json * Add FastRCNN * Make preprocess/validation in to common functions * Add MaskRCNN and SSD and consolidate the code * Add dependencies for MaskRCNN * following modifications are made: - create common fetch function to get inputs/outputs of model from ONNX model zoo. - create common validation function to compare inference outputs with reference outputs from ONNX model zoo. - move run/repeat time to argument list. (still working on other arguments, like fp16 or fp32, latency percentile). - generate table in csv file to show the latency comparison (TRT vs CUDA) side by side. * Add approache to analyze profling file and also update model related settings * Add models * Add most of models from ONNX model zoo * Add model input name and print all the model names at the end of run * Add system info * Add TRT fp16 support * Refine the code * Handle TRT fall back and modify the way to get input data * Refine code * Modify code * Add more precise approach to measure inference * Add io-binding * Add YoLoV4 * Refine the code * Refine the code * Add models * Add yolov4 notebook for jetson device * Update notebook * Update notebook * Add CVS models * Add missing model * Add support of float16 * Add new way to get trt version * Add "validate" and "benchmark" mode * Add randomly generated input * Refine perf script * Refine the code. * Add README * Refine the code * Update README.md * Refine code * Update README.md * Remove all the model related python and instead using model_list.json as models configuration. Refine the benchmark.py * Refine the code Co-authored-by: Chi Lo <lochi@microsoft.com>	2020-09-15 10:06:01 -07:00
Yufeng Li	3068a835f1	Fix quantization of 1-D conv with bias (#5157 )	2020-09-14 18:07:14 -07:00
Andrei Shadrikov	82b25e1731	Fix datasize call in calibrate (#5110 ) * Moving datasize to the interface. * Reverting changes and adressing the comment	2020-09-14 18:06:23 -07:00
Zhang Lei	d45e49dd2b	Add LeakyRelu and Sigmoid QLinear Quantization support (#5116 ) * Add LeakyRelu and Sigmoid QLinear Quantization support * Change due to reflect master changes.	2020-09-14 14:46:24 -07:00
Yufeng Li	20b2f45b24	Support per-channel quantization of weight tensor (#5057 ) * Support per-channel quantization of weight tensor * rename util functions * fix bugs in calibrate * add support of reduce_range * refine opset check	2020-09-14 11:53:50 -07:00

1 2 3 4

161 commits