onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-14 18:12:05 +00:00

Author	SHA1	Message	Date
RRRachelllll555	f7c1e51810	Remove shape inference and fix save large model(>2g) issue (#5210 ) * remove shape inference and fix save large model problem * remove unnecessary import * refine code and add external format for quantize_qat * remove initializers in tensors_to_calibrate * small refine Co-authored-by: t-yguo <t-yguo@microsoft.com>	2020-09-18 08:46:31 -07:00
Pranav Prakash	f5df96256c	Fix order of returned values in quantize_weight_per_channel (#5205 ) Must match returned order of `quantize_inputs`	2020-09-17 17:57:46 -07:00
Zhang Lei	cd0386b649	MaxPool versioning in quantization tools. (#5194 ) MaxPool versioning in quantization tools.	2020-09-16 22:52:24 -07:00
Chi Lo	9f526f45ac	TensorRT Perf Tool (#4900 ) * Initialize tensorrt perf script * Add bert-squad dependencies * Modified code to make ort inference with CUDA/Tensorrt * Add get CUDA/TRT version * uncomment bert-squad * Add BERT-SQUAD inputs.json * Add FastRCNN * Make preprocess/validation in to common functions * Add MaskRCNN and SSD and consolidate the code * Add dependencies for MaskRCNN * following modifications are made: - create common fetch function to get inputs/outputs of model from ONNX model zoo. - create common validation function to compare inference outputs with reference outputs from ONNX model zoo. - move run/repeat time to argument list. (still working on other arguments, like fp16 or fp32, latency percentile). - generate table in csv file to show the latency comparison (TRT vs CUDA) side by side. * Add approache to analyze profling file and also update model related settings * Add models * Add most of models from ONNX model zoo * Add model input name and print all the model names at the end of run * Add system info * Add TRT fp16 support * Refine the code * Handle TRT fall back and modify the way to get input data * Refine code * Modify code * Add more precise approach to measure inference * Add io-binding * Add YoLoV4 * Refine the code * Refine the code * Add models * Add yolov4 notebook for jetson device * Update notebook * Update notebook * Add CVS models * Add missing model * Add support of float16 * Add new way to get trt version * Add "validate" and "benchmark" mode * Add randomly generated input * Refine perf script * Refine the code. * Add README * Refine the code * Update README.md * Refine code * Update README.md * Remove all the model related python and instead using model_list.json as models configuration. Refine the benchmark.py * Refine the code Co-authored-by: Chi Lo <lochi@microsoft.com>	2020-09-15 10:06:01 -07:00
Yufeng Li	3068a835f1	Fix quantization of 1-D conv with bias (#5157 )	2020-09-14 18:07:14 -07:00
Andrei Shadrikov	82b25e1731	Fix datasize call in calibrate (#5110 ) * Moving datasize to the interface. * Reverting changes and adressing the comment	2020-09-14 18:06:23 -07:00
S. Manohar Karlapalem	f7edf0aa57	[OpenVINO-EP] Enable EP config options for VPU hardware (#5119 ) * Added config flags for VPU Fast Recompile * clean-up ifdefs * Add VPU Fast compile config option Adds an option that enables Fast compilation of models to VPU hardware specific format. * Add config option to choose specific device id for inference Inference of all subgraphs will be scheduled only on this device even if other devices of the same type are available. * Add Python API to list available device IDs * code cleanup * Add second C/C++ API with settings string parameter Adds an additional C/C++ API that allows passing multiple key-value pairs for settings as a single string. Multiple settings are delimited by '\n' while the key and value within a setting are delimited by '\|'. * Append 'Ex' to the extended C/C++ API * Use set_providers Py API to set config options. Uses Session.set_providers Python API to set EP runtime config options as key/val pairs Deprecated older module function definitions for config settings. Updates documentation. * avoid globals for py config options where possible Co-authored-by: intel <you@example.com>	2020-09-14 15:46:14 -07:00
Zhang Lei	d45e49dd2b	Add LeakyRelu and Sigmoid QLinear Quantization support (#5116 ) * Add LeakyRelu and Sigmoid QLinear Quantization support * Change due to reflect master changes.	2020-09-14 14:46:24 -07:00
Yufeng Li	20b2f45b24	Support per-channel quantization of weight tensor (#5057 ) * Support per-channel quantization of weight tensor * rename util functions * fix bugs in calibrate * add support of reduce_range * refine opset check	2020-09-14 11:53:50 -07:00
Ye Wang	5302fe4079	A fix in load_pretrained_model() (#5137 ) * Fix in load_pretrained_model * Update onnx_exporter.py	2020-09-11 17:23:02 -07:00
Tianlei Wu	7511021e0e	Save Gpt2 test data (#5132 ) (1) Save gpt2 test data during test generation. (2) Use torch fp32 model as baseline when onnx model is fp16. (3) Refine logic to compose onnx model path	2020-09-11 14:31:49 -07:00
Ye Wang	89509f256a	Not fuse SkipLayerNorm when add has initializer input (#5123 )	2020-09-11 11:46:31 -07:00
Ye Wang	879751f3b7	Support Tensorflow benchmarking and onnx export in transformers tool (#5068 ) * init checkin for tf export and tf benchmark * small fix on argparse * refactor * review comments * review comments	2020-09-11 00:47:37 -07:00
Tianlei Wu	c5d4ae0401	Add transformers tools to python package (#5090 ) * Add transformers to onnxruntime python package	2020-09-10 15:42:15 -07:00
Ye Wang	b23e08b85c	Add AutoModel selector in transformers tool (#5051 ) * Add AutoModel selector in transformers tool * change distilbert--squad's pipeline to AutoModelForQuestionAnswering rule base selector and add model_class as parameter * Update huggingface_models.py * review comments	2020-09-08 15:06:04 -07:00
Cameron Maske	4553b2eecd	Expose DirectML provider to python (conflicts resolved from #3359 ) (#4630 )	2020-09-08 14:34:09 -07:00
Hariharan Seshadri	e1ed0fde2b	Prevent registering both DML and CUDA EPs in an ML op test (#5078 )	2020-09-08 11:13:50 -07:00
Xiang Zhang	0dad79b495	Add SetLanguageProjection C Api and use it in four projections (#5023 ) * Add SetLanguageProjection C Api and use it in four projections * static cast enum languageprojection to uint32_t * resolve comments * fix typo and line added unintentionally * revert unecessary change * reorder c# api * add TensorAt and CreateAndRegisterAllocator in Csharp to keep the same order as C apis	2020-09-04 14:26:39 -07:00
Ashwini Khade	9ba2cfb71b	fix py packaging pipeline (#5038 ) * add test skip logic when opset > allowed opset * fix attribute error * plus fix	2020-09-03 09:32:10 -07:00
Scott McKay	28445c88f9	Changes to enable saving and loading an ORT format model (#4995 ) * Changes to enable saving and loading an ORT format model via the public APIs. Cleanup session.py to try and make slightly more understandable. More refactoring is needed here. Couple of bug fixes * Fix bug in handling NodeArg serialization for optional inputs which has a name and no type info. * Address PR comments - tweak SessionOptions config to avoid double lookup - merge duplicated functionality in python binding around registering an EP with optional options Fix a couple of build issues. * Update C API to be consistent with python API - only load model in InferenceSession ctor if required - support loading ORT model in minimal build * Fix nodejs test. We get an invalid path error from LoadInterOp first now * Another attempt at fixing nodejs test. Error message depends on whether ENABLE_LANGUAGE_INTEROP_OPS is defined. Make the output consistent. The interop implementation looks suspicious given it appears to be internal code that is going via the public api. TBD if that should be fixed. * Fix couple of build issues. * Disable test temporarily so PR can be checked in. Will fix in separate PR that adds final pieces for minimal build as the test is required there. * Give up on nodejs test and make the match simpler. Fix init call in TrainingSession python to not pass through sess. it wasn't being used in Session anyway so passing it through just adds confusion. * Fix call to Session.__init__ in TrainingSession. Session now initializes Session._sess to None to make it clearer where the 'ownership' of that member is, and that needs to happen before TrainingSession sets it.	2020-09-03 09:10:48 -07:00
Hariharan Seshadri	a9db287bd7	Return windows error code for library loading and unloading failure (#5036 )	2020-09-02 18:07:36 -07:00
Ye Wang	b4e9e98cee	Add more huggingface models in benchmark tools (#4986 ) * checkin more huggingface models * review comments * review comments	2020-09-02 16:41:58 -07:00
Hariharan Seshadri	d30dd41c0e	Remove public default ctor in PyInferenceSession and replace it with a protected ctor (#4990 )	2020-09-01 17:10:36 -07:00
Tianlei Wu	a47cae031f	Use raw attention mask in BERT related fusions (#4889 ) * Use raw attention mask in fusion * update python scripts to use raw attention mask by default	2020-09-01 13:22:20 -07:00
Yufeng Li	ffc2b25a3a	Quantization tool improvement (#4933 ) Improve quantization tools: 1. Support QAT 2. Make quantization tool to register Operators. 3. Make the API clear to use Co-authored-by: t-yguo <t-yguo@microsoft.com>	2020-09-01 09:07:46 -07:00
Hariharan Seshadri	7045910d10	Support RegisterCustomOpsLibrary via the Python API (#4764 )	2020-08-28 13:24:29 -07:00
Scott McKay	08eb15068c	Exclude the Map types from the build if ML ops are disabled. (#4908 ) * Exclude the Map types from the build if ML ops are disabled. They're the only ops that use Map.	2020-08-27 17:48:12 +10:00
Tianlei Wu	268d2283c0	Export GPT-2 ONNX model without postion_ids and attention_mask inputs (#4852 ) * Export GPT-2 ONNX model without postion_ids and attention_mask inputs * allow benchmark_gpt2 on user's model * refactor: get_dummy_inputs returns a data class.	2020-08-24 13:05:25 -07:00
Scott McKay	db7669b225	Reduce ONNX dependency in minimal build (#4890 ) * Next round of changes. Remove inclusion of ONNX schema header Exclude custom registry related things Move IsConstantInitializer from graph_utils to Graph as it's needed in a minimal build and graph_utils is excluded.	2020-08-23 07:02:13 +10:00
RRRachelllll555	9a6db9b9f4	Fix next node access bug in calibration tool (#4863 ) * fix bug in calibration tool * fix next node access bugs * rm file in wrong folder * refine * optimize * refine * refine format * refine Co-authored-by: t-yguo <t-yguo@microsoft.com>	2020-08-21 20:48:54 -07:00
Yufeng Li	0575881949	Update quantization notebook to pytorch 1.6 (#4834 )	2020-08-18 14:20:46 -07:00
gwang-msft	dee7596724	Add a generic collection of session configurations to the SessionOptions (#4718 ) * adding generic configurations for session options * fix a build break on linux * fix training ci build break * fix training ci build break * addressed CR comments * fix traning ci build break * move config_key from enum to string * add c# api * add python api * fix build break * move prepacking from 2 new api entries to session options configs * fix traning ci build break * add python test, update some comments, move const key definition to avoid build break * addressed comments * move definitions of keys to common.h * move api to version 5 * remove accidental change in build.py * remove pragma to avoid build break * addressed CR comments * fix the python build break, and move location of config keys definition * small typo changes	2020-08-18 13:40:40 -07:00
Tianlei Wu	1ce2982f65	Update GPT-2 notebook using IO Binding example (#4799 )	2020-08-17 10:43:36 -07:00
Tianlei Wu	a69ca63895	add --no_attention_mask option (#4750 ) output producer name and version in optimized model. avoid removing initializer that existed in graph output	2020-08-12 15:56:25 -07:00
Tianlei Wu	316d1a9e69	Update benchmark for large model or model name with non-alphanumeric. (#4743 ) * Export model > 2GB using external data format	2020-08-10 12:58:01 -07:00
Vagif	6499a38b7d	Add the missing onnx_proto import (#4705 ) * add missing onnx_proto import * Fix TensorProto usage in calibrate.py * remove unused imports	2020-08-10 12:46:21 -07:00
Tianlei Wu	9c729d1719	Update notebook for mac since onnxruntime 1.3 or 1.4 in mac does not have openmp (#4732 )	2020-08-07 14:01:48 -07:00
Ye Wang	61726e58f0	fix (#4697 )	2020-08-07 13:08:41 -07:00
Yufeng Li	b22091dc91	Add the framework to support prepack (#4413 ) * add support of prepack * add support for QAttention and DynamicQuantizeMatMul * add an use_prepacking option * add use_prepacking in c_sharp api	2020-08-07 09:39:19 -07:00
Tianlei Wu	e70e9e2f67	refine machine_info and output onnxruntime_tools version (#4679 ) * output onnxruntime_tools version * change get_machine_info return data type to string	2020-08-02 18:20:59 -07:00
Ye Wang	b1bfff34e0	Support distill-bert fusion in transformers tool (#4631 ) * checkin attention * checkin embedlayer but cause invalid onnx model * resolve comments * fix comments * check return values * add version limit * fix comments * add warning	2020-07-31 17:57:54 -07:00
Tianlei Wu	3588c5b545	Add GPT-2 test generation to convert_to_onnx.py (#4670 ) * add gpt2 tester * add an option to include output latency.	2020-07-30 21:03:53 -07:00
Tianlei Wu	326cc686df	Update notebook: disable GPU for tensorflow (#4649 )	2020-07-29 10:09:06 -07:00
RRRachelllll555	f3fc8ca954	Add input tensor calibration (#4619 ) * add input tensor calibration * set default fusions to be true Co-authored-by: t-yguo <t-yguo@microsoft.com>	2020-07-28 14:04:41 -07:00
Yufeng Li	a06cf6a3b3	Show quantization model size in benchmark of transformer (#4626 ) * Show quantization model size in benchmark of transformer * refine model size calculation	2020-07-27 23:56:33 -07:00
Hariharan Seshadri	9510f26744	[Python] Support more APIs for the SessionOptions class (#4596 )	2020-07-24 12:56:54 -07:00
Yufeng Li	9c75c29403	refine opset version getter (#4602 )	2020-07-24 10:34:56 -07:00
Tianlei Wu	ace41b8064	Force return_tuple=True to handle transformers breaking change of output format. (#4599 )	2020-07-23 11:35:41 -07:00
Tianlei Wu	ea87c0d028	Update Transformer Optimizer documents (#4591 ) (1) Add bert-base-cased and gpt2 benchmark results on V100 (2) Update list of supported models. (3) Add comments to gpt2_helper. (4) Use IO Binding in test parity by default.	2020-07-23 08:38:39 -07:00
RRRachelllll555	c5df918744	improve calibration tool (#4561 ) * improve calibration tool * modify calibration interface name * modify calibration interface name * refine calibrate and calibrate_user * refine and add type info * refine and add type info * add e2e user example file * remove unnecessary files * remote test images no longer needed * update readme document Co-authored-by: t-yguo <t-yguo@microsoft.com>	2020-07-22 21:31:49 -07:00

1 2 3 4 5

245 commits