onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-10 00:38:54 +00:00

Author	SHA1	Message	Date
Hariharan Seshadri	d42399e1b0	Allow querying a GraphProto's doc_string as part of ModelMetadata (#6248 )	2021-01-05 22:18:03 -08:00
liqunfu	addb4b8c2b	Liqun/speech model loop to scan (#6070 ) Provide a tool to convert Loop to Scan for Nuphar performance Fix Nuphar CI pipeline failures. Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-01-05 15:15:23 -08:00
Olivia Jain	c8de3f355a	Refactor EP Perf Tool (#6202 ) * merge master, keep postprocess status commit * download float16.py everytime * using variables to reference eps * adding ACL EP to ep perf tool * accuracy with absolute tolerance configurable * add acl to dict + remove commented line	2021-01-04 08:50:41 -08:00
Changming Sun	1b23b28706	Remove MKLML/openblas/jemalloc build config (#6212 )	2020-12-30 17:18:19 -08:00
Chi Lo	945fae8f56	Lochi/quantization tool for trt (#6103 ) * Initial implementation of generating calibration dynamic range table * Initialize validation support for Quantization * Initialize validation support for Quantization (cont.) * Improve validation support for Quantization * Improve validation support for Quantization * Rewrite/Refine for calibration and validation * Rewrite/Refine for calibration and validation (cont.) * Refine code * Refine code * Add data reader for BERT * Add flatbuffers to serialize calibration table * Refine code and add BERT evaluation * Refine the code * minor modification * Add preprocess/postprocess of vision team yolov3 and refine the code * Update annotation * Make bbox cooridates more accurate * Fix bug * Add support of batch processing * Batch processing for model zoo yolov3 * Add batch inference for evaluation * Refine the code * Add README * Add comments * Refine the code for PR * Remove batch support checking in data_reader and refine the code * Refine the code for PR * Refine the code for PR review Co-authored-by: Olivia Jain <oljain@microsoft.com>	2020-12-21 20:59:08 -08:00
Olivia Jain	234e94b4e1	Add Status.csv to EP Perf Tool (#6167 ) * merge master, keep postprocess status commit * download float16.py everytime * removing hardcoded values	2020-12-21 20:23:19 -08:00
Cecilia Liu	980a93c164	Model Fusion For Bart (#6105 ) Fusion fix for Bart models	2020-12-15 14:30:15 -08:00
Edward Chen	64709b1335	Deprecate Python global configuration functions [Part 1] (#5923 ) Enable options to be set via execution provider (EP)-specific options and log deprecation warning from current global configuration functions.	2020-12-15 11:32:43 -08:00
ashbhandare	b1a75d0e98	Enable passing initial optimizer state while creating training session (#5869 ) * Support to pass initial optimizer states to optimizer graph builder * Changes for passing init optim state to training session config * Pass optimizer state through cpp and python frontend * Cleanup * Review comments * Fix windows and mac CI * Review comments * review comments * Review comments * Frontend review changes * Fix CI	2020-12-08 21:20:51 -05:00
Ye Wang	fa06be2133	Support export >2G model when using optimizer.py only (#6014 ) * checkin * add warning if user specify same inut and output path	2020-12-07 17:18:49 -08:00
Tianlei Wu	51fbe87b9b	Update profiler tool to support gpt2 and longformer models (#6011 ) * support gpt2 and longformer in profiler tool * rename bert_profiler to profiler * Add --basic_optimization to allow user to use basic level of graph optimization * Add --kernel_time_only to filter kernel time and exclude fence time * Add --threshold to filter nodes that with low run time percentage.	2020-12-07 10:33:41 -08:00
Changming Sun	925879a8b0	Remove python 3.8 Windows GPU build from python packaging pipeline (#6054 ) Revert the last a few changes to get the pipeline back to a normal state.	2020-12-07 10:23:07 -08:00
George Wu	020efc9002	fix windows cuda support for python 3.8 + (#6046 ) * fix * noqa * fix. * remove unused import	2020-12-07 10:09:22 -08:00
Tianlei Wu	cdb91208a3	longformer onnx conversion and benchmark tools (#6007 ) * initial implementation of longformer tools for onnx conversion and benchmark * Support ONNX conversion for transformers 4.0 Add an option to optimize onnx model, and export fp16 model	2020-12-03 11:37:30 -08:00
Cecilia Liu	3b198c9614	Support Fusion for 1 and 2 Inputs Bert Models Converted From tf (#5993 ) Support fusion for 1 and 2 inputs Bert models converted from tf	2020-12-03 10:52:33 -08:00
Zhang Lei	648c9c7789	Fix bugs for 1: Calibrator should check model inputs; 2: (#6017 ) quantize_inupts forgot to use parameter initializer_use_weight_qtyp.	2020-12-03 00:00:16 -08:00
Ye Wang	5f516899bf	optimize a bert model converted using tf2onnx (#5492 ) * optimize a bert model converted using tf2onnx * add test data * update * remove comments * format * Revert "format" This reverts commit f8ae88cb564bce5caf4780e56561403f3ba3d524. * Revert "remove comments" This reverts commit 59d8a693581a731fd0291b70fe2c9cec6c4950fe. * add a squeeze node to convert a 3-d mask to 2-d * update * update * verify and add comments	2020-12-01 11:19:16 -08:00
Changming Sun	2d9dcc4576	Add python 3.9 support (#5874 ) 1. Add python 3.9 support(except Linux ARM) 2. Add Windows GPU python 3.8 to our packaging pipeline.	2020-11-30 12:02:48 -08:00
Ivan Stojiljkovic	015fbb3dbb	Add support for Python 3.8+ on Windows when CUDA is enabled (#5956 )	2020-11-26 15:52:30 -08:00
KeDengMS	ee908eb0aa	Symbolic shape inference: fix rank for ConstantOfShape (#5912 )	2020-11-24 14:50:41 -08:00
Zhang Lei	9992f0f812	Implement QLinear GlobalAveragePool with sse2/neon. (#5838 ) Add QLinear Global Average Pool for quantization for ARM and SSE2. Co-authored-by: Tracy Sharpe <tracysh@microsoft.com>	2020-11-23 19:23:58 -08:00
sfatimar	916410151c	Fix for hetero multi python binding with new shared library (#5895 ) Co-authored-by: sfatimar <sahar.fatima@intel/com>	2020-11-23 15:41:10 -08:00
Ye Wang	3d5b48a894	remove use_cdn when loading pretrained model (#5900 )	2020-11-23 14:26:55 -08:00
Hariharan Seshadri	d46dbeafd3	Expose knobs to create and share (CPU) allocators across sessions in C# and Python (#5634 )	2020-11-21 14:12:33 -08:00
Ryan Hill	ba739a8000	Convert OpenVINO into a shared provider (#5778 ) Same as Dnnl and TensorRT before it, now with more methods and more cleanup.	2020-11-20 17:39:57 -08:00
Olivia Jain	3738ca7e10	Improve perf testing (#5760 ) * build off a specific commit and archive wheel file * rename to fp32, prefix results w/ commit, add CPU col * rename 99th to 90 percentile * get symbolic_shape from master each time * add install archive wheel, parallel build * shortening hash	2020-11-20 16:03:09 -08:00
Scott McKay	f0142da59c	Add NNAPI to providers that can be used via the python bindings. (#5867 ) Update ORT model conversion script - add args for specifying optimization level and whether to use NNAPI - add logic to create a list of required ops and ORT format model that can be used with NNAPI	2020-11-21 09:18:35 +10:00
Takeshi Watanabe	a622533ecc	Support profile_file_prefix in python binding (#5864 )	2020-11-20 14:28:50 -08:00
S. Manohar Karlapalem	ff58f621fa	Remove nGraph Execution Provider (#5858 ) * Remove nGraph Execution Provider Pursuant to nGraph deprecation notice: https://github.com/microsoft/onnxruntime/blob/master/docs/execution_providers/nGraph-ExecutionProvider.md#deprecation-notice Deprecation Notice \| \| \| \| --- \| --- \| \| Deprecation Begins \| June 1, 2020 \| \| Removal Date \| December 1, 2020 \| Starting with the OpenVINO™ toolkit 2020.2 release, all of the features previously available through nGraph have been merged into the OpenVINO™ toolkit. As a result, all the features previously available through ONNX RT Execution Provider for nGraph have been merged with ONNX RT Execution Provider for OpenVINO™ toolkit. Therefore, ONNX RT Execution Provider for nGraph will be deprecated starting June 1, 2020 and will be completely removed on December 1, 2020. Users are recommended to migrate to the ONNX RT Execution Provider for OpenVINO™ toolkit as the unified solution for all AI inferencing on Intel® hardware. * Remove nGraph Licence info from ThirdPartyNotices.txt * Use simple Test.Run() for tests without EP exclusions To be consistent with rest of test code. * Remove nGraph EP functions from Java code	2020-11-19 16:47:55 -08:00
Hariharan Seshadri	62508ef0e4	Revert "Remove MKLML build config (#5559 )" (#5855 )	2020-11-19 10:53:08 -08:00
Yufeng Li	6f86c4dbe3	Quantize LSTM (#5595 ) Quantize LSTM: 1. dynamically quantizes MatMul inside the LSTM. It doesn't quantize activation function. 2. support per-channel on the input weight and recurrent weight.	2020-11-18 11:21:49 -08:00
Peichen Xie	e8c0f5d0ff	Update the quantization script to support GEMM (transB==1) (#5432 ) * Modify onnx_quantizer.py * Fix topology order issues * Handle more cases	2020-11-17 21:24:48 -08:00
Scott McKay	7b76b57fc8	Support EPs that compile nodes in a minimal build. (#5776 ) * Support EPs that compile nodes in a minimal build. This enables NNAPI being used.	2020-11-17 13:52:22 +10:00
Maajid khan	a84a058f9e	[OpenVINO-EP] Enabling Multi Device support (#5740 ) * Enabling Multi Device support for UEP Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Minor fix added *Added a simple fix to determine OpenVINO version for Arm build as well Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>	2020-11-11 15:16:30 -08:00
Chi Lo	92292de135	Tensorrt perf tool (#5436 ) * Add YAML file for pipeline * Modify typo * Add working directory * Modify and test * Modfiy and test * Modify and test * Modify and test * Modify * Modify * Modify * Modify * Make sure to copy all the result files * Add clearn up * Modify * Modify agent pool name * Upload only specific artifacts * Modify * Integrated CI Pipeline for running TRT perf as well as added the “large amount of models” into perf model target * Fix bug * Fix bug * Add reading the information regarding previously known failing models and then skip testing them during benchmark/validation * Modify the script file for CI * Replace print with logger.info * Fix bug * Fix bug * Refine the code * Modify the script so that it can capture script segmentation fault while running ORT * Fix bug * fix bug * fix bug * Add debug info * fix bug * Refine perf code * Refine the code * fix bug * Code refactoring * change many-models path * remove metadata after validation/benchmark are done * Update README.md * Fix bug so that metadata doesn't hold stale value * Remove hardcode and update README * Add arguments to the script to make it run correctly * Update linux-gpu-tensorrt-ci-perf-pipeline.yml for Azure Pipelines * Update linux-gpu-tensorrt-ci-perf-pipeline.yml for Azure Pipelines * Fix bug so that metadata doesn't hold stale value * Fix small bug of finding test dataset directory for FP16 test data, as well as modification of some output information * use -i random for perf test of TRT changes Co-authored-by: Olivia Jain <oljain@microsoft.com>	2020-11-06 12:27:42 -08:00
Ye Wang	95e6da7957	Revert saving optimized model as external data (#5690 ) * revert and add support for saving external data * review comments * update	2020-11-06 11:54:19 -08:00
Zhang Lei	77b1eea9cf	Add option to allow quantize_input() use input_qtype for initializers. (#5721 )	2020-11-06 09:33:24 -08:00
Yufeng Li	5c4543e194	Calibrate float tensor only (#5704 )	2020-11-04 23:55:48 -08:00
Ye Wang	a028ca41ec	Optimize flaubert (#5651 ) * optimize flaubert * fix an issue and format * revert non-relevent change * review comments	2020-11-03 09:51:42 -08:00
Wei-Sheng Chin	8856c2595b	Sync the two IDs in OrtMemoryInfo when calling ctor (#5663 ) * Sync the two IDs in OrtMemoryInfo when calling ctor * Also fix the same problem for output	2020-11-02 23:22:47 -08:00
Tianlei Wu	2c02530603	Bert Model Profiling Tool (#5654 ) * Add profiler tool for BERT models	2020-11-02 13:47:37 -08:00
Derek Murray	ff538b8d3a	Minor fixes in BERT Inference notebook (#5637 ) Add missing commas to the code example.	2020-11-02 09:49:23 -08:00
Maajid khan	d98062da0c	[OpenVINO-EP] Hetero support (#5627 ) * Implement Hetero in UEP * Added security checks to take valid Hetero combinations as device type * Integrating Hetero features * Get the statistics Report in Debug Mode Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Passing right device type for vadm_baackend Added simple fix to pick the right device type when using vadm_backend with Hetero as well. Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Fixed batching logic for 2020.4 and above * Fixed flake8 PEP8 errors Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Minor Fixes Added Added security checks for device_type passed in for Hetero build during run time code cleanup Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Minor changes Added Fixed batch_size bug in vadm_backend code cleanup *Documentation updated for Hetero Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> Co-authored-by: suryasidd <surya.siddharth.pemmaraju@intel.com>	2020-10-30 22:35:08 -07:00
KeDengMS	32bf6390ad	Some fixes to symbolic shape inference (#5642 ) * Some fixes to symbolic shape inference 1. Topological sort before iteration in graph 2. Fix a case in slice: start=100000, end=-100000, step=-1, dim=2 3. Fix Nuphar Gemm test's random seed 4. Slice opset 1 axes is optional	2020-10-30 19:28:47 -07:00
Weixing Zhang	aec4cb489e	ROCm EP for AMD GPU (#5480 ) The ROCm EP is designed and implemented based on AMD GPU software stack named ROCm. Here is the link for the details about ROCm: https://rocmdocs.amd.com/en/latest/ ROCm EP was created based on the following things: 1. AMD GPU programming language: HIP 2. AMD GPU HIP language runtime: amdhip64 3. BLAS: rocBLAS, hipBLAS 4. DNN: miOpen 5. Collective Communication library: RCCL 6. cub: hipCub 7. … Current status: BERT-L and GPT2 training can be ran on AMD GPU with data parallel. Next: 1. Make more GPU code be sharable between ROCm EP and CUDA EP since HIP language and HIP runtime API are very close to CUDA. 2. Continue improving the implementation. 3. Continue GPU kernel optimization. 4. Support model parallelism on ROCm EP. …… The rocm kernels have been removed from this commit and will be in a separate PR. Since the original PR was too big(~180 files), it was suggested to split the PR into two parts, one is rocm-kernels, the other is non rocm kernels. Co-authored-by: Weixing Zhang <wezhan@microsoft.com> Co-authored-by: sabreshao <sabre.shao@amd.com> Co-authored-by: anghostcici <11013544+anghostcici@users.noreply.github.com> Co-authored-by: Suffian Khan <sukha@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>	2020-10-29 17:13:04 -07:00
Maajid khan	ddf83d1ace	Maajid/multi threading 2 (#5568 ) * Enabled multi-threading for OpenVino EP ->Enabled support for concurrent_session_runs Run UEP using concurrent_session_runs > 1 Enabled support for ORT_PARALLEL ExecutionMode ->Documentation Added for Enabling MultiThreading Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Minor Fixes added Configure the value of nireq during Runtime Documentation typos rectified and details added for Multi_Threaded Inference Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Some checks added for this fix Added checks to invalidate wrong nireq value and assigned it to default value of 8 Added new config options for enable_vpu_fast_compile which were changed w.r.t OpenVINO_2021.1 Release Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>	2020-10-27 14:48:12 -07:00
Tianlei Wu	1f304fbee7	Attention with past and no unidirectional mask (#5557 ) * Update fusions to support shared node, and mask of all ones	2020-10-21 20:12:02 -07:00
Changming Sun	5802fe1699	Remove MKLML build config (#5559 ) Remove MKLML build config	2020-10-21 13:11:25 -07:00
Hariharan Seshadri	4291c57322	[C# and Python APIs] Expose knobs to enable/disable platform telemetry collection (#5481 )	2020-10-21 10:32:13 -07:00
Yufeng Li	6c2162e97a	Fix quantization of Conv1D with bias (#5491 ) * Fix reshape for Conv with bias	2020-10-20 15:27:26 -07:00

1 2 3 4 5 ...

323 commits