onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-03 03:58:54 +00:00

Author	SHA1	Message	Date
Xavier Dupré	84addcd2cf	Support double for operator ReduceMean, ReduceLogSumExp (#6217 ) * Support double for operators ReduceMean, ReduceLogSumExp	2020-12-31 11:24:54 +01:00
Xavier Dupré	5968a91ea6	Support double for operator Gemm + fix bug in gemm implementation for cuda, rocm when sizeof(type) != sizeof(float) (#6223 ) * Support double for operator Gemm * fix type size while copying data in gemm operator for GPU * fix type in gemm implementation for rocm	2020-12-31 11:24:16 +01:00
Xavier Dupré	70e2f96ef4	Support double for operator TopK + fix one bug in TopK implementation for GPU for double (#6220 ) * Support double for operator TopK * add static classes for topk/double * fix cast issue in topk	2020-12-31 11:23:19 +01:00
Tracy Sharpe	ecb2e119e4	MLAS: handle MlasGemm(M/N/K==0) cases (#6238 )	2020-12-30 23:25:10 -08:00
Hariharan Seshadri	4cc2ffef21	Support MLFloat16 type in Pow opset-12 CUDA kernel (#6233 )	2020-12-30 20:41:59 -08:00
William Tambellini	39a988ce1c	Upgrade build.py to assert for python 3.6+ Upgrade build.py to assert for python 3.6+ as python 3.5 cannot build anymore todays master.	2020-12-30 20:17:09 -08:00
Changming Sun	c15a858745	Update the readme file	2020-12-30 20:16:45 -08:00
Changming Sun	3911105f09	Remove python 3.5	2020-12-30 20:16:45 -08:00
Changming Sun	1b23b28706	Remove MKLML/openblas/jemalloc build config (#6212 )	2020-12-30 17:18:19 -08:00
Michael Giba	5c584b2636	Removed executor todo that looks dead. (#6234 )	2020-12-30 17:17:37 -08:00
Michael Goin	bbb6b416f0	Fix ImportError in build.py (#6231 ) There is a possible ImportError where build.py can import the wrong 'util' package if there are others present in `sys.path` already	2020-12-30 14:22:55 -08:00
Xavier Dupré	df7e2f3c1e	Support double for operators Relu, Tanh, Sigmoid (#6221 )	2020-12-29 18:25:23 +01:00
Xavier Dupré	111ac299cc	Support double for operators Where, LpNormalisation (#6034 )	2020-12-28 12:53:44 +01:00
Xavier Dupré	2d09db67b4	Support double for operators Log, Reciprocal, Sum (CPU) (#6032 ) * Support double for operators Log, Reciprocal, Sum * remove tesdt erf_double	2020-12-28 12:53:18 +01:00
Xavier Dupré	8a0f5c50ab	Minor change to improve performance for operator Pad. (#5537 ) * small improvment for pad	2020-12-28 12:52:41 +01:00
Jesse Benson	7ccdfed1a6	Remove most ROCm-specific element-wise code and reuse CUDA element-wise code.	2020-12-27 10:30:29 -08:00
Jesse Benson	52228a703c	Use TArray in AMD element-wise kernels, rather than manually copying memory to device.	2020-12-27 10:30:29 -08:00
Changming Sun	1fc7f92f25	Fix a memory leak in test_inference.cc (#6201 ) * Fix a memory leak in test_inference.cc	2020-12-25 13:02:21 -08:00
sfatimar	7347996942	Openvino ep 2021.2 (#6196 ) * Enabling fasterrcnn variant and vehicle detector * changes for 2021_2 branch * yolov3_pytorch commit * fixed braces in basic_backend.cc * ci information added * faster rcnn variant and vehicle detector changes were made in 2021.1 and not in 2021.2 * some changes to support unit tests * disable some tests which are failing * fix myriad tests for vehicle detector * Did some cleanup cleaned up comments Disabled Add_Broadcast_0x1 and Add_Broadcast_1x0 tests on MYRIAD_FP16 backend due to a bug cleaned up capability_2021_2.cc file Removed extra conditions which were added for some validation in backend_utils Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * yolov3 pytorch workaround to ensure that the output names are matched * gemmoptest fixed on myriad * Fixed MYRIADX CPP Test Failures Expand,GatherND,Range,Round op's are only supported in model where op with float input data types are not supported and fixed Scatter and ScatterElements op's with negative axis are fixed Reshape op with 0 dim value are not supported and fixed Disabled InstanceNorm_2 test on MYRIADX Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> make changes to yolov3 pytorch * Fixed python unit tests Fixed failing python tests on vpu, GPU and CPU Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> Fixes POW op failures on GPU_FP16 Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Clean up capability_2021_2.cc Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Updated docx for MultiThreading option Added extra info on setting the num_of_threads option using the API and it's actual usage Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> fixed slice and removed extra prints * Disabled failing python tests Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Minor changes added in capabilty_2021_2 Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * made changes to slice to avoid failures * Disabling FP16 support for GPU_FP32 ->Inferencing an FP16 model on GPU_FP32 leads to accuracy mismatches. so, we would rather use GPU_FP16 to infer an FP16 model on GPU Device Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * Updated docx for Inferencing a FP16 Model Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com> * fix for mask rcnn * Script for installing openvino from source * Updated with openvino 2021.2 online installation * code comment fixes fixed accuracy mismatch for div * Update OpenvinoEP-ExecutionProvider.md updated for 2021.2 branch * Update README.md updated dockerfile documentation * Update BUILD.md build.md update documentation * permissiong change of install_openvino.sh * made changes to align with microsoft onnxruntime changes * Updated with ov 2021.2.200 Co-authored-by: suryasidd <surya.siddharth.pemmaraju@intel.com> Co-authored-by: sfatimar <sahar.fatima@intel/com> Co-authored-by: MaajidKhan <n.maajidkhan@gmail.com> Co-authored-by: mohdansx <mohdx.ansari@intel.com>	2020-12-23 08:47:22 -08:00
Ryan Lai	0494a0f95f	Add ability to skip GPU tests based on GPU adapter name (#6198 ) * Implement conversion from ortvalue to Itensor for string tensors and comparing sequence of maps of strings to floats * PR comments * Add ability to skip gpu tests according to adapter description * spacing * spacing * spacing	2020-12-22 15:20:23 -08:00
Jesse Benson	c562952750	Dockerfile to build onnxruntime with ROCm 4.0	2020-12-22 10:21:12 -08:00
Ryan Lai	21395f8e24	Implement comparing outputs that are sequence of maps of strings to floats (#6180 ) * Implement conversion from ortvalue to Itensor for string tensors and comparing sequence of maps of strings to floats * PR comments	2020-12-22 09:52:29 -08:00
baijumeswani	a8b482681a	Clean up checkpoint tests to use the new checkpoint functions (#6188 ) * add deprecation warning for old checkpoint functions * update all the distributed checkpoint tests to use new checkpoint functions	2020-12-22 09:15:57 -08:00
Hariharan Seshadri	04b3e0ef5e	Condition fix in Resize operator (#6193 )	2020-12-22 00:05:31 -08:00
Hariharan Seshadri	fc27074bae	Implement ScatterND for CUDA EP (#6184 )	2020-12-22 00:04:20 -08:00
Chi Lo	945fae8f56	Lochi/quantization tool for trt (#6103 ) * Initial implementation of generating calibration dynamic range table * Initialize validation support for Quantization * Initialize validation support for Quantization (cont.) * Improve validation support for Quantization * Improve validation support for Quantization * Rewrite/Refine for calibration and validation * Rewrite/Refine for calibration and validation (cont.) * Refine code * Refine code * Add data reader for BERT * Add flatbuffers to serialize calibration table * Refine code and add BERT evaluation * Refine the code * minor modification * Add preprocess/postprocess of vision team yolov3 and refine the code * Update annotation * Make bbox cooridates more accurate * Fix bug * Add support of batch processing * Batch processing for model zoo yolov3 * Add batch inference for evaluation * Refine the code * Add README * Add comments * Refine the code for PR * Remove batch support checking in data_reader and refine the code * Refine the code for PR * Refine the code for PR review Co-authored-by: Olivia Jain <oljain@microsoft.com>	2020-12-21 20:59:08 -08:00
Olivia Jain	234e94b4e1	Add Status.csv to EP Perf Tool (#6167 ) * merge master, keep postprocess status commit * download float16.py everytime * removing hardcoded values	2020-12-21 20:23:19 -08:00
Suffian Khan	67ac6ae4e0	Tune fast Gelu to use exp(x) instead of tanh(x) on Rocm platform (#6174 ) * tune fast gelu to use exp(x) instead of tanh(x) on rocm * update to use expression 2/(1+exp(-2x))-1 for stability	2020-12-21 16:25:21 -08:00
Weixing Zhang	53307a5f2e	improve perf for softmax (#6128 ) * improve perf for both gathergrad and softmax * revert the change in gathergrad and will be done in another PR. * address comments from code review.	2020-12-21 14:15:54 -08:00
S. Manohar Karlapalem	ea9cfa554a	Add usage details of unified MCR container image (#6182 ) Going forward, a single unifed docker image will be published in MCR. The hardware accelerator target choice will have to be made in the application using OpenVINO EP's runtime config options.	2020-12-21 11:48:54 -08:00
satyajandhyala	201d0dbb1a	Android coverage dashboard (#6163 ) * Write the report to a file. * Post code coverage to the Dashboard database.	2020-12-21 10:34:01 -08:00
jingyanwangms	f874260b9e	Backend APIs for checkpointing (#5803 ) * Add backend API GetOptimizerState and GetModelState * add GetPartitionInfoMap	2020-12-21 08:21:29 -08:00
Scott McKay	2da8060f34	Helper for compiling EP to generate deterministic unique ids for use in MetaDef names (#6156 ) * Create a helper for generating unique ids that can be used by an EP that creates compiled nodes and needs ids to be deterministic for a model when used in multiple sessions. Added to IExecutionProvider as this can potentially be used by all compiling EPs and is more robust than a simplistic counter (although EP implementer is free to choose either approach). * Restructure the helper so it can be called across the EP bridge. Add ability to call id generation helper from EP bridge - convert DNNL EP to use helper to validate Address issue where a new Model may be loaded into the same address as a previous one. - hash the bytes in the Graph instance (1728 bytes currently) to use as the key to the full hash for the model Add lock around id generation to ensure no issues if multiple sessions partitions graphs at exactly the same time. - Extremely unlikely but would be hard to debug and the locking cost is not an issue as it's only incurred during graph partitioning and not execution.	2020-12-21 12:17:58 +10:00
Edward Chen	cd3a5acca0	Update get_docker_image.py to enable use without image cache container registry. (#6177 ) Update get_docker_image.py to enable use without image cache container registry.	2020-12-18 19:01:02 -08:00
Derek Murray	11b0a5401e	Fix typo in BERT pretraining script (#6175 ) A misplaced `}` meant that the `'enable_adasum'` option was interpreted incorrectly, causing the test to fail.	2020-12-18 16:38:14 -08:00
Guoyu Wang	bbb52e9274	[NNAPI EP] Enable per-channel quantization for QlinearConv (#6155 ) * Enable qlinearconv per-channel quantization * Fix the android CI test failure * Add Android Version Check for Per-Channel Quant * Address PR comments * Fix some minor issues * Add verification of per-channel zero points * Make the error tolerance configurable	2020-12-18 16:13:22 -08:00
baijumeswani	39aedbc97f	aggregate model states only for the case when mixed precision was true (#6176 )	2020-12-18 14:09:32 -08:00
Pranav Sharma	86493e6d0c	Update documentation for contributing a PR and add deprecation notices for PyOp and ORT server. (#6172 )	2020-12-18 02:00:42 -08:00
Sergii Dymchenko	824ef9a1de	Don't try to bind unused inputs in the Training frontend (#6166 )	2020-12-17 21:41:28 -08:00
baijumeswani	adc2071043	save_checkpoint, load_checkpoint and aggregate_checkpoints (#6136 ) * save_checkpoint and load_checkpoint implementations * checkpoint aggregation logic * unit tests for save_checkpoint, load_checkpoint and aggregate_checkpoints	2020-12-17 21:01:36 -08:00
Guoyu Wang	c339bb2da9	Remove ignored build warnings for pybind on Mac (#6165 )	2020-12-17 19:54:28 -08:00
Yufeng Li	98d8a3e335	Revert "Fuse MatMulIntegerToFloat only when scales are scalar (#6008 )" (#6169 ) This reverts commit `f2dcba7afe`.	2020-12-17 19:53:50 -08:00
Du Li	34725ae520	Bugfix for topk cuda kernel (#6164 ) * fix the issue that std::numeric_limits cannot handle half type * adding a test Co-authored-by: Du Li <duli@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-12-17 17:59:37 -08:00
Jay Rodge	dec703b62d	Update TensorRT-ExecutionProvider.md (#6161 )	2020-12-17 17:10:40 -08:00
Tixxx	32c67c2944	Deprecating Horovod and refactored Adasum computations (#5468 ) deprecated horovod submodule refactored adasum logic to be ort-native added tests for native kernel and e2e tests	2020-12-17 16:21:33 -08:00
Pranav Sharma	efa1b0d864	Minor fix to satisfy c++14 (#6162 )	2020-12-17 13:53:24 -08:00
Juliana Franco	36c03b32e9	Using a map of of ops to stages as input of partition function. (#5940 ) * New partition algorithm running before AD * Convert cut_group_info into device map. Work in progress -- works for bert-tiny with pp=2 * Removing code for partition of bwd graphs * Remove old code * Adding some verification code * Handle Shared Initializer * Renaming rank with stage * Added first unit test * new test * redundant check * undo change in bert * Moved cut-based partition to testing utils file Co-authored-by: xzhu1900 Co-authored-by: wschin * New conversion function and tests * minor * remove test that is not needed2 * improve GetDeviceAssignment and PR comments * minor changes * PR comments * improving documentation and variable naming * add documentation * Variable naming and docs * more doc improvements * more doc improvements * missing static cast * Fix test file for windows * Fix test file for windows * Fix test file for windows * stage id is not the same as rank id * PR comments * PR comments * More comments * More comments	2020-12-17 09:03:33 -08:00
Tracy Sharpe	503b61d897	MLAS: add NEON version of int8 depthwise convolution (#6152 )	2020-12-16 18:39:10 -08:00
Edward Chen	0fa04bdc50	Fix clean_docker_image_cache.py detection of image pushes. (#6151 ) Fix clean_docker_image_cache.py detection of image pushes. They were being ignored because the expected HTTP status code was wrong. For pushes, it's 201 instead of 200.	2020-12-16 17:25:22 -08:00
Changming Sun	344a2a8ee5	Revert "work around of the build break in mac (#6069 )" (#6150 ) This reverts commit `3cae28699b`.	2020-12-16 14:41:18 -08:00

1 2 3 4 5 ...

3986 commits