onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-04 04:07:22 +00:00

Author	SHA1	Message	Date
Justin Chu	4ec5fe5c8a	Github action: Inline lint python / js / cpp (#11328 ) Uses the reviewdog action which supports inline reporting.	2022-04-26 14:17:28 -07:00
Chi Lo	0292356bd7	TensorRT EP engine cache serialization/deserialization refactor (#11045 ) * Code refactor * fix bug * modify comment * modify test for the new ORT TRT cache behavior * update comment * rename variable * fix bug for not having trt context * Custom parameters (#10964) * get inputs independently for trtexec * track one process only * remove engine and profile files * change time to commit time * add runtime option for io binding * move to commit date * fixes * add option for graph optimization * cleanup docker script * note second time creation * allow for parameters to be configured from pipeline at runtime * uncomment * include optional arguments at runtime * post second session creation * update cmake version * Revert "update cmake version" This reverts commit 09a1364eae68610724c8e90eeea777b7ee03f74b. * Move data format import * Perf FasterRCNN + MaskRCNN (#11102) * add faster mask * fix paths * add a test scenario that - if engine cache is present, trt ep should load the engine cache and run inference * Revert "Merge branch 'trt_cache_refactor' of https://github.com/microsoft/onnxruntime into trt_cache_refactor" This reverts commit 8edc574de1ea6055534f33a57b9365c721c2eb29, reversing changes made to 0c92e5b2b1d453527001fe731ed4ccfc79e6adad. Co-authored-by: Olivia Jain <oljain@microsoft.com>	2022-04-26 11:07:48 -07:00
Hariharan Seshadri	81d78706fe	Add location planning logic for implicit inputs which are graph inputs (#11320 )	2022-04-26 10:39:48 -07:00
Justin Chu	fdce4fa6af	Format all python files under onnxruntime with black and isort (#11324 ) Description: Format all python files under onnxruntime with black and isort. After checking in, we can use .git-blame-ignore-revs to ignore the formatting PR in git blame. #11315, #11316	2022-04-26 09:35:16 -07:00
Yi Zhang	13f86e7d56	print mac agent info (#11338 )	2022-04-26 09:27:55 +08:00
Changming Sun	aaa583e776	Refactor the model tests code in onnxruntime_test_all.exe (#11300 ) Update the code to use OrtApis instead of the old onnxruntime::InferenceSession class. Mainly because the old one doesn't support custom op. We are trying to convert some EPs to custom ops. Hopefully they can continue to leverage this test set.	2022-04-25 11:52:51 -07:00
Justin Chu	6fb29f5b9a	Add python docstring linting in vscode settings (#11316 ) Add python docstring linting in vscode settings Use black and isort for python code formatting in VScode. Import sorting enabled on save. Code formatting available in VSCode with manual trigger. Adopted from pytorch https://github.com/pytorch/pytorch/blob/master/.vscode/settings_recommended.json	2022-04-23 06:23:04 -07:00
Yi Zhang	532e2536cc	increase timeout in PR build (#11319 ) * increase timeout * show mac agent info * Revert "show mac agent info" This reverts commit a646ebefff8940a3044f1984107856db33319eb8. * increase timeout in PR test	2022-04-23 16:01:21 +08:00
Adrian Lizarraga	f069951835	[trt-perf-test] Pass TensorRT/CUDA EP options via dictionary argument (#11231 ) * Enable users to pass a dictionary of TensorRT and CUDA EP options to the EP perf benchmark.py script. * Post specified EP options to database.	2022-04-22 11:22:25 -07:00
Yi Zhang	ba1e9a218e	increase timeout (#11310 )	2022-04-22 13:55:04 +08:00
Ye Wang	daf87fd0dd	specify the path for gpt2_helper in onnx_exporter.py (#11301 )	2022-04-21 21:01:40 -07:00
Hariharan Seshadri	23b01258b5	Fix how the output tensor is created in CUDA SpaceDepth ops (#11302 )	2022-04-21 19:58:57 -07:00
Olivia Jain	86cdabbcfd	Add OpenVINO Pipeline Status to README (#11299 ) * update ov pipeline definition ID * Update ov build status	2022-04-21 15:59:50 -07:00
Edward Chen	4d0214f851	Move Contains() helper function to a higher common.h. (#11289 )	2022-04-21 09:31:48 -07:00
Gary Miguel	7aa4af238a	Add strict_shape_type_inference config option (#11081 ) Prior to this, certain shape and type errors were surfaced only when the model was using the latest known op set version. Providing users an explicit option allows for better testing of code that produces models, which includes unit tests within this repo and other repos such as the TF-ONNX and PT-ONNX converters. Remove the previous behavior which seems quite counter-intuitive: an otherwise identical model with a later op set version should be treated identically in this regard. The option defaults to false to avoid causing errors for users that rely on the previous permissive behavior. Turned on the strict enforcement by default in OpTester, which revealed a few disagreements between ORT and ONNX on what the correct output shape should be. Fix shape inference bug in ReduceSumTraining with noop_with_empty_axes=1 which was revealed. Fix TensorOpTest.Unsqueeze_scalar, which was testing negative axes on an op set version where the op did not actually support negative axes. Fixes #9506.	2022-04-21 08:32:40 -07:00
Scott McKay	c5de493a8a	Exclude EPs that aren't available on mobile to try and fix Xamarin build error on M1. (#11267 )	2022-04-21 07:01:46 +10:00
Edward Chen	4854a09340	Consolidate utils::ToTensorProtoElementType, TypeToDataType, and data_types_internal::ToTensorDataType. (#9824 )	2022-04-20 12:45:53 -07:00
Changming Sun	2cacd18d51	Fix an SAL annotation error	2022-04-20 12:02:30 -07:00
Tianlei Wu	1d96cbec73	Move gpt2 script to models\gpt2 sub-directory (#11256 ) * move gpt-2 scripts to models\gpt2 * change gpt2 beam search helper to make test_gpt2 passes	2022-04-20 11:09:26 -07:00
Chi Lo	cb46d79108	Model tests refactor (#11194 ) * Update model test * update comment * create map to hold OnnxModelInfo so test doesn't need to reload the model again * revert the code and use GTEST_SKIP() to skip test * fix bug * revert LATEST_ONNX_OPSET_SUPPORTED_BY_TENSORRT	2022-04-20 10:14:28 -07:00
Scott McKay	af249943a1	Increase the timeout so the packaging pipeline stops failing. TODO: Someone should investigate why the AARCH64 build takes 3+ hours and reduce it if possible. Assuming it's using an emulator given the x64 build with the same arguments takes 13 minutes.	2022-04-20 09:36:37 -07:00
cloudhan	013306c940	[MinBuild] 132KB minimal build binary size reduction via dummy __cxa_demangle (#11071 ) Minimal build binary size reduction via dummy __cxa_demangle	2022-04-21 00:11:10 +08:00
Edward Chen	180b3f7cc2	Update QDQFinalCleanup transformer to also handle removing DQ/Q node pairs. (#11219 ) ` -> DQ -> Q -> ` where DQ and Q have the same scale and zero point is not necessary.	2022-04-20 09:03:12 -07:00
Edward Chen	e3ff4a6bfa	Fix NNAPI EP error when handling external node adjacent to partition. (#11233 ) Move a check for a graph output (for the partition) prior to iterating the downstream nodes to avoid trying to get a NodeUnit for a node that is outside of the partition.	2022-04-20 08:53:29 -07:00
Zhang Lei	70d97bdf53	Support only one input in QLinearConcat (#11265 )	2022-04-19 20:55:51 -07:00
Yufeng Li	2e6c2177af	remove deprecated quantize api (#11263 )	2022-04-19 19:41:55 -07:00
Maxiwell	acb555c4c7	ppc64le: Optimizing the MlasMaximumPool() to use VSX instructions (#11216 ) It runs on Power8, Power9, and Power10	2022-04-19 15:13:55 -07:00
Tianlei Wu	bab9b80f1f	auto mixed precision for t5 (#11252 )	2022-04-19 12:42:11 -07:00
Yulong Wang	5ee8e2e491	[js] use NPM and yarn to upgrade package version (#11059 )	2022-04-19 12:28:13 -07:00
Vincent Wang	06026fe8e6	SizeInBytes Fix for Strided Tensor (#11224 ) * SizeInBytes Fix for Strided Tensor * resolve comments	2022-04-19 15:13:00 +08:00
Edward Chen	3dac66698b	Add option to specify onnxruntime repo URL in tools/android_custom_build/build_custom_android_package.py. (#11250 )	2022-04-18 19:29:41 -07:00
Lukas Berbuer	efb0928e2b	Fix find_package for benchmark	2022-04-18 15:25:43 -07:00
Dmitri Smirnov	98faaa7e2f	Scoped GIL release in run_with_iobinding (#11248 )	2022-04-18 13:07:45 -07:00
Yufeng Li	dec99657a1	Improve onnx shape inference in quant tool (#11106 ) onnx.shape_inference.infer_shapes only works for model size < 2GB, while onnx.shape_inference.infer_shapes_path works for all models. This PR replaces infer_shapes with infer_shapes_path.	2022-04-18 08:07:31 -07:00
pengwa	9765ef8b4e	fix build warnings (#11213 ) * fix build warning	2022-04-18 21:09:09 +08:00
Vincent Wang	0bad5b1b5a	[CUDA] Rollback TileMemcpy and TileBatchedMemcpy when Block Size is Small (#11187 )	2022-04-16 07:46:43 +08:00
George Nash	d9eeb48393	One dnn v2.6 update (#11220 ) * Disable training code in DNNL LayerNorm code The capability code already does not claim the LayerNorm and SkipLayerNorm that require more than one output. However, building with training enabled was causing issues. The training specific code has been removed even when building with training enabled. Signed-off-by: George Nash <george.nash@intel.com> * Fix for DNNL FusedMatMul op. The bug was in the transpose code. Signed-off-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com> * Use agreed upon memory format type when runnig Pooling Gradient in dnnl ep The dnnl ep does not currently have a way to pass memory_format information between the forward pooling primitive to the backward pooling primitive. This change explicitly sets the memory_format to use match that of Onnxruntime. For both the forward and backward pooling code. This will prevent using un-matched memory format that could result in an `unimplemented` error from dnnl ep. Signed-off-by: George Nash <george.nash@intel.com> * Update dnnl ep to use OneDNN v2.6 Do not run ReduceInfLogSum on the kDnnlExecutionProvider due to a calculation bug when doing Log or infinity valuse. The fix for this issue will be part of the next OneDNN release. Signed-off-by: George Nash <george.nash@intel.com> * Update PrintMemory function in dnnl ep This modification can be used to enable/disable memory printing for dnnl ep develpers. This is considered a developer only feature and is disabled by default. It must be enabled and code recompiled to use. Even if it is enabled it will not actually print any memory because the developer needs to take the extra step of spefifying the memory that will be printed to the screen. Signed-off-by: George Nash <george.nash@intel.com> * Update binary ops to run on intel GPU when using dnnl ep Binary ops (i.e. Add, Div, Mul, and Sub ) was updated to no longer call GetMemoryAndReshape in the past this would move the memory from CPU to the GPU. This extra call is no longer needed since it is taken care of by the GetMemoryInOrtFormat call. Removing the GetMemoryAndReshape prevented copying the memory to GPU twice. Signed-off-by: George Nash <george.nash@intel.com> Co-authored-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com>	2022-04-15 12:51:11 -07:00
sumitsays	227bc7264e	Fixed compilation error for ARM architecture (#11223 ) Co-authored-by: Sumit Agarwal <sumitagarwal@microsoft.com>	2022-04-15 09:24:21 -07:00
ytaous	bc296c706e	MatMulScaleFusion - handling scale input (#11121 ) * scale input * more condition check * alternative * per comments * fix comments Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-04-14 21:54:04 -07:00
Yi Zhang	94032357e2	use int storage (#11185 )	2022-04-15 09:56:36 +08:00
Ahmad Zakaria	63ff391b16	add AppendExecutionProvider_CUDA_V2 to the C++ api (#11153 )	2022-04-14 17:33:27 -07:00
chausner	c2b4054c74	Fix typos	2022-04-14 13:53:50 -07:00
stevenlix	5216a43c9d	Consolidate TensorRT subgraphs to reduce inference overhead (#11211 ) * add trt node list consolidation * add more log * fix typo * seperate cycle detection and removal * update * change function name Co-authored-by: Ubuntu <azureuser@orttrtlinuxdev.bxgbzpva45kedp3rhbsbit4phb.jx.internal.cloudapp.net>	2022-04-14 11:05:27 -07:00
Faruk D	a00d24066a	Fix CITATION.cff and add automatic validation of your citation metadata (#10478 ) * Add cffconvert.yml to validate CITATION.cff * Fix CITATION.cff by removing duplicate title and correcting the license Co-authored-by: Abel Soares Siqueira <abel.s.siqueira@gmail.com>	2022-04-13 10:03:52 -07:00
Vincent Wang	9707181257	fix build error (#11199 )	2022-04-13 13:09:19 +08:00
Scott McKay	3b3b23bcf9	Add new python helper dirs to wheel. (#11196 )	2022-04-13 13:34:07 +10:00
Chen Fu	0d0edc071f	Detecting ARM64 CPU core micro-architectures in Windows (#11145 ) Some micro-architectures of power efficient cores in ARMv8 system have narrow 64b load/store resources, which require specialized computing kernels in MLAS. We leverage pytorch CPUinfo package for detecting these cores. Unfortunately CPUinfo package does not work on Windows. This commit implements ARM64 micro-architecture detection.	2022-04-12 16:47:11 -07:00
ashbhandare	ddb17294b2	Fix gradient builder for Cast (#11008 ) * fix grad builder for cast * reviw comments Co-authored-by: Aishwarya Bhandare <aibhanda@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by: Aishwarya Bhandare <aibhanda@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>	2022-04-12 16:08:21 -07:00
Gary Miguel	e84c338989	minor improvements to CONTRIBUTING doc (#11080 )	2022-04-12 15:22:34 -07:00
Faith Xu	5337972f92	Update to use teams instead of individual GH handles (#11163 ) * Update to use teams instead of individual GH handles * Fix typo * Update CODEOWNERS * Update CODEOWNERS * Update team name	2022-04-12 12:06:12 -07:00

1 2 3 4 5 ...

6687 commits