onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-20 19:12:24 +00:00

Author	SHA1	Message	Date
Chris Seymour	db61eb4cd7	Update ONNX_Runtime_Perf_Tuning.md (#1378 )	2019-07-17 19:14:43 -07:00
Tracy Sharpe	f47f6fd020	Fix MaxPool when using dilation > 1 plus non-zero padding (#1320 ) MaxPool with dilation > 1 and padding did not compute the correct start index. Added code to fix and test cases to cover this.	2019-07-17 17:33:29 -07:00
Changming Sun	fbdd905440	Switch some of the linux pipelines to use the new data download script (#1379 )	2019-07-17 16:06:02 -07:00
avidiyal	859a57d781	Updated Dockerfile for OpenvinoEP (#1362 ) * Updated Dockerfile for OpenvinoEP Signed-off-by: avidiyal <akhila.vidiyala@intel.com> * Changed the license Signed-off-by: avidiyal <akhila.vidiyala@intel.com> * resolving conflicts * Reviews fixed	2019-07-17 14:52:59 -07:00
Yuan Yu	93fb62bb3e	More code cleanup (#1405 ) * More code cleanup * More cleanup	2019-07-17 14:45:50 -07:00
Yufeng Li	a7b1a8969c	simply nocontribops-ci and fix build break (#1422 ) simply nocontribops-ci and fix build break	2019-07-17 13:43:40 -07:00
Tracy Sharpe	4383615cf6	implement conv+clip fusion (#1412 ) This change implements Conv+Clip activation fusion for FusedConv and NCHWc convolutions. The Clip operation runs in the thread context that is producing the convolution output.	2019-07-17 12:16:45 -07:00
suryasidd	d2cc086bee	[OpenVINO EP] Minor bug fixes (#1388 ) * Minor bug fixes for accelerators * Added dimensionality checks for each graph input for GPU * Disabled some tests for MYRAID and GPU * This change is required for running some of the models on OpenVINO instead of falling back to default CPU EP Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com> * PR Feedback Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com> * Fix missing bracket Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>	2019-07-17 10:48:54 -07:00
R. G. Esteves	8720fe62e3	Added missing libraries to Windows wheel (#1415 )	2019-07-17 05:54:09 -07:00
Changming Sun	c2aa2056b5	Sample for imagenet and batch prediction (#1372 ) * Sample for imagenet and batch prediction (Will add a readme later)	2019-07-16 14:23:45 -07:00
Changming Sun	d38badffdb	Disable mklml in Windows Build	2019-07-16 11:09:17 -07:00
Raymond Yang	a203077dcd	Relax timeout in CI system (#1394 ) * Relax timeout in CI system (temporary) * Relax timeout on TensorRT pipeline	2019-07-15 15:10:08 -07:00
Scott McKay	07a2466d9f	Use INFO instead of WARNING for an unused graph input. (#1235 ) * Use INFO instead of WARNING for an unused graph input. * Drop severity of unused initializer as well * Update to output a warning level message if removing an initializer that is never used, and an info level message if removing an initializer that optimization has made redundant.	2019-07-15 20:29:30 +10:00
Yang Chen	fa4b956f12	replace onnx:: with ONNX_NAMESPACE:: (#1376 ) * replace onnx:: with ONNX_NAMESPACE:: * Fixed issue for building shared libs * address CR feedback * address more CR feedback	2019-07-15 01:06:53 -07:00
Scott McKay	61b733ce6d	Update optimizers to be able to utilize a constant initializer from an ancestor graph (#1346 ) * Now that we check for a constant initializer in an ancestor graph we also need to be able to retrieve and replace that initializer. Add helpers to do so. Update optimizers to use the new helpers. Fix bug in UnsqueezeElimination where it wasn't checking if the initializer it was replacing was constant.	2019-07-15 12:41:01 +10:00
Tracy Sharpe	d4ce31ea6d	cleanup fused conv activation handling (#1403 ) * cleanup fused conv activation handling * fix build break * fix mkldnn build break	2019-07-14 16:34:16 -07:00
Yuan Yu	c139e3ab33	Remove a few useless unique_ptrs (#1401 )	2019-07-13 16:15:29 -07:00
Tracy Sharpe	719e58d831	Use MLAS to retrieve the CPU preferred tensor buffer alignment (#1377 ) Add MlasGetPreferredBufferAlignment() for use by CPUAllocator::Alloc to get the byte alignment for CPU tensors. Using MLAS allows the value to be based on the platform the binary is running on instead of a constant value fixed at compile time.	2019-07-12 22:22:46 -07:00
Changming Sun	5a6f1c10d6	Add OrtCreateStatus to the symbol list	2019-07-12 15:10:58 -07:00
Ke Zhang	3bf0e364e2	Move CopyTensor out of IExecutionProvider interface. (#1268 ) * add ortdevice class * add data transfer manager for copying tensors. * update * add data trasnfer for gpu * fix constexpr build break. * update * remove unnecessary header files. * remove unnecessary header files. * add dependency * add dependency * add dependency * add dependency * fix linux build break. * update * fix build break * fix build break * fix build break * update * update * update c api. * update to not use OrtCreateAllocatorInfo * change to all eps . * fix linux build break * remove useless codes. * update * move datatransfermanager in session state * update * fix cuda build break. * fix comments * fix windows GPU build. * fix comments * fix build break * fix comments * fix test failure * update * fix comments * fix onnx runtime server. * update * fix test failure. * fix comments * fix comment	2019-07-11 14:49:20 -07:00
jignparm	e580b76305	Fix ARM64 build + Add NuGet pipeline including ARM binaries (#1335 ) * Add arm64 nocontribops pipeline * minor fix * Added new template for arm build -- disable all tests * fix build command * add arm64 flag for msbuild * add arm leg as upstream dependency * update platform to arm64 for msbuild * remove test task from arm build * remove ESRP signing of C# dlls in arm build * Updated to work for both --arm and --arm64 * Make the cross compiling cmake flags symmetric * Add dynamic check for /Wno-error flag, instead of extra build option * remove extra full-stop	2019-07-11 11:49:17 -07:00
Maik Riechert	bfda9ca1c1	Make sure submodule urls are up-to-date (#1357 ) This extends build.py to run git submodule sync --recursive before running git submodule update --init --recursive. This makes sure submodule URLs are up-to-date.	2019-07-10 13:11:59 -07:00
Changming Sun	20f6c84fd2	Switch to use nvidia-docker2 command format	2019-07-10 13:11:07 -07:00
S. Manohar Karlapalem	a7fcd60572	Add missing 'openvino' option in perftest Usage message (#1367 )	2019-07-10 10:58:18 -07:00
Faith Xu	aba7271ad7	Fix links (#1371 )	2019-07-10 08:34:31 -07:00
Tracy Sharpe	823fa3f39c	Integrate MLAS NCHWc support into ONNX Runtime (#1327 ) This change integrates the NCHWc support recently added to MLAS into ONNX Runtime. When using "-o 3" optimizations, then the runtime will do a NCHWc layout optimization pass to convert standard ONNX operators such as Conv/MaxPool to the com.microsoft.nchwc domain with weights and biases reordered for speed.	2019-07-09 20:41:19 -07:00
Hector Li	42c18762f3	Update the log message for fallback case. (#1370 ) Log a warning if the fallback is caused by functional limitation Log a information if the fallback is by design. e.g Nodes between Shape (CPU output) -> CUDA nodes .. -> ReShape (CPU input)	2019-07-09 16:54:40 -07:00
Tracy Sharpe	c483a1e3c6	Use simpler GEMM function for MatMul operator (#1365 ) More cleanup of the math files. Instead of using templates to instantiate a full GEMM for the types added for MatMul (integers and double), use a simpler MatMul function that doesn't do any transposing and assumes alpha=1 and beta=0.	2019-07-09 15:07:50 -07:00
jignparm	57225cd4ee	Add C++ API test for NuGet package (#1364 )	2019-07-09 13:51:51 -07:00
Hector Li	298f30546b	Fix the random UT failure for RNN/GRU cases which have padded sequenc… (#1361 ) Fix the random UT failure for RNN/GRU cases which have padded sequence. e.g. max_seq = 2. batch_size =2, sequence_lengths = {2, 1}. For the output beyond the shorter sequence {1}, we should initialize the value to 0. Root cause: Cudnn library doesn't guarantee the value beyond the shorter sequence. Fix: Initialize the output Y data to all 0 before calling cudnn library.	2019-07-09 13:28:11 -07:00
Changming Sun	27da857b51	Fix an SAL annotation in onnxruntime_c_api.h	2019-07-09 10:14:58 -07:00
Vinitra Swamy	6b32c77804	Dockerfiles for TensorRT, CUDA, build from source (#922 ) * dockerfile updates for BYOC scenario * updates for 3 different build versions * updating to remove libopenblas, python3, python3-pip * Including LICENSE-IMAGE.txt for CUDA/TensorRT dockerfiles * remove unnecessary cmake files * fixing comment typo * optimizing dockerfile.source as per review suggestions (not working currently) * Optimizing dockerfiles with install_dependencies script * update dockerfile with --cmake_extra_defines version number * add &&\ for license copy lines * updates, adding miniconda to path, reincluded clearing the pycache * adding maintainer note * update readme instructions * update tensorrt versioning in dockerfile	2019-07-09 02:03:55 -07:00
Maik Riechert	3cae067a9b	fix non-standard u_int32_t type (#1358 )	2019-07-09 00:19:58 -07:00
Scott McKay	ac6a4afb0f	Add validation of shape when re-using a buffer in ExecutionFrame (#1356 ) * Check for empty string as dim_param in allocation planner. * Validate shape is compatible at runtime when re-using Tensor.	2019-07-09 14:59:07 +10:00
Changming Sun	58d6ff3f13	Remove AgentPool setting in CI yaml	2019-07-08 15:40:54 -07:00
Tracy Sharpe	3a588860cc	remove unused math routines (#1354 ) This change removes a number of unused math helpers from core/util/math.h. Most operators are already using MLAS or Eigen directly.	2019-07-08 14:05:27 -07:00
Pranav Sharma	e9ce51ead4	Make GetTensorShapeFromTensorShapeProto return TensorShape and not it's internal representation. (#1353 )	2019-07-08 11:45:55 -07:00
Faith Xu	5b93b02c69	Issue template update (#1339 ) * Update to include urgency * Wording update * Wording update	2019-07-07 23:38:52 -07:00
Faith Xu	b7ae0d5694	Fix link (#1351 ) * Fix link * Update PyOp.md	2019-07-07 21:56:18 -07:00
R. G. Esteves	93528d9b3c	Reduce memory footprint of nGraph (#1296 ) * Fix unnecessary memory allocation in MKLDNN 1x1 convolution. * remove the patch header.	2019-07-07 20:23:19 -07:00
NonStatic	9f9ff19bdc	Copy shared library after build ORT Server (#1347 )	2019-07-07 20:21:16 -07:00
Hariharan Seshadri	2714576d0a	Update ONNX Runtime server doc to reference Jupyter notebook (#1340 )	2019-07-05 14:30:17 -07:00
Colin Versteeg	a8ff209ab6	Refactor Onnx runtime Server to only use public APIs (#1271 ) * replace log sinks * limit headers to include dir * first changes to do dynamic linking * wip for using cxx api * remove weird dangling dependency * building with tests failing * finish updating converters * fix const * intital introduction of typedef * change logging to use spdlog * get tests passing * clang format * map logging levels better * clean up unused imports * trent cr comments * clang-format * code review comments * changing buffer use to reserve * Dynamically link * revert tvm * update binary uploading * catch exceptions by const-ref * Revert "revert tvm" This reverts commit 387676dd1018134d15eb71fa126f7caf94380800. * fix typo * update versioning of lib	2019-07-04 01:08:14 -07:00
Scott McKay	e3919d3fce	Cleanup naming of test input to use .onnx for models. (#1337 ) * Cleanup naming of test input to use .onnx for models. * Remove file deleted on master	2019-07-04 13:10:29 +10:00
KeDengMS	0d204f3f06	Implementation of TVM codegen library (#888 ) Description: This change adds the common part of TVM based codegen library. It includes following parts: * Microsoft TVM Inventory (MTI): a set of TVM ops for neural networks, similar to TOPI * Compiler pass for traversing ONNX graph and generate TVM ops * Compiler pass for traversing generated graph and specify TVM schedule * Compiler pass for handling weight layout * Utils for debugging Motivation and Context: TVM is an open deep learning compiler stack for cpu, gpu and specialized accelerators. To leverage it in ONNX, we built an execution provider named Nuphar. Currently, Nuphar gets good performance on CPUs with AVX2 on quantized LSTM models. This codegen library was part of Nuphar execution provider. It is split out for sharing with other execution providers, as we'd like to reuse TVM in more devices.	2019-07-03 10:32:59 -07:00
Scott McKay	9d3b6b3a49	Disallow overriding initializers if IR version < 4 (#1324 ) Description: Disallow overriding an initializer via a graph input if the IR version is < 4. This enforces an implicit assumption that initializers should be treated as constant, and allows constant folding to be done on a model with an older IR version. Separate constant and overridable initializers so that it's clear which ones constant folding can utilize. Update Graph to not add all initializers to the graph inputs when the graph is manually created (i.e. not loaded from a GraphProto) and the IR version is >= 4. Motivation and Context In order to do constant folding we need to know which initializers can be treated as constant and which are overridable. All initializers were required to have a matching graph input prior to IR version 4, technically making all of them overridable. The intention however was for them to be treated as constants, and this change enforces that intent. The benefit of doing so is that constant folding will work for models with IR version < 4. The cost is that if someone is actually overriding an initializer they will need to update the IR version of their model to version 4 in order to keep doing so. The belief is that this is a very small subset of usage (e.g. models involving feeding in a truncated sequence) and the cost to update that small subset is warranted by the benefit of constant folding being able to be enabled on all older models without them needing an IR version update.	2019-07-03 18:43:38 +10:00
Hector Li	2a6c69de2b	Implement the Concat CUDA kernel (#1333 ) * Improve CUDA kernel performance for Concat. Implement the kernel code instead of using cudaMemCpy in a loop. * Update the index lookup part for Concat & Split	2019-07-02 23:08:59 -07:00
Faith Xu	5e54bbffec	PyOp documentation Revisions (#1318 ) * Revisions * Minor fix	2019-07-02 18:00:51 -07:00
Ryan Hill	1bf80e30fa	Ryanunderhill/MNIST sample (#1330 )	2019-07-02 14:41:27 -07:00
RandySheriffH	bf6a9f9c27	Rashuai/py op example (#1325 ) * add scikit example * format text * format doc	2019-07-02 09:53:49 -07:00

1 2 3 4 5 ...

1010 commits