onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-18 18:52:16 +00:00

Author	SHA1	Message	Date
Mika Fischer	b2658b3594	Cache CUDNN convolution benchmark results in cuda::Conv kernels (#712 ) * Cache CUDNN convolution benchmark results in cuda::Conv kernels Previously, the best convolution algorithm was determined by running cudnnFindConvolutionForwardAlgorithmEx and cudnnFindConvolutionBackwardDataAlgorithmEx on every shape change. This is very detrimental for variable input shapes, such as variable batch sizes. This change adds a map to cache previously determined benchmark results. The caching results in significant speedups for variable input shapes. * Use LRU to limit cached benchmark results * Only cache benchmark results for a fixed weight shape In case the weight shape changes, all cached results are discarded. * Use padded shape as key for cached benchmarks * Add constant for max number of cached benchmark results * Use unordered_map to store cached benchmark results * Only store the parameters that are actuallt needed	2019-04-15 22:15:14 -07:00
Tracy Sharpe	f19d9a4907	Reduce code size of kernel registration (#833 ) Some changes that reduce the size of the release onnxruntime.dll by 170KB: Change the ONNX_OPERATOR_KERNEL macros to not create a unique virtual class per kernel create lambda, but instead use a generic class with the raw function address supplied at BuildCreateKernelInfo time. Changed the exceution providers to use a table driven approach to calling the BuildCreateKernelInfo functions instead of a massive function with construct/call/delete sequences. The CreateFunc in data_types.h didn't need to be a std::function, eliminating more lambda virtual classes. N.B. To accommodate MSVC 14.11 toolchain (used for CUDA builds), the operator+() syntax cannot be used to retrieve the raw function address. The older toolchain can't resolve between cdecl/vectorcall and gives up. An explicit cast is needed to help the compiler along.	2019-04-15 16:39:59 -07:00
Pranav Sharma	049ba2d747	Exclude tests that fail when contrib ops are disabled. (#835 )	2019-04-15 15:57:48 -07:00
Pranav Sharma	4b4a359943	Exclude unreferenced global data and op doc strings in the opschema object. The first causes a decrease in the binary size by at least 85k. The latter reduces resident memory size. (#823 ) * Exclude unreferenced global data and op doc strings in the opschema object. The first causes a decrease in the binary size by at least 85k. The latter reduces resident memory size. * Update onnx to incorporate my PR that fixes SetDoc compiler warnings	2019-04-15 15:57:19 -07:00
Ashwini Khade	e999af61b2	bug fix for shape inference (#834 )	2019-04-15 15:51:12 -07:00
Raymond Yang	fabdbdc130	Update test retrieval following #828 (#836 ) * Enable nightly build * Update fetch file names * Fix * Update setup.py * Update run_dockerbuild.sh * Resolve comments * Update test data	2019-04-15 14:51:20 -07:00
Dmitri Smirnov	6194a92249	Fix empty input handling in Tokenizer. (#826 )	2019-04-15 14:46:17 -07:00
Changming Sun	2c0b8e965e	Disable test data local cache in Linux CI pipelines	2019-04-12 22:23:16 -07:00
Changming Sun	e493ba2219	Fix memory leaks in perf test runner	2019-04-12 00:51:33 -07:00
Raymond Yang	1936d141a7	Create nightly build for python packages (#817 ) * Enable nightly build * Update fetch file names * Fix * Update setup.py * Update run_dockerbuild.sh * Resolve comments	2019-04-11 22:06:18 -07:00
Tracy Sharpe	c55e2de593	Status class optimizations (#824 ) optimize onnxruntime::common::Status to reduce code size	2019-04-11 21:57:01 -07:00
Hariharan Seshadri	b6936e71cb	Avoid postfix iterator increment in a loop in Slice op and some minor formatting fixes (#820 ) * Initial commit * Fix comment * More nit fixes	2019-04-11 17:44:32 -07:00
Hariharan Seshadri	ccf3566c35	Register kernel for Dropout (opset 10) for opset compliance (#813 )	2019-04-11 13:55:43 -07:00
Pranav Sharma	6577c3dddf	Extract debug symbols in a separate file and strip the binary. (#811 ) * Ensure Linux binaries are built with debug info. Extract debug info out of the main binaries. Strip the main binaries. * add binutils * add uname * add binutils * remove linux portion	2019-04-11 12:02:50 -07:00
Ryan Hill	1ff29bfb3d	Fix x86 calling convention break (#814 )	2019-04-11 10:41:07 -07:00
Hector Li	0741baf867	Update NMS to support max_output_boxes_per_class = 0. NMS will do nothing for this case. (#816 )	2019-04-11 10:09:33 -07:00
Hariharan Seshadri	56749a84ee	Implement opset v10 changes for Slice operator (#772 )	2019-04-10 22:06:05 -07:00
jignparm	53038b33ed	BuildFusedKernelDef uses N^2 algorithm verifying input constraints; session load time is huge for fused nodes (#804 ) This optimization is required for WinML to prevent unit test time out.	2019-04-11 09:52:10 +08:00
jignparm	d17ae5c093	MKLML pipeline - update C# and CMake to handle dll dependencies (#810 ) * Refactor NuGet to allow arbitrary namespaces * Move csharp build to end of cmake * Minor edit to ensure dll generation in sequence	2019-04-10 18:16:02 -07:00
Jesse Benson	24d80b4bda	Add support for BrainSlice execution provider in Python, if onnxruntime is built with it.	2019-04-10 17:37:21 -07:00
Ashwini Khade	10b113f144	update onnx to bring in quantized ops (#808 ) * update onnx + move quantized ops kernels and test to onnx + remove exp ops * update onnx * Revert "update onnx" This reverts commit 533abfc297e75473a74505fb89921ffc05c46a1c. * add generated csharp test file	2019-04-10 17:20:35 -07:00
Changming Sun	4bc3d6027d	Build perf test runner only if onnxruntime_BUILD_SHARED_LIB is ON	2019-04-10 13:16:56 -07:00
Raymond Yang	3dcf82a1f9	Disable some flaky tests with CUDA9 (#805 ) * Disable failing cuda tests	2019-04-10 01:02:32 -07:00
jignparm	4e3391ef60	Refactor NuGet to allow arbitrary PackageId names (e.g. Microsoft.ML.OnnxRuntime.MKLML) (#797 ) * Refactor NuGet to allow arbitrary namespaces * Move csharp build to end of cmake	2019-04-09 22:48:00 -07:00
Ashwini Khade	e7090d7202	move all removed exp ops to contrib ops (#786 ) * move all removed exp ops to contrib ops * fix cuda build failure * bug fix * move some tests to contrib ops + cosmetic changes * Revert "move some tests to contrib ops + cosmetic changes" This reverts commit 4cda9297e257a6f6b902724e8113bf5d5a62df29.	2019-04-09 22:26:48 -07:00
Changming Sun	0d4055def4	Integrate tensorflow into onnxruntime_perf_test tool	2019-04-09 15:55:08 -07:00
jignparm	9467c5f967	Update version to 0.3.1 (patch release) (#798 ) * bump up version number (#752) * bump up version number * Minor change to kick off build * update version to 0.3.1	2019-04-09 14:48:56 -07:00
Xavier Dupré	ccd7e801a0	Fix #612 , TfidfVectorizer handles empty matrices as an input (#702 ) * Fix #612, TfidfVectorizer handles empty matrices as an input * Add more unit tests, better consistency of error messages * Update tfidfvectorizer.cc * better comment * fix comments * add unit test failure for an empty input {0, 1}	2019-04-09 10:55:24 -07:00
Yufeng Li	39951f35f4	Use template windows-build-tools-setup-steps.yml in win pipelines (#794 ) 1. Update nuget restore to 4.3 for capi pipeline 2. Use template windows-build-tools-setup-steps.yml in win piplines.	2019-04-08 21:35:33 -07:00
jywu-msft	d91555f99e	fix for tensorrt_basic_test not being run. (#792 )	2019-04-08 13:18:36 -07:00
Hariharan Seshadri	5cf72030b2	Rename misleading test names in ConvTranspose op tests (#788 )	2019-04-06 17:01:26 -07:00
jywu-msft	571291c323	build.sh: don't require user to set --use_full_protobuf with --use_tensorrt option. we can set it implicitly. (#780 ) * use_full_protobuf if tensorrt build option is enabled. * update BUILD.md sections on MKLDNN and TensorRT/full_protobuf option	2019-04-06 10:11:57 -07:00
Yufeng Li	cea2a40bf1	Clean up ExecutionProvider in CSharp (#783 )	2019-04-05 22:29:54 -07:00
Ryan Hill	fda1d0dce9	Ryanunderhill/ocr custom op (#744 ) * Adding a custom op interface to the C API to remove shared library dependency. * Remove old custom op test * Rework how custom ops handle inputs/outputs to enable custom op output shape calculation in the compute method * Add a nicer C++ API for custom ops and switch the tests to use it.	2019-04-05 18:53:20 -07:00
Tao Qin	58ef1306d4	Copy inputs and outputs directly in InferenceSession::SaveModelMetadata (#777 ) * Copy required inputs and outputs directly in InferenceSession::SaveModelMetadata * trivial * trivial	2019-04-05 15:16:55 -07:00
ybrnathan	3eddb2d61e	Add optimization level as cmd line arguments (#776 ) * Add optimization level as cmd line arguments * fix the help info and add option.	2019-04-05 14:44:28 -07:00
utsabsingharoy	36ed91ee9f	CustomRegistry should use composition instead of inheritence CustomRegistry should use composition instead of inheritence	2019-04-05 14:14:10 -07:00
Changming Sun	867e961ee8	Remove mkldnn_sgemm from math_util.cc If it is needed, it can be used explicitly in mkldnn provider.	2019-04-05 14:13:10 -07:00
Hariharan Seshadri	ffd9071168	expose graph node name returning non-zero status code (#714 ) * Initial commit	2019-04-05 12:50:58 -07:00
Peng Wang (AI FWK)	f4021cf30a	fix a minor inaccurate error message	2019-04-05 12:07:28 -07:00
Yufeng Li	ef9a4d98cb	Expose parallel execution option in C# API (#767 ) * Expose parallel execution option * delete unnesary file * add doc * update nuget retore to 4.3.0 * resolve comments * remove unnessary file * make git ignore csharp/Directory.Build.props * fix yaml config for nuget 4.3	2019-04-05 12:05:56 -07:00
Changming Sun	43521c0de7	update	2019-04-04 21:22:27 -07:00
Scott McKay	65c50bb25b	Create NodeArg for all initializers if IR version is > 3. (#742 ) Previously all initializers had to have matching graph inputs and the NodeArg was guaranteed to be created via graph input processing.	2019-04-05 14:09:27 +10:00
Mika Fischer	2674d9bd8a	Fix profiling with C API (#710 ) Currently, when using OrtEnableProfiling to enable profiling using the C API, the profile output file is created but is always empty. The reason is that InferenceSession::EndProfiling() needs to be called to write the profiling data to the output file. However there's currently no way to call this function via the C API. This adds a call to EndProfiling() to the descructor of the session if profiling is enabled in the session options.	2019-04-04 16:23:18 -07:00
Raymond Yang	c35d9797d7	Add unique identifier in function subgraph (#771 ) * Add unique identifier in function subgraph * enable test	2019-04-04 13:31:20 -07:00
Changming Sun	289663fd85	reformat onnxruntime_c_api.cc	2019-04-04 13:03:13 -07:00
Changming Sun	290112d614	Update onnx (#761 ) * update onnx	2019-04-04 10:58:45 -07:00
Changming Sun	261a9078a5	Remove redundant code in ml::write_scores	2019-04-04 01:21:48 -07:00
Konstantinos Karanasos	512cfdd9fe	Generalize node removal (#743 ) Generalize node removal method in graph_utils. This is a higher-level method that keeps the graph consistent so that no Resolve is needed after the removal of a node. The new method supports the removal of nodes with a single input (be it an incoming node or an initializer) and a single output (but allowing multiple output edges of that output). It also takes into account the case that one of the output edges is fed to a subgraph. Also updated the rewrite rules to use this new, less restrictive method, and improved the rules' conditions. Introduced a GraphEdge struct to simplify various methods in graph_utils.	2019-04-03 22:34:20 -07:00
Changming Sun	7af35ac1e6	Fix two warnings in graph lib	2019-04-03 19:41:11 -07:00

... 226 227 228 229 230 ...

11997 commits