onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-10 17:37:14 +00:00

Author	SHA1	Message	Date
Tracy Sharpe	7a96cfc8f5	operator code cleanup (#4228 ) Search/replace of the pattern "const auto foo = tensor.Shape()" to "const auto& foo = tensor.Shape()" to avoid unneeded copies at runtime and reduce code size (8KB drop for onnxruntime.dll). Remove some unnecessary header includes.	2020-06-13 14:47:44 -07:00
jornt-xilinx	c55f6d76be	[Vitis-AI EP] Fix to enable multi-output subgraphs inside Vitis-AI EP + edit docs (#4171 )	2020-06-13 04:56:07 -07:00
Wei-Sheng Chin	de9da123cf	Enable static memory planning for pipeline. (#4204 ) * Enable static memory planning for pipeline. 1. We fix a bug when resolving symbolic shape for scalars. 2. We pass the original inputs to all pipeline stages so that the symbolic shapes can be resolved. * Further Improvements 1. Address comments. 2. Further reduce activation size by ~50% when pipeline is on. This is done by removing all but one gradient tensor from the last RecordEvent in the backward pass. * Address a comment * Fix Windows build	2020-06-12 21:43:50 -07:00
Hariharan Seshadri	b377266eb3	Fix Mac build linker warnings (#4155 )	2020-06-12 21:10:12 -07:00
Hariharan Seshadri	91a41298cc	Fix ORT build when onnxruntime_PYBIND_EXPORT_OPSCHEMA is enabled (#3954 )	2020-06-12 19:32:57 -07:00
Tracy Sharpe	155e22d1ab	MLAS: fuse float output into quantized GEMM (#4215 ) Add more variants of MlasGemm that do a u8x8 GEMM with the output type as float. This fuses the common sequence of MatMulInteger + Cast + Mul(OutputScale) + optional Add(BiasVector).	2020-06-12 17:50:40 -07:00
Tiago Koji Castro Shibata	2e3607c7cd	Remove hardcoded desktop lib (#4193 )	2020-06-12 16:51:54 -07:00
Edward Chen	f74861841e	Fix dangling pointer to local string variable in onnxruntime_pybind_state.cc.	2020-06-12 14:28:39 -07:00
Edward Chen	6b4f652017	Clean up status checks in gradient_graph_builder_test.cc.	2020-06-12 14:28:39 -07:00
Edward Chen	7096e6f5ef	Reduce severity of GraphAugmenter logging statement.	2020-06-12 14:28:39 -07:00
Changming Sun	6f4320fb85	Fix the python package name issue (#4207 ) Fix the package package name issue. In my last change(#4197) about enabling code sign. I forgot to pass the additional flags to setup.py,	2020-06-12 08:32:59 -07:00
Yufeng Li	87d68d8531	matmul integer fusion (#4195 ) * Introduce DynamicQuantizeMatMul It fuses DynamicQuantizeLinear, MatMul and following cast, multiplier. It gets float in and float out for quantized matmul. We have a MLAS kernel in implementation for this op.	2020-06-11 21:42:09 -07:00
Tianlei Wu	2605faef88	Add past state support in Attention Op for GPT-2 (#4107 ) Update Attention op to allow past state input and output. Add fusion script and tests	2020-06-11 14:19:55 -07:00
pengwa	e6ccb1ac28	GatherNDGrad for CPU (#4123 ) * GatherNDGrad on CPU * Remove __CUDA_ARCH__ check in .cc files	2020-06-12 02:43:49 +08:00
Xueyun Zhu	65a682354b	enable pipeline to run with mixed precision (#4113 ) * enable pipeline to run with mixed precision * address feedback * address feedback * test log * pipe infomation if test fails * ci failure	2020-06-10 22:16:24 -07:00
Changming Sun	8f8d899bf2	Enable code sign in c api pipeline and python pipeline	2020-06-10 19:31:22 -07:00
Yulong Wang	73bc6be5d1	build: split nodejs binding build and test to avoid timeout issue (#4188 ) * split nodejs binding build and test * enable nodejs tests	2020-06-10 19:16:32 -07:00
Matthew Hill	117b2e7743	Fix GPU memory leak on TensorRT (#4172 )	2020-06-10 16:56:51 -07:00
Dmitri Smirnov	af0750ba1b	Java GPu artifact naming (#4179 ) Modify gradle build so artifactID has _gpu for GPU builds. Pass USE_CUDA flag on CUDA build Adjust publishing pipelines to extract POM from a correct path. Co-Authored-By: @Craigacp	2020-06-10 11:15:48 -07:00
George Wu	e8ed14bcb3	disable MEMLEAK CHECKER for openvino	2020-06-10 11:12:17 -07:00
stevenlix	c296884fc3	bump up ORT version to 1.3.1 (#4181 )	2020-06-10 08:44:03 -07:00
Changming Sun	c0bdbc0b39	Enable telemetry for the C API and python pipeline (#4174 )	2020-06-10 00:07:46 -07:00
Tracy Sharpe	35d9f396c4	MLAS: refactor quantized GEMM loops (#4182 )	2020-06-09 23:28:55 -07:00
George Wu	9d65ce53bc	move back to toolset 14.16 to possibly work around nvcc bug (#4180 )	2020-06-09 19:36:30 -07:00
Changming Sun	a7366d82af	Disable nuphar large model test (#4173 ) Disable nuphar large model test, because it takes too long(40+ minutes), while the default cpu provider takes about 5 minutes. After this change, we still keep a lot of other nuphar model tests, I think that should be enough.	2020-06-09 17:45:17 -07:00
Ashwini Khade	9eba9fba7c	Fix for BiasGelu fusion optimizer (#4160 ) * Fix for BiasGelu fusion optimizer * changes per review comments	2020-06-09 14:33:34 -07:00
Yulong Wang	2b3ce1b090	add script to support update nodejs binding version (#4164 )	2020-06-09 13:12:55 -07:00
Sheil Kumar	4377ff4a1a	Enable .NET Core 2.0 and .NET Framework 4.6.1 in Microsoft.AI.MachineLearning NuGet package (#4125 ) * add project to download cswinrt and build winrt c# interop dll * Add to nuget package * reverse if check * run generation before core compile * add generated files to compile * update .net package to binplace native libs * add props to .netstandard2.0 folder * auto binplace ml native binaries * force 'Any CPU' platform build * Fix anycpu and platform targets * fix flake errors * fix variable order * fix flake pep8 errors, semicolon Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2020-06-09 09:08:19 -07:00
Scott McKay	28d12dc4f0	Try to avoid std::move in return whilst keeping CentOS build happy. (#4163 )	2020-06-09 21:41:49 +10:00
oak-tree	541eafb41a	Fixed the link to model test documenation (#4011 )	2020-06-08 17:27:55 -07:00
Changming Sun	2ab3a19728	Enlarge the read buffer size in C#/Java test code (#4150 ) 1. Enlarge the read buffer size further, so that our code can run even faster. TODO: need apply the similar changes to python some other language bindings. 2. Add coreml_VGG16_ImageNet to the test exclusion set of x86_32. It is not a new model but previously we didn't run the test against x86_32.	2020-06-08 16:13:11 -07:00
Tiago Koji Castro Shibata	8eb6a539bd	Hardcode WinML tests umbrella lib (#4161 )	2020-06-08 15:24:08 -07:00
suffiank	7f5339505e	Discover trainable parameters using reverse DFS from loss node (#4116 ) Discover trainable parameters using reverse DFS from loss node, omitting recursion along untrainable inputs. Co-authored-by: suffian khan <sukha@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: suffian khan <sukha@microsoft.com>	2020-06-08 14:16:10 -07:00
Yulong Wang	842be1535d	[Node.js binding] add linux and mac package (#4157 ) * try mac pipeline * fix path separator * copy prebuilds folder * split esrp yaml for win/mac * disable mac signing temporarily * add linux * fix indent * add nodetool in linux * add nodetool in win-ci-2019 * replace linux build by custom docker scripts * use manylinux as node 12.16 not working on centos6 * try ubuntu * loosen timeout for test case - multiple runs calls	2020-06-08 14:12:05 -07:00
Sergii Dymchenko	653417ae4b	Fix scaler->scalar typo. (#4142 )	2020-06-08 13:02:12 -07:00
Tiago Koji Castro Shibata	6bbd18efd0	Hardcode WinML umbrella lib to windowsapp.lib (#4133 )	2020-06-08 11:04:44 -07:00
Wenbing Li	ee35320974	The fixings for python scripts in ONNXRuntime (#4135 ) * The fixings for python scripts in ONNXRuntime * update according the comments	2020-06-08 10:27:32 -07:00
Faith Xu	3390431d80	Update MCR image table (#4137 )	2020-06-08 10:13:13 -07:00
Changming Sun	5a5f44eed7	Add softmax to the mnist example (#4149 )	2020-06-08 09:33:50 -07:00
Dmitri Smirnov	4e1dac67cd	Address memory leak and improve memory handling (#4124 ) Fix memory leak when a Python list passed as a feed. Create a custom allocator that can take ownership of python arrays that are created inside pybind. Allow direct memory use if continuous array is a copy because we now can take ownership of it by the allocator.	2020-06-08 09:29:46 -07:00
Cecilia Liu	b8db8076cb	Fix MKLML Tests Run (#4144 ) Add a path to LD_LIBRARY_PATH to fix library not found error when running mklml test cases.	2020-06-06 20:28:53 -07:00
Tianlei Wu	7c8e1580a1	Add check of graph output in Bert Fusions (#4126 ) * Refine node output edge checking in bert related fusions	2020-06-06 00:06:07 -07:00
liqunfu	ffed43e9b8	handle loss and name marching wrappers (#4066 ) * handle loss and name marching wrappers Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-06-05 23:34:26 -07:00
Yulong Wang	2aab20b4ea	[Node.js binding] upgrade node-addon-api to 3.0 (#4148 )	2020-06-05 21:24:34 -07:00
Miguel de Icaza	ea368f69db	Add Swift/macOS sample, a port of the Windows MNist sample	2020-06-05 21:16:41 -07:00
Yulong Wang	2e58097f8f	fix build: pipeline Node.js version to 12.16.3 (#4145 )	2020-06-05 17:56:03 -07:00
Bowen Bao	1e5307d458	Bug fix for parameter names of models not using wrapper (#4061 ) * bug fix for models not using wrapper * add test case for no wrapper case * update test case to use internal learning rate * fix bug with frozen weight update	2020-06-05 12:03:38 -07:00
Scott McKay	9790e19424	Handle mem pattern allocation failure better. Make BFCArena behavior more consistent (#4062 ) * Fixes from investigating issue running BERT-Squad model with larger batch sizes. When the batch size gets large enough the initial run will be successful (no memory pattern in use) but the second will fail to allocate the memory pattern block. The cause of this failure is that we still have the smaller blocks from the first run allocated, as BFCArena has no logic to free those. This essentially results in 2x the memory being required to run the model. There was inconsistency in BFCArena::Extend which on one path threw an exception if it couldn't do the allocation, and on another just returned false (resulting in Alloc returning a nullptr). Make the behavior consistent by always throwing if BFCArena fails to find a buffer to return. There are a huge number of places in the code where we assume Alloc returns a valid pointer so throwing will result in more correct behavior as a whole. It's also consistent with what happens when CUDA or the standard library fails to allocate memory. Next, update ExecutionFrame to check for this failure and not insert a memory block entry if it happens. With the existing code if BFCArena Alloc returned a nullptr we happily inserted that in the blocks, delaying detection of the failure to when we attempted to use the block in AllocateMLValueTensorSelfOwnBufferHelper. Finally update AllocateMLValueTensorSelfOwnBufferHelper to expect a location may not have a block. A log message will be provided when the block allocation fails so it's not necessary to have more on each individual allocation that would have used the block. Falls through to default behavior of doing a normal allocation.	2020-06-05 18:54:01 +10:00
Thiago Crepaldi	81101c9efd	Fix DropoutGrad op (#4052 ) Dropout op was recently changed to accept a new input named 'training_mode', which is passed in to DropoutGrad automatically. This PR updates the DropoutGrad schema to accommodate the new input. Tests were also update to reflect the API change Co-authored-by: Thiago Crepaldi <thiag.crepaldi@microsoft.com>	2020-06-04 15:00:02 -07:00
Dmitri Smirnov	6199ef1375	Change group id to com.microsoft.onnxruntime per requirements.	2020-06-03 22:30:13 -07:00

1 2 3 4 5 ...

2695 commits