Commit graph

2687 commits

Author SHA1 Message Date
Edward Chen
6b4f652017 Clean up status checks in gradient_graph_builder_test.cc. 2020-06-12 14:28:39 -07:00
Edward Chen
7096e6f5ef Reduce severity of GraphAugmenter logging statement. 2020-06-12 14:28:39 -07:00
Changming Sun
6f4320fb85
Fix the python package name issue (#4207)
Fix the package package name issue. In my last change(#4197) about enabling code sign. I forgot to pass the additional flags to setup.py,
2020-06-12 08:32:59 -07:00
Yufeng Li
87d68d8531
matmul integer fusion (#4195)
* Introduce DynamicQuantizeMatMul
It fuses DynamicQuantizeLinear, MatMul and following cast, multiplier. It gets float in and float out for quantized matmul. We have a MLAS kernel in implementation for this op.
2020-06-11 21:42:09 -07:00
Tianlei Wu
2605faef88
Add past state support in Attention Op for GPT-2 (#4107)
Update Attention op to allow past state input and output.
Add fusion script and tests
2020-06-11 14:19:55 -07:00
pengwa
e6ccb1ac28
GatherNDGrad for CPU (#4123)
* GatherNDGrad on CPU

* Remove __CUDA_ARCH__ check in .cc files
2020-06-12 02:43:49 +08:00
Xueyun Zhu
65a682354b
enable pipeline to run with mixed precision (#4113)
* enable pipeline to run with mixed precision

* address feedback

* address feedback

* test log

* pipe infomation if test fails

* ci failure
2020-06-10 22:16:24 -07:00
Changming Sun
8f8d899bf2 Enable code sign in c api pipeline and python pipeline 2020-06-10 19:31:22 -07:00
Yulong Wang
73bc6be5d1
build: split nodejs binding build and test to avoid timeout issue (#4188)
* split nodejs binding build and test

* enable nodejs tests
2020-06-10 19:16:32 -07:00
Matthew Hill
117b2e7743
Fix GPU memory leak on TensorRT (#4172) 2020-06-10 16:56:51 -07:00
Dmitri Smirnov
af0750ba1b
Java GPu artifact naming (#4179)
Modify gradle build so artifactID has _gpu for GPU builds.
  Pass USE_CUDA flag on CUDA build
  Adjust publishing pipelines to extract POM from a correct path.

Co-Authored-By: @Craigacp
2020-06-10 11:15:48 -07:00
George Wu
e8ed14bcb3 disable MEMLEAK CHECKER for openvino 2020-06-10 11:12:17 -07:00
stevenlix
c296884fc3
bump up ORT version to 1.3.1 (#4181) 2020-06-10 08:44:03 -07:00
Changming Sun
c0bdbc0b39
Enable telemetry for the C API and python pipeline (#4174) 2020-06-10 00:07:46 -07:00
Tracy Sharpe
35d9f396c4
MLAS: refactor quantized GEMM loops (#4182) 2020-06-09 23:28:55 -07:00
George Wu
9d65ce53bc
move back to toolset 14.16 to possibly work around nvcc bug (#4180) 2020-06-09 19:36:30 -07:00
Changming Sun
a7366d82af
Disable nuphar large model test (#4173)
Disable nuphar large model test, because it takes too long(40+ minutes), while the default cpu provider takes about 5 minutes. After this change, we still keep a lot of other nuphar model tests, I think that should be enough.
2020-06-09 17:45:17 -07:00
Ashwini Khade
9eba9fba7c
Fix for BiasGelu fusion optimizer (#4160)
* Fix for BiasGelu fusion optimizer

* changes per review comments
2020-06-09 14:33:34 -07:00
Yulong Wang
2b3ce1b090
add script to support update nodejs binding version (#4164) 2020-06-09 13:12:55 -07:00
Sheil Kumar
4377ff4a1a
Enable .NET Core 2.0 and .NET Framework 4.6.1 in Microsoft.AI.MachineLearning NuGet package (#4125)
* add project to download cswinrt and build winrt c# interop dll

* Add to nuget package

* reverse if check

* run generation before core compile

* add generated files to compile

* update .net package to binplace native libs

* add props to .netstandard2.0 folder

* auto binplace ml native binaries

* force 'Any CPU' platform build

* Fix anycpu and platform targets

* fix flake errors

* fix variable order

* fix flake pep8 errors, semicolon

Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
2020-06-09 09:08:19 -07:00
Scott McKay
28d12dc4f0
Try to avoid std::move in return whilst keeping CentOS build happy. (#4163) 2020-06-09 21:41:49 +10:00
oak-tree
541eafb41a
Fixed the link to model test documenation (#4011) 2020-06-08 17:27:55 -07:00
Changming Sun
2ab3a19728
Enlarge the read buffer size in C#/Java test code (#4150)
1. Enlarge the read buffer size further, so that our code can run even faster. TODO: need apply the similar changes to python some other language bindings.
2. Add coreml_VGG16_ImageNet to the test exclusion set of x86_32. It is not a new model but previously we didn't run the test against x86_32.
2020-06-08 16:13:11 -07:00
Tiago Koji Castro Shibata
8eb6a539bd
Hardcode WinML tests umbrella lib (#4161) 2020-06-08 15:24:08 -07:00
suffiank
7f5339505e
Discover trainable parameters using reverse DFS from loss node (#4116)
Discover trainable parameters using reverse DFS from loss node, omitting recursion along untrainable inputs.

Co-authored-by: suffian khan <sukha@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
Co-authored-by: suffian khan <sukha@microsoft.com>
2020-06-08 14:16:10 -07:00
Yulong Wang
842be1535d
[Node.js binding] add linux and mac package (#4157)
* try mac pipeline

* fix path separator

* copy prebuilds folder

* split esrp yaml for win/mac

* disable mac signing temporarily

* add linux

* fix indent

* add nodetool in linux

* add nodetool in win-ci-2019

* replace linux build by custom docker scripts

* use manylinux as node 12.16 not working on centos6

* try ubuntu

* loosen timeout for test case - multiple runs calls
2020-06-08 14:12:05 -07:00
Sergii Dymchenko
653417ae4b
Fix scaler->scalar typo. (#4142) 2020-06-08 13:02:12 -07:00
Tiago Koji Castro Shibata
6bbd18efd0
Hardcode WinML umbrella lib to windowsapp.lib (#4133) 2020-06-08 11:04:44 -07:00
Wenbing Li
ee35320974
The fixings for python scripts in ONNXRuntime (#4135)
* The fixings for python scripts in ONNXRuntime

* update according the comments
2020-06-08 10:27:32 -07:00
Faith Xu
3390431d80
Update MCR image table (#4137) 2020-06-08 10:13:13 -07:00
Changming Sun
5a5f44eed7
Add softmax to the mnist example (#4149) 2020-06-08 09:33:50 -07:00
Dmitri Smirnov
4e1dac67cd
Address memory leak and improve memory handling (#4124)
Fix memory leak when a Python list passed as a feed.
  Create a custom allocator that can take ownership of python
  arrays that are created inside pybind.
  Allow direct memory use if continuous array is a copy because
  we now can take ownership of it by the allocator.
2020-06-08 09:29:46 -07:00
Cecilia Liu
b8db8076cb
Fix MKLML Tests Run (#4144)
Add a path to LD_LIBRARY_PATH to fix library not found error when running mklml test cases.
2020-06-06 20:28:53 -07:00
Tianlei Wu
7c8e1580a1
Add check of graph output in Bert Fusions (#4126)
* Refine node output edge checking in bert related fusions
2020-06-06 00:06:07 -07:00
liqunfu
ffed43e9b8
handle loss and name marching wrappers (#4066)
* handle loss and name marching wrappers

Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-06-05 23:34:26 -07:00
Yulong Wang
2aab20b4ea
[Node.js binding] upgrade node-addon-api to 3.0 (#4148) 2020-06-05 21:24:34 -07:00
Miguel de Icaza
ea368f69db Add Swift/macOS sample, a port of the Windows MNist sample 2020-06-05 21:16:41 -07:00
Yulong Wang
2e58097f8f
fix build: pipeline Node.js version to 12.16.3 (#4145) 2020-06-05 17:56:03 -07:00
Bowen Bao
1e5307d458
Bug fix for parameter names of models not using wrapper (#4061)
* bug fix for models not using wrapper

* add test case for no wrapper case

* update test case to use internal learning rate

* fix bug with frozen weight update
2020-06-05 12:03:38 -07:00
Scott McKay
9790e19424
Handle mem pattern allocation failure better. Make BFCArena behavior more consistent (#4062)
* Fixes from investigating issue running BERT-Squad model with larger batch sizes. When the batch size gets large enough the initial run will be successful (no memory pattern in use) but the second will fail to allocate the memory pattern block.

The cause of this failure is that we still have the smaller blocks from the first run allocated, as BFCArena has no logic to free those. This essentially results in 2x the memory being required to run the model.

There was inconsistency in BFCArena::Extend which on one path threw an exception if it couldn't do the allocation, and on another just returned false (resulting in Alloc returning a nullptr). Make the behavior consistent by always throwing if BFCArena fails to find a buffer to return. There are a huge number of places in the code where we assume Alloc returns a valid pointer so throwing will result in more correct behavior as a whole. It's also consistent with what happens when CUDA or the standard library fails to allocate memory.

Next, update ExecutionFrame to check for this failure and not insert a memory block entry if it happens. With the existing code if BFCArena Alloc returned a nullptr we happily inserted that in the blocks, delaying detection of the failure to when we attempted to use the block in AllocateMLValueTensorSelfOwnBufferHelper.

Finally update AllocateMLValueTensorSelfOwnBufferHelper to expect a location may not have a block. A log message will be provided when the block allocation fails so it's not necessary to have more on each individual allocation that would have used the block. Falls through to default behavior of doing a normal allocation.
2020-06-05 18:54:01 +10:00
Thiago Crepaldi
81101c9efd
Fix DropoutGrad op (#4052)
Dropout op was recently changed to accept a new input named
'training_mode', which is passed in to DropoutGrad automatically.

This PR updates the DropoutGrad schema to accommodate the new input.
Tests were also update to reflect the API change

Co-authored-by: Thiago Crepaldi <thiag.crepaldi@microsoft.com>
2020-06-04 15:00:02 -07:00
Dmitri Smirnov
6199ef1375 Change group id to com.microsoft.onnxruntime per requirements. 2020-06-03 22:30:13 -07:00
Scott McKay
16cef90e29
General enhancements/cleanups to test exes (#4109)
* General enhancements/cleanups to test exes
  - Support running onnxruntime_perf_test with no output file
    - if you're profiling the output file is often unused and can be very large
  - Allow failure to override early success if doing multiple runs of a test using running onnx_test_runner
    - e.g. if the second run fails that's more important as a final status
  - Clarify ownership semantics
  - Cleanup naming, line lengths, usage of references for required parameters etc.
2020-06-04 07:01:39 +10:00
Yufeng Li
197da135eb
Implement quantized Attention on cpu (#4111)
* Implement QAttention on CPU
* support QAttention in quantization tool
* refine attention code
* add more unit tests
2020-06-03 13:42:00 -07:00
Andrews548
62b44527e5
Add ArmNN Execution Provider (#3714)
* Add ArmNN Execution Provider

Add a new execution provider targeting Arm architecture based on ArmNN.
Validated on NXP i.MX8QM CPU with ResNet50, MobileNetv2 and VGG models.

reviewed-by: mike.caraman@nxp.com

* Minor fixes

- renamed onnxruntime_ARMNN_RELU_USECPU to onnxruntime_ARMNN_RELU_USE_CPU
- fixed acl typo

* remove extra includes. added exception for ArmNN in test

* fix indentation

* Separated the activation implementation from the cpu and fixed the blockage from the endif

Co-authored-by: Andrei-Alexandru <andrei-alexandru.avram@nxp.com>
2020-06-03 22:57:51 +05:30
Scott McKay
62af8da3f6 Use OrtMutex and OrtCondVar everywhere instead of std::mutex/std::condition_variable for consistency.
Needed to change the MissingTrack enum naming due to ort_mutex.h including Windows.h which #defines TRUE and FALSE (via inclusion of fdi_fci_types.h), breaking usage of MissingTrack::TRUE and MissingTrack::FALSE.
2020-06-03 08:42:16 -07:00
liqunfu
905c535626
still need to make the test stable. Lower the acc number a bit to make the test pass for now (#4117)
Co-authored-by: liqun fu <liqun@OrtTrainingDev1.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-06-02 21:37:48 -07:00
KeDengMS
d63b90538e
Symbolic shape inference exit on models without onnx opset used (#4090)
* Symbolic shape inference exit on models without onnx opset used

* Temporary fix for ConvTranspose with symbolic input dims

Co-authored-by: Changming Sun <me@sunchangming.com>
2020-06-02 19:39:46 -07:00
KeDengMS
6f8a4f4cad Fix Nuphar test failure 2020-06-02 18:03:38 -07:00
KeDengMS
32d8a76f2f Fix Nuphar build in gcc 7 (Ubuntu 18.04) 2020-06-02 18:03:38 -07:00