Commit graph

659 commits

Author SHA1 Message Date
Raymond Yang
2a2de42bb2
Add docker image clean script (#844)
* Add docker image clean script

* Change the command not to generate warning if no such image presents

* Update linux-gpu-ci-pipeline.yml

* Update linux-ci-pipeline.yml

* Update azure-pipelines-py-packaging.yml
2019-04-17 11:20:41 -07:00
Hector Li
f1af493b75
Fix some issue in CUDA GRU and ReduceSum (#845)
* Fix issues in GRU GPU implementation. The cudnnGetRNNWorkspaceSize could failed because some descriptor are defined as local variable and are destroyed.

* Fix the issue for ReduceSum.  cudnnReduceTensor for ReduceSum has issue if input and output has same size, we just need to copy the data for this case.
2019-04-17 09:48:08 -07:00
Rui Xia
9fb7e98c0b fix the allocator type in lru of cuda conv algorithm cache. (#848) 2019-04-16 23:58:58 -07:00
Ke Zhang
41dc3130f5
no need putting initializers (for constant node) into graph inputs. (#665)
* constant node should not be put into graph inputs any more.

* simplify graph input/output set logic.

* refactor comments.

* remove adding initializers as graph inputs when creating graph from scratch.
2019-04-17 07:38:08 +08:00
RandySheriffH
60d71d63b5
Rashuai/onnx test reduce mem (#790)
* define new test load function

* remove bak file

* add stat operator

* add arguments

* fix comments

* try enable fp16_tiny_yolov2 on linux

* fix compile err

* try enable fp16_tiny_yolov2
2019-04-16 15:47:52 -07:00
Tracy Sharpe
3a8b9a4918
fix trivial size_t warnings (#843) 2019-04-16 14:37:50 -07:00
Ashwini Khade
14d63b5f45
generate transformers bug fix (#838)
* fix graph transformer generation

* add more tests

* cosmetic changes

* more changes per review
2019-04-16 14:10:33 -07:00
Du Li
1818835795
Adding kernels for Resize op (#809)
* Adding the kernel for Resize op.

* Fixing a bug in nearest neighbour.

* remove gpu resize kernel.
will add it in another pr.

* fix a bug.

* Accomodating PR comments.
2019-04-16 13:05:40 -07:00
Pranav Sharma
29ad798c56
Update license - came up during IP scan (#841)
* support non-tensor types

* support non-tensor types.

* support non-tensor types.

* fix compilation issues

* fix compilation issues

* Build without mkldnn for release packages. We'll default to MLAS.

* Update license - came up during IP scan
2019-04-16 11:01:02 -07:00
Ashwini Khade
07e6dfa7ab
update onnx and enable tests for qlinearconv (#840) 2019-04-16 09:43:17 -07:00
jignparm
7775551a6f
Refactor C# and native packaging tests (#825)
* Refactor C# and native packaging tests

* Pass package name into docker

* add libiomp5ml.dll required by mklml.dll
2019-04-16 00:00:07 -07:00
Pranav Sharma
54e04cb8bb
cherry pick PR from 0.3.1 release - enable MSVC static runtime (#837) 2019-04-15 22:37:47 -07:00
Mika Fischer
b2658b3594 Cache CUDNN convolution benchmark results in cuda::Conv kernels (#712)
* Cache CUDNN convolution benchmark results in cuda::Conv kernels

Previously, the best convolution algorithm was determined by running
cudnnFindConvolutionForwardAlgorithmEx and cudnnFindConvolutionBackwardDataAlgorithmEx
on every shape change.

This is very detrimental for variable input shapes, such as variable batch
sizes.

This change adds a map to cache previously determined benchmark results.

The caching results in significant speedups for variable input shapes.

* Use LRU to limit cached benchmark results

* Only cache benchmark results for a fixed weight shape

In case the weight shape changes, all cached results are discarded.

* Use padded shape as key for cached benchmarks

* Add constant for max number of cached benchmark results

* Use unordered_map to store cached benchmark results

* Only store the parameters that are actuallt needed
2019-04-15 22:15:14 -07:00
Tracy Sharpe
f19d9a4907
Reduce code size of kernel registration (#833)
Some changes that reduce the size of the release onnxruntime.dll by 170KB:

Change the ONNX_OPERATOR_KERNEL macros to not create a unique virtual class per kernel create lambda, but instead use a generic class with the raw function address supplied at BuildCreateKernelInfo time.

Changed the exceution providers to use a table driven approach to calling the BuildCreateKernelInfo functions instead of a massive function with construct/call/delete sequences.

The CreateFunc in data_types.h didn't need to be a std::function, eliminating more lambda virtual classes.

N.B. To accommodate MSVC 14.11 toolchain (used for CUDA builds), the operator+() syntax cannot be used to retrieve the raw function address. The older toolchain can't resolve between cdecl/vectorcall and gives up. An explicit cast is needed to help the compiler along.
2019-04-15 16:39:59 -07:00
Pranav Sharma
049ba2d747
Exclude tests that fail when contrib ops are disabled. (#835) 2019-04-15 15:57:48 -07:00
Pranav Sharma
4b4a359943
Exclude unreferenced global data and op doc strings in the opschema object. The first causes a decrease in the binary size by at least 85k. The latter reduces resident memory size. (#823)
* Exclude unreferenced global data and op doc strings in the opschema object. The first causes a decrease in the binary size by at least 85k. The latter reduces resident memory size.

* Update onnx to incorporate my PR that fixes SetDoc compiler warnings
2019-04-15 15:57:19 -07:00
Ashwini Khade
e999af61b2
bug fix for shape inference (#834) 2019-04-15 15:51:12 -07:00
Raymond Yang
fabdbdc130
Update test retrieval following #828 (#836)
* Enable nightly build

* Update fetch file names

* Fix

* Update setup.py

* Update run_dockerbuild.sh

* Resolve comments

* Update test data
2019-04-15 14:51:20 -07:00
Dmitri Smirnov
6194a92249
Fix empty input handling in Tokenizer. (#826) 2019-04-15 14:46:17 -07:00
Changming Sun
2c0b8e965e Disable test data local cache in Linux CI pipelines 2019-04-12 22:23:16 -07:00
Changming Sun
e493ba2219 Fix memory leaks in perf test runner 2019-04-12 00:51:33 -07:00
Raymond Yang
1936d141a7
Create nightly build for python packages (#817)
* Enable nightly build

* Update fetch file names

* Fix

* Update setup.py

* Update run_dockerbuild.sh

* Resolve comments
2019-04-11 22:06:18 -07:00
Tracy Sharpe
c55e2de593
Status class optimizations (#824)
optimize onnxruntime::common::Status to reduce code size
2019-04-11 21:57:01 -07:00
Hariharan Seshadri
b6936e71cb
Avoid postfix iterator increment in a loop in Slice op and some minor formatting fixes (#820)
* Initial commit

* Fix comment

* More nit fixes
2019-04-11 17:44:32 -07:00
Hariharan Seshadri
ccf3566c35 Register kernel for Dropout (opset 10) for opset compliance (#813) 2019-04-11 13:55:43 -07:00
Pranav Sharma
6577c3dddf
Extract debug symbols in a separate file and strip the binary. (#811)
* Ensure Linux binaries are built with debug info. Extract debug info out of the main binaries. Strip the main binaries.

* add binutils

* add uname

* add binutils

* remove linux portion
2019-04-11 12:02:50 -07:00
Ryan Hill
1ff29bfb3d
Fix x86 calling convention break (#814) 2019-04-11 10:41:07 -07:00
Hector Li
0741baf867
Update NMS to support max_output_boxes_per_class = 0. NMS will do nothing for this case. (#816) 2019-04-11 10:09:33 -07:00
Hariharan Seshadri
56749a84ee
Implement opset v10 changes for Slice operator (#772) 2019-04-10 22:06:05 -07:00
jignparm
53038b33ed BuildFusedKernelDef uses N^2 algorithm verifying input constraints; session load time is huge for fused nodes (#804)
This optimization is required for WinML to prevent unit test time out.
2019-04-11 09:52:10 +08:00
jignparm
d17ae5c093
MKLML pipeline - update C# and CMake to handle dll dependencies (#810)
* Refactor NuGet to allow arbitrary namespaces

* Move csharp build to end of cmake

* Minor edit to ensure dll generation in sequence
2019-04-10 18:16:02 -07:00
Jesse Benson
24d80b4bda Add support for BrainSlice execution provider in Python, if onnxruntime is built with it. 2019-04-10 17:37:21 -07:00
Ashwini Khade
10b113f144
update onnx to bring in quantized ops (#808)
* update onnx + move quantized ops kernels and test to onnx + remove exp ops

* update onnx

* Revert "update onnx"

This reverts commit 533abfc297e75473a74505fb89921ffc05c46a1c.

* add generated csharp test file
2019-04-10 17:20:35 -07:00
Changming Sun
4bc3d6027d Build perf test runner only if onnxruntime_BUILD_SHARED_LIB is ON 2019-04-10 13:16:56 -07:00
Raymond Yang
3dcf82a1f9 Disable some flaky tests with CUDA9 (#805)
* Disable failing cuda tests
2019-04-10 01:02:32 -07:00
jignparm
4e3391ef60
Refactor NuGet to allow arbitrary PackageId names (e.g. Microsoft.ML.OnnxRuntime.MKLML) (#797)
* Refactor NuGet to allow arbitrary namespaces

* Move csharp build to end of cmake
2019-04-09 22:48:00 -07:00
Ashwini Khade
e7090d7202
move all removed exp ops to contrib ops (#786)
* move all removed exp ops to contrib ops

* fix cuda build failure

* bug fix

* move some tests to contrib ops + cosmetic changes

* Revert "move some tests to contrib ops + cosmetic changes"

This reverts commit 4cda9297e257a6f6b902724e8113bf5d5a62df29.
2019-04-09 22:26:48 -07:00
Changming Sun
0d4055def4 Integrate tensorflow into onnxruntime_perf_test tool 2019-04-09 15:55:08 -07:00
jignparm
9467c5f967
Update version to 0.3.1 (patch release) (#798)
* bump up version number (#752)

* bump up version number

* Minor change to kick off build

* update version to 0.3.1
2019-04-09 14:48:56 -07:00
Xavier Dupré
ccd7e801a0 Fix #612, TfidfVectorizer handles empty matrices as an input (#702)
* Fix #612, TfidfVectorizer handles empty matrices as an input

* Add more unit tests, better consistency of error messages

* Update tfidfvectorizer.cc

* better comment

* fix comments

* add unit test failure for an empty input {0, 1}
2019-04-09 10:55:24 -07:00
Yufeng Li
39951f35f4
Use template windows-build-tools-setup-steps.yml in win pipelines (#794)
1.  Update nuget restore to 4.3 for capi pipeline
2. Use template windows-build-tools-setup-steps.yml in win piplines.
2019-04-08 21:35:33 -07:00
jywu-msft
d91555f99e
fix for tensorrt_basic_test not being run. (#792) 2019-04-08 13:18:36 -07:00
Hariharan Seshadri
5cf72030b2
Rename misleading test names in ConvTranspose op tests (#788) 2019-04-06 17:01:26 -07:00
jywu-msft
571291c323 build.sh: don't require user to set --use_full_protobuf with --use_tensorrt option. we can set it implicitly. (#780)
* use_full_protobuf if tensorrt build option is enabled.

* update BUILD.md sections on MKLDNN and TensorRT/full_protobuf option
2019-04-06 10:11:57 -07:00
Yufeng Li
cea2a40bf1
Clean up ExecutionProvider in CSharp (#783) 2019-04-05 22:29:54 -07:00
Ryan Hill
fda1d0dce9
Ryanunderhill/ocr custom op (#744)
* Adding a custom op interface to the C API to remove shared library dependency.
* Remove old custom op test
* Rework how custom ops handle inputs/outputs to enable custom op output shape calculation in the compute method
* Add a nicer C++ API for custom ops and switch the tests to use it.
2019-04-05 18:53:20 -07:00
Tao Qin
58ef1306d4
Copy inputs and outputs directly in InferenceSession::SaveModelMetadata (#777)
* Copy required inputs and outputs directly in InferenceSession::SaveModelMetadata

* trivial

* trivial
2019-04-05 15:16:55 -07:00
ybrnathan
3eddb2d61e
Add optimization level as cmd line arguments (#776)
* Add optimization level as cmd line arguments

* fix the help info and add option.
2019-04-05 14:44:28 -07:00
utsabsingharoy
36ed91ee9f CustomRegistry should use composition instead of inheritence
CustomRegistry should use composition instead of inheritence
2019-04-05 14:14:10 -07:00
Changming Sun
867e961ee8 Remove mkldnn_sgemm from math_util.cc
If it is needed, it can be used explicitly in mkldnn provider.
2019-04-05 14:13:10 -07:00