Commit graph

722 commits

Author SHA1 Message Date
nivas-x86
3b0dda0aca nGraph: Avoid input and output data copies (#940) 2019-04-30 12:10:28 -07:00
Raymond Yang
01cd7eaca8 Bump up onnx version (#936)
* bump up onnx version
2019-04-30 08:44:32 -07:00
Changming Sun
2d00c62005 perf test: support more data types for the TF backend 2019-04-29 20:23:33 -07:00
Konstantinos Karanasos
1b7d1f2645
Convert constant folding to a transformer (#866) 2019-04-29 18:12:49 -07:00
Sergii Dymchenko
5bcd77e488 Use O_TRUNC when saving ONNX models to prevent possible file corruption. (#887) 2019-04-29 13:45:25 -07:00
Du Li
1658efd953
Adding Resize CUDA kernel (#869)
* Adding resize cuda kernel

* Fix a bug.

* Fix CI errors

* Fix CI issues.

* update onnx

* fix a linux CI issue.

* fix cuda kernels
2019-04-29 13:35:38 -07:00
Maik Riechert
ded7eeb033 make builds more robust (#906) (#932) 2019-04-29 12:58:20 -07:00
Du Li
8d6114038f
Adding cuda Greater kernel (#933) 2019-04-29 10:30:51 -07:00
Ke Zhang
f39a8d1f59
allow users to set graph inputs and outputs fully. (#905)
* allow users to set graph inputs and outputs fully.

* update

* update the comments of the APIs

* update

* remove commented-out codes.

* fix test failures.

* fix comments.

* adding more check to throw not support exception right now.
2019-04-29 15:58:39 +08:00
jignparm
bb58806872 Adding versioned dlls to tar/zip packages (#928)
* Adding versioned dlls to tar/zip packages

* fix syntax error

* fix version name of dylib

* minor fix in the target

* update pattern for versioned dylib files
2019-04-28 22:44:49 -07:00
Dmitri Smirnov
93d798b8ca Add locale configuration doc. 2019-04-27 21:16:49 -07:00
hariharans29
71ddba7254 Disable flaky model test in CUDA build 2019-04-27 21:15:20 -07:00
jignparm
861b9fda45
Add link to build within Nuget package (#926)
* Add link to build within Nuget package

* Update buildID to build uri

* add url prefix to build id
2019-04-27 13:41:20 -07:00
Hariharan Seshadri
06e0f7e3e7
Minor changes to support inclusion of x86 bits in the Nuget packaging pipeline (#916)
* initial commit

* More changes

* More changes

* Adding stuff back to the targets xml

* More changes v3

* More changes v4

* More changes v5

* More changes v6

* More changes v7

* More changes v9

* Disable CSharp tests for now

* More changes

* Revert file to same status

* Update props file for x86

* Change to usage of TargetArchitecture instead of PlatformTarget

* Update targets.xml

* Minor formatting nit fix

* Update based on PR comments
2019-04-27 00:41:26 -07:00
Konstantinos Karanasos
7a42ffd15f
Fix in Slice Elimination (issue #885) (#918)
Slice elimination should not be triggered when starts or ends is negative; small fix in op set domain validation. Fixes issue #885.
2019-04-26 21:04:29 -07:00
Ashwini Khade
90544ed766
bump version number for release (#911)
* bump version number for release

* + review comments
2019-04-26 16:28:16 -07:00
nivas-x86
b5ea3973c4 nGraph EP disable new quant tests (#920)
* Disable quant op test cases

* enable ngraph quant ops
2019-04-26 14:21:20 -07:00
Faith Xu
3812c881f7 Readme updates (#912)
* Updates

* Minor updates

* Updates

* Updates

* Add version link

* Update README.md
2019-04-26 13:55:26 -07:00
Raymond Yang
38f1f69432
Add a temporary bypass of artifacts permission issue (#921)
* Try using blob

* Try using blob

* Update working directory

* Update windows-build-tools-setup-steps.yml
2019-04-26 13:34:41 -07:00
Ryan Hill
5ed3db914e
Update custom op help (#914)
* Update AddingCustomOp.md
2019-04-26 11:22:46 -07:00
Bowen Bao
0cef3b53df Fix scatter assertion offset failure (#913) 2019-04-26 11:20:53 -07:00
Hector Li
8633e9ffda
Fix a issue that ort will crash if the model has subgraph created by controlflow ops like loop, if, scan. (#917)
Root cause:
The crash is caused by the null threadpool in the op. The op is inside the subgraph. theadpool is set on the session_state. However it doesn't pass it to the session_sate owned by subgraph.
Fix:
Pass the threadpool to the session_sate owned by subgraph when we create CreateSubgraphSessionState.
2019-04-26 10:57:06 -07:00
Du Li
da0f9bf9a4
fixing a bug in resize op (#910)
* fixing a bug in resize op

* Enable onnx tests

* merge master
2019-04-26 06:48:51 -07:00
RandySheriffH
5e3a266709
Rashuai/top k0 (#909)
* allow opset 10 topk to accept K==0

* add test case

* return empty tensor

* fix comment
2019-04-25 21:32:49 -07:00
Yufeng Li
0d2181cf85
Remove parallelfor for certain ops (#908)
Parallelfor makes maxpool, gather and reduce ops slower. This PR:

removes parallelfor for those ops
add windows thread pool back for sgemm.
2019-04-25 19:38:59 -07:00
Dmitri Smirnov
893b48e92a
Implement Mod operator (#900)
Implement Mod operator
2019-04-25 17:49:11 -07:00
Scott McKay
b8eaa88bd4
Migrate ReverseSequence from contrib op to ONNX opset 10 (#896)
* Update ONNX to 70c9026ca11b0af0050f8186bea6cab94636947f to pickup ReverseSequence op.

Copy ReverseSequence from contrib ops to ONNX (keep contrib op in this commit), and update to use int64_t for sequence_lens input.

* Copy ReverseSequence from contrib to ONNX and update to use int64_t for sequence_lens.

Maintain contrib op in this commit.

* Remove contrib op as it was temporary and only used internally.

* Remove contrib op schema defs.

* Cleanup contrib_defs.cc
2019-04-26 09:26:46 +10:00
jignparm
8ed3eed7b5
Fix ceil_mode not defined for mkldnn pool.cc warning (#907) 2019-04-25 12:32:28 -07:00
Pranav Sharma
80ac858016
Remove OSes/architectures that we don't build on and have no CI for. (#904)
* support non-tensor types

* support non-tensor types.

* support non-tensor types.

* fix compilation issues

* fix compilation issues

* Build without mkldnn for release packages. We'll default to MLAS.

* Remove OSes/architectures that we don't build on and have no CI for.
2019-04-25 11:48:48 -07:00
jignparm
125a77bec4
MaxPool+AvgPool - opset 10 [Dilation and ceil_mode] (#873)
* initial checkin for dilations and ceil support

* add unit tests for ceil_mode

* Update to use versioned_kernel for AveragePool

* update mkldnn/poolc..

* Add versioning to cuda_execution_providers.cc

* minor fix

* minor fix

* minor fix

* minor fix

* minor fix

* Folding PR comments

* Folded PR comments

* removed copy of dilations from contrib ops
2019-04-24 21:59:08 -07:00
Pranav Sharma
3ff97de8da
Modify roialign to conform with the new onnx spec and take it out from contrib ops. (#901)
* support non-tensor types

* support non-tensor types.

* support non-tensor types.

* fix compilation issues

* fix compilation issues

* Build without mkldnn for release packages. We'll default to MLAS.

* Modify roialign to conform with the new onnx spec and take it out from contrib ops.
2019-04-24 20:30:44 -07:00
Yufeng Li
73bc09421c Fix deadlock in parallel executor (#891)
Fix deadlock in parallel executor
  Execute immediately if ParalellFor has only 1 task
2019-04-24 15:55:04 -07:00
nivas-x86
ba3b82648e ng ep update1 (#895) 2019-04-24 10:35:26 -07:00
Dmitri Smirnov
95ac7a2f35
Implement separators as regex (#857)
* Implement separators as regex
2019-04-24 10:23:45 -07:00
Changming Sun
1f066d4dc4 Update onnx (#893) 2019-04-24 21:31:49 +10:00
Hariharan Seshadri
9d89b23d81
BatchNorm CPU does not support non-spatial cases - explicitly handle such cases (#890)
* BatchNorm CPU does not support non-spatial cases

* skip test in c#

* Update comments
2019-04-23 21:37:21 -07:00
Yufeng Li
d0f846aad5
Tuning GRU performance for batch size >= 2 (#644)
GRU with batch size >1 is implemented on the assumption that Lotus use single-thread Eigen Gemm. The assumption doesn't hold anymore. MLAS and MKLML support multi-thread. We don't rely eigen gemm anymore.
This PR implements batch size > 1 as batch size ==1. With this change, we have about 2x performance gain for GRU.Please refer to the performance test below:
(ms)
Batch_size | Seq_length | input_size | hiddden_size | Old | New
8 | 30 | 512 | 512 | 19.16 | 10.47
16 | 30 | 512 | 512 | 28.13 | 15.15
32 | 30 | 512 | 512 | 36.97 | 26.89
8 | 30 | 1024 | 1024 | 142.853 | 55.67
16 | 30 | 1024 | 1024 | 184.397 | 72.32
32 | 30 | 1024 | 1024 236.364 | 112.78
2019-04-23 14:50:11 -07:00
Changming Sun
80d69515ed
C API: catch exceptions in OrtCreateStatus (#821) 2019-04-23 14:41:44 -07:00
Changming Sun
11806529d0
Update test data (#864)
Add:

1. mxnet_arcface
2. tf_mobilenet_v1_1.0_224
3. tf_mobilenet_v2_1.0_224
4. tf_mobilenet_v2_1.4_224
5. tf_inception_v2
2019-04-23 13:24:24 -07:00
Hariharan Seshadri
4b4b585f58
Fix minor bugs in RemoveDuplicateGraphTransformer (#883) 2019-04-23 11:30:20 -07:00
Ashwini Khade
fb3b63438d
Add python api for graph optimization level (#882) 2019-04-23 11:11:35 -07:00
ybrnathan
b0a37477db Fix memory corruption issue when CPU->CUDA memcpy is involved (#879) 2019-04-22 20:21:14 -07:00
Dmitri Smirnov
7d7627b1ac
Implement IsInf (#871)
* Implement IsInf.
2019-04-22 17:45:54 -07:00
Yufeng Li
0bf12e9dbf
Add option to enable/disable memory pattern back (#872)
Memory pattern doesn't work for parallel executor by design. Enabling Memory Pattern for parallel executor logs warning and make the perf bad.
Add option to enable/disable memory pattern back.
2019-04-22 13:49:41 -07:00
Hector Li
e8d722003a
Move NMS to Onnx domain (#865)
* move files

* move files

* Remove NonMaxSuppression from Contrib op, move it to Onnx domain, opset 10

* move NMS out of namespace contrib

* update data type in UT

* update to latest onnx

* white list the node test for Mod which is not implemented yet
2019-04-22 13:24:27 -07:00
Lei Zhang
2947e1f9d4 Fix onehot code arm build break. 2019-04-22 11:06:20 -07:00
Changming Sun
2879ee8bd2 Fix a few warnings (#762)
* Fix warning in tensor_type_and_shape.cc

tensor_type_and_shape.cc:139:18: error: ‘out’ may be used uninitialized in this function [-Werror=maybe-uninitialized]

* fix warnings
2019-04-22 09:45:02 +10:00
Tracy Sharpe
cb69c65756
Update MLAS to be able to build standalone again (#874)
Change MLAS to be able to build standalone without onnxruntime header dependencies. This is enabled when building with MLAS_NO_ONNXRUNTIME_THREADPOOL defined.
mlas.h had been changed to include the ThreadPool header, but this header now just has a forward reference for the class. The header was also doing a "using onnxruntime::concurrency"; that has been removed and the external mlas.h users fixed up as needed.
As before, if ThreadPool==nullptr, then MLAS uses OpenMP or falls back to a single threaded implementation. The build option to use the Win32 system thread pool has been removed as onnxruntime can't hit that path and I don't use that option for standalone tests anymore.
2019-04-21 14:04:15 -07:00
nivas-x86
a4d7052aeb Add nGraph Execution Provider (#832)
* Add nGraph Execution Provider

* feedback changes 1

* feedback2

* Feedback and upgrade nGraph

* Feedback 4

* Fix CI

* Disable new ops
2019-04-20 17:02:35 -07:00
Changming Sun
7e1edbb9a2 Fix a build error in onnxruntime/core/common/threadpool.cc 2019-04-19 15:59:15 -07:00