Commit graph

7863 commits

Author SHA1 Message Date
jignparm
bb58806872 Adding versioned dlls to tar/zip packages (#928)
* Adding versioned dlls to tar/zip packages

* fix syntax error

* fix version name of dylib

* minor fix in the target

* update pattern for versioned dylib files
2019-04-28 22:44:49 -07:00
Dmitri Smirnov
93d798b8ca Add locale configuration doc. 2019-04-27 21:16:49 -07:00
hariharans29
71ddba7254 Disable flaky model test in CUDA build 2019-04-27 21:15:20 -07:00
jignparm
861b9fda45
Add link to build within Nuget package (#926)
* Add link to build within Nuget package

* Update buildID to build uri

* add url prefix to build id
2019-04-27 13:41:20 -07:00
Hariharan Seshadri
06e0f7e3e7
Minor changes to support inclusion of x86 bits in the Nuget packaging pipeline (#916)
* initial commit

* More changes

* More changes

* Adding stuff back to the targets xml

* More changes v3

* More changes v4

* More changes v5

* More changes v6

* More changes v7

* More changes v9

* Disable CSharp tests for now

* More changes

* Revert file to same status

* Update props file for x86

* Change to usage of TargetArchitecture instead of PlatformTarget

* Update targets.xml

* Minor formatting nit fix

* Update based on PR comments
2019-04-27 00:41:26 -07:00
Konstantinos Karanasos
7a42ffd15f
Fix in Slice Elimination (issue #885) (#918)
Slice elimination should not be triggered when starts or ends is negative; small fix in op set domain validation. Fixes issue #885.
2019-04-26 21:04:29 -07:00
Ashwini Khade
90544ed766
bump version number for release (#911)
* bump version number for release

* + review comments
2019-04-26 16:28:16 -07:00
nivas-x86
b5ea3973c4 nGraph EP disable new quant tests (#920)
* Disable quant op test cases

* enable ngraph quant ops
2019-04-26 14:21:20 -07:00
Faith Xu
3812c881f7 Readme updates (#912)
* Updates

* Minor updates

* Updates

* Updates

* Add version link

* Update README.md
2019-04-26 13:55:26 -07:00
Raymond Yang
38f1f69432
Add a temporary bypass of artifacts permission issue (#921)
* Try using blob

* Try using blob

* Update working directory

* Update windows-build-tools-setup-steps.yml
2019-04-26 13:34:41 -07:00
Ryan Hill
5ed3db914e
Update custom op help (#914)
* Update AddingCustomOp.md
2019-04-26 11:22:46 -07:00
Bowen Bao
0cef3b53df Fix scatter assertion offset failure (#913) 2019-04-26 11:20:53 -07:00
Hector Li
8633e9ffda
Fix a issue that ort will crash if the model has subgraph created by controlflow ops like loop, if, scan. (#917)
Root cause:
The crash is caused by the null threadpool in the op. The op is inside the subgraph. theadpool is set on the session_state. However it doesn't pass it to the session_sate owned by subgraph.
Fix:
Pass the threadpool to the session_sate owned by subgraph when we create CreateSubgraphSessionState.
2019-04-26 10:57:06 -07:00
Du Li
da0f9bf9a4
fixing a bug in resize op (#910)
* fixing a bug in resize op

* Enable onnx tests

* merge master
2019-04-26 06:48:51 -07:00
RandySheriffH
5e3a266709
Rashuai/top k0 (#909)
* allow opset 10 topk to accept K==0

* add test case

* return empty tensor

* fix comment
2019-04-25 21:32:49 -07:00
Yufeng Li
0d2181cf85
Remove parallelfor for certain ops (#908)
Parallelfor makes maxpool, gather and reduce ops slower. This PR:

removes parallelfor for those ops
add windows thread pool back for sgemm.
2019-04-25 19:38:59 -07:00
Dmitri Smirnov
893b48e92a
Implement Mod operator (#900)
Implement Mod operator
2019-04-25 17:49:11 -07:00
Scott McKay
b8eaa88bd4
Migrate ReverseSequence from contrib op to ONNX opset 10 (#896)
* Update ONNX to 70c9026ca11b0af0050f8186bea6cab94636947f to pickup ReverseSequence op.

Copy ReverseSequence from contrib ops to ONNX (keep contrib op in this commit), and update to use int64_t for sequence_lens input.

* Copy ReverseSequence from contrib to ONNX and update to use int64_t for sequence_lens.

Maintain contrib op in this commit.

* Remove contrib op as it was temporary and only used internally.

* Remove contrib op schema defs.

* Cleanup contrib_defs.cc
2019-04-26 09:26:46 +10:00
jignparm
8ed3eed7b5
Fix ceil_mode not defined for mkldnn pool.cc warning (#907) 2019-04-25 12:32:28 -07:00
Pranav Sharma
80ac858016
Remove OSes/architectures that we don't build on and have no CI for. (#904)
* support non-tensor types

* support non-tensor types.

* support non-tensor types.

* fix compilation issues

* fix compilation issues

* Build without mkldnn for release packages. We'll default to MLAS.

* Remove OSes/architectures that we don't build on and have no CI for.
2019-04-25 11:48:48 -07:00
jignparm
125a77bec4
MaxPool+AvgPool - opset 10 [Dilation and ceil_mode] (#873)
* initial checkin for dilations and ceil support

* add unit tests for ceil_mode

* Update to use versioned_kernel for AveragePool

* update mkldnn/poolc..

* Add versioning to cuda_execution_providers.cc

* minor fix

* minor fix

* minor fix

* minor fix

* minor fix

* Folding PR comments

* Folded PR comments

* removed copy of dilations from contrib ops
2019-04-24 21:59:08 -07:00
Pranav Sharma
3ff97de8da
Modify roialign to conform with the new onnx spec and take it out from contrib ops. (#901)
* support non-tensor types

* support non-tensor types.

* support non-tensor types.

* fix compilation issues

* fix compilation issues

* Build without mkldnn for release packages. We'll default to MLAS.

* Modify roialign to conform with the new onnx spec and take it out from contrib ops.
2019-04-24 20:30:44 -07:00
Yufeng Li
73bc09421c Fix deadlock in parallel executor (#891)
Fix deadlock in parallel executor
  Execute immediately if ParalellFor has only 1 task
2019-04-24 15:55:04 -07:00
nivas-x86
ba3b82648e ng ep update1 (#895) 2019-04-24 10:35:26 -07:00
Dmitri Smirnov
95ac7a2f35
Implement separators as regex (#857)
* Implement separators as regex
2019-04-24 10:23:45 -07:00
Changming Sun
1f066d4dc4 Update onnx (#893) 2019-04-24 21:31:49 +10:00
Hariharan Seshadri
9d89b23d81
BatchNorm CPU does not support non-spatial cases - explicitly handle such cases (#890)
* BatchNorm CPU does not support non-spatial cases

* skip test in c#

* Update comments
2019-04-23 21:37:21 -07:00
Yufeng Li
d0f846aad5
Tuning GRU performance for batch size >= 2 (#644)
GRU with batch size >1 is implemented on the assumption that Lotus use single-thread Eigen Gemm. The assumption doesn't hold anymore. MLAS and MKLML support multi-thread. We don't rely eigen gemm anymore.
This PR implements batch size > 1 as batch size ==1. With this change, we have about 2x performance gain for GRU.Please refer to the performance test below:
(ms)
Batch_size | Seq_length | input_size | hiddden_size | Old | New
8 | 30 | 512 | 512 | 19.16 | 10.47
16 | 30 | 512 | 512 | 28.13 | 15.15
32 | 30 | 512 | 512 | 36.97 | 26.89
8 | 30 | 1024 | 1024 | 142.853 | 55.67
16 | 30 | 1024 | 1024 | 184.397 | 72.32
32 | 30 | 1024 | 1024 236.364 | 112.78
2019-04-23 14:50:11 -07:00
Changming Sun
80d69515ed
C API: catch exceptions in OrtCreateStatus (#821) 2019-04-23 14:41:44 -07:00
Changming Sun
11806529d0
Update test data (#864)
Add:

1. mxnet_arcface
2. tf_mobilenet_v1_1.0_224
3. tf_mobilenet_v2_1.0_224
4. tf_mobilenet_v2_1.4_224
5. tf_inception_v2
2019-04-23 13:24:24 -07:00
Hariharan Seshadri
4b4b585f58
Fix minor bugs in RemoveDuplicateGraphTransformer (#883) 2019-04-23 11:30:20 -07:00
Ashwini Khade
fb3b63438d
Add python api for graph optimization level (#882) 2019-04-23 11:11:35 -07:00
ybrnathan
b0a37477db Fix memory corruption issue when CPU->CUDA memcpy is involved (#879) 2019-04-22 20:21:14 -07:00
Dmitri Smirnov
7d7627b1ac
Implement IsInf (#871)
* Implement IsInf.
2019-04-22 17:45:54 -07:00
Yufeng Li
0bf12e9dbf
Add option to enable/disable memory pattern back (#872)
Memory pattern doesn't work for parallel executor by design. Enabling Memory Pattern for parallel executor logs warning and make the perf bad.
Add option to enable/disable memory pattern back.
2019-04-22 13:49:41 -07:00
Hector Li
e8d722003a
Move NMS to Onnx domain (#865)
* move files

* move files

* Remove NonMaxSuppression from Contrib op, move it to Onnx domain, opset 10

* move NMS out of namespace contrib

* update data type in UT

* update to latest onnx

* white list the node test for Mod which is not implemented yet
2019-04-22 13:24:27 -07:00
Lei Zhang
2947e1f9d4 Fix onehot code arm build break. 2019-04-22 11:06:20 -07:00
Changming Sun
2879ee8bd2 Fix a few warnings (#762)
* Fix warning in tensor_type_and_shape.cc

tensor_type_and_shape.cc:139:18: error: ‘out’ may be used uninitialized in this function [-Werror=maybe-uninitialized]

* fix warnings
2019-04-22 09:45:02 +10:00
Tracy Sharpe
cb69c65756
Update MLAS to be able to build standalone again (#874)
Change MLAS to be able to build standalone without onnxruntime header dependencies. This is enabled when building with MLAS_NO_ONNXRUNTIME_THREADPOOL defined.
mlas.h had been changed to include the ThreadPool header, but this header now just has a forward reference for the class. The header was also doing a "using onnxruntime::concurrency"; that has been removed and the external mlas.h users fixed up as needed.
As before, if ThreadPool==nullptr, then MLAS uses OpenMP or falls back to a single threaded implementation. The build option to use the Win32 system thread pool has been removed as onnxruntime can't hit that path and I don't use that option for standalone tests anymore.
2019-04-21 14:04:15 -07:00
nivas-x86
a4d7052aeb Add nGraph Execution Provider (#832)
* Add nGraph Execution Provider

* feedback changes 1

* feedback2

* Feedback and upgrade nGraph

* Feedback 4

* Fix CI

* Disable new ops
2019-04-20 17:02:35 -07:00
Changming Sun
7e1edbb9a2 Fix a build error in onnxruntime/core/common/threadpool.cc 2019-04-19 15:59:15 -07:00
Changming Sun
d78c340eac update onnx (#861)
* update onnx

* ignore some tests
2019-04-19 10:52:47 -07:00
jignparm
b2268a6378
removing specific target framework for c-api test (#860) 2019-04-18 23:58:18 -07:00
Pranav Sharma
07a4ecbddb
Disable tests for certain models (Cherry pick from 0.3.1) (#842)
* Disable tests for certain models (Cherry pick from 0.3.1)

* Disable more tests

* More tests

* even more tests

* Fix gpu builds

* Disable L2 transformers

* Env variable to disable contrip ops for csharp tests
2019-04-18 23:57:52 -07:00
Pranav Sharma
780aad8fd0 Eliminate unused code and data from Linux binaries. (#849) 2019-04-18 23:00:27 -07:00
Konstantinos Karanasos
f09a76d669 Don't trigger constant folding in subgraphs (#858)
* don't trigger constant folding in subgraphs
2019-04-18 22:59:19 -07:00
Changming Sun
687bac455d Convert eigen to a submodule and update it to the latest version 2019-04-18 21:24:56 -07:00
Konstantinos Karanasos
ada90086f7
More efficient rule-based transformer (#815)
Introduce a quick pre-filtering of rules based on the node op types they are targeting.
The goal is to avoid evaluating all rules for all nodes. Instead, for each node, we will only be evaluating the rules associated with its op type.
2019-04-18 17:10:13 -07:00
Bowen Bao
ed0c86cd90 update onnx to fix matmul shape inference (#847)
* update onnx to fix matmul shape inference

* update onnx submodule hash in cgmanifest.json and ci scripts
2019-04-18 14:52:48 -07:00
stevenlix
f2694ab526
Enable provider unit tests for TensorRT (#802)
* Update provider_test_utils.cc

* Update tensorrt_execution_provider.h

* Update tensorrt_execution_provider.cc

* Update gemm_test.cc

* Update softmax_test.cc

* Update logsoftmax_test.cc

* Update matmul_test.cc

* Update batch_norm_op_test.cc

* Update conv_op_test.cc

* Update batch_norm_op_test.cc

* Update softmax_test.cc

* Update conv_transpose_op_test.cc

* Update instance_norm_op_test.cc

* Update flatten_op_test.cc

* Update loop_test.cc

* Disable failed tests for TensorRT

* Disable unsupported tests for TensorRT

* Disable unsupported tests for TensorRT

* Disable unsupported tests for TensorRT

* Disable unsupported tests for TensorRT

* Update matmul_test.cc

* Update logsoftmax_test.cc

* Update topk_op_test.cc

* disable unsupported tests for TensorRT

* resolve conflicts

* Update identity_op_test.cc

* Update activation_op_test.cc

* make max batch size configurable and simplify the code for disabling unsupported tests

* make max batch size configurable at runtime

* update tensorrt ci pipline

* move max batch size to private

* Update tensorrt_execution_provider.cc

* Update tensorrt_execution_provider.h

* Update tensorrt_execution_provider.cc

* add comments on the test changes

* Update tensorrt_execution_provider.h

* Update tensorrt_execution_provider.cc

* Update build.py
2019-04-18 13:20:37 -07:00