Commit graph

1579 commits

Author SHA1 Message Date
Paul McDaniel
94fc7bccff merge layer_dev_paulm 2019-11-19 10:58:27 -08:00
Paul McDaniel
d8941f11b1 fixed map and sequence when passing stl types across the ABI .
found a leak in nvidia driver, but skipped it.
all winmlapitests pass now
2019-11-19 10:54:51 -08:00
Paul McDaniel
3f67aaaf81
Layer dev paulm (#2426)
* model moved over.
everything builds clean.
step !

* weak ref comment

* added a wrapper for RoGetActivationFactory to hook back into winml for creating winml objects.
fixes model load.

* fixed some lifetime management.
fixed the debug build.
squeezenet passes using winmlrunner for CPU and GPU

* PR feedback.

* couple of fixes and coded getmutabledata()

* fixed 2 more heap corruptions
2019-11-18 19:47:38 -08:00
Paul McDaniel
3da841bb73 fixed 2 more heap corruptions 2019-11-18 19:46:32 -08:00
Paul McDaniel
72b3a91bbd Merge remote-tracking branch 'origin/layer_dev' into layer_dev_paulm 2019-11-18 18:44:26 -08:00
Paul McDaniel
8c28f0816f
Layer dev paulm (#2425)
* model moved over.
everything builds clean.
step !

* weak ref comment

* added a wrapper for RoGetActivationFactory to hook back into winml for creating winml objects.
fixes model load.

* fixed some lifetime management.
fixed the debug build.
squeezenet passes using winmlrunner for CPU and GPU

* PR feedback.

* couple of fixes and coded getmutabledata()
2019-11-18 18:42:54 -08:00
Paul McDaniel
9253e2392d merge 2019-11-18 18:40:26 -08:00
Paul McDaniel
acc6ea525b couple of fixes and coded getmutabledata() 2019-11-18 18:31:04 -08:00
Paul McDaniel
be37e4c225
Layer dev paulm (#2424)
* model moved over.
everything builds clean.
step !

* weak ref comment

* added a wrapper for RoGetActivationFactory to hook back into winml for creating winml objects.
fixes model load.

* fixed some lifetime management.
fixed the debug build.
squeezenet passes using winmlrunner for CPU and GPU

* PR feedback.
2019-11-18 11:07:54 -08:00
Paul McDaniel
a3542e1128 PR feedback. 2019-11-18 11:06:08 -08:00
Paul McDaniel
54c785ff68 Merge remote-tracking branch 'origin/layer_dev' into layer_dev_paulm 2019-11-18 09:52:10 -08:00
Paul McDaniel
5a1177acaa
Layer dev paulm (#2423)
* model moved over.
everything builds clean.
step !

* weak ref comment

* added a wrapper for RoGetActivationFactory to hook back into winml for creating winml objects.
fixes model load.

* fixed some lifetime management.
fixed the debug build.
squeezenet passes using winmlrunner for CPU and GPU
2019-11-18 09:51:39 -08:00
Paul McDaniel
b4047a0aad fixed some lifetime management.
fixed the debug build.
squeezenet passes using winmlrunner for CPU and GPU
2019-11-18 09:50:25 -08:00
Paul McDaniel
5bd2c1ef21
Layer dev paulm (#2414)
* model moved over.
everything builds clean.
step !

* weak ref comment

* added a wrapper for RoGetActivationFactory to hook back into winml for creating winml objects.
fixes model load.
2019-11-15 16:48:38 -08:00
Paul McDaniel
7f9a7f5abe added a wrapper for RoGetActivationFactory to hook back into winml for creating winml objects.
fixes model load.
2019-11-15 16:47:33 -08:00
Paul McDaniel
8f95b7739a Merge remote-tracking branch 'origin/layer_dev' into layer_dev_paulm 2019-11-15 13:20:11 -08:00
Paul McDaniel
2bfa3c67c6
Layer dev paulm (#2408)
* model moved over.
everything builds clean.
step !

* weak ref comment
2019-11-15 13:17:22 -08:00
Paul McDaniel
f32bbd5cb7 weak ref comment 2019-11-15 13:15:57 -08:00
Paul McDaniel
f07fdf96b4 model moved over.
everything builds clean.
step !
2019-11-15 10:54:44 -08:00
Paul McDaniel
00cee34ec0 Merge branch 'layer_dev' of https://github.com/microsoft/onnxruntime into layer_dev 2019-11-15 09:54:13 -08:00
Paul McDaniel
8b37bd03ac Merge remote-tracking branch 'origin/windowsai' into layer_dev 2019-11-15 09:53:55 -08:00
Paul McDaniel
5350abe19d
LearningModelSession is cleaned up to use the adapter, and parts of b… (#2382)
this is a big PR.    we are going to move it up to layer_dev , which is still a L3 so we are still safe to do work there agile.

we are going to move this into the L3 so that ryan can start doing intergration testing.   

we will pause for a full code review and integration test result prior to going into the L2.

>>>> raw comments from previous commits >>> 

* LearningModelSession is cleaned up to use the adapter, and parts of binding are.
* moved everything in the winmladapter
made it all nano-com using, WRL to construct objects in the ORT side.
base interfaces for everythign for winml to call
cleaned up a bunch of winml to use the base interfaces.
* more pieces
* GetData across the abi.
* renamed some namepsace
cleaned up OrtValue
cleaned up Tensor
cleaned up custom ops.
everything *but* learnignmodel should be clean
* make sure it's building.   winml.dll is still a monolith.
2019-11-14 17:44:07 -08:00
Paul McDaniel
5406801670
Task 23998197: add winml_lib_core into onnnxruntime.dll (#2368)
* Task 23998197: add winml_lib_core into onnnxruntime.dll

* PR feedback
build break on perf_test
2019-11-11 14:34:19 -08:00
Brian Martin
a3a6a97407
update build instructions to include --build_shared_lib (#2358)
* update build instructions to include --build_shared_lib

* fix line breaks
2019-11-08 14:26:04 -08:00
Paul McDaniel
b6f5eef1d9 more snipping to get core into ort 2019-11-08 13:23:44 -08:00
Ryan Lai
444bfcc26e Initial changes for layering 2019-11-07 16:50:24 -08:00
Brian Martin
b94ae8e965
Merged PR 3985217: add onecoreuap_apiset.lib in order to avoid linking against kernel32.lib etc (#2346)
add onecoreuap_apiset.lib in order to avoid linking against kernel32.lib etc and violating our OS layering requirements.

We linked against onecoreuap_apiset.lib in VB so we will continue doing this, but I am still unsure why not to link against onecore instead since that is where we ship. However, since Sheil is the owner of this code we will wait to discuss with him before changing anything.
2019-11-07 14:29:11 -08:00
Adrian Tsai
7390b64af5 Initial Commit 2019-11-07 11:51:44 -08:00
baowenlei
0f1e24f4a9 [NupharEP] tensorize int8 GEMM for avx (#2142)
* finish avx tensorization and save state

* split tests for better debug

* add missing avx option

* update configure for AVX

* update tensorize avx support

* Merged PR 5327: Fix llvm cross compilation

Fix llvm cross compilation

Related work items: #4080
2019-11-06 14:35:13 -08:00
KeDengMS
58e6aaa414
Fix crash in releasing TLS from CUDA EP dtor (#2329)
thread_local/global/static destruction order depends on implementation details of compilers and OS. The bug happens when thread_local is already out of scope while static EP being destructed, thus causing access violation in EP's destructor when accessing thread_local.

The fix is to maintain ownership inside EP with a mapping from tid to ThreadLocalContext, to avoid accessing thread_local in EP's destructor. This way, no matter what the destruction order is, no access violation would be triggered.
2019-11-06 13:00:17 -08:00
Yulong Wang
c0b8926863
implement CPU contrib OP EmbedLayerNormalization (#2332) 2019-11-06 12:27:08 -08:00
George Wu
06a6d74a67
update ngraph dockerfile. add python lib location to LD_LIBRARY_PATH for cuda/tensorrt Dockerfiles. (#2330) 2019-11-06 11:29:55 -08:00
Vinitra Swamy
ace19129b9 MCR Docker Images v1.0.0 refresh (#2302)
* update dockerfile table with new MCR tags

* add new openvino dockerfiles to table
2019-11-05 22:06:47 -08:00
Patrick Foley
151075790d [OpenVINO-EP] Update to latest version: OpenVINO 2019 R3.1 (#2308)
* Updates OpenVINO EP to latest version: 2019 R3.1

* Reviews fixed

* Update Dockerfile.openvino

* Addressed PR comments and disabled model tests temporarily

* Update Dockerfile.ubuntu_openvino
2019-11-05 19:55:46 -08:00
Dwayne Robinson
db454beacf
TensorDesc::Placement test failure - cherry pick Vibranium fix. (#2328) 2019-11-05 18:18:31 -08:00
Scott McKay
67ec626d88
Copy blocks in Slice when possible (#2312)
* Add logic to try and flatten inner dimensions being copied by Slice and do a block copy if they can be.
Do a block copy for just the inner most dimension where possible (applies even if we don't flatten inner dimensions).
2019-11-06 10:53:30 +10:00
Changming Sun
104f3b2a59 Exclude candy from CUDA tests 2019-11-05 15:22:09 -08:00
Changming Sun
143ae98a37
Fix a bug in onnxruntime_pybind_state.cc when TENSORRT is enabled (#2326) 2019-11-05 15:04:50 -08:00
George
8a102c6e99 apply eigen patch only for ACL. 2019-11-05 13:53:53 -08:00
Changming Sun
5ce4d4fc49 Fix a test failure when it runs on FreeBSD 2019-11-04 23:47:37 -08:00
Yufeng Li
035913d42f
Support int32_t for Reduction (#2317) 2019-11-04 20:52:01 -08:00
manashgoswami
d5c36bfff2 Updated links in docs (#2303)
* Update README.md

* Update README.md

* Update README.md
2019-11-03 09:10:56 -08:00
Faith Xu
556bae17a5 Fix versions table (#2309)
* Update table values

* Fix onnxml opset version
2019-11-03 08:58:21 -08:00
Yulong Wang
cba93f7c8d fix Gelu CPU: remove MayInplace() declaration (#2306) 2019-11-01 18:10:05 -07:00
Yulong Wang
204a6872d3
remove unused param 'input_count' in ConcatImpl (#2304) 2019-11-01 15:50:11 -07:00
Tianlei Wu
a6b2c9fc09
Fix mask in EmbedLayerNormalization (#2300) 2019-11-01 13:49:55 -07:00
KeDengMS
6e65dcf588
[NupharEP] symbolic_shape_infer improvements (#2299)
- Improves symbolic shape inference in following ways:
1. Extend suggested merge to map to literals with --auto_merge. For example, MatMul of ['ax1', 'ax2'] x [128, 256] would now map 'ax2' to 128
2. Add --int_max option to simplify computations like Min(100000, 'dim') to be 'dim'. This helps ops like Slice to generate correct shape, i.e. start=0, end=Min(100000, dim - 2) on dim. It was previously treated as equal, since sympy cannot determine Min(100000, dim - 2) < dim.
- Fix a bug in create_shared script on Windows, that AOT dll is not generated because of failure in link, when there are too many obj files
- Fix a bug for Split since TOPI does not support split on symbolic dimension.
- Some build warning fixes for NupharEP.
2019-11-01 11:34:52 -07:00
Tianlei Wu
bc85d43809
Dump cuda tensor data (#2243)
* dump cuda tensor

* move data_type definition

* Dump cuda tensors for cuda build only.
Output tensor location (if it is not in CPU or pinned)

* update for cuda build

* Update for code review feedback

* update for CR feedback

* use data transfer manager for tensor copy
2019-10-31 21:09:10 -07:00
Scott McKay
7a5de9c958
Add a python script with a number of helper actions for creating/editing/dumping onnx test runner format pb files (#2294)
* Add a python script with a number of helper actions for creating/editing/dumping onnx test running format pb files.
2019-11-01 06:39:14 +10:00
mikecaraman
358b517d49 [v2] Add ACL (Arm Compute Library) execution provider (#2258)
* Guard unused parameter

Guard unused parameter for Linux Arm and other cases.

* Add ACL (Arm Compute Library) execution provider

Add a new execution provider targeting Arm architecture based on Arm Compute Library.
Validated on NXP i.MX8QM CPU with ResNet50, MobileNetv2 and VGG models.
All unit tests are passing.

Comparative performance improvements for ResNet50v1 model obtained with
onnxruntime_perf_test:
		A72	2xA72	A53	4xA53
ACL vs CPU  	16%	9%	21%	13%

Usage documentation available in ACL-ExecutionProvider.

* Fix eigen unused parameter

Fix eigen unused parameter error for Arm cross-compilation.
2019-10-31 12:25:36 -07:00