Commit graph

339 commits

Author SHA1 Message Date
stevenlix
8ea7197b82 trt (#361)
* updated cmake files for tensorrt
2019-01-23 13:28:13 -08:00
Harry Summer
904d7c6ec8 Add --cuda_version option to enable manually specifying cuda version 2019-01-22 20:47:28 -08:00
Bowen Bao
d040b452cb Expand: add additional supported types. (#364) 2019-01-22 19:07:36 -08:00
jignparm
ea816615eb
remove use_tvm from base script. Put it in yaml configuration (#363) 2019-01-22 16:46:38 -08:00
Hector Li
647cc2dced
use gemm to replace matmul + add (#234)
* matmul add fusion

* add shape check on Gemm input C

* walk around the issue with RemoveNode

* update the version support

* If MatMul has shape [K] * [K, N], update it to [1, K] * [K, N], so that it can work for Gemm

* Fuse Gemm+Activation into FusedGemm

* test

* revert the change which fuse the matmul with shape [K]*[K, N] to Gemm as shape [1, K]*[K, N], this may cause runtime failure, as the we can't change input data shape.

* revert the change which change the shape for Matmul from [K]*[K, N] to [1, K]*[K, N]. It enables fuse Matmul + Add to Gemm, but the issue is the data is not aware of this, so the data shape is still [K]*[K, N] and cause runtime issue.

* 1. Fix build issue for CUDA
2. Update Gemm so that we can fuse Matmul [K] * [K, N] + Add [1, N] into Gemm with shape [1,K] * [K, N] + [1, N]

* Fix build issue

* Fuse the activation node even it connects the output

* resolve the merge conflicts

* Add test model for Gemm+Activation fusion
2019-01-22 15:21:55 -08:00
Scott McKay
8b55596dfe
The CUDA compiler doesn't support gsl::suppress so disable when __NVCC__ is defined. (#358) 2019-01-22 17:42:33 +10:00
Changming Sun
c87929e949 Use nsync for implementing condition variable 2019-01-21 22:59:42 -08:00
Du Li
1653ba9fcc
Optimizing Upsample op (#352) 2019-01-18 16:36:00 -08:00
Tracy Sharpe
22337bb641
fix linaro build (#355) 2019-01-18 16:11:53 -08:00
jignparm
0a21226b09 comment out 16-bit float models in C# (#351) 2019-01-18 14:16:53 -08:00
Tracy Sharpe
6f30bec040 Implement MLAS convolution+activation fusion (#354)
* conv+activation fusion
2019-01-18 14:16:28 -08:00
Ke Zhang
6831fc16ed Kezhan/kernel registry refine (#346)
* refactor kernel registry to make it a little bit more readable.

* update

* update cudaexecutionprovider

* fix build break

* fix comments

* fix build break
2019-01-18 09:55:30 -08:00
Changming Sun
948cc03490 upgrade onnx 2019-01-17 13:10:30 -08:00
Changming Sun
21713b7a41 Reduce test parallelism for cuda model tests 2019-01-17 13:10:30 -08:00
Changming Sun
36c62d84b4 remove ConstantLike OP 2019-01-17 13:10:30 -08:00
Scott McKay
9f3ae4279f Handle copy to/from non-CPU devices across control flow nodes (#339) 2019-01-17 10:51:23 -08:00
Changming Sun
c2704b5afb cleanup code (#343) 2019-01-16 17:12:22 -08:00
jignparm
b3f0d0b659
added unit test to guard against native API changes (#337)
* added unit test to guard against native API changes

* Removed cuda and mkldnn from API checks

* Updated per some code comments
2019-01-16 16:53:06 -08:00
Hector Li
790cda6ea7
Fix the issue which causes wrong output. (#342)
Root cause:
The cudaStreamWaitEvent is used after copy data from GPU memory to CPU memory, but the following node has CPU code depend on the data. Should use cudaEventSynchronize instead.
Fix:
Add code in executor to check the input memory type first, if it wants CPU memory, pass the CPUExecutionProvider type to BeforeUsingAsInput, then it will use cudaEventSynchronize to wait the write event.
2019-01-16 14:47:18 -08:00
Ashwini Khade
5d0e024284 Askhade/add quantized matmul (#295)
* Quantized Matmul Operators

* fix type inference after master merge

* bug fix for linux

* Plus review comments

* fix a check

* fix build error
2019-01-16 13:36:25 -08:00
Changming Sun
34afa0a598 Delete onnxruntime_exec 2019-01-16 11:18:44 -08:00
Changming Sun
d23f01dcd9 Suppress warnings for gemmlowp 2019-01-15 22:29:30 -08:00
Ashwin Kumar
95b8941e9d
Fix Seg fault when repeats input contain a 0 (#336)
* Fix Seg fault when repeats input contain a 0

* refine
2019-01-15 21:34:04 -08:00
Scott McKay
f678f58750
Revert to ignoring optional subgraph inputs (#306)
* Revert to ignoring optional subgraph inputs due to abandoning PR 216. Restores previous behaviour that changed a couple of days ago with the Scan v9 checkin.

* Update to allow either all inputs, or just required inputs to be provided for the subgraph.

* Update IterateSequence to prefer all inputs over required inputs.
2019-01-16 11:58:19 +10:00
Changming Sun
6225d5fe1e
Update test data (#334)
* update test data
2019-01-15 17:01:46 -08:00
Ashwin Kumar
492d9fd6cc
Use Eigen ThreadPool in OnnxRuntime (#323)
* switch to nonblocking threadpool in inference session and sessions state

* switch to eigen threadpool - first draft

* refine

* refine

* add a switch to easily revert back to windows thread pool

* switch thread pool in test runner and turn on leak checker

* remove unncessary files

* fix build error

* more build fixes

* catch exceptions in parallel executor

* fix mac build error

* fix mac build error

* more build fixes

* more mac build fixes

* fix cv issue

* change macro to include cuda compiler for  disabled compiler warning

* try switching the macro to win32 only

* test #error

* move #disable warning to the top

* Update onnxruntime_framework.cmake

* move eigen include to public scope

* turn off eigenthreadpool by default and add todo comment
2019-01-15 15:19:30 -08:00
Ke Zhang
139abda393
convinteger implementation based on gemmlowp (#294)
* update

* cmake change

* rename

* update

* update

* add cmake

* fix build warnings.

* fix comments

* update cmake to avoid run gemmlowp tests

* update cmake

* update

* fix build break

* update

* fix comments

* fix test failure

* add one more test case with padding.

* fix conv implementation of mkldnn and cuda to use updated computekernelshape function.

* fix linux ci build break
2019-01-15 14:39:50 -08:00
Hector Li
835b511fa8 cuda fix to unblock the tf model tests (#333)
* Check the pads attribute on Conv, and auto fallback to CPU if it's not symmetric padding

* Insert copy nodes after all graph transformer. It causes some issue if do the cast transformer before memory copy transformer.
2019-01-15 14:05:47 -08:00
Changming Sun
7977871740 Split build pipeline 2019-01-15 12:30:59 -08:00
jignparm
7e3923b9b3
Fix for non-wide characters in strings for linux - for c#-native interop (#326)
* Fix for non-wide characters in strings for linux - for c#-native interop

* update some unit tests

* added unicode and utf-8 encoding explicitly for file names
2019-01-15 01:41:32 -08:00
Hector Li
779123cf55
Upsample opset 9 cuda implementation (#330) 2019-01-14 23:04:46 -08:00
Raymond Yang
0efc48a11a
Install dotnet sdk on linux ci (#320)
* Try install dotnet sdk on linux ci

* Fix install script

* Add configurable os version in docker build script

* Avoid use ARG in docker
2019-01-14 17:51:45 -08:00
Edward Chen
677918cd9a Added generation of C# project properties file containing actual build directory. 2019-01-14 16:02:13 -08:00
Shah Asaduzzaman (ASAD)
c955cd8278 changed csharp runtime folder name to win from win10 2019-01-14 16:01:57 -08:00
Changming Sun
25c1e68988 Fix: roi_pool operator implementation error about FLT_MIN 2019-01-14 15:21:37 -08:00
Changming Sun
ef5679949a
Fix a c# build issue when mkldnn is not enabled (#321) 2019-01-14 14:22:02 -08:00
jignparm
3b83f062fc remove delayload from mkldnn (#276) 2019-01-14 14:13:28 -08:00
Changming Sun
260639c327 Add missing EXCLUDE_FROM_ALL keyword to nsync submodule 2019-01-11 16:34:55 -08:00
Faith Xu
2d067ec65c Update with link for C# GPU Nuget package 2019-01-11 15:39:18 -08:00
Yufeng Li
4735bb1ccb
Add onnx protobuf format that supports large model (#313)
* Add onnx protobuf format that supports large model

* Add optional for DataLocation
2019-01-11 10:23:18 -08:00
Sreekanth Yalachigere
05b9440fce mkldnn:Conv weight optimization (#256)
* mkldnn:Conv weight optimization

* weight optimization: review changes

* lock_guard and mutex for thread safe

* mutex added to provider

* lock to ReOrder done only once

* removed #ifndef mkldnn_hpp

* keep re-ordered mem buffer in scope

* applied clang format

* review updates: map to unordered map

* conv_mutex to mutex_
2019-01-11 08:48:20 -08:00
Xavier Dupré
8c40313e28
Update documentation to reflect the latest changes (#311)
- removes markdown output
- rename intro into index
- uses skl2onnx anywhere possible instead of onnxmltools
2019-01-11 12:41:42 +01:00
Du Li
7641ee9a2b suppress a warning. 2019-01-10 19:28:18 -08:00
Bowen Bao
d22429c5b2 Update compare_mlvalue for tests (#290)
* values should be considered matched if both of them are inf, or both
of them are nan.
2019-01-10 18:13:46 -08:00
Changming Sun
02962ce9d8
Update ABI.md (#299)
* Update ABI.md
2019-01-10 16:34:42 -08:00
Yufeng Li
02852a0881
Remove OperatorParser tool (#279) 2019-01-10 16:14:37 -08:00
Pranav Sharma
223773d278
Implement ROI Align for object detection. (#308)
* Implement ROI Align for object detection.

* Fix Mac build

* Fix Mac build
2019-01-10 11:34:55 -08:00
Changming Sun
6b3044ddd3 Update AddingExecutionProvider.md 2019-01-10 11:11:54 -08:00
Randy
fa0ea9a273
implement dynamic slice cuda (#286)
* implement dynamic slice cuda

* add template parameter

* add delaration

* init base class

* exclude case from cuda

* use cuda mapped type

* separate function implementation

* add cpy logic

* refactor

* add type check

* use InputMemoryType

* merge functions
2019-01-10 09:42:18 -08:00
Ryan Hill
98a92547bf
Ryanunderhill/c api 8 (#297)
* Make OrtAllocator not be reference counted

* Make the allocator interface more type safe

* Fix build break

* Build break fix

* Build break fix

* Mistake in previous build fix.

* Fix review comments + build break

* Missed the export symbols

* C specific error, need 'struct' keyword in one case.

* Function calling OrtReleaseObject instead of OrtReleaseEnv
2019-01-10 02:06:29 -08:00