Commit graph

964 commits

Author SHA1 Message Date
Hector Li
2a6c69de2b
Implement the Concat CUDA kernel (#1333)
* Improve CUDA kernel performance for Concat. Implement the kernel code instead of using cudaMemCpy in a loop.

* Update the index lookup part for Concat & Split
2019-07-02 23:08:59 -07:00
Faith Xu
5e54bbffec PyOp documentation Revisions (#1318)
* Revisions

* Minor fix
2019-07-02 18:00:51 -07:00
Ryan Hill
1bf80e30fa
Ryanunderhill/MNIST sample (#1330) 2019-07-02 14:41:27 -07:00
RandySheriffH
bf6a9f9c27
Rashuai/py op example (#1325)
* add scikit example

* format text

* format doc
2019-07-02 09:53:49 -07:00
daquexian
c65489a47f Initial PR for NNAPI execution provider (#1220)
* init

* Update DNNLibrary

* Update DNNLibrary, set compiler flags, it compiles now

* Add more missing flags, add test

* Update DNNLibrary

* Update Compile method, fix allocator and some other bugs

* Update DNNLibrary

* Implement CopyTensor

* Not delete state explicitly since it is managed by unique_ptr

* Add the missing files when SingleUnitTestProjct is ON

* misc changes

* Fix wrong name in provider factory

* Add my own test

* Update the code of add node into graph, and add the missing initializer into graph

* Fix the bug that re-build the graph produces extra output

* Update DNNLibrary

* Transpose nchw (ONNX) -> nhwc (NNAPI)

* Add license

* Add GetSupportedNodes method (implement it later)

* Rename onnxruntime_nnapi_test->onnxruntime_nnapi_squeezenet_test

* Update squeezenet_test.cpp after rebase master

* Remove squeezenet_test.cpp since it is almost same with the c++ sample

* Update DNNLibrary for GetSupportedNodes

* Update GetSupportedNodes

* Revert "Remove squeezenet_test.cpp since it is almost same with the c++ sample"

This reverts commit a97575fd9ff49e50ba1dc8d8154790d8cd86c48d.

* Update DNNLibrary

* Fix multiple outputs bug

* Remove GetKernelRegistry

* Revert "Revert "Remove squeezenet_test.cpp since it is almost same with the c++ sample""

This reverts commit 2a0670e9cbf10ea654111ce39e198a4be0ddd838.

* Set default memory type of NNAPI EP

* Add CPUOutput allocator

* Update DNNLibrary for multiple outputs

* Fix bug of nhwc->nchw

* Remove GetExecutionHandle()
2019-07-02 06:03:29 -07:00
xkszltl
98ea675e40 Fix typo: op[s]iops -> op[t]ions. (#1329)
Resolve https://github.com/microsoft/onnxruntime/issues/1322
2019-07-01 21:25:38 -07:00
Faith Xu
f67a1629fc Documentation reorganization (#1143)
* Update Versioning.md

* Update Versioning.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update BUILD.md

* Update HighLevelDesign.md

* Update Versioning.md

* Update README.md

* Update tool compat table

* typo

* Updates based on feedback

* Update template to include model

* Updates based on feedback

* Typos
2019-07-01 17:11:50 -07:00
S. Manohar Karlapalem
5af2aec3a2 [OpenVINO-EP] use logging infrastructure to display EP log messages (#1289)
* Add logging messages

Implements logging messages at INFO, WARNING and ERROR levels.
Utilizes ONNX-RT's logging infrastrcture.
Reversed IsGraphSupported check logic to facilitate logging.

* Add MO exception text to WARNING log message
2019-07-02 08:28:20 +10:00
Changming Sun
28759e2f6f Uninstall the preinstalled cmake in tensorrt image because it's too old (#1316) 2019-07-01 15:08:01 -07:00
Hariharan Seshadri
a077ac8df5
Support non-default negative axis value and intuitive data type combination for OneHot op (#1317)
* Handle nondefault negative axis value

* Support more intuitive data types for this op
2019-07-01 14:29:33 -07:00
Ashwini Khade
2698edbc98
enable tests (#1310) 2019-06-28 12:04:14 -07:00
Dmitri Smirnov
2f698bd54b
Fix NMS const_cast that modified kernel state creating (#1303)
* Fix NMS const_cast that modified kernel state creating
  thread safety issue. Re-factor for future CUDA implementation.
2019-06-28 09:41:17 -07:00
Matthieu Darbois
04d581995d Use manylinux2010 image to build linux python wheels (#1282)
* Update cuda for python wheels

* Update cuda for python wheels

* Update cuda for python wheels

* Update azure-pipelines-py-packaging.yml

* Update to cuda 10

* Only test win gpu

* Update cuda for python wheels

* Use manylinux2010 image to build linux python wheels

Allow wheels built to truly be compliant with a manylinux policy
2019-06-27 15:45:06 -07:00
Scott McKay
0951f53c80 Update ONNX to d94f99d21a9a0820d58966410ceaf525132f85f1 to pickup change to checker that makes ssd_mobilenet model load 20x faster by avoiding unnecessary copies. (#1307) 2019-06-27 08:39:41 -07:00
jignparm
59de37af1f
Add CUDA Expand operator (#1292)
* Add CUDA expand operator

* Reset counter variables when striding

* Reset counter variables when striding

* use fast_divmod and other PR comments

* Fix merge variable rename

* Fix indentation per PR comment

* Remove maxpool_argmax

* Reduce number of type templates for Expand operator

* removed all types

* Commit updated cuda_execution_provider.cc
2019-06-27 02:31:56 -07:00
ybrnathan
a79ab5ec5b
Add document for ONNX Runtime latency profiling and JSON file viewing. (#1301) 2019-06-26 21:58:10 -07:00
Pranav Sharma
b8d370029f
Check that specific inputs are constants in Conv(Add|Mul|BN)Fusion rules (#1270)
* Check for non-existent initializers while fusing conv and add.

* Fix other places where initializer can be null

* Add check if initializer is an input

* update the models to comply with the new ONNX spec.

In new ONNX spec, the initializers should not be in inputs.

* Fix previous temporary code

* Add negative test

* Revert changes to conv_bn_fusion and conv_mul_fusion

* making helper IsNodeArgConstant a little more general; updating remaining Conv*Fusion rules

* minor comment

* AllNodeIputsAreConstant to use new function
2019-06-26 15:42:05 -07:00
jignparm
089b1ef3bb Enable max/average pooling onnx_test_runner tests (#1129)
* Enable max/average pooling tests

* minor edit

* comment out maxpool_with_argmax
2019-06-26 13:45:35 -07:00
Ashwini Khade
05a222a961 enable quantization tests (#1293) 2019-06-26 10:08:19 -07:00
Tracy Sharpe
3ebad81abc
MLAS: NCHWc low-level changes (#1283)
Implementation of the MLAS changes for NCHWc convolution/pooling support. These changes adopt the blocking format used by MKL-DNN and other convolution libraries for better performance.
2019-06-25 16:57:30 -07:00
Scott McKay
a462328d9d
Handle case where the Loop 'M' and 'cond' inputs can be considered scalars but the rank doesn't match the subgraph. Use the subgraph rank when creating the MLValue instance for the subgraph input. (#1285) 2019-06-26 09:33:11 +10:00
RandySheriffH
c0cf2213bc
Parallelize TreeEnsembleClassifier batch predition (#1276)
* use openmp for loop

* Fix windows compile err

* fix windows com err
2019-06-25 12:13:05 -07:00
jignparm
a56b294428
Activate compliance tasks for private builds, and also set a daily scheduler (#1280)
* add compliance and build schedules

* cpu-esrp-pipeline.yml

* update schedule time for testing

* add schedules to all pipelines
2019-06-25 11:13:55 -07:00
Hariharan Seshadri
a542769b50 Fix misleading notion that Flatten op is not supported for opset 10 (#1107)
Use more standard way of defining valid opset range for Flatten-1 and Flatten-9 for both CPU and CUDA implementations.
2019-06-25 19:37:11 +10:00
Scott McKay
86dc3b4360
Fix bug in the transformer that removes unnecessary Cast nodes where it was re-processing removed nodes leading to multiple calls to RemoveNode for the same node. (#1291)
Description:
The remove duplicate Cast logic was processing a node already removed, leading to multiple calls to remove the same node causing an error. Add a check so that nodes marked for removal are skipped.

Motivation and Context
If a model has 3 Cast nodes in a row the bug would cause an exception to be thrown due to multiple calls to remove the same node. This causes the latest optimized tf2onnx conversion of ssd_mobilenet to break.
2019-06-25 15:17:08 +10:00
David Fan
c9d83a52a8
Implement contrib op CropAndResize (#1277)
* Implement contrib op CropAndResize

* Implement contrib op CropAndResize
2019-06-24 18:34:35 -07:00
Ashwini Khade
06642dbbac
selectively enable quantization tests for ngraph (#1290)
* selectively enable quantization tests for ngraph

* completely disable qlinearconv test for ngraph
2019-06-24 18:20:05 -07:00
Ashwini Khade
a571ea74a6 update onnx (#1287) 2019-06-24 14:17:27 -07:00
Ashwini Khade
01715c0ff1
update doc "How_To_Update_ONNX_Dev_Notes" (#1288)
* update documentation to match current code

* plus some wording changes
2019-06-24 12:59:30 -07:00
Ashwini Khade
92dc5c506d
move all contrib ops to contrib ops namespace (#1190)
* move all contrib ops to one place

* namespace changes

* bug fix - remove redundant file after merge master

* plus more minor bug fixes

* bug fix

* fix extra space in include header + namespace fix

* fix linux build failure:

* fix test group names

* remove redundant test
2019-06-24 10:19:01 -07:00
Scott McKay
12d70abd29
Propagate errors from parallel executor (#1262)
* Capture and return any errors from parallel execution so the behavior is equivalent to when the sequential executor is used.
2019-06-24 19:21:11 +10:00
RandySheriffH
671c15a56a
Treat attribute warning as non-error on cross compiling ARM (#1261)
* abandon attribute error on cross compiling

* install dep lib
2019-06-23 17:59:38 -07:00
Yufeng Li
2bfbcd323a
fix OrtValue Release condition (#1182)
CUDAFence::CanRelease should also check wirte_event_
2019-06-22 11:29:23 -07:00
Pranav Sharma
204bd38d6a
Add ability to set graph optimization level in onnx_test_runner. (#1275) 2019-06-21 18:36:40 -07:00
Ryan Hill
c8db2d507e
Actualy add the C++ headers to the nuget packages (#1267) 2019-06-21 14:36:50 -07:00
Scott McKay
4d765dc6d0
Return error message from status instead of swallowing it. (#1221)
* Return error message from status instead of swallowing it.

* Return OrtValue* from OpKernelContext::GetOrCreateOutputMLValue

* Add unstaged change.
2019-06-22 06:26:42 +10:00
ybrnathan
18b7d2b18a
Add document of ONNXRuntime performance tuning (#1266)
* Add document of ONNXRuntime performance tuning

* Clarify MKL-ML
2019-06-21 10:38:22 -07:00
Vinitra Swamy
3b71701f91
Update onnxruntime server docker file with ONNXRUNTIME_VERSION in cmake files (#1259) 2019-06-21 10:28:06 -07:00
Changming Sun
3275e44c62
Change the memory alignment for default cpu provider (#1269) 2019-06-20 20:19:11 -07:00
Changming Sun
eb833057d2
Update Model_Test.md (#1264) 2019-06-20 19:51:54 -07:00
jignparm
d3e5474c1d
Refactor CI pipelines - add GPU NuGet pipelines and ESRP code signing steps (#1247)
* Simplify linux gpu pipeline

* Refactor win-gpu-ci-pipeline.yml

* Set cuda environment variables for testing and version

* Remove variables from starter script

* minor fix

* Add GPU Nuget pipeline

* Set DisableContribOps environment variable for Linux package tests

* Add ESRP tasks

* Add ESRP signing templates

* Test out hardcode value of ERSP

* Test out hardcode value of ERSP

* Test out hardcode value of ERSP

* Test out hardcode value of ERSP

* test variable expansion

* test variable expansion

* test variable expansion

* test variable expansion

* test variable expansion

* test variable expansion

* test out variable expansion

* test variable expansion

* test variable expansion

* test variable expansion

* test variable expansion

* test variable expansion

* test variable expansion

* test variable expansion

* test variable expansion

* update cpu pipeline to conditionally esrp sign

* Set C# GPU tests to run only if env var is set

* Refactor for easy parameter passing

* refactored esrp templates

* remove variables from template

* Add packaging variables back to pipelines

* update C# for cuda 10

* Merge vars ana parameters for gpu pipeline

* remove vars from mklml pipeline

* display envvars on terminal

* Clean up C# cuda tests, and upgrade to Cuda10

* Introduce CUDNN_PATH pipeline varaible

* YAML variable are always uppercased (not true with classic)

* Update C# GPU test to be more meaningful

* remove macos from gpu tests

* remove debugging info for DisableContribOps option

* Remove DisableContrib ops parameters -- use variables only

* Fix typo from = to -

* remove debug steps

* fix typo

* remove unused variable TESTONGPU from some templates

* clean up CUDA env setup scripts

* Remove CUDNN_PATH from setup_env_cuda.bat
2019-06-20 19:41:30 -07:00
Changming Sun
766c6b6163
Add an API for retrieve ORT version (#1263)
* Add an API for retrieve ORT version
2019-06-20 15:42:12 -07:00
Ashwini Khade
5b87f07d80
enable matmul tests (#1255)
* enable matmul and gemm tests

* correction
2019-06-20 10:32:15 -07:00
Jorgen Thelin
ba25ea3643 Allow building Docker container based on a different git repo. (#1222)
- Introduce Docker build ARG `ONNXRUNTIME_REPO`
  to allow building Docker container based on a different git repo.

Example docker build command:

```bash
cd dockerfiles
docker build -t onnx-runtime \
  --build-arg ONNXRUNTIME_REPO=https://github.com/jthelin/onnxruntime \
  --build-arg ONNXRUNTIME_SERVER_BRANCH=my-branch \
  -f Dockerfile.server .
```

- Add a basic `.dockerignore` file, to cut down the number of filles passed into the Docker build context.
2019-06-20 09:55:42 -07:00
KeDengMS
df68111b98
Fix a bug that fused func manager in subgraph session state is nullptr (#1251)
Description: This fixes nullptr of fused func manager issue when running fused function inside sub graph session state

Motivation and Context

The bug happens in running fused functions created IExecutionProvider::Compile inside sub graph, i.e. Scan, which causes crash.
The problem is that FuncInfo is collected into main graph's session state, before sub graph session state is created.
The fix is to share FuncInfo between main graph and sub graph.
2019-06-19 14:37:21 -07:00
Zhang Lei
23838d9c2a Add enable/disable mem pattern api for python and csharp. (#1227) 2019-06-19 11:17:21 -07:00
PhaniShekhar
e26e11b9f7 Quantization tool to support quantization of Conv and MatMul nodes. (#1057)
* Move quantization tool from onnx to onnxruntime

* Fix some issues

* Use u8_s8 for asymmetric mode and u8_u8 for symmetric mode irrespective of whether inputs are initializers or from previous

* Address PR comments

* Fix error message formatting

* Separate static/dynamic and quantization mode
2019-06-18 20:44:45 -07:00
Changming Sun
051ee681a3
Add a few suggestions to coding guideline (#1238)
* Add a few suggestions to coding guideline
2019-06-18 19:28:21 -07:00
Raymond Yang
c96049fe4a Update ONNX version to include new fixes/changes (#1250) 2019-06-18 14:39:36 -07:00
Scott McKay
6477d4e756
Return better output shape for Loop with zero iterations (#1233)
* Attempt to provide the correct rank for an output from a Loop node when there are no iterations.

For a loop output (vs. loop carried dependency) the first dimension is the iteration count so will have a value of 0 and the output size will be zero. Use the rank of the matching subgraph output if available.

If the subgraph output rank is not available output a warning and use a rank 1 shape of {0}.
2019-06-19 07:31:13 +10:00