Commit graph

1227 commits

Author SHA1 Message Date
Changming Sun
bd5451b4ed
Don't define USE_OPENMP if the compiler doesn't support OpenMP (#1836) 2019-09-13 16:42:50 -07:00
shahasad
aac6021549
Add NuGet feed publish to nuget pipeline (#1833) 2019-09-13 15:27:35 -07:00
Vinitra Swamy
2a21df2309
Updates to CUDA and TensorRT dockerfiles for v0.5.0 (#1731)
* updates to cuda and tensorrt dockerfiles for v0.5.0

* add table of build tags
2019-09-13 14:16:47 -07:00
jywu-msft
634f554471
python api's for execution provider registration (#1826)
* add python api's for get/set execution providers, checking available and all providers.

* add back deleted code.

* minimize peak memory consumption for sess.set_providers() api. need to remove references to underlying _sess object

* fix typo.

* add validation for set_providers(), addr other review feedback.
2019-09-13 11:25:18 -07:00
KeDengMS
cf22ea6893
Upgrade TVM for a fix in tvm::Integer with int64_t input (#1824)
* Upgrade TVM for a fix in tvm::Integer with int64_t input
2019-09-12 23:15:13 -07:00
Faith Xu
a60283845b
Update link format and example sections in readme (#1729)
* Fix broken link and minor wording updates

* Update links to use relative paths

* Update sample section organization

* Fix a few more links

* Update links to relative paths

* Fix link urls

* Update links to relative paths

* Update link to perf test doc page

* Update links to relative paths

* Update to relative paths for links

* Update link
2019-09-12 17:49:29 -07:00
Hector Li
a0ba25f98f
Fix the issue that it run into alloc failed on multiple cuda device. We have some place hard code the allocator always use device 0. (#1815)
Fix the issue that it run into alloc failed on multiple cuda device. We have some place hard code the allocator always use device 0.
2019-09-12 15:31:59 -07:00
KeDengMS
18f7377269
[Nuphar] improvements in symbolic_shape_infer and model editor (#1787)
* use unaligned buffer for Nuphar in onnxruntime_perf_test to avoid crash
symbolic shape inference fixes to support more sophiscated models
remove useless code from model_quantizer
* Run symbolic shape inference for subgraphs in Loop/Scan
* Allow a symbolic dimension to merge with an int dimension
2019-09-12 15:11:52 -07:00
Bowen Bao
8712a523a4
Bump onnx to latest (#1756)
* Bump onnx to latest

Update onnx.in.proto with changes for SparseTensor.

* add temp skip tests

* remove passed tests from skip list

* skip more tests for new ops in opset 11

* skip crashing tests

* update handling of new attribute types sparse tensor and sparse tensors

* advance onnx commit and remove skip cpu_flaky_tests

* temporarily skip yolo3 model test due to resize opset10 shape inference regression

* update proto for onnxruntime server

* advance onnx commit further
2019-09-12 11:46:49 -07:00
Pranav Sharma
f8c3442880
Part 2 of renaming AllocatorInfo to MemoryInfo. (#1804)
* Mention OrtCreateSessionFromArray in C API doc

* Part 2 of renaming AllocatorInfo to MemoryInfo.

* pr comments

* fix comment
2019-09-12 08:19:29 -07:00
ybrnathan
397713a2d1
Add Int64 support to ReduceMin, and Add Double support to Neg op (#1812) 2019-09-11 17:15:02 -07:00
Hector Li
2b8677b210
Enable Openvino nightly build on edge device (#1684)
1. Add openvino GPU nightly build pipeline, this test is running on Intel Up square Edge device. The device are host locally not from Azure VM. We persist a smaller model test data on Edge device.

2. Update the build condition for openvino GPU so it works for GPU_FP32, GPU_FP16

3. add option to install_ubuntu.sh to exclude the package used for nuphar, so that we can save some disk space as the Edge device usually have limited disk space.
2019-09-11 16:36:12 -07:00
Dmitri Smirnov
fe8915863c
Implement C API entry points for creating and fetching non-standard types to OrtValue (#1714)
C/C++ Opage APIs
 Add new virtual interfaces for NonTensorType
 Implement entry points.
 Add shared header for the data container.
 Add export symbols.
 Add serialization/deserialization.
 Implement model with Opaque types.
 Rework opqaue_api_test as a standalone executable.
2019-09-11 14:52:47 -07:00
Chi Lo
d9fa632863
Add Cuda Kernel for Not operator (#1801)
* Add Cuda Kernel for Not operator
* Register Not CUDA Kernel
2019-09-11 14:30:44 -07:00
Dmitri Smirnov
a9e4de2cea
Follow up on proto3 compatiblity. (#1799)
This provides additional has_*() methods abstraction/replacement for proto3 compatibility.
2019-09-11 11:36:13 -07:00
Scott McKay
3b7f047a49
General performance testing tooling improvements (#1577)
* Miscellaneous updates to help with perf testing
2019-09-11 19:46:59 +10:00
Scott McKay
6586afc8eb
Refine the output shape calculation to avoid unnecessary re-allocations and vector insert operations. (#1781) 2019-09-11 14:31:53 +10:00
Scott McKay
35c5c4d418
A subgraph may have no inputs (e.g. subgraph in If has no explicit inputs) or value infos (e.g. a subgraph with just an If node in it). (#1083)
It should always have outputs but in case it doesn't (nothing fails currently if it doesn't even though that makes it meaningless) make sure it also has a node.
2019-09-11 11:03:55 +10:00
Hariharan Seshadri
206278ca44
Fix error message in Cast op (#1792) 2019-09-10 15:40:53 -07:00
Pranav Sharma
f9d85d654a
Add GetDataTransfer() interface in the EP. (#1773)
* Mention OrtCreateSessionFromArray in C API doc

* Add GetDataTransfer() interface in the EP.

* Check return status of RegisterDataTransfer

* Address PR comments
2019-09-10 14:07:17 -07:00
ybrnathan
bd48660592
Add Cuda Kernel for Less operator (#1790)
* Add Cuda Kernel for operator Less

* Register Less CUDA Kernel
2019-09-10 11:33:57 -07:00
Ran Cohen
b32f24a3f9 added support for Less(double) (#1722) 2019-09-10 11:15:01 -07:00
Pranav Sharma
0b609d3e68
Add make_unique implementation for use with C++11. (#1793)
* Mention OrtCreateSessionFromArray in C API doc

* Add make_unique implementation for use with C++11

* Add cgmanifest and TPN files as well

* Add annotation to cgmanifest to identify the component that uses the dependency
2019-09-09 23:55:44 -07:00
Scott McKay
98dbdb1e0b
Rework the feed/fetch copy setup so that it can be calculated prior to subgraph execution (#1761)
* Rework the feed/fetch copy setup so that it can be calculated upfront by the control flow nodes. Also simplifies how it all works.
Update the control flow nodes to do the calculation prior to graph execution.
2019-09-10 15:46:00 +10:00
Scott McKay
2e242a4089
Clarify naming of the API involving the RunOptions terminate flag. (#1768)
* Clarify naming of the RunOptions terminate flag.

* Update C# code to use new names.
2019-09-10 08:32:33 +10:00
Dmitri Smirnov
75f241d02c
Enhance compatibility with proto3 and replace or abstract has_*() methods. (#1778)
Enhance proto3 compatibility.
  Replace has_*() method to corresponding enum handling so we can deal with
  proto3 generated stream from proto2 code.
  Add utility wrappers for remaining has_*() methods so we can
  easily deal with them if/when we switch to proto3.
2019-09-09 14:07:30 -07:00
shahasad
6a5b11756b
Conditionally export execution provider apis in chsarp (#1724) 2019-09-09 11:17:44 -07:00
Tracy Sharpe
071a0c2522
MLAS: MlasSgemm refactoring (#1749)
Refactor the SGEMM kernels to resynchronize the code between Windows/Linux and remove unneeded binary bloat from a different zero/add mode kernel. Another goal is to get to a cleaner state for then doing a DGEMM kernel.
2019-09-06 22:26:28 -07:00
Tracy Sharpe
a324ad7b96
MLAS: clang u8u8 GEMM fix 2019-09-06 09:11:10 -07:00
Ashwini Khade
b2a2326a45
add dequantize and quantize back to contrib ops (#1712) 2019-09-06 08:55:42 -07:00
Scott McKay
e1a12b1760
Fix some unnecessary copies of the Node attributes (#1763) 2019-09-06 17:00:35 +10:00
Pranav Sharma
52fe574fed
Rename OrtAllocatorInfo to OrtMemoryInfo to make it more obvious. (#1758)
* Mention OrtCreateSessionFromArray in C API doc

* Rename OrtAllocatorInfo to OrtMemoryInfo to avoid confusion
2019-09-05 14:20:37 -07:00
KeDengMS
58fe5a6bf1
Enable Nuphar docker build, and reinstate Nuphar tests (#1757)
Enable Nuphar EP docker build
Revert back to LLVM 6.0.1
Reinstate disabled Softmax tests caused by LLVM 8.0.1
Reinstate Nuphar Python test due to stale sympy version
Increase build timeout of Linux CI
2019-09-05 08:50:48 -07:00
Yang Chen
eddb9d78f9
fixed "unreachable code" warnings on Windows (#1755)
When NUPHAR_USE_MKL or NUPHAR_USE_AVX2 is not defined, we got
"unreachable code" warnings on Windows, which were truned into
errors and broke the build.
2019-09-04 20:30:47 -07:00
Pranav Sharma
7c5b3a5ecc
Update coding guidelines to prefer using make_unique for heap allocations (unless where not possible). (#1730)
* Mention OrtCreateSessionFromArray in C API doc

* Fix perf test executable due to removal of certain C APIs

* fix linux build

* Avoid duplication

* Update coding guidelines to prefer using make_unique for heap allocations (unless where not possible).
2019-09-04 19:16:16 -07:00
manashgoswami
3d44c55092 Updated docs related to base images (#1753)
* Update README.md

* Update onnx-inference-byoc-gpu-cpu-aks.ipynb

* Update README.md
2019-09-04 10:33:41 -07:00
Tomasz Dołbniak
4ed8d4b30e Put the initializers at the end of the cluster inputs list (#1751)
Restore the missing variable
2019-09-03 15:09:37 -07:00
suryasidd
9523977cc2 Added emotion ferplus support (#1752)
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
2019-09-03 15:01:22 -07:00
Changming Sun
94d9161166
Add nuphar to Linux CI build (#1750) 2019-09-03 11:39:27 -07:00
Ashwini Khade
0f6cf9a335
enable quantizing specific nodes (#1742) 2019-09-03 11:04:17 -07:00
Pranav Sharma
ad7ab3d880
Enforce shape validation. (#1716)
* Mention OrtCreateSessionFromArray in C API doc

* Enforce shape validation.

* Update broken models
2019-09-02 20:00:37 -07:00
KeDengMS
c9240f4e93
Implementation of Nuphar execution provider (#881)
* Implement Nuphar execution provider

Nuphar execution provider is a TVM-based compilation provider. It has shown great speedups for RNN models using Scan.
This PR is mainly for a preview of the shared codegen library for other TVM-based providers.

* Fix submodules

* Fix TVM submodule

* Update Nuphar to latest and resolve confliction

* Remove stale files caused by merge -X theirs

* Revert heap buffer change to not introduce onnxruntime_framework into onnxruntime_perf_test

* Fix bad merge

* Merge from Nuphar

* Fix warning treated as error, revert some unnecessary changes

* Revert some more test changes

* Some more test revert or comments to make review easier
New tests could be added later

* One more revert of unnecessary changes

* More change revert. Test could be added back later.
2019-09-01 23:01:47 -07:00
Sreekanth Yalachigere
f4a6d267c1 MKL-DNN EP: control flow fix (#1740)
* moved subgraph_index to MklDnn Execution Provider

* code cleanup
2019-08-31 09:58:59 -07:00
Takeshi Watanabe
259863758e Fix typo in NMS code
Fix typo in NMS code
2019-08-30 22:37:36 -07:00
Hector Li
dc9c89546d
Update the docker file for OpenVINO (#1741)
Update the docker file for OpenVINO which is used for AML
2019-08-30 22:32:24 -07:00
shahasad
833e18345d
Publish perf tool with nightly build (#1728) 2019-08-30 11:25:55 -07:00
Hector Li
810ee0068f
Fix a issue that CUDA EP fallback to much nodes to CPU for some case which cause huge data copy. If the node's inputs are all initializer, we shouldn't fallback the node to CPU. (#1727)
Fix an issue that CUDA EP fallback too much nodes to CPU for some case which cause huge data copy.
https://github.com/microsoft/onnxruntime/issues/1675

Currently, if the node's inputs are all as initialier, CUDA EP will fallback it to CPU. And it will also fallback some nodes under it. It could cause some huge data copy. for the case reported by a user, it has several Slices with input from initializer, and a Concat op to concat the output from Slice output. The data is huge 16MB after concat, which make the data copy from CPU to GPU quite costly because it's a sync copy.

Fix
If the node's inputs are all initializer, we shouldn't fallback the node to CPU.
2019-08-29 13:54:17 -07:00
Pranav Sharma
25d02a33c8
Fix reading of onnx domain causing one of the automl models to break in 0.5 release. (#1694)
* Mention OrtCreateSessionFromArray in C API doc

* Fix registration of Equal op causing one of the automl models to break in 0.5 release.

* updates...
2019-08-29 12:18:39 -07:00
Ashwini Khade
e54904e6a3
add implementation for dynamic quantize linear (#1697) 2019-08-29 11:40:19 -07:00
Hariharan Seshadri
4b5b037289
Support 'Bilinear' mode for 2D inputs in Resize and Upsample kernels (#1679)
* Support bilinear mode with actual 2D inputs in Resize and upsample

* Fix build break

* Fix build break

* Add test

* CUDA changes

* Resolve PR comments

* Resolve comments
2019-08-29 11:34:31 -07:00