Commit graph

1021 commits

Author SHA1 Message Date
Sreekanth Yalachigere
f3c74ec3e9 Reduce memory footprint of MKL-DNN EP (#1429)
* MKL-DNN EP memory fix patch

* Call default provider for Opset10

* opset 10 fix

* removed email header from patch

* UseSubgraph method refactored
2019-07-18 22:57:00 -07:00
Yulong Wang
887930e6c2 inference overheads optimizations (#1392)
This change makes some optimizations on various places. This change consists of a part of PR #1240 (removed the problematic part) and some other trivial fix.

1. reduce unnecessary copy when constructing vector or objects that contains vector as member. use std::move when applicable.
2. use std::vector<std::reference_wrapper<const TensorShape>> instead of std::vector<TensorShape>, when it is only for constant reference usage.
3. calculate key BEFORE (instead of AFTER) acquire lock in SessionState::GetMemoryPatternGroup
other trivial fixes (code should be straightforward and self-explainable).
2019-07-18 19:40:48 -07:00
Tracy Sharpe
07ecd59e8f flatten conv2d when input_width==kernel_width (#1435) 2019-07-18 19:11:45 -07:00
Yuan Yu
c843c393e4 More code cleanup (#1406)
* more cleanup

* More cleanup

* fix build break.

* update
2019-07-18 18:21:16 -07:00
daquexian
bbf64c2c45 Update cgmanifest.json and ThirdPartyNotices.txt for DNNLibrary (#1431) 2019-07-18 14:58:43 -07:00
kile0
7a681fb964 Improve build throughput and enable using the Visual Studio 2019 cmake generator (#1411) 2019-07-18 11:43:11 -07:00
Colin Versteeg
5ee0f185dc Add GRPC support to ONNX Runtime Server (#1144)
* add grpc

* add-submodule

* Revert "add-submodule"

This reverts commit e35994b25035ce310a98909658582bff759ee358.

* fix submodule

* IT BUILDS

* Initial commit of prediction_service_impl.cpp

* Server builds and runs!

* add request id, health and reflection. GRPC is done

* enable channelz for monitoring

* GRPC unit tests

* clang format

* add unit tests

* Add function tests for GRPC

* add grpc to model_zoo_tests

* revert update protobuf to 3.7.0

* update submodules

* builds but runs some gflags tests which fail

* get build working

* confine build changes to onnxruntime_server.cmake

* update build files

* code reveiw comments

* Maik's code review comments

* update cares version to fix compilation issue

* update build to fix c-ares

* code review comments

* update cgmanifest.json

* remove extraneous file

* Klein comments.

* update ci based on discussions for go dependency

* fix tag issue

* fix build issues

* remove stray submodule

* update dockerfile and build script

* dynamic linking changes

* update build script

* code review comments

* update dockerfile

* update script for mount

* code review comments
2019-07-18 11:10:38 -07:00
Yufeng Li
6c41809655
Build Shared Library with cuda 10.1 (#1418)
Description: Describe your changes.
Change the logic to find cublas dll
Motivation and Context
Why is this change required? What problem does it solve?
The name pattern of cublas changed since 10.1. It doesn't include minor version in its name anymore.
If it fixes an open issue, please link to the issue here.
2019-07-18 09:51:19 -07:00
Yufeng Li
02ded802ab
cleanup more useless unique_ptr (#1427) 2019-07-18 09:50:48 -07:00
Hector Li
1ff957f96e
CUDNN_RNN_DATA_LAYOUT_SEQ_MAJOR_UNPACKED works with CUDNN_RNN_PADDED_… (#1428)
CUDNN_RNN_DATA_LAYOUT_SEQ_MAJOR_UNPACKED works with CUDNN_RNN_PADDED_IO_ENABLED, so that it will auto fill 0 for the shorter sequences.
2019-07-18 09:17:44 -07:00
Ke Zhang
f720166887
register gpu data transfer only when there's nvidia gpu related eps. (#1420) 2019-07-17 21:12:18 -07:00
Chris Seymour
db61eb4cd7 Update ONNX_Runtime_Perf_Tuning.md (#1378) 2019-07-17 19:14:43 -07:00
Tracy Sharpe
f47f6fd020
Fix MaxPool when using dilation > 1 plus non-zero padding (#1320)
MaxPool with dilation > 1 and padding did not compute the correct start index. Added code to fix and test cases to cover this.
2019-07-17 17:33:29 -07:00
Changming Sun
fbdd905440
Switch some of the linux pipelines to use the new data download script (#1379) 2019-07-17 16:06:02 -07:00
avidiyal
859a57d781 Updated Dockerfile for OpenvinoEP (#1362)
* Updated Dockerfile for OpenvinoEP

Signed-off-by: avidiyal <akhila.vidiyala@intel.com>

* Changed the license

Signed-off-by: avidiyal <akhila.vidiyala@intel.com>

* resolving conflicts

* Reviews fixed
2019-07-17 14:52:59 -07:00
Yuan Yu
93fb62bb3e More code cleanup (#1405)
* More code cleanup

* More cleanup
2019-07-17 14:45:50 -07:00
Yufeng Li
a7b1a8969c
simply nocontribops-ci and fix build break (#1422)
simply nocontribops-ci and fix build break
2019-07-17 13:43:40 -07:00
Tracy Sharpe
4383615cf6
implement conv+clip fusion (#1412)
This change implements Conv+Clip activation fusion for FusedConv and NCHWc convolutions. The Clip operation runs in the thread context that is producing the convolution output.
2019-07-17 12:16:45 -07:00
suryasidd
d2cc086bee [OpenVINO EP] Minor bug fixes (#1388)
* Minor bug fixes for accelerators

* Added dimensionality checks for each graph input for GPU

* Disabled some tests for MYRAID and GPU

* This change is required for running some of the models on
  OpenVINO instead of falling back to default CPU EP

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* PR Feedback

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Fix missing bracket

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
2019-07-17 10:48:54 -07:00
R. G. Esteves
8720fe62e3 Added missing libraries to Windows wheel (#1415) 2019-07-17 05:54:09 -07:00
Changming Sun
c2aa2056b5
Sample for imagenet and batch prediction (#1372)
* Sample for imagenet and batch prediction
(Will add a readme later)
2019-07-16 14:23:45 -07:00
Changming Sun
d38badffdb Disable mklml in Windows Build 2019-07-16 11:09:17 -07:00
Raymond Yang
a203077dcd
Relax timeout in CI system (#1394)
* Relax timeout in CI system (temporary)

* Relax timeout on TensorRT pipeline
2019-07-15 15:10:08 -07:00
Scott McKay
07a2466d9f
Use INFO instead of WARNING for an unused graph input. (#1235)
* Use INFO instead of WARNING for an unused graph input.

* Drop severity of unused initializer as well

* Update to output a warning level message if removing an initializer that is never used, and an info level message if removing an initializer that optimization has made redundant.
2019-07-15 20:29:30 +10:00
Yang Chen
fa4b956f12
replace onnx:: with ONNX_NAMESPACE:: (#1376)
* replace onnx:: with ONNX_NAMESPACE::

* Fixed issue for building shared libs

* address CR feedback

* address more CR feedback
2019-07-15 01:06:53 -07:00
Scott McKay
61b733ce6d
Update optimizers to be able to utilize a constant initializer from an ancestor graph (#1346)
* Now that we check for a constant initializer in an ancestor graph we also need to be able to retrieve and replace that initializer.
Add helpers to do so.
Update optimizers to use the new helpers.
Fix bug in UnsqueezeElimination where it wasn't checking if the initializer it was replacing was constant.
2019-07-15 12:41:01 +10:00
Tracy Sharpe
d4ce31ea6d cleanup fused conv activation handling (#1403)
* cleanup fused conv activation handling

* fix build break

* fix mkldnn build break
2019-07-14 16:34:16 -07:00
Yuan Yu
c139e3ab33 Remove a few useless unique_ptrs (#1401) 2019-07-13 16:15:29 -07:00
Tracy Sharpe
719e58d831
Use MLAS to retrieve the CPU preferred tensor buffer alignment (#1377)
Add MlasGetPreferredBufferAlignment() for use by CPUAllocator::Alloc to get the byte alignment for CPU tensors. Using MLAS allows the value to be based on the platform the binary is running on instead of a constant value fixed at compile time.
2019-07-12 22:22:46 -07:00
Changming Sun
5a6f1c10d6 Add OrtCreateStatus to the symbol list 2019-07-12 15:10:58 -07:00
Ke Zhang
3bf0e364e2
Move CopyTensor out of IExecutionProvider interface. (#1268)
* add ortdevice class

* add data transfer manager for copying tensors.

* update

* add data trasnfer for gpu

* fix constexpr build break.

* update

* remove unnecessary header files.

* remove unnecessary header files.

* add dependency

* add dependency

* add dependency

* add dependency

* fix linux build break.

* update

* fix build break

* fix build break

* fix build break

* update

* update

* update c api.

* update to not use OrtCreateAllocatorInfo

* change to all eps .

* fix linux build break

* remove useless codes.

* update

* move datatransfermanager in session state

* update

* fix cuda build break.

* fix comments

* fix windows GPU build.

* fix comments

* fix build break

* fix comments

* fix test failure

* update

* fix comments

* fix onnx runtime server.

* update

* fix test failure.

* fix comments

* fix comment
2019-07-11 14:49:20 -07:00
jignparm
e580b76305
Fix ARM64 build + Add NuGet pipeline including ARM binaries (#1335)
* Add arm64 nocontribops pipeline

* minor fix

* Added new template for arm build -- disable all tests

* fix build command

* add arm64 flag for msbuild

* add arm leg as upstream dependency

* update platform to arm64 for msbuild

* remove test task from arm build

* remove ESRP signing of C# dlls in arm build

* Updated to work for both --arm and --arm64

* Make the cross compiling cmake flags symmetric

* Add dynamic check for /Wno-error flag, instead of extra build option

* remove extra full-stop
2019-07-11 11:49:17 -07:00
Maik Riechert
bfda9ca1c1 Make sure submodule urls are up-to-date (#1357)
This extends build.py to run git submodule sync --recursive before running git submodule update --init --recursive. This makes sure submodule URLs are up-to-date.
2019-07-10 13:11:59 -07:00
Changming Sun
20f6c84fd2 Switch to use nvidia-docker2 command format 2019-07-10 13:11:07 -07:00
S. Manohar Karlapalem
a7fcd60572 Add missing 'openvino' option in perftest Usage message (#1367) 2019-07-10 10:58:18 -07:00
Faith Xu
aba7271ad7 Fix links (#1371) 2019-07-10 08:34:31 -07:00
Tracy Sharpe
823fa3f39c
Integrate MLAS NCHWc support into ONNX Runtime (#1327)
This change integrates the NCHWc support recently added to MLAS into ONNX Runtime. When using "-o 3" optimizations, then the runtime will do a NCHWc layout optimization pass to convert standard ONNX operators such as Conv/MaxPool to the com.microsoft.nchwc domain with weights and biases reordered for speed.
2019-07-09 20:41:19 -07:00
Hector Li
42c18762f3
Update the log message for fallback case. (#1370)
Log a warning if the fallback is caused by functional limitation
Log a information if the fallback is by design. e.g Nodes between Shape (CPU output) -> CUDA nodes .. -> ReShape (CPU input)
2019-07-09 16:54:40 -07:00
Tracy Sharpe
c483a1e3c6
Use simpler GEMM function for MatMul operator (#1365)
More cleanup of the math files. Instead of using templates to instantiate a full GEMM for the types added for MatMul (integers and double), use a simpler MatMul function that doesn't do any transposing and assumes alpha=1 and beta=0.
2019-07-09 15:07:50 -07:00
jignparm
57225cd4ee
Add C++ API test for NuGet package (#1364) 2019-07-09 13:51:51 -07:00
Hector Li
298f30546b
Fix the random UT failure for RNN/GRU cases which have padded sequenc… (#1361)
Fix the random UT failure for RNN/GRU cases which have padded sequence. e.g. max_seq = 2. batch_size =2, sequence_lengths = {2, 1}. For the output beyond the shorter sequence {1}, we should initialize the value to 0.

Root cause:
Cudnn library doesn't guarantee the value beyond the shorter sequence.
Fix:
Initialize the output Y data to all 0 before calling cudnn library.
2019-07-09 13:28:11 -07:00
Changming Sun
27da857b51 Fix an SAL annotation in onnxruntime_c_api.h 2019-07-09 10:14:58 -07:00
Vinitra Swamy
6b32c77804 Dockerfiles for TensorRT, CUDA, build from source (#922)
* dockerfile updates for BYOC scenario

* updates for 3 different build versions

* updating to remove libopenblas, python3, python3-pip

* Including LICENSE-IMAGE.txt for CUDA/TensorRT dockerfiles

* remove unnecessary cmake files

* fixing comment typo

* optimizing dockerfile.source as per review suggestions (not working currently)

* Optimizing dockerfiles with install_dependencies script

* update dockerfile with --cmake_extra_defines version number

* add &&\ for license copy lines

* updates, adding miniconda to path, reincluded clearing the pycache

* adding maintainer note

* update readme instructions

* update tensorrt versioning in dockerfile
2019-07-09 02:03:55 -07:00
Maik Riechert
3cae067a9b fix non-standard u_int32_t type (#1358) 2019-07-09 00:19:58 -07:00
Scott McKay
ac6a4afb0f
Add validation of shape when re-using a buffer in ExecutionFrame (#1356)
* Check for empty string as dim_param in allocation planner.
* Validate shape is compatible at runtime when re-using Tensor.
2019-07-09 14:59:07 +10:00
Changming Sun
58d6ff3f13 Remove AgentPool setting in CI yaml 2019-07-08 15:40:54 -07:00
Tracy Sharpe
3a588860cc
remove unused math routines (#1354)
This change removes a number of unused math helpers from core/util/math.h. Most operators are already using MLAS or Eigen directly.
2019-07-08 14:05:27 -07:00
Pranav Sharma
e9ce51ead4
Make GetTensorShapeFromTensorShapeProto return TensorShape and not it's internal representation. (#1353) 2019-07-08 11:45:55 -07:00
Faith Xu
5b93b02c69 Issue template update (#1339)
* Update to include urgency

* Wording update

* Wording update
2019-07-07 23:38:52 -07:00
Faith Xu
b7ae0d5694 Fix link (#1351)
* Fix link

* Update PyOp.md
2019-07-07 21:56:18 -07:00