Commit graph

1193 commits

Author SHA1 Message Date
Pranav Sharma
7c5b3a5ecc
Update coding guidelines to prefer using make_unique for heap allocations (unless where not possible). (#1730)
* Mention OrtCreateSessionFromArray in C API doc

* Fix perf test executable due to removal of certain C APIs

* fix linux build

* Avoid duplication

* Update coding guidelines to prefer using make_unique for heap allocations (unless where not possible).
2019-09-04 19:16:16 -07:00
manashgoswami
3d44c55092 Updated docs related to base images (#1753)
* Update README.md

* Update onnx-inference-byoc-gpu-cpu-aks.ipynb

* Update README.md
2019-09-04 10:33:41 -07:00
Tomasz Dołbniak
4ed8d4b30e Put the initializers at the end of the cluster inputs list (#1751)
Restore the missing variable
2019-09-03 15:09:37 -07:00
suryasidd
9523977cc2 Added emotion ferplus support (#1752)
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
2019-09-03 15:01:22 -07:00
Changming Sun
94d9161166
Add nuphar to Linux CI build (#1750) 2019-09-03 11:39:27 -07:00
Ashwini Khade
0f6cf9a335
enable quantizing specific nodes (#1742) 2019-09-03 11:04:17 -07:00
Pranav Sharma
ad7ab3d880
Enforce shape validation. (#1716)
* Mention OrtCreateSessionFromArray in C API doc

* Enforce shape validation.

* Update broken models
2019-09-02 20:00:37 -07:00
KeDengMS
c9240f4e93
Implementation of Nuphar execution provider (#881)
* Implement Nuphar execution provider

Nuphar execution provider is a TVM-based compilation provider. It has shown great speedups for RNN models using Scan.
This PR is mainly for a preview of the shared codegen library for other TVM-based providers.

* Fix submodules

* Fix TVM submodule

* Update Nuphar to latest and resolve confliction

* Remove stale files caused by merge -X theirs

* Revert heap buffer change to not introduce onnxruntime_framework into onnxruntime_perf_test

* Fix bad merge

* Merge from Nuphar

* Fix warning treated as error, revert some unnecessary changes

* Revert some more test changes

* Some more test revert or comments to make review easier
New tests could be added later

* One more revert of unnecessary changes

* More change revert. Test could be added back later.
2019-09-01 23:01:47 -07:00
Sreekanth Yalachigere
f4a6d267c1 MKL-DNN EP: control flow fix (#1740)
* moved subgraph_index to MklDnn Execution Provider

* code cleanup
2019-08-31 09:58:59 -07:00
Takeshi Watanabe
259863758e Fix typo in NMS code
Fix typo in NMS code
2019-08-30 22:37:36 -07:00
Hector Li
dc9c89546d
Update the docker file for OpenVINO (#1741)
Update the docker file for OpenVINO which is used for AML
2019-08-30 22:32:24 -07:00
shahasad
833e18345d
Publish perf tool with nightly build (#1728) 2019-08-30 11:25:55 -07:00
Hector Li
810ee0068f
Fix a issue that CUDA EP fallback to much nodes to CPU for some case which cause huge data copy. If the node's inputs are all initializer, we shouldn't fallback the node to CPU. (#1727)
Fix an issue that CUDA EP fallback too much nodes to CPU for some case which cause huge data copy.
https://github.com/microsoft/onnxruntime/issues/1675

Currently, if the node's inputs are all as initialier, CUDA EP will fallback it to CPU. And it will also fallback some nodes under it. It could cause some huge data copy. for the case reported by a user, it has several Slices with input from initializer, and a Concat op to concat the output from Slice output. The data is huge 16MB after concat, which make the data copy from CPU to GPU quite costly because it's a sync copy.

Fix
If the node's inputs are all initializer, we shouldn't fallback the node to CPU.
2019-08-29 13:54:17 -07:00
Pranav Sharma
25d02a33c8
Fix reading of onnx domain causing one of the automl models to break in 0.5 release. (#1694)
* Mention OrtCreateSessionFromArray in C API doc

* Fix registration of Equal op causing one of the automl models to break in 0.5 release.

* updates...
2019-08-29 12:18:39 -07:00
Ashwini Khade
e54904e6a3
add implementation for dynamic quantize linear (#1697) 2019-08-29 11:40:19 -07:00
Hariharan Seshadri
4b5b037289
Support 'Bilinear' mode for 2D inputs in Resize and Upsample kernels (#1679)
* Support bilinear mode with actual 2D inputs in Resize and upsample

* Fix build break

* Fix build break

* Add test

* CUDA changes

* Resolve PR comments

* Resolve comments
2019-08-29 11:34:31 -07:00
rakelkar
0f7c01b49b Use exec form of ENTRYPOINT for docker server (#1690)
* Use exec form of ENTRYPOINT for docker server

# Issue
The entrypoint currently uses the shell form - this prevents users from passing in any cmdline arguments... also passing a model_path in means the server only works in the envvar is set... however this is not what the error message says!
```
$ docker run -v /home/rakelkar/try/onnxzoo/style:/mnt/models -it   mcr.microsoft.com/onnxruntime/server --model_path /mnt/models/model.onnx
Version: local_build
Commit ID: default

model_path must be the location of a valid file
Allowed options:
  -h [ --help ]               Shows a help message and exits
  --log_level arg (=info)     Logging level. Allowed options (case sensitive): 
                              verbose, info, warning, error, fatal
  --model_path arg            Path to ONNX model
  --address arg (=0.0.0.0)    The base HTTP address
  --http_port arg (=8001)     HTTP port to listen to requests
  --num_http_threads arg (=4) Number of http threads
  --grpc_port arg (=50051)    GRPC port to listen to requests
```
# Fix
1. remove the env var
2. use the exec form

* Update readme to use model_path arg
2019-08-29 10:18:08 -07:00
KeDengMS
068b568472
Add support for int8 x uint8 for MatMulInteger, and int16 x int16 custom op (#1391)
Description: The change adds necessary quantization support on CPU with mixed int8/uint8, as well as int16 for matrix multiply operations that outputs int32

Motivation and Context

Integer operations are critical for quantized model's performance
Current MatMulInteger implementation in CPU only supports uint8 x uint8, while the spec supports int8 x uint8. Having a default CPU implementation that fully support the spec would help accuracy verification.
Besides, some model may need to quantize to int16, but MatMulInteger op does not support that yet. A custom op of MatMulInteger16 is added to satisfy such models.
2019-08-28 21:40:24 -07:00
KeDengMS
8fc8910a0e
Allow input used across execution providers as long as they use the same allocator device (#1715)
as long as these providers use the same allocator device

Description: Currently ORT throws error when one input is used in different EPs. The change removes that restriction

Motivation and Context

It is now possible to share inputs across EPs now that allocation are device-based, instead of EP based.
2019-08-28 20:30:00 -07:00
Changming Sun
81ad48080b
Remove TaskThreadPool (#1713) 2019-08-28 18:00:10 -07:00
Tracy Sharpe
73312b8195
MLAS: Android sgemm kernel build fix (#1710)
Fix the aarch64 kernel to build properly with the Android NDK (specifically clang).
2019-08-28 16:14:12 -07:00
Tracy Sharpe
14eae293bf
remove @PCGOTREL x64 usage (#1707)
Avoid the need for @PCGOTREL relocations by annotating MLAS global data shared with assembly modules with attribute(visibility("hidden")).
2019-08-28 11:27:16 -07:00
Faith Xu
d9cdf4b4ed
Doc updates (#1522)
* Updates

* Remove preview texts

* Update README.md

* Updates

* Update README.md

* Update README.md

* Minor wording update

* Update README.md

* Update doc on CUDA version

* revert update

* Update readme for issue #1558

* Clean up example section

* Cosmetic updates

- Add a index of build instructions for browsability
- Update build CUDA version from 9.1 to 10

* Fix broken link

* Update README to reflect upgrade to pip requirement

* Update CuDNN version for Linux Python packages

* Clean up content

Updated ordering and add table of contents

* Minor format fixes

* Move Android NNAPI under EP section

* Add link to operator support documentation

* Fix typo

* typo fix

* remove todo section
2019-08-27 21:31:19 -07:00
Ashwini Khade
8813b79c5b
make gemmlowp default for arm (#1701)
* make gemmlowp default for arm

* force use_gemmlowp in header for default case

* remove unnecessary white space
2019-08-27 15:52:03 -07:00
shahasad
121d308a33
Python API naming and other cleanup (#1678)
- Make the naming of properties in python SessionOptions and RunOptions consistent with other apis.
- Remove unnecessary apis
2019-08-27 12:48:46 -07:00
jywu-msft
938200de9b
fix typo in max batch size error msg. (#1687) 2019-08-27 11:15:18 -07:00
Ashwini Khade
961b14ac4a
use MLAS for QGEMM in matmulInteger and convInteger (#1692)
* use mlas qgemm for u8u8_s32 gemms

* update test
2019-08-26 18:13:22 -07:00
Tracy Sharpe
a8998b07b5
treat zero point properly (#1686) 2019-08-26 11:10:28 -07:00
shahasad
f25847bccd
More fixes on the NuGet CPU CI pipeline (#1688)
- Fix the Windows end-to-end test in NuGet CI
- Skip the TestModelSerialization, because it is failing on Linux. Must be fixed before API is released for use. Owner is notified.
2019-08-23 18:13:13 -07:00
KeDengMS
5873bdbb3f
Share default CPU allocator with Mlas preferred alignment (#1682)
Description: make default CPU allocator to use MLAS preferred alignment

Motivation and Context

This is needed for C API to have an aligned default CPU allocator, the same as the one in CPU provider
2019-08-23 12:06:35 -07:00
Pranav Sharma
4035fe842e
Don't create the default allocator every single time. Rename API accordingly. Expose Session/Run log severity levels. (#1615)
* Mention OrtCreateSessionFromArray in C API doc

* Don't create the default allocator every single time. Rename API accordingly.

* Don't create the default allocator every single time. Rename API accordingly.

* updates...

* updates...

* PR comments

* fix typo in license header

* fix build
2019-08-23 10:33:20 -07:00
suryasidd
7408dec0bf Added some mo optimizations to improve performance (#1674)
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
2019-08-22 19:16:01 -07:00
Negin Raoof
addf32fa2a int64 support for 'where' op (#1666) 2019-08-22 16:14:18 -07:00
Yufeng Li
c9a4fe2b7b
Add support of ReduceSum int64 (#1664)
* Add support of ReduceSum int64

* add unit test for int64
2019-08-22 13:53:01 -07:00
Ashwini Khade
d2569d3761
update clip for opset 11 (#1661)
* update clip for opset 11

* exclude ngraph provider for clip unit tests

* exclude ngraph for all clip opset 11 tests

* fix op version
2019-08-22 13:07:22 -07:00
Changming Sun
4de0aa8049
Optimize kernel index (#1672) 2019-08-22 10:26:35 -07:00
shahasad
a818740d91
Support Tensor<bool> and Tensor<Int8> in C# API. Support Tensor<string> as input. Fix a bug in the InferenceSession Run() with RunOptions (#1671)
- Support bool-Tensor and int8-Tensor in input-output of C# api
- Support string-tensor as input in C# api
- Fix a bug in InferenceSession.Run() -- RunOptions was not passed into the native call
2019-08-22 10:14:50 -07:00
Ke Zhang
b53f40a886
update set fetches for execution with allocation plan. (#1668) 2019-08-21 19:58:05 -07:00
shahasad
6f70a78e1f
Fix a few errors in the NuGet pipeline (still broken) (#1656) 2019-08-21 15:42:23 -07:00
Tommy Trimeloni
97d0a46afc nGraph EP Optimizations (#1630)
* Added check for unnecessary function initializations, and removed lock from unneeded areas of code.

* Added LRU cache to EP.

* Bugfixes for nGraph EP Optimization PR

* Changed default cache size to 500 and refactored mutex readability.

* Fixed unsafe environmental variable fetch for Windows.

* Cleaned up Windows environment functions and cleaned up mutexes.
2019-08-21 14:04:53 -07:00
Scott McKay
a68a20e415
Add details of which node was not able to be placed on an execution provider. (#1665) 2019-08-21 13:31:00 -07:00
Hector Li
e652a236b4
cudnnRNNForwardInferenceEx doesn't support 0 sequence in the bathes
Fix issue that cudnnRNNForwardInferenceEx doesn't support 0 sequence in the bathes

Solution:
Reset the 0 sequence to 1 for the bathes before call the cudnnRNNForwardInferenceEx, has a array to track the batch id which has 0 sequence. Once get the result, call a CUDA kernel to mask on the output using the batch id tracked in the array.
2019-08-21 09:59:43 -07:00
Emma Ning
d0d82432f3
Update PyTorch Section for supported onnx version (#1635)
PyTorch exporter in Pytorch1.2 can natively support multiple opset now
2019-08-20 13:56:19 -07:00
Scott McKay
5311c1b2b5
Check return value form CreateFeedsFetchesManager. (#1653)
Also cleanup a couple of unused variables.
2019-08-20 12:20:21 -07:00
Changming Sun
7be5695fad
Remove --whole-archive (#1655) 2019-08-20 12:04:10 -07:00
jywu-msft
68d496c7ca
fix bug on windows where ops were always getting dumped. (#1648) 2019-08-20 10:48:29 -07:00
Changming Sun
a1b3c64038
Fix memory leak in mlas unitest (#1654) 2019-08-19 19:53:51 -07:00
Pranav Sharma
377dcf60ac
Update onnx test runner documentation (#1651)
* Mention OrtCreateSessionFromArray in C API doc

* Update perf tool documentation to reflect the new graph optimization enums. Relax constraint for enable_all.

* Update one more doc

* Update onnx test runner documentation

* Add default in the docs
2019-08-19 18:28:09 -07:00
Changming Sun
224dde7ef1
Allow user disable multiple threading (#1647) 2019-08-19 18:12:39 -07:00
Pranav Sharma
6f3a835d38 Update perf tool documentation to reflect the new graph optimization enums. Relax constraint for enable_all. (#1650) 2019-08-19 14:27:33 -07:00