Commit graph

1138 commits

Author SHA1 Message Date
jywu-msft
372b657900 update TRT EP CI's to use latest model.zip (#1637) 2019-08-16 17:44:22 -07:00
Changming Sun
6b89c7ad04
Let mlas use session thread pool (#1609)
1.Let mlas use session thread pool
2.Remove onnxruntime_USE_MLAS cmake option
3. Remove the win32 thread pool code inside mlas

mlas will:

1.use ort thread pool if it get passed in
2.use openmp if the threadpool parameter is nullptr
3.run single threaded if the threadpool parameter is nullptr and openmp is disabled.
2019-08-16 13:21:15 -07:00
Hariharan Seshadri
44a42a6a98
Fix parsing initial hidden state in RNN (#1626)
* Fix the way initial hidden state is used for reverse direction in RNN

* Add test case

* Updates
2019-08-16 10:12:46 -07:00
shahasad
f9834105aa removed --gen_doc (#1633) 2019-08-16 09:52:36 -07:00
shahasad
c9eb13a638
Copy System.Numerics.Tensors sources from dotnet/corefx into onnxruntime (#1605)
Copy System.Numerics.Tensors sources from dotnet/corefx into onnxruntime
2019-08-15 17:28:47 -07:00
Ashwini Khade
0044be6259
update onnx to latest commit (#1622)
* update onnx to latest commit

* Disable and/or fix failing tests

* disable not yet implemented tests for opset 11

* disable tests

* fix bug in mkldnn fp16 graph check
2019-08-15 17:10:32 -07:00
Hariharan Seshadri
1835640d94
Support int64 for ReduceMax (#1625) 2019-08-15 14:48:59 -07:00
Dmitri Smirnov
17c8fe44e3
Integrate featurizers (#1573)
Added Sample Featurizer and Infrastructure
  Make featurizers and unit tests compile and run with GTest.
  Create definitions for the first featurizer kernel.
  Add new operator domain.
  Create datetime_transformer kernel and build.
  Move OPAQUE types definitions for featurizers kerneles out to a separate cc.
  Register them with the type system.
 Provide unit tests for new AutoML DateTimeTransformer kernel.
  Make necessary adjustments to the test infrastructure to make it run
  with new types.
2019-08-15 13:59:59 -07:00
Hariharan Seshadri
7545b795df
Fix incorrect box offset computation in NMS op (#1624)
* More changes

* Fix NMS

* nits
2019-08-15 11:41:10 -07:00
shahasad
0c5d2c998b
Generate documentation from the registered operator kernels (#1395)
- Added python script for generating markdown doc from the registered opkernels. 
- Made some conditional changes in the pybind to expose necessary python API
- Added some missing type-constraints in the op kernel registrations
2019-08-14 18:12:24 -07:00
Pranav Sharma
8d12ce45cf
Use a friendly enum for graph optimization level. (#1586)
* Mention OrtCreateSessionFromArray in C API doc

* review changes

* use enum for graph optimization level

* Use explicit values for enums

* updates...

* Add friendly enum for graph optimization levels in C, C# and Python APIs.

* Fix linux build

* Fix build breakage due to master merge

* PR comments
2019-08-14 17:12:08 -07:00
jywu-msft
24d17f4353
Fix trtlogger segfault. re-enable SoftPlus unit test for TRT. add doc… (#1623)
* Fix trtlogger segfault. re-enable SoftPlus unit test for TRT. add documentation for ORT_TENSORRT* env vars.

* Update TensorRT-ExecutionProvider.md
2019-08-14 16:34:39 -07:00
Hariharan Seshadri
09db1e06b5
Make changes to pipeline template to include missing headers in tars/zips (#1617) 2019-08-14 13:51:29 -07:00
shahasad
a6a5acedda
Cleanup csharp API SessionOptions and RunOptions to be consistent with other APIs (#1570)
- Updated SessionOptions API to use properties instead of setter/getter methods. 
- Added missing APIs. 
- Added RunOptions.
2019-08-14 12:02:02 -07:00
Ke Zhang
bd64ca3019
Kezhan/execute graph refactoring (#1553)
* checking execution provider logic updated.

* fix the logic of copy input and output.

* update

* update

* update

* update

* update

* update

* fix ngraph failure.

* fix comments
2019-08-14 01:07:05 -07:00
Scott McKay
b405482cfa
Remove copy of generator in Multinomial (#1611)
* Remove copy of generator in Multinomial so that different values are generated each time.
Add ability to test
2019-08-14 10:58:54 +10:00
Scott McKay
b5de1324ef
Fix log message truncation on Windows when printf formatting is used.` (#1599)
* Fix log message truncation and add unit test. On Windows vnsprintf_s returns -1 when truncating so we need to differentiate that from a real error.
2019-08-14 07:53:45 +10:00
pulkittomar
a50a63aa9e Serialize optimized onnx model (#1470)
* Model serialization

* Removed duplicate symbol

* Minor update

* Review comments

* add tests

* Model serialization

* Removed duplicate symbol

* Minor update

* Merged PR 1106437: Model Serialization in onnxruntime

* Review comments

* Merged PR 1107226: Review comments

Review comments

* add tests

* Fixed merge conflict

* Correct python tests

* InferenceSesssion Refeed Test

* Replace use of widechar const literal-L

* Fixed failing tests

* Updated comment

* Removed unnecessary session options

* Spell check on comments

* Do not serialize when level 3 optimization specified

* Updated error logs

* Changed log severity to WARN
2019-08-12 18:43:40 -07:00
Scott McKay
8a559d75ae
Minor perf improvements. (#1580)
* Minor perf improvements.

- Cache the vector sizes in IExecutionFrame and NodeIndexInfo to avoid calls to size().
  - 2 instructions instead of 10
- Remove an unnecessary check in IExecutionFrame
  - add a check to the ctor so we guarantee it's unnecessary
- Reserve memory for the vectors in BroadcastIterator
  - saves reallocs if more than one value is added
    - but rare with the mlperf models for multiple values to be added so benefit is limited.
  - slight tweak to the Broadcaster ctor code to make it more readable
2019-08-13 09:05:48 +10:00
Pranav Sharma
a6a4c4c079
Fix perf test executable. (#1598)
* Mention OrtCreateSessionFromArray in C API doc

* Fix perf test executable due to removal of certain C APIs

* fix linux build

* Avoid duplication

* Fix mem leak
2019-08-12 09:49:29 -07:00
AlbertSadovnikov
ce3c8f98dd Fix for CPU random ops seed narrowing conversion. (#1594) 2019-08-12 09:01:13 -07:00
Malik Shahzad Muzaffar
df9b1b8ec8 Include io_win32.h only if builds on windows (#1587)
* Include io_win32.h only if builds on windows

* looks like include order matters
2019-08-12 08:18:42 -07:00
Tomasz Dołbniak
69baf9e800 Update nGraph to v0.22.1 (#1582)
* Update nGraph to 0.21 and adjust the EP

* Share the graph initializers between custom ops

* Update nGraph to 0.22 and exclude Gather entirely

* Enable building on Windows with nGraph v0.21.1-rc.0

* Disable the unsigned input Shrink op tests for nGraph until the next update

* Line-shortening code refactor

* Fix for the master branch merge artifact

* MKLDNN patches adjustment for Windows

* Exclude MatMulInteger for non-const zero points

* Exclude ConvInteger for non-const zero points

* Enable full Cast op support

* Use the v0.22.1 tag

* Skip ConvTranspose_InvalidKernelShape test for ngraph provider

* Create sub-graph ModelProto from fused_node
2019-08-10 17:41:08 -07:00
Ashwini Khade
7be40b2946
put all gemmlowp common code in one place (#1590)
* put all gemmlowp common code in one place

* fix gpu build failures

* minor update
2019-08-10 17:01:07 -07:00
Ke Zhang
59c9d83f35
add int64 support for less op. (#1604) 2019-08-09 17:16:57 -07:00
Wei-Sheng Chin
0187d876cb Implement new LabelEncoder in opset 2 in ML domain (#1393)
* Implement new LabelEncoder in opset 2 in ML domain

* Fix compilation error

* Fix tests

* Include ONNX's fix

* Formatting and addressing a comment

* Address a minor comment
2019-08-09 14:03:58 -07:00
manashgoswami
6d783e8a07 Added license files in the base image (#1595)
* Update Dockerfile.openvino

* Update Dockerfile.cuda

* Update Dockerfile.cuda

* Update Dockerfile.openvino

* Update Dockerfile.cuda

* added ThirdParty notice file to base image.

* corrected license file name
2019-08-09 13:02:06 -07:00
ybrnathan
9b83545f66
Optimize Fence checking performance (#1593)
* For majority of nodes, we do not need to do fence check. Instead, we only need to do FenceCheck for CPU<->GPU mem sync node
But we pay the Fence check cost for every single node and every single input and output.

This change will minimize the Fence check to only do it when necessary.
2019-08-08 20:16:13 -07:00
stevenlix
1c5b15c2b8
Remove memory copy between TensorRT and CUDA (#1561)
* remove memory copy between CUDA and TRT

* add info to RegisterExecutionProvider input

* use new IDeviceAllocator for trt allocator

* remove SetDefaultInputsMemoryType from TRT EP

* remove onnx-tensorrt 5.0

* add submodule onnx-tensorrt branch 5.1

* remove redundancy

* Update transformer_memcpy.cc

* Update tensorrt_execution_provider.cc

* switch to TensorRT 5.1.5.0

* update python binding

* disable failed test case on TensorRT

* Update activation_op_test.cc

* upgrade to TensorRT container 19.06

* update according to feedback

* add comments

* remove tensorrt allocator and use cuda(gpu) allocator

* update onnx-tensorrt submodule

* change ci build cuda directory name
2019-08-08 19:31:39 -07:00
Hector Li
38d78542c3
Fix race condition issue in RNN/LSTM/GRU (#1544)
Fix race condition issue in RNN/LSTM/GRU.

Description:
The filter_desc and rnn_desc could also be changed in compute which could be in multi-thread. It will cause race condition issue.

Fix:
create temperate cudnn descriptors
cache cudnn_dropout_desc_ which won't change
2019-08-08 14:18:41 -07:00
Scott McKay
6e430c0526
A few performance improvements coming out of ssd_mobilenet and ssd_resnet34 analysis (#1578)
* A few performance improvements:
 - Make the iteration in NonZero more efficient by using a raw pointer and simplifying the increment logic
   - add another unit test to check the new logic works with 3 dimensional tensor
   - gains about 2% for ssd_mobilenet
 - Avoid floating point operations on each iteration on Concat
  - about 0.5% for ssd_mobilenet and ssd_resnet34
 - Put common case first in ExecutionFrame::AllocateAsPerAllocationPlan to avoid unnecessary call to IsSparseTensor
  - about 0.05% for ssd_mobilenet
 - Minor tweak to put some ctors in the TensorShape header so they can be inlined more easily
2019-08-08 07:20:00 +10:00
Pranav Sharma
a443b013dd
Remove unneeded C APIs + some refactoring. (#1555)
* Mention OrtCreateSessionFromArray in C API doc

* c api changes after review (1)

* updates...

* fixes

* Reorder include
2019-08-07 11:05:29 -07:00
Ashwini Khade
a93ece2727
update quatizelinear to process int8 input (#1576) 2019-08-07 10:09:15 -07:00
Changming Sun
aeb0bcb4a3 parallel build 2019-08-07 08:38:26 -07:00
Hariharan Seshadri
9a34089f67
Add more type support for OneHot op (#1565) 2019-08-06 17:45:42 -07:00
Changming Sun
9e926fef1c
Add a doc for cmake (#1524) 2019-08-06 07:51:53 -07:00
Changming Sun
65ff02fdb0
Set job timeout for code coverage pipeline to 120min(#1563) 2019-08-06 07:49:31 -07:00
Ashwini Khade
16087f3133
update default values for weight quatization (#1564) 2019-08-05 21:39:37 -07:00
Changming Sun
7ee8aca1bf
Avoid downloading test data into C:\ (#1562) 2019-08-05 19:53:15 -07:00
S. Manohar Karlapalem
05bbb3065c [OpenVINO-EP] Update hardware branding of VAD-R as VAD-M (#1552)
Replaces all occurrences of VAD-R/VAD_R with VAD-M/VAD_M.
Aligns with the official hardware branding.
2019-08-05 15:28:46 -07:00
Hariharan Seshadri
ceb8f1c1a2
Modify the kernel declaration for Shrink op (#1554)
* Add capability for the input and output of Shrink op to share a commong buffer

* Cosmetic change
2019-08-05 13:21:04 -07:00
pengwa
6c271c63ac
add test cases for commit c019bb9355a511f471e55e7302b26e1d370ed46a (#1556) 2019-08-04 17:18:45 +08:00
jywu-msft
8a6bfe00af
roll back model test update for ngraph provider. (#1551) 2019-08-02 15:53:32 -07:00
Yufeng Li
a098be12ba
Register kernel for Greater int64 (#1546)
Register int64 for Greater and refactor the register code
2019-08-02 14:01:43 -07:00
Ke Zhang
cb71c69d5e
checking execution provider logic updated. (#1547) 2019-08-02 13:29:39 -07:00
daquexian
93cb29f958 [WIP] NNAPI EP Update (#1540) 2019-08-01 22:25:56 -07:00
Scott McKay
9fb8867a24
Don't create implicit input for outer scope value if there is a subgraph input with the same name. (#1186)
* If there is an outer scope value that matches a subgraph input, don't create an implicit input from the outer scope value.

Minor unrelated change for issue noticed while debugging: Use unordered_set for implicit inputs so we don't add them multiple times.

* Add unit test based on onnx issue.
2019-08-02 07:23:41 +10:00
Ke Zhang
1cf5ebc4c5
copyfromhost/copytohost are not needed for mkldnn ep (#1532)
* memcpy is not necessary for mkldnn ep to copy from/to host.

* update
2019-08-01 13:22:15 -07:00
Hariharan Seshadri
624411bb69
Upload correct ESRP signed package (#1531) (#1534) 2019-08-01 10:56:18 -07:00
Changming Sun
3045a5f88b
Update test data (#1512)
* Update test data
2019-08-01 10:42:08 -07:00