Commit graph

1162 commits

Author SHA1 Message Date
suryasidd
7408dec0bf Added some mo optimizations to improve performance (#1674)
Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
2019-08-22 19:16:01 -07:00
Negin Raoof
addf32fa2a int64 support for 'where' op (#1666) 2019-08-22 16:14:18 -07:00
Yufeng Li
c9a4fe2b7b
Add support of ReduceSum int64 (#1664)
* Add support of ReduceSum int64

* add unit test for int64
2019-08-22 13:53:01 -07:00
Ashwini Khade
d2569d3761
update clip for opset 11 (#1661)
* update clip for opset 11

* exclude ngraph provider for clip unit tests

* exclude ngraph for all clip opset 11 tests

* fix op version
2019-08-22 13:07:22 -07:00
Changming Sun
4de0aa8049
Optimize kernel index (#1672) 2019-08-22 10:26:35 -07:00
shahasad
a818740d91
Support Tensor<bool> and Tensor<Int8> in C# API. Support Tensor<string> as input. Fix a bug in the InferenceSession Run() with RunOptions (#1671)
- Support bool-Tensor and int8-Tensor in input-output of C# api
- Support string-tensor as input in C# api
- Fix a bug in InferenceSession.Run() -- RunOptions was not passed into the native call
2019-08-22 10:14:50 -07:00
Ke Zhang
b53f40a886
update set fetches for execution with allocation plan. (#1668) 2019-08-21 19:58:05 -07:00
shahasad
6f70a78e1f
Fix a few errors in the NuGet pipeline (still broken) (#1656) 2019-08-21 15:42:23 -07:00
Tommy Trimeloni
97d0a46afc nGraph EP Optimizations (#1630)
* Added check for unnecessary function initializations, and removed lock from unneeded areas of code.

* Added LRU cache to EP.

* Bugfixes for nGraph EP Optimization PR

* Changed default cache size to 500 and refactored mutex readability.

* Fixed unsafe environmental variable fetch for Windows.

* Cleaned up Windows environment functions and cleaned up mutexes.
2019-08-21 14:04:53 -07:00
Scott McKay
a68a20e415
Add details of which node was not able to be placed on an execution provider. (#1665) 2019-08-21 13:31:00 -07:00
Hector Li
e652a236b4
cudnnRNNForwardInferenceEx doesn't support 0 sequence in the bathes
Fix issue that cudnnRNNForwardInferenceEx doesn't support 0 sequence in the bathes

Solution:
Reset the 0 sequence to 1 for the bathes before call the cudnnRNNForwardInferenceEx, has a array to track the batch id which has 0 sequence. Once get the result, call a CUDA kernel to mask on the output using the batch id tracked in the array.
2019-08-21 09:59:43 -07:00
Emma Ning
d0d82432f3
Update PyTorch Section for supported onnx version (#1635)
PyTorch exporter in Pytorch1.2 can natively support multiple opset now
2019-08-20 13:56:19 -07:00
Scott McKay
5311c1b2b5
Check return value form CreateFeedsFetchesManager. (#1653)
Also cleanup a couple of unused variables.
2019-08-20 12:20:21 -07:00
Changming Sun
7be5695fad
Remove --whole-archive (#1655) 2019-08-20 12:04:10 -07:00
jywu-msft
68d496c7ca
fix bug on windows where ops were always getting dumped. (#1648) 2019-08-20 10:48:29 -07:00
Changming Sun
a1b3c64038
Fix memory leak in mlas unitest (#1654) 2019-08-19 19:53:51 -07:00
Pranav Sharma
377dcf60ac
Update onnx test runner documentation (#1651)
* Mention OrtCreateSessionFromArray in C API doc

* Update perf tool documentation to reflect the new graph optimization enums. Relax constraint for enable_all.

* Update one more doc

* Update onnx test runner documentation

* Add default in the docs
2019-08-19 18:28:09 -07:00
Changming Sun
224dde7ef1
Allow user disable multiple threading (#1647) 2019-08-19 18:12:39 -07:00
Pranav Sharma
6f3a835d38 Update perf tool documentation to reflect the new graph optimization enums. Relax constraint for enable_all. (#1650) 2019-08-19 14:27:33 -07:00
Changming Sun
413730365f
MlasGetMaximumThreadCount: plus 1 to the NumThreads from ORT thread pool (#1646) 2019-08-19 14:15:04 -07:00
jywu-msft
bdc694314c
update MKLML to version which contains fix for thread hang. (#1636)
* update MKLML which has bugfix for thread hang. move PATCH_COMMAND outside BUILD_FOR_NATIVE_MACHINE check.

* MKLML_VERSION 2020.0.20190813 is for windows only.
2019-08-19 10:34:48 -07:00
Lara Haidar
b963e4b5c2 Add uint8 Support for NonZero Op (#1614) 2019-08-19 09:50:48 -07:00
Tracy Sharpe
bc72c2dba7
MLAS: add U8U8 MatMul operation (#1644)
Implement the first round of changes for quantization inside MLAS. This adds a MatMul operation for U8xU8=S32 for x86/x64 processors.
2019-08-18 18:15:48 -07:00
Dmitri Smirnov
fbd790f703
Add AutoML to 3 main builds. (#1631)
Add AutoML to 3 main builds.
  Fix unit tests. Enable copy elision, do not move movable object
  on return by value.
2019-08-16 18:06:16 -07:00
jywu-msft
372b657900 update TRT EP CI's to use latest model.zip (#1637) 2019-08-16 17:44:22 -07:00
Changming Sun
6b89c7ad04
Let mlas use session thread pool (#1609)
1.Let mlas use session thread pool
2.Remove onnxruntime_USE_MLAS cmake option
3. Remove the win32 thread pool code inside mlas

mlas will:

1.use ort thread pool if it get passed in
2.use openmp if the threadpool parameter is nullptr
3.run single threaded if the threadpool parameter is nullptr and openmp is disabled.
2019-08-16 13:21:15 -07:00
Hariharan Seshadri
44a42a6a98
Fix parsing initial hidden state in RNN (#1626)
* Fix the way initial hidden state is used for reverse direction in RNN

* Add test case

* Updates
2019-08-16 10:12:46 -07:00
shahasad
f9834105aa removed --gen_doc (#1633) 2019-08-16 09:52:36 -07:00
shahasad
c9eb13a638
Copy System.Numerics.Tensors sources from dotnet/corefx into onnxruntime (#1605)
Copy System.Numerics.Tensors sources from dotnet/corefx into onnxruntime
2019-08-15 17:28:47 -07:00
Ashwini Khade
0044be6259
update onnx to latest commit (#1622)
* update onnx to latest commit

* Disable and/or fix failing tests

* disable not yet implemented tests for opset 11

* disable tests

* fix bug in mkldnn fp16 graph check
2019-08-15 17:10:32 -07:00
Hariharan Seshadri
1835640d94
Support int64 for ReduceMax (#1625) 2019-08-15 14:48:59 -07:00
Dmitri Smirnov
17c8fe44e3
Integrate featurizers (#1573)
Added Sample Featurizer and Infrastructure
  Make featurizers and unit tests compile and run with GTest.
  Create definitions for the first featurizer kernel.
  Add new operator domain.
  Create datetime_transformer kernel and build.
  Move OPAQUE types definitions for featurizers kerneles out to a separate cc.
  Register them with the type system.
 Provide unit tests for new AutoML DateTimeTransformer kernel.
  Make necessary adjustments to the test infrastructure to make it run
  with new types.
2019-08-15 13:59:59 -07:00
Hariharan Seshadri
7545b795df
Fix incorrect box offset computation in NMS op (#1624)
* More changes

* Fix NMS

* nits
2019-08-15 11:41:10 -07:00
shahasad
0c5d2c998b
Generate documentation from the registered operator kernels (#1395)
- Added python script for generating markdown doc from the registered opkernels. 
- Made some conditional changes in the pybind to expose necessary python API
- Added some missing type-constraints in the op kernel registrations
2019-08-14 18:12:24 -07:00
Pranav Sharma
8d12ce45cf
Use a friendly enum for graph optimization level. (#1586)
* Mention OrtCreateSessionFromArray in C API doc

* review changes

* use enum for graph optimization level

* Use explicit values for enums

* updates...

* Add friendly enum for graph optimization levels in C, C# and Python APIs.

* Fix linux build

* Fix build breakage due to master merge

* PR comments
2019-08-14 17:12:08 -07:00
jywu-msft
24d17f4353
Fix trtlogger segfault. re-enable SoftPlus unit test for TRT. add doc… (#1623)
* Fix trtlogger segfault. re-enable SoftPlus unit test for TRT. add documentation for ORT_TENSORRT* env vars.

* Update TensorRT-ExecutionProvider.md
2019-08-14 16:34:39 -07:00
Hariharan Seshadri
09db1e06b5
Make changes to pipeline template to include missing headers in tars/zips (#1617) 2019-08-14 13:51:29 -07:00
shahasad
a6a5acedda
Cleanup csharp API SessionOptions and RunOptions to be consistent with other APIs (#1570)
- Updated SessionOptions API to use properties instead of setter/getter methods. 
- Added missing APIs. 
- Added RunOptions.
2019-08-14 12:02:02 -07:00
Ke Zhang
bd64ca3019
Kezhan/execute graph refactoring (#1553)
* checking execution provider logic updated.

* fix the logic of copy input and output.

* update

* update

* update

* update

* update

* update

* fix ngraph failure.

* fix comments
2019-08-14 01:07:05 -07:00
Scott McKay
b405482cfa
Remove copy of generator in Multinomial (#1611)
* Remove copy of generator in Multinomial so that different values are generated each time.
Add ability to test
2019-08-14 10:58:54 +10:00
Scott McKay
b5de1324ef
Fix log message truncation on Windows when printf formatting is used.` (#1599)
* Fix log message truncation and add unit test. On Windows vnsprintf_s returns -1 when truncating so we need to differentiate that from a real error.
2019-08-14 07:53:45 +10:00
pulkittomar
a50a63aa9e Serialize optimized onnx model (#1470)
* Model serialization

* Removed duplicate symbol

* Minor update

* Review comments

* add tests

* Model serialization

* Removed duplicate symbol

* Minor update

* Merged PR 1106437: Model Serialization in onnxruntime

* Review comments

* Merged PR 1107226: Review comments

Review comments

* add tests

* Fixed merge conflict

* Correct python tests

* InferenceSesssion Refeed Test

* Replace use of widechar const literal-L

* Fixed failing tests

* Updated comment

* Removed unnecessary session options

* Spell check on comments

* Do not serialize when level 3 optimization specified

* Updated error logs

* Changed log severity to WARN
2019-08-12 18:43:40 -07:00
Scott McKay
8a559d75ae
Minor perf improvements. (#1580)
* Minor perf improvements.

- Cache the vector sizes in IExecutionFrame and NodeIndexInfo to avoid calls to size().
  - 2 instructions instead of 10
- Remove an unnecessary check in IExecutionFrame
  - add a check to the ctor so we guarantee it's unnecessary
- Reserve memory for the vectors in BroadcastIterator
  - saves reallocs if more than one value is added
    - but rare with the mlperf models for multiple values to be added so benefit is limited.
  - slight tweak to the Broadcaster ctor code to make it more readable
2019-08-13 09:05:48 +10:00
Pranav Sharma
a6a4c4c079
Fix perf test executable. (#1598)
* Mention OrtCreateSessionFromArray in C API doc

* Fix perf test executable due to removal of certain C APIs

* fix linux build

* Avoid duplication

* Fix mem leak
2019-08-12 09:49:29 -07:00
AlbertSadovnikov
ce3c8f98dd Fix for CPU random ops seed narrowing conversion. (#1594) 2019-08-12 09:01:13 -07:00
Malik Shahzad Muzaffar
df9b1b8ec8 Include io_win32.h only if builds on windows (#1587)
* Include io_win32.h only if builds on windows

* looks like include order matters
2019-08-12 08:18:42 -07:00
Tomasz Dołbniak
69baf9e800 Update nGraph to v0.22.1 (#1582)
* Update nGraph to 0.21 and adjust the EP

* Share the graph initializers between custom ops

* Update nGraph to 0.22 and exclude Gather entirely

* Enable building on Windows with nGraph v0.21.1-rc.0

* Disable the unsigned input Shrink op tests for nGraph until the next update

* Line-shortening code refactor

* Fix for the master branch merge artifact

* MKLDNN patches adjustment for Windows

* Exclude MatMulInteger for non-const zero points

* Exclude ConvInteger for non-const zero points

* Enable full Cast op support

* Use the v0.22.1 tag

* Skip ConvTranspose_InvalidKernelShape test for ngraph provider

* Create sub-graph ModelProto from fused_node
2019-08-10 17:41:08 -07:00
Ashwini Khade
7be40b2946
put all gemmlowp common code in one place (#1590)
* put all gemmlowp common code in one place

* fix gpu build failures

* minor update
2019-08-10 17:01:07 -07:00
Ke Zhang
59c9d83f35
add int64 support for less op. (#1604) 2019-08-09 17:16:57 -07:00
Wei-Sheng Chin
0187d876cb Implement new LabelEncoder in opset 2 in ML domain (#1393)
* Implement new LabelEncoder in opset 2 in ML domain

* Fix compilation error

* Fix tests

* Include ONNX's fix

* Formatting and addressing a comment

* Address a minor comment
2019-08-09 14:03:58 -07:00