Commit graph

188 commits

Author SHA1 Message Date
Scott McKay
41d55ea274
Update the GraphProto for subgraphs when saving the Graph. (#647)
* Update the GraphProto for subgraphs when saving the Graph. This is required to produce a valid overall Graph if the Graph has been optimized.
2019-10-23 15:14:06 -07:00
Ryan Hill
6fca8b0a94 Move CXX API global into the header (#2228) 2019-10-23 14:15:53 -07:00
edgchen1
856c6cae0a Edgchen1/endian utils (#2181) 2019-10-21 22:28:35 -07:00
Scott McKay
3cda9f717b Relax shape inferencing error handling if model uses an old opset (#2199) 2019-10-21 10:51:22 -07:00
Paul McDaniel
d1159b7008 Adding platform telemetry (#2109) 2019-10-19 18:25:57 -07:00
Ashwini Khade
fc3c168402
Graph Optimizations Doc (#2050)
* Initial draft

* updates per review

* fix link

* plus one more link fix

* small changes to the optimizer documentation

* some more changes

* done

* update C_API with doc link
2019-10-18 08:03:40 -07:00
Changming Sun
13f8b49d58
Fix kernel registry bug (#2137) 2019-10-17 23:10:54 -07:00
Scott McKay
3fcb4ee7d4
Refine optimizers (#1407)
* Refine optimizers

* Address PR comments

* Changes from PR comments and discussion.

* Fixed signed/unsigned mismatch

* Address PR comments

* Address PR comments

* Fix linux build

* Fix issue with mkldnn logic.

* Turn off optimizers by default for operator unit tests.

* Handle edge case of graph with no nodes in partitioner so all execution providers don't need to.

* Comment out change to turn off optimizers for unit tests. Add details on what needs to be done to re-enable.
2019-10-15 14:49:59 -07:00
Adrian Tsai
4090d0d0de
Add DirectML Execution Provider (#2057)
This change adds a new execution provider powered by [DirectML](https://aka.ms/DirectML).

DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning on Windows. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers.

The DirectML execution provider is capable of greatly improving evaluation time of models using commodity GPU hardware, without sacrificing broad hardware support or requiring vendor-specific extensions to be installed.

**Note** that the DML EP code was moved verbatim from the existing WindowsAI project, which is why it doesn't yet conform to the onnxruntime coding style. This is something that can be fixed later; we would like to keep formatting/whitespace changes to a minimum for the time being to make it easier to port fixes from WindowsAI to ORT during this transition.

Summary of changes:
* Initial commit of DML EP files under onnxruntime/core/providers/dml
* Add cmake entries for building the DML EP and for pulling down the DirectML redist using nuget
* Add a submodule dependency on the Windows Implementation Library (WIL)
* Add docs under docs/execution_providers/DirectML-ExecutionProvider.md
* Add support for DML EP to provider tests and perf tests
* Add support for DML EP to fns_candy_style_transfer sample
* Add entries to the C ABI for instantiating the DML EP
2019-10-15 06:13:07 -07:00
Pranav Sharma
91db840b6b
Introduce execution mode enum for clarity and extensibility; Change Python, C and C# APIs accordingly; Removed EnableSequentialExecution, DisableSequentialExecution in favor of the more general SetExecutionModeAPI. (#2098)
* Introduce execution mode for clarity and extensibility; Change Python APIs accordingly; Replace DisableSequentialExecution API with EnableParallelExecution for clarity.

* Fix cuda build

* Modify the test slightly

* Make C and C# APIs consistent with Python.
2019-10-14 09:48:19 -07:00
Scott McKay
eb24617d2e Add ability to get symbolic dimension info for graph inputs and outputs. (#2051)
* Add ability to get symbolic dimension info for graph inputs and outputs.
WIP to get initial feedback.

* Fix linxu build error.
Update C# API and add unit test

* Clarify the two different ways Tensor shape and type info is created. One is from concrete values and one is from a type proto where symbolic dimensions may exist. Doing so allows a change to default to empty strings for the symbolic dimensions if not provided.
2019-10-12 15:46:28 -07:00
Scott McKay
50faab308b
Remove 'Ort' prefix from OrtAddFreeDimensionOverride for consistency. (#2099) 2019-10-12 06:11:29 +10:00
Ryan Hill
e8e33977da
Ryanunderhill/customop dll (#2002)
* Add OrtApiBase
* Add RegisterCustomOpsLibrary API
2019-10-11 11:12:51 -07:00
Scott McKay
bdfff800ea Move access to intra-op threadpool into OpKernelContext. (#2091) 2019-10-11 10:36:20 -07:00
Scott McKay
db0dd09ded
Cleanup some aspects of the Initializer class used by optimizers (#2005)
* Move check on data type outside of the Initializer class as it's specific to Conv processing.
Use references for arguments that can't be null.
2019-10-09 10:37:44 +10:00
Pranav Sharma
ea60469af5
Support seq(tensor), implement 2 sequence ops that use the new type. (#1983)
* Mention OrtCreateSessionFromArray in C API doc

* fix seq of tensors

* changes on 9/30

* All tests passing

* Add SequenceAt op

* Fix shared_lib non_tensor_types test

* Address some PR comments

* Address PR comments

* Add support in python bindings to accept seq(tensor)

* Change data type from vector<Tensor> to TensorSeq

* Change data type from vector<Tensor> to TensorSeq

* Added some documentation

* Added missing test model

* Fix Linux build

* Fix Mac build

* Fix Mac build
2019-10-07 15:35:09 -07:00
Scott McKay
e58827fa62
Add Unique operator. (#1900)
* Add Unique operator.
* Enable onnx tests. Disable one with incorrect expected output and add unit test to validate ORT behavior. Need onnx update to fix (will address that separately but don't want to block this checkin on that change).
2019-10-04 22:11:55 +10:00
Dmitri Smirnov
627f853a44
Downgrade compiler to CentOS 4.8.5 (#1985)
Make onnxruntime CPU build and run on CentOS GCC 4.8.5
2019-10-03 15:40:46 -07:00
Dmitri Smirnov
d1b1cdc5c4
Replace GSL with GSL-LITE submodule and fix up refs (#1920)
Remove gsl subodule and replace with a local copy of gsl-lite
  Refactor for onnxruntime::make_unique
  gsl::span size and index are now size_t
  Remove lambda auto argument type detection.
  Remove constexpr from fail_fast in gsl due to Linux not being happy.
  Comment out std::stream support due to MacOS std lib broken.
  Move make_unique into include/core/common so it is accessible for server builds.
  Relax requirements for onnxruntime/test/providers/cpu/ml/write_scores_test.cc
  due to x86 build.
  Add ONNXRUNTIME_ROOT to Server Lib includes so gsl is recognized
2019-10-01 12:43:29 -07:00
Scott McKay
bd2d6af9ca
Filter out info from non-const initializers during shape inferencing (#1806)
* Don't return shape for non-const initializer in InferenceContextImpl::getInputType
Don't return initializer for non-const initializer in InferenceContextImpl::getInputData
Update graph_utils to support these scenarios
  - fix GetConstantInitializer to make sure a name is for an outer scope value before checking a parent graph, as local name could shadow an outer scope initializer.
2019-09-26 13:44:33 +10:00
jywu-msft
686bd36210
Remove ml_status.h, add StatusCode to pybind exception mappings (#1889)
* initial checkin.

* add onnxruntime status code to ort pybind exception mapping.

* address review feedback.
2019-09-24 11:13:14 -07:00
Pranav Sharma
1a3ded6a7b
Add C API for free dim override, fix missing API mention in InferenceTest.cs, fix confusing print statement in perf_test. (#1884)
* Mention OrtCreateSessionFromArray in C API doc

* Add C API for free dim override

* Add C API for free dim override, fix missing API mention in InferenceTest.cs, fix confusing print statement in perf_test.

* Remaining C#files

* fix c# build

* Run the tests in blame mode. This option is helpful in isolating a problematic test causing the test host to crash.

* fix order
2019-09-23 17:58:20 -07:00
Ryan Hill
5781222456
Ryanunderhill/api interface (#1855)
* Convert ABI to a versioned interface.
* Convert ORT_THROW_ON_ERROR to inline function to fix link errors.
2019-09-20 13:39:11 -07:00
Adrian Tsai
a7beed798e Implement L1 graph transformer for free dimension override (#1825)
* Implement FreeDimensionOverrideTransformer

* Add test

* Fix compiler warnings

* Update comment

* LOGS_DEFAULT

* Merge from master
2019-09-20 10:52:14 -07:00
Dmitri Smirnov
6a9ae65f41
Expose GetOverridableInitializers via Python and C/C++ API (#1878)
Implement GetOverridableInitializers()
 Add unit test for initializer override.
Expose in Python and C/C++ API
2019-09-19 15:43:28 -07:00
Pranav Sharma
a9ce941579
Refine threading control options and move inter op thread pool to session state. (#1841)
Description: Refine threading control options and move inter op thread pool to session state.
Added thread_utils.h/cc to centralize the decision around the thread pool size under various conditions.

Motivation and Context
Currently the thread pool size of the parallel executor is hardcoded to 32 for some reason. This PR makes the options to configure the thread pool sizes clearer.
2019-09-18 22:36:23 -07:00
KeDengMS
80bda77203
Fix symbolic shape inference for faster_rcnn, mask_rcnn, yolov3 (#1867)
* Fix symbolic shape inference for faster_rcnn, mask_rcnn, yolov3

Force merge when --auto_merge, on symbolic dims which sympy cannot simplify
Add symbolic inference for Resize opset 10
Add support for step != 1 in Slice
Add support for computed dim in TopK
Bug fixes in passing symbolic dims from subgraph
Fix an outdate comment in Nuphar provider header
2019-09-18 14:18:32 -07:00
Bowen Bao
8712a523a4
Bump onnx to latest (#1756)
* Bump onnx to latest

Update onnx.in.proto with changes for SparseTensor.

* add temp skip tests

* remove passed tests from skip list

* skip more tests for new ops in opset 11

* skip crashing tests

* update handling of new attribute types sparse tensor and sparse tensors

* advance onnx commit and remove skip cpu_flaky_tests

* temporarily skip yolo3 model test due to resize opset10 shape inference regression

* update proto for onnxruntime server

* advance onnx commit further
2019-09-12 11:46:49 -07:00
Pranav Sharma
f8c3442880
Part 2 of renaming AllocatorInfo to MemoryInfo. (#1804)
* Mention OrtCreateSessionFromArray in C API doc

* Part 2 of renaming AllocatorInfo to MemoryInfo.

* pr comments

* fix comment
2019-09-12 08:19:29 -07:00
Dmitri Smirnov
fe8915863c
Implement C API entry points for creating and fetching non-standard types to OrtValue (#1714)
C/C++ Opage APIs
 Add new virtual interfaces for NonTensorType
 Implement entry points.
 Add shared header for the data container.
 Add export symbols.
 Add serialization/deserialization.
 Implement model with Opaque types.
 Rework opqaue_api_test as a standalone executable.
2019-09-11 14:52:47 -07:00
Scott McKay
3b7f047a49
General performance testing tooling improvements (#1577)
* Miscellaneous updates to help with perf testing
2019-09-11 19:46:59 +10:00
Pranav Sharma
f9d85d654a
Add GetDataTransfer() interface in the EP. (#1773)
* Mention OrtCreateSessionFromArray in C API doc

* Add GetDataTransfer() interface in the EP.

* Check return status of RegisterDataTransfer

* Address PR comments
2019-09-10 14:07:17 -07:00
Scott McKay
98dbdb1e0b
Rework the feed/fetch copy setup so that it can be calculated prior to subgraph execution (#1761)
* Rework the feed/fetch copy setup so that it can be calculated upfront by the control flow nodes. Also simplifies how it all works.
Update the control flow nodes to do the calculation prior to graph execution.
2019-09-10 15:46:00 +10:00
Scott McKay
2e242a4089
Clarify naming of the API involving the RunOptions terminate flag. (#1768)
* Clarify naming of the RunOptions terminate flag.

* Update C# code to use new names.
2019-09-10 08:32:33 +10:00
Ashwini Khade
b2a2326a45
add dequantize and quantize back to contrib ops (#1712) 2019-09-06 08:55:42 -07:00
Pranav Sharma
52fe574fed
Rename OrtAllocatorInfo to OrtMemoryInfo to make it more obvious. (#1758)
* Mention OrtCreateSessionFromArray in C API doc

* Rename OrtAllocatorInfo to OrtMemoryInfo to avoid confusion
2019-09-05 14:20:37 -07:00
KeDengMS
c9240f4e93
Implementation of Nuphar execution provider (#881)
* Implement Nuphar execution provider

Nuphar execution provider is a TVM-based compilation provider. It has shown great speedups for RNN models using Scan.
This PR is mainly for a preview of the shared codegen library for other TVM-based providers.

* Fix submodules

* Fix TVM submodule

* Update Nuphar to latest and resolve confliction

* Remove stale files caused by merge -X theirs

* Revert heap buffer change to not introduce onnxruntime_framework into onnxruntime_perf_test

* Fix bad merge

* Merge from Nuphar

* Fix warning treated as error, revert some unnecessary changes

* Revert some more test changes

* Some more test revert or comments to make review easier
New tests could be added later

* One more revert of unnecessary changes

* More change revert. Test could be added back later.
2019-09-01 23:01:47 -07:00
Changming Sun
81ad48080b
Remove TaskThreadPool (#1713) 2019-08-28 18:00:10 -07:00
Pranav Sharma
4035fe842e
Don't create the default allocator every single time. Rename API accordingly. Expose Session/Run log severity levels. (#1615)
* Mention OrtCreateSessionFromArray in C API doc

* Don't create the default allocator every single time. Rename API accordingly.

* Don't create the default allocator every single time. Rename API accordingly.

* updates...

* updates...

* PR comments

* fix typo in license header

* fix build
2019-08-23 10:33:20 -07:00
Changming Sun
224dde7ef1
Allow user disable multiple threading (#1647) 2019-08-19 18:12:39 -07:00
Changming Sun
6b89c7ad04
Let mlas use session thread pool (#1609)
1.Let mlas use session thread pool
2.Remove onnxruntime_USE_MLAS cmake option
3. Remove the win32 thread pool code inside mlas

mlas will:

1.use ort thread pool if it get passed in
2.use openmp if the threadpool parameter is nullptr
3.run single threaded if the threadpool parameter is nullptr and openmp is disabled.
2019-08-16 13:21:15 -07:00
Dmitri Smirnov
17c8fe44e3
Integrate featurizers (#1573)
Added Sample Featurizer and Infrastructure
  Make featurizers and unit tests compile and run with GTest.
  Create definitions for the first featurizer kernel.
  Add new operator domain.
  Create datetime_transformer kernel and build.
  Move OPAQUE types definitions for featurizers kerneles out to a separate cc.
  Register them with the type system.
 Provide unit tests for new AutoML DateTimeTransformer kernel.
  Make necessary adjustments to the test infrastructure to make it run
  with new types.
2019-08-15 13:59:59 -07:00
shahasad
0c5d2c998b
Generate documentation from the registered operator kernels (#1395)
- Added python script for generating markdown doc from the registered opkernels. 
- Made some conditional changes in the pybind to expose necessary python API
- Added some missing type-constraints in the op kernel registrations
2019-08-14 18:12:24 -07:00
Pranav Sharma
8d12ce45cf
Use a friendly enum for graph optimization level. (#1586)
* Mention OrtCreateSessionFromArray in C API doc

* review changes

* use enum for graph optimization level

* Use explicit values for enums

* updates...

* Add friendly enum for graph optimization levels in C, C# and Python APIs.

* Fix linux build

* Fix build breakage due to master merge

* PR comments
2019-08-14 17:12:08 -07:00
Ke Zhang
bd64ca3019
Kezhan/execute graph refactoring (#1553)
* checking execution provider logic updated.

* fix the logic of copy input and output.

* update

* update

* update

* update

* update

* update

* fix ngraph failure.

* fix comments
2019-08-14 01:07:05 -07:00
pulkittomar
a50a63aa9e Serialize optimized onnx model (#1470)
* Model serialization

* Removed duplicate symbol

* Minor update

* Review comments

* add tests

* Model serialization

* Removed duplicate symbol

* Minor update

* Merged PR 1106437: Model Serialization in onnxruntime

* Review comments

* Merged PR 1107226: Review comments

Review comments

* add tests

* Fixed merge conflict

* Correct python tests

* InferenceSesssion Refeed Test

* Replace use of widechar const literal-L

* Fixed failing tests

* Updated comment

* Removed unnecessary session options

* Spell check on comments

* Do not serialize when level 3 optimization specified

* Updated error logs

* Changed log severity to WARN
2019-08-12 18:43:40 -07:00
Pranav Sharma
a6a4c4c079
Fix perf test executable. (#1598)
* Mention OrtCreateSessionFromArray in C API doc

* Fix perf test executable due to removal of certain C APIs

* fix linux build

* Avoid duplication

* Fix mem leak
2019-08-12 09:49:29 -07:00
stevenlix
1c5b15c2b8
Remove memory copy between TensorRT and CUDA (#1561)
* remove memory copy between CUDA and TRT

* add info to RegisterExecutionProvider input

* use new IDeviceAllocator for trt allocator

* remove SetDefaultInputsMemoryType from TRT EP

* remove onnx-tensorrt 5.0

* add submodule onnx-tensorrt branch 5.1

* remove redundancy

* Update transformer_memcpy.cc

* Update tensorrt_execution_provider.cc

* switch to TensorRT 5.1.5.0

* update python binding

* disable failed test case on TensorRT

* Update activation_op_test.cc

* upgrade to TensorRT container 19.06

* update according to feedback

* add comments

* remove tensorrt allocator and use cuda(gpu) allocator

* update onnx-tensorrt submodule

* change ci build cuda directory name
2019-08-08 19:31:39 -07:00
Scott McKay
6e430c0526
A few performance improvements coming out of ssd_mobilenet and ssd_resnet34 analysis (#1578)
* A few performance improvements:
 - Make the iteration in NonZero more efficient by using a raw pointer and simplifying the increment logic
   - add another unit test to check the new logic works with 3 dimensional tensor
   - gains about 2% for ssd_mobilenet
 - Avoid floating point operations on each iteration on Concat
  - about 0.5% for ssd_mobilenet and ssd_resnet34
 - Put common case first in ExecutionFrame::AllocateAsPerAllocationPlan to avoid unnecessary call to IsSparseTensor
  - about 0.05% for ssd_mobilenet
 - Minor tweak to put some ctors in the TensorShape header so they can be inlined more easily
2019-08-08 07:20:00 +10:00
Pranav Sharma
a443b013dd
Remove unneeded C APIs + some refactoring. (#1555)
* Mention OrtCreateSessionFromArray in C API doc

* c api changes after review (1)

* updates...

* fixes

* Reorder include
2019-08-07 11:05:29 -07:00