Commit graph

128 commits

Author SHA1 Message Date
Ke Zhang
3bf0e364e2
Move CopyTensor out of IExecutionProvider interface. (#1268)
* add ortdevice class

* add data transfer manager for copying tensors.

* update

* add data trasnfer for gpu

* fix constexpr build break.

* update

* remove unnecessary header files.

* remove unnecessary header files.

* add dependency

* add dependency

* add dependency

* add dependency

* fix linux build break.

* update

* fix build break

* fix build break

* fix build break

* update

* update

* update c api.

* update to not use OrtCreateAllocatorInfo

* change to all eps .

* fix linux build break

* remove useless codes.

* update

* move datatransfermanager in session state

* update

* fix cuda build break.

* fix comments

* fix windows GPU build.

* fix comments

* fix build break

* fix comments

* fix test failure

* update

* fix comments

* fix onnx runtime server.

* update

* fix test failure.

* fix comments

* fix comment
2019-07-11 14:49:20 -07:00
Tracy Sharpe
823fa3f39c
Integrate MLAS NCHWc support into ONNX Runtime (#1327)
This change integrates the NCHWc support recently added to MLAS into ONNX Runtime. When using "-o 3" optimizations, then the runtime will do a NCHWc layout optimization pass to convert standard ONNX operators such as Conv/MaxPool to the com.microsoft.nchwc domain with weights and biases reordered for speed.
2019-07-09 20:41:19 -07:00
Changming Sun
27da857b51 Fix an SAL annotation in onnxruntime_c_api.h 2019-07-09 10:14:58 -07:00
Pranav Sharma
e9ce51ead4
Make GetTensorShapeFromTensorShapeProto return TensorShape and not it's internal representation. (#1353) 2019-07-08 11:45:55 -07:00
Scott McKay
9d3b6b3a49
Disallow overriding initializers if IR version < 4 (#1324)
Description:

Disallow overriding an initializer via a graph input if the IR version is < 4. This enforces an implicit assumption that initializers should be treated as constant, and allows constant folding to be done on a model with an older IR version.
Separate constant and overridable initializers so that it's clear which ones constant folding can utilize.
Update Graph to not add all initializers to the graph inputs when the graph is manually created (i.e. not loaded from a GraphProto) and the IR version is >= 4.
Motivation and Context
In order to do constant folding we need to know which initializers can be treated as constant and which are overridable. All initializers were required to have a matching graph input prior to IR version 4, technically making all of them overridable. The intention however was for them to be treated as constants, and this change enforces that intent.

The benefit of doing so is that constant folding will work for models with IR version < 4. The cost is that if someone is actually overriding an initializer they will need to update the IR version of their model to version 4 in order to keep doing so. The belief is that this is a very small subset of usage (e.g. models involving feeding in a truncated sequence) and the cost to update that small subset is warranted by the benefit of constant folding being able to be enabled on all older models without them needing an IR version update.
2019-07-03 18:43:38 +10:00
daquexian
c65489a47f Initial PR for NNAPI execution provider (#1220)
* init

* Update DNNLibrary

* Update DNNLibrary, set compiler flags, it compiles now

* Add more missing flags, add test

* Update DNNLibrary

* Update Compile method, fix allocator and some other bugs

* Update DNNLibrary

* Implement CopyTensor

* Not delete state explicitly since it is managed by unique_ptr

* Add the missing files when SingleUnitTestProjct is ON

* misc changes

* Fix wrong name in provider factory

* Add my own test

* Update the code of add node into graph, and add the missing initializer into graph

* Fix the bug that re-build the graph produces extra output

* Update DNNLibrary

* Transpose nchw (ONNX) -> nhwc (NNAPI)

* Add license

* Add GetSupportedNodes method (implement it later)

* Rename onnxruntime_nnapi_test->onnxruntime_nnapi_squeezenet_test

* Update squeezenet_test.cpp after rebase master

* Remove squeezenet_test.cpp since it is almost same with the c++ sample

* Update DNNLibrary for GetSupportedNodes

* Update GetSupportedNodes

* Revert "Remove squeezenet_test.cpp since it is almost same with the c++ sample"

This reverts commit a97575fd9ff49e50ba1dc8d8154790d8cd86c48d.

* Update DNNLibrary

* Fix multiple outputs bug

* Remove GetKernelRegistry

* Revert "Revert "Remove squeezenet_test.cpp since it is almost same with the c++ sample""

This reverts commit 2a0670e9cbf10ea654111ce39e198a4be0ddd838.

* Set default memory type of NNAPI EP

* Add CPUOutput allocator

* Update DNNLibrary for multiple outputs

* Fix bug of nhwc->nchw

* Remove GetExecutionHandle()
2019-07-02 06:03:29 -07:00
Scott McKay
4d765dc6d0
Return error message from status instead of swallowing it. (#1221)
* Return error message from status instead of swallowing it.

* Return OrtValue* from OpKernelContext::GetOrCreateOutputMLValue

* Add unstaged change.
2019-06-22 06:26:42 +10:00
Changming Sun
766c6b6163
Add an API for retrieve ORT version (#1263)
* Add an API for retrieve ORT version
2019-06-20 15:42:12 -07:00
S. Manohar Karlapalem
8d15ffd8f5 Initial commit for OpenVINO Execution Provider (#935)
* Initial commit for OpenVINO Execution Provider

OpenVINO Execution Provider provides the interface for ONNX Runtime
applications to access Intel's hardware accelerators using Intel's
OpenVINO Toolkit.

* Fixed bug in GetCapability to disable custom ops

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Added OPENVINO ci pipeline

Added new pipeline for openvino provider,
made changes to support the docker build and
onnxruntime build with openvino.

Signed-off-by: Luis Daniel Castellanos <luis.daniel.castellanos@intel.com>

* Enabled all unit tests for OpenVINO EP

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Fixed syntax issue in run_docker_build.sh file

* Added missing default OPENVINO_VERSION

Default value for OPENVINO_VERSION env was
missing causing the build to fail

* Added install Model Optimizer deps step

* Fixed python unit tests and some tests from onnx_backend_test_series

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Fixed indentation bug

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Disabled some of the python backend tests for OpenVINO

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Disabled some model tests

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Remove Duplicate checks for openvino in build.py

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Modified GetCapability for FP16

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Disabled GPU FP32 tests that are not supported

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Convert modelProto to string and use it in compile

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Pass byte-array input args to MO

* Serialized ModelProto passed in-memory to MO

ModelOptimizer python module receives the serialized  ModelProto
in-memory.
Uses appropriate ONNX function to load the serialized bytes.

* Make Py_Finalize compatible with older python versions

Also, remove pFunc unassigned variable possibility.

* Fallback if input dims of Matmul is greater than 2

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* fixup: Device #define syntax

* Updated the documentation

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Enable dynamic dim value

* removed commented out code

* Added Dockerfile for openvino EP

Updated instructions on dockerfiles/README.md file

Signed-off-by: Luis Daniel Castellanos <luis.daniel.castellanos@intel.com>

* Disabled fp16_inception_v1 test

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Code formatting with clang-format

Uses style from the .clang-format file in root directory.

* fixup: docker tag and build error fixes

* Heuristics to automatically detect batching

Distributes slices from batch into parallel infer-request objects.

* Handle disabled tests in GetCapability

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Disabled average pool and max pool if ceil_mode is 1

Also dilations are not supported if they are greater than 1

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Disabled Unsqueeze int32 test

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* changes to fix output results bug

* Disabled a few C++ unit tests for MYRIAD FP16

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Manually revert '9fe162bb Enable dynamic dim value'

Reverts compile time setting of dynamic shape
Reverting manually due to significantly huge auto-revert conflicts.

* Fixed unused variable warning

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Disabled Mul test for GPU_FP16 due to accuracy issue

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* VPU documentation update

* Disabled inception_v1 for MYRIAD and HDDL

*Also disabled few C++ accuracy tests for HDDL

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* updates from upstream

* use the new CustomOpApis for I/O interfacing

* Pass initializers as subgraph meta-def inputs in GetCapability()

Requirement due to API changes introduced with PR# 1019.

* Remove obsolete functions

* Save indexes of graph inputs from fused_node info

Both inputs and initializers are passed as data inputs to the
infer function. To identify only inputs among them, save thier
index info from fused_node in Compile function.

* Documentation changes to enable VPU

* Fix VPU related changes in documentation

* Fix minor changes in documentation

* Fix VPU related changes in documentation

* Use Node.In/OutputDefs() to track graph inputs and outputs.

Don't use graph_viewer's GetInputs() or
GetInputsIncludingInitializers().

* Permit "SAME_UPPER" auto_pad attribute from MaxPool

* Disabled fp16_tiny_yolov2 in onnx model tests

* Updated documentation to include configuration guides for myriad and hddl

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Use 8 Infer requests only for VAD-R

* disable debug prints

* Clang-format source files

* Updated BUILD.md with OpenVINO R5 links

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Disabled same upper python tests

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Update test exclusion syntax

* Change path of install_onnx.sh

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Disable tiny_yolov2 in broken tests

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Revert "Change path of install_onnx.sh"

This reverts commit ba9db165f3be430f2aff1ef413299ed04637196a.
This change is only required for Intel internal CI pipeline until
the settings are matched with the upstream's CI pipeline.

* Added debug statements for debugging CI error

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Add --build_wheel to linux openvino pipeline

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Added -v option to onnx_test_runner for debugging

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Removed path change patch

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Added -c 1  to onnx_test_runner

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Refactor MO python invocation in separate function

Cleans up Model Optimizer python invocation check and conversion
logic. Invokes MO only once in GetCapability() and passes the
IR strings (xml and bin) to the Compiler as meta-def attributes.

* Add comments

* code cleanup and comments

* Code cleanup for GetCapability

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Removed unnecessary files

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Revert "Added -v option to onnx_test_runner for debugging"

This reverts commit d1dd70938a94d648df1a1dbbc2e48d0b97e49ec8.

* Revert "Added debug statements for debugging CI error"

This reverts commit b86d41afed2aa29c3508155d6f9c8d3a7263cc60.

* incorporate Status Code changes

* ComputeFunc returns Status::OK() on success

* Use test names to disable tests for MYRIAD and VAD-R

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Rename local identifiers from CNNNetwork to OpenVINO network

CNNNetwork is an OpenVINO's API class that represents more than
just convolutional neural networks (CNNs). Renaming helps to avoid
confusion that the API's only support CNN type models.

* Added error message if building on windows

* Removed duplicate option in Cmake
* Removed unnecessary parameters in activation_opt_test

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Refactor Map search and access logic for efficiently and cleanliness.

* use C++ style casts

* Use os.path.join for python directory path operations

* use C++ style casts

* EP classes should use onnxruntime namespace

* Clean up fixes from PR comments

* Don't explicitly shutdown Py interpreter

* Remove debug print statements

Prints will be re-enabled later with a logging mechanism with
debug/verbose printing options.

* Decrement ref counts for used pyObjects

* Restore build instructions for other compilers

Content under the "Using other compilers" section has been
accidentally deleted by a previous commit. Restoring back that
content from the latest upstream repo.

* CMake code cleanup

Code clean up, commenting and formatting of CMake code.

* Don't pass the unused device_info parameter to OpenVINOGraph ctor.

* Add support for multiple I/O data types

Adds support for the following tensor data types for graph inputs
and outputs:
1) float
2) float16
3) int32
4) int16
5) int8
6) uint16
7) uint8

* cleanup setup.py module list definition

* Deduce index of input using tracked input index map

Ignores initializers in case they are ordered before inputs.

* Removed debug statement in MO code

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* PR feedback

* Removed per_sample_tolerance for openvino
* Removed unnecessary disabled tests

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Removed debug function

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Disabled tiny_yolo_v2 due to accuracy issues

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Changed the disabled reason for broken tests

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Disabled Reshape with no input

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Python formatting with Autopep8

* Minor fix for MYRIAD devices

* Added zero dimension check

*Removed setting batch size for the network

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Set the threshold to larger value for MNIST

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Removed setting higher threshold in provider_test_utils

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>

* Check for --use_openvino in python wheel setup.py

Add openvino modules to the setup script for building the wheel
package only for --use_openvino a build option.

* Removed nullptr checks for GetNode()

Signed-off-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
2019-06-18 08:58:53 -07:00
Changming Sun
f22d1df04f
Fix a bug that constant folding doesn't check the returned value of OpKernel::Compute (#1243) 2019-06-18 07:45:09 -07:00
Ke Zhang
13d8558404
move environment.h/cc from framework to session project/folder. (#1241) 2019-06-17 18:01:21 -07:00
Ke Zhang
d448f39832
refactor a little bit to not ask every execution provider to give an implementation of getkernalregistry. It's not needed for a runtime-based exp actually. (#1231) 2019-06-17 13:50:07 -07:00
Ryan Hill
38963d81eb
Simplify some CustomOp code (#1206)
test_inference.cc reformatted with clang-format
2019-06-11 17:00:04 -07:00
Scott McKay
87d65389e6
Add ability to change the logging severity of the default logger. (#1165)
Add ability to set the session and run logger severity via SessionOptions and RunOptions
Inherit severity from the next logger up if logger severity isn't specified in SessionOptions or RunOptions
Expose ability to set default logger severity in python bindings.
2019-06-12 08:54:03 +10:00
Ryan Hill
3c3186c761
Convert more C APIs to return OrtStatus (#1194)
* Change SessionOptions APIs to always return a status, for consistency and ease of use (a couple returned 0 or -1 for success/failure)
2019-06-10 18:36:04 -07:00
Changming Sun
d8ac0d64d0
Make C API capable of defining CUDA custom ops (#1178)
Recreate the PR on behalf of Rui Xia, for #779
2019-06-06 13:45:32 -07:00
Ryan Hill
b68bb51dd0
Change SessionOptions APIs to always return a status (#1171)
* Change SessionOptions APIs to always return a status, for consistency and ease of use (a couple returned 0 or -1 for success/failure)
2019-06-06 13:24:24 -07:00
G. Ramalingam
b23ab6a06e
Implementation of sparse tensor (#1121)
* Initial implementation of sparse tensor

* minor cleanup

* minor cleanup (remove empty line)

* simplify template usage in test-case

* address linux build error

* fix constructor order to address compiler warning

* Address PR comments

* handle allocation in optimizer execution frame

* address compiler warning message and PR feedback comment

* address gcc unused warning for protobuf code

* address PR comment
2019-06-06 11:50:38 -07:00
Klein Hu
1a86421aff Create a syslog sink for logging in !Win32 env (#1163)
* Create a syslog sink for logging in !Win32 env

* Move syslog level logic to syslog_sink.c
2019-06-06 16:35:06 +10:00
Ryan Hill
148141dd5f Change Compute function to return a status code instead of an integer. (#1139) 2019-06-04 08:34:32 -07:00
Changming Sun
c18de6817b
Rename MLValue to OrtValue (#1154) 2019-06-03 17:29:55 -07:00
Du Li
05110a6558
Adding custom op ConvTransposeWithDynamicPads. (#638)
* Adding custom op ConvTransposeWithDynamicPads.

* Adding custom op ConvTransposeWithDynamicPads.

* adding cuda kernels

* fix a bug

* fix build issue.

* Integrate PR comments.
2019-05-31 11:48:43 -07:00
Ryan Hill
f9f6818e4c
Add comments and organize the C++ header into the main header plus a separate one of the inline methods. (#1130) 2019-05-30 14:24:25 -07:00
Ke Zhang
ef66395060
remove unnecessary lock and code clean up (#1114)
* refactoring the ep codes.

* remove unnecessary lock.

* fix the comment to claim KernelRegistryManager is not thread safe.

* clarify that APIs to add custom op in inferencesession is not thread safe.
2019-05-29 13:20:38 -07:00
Konstantinos Karanasos
ee6217972b
Fix when rewrite rule gets registered to multiple op types; update constness of rule methods; enable dropout elimination (#1098) 2019-05-24 13:47:55 -07:00
Ryan Hill
9129a652c5
Ryanunderhill/cxx api2 (#1091)
More C++ API improvements and cleanup
Add templates to tensor creation
Add run method that allows preallocated outputs
Simplify CreateTensor<T> to multiply by sizeof(T)
Convert io_types code
Optimize away vector copies in Session::Run
2019-05-24 11:15:51 -07:00
Ryan Hill
3a32b0eb99
Change function kernels to use CustomOp APIs (#1020)
* Change function signature
* Convert compute to use custom op style APIs
* Remove dead CustomOp function
* Use CustomOp API in TensorRT EP
* Switch to new API in ngraph
2019-05-20 14:57:43 -07:00
Changming Sun
2663b9c443
Remove unnecessary casts from OrtValue to MLValue(#1051) 2019-05-17 07:52:59 -07:00
Ke Zhang
a7039601c4
setting input/output for partitioned function subgraph (#1019)
* setting input/output for partitioned function subgraph

* fixed TensorRT related failure

* fix failures for tvm and ngraph

* update
2019-05-17 11:19:28 +08:00
Changming Sun
99556b111d
Make MemPatternPlanner on/off switchable in model weight loading (#989) 2019-05-16 14:39:09 -07:00
Changming Sun
e4225f8234
Combine OrtValue and MLValue into one type (#1043)
* Combine OrtValue and MLValue into one type
2019-05-16 10:22:49 -07:00
Changming Sun
5062be612e
update (#1041) 2019-05-15 16:41:41 -07:00
Changming Sun
e42099480e
Clean up code (#1033)
Some trivial changes for making gcc compiler happy. Won't impact runtime behavior.
2019-05-15 10:00:39 -07:00
Ryan Hill
3408494407
More C++ API improvements and conversions (#998)
* More C++ API improvements and conversions
* Mark more constructors as explicit
* Fix CSharp function name changes
* Change more test cases to use C++ API
2019-05-13 13:56:54 -07:00
Mika Fischer
c0acb8b6c3 Allow loading model from in-memory byte-array (#718) 2019-05-11 05:58:50 +08:00
Ryan Hill
f73ce305e9
C++ wrapper for ABI (#958) 2019-05-03 19:32:46 -07:00
Konstantinos Karanasos
feab3088fb
Conv(Add|Mul|BN)Fusion as rewrite rules (#863)
* Converted ConvAddFusion, ConvMulFusion, and ConvBNFusion to rewrite rules
* Extended graph_utils::RemoveNode
* Introduced RewriteRuleEffect enum
2019-05-01 13:23:29 -07:00
Scott McKay
0ad940027c
Use ConstPointerContainer for Node::ImplicitInputDefs() for better consistency with InputDefs() and OutputDefs(). (#894) 2019-05-01 14:22:28 +10:00
Konstantinos Karanasos
1b7d1f2645
Convert constant folding to a transformer (#866) 2019-04-29 18:12:49 -07:00
Ke Zhang
f39a8d1f59
allow users to set graph inputs and outputs fully. (#905)
* allow users to set graph inputs and outputs fully.

* update

* update the comments of the APIs

* update

* remove commented-out codes.

* fix test failures.

* fix comments.

* adding more check to throw not support exception right now.
2019-04-29 15:58:39 +08:00
ybrnathan
b0a37477db Fix memory corruption issue when CPU->CUDA memcpy is involved (#879) 2019-04-22 20:21:14 -07:00
Yufeng Li
0bf12e9dbf
Add option to enable/disable memory pattern back (#872)
Memory pattern doesn't work for parallel executor by design. Enabling Memory Pattern for parallel executor logs warning and make the perf bad.
Add option to enable/disable memory pattern back.
2019-04-22 13:49:41 -07:00
nivas-x86
a4d7052aeb Add nGraph Execution Provider (#832)
* Add nGraph Execution Provider

* feedback changes 1

* feedback2

* Feedback and upgrade nGraph

* Feedback 4

* Fix CI

* Disable new ops
2019-04-20 17:02:35 -07:00
Konstantinos Karanasos
ada90086f7
More efficient rule-based transformer (#815)
Introduce a quick pre-filtering of rules based on the node op types they are targeting.
The goal is to avoid evaluating all rules for all nodes. Instead, for each node, we will only be evaluating the rules associated with its op type.
2019-04-18 17:10:13 -07:00
shschaefer
ff253631b5
Enable use of session based threadpool. (#854)
* Enable use of session based threadpool.

* Fix build dir issue
2019-04-18 10:20:46 -07:00
Ke Zhang
951c428ee1
Simplify the validation in Run call (#850)
* Simpplify Run()

* remove the lock

* remove a file added wrongly.

* fix tests

* fix c# test
2019-04-18 08:38:17 +08:00
Ke Zhang
41dc3130f5
no need putting initializers (for constant node) into graph inputs. (#665)
* constant node should not be put into graph inputs any more.

* simplify graph input/output set logic.

* refactor comments.

* remove adding initializers as graph inputs when creating graph from scratch.
2019-04-17 07:38:08 +08:00
Ashwini Khade
14d63b5f45
generate transformers bug fix (#838)
* fix graph transformer generation

* add more tests

* cosmetic changes

* more changes per review
2019-04-16 14:10:33 -07:00
Tracy Sharpe
f19d9a4907
Reduce code size of kernel registration (#833)
Some changes that reduce the size of the release onnxruntime.dll by 170KB:

Change the ONNX_OPERATOR_KERNEL macros to not create a unique virtual class per kernel create lambda, but instead use a generic class with the raw function address supplied at BuildCreateKernelInfo time.

Changed the exceution providers to use a table driven approach to calling the BuildCreateKernelInfo functions instead of a massive function with construct/call/delete sequences.

The CreateFunc in data_types.h didn't need to be a std::function, eliminating more lambda virtual classes.

N.B. To accommodate MSVC 14.11 toolchain (used for CUDA builds), the operator+() syntax cannot be used to retrieve the raw function address. The older toolchain can't resolve between cdecl/vectorcall and gives up. An explicit cast is needed to help the compiler along.
2019-04-15 16:39:59 -07:00
Tracy Sharpe
c55e2de593
Status class optimizations (#824)
optimize onnxruntime::common::Status to reduce code size
2019-04-11 21:57:01 -07:00