onnxruntime/docs/execution_providers
sfatimar 7347996942
Openvino ep 2021.2 (#6196)
* Enabling fasterrcnn variant and vehicle detector

* changes for 2021_2 branch

* yolov3_pytorch commit

* fixed braces in basic_backend.cc

* ci information added

* faster rcnn variant and vehicle detector changes were made in 2021.1 and not in 2021.2

* some changes to support unit tests

* disable some tests which are failing

* fix myriad tests for vehicle detector

* Did some cleanup
*cleaned up comments
*Disabled Add_Broadcast_0x1 and Add_Broadcast_1x0
tests on MYRIAD_FP16 backend due to a bug
*cleaned up capability_2021_2.cc file
*Removed extra conditions which were added
for some validation in backend_utils

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* yolov3 pytorch workaround to ensure that the output names are matched

* gemmoptest fixed on myriad

* Fixed MYRIADX CPP Test Failures

*Expand,GatherND,Range,Round op's
are only supported in model

*where op with float input data
types are not supported and fixed

*Scatter and ScatterElements op's with
negative axis are fixed

*Reshape op with 0 dim value are not
supported and fixed

*Disabled InstanceNorm_2 test on MYRIADX

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* make changes to yolov3 pytorch

* Fixed python unit tests
*Fixed failing python tests on vpu,
GPU and CPU

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Fixes POW op failures on GPU_FP16

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Clean up capability_2021_2.cc

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Updated docx for MultiThreading option
*Added extra info on setting the num_of_threads
option using the API and it's actual usage

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* fixed slice and removed extra prints

* Disabled failing python tests

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Minor changes added in capabilty_2021_2

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* made changes to slice to avoid failures

* Disabling FP16 support for GPU_FP32
->Inferencing an FP16 model on GPU_FP32
leads to accuracy mismatches. so, we would
rather use GPU_FP16 to infer an FP16 model
on GPU Device

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* Updated docx for Inferencing a FP16 Model

Signed-off-by: MaajidKhan <n.maajidkhan@gmail.com>

* fix for mask rcnn

* Script for installing openvino from source

* Updated with openvino 2021.2 online installation

* code comment fixes
fixed accuracy mismatch for div

* Update OpenvinoEP-ExecutionProvider.md

updated for 2021.2 branch

* Update README.md

updated dockerfile documentation

* Update BUILD.md

build.md update documentation

* permissiong change of install_openvino.sh

* made changes to align with microsoft onnxruntime changes

* Updated with ov 2021.2.200

Co-authored-by: suryasidd <surya.siddharth.pemmaraju@intel.com>
Co-authored-by: sfatimar <sahar.fatima@intel/com>
Co-authored-by: MaajidKhan <n.maajidkhan@gmail.com>
Co-authored-by: mohdansx <mohdx.ansari@intel.com>
2020-12-23 08:47:22 -08:00
..
images Updated docs: Execution Provider overview (#5328) 2020-10-06 15:01:25 -07:00
ACL-ExecutionProvider.md Updating EP docs with Onnxruntime API calls (#5503) 2020-10-19 12:21:21 -07:00
ArmNN-ExecutionProvider.md Updating EP docs with Onnxruntime API calls (#5503) 2020-10-19 12:21:21 -07:00
DirectML-ExecutionProvider.md Switch to unified DirectML 1.4.0 redistributable (#5794) 2020-11-17 13:42:23 -08:00
DNNL-ExecutionProvider.md Updating EP docs with Onnxruntime API calls (#5503) 2020-10-19 12:21:21 -07:00
MIGraphX-ExecutionProvider.md Updating EP docs with Onnxruntime API calls (#5503) 2020-10-19 12:21:21 -07:00
MKL-DNN-Subgraphs.md Renaming MKL-DNN as DNNL (#2515) 2019-12-03 07:34:23 -08:00
NNAPI-ExecutionProvider.md Updating EP docs with Onnxruntime API calls (#5503) 2020-10-19 12:21:21 -07:00
Nuphar-ExecutionProvider.md Updating EP docs with Onnxruntime API calls (#5503) 2020-10-19 12:21:21 -07:00
OpenVINO-ExecutionProvider.md Openvino ep 2021.2 (#6196) 2020-12-23 08:47:22 -08:00
README.md Updated docs: Execution Provider overview (#5328) 2020-10-06 15:01:25 -07:00
RKNPU-ExecutionProvider.md Updating EP docs with Onnxruntime API calls (#5503) 2020-10-19 12:21:21 -07:00
TensorRT-ExecutionProvider.md Update TensorRT-ExecutionProvider.md (#6161) 2020-12-17 17:10:40 -08:00
Vitis-AI-ExecutionProvider.md [Vitis-AI EP] Fix to enable multi-output subgraphs inside Vitis-AI EP + edit docs (#4171) 2020-06-13 04:56:07 -07:00

Introduction

ONNX Runtime is capable of working with different HW acceleration libraries to execute the ONNX models on the hardware platform. ONNX Runtime supports an extensible framework, called Execution Providers (EP), to integrate with the HW specific libraries. This interface enables flexibility for the AP application developer to deploy their ONNX models in different environments in the cloud and the edge and optimize the execution by taking advantage of the compute capabilities of the platform.

Executing ONNX models across different HW environments

ONNX Runtime works with the execution provider(s) using the GetCapability() interface to allocate specific nodes or sub-graphs for execution by the EP library in supported hardware. The EP libraries that are preinstalled in the execution environment processes and executes the ONNX sub-graph on the hardware. This architecture abstracts out the details of the hardware specific libraries that are essential to optimizing the execution of deep neural networks across hardware platforms like CPU, GPU, FPGA or specialized NPUs.

ONNX Runtime GetCapability()

ONNX Runtime supports many different execution providers today. Some of the EPs are in GA and used in live service. Many are in released in preview to enable developers to develop and customize their application using the different options.

Adding an Execution Provider

Developers of specialized HW acceleration solutions can integrate with ONNX Runtime to execute ONNX models on their stack. To create an EP to interface with ONNX Runtime you must first identify a unique name for the EP. Follow the steps outlined here to integrate your code in the repo.

Building ONNX Runtime package with EPs

The ONNX Runtime package can be built with any combination of the EPs along with the default CPU execution provider. Note that if multiple EPs are combined into the same ONNX Runtime package then all the dependent libraries must be present in the execution environment. The steps for producing the ONNX Runtime package with different EPs is documented here.

APIs for Execution Provider

The same ONNX Runtime API is used across all EPs. This provides the consistent interface for applications to run with different HW acceleration platforms. The APIs to set EP options are available across Python, C/C++/C#, Java and node.js. Note we are updating our API support to get parity across all language binding and will update specifics here.

`get_providers`: Return list of registered execution providers.
`get_provider_options`: Return the registered execution providers' configurations.
`set_providers`: Register the given list of execution providers. The underlying session is re-created. 
    The list of providers is ordered by Priority. For example ['CUDAExecutionProvider', 'CPUExecutionProvider']
    means execute a node using CUDAExecutionProvider if capable, otherwise execute using CPUExecutionProvider.

Using Execution Providers

import onnxruntime as rt

#define the priority order for the execution providers
# prefer CUDA Execution Provider over CPU Execution Provider
EP_list = ['CUDAExecutionProvider', 'CPUExecutionProvider']

# initialize the model.onnx
sess = rt.InferenceSession("model.onnx", providers=EP_list)

# get the outputs metadata as a list of :class:`onnxruntime.NodeArg`
output_name = sess.get_outputs()[0].name

# get the inputs metadata as a list of :class:`onnxruntime.NodeArg`
input_name = sess.get_inputs()[0].name

# inference run using image_data as the input to the model 
detections = sess.run([output_name], {input_name: image_data})[0]

print("Output shape:", detections.shape)

# Process the image to mark the inference points 
image = post.image_postprocess(original_image, input_size, detections)
image = Image.fromarray(image)
image.save("kite-with-objects.jpg")

# Update EP priority to only CPUExecutionProvider
sess.set_providers('CPUExecutionProvider')

cpu_detection = sess.run(...)