mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-18 21:21:17 +00:00

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

Find a file

Yang Chen d486481455 Correctly handle implicit inputs for fused nodes (#2390 ) * Correctly handle implicit inputs for fused nodes Previously, nuphar's partitioning function didn't include node's implicit inputs into the inputs list of MetaDef, and hence a crash was triggered in the onnx graph checker. This commit fixed the issue. Furthermore, it also fixed a related issue where we didn't add implicit inputs into graph_inputs_excluding_initializers_ in Graph::SetGraphInputsOutputs. the issue was that graph_inputs_including_initializers_ populated by SetInputs (e.g. called by FunctionImpl::FunctionImpl) may contain implicit inputs which were not of any node's initializers in the graph. Because they were not part of any initializers, these implicit inputs couldn't be visited by going through all nodes' inputs. Consequently, they would not be added into graph_inputs_excluding_initializers_. We fixed the issue by first copying the populated graph_inputs_including_initializers_ into graph_inputs_excluding_initalizers_, which then had both initializers and non-initializers as its initial content. Later, we erase initializers from the list. In this way, we can ensure all implicit inputs to remain in graph_inputs_excluding_initializers_. * refined comments and fixed duplicates Address CR by revisiting comments in terms of implicit inputs Also fixed an issue by skipping duplicates while copying inputs from graph_inputs_including_initializers_. * address CR explain why we need to collect nodes' implicit inputs * don't rely on pointer values for iterating std::set Previously, openvino relied on iterating a set of NodeArg pointers to construct inputs and outputs for a fused graph. It could cause non-determinism. The reason was that although iterating std::set by itself is stable, pointer values of NodeArgs may vary. Consequently, we could end up visiting the set's elements in different orders for different runs for the same test, which resulted in constructing inputs (and outputs) with different orders to the fused graph. For example, for the same test, we may have inputs [A, B] in some runs but inputs[B, A] in others. Let's use std::string as the key type to avoid such nondeterminism. This commit also added implicit inputs into meta->inputs while returning the capability from the openvino provider. * Fixed another latent issue in openvino's GetCapability function The issue was that we couldn't simply erase fused_inputs and fused_outputs while iterating the nodes. For example, an output NodeArg may have multiple uses, and it's wrong if we erase it from fused_outputs when we encounter only one of its uses as input.		2019-11-21 10:27:09 -08:00
.github	Issue template update (#1339 )	2019-07-07 23:38:52 -07:00
cmake	Avoid using the default logger in the graph lib and optimizers (#2361 )	2019-11-14 13:23:28 -08:00
csharp	Set ElementType to String type of node metadata, instead of byte[] (#2348 )	2019-11-08 14:52:56 -08:00
dockerfiles	[NupharEP] Update notebook and docker image (#2416 )	2019-11-18 10:38:14 -08:00
docs	onnxrt server documentation update (#2396 )	2019-11-18 15:31:07 -08:00
include/onnxruntime/core	Avoid using the default logger in the graph lib and optimizers (#2361 )	2019-11-14 13:23:28 -08:00
onnxruntime	Correctly handle implicit inputs for fused nodes (#2390 )	2019-11-21 10:27:09 -08:00
package/rpm	Update version number	2019-10-30 08:13:09 -07:00
samples	Updated links in docs (#2303 )	2019-11-03 09:10:56 -08:00
tools	Fix Windows GPU C API packaging pipeline failure (#2440 )	2019-11-20 14:00:37 -08:00
.clang-format	Initial bootstrap commit.	2018-11-19 16:48:22 -08:00
.clang-tidy	Add remaining build options and make minor changes in documentation (#39 )	2018-11-27 19:59:40 -08:00
.dockerignore	Allow building Docker container based on a different git repo. (#1222 )	2019-06-20 09:55:42 -07:00
.gitattributes	Initial bootstrap commit.	2018-11-19 16:48:22 -08:00
.gitignore	Add CUDA Scan operator. (#2403 )	2019-11-21 07:59:06 +10:00
.gitmodules	Add DirectML Execution Provider (#2057 )	2019-10-15 06:13:07 -07:00
build.amd64.1411.bat	Initial bootstrap commit.	2018-11-19 16:48:22 -08:00
build.bat	Initial bootstrap commit.	2018-11-19 16:48:22 -08:00
BUILD.md	onnxrt server documentation update (#2396 )	2019-11-18 15:31:07 -08:00
build.sh	update	2019-01-09 15:49:27 -08:00
cgmanifest.json	Update ONNX to 1.6.1 (#2235 )	2019-10-23 13:47:45 -07:00
CODEOWNERS	Fix codeowners file	2018-11-27 23:42:17 -08:00
CONTRIBUTING.md	Miscellaneous fixes (#123 )	2018-12-06 22:21:04 -08:00
LICENSE	Initial bootstrap commit.	2018-11-19 16:48:22 -08:00
NuGet.config	Add DirectML Execution Provider (#2057 )	2019-10-15 06:13:07 -07:00
ort.wprp	Add Tracelogging for profiling (#1639 )	2019-11-11 21:34:10 -08:00
packages.config	Add DirectML Execution Provider (#2057 )	2019-10-15 06:13:07 -07:00
README.md	Updated links in docs (#2303 )	2019-11-03 09:10:56 -08:00
requirements-dev.txt	Implementation of Nuphar execution provider (#881 )	2019-09-01 23:01:47 -07:00
requirements-doc.txt	Update readme.rst for pypi, change documentation style (#1663 )	2019-10-19 18:26:34 -07:00
requirements.txt	Initial bootstrap commit.	2018-11-19 16:48:22 -08:00
setup.py	[OpenVINO-EP] Update to latest version: OpenVINO 2019 R3.1 (#2308 )	2019-11-05 19:55:46 -08:00
ThirdPartyNotices.txt	Add DirectML Execution Provider (#2057 )	2019-10-15 06:13:07 -07:00
VERSION_NUMBER	Update version number	2019-10-30 08:13:09 -07:00

README.md

ONNX Runtime is a performance-focused complete scoring engine for Open Neural Network Exchange (ONNX) models, with an open extensible architecture to continually address the latest developments in AI and Deep Learning. ONNX Runtime stays up to date with the ONNX standard and supports all operators from the ONNX v1.2+ spec with both forwards and backwards compatibility. Please refer to this page for ONNX opset compatibility details.

ONNX is an interoperable format for machine learning models supported by various ML and DNN frameworks and tools. The universal format makes it easier to interoperate between frameworks and maximize the reach of hardware optimization investments.

Key Features

Samples and Tutorials

Setup

Installation
- APIs and Official Binaries
- Building from Source

Usage

Getting ONNX Models
Deploying ONNX Runtime
Performance Tuning

More Info

Technical Design Details
Extensibility Options

Data/Telemetry

Contributions and Feedback

License

Key Features

Run any ONNX model

ONNX Runtime provides comprehensive support of the ONNX spec and can be used to run all models based on ONNX v1.2.1 and higher. See version compatibility details here.

Traditional ML support

In addition to DNN models, ONNX Runtime fully supports the ONNX-ML profile of the ONNX spec for traditional ML scenarios.

For the full set of operators and types supported, please see operator documentation

Note: Some operators not supported in the current ONNX version may be available as a Contrib Operator

High Performance

ONNX Runtime supports both CPU and GPU. Using various graph optimizations and accelerators, ONNX Runtime can provide lower latency compared to other runtimes for faster end-to-end customer experiences and minimized machine utilization costs.

Currently ONNX Runtime supports the following accelerators:

MLAS (Microsoft Linear Algebra Subprograms)
NVIDIA CUDA
Intel MKL-ML
Intel MKL-DNN - subgraph optimization
Intel nGraph
NVIDIA TensorRT
Intel OpenVINO
Nuphar Model Compiler
DirectML
ACL (in preview, for ARM Compute Library)

Not all variations are supported in the official release builds, but can be built from source following these instructions.

We are continuously working to integrate new execution providers for further improvements in latency and efficiency. If you are interested in contributing a new execution provider, please see this page.

Cross Platform

ONNX Runtime is currently available for Linux, Windows, and Mac with Python, C#, C++, and C APIs. Please see API documentation and package installation.

If you have specific scenarios that are not supported, please share your suggestions and scenario details via Github Issues.

Installation

Quick Start: The ONNX-Ecosystem Docker container image is available on Dockerhub and includes ONNX Runtime (CPU, Python), dependencies, tools to convert from various frameworks, and Jupyter notebooks to help get started.

Additional dockerfiles can be found here.

APIs and Official Builds

API Documentation

Python
C
C#
C++
Ruby (external project)

Official Builds

	CPU (MLAS+Eigen)	CPU (MKL-ML)	GPU (CUDA)
Python	pypi: onnxruntime Windows (x64) Linux (x64) Mac OS X (x64)	--	pypi: onnxruntime-gpu Windows (x64) Linux (x64)
C#	Nuget: Microsoft.ML.OnnxRuntime Windows (x64, x86) Linux (x64, x86) Mac OS X (x64)	Nuget: Microsoft.ML.OnnxRuntime.MKLML Windows (x64) Linux (x64) Mac OS X (x64)	Nuget: Microsoft.ML.OnnxRuntime.Gpu Windows (x64) Linux (x64)
C/C++ wrapper	Nuget: Microsoft.ML.OnnxRuntime .zip, .tgz Windows (x64, x86) Linux (x64, x86) Mac OS X (x64)	Nuget: Microsoft.ML.OnnxRuntime.MKLML Windows (x64) Linux (x64) Mac OS X (x64)	Nuget: Microsoft.ML.OnnxRuntime.Gpu .zip, .tgz Windows (x64) Linux (x64)

System Requirements (pre-requisite dependencies)

ONNX Runtime binaries in the CPU packages use OpenMP and depend on the library being available at runtime in the system.
- For Windows, OpenMP support comes as part of VC runtime. It is also available as redist packages: vc_redist.x64.exe and vc_redist.x86.exe
- For Linux, the system must have libgomp.so.1 which can be installed using apt-get install libgomp1.
GPU builds require CUDA runtime libraries being installed on the system:
- Version: CUDA 10.0 and cuDNN 7.6
- Older ONNX Runtime releases: used CUDA 9.1 and cuDNN 7.1 - please refer to prior release notes for more details.
Python binaries are compatible with Python 3.5-3.7. See Python Dev Notes. If using pip to be download the Python binaries, run pip install --upgrade pip prior to downloading.
Certain operators makes use of system locales. Installation of the English language package and configuring en_US.UTF-8 locale is required.
- For Ubuntu install language-pack-en package
- Run the following commands: locale-gen en_US.UTF-8 update-locale LANG=en_US.UTF-8
- Follow similar procedure to configure other locales on other platforms.

Building from Source

If additional build flavors and/or dockerfiles are needed, please find instructions at Build ONNX Runtime. For production scenarios, it's strongly recommended to build only from an official release branch.

Usage

Getting ONNX Models

The ONNX Model Zoo has popular ready-to-use pre-trained models.
To export or convert a trained ONNX model trained from various frameworks, see ONNX Tutorials. Versioning compatibility information can be found under Versioning
Other services that can be used to create ONNX models include:

Deploying ONNX Runtime

Cloud

ONNX Runtime can be deployed to the cloud for model inferencing using Azure Machine Learning Services. See detailed instructions and sample notebooks.

ONNX Runtime Server (beta) is a hosted application for serving ONNX models using ONNX Runtime, providing a REST API for prediction. Usage details can be found here, and image installation instructions are here.

IoT and edge devices

The expanding focus and selection of IoT devices with sensors and consistent signal streams introduces new opportunities to move AI workloads to the edge.

This is particularly important when there are massive volumes of incoming data/signals that may not be efficient or useful to push to the cloud due to storage or latency considerations. Consider: surveillance tapes where 99% of footage is uneventful, or real-time person detection scenarios where immediate action is required. In these scenarios, directly executing model inferencing on the target device is crucial for optimal assistance.

To deploy AI workloads to these edge devices and take advantage of hardware acceleration capabilities on the target device, see these reference implementations.

Local applications

ONNX Runtime packages are published to PyPi and Nuget (see Official Builds and/or can be built from source for local application development. Find samples here using the C++ API.

On newer Windows 10 devices (1809+), ONNX Runtime is available by default as part of the OS and is accessible via the Windows Machine Learning APIs. Find tutorials here for building a Windows Desktop or UWP application using WinML.

Performance Tuning

ONNX Runtime is open and extensible, supporting a broad set of configurations and execution providers for model acceleration. For performance tuning guidance, please see this page.

To tune performance for ONNX models, the ONNX Go Live tool "OLive" provides an easy-to-use pipeline for converting models to ONNX and optimizing performance for inferencing with ONNX Runtime.

Technical Design Details

Extensibility Options

Data/Telemetry

This project may collect usage data and send it to Microsoft to help improve our products and services. See the privacy statement for more details.

Contribute

We welcome contributions! Please see the contribution guidelines.

Feedback

For any feedback or to report a bug, please file a GitHub Issue.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

License

MIT License