mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-14 20:48:00 +00:00

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

Find a file

sfatimar 6d2a30eae3 [OPENVINO-EP] 2021.1 Release (#5431 ) * Cmake changes for 2021.1 * added new ov version 2020.1 for faster rcnn * Added missing defs * equal op modified * changes to incoroporate faster rcnn * backend util.cc * hddl_plugin_config.hpp is depreceated . instead use hddl_config.hpp * changing myriad precision bool to i32 * gather is not enabled for gpu * conv2D and pooltest auto_pad attribute should not be null * negative indices are not valid for scatter op in myriad * non max suppression op only supported in faster rcnn mode * maxpool indices output is not supported * Cleaned redundant code in backends * Added ifdefs for HDDL config * cast output dimensions check topk operator k input it seems only resolved for myriad as it is throwing issues for ask rcnn . need to verify * we are limiting the subgraph size to 3 here * taking care of review comments * Fixed minor bugs * Modified Slice op checks * Added NonZero, Upsample * Removed TopK if it's in the middle of a subgraph * incorporated upsample conditions too * Dockerfile changes for 2021.1 release * dockerfile aptkey update * Minor fixes * ceil condition added again * Fixed few gpu models * Disabled LSTM and yolov3 in ModelTests * python softmax cross entropy tests and negative log likelihood * Update Build.md Updated for openvino 2021.1 * Update OpenVINO-ExecutionProvider.md update openvino execution provider for 2021.1 * Update READMe.md updated new openvino version * Update Dockerfile.openvino added environment variable for DEBIAN Frontend * Fixed myriad models * Fixed gather condition * Fixed mask rcnn model on myriad * Modified Gather condition * set default target of MCR dockerfile to MYRIAD_FP16 * Fixed tinyolov3 on CPU * Update OpenVINO-ExecutionProvider.md update openvino execution provider documentation * Update Dockerfile.openvino Removed environment variable * Update OpenVINO-ExecutionProvider.md update image manipulation networks supported * Update onnx_backend_test_series_filters.jsonc removed test_upsample_nearest from cpu test cases * New InternalCI changes for 2021.1 * Full protobuf removed for OpenVINO * Protobuf added * Updated with apt installation for openvino * Revert the testing changes * Reverted testing changes * File permessions are changed to original * Deleted openvino installation and cmake change * Optimized Dockerfile Removed unnecessary cmake installation, numpy * Added missing ifdefs * delete array fix * backend_utils.cc output_shape * Revert "set default target of MCR dockerfile to MYRIAD_FP16" This reverts commit 928d3e2b71e2f589cf51dacd3a133951cf9ca18d. Co-authored-by: suryasidd <surya.siddharth.pemmaraju@intel.com> Co-authored-by: sfatimar <sahar.fatima@intel/com> Co-authored-by: suryasidd <48925384+suryasidd@users.noreply.github.com> Co-authored-by: S. Manohar Karlapalem <manohar.karlapalem@intel.com> Co-authored-by: Aravind <aravindx.gunda@intel.com> Co-authored-by: Aravind Gunda <38353114+gundaarx@users.noreply.github.com>		2020-10-14 15:56:00 -07:00
.github	Update stale.yml with current labels and mark stale items as "stale" (#4831 )	2020-08-18 13:25:57 -07:00
cgmanifests	Update protobuf submodule url (#5477 )	2020-10-14 02:35:38 -07:00
cmake	[OPENVINO-EP] 2021.1 Release (#5431 )	2020-10-14 15:56:00 -07:00
csharp	Add GetProfilingStartTimeNs() to Python/C# APIs (#5280 )	2020-10-14 05:32:43 -07:00
dockerfiles	[OPENVINO-EP] 2021.1 Release (#5431 )	2020-10-14 15:56:00 -07:00
docs	[OPENVINO-EP] 2021.1 Release (#5431 )	2020-10-14 15:56:00 -07:00
include/onnxruntime/core	Add GetProfilingStartTimeNs() to Python/C# APIs (#5280 )	2020-10-14 05:32:43 -07:00
java	javadoc warning fix (#5332 )	2020-10-02 11:52:07 -07:00
nodejs	bump version to 1.5.2 (#5420 )	2020-10-08 16:30:13 -07:00
onnxruntime	[OPENVINO-EP] 2021.1 Release (#5431 )	2020-10-14 15:56:00 -07:00
orttraining	Add CUDA option to run copy in default stream (#5445 )	2020-10-12 22:12:05 -07:00
package/rpm	bump version to 1.5.2 (#5420 )	2020-10-08 16:30:13 -07:00
samples	Fix commands in README.md. (#5459 )	2020-10-12 17:53:09 -07:00
server	[Android NNAPI EP] Remove dependency on external JD/DNNLibrary (#4576 )	2020-07-22 14:08:12 -07:00
tools	[OPENVINO-EP] 2021.1 Release (#5431 )	2020-10-14 15:56:00 -07:00
winml	Fix com ptr refcount (#5404 )	2020-10-08 10:18:38 -07:00
.clang-format	Initial bootstrap commit.	2018-11-19 16:48:22 -08:00
.clang-tidy	Add remaining build options and make minor changes in documentation (#39 )	2018-11-27 19:59:40 -08:00
.dockerignore	Allow building Docker container based on a different git repo. (#1222 )	2019-06-20 09:55:42 -07:00
.flake8	Re-enable PEP8 check in Win CI build (#4075 )	2020-05-30 09:10:05 +10:00
.gitattributes	Initial bootstrap commit.	2018-11-19 16:48:22 -08:00
.gitignore	Build system enhancements (#5012 )	2020-09-02 10:13:26 -07:00
.gitmodules	Update protobuf submodule url (#5477 )	2020-10-14 02:35:38 -07:00
build.amd64.1411.bat	Initial bootstrap commit.	2018-11-19 16:48:22 -08:00
build.bat	Initial bootstrap commit.	2018-11-19 16:48:22 -08:00
BUILD.md	[OPENVINO-EP] 2021.1 Release (#5431 )	2020-10-14 15:56:00 -07:00
build.sh	Add iOS test pipeline and a sample app. (#5298 )	2020-09-29 13:53:11 -07:00
CODEOWNERS	Re-enable CI tests for the new PyTorch frontend (#5017 )	2020-09-04 09:36:24 -07:00
CONTRIBUTING.md	typo in contributing.md (#5340 )	2020-10-01 10:23:08 -07:00
LICENSE	Initial bootstrap commit.	2018-11-19 16:48:22 -08:00
NuGet.config	Add DirectML Execution Provider (#2057 )	2019-10-15 06:13:07 -07:00
ort.wprp	Add Tracelogging for profiling (#1639 )	2019-11-11 21:34:10 -08:00
packages.config	Update DirectML Nuget to 1.3.0 (#5274 )	2020-09-23 22:53:02 -07:00
README.md	Update code snippet in README.md	2020-10-06 17:41:56 -07:00
requirements-dev.txt	Add new PytTrch front-end (#4815 )	2020-08-17 09:45:25 -07:00
requirements-doc.txt	Update readme.rst for pypi, change documentation style (#1663 )	2019-10-19 18:26:34 -07:00
requirements.txt	Remove cerberus from wheel package (#4919 )	2020-08-26 09:00:03 -07:00
setup.py	Add transformers tools to python package (#5090 )	2020-09-10 15:42:15 -07:00
ThirdPartyNotices.txt	Add iOS test pipeline and a sample app. (#5298 )	2020-09-29 13:53:11 -07:00
VERSION_NUMBER	bump version to 1.5.2 (#5420 )	2020-10-08 16:30:13 -07:00

README.md

ONNX Runtime is a cross-platform inferencing and training accelerator compatible with many popular ML/DNN frameworks, including PyTorch, TensorFlow/Keras, scikit-learn, and more. aka.ms/onnxruntime

Many users can benefit from ONNX Runtime, including those looking to:

Improve inference performance for a wide variety of ML models
Reduce time and cost of training large models
Train in Python but deploy into a C#/C++/Java app
Run on different hardware and operating systems
Support models created in several different frameworks

ONNX Runtime inferencing APIs are stable and production-ready since the 1.0 release in October 2019 and can enable faster customer experiences and lower costs.

ONNX Runtime training feature was introduced in May 2020 in preview. This feature supports acceleration of PyTorch training on multi-node NVIDIA GPUs for transformer models. Additional updates for this feature are coming soon.

Get Started
- ONNX Runtime Inferencing
- ONNX Runtime Training
Data/Telemetry
Contributions and Feedback
License

Get Started

Frequently Asked Questions

Inferencing: Start

To use ONNX Runtime, refer to the table on aka.ms/onnxruntime for instructions for different build combinations.

Compatibility

Supporting models based on the standard ONNX format, the runtime is compatible with PyTorch, scikit-learn, TensorFlow, Keras, and all other frameworks and tools that support the interoperable format.

Getting ONNX models - tutorials

ONNX Runtime is up to date and backwards compatible with all operators (both DNN and traditional ML) since ONNX v1.2.1+. (ONNX compatibility details). Newer versions of ONNX Runtime support all models that worked with prior versions, so updates should not break integrations.

Supported operators/types
- Operators not supported in the current ONNX spec may be available as a Contrib Operator
Extensibility: Add a custom operator/kernel

Binaries

Official builds are available on PyPi (Python), Nuget (C#/C/C++), Maven Central (Java), and npm (node.js).

Default CPU Provider (Eigen + MLAS)
GPU Provider - NVIDIA CUDA
GPU Provider - DirectML (Windows)
- On Windows, the DirectML execution provider is recommended for optimal performance and compatibility with a broad set of GPUs.

Dev builds created from the master branch are available for testing newer changes between official releases. Please use these at your own risk. We strongly advise against deploying these to production workloads as support is limited for dev builds.

Repository	Details
Pypi (Python)	If using pip, run `pip install --upgrade pip` prior to downloading. CPU: onnxruntime / ort-nightly (dev) GPU: onnxruntime-gpu / ort-gpu-nightly (dev)
Nuget (C#/C/C++)	CPU: Microsoft.ML.OnnxRuntime / ort-nightly (dev) GPU: Microsoft.ML.OnnxRuntime.Gpu / ort-nightly (dev)
Maven Central (Java)	CPU: com.microsoft.onnxruntime/onnxruntime GPU: com.microsoft.onnxruntime/onnxruntime_gpu
npm (node.js)	CPU: onnxruntime
Other	Contributed non-official packages (including Homebrew, Linuxbrew, and nixpkgs) These are not maintained by the core ONNX Runtime team and may have limited support; use at your discretion.

System Requirements

The following are required for usage of the official published packages.

Visual C++ Runtime (for Windows packages)
- Requires Visual C++ 2019 runtime
System language
- Installation of the English language package and configuring en_US.UTF-8 locale is required, as certain operators makes use of system locales.
- For Ubuntu, install language-pack-en package
  - Run the following commands: locale-gen en_US.UTF-8 update-locale LANG=en_US.UTF-8
  - Follow similar procedure to configure other locales on other platforms.
Default CPU
- ONNX Runtime binaries in the CPU packages use OpenMP and depend on the library being available at runtime in the system.
  - For Windows, OpenMP support comes as part of VC runtime. It is also available as redist packages: vc_redist.x64.exe and vc_redist.x86.exe
  - For Linux, the system must have libgomp.so.1 which can be installed using apt-get install libgomp1.
  - For Mac OS X, the system must have libomp.dylib which can be installed using brew install libomp.
Default GPU (CUDA)
- The default GPU build requires CUDA runtime libraries being installed on the system:
  - Version: CUDA 10.2 and cuDNN 8.0.3
- Version dependencies from older ONNX Runtime releases can be found in prior release notes.

Build from Source

For production scenarios, it's strongly recommended to build only from an official release branch.

Instructions for additional build flavors

Docker Images

ONNX-Ecosystem: includes ONNX Runtime (CPU, Python), dependencies, tools to convert from various frameworks, and Jupyter notebooks to help get started
Additional dockerfiles

API Documentation

API	Supported Versions	Samples
Python	3.5, 3.6, 3.7, 3.8 (3.8 excludes Win GPU and Linux ARM) Python Dev Notes	Samples
C#		Samples
C++		Samples
C		Samples
WinRT	Windows.AI.MachineLearning	Samples
Java	8+	Samples
Ruby (external project)	2.4-2.7	Samples
Javascript (node.js)	12.x	Samples

Supported Accelerators

Execution Providers

CPU	GPU	IoT/Edge/Mobile	Other
Default CPU - MLAS (Microsoft Linear Algebra Subprograms) + Eigen Intel DNNL Intel nGraph Intel MKL-ML (build option)	NVIDIA CUDA NVIDIA TensorRT DirectML AMD MIGraphX (preview)	Intel OpenVINO ARM Compute Library (preview) Android Neural Networks API (preview) ARM-NN (preview) Rockchip NPU (preview)	Nuphar Model Compiler - (preview) Xilinx Vitis-AI (preview)

Deploying ONNX Runtime

Cloud

ONNX Runtime can be deployed to any cloud for model inferencing, including Azure Machine Learning Services.
- Detailed instructions
- AzureML sample notebooks
ONNX Runtime Server (beta) is a hosting application for serving ONNX models using ONNX Runtime, providing a REST API for prediction.
- Usage details
- Image installation instructions

IoT and edge devices

Reference implementations

The expanding focus and selection of IoT devices with sensors and consistent signal streams introduces new opportunities to move AI workloads to the edge. This is particularly important when there are massive volumes of incoming data/signals that may not be efficient or useful to push to the cloud due to storage or latency considerations. Consider: surveillance tapes where 99% of footage is uneventful, or real-time person detection scenarios where immediate action is required. In these scenarios, directly executing model inferencing on the target device is crucial for optimal assistance.

Client applications

Install or build the package you need to use in your application. (sample implementations using the C++ API)
On newer Windows 10 devices (1809+), ONNX Runtime is available by default as part of the OS and is accessible via the Windows Machine Learning APIs. (Tutorials for Windows Desktop or UWP app)

Training: Start

The ONNX Runtime training feature enables easy integration with existing Pytorch trainer code to accelerate the exection. With a few lines of code, you can add ONNX Runtime into your existing training scripts and start seeing acceleration. The current preview version supports training acceleration for transformer models on NVIDIA GPUs.

ONNX Runtime pre-training sample: This sample is setup to pre-train the BERT-Large model to show how ONNX Runtime training can be used to accelerate training execution.

Train PyTorch model with ONNX Runtime

ONNX Runtime (ORT) has the capability to train existing PyTorch models through its optimized backend. For this, we have introduced an python API for PyTorch, called ORTTrainer, which can be used to switch the training backend for PyTorch models (instance of torch.nn.Module) to orttrainer. This requires some changes in the trainer code, such as replacing the PyTorch optimizer, and optionally, setting flags to enable additional features such as mixed-precision training. Here is a sample code fragment to integrate ONNX Runtime Training in your PyTorch pre-training script:

NOTE: The current API is experimental and expected to see significant changes in the near future. Our goal is to improve the interface to provide a seamless integration with PyTorch training that requires minimal changes in users’ training code.

import torch
...
import onnxruntime
from onnxruntime.training import ORTTrainer, optim

# Model definition
class NeuralNet(torch.nn.Module):
  def __init__(self, input_size, hidden_size, num_classes):
    ...
  def forward(self, data):
    ...

model = NeuralNet(input_size=784, hidden_size=500, num_classes=10)
criterion = torch.nn.Functional.cross_entropy 
model_description = {'inputs':  [('data', ['in', 'batch_size']),
                                 ('target', ['label_x_batch_size'])],
                     'outputs': [('loss', [], True),
                                 ('output', ['out', 'batch_size'])]}

optimizer_config = optim.AdamConfig(lr=learning_rate)

trainer = ORTTrainer(model,              # model
                     model_description,  # model description
                     optimizer_config,   # optimizer configuration
                     criterion)          # loss function

# Training Loop
for t in range(1000):
  # forward + backward + weight update
  loss, y_pred = trainer.train_step(input_data, target_labels, learning_rate)
  total_loss += loss.item()
  ...

Build ONNX Runtime Training from source

To use ONNX Runtime training in a custom environment, like on-prem NVIDIA DGX-2 clusters, you can use these build instructions to generate the Python package to integrate into existing trainer code.

Data/Telemetry

This project may collect usage data and send it to Microsoft to help improve our products and services. See the privacy statement for more details.

Contributions and Feedback

We welcome contributions! Please see the contribution guidelines.

For any feedback or to report a bug, please file a GitHub Issue.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

License

This project is licensed under the MIT License.

README.md Unescape Escape

Table of Contents