ONNX is an open format for machine learning (ML) models that is supported by various ML and DNN frameworks and tools. This format makes it easier to interoperate between frameworks and to maximize the reach of your hardware optimization investments. Learn more about ONNX on [https://onnx.ai](https://onnx.ai) or view the [Github Repo](https://github.com/onnx/onnx).
ONNX Runtime has an open architecture that is continually evolving to address the newest developments and challenges in AI and Deep Learning. ONNX Runtime stays up to date with the ONNX standard, supporting all ONNX releases with future compatibliity and maintaining backwards compatibility with prior releases.
ONNX Runtime continuously strives to provide top performance for a broad and growing number of usage scenarios in Machine Learning. Our investments focus on:
ONNX Runtime provides comprehensive support of the ONNX spec and can be used to run all models based on ONNX v1.2.1 and higher. See ONNX version release details [here](https://github.com/onnx/onnx/releases).
As of May 2019, ONNX Runtime supports ONNX 1.5 (opset10). See [this table](https://github.com/Microsoft/onnxruntime/blob/master/docs/Versioning.md#version-matrix) for details on ONNX Runtime and ONNX versioning compatibility,
ONNX Runtime supports both CPU and GPU hardware through a variety of execution providers. With a variety of graph optimizations and accelerators, ONNX Runtime often provides lower latency and higher efficiency compared to other runtimes. This provides faster end-to-end customer experiences and lower costs from improved machine utilization.
Currently ONNX Runtime supports CUDA, TensorRT, MLAS (Microsoft Linear Algebra Subprograms), MKL-DNN, MKL-ML, and nGraph for computation acceleration. See more details on available build options [here](https://github.com/Microsoft/onnxruntime/blob/master/BUILD.md).
We are continuously working to integrate new execution providers to provide improvements in latency and efficiency. If you are interested in contributing a new execution provider, please see [this page](docs/AddingExecutionProvider.md).
We have ongoing investments to make ONNX Runtime compatible with more platforms and architectures. If you have specific scenarios that are not currently supported, please share your suggestions via [Github Issues](https://github.com/microsoft/onnxruntime/issues).
Once you have an ONNX model, you can [install the runtime](#Installation) for your machine to try it out. There is also an [ONNX-Ecosystem Docker container](https://github.com/onnx/onnx-docker/tree/master/onnx-ecosystem) available and ready for use with the Python API.
One easy way to deploy the model on the cloud is by using [Azure Machine Learning](https://azure.microsoft.com/en-us/services/machine-learning-service). See [detailed instructions](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-build-deploy-onnx) and [sample notebooks](https://github.com/Azure/MachineLearningNotebooks/tree/master/how-to-use-azureml/deployment/onnx).
For an overview of the high level architecture and key decisions in the technical design of ONNX Runtime, see [Engineering Design](docs/HighLevelDesign.md).
ONNX Runtime is built with an extensible design that makes it versatile to support a wide array of models with high performance.