ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Find a file
Chen Fu 4a4488baae
Release buffers for prepacked tensors (#6820)
Unsolved problems:

1. One test failure was caused by a bug in Cudnn rnn kernels, when they can allocate a buffer and partially initialize it, the garbage data near tail of the buffer caused problem in some of the hardware. To attack this problem in a broader sense, should we add code in our allocators, and during a memory fuzzing test, fill an allocated buffer with garbage before returning to the caller?


2. Prepacking is used more widely than we know. For instance, Cudnn rnn kernels also cache their weights. They mix several weight tensors together into a single buffer, and never touch the original weight tensor anymore. This is the same idea with pre-pack, but they didn't override the virtual function, and they never tried to release those weight tensors, leading to memory waste. It also seems to me that there are some other kernels have similar behavior. Wonder how much memory we can save if we try to cleanup those too.

3. Turning off memory pattern planning does increase memory fragmentation, leading to out of memory error in some training test cases. Perhaps we can revisit the idea of pushing kernels-creation stage earlier, and then during initializer deserialization, we only avoid tracing those that will be prepacked.
2021-03-10 14:07:20 -08:00
.github Don't mark issues that are marked as enhancement as stale (#6134) 2020-12-14 18:57:40 -08:00
cgmanifests Upgrade TensorRT to v7.2.2 (#6452) 2021-02-18 04:30:47 -08:00
cmake MLAS: quantized GEMM update (#6916) 2021-03-10 09:54:43 -08:00
csharp Fix app packaging in UWP (#6804) 2021-03-04 11:16:25 -08:00
dockerfiles Fix broken link in server usage and remove absolute path from dockerfiles readme (#6926) 2021-03-09 11:54:21 -08:00
docs Fix broken link in server usage and remove absolute path from dockerfiles readme (#6926) 2021-03-09 11:54:21 -08:00
include/onnxruntime/core Enable type reduction in EyeLike, Mod, random.cc CPU kernels. (#6960) 2021-03-10 15:32:56 +10:00
java Fix broken Java API link (#6826) 2021-03-08 11:28:41 -08:00
nodejs Removed BUILD.md from master as source now lives in gh-pages (#6709) 2021-02-19 11:34:21 -08:00
onnxruntime Release buffers for prepacked tensors (#6820) 2021-03-10 14:07:20 -08:00
orttraining Release buffers for prepacked tensors (#6820) 2021-03-10 14:07:20 -08:00
package/rpm Bumping up version to 1.7 (#6736) 2021-02-17 19:07:38 -08:00
samples fixed type to experimental session constructor (#6950) 2021-03-10 10:18:27 -08:00
server Remove nGraph Execution Provider (#5858) 2020-11-19 16:47:55 -08:00
tools Only set _native folder for Microsoft.AI.MachineLearning package (#6939) 2021-03-08 15:27:11 -08:00
winml Minor WinML model test skip name change 2021-02-17 14:27:58 -08:00
.clang-format Initial bootstrap commit. 2018-11-19 16:48:22 -08:00
.clang-tidy Add remaining build options and make minor changes in documentation (#39) 2018-11-27 19:59:40 -08:00
.dockerignore Update dockerfiles (#5929) 2020-11-25 15:38:22 -08:00
.flake8 Add ability to track per operator types in reduced build config. (#6428) 2021-01-29 07:59:51 +10:00
.gitattributes Initial bootstrap commit. 2018-11-19 16:48:22 -08:00
.gitignore Add robust dependency check for Python package (#6436) 2021-02-21 15:11:28 -08:00
.gitmodules Upgrade TensorRT to v7.2.2 (#6452) 2021-02-18 04:30:47 -08:00
build.amd64.1411.bat Initial bootstrap commit. 2018-11-19 16:48:22 -08:00
build.bat Initial bootstrap commit. 2018-11-19 16:48:22 -08:00
build.sh Add iOS test pipeline and a sample app. (#5298) 2020-09-29 13:53:11 -07:00
CODEOWNERS Update code owners for pytorch frontend team (#6329) 2021-02-02 11:09:10 -08:00
CONTRIBUTING.md Removed BUILD.md from master as source now lives in gh-pages (#6709) 2021-02-19 11:34:21 -08:00
LICENSE Remove year from license (#6658) 2021-02-12 00:25:56 -08:00
NuGet.config Delete nuget extra configs (#6477) 2021-01-27 20:25:45 -08:00
ort.wprp Add Tracelogging for profiling (#1639) 2019-11-11 21:34:10 -08:00
packages.config Update DirectML 1.4.1 to 1.4.2 for ORT 1.7 (#6780) 2021-02-23 10:52:10 -08:00
README.md Add direct link to build instructions on readme (#6729) 2021-02-19 10:56:50 -08:00
requirements-dev.txt Add ability to track per operator types in reduced build config. (#6428) 2021-01-29 07:59:51 +10:00
requirements-doc.txt Update readme.rst for pypi, change documentation style (#1663) 2019-10-19 18:26:34 -07:00
requirements.txt Remove cerberus from wheel package (#4919) 2020-08-26 09:00:03 -07:00
setup.py Add Python 3.9 to pypi metadata 2021-02-12 20:00:17 -08:00
ThirdPartyNotices.txt Merge CPU packaging pipelines (#6480) 2021-02-04 08:38:56 -08:00
VERSION_NUMBER Bumping up version to 1.7 (#6736) 2021-02-17 19:07:38 -08:00

ONNX Runtime is a cross-platform inference and training machine-learning accelerator compatible with deep learning frameworks, PyTorch and TensorFlow/Keras, as well as classical machine learning libraries such as scikit-learn, and more.

ONNX Runtime uses the portable ONNX computation graph format, backed by execution providers optimized for operating systems, drivers and hardware.

Common use cases for ONNX Runtime:

  • Improve inference performance for a wide variety of ML models
  • Reduce time and cost of training large models
  • Train in Python but deploy into a C#/C++/Java app
  • Run with optimized performance on different hardware and operating systems
  • Support models created in several different frameworks

ONNX Runtime inference APIs are stable and production-ready since the 1.0 release in October 2019 and can enable faster customer experiences and lower costs.

ONNX Runtime training feature was introduced in May 2020 in preview. This feature supports acceleration of PyTorch training on multi-node NVIDIA GPUs for transformer models. Additional updates for this feature are coming soon.

Get Started

http://onnxruntime.ai/

Build Pipeline Status

System CPU GPU EPs
Windows Build Status Build Status Build Status
Linux Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Mac Build Status
Build Status
Android Build Status
iOS Build Status

Data/Telemetry

This project may collect usage data and send it to Microsoft to help improve our products and services. See the privacy statement for more details.

Contributions and Feedback

We welcome contributions! Please see the contribution guidelines.

For feature requests or bug reports, please file a GitHub Issue.

For general discussion or questions, please use Github Discussions.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

License

This project is licensed under the MIT License.