saymrwulf/onnxruntime: ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-29 03:30:52 +00:00

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

Find a file

Chen Fu 4a4488baae Release buffers for prepacked tensors (#6820 ) Unsolved problems: 1. One test failure was caused by a bug in Cudnn rnn kernels, when they can allocate a buffer and partially initialize it, the garbage data near tail of the buffer caused problem in some of the hardware. To attack this problem in a broader sense, should we add code in our allocators, and during a memory fuzzing test, fill an allocated buffer with garbage before returning to the caller? 2. Prepacking is used more widely than we know. For instance, Cudnn rnn kernels also cache their weights. They mix several weight tensors together into a single buffer, and never touch the original weight tensor anymore. This is the same idea with pre-pack, but they didn't override the virtual function, and they never tried to release those weight tensors, leading to memory waste. It also seems to me that there are some other kernels have similar behavior. Wonder how much memory we can save if we try to cleanup those too. 3. Turning off memory pattern planning does increase memory fragmentation, leading to out of memory error in some training test cases. Perhaps we can revisit the idea of pushing kernels-creation stage earlier, and then during initializer deserialization, we only avoid tracing those that will be prepacked.		2021-03-10 14:07:20 -08:00
.github	Don't mark issues that are marked as enhancement as stale (#6134 )	2020-12-14 18:57:40 -08:00
cgmanifests	Upgrade TensorRT to v7.2.2 (#6452 )	2021-02-18 04:30:47 -08:00
cmake	MLAS: quantized GEMM update (#6916 )	2021-03-10 09:54:43 -08:00
csharp	Fix app packaging in UWP (#6804 )	2021-03-04 11:16:25 -08:00
dockerfiles	Fix broken link in server usage and remove absolute path from dockerfiles readme (#6926 )	2021-03-09 11:54:21 -08:00
docs	Fix broken link in server usage and remove absolute path from dockerfiles readme (#6926 )	2021-03-09 11:54:21 -08:00
include/onnxruntime/core	Enable type reduction in EyeLike, Mod, random.cc CPU kernels. (#6960 )	2021-03-10 15:32:56 +10:00
java	Fix broken Java API link (#6826 )	2021-03-08 11:28:41 -08:00
nodejs	Removed BUILD.md from master as source now lives in gh-pages (#6709 )	2021-02-19 11:34:21 -08:00
onnxruntime	Release buffers for prepacked tensors (#6820 )	2021-03-10 14:07:20 -08:00
orttraining	Release buffers for prepacked tensors (#6820 )	2021-03-10 14:07:20 -08:00
package/rpm	Bumping up version to 1.7 (#6736 )	2021-02-17 19:07:38 -08:00
samples	fixed type to experimental session constructor (#6950 )	2021-03-10 10:18:27 -08:00
server	Remove nGraph Execution Provider (#5858 )	2020-11-19 16:47:55 -08:00
tools	Only set _native folder for Microsoft.AI.MachineLearning package (#6939 )	2021-03-08 15:27:11 -08:00
winml	Minor WinML model test skip name change	2021-02-17 14:27:58 -08:00
.clang-format	Initial bootstrap commit.	2018-11-19 16:48:22 -08:00
.clang-tidy	Add remaining build options and make minor changes in documentation (#39 )	2018-11-27 19:59:40 -08:00
.dockerignore	Update dockerfiles (#5929 )	2020-11-25 15:38:22 -08:00
.flake8	Add ability to track per operator types in reduced build config. (#6428 )	2021-01-29 07:59:51 +10:00
.gitattributes	Initial bootstrap commit.	2018-11-19 16:48:22 -08:00
.gitignore	Add robust dependency check for Python package (#6436 )	2021-02-21 15:11:28 -08:00
.gitmodules	Upgrade TensorRT to v7.2.2 (#6452 )	2021-02-18 04:30:47 -08:00
build.amd64.1411.bat	Initial bootstrap commit.	2018-11-19 16:48:22 -08:00
build.bat	Initial bootstrap commit.	2018-11-19 16:48:22 -08:00
build.sh	Add iOS test pipeline and a sample app. (#5298 )	2020-09-29 13:53:11 -07:00
CODEOWNERS	Update code owners for pytorch frontend team (#6329 )	2021-02-02 11:09:10 -08:00
CONTRIBUTING.md	Removed BUILD.md from master as source now lives in gh-pages (#6709 )	2021-02-19 11:34:21 -08:00
LICENSE	Remove year from license (#6658 )	2021-02-12 00:25:56 -08:00
NuGet.config	Delete nuget extra configs (#6477 )	2021-01-27 20:25:45 -08:00
ort.wprp	Add Tracelogging for profiling (#1639 )	2019-11-11 21:34:10 -08:00
packages.config	Update DirectML 1.4.1 to 1.4.2 for ORT 1.7 (#6780 )	2021-02-23 10:52:10 -08:00
README.md	Add direct link to build instructions on readme (#6729 )	2021-02-19 10:56:50 -08:00
requirements-dev.txt	Add ability to track per operator types in reduced build config. (#6428 )	2021-01-29 07:59:51 +10:00
requirements-doc.txt	Update readme.rst for pypi, change documentation style (#1663 )	2019-10-19 18:26:34 -07:00
requirements.txt	Remove cerberus from wheel package (#4919 )	2020-08-26 09:00:03 -07:00
setup.py	Add Python 3.9 to pypi metadata	2021-02-12 20:00:17 -08:00
ThirdPartyNotices.txt	Merge CPU packaging pipelines (#6480 )	2021-02-04 08:38:56 -08:00
VERSION_NUMBER	Bumping up version to 1.7 (#6736 )	2021-02-17 19:07:38 -08:00

README.md

ONNX Runtime is a cross-platform inference and training machine-learning accelerator compatible with deep learning frameworks, PyTorch and TensorFlow/Keras, as well as classical machine learning libraries such as scikit-learn, and more.

ONNX Runtime uses the portable ONNX computation graph format, backed by execution providers optimized for operating systems, drivers and hardware.

Common use cases for ONNX Runtime:

Improve inference performance for a wide variety of ML models
Reduce time and cost of training large models
Train in Python but deploy into a C#/C++/Java app
Run with optimized performance on different hardware and operating systems
Support models created in several different frameworks

ONNX Runtime inference APIs are stable and production-ready since the 1.0 release in October 2019 and can enable faster customer experiences and lower costs.

ONNX Runtime training feature was introduced in May 2020 in preview. This feature supports acceleration of PyTorch training on multi-node NVIDIA GPUs for transformer models. Additional updates for this feature are coming soon.