ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Find a file
Tim Harris 2e09d9921a
"Sticky" allocation of worker threads (#7551)
[ PR previously merged as https://github.com//pull/7372, then reverted pending investigation of lost-wake-up issue seen with ParallelExecutor. Issue was a missing test for new work pushed to thread concurrent with a worker blocking. Change from 7372 is the addition of: https://github.com/microsoft/onnxruntime/blob/tiharr/dev-sticky-4/include/onnxruntime/core/platform/EigenNonBlockingThreadPool.h#L1473-L1492 ]

Description: This change updates the heuristics used when a thread selects which worker threads to push work to on entering a parallel loop. Previously, worker threads would maintain a best-effort bitmap of "good worker hints" indicating the threads that were likely to be spinning waiting for work. This change uses a simpler heuristic where a thread records which workers ran its previous loop, and then re-submits its next loop to those same workers. The aim is to retain affinity between a thread and a set of workers, and to avoid maintaining the "good worker hints" bitmaps.

Motivation and Context: Profiling suggested that maintaining the "good worker hints" was taking unexpected time, particularly on NUMA systems. In addition, when running many concurrent workloads, the hints did not provide a way to help retain locality of workers and hence data in caches. Testing to confirm no regressions on microbenchmark (./build/Linux/Release/onnxruntime_benchmark --benchmark_filter=BM_ThreadPoolParallelFor) and on Linux mobilenet_v1_1.0_224.onnx, comparing p50 and p99 with vs without this change:

1 concurrent:
p50 0.0172s vs 0.0181s
p99 0.0204s vs 0.0216s

2 concurrent:
p50 0.0172s vs 0.0181s
p99 0.0213s vs 0.0221s
2021-05-03 18:28:13 +01:00
.github Don't mark issues that are marked as enhancement as stale (#6134) 2020-12-14 18:57:40 -08:00
cgmanifests pick onnx release candidate (#7177) 2021-04-22 23:57:09 -07:00
cmake Check whether nvcc supports -Wstrict-aliasing before adding the flag. (#7509) 2021-05-01 00:14:50 -07:00
csharp Enable Microsoft.Ai.MachineLearning package to work on .NET5 down to 17763 Windows SDK (#7522) 2021-05-01 00:56:36 -07:00
dockerfiles fix for using tensorrt:20.12 base image (#7264) 2021-04-07 08:48:43 -07:00
docs Android package infrastructure (#7430) 2021-04-30 14:23:54 +10:00
include/onnxruntime/core "Sticky" allocation of worker threads (#7551) 2021-05-03 18:28:13 +01:00
java Add static code analyzer to Windows CPU/GPU CI builds and fix the warnings (#7489) 2021-04-29 11:54:57 -07:00
js [js/web] fix pacakge metadata of onnxruntime-web (#7543) 2021-05-02 13:26:07 -07:00
objectivec Initial Objective-C API (#7366) 2021-04-27 10:06:30 -07:00
onnxruntime "Sticky" allocation of worker threads (#7551) 2021-05-03 18:28:13 +01:00
orttraining Improve tol value logging in ORTModule test (#7544) 2021-05-03 09:43:40 -07:00
package/rpm Bumping up version to 1.7 (#6736) 2021-02-17 19:07:38 -08:00
samples Introduce ORTModule training API to ONNX Runtime 2021-03-10 10:48:10 -08:00
server Update ORT server build pipeline (#7030) 2021-03-16 18:02:09 -07:00
tools Enable Microsoft.Ai.MachineLearning package to work on .NET5 down to 17763 Windows SDK (#7522) 2021-05-01 00:56:36 -07:00
winml Use unicode apis for loadlibrary (#7523) 2021-05-03 07:24:40 -07:00
.clang-format
.clang-tidy
.dockerignore Update dockerfiles (#5929) 2020-11-25 15:38:22 -08:00
.flake8 Sync ORTModule branch with master and fix tests (#6526) 2021-02-02 08:59:56 -08:00
.gitattributes
.gitignore Add auto doc gen for ORTModule API during CI build (#7046) 2021-03-22 10:20:33 -07:00
.gitmodules build ONNXRuntime into WebAssembly (#6478) 2021-04-06 16:18:10 -07:00
build.amd64.1411.bat
build.bat
build.sh Add iOS test pipeline and a sample app. (#5298) 2020-09-29 13:53:11 -07:00
CODEOWNERS Update code owners for pytorch frontend team (#6329) 2021-02-02 11:09:10 -08:00
CONTRIBUTING.md Add README for docs (#6626) 2021-03-12 15:14:40 -08:00
LICENSE Remove year from license (#6658) 2021-02-12 00:25:56 -08:00
NuGet.config Sync ORTModule branch with master and fix tests (#6526) 2021-02-02 08:59:56 -08:00
ort.wprp
packages.config Update DirectML version to 1.5.1 and enable ARM/ARM64 builds with DML (#7511) 2021-04-30 00:49:30 -07:00
README.md build ONNXRuntime into WebAssembly (#6478) 2021-04-06 16:18:10 -07:00
requirements-dev.txt Sync ORTModule branch with master and fix tests (#6526) 2021-02-02 08:59:56 -08:00
requirements-doc.txt Add auto doc gen for ORTModule API during CI build (#7046) 2021-03-22 10:20:33 -07:00
requirements-training.txt Add missing Python dependencies for ORT training (#7104) 2021-03-23 18:43:19 -07:00
requirements.txt Quantization calibration refactor (#6893) 2021-03-19 01:09:11 -07:00
setup.py Update DirectML version to 1.5.1 and enable ARM/ARM64 builds with DML (#7511) 2021-04-30 00:49:30 -07:00
ThirdPartyNotices.txt Enable CoreML EP for minimal extended mode (#7266) 2021-04-08 17:45:22 -07:00
VERSION_NUMBER Bumping up version to 1.7 (#6736) 2021-02-17 19:07:38 -08:00

ONNX Runtime is a cross-platform inference and training machine-learning accelerator compatible with deep learning frameworks, PyTorch and TensorFlow/Keras, as well as classical machine learning libraries such as scikit-learn, and more.

ONNX Runtime uses the portable ONNX computation graph format, backed by execution providers optimized for operating systems, drivers and hardware.

Common use cases for ONNX Runtime:

  • Improve inference performance for a wide variety of ML models
  • Reduce time and cost of training large models
  • Train in Python but deploy into a C#/C++/Java app
  • Run with optimized performance on different hardware and operating systems
  • Support models created in several different frameworks

ONNX Runtime inference APIs are stable and production-ready since the 1.0 release in October 2019 and can enable faster customer experiences and lower costs.

ONNX Runtime training feature was introduced in May 2020 in preview. This feature supports acceleration of PyTorch training on multi-node NVIDIA GPUs for transformer models. Additional updates for this feature are coming soon.

Get Started

http://onnxruntime.ai/

Build Pipeline Status

System CPU GPU EPs
Windows Build Status Build Status Build Status
Linux Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Mac Build Status
Build Status
Android Build Status
iOS Build Status
WebAssembly Build Status

Data/Telemetry

This project may collect usage data and send it to Microsoft to help improve our products and services. See the privacy statement for more details.

Contributions and Feedback

We welcome contributions! Please see the contribution guidelines.

For feature requests or bug reports, please file a GitHub Issue.

For general discussion or questions, please use Github Discussions.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

License

This project is licensed under the MIT License.