mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-16 21:00:14 +00:00
Description: This change makes three changes to the ThreadPool class to clean up issues identified during performance analysis and optimization. (1) It uses mm_pause intrinsics in spin loops, helping avoid consuming pipeline resources while waiting. (2) It re-organizes the spin-then-steal loop for work distribution to start out spinning as intended, rather than to start out trying to steal. (3) It updates the ThreadPool class's API to be consistent in the use of static methods for public functions. The PR includes minor doc updates and corresponding changes to test cases. Motivation and Context The change helps ensure consistency in behavior between the OpenMP and Eigen-based implementations. Unlike the instance methods, the static methods abstract over the different ways in which threading can be implemented; they will map onto the OpenMP or Eigen-based implementations when threading is used. When threading is not used they will run work sequentially.
22 lines
1.5 KiB
Markdown
22 lines
1.5 KiB
Markdown
# Notes on Threading in ORT
|
|
|
|
This document is intended for ORT developers.
|
|
|
|
ORT allows the usage of either OpenMP or non-OpenMP (ORT) threads for execution. Threadpool management
|
|
is abstracted behind: (1) ThreadPool class in [threadpool.h](https://github.com/microsoft/onnxruntime/blob/master/include/onnxruntime/core/platform/threadpool.h) and (2) functions in [thread_utils.h](https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/core/util/thread_utils.h).
|
|
|
|
When developing an op, please use these abstractions to parallelize your code. These abstractions centralize 2 things.
|
|
When OpenMP is enabled, they resort to using OpenMP. When OpenMP is disabled they resort to sequential execution if the threadpool ptr is NULL or schedule the tasks on the threadpool otherwise.
|
|
|
|
Examples of these abstractions are: ([threadpool.h](https://github.com/microsoft/onnxruntime/blob/master/include/onnxruntime/core/platform/threadpool.h) has more documentation for these)
|
|
* TryParallelFor
|
|
* TrySimpleParallelFor
|
|
* TryBatchParallelFor
|
|
* ShouldParallelize
|
|
* DegreeOfParallelism
|
|
|
|
These static methods abstract over the different implementation choices. They can run over the ORT thread pool, or run over OpenMP, or run sequentially.
|
|
|
|
**Please do not write #ifdef pragma omp in operator code**.
|
|
|
|
For intra op parallelism ORT users can use either OpenMP or ORT threadpool. The choice of using OpenMP is indicated by building ORT with ```--use_openmp``` switch. For inter op parallelism, however, we always use the ORT threadpool.
|