onnxruntime/docs
Tim Harris 5e44d25c5a
Support multi-loop parallel sections, use multi-loop sections in GRU (#5602)
This PR updates the ThreadPool API to support multi-loop parallel sections. As with the OpenMP "parallel" construct, this allows per-loop work to be amortized over a series of loops. For ORT, it also promotes locality between successive loops in the sense that iteration X of one loop will tend to run on the same worker thread as iteration X of preceding loops.

The change was developed while optimizing the implementation of a model that performed better with OpenMP. Profiling indicated that OpenMP was providing lower loop entry/exit costs and that, via OpenMP's static scheduling, it was leading to a lower L2 miss rate in the series of parallel loops used in GRU.

The main changes are:

- Addition of ThreadPool::ParallelSection and underlying support in the modified Eigen thread pool.

- In EigenNonBlockingThreadPool.h, refactoring the RunInParallel method to support two variants: one that takes an existing parallel section object created by the caller, and another (used by default) that creates its own parallel section.

- Simplify ThreadPool::LoopCounter (used by worker threads to claim loop iterations), basing it an ID supplied by the underlying Eigen thread pool for affinity in a series of loops.

- Fix a possible perf issue where a loop with iterations scheduled in batches would have more threads than batches available.

- Use of parallel sections in the GRU operator.

- Additional test cases in threadpool_test.h.

- Additional comments at the top of threadpool.h and EigenNonBlockingThreadPool.h.
2020-11-10 12:24:57 +00:00
..
execution_providers added missing flag ORT_TENSORRT_DUMP_SUBGRAPHS (#5724) 2020-11-06 12:32:12 -08:00
images Updated with image for creating the onnxruntime pkg (#5400) 2020-10-08 08:54:27 -07:00
python bump version to 1.5.2 (#5420) 2020-10-08 16:30:13 -07:00
ABI_Dev_Notes.md Fix some typos. (#3582) 2020-04-18 14:18:05 -07:00
AddingCustomOp.md Revert "Custom Op on GPU (#5620)" 2020-10-30 21:23:51 -07:00
AddingExecutionProvider.md Renaming MKL-DNN as DNNL (#2515) 2019-12-03 07:34:23 -08:00
Android_testing.md Update Android instructions (#3971) 2020-05-19 07:30:45 +10:00
C_API.md Allow sharing of initializers between sessions. (#5092) 2020-09-21 14:09:37 -07:00
C_API_Guidelines.md Add C API Guidelines document (#5686) 2020-11-04 18:50:31 -08:00
cmake_guideline.md Add a doc for cmake (#1524) 2019-08-06 07:51:53 -07:00
Coding_Conventions_and_Standards.md Enable running PEP8 on python scripts using flake8 (#3928) 2020-05-15 07:15:06 +10:00
ContribOperators.md revert contrib op version bump and deprecation of TransposeMatMul (#5424) 2020-10-12 13:02:15 -07:00
CSharp_API.md [C# and Python APIs] Expose knobs to enable/disable platform telemetry collection (#5481) 2020-10-21 10:32:13 -07:00
ExportPyTorchCustomOps.md Add Trilu custom op (#4537) 2020-08-17 14:42:26 -07:00
FAQ.md Add FAQ page (#3324) 2020-05-06 15:43:32 -07:00
How_To_Update_ONNX_Dev_Notes.md CGManifest - add training entries and generate entries for submodules. (#3933) 2020-05-15 13:34:18 -07:00
InferenceHighLevelDesign.md Add docs indicating that the onnxruntime engine from other distributions can be compatible with the WinRT NuGet (#5009) 2020-09-14 21:15:51 -07:00
Java_API.md Java API: Documentation cleanup (#4395) 2020-08-13 12:06:42 -07:00
Model_Test.md Renaming MKL-DNN as DNNL (#2515) 2019-12-03 07:34:23 -08:00
NotesOnThreading.md Support multi-loop parallel sections, use multi-loop sections in GRU (#5602) 2020-11-10 12:24:57 +00:00
ONNX_Runtime_for_Mobile_Platforms.md Add --skip_tests to example command line as the included ops are being reduced. (#5554) 2020-10-22 08:55:42 +10:00
ONNX_Runtime_Graph_Optimizations.md Disable GeluApproximation transformer by default (#3644) 2020-04-24 14:29:40 -07:00
ONNX_Runtime_Perf_Tuning.md Add FAQ page (#3324) 2020-05-06 15:43:32 -07:00
ONNX_Runtime_Server_Usage.md [Doc] ONNX_Runtime_Server_Usage fix proto uri (#5345) 2020-10-19 13:30:58 -07:00
onnxruntime_dependencies.dot Update dependencies graph 2020-04-17 07:38:45 -07:00
onnxruntime_dependencies.png Update dependencies graph 2020-04-17 07:38:45 -07:00
OperatorKernels.md Render Operator documentation as compliant markdown (#3658) 2020-09-02 15:07:50 -07:00
PR_Guidelines.md Add guidelines for writing a good PR. (#3830) 2020-05-05 16:28:21 -07:00
Privacy.md [C# and Python APIs] Expose knobs to enable/disable platform telemetry collection (#5481) 2020-10-21 10:32:13 -07:00
PyOp.md EnrichPyOpUT (#4681) 2020-08-05 14:11:56 -07:00
Python_Dev_Notes.md Changes related to the release binaries requiring Visual C++ 2019 runtime (#3871) 2020-05-12 17:07:06 -07:00
Reduced_Operator_Kernel_build.md Add ability to generate configuration file with required operators. (#5089) 2020-09-09 21:39:17 +10:00
ReleaseManagement.md Updated TPN for OpenMPI and cleanup (#3932) 2020-05-14 11:42:44 -07:00
Roadmap.md Doc updates for 1.5 (#5302) 2020-09-30 09:53:33 -07:00
Server.md Doc Updates for Build (#3976) 2020-05-18 20:08:36 -07:00
Versioning.md bump version to 1.5.2 (#5420) 2020-10-08 16:30:13 -07:00
WinML_principles.md User/alexzak/win ml principles (#5453) 2020-11-04 13:35:40 -08:00
WinRT_API.md Update winrt_api.md to address the 1.4 release (#4946) 2020-08-28 08:05:22 -07:00