onnxruntime/cmake
RandySheriffH 4dfb89b3ad
Implement mutex-free spin lock for task queue (#14834)
Implemented "lock-free" spinlock to save CPU usage on context switching.
The change has been tested on queene service of Ads team, the lock-free
version of ort (40 threads) saves CPU usage on gen8 (128 logical
processors on 8 numa nodes) windows by nearly half, from 65% to 35%.

For 32 cores, the curve is flat:

Anubis, 32 vCPU, windows, hugging face models,
95 percentile E2E latency in ms:

model | mutex(ms) | mutex-free
--- | --- | ---
 alvert_base_v2 | 34.21 | 34.09
 bert_large_uncased | 116.27| 117.84
 bart_base | 72.06 | 71.99
 distilgpt2 | 25.43 | 25.02
 vit_base_patch16_224 | 37.33 | 37.76

Anubis, 32 vCPU win, Linux, 1st party models,
95 percentile E2E latency in ms:

model | mutex(ms) | mutex-free
--- | --- | ---
deepthink_v2 | 24.35 | 22.95
bing_feeds |  36.96 | 36.48
deep_writes |  14.46 | 14.32
keypoints |  9.34 | 7.69
model11 |  1.71 | 1.66
model12 |  1.82 | 1.44
model2 |  4.21 | 3.95
model6 |  1.08 | 1.05
agiencoder |  0.99 | 0.93
geminet_transformer |  5.32 | 5.24

---------

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2023-05-19 10:12:10 -07:00
..
external [DML EP] Update DirectML version to 1.12.0 (#16011) 2023-05-18 19:37:12 -07:00
patches FlatBuffers fails to compile with gcc13. (#15787) 2023-05-11 11:20:19 -07:00
tensorboard Improve dependency management (#13523) 2022-12-01 09:51:59 -08:00
adjust_global_compile_flags.cmake upgrade emsdk to 3.1.37 (#15817) 2023-05-08 16:49:47 -07:00
CMakeLists.txt Implement mutex-free spin lock for task queue (#14834) 2023-05-19 10:12:10 -07:00
CMakeSettings.json
codeconv.runsettings
deps.txt Implement openAI endpoint invoker for nuget (#15797) 2023-05-11 22:04:02 -07:00
EnableVisualStudioCodeAnalysis.props
gdk_toolchain.cmake
Info.plist.in
libonnxruntime.pc.cmake.in
nuget_helpers.cmake
onnxruntime.cmake android package fix (#15999) 2023-05-18 09:21:03 -07:00
onnxruntime_codegen_tvm.cmake Use target name for flatbuffers (#13991) 2022-12-20 11:44:02 -08:00
onnxruntime_common.cmake Add clog back to onnxruntime_EXTERNAL_LIBRARIES. (#15363) 2023-04-05 09:11:19 -07:00
onnxruntime_compile_triton_kernel.cmake integrate triton into ort (#15862) 2023-05-17 09:35:28 +08:00
onnxruntime_config.h.in Adust GetVersionString() GetBuildInfoString() signatures and move them to OrtApi (#15921) 2023-05-13 13:45:07 -07:00
onnxruntime_csharp.cmake Refactor training build options (#13964) 2023-01-03 13:28:16 -08:00
onnxruntime_flatbuffers.cmake Rework some external targets to ease building with -DFETCHCONTENT_FULLY_DISCONNECTED=ON (#15323) 2023-04-03 17:45:12 -07:00
onnxruntime_framework.cmake Implement openAI endpoint invoker for nuget (#15797) 2023-05-11 22:04:02 -07:00
onnxruntime_fuzz_test.cmake Fix fuzz test (#14385) 2023-01-22 22:17:43 -08:00
onnxruntime_graph.cmake Create dedicated build for training api (#14136) 2023-01-10 20:58:04 -08:00
onnxruntime_ios.toolchain.cmake
onnxruntime_java.cmake Update build option for training in java to enable_training_api (#15638) 2023-04-24 11:53:08 -07:00
onnxruntime_java_unittests.cmake Update build option for training in java to enable_training_api (#15638) 2023-04-24 11:53:08 -07:00
onnxruntime_kernel_explorer.cmake integrate triton into ort (#15862) 2023-05-17 09:35:28 +08:00
onnxruntime_language_interop_ops.cmake Use target name for flatbuffers (#13991) 2022-12-20 11:44:02 -08:00
onnxruntime_mlas.cmake [AMX] Update assembler check (#15501) 2023-04-19 14:16:26 -07:00
onnxruntime_nodejs.cmake [js] upgrade dependencies and enable strict mode (#14930) 2023-03-22 15:05:04 -07:00
onnxruntime_objectivec.cmake Add iOS Swift Package Manager support (#15297) 2023-04-20 16:18:35 +10:00
onnxruntime_opschema_lib.cmake Use target name for flatbuffers (#13991) 2022-12-20 11:44:02 -08:00
onnxruntime_optimizer.cmake Optimize SCE loss compute (#15401) 2023-04-13 13:02:12 +08:00
onnxruntime_providers.cmake integrate triton into ort (#15862) 2023-05-17 09:35:28 +08:00
onnxruntime_pyop.cmake Use target name for flatbuffers (#13991) 2022-12-20 11:44:02 -08:00
onnxruntime_python.cmake Remove onnxruntime_PYBIND_EXPORT_OPSCHEMA definition from onnxruntime (#15776) 2023-05-03 13:08:35 -07:00
onnxruntime_rocm_hipify.cmake [ROCm] add beam search support (#15625) 2023-04-26 17:53:33 +08:00
onnxruntime_session.cmake fix headers for training apis (#14350) 2023-01-19 10:26:53 -08:00
onnxruntime_snpe_provider.cmake Use target name for flatbuffers (#13991) 2022-12-20 11:44:02 -08:00
onnxruntime_training.cmake Create dedicated build for training api (#14136) 2023-01-10 20:58:04 -08:00
onnxruntime_unittests.cmake upgrade emsdk to 3.1.37 (#15817) 2023-05-08 16:49:47 -07:00
onnxruntime_util.cmake Improve dependency management (#13523) 2022-12-01 09:51:59 -08:00
onnxruntime_webassembly.cmake Support WebNN EP (#15698) 2023-05-08 21:25:10 -07:00
precompiled_header.cmake
Sdl.ruleset Add a Github workflow for Prefast (#15763) 2023-05-03 11:42:51 -07:00
set_winapi_family_desktop.h
target_delayload.cmake
uwp_stubs.h Run clang-format in CI (#15524) 2023-04-18 09:26:58 -07:00
wcos_rules_override.cmake
winml.cmake Use target name for flatbuffers (#13991) 2022-12-20 11:44:02 -08:00
winml_cppwinrt.cmake
winml_sdk_helpers.cmake
winml_unittests.cmake Use target name for flatbuffers (#13991) 2022-12-20 11:44:02 -08:00