onnxruntime/tools/ci_build/github
RandySheriffH 4dfb89b3ad
Implement mutex-free spin lock for task queue (#14834)
Implemented "lock-free" spinlock to save CPU usage on context switching.
The change has been tested on queene service of Ads team, the lock-free
version of ort (40 threads) saves CPU usage on gen8 (128 logical
processors on 8 numa nodes) windows by nearly half, from 65% to 35%.

For 32 cores, the curve is flat:

Anubis, 32 vCPU, windows, hugging face models,
95 percentile E2E latency in ms:

model | mutex(ms) | mutex-free
--- | --- | ---
 alvert_base_v2 | 34.21 | 34.09
 bert_large_uncased | 116.27| 117.84
 bart_base | 72.06 | 71.99
 distilgpt2 | 25.43 | 25.02
 vit_base_patch16_224 | 37.33 | 37.76

Anubis, 32 vCPU win, Linux, 1st party models,
95 percentile E2E latency in ms:

model | mutex(ms) | mutex-free
--- | --- | ---
deepthink_v2 | 24.35 | 22.95
bing_feeds |  36.96 | 36.48
deep_writes |  14.46 | 14.32
keypoints |  9.34 | 7.69
model11 |  1.71 | 1.66
model12 |  1.82 | 1.44
model2 |  4.21 | 3.95
model6 |  1.08 | 1.05
agiencoder |  0.99 | 0.93
geminet_transformer |  5.32 | 5.24

---------

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2023-05-19 10:12:10 -07:00
..
android Creating Nuget and Android packages for Training (#15712) 2023-05-01 12:59:56 -07:00
apple [doc] add LeakyRelu to coreml supported ops (#15944) 2023-05-15 09:46:30 -07:00
azure-pipelines Implement mutex-free spin lock for task queue (#14834) 2023-05-19 10:12:10 -07:00
js Bump ruff in CI (#15533) 2023-04-17 10:11:44 -07:00
linux [ROCm] remove ROCm5.2.3, ROCm5.3, ROCm5.4 from pipeline (#16004) 2023-05-19 10:29:01 +08:00
pai [ROCm] update ROCm/MIGraphX CI to ROCm5.5 (#15905) 2023-05-15 10:28:15 +08:00
windows Change CUDA pipelines to download CUDA SDK in every build job (#15915) 2023-05-17 17:31:51 -07:00
Doxyfile_csharp.cfg Implement Optional Metadata support and C# test support (#15314) 2023-04-11 09:41:59 -07:00