mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-18 21:21:17 +00:00
Move Linux GPU CI pipeline to A10 machines which are more advanced. Retire onnxruntime-Linux-GPU-T4 machine pool. Disable run_lean_attention test because the new machines do not have enough shared memory. ``` skip loading trt attention kernel fmha_mhca_fp16_128_256_sm86_kernel because no enough shared memory [E:onnxruntime:, sequential_executor.cc:505 ExecuteKernel] Non-zero status code returned while running MultiHeadAttention node. Name:'MultiHeadAttention_0' Status Message: CUDA error cudaErrorInvalidValue:invalid argument ``` |
||
|---|---|---|
| .. | ||
| android | ||
| apple | ||
| azure-pipelines | ||
| js | ||
| linux | ||
| pai | ||
| windows | ||
| Doxyfile_csharp.cfg | ||