mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-06-19 02:03:52 +00:00
Move Linux GPU CI pipeline to A10 machines which are more advanced. Retire onnxruntime-Linux-GPU-T4 machine pool. Disable run_lean_attention test because the new machines do not have enough shared memory. ``` skip loading trt attention kernel fmha_mhca_fp16_128_256_sm86_kernel because no enough shared memory [E:onnxruntime:, sequential_executor.cc:505 ExecuteKernel] Non-zero status code returned while running MultiHeadAttention node. Name:'MultiHeadAttention_0' Status Message: CUDA error cudaErrorInvalidValue:invalid argument ``` |
||
|---|---|---|
| .. | ||
| contrib_ops | ||
| quantization | ||
| testdata | ||
| transformers | ||
| helper.py | ||
| onnx_backend_test_series.py | ||
| onnxruntime_test_collective.py | ||
| onnxruntime_test_distributed.py | ||
| onnxruntime_test_engine_wrapper.py | ||
| onnxruntime_test_float8.py | ||
| onnxruntime_test_float8_gemm8.py | ||
| onnxruntime_test_python.py | ||
| onnxruntime_test_python_azure.py | ||
| onnxruntime_test_python_backend.py | ||
| onnxruntime_test_python_backend_mlops.py | ||
| onnxruntime_test_python_cudagraph.py | ||
| onnxruntime_test_python_dmlgraph.py | ||
| onnxruntime_test_python_iobinding.py | ||
| onnxruntime_test_python_keras.py | ||
| onnxruntime_test_python_mlops.py | ||
| onnxruntime_test_python_nested_control_flow_op.py | ||
| onnxruntime_test_python_sparse_matmul.py | ||
| onnxruntime_test_python_symbolic_shape_infer.py | ||
| onnxruntime_test_scatternd.py | ||
| requirements.txt | ||
| test_pytorch_export_contrib_ops.py | ||