mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-06-04 23:59:56 +00:00
### Description Upgrade cutlass to 3.5 to fix build errors using CUDA 12.4 or 12.5 in Windows - [x] Upgrade cutlass to 3.5.0. - [x] Fix flash attention build error with latest cutlass header files and APIs. This fix is provided by @wangyems. - [x] Update efficient attention to use new cutlass fmha interface. - [x] Patch cutlass to fix `hrsqrt` not found error for sm < 53. - [x] Disable TF32 Staged Accumulation to fix blkq4_fp16_gemm_sm80_test build error for cuda 11.8 to 12.3. - [x] Disable TRT 10 deprecate warnings. The following are not included in this PR: * TRT provider replaces the deprecated APIs. * Fix blkq4_fp16_gemm_sm80_test build error for cuda 12.4 or 12.5. This test is not built by default unless you add `--cmake_extra_defines onnxruntime_ENABLE_CUDA_EP_INTERNAL_TESTS=ON` in build command. To integrate to rel-1.18.1: Either bring in other changes (like onnx 1.16.1), or generate manifest and upload a new ONNX Runtime Build Time Deps artifact based on rel-1.18.1. ### Motivation and Context https://github.com/microsoft/onnxruntime/issues/19891 https://github.com/microsoft/onnxruntime/issues/20924 https://github.com/microsoft/onnxruntime/issues/20953 |
||
|---|---|---|
| .. | ||
| nodejs/templates | ||
| nuget/templates | ||
| stages | ||
| templates | ||
| triggers | ||
| android-arm64-v8a-QNN-crosscompile-ci-pipeline.yml | ||
| android-x86_64-crosscompile-ci-pipeline.yml | ||
| bigmodels-ci-pipeline.yml | ||
| binary-size-checks-pipeline.yml | ||
| build-perf-test-binaries-pipeline.yml | ||
| c-api-noopenmp-packaging-pipelines.yml | ||
| clean-build-docker-image-cache-pipeline.yml | ||
| cuda-packaging-pipeline.yml | ||
| linux-ci-pipeline.yml | ||
| linux-cpu-aten-pipeline.yml | ||
| linux-cpu-eager-pipeline.yml | ||
| linux-cpu-minimal-build-ci-pipeline.yml | ||
| linux-dnnl-ci-pipeline.yml | ||
| linux-gpu-ci-pipeline.yml | ||
| linux-gpu-tensorrt-ci-pipeline.yml | ||
| linux-gpu-tensorrt-daily-perf-pipeline.yml | ||
| linux-migraphx-ci-pipeline.yml | ||
| linux-openvino-ci-pipeline.yml | ||
| linux-qnn-ci-pipeline.yml | ||
| mac-ci-pipeline.yml | ||
| mac-coreml-ci-pipeline.yml | ||
| mac-ios-ci-pipeline.yml | ||
| mac-ios-packaging-pipeline.yml | ||
| mac-react-native-ci-pipeline.yml | ||
| npm-packaging-pipeline.yml | ||
| nuget-cuda-publishing-pipeline.yml | ||
| orttraining-linux-ci-pipeline.yml | ||
| orttraining-linux-gpu-ci-pipeline.yml | ||
| orttraining-linux-gpu-ortmodule-distributed-test-ci-pipeline.yml | ||
| orttraining-linux-nightly-ortmodule-test-pipeline.yml | ||
| orttraining-mac-ci-pipeline.yml | ||
| orttraining-pai-ci-pipeline.yml | ||
| orttraining-py-packaging-pipeline-cpu.yml | ||
| orttraining-py-packaging-pipeline-cuda.yml | ||
| orttraining-py-packaging-pipeline-cuda12.yml | ||
| orttraining-py-packaging-pipeline-rocm.yml | ||
| post-merge-jobs.yml | ||
| publish-nuget.yml | ||
| py-cuda-package-test-pipeline.yml | ||
| py-cuda-packaging-pipeline.yml | ||
| py-cuda-publishing-pipeline.yml | ||
| py-package-build-pipeline.yml | ||
| py-package-test-pipeline.yml | ||
| py-packaging-pipeline.yml | ||
| qnn-ep-nuget-packaging-pipeline.yml | ||
| web-ci-pipeline.yml | ||
| win-ci-fuzz-testing.yml | ||
| win-ci-pipeline.yml | ||
| win-gpu-ci-pipeline.yml | ||
| win-gpu-reduce-op-ci-pipeline.yml | ||
| win-gpu-tensorrt-ci-pipeline.yml | ||
| win-qnn-arm64-ci-pipeline.yml | ||
| win-qnn-ci-pipeline.yml | ||