onnxruntime/tools
Ye Wang f9af94009b
onboard MoE (#18279)
### Description
<!-- Describe your changes. -->
1. Introduce MoE CUDA op to ORT based on FT implementation.
2. Upgrade cutlass to 3.1.0 to avoid some build failures on Windows.
Remove patch file for cutlass 3.0.0.
3. Sharded MoE implementation will come with another PR

limitation: __CUDA_ARCH__ >= 700


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-11-14 16:48:51 -08:00
..
android_custom_build Update NDK to 26.0.10792818 (#17852) 2023-10-12 14:08:43 -07:00
ci_build onboard MoE (#18279) 2023-11-14 16:48:51 -08:00
doc Disable PERF* rules in ruff to allow better readability (#16834) 2023-07-25 15:38:22 -07:00
nuget Rework/cleanup the C# build infrastructure for nuget packages. (#18127) 2023-11-03 09:05:17 -07:00
perf_view fixed #16873 (#16932) 2023-09-26 09:57:01 -07:00
python Add tool to fix lines > 120 chars. (#18293) 2023-11-09 10:12:57 +10:00
scripts Remove dnf update from docker build scripts (#17551) 2023-09-21 07:33:29 -07:00