mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-18 21:21:17 +00:00
### Description <!-- Describe your changes. --> 1. Introduce MoE CUDA op to ORT based on FT implementation. 2. Upgrade cutlass to 3.1.0 to avoid some build failures on Windows. Remove patch file for cutlass 3.0.0. 3. Sharded MoE implementation will come with another PR limitation: __CUDA_ARCH__ >= 700 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> |
||
|---|---|---|
| .. | ||
| generated | ||
| cgmanifest.json | ||
| generate_cgmanifest.py | ||
| print_submodule_info.py | ||
| README.md | ||
CGManifest Files
This directory contains CGManifest (cgmanifest.json) files. See here for details.
cgmanifests/generated/cgmanifest.json
This file contains generated CGManifest entries.
It covers these dependencies:
- git submodules
- dependencies from the Dockerfile
tools/ci_build/github/linux/docker/Dockerfile.manylinux2014_cuda11 - the entries in ../cmake/deps.txt
If any of these dependencies change, this file should be updated. When updating, please regenerate instead of editing manually.
How to Generate
- Change to the repository root directory.
- Ensure the git submodules are checked out and up to date. For example, with:
$ git submodule update --init --recursive - Run the generator script:
$ python cgmanifests/generate_cgmanifest.py --username <xxx> --token <your_access_token>
Please supply your github username and access token to the script. If you don't have a token, you can generate one at https://github.com/settings/tokens. This is for authenticating with Github REST API so that you would not hit the rate limit.
cgmanifests/cgmanifest.json
This file contains non-generated CGManifest entries. Please edit directly as needed.