mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-30 23:18:20 +00:00
* Improve CUDA kernel performance for Concat. Implement the kernel code instead of using cudaMemCpy in a loop. * Update the index lookup part for Concat & Split |
||
|---|---|---|
| .. | ||
| contrib_ops | ||
| core | ||
| python | ||
| server | ||
| test | ||
| __init__.py | ||
| ReformatSource.ps1 | ||
| ReformatSourcePython.bat | ||
| VSCodeCoverage.runsettings | ||