mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-06-29 03:30:52 +00:00
### Description Support loading from model with multiple QNN context binary ### Motivation and Context QNN EP generated context binary model only has one single QNN context. Because of QNN PD memory limitation, large model (>3.5GB) has to be split into 2 smaller models. Then generate the model with context binary. User can load from the smaller models with context binary. The problem is it requires 2 Ort session. User want to glue the split models into 1 (with multiple EPContext nodes) so that they can use 1 Ort session to do the work. QNN EP has limitation which only support loading from 1 single QNN context binary. This PR removes that limitation to unblock this user scenario. --------- Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com> |
||
|---|---|---|
| .. | ||
| contrib_ops | ||
| core | ||
| python | ||
| test | ||
| tool/etw | ||
| wasm | ||
| __init__.py | ||
| ReformatSource.ps1 | ||
| ReformatSourcePython.bat | ||
| VSCodeCoverage.runsettings | ||