mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-29 23:06:41 +00:00
Add a configuration `max_power_of_two_extend_bytes ` to limit the arena extension size. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> In our real scenario, we observe that if the model is big enough the BfcArena will extend uncontrollable. As showed by the following figures, if a model uses more than 16GB memory, the BfcArena will totally apply for 32GB memory according to the `kNextPowerOfTwo` strategy. With the new strategy, the extension is limited. The default maximum extension size is 1GB. #### Without the new configuration After loading the model, ORT uses 32G GPU memory.  #### With the new configuration After loading the model, ORT uses 23G GPU memory.  Co-authored-by: Yuhong Guo <yuhong.gyh@antgroup.com> |
||
|---|---|---|
| .. | ||
| common | ||
| eager | ||
| framework | ||
| graph | ||
| optimizer | ||
| platform | ||
| providers | ||
| session | ||