onnxruntime/include/onnxruntime/core/framework
Yuhong Guo 04a8f50674
New configuration to limit the arena extension (#15983)
Add a configuration `max_power_of_two_extend_bytes ` to limit the arena extension size.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
In our real scenario, we observe that if the model is big enough the
BfcArena will extend uncontrollable.
As showed by the following figures, if a model uses more than 16GB
memory, the BfcArena will totally apply for 32GB memory according to the
`kNextPowerOfTwo` strategy. With the new strategy, the extension is
limited. The default maximum extension size is 1GB.

#### Without the new configuration
After loading the model, ORT uses 32G GPU memory.

![image](https://github.com/microsoft/onnxruntime/assets/19584326/42b93c66-b957-4f20-a13b-d34cb390afff)

#### With the new configuration
After loading the model, ORT uses 23G GPU memory.

![image](https://github.com/microsoft/onnxruntime/assets/19584326/5abffeff-9ca3-4187-a262-37fd2764fe1b)

Co-authored-by: Yuhong Guo <yuhong.gyh@antgroup.com>
2023-05-25 02:19:07 -07:00
..
alloc_kind.h
allocator.h New configuration to limit the arena extension (#15983) 2023-05-25 02:19:07 -07:00
buffer_deleter.h Multi-stream execution support (#13495) 2022-12-15 07:39:29 -08:00
customregistry.h
data_types.h Run clang-format in CI (#15524) 2023-04-18 09:26:58 -07:00
data_types_internal.h Run clang-format in CI (#15524) 2023-04-18 09:26:58 -07:00
endian.h
execution_provider.h ExecutionProvider API refactor - replace OrtMemoryInfo with OrtDevice (#15618) 2023-05-01 10:06:00 -07:00
float16.h Run clang-format in CI (#15524) 2023-04-18 09:26:58 -07:00
framework_common.h
framework_provider_common.h Add TRT plugins support using custom ops (#13847) 2023-04-18 20:24:32 -07:00
func_api.h Run clang-format in CI (#15524) 2023-04-18 09:26:58 -07:00
kernel_def_builder.h Remove onnxruntime_PYBIND_EXPORT_OPSCHEMA definition from onnxruntime (#15776) 2023-05-03 13:08:35 -07:00
kernel_registry.h Remove onnxruntime_PYBIND_EXPORT_OPSCHEMA definition from onnxruntime (#15776) 2023-05-03 13:08:35 -07:00
op_kernel.h ExecutionProvider API refactor - replace OrtMemoryInfo with OrtDevice (#15618) 2023-05-01 10:06:00 -07:00
op_kernel_context.h ExecutionProvider API refactor - replace OrtMemoryInfo with OrtDevice (#15618) 2023-05-01 10:06:00 -07:00
op_kernel_info.h ExecutionProvider API refactor - replace OrtMemoryInfo with OrtDevice (#15618) 2023-05-01 10:06:00 -07:00
op_node_proto_helper.h Switch GSL to MS GSL 4.0.0 (#13416) 2022-10-29 04:15:20 -07:00
ort_value.h Enable Opset11 Sequence Ops on DirectML, and make the CPU implementations agnostic to backend EP (#14442) 2023-02-21 18:08:28 -08:00
ortdevice.h ExecutionProvider API refactor - replace OrtMemoryInfo with OrtDevice (#15618) 2023-05-01 10:06:00 -07:00
ortmemoryinfo.h Run clang-format in CI (#15524) 2023-04-18 09:26:58 -07:00
provider_options.h
provider_options_utils.h
provider_shutdown.h
run_options.h Adding RunOptions synchronization behaviour to C/C++ API (#14088) 2023-02-07 19:59:28 -08:00
sparse_tensor.h Run clang-format in CI (#15524) 2023-04-18 09:26:58 -07:00
stream_handles.h Run clang-format in CI (#15524) 2023-04-18 09:26:58 -07:00
tensor.h Enable Opset11 Sequence Ops on DirectML, and make the CPU implementations agnostic to backend EP (#14442) 2023-02-21 18:08:28 -08:00
tensor_shape.h Switch GSL to MS GSL 4.0.0 (#13416) 2022-10-29 04:15:20 -07:00
to_tensor_proto_element_type.h Consolidate utils::ToTensorProtoElementType, TypeToDataType, and data_types_internal::ToTensorDataType. (#9824) 2022-04-20 12:45:53 -07:00