onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-21 21:52:11 +00:00

History

Yuhong Guo 04a8f50674 New configuration to limit the arena extension (#15983 ) Add a configuration `max_power_of_two_extend_bytes ` to limit the arena extension size. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> In our real scenario, we observe that if the model is big enough the BfcArena will extend uncontrollable. As showed by the following figures, if a model uses more than 16GB memory, the BfcArena will totally apply for 32GB memory according to the `kNextPowerOfTwo` strategy. With the new strategy, the extension is limited. The default maximum extension size is 1GB. #### Without the new configuration After loading the model, ORT uses 32G GPU memory. ![image](https://github.com/microsoft/onnxruntime/assets/19584326/42b93c66-b957-4f20-a13b-d34cb390afff) #### With the new configuration After loading the model, ORT uses 23G GPU memory. ![image](https://github.com/microsoft/onnxruntime/assets/19584326/5abffeff-9ca3-4187-a262-37fd2764fe1b) Co-authored-by: Yuhong Guo <yuhong.gyh@antgroup.com>		2023-05-25 02:19:07 -07:00
..
alloc_kind.h
allocator.h	New configuration to limit the arena extension (#15983 )	2023-05-25 02:19:07 -07:00
buffer_deleter.h	Multi-stream execution support (#13495 )	2022-12-15 07:39:29 -08:00
customregistry.h
data_types.h	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
data_types_internal.h	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
endian.h
execution_provider.h	ExecutionProvider API refactor - replace OrtMemoryInfo with OrtDevice (#15618 )	2023-05-01 10:06:00 -07:00
float16.h	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
framework_common.h
framework_provider_common.h	Add TRT plugins support using custom ops (#13847 )	2023-04-18 20:24:32 -07:00
func_api.h	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
kernel_def_builder.h	Remove onnxruntime_PYBIND_EXPORT_OPSCHEMA definition from onnxruntime (#15776 )	2023-05-03 13:08:35 -07:00
kernel_registry.h	Remove onnxruntime_PYBIND_EXPORT_OPSCHEMA definition from onnxruntime (#15776 )	2023-05-03 13:08:35 -07:00
op_kernel.h	ExecutionProvider API refactor - replace OrtMemoryInfo with OrtDevice (#15618 )	2023-05-01 10:06:00 -07:00
op_kernel_context.h	ExecutionProvider API refactor - replace OrtMemoryInfo with OrtDevice (#15618 )	2023-05-01 10:06:00 -07:00
op_kernel_info.h	ExecutionProvider API refactor - replace OrtMemoryInfo with OrtDevice (#15618 )	2023-05-01 10:06:00 -07:00
op_node_proto_helper.h	Switch GSL to MS GSL 4.0.0 (#13416 )	2022-10-29 04:15:20 -07:00
ort_value.h	Enable Opset11 Sequence Ops on DirectML, and make the CPU implementations agnostic to backend EP (#14442 )	2023-02-21 18:08:28 -08:00
ortdevice.h	ExecutionProvider API refactor - replace OrtMemoryInfo with OrtDevice (#15618 )	2023-05-01 10:06:00 -07:00
ortmemoryinfo.h	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
provider_options.h
provider_options_utils.h
provider_shutdown.h
run_options.h	Adding RunOptions synchronization behaviour to C/C++ API (#14088 )	2023-02-07 19:59:28 -08:00
sparse_tensor.h	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
stream_handles.h	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
tensor.h	Enable Opset11 Sequence Ops on DirectML, and make the CPU implementations agnostic to backend EP (#14442 )	2023-02-21 18:08:28 -08:00
tensor_shape.h	Switch GSL to MS GSL 4.0.0 (#13416 )	2022-10-29 04:15:20 -07:00
to_tensor_proto_element_type.h	Consolidate utils::ToTensorProtoElementType, TypeToDataType, and data_types_internal::ToTensorDataType. (#9824 )	2022-04-20 12:45:53 -07:00