onnxruntime/include/onnxruntime/core/framework
Edward Chen 04030f64be
Add QNN EP HTP shared memory allocator (#23136)
Adds QNN EP HTP shared memory allocator.

The HTP shared memory allocator (`HtpSharedMemoryAllocator`) calls the
rpcmem shared library (libcdsprpc.so/dll) to allocate and free memory
that can be shared between HTP and CPU.

The allocator can be enabled by setting QNN EP option
`enable_htp_shared_memory_allocator` to `1`.
`QNNExecutionProvider::CreatePreferredAllocators()` will then return an
instance of `HtpSharedMemoryAllocator`.

For each QNN context, we also need to register and unregister memory
handles in order to use the HTP shared memory. This memory handle
management is added to `QnnBackendManager`, which also manages the QNN
context handles.

For more information about using HTP shared memory with QNN, see:
https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/htp_shared_buffer_tutorial.html#shared-buffer-tutorial

Limitations:
- HTP shared memory usage is only supported for graph inputs and
outputs. Intermediate values are not supported.
- An allocation is assigned to a single shared memory buffer. The
allocator is not smart enough to have multiple allocations share a
single shared memory buffer.

Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>
2025-01-14 11:09:50 -08:00
..
alloc_kind.h
allocator.h Add QNN EP HTP shared memory allocator (#23136) 2025-01-14 11:09:50 -08:00
buffer_deleter.h Multi-stream execution support (#13495) 2022-12-15 07:39:29 -08:00
customregistry.h
data_types.h [VitisAI] change all support tensor type from ir 9 to ir 10 (#23204) 2025-01-02 06:45:21 -08:00
data_types_internal.h [CPU EP] Int4 support for QuantizeLinear, DequantizeLinear, and Transpose (#20362) 2024-05-30 18:56:24 -07:00
endian.h
execution_provider.h Add SetEpDynamicOptions and remove workload_type from run/session options (#22282) 2024-10-09 22:54:22 -07:00
float8.h Pre-requisites of upgrading EMSDK (#23347) 2025-01-14 11:07:21 -08:00
float16.h Pre-requisites of upgrading EMSDK (#23347) 2025-01-14 11:07:21 -08:00
framework_common.h
framework_provider_common.h Add TRT plugins support using custom ops (#13847) 2023-04-18 20:24:32 -07:00
func_api.h Run clang-format in CI (#15524) 2023-04-18 09:26:58 -07:00
int4.h Remove core/common/gsl.h (#20894) 2024-07-08 18:09:39 -07:00
kernel_def_builder.h Release backward inputs per static graph ref count (#20804) 2024-06-14 14:33:01 +08:00
kernel_registry.h Reduce default logger usage (#23030) 2024-12-10 12:54:14 +11:00
op_kernel.h Implement pre-packed blobs serialization on disk and their memory mapping on load (#23069) 2024-12-20 10:49:08 -08:00
op_kernel_context.h ExecutionProvider API refactor - replace OrtMemoryInfo with OrtDevice (#15618) 2023-05-01 10:06:00 -07:00
op_kernel_info.h Remove core/common/gsl.h (#20894) 2024-07-08 18:09:39 -07:00
op_node_proto_helper.h Remove core/common/gsl.h (#20894) 2024-07-08 18:09:39 -07:00
ort_value.h Two fixes involving minimal builds (#17000) 2023-08-23 16:01:22 +10:00
ortdevice.h Add QNN EP HTP shared memory allocator (#23136) 2025-01-14 11:09:50 -08:00
ortmemoryinfo.h Add QNN EP HTP shared memory allocator (#23136) 2025-01-14 11:09:50 -08:00
provider_options.h
provider_options_utils.h
provider_shutdown.h
run_options.h Multi-Lora support (#22046) 2024-09-30 15:59:07 -07:00
sparse_tensor.h Run clang-format in CI (#15524) 2023-04-18 09:26:58 -07:00
stream_handles.h Update ruff and clang-format versions (#21479) 2024-07-24 11:50:11 -07:00
tensor.h Remove core/common/gsl.h (#20894) 2024-07-08 18:09:39 -07:00
tensor_shape.h Remove core/common/gsl.h (#20894) 2024-07-08 18:09:39 -07:00
to_tensor_proto_element_type.h [CPU EP] Int4 support for QuantizeLinear, DequantizeLinear, and Transpose (#20362) 2024-05-30 18:56:24 -07:00