onnxruntime/include/onnxruntime/core
Edward Chen 04030f64be
Add QNN EP HTP shared memory allocator (#23136)
Adds QNN EP HTP shared memory allocator.

The HTP shared memory allocator (`HtpSharedMemoryAllocator`) calls the
rpcmem shared library (libcdsprpc.so/dll) to allocate and free memory
that can be shared between HTP and CPU.

The allocator can be enabled by setting QNN EP option
`enable_htp_shared_memory_allocator` to `1`.
`QNNExecutionProvider::CreatePreferredAllocators()` will then return an
instance of `HtpSharedMemoryAllocator`.

For each QNN context, we also need to register and unregister memory
handles in order to use the HTP shared memory. This memory handle
management is added to `QnnBackendManager`, which also manages the QNN
context handles.

For more information about using HTP shared memory with QNN, see:
https://docs.qualcomm.com/bundle/publicresource/topics/80-63442-50/htp_shared_buffer_tutorial.html#shared-buffer-tutorial

Limitations:
- HTP shared memory usage is only supported for graph inputs and
outputs. Intermediate values are not supported.
- An allocation is assigned to a single shared memory buffer. The
allocator is not smart enough to have multiple allocations share a
single shared memory buffer.

Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>
2025-01-14 11:09:50 -08:00
..
common Bump clang-format from 18.1.8 to 19.1.6 (#23346) 2025-01-14 09:02:04 -08:00
eager Fix typos - 1st Wave (#21278) 2024-07-11 13:35:08 +08:00
framework Add QNN EP HTP shared memory allocator (#23136) 2025-01-14 11:09:50 -08:00
graph Implement pre-packed blobs serialization on disk and their memory mapping on load (#23069) 2024-12-20 10:49:08 -08:00
optimizer Reduce default logger usage (#23030) 2024-12-10 12:54:14 +11:00
platform Bump clang-format from 18.1.8 to 19.1.6 (#23346) 2025-01-14 09:02:04 -08:00
providers [CoreML] support coreml model cache (#23065) 2024-12-31 09:29:41 +08:00
session Add QNN EP HTP shared memory allocator (#23136) 2025-01-14 11:09:50 -08:00