onnxruntime/include
Adrian Lizarraga 84d48b6ad6
[QNN EP] Add provider option to offload graph I/O quantization/dequantization to the CPU EP (#22436)
### Description
Adds QNN provider option `offload_graph_io_quantization` to offload
graph input quantization and graph output dequantization to the CPU EP.
Option is disabled by default to maintain current behavior.


### Motivation and Context
Offloading the handling of I/O quantization to the CPU EP significantly
improves inference latency for many models.
2024-10-16 15:00:53 -07:00
..
onnxruntime/core [QNN EP] Add provider option to offload graph I/O quantization/dequantization to the CPU EP (#22436) 2024-10-16 15:00:53 -07:00