onnxruntime/include/onnxruntime/core
Adrian Lizarraga b47e1e64d7
[QNN EP] Make offloading graph input/output quantization (to CPU) the default (#23368)
### Description
Makes the QNN provider option `offload_graph_io_quantization` enabled by
default. It was previously disabled by default.



### Motivation and Context
Enabling this option significantly decreases inference latency for many
models.
2025-02-04 11:42:46 -08:00
..
common Add overload of TryParseStringWithClassicLocale() that uses std::from_chars() (#23541) 2025-01-30 13:55:54 -08:00
eager Fix typos - 1st Wave (#21278) 2024-07-11 13:35:08 +08:00
framework Add QNN EP HTP shared memory allocator (#23136) 2025-01-14 11:09:50 -08:00
graph Use onnx_protobuf.h to suppress some GCC warnings (#23453) 2025-01-21 20:25:12 -08:00
optimizer Reduce default logger usage (#23030) 2024-12-10 12:54:14 +11:00
platform Bump clang-format from 18.1.8 to 19.1.6 (#23346) 2025-01-14 09:02:04 -08:00
providers [CoreML] support coreml model cache (#23065) 2024-12-31 09:29:41 +08:00
session [QNN EP] Make offloading graph input/output quantization (to CPU) the default (#23368) 2025-02-04 11:42:46 -08:00