onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-14 20:48:00 +00:00

History

Adrian Lizarraga b47e1e64d7 [QNN EP] Make offloading graph input/output quantization (to CPU) the default (#23368 ) ### Description Makes the QNN provider option `offload_graph_io_quantization` enabled by default. It was previously disabled by default. ### Motivation and Context Enabling this option significantly decreases inference latency for many models.		2025-02-04 11:42:46 -08:00
..
common	Add overload of `TryParseStringWithClassicLocale()` that uses `std::from_chars()` (#23541 )	2025-01-30 13:55:54 -08:00
eager	Fix typos - 1st Wave (#21278 )	2024-07-11 13:35:08 +08:00
framework	Add QNN EP HTP shared memory allocator (#23136 )	2025-01-14 11:09:50 -08:00
graph	Use onnx_protobuf.h to suppress some GCC warnings (#23453 )	2025-01-21 20:25:12 -08:00
optimizer	Reduce default logger usage (#23030 )	2024-12-10 12:54:14 +11:00
platform	Bump clang-format from 18.1.8 to 19.1.6 (#23346 )	2025-01-14 09:02:04 -08:00
providers	[CoreML] support coreml model cache (#23065 )	2024-12-31 09:29:41 +08:00
session	[QNN EP] Make offloading graph input/output quantization (to CPU) the default (#23368 )	2025-02-04 11:42:46 -08:00