onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-18 21:21:17 +00:00

History

Dmitri Smirnov 2c50b75a26 Functions Ahead Of Time inlininng (#17764 ) ### Description Inline functions in an EP aware fashion. The result of this PR is that models that are having been inlined by ONNX inliner and optimized and models that have been AOT inlined appear to be visually identical. For tests I used two models. The only difference is the resulting size because ONNX inliner removes local function definitions and AOT does not. Difference in sizes for `HF Mobile` model was 2.5 MB, and for `HF Bart` it was ~500K. It seems that the resuling model size affects the load time more than the actual optimizations. In general, the inlined models grow in size very fast and can easily exceed 2Gb limit. Q. Should we make AOT optional? `If` costant folding and the removal of local inlined models will be coming in other PRs. Some stats: ![image](https://github.com/microsoft/onnxruntime/assets/11303988/fcb4c815-2e06-4574-8d96-5a0a727d1ecf)		2023-10-23 17:42:20 -07:00
..
basic_types.h
constants.h	Support WebNN EP (#15698 )	2023-05-08 21:25:10 -07:00
function.h	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
graph.h	Functions Ahead Of Time inlininng (#17764 )	2023-10-23 17:42:20 -07:00
graph_nodes.h	Graph transformer to ensure unique DQ nodes for QDQ node units (#15145 )	2023-03-31 08:39:43 +10:00
graph_viewer.h	Multi-stream execution support (#13495 )	2022-12-15 07:39:29 -08:00
indexed_sub_graph.h	[wasm] upgrade emsdk to 3.1.44 (#17069 )	2023-08-10 16:08:36 -07:00
node_arg.h	Remove ORT_ENABLE_RUNTIME_OPTIMIZATION_IN_MINIMAL_BUILD. (#10778 )	2022-03-08 16:18:49 -08:00
schema_registry.h	[wasm] upgrade emsdk to 3.1.44 (#17069 )	2023-08-10 16:08:36 -07:00