onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-25 22:26:24 +00:00

History

Adrian Lizarraga efc84a43e8 [QNN EP] Add session option to disable fallback to default CPU EP (#16016 ) ### Description Adds the session config option `disable_cpu_ep_fallback` to allow the user to prevent the CPU EP from handling nodes not supported by other execution providers. ```C++ // Graph nodes that are not supported by the execution providers (EPs) explicitly added to the session are // assigned (i.e., "fallback") to the CPU EP by default. // // This option allows the user to disable the fallback of unsupported graph nodes to the CPU EP. // If this option is set to "1", session creation will fail if the execution providers other than the CPU EP cannot // fully support all of the nodes in the graph. // // It is invalid to set this option and explicitly add the CPU EP to the session. In this case, session creation // will also fail with an error. // // Option values: // - "0": CPU EP fallback is not disabled. [DEFAULT] // - "1": CPU EP fallback is disabled. static const char* const kOrtSessionOptionsDisableCPUEPFallback = "session.disable_cpu_ep_fallback"; ``` #### Example use ```C++ #include "core/session/onnxruntime_cxx_api.h" #include "core/session/onnxruntime_session_options_config_keys.h" int main(int argc, char** argv) { Ort::SessionOptions so; so.AddConfigEntry(kOrtSessionOptionsDisableCPUEPFallback, "1"); // Disable fallback to the CPU EP. onnxruntime::ProviderOptions options; #if defined(_WIN32) options["backend_path"] = "QnnCpu.dll"; #else options["backend_path"] = "libQnnCpu.so"; #endif so.AppendExecutionProvider("QNN", options); const ORTCHAR_T* ort_model_path = ORT_MODEL_FOLDER "qnn_ep_partial_support.onnx"; Ort::Session session(*ort_env, ort_model_path, so); // Throws exception if nodes fallback to CPU // ... ``` ### Motivation and Context Makes it easier for application developers to ensure that the entire model runs on specific EPs. This is critical for Qualcomm/scenarios. If the compute cannot be offloaded to the NPU, running on CPU is not acceptable. (could be the difference between 90 second inference and 6 seconds inference) --------- Co-authored-by: Pranav Sharma <prs@microsoft.com>		2023-05-23 17:56:32 -07:00
..
backend	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00
datasets	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00
providers/tvm	Bump ruff in CI (#15533 )	2023-04-17 10:11:44 -07:00
tools	[ROCm] add hipblaslt into GemmFastGelu TunableOp (#15945 )	2023-05-23 11:07:09 +08:00
torch_cpp_extensions	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00
training
__init__.py
_ld_preload.py
_pybind_state.py.in
exported_symbols.lst
numpy_helper.h
onnxruntime_collect_build_info.py	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00
onnxruntime_inference_collection.py	update with onnx main (#14929 )	2023-04-18 08:42:51 -07:00
onnxruntime_pybind.h
onnxruntime_pybind_exceptions.cc	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
onnxruntime_pybind_exceptions.h
onnxruntime_pybind_iobinding.cc	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
onnxruntime_pybind_mlvalue.cc	Enable Opset11 Sequence Ops on DirectML, and make the CPU implementations agnostic to backend EP (#14442 )	2023-02-21 18:08:28 -08:00
onnxruntime_pybind_mlvalue.h
onnxruntime_pybind_module.cc	Expose build information in dynamic lib (#15643 )	2023-04-28 21:57:31 -07:00
onnxruntime_pybind_ortvalue.cc	Run clang-format in CI (#15524 )	2023-04-18 09:26:58 -07:00
onnxruntime_pybind_schema.cc	Fix issues on Windows for Vitis AI (#15810 )	2023-05-04 14:42:19 -07:00
onnxruntime_pybind_sparse_tensor.cc
onnxruntime_pybind_state.cc	[QNN EP] Add session option to disable fallback to default CPU EP (#16016 )	2023-05-23 17:56:32 -07:00
onnxruntime_pybind_state.h
onnxruntime_pybind_state_common.cc
onnxruntime_pybind_state_common.h	Remove onnxruntime_PYBIND_EXPORT_OPSCHEMA definition from onnxruntime (#15776 )	2023-05-03 13:08:35 -07:00
onnxruntime_validation.py	Adopt linrtunner as the linting tool - take 2 (#15085 )	2023-03-24 15:29:03 -07:00
pybind.def
version_script.lds
version_script_expose_onnx_protobuf.lds