onnxruntime/onnxruntime/python
Adrian Lizarraga efc84a43e8
[QNN EP] Add session option to disable fallback to default CPU EP (#16016)
### Description
Adds the session config option `disable_cpu_ep_fallback` to allow the
user to prevent the CPU EP from handling
nodes not supported by other execution providers.

```C++
// Graph nodes that are not supported by the execution providers (EPs) explicitly added to the session are
// assigned (i.e., "fallback") to the CPU EP by default.
//
// This option allows the user to disable the fallback of unsupported graph nodes to the CPU EP.
// If this option is set to "1", session creation will fail if the execution providers other than the CPU EP cannot
// fully support all of the nodes in the graph.
//
// It is invalid to set this option and explicitly add the CPU EP to the session. In this case, session creation
// will also fail with an error.
//
// Option values:
// - "0": CPU EP fallback is not disabled. [DEFAULT]
// - "1": CPU EP fallback is disabled.
static const char* const kOrtSessionOptionsDisableCPUEPFallback = "session.disable_cpu_ep_fallback";
```

#### Example use
```C++
#include "core/session/onnxruntime_cxx_api.h"
#include "core/session/onnxruntime_session_options_config_keys.h"

int main(int argc, char** argv) {
    Ort::SessionOptions so;
    so.AddConfigEntry(kOrtSessionOptionsDisableCPUEPFallback, "1");  // Disable fallback to the CPU EP.

    onnxruntime::ProviderOptions options;
#if defined(_WIN32)
    options["backend_path"] = "QnnCpu.dll";
#else
    options["backend_path"] = "libQnnCpu.so";
#endif

    so.AppendExecutionProvider("QNN", options);

    const ORTCHAR_T* ort_model_path = ORT_MODEL_FOLDER "qnn_ep_partial_support.onnx";
    Ort::Session session(*ort_env, ort_model_path, so);  // Throws exception if nodes fallback to CPU
    // ...
```

### Motivation and Context
Makes it easier for application developers to ensure that the entire
model runs on specific EPs. This is critical for Qualcomm/scenarios. If
the compute cannot be offloaded to the NPU, running on CPU is not
acceptable. (could be the difference between 90 second inference and 6
seconds inference)

---------

Co-authored-by: Pranav Sharma <prs@microsoft.com>
2023-05-23 17:56:32 -07:00
..
backend Adopt linrtunner as the linting tool - take 2 (#15085) 2023-03-24 15:29:03 -07:00
datasets Adopt linrtunner as the linting tool - take 2 (#15085) 2023-03-24 15:29:03 -07:00
providers/tvm Bump ruff in CI (#15533) 2023-04-17 10:11:44 -07:00
tools [ROCm] add hipblaslt into GemmFastGelu TunableOp (#15945) 2023-05-23 11:07:09 +08:00
torch_cpp_extensions Adopt linrtunner as the linting tool - take 2 (#15085) 2023-03-24 15:29:03 -07:00
training
__init__.py
_ld_preload.py
_pybind_state.py.in
exported_symbols.lst
numpy_helper.h
onnxruntime_collect_build_info.py Adopt linrtunner as the linting tool - take 2 (#15085) 2023-03-24 15:29:03 -07:00
onnxruntime_inference_collection.py update with onnx main (#14929) 2023-04-18 08:42:51 -07:00
onnxruntime_pybind.h
onnxruntime_pybind_exceptions.cc Run clang-format in CI (#15524) 2023-04-18 09:26:58 -07:00
onnxruntime_pybind_exceptions.h
onnxruntime_pybind_iobinding.cc Run clang-format in CI (#15524) 2023-04-18 09:26:58 -07:00
onnxruntime_pybind_mlvalue.cc Enable Opset11 Sequence Ops on DirectML, and make the CPU implementations agnostic to backend EP (#14442) 2023-02-21 18:08:28 -08:00
onnxruntime_pybind_mlvalue.h
onnxruntime_pybind_module.cc Expose build information in dynamic lib (#15643) 2023-04-28 21:57:31 -07:00
onnxruntime_pybind_ortvalue.cc Run clang-format in CI (#15524) 2023-04-18 09:26:58 -07:00
onnxruntime_pybind_schema.cc Fix issues on Windows for Vitis AI (#15810) 2023-05-04 14:42:19 -07:00
onnxruntime_pybind_sparse_tensor.cc
onnxruntime_pybind_state.cc [QNN EP] Add session option to disable fallback to default CPU EP (#16016) 2023-05-23 17:56:32 -07:00
onnxruntime_pybind_state.h
onnxruntime_pybind_state_common.cc
onnxruntime_pybind_state_common.h Remove onnxruntime_PYBIND_EXPORT_OPSCHEMA definition from onnxruntime (#15776) 2023-05-03 13:08:35 -07:00
onnxruntime_validation.py Adopt linrtunner as the linting tool - take 2 (#15085) 2023-03-24 15:29:03 -07:00
pybind.def
version_script.lds
version_script_expose_onnx_protobuf.lds