ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Find a file
Adrian Lizarraga de17d53c50
Custom Op runtime wrapper (#13427)
### Description

Adds the below C APIs to support custom ops that wrap an entire model to
be inferenced with an external runtime. The current SNPE EP is an
example of an EP that could be ported to use a custom op wrapper. Ex:
The custom op stores the serialized SNPE DLC binary as a string
attribute. The SNPE model is built when the kernel is created. The model
is inferenced with SNPE APIs on call to the kernel's compute method.

#### C APIs
| API | Description | Why |
| ---            | ---        | ---  |
| `KernelInfo_GetInputCount` | Gets number of inputs from
`OrtKernelInfo`. | Query I/O characteristics during kernel
creation<sup>1</sup> |
| `KernelInfo_GetOutputCount` | Gets number of outputs from
`OrtKernelInfo`. | Query I/O characteristics during kernel
creation<sup>1</sup> |
| `KernelInfo_GetInputName` | Gets an input's name. | Query I/O
characteristics during kernel creation<sup>1</sup> |
| `KernelInfo_GetOutputName` | Gets an output's name. | Query I/O
characteristics during kernel creation<sup>1</sup> |
| `KernelInfo_GetInputTypeInfo` | Gets the type/shape information for an
input. | Query I/O characteristics during kernel creation<sup>1</sup> |
| `KernelInfo_GetOutputTypeInfo` | Gets the type/shape information for
an output. | Query I/O characteristics during kernel
creation<sup>1</sup> |
| `KernelInfoGetAttribute_tensor` | Get a OrtValue tensor stored as an
attribute in the graph node | Extract serialized models, weights, etc. |
| `GetSessionConfigEntry` | Get a session configuration value | Need to
be able to get session-time configurations from within custom op |
| `HasSessionConfigEntry` | Check if session configuration entry exists.
| Need to be able to get session-time configurations from within custom
op |

#### Why so many KernelInfo APIs?<sup>1</sup>
Similar APIs currently exist for `OrtKernelContext`, but not
`OrtKernelInfo`. Note that `OrtKernelContext` is passed to the custom op
on call to its kernel's compute() function. However, `OrtKernelInfo` is
available on kernel creation, which occurs when the session is created.
Having these APIs available from `OrtKernelInfo` allows an operator to
trade-off computation time for session-creation time, and vice versa.
Operators that must build expensive state may prefer to do it during
session creation time instead of compute-time.

SNPE is an example of an EP that needs to be able to query `KernelInfo`
for the name, type, and shape of inputs and outputs in order to build
the model from the serialized DLC data. This is an expensive operation.
Other providers (e.g., OpenVINO) are able to query i/o info from the
serialized model, so they do not strictly need these APIs. However, the
APIs can still be used to validate the expected I/O characteristics.

Additionally, several of our CPU contrib ops currently use the same
internal version of these KernelInfo APIs (Ex:
[qlinear_softmax](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/contrib_ops/cpu/quantization/qlinear_softmax.cc#L71)).
If custom ops are also meant to be a test bed for future ops, then all
custom ops (not just runtime wrappers) would benefit from the addition
of these public KernelInfo APIs (IMO).

#### Example of usage in a custom OP
From
`onnxruntime/test/testdata/custom_op_openvino_wrapper_library/openvino_wrapper.h`

```c++
struct CustomOpOpenVINO : Ort::CustomOpBase<CustomOpOpenVINO, KernelOpenVINO> {
  explicit CustomOpOpenVINO(Ort::ConstSessionOptions session_options);

  CustomOpOpenVINO(const CustomOpOpenVINO&) = delete;
  CustomOpOpenVINO& operator=(const CustomOpOpenVINO&) = delete;

  void* CreateKernel(const OrtApi& api, const OrtKernelInfo* info) const;

  constexpr const char* GetName() const noexcept {
    return "OpenVINO_Wrapper";
  }

  constexpr const char* GetExecutionProviderType() const noexcept {
    return "CPUExecutionProvider";
  }

  // IMPORTANT: In order to wrap a generic runtime-specific model, the custom operator
  // must have a non-homogeneous variadic input and output.

  constexpr size_t GetInputTypeCount() const noexcept {
    return 1;
  }

  constexpr size_t GetOutputTypeCount() const noexcept {
    return 1;
  }

  constexpr ONNXTensorElementDataType GetInputType(size_t /* index */) const noexcept {
    return ONNX_TENSOR_ELEMENT_DATA_TYPE_UNDEFINED;
  }

  constexpr ONNXTensorElementDataType GetOutputType(size_t /* index */) const noexcept {
    return ONNX_TENSOR_ELEMENT_DATA_TYPE_UNDEFINED;
  }

  constexpr OrtCustomOpInputOutputCharacteristic GetInputCharacteristic(size_t /* index */) const noexcept {
    return INPUT_OUTPUT_VARIADIC;
  }

  constexpr OrtCustomOpInputOutputCharacteristic GetOutputCharacteristic(size_t /* index */) const noexcept {
    return INPUT_OUTPUT_VARIADIC;
  }

  constexpr bool GetVariadicInputHomogeneity() const noexcept {
    return false;  // heterogenous
  }

  constexpr bool GetVariadicOutputHomogeneity() const noexcept {
    return false;  // heterogeneous
  }

  std::vector<std::string> GetSessionConfigKeys() const { return {"device_type"}; }

 private:
  std::unordered_map<std::string, std::string> session_configs_;
};
```

#### How to create a session:
```c++
Ort::Env env;
Ort::SessionOptions session_opts;
Ort::CustomOpConfigs custom_op_configs;

// Create local session config entries for the custom op.
custom_op_configs.AddConfig("OpenVINO_Wrapper", "device_type", "CPU");

// Register custom op library and pass in the custom op configs (optional).
session_opts.RegisterCustomOpsLibrary(lib_name, custom_op_configs);

Ort::Session session(env, model_path.data(), session_opts);
```
### Motivation and Context
Allows creation of simple "wrapper" EPs outside of the main ORT code
base.
2023-01-18 09:09:32 -08:00
.config Update tsaoptions.json: update the email alias (#13448) 2022-10-26 15:56:16 -07:00
.devcontainer
.gdn
.github Delete add-issues-to-project (#14147) 2023-01-11 14:33:37 -08:00
.pipelines [DML EP] Upgrade DML to 1.10.0 (#13796) 2022-11-30 21:32:14 -08:00
.vscode
cgmanifests [CPU] Resize of Opset 18 (#13890) 2023-01-14 08:57:23 +10:00
cmake Custom Op runtime wrapper (#13427) 2023-01-18 09:09:32 -08:00
csharp [CPU] Resize of Opset 18 (#13890) 2023-01-14 08:57:23 +10:00
dockerfiles Openvino ep 2022.3 v4.3 (#14210) 2023-01-11 16:31:26 -08:00
docs [CUDA] Add trt cross attention kernels (#14328) 2023-01-17 17:55:45 -08:00
include/onnxruntime/core Custom Op runtime wrapper (#13427) 2023-01-18 09:09:32 -08:00
java Add Java and Objective-C bindings for RegisterCustomOpsUsingFunction. (#14256) 2023-01-13 09:04:26 -08:00
js add opset18 node test (#14236) 2023-01-19 00:56:57 +08:00
objectivec Add Java and Objective-C bindings for RegisterCustomOpsUsingFunction. (#14256) 2023-01-13 09:04:26 -08:00
onnxruntime Custom Op runtime wrapper (#13427) 2023-01-18 09:09:32 -08:00
orttraining Improved test cases by using paramerters (#14246) 2023-01-13 12:54:23 -08:00
package/rpm Bumping up version number to 1.14.0 on main branch (#13401) 2022-10-21 19:16:44 -04:00
samples
test Multi-stream execution support (#13495) 2022-12-15 07:39:29 -08:00
tools enable ort-extensions in wasm release builds (#14239) 2023-01-17 12:39:13 -08:00
winml Enabling thread pool to be numa-aware (#13778) 2022-12-12 10:33:55 -08:00
.clang-format
.clang-tidy Create clang-tidy CI (#12653) 2022-09-30 08:05:38 -07:00
.dockerignore
.flake8 Remove miscellaneous nuphar configs (#13070) 2022-09-26 13:41:28 -07:00
.gitattributes
.gitignore Ignore more build directories and clangd files (#14154) 2023-01-07 06:58:57 +08:00
.gitmodules Remove unused git submodules (#13830) 2022-12-07 21:59:16 -08:00
build.amd64.1411.bat
build.bat
build.sh
CITATION.cff
CODEOWNERS Add cgmanifest file in codeowner list (#13042) 2022-09-22 18:58:01 -07:00
CONTRIBUTING.md
lgtm.yml Fix lgtm C++ error (#13613) 2022-11-10 10:06:22 -08:00
LICENSE
NuGet.config
ort.wprp
ORT_icon_for_light_bg.png
packages.config [DML EP] Upgrade DML to 1.10.0 (#13796) 2022-11-30 21:32:14 -08:00
pyproject.toml Update pylint config to include valid short names (#13631) 2022-11-14 10:00:25 -08:00
README.md Update resource section in readme (#13724) 2022-11-28 09:42:31 -08:00
requirements-dev.txt
requirements-doc.txt
requirements-training.txt Remove protobuf pin from training requirements (#13695) 2022-11-22 12:27:18 -08:00
requirements.txt.in
SECURITY.md
setup.py Openvino ep 2022.3 v4.3 (#14210) 2023-01-11 16:31:26 -08:00
ThirdPartyNotices.txt [CPU] Resize of Opset 18 (#13890) 2023-01-14 08:57:23 +10:00
VERSION_NUMBER Bumping up version number to 1.14.0 on main branch (#13401) 2022-10-21 19:16:44 -04:00

ONNX Runtime is a cross-platform inference and training machine-learning accelerator.

ONNX Runtime inference can enable faster customer experiences and lower costs, supporting models from deep learning frameworks such as PyTorch and TensorFlow/Keras as well as classical machine learning libraries such as scikit-learn, LightGBM, XGBoost, etc. ONNX Runtime is compatible with different hardware, drivers, and operating systems, and provides optimal performance by leveraging hardware accelerators where applicable alongside graph optimizations and transforms. Learn more →

ONNX Runtime training can accelerate the model training time on multi-node NVIDIA GPUs for transformer models with a one-line addition for existing PyTorch training scripts. Learn more →

Get Started & Resources

Build Pipeline Status

System CPU GPU EPs
Windows Build Status Build Status Build Status
Linux Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Mac Build Status
Build Status
Android Build Status
iOS Build Status
WebAssembly Build Status

Data/Telemetry

Windows distributions of this project may collect usage data and send it to Microsoft to help improve our products and services. See the privacy statement for more details.

Contributions and Feedback

We welcome contributions! Please see the contribution guidelines.

For feature requests or bug reports, please file a GitHub Issue.

For general discussion or questions, please use GitHub Discussions.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

License

This project is licensed under the MIT License.