saymrwulf/onnxruntime: ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-16 18:31:27 +00:00

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

Find a file

Adrian Lizarraga de17d53c50 Custom Op runtime wrapper (#13427 ) ### Description Adds the below C APIs to support custom ops that wrap an entire model to be inferenced with an external runtime. The current SNPE EP is an example of an EP that could be ported to use a custom op wrapper. Ex: The custom op stores the serialized SNPE DLC binary as a string attribute. The SNPE model is built when the kernel is created. The model is inferenced with SNPE APIs on call to the kernel's compute method. #### C APIs \| API \| Description \| Why \| \| --- \| --- \| --- \| \| `KernelInfo_GetInputCount` \| Gets number of inputs from `OrtKernelInfo`. \| Query I/O characteristics during kernel creation<sup>1</sup> \| \| `KernelInfo_GetOutputCount` \| Gets number of outputs from `OrtKernelInfo`. \| Query I/O characteristics during kernel creation<sup>1</sup> \| \| `KernelInfo_GetInputName` \| Gets an input's name. \| Query I/O characteristics during kernel creation<sup>1</sup> \| \| `KernelInfo_GetOutputName` \| Gets an output's name. \| Query I/O characteristics during kernel creation<sup>1</sup> \| \| `KernelInfo_GetInputTypeInfo` \| Gets the type/shape information for an input. \| Query I/O characteristics during kernel creation<sup>1</sup> \| \| `KernelInfo_GetOutputTypeInfo` \| Gets the type/shape information for an output. \| Query I/O characteristics during kernel creation<sup>1</sup> \| \| `KernelInfoGetAttribute_tensor` \| Get a OrtValue tensor stored as an attribute in the graph node \| Extract serialized models, weights, etc. \| \| `GetSessionConfigEntry` \| Get a session configuration value \| Need to be able to get session-time configurations from within custom op \| \| `HasSessionConfigEntry` \| Check if session configuration entry exists. \| Need to be able to get session-time configurations from within custom op \| #### Why so many KernelInfo APIs?<sup>1</sup> Similar APIs currently exist for `OrtKernelContext`, but not `OrtKernelInfo`. Note that `OrtKernelContext` is passed to the custom op on call to its kernel's compute() function. However, `OrtKernelInfo` is available on kernel creation, which occurs when the session is created. Having these APIs available from `OrtKernelInfo` allows an operator to trade-off computation time for session-creation time, and vice versa. Operators that must build expensive state may prefer to do it during session creation time instead of compute-time. SNPE is an example of an EP that needs to be able to query `KernelInfo` for the name, type, and shape of inputs and outputs in order to build the model from the serialized DLC data. This is an expensive operation. Other providers (e.g., OpenVINO) are able to query i/o info from the serialized model, so they do not strictly need these APIs. However, the APIs can still be used to validate the expected I/O characteristics. Additionally, several of our CPU contrib ops currently use the same internal version of these KernelInfo APIs (Ex: [qlinear_softmax](https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/contrib_ops/cpu/quantization/qlinear_softmax.cc#L71)). If custom ops are also meant to be a test bed for future ops, then all custom ops (not just runtime wrappers) would benefit from the addition of these public KernelInfo APIs (IMO). #### Example of usage in a custom OP From `onnxruntime/test/testdata/custom_op_openvino_wrapper_library/openvino_wrapper.h` ```c++ struct CustomOpOpenVINO : Ort::CustomOpBase<CustomOpOpenVINO, KernelOpenVINO> { explicit CustomOpOpenVINO(Ort::ConstSessionOptions session_options); CustomOpOpenVINO(const CustomOpOpenVINO&) = delete; CustomOpOpenVINO& operator=(const CustomOpOpenVINO&) = delete; void* CreateKernel(const OrtApi& api, const OrtKernelInfo* info) const; constexpr const char* GetName() const noexcept { return "OpenVINO_Wrapper"; } constexpr const char* GetExecutionProviderType() const noexcept { return "CPUExecutionProvider"; } // IMPORTANT: In order to wrap a generic runtime-specific model, the custom operator // must have a non-homogeneous variadic input and output. constexpr size_t GetInputTypeCount() const noexcept { return 1; } constexpr size_t GetOutputTypeCount() const noexcept { return 1; } constexpr ONNXTensorElementDataType GetInputType(size_t /* index /) const noexcept { return ONNX_TENSOR_ELEMENT_DATA_TYPE_UNDEFINED; } constexpr ONNXTensorElementDataType GetOutputType(size_t / index /) const noexcept { return ONNX_TENSOR_ELEMENT_DATA_TYPE_UNDEFINED; } constexpr OrtCustomOpInputOutputCharacteristic GetInputCharacteristic(size_t / index /) const noexcept { return INPUT_OUTPUT_VARIADIC; } constexpr OrtCustomOpInputOutputCharacteristic GetOutputCharacteristic(size_t / index */) const noexcept { return INPUT_OUTPUT_VARIADIC; } constexpr bool GetVariadicInputHomogeneity() const noexcept { return false; // heterogenous } constexpr bool GetVariadicOutputHomogeneity() const noexcept { return false; // heterogeneous } std::vector<std::string> GetSessionConfigKeys() const { return {"device_type"}; } private: std::unordered_map<std::string, std::string> session_configs_; }; ``` #### How to create a session: ```c++ Ort::Env env; Ort::SessionOptions session_opts; Ort::CustomOpConfigs custom_op_configs; // Create local session config entries for the custom op. custom_op_configs.AddConfig("OpenVINO_Wrapper", "device_type", "CPU"); // Register custom op library and pass in the custom op configs (optional). session_opts.RegisterCustomOpsLibrary(lib_name, custom_op_configs); Ort::Session session(env, model_path.data(), session_opts); ``` ### Motivation and Context Allows creation of simple "wrapper" EPs outside of the main ORT code base.		2023-01-18 09:09:32 -08:00
.config	Update tsaoptions.json: update the email alias (#13448 )	2022-10-26 15:56:16 -07:00
.devcontainer	Remove two lines in the Dockerfile for Github Codespace (#12278 )	2022-07-21 20:52:17 -07:00
.gdn
.github	Delete add-issues-to-project (#14147 )	2023-01-11 14:33:37 -08:00
.pipelines	[DML EP] Upgrade DML to 1.10.0 (#13796 )	2022-11-30 21:32:14 -08:00
.vscode	cpplint & Eager mode: refactor and add comments to empty_* functions, general lint cleanup in ort_aten (#12238 )	2022-07-20 11:47:57 -04:00
cgmanifests	[CPU] Resize of Opset 18 (#13890 )	2023-01-14 08:57:23 +10:00
cmake	Custom Op runtime wrapper (#13427 )	2023-01-18 09:09:32 -08:00
csharp	[CPU] Resize of Opset 18 (#13890 )	2023-01-14 08:57:23 +10:00
dockerfiles	Openvino ep 2022.3 v4.3 (#14210 )	2023-01-11 16:31:26 -08:00
docs	[CUDA] Add trt cross attention kernels (#14328 )	2023-01-17 17:55:45 -08:00
include/onnxruntime/core	Custom Op runtime wrapper (#13427 )	2023-01-18 09:09:32 -08:00
java	Add Java and Objective-C bindings for RegisterCustomOpsUsingFunction. (#14256 )	2023-01-13 09:04:26 -08:00
js	add opset18 node test (#14236 )	2023-01-19 00:56:57 +08:00
objectivec	Add Java and Objective-C bindings for RegisterCustomOpsUsingFunction. (#14256 )	2023-01-13 09:04:26 -08:00
onnxruntime	Custom Op runtime wrapper (#13427 )	2023-01-18 09:09:32 -08:00
orttraining	Improved test cases by using paramerters (#14246 )	2023-01-13 12:54:23 -08:00
package/rpm	Bumping up version number to 1.14.0 on main branch (#13401 )	2022-10-21 19:16:44 -04:00
samples	Format all python files under onnxruntime with black and isort (#11324 )	2022-04-26 09:35:16 -07:00
test	Multi-stream execution support (#13495 )	2022-12-15 07:39:29 -08:00
tools	enable ort-extensions in wasm release builds (#14239 )	2023-01-17 12:39:13 -08:00
winml	Enabling thread pool to be numa-aware (#13778 )	2022-12-12 10:33:55 -08:00
.clang-format
.clang-tidy	Create clang-tidy CI (#12653 )	2022-09-30 08:05:38 -07:00
.dockerignore
.flake8	Remove miscellaneous nuphar configs (#13070 )	2022-09-26 13:41:28 -07:00
.gitattributes
.gitignore	Ignore more build directories and clangd files (#14154 )	2023-01-07 06:58:57 +08:00
.gitmodules	Remove unused git submodules (#13830 )	2022-12-07 21:59:16 -08:00
build.amd64.1411.bat
build.bat
build.sh
CITATION.cff
CODEOWNERS	Add cgmanifest file in codeowner list (#13042 )	2022-09-22 18:58:01 -07:00
CONTRIBUTING.md
lgtm.yml	Fix lgtm C++ error (#13613 )	2022-11-10 10:06:22 -08:00
LICENSE
NuGet.config
ort.wprp
ORT_icon_for_light_bg.png
packages.config	[DML EP] Upgrade DML to 1.10.0 (#13796 )	2022-11-30 21:32:14 -08:00
pyproject.toml	Update pylint config to include valid short names (#13631 )	2022-11-14 10:00:25 -08:00
README.md	Update resource section in readme (#13724 )	2022-11-28 09:42:31 -08:00
requirements-dev.txt	Introduce parameterized as a dev dependency (#11364 )	2022-04-26 17:24:39 -07:00
requirements-doc.txt
requirements-training.txt	Remove protobuf pin from training requirements (#13695 )	2022-11-22 12:27:18 -08:00
requirements.txt.in	Add additional python requirements (#11522 )	2022-05-20 16:16:18 -07:00
SECURITY.md	Microsoft mandatory file (#11619 )	2022-05-25 13:56:10 -07:00
setup.py	Openvino ep 2022.3 v4.3 (#14210 )	2023-01-11 16:31:26 -08:00
ThirdPartyNotices.txt	[CPU] Resize of Opset 18 (#13890 )	2023-01-14 08:57:23 +10:00
VERSION_NUMBER	Bumping up version number to 1.14.0 on main branch (#13401 )	2022-10-21 19:16:44 -04:00

README.md

ONNX Runtime is a cross-platform inference and training machine-learning accelerator.

ONNX Runtime inference can enable faster customer experiences and lower costs, supporting models from deep learning frameworks such as PyTorch and TensorFlow/Keras as well as classical machine learning libraries such as scikit-learn, LightGBM, XGBoost, etc. ONNX Runtime is compatible with different hardware, drivers, and operating systems, and provides optimal performance by leveraging hardware accelerators where applicable alongside graph optimizations and transforms. Learn more →

ONNX Runtime training can accelerate the model training time on multi-node NVIDIA GPUs for transformer models with a one-line addition for existing PyTorch training scripts. Learn more →

Get Started & Resources

General Information: onnxruntime.ai
Usage documention and tutorials: onnxruntime.ai/docs
YouTube video tutorials: youtube.com/@ONNXRuntime
Upcoming Release Roadmap
Companion sample repositories:
- ONNX Runtime Inferencing: microsoft/onnxruntime-inference-examples
- ONNX Runtime Training: microsoft/onnxruntime-training-examples

Build Pipeline Status

System	CPU	GPU	EPs
Windows
Linux
Mac
Android
iOS
WebAssembly

Data/Telemetry

Windows distributions of this project may collect usage data and send it to Microsoft to help improve our products and services. See the privacy statement for more details.

Contributions and Feedback

We welcome contributions! Please see the contribution guidelines.

For feature requests or bug reports, please file a GitHub Issue.

For general discussion or questions, please use GitHub Discussions.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

License

This project is licensed under the MIT License.