onnxruntime/docs/onnxruntime_extensions.md

# ONNXRuntime Extensions

ONNXRuntime Extensions is a comprehensive package to extend the capability of the ONNX conversion and inference. Please visit the documentation [onnxruntime-extensions](https://github.com/microsoft/onnxruntime-extensions) to learn more about ONNXRuntime Extensions.

## Custom Operators Supported
onnxruntime-extensions supports many useful custom operators to enhance the text processing capability of ONNXRuntime, which include some widely used **string operators** and popular **tokenizers**. For custom operators supported and how to use them, please check the documentation [custom operators](https://github.com/microsoft/onnxruntime-extensions/blob/main/docs/custom_text_ops.md).

## Build ONNXRuntime with Extensions
We have supported build onnxruntime-extensions as a static library and link it into ONNXRuntime. To enable custom operators from onnxruntime-extensions, you should add argument `--use_extensions`, which will fetch onnxruntime-extensions and build it as static library from https://github.com/microsoft/onnxruntime-extensions **by default**.

If you want to build ONNXRuntime with a pre-pulled onnxruntime-extensions, pass extra argument `--extensions_overridden_path <path-to-onnxruntime-extensions>`.

Note: Please remember to use `--minimal_build custom_ops` when you build minimal runtime with custom operators from onnxruntime-extensions.

### Build with Operators Config
Also, you could pass the **required operators config** file by argument `--include_ops_by_config` to customize the operators you want to build in both onnxruntime and onnxruntime-extensions. Example content of **required_operators.config** are:
```json
# Generated from model/s
# domain;opset;op1,op2...
ai.onnx;12;Add,Cast,Concat,Squeeze
ai.onnx.contrib;1;GPT2Tokenizer,
```

In above operators config, `ai.onnx.contrib` is the domain name of operators in onnxruntime-extensions. We would parse this line to generate required operators in onnxruntime-extensions for build.

### Generate Operators Config
To generate the **required_operators.config** file from model, please follow the guidance [Converting ONNX models to ORT format](https://onnxruntime.ai/docs/reference/ort-format-models.html#convert-onnx-models-to-ort-format).

If your model contains operators from onnxruntime-extensions, please add argument `--custom_op_library` and pass the path to **ortcustomops** shared library built following guidance [share library](https://github.com/microsoft/onnxruntime-extensions#the-share-library-for-non-python).

You could even manually edit the **required_operators.config** if you know the custom operators required and don't want to build the shared library.

### Build and Disable Exceptions
You could add argument `--disable_exceptions` to disable exceptions in both onnxruntime and onnxruntime-extensions.

However, if the custom operators you used in onnxruntime-extensions (such as BlingFireTokenizer) use c++ exceptions, then you will also need to add argument `--enable_wasm_exception_throwing_override` to enable **Emscripten** to link in exception throwing support library. If this argument is not set, Emscripten will throw linking errors.

### Example Build Command
```console
D:\onnxruntime> build.bat --config Release --build_wasm --enable_wasm_threads --enable_wasm_simd --skip_tests --disable_exceptions --disable_wasm_exception_catching --enable_wasm_exception_throwing_override --disable_rtti --use_extensions --parallel --minimal_build custom_ops --include_ops_by_config D:\required_operators.config
```

## E2E Example using Custom Operators
A common NLP task would probably contain several steps, including pre-processing, DL model and post-processing. It would be very efficient and productive to convert the pre/post processing code snippets into ONNX model since ONNX graph is actually a computation graph, and it can represent the most programming code, theoretically.

Here is an E2E NLP example to show the usage of onnxruntime-extensions:
### Create E2E Model
You could use ONNX helper functions to create an ONNX model with custom operators.
```python
import onnx
from onnx import helper

# ...
e2e_nodes = []

# tokenizer node
tokenizer_node = helper.make_node(
    'GPT2Tokenizer', # custom operator supported in onnxruntime-extensions
    inputs=['input_str'],
    outputs=['token_ids', 'attention_mask'],
    vocab=get_file_content(vocab_file),
    merges=get_file_content(merges_file),
    name='gpt2_tokenizer',
    domain='ai.onnx.contrib' # domain of custom operator
)
e2e_nodes.append(tokenizer_node)

# deep learning model
dl_model = onnx.load("dl_model.onnx")
dl_nodes = dl_model.graph.node
e2e_nodes.extend(dl_nodes)

# construct E2E ONNX graph and model
e2e_graph = helper.make_graph(
    e2e_nodes,
    'e2e_graph',
    [input_tensors],
    [output_tensors],
)
# ...
```
For more usage of ONNX helper, please visit the document [Python API Overview](https://github.com/onnx/onnx/blob/main/docs/PythonAPIOverview.md).

### Run E2E Model in Python
```python
import onnxruntime as _ort
from onnxruntime_extensions import get_library_path as _lib_path

so = _ort.SessionOptions()
# register onnxruntime-extensions library
so.register_custom_ops_library(_lib_path())

# run onnxruntime session
sess = _ort.InferenceSession(e2e_model, so)
sess.run(...)
```

### Run E2E Model in JavaScript
To run E2E ONNX model in JavaScript, you need to first [prepare ONNX Runtime WebAssembly artifacts](https://github.com/microsoft/onnxruntime/blob/main/js), include the generated `ort.min.js`, and then load and run the model in JS.
```js
// use an async context to call onnxruntime functions
async function main() {
    try {
        // create a new session and load the e2e model
        const session = await ort.InferenceSession.create('./e2e_model.onnx');

        // prepare inputs
        const tensorA = new ort.Tensor(...);
        const tensorB = new ort.Tensor(...);

        // prepare feeds: use model input names as keys
        const feeds = { a: tensorA, b: tensorB };

        // feed inputs and run
        const results = await session.run(feeds);

        // read from results
        const dataC = results.c.data;
        document.write(`data of result tensor 'c': ${dataC}`);

    } catch (e) {
        document.write(`failed to inference ONNX model: ${e}.`);
    }
}
```
Update submodule onnxruntime-extensions. (#8282) * Update submodule onnxruntime-extensions to latest. * Add document for onnxruntime-extensions. * Update cgmanifest.json for onnxruntime-extensions. * Add example in JavaScript. Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com> 2021-07-13 02:21:11 +00:00			`# ONNXRuntime Extensions`

			`ONNXRuntime Extensions is a comprehensive package to extend the capability of the ONNX conversion and inference. Please visit the documentation [onnxruntime-extensions](https://github.com/microsoft/onnxruntime-extensions) to learn more about ONNXRuntime Extensions.`

			`## Custom Operators Supported`
			`onnxruntime-extensions supports many useful custom operators to enhance the text processing capability of ONNXRuntime, which include some widely used string operators and popular tokenizers. For custom operators supported and how to use them, please check the documentation [custom operators](https://github.com/microsoft/onnxruntime-extensions/blob/main/docs/custom_text_ops.md).`

			`## Build ONNXRuntime with Extensions`
Remove the extensions submodule (#17097) ### Description Remove the onnxruntime-extensions submodule since it now was used via cmake FetchContent ### Motivation and Context The submodule relies on an outdated version of the extensions, and the build instructions should be updated to eliminate any confusion. 2023-08-14 17:16:33 +00:00			We have supported build onnxruntime-extensions as a static library and link it into ONNXRuntime. To enable custom operators from onnxruntime-extensions, you should add argument `--use_extensions`, which will fetch onnxruntime-extensions and build it as static library from https://github.com/microsoft/onnxruntime-extensions by default.
Enable selecting custom ops in onnxruntime-extensions. (#8826) * Enable selecting custom ops in onnxruntime-extensions. * Move cmake_helper.py. * Remove over-indented spaces. * Add doc. * Remove onnxruntime-extensions from git submodules, and user should pass path of onnxruntime-extensions for build. * Modify doc. * Remove argument --enable_onnxruntime_extensions and use --onnxruntime_extensions_path. * Fix build error. * Fix build error. * Use onnxruntime_extensions_path. * support both submodule and external source folders * refinement * Update cgmanifest.json * Support building onnxruntime-extensions from either git submodule or pre-pulled path. * Update doc. * more standard name * update docs * add the copyright header Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com> Co-authored-by: Wenbing Li <wenbingl@outlook.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> 2021-08-28 04:45:52 +00:00
			If you want to build ONNXRuntime with a pre-pulled onnxruntime-extensions, pass extra argument `--extensions_overridden_path <path-to-onnxruntime-extensions>`.

			Note: Please remember to use `--minimal_build custom_ops` when you build minimal runtime with custom operators from onnxruntime-extensions.

			`### Build with Operators Config`
			Also, you could pass the required operators config file by argument `--include_ops_by_config` to customize the operators you want to build in both onnxruntime and onnxruntime-extensions. Example content of required_operators.config are:
			```json
			`# Generated from model/s`
			`# domain;opset;op1,op2...`
			`ai.onnx;12;Add,Cast,Concat,Squeeze`
			`ai.onnx.contrib;1;GPT2Tokenizer,`
			```

			In above operators config, `ai.onnx.contrib` is the domain name of operators in onnxruntime-extensions. We would parse this line to generate required operators in onnxruntime-extensions for build.

			`### Generate Operators Config`
Fix broken and outdated links in documentation (#14092) ### Description <!-- Describe your changes. --> I fixed some broken links in the C API documentation, but then did a quick pass over all of the links I could find and then fixed those. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> I got some 404's when exploring the documentation and wanted to fix it. 2023-02-23 18:48:04 +00:00			`To generate the required_operators.config file from model, please follow the guidance [Converting ONNX models to ORT format](https://onnxruntime.ai/docs/reference/ort-format-models.html#convert-onnx-models-to-ort-format).`
Enable selecting custom ops in onnxruntime-extensions. (#8826) * Enable selecting custom ops in onnxruntime-extensions. * Move cmake_helper.py. * Remove over-indented spaces. * Add doc. * Remove onnxruntime-extensions from git submodules, and user should pass path of onnxruntime-extensions for build. * Modify doc. * Remove argument --enable_onnxruntime_extensions and use --onnxruntime_extensions_path. * Fix build error. * Fix build error. * Use onnxruntime_extensions_path. * support both submodule and external source folders * refinement * Update cgmanifest.json * Support building onnxruntime-extensions from either git submodule or pre-pulled path. * Update doc. * more standard name * update docs * add the copyright header Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com> Co-authored-by: Wenbing Li <wenbingl@outlook.com> Co-authored-by: Wenbing Li <10278425+wenbingl@users.noreply.github.com> 2021-08-28 04:45:52 +00:00
			If your model contains operators from onnxruntime-extensions, please add argument `--custom_op_library` and pass the path to ortcustomops shared library built following guidance [share library](https://github.com/microsoft/onnxruntime-extensions#the-share-library-for-non-python).

			`You could even manually edit the required_operators.config if you know the custom operators required and don't want to build the shared library.`

			`### Build and Disable Exceptions`
Enable linking in exception throwing support library when build onnxruntime wasm. (#8973) * Enable linking in exception throwing support library when build onnxruntime webassembly containing onnxruntime-extensions. * Add flag in build.py to enable linking exceptions throwing library. * Update onnxruntime-extensions document and bind custom_ops build flag with use_extensions. * Update doc. * Update cgmanifest.json. Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com> 2021-09-10 14:09:16 +00:00			You could add argument `--disable_exceptions` to disable exceptions in both onnxruntime and onnxruntime-extensions.

			However, if the custom operators you used in onnxruntime-extensions (such as BlingFireTokenizer) use c++ exceptions, then you will also need to add argument `--enable_wasm_exception_throwing_override` to enable Emscripten to link in exception throwing support library. If this argument is not set, Emscripten will throw linking errors.

			`### Example Build Command`
			```console
			`D:\onnxruntime> build.bat --config Release --build_wasm --enable_wasm_threads --enable_wasm_simd --skip_tests --disable_exceptions --disable_wasm_exception_catching --enable_wasm_exception_throwing_override --disable_rtti --use_extensions --parallel --minimal_build custom_ops --include_ops_by_config D:\required_operators.config`
			```
Update submodule onnxruntime-extensions. (#8282) * Update submodule onnxruntime-extensions to latest. * Add document for onnxruntime-extensions. * Update cgmanifest.json for onnxruntime-extensions. * Add example in JavaScript. Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com> 2021-07-13 02:21:11 +00:00
			`## E2E Example using Custom Operators`
			`A common NLP task would probably contain several steps, including pre-processing, DL model and post-processing. It would be very efficient and productive to convert the pre/post processing code snippets into ONNX model since ONNX graph is actually a computation graph, and it can represent the most programming code, theoretically.`

			`Here is an E2E NLP example to show the usage of onnxruntime-extensions:`
			`### Create E2E Model`
			`You could use ONNX helper functions to create an ONNX model with custom operators.`
			```python
			`import onnx`
			`from onnx import helper`

			`# ...`
			`e2e_nodes = []`

			`# tokenizer node`
			`tokenizer_node = helper.make_node(`
			`'GPT2Tokenizer', # custom operator supported in onnxruntime-extensions`
			`inputs=['input_str'],`
			`outputs=['token_ids', 'attention_mask'],`
			`vocab=get_file_content(vocab_file),`
			`merges=get_file_content(merges_file),`
			`name='gpt2_tokenizer',`
			`domain='ai.onnx.contrib' # domain of custom operator`
			`)`
			`e2e_nodes.append(tokenizer_node)`

			`# deep learning model`
			`dl_model = onnx.load("dl_model.onnx")`
			`dl_nodes = dl_model.graph.node`
			`e2e_nodes.extend(dl_nodes)`

			`# construct E2E ONNX graph and model`
			`e2e_graph = helper.make_graph(`
			`e2e_nodes,`
			`'e2e_graph',`
			`[input_tensors],`
			`[output_tensors],`
			`)`
			`# ...`
			```
replace 'master' branch ref to 'main' for onnx repo (#12678) 2022-08-30 20:41:42 +00:00			`For more usage of ONNX helper, please visit the document [Python API Overview](https://github.com/onnx/onnx/blob/main/docs/PythonAPIOverview.md).`
Update submodule onnxruntime-extensions. (#8282) * Update submodule onnxruntime-extensions to latest. * Add document for onnxruntime-extensions. * Update cgmanifest.json for onnxruntime-extensions. * Add example in JavaScript. Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com> 2021-07-13 02:21:11 +00:00
			`### Run E2E Model in Python`
			```python
			`import onnxruntime as _ort`
			`from onnxruntime_extensions import get_library_path as _lib_path`

			`so = _ort.SessionOptions()`
			`# register onnxruntime-extensions library`
			`so.register_custom_ops_library(_lib_path())`

			`# run onnxruntime session`
			`sess = _ort.InferenceSession(e2e_model, so)`
			`sess.run(...)`
			```

			`### Run E2E Model in JavaScript`
Replace 'master' branch ref to 'main' in the code (#12547) 2022-08-22 17:48:12 +00:00			To run E2E ONNX model in JavaScript, you need to first [prepare ONNX Runtime WebAssembly artifacts](https://github.com/microsoft/onnxruntime/blob/main/js), include the generated `ort.min.js`, and then load and run the model in JS.
Update submodule onnxruntime-extensions. (#8282) * Update submodule onnxruntime-extensions to latest. * Add document for onnxruntime-extensions. * Update cgmanifest.json for onnxruntime-extensions. * Add example in JavaScript. Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com> 2021-07-13 02:21:11 +00:00			```js
			`// use an async context to call onnxruntime functions`
			`async function main() {`
			`try {`
			`// create a new session and load the e2e model`
			`const session = await ort.InferenceSession.create('./e2e_model.onnx');`

			`// prepare inputs`
			`const tensorA = new ort.Tensor(...);`
			`const tensorB = new ort.Tensor(...);`

			`// prepare feeds: use model input names as keys`
			`const feeds = { a: tensorA, b: tensorB };`

			`// feed inputs and run`
			`const results = await session.run(feeds);`

			`// read from results`
			`const dataC = results.c.data;`
			document.write(`data of result tensor 'c': ${dataC}`);

			`} catch (e) {`
			document.write(`failed to inference ONNX model: ${e}.`);
			`}`
			`}`
Replace 'master' branch ref to 'main' in the code (#12547) 2022-08-22 17:48:12 +00:00			```