mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-30 23:18:20 +00:00
Update conversion script and process to simplify creating ORT format models and a minimal build (#5217)
* Update conversion script and process to simplify creating ORT format models and a minimal build.
This commit is contained in:
parent
21a7afb2c6
commit
95b2e31659
4 changed files with 124 additions and 120 deletions
|
|
@ -15,60 +15,28 @@ The directory the ONNX Runtime repository was cloned into is referred to as `<ON
|
|||
|
||||
Once you have cloned the repository, perform the following steps to create a minimal build of ONNX Runtime that is model specific:
|
||||
|
||||
### 1. Create ORT format model
|
||||
### 1. Create ORT format model and configuration file with required operators
|
||||
|
||||
We will use a helper python script to convert an existing ONNX format model into an ORT format model.
|
||||
We will use a helper python script to convert ONNX format models into ORT format models, and to create the configuration file for use with the minimal build.
|
||||
This will require the standard ONNX Runtime python package to be installed.
|
||||
A single model is converted at a time by this script.
|
||||
|
||||
- Install the ONNX Runtime nightly python package from https://test.pypi.org/project/ort-nightly/
|
||||
- e.g. `pip install -i https://test.pypi.org/simple/ ort-nightly`
|
||||
- ensure that any existing ONNX Runtime python package was uninstalled first, or use `-U` with the above command to upgrade an existing package
|
||||
- using the nightly package is temporary until ONNX Runtime version 1.5 is released
|
||||
- Convert the ONNX model to ORT format
|
||||
- `python <ONNX Runtime repository root>/tools/python/convert_onnx_model_to_ort.py <path to .onnx model>`
|
||||
- This script will first optimize the ONNX model and save it with a '.optimized.onnx' file extension
|
||||
- *IMPORTANT* this optimized ONNX model should be used as the input to the minimal build. Do NOT use the original ONNX model for that step.
|
||||
- It will next convert the optimized ONNX model to ORT format and save the file using '.ort' as the file extension.
|
||||
- Copy all the ONNX models you wish to convert and use with the minimal build into a directory
|
||||
- Convert the ONNX models to ORT format
|
||||
- `python <ONNX Runtime repository root>/tools/python/convert_onnx_models_to_ort.py <path to directory containing one or more .onnx models>`
|
||||
- For each ONNX model an ORT format model will be created with '.ort' as the file extension.
|
||||
- A `required_operators.config` configuration file will also be created.
|
||||
|
||||
Example:
|
||||
|
||||
Running `python <ORT repository root>/tools/python/convert_onnx_model_to_ort.py /models/ssd_mobilenet.onnx`
|
||||
- Will create `/models/ssd_mobilenet.optimized.onnx`, which is an ONNX format model that ONNX Runtime has optimized
|
||||
- e.g. constant folding will have run
|
||||
- Will use `/models/ssd_mobilenet.optimized.onnx` to create `/models/ssd_mobilenet.ort`
|
||||
- ssd_mobilenet.ort is the ORT format version of the optimized model.
|
||||
Running `'python <ORT repository root>/tools/python/convert_onnx_model_to_ort.py /models'` where the '/models' directory contains ModelA.onnx and ModelB.onnx
|
||||
- Will create /models/ModelA.ort and /models/ModelB.ort
|
||||
- Will create /models/required_operators.config/
|
||||
|
||||
|
||||
### 2. Setup information to reduce build to minimum set of operator kernels required
|
||||
|
||||
In order to reduce the operator kernels included in the build, the required set must be either inferred from one or more ONNX models, or explicitly specified via configuration.
|
||||
|
||||
To infer, put one or more optimized ONNX models in a directory. The directory will be recursively searched for '.onnx' files.
|
||||
If taking this approach (vs. creating a configuration file), you should only include the optimized ONNX models and not both the original and optimized models, as there may be kernels that are were required in the original model that are not required in the optimized model.
|
||||
|
||||
Alternatively a configuration file can be created to specify the set of kernels to include.
|
||||
|
||||
See the documentation on the [Reduced Operator Kernel build](Reduced_Operator_Kernel_build.md) for more information.
|
||||
|
||||
This step can be run prior to building, or as part of the minimal build.
|
||||
|
||||
#### Example usage:
|
||||
|
||||
##### Pre-build
|
||||
|
||||
Place the optimized ONNX model/s (files with '.optimized.onnx' from the 'Create ORT format model' step above) in a directory.
|
||||
|
||||
Run the script to exclude unused kernels using this directory.
|
||||
|
||||
`python <ONNX Runtime repository root>/tools/ci_build/exclude_unused_ops.py --model_path <directory with optimized ONNX model/s>`
|
||||
|
||||
##### When building
|
||||
|
||||
When building as per the below instructions, add `--include_ops_by_model <directory with optimized ONNX model/s>` to the build command.
|
||||
|
||||
|
||||
### 3. Create the minimal build
|
||||
### 2. Create the minimal build
|
||||
|
||||
You will need to build ONNX Runtime from source to reduce the included operator kernels and other aspects of the binary.
|
||||
|
||||
|
|
@ -76,7 +44,10 @@ See [here](https://github.com/microsoft/onnxruntime/blob/master/BUILD.md#start-b
|
|||
|
||||
#### Binary size reduction options:
|
||||
|
||||
The follow options can be used to reduce the build size. Enable all options that your scenario allows.
|
||||
The follow options can be used to reduce the build size. Enable all options that your scenario allows.
|
||||
- Reduce build to required operator kernels
|
||||
- Add `--include_ops_by_config <config file produced by step 1>` to the build parameters.
|
||||
- See the documentation on the [Reduced Operator Kernel build](Reduced_Operator_Kernel_build.md) for more information. This step can also be done pre-build if needed.
|
||||
|
||||
- Enable minimal build (`--minimal_build`)
|
||||
- A minimal build will ONLY support loading and executing ORT format models.
|
||||
|
|
@ -106,11 +77,11 @@ The `Release` configuration could also be used if you wish to prioritize perform
|
|||
|
||||
##### Windows
|
||||
|
||||
`<ONNX Runtime repository root>\build.bat --config=MinSizeRel --cmake_generator="Visual Studio 16 2019" --build_shared_lib --minimal_build --disable_ml_ops --disable_exceptions`
|
||||
`<ONNX Runtime repository root>\build.bat --config=MinSizeRel --cmake_generator="Visual Studio 16 2019" --build_shared_lib --minimal_build --disable_ml_ops --disable_exceptions --include_ops_by_config <config file produced by step 1>`
|
||||
|
||||
##### Linux
|
||||
|
||||
`<ONNX Runtime repository root>/build.sh --config=MinSizeRel --build_shared_lib --minimal_build --disable_ml_ops --disable_exceptions`
|
||||
`<ONNX Runtime repository root>/build.sh --config=MinSizeRel --build_shared_lib --minimal_build --disable_ml_ops --disable_exceptions --include_ops_by_config <config file produced by step 1>`
|
||||
|
||||
##### Building ONNX Runtime Python Wheel as part of Minimal build
|
||||
|
||||
|
|
|
|||
|
|
@ -337,11 +337,23 @@ def _create_config_file_with_required_ops(required_operators, model_path, config
|
|||
log.info("Wrote set of required operators to {}".format(output_file))
|
||||
|
||||
|
||||
def exclude_unused_ops(models_path, config_path, ort_root=None, use_cuda=True):
|
||||
"Note that this called directly from build.py"
|
||||
def exclude_unused_ops(models_path, config_path, ort_root=None, use_cuda=True, output_config_path=None):
|
||||
'''Determine operators that are used, and either exclude them or create a configuration file that will.
|
||||
Note that this called directly from build.py'''
|
||||
|
||||
required_operators = _extract_ops_from_config(config_path, _extract_ops_from_model(models_path, {}))
|
||||
_exclude_unused_ops_in_providers(required_operators, _get_provider_paths(ort_root, use_cuda))
|
||||
if not models_path and not config_path:
|
||||
log.error('Please specify model_path and/or config_path.')
|
||||
sys.exit(-1)
|
||||
|
||||
if not ort_root:
|
||||
log.info('ort_root was not specified. Inferring ONNX Runtime repository root from location of this script.')
|
||||
|
||||
required_ops = _extract_ops_from_config(config_path, _extract_ops_from_model(models_path, {}))
|
||||
|
||||
if not output_config_path:
|
||||
_exclude_unused_ops_in_providers(required_ops, _get_provider_paths(ort_root, use_cuda))
|
||||
else:
|
||||
_create_config_file_with_required_ops(required_ops, models_path, config_path, output_config_path)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
|
|
@ -379,11 +391,5 @@ if __name__ == "__main__":
|
|||
parser.print_help()
|
||||
sys.exit(-1)
|
||||
|
||||
if not ort_root:
|
||||
log.info('ort_root was not specified. Inferring ORT root from location of this script.')
|
||||
|
||||
if not args.write_combined_config_to:
|
||||
exclude_unused_ops(models_path, config_path, ort_root, use_cuda=True)
|
||||
else:
|
||||
required_ops = _extract_ops_from_config(config_path, _extract_ops_from_model(models_path, {}))
|
||||
_create_config_file_with_required_ops(required_ops, models_path, config_path, args.write_combined_config_to)
|
||||
exclude_unused_ops(models_path, config_path, ort_root, use_cuda=True,
|
||||
output_config_path=args.write_combined_config_to)
|
||||
|
|
|
|||
|
|
@ -1,62 +0,0 @@
|
|||
#!/usr/bin/env python
|
||||
# Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
import argparse
|
||||
import onnxruntime as ort
|
||||
import os
|
||||
import re
|
||||
|
||||
|
||||
def convert(model: str):
|
||||
|
||||
if not model.endswith('.onnx'):
|
||||
raise ValueError("Model filename must end in .onnx.")
|
||||
|
||||
onnx_target_path = re.sub('.onnx$', '.optimized.onnx', model)
|
||||
ort_target_path = re.sub('.onnx$', '.ort', model)
|
||||
|
||||
so = ort.SessionOptions()
|
||||
so.optimized_model_filepath = onnx_target_path
|
||||
so.add_session_config_entry('session.save_model_format', 'ONNX')
|
||||
so.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_EXTENDED # Skip NCHWc optimizations
|
||||
|
||||
print("Optimizing ONNX model {} and saving in ONNX format to {}".format(model, onnx_target_path))
|
||||
# creating the session will result in the optimized model being saved
|
||||
_ = ort.InferenceSession(model, sess_options=so)
|
||||
|
||||
# Second, convert optimized ONNX model to ORT format
|
||||
so.optimized_model_filepath = ort_target_path
|
||||
so.graph_optimization_level = ort.GraphOptimizationLevel.ORT_DISABLE_ALL # Convert model as-is so we don't change the kernels in this step # noqa
|
||||
|
||||
so.add_session_config_entry('session.save_model_format', 'ORT')
|
||||
|
||||
print("Converting optimized ONNX model {} to ORT format model {}".format(onnx_target_path, ort_target_path))
|
||||
_ = ort.InferenceSession(onnx_target_path, sess_options=so)
|
||||
|
||||
orig_size = os.path.getsize(onnx_target_path)
|
||||
new_size = os.path.getsize(ort_target_path)
|
||||
print("Serialized {} to {}. Sizes: orig={} new={} diff={} new:old={:.4f}:1.0".format(
|
||||
onnx_target_path, ort_target_path, orig_size, new_size, new_size - orig_size, new_size / orig_size))
|
||||
|
||||
|
||||
def parse_args():
|
||||
parser = argparse.ArgumentParser(os.path.basename(__file__),
|
||||
description='''Convert an onnx model -> optimized onnx model -> ORT format model.
|
||||
Expects a .onnx file as input. Optimized onnx model will be saved in the same
|
||||
directory with an extension of .optimized.onnx.
|
||||
An ORT format model will be created from the optimized onnx model.
|
||||
The optimized onnx model should be used as input to a minimal build so that
|
||||
any post-optimization kernels are included in the build.'''
|
||||
)
|
||||
parser.add_argument('model', help='Provide path to ONNX model to convert. Must have .onnx extension.')
|
||||
return parser.parse_args()
|
||||
|
||||
|
||||
def main():
|
||||
args = parse_args()
|
||||
convert(args.model)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
89
tools/python/convert_onnx_models_to_ort.py
Normal file
89
tools/python/convert_onnx_models_to_ort.py
Normal file
|
|
@ -0,0 +1,89 @@
|
|||
#!/usr/bin/env python
|
||||
# Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
import argparse
|
||||
import glob
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
import tempfile
|
||||
|
||||
import onnxruntime as ort
|
||||
|
||||
|
||||
def create_config_file(optimized_model_path, config_file_path):
|
||||
script_path = os.path.dirname(os.path.realpath(__file__))
|
||||
ci_build_py_path = os.path.abspath(os.path.join(script_path, '..', 'ci_build'))
|
||||
sys.path.append(ci_build_py_path)
|
||||
|
||||
# create config file from all the optimized models
|
||||
print("Creating configuration file for operators required by optimized models in {}".format(config_file_path))
|
||||
from exclude_unused_ops import exclude_unused_ops # tools/ci_build/exclude_unused_ops.py
|
||||
exclude_unused_ops(optimized_model_path, config_path=None, ort_root=None, output_config_path=config_file_path)
|
||||
|
||||
|
||||
def convert(model_path: str):
|
||||
models = glob.glob(os.path.join(model_path, '*.onnx'))
|
||||
|
||||
if len(models) == 0:
|
||||
raise ValueError("No .onnx files were found in " + model_path)
|
||||
|
||||
# create temp directory to create optimized onnx format models in. currently we need this to create the
|
||||
# config file with required operators. long term we could potentially do this from the ORT format model,
|
||||
# however that requires a lot of infrastructure to be able to parse the flatbuffers schema for those files
|
||||
with tempfile.TemporaryDirectory() as tmpdirname:
|
||||
for model in models:
|
||||
model_filename = os.path.basename(model)
|
||||
# create .optimized.onnx file in temp dir
|
||||
onnx_target_path = os.path.join(tmpdirname, re.sub('.onnx$', '.optimized.onnx', model_filename))
|
||||
# create .ort file in same dir as original onnx model
|
||||
ort_target_path = re.sub('.onnx$', '.ort', model)
|
||||
|
||||
so = ort.SessionOptions()
|
||||
so.optimized_model_filepath = onnx_target_path
|
||||
so.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_EXTENDED # Skip NCHWc optimizations
|
||||
|
||||
print("Optimizing ONNX model {}".format(model))
|
||||
# creating the session will result in the optimized model being saved
|
||||
_ = ort.InferenceSession(model, sess_options=so)
|
||||
|
||||
# Second, convert optimized ONNX model to ORT format
|
||||
so.optimized_model_filepath = ort_target_path
|
||||
so.graph_optimization_level = ort.GraphOptimizationLevel.ORT_DISABLE_ALL # Convert model as-is so we don't change the kernels in this step # noqa
|
||||
so.add_session_config_entry('session.save_model_format', 'ORT')
|
||||
|
||||
print("Converting optimized ONNX model to ORT format model {}".format(ort_target_path))
|
||||
_ = ort.InferenceSession(onnx_target_path, sess_options=so)
|
||||
|
||||
# orig_size = os.path.getsize(onnx_target_path)
|
||||
# new_size = os.path.getsize(ort_target_path)
|
||||
# print("Serialized {} to {}. Sizes: orig={} new={} diff={} new:old={:.4f}:1.0".format(
|
||||
# onnx_target_path, ort_target_path, orig_size, new_size, new_size - orig_size, new_size / orig_size))
|
||||
|
||||
# now that all models are converted create the config file before the temp dir is deleted
|
||||
create_config_file(tmpdirname, os.path.join(model_path, 'required_operators.config'))
|
||||
|
||||
|
||||
def parse_args():
|
||||
parser = argparse.ArgumentParser(
|
||||
os.path.basename(__file__),
|
||||
description='''Convert the ONNX format model/s in the provided directory to ORT format models.
|
||||
All files with a `.onnx` extension will be processed. For each one, an ORT format model will be created in the
|
||||
same directory. A configuration file will also be created called `required_operators.config`, and will contain
|
||||
the list of required operators for all converted models.
|
||||
This configuration file should be used as input to the minimal build'''
|
||||
)
|
||||
|
||||
parser.add_argument('model_path', help='Provide path to directory containing ONNX model/s to convert. '
|
||||
'Files with .onnx extension will be processed.')
|
||||
return parser.parse_args()
|
||||
|
||||
|
||||
def main():
|
||||
args = parse_args()
|
||||
convert(args.model_path)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
Loading…
Reference in a new issue