[Python] Move symbolic_shape_infer from nuphar to tools (#5162)

* [Python] Move symbolic shape inference from nuphar to tools

* Fix PEP8 ERROR
This commit is contained in:
KeDengMS 2020-09-18 09:31:06 -07:00 committed by GitHub
parent f7c1e51810
commit ce3b67e0cd
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
16 changed files with 59 additions and 44 deletions

View file

@ -148,7 +148,7 @@ sess = onnxruntime.InferenceSession(model_path)
To save runtime JIT cost, Nuphar requires models to have shape inference information from ONNX after model is loaded. Some nodes in ONNX can generate dynamic output tensor shapes from input data value, i.e. ConstantOfShape, Tile, Slice in opset 10, Compress, etc. Those ops may block ONNX shape inference and make the part of graph after such nodes not runnable in Nuphar.
User may use Python script [symbolic_shape_infer.py](../../onnxruntime/core/providers/nuphar/scripts/symbolic_shape_infer.py) to run symbolic shape inference in ONNX model. This script adds output tensor shapes in the model in graph.value_info field, by doing symbolic dimension computation using sympy when there are Shape ops in model. Besides, running symbolic shape inference on ONNX model would make the graph more readable. Note that when using [model_editor.py](../../onnxruntime/core/providers/nuphar/scripts/model_editor.py) to convert models with LSTM/GRU/RNN to Scan, the resulting model may have incomplete shape inference. Running symbolic_shape_infer.py is needed to get the Scan ops in the model to run in Nuphar. Besides, please note that quantization should be the last step, after verified accuracy and performance of the edited floating point model.
User may use Python script [symbolic_shape_infer.py](../../onnxruntime/python/tools/symbolic_shape_infer.py) to run symbolic shape inference in ONNX model. This script adds output tensor shapes in the model in graph.value_info field, by doing symbolic dimension computation using sympy when there are Shape ops in model. Besides, running symbolic shape inference on ONNX model would make the graph more readable. Note that when using [model_editor.py](../../onnxruntime/core/providers/nuphar/scripts/model_editor.py) to convert models with LSTM/GRU/RNN to Scan, the resulting model may have incomplete shape inference. Running symbolic_shape_infer.py is needed to get the Scan ops in the model to run in Nuphar. Besides, please note that quantization should be the last step, after verified accuracy and performance of the edited floating point model.
In addition, user may also manually add shapes to graph.value_info using [onnx.helper.make_tensor_value_info](https://github.com/onnx/onnx/blob/v1.5.0/onnx/helper.py#L290) with model specific knowledge. For example, if you have Hardmax output casted to bool as Compress input condition, then the unknown dimension of the output of Compress is actually 1.

View file

@ -28,14 +28,14 @@ status = session_object.Load(model_file_name);
The C API details are [here](../C_API.md#c-api).
#### Shape Inference for TensorRT Subgraphs
If some operators in the model are not supported by TensorRT, ONNX Runtime will partition the graph and only send supported subgraphs to TensorRT execution provider. Because TensorRT requires that all inputs of the subgraphs have shape specified, ONNX Runtime will throw error if there is no input shape info. In this case please run shape inference for the entire model first by running script [here](https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/core/providers/nuphar/scripts/symbolic_shape_infer.py).
If some operators in the model are not supported by TensorRT, ONNX Runtime will partition the graph and only send supported subgraphs to TensorRT execution provider. Because TensorRT requires that all inputs of the subgraphs have shape specified, ONNX Runtime will throw error if there is no input shape info. In this case please run shape inference for the entire model first by running script [here](https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/python/tools/symbolic_shape_infer.py).
#### Sample
This example shows how to run Faster R-CNN model on TensorRT execution provider,
First, download Faster R-CNN onnx model from onnx model zoo [here](https://github.com/onnx/models/tree/master/vision/object_detection_segmentation/faster-rcnn).
Second, infer shapes in the model by running shape inference script [here](https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/core/providers/nuphar/scripts/symbolic_shape_infer.py),
Second, infer shapes in the model by running shape inference script [here](https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/python/tools/symbolic_shape_infer.py),
```
python symbolic_shape_infer.py --input /path/to/onnx/model/model.onnx --output /path/to/onnx/model/new_model.onnx --auto_merge
```

View file

@ -101,7 +101,7 @@
"from onnxruntime.nuphar.model_editor import convert_to_scan_model\n",
"from onnxruntime.nuphar.model_quantizer import convert_matmul_model\n",
"from onnxruntime.nuphar.rnn_benchmark import generate_model, perf_test\n",
"from onnxruntime.nuphar.symbolic_shape_infer import SymbolicShapeInference"
"from onnxruntime.tools.symbolic_shape_infer import SymbolicShapeInference"
]
},
{
@ -122,6 +122,7 @@
"opset = model.opset_import.add()\n",
"opset.domain == 'onnx'\n",
"opset.version = 7 # ONNX opset 7 is required for LSTM op later\n",
"model.ir_version = onnx.IR_VERSION\n",
"\n",
"graph = model.graph\n",
"X = 'input'\n",

View file

@ -14,8 +14,4 @@ Quantize MatMul in model dynamically wrt. input
* rnn_benchmark.py
Benchmark for LSTM/GRU/RNN with model_editor and model_quantizer to show Nuphar's speed up for those models
* symbolic_shape_infer.py
Run symbolic shape inference with sympy. Nuphar relies on shape inference to run efficiently.
Benchmark for LSTM/GRU/RNN with model_editor and model_quantizer to show Nuphar's speed up for those models

View file

@ -7,7 +7,7 @@ from enum import Enum
import numpy as np
import onnx
from .node_factory import NodeFactory, ensure_opset
from .symbolic_shape_infer import SymbolicShapeInference, get_shape_from_type_proto
from ..tools.symbolic_shape_infer import SymbolicShapeInference, get_shape_from_type_proto
# trim outputs of LSTM/GRU/RNN if not used or outputed
def trim_unused_outputs(node, graph):

View file

@ -9,7 +9,7 @@ import numpy as np
import onnx
from onnx import helper, numpy_helper
from .node_factory import NodeFactory, ensure_opset
from .symbolic_shape_infer import SymbolicShapeInference
from ..tools.symbolic_shape_infer import SymbolicShapeInference
class QuantizeConfig:
def __init__(self, signed, reserved_bits, type_bits):

View file

@ -116,7 +116,7 @@ def perf_test(rnn_type, num_threads, input_dim, hidden_dim, bidirectional, layer
with ScopedSetNumThreads(num_threads) as scoped_set_num_threads:
# run Scan model converted from original in Nuphar
from .model_editor import convert_to_scan_model
from .symbolic_shape_infer import SymbolicShapeInference
from ..tools.symbolic_shape_infer import SymbolicShapeInference
scan_model_name = os.path.splitext(model_name)[0] + '_scan.onnx'
convert_to_scan_model(model_name, scan_model_name)
# note that symbolic shape inference is needed because model has symbolic batch dim, thus init_state is ConstantOfShape

View file

@ -48,7 +48,7 @@ def as_scalar(x):
assert len(x) == 1
return x[0]
elif type(x) == np.ndarray:
return np.asscalar(x)
return x.item()
else:
return x
@ -328,20 +328,17 @@ class SymbolicShapeInference:
self.symbolic_dims_[str(new_dim)] = new_dim
def _onnx_infer_single_node(self, node):
# skip onnx shape inference for Scan/Loop
skip_infer = node.op_type in ['Scan', 'Loop']
# skip onnx shape inference for some ops, as they are handled in _infer_*
skip_infer = node.op_type in ['If', 'Loop', 'Scan', 'SplitToSequence', 'ZipMap']
if not skip_infer:
# run single node inference with self.known_vi_ shapes
# note that inference rely on initializer values is not handled
# as we don't copy initializer weights to tmp_graph for inference speed purpose
if node.op_type == 'SplitToSequence':
make_value_info_func = helper.make_sequence_value_info
else:
make_value_info_func = helper.make_tensor_value_info
tmp_graph = helper.make_graph([node],
'tmp',
[self.known_vi_[i] for i in node.input if i],
[make_value_info_func(i, onnx.TensorProto.UNDEFINED, None) for i in node.output])
[helper.make_tensor_value_info(i, onnx.TensorProto.UNDEFINED, None) for i in node.output])
self.tmp_mp_.graph.CopyFrom(tmp_graph)
self.tmp_mp_ = shape_inference.infer_shapes(self.tmp_mp_)
for i_o in range(len(node.output)):
@ -349,6 +346,8 @@ class SymbolicShapeInference:
vi = self.out_mp_.graph.value_info.add()
if not skip_infer:
vi.CopyFrom(self.tmp_mp_.graph.output[i_o])
else:
vi.name = o
self.known_vi_[o] = vi
def _onnx_infer_subgraph(self, node, subgraph, use_node_input=True):
@ -407,7 +406,7 @@ class SymbolicShapeInference:
if len(v.shape) > 1:
new_v = None # ignore value for rank > 1
elif len(v.shape) == 0:
new_v = int(np.asscalar(v))
new_v = int(v.item())
else:
assert len(v.shape) == 1
new_v = [int(vv) for vv in v]

View file

@ -154,7 +154,5 @@ The output of running benchmark:
```
## Dependencies
- This test script uses following script to infer shape in the model for TensorRT execution provider.
https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/core/providers/nuphar/scripts/symbolic_shape_infer.py
- When inferencing model using CUDA float16, this script following script to convert nodes in model graph from float32 to float16. It also modifies the converting script a little bit to better cover more model graph conversion.
https://github.com/microsoft/onnxconverter-common/blob/master/onnxconverter_common/float16.py

View file

@ -675,7 +675,7 @@ def create_session(model_path, providers, session_options):
new_model_path = model_path[:].replace(".onnx", "_new.onnx")
if not os.path.exists(new_model_path):
subprocess.run("python3 ../symbolic_shape_infer.py --input " + model_path + " --output " + new_model_path + " --auto_merge", shell=True, check=True)
subprocess.run("python3 -m onnxruntime.tools.symbolic_shape_infer --input " + model_path + " --output " + new_model_path + " --auto_merge", shell=True, check=True)
session = onnxruntime.InferenceSession(new_model_path, providers=providers, sess_options=session_options)
return session
except Exception as e:

View file

@ -363,7 +363,7 @@ class TestNuphar(unittest.TestCase):
# run onnx_test_runner to verify results
# use -M to disable memory pattern
onnx_test_runner = os.path.join(cwd, 'onnx_test_runner')
subprocess.run([onnx_test_runner, '-e', 'nuphar', '-M', '-n', 'bidaf', cwd], check=True, cwd=cwd)
subprocess.run([onnx_test_runner, '-e', 'nuphar', '-M', '-c', '1', '-j', '1', '-n', 'bidaf', cwd], check=True, cwd=cwd)
# test AOT on the quantized model
if os.name not in ['nt', 'posix']:
@ -426,7 +426,7 @@ class TestNuphar(unittest.TestCase):
bert_squad_dir = os.path.join(cwd, 'download_sample_10')
bert_squad_model = os.path.join(bert_squad_dir, 'bertsquad10.onnx')
subprocess.run([
sys.executable, '-m', 'onnxruntime.nuphar.symbolic_shape_infer', '--input', bert_squad_model, '--output',
sys.executable, '-m', 'onnxruntime.tools.symbolic_shape_infer', '--input', bert_squad_model, '--output',
bert_squad_model, '--auto_merge', '--int_max=1000000'
],
check=True,
@ -667,22 +667,9 @@ class TestNuphar(unittest.TestCase):
sess = onnxrt.InferenceSession(matmul_model_name)
actual_y = sess.run([], test_inputs)
assert np.allclose(expected_y, actual_y)
assert np.allclose(expected_y, actual_y, atol=1e-7)
print("finished " + matmul_model_name)
def test_symbolic_shape_infer(self):
cwd = os.getcwd()
test_model_dir = os.path.join(cwd, '..', 'models')
for filename in Path(test_model_dir).rglob('*.onnx'):
if filename.name.startswith('.'):
continue # skip some bad model files
subprocess.run([
sys.executable, '-m', 'onnxruntime.nuphar.symbolic_shape_infer', '--input',
str(filename), '--auto_merge', '--int_max=100000', '--guess_output_rank'
],
check=True,
cwd=cwd)
if __name__ == '__main__':
unittest.main()

View file

@ -0,0 +1,27 @@
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License.
# -*- coding: UTF-8 -*-
import unittest
import os
from onnxruntime.tools.symbolic_shape_infer import SymbolicShapeInference
import sys
from pathlib import Path
class TestSymbolicShapeInference(unittest.TestCase):
def test_symbolic_shape_infer(self):
cwd = os.getcwd()
test_model_dir = os.path.join(cwd, '..', 'models')
for filename in Path(test_model_dir).rglob('*.onnx'):
if filename.name.startswith('.'):
continue # skip some bad model files
print("Running symbolic shape inference on : " + str(filename))
SymbolicShapeInference.infer_shapes(
input_model=str(filename),
output_model=None,
auto_merge=True,
int_max=100000,
guess_output_rank=True)
if __name__ == '__main__':
unittest.main()

View file

@ -111,6 +111,10 @@ def parse_arguments():
parser.add_argument("--path_to_protoc_exe", help="Path to protoc exe.")
parser.add_argument(
"--fuzz_testing", action='store_true', help="Enable Fuzz testing of the onnxruntime.")
parser.add_argument(
"--enable_symbolic_shape_infer_tests", action='store_true',
help="""When running the Test phase, run symbolic shape inference against
available test data directories.""")
# generate documentaiton
parser.add_argument(
@ -1206,6 +1210,10 @@ def run_onnxruntime_tests(args, source_dir, ctest_path, build_dir, configs):
run_subprocess([sys.executable, 'onnxruntime_test_python.py'], cwd=cwd, dll_path=dll_path)
if args.enable_symbolic_shape_infer_tests:
run_subprocess([sys.executable, 'onnxruntime_test_python_symbolic_shape_infer.py'],
cwd=cwd, dll_path=dll_path)
# For CUDA enabled builds test IOBinding feature
# Limit testing to Windows non-ARM builds for now
iobinding_test = False

View file

@ -38,6 +38,7 @@ jobs:
--build_wheel \
--use_openmp \
--enable_onnx_tests \
--enable_symbolic_shape_infer_tests \
--use_mklml --enable_pybind --build_java --build_nodejs \
--cmake_extra_defines PYTHON_INCLUDE_DIR=/opt/python/cp37-cp37m/include/python3.7m PYTHON_LIBRARY=/usr/lib64/librt.so
workingDirectory: $(Build.SourcesDirectory)

View file

@ -105,10 +105,7 @@ mv /tmp/src/gradle-6.3 /usr/local/gradle
#Don't update 'wheel' to the latest version. see: https://github.com/pypa/auditwheel/issues/102
${PYTHON_EXE} -m pip install -r ${0/%install_deps\.sh/requirements\.txt}
if [ $DEVICE_TYPE = "Normal" ]; then
${PYTHON_EXE} -m pip install sympy==1.1.1
elif [ $DEVICE_TYPE = "gpu" ]; then
${PYTHON_EXE} -m pip install sympy==1.1.1
if [ $DEVICE_TYPE = "gpu" ]; then
if [[ $BUILD_EXTR_PAR = *--enable_training* ]]; then
${PYTHON_EXE} -m pip install --upgrade --pre torch==1.6.0.dev20200610 torchvision==0.7.0.dev20200610 torchtext==0.6.0.dev20200610 -f https://download.pytorch.org/whl/nightly/cu101/torch_nightly.html
${PYTHON_EXE} -m pip install transformers==v2.10.0

View file

@ -6,3 +6,4 @@ setuptools>=41.4.0
wheel
onnx==1.7.0
argparse
sympy==1.1.1