onnxruntime/orttraining/tools/scripts/performance_investigation.py
Justin Chu d834ec895a
Adopt linrtunner as the linting tool - take 2 (#15085)
### Description

`lintrunner` is a linter runner successfully used by pytorch, onnx and
onnx-script. It provides a uniform experience running linters locally
and in CI. It supports all major dev systems: Windows, Linux and MacOs.
The checks are enforced by the `Python format` workflow.

This PR adopts `lintrunner` to onnxruntime and fixed ~2000 flake8 errors
in Python code. `lintrunner` now runs all required python lints
including `ruff`(replacing `flake8`), `black` and `isort`. Future lints
like `clang-format` can be added.

Most errors are auto-fixed by `ruff` and the fixes should be considered
robust.

Lints that are more complicated to fix are applied `# noqa` for now and
should be fixed in follow up PRs.

### Notable changes

1. This PR **removed some suboptimal patterns**:

	- `not xxx in` -> `xxx not in` membership checks
	- bare excepts (`except:` -> `except Exception`)
	- unused imports
	
	The follow up PR will remove:
	
	- `import *`
	- mutable values as default in function definitions (`def func(a=[])`)
	- more unused imports
	- unused local variables

2. Use `ruff` to replace `flake8`. `ruff` is much (40x) faster than
flake8 and is more robust. We are using it successfully in onnx and
onnx-script. It also supports auto-fixing many flake8 errors.

3. Removed the legacy flake8 ci flow and updated docs.

4. The added workflow supports SARIF code scanning reports on github,
example snapshot:
	

![image](https://user-images.githubusercontent.com/11205048/212598953-d60ce8a9-f242-4fa8-8674-8696b704604a.png)

5. Removed `onnxruntime-python-checks-ci-pipeline` as redundant

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

Unified linting experience in CI and local.

Replacing https://github.com/microsoft/onnxruntime/pull/14306

---------

Signed-off-by: Justin Chu <justinchu@microsoft.com>
2023-03-24 15:29:03 -07:00

90 lines
2.8 KiB
Python

import argparse
import onnx
parser = argparse.ArgumentParser(description="ONNX file analyzer for performance investigation.")
parser.add_argument("onnx_file", type=str, help="ONNX file to analyze")
args = parser.parse_args()
def process_file(onnx_file):
model = onnx.load(onnx_file)
# Map from output arg to the producer of the output.
output_to_node = {}
for node in model.graph.node:
for o in node.output:
output_to_node[o] = node
aten_ops = []
python_ops = []
memcpu_ops = []
cast_ops = []
msgs = []
for node in model.graph.node:
if "Memcpy" in node.op_type:
memcpu_ops.append(f"{node.op_type} {node.name}")
if node.op_type == "Cast":
cast_ops.append(f"{node.name}")
if node.op_type == "ATen":
for attr in node.attribute:
if attr.name == "operator":
aten_ops.append(f"{node.name}: {attr.s.decode('utf-8')}")
if node.op_type == "PythonOp":
for attr in node.attribute:
if attr.name == "name":
python_ops.append(f"{node.name}: {attr.s.decode('utf-8')}")
# Look for stand-alone Dropout node in *_execution_model_<mode>.onnx graph.
# Examine whether it should be fused with surrounding Add ops into BiasDropout node.
if node.op_type == "Dropout" and len(node.input) == 1:
prev = output_to_node[node.input[0]]
if prev.op_type == "Add":
msgs.append(
f"Examine whether {node.name} should be fused with the leading {prev.name} op into BiasDropout node."
)
# Look for stand-alone Softmax node in *_execution_model_<mode>.onnx graph.
# Examine whether it should be fused with the leading Add ops into BiasSoftmax node.
if node.op_type == "Softmax" and len(node.input) == 1:
prev = output_to_node[node.input[0]]
if prev.op_type == "Add":
msgs.append(
f"Examine whether {node.name} should be fused with the leading {prev.name} op into BiasSoftmax node."
)
if aten_ops:
print("ATen op found:")
for line in aten_ops:
print(line)
print(10 * "-")
if python_ops:
print("PythonOp found:")
for line in python_ops:
print(line)
print(10 * "-")
if memcpu_ops:
print("Memcpu ops found:")
for line in memcpu_ops:
print(line)
print(10 * "-")
if cast_ops:
print("Cast ops found:")
for line in cast_ops:
print(line)
print(10 * "-")
for line in msgs:
print(line)
def main():
process_file(args.onnx_file)
if __name__ == "__main__":
main()