mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-17 21:10:43 +00:00
* cancel night build on pyop * setup ci pipeline for build of reduced ops * add back c# test * remove debugging print * add testing model * add more arg in pipeline script * disable pipeline trigger temporarily * fix yaml format * fix yaml format * fix pipeline error * rid c# test * add ops for test cases * add Conv from domain com.microsoft.nchwc * remove --reduce_ops * fix typo * remove --build_java * add test case for excluded op * update doc with --skip_test * formatting code, renaming files and simplify yaml * remove debug build from yaml * remove surplus ops from included_ops.txt * add MinSizeRel build to yaml * rename test cases and models * exclude ir test from minimum build * restrict ir test to be only applied to reduced ops build
1.2 KiB
1.2 KiB
Reduce binary size
To reduce compiled binary size, two options are available:
- --include_ops_by_model=<path to directory of models>
- --include_ops_by_file=<path to a file>
The options empower building to comment out operators listed in execution provider(s), thereby downsizing the output. Note that it is a MUST to build with --skip_tests in case excluded ops cause test failures.
include_ops_by_model
The argument enables the compile binary of including only operators consumed by models in the specified directory.
include_ops_by_file
The argument enables the compiled binary of including only operators referred. The file has format like:
#domain;opset;op1,op2...
ai.onnx;1;MemcpyToHost,MemcpyFromHost
ai.onnx;11;Gemm
Usage tips
- By default, the trimming happens only on cpu execution provider, with --use_cuda it will also be applied to cuda;
- If both are specified, operators referred from either argument will be kept active;
- The script is located under toos/ci_build/, and could go solo to apply to cpu and cuda providers as:
python exclude_unused_ops.py --model_path d:\ReduceSize\models --file_path d:\ReduceSize\ops.txt --ort_root d:\onnxruntime