onnxruntime/docs/ReduceBinarySize.md
RandySheriffH 3fa73a5b6a
ReduceBinarySize (#4747)
* cancel night build on pyop

* add rewriter to rewrite cpu provider

* skip BuildKernelCreateInfo<void>

* refactor variable name and comment

* include ops from csv file

* process multiple eps

* add default function to cuda provider

* rename function and add license header

* fix import

* add doc

* fix typo

* deal with empty kernel entry in cuda

* rename the rewriter file

* add comment into provider file

* add comment and rename function

* log warnings

* refactor extracting logic

* add entry for script to run solo

* add better example

* avoid onnx importing

* fix flake8 alerts

* minor fixes to better comments and doc

* add entries for all domains

* add void entry into contrib providers

* format cuda_contrib_kernels.cc

* format cpu_contrib_kernels.cc

* add all providers

* add default entry to all providers

* include op_kernel header

* cancelling change in providers beyond cpu/cuda

* rename file and switch file format to domain;opset;op1,op2...

* update doc

* restore non-regular ending grammar in cuda_contrib_kernels.cc

* add ort_root as input argument of script

* enable test in ci

* update doc

* update doc

* revert change on linux gnu ci

* switch to set to host ops

* simplify trimming logic

* add domain map to track current model

* allow ort_root to take relative path
2020-08-21 19:50:13 -07:00

1.1 KiB

Reduce binary size

To reduce compiled binary size, two options are available:

  • --include_ops_by_model=<path to directory of models>
  • --include_ops_by_file=<path to a file>

The options empower building to comment out operators listed in execution provider(s), thereby downsizing the output.

include_ops_by_model

The argument enables the compile binary of including only operators consumed by models in the specified directory.

include_ops_by_file

The argument enables the compiled binary of including only operators referred. The file has format like:

#domain;opset;op1,op2...
ai.onnx;1;MemcpyToHost,MemcpyFromHost
ai.onnx;11;Gemm

More usage tips

  • By default, the trimming happens only on cpu execution provider, with --use_cuda it will also be applied to cuda;
  • If both are specified, operators referred from either argument will be kept active;
  • The script is located under toos/ci_build/, and could go solo to apply to cpu and cuda providers as:
python exclude_unused_ops.py --model_path d:\ReduceSize\models --file_path d:\ReduceSize\ops.txt --ort_root d:\onnxruntime