mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-16 21:00:14 +00:00
Adding ARM64 depthwise convolution kernel for symmetric quantization Motivation and Context Two improvements against current kernel code : 1. Signed int8 based instructions, no need to extend from 8b to 16b before multiplication. 2. Unrolled loop with manual software pipelining Co-authored-by: Chen Fu <fuchen@microsoft.com> |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| build_full_ort_and_create_ort_files.sh | ||
| build_minimal_ort_and_run_tests.sh | ||
| build_minimal_ort_android_baseline_and_report_bin_size.sh | ||
| check_build_binary_size.py | ||
| nnapi_minimal_build_minimal_ort_and_run_tests.sh | ||
| readelf_utils.py | ||