onnxruntime/cmake
Chen Fu 2afce4830c
Symmetric QGEMM (#10289)
Adding code for symmetric quantized matrix multiplication. Used in quantized convolution, achieving significant perf gain.

TODO, use Symmetric Quantized GEMM in other operators!

TODO address activation buffer overread in custom allocators and tensors supplied by users.

DOT kernel perf test:

Pixel 5a:

Cartoongan	513.539 ms	471.786 ms
Efficient	57.5169 ms	56.4174 ms
Edgetpu	14.6673 ms	13.5959 ms
NEON kernel perf test

Pixel 3a

Cartoongan	1423.53 ms	1069.92 ms
Efficient	114.086 ms	107.968 ms
Edgetpu	39.2632 ms	36.9839 ms


Co-authored-by: Chen Fu <fuchen@microsoft.com>
2022-01-24 10:49:04 -08:00
..
external Reduce number of memory allocations based on a customer profiling case (#10193) 2022-01-24 10:40:46 -08:00
patches Reduce number of memory allocations based on a customer profiling case (#10193) 2022-01-24 10:40:46 -08:00
tensorboard Update compliance tasks in python packaging pipeline and fix some compile warnings (#8471) 2021-07-30 17:16:37 -07:00
CMakeLists.txt Add a build option to create a WebAssembly static library (#10184) 2022-01-18 18:05:04 -08:00
CMakeSettings.json
codeconv.runsettings
EnableVisualStudioCodeAnalysis.props Fix SDL warnings in CPU EP (#9975) 2021-12-19 20:54:29 -08:00
Info.plist.in Enable build dynamic framework for macOS/iOS (#7343) 2021-04-15 16:47:53 -07:00
libonnxruntime.pc.cmake.in cmake: support install target with generated pkg-config file (#7076) 2021-03-22 19:36:31 -07:00
nuget_helpers.cmake
onnxruntime.cmake Amdmigraphx fix build error (#9272) 2022-01-10 15:18:43 -08:00
onnxruntime_codegen.cmake Update manylinux build scripts and GPU CUDA version from 11.0 to 11.1 (#7632) 2021-06-02 23:36:49 -07:00
onnxruntime_common.cmake Reduce number of memory allocations based on a customer profiling case (#10193) 2022-01-24 10:40:46 -08:00
onnxruntime_config.h.in [js/web] update emsdk to v2.0.26 (#8653) 2021-08-26 15:31:34 -07:00
onnxruntime_csharp.cmake Standalone TVM Executor Provider (#10019) 2021-12-15 16:59:20 -08:00
onnxruntime_eager.cmake Abjindal/fix windows ci pipeline (#9883) 2021-11-30 10:33:13 -08:00
onnxruntime_flatbuffers.cmake Fix Windows Store build (#8753) 2021-08-23 11:19:03 -07:00
onnxruntime_framework.cmake Reduce number of memory allocations based on a customer profiling case (#10193) 2022-01-24 10:40:46 -08:00
onnxruntime_fuzz_test.cmake
onnxruntime_graph.cmake Fix SDL warnings in CPU EP (#9975) 2021-12-19 20:54:29 -08:00
onnxruntime_ios.toolchain.cmake Enable build dynamic framework for macOS/iOS (#7343) 2021-04-15 16:47:53 -07:00
onnxruntime_java.cmake Standalone TVM Executor Provider (#10019) 2021-12-15 16:59:20 -08:00
onnxruntime_java_unittests.cmake [Java] Adds support for DNNL, OpenVINO, TensorRT shared providers and refactors the CUDA shared provider loader (#8013) 2021-07-20 22:33:15 -07:00
onnxruntime_language_interop_ops.cmake Update manylinux build scripts and GPU CUDA version from 11.0 to 11.1 (#7632) 2021-06-02 23:36:49 -07:00
onnxruntime_mlas.cmake Symmetric QGEMM (#10289) 2022-01-24 10:49:04 -08:00
onnxruntime_nodejs.cmake Add Node.js binding support to packaging pipeline (#9577) 2021-11-05 15:29:40 -07:00
onnxruntime_nuphar_extern.cmake Add static code analyzer to Windows CPU/GPU CI builds and fix the warnings (#7489) 2021-04-29 11:54:57 -07:00
onnxruntime_objectivec.cmake [Objective-C API] Add script to assemble pod package files. (#7958) 2021-06-07 19:16:39 -07:00
onnxruntime_opschema_lib.cmake Update compliance tasks in python packaging pipeline and fix some compile warnings (#8471) 2021-07-30 17:16:37 -07:00
onnxruntime_optimizer.cmake [QDQ] Add shared qdq selectors (#10178) 2022-01-11 19:41:45 -08:00
onnxruntime_providers.cmake Reduce number of memory allocations based on a customer profiling case (#10193) 2022-01-24 10:40:46 -08:00
onnxruntime_pyop.cmake Packaging pipeline now builds with PythonOp (aka running autograd.Function) (#8652) 2021-08-17 10:55:13 -07:00
onnxruntime_python.cmake Abjindal/clean eager backend (#10055) 2022-01-19 14:20:09 -08:00
onnxruntime_session.cmake [ROCm] static re-hipify of CUDA EP to ROCm EP, now a shared provider (#8877) 2021-10-14 15:15:51 -07:00
onnxruntime_training.cmake [ROCm] static re-hipify of CUDA EP to ROCm EP, now a shared provider (#8877) 2021-10-14 15:15:51 -07:00
onnxruntime_unittests.cmake Amdmigraphx fix build error (#9272) 2022-01-10 15:18:43 -08:00
onnxruntime_util.cmake Update manylinux build scripts and GPU CUDA version from 11.0 to 11.1 (#7632) 2021-06-02 23:36:49 -07:00
onnxruntime_webassembly.cmake Add a build option to create a WebAssembly static library (#10184) 2022-01-18 18:05:04 -08:00
precompiled_header.cmake Fix Windows Store build (#8753) 2021-08-23 11:19:03 -07:00
protobuf_function.cmake
Sdl.ruleset Fix SDL warnings in CPU EP (#9975) 2021-12-19 20:54:29 -08:00
set_winapi_family_desktop.h
store_toolchain.cmake
target_delayload.cmake
uwp_stubs.h Fix Windows Store build (#8753) 2021-08-23 11:19:03 -07:00
wcos_rules_override.cmake
wil.cmake
winml.cmake Enable JoinModels API in WinML+RT Experimental API (#9746) 2021-11-12 16:56:31 -08:00
winml_cppwinrt.cmake Fix Windows Store build (#8753) 2021-08-23 11:19:03 -07:00
winml_sdk_helpers.cmake
winml_unittests.cmake Clean up optional-lite references (#9534) 2021-10-25 21:05:45 -07:00