onnxruntime/cmake
Yufeng Li da3dd398c5
Kernels for QLinearConv with symmetrically quantized filter (#9323)
Add kernels for QLinearConv with symmetric quantized filter, e.g., filter type is int8 and zero point of filter is 0. This PR includes kernels for avx2, avxvnni, avx512 and avx 512 vnni. Will adds kernels for ARM64 in following PR.

Kernels uses direct input buffer directly for pointwise, and in-direct buffer for depthwise and non-group conv.

The advantages of those new kernels are:

no need to compute the sum of each pixel output image, and sum/offset of filter can be combined with bias.
with in-direct buffer, im2col returns an array of buffer pointers instead of memcpy'ing the original data. This saves memcpy time and reduces the size of the intermediate buffer needed to hold the im2col transform. In the future, will compute im2col ahead of time for input with fixed input size.
2021-10-18 19:40:18 -07:00
..
external Fix to_dlpack Failure on PyTorch-1.10 (#9151) 2021-09-24 09:48:07 +08:00
patches Sync ORTModule branch with master and fix tests (#6526) 2021-02-02 08:59:56 -08:00
tensorboard Update compliance tasks in python packaging pipeline and fix some compile warnings (#8471) 2021-07-30 17:16:37 -07:00
CMakeLists.txt [js/web] Enable wasm profiling and preserve function names in profiling (#9314) 2021-10-11 22:04:50 -07:00
CMakeSettings.json
codeconv.runsettings
Info.plist.in Enable build dynamic framework for macOS/iOS (#7343) 2021-04-15 16:47:53 -07:00
libonnxruntime.pc.cmake.in cmake: support install target with generated pkg-config file (#7076) 2021-03-22 19:36:31 -07:00
nuget_helpers.cmake Fix nuget build error (#6009) 2020-12-03 09:28:39 -08:00
onnxruntime.cmake CMake file changes for macOS universal2 support (#8953) 2021-09-04 13:30:33 -07:00
onnxruntime_codegen.cmake Update manylinux build scripts and GPU CUDA version from 11.0 to 11.1 (#7632) 2021-06-02 23:36:49 -07:00
onnxruntime_common.cmake Remove cpuinfo from WCOS builds (#9076) 2021-09-16 12:05:47 -07:00
onnxruntime_config.h.in [js/web] update emsdk to v2.0.26 (#8653) 2021-08-26 15:31:34 -07:00
onnxruntime_csharp.cmake Cleanup C# bindings to add EP (#8810) 2021-08-26 13:59:40 +10:00
onnxruntime_eager.cmake Abjindal/eager windows build (#9326) 2021-10-14 12:54:49 -07:00
onnxruntime_flatbuffers.cmake Fix Windows Store build (#8753) 2021-08-23 11:19:03 -07:00
onnxruntime_framework.cmake Extend node debugging utilities to push tensors and node placement to SQL database (#8672) 2021-08-21 00:40:12 -07:00
onnxruntime_fuzz_test.cmake Merge CPU packaging pipelines (#6480) 2021-02-04 08:38:56 -08:00
onnxruntime_graph.cmake Enable selector action transformer infrastructure in minimal build. (#8804) 2021-08-27 17:16:05 +10:00
onnxruntime_ios.toolchain.cmake Enable build dynamic framework for macOS/iOS (#7343) 2021-04-15 16:47:53 -07:00
onnxruntime_java.cmake CMake file changes for macOS universal2 support (#8953) 2021-09-04 13:30:33 -07:00
onnxruntime_java_unittests.cmake [Java] Adds support for DNNL, OpenVINO, TensorRT shared providers and refactors the CUDA shared provider loader (#8013) 2021-07-20 22:33:15 -07:00
onnxruntime_language_interop_ops.cmake Update manylinux build scripts and GPU CUDA version from 11.0 to 11.1 (#7632) 2021-06-02 23:36:49 -07:00
onnxruntime_mlas.cmake Kernels for QLinearConv with symmetrically quantized filter (#9323) 2021-10-18 19:40:18 -07:00
onnxruntime_nodejs.cmake Specify correct dependency for CI pipeline of nodejs binding (#7717) 2021-05-15 08:56:58 -07:00
onnxruntime_nuphar_extern.cmake Add static code analyzer to Windows CPU/GPU CI builds and fix the warnings (#7489) 2021-04-29 11:54:57 -07:00
onnxruntime_objectivec.cmake [Objective-C API] Add script to assemble pod package files. (#7958) 2021-06-07 19:16:39 -07:00
onnxruntime_opschema_lib.cmake Update compliance tasks in python packaging pipeline and fix some compile warnings (#8471) 2021-07-30 17:16:37 -07:00
onnxruntime_optimizer.cmake Enable selector action transformer infrastructure in minimal build. (#8804) 2021-08-27 17:16:05 +10:00
onnxruntime_providers.cmake [ROCm] static re-hipify of CUDA EP to ROCm EP, now a shared provider (#8877) 2021-10-14 15:15:51 -07:00
onnxruntime_pyop.cmake Packaging pipeline now builds with PythonOp (aka running autograd.Function) (#8652) 2021-08-17 10:55:13 -07:00
onnxruntime_python.cmake [ROCm] static re-hipify of CUDA EP to ROCm EP, now a shared provider (#8877) 2021-10-14 15:15:51 -07:00
onnxruntime_session.cmake [ROCm] static re-hipify of CUDA EP to ROCm EP, now a shared provider (#8877) 2021-10-14 15:15:51 -07:00
onnxruntime_training.cmake [ROCm] static re-hipify of CUDA EP to ROCm EP, now a shared provider (#8877) 2021-10-14 15:15:51 -07:00
onnxruntime_unittests.cmake [ROCm] static re-hipify of CUDA EP to ROCm EP, now a shared provider (#8877) 2021-10-14 15:15:51 -07:00
onnxruntime_util.cmake Update manylinux build scripts and GPU CUDA version from 11.0 to 11.1 (#7632) 2021-06-02 23:36:49 -07:00
onnxruntime_webassembly.cmake [js/web] Enable wasm profiling and preserve function names in profiling (#9314) 2021-10-11 22:04:50 -07:00
precompiled_header.cmake Fix Windows Store build (#8753) 2021-08-23 11:19:03 -07:00
protobuf_function.cmake Sync ORTModule branch with master and fix tests (#6526) 2021-02-02 08:59:56 -08:00
set_winapi_family_desktop.h
store_toolchain.cmake
target_delayload.cmake
uwp_stubs.h Fix Windows Store build (#8753) 2021-08-23 11:19:03 -07:00
wcos_rules_override.cmake
wil.cmake
winml.cmake Port ARM64x support (#9230) 2021-10-01 13:06:43 -07:00
winml_cppwinrt.cmake Fix Windows Store build (#8753) 2021-08-23 11:19:03 -07:00
winml_sdk_helpers.cmake
winml_unittests.cmake Fix Windows Store build (#8753) 2021-08-23 11:19:03 -07:00