mirror of
https://github.com/saymrwulf/onnxruntime.git
synced 2026-05-18 21:21:17 +00:00
* Add QAttention to DNNL EP Add QAttention to DNNL EP (limited support and disable for gpu) update ONEDNN version to 2.4.4 bug fix in getcapability add memory debug print Signed-off-by: Wang <zhaoyang.wang@intel.com> * Address Code Review + MatMulInteger Fix clean up code and add comments fix matmulinteger and add fusion rule to enable initialized vector weight zero points of 0s update DNNL_TAG to v2.5 Signed-off-by: Wang <zhaoyang.wang@intel.com> * Linux Compile Fix + rollback ONEDNN to 2.4.4 Signed-off-by: Zhaoyang Wang <zhaoyang.wang@intel.com> * Fix QAttention Debug build Signed-off-by: Wang <zhaoyang.wang@intel.com> * Fix QAttention build if USE_DNNL not specified Signed-off-by: George Nash <george.nash@intel.com> Co-authored-by: Wang <zhaoyang.wang@intel.com> Co-authored-by: MTC <63478620+jeyblu@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| coremltools@523d5e03d8 | ||
| cub@c3cceac115 | ||
| cxxopts@3c73d91c0b | ||
| date@e7e1482087 | ||
| dlpack@2775088798 | ||
| eigen@d10b27fe37 | ||
| emsdk@a3d65c80d3 | ||
| flatbuffers@6df40a2471 | ||
| googlebenchmark@7d0d9061d8 | ||
| googletest@53495a2a7d | ||
| json@db78ac1d77 | ||
| libprotobuf-mutator@7a2ed51a6b | ||
| mimalloc@f412df7a2b | ||
| mp11@21cace4e57 | ||
| nsync@436617053d | ||
| onnx@be76ca7148 | ||
| onnx-tensorrt@1f416bb462 | ||
| onnxruntime-extensions@d4b2aff0c8 | ||
| protobuf@2dc747c574 | ||
| pytorch_cpuinfo@5916273f79 | ||
| re2@4244cd1cb4 | ||
| SafeInt | ||
| tensorboard@373eb09e4c | ||
| tvm@9ec2b92d18 | ||
| wil@e8c599bca6 | ||
| dml.cmake | ||
| dnnl.cmake | ||
| eigen.cmake | ||
| extensions.cmake | ||
| FindNumPy.cmake | ||
| jemalloc.cmake | ||
| mimalloc.cmake | ||
| onnx_minimal.cmake | ||
| pybind11.cmake | ||
| pyxir.cmake | ||
| zlib.cmake | ||