onnxruntime/onnxruntime
Yufeng Li 87d68d8531
matmul integer fusion (#4195)
* Introduce DynamicQuantizeMatMul
It fuses DynamicQuantizeLinear, MatMul and following cast, multiplier. It gets float in and float out for quantized matmul. We have a MLAS kernel in implementation for this op.
2020-06-11 21:42:09 -07:00
..
contrib_ops matmul integer fusion (#4195) 2020-06-11 21:42:09 -07:00
core matmul integer fusion (#4195) 2020-06-11 21:42:09 -07:00
featurizers_ops/cpu Initial checkin (#3791) 2020-05-01 14:58:49 -07:00
gsl Fix static analysis warnings found by VC++ (#3530) 2020-04-16 01:46:47 -07:00
python Add past state support in Attention Op for GPT-2 (#4107) 2020-06-11 14:19:55 -07:00
test matmul integer fusion (#4195) 2020-06-11 21:42:09 -07:00
tool/etw Fix static analysis warnings found by VC++ (#3530) 2020-04-16 01:46:47 -07:00
.style.yapf Enable running PEP8 on python scripts using flake8 (#3928) 2020-05-15 07:15:06 +10:00
__init__.py bump up ORT version to 1.3.1 (#4181) 2020-06-10 08:44:03 -07:00
ReformatSource.ps1
ReformatSourcePython.bat Update GPT2 Model Benchmark Script to Support IO Binding (#4088) 2020-06-01 15:07:48 -07:00
VSCodeCoverage.runsettings