ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Find a file
raoanag 7d4dba7e16
Disable MatMulIntegerToFloat transformation for FP16 on CPU EP (#18239)
### Description
MatMulIntegerToFloat is updated to support FP16. The nodes for FP16
Transformation use "Mul" FP16, which is not directly supported by the
CPU.

For now FP16 transformation is only supported for DML EP. Disabled all
FP16 tests on CPU.

Tests result without `-use_dml` build flag
```
onnxruntime_test_all.exe --gtest_filter="*MatMulIntegerToFloat*"
Note: Google Test filter = *MatMulIntegerToFloat*
[==========] Running 8 tests from 4 test suites.
[----------] Global test environment set-up.
[----------] 1 test from CPU_U8S8_Precision_Tests
[ RUN      ] CPU_U8S8_Precision_Tests.MatMulIntegerToFloat
[       OK ] CPU_U8S8_Precision_Tests.MatMulIntegerToFloat (181 ms)
[----------] 1 test from CPU_U8S8_Precision_Tests (181 ms total)

[----------] 1 test from GraphTransformationTests
[ RUN      ] GraphTransformationTests.MatMulIntegerToFloatTest
[       OK ] GraphTransformationTests.MatMulIntegerToFloatTest (17 ms)
[----------] 1 test from GraphTransformationTests (17 ms total)

[----------] 1 test from QDQTransformerTests
[ RUN      ] QDQTransformerTests.MatMulIntegerToFloat
[       OK ] QDQTransformerTests.MatMulIntegerToFloat (656 ms)
[----------] 1 test from QDQTransformerTests (656 ms total)

[----------] 5 tests from MatMulIntegerToFloat
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8X8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_U8X8 (195 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8X8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_U8X8 (206 ms)
[ RUN      ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8
[       OK ] MatMulIntegerToFloat.HasZeroPoint_NoBias_test_S8S8 (107 ms)
[ RUN      ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8
[       OK ] MatMulIntegerToFloat.NoZeroPoint_HasBias_test_S8S8 (114 ms)
[ RUN      ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint
[       OK ] MatMulIntegerToFloat.MatMulInteger_With_ZeroPoint (227 ms)
[----------] 5 tests from MatMulIntegerToFloat (854 ms total)

[----------] Global test environment tear-down
[==========] 8 tests from 4 test suites ran. (1713 ms total)
[  PASSED  ] 8 tests.
memleakdbg:
----- No memory leaks detected -----
```

```
onnxruntime_test_all.exe --gtest_filter="GraphTransformationTests.MatMulIntegerToFloat*"
Note: Google Test filter = GraphTransformationTests.MatMulIntegerToFloat*
[==========] Running 2 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 2 tests from GraphTransformationTests
[ RUN      ] GraphTransformationTests.MatMulIntegerToFloatTest
[       OK ] GraphTransformationTests.MatMulIntegerToFloatTest (13 ms)
[ RUN      ] GraphTransformationTests.MatMulIntegerToFloat16Test
[       OK ] GraphTransformationTests.MatMulIntegerToFloat16Test (4 ms)
[----------] 2 tests from GraphTransformationTests (20 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test suite ran. (22 ms total)
[  PASSED  ] 2 tests.
memleakdbg:
----- No memory leaks detected -----
```
### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
2023-11-03 10:05:09 -07:00
.config Update tsaoptions.json: update the email alias (#13448) 2022-10-26 15:56:16 -07:00
.devcontainer Remove two lines in the Dockerfile for Github Codespace (#12278) 2022-07-21 20:52:17 -07:00
.gdn Update win-ci-pipeline.yml: enable xnnpack tests (#16244) 2023-06-14 19:12:42 -07:00
.github Bump actions/deploy-pages from 1 to 2 (#16402) 2023-07-24 16:13:59 -07:00
.pipelines Update DML Preview Package to 13.0-dev4c864f8324cef2ff5c39a5822d6c4de05929306d (#18193) 2023-10-31 17:43:43 -07:00
.vscode cpplint & Eager mode: refactor and add comments to empty_* functions, general lint cleanup in ort_aten (#12238) 2022-07-20 11:47:57 -04:00
cgmanifests [TensorRT EP] TRT 8.6 minor version update (#16475) 2023-06-26 10:44:27 -07:00
cmake Update DML Preview Package to 13.0-dev4c864f8324cef2ff5c39a5822d6c4de05929306d (#18193) 2023-10-31 17:43:43 -07:00
csharp Change DML GPU pool in Windows GPU workflow use Visual Studio 2022 (#16784) 2023-07-23 10:07:21 +08:00
dockerfiles Enable model subgraph execution in OVEP and setting the OpenVINO dll's to the path from the OpenVINO pypi packge in OVEP and fix OVEP windows io buffer sample (#16147) 2023-06-16 19:47:09 -07:00
docs [Better Engineering] Bump ruff to 0.0.278 and fix new lint errors (#16789) 2023-07-21 12:53:41 -07:00
include/onnxruntime/core Cherry-pick b8f373b0ae (#17895) (#17926) 2023-10-12 16:56:40 -07:00
java [java] Fp16 fix for android/react native (#16832) 2023-07-25 12:31:32 -07:00
js Bump word-wrap from 1.2.3 to 1.2.4 in /js/react_native (#16755) 2023-07-22 13:36:49 -07:00
objectivec Objective-C Add Support to Create and Query String ORTValues (#16764) 2023-07-20 17:39:29 -07:00
onnxruntime Disable MatMulIntegerToFloat transformation for FP16 on CPU EP (#18239) 2023-11-03 10:05:09 -07:00
orttraining [DORT] Enable Dynamic Shape in DORT and Use Different InferenceSession's when Inputs Are Not Compatible (#16753) 2023-07-24 16:54:01 -07:00
rust Add rust bindings (#12606) 2023-02-08 14:57:15 -08:00
samples Enable pylint and numpy rules (#15218) 2023-03-27 20:37:53 -07:00
swift/OnnxRuntimeBindingsTests Add iOS Swift Package Manager support (#15297) 2023-04-20 16:18:35 +10:00
tools Update DML Preview Package to 13.0-dev4c864f8324cef2ff5c39a5822d6c4de05929306d (#18193) 2023-10-31 17:43:43 -07:00
winml Fix DML regression from allocator refactor and enable unrounded weight allocations through ORT API 2023-08-06 09:49:39 -07:00
.clang-format Run clang-format in CI (#15524) 2023-04-18 09:26:58 -07:00
.clang-tidy Create clang-tidy CI (#12653) 2022-09-30 08:05:38 -07:00
.dockerignore
.gitattributes
.gitignore remove 'lib/' from .gitignore (#15613) 2023-04-24 18:43:32 -07:00
.gitmodules Update eigen to 3.4 and remove the eigen from git submodule (#15875) 2023-05-11 11:56:59 -07:00
.lintrunner.toml Minimal Build for On-Device Training (#16326) 2023-06-22 12:27:23 -07:00
build.bat Upgrade old Python version in packaging pipeline (#16667) 2023-07-17 08:24:47 -07:00
build.sh Upgrade old Python version in packaging pipeline (#16667) 2023-07-17 08:24:47 -07:00
CITATION.cff
CODEOWNERS Add owners for public facing API files (#15288) 2023-03-30 17:16:15 -07:00
CONTRIBUTING.md Fix link to High Level Design (#11786) 2023-02-28 11:05:54 -08:00
lgtm.yml Fix lgtm C++ error (#13613) 2022-11-10 10:06:22 -08:00
LICENSE
NuGet.config
ort.wprp
ORT_icon_for_light_bg.png
Package.swift Objective-C Add Support to Create and Query String ORTValues (#16764) 2023-07-20 17:39:29 -07:00
packages.config Update DML Preview Package to 13.0-dev4c864f8324cef2ff5c39a5822d6c4de05929306d (#18193) 2023-10-31 17:43:43 -07:00
pyproject.toml [Better Engineering] Bump ruff to 0.0.278 and fix new lint errors (#16789) 2023-07-21 12:53:41 -07:00
README.md add third-party pipeline status to README.md (#16155) 2023-05-31 22:14:39 -07:00
requirements-dev.txt Remove codecov from requirements-dev.txt (#15487) 2023-04-12 18:48:02 -07:00
requirements-doc.txt
requirements-lintrunner.txt [Better Engineering] Bump ruff to 0.0.278 and fix new lint errors (#16789) 2023-07-21 12:53:41 -07:00
requirements-training.txt Remove protobuf pin from training requirements (#13695) 2022-11-22 12:27:18 -08:00
requirements.txt.in Add additional python requirements (#11522) 2022-05-20 16:16:18 -07:00
SECURITY.md Microsoft mandatory file (#11619) 2022-05-25 13:56:10 -07:00
setup.py Triton Codegen for ORTModule (#15831) 2023-07-13 18:17:58 +08:00
ThirdPartyNotices.txt Implement openAI endpoint invoker for nuget (#15797) 2023-05-11 22:04:02 -07:00
VERSION_NUMBER Update VERSION_NUMBER (#15773) 2023-05-03 15:07:34 -07:00

ONNX Runtime is a cross-platform inference and training machine-learning accelerator.

ONNX Runtime inference can enable faster customer experiences and lower costs, supporting models from deep learning frameworks such as PyTorch and TensorFlow/Keras as well as classical machine learning libraries such as scikit-learn, LightGBM, XGBoost, etc. ONNX Runtime is compatible with different hardware, drivers, and operating systems, and provides optimal performance by leveraging hardware accelerators where applicable alongside graph optimizations and transforms. Learn more →

ONNX Runtime training can accelerate the model training time on multi-node NVIDIA GPUs for transformer models with a one-line addition for existing PyTorch training scripts. Learn more →

Get Started & Resources

Builtin Pipeline Status

System Inference Training
Windows Build Status
Build Status
Build Status
Linux Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Mac Build Status
Android Build Status
iOS Build Status
Web Build Status
Other Build Status
Build Status

Third-party Pipeline Status

System Inference Training
Linux Build Status

Data/Telemetry

Windows distributions of this project may collect usage data and send it to Microsoft to help improve our products and services. See the privacy statement for more details.

Contributions and Feedback

We welcome contributions! Please see the contribution guidelines.

For feature requests or bug reports, please file a GitHub Issue.

For general discussion or questions, please use GitHub Discussions.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

License

This project is licensed under the MIT License.