onnxruntime/onnxruntime
Changming Sun 660f39aca5
Perf improvement for Intel MTL CPUs (#19524)
### Description
See the comments inside of the changed files for more detailed
information.

The file onnxruntime/core/platform/windows/hardware_core_enumerator.cc
and onnxruntime/core/platform/windows/hardware_core_enumerator.h were
copied from WinML source folder in this repo, with minor coding style
changes.

I had an offline discussion with Sheil. We agree that given the lack of
a future proof solution, we may check-in this temp fix first, and rework
it later. I will have a meeting with @ivberg for discussing the issue
deeply, and seeking for a long term solution. Thanks for offering help,
@ivberg !

### Motivation and Context
With this change, we will see about 2x perf improvement on some Intel
CPUs.
2024-02-14 18:35:56 -08:00
..
contrib_ops SimplifiedLayerNormalization Fusion BFloat16 support for Llama-v2 on A100 (#18898) 2024-02-14 10:05:16 -08:00
core Perf improvement for Intel MTL CPUs (#19524) 2024-02-14 18:35:56 -08:00
python Phi2 script fixes (#19500) 2024-02-14 10:08:11 -08:00
test Add BF16 to Sqrt (#19363) 2024-02-14 18:07:51 -08:00
tool/etw
wasm [js/webgpu] Support capture and replay for jsep (#18989) 2024-01-30 18:28:03 -08:00
__init__.py [ORT 1.17.0 release] Bump up version to 1.18.0 (#19170) 2024-01-17 11:18:32 -08:00
ReformatSource.ps1
ReformatSourcePython.bat
VSCodeCoverage.runsettings