Summary: This PR upgrades oneDNN to [v2.3.3](https://github.com/oneapi-src/oneDNN/releases/tag/v2.3.3) and includes [Graph API preview release](https://github.com/oneapi-src/oneDNN/releases/tag/graph-v0.2) in one package. - oneDNN will be located at `pytorch/third_party/ideep/mkl-dnn/third_party/oneDNN` - The version of oneDNN will be [v2.3.3](https://github.com/oneapi-src/oneDNN/releases/tag/v2.3.3) The main changes on CPU: - v2.3 - Extended primitive cache to improve primitive descriptor creation performance. - Improved primitive cache performance in multithreaded configurations. - Introduced initial optimizations for bfloat16 compute functionality for future Intel Xeon Scalable processor (code name Sapphire Rapids). - Improved performance of binary primitive and binary post-op for cases with broadcast and mixed source and destination formats. - Improved performance of reduction primitive - Improved performance of depthwise convolution primitive with NHWC activations for training cases - v2.3.1 - Improved int8 GEMM performance for processors with Intel AVX2 and Intel DL Boost support - Fixed integer overflow for inner product implementation on CPUs - Fixed out of bounds access in GEMM implementation for Intel SSE 4.1 - v2.3.2 - Fixed performance regression in fp32 inner product primitive for processors with Intel AVX512 support - v2.3.3 - Reverted check for memory descriptor stride validity for unit dimensions - Fixed memory leak in CPU GEMM implementation More changes can be found in https://github.com/oneapi-src/oneDNN/releases. - The Graph API provides flexible API for aggressive fusion, and the preview2 supports fusion for FP32 inference. See the [Graph API release branch](https://github.com/oneapi-src/oneDNN/tree/dev-graph-preview2) and [spec](https://spec.oneapi.io/onednn-graph/latest/introduction.html) for more details. A separate PR will be submitted to integrate the oneDNN Graph API to Torchscript graph. Pull Request resolved: https://github.com/pytorch/pytorch/pull/63748 Reviewed By: albanD Differential Revision: D32153889 Pulled By: malfet fbshipit-source-id: 536071168ffe312d452f75d54f34c336ca3778c1 |
||
|---|---|---|
| .. | ||
| FindARM.cmake | ||
| FindAtlas.cmake | ||
| FindAVX.cmake | ||
| FindBenchmark.cmake | ||
| FindBLAS.cmake | ||
| FindBLIS.cmake | ||
| FindCUB.cmake | ||
| FindFFmpeg.cmake | ||
| FindFlexiBLAS.cmake | ||
| FindGloo.cmake | ||
| FindHiredis.cmake | ||
| FindLAPACK.cmake | ||
| FindLevelDB.cmake | ||
| FindLMDB.cmake | ||
| FindMAGMA.cmake | ||
| FindMatlabMex.cmake | ||
| FindMKL.cmake | ||
| FindMKLDNN.cmake | ||
| FindNCCL.cmake | ||
| FindNuma.cmake | ||
| FindNumPy.cmake | ||
| FindOpenBLAS.cmake | ||
| FindOpenMP.cmake | ||
| Findpybind11.cmake | ||
| FindRocksDB.cmake | ||
| FindSnappy.cmake | ||
| FindvecLib.cmake | ||
| FindVSX.cmake | ||
| FindZMQ.cmake | ||
| README.md | ||
This folder contains various custom cmake modules for finding libraries and packages. Details about some of them are listed below.
FindOpenMP.cmake
This is modified from the file included in CMake 3.13 release, with the following changes:
-
Replace
VERSION_GREATER_EQUALwithNOT ... VERSION_LESSasVERSION_GREATER_EQUALis not supported in CMake 3.5 (our min supported version). -
Update the
separate_argumentscommands to not useNATIVE_COMMANDwhich is not supported in CMake 3.5 (our min supported version). -
Make it respect the
QUIETflag so that, when it is set,try_compilefailures are not reported. -
For
AppleClangcompilers, use-Xpreprocessorinstead of-Xclangas the later is not documented. -
For
AppleClangcompilers, an extra flag option is tried, which is-Xpreprocessor -openmp -I${DIR_OF_omp_h}, where${DIR_OF_omp_h}is a obtained usingfind_pathonomp.hwithbrew's default include directory as a hint. Without this, the compiler will complain about missing headers as they are not natively included in Apple's LLVM. -
For non-GNU compilers, whenever we try a candidate OpenMP flag, first try it with directly linking MKL's
libompif it has one. Otherwise, we may end up linking twolibomps and end up with this nasty error:OMP: Error #15: Initializing libomp.dylib, but found libiomp5.dylib already initialized. OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://openmp.llvm.org/See NOTE [ Linking both MKL and OpenMP ] for details.