pytorch/caffe2
ZhiweiYan-96 9875a834e4 [Intel GPU] oneDNN GPU GEMM support (#117202)
# Motivation

This PR is a part of RFC #114848, and it  is a successor PR of #116249 and #116019. This PR would depend on oneDNN compilation in #116249. Some runtime support is needed in #116019.

Aten operators like `addmm`, `baddmm` is defined in `Blas.cpp` in `aten/src/ATen/native/mkldnn/xpu/`.

Accompanied with these files provide core functionaliy, `BlasImpl.h`, `Utils.h` and other file provide basic utilities for them. For instance, `Utils.h` provide common memory descriptor query utils for `Matmul.h` and these utility function will also be used in other primitive, like `convolution`.  `BlasImpl.h` is a header file that provide helper for handling shape info processing in matmul related operators. It would not only help basic GEMM operator like `addmm, baddmm` but also help fusion operators used in `torch.compile` like `linear_pointwise` in #117824.

In next stage, we would continually complete the oneDNN support through enabling  `matmul fusion`  and `convolution` related code.

Co-authored-by: xiaolil1 <xiaoli.liu@intel.com>
Co-authored-by: lei,zhenyuan <zhenyuan.lei@intel.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117202
Approved by: https://github.com/EikanWang, https://github.com/jgong5, https://github.com/malfet
ghstack dependencies: #117098, #117112
2024-04-17 23:06:38 +00:00
..
contrib [TorchGen] Use std::optional in generated code (#121454) 2024-03-29 14:11:09 +00:00
core
cuda_rtc
db
distributed
experiments [codemod] Remove unused variables in caffe2/caffe2/experiments/operators/tt_pad_op.h (#120177) 2024-03-19 23:36:52 +00:00
ideep [codemod] Fix some namespace issues in caffe2 (#121847) 2024-04-01 17:45:16 +00:00
image
mobile
mpi
observers
onnx
operators [codemod] Remove unused variables in caffe2/caffe2/operators/softmax_op_cudnn.cc (#121995) 2024-03-19 22:35:58 +00:00
opt [codemod] Fix some namespace issues in caffe2 (#121847) 2024-04-01 17:45:16 +00:00
perfkernels
predictor
proto
python Move doc links to point to main (#121823) 2024-03-15 19:49:37 +00:00
quantization
queue
serialize
sgd
share
test
transforms
utils
video [codemod] Remove unused variables in caffe2/caffe2/video/video_decoder.cc (#122151) 2024-03-19 22:34:17 +00:00
.clang-format
__init__.py
BUILD_MODE.bzl
CMakeLists.txt [Intel GPU] oneDNN GPU GEMM support (#117202) 2024-04-17 23:06:38 +00:00
README.md
release-notes.md
requirements.txt
unexported_symbols.lds
VERSION_NUMBER
version_script.lds

Caffe2

Caffe2 is a lightweight, modular, and scalable deep learning framework. Building on the original Caffe, Caffe2 is designed with expression, speed, and modularity in mind.

Questions and Feedback

Please use GitHub issues (https://github.com/pytorch/pytorch/issues) to ask questions, report bugs, and request new features.

Further Resources on Caffe2.ai