mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-15 21:00:47 +00:00

History

ZhiweiYan-96 9875a834e4 [Intel GPU] oneDNN GPU GEMM support (#117202 ) # Motivation This PR is a part of RFC #114848, and it is a successor PR of #116249 and #116019. This PR would depend on oneDNN compilation in #116249. Some runtime support is needed in #116019. Aten operators like `addmm`, `baddmm` is defined in `Blas.cpp` in `aten/src/ATen/native/mkldnn/xpu/`. Accompanied with these files provide core functionaliy, `BlasImpl.h`, `Utils.h` and other file provide basic utilities for them. For instance, `Utils.h` provide common memory descriptor query utils for `Matmul.h` and these utility function will also be used in other primitive, like `convolution`. `BlasImpl.h` is a header file that provide helper for handling shape info processing in matmul related operators. It would not only help basic GEMM operator like `addmm, baddmm` but also help fusion operators used in `torch.compile` like `linear_pointwise` in #117824. In next stage, we would continually complete the oneDNN support through enabling `matmul fusion` and `convolution` related code. Co-authored-by: xiaolil1 <xiaoli.liu@intel.com> Co-authored-by: lei,zhenyuan <zhenyuan.lei@intel.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/117202 Approved by: https://github.com/EikanWang, https://github.com/jgong5, https://github.com/malfet ghstack dependencies: #117098, #117112		2024-04-17 23:06:38 +00:00
..
contrib	[TorchGen] Use std::optional in generated code (#121454 )	2024-03-29 14:11:09 +00:00
core
cuda_rtc
db
distributed
experiments	[codemod] Remove unused variables in caffe2/caffe2/experiments/operators/tt_pad_op.h (#120177 )	2024-03-19 23:36:52 +00:00
ideep	[codemod] Fix some namespace issues in caffe2 (#121847 )	2024-04-01 17:45:16 +00:00
image
mobile
mpi
observers
onnx
operators	[codemod] Remove unused variables in caffe2/caffe2/operators/softmax_op_cudnn.cc (#121995 )	2024-03-19 22:35:58 +00:00
opt	[codemod] Fix some namespace issues in caffe2 (#121847 )	2024-04-01 17:45:16 +00:00
perfkernels
predictor
proto
python	Move doc links to point to main (#121823 )	2024-03-15 19:49:37 +00:00
quantization
queue
serialize
sgd
share
test
transforms
utils
video	[codemod] Remove unused variables in caffe2/caffe2/video/video_decoder.cc (#122151 )	2024-03-19 22:34:17 +00:00
.clang-format
__init__.py
BUILD_MODE.bzl
CMakeLists.txt	[Intel GPU] oneDNN GPU GEMM support (#117202 )	2024-04-17 23:06:38 +00:00
README.md
release-notes.md
requirements.txt
unexported_symbols.lds
VERSION_NUMBER
version_script.lds

README.md

Caffe2

Caffe2 is a lightweight, modular, and scalable deep learning framework. Building on the original Caffe, Caffe2 is designed with expression, speed, and modularity in mind.

Questions and Feedback

Please use GitHub issues (https://github.com/pytorch/pytorch/issues) to ask questions, report bugs, and request new features.

README.md

Caffe2

Questions and Feedback

Further Resources on Caffe2.ai