pytorch/caffe2
Jongsoo Park 822c8ee143 use acc16 only when n>128 and k>128 in Skylake (#18672)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18672

In Skylake, when n < 128 or k < 128, acc16 is slower.

Reviewed By: jianyuh

Differential Revision: D14700576

fbshipit-source-id: 80ca9f1af4626637eed9c5ca49f95ae744811189
2019-04-01 08:52:28 -07:00
..
contrib Resubmit PR-18512: Improved onnx export for 3 onnx ops (#18571) 2019-03-28 18:12:49 -07:00
core Adding quantized tensor shape/type info support for caffe2=>glow in caffe2 side (#18621) 2019-03-31 17:42:27 -07:00
cuda_rtc Change ConvPoolOp<Context>::SetOutputSize to ConvPoolOp<Context>::GetOutputSize (#17764) 2019-03-07 18:38:53 -08:00
db
distributed Manual hipify caffe2/distributed and rocm update (no hcc modules support) (#18088) 2019-03-29 11:07:32 -07:00
experiments
ideep Move ideep singleton registration to ATen from C2. (#18335) 2019-04-01 08:00:33 -07:00
image Open registration for c10 thread pool (#17788) 2019-03-08 15:38:41 -08:00
mobile Remove ComputeLibrary submodule 2019-03-16 09:06:42 -07:00
mpi
observers Remove GPU dependency from ProfileObserver (#17592) 2019-03-04 10:00:46 -08:00
onnx Resubmit PR-18512: Improved onnx export for 3 onnx ops (#18571) 2019-03-28 18:12:49 -07:00
operators Adding quantized tensor shape/type info support for caffe2=>glow in caffe2 side (#18621) 2019-03-31 17:42:27 -07:00
opt Adding quantized tensor shape/type info support for caffe2=>glow in caffe2 side (#18621) 2019-03-31 17:42:27 -07:00
perfkernels Move math::Axpy function to elementwise lib (#18316) 2019-03-26 12:19:19 -07:00
predictor add command line option to use hive filler; add README (#17619) 2019-03-01 13:56:15 -08:00
proto Add qtensors in caffe2 protobuf argument (#18486) 2019-03-27 11:16:40 -07:00
python support pre-convert filter format for mkldnn training mode and change 'OptimizeForIdeep' to 'OptimizeForMkldnn' (#15171) 2019-03-29 19:00:48 -07:00
quantization use acc16 only when n>128 and k>128 in Skylake (#18672) 2019-04-01 08:52:28 -07:00
queue Tensor construction codemod(raw_mutable_data) (#16373) 2019-03-29 18:36:46 -07:00
serialize
sgd Optimize MomentumSGDUpdate maximum block size and make it templated 2019-03-22 09:54:25 -07:00
share Change ConvPoolOp<Context>::SetOutputSize to ConvPoolOp<Context>::GetOutputSize (#17764) 2019-03-07 18:38:53 -08:00
test
transforms fix -Wsign-compare warnings for some files inside c2 (#18123) 2019-03-19 10:39:20 -07:00
utils Move math::Axpy function to elementwise lib (#18316) 2019-03-26 12:19:19 -07:00
video Open registration for c10 thread pool (#17788) 2019-03-08 15:38:41 -08:00
.clang-format
__init__.py
CMakeLists.txt Remove nomscheduler (#17693) 2019-03-06 10:48:13 -08:00
README.md
release-notes.md
requirements.txt
VERSION_NUMBER

Caffe2

Jenkins Build Status

Caffe2 is a lightweight, modular, and scalable deep learning framework. Building on the original Caffe, Caffe2 is designed with expression, speed, and modularity in mind.

Questions and Feedback

Please use Github issues (https://github.com/pytorch/pytorch/issues) to ask questions, report bugs, and request new features.

Further Resources on Caffe2.ai