pytorch/torch
Charles David Hernandez 6e1fc5cef8 [quant] added dq->op->q quantization patterns for GELU and softmax ops (#56004)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56004

added reference pattern support for GELU, softmax and bmm for int dtypes. For GELU and Softmax, this consisted of adding reference patterns to the default node handler for int dtypes. Note GELU and softmax patterns are not registered since they do not have a proper quantized kernel which means they would either add unnecessary dequant and quant ops to the network, or they would simply error. This can be circumvented with custom qconfig usage as in test_gelu_reference

bmm was added within binary ops along with some significant changes to how that code is structured. Theoretically the reference pattern used for bmm could be applied to other dtypes. This was not enabled because of issues relating to Line 1323 in quantize.py. In essence, the prepare step does not know whether an op will use a reference pattern or not, so for ops that are supported with one dtype in reference and one dtype normally, this has the potential to cause issues. This is difficult to get aorund with the is_reference flag being available in the prepare step or discussed changes around separating

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_gelu_reference
python test/test_quantization.py TestQuantizeFxOps.ttest_gelu_normal
python test/test_quantization.py TestQuantizeFxOps.test_softmax_reference
python test/test_quantization.py TestQuantizeFxOps.test_softmax_normal
python test/test_quantization.py TestQuantizeFxOps.test_silu_reference
python test/test_quantization.py TestQuantizeFxOps.test_bmm_int_reference
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestFuseFx
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxModels

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D27818340

fbshipit-source-id: de65be0797035463cd2d1b0e4677d1a87f69143c
2021-04-20 13:26:15 -07:00
..
_C Fix distributed.test_jit_c10d flaky tests (#56410) 2021-04-20 09:28:27 -07:00
autograd Fix typo in gradcheck.py (#56368) 2021-04-19 15:53:02 -07:00
backends
contrib
csrc [TensorPipe] Use targetDevice in tensorpipe_agent. (#56346) 2021-04-20 11:54:13 -07:00
cuda
distributed Pytorch resolve bug around incorrect rdzv handler resolution (#56386) 2021-04-19 23:50:28 -07:00
distributions
fft
for_onnx
futures
fx Add remaining ToCs to ToC lint (#56487) 2021-04-20 10:28:47 -07:00
jit Add lint for unqualified noqa (#56272) 2021-04-19 13:16:18 -07:00
legacy
lib unified GlooStore and c10d store API (#56222) 2021-04-19 10:57:18 -07:00
linalg
multiprocessing
nn [SPMD] Remove _specify_ddp_gpu_num method (#56425) 2021-04-20 11:17:47 -07:00
onnx
optim [optim] take kw-only argument for functional optim APIs (#56185) 2021-04-15 20:08:04 -07:00
package [package] make GlobGroup a public concept (#56238) 2021-04-16 13:31:48 -07:00
profiler
quantization [quant] added dq->op->q quantization patterns for GELU and softmax ops (#56004) 2021-04-20 13:26:15 -07:00
sparse
special [special] Add i0e (#54409) 2021-04-15 06:06:11 -07:00
testing Separate profiling tests from p2p tests (#56412) 2021-04-20 10:42:00 -07:00
utils [Pytorch] Better error message for bundling inputs a second time (#56086) 2021-04-20 12:28:27 -07:00
__config__.py
__future__.py
__init__.py Update use_deterministic_algorithms docs (#55413) 2021-04-15 04:04:27 -07:00
_appdirs.py
_autograd_functions.py
_classes.py
_deploy.py
_jit_internal.py Additional annotations in fbcode/caffe2/torch/_jit_internal.py (#55855) 2021-04-15 09:47:17 -07:00
_linalg_utils.py
_lobpcg.py
_lowrank.py
_namedtensor_internals.py
_ops.py
_python_dispatcher.py
_six.py
_storage_docs.py
_tensor.py
_tensor_docs.py
_tensor_str.py
_torch_docs.py Fix TestTypeHints.test_doc_examples (#56388) 2021-04-19 15:27:09 -07:00
_utils.py
_utils_internal.py
_VF.py
_vmap_internals.py
abi-check.cpp
CMakeLists.txt Add minidump collection via breakpad (#55647) 2021-04-16 13:05:01 -07:00
custom_class.h
custom_class_detail.h
deploy.h
extension.h
functional.py Add lint for unqualified noqa (#56272) 2021-04-19 13:16:18 -07:00
hub.py Fix torch.hub.load("pytorch/vision") fails to validate the master branch (#56138) 2021-04-20 09:33:25 -07:00
library.h
overrides.py Python API for Vitals (#53238) 2021-04-15 16:06:43 -07:00
py.typed
quasirandom.py
random.py
README.txt
script.h
serialization.py
storage.py
types.py

Note [TH abstraction violation]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TH/THC provide some hpp headers, which are proper C++ headers rather than
C headers.  These headers serve double duty as *internal implementation
detail* headers, whose contents should largely not be used by external
clients.

Ideally, we would not install these headers at all; instead, you should
use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`)
to manipulate these structs.  However, there are a few places
in torch/csrc where we violate this abstraction.  They are marked with
a pointer to this note.  Each of those sites will have to be refactored
when we refactor the guts of THTensor and related structures.