mirror of
https://github.com/saymrwulf/pytorch.git
synced 2026-05-14 20:57:59 +00:00
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56004 added reference pattern support for GELU, softmax and bmm for int dtypes. For GELU and Softmax, this consisted of adding reference patterns to the default node handler for int dtypes. Note GELU and softmax patterns are not registered since they do not have a proper quantized kernel which means they would either add unnecessary dequant and quant ops to the network, or they would simply error. This can be circumvented with custom qconfig usage as in test_gelu_reference bmm was added within binary ops along with some significant changes to how that code is structured. Theoretically the reference pattern used for bmm could be applied to other dtypes. This was not enabled because of issues relating to Line 1323 in quantize.py. In essence, the prepare step does not know whether an op will use a reference pattern or not, so for ops that are supported with one dtype in reference and one dtype normally, this has the potential to cause issues. This is difficult to get aorund with the is_reference flag being available in the prepare step or discussed changes around separating Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_gelu_reference python test/test_quantization.py TestQuantizeFxOps.ttest_gelu_normal python test/test_quantization.py TestQuantizeFxOps.test_softmax_reference python test/test_quantization.py TestQuantizeFxOps.test_softmax_normal python test/test_quantization.py TestQuantizeFxOps.test_silu_reference python test/test_quantization.py TestQuantizeFxOps.test_bmm_int_reference python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestFuseFx python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxModels Imported from OSS Reviewed By: raghuramank100 Differential Revision: D27818340 fbshipit-source-id: de65be0797035463cd2d1b0e4677d1a87f69143c |
||
|---|---|---|
| .. | ||
| _C | ||
| autograd | ||
| backends | ||
| contrib | ||
| csrc | ||
| cuda | ||
| distributed | ||
| distributions | ||
| fft | ||
| for_onnx | ||
| futures | ||
| fx | ||
| jit | ||
| legacy | ||
| lib | ||
| linalg | ||
| multiprocessing | ||
| nn | ||
| onnx | ||
| optim | ||
| package | ||
| profiler | ||
| quantization | ||
| sparse | ||
| special | ||
| testing | ||
| utils | ||
| __config__.py | ||
| __future__.py | ||
| __init__.py | ||
| _appdirs.py | ||
| _autograd_functions.py | ||
| _classes.py | ||
| _deploy.py | ||
| _jit_internal.py | ||
| _linalg_utils.py | ||
| _lobpcg.py | ||
| _lowrank.py | ||
| _namedtensor_internals.py | ||
| _ops.py | ||
| _python_dispatcher.py | ||
| _six.py | ||
| _storage_docs.py | ||
| _tensor.py | ||
| _tensor_docs.py | ||
| _tensor_str.py | ||
| _torch_docs.py | ||
| _utils.py | ||
| _utils_internal.py | ||
| _VF.py | ||
| _vmap_internals.py | ||
| abi-check.cpp | ||
| CMakeLists.txt | ||
| custom_class.h | ||
| custom_class_detail.h | ||
| deploy.h | ||
| extension.h | ||
| functional.py | ||
| hub.py | ||
| library.h | ||
| overrides.py | ||
| py.typed | ||
| quasirandom.py | ||
| random.py | ||
| README.txt | ||
| script.h | ||
| serialization.py | ||
| storage.py | ||
| types.py | ||
Note [TH abstraction violation] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ TH/THC provide some hpp headers, which are proper C++ headers rather than C headers. These headers serve double duty as *internal implementation detail* headers, whose contents should largely not be used by external clients. Ideally, we would not install these headers at all; instead, you should use public functions (in headers like `THTensor.h`, NOT `THTensor.hpp`) to manipulate these structs. However, there are a few places in torch/csrc where we violate this abstraction. They are marked with a pointer to this note. Each of those sites will have to be refactored when we refactor the guts of THTensor and related structures.