pytorch/test/quantization
Zafar Takhirov b8584b884e [quant] Quantizable MultiheadAttention (#49866)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49866

- Adds the `torch.nn.quantizable.MultiheadAttention`

The quantizable version can serve as a fully equivalent to `torch.nn.MultiheadAttention` module.
The main difference is that it allows for linear units observation after the `prepare` step in the quantization flow.

Note: The `from_observed` method (called during the `convert`) removes the `bias_k` and `bias_v` parameters, and resets them as attributes.
This is done to avoid an error of assigning a quantized tensor to the `torch.nn.Parameter`.

(Note: this ignores all push blocking failures!)

Test Plan:
```
python test/test_quantization.py TestQuantizedOps.test_custom_module_multi_head_attention
```

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D25706179

fbshipit-source-id: e27ab641d8d1eccc64cf9e44343459331f89eea4
2021-02-17 12:36:30 -08:00
..
serialized
__init__.py
test_backward_compatibility.py
test_bias_correction.py Numeric Suite: Swap with shadow modules only for quantized part of model (#51052) 2021-02-04 11:40:30 -08:00
test_equalize.py
test_fusion_passes.py
test_numeric_suite.py Numeric Suite: Swap with shadow modules only for quantized part of model (#51052) 2021-02-04 11:40:30 -08:00
test_numeric_suite_fx.py reland - ns for fx - stubs of the three APIs (compare weights, activations, activations with shadow) (#52302) 2021-02-16 19:59:32 -08:00
test_qat_module.py [reland][quant][fix] Add bias once in conv_fused (#48593) (#48661) 2020-12-02 10:17:43 -08:00
test_quantize.py quantization: Linear + BatchNorm1d fusion (#50748) 2021-01-20 12:59:02 -08:00
test_quantize_fx.py [quant][graphmode][fx] Enable inception_v3 and googlenet static quant test (#51402) 2021-02-03 14:32:00 -08:00
test_quantize_jit.py split quantization jit op (#51329) 2021-01-29 07:49:53 -08:00
test_quantized_functional.py [reland][quant] Remove nn.quantized.ReLU module and nn.quantized.functional.relu (#47415) (#48038) 2020-11-17 09:52:21 -08:00
test_quantized_module.py [quant] Add reflection padding to conv (#49011) 2021-02-03 21:44:12 -08:00
test_quantized_op.py [quant] Quantizable MultiheadAttention (#49866) 2021-02-17 12:36:30 -08:00
test_quantized_tensor.py
test_workflow_module.py fake_quant cachemask: remove Python bindings (#51878) 2021-02-09 23:27:53 -08:00