mirror of
https://github.com/saymrwulf/pytorch.git
synced 2026-05-14 20:57:59 +00:00
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49866 - Adds the `torch.nn.quantizable.MultiheadAttention` The quantizable version can serve as a fully equivalent to `torch.nn.MultiheadAttention` module. The main difference is that it allows for linear units observation after the `prepare` step in the quantization flow. Note: The `from_observed` method (called during the `convert`) removes the `bias_k` and `bias_v` parameters, and resets them as attributes. This is done to avoid an error of assigning a quantized tensor to the `torch.nn.Parameter`. (Note: this ignores all push blocking failures!) Test Plan: ``` python test/test_quantization.py TestQuantizedOps.test_custom_module_multi_head_attention ``` Imported from OSS Reviewed By: vkuzo Differential Revision: D25706179 fbshipit-source-id: e27ab641d8d1eccc64cf9e44343459331f89eea4 |
||
|---|---|---|
| .. | ||
| serialized | ||
| __init__.py | ||
| test_backward_compatibility.py | ||
| test_bias_correction.py | ||
| test_equalize.py | ||
| test_fusion_passes.py | ||
| test_numeric_suite.py | ||
| test_numeric_suite_fx.py | ||
| test_qat_module.py | ||
| test_quantize.py | ||
| test_quantize_fx.py | ||
| test_quantize_jit.py | ||
| test_quantized_functional.py | ||
| test_quantized_module.py | ||
| test_quantized_op.py | ||
| test_quantized_tensor.py | ||
| test_workflow_module.py | ||