pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

History

Zafar Takhirov b8584b884e [quant] Quantizable MultiheadAttention (#49866 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49866 - Adds the `torch.nn.quantizable.MultiheadAttention` The quantizable version can serve as a fully equivalent to `torch.nn.MultiheadAttention` module. The main difference is that it allows for linear units observation after the `prepare` step in the quantization flow. Note: The `from_observed` method (called during the `convert`) removes the `bias_k` and `bias_v` parameters, and resets them as attributes. This is done to avoid an error of assigning a quantized tensor to the `torch.nn.Parameter`. (Note: this ignores all push blocking failures!) Test Plan: ``` python test/test_quantization.py TestQuantizedOps.test_custom_module_multi_head_attention ``` Imported from OSS Reviewed By: vkuzo Differential Revision: D25706179 fbshipit-source-id: e27ab641d8d1eccc64cf9e44343459331f89eea4		2021-02-17 12:36:30 -08:00
..
serialized
__init__.py
test_backward_compatibility.py
test_bias_correction.py	Numeric Suite: Swap with shadow modules only for quantized part of model (#51052 )	2021-02-04 11:40:30 -08:00
test_equalize.py
test_fusion_passes.py
test_numeric_suite.py	Numeric Suite: Swap with shadow modules only for quantized part of model (#51052 )	2021-02-04 11:40:30 -08:00
test_numeric_suite_fx.py	reland - ns for fx - stubs of the three APIs (compare weights, activations, activations with shadow) (#52302 )	2021-02-16 19:59:32 -08:00
test_qat_module.py	[reland][quant][fix] Add bias once in conv_fused (#48593 ) (#48661 )	2020-12-02 10:17:43 -08:00
test_quantize.py	quantization: Linear + BatchNorm1d fusion (#50748 )	2021-01-20 12:59:02 -08:00
test_quantize_fx.py	[quant][graphmode][fx] Enable inception_v3 and googlenet static quant test (#51402 )	2021-02-03 14:32:00 -08:00
test_quantize_jit.py	split quantization jit op (#51329 )	2021-01-29 07:49:53 -08:00
test_quantized_functional.py	[reland][quant] Remove nn.quantized.ReLU module and nn.quantized.functional.relu (#47415 ) (#48038 )	2020-11-17 09:52:21 -08:00
test_quantized_module.py	[quant] Add reflection padding to conv (#49011 )	2021-02-03 21:44:12 -08:00
test_quantized_op.py	[quant] Quantizable MultiheadAttention (#49866 )	2021-02-17 12:36:30 -08:00
test_quantized_tensor.py
test_workflow_module.py	fake_quant cachemask: remove Python bindings (#51878 )	2021-02-09 23:27:53 -08:00