Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53166
Context: For fx modules that consist of scriptmodules, calling
delattr(module, 'qconfig') throws an attribute error. will follow up
with a separate issue/repro to fix this problem
This PR adds a temporary flag to convert_fx API to preserve the qconfig attributes on the converted model
We will remove this flag once we reach a conclusion on calling delattr on scriptmodules
Test Plan:
python test/test_quantization.py test_preserve_qconfig
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26771518
fbshipit-source-id: 9fd72816576856ffb4aa11f8fde08303d1df10a2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52651
Merging them for easier extensions to fp16 and more binary ops
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D26600118
fbshipit-source-id: a1816e593cf3065afe87d2e6e44cdace13bf6aeb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52534
Currently linear_dynamic_fp16 has a signature that's tied to fbgemm/qnnpack
We'll need to produce a pattern equivalent to linear_dynamic_fp16 to support extensions
to other backends
Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_linear_dynamic_fp16
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D26557726
fbshipit-source-id: 270c9f781f73c79416a092b7831294cabca84b0c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52414
When the input is not quantized, we'll still quantize cat as requested by the qconfig, even though
it might be slower
Test Plan: Imported from OSS
Reviewed By: supriyar
Differential Revision: D26503554
fbshipit-source-id: 29d7c136711a12c124791c10ae436b61c1407668
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52179
Rename debug to reference. We'll use this to produce a reference quantized model
that can be used as a common interface between pytorch quantized model and backends.
Test Plan:
python test/test_quantization.py TestQuantizeFx
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D26424656
fbshipit-source-id: a0299b023f6ba7d98f5750724c517b0ecb987b35
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52413
TODO: We'll need to add this guard for other ops as well
(Note: this ignores all push blocking failures!)
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_mul_add_fp16_config
Imported from OSS
Reviewed By: supriyar
Differential Revision: D26503348
fbshipit-source-id: 5aaba518742a516cc3521fd5f23f1a264d2973e2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52412
When the input is not quantized, we'll still quantize add/mul
Test Plan: Imported from OSS
Reviewed By: supriyar
Differential Revision: D26503347
fbshipit-source-id: 457b3444c50e5b49b911b04c67684f5eead78ec9
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51259
Store the FQN of the module that is using the packed weights (the quantized op)
In the case of fusion we update the scope mapping to store the module path of the fused node.
Test Plan:
python test/test_quantization.py test_packed_weight_fused_op
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26117964
fbshipit-source-id: 9d929997baafb1c91063dd9786a451b0040ae461
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51171
Following up on previous PR, this PR makes scale and zero_point for quantize_per_tensor to be
registered as buffers in the module.
Currently the dtype is still stored as attr (not registered as buffer) since we can only register tensor types.
Test Plan:
python test/test_quantization.py test_qparams_buffers
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26092964
fbshipit-source-id: a54d914db7863402f2b5a3ba2c8ce8b27c18b47b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51166
Currently scale and zero_point values are stored as constant values in the graph.
This prevents these values from being updated in the graph and also does not enable saving
these values to state_dict
After this PR we store scale/zero_point values for quantized ops as buffers in the root module
and createe get_attr nodes for them in the graph.
We also use the FQN of the module where the quantized ops are present to name these attributes so
that they can be uniquely identified and mapped to quantized ops.
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_qparams_buffers
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26092965
fbshipit-source-id: b549b2d3dccb45c5d38415ce95a09c26f5bd590b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51086
Previously we only supported getting scope for call_module and custom qconfig dict for call_module.
This PR extends the Scope class to record the scope for all node types.
For call_function qconfig if module_name is specified it takes precedence over function qconfig.
Test Plan:
python test/test_quantization.py test_qconfig_for_call_func
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D26077602
fbshipit-source-id: 99cdcdedde2280e51812db300e17d4e6d8f477d2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50173
Previously we did not set the qconfig for call_method node correctly since it requires us to know
the scope (module path of the module whose forward graph contains the node) of the node. This
PR modifies the QuantizationTracer to record the scope information and build a map from call_method
Node to module path, which will be used when we construct qconfig_map
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_qconfig_for_call_method
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D25818132
fbshipit-source-id: ee9c5830f324d24d7cf67e5cd2bf1f6e0e46add8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50058
This PR adds the support for {input/output}_quantized_idxs for standalone module.
if input_quantized_idxs = [] and output_quantized_idxs = [], the standalone module will be expecting float
input and produce float output, and will quantize the input and dequantize output internally
if input_quantized_idxs = [0] and otuput_qiuantized_idxs = [0], the standalone module will be expecting quantized
input and produce quantized output, the input will be quantized in the parent module, and output will be dequantized
in the parent module as well, this is similar to current quantized modules like nn.quantized.Conv2d
For more details, please see the test case
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_standalone_module
Imported from OSS
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D25768910
fbshipit-source-id: 96c21a3456cf192c8f1400afa4e86273ee69197b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49899
Per channel weights observer in conv transpose is not supported yet. Adding an
error message which fails instantly instead of making the user wait until after
calibration/training finishes.
Test Plan:
```
python test/test_quantization.py TestPostTrainingStatic.test_convtranspose_per_channel_fails_early
python test/test_quantization.py TestQuantizeFx.test_convtranspose_per_channel_fails_early
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25717151
fbshipit-source-id: 093e5979030ec185e3e0d56c45d7ce7338bf94b6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49740
1. Separates the module and functional linear test cases.
2. Combines the test case which tests for linear bias observation into
the main linear test case, as requested in
https://github.com/pytorch/pytorch/pull/49628.
Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps.test_linear_module
python test/test_quantization.py TestQuantizeFxOps.test_linear_functional
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25681272
fbshipit-source-id: 0ed0ebd5afb8cdb938b530f7dbfbd79798eb9318
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49717
Quantization of `ConvTranpose{n}d` is supported in Eager mode. This PR
adds the support for FX graph mode.
Note: this currenlty only works in `qnnpack` because per-channel weights
are not supported by quantized conv transpose. In a future PR we should throw
an error when someone tries to quantize a ConvTranspose model with per-channel
weight observers until this is fixed.
Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps.test_conv_transpose_1d
python test/test_quantization.py TestQuantizeFxOps.test_conv_transpose_2d
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25674636
fbshipit-source-id: b6948156123ed55db77e6337bea10db956215ae6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49754
This PR adds the support for {input/output}_quantized_idxs for standalone module.
if input_quantized_idxs = [] and output_quantized_idxs = [], the standalone module will be expecting float
input and produce float output, and will quantize the input and dequantize output internally
if input_quantized_idxs = [0] and otuput_qiuantized_idxs = [0], the standalone module will be expecting quantized
input and produce quantized output, the input will be quantized in the parent module, and output will be dequantized
in the parent module as well, this is similar to current quantized modules like nn.quantized.Conv2d
For more details, please see the test case
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_standalone_module
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D25684692
fbshipit-source-id: 900360e01c0e35b26fe85f4a887dc1fd6f7bfb66
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49719
We find there are multiple use cases for standalone module, one use case requires standalone module
to produce a module that takes float Tensor as input and outputs a float Tensor, the other needs to
produce a modulee that takes quantized Tensor as input and outputs a quantized Tensor.
This is similar to `quantized_input_idxs` and `quantized_output_idxs` so we want to nest
prepare_custom_config_dict in the standalone module configuration, for maximum flxibility we also
include qconfig_dict for stand alone module as well in case user needs to have special qconfig_dict for
the standalone module in the future.
Changed from
```python
prepare_custom_config_dict =
{
"standalone_module_name": ["standalone_module"],
"standalone_module_class": [StandaloneModule]
}
```
to
```python
prepare_custom_config_dict =
{
"standalone_module_name": [("standalone_module", qconfig_dict1, prepare_custom_config_dict1)],
"standalone_module_class": [(StandaloneModule, qconfig_dict2, prepare_custom_config_dict2)]
}
```
The entries in the config are:
1. name/module_class
2. optional qconfig_dict, when it is None, we'll use {"": qconfig} where qconfig is the one from parent qconfig_dict
3. optional prepare_custom_config_dict, when it is None, we'll use default value of prepare_custom_config_dict for prepare API (None)
Test Plan:
python test/test_quantization.py TestQuantizeFx.test_standalone_module
Imported from OSS
Reviewed By: raghuramank100
Differential Revision: D25675704
fbshipit-source-id: 0889f519a3e55a7a677f0e2db4db9a18d87a93d4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49628
Ensures that linear bias is not observed in a `F.linear` call. This should
be a small speedup in PTQ, and will change numerics (in a good way) for
QAT if someone is using `F.linear`.
Note: the implementation is slightly more verbose compared to conv
because bias is a keyword argument in Linear.
Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps.test_linear_functional_bias_not_observed
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25653532
fbshipit-source-id: c93501bf6b55cbe4a11cfdad6f79313483133a39
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49623
(not ready for review)
Ensures that conv bias is not observed in a `F.conv{n}d` call.
Test Plan: Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25652856
fbshipit-source-id: 884f87be1948d3e049a557d79bec3c90aec34340
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49621
This adds support to configure qconfig for a call_method, e.g. x.chunk, this will help workaround
a problem in our internal model.
TODO: since call_method is also a string and we flatten the qconfig, might need to resolve namespace conflict between
call_method and module_name
TODO: Add scope support to set the qconfig for call_method correctly with original qconfig
Test Plan: Imported from OSS
Reviewed By: vkuzo
Differential Revision: D25651828
fbshipit-source-id: 82d66b121d37c8274fd481b6a2e9f9b54c5ca73d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49420
Before: if an output was marked as quantized, it could actually not
be quantized, if the previous node was not quantized.
After: if an output was marked as quantized, it will be quantized
regardless of the quantization status of the previous node.
Test Plan:
```
python test/test_quantization.py TestQuantizeFxOps.test_quant_output_always_observed
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D25566834
fbshipit-source-id: 84755a1605fd3847edd03a7887ab9f635498c05c