pytorch

mirror of https://github.com/saymrwulf/pytorch.git synced 2026-05-14 20:57:59 +00:00

History

Huamin Li 3d8db41337 Add new op wrapped_quantized_linear (#134024 ) Summary: This diff adds a new operator wrapped_quantized_linear (torch.ops._quantized.wrapped_quantized_linear) and takes the following input argument: input (in fp32) , input_scale, input_zero_point, weight (in fp32), weight_scale, weight_zero_point, bias (in fp32), output_scale, output_zero_point, and out_channel. It does the following 1. Use quantize_per_tensor(input, input_scale, input_zero_point) to quantize the input tensor to int8 2. Use quantized::linear_prepack(weight, weight_scale, weight_zero_point, bias) to pack the weight and bias 3. Use quantized::linear to perform int8 quantized linear 4. dequantize This new op is essentially a wrapper of mutiple ops. We do this as torch.export cannot handle models where it has old quantize apis. Reviewed By: jerryzh168 Differential Revision: D61377266 Pull Request resolved: https://github.com/pytorch/pytorch/pull/134024 Approved by: https://github.com/houseroad		2024-08-21 09:26:58 +00:00
..
experimental	Add None return type to init -- tests (#132352 )	2024-08-01 15:44:51 +00:00
__init__.py
test_backend_config.py
test_docs.py
test_quantized_functional.py
test_quantized_module.py	Fix failures when default is flipped for weights_only (#127627 )	2024-08-16 00:22:43 +00:00
test_quantized_op.py	Add new op wrapped_quantized_linear (#134024 )	2024-08-21 09:26:58 +00:00
test_quantized_tensor.py	Fix failures when default is flipped for weights_only (#127627 )	2024-08-16 00:22:43 +00:00
test_top_level_apis.py
test_utils.py	Add None return type to init -- tests (#132352 )	2024-08-01 15:44:51 +00:00
test_workflow_module.py	Add None return type to init -- tests (#132352 )	2024-08-01 15:44:51 +00:00
test_workflow_ops.py	Add None return type to init -- tests (#132352 )	2024-08-01 15:44:51 +00:00