torch.nn.qat¶
This module implements versions of the key nn modules Conv2d() and Linear() which run in FP32 but with rounding applied to simulate the effect of INT8 quantization.
Conv2d¶
-
class
torch.nn.qat.
Conv2d
(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', qconfig=None, device=None, dtype=None)[source]¶ A Conv2d module attached with FakeQuantize modules for weight, used for quantization aware training.
We adopt the same interface as torch.nn.Conv2d, please see https://pytorch.org/docs/stable/nn.html?highlight=conv2d#torch.nn.Conv2d for documentation.
Similar to torch.nn.Conv2d, with FakeQuantize modules initialized to default.
- Variables
~Conv2d.weight_fake_quant – fake quant module for weight
Linear¶
-
class
torch.nn.qat.
Linear
(in_features, out_features, bias=True, qconfig=None, device=None, dtype=None)[source]¶ A linear module attached with FakeQuantize modules for weight, used for quantization aware training.
We adopt the same interface as torch.nn.Linear, please see https://pytorch.org/docs/stable/nn.html#torch.nn.Linear for documentation.
Similar to torch.nn.Linear, with FakeQuantize modules initialized to default.
- Variables
~Linear.weight – fake quant module for weight