The large core pooling is decomposed into multiple small core pooling and the Padv2 operator is added. #380

mjq2020 · 2024-12-06T09:30:17Z

Situation

After quantizing the target detection model and converting it to tflite format, when converting the large core pooling operator to tflite operator, it is found that the pooling will be decomposed into multiple small core pooling operators, and the padv2 operator is added
The following figure shows the difference between onnx2tf and tinynn exporting tflite graphs

onnx2tf

tinynn

Reason

Too many operators will cause overhead in inference time

Requirement

Can the padding attribute of maxpool be used to generate a tflite model without decomposing the pooling?

Original code

class SPPFBottleneck(BaseModule):
    """Spatial pyramid pooling - Fast (SPPF) layer for
    YOLOv5, YOLOX and PPYOLOE by Glenn Jocher

    Args:
        in_channels (int): The input channels of this Module.
        out_channels (int): The output channels of this Module.
        kernel_sizes (int, tuple[int]): Sequential or number of kernel
            sizes of pooling layers. Defaults to 5.
        use_conv_first (bool): Whether to use conv before pooling layer.
            In YOLOv5 and YOLOX, the para set to True.
            In PPYOLOE, the para set to False.
            Defaults to True.
        mid_channels_scale (float): Channel multiplier, multiply in_channels
            by this amount to get mid_channels. This parameter is valid only
            when use_conv_fist=True.Defaults to 0.5.
        conv_cfg (dict): Config dict for convolution layer. Defaults to None.
            which means using conv2d. Defaults to None.
        norm_cfg (dict): Config dict for normalization layer.
            Defaults to dict(type='BN', momentum=0.03, eps=0.001).
        act_cfg (dict): Config dict for activation layer.
            Defaults to dict(type='SiLU', inplace=True).
        init_cfg (dict or list[dict], optional): Initialization config dict.
            Defaults to None.
    """

    def __init__(self,
                 in_channels: int,
                 out_channels: int,
                 kernel_sizes: Union[int, Sequence[int]] = 5,
                 use_conv_first: bool = True,
                 mid_channels_scale: float = 0.5,
                 conv_cfg: ConfigType = None,
                 norm_cfg: ConfigType = dict(
                     type='BN', momentum=0.03, eps=0.001),
                 act_cfg: ConfigType = dict(type='SiLU', inplace=True),
                 init_cfg: OptMultiConfig = None):
        super().__init__(init_cfg)

        if use_conv_first:
            mid_channels = int(in_channels * mid_channels_scale)
            self.conv1 = ConvModule(
                in_channels,
                mid_channels,
                1,
                stride=1,
                conv_cfg=conv_cfg,
                norm_cfg=norm_cfg,
                act_cfg=act_cfg)
        else:
            mid_channels = in_channels
            self.conv1 = None
        self.kernel_sizes = kernel_sizes
        if isinstance(kernel_sizes, int):
            self.poolings = nn.MaxPool2d(
                kernel_size=kernel_sizes, stride=1, padding=kernel_sizes // 2)
            conv2_in_channels = mid_channels * 4
        else:
            self.poolings = nn.ModuleList([
                nn.MaxPool2d(kernel_size=ks, stride=1, padding=ks // 2)
                for ks in kernel_sizes
            ])
            conv2_in_channels = mid_channels * (len(kernel_sizes) + 1)

        self.conv2 = ConvModule(
            conv2_in_channels,
            out_channels,
            1,
            conv_cfg=conv_cfg,
            norm_cfg=norm_cfg,
            act_cfg=act_cfg)

    def forward(self, x: Tensor) -> Tensor:
        """Forward process
        Args:
            x (Tensor): The input tensor.
        """
        if self.conv1:
            x = self.conv1(x)
        if isinstance(self.kernel_sizes, int):
            y1 = self.poolings(x)
            y2 = self.poolings(y1)
            x = torch.cat([x, y1, y2, self.poolings(y2)], dim=1)
        else:
            x = torch.cat(
                [x] + [pooling(x) for pooling in self.poolings], dim=1)
        x = self.conv2(x)
        return x

Code link here

peterjc123 · 2024-12-10T09:42:54Z

Work items:

Skip the generation of padding ops if they are followed by ReLU/ReLU6 ops. (Complicated, but it can be done via transformable ops)
Max pooling fusion (consecutive pooling elimination, need to figure out the cases that can be removed)

peterjc123 added the enhancement New feature or request label Dec 10, 2024

mjq2020 mentioned this issue Dec 12, 2024

[quantizer] reduce the number of pad operators #381

Merged

peterjc123 linked a pull request Dec 12, 2024 that will close this issue

[quantizer] reduce the number of pad operators #381

Merged

peterjc123 closed this as completed in #381 Dec 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The large core pooling is decomposed into multiple small core pooling and the Padv2 operator is added. #380

The large core pooling is decomposed into multiple small core pooling and the Padv2 operator is added. #380

mjq2020 commented Dec 6, 2024

peterjc123 commented Dec 10, 2024 •

edited

Loading

The large core pooling is decomposed into multiple small core pooling and the Padv2 operator is added. #380

The large core pooling is decomposed into multiple small core pooling and the Padv2 operator is added. #380

Comments

mjq2020 commented Dec 6, 2024

Situation

Reason

Requirement

Original code

peterjc123 commented Dec 10, 2024 • edited Loading

peterjc123 commented Dec 10, 2024 •

edited

Loading