深度学习论文: Pyramidal Convolution: Rethinking CNN for Visual Recognition及其PyTorch实现

阅读量：

深度学习论文: Re-imagining Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition and its PyTorch implementation
PDF:https://arxiv.org/pdf/2006.11538.pdf
PyTorch: https://github.com/shanglianlm0525/PyTorch-Networks

1 概述

本文提出了一种新型的金字塔卷积（Pyramidal Convolution, PyConv）方法，在计算机视觉领域中展现出显著的应用潜力。该方法通过构建一个多层次的滤波器金字塔结构，在不同尺度的空间特征提取上实现了高度并行化处理能力。其中，每一层采用了不同类型且大小可调节的滤波器设计，在保留原始图像关键细节的同时能够有效降低计算复杂度。相比于传统标准卷积方法，在保持相同性能的前提下，其计算复杂度和参数规模并未显著增加。此外，在实际应用中发现这种设计不仅更加灵活，而且在可扩展性方面也表现出色。通过引入多级自适应滤波器组合机制，在不同应用场景中均能实现高效的特征提取与重建过程

2 Pyramidal Convolution

复制代码

    def ConvBNReLU(in_channels,out_channels,kernel_size,stride,groups=1):
    return nn.Sequential(
        nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride,padding=kernel_size//2,groups=groups),
        nn.BatchNorm2d(out_channels),
        nn.ReLU(inplace=True)
    )
    
    def Conv1x1BNReLU(in_channels,out_channels,groups=1):
    return nn.Sequential(
            nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=1,groups=groups),
            nn.BatchNorm2d(out_channels),
            nn.ReLU6(inplace=True)
        )
    
    class PyConv(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_sizes, groups, stride=1):
        super(PyConv, self).__init__()
        if out_channels is None:
            out_channels = []
        assert len(out_channels) == len(kernel_sizes) == len(groups)
    
        self.pyconv_list = nn.ModuleList()
        for i in range(len(kernel_sizes)):
            self.pyconv_list.append(ConvBNReLU(in_channels=in_channels,out_channels=out_channels[i],kernel_size=kernel_sizes[i],stride=stride,groups=groups[i]))
    
    def forward(self, x):
        outputs = []
        for pyconv in self.pyconv_list:
            outputs.append(pyconv(x))
        return torch.cat(outputs, 1)
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读

3 应用

3-1 PyConv Networks for Image Classification

将PyConv集成到Bottleneck中，并构建两类结构分别命名为PyConvResNet与PyConvHGResNet

3-2 PyConv Network on Semantic Segmentation

PyConvPH。PyConvPH包含三个主要成分：

LocalPyConv模块被设计用于提取多尺度细节特征，并聚焦于小目标。
GlobalPyConv模块被构建以捕获场景的整体信息，并识别和处理大尺寸的目标。
MergePyConv模块负责整合全局与局部特征。

复制代码

    class LocalPyConv(nn.Module):
    def __init__(self, planes):
        super(LocalPyConv, self).__init__()
        inplanes = planes//4
        self._reduce = Conv1x1BNReLU(planes, 512)
        self._pyConv = PyConv(in_channels=512, out_channels=[inplanes, inplanes, inplanes, inplanes], kernel_sizes=[3, 5, 7, 9], groups=[1, 4, 8, 16])
        self._combine = Conv1x1BNReLU(512, planes)
    
    def forward(self, x):
        return self._combine(self._pyConv(self._reduce(x)))
    
    
    class GlobalPyConv(nn.Module):
    def __init__(self, planes):
        super(GlobalPyConv, self).__init__()
        inplanes = planes // 4
        self.global_pool = nn.AdaptiveAvgPool2d(output_size=9)
        self._reduce = Conv1x1BNReLU(planes, 512)
        self._pyConv = PyConv(in_channels=512, out_channels=[inplanes, inplanes, inplanes, inplanes],
                              kernel_sizes=[3, 5, 7, 9], groups=[1, 4, 8, 16])
        self._fuse = Conv1x1BNReLU(512, 512)
    
    def forward(self, x):
        b,c,w,h = x.shape
        x = self._fuse(self._pyConv(self._reduce(self.global_pool(x))))
        out = F.interpolate(x,(w,h),align_corners=True,mode='bilinear')
        return out
    
    class MergePyConv(nn.Module):
    def __init__(self, img_size,in_channels, num_classes):
        super(MergePyConv, self).__init__()
        self.img_size = img_size
        self.conv3 = ConvBNReLU(in_channels=in_channels,out_channels=256,kernel_size=3,stride=1)
        self.conv1 = nn.Conv2d(in_channels=256, out_channels=num_classes, kernel_size=1, stride=1,groups=1)
    
    def forward(self, x):
        x = self.conv3(x)
        x = F.interpolate(x, self.img_size, align_corners=True,mode='bilinear')
        out = self.conv1(x)
        return out
    
    class PyConvParsingHead(nn.Module):
    def __init__(self, img_size=(473,473), planes=512,num_classes=150):
        super(PyConvParsingHead, self).__init__()
    
        self.globalPyConv = GlobalPyConv(planes=planes)
        self.localPyConv = LocalPyConv(planes=planes)
        self.mergePyConv = MergePyConv(img_size,1024, num_classes)
    
    def forward(self, x):
        g_x = self.globalPyConv(x)
        l_x = self.localPyConv(x)
        x = torch.cat([g_x,l_x],dim=1)
        out = self.mergePyConv(x)
        return out
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读

3-3 PyConv Network on Object Detection

复制代码

    class PyConv4(nn.Module):
    def __init__(self, inplaces, places, kernel_sizes=[3, 5, 7, 9], groups=[1, 4, 8, 16], stride=2):
        super(PyConv4, self).__init__()
        assert len(kernel_sizes) == len(groups)
        out_channels = [places//4, places//4, places//4, places//4]
        self.pyconv_list = nn.ModuleList()
        for i in range(len(kernel_sizes)):
            self.pyconv_list.append(ConvBNReLU(in_channels=inplaces,out_channels=out_channels[i],kernel_size=kernel_sizes[i],stride=stride,groups=groups[i]))
    
    def forward(self, x):
        outputs = []
        for pyconv in self.pyconv_list:
            outputs.append(pyconv(x))
        return torch.cat(outputs, 1)
    
    
    class PyConv3(nn.Module):
    def __init__(self, inplaces,places, kernel_sizes=[3, 5, 7], groups=[1, 4, 8], stride=2):
        super(PyConv3, self).__init__()
        assert len(kernel_sizes) == len(groups)
        out_channels = [places // 4, places // 4, places // 2]
        self.pyconv_list = nn.ModuleList()
        for i in range(len(kernel_sizes)):
            self.pyconv_list.append(ConvBNReLU(in_channels=inplaces,out_channels=out_channels[i],kernel_size=kernel_sizes[i],stride=stride,groups=groups[i]))
    
    def forward(self, x):
        outputs = []
        for pyconv in self.pyconv_list:
            outputs.append(pyconv(x))
        return torch.cat(outputs, 1)
    
    class PyConv2(nn.Module):
    def __init__(self, inplaces,places, kernel_sizes=[3, 5], groups=[1, 4], stride=2):
        super(PyConv2, self).__init__()
        assert len(kernel_sizes) == len(groups)
        out_channels = [places // 2, places // 2]
        self.pyconv_list = nn.ModuleList()
        for i in range(len(kernel_sizes)):
            self.pyconv_list.append(ConvBNReLU(in_channels=inplaces,out_channels=out_channels[i],kernel_size=kernel_sizes[i],stride=stride,groups=groups[i]))
    
    def forward(self, x):
        outputs = []
        for pyconv in self.pyconv_list:
            outputs.append(pyconv(x))
        return torch.cat(outputs, 1)
    
    class PyConv1(nn.Module):
    def __init__(self, inplaces,places, kernel_sizes, groups, stride=1):
        super(PyConv1, self).__init__()
        assert len(kernel_sizes) == len(groups)
        self.pyconv = ConvBNReLU(in_channels=inplaces,out_channels=places,kernel_size=3,stride=stride,groups=1)
    
    def forward(self, x):
        return self.pyconv(x)
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读

3-4 PyConv Network on Video Classification

全部评论 (0)

还没有任何评论哟~

深度学习论文: Pyramidal Convolution: Rethinking CNN for Visual Recognition及其PyTorch实现

深度学习论文:PyramidalConvolution:RethinkingConvolutionalNeuralNetworksforVisualRecognition及其PyTorch实现 Pyr...

Pyramidal Convolution Rethinking Convolutional Neural Networks for Visual Recognition

PyramidalConvolution:RethinkingConvolutionalNeuralNetworksforVisualRecognition PyramidalConvolution:...

Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition论文阅读翻译

PyramidalConvolution:RethinkingConvolutionalNeuralNetworksforVisualRecognition论文阅读翻译目录： PyramidalCo...

深度学习论文: Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition及其PyTorch实现

深度学习论文:Conv2Former:ASimpleTransformerStyleConvNetforVisualRecognition及其PyTorch实现 Conv2Former:ASimple...

深度学习论文: SlowFast Networks for Video Recognition及其PyTorch实现

深度学习论文:SlowFastNetworksforVideoRecognition及其PyTorch实现 SlowFastNetworksforVideoRecognition PDF:<https...

深度学习论文: Rethinking BiSeNet For Real-time Semantic Segmentation及其PyTorch实现

深度学习论文:RethinkingBiSeNetForRealtimeSemanticSegmentation及其PyTorch实现 RethinkingBiSeNetForRealtimeSeman...

深度学习论文: Rethinking “Batch” in BatchNorm及其PyTorch实现

深度学习论文:Rethinking“Batch”inBatchNorm及其PyTorch实现 Rethinking“Batch”inBatchNorm PDF:<https://arxiv.org/p...

深度学习论文: Rethinking Mobile Block for Efficient Attention-based Models及其PyTorch实现

深度学习论文:RethinkingMobileBlockforEfficientAttentionbasedModels及其PyTorch实现 RethinkingMobileBlockforEffi...

深度学习论文: SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation及其PyTorch实现

深度学习论文:SegNeXt:RethinkingConvolutionalAttentionDesignforSemanticSegmentation及其PyTorch实现 SegNeXt:Reth...

深度学习论文: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks及其PyTorch实现

深度学习论文:EfficientNet:RethinkingModelScalingforConvolutionalNeuralNetworks及其PyTorch实现 PDF:<https://arx...

是否确定退出登录?

深度学习论文: Pyramidal Convolution: Rethinking CNN for Visual Recognition及其PyTorch实现

1 概述

2 Pyramidal Convolution

3 应用

3-1 PyConv Networks for Image Classification

3-2 PyConv Network on Semantic Segmentation

3-3 PyConv Network on Object Detection

3-4 PyConv Network on Video Classification

全部评论 (0)

相关文章推荐

深度学习论文: Pyramidal Convolution: Rethinking CNN for Visual Recognition及其PyTorch实现

Pyramidal Convolution Rethinking Convolutional Neural Networks for Visual Recognition

Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition论文阅读翻译

深度学习论文: Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition及其PyTorch实现

深度学习论文: SlowFast Networks for Video Recognition及其PyTorch实现

深度学习论文: Rethinking BiSeNet For Real-time Semantic Segmentation及其PyTorch实现

深度学习论文: Rethinking “Batch” in BatchNorm及其PyTorch实现

深度学习论文: Rethinking Mobile Block for Efficient Attention-based Models及其PyTorch实现

深度学习论文: SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation及其PyTorch实现

深度学习论文: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks及其PyTorch实现