深度学习论文: An Energy and GPU-Computation Efficient Backbone Network for Object Detection及其PyTorch

阅读量：

An Energy-efficient and GPU-Computation-Oriented Backbone Network is proposed to address the challenge of real-time object detection. The research focuses on optimizing both energy consumption and computational resources, particularly in the context of deep learning-based detection systems. The proposed network aims to achieve high performance while maintaining low power consumption, making it suitable for deployment in resource-constrained environments. The effectiveness of this approach is validated through extensive experiments conducted on standard benchmark datasets. Additionally, the implementation leverages PyTorch as the primary deep learning framework, ensuring compatibility with existing research and development workflows. For more details, please refer to the full paper available at https://arxiv.org/pdf/1904.09730.pdf. The complete code implementation can be found on GitHub at https://github.com/shanglianlm0525/PyTorch-Networks.

1 概述

Densenet仅需较少的参数和运算量即可达到与resnet相当的性能水平,这得益于concat操作相较于加法操作能够更好地保持信息量.然而,在实际应用中发现densenet运行速度较慢且对硬件资源需求较高.

该文提出了一种名为One-short Aggregation的新方法。实际上改变了shortcuts的连接模式。从前一层特征仅与后一层的所有特征相连转变为仅与最后一层特征相连。显著降低了总的链接数量。实证结果表明具有相同架构但性能相近却能比DenseNet快两倍同时能量消耗下降幅度在1.6至4.1倍之间。

2 Factors of Efficient Network Design(高效网络设计要素)

2-1 Memory Access Cost (MAC)

对于CNN而言，在大多数情况下其能量消耗集中在内存操作而非计算操作上。

影响MAC的主要因素是中间激活的内存占用量（intermediate activation memory footprint），这一数值主要受到卷积核数量和特征图尺寸的影响。

每个卷积层的MAC计算如下，

在其中，在h和w分别代表高度和宽度，在k则表示核的尺寸参数。根据上述公式可知

其中 $B = k^{2}*h*w*c_{i}* c_{o}$
明显可以看出当输入和输出通道数相等时，MAC开销最小。

2-2 GPU-Computation Efficiency

GPU以其擅长并行计算著称，在处理张量规模较大的情况下能够进一步提升计算效率 $...$
将较大的卷积操作分割为小块进行处理将不利于提高GPU的计算效率 $...$
构建包含较少层的网络结构更为优胜的选择 $...$
通过1x1卷积减少计算量的做法虽然有效但这种做法不利于提升GPU计算效率 $...$

3 Network

3-1 One-Shot Aggregation

设计OSA（one-shot-aggregation）组件，在该组件中，每一层产生两种类型的连接：一种是通过卷积层与其下一层进行连接，并能扩展其收视范围；另一种则与输出层直接相连，并整合了足够的特征信息。通过采用该组件的结构优化方法，在保持原有模型性能的前提下，将5层43channels的DenseNet-40计算复杂度降低约30%，具体数值变化从3.7M下降至2.5M MAC运算量。

PyTorch代码：

复制代码

    class OSA_module(nn.Module):
    def __init__(self, in_channels,mid_channels, out_channels, block_nums=5):
        super(OSA_module, self).__init__()
    
        self._layers = nn.ModuleList()
        self._layers.append(Conv3x3BNReLU(in_channels=in_channels, out_channels=mid_channels, stride=1))
        for idx in range(block_nums-1):
            self._layers.append(Conv3x3BNReLU(in_channels=mid_channels, out_channels=mid_channels, stride=1))
    
        self.conv1x1 = Conv1x1BNReLU(in_channels+mid_channels*block_nums,out_channels)
    
    def forward(self, x):
        outputs = []
        outputs.append(x)
        for _layer in self._layers:
            x = _layer(x)
            outputs.append(x)
        out = torch.cat(outputs, dim=1)
        out = self.conv1x1(out)
        return out
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读

3-2 VoVNet

提出的各种VoVNet结构

PyTorch代码：

复制代码

    # !/usr/bin/env python
    # -- coding: utf-8 --
    # @Time : 2020/6/1 14:40
    # @Author : liumin
    # @File : VoVNet.py
    
    import torch
    import torch.nn as nn
    import torchvision
    
    __all__ = ['VoVNet', 'vovnet27_slim', 'vovnet39', 'vovnet57']
    
    from PIL.Image import Image
    
    
    def Conv3x3BNReLU(in_channels,out_channels,stride,groups=1):
    return nn.Sequential(
            nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=3, stride=stride, padding=1,groups=groups, bias=False),
            nn.BatchNorm2d(out_channels),
            nn.ReLU6(inplace=True)
        )
    
    
    def Conv3x3BN(in_channels,out_channels,stride,groups):
    return nn.Sequential(
            nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=3, stride=stride, padding=1,groups=groups, bias=False),
            nn.BatchNorm2d(out_channels)
        )
    
    
    def Conv1x1BNReLU(in_channels,out_channels):
    return nn.Sequential(
            nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=1, bias=False),
            nn.BatchNorm2d(out_channels),
            nn.ReLU6(inplace=True)
        )
    
    
    def Conv1x1BN(in_channels,out_channels):
    return nn.Sequential(
            nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=1, bias=False),
            nn.BatchNorm2d(out_channels)
        )
    
    class OSA_module(nn.Module):
    def __init__(self, in_channels,mid_channels, out_channels, block_nums=5):
        super(OSA_module, self).__init__()
    
        self._layers = nn.ModuleList()
        self._layers.append(Conv3x3BNReLU(in_channels=in_channels, out_channels=mid_channels, stride=1))
        for idx in range(block_nums-1):
            self._layers.append(Conv3x3BNReLU(in_channels=mid_channels, out_channels=mid_channels, stride=1))
    
        self.conv1x1 = Conv1x1BNReLU(in_channels+mid_channels*block_nums,out_channels)
    
    def forward(self, x):
        outputs = []
        outputs.append(x)
        for _layer in self._layers:
            x = _layer(x)
            outputs.append(x)
        out = torch.cat(outputs, dim=1)
        out = self.conv1x1(out)
        return out
    
    
    class VoVNet(nn.Module):
    def __init__(self, planes, layers, num_classes=2):
        super(VoVNet, self).__init__()
    
        self.groups = 1
        self.stage1 = nn.Sequential(
            Conv3x3BNReLU(in_channels=3, out_channels=64, stride=2, groups=self.groups),
            Conv3x3BNReLU(in_channels=64, out_channels=64, stride=1, groups=self.groups),
            Conv3x3BNReLU(in_channels=64, out_channels=128, stride=1, groups=self.groups),
        )
    
        self.stage2 = self._make_layer(planes[0][0],planes[0][1],planes[0][2],layers[0])
    
        self.stage3 = self._make_layer(planes[1][0],planes[1][1],planes[1][2],layers[1])
    
        self.stage4 = self._make_layer(planes[2][0],planes[2][1],planes[2][2],layers[2])
    
        self.stage5 = self._make_layer(planes[3][0],planes[3][1],planes[3][2],layers[3])
    
        self.avgpool = nn.AdaptiveAvgPool2d(output_size=1)
        self.flatten = nn.Flatten()
        self.dropout = nn.Dropout(p=0.2)
        self.linear = nn.Linear(in_features=planes[3][2], out_features=num_classes)
    
    def _make_layer(self, in_channels, mid_channels,out_channels, block_num):
        layers = []
        layers.append(nn.MaxPool2d(kernel_size=3, stride=2, padding=1))
        for idx in range(block_num):
            layers.append(OSA_module(in_channels=in_channels, mid_channels=mid_channels, out_channels=out_channels))
            in_channels = out_channels
        return nn.Sequential(*layers)
    
    def init_params(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight)
                if m.bias is not None:
                    nn.init.constant_(m.bias, 0)
            elif isinstance(m, nn.BatchNorm2d) or isinstance(m, nn.Linear):
                nn.init.constant_(m.weight, 1)
                nn.init.constant_(m.bias, 0)
    
    def forward(self, x):
        x = self.stage1(x)
        x = self.stage2(x)
        x = self.stage3(x)
        x = self.stage4(x)
        x = self.stage5(x)
        x = self.avgpool(x)
        x = self.flatten(x)
        x = self.dropout(x)
        out = self.linear(x)
        return out
    
    def vovnet27_slim(**kwargs):
    planes = [[128, 64, 128],
              [128, 80, 256],
              [256, 96, 384],
              [384, 112, 512]]
    layers = [1, 1, 1, 1]
    model = VoVNet(planes, layers)
    return model
    
    def vovnet39(**kwargs):
    planes = [[128, 128, 256],
              [256, 160, 512],
              [512, 192, 768],
              [768, 224, 1024]]
    layers = [1, 1, 2, 2]
    model = VoVNet(planes, layers)
    return model
    
    def vovnet57(**kwargs):
    planes = [[128, 128, 256],
              [256, 160, 512],
              [512, 192, 768],
              [768, 224, 1024]]
    layers = [1, 1, 4, 3]
    model = VoVNet(planes, layers)
    return model
    
    
    if __name__=='__main__':
    model = vovnet27_slim()
    print(model)
    
    input = torch.randn(1, 3, 64, 64)
    out = model(input)
    print(out.shape)
    
    
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
      
    
    代码解读

全部评论 (0)

还没有任何评论哟~

深度学习论文: An Energy and GPU-Computation Efficient Backbone Network for Object Detection及其PyTorch

AnEnergyandGPUComputationEfficientBackboneNetworkforRealTimeObjectDetection PDF:<https://arxiv.org/p...

深度学习论文: FemtoDet: An Object Detection Baseline for Energy Versus Performance Tradeoffs及其PyTorch实现

深度学习论文:FemtoDet:AnObjectDetectionBaselineforEnergyVersusPerformanceTradeoffs及其PyTorch实现 FemtoDet:AnO...

深度学习论文: EfficientDet: Scalable and Efficient Object Detection及其PyTorch实现

深度学习论文:EfficientDet:ScalableandEfficientObjectDetection及其PyTorch实现 EfficientDet:ScalableandEfficient...

深度学习论文: Q-YOLO: Efficient Inference for Real-time Object Detection及其PyTorch实现

深度学习论文:QYOLO:EfficientInferenceforRealtimeObjectDetection及其PyTorch实现 QYOLO:EfficientInferenceforReal...

深度学习论文: PE-YOLO: Pyramid Enhancement Network for Dark Object Detection及其PyTorch实现

深度学习论文:PEYOLO:PyramidEnhancementNetworkforDarkObjectDetection及其PyTorch实现 PEYOLO:PyramidEnhancementNe...

深度学习论文: SimpleNet: A Simple Network for Image Anomaly Detection and Localization及其PyTorch实现

深度学习论文:SimpleNet:ASimpleNetworkforImageAnomalyDetectionandLocalization及其PyTorch实现 SimpleNet:ASimpleN...

深度学习论文: SFPN: Synthetic FPN for Object Detection及其PyTorch实现

深度学习论文:SFPN:SyntheticFPNforObjectDetection及其PyTorch实现 SFPN:SyntheticFPNforObjectDetection PDF:<https...

深度学习论文: Slicing Aided Hyper Inference and Fine-tuning for Small Object Detection及其PyTorch实现

深度学习论文:SlicingAidedHyperInferenceandFinetuningforSmallObjectDetection及其PyTorch实现 SlicingAidedHyperIn...

深度学习论文: Task-Specific Context Decoupling for Object Detection及其PyTorch实现

深度学习论文:TaskSpecificContextDecouplingforObjectDetection及其PyTorch实现 TaskSpecificContextDecouplingforOb...

深度学习论文: End-to-End Object Detection with Fully Convolutional Network及其PyTorch实现

深度学习论文:EndtoEndObjectDetectionwithFullyConvolutionalNetwork及其PyTorch实现 EndtoEndObjectDetectionwithFu...

是否确定退出登录?

深度学习论文: An Energy and GPU-Computation Efficient Backbone Network for Object Detection及其PyTorch

1 概述

2 Factors of Efficient Network Design(高效网络设计要素)

2-1 Memory Access Cost (MAC)

2-2 GPU-Computation Efficiency

3 Network

3-1 One-Shot Aggregation

3-2 VoVNet

全部评论 (0)

相关文章推荐

深度学习论文: An Energy and GPU-Computation Efficient Backbone Network for Object Detection及其PyTorch

深度学习论文: FemtoDet: An Object Detection Baseline for Energy Versus Performance Tradeoffs及其PyTorch实现

深度学习论文: EfficientDet: Scalable and Efficient Object Detection及其PyTorch实现

深度学习论文: Q-YOLO: Efficient Inference for Real-time Object Detection及其PyTorch实现

深度学习论文: PE-YOLO: Pyramid Enhancement Network for Dark Object Detection及其PyTorch实现

深度学习论文: SimpleNet: A Simple Network for Image Anomaly Detection and Localization及其PyTorch实现

深度学习论文: SFPN: Synthetic FPN for Object Detection及其PyTorch实现

深度学习论文: Slicing Aided Hyper Inference and Fine-tuning for Small Object Detection及其PyTorch实现

深度学习论文: Task-Specific Context Decoupling for Object Detection及其PyTorch实现

深度学习论文: End-to-End Object Detection with Fully Convolutional Network及其PyTorch实现