基于Deeplabv3模型实现自动驾驶车辆的实时对象检测

阅读量：

团队名称

Intel启动队

问题陈述

结合先进的计算机视觉技术和英特尔® AI 分析工具套件，本研究致力于开发实时目标检测系统以支持自动驾驶汽车的应用。团队需构建一个基于深度学习的系统，在人行道、汽车、交通标志牌以及交通指示灯等物体的基础下进行识别与分类任务。该系统需具备极高的识别精度与极低的响应时间，并以确保自动驾驶汽车的安全可靠的行驶路径规划为目标进行设计与优化。

项目简介

本项目采用了deeplabv3模型，通过以下步骤实现了语义分割。

数据预处理过程
ResNet模块用于提取输入图像的特征表示
ASPP模块通过整合上下文信息来提升特征表达能力
基于深度学习框架DeeplabV3将各部分整合后,实现语义分割预测任务的完成。

至此已经实现了通过从视频或图像中分割出道路交通标志以及车辆等元素并识别出所处的环境情况从而为自动驾驶系统提供技术支持

团队收获

我们学会了使用oneAPI 进行性能优化：

oneAPI 丰富地提供了各种工具与库资源，在助力提升代码性能方面发挥着重要作用。其中一项关键工具是 Intel® VTune™ Profiler ，它不仅能够识别代码中的性能瓶颈问题，并且还能给出相应的优化策略以供参考。借助这些高性能调试工具的强大功能与丰富功能模块组合设计能力，在实际应用中能够显著提升开发效率与系统响应速度。

我们学会了使用oneAPI 进行深度学习模型训练：

oneAPI 包括以下深度学习与神经网络相关的库与工具包。在实际应用中掌握如何构建、训练与部署深度学习架构的同时，在使用Intel® oneAPI Deep Neural Network Library的过程中还深入学习了提升推理效率的具体方法与技巧。

实现方案

数据预处理

该数据集包含了34个类别，在特定场景下或研究中常会关注其中一部分分类。通常会选择19个训练标识符来进行图像语义分割的训练。

训练ID（trainId）是一种专为便于管理与训练过程而设计的独特标识符。它将原始标签中的34个分类映射到19个分类构成的集合中。这种做法的主要目的是降低计算复杂度并提升模型性能。

在此处，该标识符代表了19个类别中的每一个，在Cityscapes数据集的图像语义分割任务中被使用。

映射关系(id->trainId)：

复制代码

 labels = [

    
     #       name                     id    trainId   category            catId     hasInstances   ignoreInEval   color
    
     Label(  'unlabeled'            ,  0 ,      19 , 'void'            , 0       , False        , True         , (  0,  0,  0) ),
    
     Label(  'ego vehicle'          ,  1 ,      19 , 'void'            , 0       , False        , True         , (  0,  0,  0) ),
    
     Label(  'rectification border' ,  2 ,      19 , 'void'            , 0       , False        , True         , (  0,  0,  0) ),
    
     Label(  'out of roi'           ,  3 ,      19 , 'void'            , 0       , False        , True         , (  0,  0,  0) ),
    
     Label(  'static'               ,  4 ,      19 , 'void'            , 0       , False        , True         , (  0,  0,  0) ),
    
     Label(  'dynamic'              ,  5 ,      19 , 'void'            , 0       , False        , True         , (111, 74,  0) ),
    
     Label(  'ground'               ,  6 ,      19 , 'void'            , 0       , False        , True         , ( 81,  0, 81) ),
    
     Label(  'road'                 ,  7 ,        0 , 'flat'            , 1       , False        , False        , (128, 64,128) ),
    
     Label(  'sidewalk'             ,  8 ,        1 , 'flat'            , 1       , False        , False        , (244, 35,232) ),
    
     Label(  'parking'              ,  9 ,      19 , 'flat'            , 1       , False        , True         , (250,170,160) ),
    
     Label(  'rail track'           , 10 ,      19 , 'flat'            , 1       , False        , True         , (230,150,140) ),
    
     Label(  'building'             , 11 ,        2 , 'construction'    , 2       , False        , False        , ( 70, 70, 70) ),
    
     Label(  'wall'                 , 12 ,        3 , 'construction'    , 2       , False        , False        , (102,102,156) ),
    
     Label(  'fence'                , 13 ,        4 , 'construction'    , 2       , False        , False        , (190,153,153) ),
    
     Label(  'guard rail'           , 14 ,      19 , 'construction'    , 2       , False        , True         , (180,165,180) ),
    
     Label(  'bridge'               , 15 ,      19 , 'construction'    , 2       , False        , True         , (150,100,100) ),
    
     Label(  'tunnel'               , 16 ,      19 , 'construction'    , 2       , False        , True         , (150,120, 90) ),
    
     Label(  'pole'                 , 17 ,        5 , 'object'          , 3       , False        , False        , (153,153,153) ),
    
     Label(  'polegroup'            , 18 ,      19 , 'object'          , 3       , False        , True         , (153,153,153) ),
    
     Label(  'traffic light'        , 19 ,        6 , 'object'          , 3       , False        , False        , (250,170, 30) ),
    
     Label(  'traffic sign'         , 20 ,        7 , 'object'          , 3       , False        , False        , (220,220,  0) ),
    
     Label(  'vegetation'           , 21 ,        8 , 'nature'          , 4       , False        , False        , (107,142, 35) ),
    
     Label(  'terrain'              , 22 ,        9 , 'nature'          , 4       , False        , False        , (152,251,152) ),
    
     Label(  'sky'                  , 23 ,       10 , 'sky'             , 5       , False        , False        , ( 70,130,180) ),
    
     Label(  'person'               , 24 ,       11 , 'human'           , 6       , True         , False        , (220, 20, 60) ),
    
     Label(  'rider'                , 25 ,       12 , 'human'           , 6       , True         , False        , (255,  0,  0) ),
    
     Label(  'car'                  , 26 ,       13 , 'vehicle'         , 7       , True         , False        , (  0,  0,142) ),
    
     Label(  'truck'                , 27 ,       14 , 'vehicle'         , 7       , True         , False        , (  0,  0, 70) ),
    
     Label(  'bus'                  , 28 ,       15 , 'vehicle'         , 7       , True         , False        , (  0, 60,100) ),
    
     Label(  'caravan'              , 29 ,      19 , 'vehicle'         , 7       , True         , True         , (  0,  0, 90) ),
    
     Label(  'trailer'              , 30 ,      19 , 'vehicle'         , 7       , True         , True         , (  0,  0,110) ),
    
     Label(  'train'                , 31 ,       16 , 'vehicle'         , 7       , True         , False        , (  0, 80,100) ),
    
     Label(  'motorcycle'           , 32 ,       17 , 'vehicle'         , 7       , True         , False        , (  0,  0,230) ),
    
     Label(  'bicycle'              , 33 ,       18 , 'vehicle'         , 7       , True         , False        , (119, 11, 32) ),
    
     Label(  'license plate'        , -1 ,       19 , 'vehicle'         , 7       , False        , True         , (  0,  0,142) ),
    
 ]

超参设置

num_epochs = 400
batch_size = 3
learning_rate = 0.0001

ResNet模块

简介

我们基于预训练的ResNet构建了一个核心特征提取模块,该模块具备多样化的深度设置与输出间隔选择能力,为其后的语义分割任务提供强有力的特征支撑。定义网络架构并导入预训练权重的过程支撑了一个通用性的ResNet特征提取模块,从而为其后的各项技术环节提供基础性支持。

具体实现思路

确定ResNet基础架构
设计多层次架构
导入预训练网络模型
生成适用于OS16及OS8的版本
建立正向与反向传播接口
实现功能模块接口构建

代码实现

复制代码

 import torch

    
 import torch.nn as nn
    
 import torch.nn.functional as F
    
 import torchvision.models as models
    
  
    
  
    
 def make_layer(block, in_channels, channels, num_blocks, stride=1, dilation=1):
    
     strides = [stride] + [1] * (num_blocks - 1)
    
  
    
     blocks = []
    
     for stride in strides:
    
     blocks.append(block(in_channels=in_channels, channels=channels, stride=stride, dilation=dilation))
    
     in_channels = block.expansion * channels
    
  
    
     layer = nn.Sequential(*blocks)
    
  
    
     return layer
    
  
    
  
    
 class BasicBlock(nn.Module):
    
     expansion = 1
    
  
    
     def __init__(self, in_channels, channels, stride=1, dilation=1):
    
     super(BasicBlock, self).__init__()
    
  
    
     out_channels = self.expansion * channels
    
  
    
     self.conv1 = nn.Conv2d(in_channels, channels, kernel_size=3, stride=stride, padding=dilation, dilation=dilation,
    
                            bias=False)
    
     self.bn1 = nn.BatchNorm2d(channels)
    
  
    
     self.conv2 = nn.Conv2d(channels, channels, kernel_size=3, stride=1, padding=dilation, dilation=dilation,
    
                            bias=False)
    
     self.bn2 = nn.BatchNorm2d(channels)
    
  
    
     if (stride != 1) or (in_channels != out_channels):
    
         conv = nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride, bias=False)
    
         bn = nn.BatchNorm2d(out_channels)
    
         self.downsample = nn.Sequential(conv, bn)
    
     else:
    
         self.downsample = nn.Sequential()
    
  
    
     def forward(self, x):
    
     out = F.relu(self.bn1(self.conv1(x)))
    
     out = self.bn2(self.conv2(out))
    
  
    
     out = out + self.downsample(x)
    
  
    
     out = F.relu(out)
    
  
    
     return out
    
  
    
  
    
 class Bottleneck(nn.Module):
    
     expansion = 4
    
  
    
     def __init__(self, in_channels, channels, stride=1, dilation=1):
    
     super(Bottleneck, self).__init__()
    
  
    
     out_channels = self.expansion * channels
    
  
    
     self.conv1 = nn.Conv2d(in_channels, channels, kernel_size=1, bias=False)
    
     self.bn1 = nn.BatchNorm2d(channels)
    
  
    
     self.conv2 = nn.Conv2d(channels, channels, kernel_size=3, stride=stride, padding=dilation, dilation=dilation,
    
                            bias=False)
    
     self.bn2 = nn.BatchNorm2d(channels)
    
  
    
     self.conv3 = nn.Conv2d(channels, out_channels, kernel_size=1, bias=False)
    
     self.bn3 = nn.BatchNorm2d(out_channels)
    
  
    
     if (stride != 1) or (in_channels != out_channels):
    
         conv = nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride, bias=False)
    
         bn = nn.BatchNorm2d(out_channels)
    
         self.downsample = nn.Sequential(conv, bn)
    
     else:
    
         self.downsample = nn.Sequential()
    
  
    
     def forward(self, x):
    
     out = F.relu(self.bn1(self.conv1(x)))  # (shape: (batch_size, channels, h, w))
    
     out = F.relu(self.bn2(self.conv2(out)))
    
     out = self.bn3(self.conv3(out))
    
     out = out + self.downsample(x)
    
     out = F.relu(out)
    
  
    
     return out
    
  
    
  
    
 class ResNet_Bottleneck_OS16(nn.Module):
    
     def __init__(self, num_layers):
    
     super(ResNet_Bottleneck_OS16, self).__init__()
    
  
    
     if num_layers == 50:
    
         resnet = models.resnet50()
    
         # load pretrained model:
    
         resnet.load_state_dict(torch.load("/root/deeplabv3/pretrained_models/resnet/resnet50-19c8e357.pth"))
    
         # remove fully connected layer, avg pool and layer5:
    
         self.resnet = nn.Sequential(*list(resnet.children())[:-3])
    
         print("pretrained resnet, 50")
    
  
    
     elif num_layers == 101:
    
         resnet = models.resnet101()
    
         resnet.load_state_dict(torch.load("/root/deeplabv3/pretrained_models/resnet/resnet101-5d3b4d8f.pth"))
    
         self.resnet = nn.Sequential(*list(resnet.children())[:-3])
    
         print("pretrained resnet, 101")
    
  
    
     elif num_layers == 152:
    
         resnet = models.resnet152()
    
         # load pretrained model:
    
         resnet.load_state_dict(torch.load("/root/deeplabv3/pretrained_models/resnet/resnet152-b121ed2d.pth"))
    
         # remove fully connected layer, avg pool and layer5:
    
         self.resnet = nn.Sequential(*list(resnet.children())[:-3])
    
         print("pretrained resnet, 152")
    
  
    
     else:
    
         raise Exception("num_layers must be in {50, 101, 152}!")
    
  
    
     self.layer5 = make_layer(Bottleneck, in_channels=4 * 256, channels=512, num_blocks=3, stride=1, dilation=2)
    
  
    
     def forward(self, x):
    
     c4 = self.resnet(x)
    
     output = self.layer5(c4)
    
     return output
    
  
    
  
    
 class ResNet_BasicBlock_OS16(nn.Module):
    
     def __init__(self, num_layers):
    
     super(ResNet_BasicBlock_OS16, self).__init__()
    
  
    
     if num_layers == 18:
    
         resnet = models.resnet18()
    
         resnet.load_state_dict(torch.load("/root/deeplabv3/pretrained_models/resnet/resnet18-5c106cde.pth"))
    
         self.resnet = nn.Sequential(*list(resnet.children())[:-3])
    
         num_blocks = 2
    
         print("pretrained resnet, 18")
    
  
    
     elif num_layers == 34:
    
         resnet = models.resnet34()
    
         resnet.load_state_dict(torch.load("/root/deeplabv3/pretrained_models/resnet/resnet34-333f7ec4.pth"))
    
         self.resnet = nn.Sequential(*list(resnet.children())[:-3])
    
         num_blocks = 3
    
         print("pretrained resnet, 34")
    
  
    
     else:
    
         raise Exception("num_layers must be in {18, 34}!")
    
  
    
     self.layer5 = make_layer(BasicBlock, in_channels=256, channels=512, num_blocks=num_blocks, stride=1, dilation=2)
    
  
    
     def forward(self, x):
    
     c4 = self.resnet(x)
    
     output = self.layer5(c4)
    
     return output
    
  
    
  
    
 class ResNet_BasicBlock_OS8(nn.Module):
    
     def __init__(self, num_layers):
    
     super(ResNet_BasicBlock_OS8, self).__init__()
    
  
    
     if num_layers == 18:
    
         resnet = models.resnet18()
    
         resnet.load_state_dict(torch.load("/root/deeplabv3/pretrained_models/resnet/resnet18-5c106cde.pth"))
    
         self.resnet = nn.Sequential(*list(resnet.children())[:-4])
    
         num_blocks_layer_4 = 2
    
         num_blocks_layer_5 = 2
    
         print("pretrained resnet, 18")
    
  
    
     elif num_layers == 34:
    
         resnet = models.resnet34()
    
         resnet.load_state_dict(torch.load("/root/deeplabv3/pretrained_models/resnet/resnet34-333f7ec4.pth"))
    
         self.resnet = nn.Sequential(*list(resnet.children())[:-4])
    
         num_blocks_layer_4 = 6
    
         num_blocks_layer_5 = 3
    
         print("pretrained resnet, 34")
    
     else:
    
         raise Exception("num_layers must be in {18, 34}!")
    
  
    
     self.layer4 = make_layer(BasicBlock, in_channels=128, channels=256, num_blocks=num_blocks_layer_4, stride=1,
    
                              dilation=2)
    
  
    
     self.layer5 = make_layer(BasicBlock, in_channels=256, channels=512, num_blocks=num_blocks_layer_5, stride=1,
    
                              dilation=4)
    
  
    
     def forward(self, x):
    
     c3 = self.resnet(x)
    
     output = self.layer4(c3)
    
     output = self.layer5(output)
    
     return output
    
  
    
  
    
 def ResNet18_OS16():
    
     return ResNet_BasicBlock_OS16(num_layers=18)
    
  
    
  
    
 def ResNet34_OS16():
    
     return ResNet_BasicBlock_OS16(num_layers=34)
    
  
    
  
    
 def ResNet50_OS16():
    
     return ResNet_Bottleneck_OS16(num_layers=50)
    
  
    
  
    
 def ResNet101_OS16():
    
     return ResNet_Bottleneck_OS16(num_layers=101)
    
  
    
  
    
 def ResNet152_OS16():
    
     return ResNet_Bottleneck_OS16(num_layers=152)
    
  
    
  
    
 def ResNet18_OS8():
    
     return ResNet_BasicBlock_OS8(num_layers=18)
    
  
    
  
    
 def ResNet34_OS8():
    
     return ResNet_BasicBlock_OS8(num_layers=34)

ASPP模块

简介

本研究采用了ASPP模块，并利用多层次空洞卷积与金字塔池化操作提取不同层次的空间关系信息，在提升图像语义分割精度方面取得了显著成果。

本研究开发了两个关键模块：ASPP及其瓶颈增强版本。其中，ASPP模块基于512通道的空间特征图设计。而其瓶颈增强版本则采用了4×512通道的空间特征图。通过在瓶颈增强版本中增加一个瓶颈层节点，在不显著增加计算开销的情况下显著提升了模型性能。

代码实现

复制代码

 import torch

    
 import torch.nn as nn
    
 import torch.nn.functional as F
    
  
    
 class ASPP(nn.Module):
    
     def __init__(self, num_classes):
    
     super(ASPP, self).__init__()
    
  
    
     self.conv_1x1_1 = nn.Conv2d(512, 256, kernel_size=1)
    
     self.bn_conv_1x1_1 = nn.BatchNorm2d(256)
    
  
    
     self.conv_3x3_1 = nn.Conv2d(512, 256, kernel_size=3, stride=1, padding=6, dilation=6)
    
     self.bn_conv_3x3_1 = nn.BatchNorm2d(256)
    
  
    
     self.conv_3x3_2 = nn.Conv2d(512, 256, kernel_size=3, stride=1, padding=12, dilation=12)
    
     self.bn_conv_3x3_2 = nn.BatchNorm2d(256)
    
  
    
     self.conv_3x3_3 = nn.Conv2d(512, 256, kernel_size=3, stride=1, padding=18, dilation=18)
    
     self.bn_conv_3x3_3 = nn.BatchNorm2d(256)
    
  
    
     self.avg_pool = nn.AdaptiveAvgPool2d(1)
    
  
    
     self.conv_1x1_2 = nn.Conv2d(512, 256, kernel_size=1)
    
     self.bn_conv_1x1_2 = nn.BatchNorm2d(256)
    
  
    
     self.conv_1x1_3 = nn.Conv2d(1280, 256, kernel_size=1) # (1280 = 5*256)
    
     self.bn_conv_1x1_3 = nn.BatchNorm2d(256)
    
  
    
     self.conv_1x1_4 = nn.Conv2d(256, num_classes, kernel_size=1)
    
  
    
     def forward(self, feature_map):
    
     # (feature_map has shape (batch_size, 512, h/16, w/16)) (assuming self.resnet is ResNet18_OS16 or ResNet34_OS16. If self.resnet instead is ResNet18_OS8 or ResNet34_OS8, it will be (batch_size, 512, h/8, w/8))
    
  
    
     feature_map_h = feature_map.size()[2] # (== h/16)
    
     feature_map_w = feature_map.size()[3] # (== w/16)
    
  
    
     out_1x1 = F.relu(self.bn_conv_1x1_1(self.conv_1x1_1(feature_map))) # (shape: (batch_size, 256, h/16, w/16))
    
     out_3x3_1 = F.relu(self.bn_conv_3x3_1(self.conv_3x3_1(feature_map))) # (shape: (batch_size, 256, h/16, w/16))
    
     out_3x3_2 = F.relu(self.bn_conv_3x3_2(self.conv_3x3_2(feature_map))) # (shape: (batch_size, 256, h/16, w/16))
    
     out_3x3_3 = F.relu(self.bn_conv_3x3_3(self.conv_3x3_3(feature_map))) # (shape: (batch_size, 256, h/16, w/16))
    
  
    
     out_img = self.avg_pool(feature_map) # (shape: (batch_size, 512, 1, 1))
    
     out_img = F.relu(self.bn_conv_1x1_2(self.conv_1x1_2(out_img))) # (shape: (batch_size, 256, 1, 1))
    
     out_img = F.upsample(out_img, size=(feature_map_h, feature_map_w), mode="bilinear") # (shape: (batch_size, 256, h/16, w/16))
    
  
    
     out = torch.cat([out_1x1, out_3x3_1, out_3x3_2, out_3x3_3, out_img], 1) # (shape: (batch_size, 1280, h/16, w/16))
    
     out = F.relu(self.bn_conv_1x1_3(self.conv_1x1_3(out))) # (shape: (batch_size, 256, h/16, w/16))
    
     out = self.conv_1x1_4(out) # (shape: (batch_size, num_classes, h/16, w/16))
    
  
    
     return out
    
  
    
 class ASPP_Bottleneck(nn.Module):
    
     def __init__(self, num_classes):
    
     super(ASPP_Bottleneck, self).__init__()
    
  
    
     self.conv_1x1_1 = nn.Conv2d(4*512, 256, kernel_size=1)
    
     self.bn_conv_1x1_1 = nn.BatchNorm2d(256)
    
  
    
     self.conv_3x3_1 = nn.Conv2d(4*512, 256, kernel_size=3, stride=1, padding=6, dilation=6)
    
     self.bn_conv_3x3_1 = nn.BatchNorm2d(256)
    
  
    
     self.conv_3x3_2 = nn.Conv2d(4*512, 256, kernel_size=3, stride=1, padding=12, dilation=12)
    
     self.bn_conv_3x3_2 = nn.BatchNorm2d(256)
    
  
    
     self.conv_3x3_3 = nn.Conv2d(4*512, 256, kernel_size=3, stride=1, padding=18, dilation=18)
    
     self.bn_conv_3x3_3 = nn.BatchNorm2d(256)
    
  
    
     self.avg_pool = nn.AdaptiveAvgPool2d(1)
    
  
    
     self.conv_1x1_2 = nn.Conv2d(4*512, 256, kernel_size=1)
    
     self.bn_conv_1x1_2 = nn.BatchNorm2d(256)
    
  
    
     self.conv_1x1_3 = nn.Conv2d(1280, 256, kernel_size=1) # (1280 = 5*256)
    
     self.bn_conv_1x1_3 = nn.BatchNorm2d(256)
    
  
    
     self.conv_1x1_4 = nn.Conv2d(256, num_classes, kernel_size=1)
    
  
    
     def forward(self, feature_map):
    
     # (feature_map has shape (batch_size, 4*512, h/16, w/16))
    
  
    
     feature_map_h = feature_map.size()[2] # (== h/16)
    
     feature_map_w = feature_map.size()[3] # (== w/16)
    
  
    
     out_1x1 = F.relu(self.bn_conv_1x1_1(self.conv_1x1_1(feature_map))) # (shape: (batch_size, 256, h/16, w/16))
    
     out_3x3_1 = F.relu(self.bn_conv_3x3_1(self.conv_3x3_1(feature_map))) # (shape: (batch_size, 256, h/16, w/16))
    
     out_3x3_2 = F.relu(self.bn_conv_3x3_2(self.conv_3x3_2(feature_map))) # (shape: (batch_size, 256, h/16, w/16))
    
     out_3x3_3 = F.relu(self.bn_conv_3x3_3(self.conv_3x3_3(feature_map))) # (shape: (batch_size, 256, h/16, w/16))
    
  
    
     out_img = self.avg_pool(feature_map) # (shape: (batch_size, 512, 1, 1))
    
     out_img = F.relu(self.bn_conv_1x1_2(self.conv_1x1_2(out_img))) # (shape: (batch_size, 256, 1, 1))
    
     out_img = F.upsample(out_img, size=(feature_map_h, feature_map_w), mode="bilinear") # (shape: (batch_size, 256, h/16, w/16))
    
  
    
     out = torch.cat([out_1x1, out_3x3_1, out_3x3_2, out_3x3_3, out_img], 1) # (shape: (batch_size, 1280, h/16, w/16))
    
     out = F.relu(self.bn_conv_1x1_3(self.conv_1x1_3(out))) # (shape: (batch_size, 256, h/16, w/16))
    
     out = self.conv_1x1_4(out) # (shape: (batch_size, num_classes, h/16, w/16))
    
  
    
     return out

这段代码定义了两个模型：ASPP和ASPP_Bottleneck。

ASPP模型的主要功能是对输入图像中的特征图进行多尺度特征融合，并被应用于语义分割任务中。其内部架构整合了多种卷积层与下采样模块。通过应用不同尺寸的卷积核来提取各层次的空间细节，并结合自适应平均池化技术对特征图进行降维处理。随后将各尺度提取的结果与全局上下文信息进行融合，并通过一系列额外的卷积层与激活函数处理后得到最终预测输出。

基于ASPP架构设计的ASPP_Bottleneck模型加入了瓶颈结构。
该模型旨在以增强特征表示能力。
其首先在其输入的空间金字塔式地执行多尺度卷积操作。
并利用自适应平均池化技术获取整体空间信息。
随后将各尺度生成的空间表征与整体空间信息融合后，在经过一系列的空间转换与归一化处理后获得最终预测结果

Deeplabv3模块

简介

该模块实现了Deeplabv3网络的主要结构和操作流程:

定义Deeplabv3网络类DeepLabV3,并传入训练相关参数；

指定不同的ResNet结构作为 backbone 提取特征图；

初始化ASPP模块,用于对特征图进行上下文聚合；

定义forward传播过程；

保存训练状态

代码实现

复制代码

 import torch

    
 import torch.nn as nn
    
 import torch.nn.functional as F
    
  
    
 import os
    
  
    
 from resnet import ResNet18_OS16, ResNet34_OS16, ResNet50_OS16, ResNet101_OS16, ResNet152_OS16, ResNet18_OS8, ResNet34_OS8
    
 from aspp import ASPP, ASPP_Bottleneck
    
  
    
 class DeepLabV3(nn.Module):
    
     def __init__(self, model_id, project_dir):
    
     super(DeepLabV3, self).__init__()
    
  
    
     self.num_classes = 20
    
  
    
     self.model_id = model_id
    
     self.project_dir = project_dir
    
     self.create_model_dirs()
    
  
    
     self.resnet = ResNet18_OS8() 
    
     self.aspp = ASPP(num_classes=self.num_classes) 
    
  
    
     def forward(self, x):
    
  
    
     h = x.size()[2]
    
     w = x.size()[3]
    
  
    
     feature_map = self.resnet(x) 
    
  
    
     output = self.aspp(feature_map) 
    
     
    
     output = F.upsample(output, size=(h, w), mode="bilinear") 
    
  
    
     return output
    
  
    
     def create_model_dirs(self):
    
     self.logs_dir = self.project_dir + "/training_logs"
    
     self.model_dir = self.logs_dir + "/model_%s" % self.model_id
    
     self.checkpoints_dir = self.model_dir + "/checkpoints"
    
     if not os.path.exists(self.logs_dir):
    
         os.makedirs(self.logs_dir)
    
     if not os.path.exists(self.model_dir):
    
         os.makedirs(self.model_dir)
    
         os.makedirs(self.checkpoints_dir)

训练过程

共训练400epoches，分成0-100、101-200、201-400epoch进行训练。

注明：101-200、201-400epoch训练损失结果图下标输出错误，已手动修改。

训练集训练损失

验证集训练损失

训练结果：

测试集中图片平均像素交并比-直方图

平均像素交并比：0.954983232183881

测试集中图片像素精度-直方图

平均像素精度为：0.86769234085083

测试集中单张图片的平均预测时间-直方图

平均预测时间为：0.0030388580219639257s

结果演示

全部评论 (0)

还没有任何评论哟~

基于Deeplabv3模型实现自动驾驶车辆的实时对象检测

团队名称 Intel启动队问题陈述使用计算机视觉技术和英特尔®AI分析工具套件为自动驾驶车辆开发实时对象检测模型。参赛团队需要创建一个深度学习模型，用于准确检测行人、车辆、交通标志和交通信号等对象...

自动驾驶基础一车辆模型

1.模型概述：自行车动力学模型通常用于研究自行车在骑行过程中的行为，如稳定性、操控性和速度等。模型可以基于不同的简化假设和复杂度，从简单的二维模型到复杂的三维模型，甚至包括骑行者的动态。 2.力学...

自动驾驶之无人驾驶车辆与模型预测控制

无人驾驶车辆无人车辆UnmannedVehicle,根据其行驶环境的不同,可以分为空中无人机UnmannedAerialVehicle、水面无人艇UnmannedSurfaceVehicle、水下无...

【无人驾驶】基于模型预测控制实现无人驾驶车辆运动规划和控制

摘要在无人驾驶车辆的运动规划与控制中，生成有效且动态可行的路径是确保车辆安全行驶的核心问题。本文提出一种结合模型预测控制（ModelPredictiveControl,MPC）的路径规划方法。通过引...

python自动驾驶代码_用于自动驾驶的3D对象检测代码

AggregateViewObjectDetection ThisrepositorycontainsthepublicreleaseofthePythonimplementationofourAgg...

开源 | 自动驾驶中基于激光雷达的车辆道路和人行道实时检测！

作者|dianyunPCL 编辑|点云PCL点击下方卡片，关注“自动驾驶之心”公众号ADAS巨卷干货，即可获取点击进入→自动驾驶之心【3D目标检测】技术交流群后台回复【3D检测综述】获取最新基于点云/...

自动驾驶中的实时挑战：如何优化车辆动力学模型

自动驾驶中的实时优化：自行车模型与双轨模型的计算复杂度权衡在自动驾驶领域，车辆动力学建模是实现精准控制和路径规划的关键。自行车模型和双轨模型作为两种主流的建模方法，在实时性需求下如何平衡计算复杂度与...

基于openvino实现车辆检测及驾驶员人脸关键点检测系统

文章目录方案介绍方案详细描述结果展示实用设计、价值影响等总结方案介绍我们的产品是基于OpenVINO的车辆检测及驾驶员人脸关键点检测系统。该系统利用OpenVINO（OpenVisua...

carla自动驾驶模型（带车道线检测）

carla自动驾驶模型（带车道线检测）平台搭建： ①python3.5（点击下载）提取码：pt7q ②carla0.8.2（点击下载）提取码：q77e ③RLCARLA项目（点击下载）提取码：...

python车辆识别硬件_【辅助驾驶】Python OpenCV实现车辆检测

一、功能对车辆前方的车辆进行检测，效果如图：二、算法 1、传统检测方法常规的机器学习方法，包括训练和应用两个过程。训练：需要构建训练集包括正负样本，使用HOG、SIFT等特征描述获取特征，使用...

是否确定退出登录?

基于Deeplabv3模型实现自动驾驶车辆的实时对象检测

团队名称

问题陈述

项目简介

相关技术

团队收获

实现方案

数据预处理

映射关系(id->trainId)：

超参设置

ResNet模块

简介

具体实现思路

代码实现

ASPP模块

简介

代码实现

Deeplabv3模块

简介

代码实现

训练过程

训练集训练损失

验证集训练损失

训练结果：

测试集中图片平均像素交并比-直方图

测试集中图片像素精度-直方图

测试集中单张图片的平均预测时间-直方图

结果演示

全部评论 (0)

相关文章推荐

基于Deeplabv3模型实现自动驾驶车辆的实时对象检测

自动驾驶基础一车辆模型

自动驾驶之无人驾驶车辆与模型预测控制

【无人驾驶】基于模型预测控制实现无人驾驶车辆运动规划和控制

python自动驾驶代码_用于自动驾驶的3D对象检测代码

开源 | 自动驾驶中基于激光雷达的车辆道路和人行道实时检测！

自动驾驶中的实时挑战：如何优化车辆动力学模型

基于openvino实现车辆检测及驾驶员人脸关键点检测系统

carla自动驾驶模型（带车道线检测）

python车辆识别硬件_【辅助驾驶】Python OpenCV实现车辆检测