深度之眼--图像分割第三周SegNet

阅读量：

对称编码解码器网络SegNet

- - - 1.编码器解码器和反池化
- 2.SegNet论文中的figure2
- 3.CamVid数据集测试结果

1.编码器解码器和反池化

encoder编码：采用下采样的方式，逐步提取浅层到深层特征信息，其中浅层特征图更著更分割局部边缘信息，深层特征图具有更大的感受野，有利于获取全局信息，获取详细的语义信息。
decoder解码：采用上采样的方式，把encoder阶段获取到的特征图重新恢复到高分辨率，目的是从全局获取一些更精准的信息。解码器网络的作用是将低分辨率编码器特征映射映射到全输入分辨率特征映射，并用于逐像素分类，恢复浅层网络层的分割精度。
所谓编码，就是将输入序列转化成一个固定长度的向量；解码就是将之前生成的固定向量转化成输出序列，通过逐步恢复空间信息来捕获更清晰的对象边界。

上采样（Upsampling）
上采样指的是任何可以让你的图像变成更高分辨率的技术，反卷积和反池化都是上采样的一种。几乎都是采用内插值方法，即在原有图像像素的基础上在像素点之间采用合适的插值算法插入新的元素。最简单的方式是重采样和插值：将输入图片进行放大到一个想要的尺寸，而且计算每个点的像素点，使用双线性插值等插值方法对其余点进行插值来完成上采样过程。
在这里插入图片描述
反池化（Unpooling）
Unpooling是在CNN中常用的来表示max pooling的逆操作，是论文《Visualizing and Understanding Convolutional Networks》中产生的思想，下图示意：
对比上面两个示意图，可以发现两者的区别在于：
1.UnSampling阶段没有使用MaxPooling时的位置信息，而是直接将内容复制来扩充Feature Map。
2.UnPooling的过程，特点是在Maxpooling的时候保留最大值的位置信息，之后在unPooling阶段使用该信息扩充Feature Map，除最大值位置以外，其余补0。

2.SegNet论文中的figure2

SegNet基于FCN，修改VGG-16网络得到的语义分割网络，使用encoder-decoder模型：在这里插入图片描述
Seg Net体系结构的说明:是一个由预训练的VGG16网络特征提取层作为编码器和对应的解码器形成的对称全卷积网络，由中间绿色pooling层与红色upsampling层作为分割，左边是same卷积提取高维特征，并通过pooling使图片变小，右边是upsampling与卷积，通过upsampling（反池化）使图像变大，反卷积使得图像分类后特征得以重现（使池化后稀疏的特征变密集），最后通过Softmax，输出不同分类的最大值。
优点：仅在训练好的模型上进行前向计算便可得到平滑的像素级预测；解码器使用了在相应编码器的最大池化步骤中计算的池化索引来执行非线性上采样。这种方法消除了学习上采样的需要。经上采样后的特征图是稀疏的，因此随后使用可训练的卷积核进行卷积操作，生成密集的特征图。SegNet的效率更高，因为它只存储特征映射的最大池化索引，并在其解码器网络中使用它们以获得良好的性能。

3.CamVid数据集测试结果

使用vgg_bn预训练网络的代码如下：

复制代码

    vgg16_pretrained = models.vgg16_bn(pretrained=True)
    def decoder(input_channel, output_channel, num=3):
    if num == 3:
        decoder_body = nn.Sequential(
            nn.Conv2d(input_channel, output_channel, 3, padding=1),
            nn.BatchNorm2d(output_channel),
            nn.ReLU(inplace=True),
            nn.Conv2d(output_channel, output_channel, 3, padding=1),
            nn.BatchNorm2d(output_channel),
            nn.ReLU(inplace=True),
            nn.Conv2d(output_channel, output_channel, 3, padding=1),
            nn.BatchNorm2d(output_channel),
            nn.ReLU(inplace=True)
        )
    elif num == 2:
        decoder_body = nn.Sequential(
            nn.Conv2d(input_channel, output_channel, 3, padding=1),
            nn.BatchNorm2d(output_channel),
            nn.ReLU(inplace=True),
            nn.Conv2d(output_channel, output_channel, 3, padding=1),
            nn.BatchNorm2d(output_channel),
            nn.ReLU(inplace=True)
        )
    return decoder_body
    
    
    class VGG16_deconv(torch.nn.Module):
    def __init__(self):
        super(VGG16_deconv, self).__init__()
    
        pool_list = [6, 13, 23, 33, 43]
        for index in pool_list:
            vgg16_pretrained.features[index].return_indices = True   # pooling层可以返回索引值
    
        self.encoder1 = vgg16_pretrained.features[:6]
        self.pool1 = vgg16_pretrained.features[6]
    
        self.encoder2 = vgg16_pretrained.features[7:13]
        self.pool2 = vgg16_pretrained.features[13]
    
        self.encoder3 = vgg16_pretrained.features[14:23]
        self.pool3 = vgg16_pretrained.features[23]
    
        self.encoder4 = vgg16_pretrained.features[24:33]
        self.pool4 = vgg16_pretrained.features[33]
    
        self.encoder5 = vgg16_pretrained.features[34:43]
        self.pool5 = vgg16_pretrained.features[43]
    
        self.decoder5 = decoder(512, 512)
        self.unpool5 = nn.MaxUnpool2d(2, 2)
    
        self.decoder4 = decoder(512, 256)
        self.unpool4 = nn.MaxUnpool2d(2, 2)
    
        self.decoder3 = decoder(256, 128)
        self.unpool3 = nn.MaxUnpool2d(2, 2)
    
        self.decoder2 = decoder(128, 64, 2)
        self.unpool2 = nn.MaxUnpool2d(2, 2)
    
        self.decoder1 = decoder(64, 64, 3)
        self.unpool1 = nn.MaxUnpool2d(2, 2)
        self.decoder0 = nn.Conv2d(64,12, 1)
    
    
    def forward(self, x):  # 3, 352, 480
        encoder1 = self.encoder1(x)  # 64, 352, 480
        output_size1 = encoder1.size()  # 64, 352, 480
        pool1, indices1 = self.pool1(encoder1)  # 64, 176, 240   池化层返回值为pool输出特征图和索引
    
        encoder2 = self.encoder2(pool1)  # 128, 176, 240
        output_size2 = encoder2.size()  # 128, 176, 240
        pool2, indices2 = self.pool2(encoder2)  # 128, 88, 120
    
        encoder3 = self.encoder3(pool2)  # 256, 88, 120
        output_size3 = encoder3.size()  # 256, 88, 120
        pool3, indices3 = self.pool3(encoder3)  # 256, 44, 60
    
        encoder4 = self.encoder4(pool3)  # 512, 44, 60
        output_size4 = encoder4.size()  # 512, 44, 60
        pool4, indices4 = self.pool4(encoder4)  # 512, 22, 30
    
        encoder5 = self.encoder5(pool4)  # 512, 22, 30
        output_size5 = encoder5.size()  # 512, 22, 30
        pool5, indices5 = self.pool5(encoder5)  # 512, 11, 15
    
        unpool5 = self.unpool5(input=pool5, indices=indices5, output_size=output_size5)  # 512, 22, 30   反池化传入参数（特征图，索引，输出大小）
        decoder5 = self.decoder5(unpool5)  # 512, 22, 30
    
        unpool4 = self.unpool4(input=decoder5, indices=indices4, output_size=output_size4)  # 512, 44, 60
        decoder4 = self.decoder4(unpool4)  # 256, 44, 60
    
        unpool3 = self.unpool3(input=decoder4, indices=indices3, output_size=output_size3)  # 256, 88, 120
        decoder3 = self.decoder3(unpool3)  # 128, 88, 120
    
        unpool2 = self.unpool2(input=decoder3, indices=indices2, output_size=output_size2)  # 128, 176, 240
        decoder2 = self.decoder2(unpool2)  # 64, 176, 240
    
        unpool1 = self.unpool1(input=decoder2, indices=indices1, output_size=output_size1)  # 64, 352, 480
        decoder1 = self.decoder1(unpool1)  # 64, 352, 480
        x = self.decoder0(decoder1)
    
        return x

在这里插入图片描述
实验过程使用SegNet模型(特征提取层为预训练的VGG16)，对CamVid数据集进行了50轮的训练。在43轮得到了loss收敛到最小的结果。
测试结果如图：

全部评论 (0)

还没有任何评论哟~

深度之眼--图像分割第三周SegNet

对称编码解码器网络SegNet 1.编码器解码器和反池化 2.SegNet论文中的figure2 3.CamVid数据集测试结果 1.编码器解码器和反池化 encoder编码：采用下采样的方式，逐步提...

深度之眼--图像分割第五周GCN

经典分割网络GCN 1.卷积的方式汇总 2.定位和分类的权衡 3.实现GCN网络 1.卷积的方式汇总 1.大卷积核：拥有大的感受野，甚至可以扩大到全局卷积，缺点是参数量、计算量过大。

深度之眼--图像分割第四周Deeplab

图像分割领风者Deeplab系列 1.语义分割面临的问题 2.Deeplab系列的思想 3.简述ASPP模块 4.模型发展：从Deeplabv1到v3+ 1.语义分割面临的问题 1.CNN可以可靠地预...

深度之眼--图像分割第一周FCN

图像分割开山之作FCN 1.语义分割的概念 2.FCN论文中的figure2 3.FCN论文中的figure3 4.论文Results提到的前3个指标 5.测试结果 1.语义分割的概念语义分割就是逐...

深度之眼--图像分割第二周Unet

医学图像分割常青树Unet 1.医学图像分割面临挑战 2.医学图像分割特点 3.Unet分割CamVid数据集 4.Unet分割医学数据集 1.医学图像分割面临挑战医学图像分割是计算机视觉领域研究的...

图像分割之segnet

一、算法介绍 SegNet网络结构如下图所示，Input为输入图片，Output为输出分割的图像，不同颜色代表不同的分类。语义分割的重要性就在于不仅告诉你图片中某个东西是什么，而且告知你他在图片的位置...

第92步深度学习图像分割：SegNet建模

基于WIN10的64位系统演示一、写在前面本期，我们继续学习深度学习图像分割系列的另一个模型，SegNet。二、SegNet简介（1）基本架构 SegNet由一个编码器网络和一个解码器网络组成...

【深度之眼】【Pytorch打卡第20天】：图像分割

目录一、图像分割分类二、PyTorchHub 常用函数实例演示模型如何分割三、深度学习图像分割模型简介 1、FCN模型 2、UNet 3、DeepLab系列——V1 4、DeepLab系列—...

图像分割（三）：模型（Deconv Net、SegNet）

文章目录三、图像分割的模型 2.DeconvNet 3.SegNet 三、图像分割的模型 2.DeconvNet 原论文 2015年的模型，其网络结构为：这是一个基于VGG16改进而来的模型，标准...

深度学习之语义分割-SegNet

Thiscoretrainablesegmentationengineconsistsofanencodernetwork,acorrespondingdecodernetworkfollowedby...

是否确定退出登录?

深度之眼--图像分割第三周SegNet

对称编码解码器网络SegNet

1.编码器解码器和反池化

2.SegNet论文中的figure2

3.CamVid数据集测试结果

全部评论 (0)

相关文章推荐

深度之眼--图像分割第三周SegNet

深度之眼--图像分割第五周GCN

深度之眼--图像分割第四周Deeplab

深度之眼--图像分割第一周FCN

深度之眼--图像分割第二周Unet

图像分割之segnet

第92步 深度学习图像分割：SegNet建模

【深度之眼】【Pytorch打卡第20天】：图像分割

图像分割（三）：模型（Deconv Net、SegNet）

深度学习之语义分割-SegNet

第92步深度学习图像分割：SegNet建模