《From Big to Small：Multi-Scale Local Planar Guidance for Monocular Depth Estimation》论文笔记

阅读量：

参考代码：bts

1. 概述

导读：从2D图像中估计出深度信息是多解的，对此文章提出了在解码器的多个stage上加上隐式约束，从而引导解码器中适应深度估计特征的生成，从而产生更佳的深度估计结果。其中的隐式约束即为LPG（local planar guidance）层，但是该层实现与其配套的操作较为复杂，直接运用于部署存在较大难度，不过用于teacher网络指导student网络是一个不错的选择。

将LPG层的输出取出，得到下图中第二行的结果：
在这里插入图片描述
可以看到在 $stride=8$ 的层次上生成的是整幅图像的大体结构，之后随着stride的减小，逐渐呈现出更多的细节信息。不过这部分生成的结果并没有进行显式监督，而是只有在不同stride的LPG输出组合得到深度上进行监督。因而LPG可以看作是深度信息在不同stride分量的隐式约束。

2. 方法设计

2.1 网络结构

文章使用的深度估计网络是U型网络，见下图所示，主要的改进点也就是下图右边详细的decoder部分。
在这里插入图片描述
在backbone输出特征之后文章通过shortcut+upsample的形式得到 $stride=8$ 的特征图，之后通过ASPP网络增强网络的感知能力，其中的膨胀系数为 $r=[3, 6, 12, 18, 24]$ 。接下来这些特征便会经过不同stride对应的LPG层，最后得到最后的深度输出。

2.2 LPG层

文章提出的LGP网络结构见下图所示：
在这里插入图片描述
首先输入的 $stride=k$ 的特征（对应的特征图大小为 $\frac{H}{k}$ ），之后经过一个reduction操作将输入的特征图编码为3通道的特征（对应上图的输入部分）。这样在此基础上就可以得到构建局部平面的参数，也就是文章中提到的 $(\theta,\phi)$ （对应上图的中间部分），可以参考：

复制代码

    # pytorch/bts.py#L110
    def forward(self, net):
    net = self.reduc.forward(net)
    if not self.is_final:
        theta = self.sigmoid(net[:, 0, :, :]) * math.pi / 3
        phi = self.sigmoid(net[:, 1, :, :]) * math.pi 
        dist = self.sigmoid(net[:, 2, :, :]) * self.max_depth
        n1 = torch.mul(torch.sin(theta), torch.cos(phi)).unsqueeze(1)
        n2 = torch.mul(torch.sin(theta), torch.sin(phi)).unsqueeze(1)
        n3 = torch.cos(theta).unsqueeze(1)
        n4 = dist.unsqueeze(1)
        net = torch.cat([n1, n2, n3, n4], dim=1)
    
    return net

其实现源自文章提供的公式2：
$n_1=sin(\theta)cos(\phi),\ n_2=sin(\theta)sin(\phi),\ n3=cos(\theta)$
那么接下来就是需要将得到局部平面表达规整化到原图的分辨下，对于当前stride下与原图分辨率不统一的问题，这里是采用torch.repeat_interleave()堆叠复制之后与当前stride下对应block块(尺寸为 $k*k$ )运算得到，其实现可以参考：

复制代码

    # pytorch/bts.py#L32
    def forward(self, plane_eq, focal):
    plane_eq_expanded = torch.repeat_interleave(plane_eq, int(self.upratio), 2)
    plane_eq_expanded = torch.repeat_interleave(plane_eq_expanded, int(self.upratio), 3)
    n1 = plane_eq_expanded[:, 0, :, :]
    n2 = plane_eq_expanded[:, 1, :, :]
    n3 = plane_eq_expanded[:, 2, :, :]
    n4 = plane_eq_expanded[:, 3, :, :]
    
    u = self.u.repeat(plane_eq.size(0), plane_eq.size(2) * int(self.upratio), plane_eq.size(3)).cuda()
    u = (u - (self.upratio - 1) * 0.5) / self.upratio
    
    v = self.v.repeat(plane_eq.size(0), plane_eq.size(2), plane_eq.size(3) * int(self.upratio)).cuda()
    v = (v - (self.upratio - 1) * 0.5) / self.upratio
    
    return n4 / (n1 * u + n2 * v + n3)

其实现对应的原文的公式1：
$\bar{c}_i=\frac{n_4}{n_1u_i+n_2v_i+n_3}$
这样的操作会在 $stride=[1,2,4,8]$ 上进行，从而可以得到深度图在不同stage上的表达，最后这些表达经过concat组合起来，经过卷积网络之后输出得到最后的深度图，其实现描述为：
$\bar{d}=f(W_1\bar{c}^{1*1}+W_2\bar{c}^{2*2}+W_3\bar{c}^{4*4}+W_4\bar{c}^{8*8})$

上述提到的各个模块消融实验：
在这里插入图片描述

2.3 损失函数

这里采用的损失函数为尺度不变损失（log空间）：
$D(g)=\frac{1}{T}\sum_ig_i^2-\lambda(\frac{1}{T}\sum_ig_i)^2$
其中， $\lambda$ 是对深度预测错误方差的关注程度。
$L=\alpha\sqrt{D(g)}$
参考文献：

3. 实验结果

全部评论 (0)

还没有任何评论哟~

《From Big to Small：Multi-Scale Local Planar Guidance for Monocular Depth Estimation》论文笔记

参考代码：bts 1\.概述导读：从2D图像中估计出深度信息是多解的，对此文章提出了在解码器的多个stage上加上隐式约束，从而引导解码器中适应深度估计特征的生成，从而产生更佳的深度估计结果。

《Boosting Monocular Depth Estimation Models to High-Resolution ...》论文笔记

参考代码：BoostingMonocularDepth 论文：BoostingMonocularDepthEstimationModelstoHighResolutionviaContentAdapt...

《PackNet：3D Packing for Self-Supervised Monocular Depth Estimation》论文笔记

参考代码：packnetsfm 1\.概述导读：这篇文章提出了一种自监督的深度估计方法，其使用视频序列与运动信息作为输入，用网络去估计深度信息/相机位姿，并用最小重构误差去约束整个训练的过程从而实现...

论文笔记- Geometry-Aware Symmetric Domain Adaptation for Monocular Depth Estimation

论文信息标题：GeometryAwareSymmetricDomainAdaptationforMonocularDepthEstimation 作者：ShanshanZhaoszha4333@un...

《Revisiting Self-Supervised Monocular Depth Estimation》论文笔记

参考代码：rmd 1\.概述介绍：回顾前几年的一些自监督型深度估计算法，它们都是从不同的角度对自监督深度估计方法进行改进。总结这些算法中对自监督深度估计的改进idea，发现这些算法有的是为了解决场景...

Self-Supervised Scale Recovery for Monocular Depth and Egomotion Estimation

code:https://github.com/utiasSTARS/learnedscalerecovery.git 摘要：用单目图像联合训练预测深度值和egomotion神经网络的自我监督损失公式...

论文笔记-Structured Coupled Generative Adversarial Networks for Unsupervised Monocular Depth Estimation

论文信息标题：StructuredCoupledGenerativeAdversarialNetworksforUnsupervisedMonocularDepthEstimation 作者：Mih...

论文笔记-Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation

论文信息标题：StructuredAttentionGuidedConvolutionalNeuralFieldsforMonocularDepthEstimation 作者：DanXu,WeiWa...

《NeW CRFs：Neural Window Fully-connected CRFs for Monocular Depth Estimation》论文笔记

参考代码：NeWCRFs 1\.概述介绍：在这篇文章中对单目有监督深度估计算法提出了新的解码单元，该解码单元设计的灵感源自于CRFs（ConditionalRandomFields）和SwinTra...

Frequency-Aware Self-Supervised Monocular Depth Estimation——论文笔记

参考代码：freqawaredepth 1\.概述介绍：在自监督深度估计算法中采用光度重构误差作为约束函数，但是这个约束函数却是存在不足的，也就是损失的大小在某些区域不能真实反应深度预测的误差大小，...

是否确定退出登录?

《From Big to Small：Multi-Scale Local Planar Guidance for Monocular Depth Estimation》论文笔记

1. 概述

2. 方法设计

2.1 网络结构

2.2 LPG层

2.3 损失函数

3. 实验结果

全部评论 (0)

相关文章推荐

《From Big to Small：Multi-Scale Local Planar Guidance for Monocular Depth Estimation》论文笔记

《Boosting Monocular Depth Estimation Models to High-Resolution ...》论文笔记

《PackNet：3D Packing for Self-Supervised Monocular Depth Estimation》论文笔记

论文笔记- Geometry-Aware Symmetric Domain Adaptation for Monocular Depth Estimation

《Revisiting Self-Supervised Monocular Depth Estimation》论文笔记

Self-Supervised Scale Recovery for Monocular Depth and Egomotion Estimation

论文笔记-Structured Coupled Generative Adversarial Networks for Unsupervised Monocular Depth Estimation

论文笔记-Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation

《NeW CRFs：Neural Window Fully-connected CRFs for Monocular Depth Estimation》论文笔记

Frequency-Aware Self-Supervised Monocular Depth Estimation——论文笔记