[paper] 00040-Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale

阅读量：

Estimating Depth Information, Surface Normals, and Semantic Labels Using a Shared Multi-Scale Convolutional Architecture

Paper:

Author:

Dr. David Eigen is affiliated with the Department of Computer Science at the Courant Institute, New York University.

Rob Fergus: FaceBook AI Research

Abstract:

Multi-scale convolution technique encompasses depth estimation, normals recovery, and object segmentation.

This method enhances predictions by processing them across multiple hierarchical levels, effectively extracting detailed information from images without the use of superpixel-based segmentation.

1. Introduction

In this paper, we investigate three key challenges—depth estimation, normal computation, and semantic segmentation—implemented by a unified framework.

Several advantages:

First, appropriate training set and loss function.

Second, optimize the procedure for implementing multi-modal systems.

Third, a significant portion of the computation can be shared among modalities, which enhances the system's efficiency.

Typically, these systems rely on ConvNets to identify localized features or produce descriptor representations for distinct proposal regions; in contrast, whereas our network integrates both local and global perspectives to generate diverse output types.

3. Model Architecture

Scale 1: Full-Image View :

The network's initial stage anticipates a coarse yet spatially varying collection of features across the imaged region by utilizing a comprehensive field of view. This is achieved via two fully connected layers.

Scale 2: Predictions:

The function of the second scale is to generate predictions at an intermediate resolution level, by integrating a more detailed yet narrower focus on the image alongside the full-resolution information provided by the coarse network.

Scale 3: Higher Resolution:

The final level of our model improves the forecasts up to a higher resolution level.

4. Tasks

We utilize this same architectural framework for each of the three depth-related tasks, normal estimation tasks, and semantic segmentation tasks we examine. Each task employs distinct loss functions and target data to define itself.

4.1 Depth

LOSS:

4.2 Surface Normals

Predicting surface normal vectors, we modify the output from a single-channel representation to a three-channel configuration. This allows us to determine the x, y, and z components of the normal orientation at every pixel location.

LOSS:

4.3 Semantic Labels

In the context of semantic segmentation, we employ a per-pixel softmax classifier to estimate the class label for each individual pixel.

5 Training

5.1 Training Procedure

Our company trains its model through two distinct stages by employing stochastic gradient descent (SGD): In the first phase, we simultaneously train both Scale 1 and Scale 2. In the second phase, after fixing the parameters of these scales, we proceed to train Scale 3.

5.2 Data Augmentation

The system applies random scaling, in-plane rotations, translations, color adjustments, flipping, and contrast enhancements.

5.3 Combining Depth and Normals

6. Performance Experiments

6.1 Depth

6.2 Surface Normals

6.3 Semantic Labels

6.3.1 NYU Depth

6.3.2 Sift Flow

Sift flow: dense correspondence across difference scenes.

we weight each pixel by

6.3.3 Pascal voc

7. Probe Experiments

7.1 Contributions of Scales

7.2 Effect of Depth and Normals Inputs

What is the critical importance of depth and normal information compared to RGB in the semantic labeling task?

What could occur if our network were to substitute the ground truth depth and normal inputs with its predicted outputs?

8 Discussion

全部评论 (0)

还没有任何评论哟~

[paper] 00040-Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale

PredictingDepth,SurfaceNormalsandSemanticLabelswithaCommonMultiScaleConvolutionalArchitecture Paper:...

Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Archit

PredictingDepth,SurfaceNormalsandSemanticLabelswithaCommonMultiScaleConvolutionalArchitecture论文阅读笔记 ...

Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Archit

代码开源：http://cs.nyu.edu/deigen/dnl/ 摘要中说，这篇文章是一石三鸟，深度预测，表面估计和语义分析，同样的网络结构的三个应用。。。

论文笔记-深度估计(3)Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale...

PredictingDepth,SurfaceNormalsandSemanticLabelswithaCommonMultiScaleConvolutionalArchitecture 这篇文章是e...

[深度学习论文笔记][Depth Estimation] Predicting Depth, Surface Normals and Semantic Labels with a Common M

Eigen,David,andRobFergus.“Predictingdepth,surfacenormalsandsemanticlabelswithacommonmultiscaleconvol...

【paper】Unsupervised Monocular Depth Estimation with Semantic-aware Representation

文章目录 Motivation contribution Framework overviewofnetwork depthestimation&semanticsegmentation selfsu...

Depth Map Prediction from a Single Image using a Multi-Scale Deep Network

在这篇了论文中，我们提出了一种估计单张图片深度信息的新方法：使用神经网络对深度进行递归。分为两个步骤：首先估计出场景的整体结构，然后使用局部信息进行优化。这个网络使用。 3.1 模型结构我们的网络由...

Depth Map Prediction from a Single Image using a Multi-Scale Deep Network

深度估计：获得d，d其实就代表着由3D物体投射而来的2D图像中每个像素点与相机的实际距离。目标：从单张图中找到深度相关性 modelarchetecture 结构：用了两簇深度网络，一个在整个图像上...

2015-CVPR-Direction Matters_ Depth Estimation with a Surface Normal Classifier

2015CVPRDirectionMatters:DepthEstimationwithaSurfaceNormalClassifier abstract 用分类器对整个集合法向量进行分类，通过一系列...

[Paper Reading] Attention to Scale: Scale-aware Semantic Image Segmentation

Motivation: Multiscalefeatures对性能的提升是非常大的。做Multiscale一般是两个思路：skipnet和sharenet skipnet的思路是将各个中间层的特征都结...

是否确定退出登录?

[paper] 00040-Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale

Author:

Abstract:

1. Introduction

2. Related Work

3. Model Architecture

4. Tasks

5 Training

6. Performance Experiments

7. Probe Experiments

8 Discussion

全部评论 (0)

相关文章推荐

[paper] 00040-Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale

Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Archit

Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Archit

论文笔记-深度估计(3)Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale...

[深度学习论文笔记][Depth Estimation] Predicting Depth, Surface Normals and Semantic Labels with a Common M

【paper】Unsupervised Monocular Depth Estimation with Semantic-aware Representation

Depth Map Prediction from a Single Image using a Multi-Scale Deep Network

Depth Map Prediction from a Single Image using a Multi-Scale Deep Network

2015-CVPR-Direction Matters_ Depth Estimation with a Surface Normal Classifier

[Paper Reading] Attention to Scale: Scale-aware Semantic Image Segmentation