[paper] 00035 Synthetic Depth-of-Field with a Single-Camera Mobile Phone

阅读量：

Google Research

Paper name aimed two point: 1. Synthetic Depth-of-Field

2. Single-Camera Mobile Phone

Key Words:

When the depth of field is low or shallow, it results in both the image's foreground and background becoming blurred. Consequently, only a narrow region within the image remains in sharp focus.

dual-pixel:

Dual pixel technology effectively separates each pixel into two distinct imaging regions. Each pixel includes two photodiodes, which are arranged in close proximity to one another, situated beneath a micro lens.

alpha matte:

A matte is a layer (or any of its channels) that specifies the transparent areas of that layer or another layer.

Bayer plane:

Background:

cell phone's carema is all-in-focus images.

AIM:

The system is introduced as an innovative tool for generating images with shallow depth-of-field effects. It is capable of computationally generating these images using both an affordable, portable mobile camera and an intuitive, user-friendly interface that requires only one button to operate.

Result:

Our system is capable of processing a 5.4-megapixel image within just four seconds on a mobile device. It operates entirely without user intervention and offers exceptional reliability, making it accessible to users without specialized knowledge.

1. Introduction

Some methods: a) two cameras

b) time-of-flight or structured-light direct depth sensor

c) Lens Blur

Our method:

Our system integrates two distinct technologies in such a manner that it can operate effectively using just one. A neural network is employed to separately identify individuals along with their belongings. In addition, when such sensors are available, we utilize those equipped with dual-pixel (DP) auto-focus technology to capture an array of light fields within an ultra-narrow 1-millimeter baseline.

The first :

The second:

这种快门技术专为在高动态范围和低光环境下进行成像而设计，在移动相机设备上应用广泛。

Robert Anderson、David Gallup、Jonathan T Barron、Janne Kontkanen、Noah Snavely、Carlos Hernández、Sameer Agarwal和Steven M Seitz于2016年发表论文《跳跃：虚拟现实视频》于SIGGRAPH Asia会议（第20届）

J.T. Barron, Andrew Adams, Yi-Chang Shih, and Carlos Hernández (2015). Fast bilateral-space stereo applied to synthetic defocus.

This paper by Johannes Kopf and his co-authors Michael F. Cohen, Dani Lischinski, and Matt Uyttendaele presents a method called joint bilateral upsampling in 2007.

We present a calibration procedure.

summary:

Our rendering technique segments the scene into multiple layers based on varying depths, converting pixels into translucent disks according to depth values and subsequently blending these layers with weights corresponding to their respective depth values.

Another question:

A typical mobile camera's wide-angle view is poorly suited for portraits. It results in photographers positioning themselves close to subjects, which distorts the faces' perspectives.

2. Related work

--Carlos HERNANDEZ, 2014, "Lenz Schärfe in der neuen Google-Kamera-Application," verffentlicht im Blog von Google Research: http://research.googleblog.com/2014/04/lens-blur-in-new-google-camera-app.html.

---etc.

3. Person segmentation

Our contributions consist of: (a) employing training and data collection methods to develop a fast and precise segmentation model that can be deployed on a mobile device, and (b) implementing edge-aware filtering to enhance the upsampling of masks generated by neural networks.

3.1 Data Collection

selecting a comprehensive range of poses, discarding images that are deemed inadequate for training, correcting inaccurate polygon masks, and so on.

Each time we made an enhancement during a 9-month training period, we noticed that the quality of our defocused portraits increased in kind.

3.2 Training

The network accepts a four-channel image with dimensions of 256 × 256 pixels. Three channels correspond to an RGB image that has been resized and padded while maintaining its aspect ratio. The fourth channel represents the face's location through a posterior map of an isotropic Gaussian distribution centered around the face detection box, which has a standard deviation of 21 pixels and is normalized to one at its mean position.

3.3 Inference

During inference phase, we receive a color video feed and detected facial landmarks identified by a detector system. The model predicts its segmentation mask, which corresponds to its facial area within each frame of the video stream.

3.4 Edge-Aware Filtering of a Segmentation Mask

Previous research has shown that mask boundaries frequently align with image edges. Therefore, in our approach, we employ a similar strategy.

an edge-aware filtering strategy to increase the resolution of the low-resolution mask M(x) predicted by a neural network.

3.5 Accuracy and Efficiency

The performance of the model is 3.07 GigaFLOPS, which is significantly higher than 607 in PortraitFCN+ and 3160 in Mask-RCNN when measured using the Tensorflow Model Benchmark Tool.

4. Depth from dual-pixel camera

Dual-pixel (DP) auto-focus systems operate through dividing each pixel into two equal parts. By means of which, each pixel's left portion integrates light across the corresponding right portion within the aperture, while simultaneously, this process is reciprocated on both halves.

This system is typically employed for autofocus, where it is often referred to as phase-detection auto-focus.

Some techniques can compute detph but need two more than two views.

EXP:

--- Edward H. Adelson and John Yehuda Wang. 1992. Their research introduced an innovative single-lens stereo camera system utilizing a plenoptic camera, significantly advancing imaging techniques in computer vision and related fields.

TPAMI (1992)

We build upon the stereo work of Barron et al.

--- Jonathan T Barron, Andrew S Adams, Yi-Chang Shih, and Carlos Hernández. Year: 2015. Rapid bilateral-space stereo for synthetic depth-of-field. Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), 2015.

We thereby build upon the stereographic workflow established by Barron et al. [2015] and the edge-aware development from Anderson et al. [2016] to create a stereo algorithm which is both tractable at high resolutions and specifically suited for defocus tasks through its ability to adhere to image input edges.

4.1 Computing Disparity

We maintain a circular buffer of nine raw and DP frame sets acquired through our camera system for use in denoising.

Compute the disparity by examining each non-overlapping 8×8 tile from the first view and searching within a range of −3 to +3 pixels in the corresponding location of the second view at DP resolution.

Several heuristic criteria: $SSD$ loss magnitude, horizontal gradient intensity within each tile, evidence for a near secondary minimum, and disparity measure correlation across adjacent tiles.

4.2 Imaging Model and Calibration

The equation presents two significant outcomes. First, disparity is influenced by the focus distance z, amounting to zero when the depth D equals the focus distance (D = z). Second, a linear relationship exists between the inverse of depth and disparity, with its parameters remaining constant across space.

4.3 Combining Disparity and Segmentation

4.4 Edge-Aware Filtering of Disparity

We employ the two-way filter [Barron and Poole, 2016] to convert the noisy disparities into a smooth edge-aware disparity map, which is ideal for shallow depth-of-field rendering.

5 RENDERING

5.1 Precomputing the blur parameters

5.2 Applying the blur

One obvious solution is to simply reexpress the scatter as a gather.

5.3 Production the final image

Final image with synthetic noise

6 RESULTS

Three pipelines:

1. DP + Segmentation

2. DP only

3. Segmentation only

7 DISCUSSION AND FUTURE WORK

全部评论 (0)

还没有任何评论哟~

[paper] 00035 Synthetic Depth-of-Field with a Single-Camera Mobile Phone

GoogleResearch Papernameaimedtwopoint:1.SyntheticDepthofField 2\.SingleCameraMobilePhone KeyWords: S...

Real-time Pose and Shape Reconstruction of Two Interacting Hands With a Single Depth Camera

RealtimePoseandShapeReconstructionofTwoInteractingHandsWithaSingleDepthCamera即利用单深度摄像机实时重建两只交互手的手势和形...

【Paper Reading】【TextBoxes: A Fast Text Detector with a Single Deep Neural Network】

Introduction：传统文本检测方法步骤： 1character/wordcandidategeneration：候选词的产生 2Candidatefiltering：候选词过滤 3group...

论文Forward CollisionWarning with a Single Camera解读

这篇论文介绍了MobilEye的基于视觉的前车碰撞预警算法。不去进行3D建模，直接去计算TTC（TimetoContact）。情况一相对速度恒定时由相机模型的几何关系进行推理 S表示时间间隔内同...

【转载】景深 (Depth of field)

在对焦时，通过镜头将在焦平面上清晰成像，而对焦点的前景和后景也在焦平面成像，只要它们成像的弥散圆等于或小于容许弥散圆直径，我们将认为是清晰的，这样影像就有一个的清晰的区间，这就是景深（下图）。从以上...

[paper] 00040-Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale

PredictingDepth,SurfaceNormalsandSemanticLabelswithaCommonMultiScaleConvolutionalArchitecture Paper:...

Image and depth from a conventional camera with a coded aperture论文阅读

Imageanddepthfromaconventionalcamerawithacodedaperture 1\.研究目标与实际意义 1.1研究目标 1.2实际问题与产业意义 2\.创新方法：编码光...

【paper】Unsupervised Monocular Depth Estimation with Semantic-aware Representation

文章目录 Motivation contribution Framework overviewofnetwork depthestimation&semanticsegmentation selfsu...

OpenGL Depth Of Field景深的实例

OpenGLDepthOfField景深先上图，再解答。完整主要的源代码源代码剖析先上图，再解答。完整主要的源代码 include<vmath.h include<object.h incl...

Unity Shader, Post Processing, Blur, Bloom, Depth of Field

挂在摄像机上的脚本，把source抄到中间临时纹理，然后再到destination 由于用到HDR，camera打开AllowHDR voidOnRenderImageRenderTexturesou...

是否确定退出登录?

[paper] 00035 Synthetic Depth-of-Field with a Single-Camera Mobile Phone

全部评论 (0)

相关文章推荐

[paper] 00035 Synthetic Depth-of-Field with a Single-Camera Mobile Phone

Real-time Pose and Shape Reconstruction of Two Interacting Hands With a Single Depth Camera

【Paper Reading】【TextBoxes: A Fast Text Detector with a Single Deep Neural Network】

论文Forward CollisionWarning with a Single Camera解读

【转载】景深 (Depth of field)

[paper] 00040-Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale

Image and depth from a conventional camera with a coded aperture论文阅读

【paper】Unsupervised Monocular Depth Estimation with Semantic-aware Representation

OpenGL Depth Of Field景深的实例

Unity Shader, Post Processing, Blur, Bloom, Depth of Field