Neural scene representation and rendering
Neural scene representation and rendering
Deep Mind!
S. M. Ali Eslami, Danilo Jimenez Rezende, Frederic Besse, Fabio Viola, Ari S. Morcos, Marta Garnelo, Avraham Ruderman, Andrei A. Rusu, Ivo Danihelka, Karol Gregor, David P. Reichert, Lars Buesing, Theophane Weber, Oriol Vinyals, Dan Rosenbaum, Neil Rabinowitz, Helen King, Chloe Hillier, Matt Botvinick, Daan Wierstra, Koray Kavukcuoglu, Demis Hassabis
Abstract
Scene representation: converting visual sensory data into concise descriptions
neural networks excel at this task when provided with large, labeled datasets, 但是标数据很困难.
解决方案:Generative Query Network (GQN), machines learn to represent scenes using only their own sensors.
input: images of a scene taken from different viewpoints
output: the appearance of that scene from previously unobserved viewpoints
根据David Marr的计算视觉理论,个人认为这才是真正实现人类视觉的有效途径
Introduction
现如今的深度学习要求大量的标注数据,但是自然界的生物以及人类婴儿在实际中并不需要这么多监督数据进行学习,相反它们只学习物体的表示,因此我们应该设计一个自动获取数据并处理的智能系统。基于这个思想,本文提出了GQN。
GQN包含两个网络,一个表示网络f和一个生成网络g,输入为不同视角下的场景图像\{(x_i,v_i)\}。以场景图像为输入,输出编码了场景信息的neural scene representation r。以为输入,输出任意指定的一个视角v^q下的场景图像(感觉挺有意思的)。该模型联合地进行end to end训练。

