Multi-View 3D Object Detection Network for Autonomous Driving
动机:
CC
CC
CC
CC
CC
CC
网络概览:
The network is constructed from two portions: one dedicated to 3D object proposal generation and another dedicated to multi-view feature fusion.

CC
CC
CC
CC
CC
CC
3D Point Cloud Representation
CC
CC
CC
CC
Bird’s Eye View Representation

CC
CC
CC
CC
CC
Front View Representation

We project the Point Cloud onto a cylinder plane to create a densely packed front-view map. Given a 3D point p = (x, y, z), whose coordinates pfv = (r, c) in the front view map can be calculated using

CC
CC
CC
CC
3D Proposal Network

CC
CC

CC
CC
CC
CC
Region-based Fusion Network
We develop a view-specific fusion network to effectively integrate features from multiple perspectives and jointly categorize object proposals and perform oriented 3D box regression.

- Multi-View ROI Pooling
we apply ROI pooling to each viewpoint to generate feature vectors with equal dimensions. Given a 3D proposal p3D, we generate ROI regions across each viewpoint using

CC
CC

【CC】拿到了固定长度的向量
- Deep Fusion
we propose a hierarchical deep learning-based feature fusion framework to fuse multi-modal and multi-view features in a hierarchical manner.

Consider a network consisting of L layers, where {Hl, l = 1 to L} represent feature transformation functions and ⊕ denotes a joining operation. For deep fusion purposes, we employ an element-wise average as the joining mechanism because it proves to be more flexible when integrated with drop-path training.

CC
CC
- Oriented 3D Box Regression
CC
CC
CC
- Network Regularization
CC
CC
CC
讨论

