Advertisement

【论文学习】Pose Flow: Efficient Online Pose Tracking

阅读量:

paper:https://arxiv.org/abs/1802.00977

GitHub实现:https://github.com/MVIG-SJTU/AlphaPose/tree/master/PoseFlow

()

Abtract:

First, we design an online optimization framework to build the association of cross-frame poses and form pose flows (PF-Builder). Second, a novel pose flow non-maximum suppression (PF-NMS) is designed to robustly reduce redundant pose flows and re-link temporal disjoint ones. Extensive experiments show that our method significantly outperforms best reported results on two standard Pose Tracking datasets ( [12] and [8]) by 13 mAP 25 MOTA and 6 mAP 3 MOTA respectively. Moreover, in the case of working on detected poses in individual frames, the extra computation of pose tracker is very minor, guaranteeing online 10FPS tracking. Our source codes are made publicly available1.

首先,我们设计了一个在线优化框架来建立交叉帧姿态和形成姿态流的关联(PF-Builder)。然后 ,设计了一种新的姿态流非最大抑制(pf-nms),以有效地减少冗余位姿势流和重链接时间不相交的姿势流。此外,在处理各个帧中检测到的姿势的情况下,姿势跟踪器的额外计算非常小,保证在线10FPS跟踪。

1 Introduction

多人姿态跟踪方法,其主要方法和多人姿态检测一样,也可以分为自顶向下和自下而上。

Top-down methods,also known as two steps scheme, first detect human proposals in every frame,estimate keypoints within each box independently, and then track human boxes over the entire video in terms of similarity between pairs of boxes in adjacent frames, and that is the reason why it is also referred to as Detect-and-Track method [8]. By contrast, bottom-up methods, also known as jointing scheme, first generate a set of joint detection candidates in every frame,construct the spatio-temporal graph, and then solve an integer linear program to partition this graph into sub-graphs that correspond to plausible human pose trajectories of each person.

自顶向下,也称为两步方案,首先在每个帧中检测人类提议,独立估计每个盒子内的关键点,然后根据相邻帧中的box对之间的相似性跟踪整个视频上的人体盒子,这就是为什么它 也被称为检测和跟踪方法[8]。 相比之下,自下而上的方法,也称为连接方案,首先在每个帧中生成一组关节检测候选点,构造时空图。然后求解一个整数线性程序(整数线性规划),将该图划分为对应的子图。该子图与每个人的大概姿势轨迹相对应。将该图分割成对应于每个人的合理人体姿势轨迹的子图。

也就是。

1.自顶向下是:在每帧中 检测 人的proposals →关键点、box边界框→每个box化成node,计算相邻帧node的相似性 跟踪整个视频(二分图匹配问题);

2.自下而上是:在每帧中 生成 关键点候选点→时空图→求解整数线性规划将时空图分为子图→每个子图对应人体姿势轨迹。

目前,自顶向下的方法在精度(map和mota)和跟踪速度上都大大优于自下而上的方法,因为自下而上的方法由于仅仅利用二阶身体部件依赖而失去了全局姿态视图,这直接导致了对关键点的模糊分配。

due to frame degeneration (e.g. blurring due to fast motion), truncation or occlusion, pose estimation in an individual frame can be unreliable.
由于帧退化(如快速运动造成的模糊)、截断或遮挡,单个帧中的姿态估计可能不可靠。为了解决这个问题,我们需要关联交叉帧(cross-frame)检测到的实例来共享时间信息,从而降低不确定性。

它主要利用时空信息来构建姿态流,以显著提高姿态跟踪任务。spatio-temporal information to build pose flow

Our Proposed Approach

pose flow means a set of pose indicating the same person instance in different frames.姿态流是指一组姿态,该姿态指示不同帧中相同的人实例。

our framework includes two steps: Pose Flow Building and Pose Flow NMS. First, we build pose flow by maximizing overall confidence along the temporal sequence. Second, we reduce redundant pose flows and relink disjoint pose flows by Pose Flow NMS.

该框架包括两步:位姿流建立和位姿流nms,第一,通过最大化时间序列的总体置信度来建立姿态流;第二,通过位姿流nms减少冗余姿态流和重连分离的姿态流。

Figure 2: Overall Pipeline: 1) Pose Estimator. 2) Pose Flow Builder. 3) Pose Flow NMS. First, we estimate multi-person poses. Second, we build pose flows by maximizing overall confidence and purify them by Pose Flow NMS. Finally, reasonable multi-pose trajectories can be obtained.

3 Preliminary

3.1 姿态度量:

Intra-Frame Pose Distance : 这个距离主要衡量同一帧内两个姿态的相似度

Inter-frame Pose Distance :这个距离主要衡量相邻帧内两个姿态的相似度

3.2 Multi-Person Pose Estimation

用alphapose得到关键点,它是Faster R-CNN+ 改进的 SPPE

3.3 Pose Flow Building

deepmatching 或者orb matching
Pose flows are built by associating poses that indicates the same person across frames.

3.4 Pose Flow NMS

可参考:<>

全部评论 (0)

还没有任何评论哟~