[深度学习论文笔记][Human Pose Estimation] DeepPose: Human Pose Estimation via Deep Neural Networks
DeepPose: 利用深度神经网络实现人体姿态估计的一种创新系统
Toshev等三人合著.“DensePose: Dense human pose estimation in the wild." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019. (被引用271次)
1 Motivation
We describe a person using K joints and employ a CNN-based regression model for analyzing body joints.
2 Method
2.1 Dataset
The dataset is

Where

y^i_k contains the height and width coordinates of the k-th joint.
2.2 Normalization
Given that the joint coordinates are situated in absolute image coordinates, it stands to reason that normalizing them with respect to a box B encompassing the human body or portions thereof would prove advantageous. Such a box is defined by its center b_C, height B_H, and width B_W.
The normalized pose vector is

The normalized image X^(i) is a crop of the image by the bounding box B.
2.3 Architecture
As shown in Figure 1, this architecture is based on AlexNet's design. The input consists of a normalized image, while the output comprises K normalized pose vectors.

2.4 Loss
Use l_2 loss for each joint.
3 Cascade of Pose Regressors

4 Result
See Fig.

