[FGD] Focal and Global Knowledge Distillation for detectors (CVPR. 2022)

阅读量：

1. Motivation

本文作者指出，在目标检测中，tea和stu之间的特征在不同的区域例如前后景的差别是比较大的。

In this paper, we point out that in object detection, the features of the teacher and student vary greatly in different areas, especially in the fore- ground and background.

如果用同样的方法蒸馏，那么在特征图上不均匀的差异性会导致蒸馏的效果更差。

If we distill them equally, the uneven differences between feature maps will negatively af- fect the distillation.

因此本文提出了FGD，分为了focal distillation 和 global distillation。

Thus, we propose Focal and Global Distillation (FGD).
Focal distillation separates the fore- ground and background, forcing the student to focus on the teacher’s critical pixels and channels.
Global distilla- tion rebuilds the relation between different pixels and trans- fers it from teachers to students, compensating for missing global information in focal distillation.

从图1可以得出，学生网络对于前景的attention map 比背景的响应是更大的。这就说明了蒸馏也是存在着前后景不平衡的影响。

从表1可以得出，作者在采用解耦fg 和bg的特征时，得到的蒸馏效果确实最差的(38.9)，因此作者构思了focal dis 来获取关键的pixels 和 channels，同时使用gcblock 提出全局特征。

本文对于全局特征提取使用的GC Block。

2. Contribution

We present that the pixels and channels that teacher and student pay attention to are quite different. If we distill the pixels and channels without distinguishing them, it will result in a trivial improvement.

We propose focal and global distillation, which en- ables the student not only to focus on the teacher’s crit- ical pixels and channels, but also to learn the relation between pixels

We verify the effectiveness of our method on various detectors via extensive experiments on the COCO [21], including one-stage, two-stage, anchor-free methods, achieving state-of-the-art performance.

3. Method

作者首先引出了一个例子，在普通的蒸馏特征的公式如下所示：
在这里插入图片描述

其中小f是一个adaptation layer 来reshape Ft 和 Fs之间的维度。

但这种方法是对于所有部分同等蒸馏，缺乏全局之间的联系。

However, such methods treat all the parts equally and lack the distillation of the global relations between different pixels.

3.1. Focal Distillation

首先使用一个maks 来区分前后景。前景为1，背景为0
在这里插入图片描述

进一步的，为了同等对待小物体和大物体之间gt 的面积，以及前后景的比例，作者提出了一个sacle mask：
在这里插入图片描述

If a pixel belongs to different targets, we choose the smallest box to calculate the S （额外的限制）

空间和通道的特征如下：

Gs 可以理解为 HxWx1, Gc可以理解为 1 x 1 x C的attention map
在这里插入图片描述

因此， attention mask可以被定义为：
在这里插入图片描述

feature loss 定义为：其中2项分别是对bg 和fg计算，通过2个超参数平衡稀疏，并且 $A^S$ 以及 $A^C$ 在训练过程中都是使用teahcer模型的。
在这里插入图片描述

Attention loss:
Besides, we use attention loss Lat to force the student detector to mimic the spatial and channel attention mask of the teacher detector(L1 loss)

3.2 Global loss

As shown in Fig. 4, we utilize GcBlock [2] to capture the global relation information in a single image and force the student detector to learn the relation from the teacher detector.

student model 总得loss：
在这里插入图片描述

4. Experiments

本文使用了 General instance distillation for object detection.（ICCV2021)中的一个方法（inherit strategy），对于相同head 结构的stu和tea，使用tea的权重对stu model 进行初始化。

4.1 Main results

在这里插入图片描述

复制代码

![在这里插入图片描述]()

4.2 Abla

4.2.1 Sensitivity study of different losses

4.2.1 Sensitivity study of focal distillation

4.2.2 Sensitivity study of global distillation

4.2.3 Sensitivity study of T

4.2.4 Sensitivity study of hyper-parameters

全部评论 (0)

还没有任何评论哟~

[FGD] Focal and Global Knowledge Distillation for detectors (CVPR. 2022)

1\.Motivation 本文作者指出，在目标检测中，tea和stu之间的特征在不同的区域例如前后景的差别是比较大的。 Inthispaper,wepointoutthatinobjectdetec...

Focal and Global Knowledge Distillation for Detectors--FGD论文解读

论文：FocalandGlobalKnowledgeDistillationforDetectors 论文：https://arxiv.org/abs/2111.11837 代码：https://gi...

Focal and Global Knowledge Distillation forDetectors

摘要文章指出，在目标检测中，教师和学生在不同领域的特征差异很大，尤其是在前景和背景中。如果我们平等地蒸馏它们，特征图之间的不均匀差异将对蒸馏产生负面影响。因此，我们提出了局部和全局蒸馏。局部蒸馏分离...

Featured Based知识蒸馏及代码(3): Focal and Global Knowledge (FGD)

文章目录 1\.摘要 2\.FocalandGlobal蒸馏的原理 2.1常规的featurebased蒸馏算法 2.2FocalDistillation 2.3GlobalDistillation ...

【KD】2022 CVPR Decoupled Knowledge Distillation

目录 1研究摘要 2研究动机 2.1符号定义 2.2重新推导KDLoss 3启发式探索 3.1单独使用TCKD/NCKD训练 3.2TCKD：传递样本难度相关的知识 3.3NCKD：被抑制的重要成分 ...

Focal and Global Knowledge Distillation——目标检测网络的知识蒸馏

Paper地址：<https://arxiv.org/abs/2111.11837 GitHub链接：<https://github.com/yzdv/FGD 方法 FGKD（FocalandGlob...

Adaptive Knowledge Distillation for Lightweight Remote Sensing Object Detectors Optimizing

摘要目前，轻型目标检测在遥感领域得到越来越多的应用。一般来说，轻量级检测器很难达到与传统深度模型相比具有竞争力的性能，而知识蒸馏是解决这一问题的一种很有前途的训练方法。由于遥感图像背景较为复杂，目标...

【CVPR 2021】Revisiting Knowledge Distillation: An Inheritance and Exploration Framework

【CVPR2021】RevisitingKnowledgeDistillation:AnInheritanceandExplorationFramework 论文地址：主要问题：主要思路：具体实...

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence

本文是蒸馏学习综述系列的第四篇文章，KnowledgeDistillationandStudentTeacherLearningforVisualIntelligence:AReviewandNewO...

【CVPR 2020】蒸馏篇（二）：Few Sample Knowledge Distillation for Efficient Network Compression

【CVPR2020】蒸馏篇（二）：FewSampleKnowledgeDistillationforEfficientNetworkCompression 论文地址：代码地址：主要问题：主要思路...

是否确定退出登录?

[FGD] Focal and Global Knowledge Distillation for detectors (CVPR. 2022)

1. Motivation

2. Contribution

3. Method

3.1. Focal Distillation

3.2 Global loss

4. Experiments

4.1 Main results

4.2 Abla

4.2.1 Sensitivity study of different losses

4.2.1 Sensitivity study of focal distillation

4.2.2 Sensitivity study of global distillation

4.2.3 Sensitivity study of T

4.2.4 Sensitivity study of hyper-parameters

全部评论 (0)

相关文章推荐

[FGD] Focal and Global Knowledge Distillation for detectors (CVPR. 2022)

Focal and Global Knowledge Distillation for Detectors--FGD论文解读

Focal and Global Knowledge Distillation forDetectors

Featured Based知识蒸馏及代码(3): Focal and Global Knowledge (FGD)

【KD】2022 CVPR Decoupled Knowledge Distillation

Focal and Global Knowledge Distillation——目标检测网络的知识蒸馏

Adaptive Knowledge Distillation for Lightweight Remote Sensing Object Detectors Optimizing

【CVPR 2021】Revisiting Knowledge Distillation: An Inheritance and Exploration Framework

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence

【CVPR 2020】蒸馏篇（二）：Few Sample Knowledge Distillation for Efficient Network Compression