【CVPR 2020】蒸馏篇（一）：Regularizing Class-wise Predictions via Self-knowledge Distillation

阅读量：

CVPR 2020

论文地址：
代码地址：
主要问题：
主要思路：
主要特点：
具体实现：
- 基本符号：
- CS-KD：
实验结果：

论文地址：

https://arxiv.org/abs/2003.13964

代码地址：

https://github.com/alinlab/cs-kd

主要问题：

在单独训练神经网络的过程中，采用传统的 $CrossEntropyLoss$ 可能会导致过拟合现象，并且忽视了样本之间的相似程度。

主要思路：

该作者建议基于样本间的相似性分析，并引入了一个额外的自蒸馏损失项 $D_{self}$ 。通过对同类样本进行预测分布的一致性约束，在同一类别内部实现预测结果的一致性目标以有效缓解过拟合现象。

主要特点：

自蒸馏+考虑样本相似性

具体实现：

基本符号：

在训练神经网络的过程中，给定输入 $x$ 时，其后验概率分布可以通过数学表达式表示为：

在给定输入 $\mathbf{x}$ 和超参数 $\theta$ 及温度参数 $T$ 的情况下，在类别空间中各分类的概率可以通过以下条件概率公式计算得出： $P(y|\mathbf{x},θ,T)=\frac{e^{f_y(\mathbf{x},θ)/T}}{\sum_{i=1}^C e^{f_i(\mathbf{x},θ)/T}}$ 其中分子项代表特定类别 $y$ 的指数加权值，在分母中则为所有可能类别的加权和。

CS-KD：

我们考虑来自同一类别的样本 $x$ 和 $x'$ ，他们的损失就可以写作：

该损失函数定义为两个条件概率分布之间的Kullback-Leibler散度：$$
\mathcal{L}_{\texttt{cls}}(\boldsymbol{x}, \boldsymbol{x}', \theta, T) := \text{KL}\big( P(y|\boldsymbol{x}', \tilde{\theta}, T) | P(y|\boldsymbol{x}, \theta, T) )

用来迫使同类别样本产生相似的预测分布 最终的蒸馏损失写作： $\begin{aligned} \mathcal{L}_{\mathrm{CS}-\mathrm{KD}}\left(\mathbf{x}, \mathbf{x}^{\prime}, y ; \theta, T\right) &:=\mathcal{L}_{\mathrm{CE}}(\mathbf{x}, y ; \theta) \\ &+\lambda_{\mathrm{cls}} \cdot T^{2} \cdot \mathcal{L}_{\mathrm{cls}}\left(\mathbf{x}, \mathbf{x}^{\prime} ; \theta, T\right) \end{aligned}$ ## 实验结果： ![在这里插入图片描述](https://ad.itadn.com/c/weblog/blog-img/images/2024-12-27/9tz8jITFWXL5bZw2P0uxJDrRVMgC.png) ![在这里插入图片描述](https://ad.itadn.com/c/weblog/blog-img/images/2024-12-27/hoRdYPLBewnsybKS6WTZaqVAJCc9.png) ![在这里插入图片描述](https://ad.itadn.com/c/weblog/blog-img/images/2024-12-27/yXrmqC0kVa7wQxWhjliKLoYnPFAS.png) ![在这里插入图片描述](https://ad.itadn.com/c/weblog/blog-img/images/2024-12-27/46sn35wvWS9t0zeXukLp2FRE8Ury.png)

全部评论 (0)

还没有任何评论哟~

【CVPR 2020】蒸馏篇（一）：Regularizing Class-wise Predictions via Self-knowledge Distillation

【CVPR2020】蒸馏篇（一）：RegularizingClasswisePredictionsviaSelfknowledgeDistillation 论文地址：代码地址：主要问题：主要思路...

【2020】自我蒸馏分类改进Regularizing Class-wise Predictions via Self-knowledge Distillation

【2020】RegularizingClasswisePredictionsviaSelfknowledgeDistillation 问题：具有数百万个参数的深度神经网络可能由于过拟合而泛化能力较差。...

【CVPR 2020】蒸馏篇（四）：Online Knowledge Distillation via Collaborative Learning

【CVPR2020】蒸馏篇（四）：OnlineKnowledgeDistillationviaCollaborativeLearning 论文地址：代码地址：主要问题：主要思路：具体实现：基...

【CVPR 2020】蒸馏篇（五）：Heterogeneous Knowledge Distillation using Information Flow Modeling

【CVPR2020】蒸馏篇（五）：HeterogeneousKnowledgeDistillationusingInformationFlowModeling 论文地址：代码地址：主要问题：主要...

【CVPR 2020】蒸馏篇（二）：Few Sample Knowledge Distillation for Efficient Network Compression

【CVPR2020】蒸馏篇（二）：FewSampleKnowledgeDistillationforEfficientNetworkCompression 论文地址：代码地址：主要问题：主要思路...

2020知识蒸馏《Channel Distillation: Channel-Wise Attention for Knowledge Distillation》论文笔记

论文笔记《ChannelDistillation:ChannelWiseAttentionforKnowledgeDistillation》摘要问题 1.ChannelDistillationCD...

【CVPR 2020】蒸馏篇（三）：Active Mixup for Data-Efficient Knowledge Distillation from a Blackbox Model

【CVPR2020】蒸馏篇（三）：ActiveMixupforDataEfficientKnowledgeDistillationfromaBlackboxModel 论文地址：主要问题：主要思路...

【CVPR 2020】知识蒸馏+NAS：Block-Wisely Supervised Neural Architecture Search With Knowledge Distillation

【CVPR2020】知识蒸馏+NAS：BlockWiselySupervisedNeuralArchitectureSearchWithKnowledgeDistillation 论文地址：代码地址...

【CVPR 2021】自我知识蒸馏：Self-distillation with Batch Knowledge Ensembling Improves ImageNet Classification

【CVPR2021】自我知识蒸馏：SelfdistillationwithBatchKnowledgeEnsemblingImprovesImageNetClassification 论文地址：代码...

知识蒸馏(Knowledge distillation)

作者：禅与计算机程序设计艺术 1.简介深度学习技术取得了长足的进步，使得人们从各个方向都可以看到突破性的进展。然而，当模型越来越复杂时，其性能也越来越差。这就需要将已有的强大的模型压缩成更小、效率更...

是否确定退出登录?

【CVPR 2020】蒸馏篇（一）：Regularizing Class-wise Predictions via Self-knowledge Distillation

论文地址：

代码地址：

主要问题：

主要思路：

主要特点：

具体实现：

基本符号：

CS-KD：

全部评论 (0)

相关文章推荐

【CVPR 2020】蒸馏篇（一）：Regularizing Class-wise Predictions via Self-knowledge Distillation

【2020】自我蒸馏分类改进Regularizing Class-wise Predictions via Self-knowledge Distillation

【CVPR 2020】蒸馏篇（四）：Online Knowledge Distillation via Collaborative Learning

【CVPR 2020】蒸馏篇（五）：Heterogeneous Knowledge Distillation using Information Flow Modeling

【CVPR 2020】蒸馏篇（二）：Few Sample Knowledge Distillation for Efficient Network Compression

2020知识蒸馏 《Channel Distillation: Channel-Wise Attention for Knowledge Distillation》论文笔记

【CVPR 2020】蒸馏篇（三）：Active Mixup for Data-Efficient Knowledge Distillation from a Blackbox Model

【CVPR 2020】知识蒸馏+NAS：Block-Wisely Supervised Neural Architecture Search With Knowledge Distillation

【CVPR 2021】自我知识蒸馏：Self-distillation with Batch Knowledge Ensembling Improves ImageNet Classification

知识蒸馏(Knowledge distillation)

2020知识蒸馏《Channel Distillation: Channel-Wise Attention for Knowledge Distillation》论文笔记