【GCN-RS】Region or Global? A Principle for Negative Sampling in Graph-based Recommendation (TKDE‘22)

阅读量：

Region or Global? A Principle for Negative Sampling in Graph-based Recommendation (TKDE’22)

中间区域

在GCN-RS里，负采样应该选取距离user中间的区域，太近的区域一般是用户聚合的物品，太远的没什么信息量，距离不远不近的样本可以看作是信息量比较大的难样本。

定义距离user k阶的物品是中间区域 $R_{m e d}$ ，可以用逐层广度优先搜索得到 $R_{m e d}$ 。

从中间区域 $R_{m e d}$ 选择 M 个负样本形成负样本候选集 $C_{u}$ ，然后从 $C_{u}$ 中用两种策略得到难负样本。

作者声称中间区域 $R_{m e d}$ 远小于整个物品集，但我在数据集上统计仅仅是3-hop，绝大部分 $R_{m e d}$ 几乎包含了整个数据集：
在这里插入图片描述

策略一：正样本辅助

类似于MixUp技术，为了得到user的难负样本，不仅仅和user $u$ 有关，还和user的正样本 $v$ 有关，对于一对正样本 $(u,v)$ ， $C_{u}$ 中负样本 $v_{n}^{p}$ 被采得概率为：
$p_{n}\left(v_{n}^{p} \mid(u, v)\right)=\frac{\sigma\left(\alpha\left(\mathbf{e}_{u}^{*} \cdot \mathbf{e}_{v_{n}^{p}}^{*}\right)+(1-\alpha)\left(\mathbf{e}_{v}^{*} \cdot \mathbf{e}_{v_{n}^{p}}^{*}\right)\right)}{\sum_{v_{i} \in \mathcal{C}_{u}} \sigma\left(\alpha\left(\mathbf{e}_{u}^{*} \cdot \mathbf{e}_{v_{i}}^{*}\right)+(1-\alpha)\left(\mathbf{e}_{v}^{*} \cdot \mathbf{e}_{v_{i}}^{*}\right)\right)}$
其中 $\alpha$ 是超参数，用于平衡用户和物品的影响。直观理解这个公式的含义： $C_{u}$ 中，embedding距离用户 $u$ 和正样本 $v$ 越近的物品，作为负样本的概率越大。从 $p_n$ 中采k个负样本构成负样本集合： $\mathcal{P}_{k}=\left\{v_{n}^{p}\right\}$

10%伪标签。

10~20%难负样本。

策略二：曝光未点击

不能多选，因为曝光未点击包含了很强的bias信息，只选曝光未点击里**“最难的那个”**：
$v_{n}^{e}=\underset{v_{i} \in \mathcal{M}_{u}}{\operatorname{argmax}} \sigma\left(\beta\left(\mathbf{e}_{u}^{*} \cdot \mathbf{e}_{v_{i}}^{*}\right)\right)$

$\beta=\left\{\begin{array}{lr} 1, & \text { if } v_{e} \text { not in } \mathcal{C}_{u} \\ \text { number of exposed items, } & \text { if } v_{e} \text { in } \mathcal{C}_{u} \end{array}\right.$

$\mathcal{M}_{u}$ 是曝光未点击集合， $\beta$ 累计分数，但是如果不在中间区域的话，不计数。

负样本融合

因为GCN-RS本质是迭代地对用户-商品图中的embedding进行消息传播，所以将这些负样本策略合并到embedding空间，其中k是负样本个数：
$\begin{gathered} \mathbf{e}_{v_{n}}^{*}=\underset{v_{n}^{p} \in \mathcal{P}_{k}}{\operatorname{merge}}\left(\mathbf{e}_{v_{n}^{e}}^{*}, \mathbf{e}_{v_{n}^{p}}^{*}\right) \\ \operatorname{merge}\left(\mathbf{e}_{v_{n}^{e}}, \mathbf{e}_{v_{n}^{p}}\right)=\frac{1}{k} \cdot \mathbf{e}_{v_{n}^{e}}+\left(1-\frac{1}{k}\right) \cdot \mathbf{e}_{v_{n}^{p}} \end{gathered}$
所以最后对于一条正样本，造出一个负样本，然后用margin hinge loss训练这条样本：
$\mathcal{L}=\max \left(0, \mathbf{e}_{u}^{*} \cdot \mathbf{e}_{v_{n}}^{*}-\mathbf{e}_{u}^{*} \cdot \mathbf{e}_{v}^{*}+\gamma\right)$

全部评论 (0)

还没有任何评论哟~

【GCN-RS】Region or Global? A Principle for Negative Sampling in Graph-based Recommendation (TKDE‘22)

RegionorGlobal?APrincipleforNegativeSamplinginGraphbasedRecommendationTKDE’22 中间区域在GCNRS里，负采样应该选取距离...

Reinforced Negative Sampling over Knowledge Graph for Recommendation

ReinforcedNegativeSamplingoverKnowledgeGraphforRecommendation ABSTRACT 合理的处理缺失数据在推荐系统中是的一个根本挑战。目前的大多...

22 TKDE SAGES: Scalable Attributed Graph Embedding with Sampling for Unsupervised Learning

talk 这篇文章没有什么特别新颖的，一作是北大的硕士，查询过程中好像没有其他工作，18年硕士，文章采用重构损失（A,X+互信息。主要看点：graphsaint在图采样上的实现，不是利用pytorch...

【GCN-RS】How Powerful is Graph Convolution for Recommendation? (CIKM‘21)

HowPowerfulisGraphConvolutionforRecommendation?CIKM’21 上篇文章认为LightGCN的缺点是oversmoothing，这篇文章认为LightGC...

【GCN-RS】Learning Explicit User Interest Boundary for Recommendation (WWW‘22)

LearningExplicitUserInterestBoundaryforRecommendationWWW’22 Pointwiseloss: \mathcalL=\sumu,x\in\math...

【GCN-RS-Defence】GCN-Based User Representation Learning for Unifying Robust Recommendation and Frauds

GCNBasedUserRepresentationLearningforUnifyingRobustRecommendationandFraudsterDetectionSIGIR’20 造了一个模...

【RS采样】A Gain-Tuning Dynamic Negative Sampler for Recommendation (WWW 2022)

《SimplifyandRobustifyNegativeSampling》NIPS2020 这篇文章实验观察到虽然FalseNegative和HardNegative都会有较大的Socre，但是Fa...

【GCN-RS】对比学习SGL：Self-supervised Graph Learning for Recommendation (SIGIR‘21)

SelfsupervisedGraphLearningforRecommendationSIGIR‘21 还是何向南组提出的图自监督学习框架，对用户物品的二部图上的节点、边做dropout等方法实现数...

Entity Similarity-Based Negative Sampling for Knowledge GraphEmbedding

Abstract 知识图嵌入KGE模型优化损失函数，使正三元组的总似然最大化，负三元组的似然最小化。负样本在KGE训练中是必不可少的，因为它们不像正样本那样可观察到。目前，大多数负抽样方法采用不同的技...

Analysis of Negative Sampling Methods for Knowledge Graph Embedding

摘要负采样是一种用于加速知识图嵌入学习和最大化嵌入模型在链接预测和实体解析等支持任务中的有效性的方法。负采样对于提高准确性、减少偏差、提高效率和改善代表性至关重要。本文仔细研究了在基准数据集Fb15...

是否确定退出登录?

【GCN-RS】Region or Global? A Principle for Negative Sampling in Graph-based Recommendation (TKDE‘22)

Region or Global? A Principle for Negative Sampling in Graph-based Recommendation (TKDE’22)

中间区域

策略一：正样本辅助

策略二：曝光未点击

负样本融合

全部评论 (0)

相关文章推荐

【GCN-RS】Region or Global? A Principle for Negative Sampling in Graph-based Recommendation (TKDE‘22)

Reinforced Negative Sampling over Knowledge Graph for Recommendation

22 TKDE SAGES: Scalable Attributed Graph Embedding with Sampling for Unsupervised Learning

【GCN-RS】How Powerful is Graph Convolution for Recommendation? (CIKM‘21)

【GCN-RS】Learning Explicit User Interest Boundary for Recommendation (WWW‘22)

【GCN-RS-Defence】GCN-Based User Representation Learning for Unifying Robust Recommendation and Frauds

【RS采样】A Gain-Tuning Dynamic Negative Sampler for Recommendation (WWW 2022)

【GCN-RS】对比学习SGL：Self-supervised Graph Learning for Recommendation (SIGIR‘21)

Entity Similarity-Based Negative Sampling for Knowledge GraphEmbedding

Analysis of Negative Sampling Methods for Knowledge Graph Embedding