[NLP论文阅读]Distributed Representations of Sentences and Documents

阅读量：

对CBOW模型和Skip-gram模型的阐述。在介绍之前，请先了解Mikolov在其另一篇论文中所提出的CBOW模型与Skip-gram模型。

CBOW模型可被看作是一个三层结构：
输入层(input)：输入为Context(w)，它表示单词w在其语境中的上下文关系。通过设置窗口大小为2C的方式可以限定上下文范围。
投影层(projection)：该层是对输入的2C个词向量进行求和运算处理。
输出层(output)：与早期的概率语言模型不同的是其架构设计采用了一棵二叉树（binary tree）结构。具体而言是以语料库中出现过的每个单词作为端节点（共有|V|个节点其中V代表语料库中的单词总数）。根据单词出现频率构建的哈夫曼树（Huffman tree）。对于给定上下文Context(w)的目标词w它在输出层哈夫曼编码中的位置是唯一确定的（可以通过Huffman编码路径唯一确定）。从根节点到目标词所在的叶子节点路径上的每一个分支都可以看作一个二分类过程（分类结果0/1分别对应不同的概率值）。因此目标词w的概率P(w|Context(w))可以通过将根节点到叶子节点上各分支概率相乘得到。模型旨在最大化这一概率值

因为直接计算概率的连乘及其偏导数较为繁琐, 所以通常采用取对数的方式简化运算。
总结而言, CBOW 模型通过输入上下文 Context(w) 来预测目标词 w。
相比之下, Skip-gram 模型与 CBOW 模型存在明显的反向关系: 即使从结构上讲, Skip-gram 模型也可以被视作一个三层神经网络架构:
输入层(input): 接收单个单词 w 作为输入。
投影层(projection): 这一层并没有实际的功能作用。
输出层(output): 和 CBOW 模型一致, 输出层同样是 Haffman 树结构, 通过将上下文中的 2C 个单词的概率进行连乘即可得到 P(Context(w)|w), 最终模型旨在最大化这一概率值

。

需要注意的是，在这里Paragraph Vector并不是真正的段落向量。它可以根据不同的需求进行调整，并且既可以表示短语、句子也可以代表整个文档。在文章中作者同样提出了2个模型：

这两个模型也可称为PV-DM（Paragraph Vector的分布式内存版本）和PV-DBOW（Paragraph Vector的分布式袋词版本）。
对于PV-DM与CBOW而言，在PV-DM中新增了一个Paragraph id用于表征输入上下文所在的段落。例如，在训练得到句子向量时，该Paragraph id即基于语料库中各个独立句子的向量表示。需要注意的是，在同一个段落内进行滑动窗口操作时，Paragraph id保持不变。
作者指出，在完成训练后遇到未曾见过的新段落时，则可通过仅更新矩阵D的方式计算该段落id的方法（固定其他参数）。当算法收敛后即可获得该段落对应的id表示。
个人认为，在一定程度上考虑了单词间的顺序关系但并不充分的是PV-DM模型。
而针对PV-DBOW模型，则是通过一个固定的Paragraph id来预测其中单词的概率分布，并不复杂也不再赘述。

可能讲的不是很清楚，欢迎讨论。

全部评论 (0)

还没有任何评论哟~

[NLP论文阅读]Distributed Representations of Sentences and Documents

对CBOW模型和Skipgram模型的介绍首先，需要先介绍一下Mikolov另一篇文章中提出的CBOW模型和Skipgram模型。 CBOW模型可以看做是一个三层结构：输入层input：输入为Co...

Distributed Representations of Sentences and Documents阅读笔记

论文原文DistributedRepresentationsofSentencesandDocuments 论文信息ICML2014 论文解读HytnChen&Pvop 更新时间20200203 句子...

Paragraph2vec(段向量）-------基于《Distributed Representations of Sentences and Documents》

目录一、概要 1）背景 2）摘要二、内容 1）传统的bagofwords 2）本文的paragraphvector 3）算法（1）word2vec的算法原理（2）paragraphvector...

读论文《Distributed Representations of Words and Phrases and their Compositionality》

读论文《DistributedRepresentationsofWordsandPhrasesandtheirCompositionality》 introduce 这篇论文是对《EfficientE...

【论文阅读】Distributed Representation of Sentence and Document

Abstract 传统的单词模型（bagofwords）在实际应用中有一些缺点： 1.Theylosetheorderingofthewords，丢失了单词的顺序。 2.Theyignoreseman...

[论文笔记] [2013] [NIPS] Distributed Representations of Words and Phrases and their Compositionality

这篇论文的作者Mikolov基于他前面的工作——skipgrammodel学习wordembedding，提出了几个提高词向量性能和训练速度的技巧，以及如何学习短语的表示。

[NLP论文阅读] Supervised Learning of Universal Sentence Representations from NLI data

论文原文：SupervisedLearningofUniversalSentenceRepresentationsfromNaturalLanguageInferenceData Github：fac...

Distributed Representations of Words and Phrasesand their Compositionality

https://papers.nips.cc/paper/5021distributedrepresentationsofwordsandphrasesandtheircompositionality...

论文阅读笔记-On Explainable and Adaptable Detection of Distributed Denial-of-Service Traffic

文章目录前言 4\.DESIGN 4.2Explainability&ManualIntervention 4.2.1RiskProfile 4.2.2VisualizedKNNModel可视化KN...

[NLP论文阅读]Siamese CBOW: OptimizingWord Embeddings for Sentence Representations

论文原文:SiameseCBOW:OptimizingWordEmbeddingsforSentenceRepresentations 引言前段时间看了一篇SiameseLSTM（孪生LSTM）用于...

是否确定退出登录?

[NLP论文阅读]Distributed Representations of Sentences and Documents

全部评论 (0)

相关文章推荐

[NLP论文阅读]Distributed Representations of Sentences and Documents

Distributed Representations of Sentences and Documents阅读笔记

Paragraph2vec(段向量）-------基于《Distributed Representations of Sentences and Documents》

读论文《Distributed Representations of Words and Phrases and their Compositionality》

【论文阅读】Distributed Representation of Sentence and Document

[论文笔记] [2013] [NIPS] Distributed Representations of Words and Phrases and their Compositionality

[NLP论文阅读] Supervised Learning of Universal Sentence Representations from NLI data

Distributed Representations of Words and Phrasesand their Compositionality

论文阅读笔记-On Explainable and Adaptable Detection of Distributed Denial-of-Service Traffic

[NLP论文阅读]Siamese CBOW: OptimizingWord Embeddings for Sentence Representations