Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond

阅读量：

模型介绍

Encoder-Decoder RNN with Attention and Large Vocabulary Trick

编码器采用双向GRU结构；解码器采用单向GRU架构并配备注意力机制，在生成目标语言词汇表时结合softmax层；具体而言，在每个训练周期中将解码器限定为仅访问当前批次输入数据中的词汇；为了捕捉语言模型中常见词语的趋势，并确保解码器能够覆盖这些高频词汇，在每个周期结束前会增加一批新的高频词至其可访问范围；直至达到预定的最大 vocabulary量；这使得 model 在计算时避免了处理过大的 output 空间；从而显著提高了 training 效率和 model 收敛速度；这种方法非常适合用于摘要生成任务

Capturing Keywords using Feature-rich Encoder

识别文档中的关键概念和实体是主要的挑战之一。为此，在每种标记类型的基础上生成基于查找的embedding（即类似词嵌入技术）。具体而言，在源文档处理时，在其所有相关标记中找到相应的embedding并将其整合到一个长向量（如图所示）。而在目标侧仅采用基于单词级别的embedding表示。

我们分别对POS、nner标记和离散tf和idf值使用一个嵌入向量，它们与基于字的嵌入一起作为编码器的输入。

We represent each of the POS, NER tags, and discrete TF and IDF values using an embedding vector. These embedding vectors, combined with word-based embeddings, are fed into the encoder as input.

Modeling Rare/Unseen Words using Switching Generator-Pointer

处理这些oov(out of vocabulary )的简便方式是通过生成器指针（generator-pointer）来引导模型直接定位到源文档中的相应位置。
(<>)在解码器端配置了一个控制机制，在特定条件下触发不同的生成模式：当指示符显示为'G'时，默认启用基于Softmax层的传统生成器输出一个单词；而当指示符变为'P'时，则激活指针网络以从源文本中复制已知词汇。

[

该生成概率计算方法与之前get-to-the-point论文中的相同。均采用了sigmoid函数作为软判断依据。采用文档中单词位置的attention分布作为pointer取样分布的基础。

目标函数 为：

在训练过程中，在目标词汇中若无摘要单词存在时（即OOV词），我们都提供具有显式指针信息的模型。
当总结中的OOV词出现在多个文档位置时（即存在tie问题），我们采用某种策略解决tie问题以确保优先处理第一次出现的情况。
在训练过程中，在优化损失函数方面我们基于以下所述的对数条件似然函数进行优化，并加入了正则化惩罚项。

其中 $g_i$ 是一个指示函数：当position $i$ 的位置属于 $ooV$ 集合时，则 $g_i=0$ ；否则 $g_i=1$ 。

The process of extracting the hierarchical structure of documents through hierarchical attention mechanisms is an essential aspect in modern natural language processing.

在一个包含大量数据的源文档中, 除了识别关键信息外, 还必须确定如何提取摘要的关键句子. 该论文中使用了两个双向的RNN模型, 其中一个模型是基于词级别的, 另一个模型是基于句子级别的. 机制能够在多个层次上协同工作. 词级别的注意力结果通过相应的句子级别注意力进行加权调整, 并经过归一化处理以确保总和为1.

然后，在解码器中引入缩放注意力机制的基础上进行计算，并用于作为解码器输入层中的隐藏状态来处理信息；同时，在句子级RNN模型中融入位置编码信息以辅助捕捉文本序列的空间关系。
由此可见，该架构设计旨在对关键句子及其内部关键词进行综合建模；图3展示了该模型的整体架构图。

全部评论 (0)

还没有任何评论哟~

Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond

模型介绍 EncoderDecoderRNNwithAttentionandLargeVocabularyTrick Encoder:双向GRU Decoder:单向GRU+attention+sof...

Sequence to Sequence – Video to Text 论文翻译

序列到序列–视频到文本摘要现实世界的视频往往具有复杂的动态性；生成开放域视频描述的方法应该对时间结构敏感，并且允许可变长度的输入（帧序列）和输出（字序列）。为了解决这个问题，我们提出了一种新的端到...

【理解】end-to-end And sequence-to-sequence

1.endtoend 如何理解端到端endtoend？ITPUB博客？ITPUB博客什么是endtoend神经网络？知乎端到端：输入和输出之间不需要其他的操作 2\.sequencetoseque...

Speech Recognition Using attention-based sequence-to-sequence methods

Abstract—Speechisoneofthemostimportantandprominentmannertocommunicateamonghumanbeing.Italsohascapaci...

Sequence-to-Sequence Knowledge Graph Completion and Question Answering

摘要知识图嵌入模型用低维嵌入向量表示知识图的每个实体和关系。这些方法最近被应用于KG链路预测和不完全KG的问答KGQA。KGE通常为图中的每个实体创建一个嵌入，这导致在具有数百万个实体的真实图上产生...

cs224n 2019 Machine Translation, Sequence-to-sequence and Attention

本节课内容：介绍一个新的任务：机器翻译介绍一个新的神经架构：序列到序列的模型介绍一种新的神经技术：注意力，用于提升序列到序列的模型第一部分：之前的机器翻译方法 1950s：系统是基于规则实现的...

From Arithmetic Sequence To ---＞＞＞ Pascal’s Triangle And To ---＞＞＞N-term Arithmetic Sequence

从等差数列到杨辉三角,再到N阶等差数列，请准备好水，这一期干货满满，脑洞大开，超级炸裂，Let'sGo！！！这次我们从大家十分熟悉的的等差数列开始一阶等差数列（ArithmeticSequence...

Sequence-to-sequence Domain Adaptation Network for Robust Text Image Recognition笔记

简介 Domainadaptation可以解决不同域之间的偏移的问题。之前大多数的工作都没有考虑连续序列的问题，都是对于单一内容的。这篇文章中，作者提出了一个SSDAN的网络来做文本识别。其中的gat...

Creating Sequence to Sequence Models

CreatingSequencetoSequenceModels coding:utf8 CreatingSequencetoSequenceModels Hereweshowhowtoimpleme...

Convolutional Sequence to Sequence Learning

论文链接：<https://arxiv.org/abs/1705.03122 一、模型框架AConvolutionalArchitecture 1、模型的整体架构图： 2、位置向量PositionEm...

是否确定退出登录?

Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond

模型介绍

Encoder-Decoder RNN with Attention and Large Vocabulary Trick

Capturing Keywords using Feature-rich Encoder

Modeling Rare/Unseen Words using Switching Generator-Pointer

全部评论 (0)

相关文章推荐

Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond

Sequence to Sequence – Video to Text 论文翻译

【理解】end-to-end And sequence-to-sequence

Speech Recognition Using attention-based sequence-to-sequence methods

Sequence-to-Sequence Knowledge Graph Completion and Question Answering

cs224n 2019 Machine Translation, Sequence-to-sequence and Attention

From Arithmetic Sequence To ---＞＞＞ Pascal’s Triangle And To ---＞＞＞N-term Arithmetic Sequence

Sequence-to-sequence Domain Adaptation Network for Robust Text Image Recognition笔记

Creating Sequence to Sequence Models

Convolutional Sequence to Sequence Learning