Speech and Natural Language Processing《资源教程》

阅读量：

该文本介绍了多个自然语言处理和机器学习相关的工具包及其应用领域。其中包括用于语音识别（如CMU Sphinx、HTK）、机器翻译（如Kaldi、Moses）、词性标注（如Stanford CoreNLP）以及深度学习模型训练（如gensim、Theano）。此外还提到了一些用于信号处理、文本分类和生成式模型的库（如ISSE、HTS）。这些工具包涵盖了从基础库到复杂模型的各种资源，并广泛应用于语音识别、机器翻译和自然语言处理等领域。

Speech and Natural Language Processing
#######################################
… image:: https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg
:alt: Awesome
:target: https://github.com/sindresorhus/awesome
A curated list of speech and natural language processing resources. Other lists can be found in this list <https://github.com/bayandin/awesome-awesomeness>_. If you want to contribute to this list (please do), send me a pull request. All Sub-caterogires are listed in alphabetical order

Finite State Toolkits and Regular Expressions

AT&T FSM Library <http://www2.research.att.com/~fsmtools/fsm/>_ The AT&T FSM libraryTM provides a comprehensive suite of Unix-compatible software utilities designed to build, integrate, optimize, and search weighted finite-state acceptors and transducers.

Carmel <https://github.com/graehl/carmel>_ Finite-state toolkit providing Expectation-Maximization (EM) and Bayesian methods for training FSTs and context-free parsing trees/

The category-based semiring, as outlined in the work of Sproat and colleagues (2014), is available at [specific URL]. Additionally, the same category-based semiring is described in detail by Sproat and colleagues (2014) at [specific URL].

dk.brics.automaton <http://www.brics.dk/automaton/>_ represents a Java package providing efficient implementations of finite automata and regular expressions.
The system under consideration is based on the theory of finite automata (FAs) and their extensions, which have been extensively studied in theoretical computer science.

Fare <https://github.com/moodmosaic/Fare>_ 是一个专为.NET框架设计的有限状态机和正则表达式引擎，在C#语言中实现。
am 是一个专为JavaScript设计的库，支持操作自动机和形式文法（尤其是正则语言和上下文无关语言）。

Foma <https://code.google.com/p/foma/>_ Finite-state compiler and C library

fsa <http:>_ Toolkit used in RWTH ASR engine

The Thomas Hanneforth fsm 2.0 library, implemented in C++, includes several useful operations, such as three-way composition.

fstrain <https://github.com/markusdr/fstrain>_ A suite for developing finite-state machines.

jopenfst <https://github.com/steveash/jopenfst>_ 是一个基于 C++ 的 OpenFst 库的 Java 实现；它最初源自 CMU Sphinx 项目的衍生版本。

The Kleene-based programming framework is a high-level abstraction for implementing finite-state automata, leveraging the foundation provided by the OpenFst library.

The MIT FST Toolkit, accessible at https://www.google.com/, is no longer actively maintained but still offers a number of unique commands that are not commonly available in other similar toolkits.

MoMs-for-StochasticLanguages <https://github.com/ICML14MoMCompare/MoMs-for-StochasticLanguages>_ Spectral-related training algorithms, including various methods, are employed for the optimization of Weighted Finite-State Automata (WFSAs).

The optimal path for the Partially Homomorphic Encryption (PDT) system has been implemented, with the source code accessible at https://github.com/kho/openfst. This implementation ensures efficient computation of the shortest path in encrypted data, leveraging advanced cryptographic techniques.

Noam https://github.com/izuzak/noam
Noam serves as a JavaScript library designed to handle automata and the grammatical structures of both regular and context-free languages. Additionally, it includes interesting visualizations created with viz.js <https://github.com/mdaines/viz.js>.

The tool serves as a robust framework for building, incorporating, fine-tuning, and accessing weighted finite-state transducers (FSTs).

Valuable collection of tools tailored for OpenFst supports the implementation of categorical semirings.openfst-utils.

The openlat toolkit, accessible via https://github.com/benob/openlat, provides a platform for processing word lattices constructed upon OpenFst. This toolkit supports the import and export of HTK-compliant lattices, facilitating their integration into various applications.

PyFst <https://github.com/vchahun/pyfst>_ Python interface to OpenFst

SFST - Stuttgart Finite State Transducer Tools <http://www.ims.uni-stuttgart.de/tcl/SOFTWARE/SFST.html>_
"SFST represents a comprehensive resource providing developers with tools to create morphological analyzers and related applications built upon finite-state transducer technology."

Treba <https://code.google.com/p/treba/>_ “Serves as an elementary command-line interface (CLI) utility for performing operations such as training, decoding, and computing metrics with weighted probabilistic finite automata (PFSA) and hidden Markov models (HMMs).”

A variety of tools within the machine translation domain also offer or utilize interesting graph structures and semiring operations.

Language Modelling Toolkits

Bayesian Recurrent Neural Network of Language Modeling http://chien.cm.nctu.edu.tw/bayesian-recurrent-neural-network-for-language-modeling/ This project provides a C/C++ implementation of the Bayesian recurrent neural network for language modeling (BRNNLM).

Berkeley LM <http://code.google.com/p/berkeleylm/>_

Bigfatlm <https://github.com/jhclark/bigfatlm>_
This system offers Hadoop-based training for Kneser-ney language models, implemented in Java.

CSLM（链接）是一个支持构建连续空间语言模型的开源软件系统。

DALM <https://github.com/jnory/DALM>_ Double array language model.

KenLM[1]
Kenneth Heafield开发的语言模型工具包采用了一种高效且占用内存少的方法。

The LWLM model, accessible at lwlm <http://chasen.org/~daiti-m/dist/lwlm/>, represents a precise and comprehensive Bayesian implementation of the Latent Words Language Model as described by Deschacht and Moens in 2009.

The system architecture of Maximum Entropy Modeling is introduced with the reference to http://homepages.inf.ed.ac.uk/lzhang10/maxent.html. Le Zhang possesses a rich collection of links that are intricately linked to his work on MaxEnt models.

Maximum entropy language models: SRILM extension <http://www.phon.ioc.ee/dokuwiki/doku.php?id=people:tanel:srilm-me.en>_
"This patch enhances the SRILM toolkit by introducing a feature for training and applying maximum entropy (MaxEnt) language models. Currently, only n-gram-based features are supported in this implementation."

个人最喜欢的模型工具包 mitlm 网址：mitlm <https://code.google.com/p/gitmlm/>_ 。它非常快速，并显示出略微更高的准确性。

MSRLM http://research.microsoft.com/en-us/downloads/78e26f9c-fc9a-44bb-80a7-69324c62df8c/default.aspx
This scalable language-model tool is capable of constructing language models from vast quantities of data. It incorporates variations of modified absolute discounting and Kneser-Ney smoothing.

[OpenGrm](http://opengrm.org) is a language modeling suite designed to work seamlessly within the OpenFst framework.

cpyp <https://github.com/redpony/cpyp> is a C++ library designed to model using Pitman-Yor processes.

Random Language Model (RandLM) http://sourceforge.net/projects/randlm/ 基于布隆过滤器的方法用于构建随机语言模型家族。

Recurrent Neural Network Language Model（RNNLM） toolkit is hosted at http://www.fit.vutbr.cz/~imikolov/rnnlm/.

Refactorer tool: Refactorer is a tool originated from http://code.google.com/p/refr. The re-ranking framework was presented at the Johns-Hopkins symposium concerning language model confusion.

The rwthlm suite provides a toolkit for training various types of neural language models, including feedforward, recurrent, and long short-term memory (LSTM) architectures. This software was developed by Martin Sundermeyer.

SRILM, accessed via the URL http://www.speech.sri.com/projects/srilm/, is a widely recognized toolkit in the speech processing domain. It offers freely accessible source code for non-commercial purposes only, necessitating licensing fees for commercial applications.

Speech Recognition

基于GitHub的AaltoASR <https://github.com/aalto-speech>功能丰富，并支持多种语言和方言的语音识别技术。

该GitHub链接代表了一个开源的 concurrent speech processing框架.

An open-source toolkit designed to support both static and dynamic decoder implementations.

The kaldi-nnet-dur-model, developed as part of the Kaldi speech recognition platform, is a neural network-based duration model specifically designed for phone modeling. This system is described in detail in an Interspeech paper and is built upon the Kaldi speech recognition framework. The model's architecture and implementation details can be accessed via a direct link to a 2014 ICASSP paper.

The CMU Sphinx project, an open-source toolkit for speech recognition, is available at the Carnegie Mellon University website http://cmusphinx.sourceforge.net/.

HTK <http://htk.eng.cam.ac.uk/>_ The HTK resource, known as the Hidden Markov Model Toolkit, serves as a portable solution for creating and adjusting hidden Markov models.

可参考GitHub仓库：https://github.com/idiap/juicer Juicer represents a WFST-based underlying decoder of the ASR system.

The software represents a highly efficient, double-step approach to large vocabulary continuous speech recognition (LVCSR). It is specifically designed to cater to the needs of speech-related researchers and developers. Available at http://julius.sourceforge.jp/en_index.php, this tool offers state-of-the-art capabilities for processing and analyzing speech data efficiently.

Kaldi http://kaldi.org/ is a modern open-source toolkit headed by Dan Povey, highlighting a wide array of cutting-edge technologies.

OpenDcd <http://opendcd.org/>_ A high-quality open-source WFST-based speech recognition decoder system built for robust performance.

Phonetisaurus <https://code.google.com/p/phonetisaurus/>_
Additionally, Josef Novak's highly efficientWFST-based Phoneticizer provides a robust solution for phonetic analysis. Moreover, the site offers comprehensive tutorials and slides to aid in understanding its functionality.

Sail Align <https://github.com/nassosoassos/sail_align>_ 是一个用于实现鲁棒长语音-文本对齐的开源软件工具包。该系统通过自适应、迭代的语音识别和对齐方案能够处理极其冗长（甚至可能存在噪声干扰）的音频信号，并能有效抵抗转录错误带来的影响。该工具主要作为 Perl 库开发使用；然而其功能实现也依赖于...

SCARF: A Segmental CRF Toolkit for Speech Recognition <http://research.microsoft.com/en-us/projects/scarf/>_
“SCARF serves as a toolkit designed to perform speech recognition through the application of segmental conditional random fields.”

trainc <https://code.google.com/p/trainc/>_
David Rybach and Michael Riley developed an application for the direct synthesis of context-dependent transducers, cited in the Interspeech Best Paper.

RASR <http://www-i6.informatik.rwth-aachen.de/rwth-asr/>_ RWTH ASR: Speech Recognition System of the RWTH Aachen University, a specialized tool for language processing and speech analysis.

Signal Processing

An Interactive Source Separation Editor http://isse.sourceforge.net/
ISSE 是一款开源、免费且支持多平台的音频编辑工具软件，在时频可视化音频信号中通过涂鸦操作实现音频源分离功能。
Bob https://github.com/idiap/bob
Bob 是一个基于数字信号处理和机器学习的免费工具包，默认由 Idiap 研究stitute生物ometrics组开发。
Matlab Audio Processing Examples http://www.ee.columbia.edu/~dpwe/resources/matlab/
提供Matlab示例代码库。
SAcC - Subband Autocorrelation Classification Pitch Tracker http://labrosa.ee.columbia.edu/projects/SAcC/
SAcC 是一种基于MLP神经网络的噪声鲁棒谱峰检测方法，在子带自相关分类中实现声调估计功能。

Text-to-Speech

HTS <http://hts.sp.nitech.ac.jp/> 提供了基于声学模型的语音合成系统。
RusPhonetizer <https://github.com/wilpert/RusPhonetizer> 是一个功能强大的工具包, 专门用于 Russian 语音转写过程所需的语言规则和词典。

Speech Data

cmudict 基于 GitHub 上 CMU Sphinx 项目发布。该工具集是一个免费的发音词典。
LibriSpeech ASR corpus 包含约 1,000 小时的 16 kHz 读音英语语音片段。该语料库由 Vassil Panayotov 主持开发并维护（协助者包括 Daniel Povey），数据源自 LibriVox 项目的录音书籍，并经过仔细分割与校准。
TED-LIUM Corpus 是从 Ted Talks 的语音演讲及其转录在 Ted 网站上可获取制作而成。

Machine Translation

Berkeley Aligner <https://code.google.com/p/berkeleyaligner/>_
“…a package for word alignment that incorporates cutting-edge advancements in unsupervised word alignment techniques.”

cdec <https://github.com/redpony/cdec> _
"Engaging in decoder, aligner, and model optimization tasks specifically for statistical machine translation applications as well as other structured prediction models that are primarily grounded in context-free formalisms."

Jane <http://www-i6.informatik.rwth-aachen.de/jane/>_
“Jane represents RWTH’s open-source platform for statistical machine translation.”
“The toolkit is equipped with state-of-the-art methods for both phrase-based and hierarchical phrase translation.”

Joshua http://joshua-decoder.org/_
一种层次结构且基于语法的机器翻译解码器用Java开发。

Moses http://www.statmt.org/moses/> is a well-known, open-source platform for machine translation.

该工具的OpenFST对齐系统的具体名称为 alignment-with-openfst，并可通过以下链接访问其GitHub存储位置：<https://github.com/ldmt-muri/alignment-with-openfst>

The project, titled zmert, is an open-source Java implementation available at zmert <http://cs.jhu.edu/~ozaidan/zmert/>. The impressive work on this project was carried out by Omar F. Zaidan, whose contributions can be viewed at Omar F. Zaidan <http://www.cs.jhu.edu/~ozaidan/>.

Machine Learning

BIDData <https://github.com/BIDData>_ BIDMat is designed for supporting large-scale exploratory data analysis. Its companion library, BIDMach, provides a machine learning interface.

libFM: Factorization Machine Library <http://libfm.org/>_

Sofia-ML is a tool developed by Google researchers that implements efficient incremental methods applicable to classification, regression, and ranking tasks.

Spearmint 是一个用于执行贝叶斯优化的工具包，
它基于 Jasper Snoek、Hugo Larochelle 和 Ryan P. Adams 在 Advances in Neural Information Processing Systems 2012 上发表的论文中的算法实现。
该方法在实践上实现了机器学习算法的高效优化。

Deep Learning

The repository at https://github.com/soumith/convnet-benchmarks offers a benchmarking platform for evaluating various convolutional network implementations.

Cafee [链接] - 一款高度动态的深度学习框架，并具备cuDNN支持以及多样化的后端选项。

cuDNN <https://developer.nvidia.com/cudnn>_ is a deep learning framework developed by Nvidia, widely-used in academic research as demonstrated in their seminal work here <http://arxiv.org/pdf/1410.0759.pdf>. The release of Torch 7 introduced comprehensive support for cuDNN, alongside additional Python wrappers available on GitHub. These wrappers include here <https://github.com/soumith/cudnn.torch>_ and here <https://github.com/hannes-brt/cudnn-python-wrappers>, providing flexible integration options for developers.

The CURE NT toolkit, available at CURE NT <http://sourceforge.net/projects/currennt/>_, is a Munich-based open-source CUDA Recurrent Neural Network toolkit, detailed in this paper <http://www.mmk.ei.tum.de/publ/pdf/14/14wen7.pdf>.

Gensim, developed by Radim Rehurek, is a Python topic modeling toolkit based on the word2vec algorithm. It is easily installed and straightforward to use.

The Glove model, sourced from http://www.socher.org/index.php/Main/GloveGlobalVectorsForWordRepresentation, provides a comprehensive set of global vector representations of words at the word-level.

GroundHog https://github.com/lisa-groundhog/GroundHog is specifically designed as a neural network-driven machine translation toolkit.

The KALDI LSTM, implemented in the Kaldi framework using C++, is an efficient algorithm for processing sequential data. This algorithm is designed to handle tasks such as automatic speech recognition and language modeling among other applications.

OxLM: Oxford University's Neural Language Modeling Toolkit, accessible via https://github.com/pauldb89/OxLM, is a specialized toolkit designed for advanced neural language modeling tasks. The system is thoroughly documented in the paper "Art-Batitespace-C 斯坦尼斯库-Blunsom-Hoang" available at https://ufal.mff.cuni.cz/pbml/102/art-baltescu-blunsom-hoang.pdf.

The Neural Probabilistic Language Model (NPLM) Toolkit http://nlg.isi.edu/software/nplm/ implements efficient methods for developing neural language models based on the work of Bengio (2003). This software package demonstrates remarkable efficiency when handling extensive vocabularies, capable of processing up to 1 million or more words. Such models can be trained on vast amounts of text data within approximately one week, and once developed, they enable rapid inference at 40 microseconds per query. This functionality makes it suitable to integrate into machine translation decoders.

RNNLM2WFST GitHub 是一个工具包用于将 Recurrent Neural Networks Language Models（RNNLM）转换为 Weighted Finite-State Transducers（WFST）。

lib ViennaCL is an open-source library developed for computations on many-core architectures like GPUs, MIC, and multi-core CPUs.

Natural Language Processing

BLLIP reranking parser https://github.com/BLLIP/bllip-parser: This system is a statistical NLP parser that incorporates a generative group chunk analyzer (first-stage) and a discriminative maximum entropy reranker (second-stage).

Apache OpenNLP http://opennlp.apache.org/: The Apache OpenNLP library serves as a comprehensive machine learning toolkit for natural language processing tasks.
SEAL https://github.com/TeamCohen/SEAL: The SEAL library offers set expanders specifically designed for languages detailed in this paper http://www.cs.cmu.edu/~wcohen/postscript/icdm-2007.pdf.
Stanford CoreNLP http://nlp.stanford.edu/software/corenlp.shtml: Stanford CoreNLP provides a suite of Java-based natural language analysis tools.

Applications

The Cloud ASR system based on PyKaldi represents a comprehensive software solution for speech recognition, offering an accessible online platform for implementing ASR technologies. https://github.com/UFAL-DSG/cloud-asr

Other Tools

GraphViz.sty <https://github.com/mprentice/GraphViz-sty>_
A powerful tool that seamlessly integrates the dot language into LaTeX documents. It enables users to create precise customizations of small, color-coded Weighted Finite-State Transducers (WFST) diagrams in academic papers and presentations.

Blogs

William Hartmann 的《介于1与0之间》_ by William Hartmann
- CMUSphinx 相关博客_ CMU Sphinx
- 语言日志_ 语言日志
- 自然语言处理与文本分析博客_ Natural Language Processing and Text Analytics
- Hal Daumé III 的自然语言处理博客_ Natural Language Processing Blog by Hal Daumé III
- 静态网页上的语音语言处理博客 “Some thoughts on Spoken Language Processing, with tangents on Natural Language Processing, Machine Learning, and Signal Processing thrown in for good measure。”

Books

Deep Learning: Techniques and Uses http://research.microsoft.com/pubs/209355/DeepLearning-NowPublishing-Vol7-SIG-039.pdf By Li Deng and Dong Yu
Foundations of Data Science http://www.cs.cornell.edu/jeh/NOSOLUTIONS90413.pdf. Draft by John Hopcroft and Ravindran Kannan
An introduction to Matrix Methods and Applications http://stanford.edu/class/ee103/mma.pdf. (Working Title) S. Boyd and L. Vandenberghe

全部评论 (0)

还没有任何评论哟~

Speech and Natural Language Processing《资源教程》

SpeechandNaturalLanguageProcessing …image::https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fe...

《Speech and Language Processing》笔记（二）

摘抄： Theprocesswejustwentthroughwasbasedonfixingtwokindsoferrors:falsepositives,stringsthatweincorrec...

《Speech and Language Processing》笔记（一）

Noun 1.mimic临摹 2.mimicry模仿、模仿的技巧 3.genre类型、体裁、样式 4.diversion消遣、分散注意力 5.TextNormalizingmeansconvertin...

Natural Language Processing

NaturalLanguageProcessing 作者：禅与计算机程序设计艺术 1\.背景介绍 1.1什么是自然语言处理？自然语言处理（NaturalLanguageProcessing,NLP）...

Natural Language Processing

作者：禅与计算机程序设计艺术 1.简介自然语言处理（NLP）是研究如何使计算机理解和处理自然语言的理论和方法，目的是让电脑更好的理解、获取、生成、存储和沟通人类语言。

speech-language-processing

SpeechandNaturalLanguageProcessing …image::<https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29f...

The Intersection of Deep Learning and Natural Language Processing

1.背景介绍自从深度学习技术的蓬勃发展以来，它已经成为了人工智能领域的重要技术之一。深度学习的发展也为自然语言处理（NLP）领域提供了强大的支持。在这篇文章中，我们将探讨深度学习与自然语言处理的相互...

Synonyms-Chinese Synonyms for Natural Language Processing and Understanding

Synonyms ChineseSynonymsforNaturalLanguageProcessingandUnderstanding. 最近需要做一个基于知识图谱的检索，但是因为知识图谱中存储的都...

Speech and Language Processing 阅读笔记 NLP

文章目录 2.1RegularExpressions 3Ngram 4NaiveBayesClassification 5LogisticRegression 6VectorSemanticsandE...

Interactive Natural Language Processing

本文是对《InteractiveNaturalLanguageProcessing》的翻译。交互式自然语言处理摘要 1引言 2交互式对象 2.1人在环 2.2KB在环 2.3模型或工具在环 2.4...

是否确定退出登录?

Speech and Natural Language Processing《资源教程》

Finite State Toolkits and Regular Expressions

Language Modelling Toolkits

Speech Recognition

Signal Processing

Text-to-Speech

Speech Data

Machine Translation

Machine Learning

Deep Learning

Natural Language Processing

Applications

Other Tools

Blogs

Books

全部评论 (0)

相关文章推荐

Speech and Natural Language Processing《资源教程》

《Speech and Language Processing》笔记（二）

《Speech and Language Processing》笔记（一）

Natural Language Processing

Natural Language Processing

speech-language-processing

The Intersection of Deep Learning and Natural Language Processing

Synonyms-Chinese Synonyms for Natural Language Processing and Understanding

Speech and Language Processing 阅读笔记 NLP

Interactive Natural Language Processing