【PaperReading】AN ATTENTION-BASED NEURAL NETWORK APPROACH FOR SINGLE CHANNEL SPEECH ENHANCEMENT

阅读量：

A novel attention-based neural network structure is proposed for single-input channel speech quality improvement.

What’s main claim? Key idea?

In this study, researchers investigate the architecture of neural networks that incorporate attention mechanisms to enhance speech processing capabilities. This research employs an attention mechanism within Long Short-Term Memory-Recurrent Neural Networks (LSTM-RNNs). Specifically, by accepting the noisy spectrum as input data, the model architecture comprises a bidirectional LSTM encoder integrated with an attention module and a speech synthesis component, thereby generating an enhanced spectral representation.

Is there code available? Data?

No code

Data: 随机选择多说话人语音语料库中的一个干净语音文件及其约24.5小时的时长，并从Musan语料库中获取噪声文件。

Is the idea neat? Is it counter-intuitive?

Attention mechanisms are intuitively employed in speech enhancement since humans can effectively focus on important speech components within an audio stream, particularly when paying high attention to such elements while neglecting irrelevant areas like noise or interference, thereby allowing them to dynamically shift their focus over time.

Is the experimentation good? Manual tuning?

Compared with OM-LSA and an LSTM approach without attention mechanism.

Loss function: mean square error (MSE).

The learning rate is set to 0.0005 at the beginning.

Evaluation criteria: PESQ and STOI.

Is it useful to my work e.g. product dev?

Recently, I have been attempting to utilize attention mechanisms for speech enhancement. However, employing them directly has resulted in suboptimal outcomes, prompting me to explore alternative methodologies that show more promise.

全部评论 (0)

还没有任何评论哟~

【PaperReading】AN ATTENTION-BASED NEURAL NETWORK APPROACH FOR SINGLE CHANNEL SPEECH ENHANCEMENT

Title：ANATTENTIONBASEDNEURALNETWORKAPPROACHFORSINGLECHANNELSPEECHENHANCEMENT What’smainclaim?Keyidea...

A Regression Approach to Speech Enhancement Based on Deep Neural Networks

ARegressionApproachtoSpeechEnhancementBasedonDeepNeuralNetworks 题目：ARegressionApproachtoSpeechEnhanc...

TSTNN: TWO-STAGE TRANSFORMER BASED NEURAL NETWORK FOR SPEECH ENHANCEMENT IN THE TIME DOMAIN

[ICASSP2021] Motivation 目前，LSTM和GRU等RNN常被用于基于顺序信息的长期序列建模。但基于RNN的模型的缺点是不能并行处理，计算复杂度较高。

【PaperReading】T-GSA: TRANSFORMER WITH GAUSSIAN-WEIGHTED SELF-ATTENTION FOR SPEECH ENHANCEMENT

Title：TGSA:TRANSFORMERWITHGAUSSIANWEIGHTEDSELFATTENTIONFORSPEECHENHANCEMENT What’smainclaim?Keyidea?...

A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement

AConvolutionalRecurrentNeuralNetworkforRealTimeSpeechEnhancement 文章：AConvolutionalRecurrentNeuralNet...

【PaperReading】DUAL-PATH RNN: EFFICIENT LONG SEQUENCE MODELING FOR TIME-DOMAIN SINGLE-CHANNEL SPEECH

Title：DUALPATHRNN:EFFICIENTLONGSEQUENCEMODELINGFORTIMEDOMAINSINGLECHANNELSPEECHSEPARATION What’smain...

comments onAn Experimental Study on Speech Enhancement Based on Deep Neural Networks(一)

这篇论文是由中科大语音与语言信息处理实验室写的，今年1月发表在IEEESignalProcessingLettersImpactfactor:1.67 论文大致分为两个部分：语音增强的建模和基于TIM...

《SELF-ATTENTION GENERATIVE ADVERSARIAL NETWORK FOR SPEECH ENHANCEMENT》论文笔记

论文地址：https://ieeexplore.ieee.org/abstract/document/9414265 会议:ICASSP2021 Abstract 目前语音增强的对抗生成网络仅依赖于卷...

TIME-FREQUENCY ATTENTION FOR MONAURAL SPEECH ENHANCEMENT

[arXiv:2111.07518v2] Motivation 现有的模型主要关注于如何有效地建模长时间依赖关系，而忽略了语音在TF表示中的能量分布特征，而能量分布对于准确预测掩模或频谱具有重要意义。

Attention-Based Convolutional Neural Network for Semantic Relation Extraction

AttentionBasedConvolutionalNeuralNetworkforSemanticRelationExtraction 摘要提出了一种新的基于注意力的卷积神经网络结构来完成关系分...

是否确定退出登录?

【PaperReading】AN ATTENTION-BASED NEURAL NETWORK APPROACH FOR SINGLE CHANNEL SPEECH ENHANCEMENT

全部评论 (0)

相关文章推荐

【PaperReading】AN ATTENTION-BASED NEURAL NETWORK APPROACH FOR SINGLE CHANNEL SPEECH ENHANCEMENT

A Regression Approach to Speech Enhancement Based on Deep Neural Networks

TSTNN: TWO-STAGE TRANSFORMER BASED NEURAL NETWORK FOR SPEECH ENHANCEMENT IN THE TIME DOMAIN

【PaperReading】T-GSA: TRANSFORMER WITH GAUSSIAN-WEIGHTED SELF-ATTENTION FOR SPEECH ENHANCEMENT

A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement

【PaperReading】DUAL-PATH RNN: EFFICIENT LONG SEQUENCE MODELING FOR TIME-DOMAIN SINGLE-CHANNEL SPEECH

comments onAn Experimental Study on Speech Enhancement Based on Deep Neural Networks(一)

《SELF-ATTENTION GENERATIVE ADVERSARIAL NETWORK FOR SPEECH ENHANCEMENT》论文笔记

TIME-FREQUENCY ATTENTION FOR MONAURAL SPEECH ENHANCEMENT

Attention-Based Convolutional Neural Network for Semantic Relation Extraction