时频字典学习的单通道语音增强算法

黄建军; 张雄伟; 张亚非; 邹霞

doi:10.15949/j.cnki.0371-0025.2012.05.010

时频字典学习的单通道语音增强算法

Single channel speech enhancement via time-frequency dictionary learning

摘要

摘要: 针对以往语音增强算法在非平稳噪声环境下性能急剧下降的问题,基于时频字典学习方法提出了一种新的单通道语音增强算法。首先,提出采用时频字典学习方法对噪声的频谱结构的先验信息进行建模,并将其融入到卷积非负矩阵分解的框架下;然后,在固定噪声时频字典情况下,推导了时变增益和语音时频字典的乘性迭代求解公式;最后,利用该迭代公式更新语音和噪声的时变增益系数以及语音的时频字典,通过语音时频字典和时变增益的卷积运算重构出语音的幅度谱并用二值时频掩蔽方法消除噪声干扰。实验结果表明,在多项语音质量评价指标上,本文算法都取得了更好的结果。在非平稳噪声和低信噪比环境下,相比于多带谱减法和非负稀疏编码去噪算法,本文算法更有效地消除了噪声,增强后的语音具有更好的质量。

Abstract: A time-frequency dictionary learning approach is proposed to enhance speech contaminated by additive non- stationary noise. In this approach, a time-frequency dictionary is used for noise process modeling and incorporated into the convolutive nonnegative matrix factorization framework. The update rules for speech and noise time-varying gains and speech time-frequency dictionary are derived by precomputing the noise dictionary. The magnitude spectrogram of speech is estimated using convolution operation between the learned speech dictionary and the time-varying gains. Finally, noise is removed via binary time-frequency masking. Experiments indicate that the scheme proposed in this paper gives better enhancement results in terms of quality measures of speech. The proposed algorithm outperforms the multiband spectra subtraction and the non-negative sparse coding based noise reduction algorithm in nonstationary noise conditions.

HTML全文

参考文献(0)

施引文献

资源附件(0)