采用性别相关的深度神经网络及非负矩阵分解模型用于单通道语音增强

Single-channel speech enhancement based on gender-related deep neural networks and non-negative matrix factorization models

摘要: 为了从带噪信号中得到纯净的语音信号,提出了一种采用性别相关模型的单通道语音增强算法。具体而言,在训练阶段,分别训练了与性别相关的深度神经网络-非负矩阵分解模型用于估计非负矩阵分解中的权重参数;在测试阶段,提出了一种基于非负矩阵分解和组稀疏惩罚的算法用于判断测试语音中说话人的性别信息,然后再采用对应的模型估计权重,并结合已训练好的字典进行语音增强。实验结果表明所提算法在噪声抑制量及语音质量上,均优于一些基于非负矩阵分解的算法和基于深度神经网络的算法。

Abstract: In order to obtain the clean speech from the noisy signal, a single-channel speech enhancement algorithm based on gender-related models is proposed. Specifically, in the training stage, Deep Neural Networks(DNN) and Nonnegative Matrix Factorization(NMF) are employed to train two gender-related DNN-NMF models using the genderspecific training data. In the test stage, an algorithm based on NMF and group sparsity penalty is proposed to identify the gender information of the speaker in the test signal. Then the corresponding DNN-NMF model is used to estimate the activations for speech enhancement. Experimental results show that the proposed algorithm performs better in suppressing the noises without decreasing the speech quality compared with other NMF-based and DNN-based methods.