波达方向初始化空间混合概率模型的语音增强
Direction-guided speech enhancement method of the spatial mixture model
-
摘要: 提出了波达方向初始化空间混合概率模型的语音增强算法。通过声源定位估计出声源波达方向,再根据此计算相对传递函数,进而构造空间协方差矩阵来初始化空间混合概率模型。论证了相对传递函数在作为模型参数中语音协方差矩阵的主特征向量时,空间混合概率模型对应的概率分布可达到最大值,进而使期望最大化算法在迭代时更易收敛,以得到期望的掩蔽值。实验先后在自建仿真数据集和CHiME-4的两通道数据集中进行验证,结果表明,将波达方向信息引入到语音增强后语音识别系统的词错误率可以比未引入波达方向的词错误率最多降低3.79%,信号失真比最多提升2.00 dB,验证了在结合波达方向后的空间混合概率模型进行语音增强时性能有所提升。Abstract: A speech enhancement method using the Direction of Arrival(DOA) to initialize Expectation-Maximization(EM) algorithm is proposed,which can improve the mask estimation performance based on the spatial mixture probabilistic model.DOA is estimated to construct the relative transfer function,and then initializes the spatial covariance matrix It is demonstrated that the probability distribution of the spatial mixture probability model will reach the maximum value when the relative transfer function serves as the main eigenvector of the speech covariance matrix,which makes the EM algorithm easier to converge to obtain the desired mask value.The experiment is verified with the Word Error Rate(WER) on a simulated two-channel dataset and the two-channel part of the CHiME4 dataset.The result shows that directional information improves the speech enhancement performance,reduces the WER by 3.79%,and improves the SDR by 2.00 dB the most.