EI / SCOPUS / CSCD 收录

中文核心期刊

早晚期混响划分对理想比值掩蔽在语音识别性能上的影响

Effect of ideal ratio mask using different early and late reverberation partition methods on speech recognition performance

  • 摘要: 真实环境中存在的噪声和混响会降低语音识别系统的性能。封闭空间中的混响包括直达声、早期反射和后期混响3部分,它们对语音识别系统具有不同的影响.我们研究了早期反射和后期混响的不同划分方法,以其中的早期反射为目标语音,计算出了不同的理想比值掩蔽并研究了它们对语音识别系统性能的影响;在此基础上,利用双向长短时记忆网络(BLSTM)估计理想比值掩蔽,测试它们对语音识别系统性能的影响.实验结果表明,基于Abel早期反射和后期混响的划分方法,理想比值掩蔽能够降低词错误率约2.8%;基于BLSTM的估计方法过低估计了理想比值掩蔽,未能有效提高语音识别系统的性能。

     

    Abstract: In the real world,noise and reverberation can degrade the performance of speech recognition systems.Reverberation in closed space includes the direct sound,early reflections and late reverberation,which have different effects on speech recognition systems.We focus on different methods of dividing the early and late reverberation,and take the early reflections as the target signals,which is used to calculate different ideal ratio masks whose effects on the performance of speech recognition systems are evaluated.Based on this,we estimate the masks using Bidirectional Long Short-Term Memory network(BLSTM) and test their impact on the performance of speech recognition systems.The experimental results show that the ideal ratio masks can reduce the word error rate by about 2.8%using the Abel's method for dividing early reflection and late reverberation.The BLSTM method underestimates the ideal ratio masks and fails to improve the performance of the speech recognition systems.

     

/

返回文章
返回