基于声调建模的带噪汉语数字串语音识别
Noisy Chinese digit string speech recognition based on tone modeling
-
摘要: 尝试利用声调信息来改善噪声下汉语数字串语音识别性能。为解决声调特征不连续问题,提出采用基于多空间概率分布的隐马尔可夫模型进行声调建模。简要分析噪声对声调特征提取的影响,论证了在带噪数字串语音识别中利用声调信息的可行性。实验结果显示,与不采用声调信息的方法相比,在5 dB到20 dB的测试数据上,所提方法可使错误率平均相对下降17.2%。这说明声调信息及所提建模方法对于改善带噪汉语数字串语音识别性能是有效的。Abstract: It is attempted to utilize tone information to improve the performance of noisy Chinese digit string speech recognition. Multi-space probability distribution based HMM (MSD-HMM) is used to model the discontinuous tone features. The effect of noisy environment on tone features is analyzed and the feasibility of utilizing tone information to improve noisy speech recognition is discussed. Experimental results show that the proposed method can averagely obtain 17.2% relative reduction of digit error rate for the noisy data SNR from 5 dB to 20 dB, comparing with the method without tone information. The study concludes that it is effective to apply MSD-HMM based tone model to enhancing noisy Chinese digit string speech recognition.