汉语连续语音识别的语速自适应算法

The speaking rate adaptation algorithm in Putonghua continuous speech recognition

摘要: 在连续语音中,不同的说话者在不同语境下说话的速度差异是很大的。偏离正常语速往往会造成识别错误,使识别性能下降。考虑到语速对于语音单元段长的影响是同步增长或同步下降的,相邻语音单元的段长之间存在很强的相关性,本文从利用段长的相关信息出发,在基于段长分布的隐含马尔可夫模型(DDBHMM:Duration Distribution Based HMM)的框架上,提出了一种语速自适应算法。对数字串和大词汇量连续语音识别的试验表明这个算法是有效的。

Abstract: In continuous speech, the difference of speaking rates is big among speakers in different speaking environment. The variation of the speaking rates can cause recognition errors and affect the performance of LVCSR(Large Vocabulary Continuous Speech Recognition) systems. It is noted that the duration of neighboring speech units, which is affected by speaking rates, increases or decreases synchronously and a strong correlation exits between them. Based on the framework of DDBHMM (Duration Distribution Based HMM), a speaking rate adaptation algorithm is proposed. For utilizing the correlation information between duration of neighboring speech units. The experiments on connected digit and large vocabulary continuous speech show that the new algorithm is effective.