汉语语音识别中基于区分性权重训练的声调集成方法
Tone model integration based on discriminative weight training for Putonghua speech recognition
-
摘要: 提出一种区分性方法,将声调信息加入大词汇量连续语音识别系统中。该方法根据最小音子错误准则,区分性地圳练模型相关的概率权重。利用这些权重对传统基于传统谱特征的隐马尔可夫模型概率以及声调模型概率进行加权,通过调整模型之间的作用程度提高系统识别率。推导了利用扩展Baum-welch算法的权重更新公式。对不同模型权重组合策略进行了评估,并利用权重之间的平滑方法来克服权重训练过拟合的问题。分别通过大词汇连续语音的带调音节输出和汉字输出两种识别任务来验证区分性模型权重训练的性能。实验结果表明在两种识别任务上,区分性的模型权重较使用全局模型权重分别获得9.5%以及4.7%的相对误识率降低。这表明了区分性模型权重对提高声调集成性能的有效性。Abstract: A discriminative framework of tone model integration into continuous speech recognition is proposed. The method uses model dependent weights to scale probabilities of the hidden Markov model based on spectral features and tone models based on tonal features. The weights are discriminatively trained by the minimum phone error criterion and update equation of model weights based on the extended Baum-Welch algorithm is derived. Variant schemes of model weight combination are evaluated and a smoothing technique is introduced to make training robust to over fitting. The proposed method is evaluated on tonal syllable output and character output speech recognition tasks. Experiments results show the proposed method has obtained 9.5% and 4.7% relative error reduction than global weight on the two tasks due to a better interpolation of the given models. This proves the effectiveness of discriminative trained model weights for tone model integration.