基于决策树的汉语三音子模型
Ttiphone models for mandarin speech recognition based on decision tree
-
摘要: 基于决策树理论的上下文相关声学模型在英语语音识别中已经得到了比较深入的研究和应用,但在汉语语音识别中的应用则研究的比较少。本文基于决策树理论建立了汉语语境相关模型-三音于模型,讨论了决策构建模所要解决的几个重要问题:(1)基本建模单元集的选择,(2)音子类别集的设计,(3)评估函数的选择,(4)停止准则的选择,(5)决策树的建立和三音子模型的生成,本文着重分析了两种不同建模单元的性能:对音子类别集的设计提出了一些一般性的准则,并对我们设计的类别集进行了统计分析;分析了三音子模型在语音库的覆盖程度。实验结果表明,基于决策树的三音子声学模型建立的识别系统与双音子声学模型系统比较,误识率下降了24.7%。Abstract: Context-dependent acoustic model based on decision tree has been deeply investigated and applied in western language speech recognition.But in Mandarin speech recognition,diphone model was more popular and little attention was paid to triphone in the past.In this paper triphone model based on decision tree was proposed.and some key problems are discussed which must be solved when build triphones,such as how to choose the basic model unit set,how to design the question set,how to choose evaluation function,the choice of stop criterion,and how to build the decision tree and triphone model.The experiments showed that error rate was reduced by 24.7% in continuous speech recognition system based on triphone compared with one based on diphone model.