EI / SCOPUS / CSCD 收录

中文核心期刊

汉语连续语音识别中语音处理和语言处理统合方法的研究

Study on the integration of speech and language processing in Recognition of Chinese Continuous Speech

  • 摘要: 提出了一种语音处理和语言处理按帧同步统合的汉语连续语音识别方法。该方法把基于 CFG语言模型和 Top Down型句法分析器的语言处理过程结合进基于有限状态自动机控制的 One Pass Viterbi语音识别算法中,实现了帧同步的语音语言处理的统合。为完成帧同步句法分析的单词预测和语音识别过程的结合,本文提出了一种类似于Earley法的 TopDown型句法分析方法以及 One Pass Viterbi算法中的有限状态自动机动态展开建立法. 60个音素单位和 8个声调单位的 HMM作为识别用基元模型被用于识别实验,识别结果表明,对于一个识别困难度(Perplexity)为27.3的任务(Task)的识别系统,利用本文提出的方法,10名话者发音的 1070句子的平均识别率达到 94.4%,比利用传统的基于单词确认(Word Spotting)以及从单词串(列)(lattice)进行句法分析的阶层性语音·语言统合方式的识别率提高约8%.

     

    Abstract: This paper presents a method of Chinese continuous speech recognition, which synthesizes speech and language processing with frame-synchronous parsing algorithm. The evolved language processing employs the context free grammar and top down sentence analyzer. The evolved speech processing uses the One Pass Viterbi algorithm based on finite state automaton. In the evaluation experiments, 60 phonemic HMMs and 8 tone HMMs were used. By using the proposed algorithm, we obtained the average sentence recognition rate of 94.4% for 1070 utterances of ten speakers, and an improvement of 8% is obtained in the same task of perplexity 27.3, compared to conventional hierarchical system based on word spotting and lattice parsing algorithm.

     

/

返回文章
返回