Abstract:
In this paper, we define initial-final-like units for Chinese speech recognition, based on the analysis of pronunciations of Chinese syllables, and determine the acoustic units used in the Chinese speech recognition system with whole syllables. We examine the detection consistence of these units and the robustness of the speech recoghtion systems constructed with these units. Through the applicahon of above units in a speaker-independent Chinese speech recognition system, it is shown that these units possess very good detection consistence, and the recognition system is very robust as for these units. In the paper, we count up, with a large number of utterances, the lengthes of initial-like units, which is derived from the segmentation method given in the paper. According to the recsults of the statistics, we present an algorithm of pre-selection. which can reduce up to half of the computation with almost 100 percent pre-selection accuracy.