合成无限词汇汉语语言的初步研究
A PRIMARY STUDY OF SYNTHETIC CHINESE SPEECH ON UNLIMITED VOCABULARY
-
摘要: 本文介绍一个采用双音素为声元素,在频率域上合成无限汉语词汇的模拟系统。它具有如下特点:
1)可以直接从键盘输入汉语拼音文字,不需要特别的正音转写过程;
2)韵律(包活音长、幅度、语调和停顿)的控制十分方便,尤其是可以根据每个汉字发音的长短,能自动地控制汉语的语调轮廓;
3)使用七个并联的“时变数字滤波器”作为声道系统的“终端模拟”。在数字滤波过程中,对于浊音,直接计算声道的脉冲响应;对于清音,一个调制过程代替了滤波过程;
4)由于利用事先制好的数据表,使合成的时间缩短。
曾利用该模拟系统合成了一篇约400个汉字的实验短文,并在10个中国人当中听写测验,初步结果表明,平均句子可懂度为90%,最高可达97%。Abstract: The paper reports on a simulation system which can synthesize unlimited Chinese vocabulary using diphones as sound elements according to the principle of formant vocoder. A spelt Chinese text can directly be used as its input without special transcription for pronunciation and its sound output is standard Chinese speech (manderin). A general method for synthetic speech, such as the processing method of excite source (voiced, unvoiced) and the terminal analog of vocal system, has been introduced in the paper. A proximat method of 4 tones for Chinese words and a segment library made of about 640 diphones for synthesis of unlimited Chinese vocabulary have been also discripted according to some features of Chinese language, for example, there are a lot of homophones, the assimilation of some phonemes, the function of recognizing different meaning from different pitch contour in chinese language, and so on.
A text about 400 Chinese words has been synthsized on the simulation system and has been dictated to 10 Chinese people. Average sentence intelligibility is near 90%. The best, 97%.