EI / SCOPUS / CSCD 收录

中文核心期刊

汉语连续语音数据库的语料设计

The text design for continuous speech database of standard Chinese

  • 摘要: 质量优良的语音识别系统或语音合成系统需要高质量的、在语音学和语言学知识指导下设计的科学合理简洁有效的连续语音数据库的支持.在目前阶段,汉语语音数据库应限制在朗读言语(read speech)的音段方面。为了描写语流中的音变现象,考虑如下语音单元:(1)不计声调的音节(401个)。(2)音节间的双音子415个。(3)音节间的三音子3035个,这是根据37个基本音子,利用音节间共振峰过渡的研究结果,按规则规纳的结果.(4)所有音节间过渡段的韵母一声母结构,采用和同三音子相同的归并方法,共781个.为了增加不同的韵律结构,并考虑语音识别系统的后处理,语料还包括汉语的17类基本句型.选用1993、1994两年的“人民日报”、“百家报刊精选”及若干电视剧本、词典词库作为语料库的原始语料,从中选出2185个句子和388个短语作为朗读语料,它们覆盖了99.8%个无调音节,100%的双音子,99.6%的三音子,以及17类句型。

     

    Abstract: Well developed continuous speech recognition systems need a higher quality, scientific designed, succinct and valid continuous speech database. At the first stage the database should be mainly limited in read speech. To describe very complex variances in continuous speech, we propose the following speech units: (1) 401 syllables witout tone. (2) 415 inter-syllabic diphones. (3) 3035 inter-syllabic triphones. (4) 781 inter-syllabicfinal-initial structures. We also give 17 sentence patterns to include the prosodic phenomena. Using automatic method 2185 sentences and 388 phrases are collected by above phonetic rules from a big corpus-recent years “Peple's Daily” and so on, as the read text of continuous speech recognition database in Standard Chinese. This set of sentences covers 99.8% syllables without tone, 100% inter-syllabic diphoes, 99.6% inter-syllabic triphones and 100% sentence patterns.

     

/

返回文章
返回