汉语二字调图样分析及其在合成语言中的应用
SOME STATISTICAL PATTERNS OF CHINESE TONE FOR DISYLLABLE AND ITS APPLICATION ON SYNTHESIS OF SPEECH BY RULE
-
摘要: 当人们说汉语(普通话)时,尽管其调值因人而异,但调型必须具有一定的模式才能成为听辨上的区别特征。这是汉语语言本身的重要特点。
关于汉语声调图样的分析,人们曾经做了大量的研究工作。不过,大部分工作只限于定性分析,或者从数学上寻找其近似表达式。要把以前的结果直接应用于按规则合成语言的工程设计就有点困难。本文给出有关汉语二字调图样的15组统计结果,每种图样可简单地用一个时域上归一的函数P(t)来描述,它反比于语调周期T0(t),而正比于该字最低基频的倒数T0max。
按照我们的分析数据,文中还给出汉字二字调的其它若干特点:
(1)每种声调的起始和结尾部分总有"弯头"和"降尾"的过渡状态,这种过度状态约占全声调图样的10%-15%。本文用一个多维数组q(15,2,n)来描述这种过渡状态;
(2)二字词第一字的上声调并不显示出它那最后部分的上升调值。该现象曾被汉语语言学家和语音学家称为"半上声调";
(3)二字词第二字的调长比第一字的调长稍短,大约为第一字调长的66%。
上述的声调图样P(t)能直接用于按规则合成汉语语言,而且有助于改善合成语言的自然度。Abstract: Although the pitch is different in absolute value the pitch contours corresponding to a particular Chinese tone should have a similar pattern which is a distinctive feature in speech perception. This is an important feature of Chinese language itself.
A lot of work on the pitch patterns of Chinese tones has been done, but most of them were only qualitative or oversimplified. It is somewhat difficult to apply those results directly to design of speech synthesis by rule. This paper has given 15 new statistical results of tone patterns of Chinese disyllabic and each pattern can be formulated simply by a normalized function P(t) in time domain which is directly proportional to Tomax and inversely proportional to To (t), where Tomax is the reciprocal of the lowest fundamental frequency in the syllable and To(t) the pitch period at time t.
The pattern P(t) can directly be used for the synthesis of Chinese speech by rule and it is helpful to improve the naturalness of synthesis speech.