自然风格言语的汉语句重音自动判别研究

Study on automatic prediction of sentential stress with natural style in Chinese

摘要: 重音是语音合成中韵律处理的一个重要参数。本文分析了轻声和重读音节同正常重音在各声学参数上的差异,包括基频、音节时长、强度、停顿长度等,还特别考察了时长同基频参数之间的关系,以及上声音调同基频的关系。建立了基于人工神经网络的三种重音预测模型,即声学预测模型、语言学预测模型和混合预测模型,对汉语句重音(包括轻声、正常重音、重读)进行了自动判别,结果显示混合模型要优于另外两种模型。此外,本文还根据重音标注的多样性现象设计了支持率的评价方法。

Abstract: Stress is an important parameter for prosody processing in speech synthesis. This paper compares the acoustic features of unstressed syllable and stressed syllables with normally stressed syllables, including pitch, syllable duration, intensity and pause length. The relations between duration and pitch, and Third Tone (T3) and pitch are also studied. Three stress prediction models based on ANN, i.e. acoustic model, linguistic model and mixed model, are proposed to predict Chinese sentential stress. The result shows that mixed model is better than the other two models. In order to solve the problem of the diversity of manual labeling, another valuation method of support ratio is proposed.