Abstract:
Stress is an important parameter for prosody processing in speech synthesis. This paper compares the acoustic features of unstressed syllable and stressed syllables with normally stressed syllables, including pitch, syllable duration, intensity and pause length. The relations between duration and pitch, and Third Tone (T3) and pitch are also studied. Three stress prediction models based on ANN, i.e. acoustic model, linguistic model and mixed model, are proposed to predict Chinese sentential stress. The result shows that mixed model is better than the other two models. In order to solve the problem of the diversity of manual labeling, another valuation method of support ratio is proposed.