EI / SCOPUS / CSCD 收录

中文核心期刊

一种面向声音变换的参数化模型

A parametric model for voice conversion

  • 摘要: 在源滤波器模型的基础上,利用统计学习方法,建立了一种面向声音变换的混合参数化模型。该模型包括浊音声学模型、清音声学模型和韵律补偿模型三部分。基于线性预测分析和mel倒谱分析的浊音声学模型,刻画了说话人声腔的共振特性。基于线性预测分析和噪声源分析的清音声学模型,反映了说话人发清音的特点。基于统计学习方法的韵律补偿模型描述了音高、能量与时长等分布特性。在该混合参数化模型的基础上,提出了一个声音变换算法,并将其应用到汉语音节的变换问题上。实验结果表明,对清浊音和韵律特性分别建模的变换算法能够提高重建语音的清晰度和可懂度,缩小重建语音与目标语音之间的感知距离,使重建语音具有目标说话人的韵律特征.

     

    Abstract: On the basis of the source-filter model, a hybrid parametric model, consisting of a voiced acoustic model, an unvoiced acoustic model and a compensation model of prosody, is presented for voice conversion and built by statistical learning. The voiced acoustic model is built on linear prediction analysis and mel cepstrum analysis to characterize the resonance of the vocal tract of speakers. The unvoiced acoustic model is adopted by linear prediction and noise-source modeling, to reflect the characteristics of the unvoiced speech of speakers. Statistical learning is involved to train the compensation model of prosody, which characterizes the distributions of pitch, energy, and duration respectively. An algorithm on the basis of the hybrid parametric model is proposed and applied to voice conversion of Mandarin syllables. The experiments demonstrate that the proposed algorithm not only improves the articulation and intelligibility of the converted speech, but also reduces the perceptual distance between the target and converted speech significantly. The formal listening tests also show that the prosodic features of target speakers are presented in the converted speech.

     

/

返回文章
返回