面向少量语料的语音转换算法

谷东; 简志华

doi:10.15949/j.cnki.0371-0025.2018.05.018

面向少量语料的语音转换算法

谷东,
简志华

An algorithm for voice conversion with limited corpus

摘要

摘要: 针对目标说话人可能存在语料不足的情况,本文提出了一种有限语料下的统一张量字典语音转换算法。从语料库中选取N个说话人作为语音张量字典的基础说话人,通过多序列动态时间规整算法使这N个说话人的平行语音段对齐,从而建立由N个二维基础字典构成的张量字典。在语音转换阶段,源、目标说话人语音都可以通过张量字典中各基础字典的线性组合,构造出各自的语音字典,实现了语音转换。实验结果表明,当基础说话人个数达到14时,只需要极少的目标说话人语料,便可获得与传统的基于非负矩阵分解转换算法相当的转换效果,这极大地方便了语音转换系统的应用。

Abstract: Under the condition of limited target speaker's corpus, this paper proposed a new voice conversion algorithm using unified tensor dictionary with limited corpus. Firstly, parallel speech of N speakers was selected randomly from the speech corpus to build the base of tensor dictionary. And then, after the operation of multi-series dynamic time warping for those chosen speech, N two-dimension basic dictionaries can be generated which constituted the unified tensor dictionary. During the conversion stage, the two dictionaries of source and target speaker were established by linear combination of the N basic dictionaries using the two speakers' speech. The experimental results showed that when the number of the basic speaker was 14, our algorithm can obtain the compared perfornmnce of the traditional NMF-based method with few target speaker corpus, which greatly facilitate the application of voice conversion system.

HTML全文

参考文献(0)

施引文献

资源附件(0)