采用压缩感知的改进的语音转换算法
A modified algorithm for voice conversion using compressed sensing
-
摘要: 提出了一种基于压缩感知的考虑语音帧间信息的语音转换算法。根据连续多帧语音的线谱对参数所构成的矢量在离散余弦变换域具有稀疏性,利用压缩感知技术对该矢量压缩成短矢量,并将该压缩后的短矢量作为特征参数训练语音转换函数。实验测试结果表明,选择合适的语音帧数时,该算法的性能要比传统的采用加权频率卷绕的转换算法提高3.21%。这说明,充分有效地利用语音帧间的相关信息会使转换语音保持更稳定的帧间声学特性,有利于提高语音转换系统的性能,Abstract: A voice conversion algorithm, which makes use of the information between continuous frames of speech by compressed sensing, is proposed in this paper. According to the sparsity property of the concatenated vector of several continuous Linear Spectrum Pairs (LSP) in the discrete cosine transformation domain, this paper utilizes compressed sensing to extract the compressed vector from the concatenated LSPs and uses it as the feature vector to train the conversion function. The results of evaluations demonstrate that the performance of this approach can averagety improve 3.21% comparing with the conventional algorithm based on weighted frequency warping when choosing the appropriate numbers of speech frame. The experimental results also illustrate that the performance of voice conversion system can be itnproved by taking full advantage of the inter-frame information, because those information can make the converted speech remain the more stable acoustic properties which is inherent in inter-frames.