EI / SCOPUS / CSCD 收录

中文核心期刊

ZHOU Jian, LUO Xiangyu, WANG Huabin, ZHENG Wenming, TAO Liang. Diverse style oriented many-to-many emotional voice conversion[J]. ACTA ACUSTICA, 2024, 49(6): 1297-1303. DOI: 10.12395/0371-0025.2023192
Citation: ZHOU Jian, LUO Xiangyu, WANG Huabin, ZHENG Wenming, TAO Liang. Diverse style oriented many-to-many emotional voice conversion[J]. ACTA ACUSTICA, 2024, 49(6): 1297-1303. DOI: 10.12395/0371-0025.2023192

Diverse style oriented many-to-many emotional voice conversion

  • To address the issues of insufficient emotional separation and lack of diversity in emotional expression in existing generative adversarial network (GAN)-based emotional voice conversion methods, this paper proposes a many-to-many speech emotional voice conversion method aimed at style diversification. The method is based on a GAN model with a dual-generator structure, where a consistency loss is applied to the latent representations of different generators to ensure the consistency of speech content and speaker characteristics, thereby improving the similarity between the converted speech emotion and the target emotion. Additionally, this method utilizes an emotion mapping network and emotion feature encoder to provide diversified emotional representations of the same emotion category for the generators. Experimental results show that the proposed emotion conversion method yields speech emotions that are closer to the target emotion, with a richer variety of emotional styles.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return