EI / SCOPUS / CSCD 收录

中文核心期刊

LIU Zongming, WANG Li, LI Junfeng, ZHANG Pengyuan. Mispronunciation detection and diagnosis with acoustic pronunciation model aided modeling[J]. ACTA ACUSTICA, 2023, 48(1): 264-273. DOI: 10.15949/j.cnki.0371-0025.2023.01.020
Citation: LIU Zongming, WANG Li, LI Junfeng, ZHANG Pengyuan. Mispronunciation detection and diagnosis with acoustic pronunciation model aided modeling[J]. ACTA ACUSTICA, 2023, 48(1): 264-273. DOI: 10.15949/j.cnki.0371-0025.2023.01.020

Mispronunciation detection and diagnosis with acoustic pronunciation model aided modeling

  • For Mispronunciation Detection and Diagnosis (MDD) tasks, expert-annotated data are scarce. To efficiently model pronunciation regularities on limited data and then aid MDD systems, an acoustic pronunciation model that integrates both acoustic and textual information is proposed. It models the mispronunciation generation process in a more theoretically complete way. Based on the acoustic correlation of different parts of this process, the model achieves aided modeling by sharing the acoustic encoder network parameters with the phoneme recognition model and optimizing it jointly in a multi-task learning manner. Moreover, the acoustic confidence masking-prediction training approach is proposed to further strengthen the correlation between the two tasks and improve the efficiency of aided modeling. Experiments show that the acoustic pronunciation model can effectively model mispronunciation regularities. With its aid in phoneme recognition modeling, the MDD system showed 4.9%, 9.5%, and 14.0% improvement in mispronunciation detection, diagnosis, and phoneme recognition, respectively. The acoustic confidence masking-prediction training method improves the efficiency of aided modeling, and both the masking parameters and the multi-task learning parameters can affect the effectiveness of aided modeling.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return