EI / SCOPUS / CSCD 收录

中文核心期刊

TAO Huawei, ZHANG Xinran, LIANG Ruiyu, ZHA Cheng, ZHAO Li, WANG Qingyun. Improved discriminative completed local binary pattern for speech emotion recognition[J]. ACTA ACUSTICA, 2016, 41(6): 905-912. DOI: 10.15949/j.cnki.0371-0025.2016.06.017
Citation: TAO Huawei, ZHANG Xinran, LIANG Ruiyu, ZHA Cheng, ZHAO Li, WANG Qingyun. Improved discriminative completed local binary pattern for speech emotion recognition[J]. ACTA ACUSTICA, 2016, 41(6): 905-912. DOI: 10.15949/j.cnki.0371-0025.2016.06.017

Improved discriminative completed local binary pattern for speech emotion recognition

  • In order to study the relationship between speech emotion and speech spectrum, a new feature is proposed for speech emotion recognition, which is called improved discriminative completed local binary pattern (IDisCLBP_SER). Firstly, based on spectrogram gray image, CLBP_M and CLBP_S statistical histograms are obtained through completed local binary pattern algorithm. Then, CLBP_M and CLBP_S statistical histograms are input into discriminative feature learning model, and are trained to get global dominant pattern set. Finally, global dominant pattern set is used to process CLBP_S and CLBP_M statistical histograms, and processed statistical histograms are joint, then IDisCLBP_SER feature is obtained. Experiment on EMO-DB database and Chinese emotional speech database show that recall rate of IDisCLBP_SER is improved by at least 8% compared to that of Texture Image Information (TII), and is averagely improved by more than 4% compared to that of speech spectrum feature. In addition, IDisCLBP_SER is fused with acoustic features, and recall rates of fusion features are improved by 1% - 4% compared to those of acoustic features.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return