EI / SCOPUS / CSCD 收录

中文核心期刊

基于多带解调分析和瞬时频率估计的耳语音话者识别

Whispered speaker identification based on multiband demodulation analysis and instantaneous frequency estimation

  • 摘要: 为了改善耳语音话者识别的稳健性,提出了一种基于调幅-调频(AM-FM)模型的耳语音特征参数,瞬时频率估计(IFE)。根据语音产生的共振峰调制理论,采用多带解调分析(MDA)获得语音的瞬时包络和频率;然后根据包络幅度和频率的加权估计,得到语音的特征IFE来描绘语音的频率结构。将该特征用于耳语话者识别并和传统的Mel倒谱系数(MFCC)进行了比较。实验结果表明,随着测试人数的增加,IFE的识别效果略好于MFCC;在测试信道改变的情况下,与MFCC相比IFE的稳健性得到了有效的提高。

     

    Abstract: In order to improve the robust performance of whispered speaker indentification,a kind of whispered speech parameter called instantaneous frequency estimation (IFE) is proposed based on the AM-FM representation of speech signal.According to the formant modulation theory of speech production,the instantaneous envelope and frequency of speech are extracted by multiband demodulation analysis (MDA).IFE is then obtained by the weighted estimation both on envelope amplitude and frequency to represent the accurate frequency structure of speech.The proposed speech parameters have been applied for whispered speaker indentification and compared with conventional MFCC.The experiment results show that,as the test objectives increase,the IFE parameters perform as well as MFCC,even a little better.When the test channels are changed,comparing with MFCC,IFE effectively improves the robust performance of system.

     

/

返回文章
返回