EI / SCOPUS / CSCD 收录

中文核心期刊

蒋文建, 韦岗. 基于掩蔽特性的噪声环境下语音识别新特征[J]. 声学学报, 2001, 26(6): 516-520. DOI: 10.15949/j.cnki.0371-0025.2001.06.007
引用本文: 蒋文建, 韦岗. 基于掩蔽特性的噪声环境下语音识别新特征[J]. 声学学报, 2001, 26(6): 516-520. DOI: 10.15949/j.cnki.0371-0025.2001.06.007
JIANG Wenjian, WEI Gang. A new feature extraction method for noisy speech recognition based on masking model[J]. ACTA ACUSTICA, 2001, 26(6): 516-520. DOI: 10.15949/j.cnki.0371-0025.2001.06.007
Citation: JIANG Wenjian, WEI Gang. A new feature extraction method for noisy speech recognition based on masking model[J]. ACTA ACUSTICA, 2001, 26(6): 516-520. DOI: 10.15949/j.cnki.0371-0025.2001.06.007

基于掩蔽特性的噪声环境下语音识别新特征

A new feature extraction method for noisy speech recognition based on masking model

  • 摘要: 语音识别系统的识别率在噪声环境中下降很大。本文根据人耳的听觉特性,提出一种基于人耳听觉掩蔽特性的抗噪声特征提取方法。该方法先求取噪声语音的掩蔽特性,在此基础上再计算Mel倒谱系数用于语音识别。通过对TIMIT数据包的 0~9十个英语数字在 NoiseX92的各种噪声下进行了识别试验。其中在信噪比 0dB条件下,在 3种噪声条件下识别率平均提高 152%,实验表明新方法对于各种噪声环境下的识别率有显著提高。

     

    Abstract: The performance of traditional speech recognition system degrades seriously in noisy environment. This paper presents a new speech feature extraction method based on masking properties of the human auditory system. We derive MFCC from masking model of noisy speech. The new method is evaluated by a task on TIMIT digit database (from 0 to 9, in English). Several types of noises from NoiseX92 database are added to the original speech to simulate noisy speech at different SNR. An average of 152% increase in recognition accuracy rate compared to classical MFCC is obtained in three different kinds of noises at 0dB SNR. The experimental results show that the performance of speech recognition systems can be greatly improved by using the new feature method under noisy environment.

     

/

返回文章
返回