一种用于强噪声环境下语音识别的含噪Lombard及Loud语音补偿方法
A noisy Lombard and Loud speech compensation approach for speech recognition in extremely adverse environment
-
摘要: 针对语音识别中由于强噪声的影响而引起的Lombard和Loud效应进行研究,提出了基于训练数据的加性噪声和Lombard及Loud效应的联合补偿法。对于加性噪声是从谱减法的逆向角度对训练数据在频谱域采用谱加法;对于Lombard和Loud语音,则采用基于隐马尔可夫模型(HMM)状态标注的训练数据补偿,该方法同时考虑Lombard和Loud语音不同声学单元的不同状态在倒谱域的多种变化和多种变异情况下不同声学单元的音长及相对音长的变化。这种基于数据的多模式补偿使模型自动适应多种噪声和语音变异情况,在强噪声环境下具有很强的鲁棒性,并且不影响识别系统在正常环境或正常发音时的识别性能.同时,由于补偿是在训练过程中得到,不增加识别时的计算复杂度。Abstract: This paper proposes a unified approach for the noisy Lombard and Loud speech recognition based on training data compensation. A spectral addition to the training data is applied to the additive noise which is derived from the reversed point of spectral subtraction, while the compensation in Mel frequency cepstrum (MFC) domain for the Lombard and loud speech is based on HMM state labeling of the training data which take jointly the Mel frequency cepstrum coefficient (MFCC) variance and duration of different states in different acoustic units into account. The new approach is of great robustness in extremely noise and does not worsen the performance under normal environment and normal style. Meanwhile, since the compensation is made in the training phase, it does not increase the complexity of recognition.