几种高鲁棒性通道及说话人自适应语音识别算法研究
Channel and speaker adaptation techniques for robust speech recognition
-
摘要: 鲁棒性问题是决定语音识别技术能否在实际中得以应用和推广的关键问题之一。概括起来说,导致语音识别系统性能变坏的原因大体上来自三个方面,即噪声(加性噪声、卷积噪声)、信道变化和不同的讲话者(不同的声道形状、不同的发育方式等)。本文对三种高鲁律性自适应语音识别方法进行了研究和改进,并对它们的性能进行了比较,这三种方法分别是VQ码本自适应法、HMM参数自适应法和基于正则相关分析的谱变换补偿方法。实验结果表明,这三种方法都能提高非特定人语音识别系统对信道以及说话人的鲁棒性,而且基于正则相关分析的稻变换补偿方法具有最好的性能,它能够补偿由三种失真源同时引起的训练条件与测试条件之间的不匹配,因此适合作为一种通用的自适应方法。Abstract: Acoustical mismatches between training and testing environments of HMM-based speech recognizer often cause sever degradations in recognition performance.The mismatch is mainly caused by noise,changes of the channel through which the speech signal is transmit-ted and differences of speakers.This paper addresses the problem of changes of the recording channel and variations of speakers.Three udaptation methods are presented to deal with the problem,i.e.,the adaptation via VQ prototype modification,the adaptation via HMM parame-ters modification and the canonical correlation based compensation method (short for CCBC).Experimental results have shown that all the three techniques can make our speaker-independent recognition system robust to channels and speakers.Among the three techniques,the CCBC has the best performance and it can be used as a unified approach to cope with mismatch caused both by noise,by differences of channels and by variations of speakers.