联合宽带到达方向估计和语音特征增强的传声器阵处理方法
Microphone array processing via joint wideband angle-of-arrival estimation and speech feature enhancement
-
摘要: 以提高室内混响环境下自动语音识别(ASR)性能为目标,讨论了一种小尺寸传声器阵处理方法。该方法采用基于旋转不变技术的信号参数估计算法(ESPRIT)计算宽带语音信号到达方向角,进行时延补偿;同时联合考虑阵列滤波与隐马尔可夫模型(HMM)识别过程,将识别输出结果反馈到前端的传声器阵处理,优化阵列滤波系数。与常规阵处理方法改善信号波形质量不同,本文通过调节阵列滤波系数降低待识别特征与训练模型之间的失配,直接提高识别过程中正确假设的概率。实验结果表明,上述方案能够有效降低会议室环境下孤立词有限词库ASR的错误概率,表现优于常规波束形成方法;采用全局优化进行阵列滤波设计,与局部优化算法相比,进一步改善了处理性能。Abstract: This paper concerns techniques of speech processing using a small-size microphone array to improve automatic speech recognition(ASR)performance in an indoor reverberant environment.The method first applies Estimation of Signal Parameters via Rotational Invariance Techniques(ESPRIT)to compute directions-of-arrivals of wideband speech signals and implements time-delay compensation accordingly;array signal filtering is then considered jointly with the HMM-based speech recognition procedure,whose outputs are fed back to the front end to optimize array filtering design.Different from conventional array processing aiming to enhancing signal waveform,the approach here adjusts array filtering coefficients to reduce mismatch between the features to be recognized and the training model,thus directly maximizing the likelihood of the right transcription for a selected vocabulary.Experimental data processing shows that the above approach can effectively reduce ASR error rate for an isolated word vocabulary of finite size in a meeting-room environment,superior to conventional beamforming processing;compared to a local optimization scheme,applying global optimization in array filtering design further improves the performance.