基于高斯混合模型移动因子补偿的说话人识别方法
Gaussian mixture model compensation method using shift factor for speaker recognition
-
摘要: 提出一种模型补偿方法,以克服基于高斯混合模型的文本无关说话人识别系统性能随目标话者训练语料长度减小而下降的问题。该方法首先构造了一个低维的移动空间,每个训练语料较充分说话人模型的自适应过程均可用该空间中的移动因子表示,然后在目标话者训练语料较不充分的条件下,从受训练语料长度影响较小的话者模型分量中学习移动因子,并依据它对受语料长度影响较大的分量进行参数补偿。和基线系统相比,该方法在相同的训练和评测集上,等错误率指标下,获得相对约7%的性能提升。Abstract: The performance of GMM-based text-independent speaker recognition systems declines rapidly when the training data is reduced. A model compensation method is proposed to address the problem. Since there is a shift between each target GMM-based model and the UBM (Universal Background Model), a low-dimensional affine space is fined, named shift space, and the shift for each model with sufficient training data is transformed to the shift factor in this space. When the training data of the target speaker is insufficient, firstly, the coordinate of the shift factor is learned from the GMM mixtures of insensitive to the amount of training data, and then it is adopted to compensate other GMM mixtures. Using the proposed method, a relative reduction of 7% in EER (equal error rate) is obtained comparing with the baseline system.