深度非线性度量学习在说话人确认中的应用
Deep nonlinear metric learning for speaker verification
-
摘要: 将非线性度量学习(Nonlinear Metric Learning,NML)应用于说话人确认,提出了一种基于深度独立子空间分析(Independent Subspace Analysis,ISA)网络的说话人确认方法。区别于传统的线性度量学习方法,该方法使用深度独立子空间分析网络来学习一种从说话人原始空间到优化子空间的非线性显式映射,并在此基础上计算两条语音之间的相似性,以获得更好的说话人确认性能.所提方法在NIST SRE 2008数据集上进行了评估。评估结果表明,所提算法的等错误率指标相比传统的基于余弦距离打分的i-vector算法、线性判别分析(Linear Discriminant Analysis,LDA)算法、概率线性判别分析(Probabilistic Linear Discriminant Analysis,PLDA)算法分别下降了11.02%,6.40%和4.579%。Abstract: By applying Nonlinear Metric Learning to speaker recognition, a speaker verification algorithm based on deep independent subspace analysis network is proposed. Different from the traditional linear metric learning methods,the proposed method learns an explicit mapping from the original space to an optimal subspace by means of deep independent subspace analysis network. On the basis of this, the similarity between two i-vectors can be calculated in the optimal subspace in order to obtain a better speaker verification performance. The proposed method is evaluated on the NIST SRE 2008 dataset. Comparing with the traditional i-vector model with cosine distance metric, LDA and PLDA, the proposed method decreases the EER by 11.02%, 6.40% and 4.57%, respectively.