一种基于模糊聚类分析的异音混合共享模型

A fuzzy-clustering analysis based on phonetic tied-mixture HMM

摘要: 为减少语音识别中声学模型的参数量,提高参数训练的鲁棒性,提出了一种基于升值法模糊聚类的异音混合共享模型。在决策树结构的基础上,通过对初始三音子模型的高斯函数做模糊聚类得到该模型的高斯码本,并进一步通过对模型的方差做模糊聚类完成对方差的共享。识别实验结果表明,与相近高斯数量的传统异音混合共享模型相比,提出的异音混合共享模型的高斯权值数减少77.59%时,识别率提高7.92%;与相近参数量的三音子模型相比,方差共享的异音混合模型误识率降低了3.01%。

Abstract: To efficiently decrease the parameter size and improve the robustness of parameter training, a revaluing fuzzy-clustering based on Phonetic Tied-mixture HMM (PTM), i.e. FPTM, was presented. The FPTM Gaussian code book was synthesized from all Gaussians belong to the same root node in phonetic decision tree. The fuzzy-clustering method was further used for FPTM covariance sharing. Experimental results showed that compared with the conventional PTM with approximately the same parameter size, the size of FPTM weights decreased by 77.59% and recognition rate increased by 7.92%, and compared covariance-shared FPTM with tri-phone model, the former error rate was reduced by 3.01%.