Initial/final acoustic model based on separating nasal coda in Chinese Putonghua speech recognition

SHAO Jian; ZHAO Qingwei; YAN Yonghong

doi:10.15949/j.cnki.0371-0025.2010.05.021

SHAO Jian, ZHAO Qingwei, YAN Yonghong. Initial/final acoustic model based on separating nasal coda in Chinese Putonghua speech recognition[J]. ACTA ACUSTICA, 2010, 35(5): 587-592. DOI: 10.15949/j.cnki.0371-0025.2010.05.021

Citation:

Initial/final acoustic model based on separating nasal coda in Chinese Putonghua speech recognition

Graphical Abstract

Graphical Abstract

Abstract

Abstract

This paper focuses research on acoustic modeling unit selection in Chinese Putonghua spontaneous speech recognition. Under HMM three-state models,two most popular modeling units,namely extended initial/final (XIF) units and phoneme units,have their own advantages and drawbacks.On one hand,from the perspective of serious pronunciation variation problem in spontaneous speech,the coarsely granular XIF units are preferred to gather up all kinds of pronunciation variations.On the other hand,from the perspective of the low-distinguish ability of three-state structure for complex modeling units,the finely granular phoneme units are preferred.In this paper,based on theoretical achievements of experimental phonetics and the experimental results of duration analysis of XIF units,we propose an XIF model with separating nasal coda.Experiments carried out on a Chinese Putonghua spontaneous speech recognition task show that our proposed method is better than the XIF modeling and phoneme-based modeling,with the character error rate is reduced by 2.23% and 9.45% respectively.

FullText(HTML)

References (0)

Cited By

Initial/final acoustic model based on separating nasal coda in Chinese Putonghua speech recognition

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content