Time-frequency speech presence probability estimation based on sequential hidden markov model for speech enhancement

XU Chundong; XIA Risheng; YING Dongwen; LI Junfeng

doi:10.15949/j.cnki.0371-0025.2014.05.017

XU Chundong, XIA Risheng, YING Dongwen, LI Junfeng. Time-frequency speech presence probability estimation based on sequential hidden markov model for speech enhancement[J]. ACTA ACUSTICA, 2014, 39(5): 647-654. DOI: 10.15949/j.cnki.0371-0025.2014.05.017

Citation:

Time-frequency speech presence probability estimation based on sequential hidden markov model for speech enhancement

Graphical Abstract

Graphical Abstract

Abstract

Abstract

Speech presence probability (SPP) estimation is a challenging issue on speech enhancement. Traditional methods for SPP is heuristic somewhat. They are not unified into a theoretical framework, which can't enable the optimal estimation. We present a sequential hidden Markov model (SHMM) to describe the log-power sequence as a dynamic process that transits between speech and noise states. The emission probability of each state is modeled by a Gaussian function. SPP is represented as the posterior probability of speech states given the observed log-power sequence. To meet the requirement of real-time capability, SHMM parameter estimation is simplified to a first-order recursive process, where the model parameter set is updated frame by frame on the basis of maximum likelihood. The comparison between several modeling methods showed the superiority of SHMM in modeling temporal correlation. The speech enhancement experiments confirm constrained SHMM outperforms conventional Minima Controlled Recursive Averaging (IMCRA) in terms of segmental SNR and log spectral distortion.

FullText(HTML)

References (0)

Cited By

Time-frequency speech presence probability estimation based on sequential hidden markov model for speech enhancement

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content