Time-frequency speech presence probability estimation based on sequential hidden markov model for speech enhancement
-
-
Abstract
Speech presence probability (SPP) estimation is a challenging issue on speech enhancement. Traditional methods for SPP is heuristic somewhat. They are not unified into a theoretical framework, which can't enable the optimal estimation. We present a sequential hidden Markov model (SHMM) to describe the log-power sequence as a dynamic process that transits between speech and noise states. The emission probability of each state is modeled by a Gaussian function. SPP is represented as the posterior probability of speech states given the observed log-power sequence. To meet the requirement of real-time capability, SHMM parameter estimation is simplified to a first-order recursive process, where the model parameter set is updated frame by frame on the basis of maximum likelihood. The comparison between several modeling methods showed the superiority of SHMM in modeling temporal correlation. The speech enhancement experiments confirm constrained SHMM outperforms conventional Minima Controlled Recursive Averaging (IMCRA) in terms of segmental SNR and log spectral distortion.
-
-