Blind speech source separation via nonlinear time-frequency masking
-
-
Abstract
A blind speech source separation method for the underdetermined convolutive mixture model is proposed via nonlinear time-frequency masking, the approximate W-disjoint orthogonality (W-DO) property of independent speech signals in the time-frequency domain is exploited. Firstly the observation mixture signal from multi-microphones is normalized to be independent of frequency in the time-frequency domain, then the dynamic clustering algorithm is developed to obtain the active source information in each time-frequency slot, a nonlinear function of deflection angle from the clustering center is selected for time-frequency masking, finally the blind separation of mixture speech signals can be achieved. This novel method can not only overcome the problem of frequency permutation which may be met in most classic frequency-domain blind separation techniques, but suppress the spatial direction diffusion of the separation matrix. Simulation results demonstrate that our proposed separation method outperform the typical BLUES method, the signal-noise-ratio gain (SNRG) is improved 1.58 dB averagely.
-
-