Abstract:
The accuracy of noise estimation directly affects the quality of speech enhancement algorithm.To improve the noise suppression effect of current speech enhancement algorithm when noise is estimated and effectively solve the unconstrained optimization problem,a time-frequency mask algorithm based on DNN(Deep Netual Networks) combined with convex optimization is proposed for monaural speech enhancement.Firstly,the power spectra of noisy speech is extracted as the input of DNN;Secondly,the inter-channel correlation factor between noise and speech is taken as the training target of DNN;Then,the objective function of convex optimization is constructed by using the correlation factor obtained from DNN model;Finally,new hybrid conjugate gradient method based on DNN combined with convex optimization,is used to perform iterative processing for initial mask.The final updated mask is used to obtain the enhanced speech.Simulation experimental results show that under different background noise with low SNR,compared with conventional methods,the obtained ratio mask makes the enhanced speech obtain better LSD(Log Spectral Distance),PESQ(Perceptual Evaluation of Speech Quality),STOI(Short-Time
Objective Intelligibility) and segSNR(segmental Signal to Noise Ratio) indices,and improves the overall quality of speech and can effectively suppress noise.