Bone-conducted speech enhancement using WaveNet fused with phase information
-
-
Abstract
The existing bone-conducted speech enhancement algorithms mainly focus on the enhancement of speech magnitude,and use the mismatch phase to synthesize waveform,which leads to the degradation of speech quality.In order to solve this problem,a WaveNet model based on phase information fusion is proposed to generate the enhanced waveform.The proposed method is based on bandwidth extended WaveNet,and combines the phase information of bone-conducted speech and the magnitude of enhanced speech as the conditional features.The waveform is generated under the fused feature conditions,where the phase information is effectively utilized.The performances of group delay spectrum and instantaneous frequency deviation spectrum are compared in experiments.The results show that the phase information of bone-conducted speech can effectively complement the original magnitude condition and improve the performance of speech enhancement,no matter whether they are fused by concatenation or convolution.The best result is obtained by fusing the group delay spectrum by concatenation.Compared with the original bone-conducted speech,the Mean Opinion Score(MOS) score is improved by 54.3%.
-
-