Convolutional recurrent network-based complex stereophonic acoustic echo cancellation with a two-stage approach
-
-
Abstract
We propose to use a two-stage Convolutional Recurrent Network (CRN) to address the Stereophonic Acoustic Echo Cancellation (SAEC) problem with complex spectral input features. The proposed algorithm avoids the decorrelation of far-end signals, which solves the non-unique solution problem of the adaptive filter-based SAEC and ensures the stereo sound quality and spatial perception. It deals with SAEC problem in two stages. In the first stage, a CRN model is used to estimate the echo signal based on the microphone and the far-end signals. In the second stage, a CRN model is used to estimate the near-end speech based on the microphone input signal and the estimated echo signal from the first stage. The discrimination between echo and near-end signal of the model can be improved by using the estimated echo signal as a priori information, which benefits the estimation of near-end signal. The input features and training targets used in the network are the complex spectral of signals, which can recover the phase information of the near-end speech. Experimental results show that the SAEC algorithm based on the proposed two-stage CRN model has significantly better performance than the traditional algorithms and single-stage CRN model in terms of both echo suppression in single-talk period and speech quality in double-talk period.
-
-