Abstract:
To address degraded speech caused by narrow-bandwidth multipath underwater acoustic channels and source noise interference, this paper proposes a semantic-guided underwater noisy speech communication system, achieving efficient and robust speech communication in noisy environments. First, a speech-denoising network is introduced to reduce the impact of source noise. The system transmits the semantic information of speech, enabling extreme compression of the speech signal. In response to the damage caused by channel noise to the data, a semantic correction module is proposed to correct the degraded semantic features at the receiver, thereby improving the intelligibility of the reconstructed speech. Furthermore, the degradation-informed model, combined with a multi-stage training strategy, further enhances the robustness and generalization of the proposed method in underwater acoustic channels. Finally, the proposed multi-receptive-field squeeze-and-excitation fusion module is employed in the speech reconstruction process to capture global semantic information within the semantic features, optimizing the intelligibility of the reconstructed speech. Simulation results demonstrate that the proposed method delivers excellent reconstruction performance under very low bit rate compression, in complex noise environments, and across various channel conditions.