基于改进时域U-Net网络的房间脉冲响应重建方法
Room impulse response reconstruction with improved time domain U-Net network
-
摘要: 利用少量传声器测得的房间脉冲响应构建空间中更多位置的房间脉冲响应可减轻对空间进行密集采样的需要, 为此提出了一种数据驱动的房间脉冲响应预测方法, 其使用改进的时域U-Net网络, 通过优化网络基本组成单元提升网络的特征学习能力。在所提方法中, 控制区域被划分为若干正方形块, 正方形四个顶点处的房间脉冲响应被输入所提U-Net网络, 网络的目标输出为正方形中心位置的房间脉冲响应, 训练收敛的U-Net网络可以预测更多位置的房间脉冲响应。在包含多个房间场景的实录房间脉冲响应数据集上的实验表明, 所提方法的房间脉冲响应重建性能优于传统的奇异值分解方法, 并且所提U-Net网络的性能优于原始U-Net、CRN和DCCRN网络。在声场分区控制这一实际应用场景中, 利用所提方法预测的房间脉冲响应, 能够在较少的传声器数量下获得更高的明暗区声能量对比度, 进一步验证了所提方法的实用价值。Abstract: The need for dense spatial sampling can be alleviated by predicting room impulse responses (RIRs) at additional locations in the space from measurements obtained using limited microphones. In this paper, a data-driven approach for RIR reconstruction is proposed. The proposed method utilizes an improved time domain U-Net network, which enhances the ability of the network for feature learning through the optimization of the fundamental blocks. In the proposed method, the control region is divided into square blocks. The RIRs at the four vertices of each square are inputted into the proposed U-Net network, which aims to predict the RIR at the central position of the square. The converged U-Net network can reconstruct RIRs for supplementary positions. Experiments conducted on a recorded RIR dataset containing multiple room scenarios demonstrate that the proposed method outperforms the traditional singular value decomposition technique in terms of the RIR reconstruction performance. Additionally, the performance of the proposed U-Net network surpasses that of the original U-Net, CRN and DCCRN networks. In a practical scenario concerning sound zone reproduction, the utilization of the proposed method for predicting RIRs has resulted in higher acoustic contrast between the listening and quiet zones with fewer microphones. This further validates the merits of the proposed approach.