汉语音段反转言语的可懂度研究
Speech intelligibility of Chinese time-reversed speech
-
摘要: 实验研究了帧长对汉语音段反转言语可懂度的影响。实验结果表明,帧长在64 ms以下,汉语音段反转言语具有较高的可懂度;帧长在64~203 ms之间,可懂度随帧长的增加逐渐降低;帧长在203 ms以上,可懂度为0。在帧长8 ms时,汉语的声调失真导致可懂度下降。原始语音信号和音段反转言语的调制谱的分析表明,调制谱失真大小和可懂度密切相关。因此,用原始语音信号和音段反转言语的窄带包络间的归一化相关值可以衡量调制谱失真大小,基于语音的语言传输指数法计算的客观值和实验结果显著相关(r=0.876,p<0.01)。研究表明,语言可懂度与窄带包络有关,音段反转言语的可懂度和保留原始语音信号的窄带包络密切相关。Abstract: This study investigated speech intelligibility of Chinese time-reversed speech in a psychoacoustic experiment with different frame lengths of time reversal window.The test of speech intelligibility showed that the intelligibility was high when the frame length was below 64 ms,the intelligibility reduced gradually when the frame length was from 64 to 203 ms,and the intelligibility nearly got to zero when the frame length was above 203 ms.The intelligibility with the frame length 8 ms reduced due to the tonal distortion.The modulation spectra of the original speech and the corresponding time-reversed speech were analyzed and it showed that the intelligibility was correlated with modulation spectra distortion.Therefore,the modulation spectra distortion was conducted by normalizing correlation between the narrow-band envelopes of the original speech and the corresponding time-reversed speech.The objective values were calculated by the speech-based speech transmission index method and it showed that the objective values were highly correlated with the test of speech intelligibility(r = 0.876,p < 0.01).The study demonstrates that speech intelligibility is related to narrow-band envelopes and the preservation of narrow-band envelopes is correlated with the intelligibility of time-reversed speech.