EI / SCOPUS / CSCD 收录

中文核心期刊

HU Jinbo, CAO Yin, WU Ming, YANG Feiran, YANG Jun. Loss function design for sound event localization and detection based on multi-task learning[J]. ACTA ACUSTICA, 2025, 50(2): 338-345. DOI: 10.12395/0371-0025.2024361
Citation: HU Jinbo, CAO Yin, WU Ming, YANG Feiran, YANG Jun. Loss function design for sound event localization and detection based on multi-task learning[J]. ACTA ACUSTICA, 2025, 50(2): 338-345. DOI: 10.12395/0371-0025.2024361

Loss function design for sound event localization and detection based on multi-task learning

  • The track-wise multi-task learning approach exhibits significant efficacy in detecting overlapping sound sources for sound event localization and detection. However, as the number of predicted event classes increases, the track-wise multi-task networks often produce sparse outputs, resulting in missing alarms of sound events. To address this issue, this paper introduces an aggregated loss function, reformulating the multi-task learning framework into a single-task learning problem by coupling the activity of sound events with its Cartesian direction-of-arrival vector. Furthermore, considering the characteristics of the track-wise output format, auxiliary duplicated targets are introduced to optimize the system outputs by replicating events from active tracks into inactive ones. Experimental results on a large-scale synthetic test set with 170 event classes demonstrate that the proposed method significantly improves the performance in sound event detection, effectively reduces the missing alarm rate, and achieves substantial improvement in localization and trajectory tracking. Additionally, experimental results on the real-scene dataset demonstrate the effectiveness of the proposed methods.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return