Skip to main content

Showing 1–2 of 2 results for author: Mu, D

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.08771  [pdf, other

    cs.SD cs.AI eess.AS

    MFF-EINV2: Multi-scale Feature Fusion across Spectral-Spatial-Temporal Domains for Sound Event Localization and Detection

    Authors: Da Mu, Zhicheng Zhang, Haobo Yue

    Abstract: Sound Event Localization and Detection (SELD) involves detecting and localizing sound events using multichannel sound recordings. Previously proposed Event-Independent Network V2 (EINV2) has achieved outstanding performance on SELD. However, it still faces challenges in effectively extracting features across spectral, spatial, and temporal domains. This paper proposes a three-stage network structu… ▽ More

    Submitted 15 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  2. arXiv:2401.04976  [pdf, other

    eess.AS cs.SD

    Full-frequency dynamic convolution: a physical frequency-dependent convolution for sound event detection

    Authors: Haobo Yue, Zhicheng Zhang, Da Mu, Yonghao Dang, Jianqin Yin, ** Tang

    Abstract: Recently, 2D convolution has been found unqualified in sound event detection (SED). It enforces translation equivariance on sound events along frequency axis, which is not a shift-invariant dimension. To address this issue, dynamic convolution is used to model the frequency dependency of sound events. In this paper, we proposed the first full-dynamic method named \emph{full-frequency dynamic convo… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: 6 pages, 4 figures, submitted to ICME2024