Skip to main content

Showing 1–3 of 3 results for author: Azcarreta, J

Searching in archive eess. Search in all archives.
.
  1. arXiv:2407.04879  [pdf, other

    cs.SD eess.AS

    All Neural Low-latency Directional Speech Extraction

    Authors: Ashutosh Pandey, Sanha Lee, Juan Azcarreta, Daniel Wong, Buye Xu

    Abstract: We introduce a novel all neural model for low-latency directional speech extraction. The model uses direction of arrival (DOA) embeddings from a predefined spatial grid, which are transformed and fused into a recurrent neural network based speech extraction model. This process enables the model to effectively extract speech from a specified DOA. Unlike previous methods that relied on hand-crafted… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: Accepted for publication at INTERSPEECH 2024

  2. arXiv:2010.13648  [pdf, other

    eess.AS cs.SD

    Improving Sound Event Detection Metrics: Insights from DCASE 2020

    Authors: Giacomo Ferroni, Nicolas Turpault, Juan Azcarreta, Francesco Tuveri, Romain Serizel, Çagdaş Bilen, Sacha Krstulović

    Abstract: The ranking of sound event detection (SED) systems may be biased by assumptions inherent to evaluation criteria and to the choice of an operating point. This paper compares conventional event-based and segment-based criteria against the Polyphonic Sound Detection Score (PSDS)'s intersection-based criterion, over a selection of systems from DCASE 2020 Challenge Task 4. It shows that, by relying on… ▽ More

    Submitted 26 October, 2020; originally announced October 2020.

  3. arXiv:1910.08440  [pdf, other

    eess.AS cs.SD

    A Framework for the Robust Evaluation of Sound Event Detection

    Authors: Cagdas Bilen, Giacomo Ferroni, Francesco Tuveri, Juan Azcarreta, Sacha Krstulovic

    Abstract: This work defines a new framework for performance evaluation of polyphonic sound event detection (SED) systems, which overcomes the limitations of the conventional collar-based event decisions, event F-scores and event error rates. The proposed framework introduces a definition of event detection that is more robust against labelling subjectivity. It also resorts to polyphonic receiver operating c… ▽ More

    Submitted 14 February, 2020; v1 submitted 18 October, 2019; originally announced October 2019.

    Comments: Accepted to ICASSP 2020