Skip to main content

Showing 1–4 of 4 results for author: Heller, L M

Searching in archive eess. Search in all archives.
.
  1. arXiv:2403.17529  [pdf, other

    cs.SD eess.AS

    Detection of Deepfake Environmental Audio

    Authors: Hafsa Ouajdi, Oussama Hadder, Modan Tailleur, Mathieu Lagrange, Laurie M. Heller

    Abstract: With the ever-rising quality of deep generative models, it is increasingly important to be able to discern whether the audio data at hand have been recorded or synthesized. Although the detection of fake speech signals has been studied extensively, this is not the case for the detection of fake environmental audio. We propose a simple and efficient pipeline for detecting fake environmental sound… ▽ More

    Submitted 13 June, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

  2. arXiv:2403.17508  [pdf, other

    cs.SD eess.AS

    Correlation of Fréchet Audio Distance With Human Perception of Environmental Audio Is Embedding Dependant

    Authors: Modan Tailleur, Junwon Lee, Mathieu Lagrange, Keunwoo Choi, Laurie M. Heller, Keisuke Imoto, Yuki Okamoto

    Abstract: This paper explores whether considering alternative domain-specific embeddings to calculate the Fréchet Audio Distance (FAD) metric can help the FAD to correlate better with perceptual ratings of environmental sounds. We used embeddings from VGGish, PANNs, MS-CLAP, L-CLAP, and MERT, which are tailored for either music or environmental sound evaluation. The FAD scores were calculated for sounds fro… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  3. arXiv:2302.09719  [pdf, ps, other

    eess.AS cs.SD

    Synergy between human and machine approaches to sound/scene recognition and processing: An overview of ICASSP special session

    Authors: Laurie M. Heller, Benjamin Elizalde, Bhiksha Raj, Soham Deshmukh

    Abstract: Machine Listening, as usually formalized, attempts to perform a task that is, from our perspective, fundamentally human-performable, and performed by humans. Current automated models of Machine Listening vary from purely data-driven approaches to approaches imitating human systems. In recent years, the most promising approaches have been hybrid in that they have used data-driven approaches informe… ▽ More

    Submitted 23 February, 2023; v1 submitted 19 February, 2023; originally announced February 2023.

    Comments: 4 pages. Summary of Special Session planned for 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). https://2023.ieeeicassp.org/ Second version has corrected spelling of an author's name

  4. arXiv:2104.12693  [pdf, other

    cs.SD eess.AS

    Identifying Actions for Sound Event Classification

    Authors: Benjamin Elizalde, Radu Revutchi, Samarjit Das, Bhiksha Raj, Ian Lane, Laurie M. Heller

    Abstract: In Psychology, actions are paramount for humans to identify sound events. In Machine Learning (ML), action recognition achieves high accuracy; however, it has not been asked whether identifying actions can benefit Sound Event Classification (SEC), as opposed to map** the audio directly to a sound event. Therefore, we propose a new Psychology-inspired approach for SEC that includes identification… ▽ More

    Submitted 5 August, 2021; v1 submitted 26 April, 2021; originally announced April 2021.