Skip to main content

Showing 1–10 of 10 results for author: Salvi, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.10989  [pdf, other

    cs.CV cs.LG cs.MM cs.SD eess.AS

    FairSSD: Understanding Bias in Synthetic Speech Detectors

    Authors: Amit Kumar Singh Yadav, Kratika Bhagtani, Davide Salvi, Paolo Bestagini, Edward J. Delp

    Abstract: Methods that can generate synthetic speech which is perceptually indistinguishable from speech recorded by a human speaker, are easily available. Several incidents report misuse of synthetic speech generated from these methods to commit fraud. To counter such misuse, many methods have been proposed to detect synthetic speech. Some of these detectors are more interpretable, can generalize to detect… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: Accepted at CVPR 2024 (WMF)

  2. arXiv:2402.05567  [pdf, other

    cs.SD cs.MM eess.AS

    Listening Between the Lines: Synthetic Speech Detection Disregarding Verbal Content

    Authors: Davide Salvi, Temesgen Semu Balcha, Paolo Bestagini, Stefano Tubaro

    Abstract: Recent advancements in synthetic speech generation have led to the creation of forged audio data that are almost indistinguishable from real speech. This phenomenon poses a new challenge for the multimedia forensics community, as the misuse of synthetic media can potentially cause adverse consequences. Several methods have been proposed in the literature to mitigate potential risks and detect synt… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  3. arXiv:2307.15555  [pdf, other

    cs.SD cs.CL cs.CR eess.AS

    All-for-One and One-For-All: Deep learning-based feature fusion for Synthetic Speech Detection

    Authors: Daniele Mari, Davide Salvi, Paolo Bestagini, Simone Milani

    Abstract: Recent advances in deep learning and computer vision have made the synthesis and counterfeiting of multimedia content more accessible than ever, leading to possible threats and dangers from malicious users. In the audio field, we are witnessing the growth of speech deepfake generation techniques, which solicit the development of synthetic speech detection algorithms to counter possible mischievous… ▽ More

    Submitted 28 July, 2023; originally announced July 2023.

    Comments: Accepted at ECML-PKDD 2023 Workshop "Deep Learning and Multimedia Forensics. Combating fake media and misinformation"

  4. arXiv:2210.17222  [pdf, other

    cs.SD cs.CV cs.MM eess.AS

    Combining Automatic Speaker Verification and Prosody Analysis for Synthetic Speech Detection

    Authors: Luigi Attorresi, Davide Salvi, Clara Borrelli, Paolo Bestagini, Stefano Tubaro

    Abstract: The rapid spread of media content synthesis technology and the potentially damaging impact of audio and video deepfakes on people's lives have raised the need to implement systems able to detect these forgeries automatically. In this work we present a novel approach for synthetic speech detection, exploiting the combination of two high-level semantic properties of the human voice. On one side, we… ▽ More

    Submitted 31 October, 2022; originally announced October 2022.

  5. arXiv:2209.08000  [pdf, other

    cs.MM

    TIMIT-TTS: a Text-to-Speech Dataset for Multimodal Synthetic Media Detection

    Authors: Davide Salvi, Brian Hosler, Paolo Bestagini, Matthew C. Stamm, Stefano Tubaro

    Abstract: With the rapid development of deep learning techniques, the generation and counterfeiting of multimedia material are becoming increasingly straightforward to perform. At the same time, sharing fake content on the web has become so simple that malicious users can create unpleasant situations with minimal effort. Also, forged media are getting more and more complex, with manipulated videos that are… ▽ More

    Submitted 16 September, 2022; originally announced September 2022.

  6. arXiv:2102.07133  [pdf, other

    cs.SD cs.LG eess.AS

    Parametric Optimization of Violin Top Plates using Machine Learning

    Authors: Davide Salvi, Sebastian Gonzalez, Fabio Antonacci, Augusto Sarti

    Abstract: We recently developed a neural network that receives as input the geometrical and mechanical parameters that define a violin top plate and gives as output its first ten eigenfrequencies computed in free boundary conditions. In this manuscript, we use the network to optimize several error functions, with the goal of analyzing the relationship between the eigenspectrum problem for violin top plates… ▽ More

    Submitted 18 February, 2021; v1 submitted 14 February, 2021; originally announced February 2021.

  7. arXiv:2102.04254  [pdf, other

    cs.CE cs.AI cs.LG cs.SD eess.AS

    A Data-Driven Approach to Violin Making

    Authors: Sebastian Gonzalez, Davide Salvi, Daniel Baeza, Fabio Antonacci, Augusto Sarti

    Abstract: Of all the characteristics of a violin, those that concern its shape are probably the most important ones, as the violin maker has complete control over them. Contemporary violin making, however, is still based more on tradition than understanding, and a definitive scientific study of the specific relations that exist between shape and vibrational properties is yet to come and sorely missed. In th… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

  8. arXiv:1408.6418  [pdf

    cs.CV cs.CL cs.IR

    Video In Sentences Out

    Authors: Andrei Barbu, Alexander Bridge, Zachary Burchill, Dan Coroian, Sven Dickinson, Sanja Fidler, Aaron Michaux, Sam Mussman, Siddharth Narayanaswamy, Dhaval Salvi, Lara Schmidt, Jiangnan Shangguan, Jeffrey Mark Siskind, Jarrell Waggoner, Song Wang, **lian Wei, Yifan Yin, Zhiqi Zhang

    Abstract: We present a system that produces sentential descriptions of video: who did what to whom, and where and how they did it. Action class is rendered as a verb, participant objects as noun phrases, properties of those objects as adjectival modifiers in those noun phrases, spatial relations between those participants as prepositional phrases, and characteristics of the event as prepositional-phrase adj… ▽ More

    Submitted 9 August, 2014; originally announced August 2014.

    Comments: Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2012)

    Report number: UAI-P-2012-PG-102-112

  9. arXiv:1204.3616  [pdf, other

    cs.CV cs.AI

    Large-Scale Automatic Labeling of Video Events with Verbs Based on Event-Participant Interaction

    Authors: Andrei Barbu, Alexander Bridge, Dan Coroian, Sven Dickinson, Sam Mussman, Siddharth Narayanaswamy, Dhaval Salvi, Lara Schmidt, Jiangnan Shangguan, Jeffrey Mark Siskind, Jarrell Waggoner, Song Wang, **lian Wei, Yifan Yin, Zhiqi Zhang

    Abstract: We present an approach to labeling short video clips with English verbs as event descriptions. A key distinguishing aspect of this work is that it labels videos with verbs that describe the spatiotemporal interaction between event participants, humans and objects interacting with each other, abstracting away all object-class information and fine-grained image characteristics, and relying solely on… ▽ More

    Submitted 16 April, 2012; originally announced April 2012.

  10. arXiv:1204.2742  [pdf, other

    cs.CV cs.AI

    Video In Sentences Out

    Authors: Andrei Barbu, Alexander Bridge, Zachary Burchill, Dan Coroian, Sven Dickinson, Sanja Fidler, Aaron Michaux, Sam Mussman, Siddharth Narayanaswamy, Dhaval Salvi, Lara Schmidt, Jiangnan Shangguan, Jeffrey Mark Siskind, Jarrell Waggoner, Song Wang, **lian Wei, Yifan Yin, Zhiqi Zhang

    Abstract: We present a system that produces sentential descriptions of video: who did what to whom, and where and how they did it. Action class is rendered as a verb, participant objects as noun phrases, properties of those objects as adjectival modifiers in those noun phrases,spatial relations between those participants as prepositional phrases, and characteristics of the event as prepositional-phrase adju… ▽ More

    Submitted 12 April, 2012; originally announced April 2012.