Skip to main content

Showing 1–21 of 21 results for author: Delp, E

Searching in archive eess. Search in all archives.
.
  1. arXiv:2404.10989  [pdf, other

    cs.CV cs.LG cs.MM cs.SD eess.AS

    FairSSD: Understanding Bias in Synthetic Speech Detectors

    Authors: Amit Kumar Singh Yadav, Kratika Bhagtani, Davide Salvi, Paolo Bestagini, Edward J. Delp

    Abstract: Methods that can generate synthetic speech which is perceptually indistinguishable from speech recorded by a human speaker, are easily available. Several incidents report misuse of synthetic speech generated from these methods to commit fraud. To counter such misuse, many methods have been proposed to detect synthetic speech. Some of these detectors are more interpretable, can generalize to detect… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: Accepted at CVPR 2024 (WMF)

  2. arXiv:2402.14205  [pdf, other

    cs.SD cs.CV cs.LG eess.AS eess.SP

    Compression Robust Synthetic Speech Detection Using Patched Spectrogram Transformer

    Authors: Amit Kumar Singh Yadav, Ziyue Xiang, Kratika Bhagtani, Paolo Bestagini, Stefano Tubaro, Edward J. Delp

    Abstract: Many deep learning synthetic speech generation tools are readily available. The use of synthetic speech has caused financial fraud, impersonation of people, and misinformation to spread. For this reason forensic methods that can detect synthetic speech have been proposed. Existing methods often overfit on one dataset and their performance reduces substantially in practical scenarios such as detect… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: Accepted as long oral paper at ICMLA 2023

  3. End-to-end Evaluation of Practical Video Analytics Systems for Face Detection and Recognition

    Authors: Praneet Singh, Edward J. Delp, Amy R. Reibman

    Abstract: Practical video analytics systems that are deployed in bandwidth constrained environments like autonomous vehicles perform computer vision tasks such as face detection and recognition. In an end-to-end face analytics system, inputs are first compressed using popular video codecs like HEVC and then passed onto modules that perform face detection, alignment, and recognition sequentially. Typically,… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: Accepted to Autonomous Vehicles and Machines 2023 Conference, IS&T Electronic Imaging (EI) Symposium

    Journal ref: Electronic Imaging, 2023, pp 111-1 - 111-6

  4. arXiv:2304.03323  [pdf, other

    cs.SD cs.CV cs.MM eess.AS

    DSVAE: Interpretable Disentangled Representation for Synthetic Speech Detection

    Authors: Amit Kumar Singh Yadav, Kratika Bhagtani, Ziyue Xiang, Paolo Bestagini, Stefano Tubaro, Edward J. Delp

    Abstract: Tools to generate high quality synthetic speech signal that is perceptually indistinguishable from speech recorded from human speakers are easily available. Several approaches have been proposed for detecting synthetic speech. Many of these approaches use deep learning methods as a black box without providing reasoning for the decisions they make. This limits the interpretability of these approach… ▽ More

    Submitted 28 July, 2023; v1 submitted 6 April, 2023; originally announced April 2023.

  5. arXiv:2301.09702  [pdf, other

    eess.IV cs.CV cs.LG

    Illumination Variation Correction Using Image Synthesis For Unsupervised Domain Adaptive Person Re-Identification

    Authors: Jiaqi Guo, Amy R. Reibman, Edward J. Delp

    Abstract: Unsupervised domain adaptive (UDA) person re-identification (re-ID) aims to learn identity information from labeled images in source domains and apply it to unlabeled images in a target domain. One major issue with many unsupervised re-identification methods is that they do not perform well relative to large domain variations such as illumination, viewpoint, and occlusions. In this paper, we propo… ▽ More

    Submitted 14 November, 2023; v1 submitted 23 January, 2023; originally announced January 2023.

    Comments: 10 pages, 5 figures, 5 tables

  6. arXiv:2210.07546  [pdf, other

    cs.SD cs.CV eess.AS

    Transformer-Based Speech Synthesizer Attribution in an Open Set Scenario

    Authors: Emily R. Bartusiak, Edward J. Delp

    Abstract: Speech synthesis methods can create realistic-sounding speech, which may be used for fraud, spoofing, and misinformation campaigns. Forensic methods that detect synthesized speech are important for protection against such attacks. Forensic attribution methods provide even more information about the nature of synthesized speech signals because they identify the specific speech synthesis method (i.e… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

    Comments: Accepted to the 2022 IEEE International Conference on Machine Learning and Applications

    Journal ref: IEEE International Conference on Machine Learning and Applications, pp. 1-8, December 2022, Nassau, The Bahamas

  7. arXiv:2205.03947  [pdf, other

    cs.CV eess.IV

    High-Resolution UAV Image Generation for Sorghum Panicle Detection

    Authors: Enyu Cai, Zhankun Luo, Sriram Baireddy, Jiaqi Guo, Changye Yang, Edward J. Delp

    Abstract: The number of panicles (or heads) of Sorghum plants is an important phenotypic trait for plant development and grain yield estimation. The use of Unmanned Aerial Vehicles (UAVs) enables the capability of collecting and analyzing Sorghum images on a large scale. Deep learning can provide methods for estimating phenotypic traits from UAV images but requires a large amount of labeled data. The lack o… ▽ More

    Submitted 8 May, 2022; originally announced May 2022.

  8. arXiv:2205.01806  [pdf, other

    cs.SD cs.CV cs.LG eess.AS

    Frequency Domain-Based Detection of Generated Audio

    Authors: Emily R. Bartusiak, Edward J. Delp

    Abstract: Attackers may manipulate audio with the intent of presenting falsified reports, changing an opinion of a public figure, and winning influence and power. The prevalence of inauthentic multimedia continues to rise, so it is imperative to develop a set of tools that determines the legitimacy of media. We present a method that analyzes audio signals to determine whether they contain real human voices… ▽ More

    Submitted 3 May, 2022; originally announced May 2022.

    Comments: Accepted to the 2021 Media Watermarking, Security, and Forensics Conference, IS&T Electronic Imaging Symposium (EI)

    Journal ref: Proceedings of the Media Watermarking, Security, and Forensics Conference, IS&T Electronic Imaging Symposium, pp 273-1 - 273-7, January 2021, Burlingame, CA

  9. arXiv:2205.01805  [pdf, other

    cs.CV cs.LG eess.IV

    Splicing Detection and Localization In Satellite Imagery Using Conditional GANs

    Authors: Emily R. Bartusiak, Sri Kalyan Yarlagadda, David Güera, Paolo Bestagini, Stefano Tubaro, Fengqing M. Zhu, Edward J. Delp

    Abstract: The widespread availability of image editing tools and improvements in image processing techniques allow image manipulation to be very easy. Oftentimes, easy-to-use yet sophisticated image manipulation tools yields distortions/changes imperceptible to the human observer. Distribution of forged images can have drastic ramifications, especially when coupled with the speed and vastness of the Interne… ▽ More

    Submitted 3 May, 2022; originally announced May 2022.

    Comments: Accepted to the 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)

    Journal ref: IEEE Conference on Multimedia Information Processing and Retrieval, pp. 91-96, March 2019, San Jose, CA

  10. arXiv:2205.01800  [pdf, other

    cs.SD cs.CV cs.LG eess.AS

    Synthesized Speech Detection Using Convolutional Transformer-Based Spectrogram Analysis

    Authors: Emily R. Bartusiak, Edward J. Delp

    Abstract: Synthesized speech is common today due to the prevalence of virtual assistants, easy-to-use tools for generating and modifying speech signals, and remote work practices. Synthesized speech can also be used for nefarious purposes, including creating a purported speech signal and attributing it to someone who did not speak the content of the signal. We need methods to detect if a speech signal is sy… ▽ More

    Submitted 3 May, 2022; originally announced May 2022.

    Comments: Accepted to the 2021 IEEE Asilomar Conference on Signals, Systems, and Computers

    Journal ref: IEEE Asilomar Conference on Signals, Systems, and Computers, pp. 1426-1430, October 2021, Asilomar, CA

  11. Forensic Analysis and Localization of Multiply Compressed MP3 Audio Using Transformers

    Authors: Ziyue Xiang, Paolo Bestagini, Stefano Tubaro, Edward J. Delp

    Abstract: Audio signals are often stored and transmitted in compressed formats. Among the many available audio compression schemes, MPEG-1 Audio Layer III (MP3) is very popular and widely used. Since MP3 is lossy it leaves characteristic traces in the compressed audio which can be used forensically to expose the past history of an audio file. In this paper, we consider the scenario of audio signal manipulat… ▽ More

    Submitted 28 April, 2022; v1 submitted 30 March, 2022; originally announced March 2022.

  12. arXiv:2107.07308  [pdf, ps, other

    eess.IV

    Panicle Counting in UAV Images For Estimating Flowering Time in Sorghum

    Authors: Enyu Cai, Sriram Baireddy, Changye Yang, Melba Crawford, Edward J. Delp

    Abstract: Flowering time (time to flower after planting) is important for estimating plant development and grain yield for many crops including sorghum. Flowering time of sorghum can be approximated by counting the number of panicles (clusters of grains on a branch) across multiple dates. Traditional manual methods for panicle counting are time-consuming and tedious. In this paper, we propose a method for e… ▽ More

    Submitted 15 July, 2021; originally announced July 2021.

  13. arXiv:2106.15753  [pdf, ps, other

    eess.IV cs.CV

    RCNN-SliceNet: A Slice and Cluster Approach for Nuclei Centroid Detection in Three-Dimensional Fluorescence Microscopy Images

    Authors: Liming Wu, Shuo Han, Alain Chen, Paul Salama, Kenneth W. Dunn, Edward J. Delp

    Abstract: Robust and accurate nuclei centroid detection is important for the understanding of biological structures in fluorescence microscopy images. Existing automated nuclei localization methods face three main challenges: (1) Most of object detection methods work only on 2D images and are difficult to extend to 3D volumes; (2) Segmentation-based models can be used on 3D volumes but it is computational e… ▽ More

    Submitted 4 November, 2021; v1 submitted 29 June, 2021; originally announced June 2021.

  14. arXiv:2105.06373  [pdf, other

    eess.IV

    Manipulation Detection in Satellite Images Using Vision Transformer

    Authors: János Horváth, Sriram Baireddy, Hanxiang Hao, Daniel Mas Montserrat, Edward J. Delp

    Abstract: A growing number of commercial satellite companies provide easily accessible satellite imagery. Overhead imagery is used by numerous industries including agriculture, forestry, natural disaster analysis, and meteorology. Satellite images, just as any other images, can be tampered with image manipulation tools. Manipulation detection methods created for images captured by "consumer cameras" tend to… ▽ More

    Submitted 13 May, 2021; originally announced May 2021.

  15. arXiv:2004.12441  [pdf, other

    eess.IV

    Manipulation Detection in Satellite Images Using Deep Belief Networks

    Authors: János Horváth, Daniel Mas Montserrat, Hanxiang Hao, Edward J. Delp

    Abstract: Satellite images are more accessible with the increase of commercial satellites being orbited. These images are used in a wide range of applications including agricultural management, meteorological prediction, damage assessment from natural disasters, and cartography. Image manipulation tools including both manual editing tools and automated techniques can be easily used to tamper and modify sate… ▽ More

    Submitted 26 April, 2020; originally announced April 2020.

  16. arXiv:2004.12027  [pdf, other

    cs.CV eess.IV

    Deepfakes Detection with Automatic Face Weighting

    Authors: Daniel Mas Montserrat, Hanxiang Hao, S. K. Yarlagadda, Sriram Baireddy, Ruiting Shao, János Horváth, Emily Bartusiak, Justin Yang, David Güera, Fengqing Zhu, Edward J. Delp

    Abstract: Altered and manipulated multimedia is increasingly present and widely distributed via social media platforms. Advanced video manipulation tools enable the generation of highly realistic-looking altered multimedia. While many methods have been presented to detect manipulations, most of them fail when evaluated with data outside of the datasets used in research environments. In order to address this… ▽ More

    Submitted 4 May, 2020; v1 submitted 24 April, 2020; originally announced April 2020.

  17. Center-Extraction-Based Three Dimensional Nuclei Instance Segmentation of Fluorescence Microscopy Images

    Authors: David Joon Ho, Shuo Han, Chichen Fu, Paul Salama, Kenneth W. Dunn, Edward J. Delp

    Abstract: Fluorescence microscopy is an essential tool for the analysis of 3D subcellular structures in tissue. An important step in the characterization of tissue involves nuclei segmentation. In this paper, a two-stage method for segmentation of nuclei using convolutional neural networks (CNNs) is described. In particular, since creating labeled volumes manually for training purposes is not practical due… ▽ More

    Submitted 12 September, 2019; originally announced September 2019.

    Comments: Presented at the IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI 2019)

  18. arXiv:1906.11979  [pdf, other

    cs.CV cs.CR cs.LG eess.IV

    A Utility-Preserving GAN for Face Obscuration

    Authors: Hanxiang Hao, David Güera, Amy R. Reibman, Edward J. Delp

    Abstract: From TV news to Google StreetView, face obscuration has been used for privacy protection. Due to recent advances in the field of deep learning, obscuration methods such as Gaussian blurring and pixelation are not guaranteed to conceal identity. In this paper, we propose a utility-preserving generative model, UP-GAN, that is able to provide an effective face obscuration, while preserving facial uti… ▽ More

    Submitted 27 June, 2019; originally announced June 2019.

    Comments: 6 pages, 5 figures, presented at the ICML 2019 Worksop on Synthetic Realities: Deep Learning for Detecting AudioVisual Fakes

  19. arXiv:1803.04061  [pdf, ps, other

    eess.IV cs.MM

    Multi-Reference Video Coding Using Stillness Detection

    Authors: Di Chen, Zoe Liu, Yaowu Xu, Fengqing Zhu, Edward Delp

    Abstract: Encoders of AOM/AV1 codec consider an input video sequence as succession of frames grouped in Golden-Frame (GF) groups. The coding structure of a GF group is fixed with a given GF group size. In the current AOM/AV1 encoder, video frames are coded using a hierarchical, multilayer coding structure within one GF group. It has been observed that the use of multilayer coding structure may result in wor… ▽ More

    Submitted 11 March, 2018; originally announced March 2018.

    Comments: 4 pages, 3 figures, IS&T Electronic Imaging on Visual Information Processing and Communication Conference. (2018)

  20. Tubule segmentation of fluorescence microscopy images based on convolutional neural networks with inhomogeneity correction

    Authors: Soonam Lee, Chichen Fu, Paul Salama, Kenneth W. Dunn, Edward J. Delp

    Abstract: Fluorescence microscopy has become a widely used tool for studying various biological structures of in vivo tissue or cells. However, quantitative analysis of these biological structures remains a challenge due to their complexity which is exacerbated by distortions caused by lens aberrations and light scattering. Moreover, manual quantification of such image volumes is an intractable and error-pr… ▽ More

    Submitted 10 February, 2018; originally announced February 2018.

    Comments: IS&T International Symposium on Electronic Imaging 2018

  21. arXiv:1801.07198  [pdf, ps, other

    cs.CV cs.LG eess.IV

    Three Dimensional Fluorescence Microscopy Image Synthesis and Segmentation

    Authors: Chichen Fu, Soonam Lee, David Joon Ho, Shuo Han, Paul Salama, Kenneth W. Dunn, Edward J. Delp

    Abstract: Advances in fluorescence microscopy enable acquisition of 3D image volumes with better image quality and deeper penetration into tissue. Segmentation is a required step to characterize and analyze biological structures in the images and recent 3D segmentation using deep learning has achieved promising results. One issue is that deep learning techniques require a large set of groundtruth data which… ▽ More

    Submitted 20 April, 2018; v1 submitted 22 January, 2018; originally announced January 2018.

    Comments: Accepted by CVPR Workshop on Computer Vision for Microscopy Image Analysis (CVMI)