Search | arXiv e-print repository

arXiv:2403.19593 [pdf, other]

Frame by Familiar Frame: Understanding Replication in Video Diffusion Models

Authors: Aimon Rahman, Malsha V. Perera, Vishal M. Patel

Abstract: Building on the momentum of image generation diffusion models, there is an increasing interest in video-based diffusion models. However, video generation poses greater challenges due to its higher-dimensional nature, the scarcity of training data, and the complex spatiotemporal relationships involved. Image generation models, due to their extensive data requirements, have already strained computat… ▽ More Building on the momentum of image generation diffusion models, there is an increasing interest in video-based diffusion models. However, video generation poses greater challenges due to its higher-dimensional nature, the scarcity of training data, and the complex spatiotemporal relationships involved. Image generation models, due to their extensive data requirements, have already strained computational resources to their limits. There have been instances of these models reproducing elements from the training samples, leading to concerns and even legal disputes over sample replication. Video diffusion models, which operate with even more constrained datasets and are tasked with generating both spatial and temporal content, may be more prone to replicating samples from their training sets. Compounding the issue, these models are often evaluated using metrics that inadvertently reward replication. In our paper, we present a systematic investigation into the phenomenon of sample replication in video diffusion models. We scrutinize various recent diffusion models for video synthesis, assessing their tendency to replicate spatial and temporal content in both unconditional and conditional generation scenarios. Our study identifies strategies that are less likely to lead to replication. Furthermore, we propose new evaluation strategies that take replication into account, offering a more accurate measure of a model's ability to generate the original content. △ Less

Submitted 28 March, 2024; originally announced March 2024.

arXiv:2305.06402 [pdf, ps, other]

Analyzing Bias in Diffusion-based Face Generation Models

Authors: Malsha V. Perera, Vishal M. Patel

Abstract: Diffusion models are becoming increasingly popular in synthetic data generation and image editing applications. However, these models can amplify existing biases and propagate them to downstream applications. Therefore, it is crucial to understand the sources of bias in their outputs. In this paper, we investigate the presence of bias in diffusion-based face generation models with respect to attri… ▽ More Diffusion models are becoming increasingly popular in synthetic data generation and image editing applications. However, these models can amplify existing biases and propagate them to downstream applications. Therefore, it is crucial to understand the sources of bias in their outputs. In this paper, we investigate the presence of bias in diffusion-based face generation models with respect to attributes such as gender, race, and age. Moreover, we examine how dataset size affects the attribute composition and perceptual quality of both diffusion and Generative Adversarial Network (GAN) based face generation models across various attribute classes. Our findings suggest that diffusion models tend to worsen distribution bias in the training data for various attributes, which is heavily influenced by the size of the dataset. Conversely, GAN models trained on balanced datasets with a larger number of samples show less bias across different attributes. △ Less

Submitted 10 May, 2023; originally announced May 2023.

arXiv:2206.04514 [pdf, ps, other]

doi 10.1109/LGRS.2023.3270799

SAR Despeckling using a Denoising Diffusion Probabilistic Model

Authors: Malsha V. Perera, Nithin Gopalakrishnan Nair, Wele Gedara Chaminda Bandara, Vishal M. Patel

Abstract: Speckle is a multiplicative noise which affects all coherent imaging modalities including Synthetic Aperture Radar (SAR) images. The presence of speckle degrades the image quality and adversely affects the performance of SAR image understanding applications such as automatic target recognition and change detection. Thus, SAR despeckling is an important problem in remote sensing. In this paper, we… ▽ More Speckle is a multiplicative noise which affects all coherent imaging modalities including Synthetic Aperture Radar (SAR) images. The presence of speckle degrades the image quality and adversely affects the performance of SAR image understanding applications such as automatic target recognition and change detection. Thus, SAR despeckling is an important problem in remote sensing. In this paper, we introduce SAR-DDPM, a denoising diffusion probabilistic model for SAR despeckling. The proposed method comprises of a Markov chain that transforms clean images to white Gaussian noise by repeatedly adding random noise. The despeckled image is recovered by a reverse process which iteratively predicts the added noise using a noise predictor which is conditioned on the speckled image. In addition, we propose a new inference strategy based on cycle spinning to improve the despeckling performance. Our experiments on both synthetic and real SAR images demonstrate that the proposed method achieves significant improvements in both quantitative and qualitative results over the state-of-the-art despeckling methods. △ Less

Submitted 9 June, 2022; originally announced June 2022.

Comments: Our code is available at https://github.com/malshaV/SAR_DDPM

arXiv:2205.15906 [pdf, ps, other]

SAR Despeckling Using Overcomplete Convolutional Networks

Authors: Malsha V. Perera, Wele Gedara Chaminda Bandara, Jeya Maria Jose Valanarasu, Vishal M. Patel

Abstract: Synthetic Aperture Radar (SAR) despeckling is an important problem in remote sensing as speckle degrades SAR images, affecting downstream tasks like detection and segmentation. Recent studies show that convolutional neural networks(CNNs) outperform classical despeckling methods. Traditional CNNs try to increase the receptive field size as the network goes deeper, thus extracting global features. H… ▽ More Synthetic Aperture Radar (SAR) despeckling is an important problem in remote sensing as speckle degrades SAR images, affecting downstream tasks like detection and segmentation. Recent studies show that convolutional neural networks(CNNs) outperform classical despeckling methods. Traditional CNNs try to increase the receptive field size as the network goes deeper, thus extracting global features. However,speckle is relatively small, and increasing receptive field does not help in extracting speckle features. This study employs an overcomplete CNN architecture to focus on learning low-level features by restricting the receptive field. The proposed network consists of an overcomplete branch to focus on the local structures and an undercomplete branch that focuses on the global structures. We show that the proposed network improves despeckling performance compared to recent despeckling methods on synthetic and real SAR images. △ Less

Submitted 31 May, 2022; originally announced May 2022.

Comments: Accepted to International Geoscience and Remote Sensing Symposium (IGARSS), 2022. Our code is available at https://github.com/malshaV/sar_overcomplete

arXiv:2201.09355 [pdf, ps, other]

Transformer-based SAR Image Despeckling

Authors: Malsha V. Perera, Wele Gedara Chaminda Bandara, Jeya Maria Jose Valanarasu, Vishal M. Patel

Abstract: Synthetic Aperture Radar (SAR) images are usually degraded by a multiplicative noise known as speckle which makes processing and interpretation of SAR images difficult. In this paper, we introduce a transformer-based network for SAR image despeckling. The proposed despeckling network comprises of a transformer-based encoder which allows the network to learn global dependencies between different im… ▽ More Synthetic Aperture Radar (SAR) images are usually degraded by a multiplicative noise known as speckle which makes processing and interpretation of SAR images difficult. In this paper, we introduce a transformer-based network for SAR image despeckling. The proposed despeckling network comprises of a transformer-based encoder which allows the network to learn global dependencies between different image regions - aiding in better despeckling. The network is trained end-to-end with synthetically generated speckled images using a composite loss function. Experiments show that the proposed method achieves significant improvements over traditional and convolutional neural network-based despeckling methods on both synthetic and real SAR images. △ Less

Submitted 23 January, 2022; originally announced January 2022.

Comments: Submitted to International Geoscience and Remote Sensing Symposium (IGARSS), 2022. Our code is available at https://github.com/malshaV/sar_transformer

arXiv:2010.13268 [pdf, ps, other]

A Joint Convolutional and Spatial Quad-Directional LSTM Network for Phase Unwrap**

Authors: Malsha V. Perera, Ashwin De Silva

Abstract: Phase unwrap** is a classical ill-posed problem which aims to recover the true phase from wrapped phase. In this paper, we introduce a novel Convolutional Neural Network (CNN) that incorporates a Spatial Quad-Directional Long Short Term Memory (SQD-LSTM) for phase unwrap**, by formulating it as a regression problem. Incorporating SQD-LSTM can circumvent the typical CNNs' inherent difficulty of… ▽ More Phase unwrap** is a classical ill-posed problem which aims to recover the true phase from wrapped phase. In this paper, we introduce a novel Convolutional Neural Network (CNN) that incorporates a Spatial Quad-Directional Long Short Term Memory (SQD-LSTM) for phase unwrap**, by formulating it as a regression problem. Incorporating SQD-LSTM can circumvent the typical CNNs' inherent difficulty of learning global spatial dependencies which are vital when recovering the true phase. Furthermore, we employ a problem specific composite loss function to train this network. The proposed network is found to be performing better than the existing methods under severe noise conditions (Normalized Root Mean Square Error of 1.3 % at SNR = 0 dB) while spending a significantly less computational time (0.054 s). The network also does not require a large scale dataset during training, thus making it ideal for applications with limited data that require fast and accurate phase unwrap**. △ Less

Submitted 25 October, 2020; originally announced October 2020.

Comments: Under Review

arXiv:2009.02575 [pdf, ps, other]

doi 10.1109/SMC42975.2020.9283285

Low-cost Active Dry-Contact Surface EMG Sensor for Bionic Arms

Authors: Asma M. Naim, Kithmin Wickramasinghe, Ashwin De Silva, Malsha V. Perera, Thilina Dulantha Lalitharatne, Simon L. Kappel

Abstract: Surface electromyography (sEMG) is a popular bio-signal used for controlling prostheses and finger gesture recognition mechanisms. Myoelectric prostheses are costly, and most commercially available sEMG acquisition systems are not suitable for real-time gesture recognition. In this paper, a method of acquiring sEMG signals using novel low-cost, active, dry-contact, flexible sensors has been propos… ▽ More Surface electromyography (sEMG) is a popular bio-signal used for controlling prostheses and finger gesture recognition mechanisms. Myoelectric prostheses are costly, and most commercially available sEMG acquisition systems are not suitable for real-time gesture recognition. In this paper, a method of acquiring sEMG signals using novel low-cost, active, dry-contact, flexible sensors has been proposed. Since the active sEMG sensor was developed to be used along with a bionic arm, the sensor was tested for its ability to acquire sEMG signals that could be used for real-time classification of five selected gestures. In a study of 4 subjects, the average classification accuracy for real-time gesture classification using the active sEMG sensor system was 85%. The common-mode rejection ratio of the sensor was measured to 59 dB, and thus the sensor's performance was not substantially limited by its active circuitry. The proposed sensors can be interfaced with a variety of amplifiers to perform fully wearable sEMG acquisition. This satisfies the need for a low-cost sEMG acquisition system for prostheses. △ Less

Submitted 9 September, 2020; v1 submitted 5 September, 2020; originally announced September 2020.

Comments: Paper accepted to IEEE International Conference on Systems, Man, and Cybernetics (SMC) 2020

Showing 1–7 of 7 results for author: Perera, M V