Search | arXiv e-print repository

Few-shot Image Generation via Adaptation-Aware Kernel Modulation

Authors: Yunqing Zhao, Keshigeyan Chandrasegaran, Milad Abdollahzadeh, Ngai-Man Cheung

Abstract: Few-shot image generation (FSIG) aims to learn to generate new and diverse samples given an extremely limited number of samples from a domain, e.g., 10 training samples. Recent work has addressed the problem using transfer learning approach, leveraging a GAN pretrained on a large-scale source domain dataset and adapting that model to the target domain based on very limited target domain samples. C… ▽ More Few-shot image generation (FSIG) aims to learn to generate new and diverse samples given an extremely limited number of samples from a domain, e.g., 10 training samples. Recent work has addressed the problem using transfer learning approach, leveraging a GAN pretrained on a large-scale source domain dataset and adapting that model to the target domain based on very limited target domain samples. Central to recent FSIG methods are knowledge preserving criteria, which aim to select a subset of source model's knowledge to be preserved into the adapted model. However, a major limitation of existing methods is that their knowledge preserving criteria consider only source domain/source task, and they fail to consider target domain/adaptation task in selecting source model's knowledge, casting doubt on their suitability for setups of different proximity between source and target domain. Our work makes two contributions. As our first contribution, we re-visit recent FSIG works and their experiments. Our important finding is that, under setups which assumption of close proximity between source and target domains is relaxed, existing state-of-the-art (SOTA) methods which consider only source domain/source task in knowledge preserving perform no better than a baseline fine-tuning method. To address the limitation of existing methods, as our second contribution, we propose Adaptation-Aware kernel Modulation (AdAM) to address general FSIG of different source-target domain proximity. Extensive experimental results show that the proposed method consistently achieves SOTA performance across source/target domains of different proximity, including challenging setups when source and target domains are more apart. Project Page: https://yunqing-me.github.io/AdAM/ △ Less

Submitted 9 May, 2023; v1 submitted 29 October, 2022; originally announced October 2022.

Comments: The Thirty-Sixth Annual Conference on Neural Information Processing Systems (NeurIPS 2022), 14 pages

arXiv:2208.00352 [pdf]

Neural Correlates of Face Familiarity Perception

Authors: Evan Ehrenberg, Kleovoulos Leo Tsourides, Hossein Nejati, Ngai-Man Cheung, Pawan Sinha

Abstract: In the domain of face recognition, there exists a puzzling timing discrepancy between results from macaque neurophysiology on the one hand and human electrophysiology on the other. Single unit recordings in macaques have demonstrated face identity specific responses in extra-striate visual cortex within 100 milliseconds of stimulus onset. In EEG and MEG experiments with humans, however, a consiste… ▽ More In the domain of face recognition, there exists a puzzling timing discrepancy between results from macaque neurophysiology on the one hand and human electrophysiology on the other. Single unit recordings in macaques have demonstrated face identity specific responses in extra-striate visual cortex within 100 milliseconds of stimulus onset. In EEG and MEG experiments with humans, however, a consistent distinction between neural activity corresponding to unfamiliar and familiar faces has been reported to emerge around 250 ms. This points to the possibility that there may be a hitherto undiscovered early correlate of face familiarity perception in human electrophysiological traces. We report here a successful search for such a correlate in dense MEG recordings using pattern classification techniques. Our analyses reveal markers of face familiarity as early as 85 ms after stimulus onset. Low-level attributes of the images, such as luminance and color distributions, are unable to account for this early emerging response difference. These results help reconcile human and macaque data, and provide clues regarding neural mechanisms underlying familiar face perception. △ Less

Submitted 30 July, 2022; originally announced August 2022.

arXiv:2205.03805 [pdf, other]

A Closer Look at Few-shot Image Generation

Authors: Yunqing Zhao, Henghui Ding, Hou**g Huang, Ngai-Man Cheung

Abstract: Modern GANs excel at generating high quality and diverse images. However, when transferring the pretrained GANs on small target data (e.g., 10-shot), the generator tends to replicate the training samples. Several methods have been proposed to address this few-shot image generation task, but there is a lack of effort to analyze them under a unified framework. As our first contribution, we propose a… ▽ More Modern GANs excel at generating high quality and diverse images. However, when transferring the pretrained GANs on small target data (e.g., 10-shot), the generator tends to replicate the training samples. Several methods have been proposed to address this few-shot image generation task, but there is a lack of effort to analyze them under a unified framework. As our first contribution, we propose a framework to analyze existing methods during the adaptation. Our analysis discovers that while some methods have disproportionate focus on diversity preserving which impede quality improvement, all methods achieve similar quality after convergence. Therefore, the better methods are those that can slow down diversity degradation. Furthermore, our analysis reveals that there is still plenty of room to further slow down diversity degradation. Informed by our analysis and to slow down the diversity degradation of the target generator during adaptation, our second contribution proposes to apply mutual information (MI) maximization to retain the source domain's rich multi-level diversity information in the target domain generator. We propose to perform MI maximization by contrastive loss (CL), leverage the generator and discriminator as two feature encoders to extract different multi-level features for computing CL. We refer to our method as Dual Contrastive Learning (DCL). Extensive experiments on several public datasets show that, while leading to a slower diversity-degrading generator during adaptation, our proposed DCL brings visually pleasant quality and state-of-the-art quantitative performance. Project Page: yunqing-me.github.io/A-Closer-Look-at-FSIG. △ Less

Submitted 15 April, 2023; v1 submitted 8 May, 2022; originally announced May 2022.

Comments: 12 figures, 4 tables, The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR) 2022

arXiv:2103.17195 [pdf, other]

A Closer Look at Fourier Spectrum Discrepancies for CNN-generated Images Detection

Authors: Keshigeyan Chandrasegaran, Ngoc-Trung Tran, Ngai-Man Cheung

Abstract: CNN-based generative modelling has evolved to produce synthetic images indistinguishable from real images in the RGB pixel space. Recent works have observed that CNN-generated images share a systematic shortcoming in replicating high frequency Fourier spectrum decay attributes. Furthermore, these works have successfully exploited this systematic shortcoming to detect CNN-generated images reporting… ▽ More CNN-based generative modelling has evolved to produce synthetic images indistinguishable from real images in the RGB pixel space. Recent works have observed that CNN-generated images share a systematic shortcoming in replicating high frequency Fourier spectrum decay attributes. Furthermore, these works have successfully exploited this systematic shortcoming to detect CNN-generated images reporting up to 99% accuracy across multiple state-of-the-art GAN models. In this work, we investigate the validity of assertions claiming that CNN-generated images are unable to achieve high frequency spectral decay consistency. We meticulously construct a counterexample space of high frequency spectral decay consistent CNN-generated images emerging from our handcrafted experiments using DCGAN, LSGAN, WGAN-GP and StarGAN, where we empirically show that this frequency discrepancy can be avoided by a minor architecture change in the last upsampling operation. We subsequently use images from this counterexample space to successfully bypass the recently proposed forensics detector which leverages on high frequency Fourier spectrum decay attributes for CNN-generated image detection. Through this study, we show that high frequency Fourier spectrum decay discrepancies are not inherent characteristics for existing CNN-based generative models--contrary to the belief of some existing work--, and such features are not robust to perform synthetic image detection. Our results prompt re-thinking of using high frequency Fourier spectrum decay attributes for CNN-generated image detection. Code and models are available at https://keshik6.github.io/Fourier-Discrepancies-CNN-Detection/ △ Less

Submitted 31 March, 2021; originally announced March 2021.

Comments: CVPR 2021 Oral

arXiv:2006.05338 [pdf, other]

doi 10.1109/TIP.2021.3049346

On Data Augmentation for GAN Training

Authors: Ngoc-Trung Tran, Viet-Hung Tran, Ngoc-Bao Nguyen, Trung-Kien Nguyen, Ngai-Man Cheung

Abstract: Recent successes in Generative Adversarial Networks (GAN) have affirmed the importance of using more data in GAN training. Yet it is expensive to collect data in many domains such as medical applications. Data Augmentation (DA) has been applied in these applications. In this work, we first argue that the classical DA approach could mislead the generator to learn the distribution of the augmented d… ▽ More Recent successes in Generative Adversarial Networks (GAN) have affirmed the importance of using more data in GAN training. Yet it is expensive to collect data in many domains such as medical applications. Data Augmentation (DA) has been applied in these applications. In this work, we first argue that the classical DA approach could mislead the generator to learn the distribution of the augmented data, which could be different from that of the original data. We then propose a principled framework, termed Data Augmentation Optimized for GAN (DAG), to enable the use of augmented data in GAN training to improve the learning of the original distribution. We provide theoretical analysis to show that using our proposed DAG aligns with the original GAN in minimizing the Jensen-Shannon (JS) divergence between the original distribution and model distribution. Importantly, the proposed DAG effectively leverages the augmented data to improve the learning of discriminator and generator. We conduct experiments to apply DAG to different GAN models: unconditional GAN, conditional GAN, self-supervised GAN and CycleGAN using datasets of natural images and medical images. The results show that DAG achieves consistent and considerable improvements across these models. Furthermore, when DAG is used in some GAN models, the system establishes state-of-the-art Frechet Inception Distance (FID) scores. Our code is available. △ Less

Submitted 31 December, 2020; v1 submitted 9 June, 2020; originally announced June 2020.

Comments: Accepted in IEEE Transactions on Image Processing

arXiv:1909.06536 [pdf]

Optimized Routing and Spectrum Assignment for Video Communication over an Elastic Optical Network

Authors: Hamed Alizadeh Ghazijahani, Hadi Seyedarabi, Javad Musevi Niya, Ngai-Man Cheung

Abstract: Elastic optical network (EON) efficiently utilize spectral resources for optical fiber communication by allocating the minimum necessary bandwidth to client demands. On the other hand, network traffic has been continuously increasing due to the wide penetration of video streaming services, so the efficient and cost-effective use of available bandwidth plays an important role in improving service p… ▽ More Elastic optical network (EON) efficiently utilize spectral resources for optical fiber communication by allocating the minimum necessary bandwidth to client demands. On the other hand, network traffic has been continuously increasing due to the wide penetration of video streaming services, so the efficient and cost-effective use of available bandwidth plays an important role in improving service provisioning. In this work, we formulate and solve an optimization problem to perform routing and spectrum assignment (RSA) in EON with focus on video streaming. In this formulation, EON and video constraints such as spectrum fragmentation and received video quality are considered jointly. In this way, we utilize a machine learning (ML) technique to estimate the video quality versus channel state. The proposed algorithm is evaluated over two benchmarks fiber-optic network, namely NSFNET and US-backbone using numerical simulations based on random traffic models. The results reveal that the mean optical signal-to-noise ratio (OSNR) for video content data in the receiver is remarkably higher than in non-video data. This is while the blocking ratio is the same for both data types. △ Less

Submitted 14 September, 2019; originally announced September 2019.

arXiv:1811.08609 [pdf, other]

On Sparse Graph Fourier Transform

Authors: Seyed Hamid Safavi, Manas Khatua, Ngai-Man Cheung, Farah Torkamani-Azar

Abstract: In this paper, we propose a new regression-based algorithm to compute Graph Fourier Transform (GFT). Our algorithm allows different regularizations to be included when computing the GFT analysis components, so that the resulting components can be tuned for a specific task. We propose using the lasso penalty in our proposed framework to obtain analysis components with sparse loadings. We show that… ▽ More In this paper, we propose a new regression-based algorithm to compute Graph Fourier Transform (GFT). Our algorithm allows different regularizations to be included when computing the GFT analysis components, so that the resulting components can be tuned for a specific task. We propose using the lasso penalty in our proposed framework to obtain analysis components with sparse loadings. We show that the components from this proposed {\em sparse GFT} can identify and select correlated signal sources into sub-graphs, and perform frequency analysis {\em locally} within these sub-graphs of correlated sources. Using real network traffic datasets, we demonstrate that sparse GFT can achieve outstanding performance in an anomaly detection task. △ Less

Submitted 21 November, 2018; originally announced November 2018.

Comments: Presented at 3rd Graph Signal Processing Workshop - GSP 18

arXiv:1801.02303 [pdf, other]

Joint Estimation of Low-Rank Components and Connectivity Graph in High-Dimensional Graph Signals: Application to Brain Imaging

Authors: Rui Liu, Hossein Nejati, Ngai-Man Cheung

Abstract: This paper presents a graph signal processing algorithm to uncover the intrinsic low-rank components and the underlying graph of a high-dimensional, graph-smooth and grossly-corrupted dataset. In our problem formulation, we assume that the perturbation on the low-rank components is sparse and the signal is smooth on the graph. We propose an algorithm to estimate the low-rank components with the he… ▽ More This paper presents a graph signal processing algorithm to uncover the intrinsic low-rank components and the underlying graph of a high-dimensional, graph-smooth and grossly-corrupted dataset. In our problem formulation, we assume that the perturbation on the low-rank components is sparse and the signal is smooth on the graph. We propose an algorithm to estimate the low-rank components with the help of the graph and refine the graph with better estimated low-rank components. We propose to perform the low-rank estimation and graph refinement jointly so that low-rank estimation can benefit from the refined graph, and graph refinement can leverage the improved low-rank estimation. We propose to address the problem with an alternating optimization. Moreover, we perform a mathematical analysis to understand and quantify the impact of the inexact graph on the low-rank estimation, justifying our scheme with graph refinement as an integrated step in estimating low-rank components. We perform extensive experiments on the proposed algorithm and compare with state-of-the-art low-rank estimation and graph learning techniques. Our experiments use synthetic data and real brain imaging (MEG) data that is recorded when subjects are presented with different categories of visual stimuli. We observe that our proposed algorithm is competitive in estimating the low-rank components, adequately capturing the intrinsic task-related information in the reduced dimensional representation, and leading to better performance in a classification task. Furthermore, we notice that our estimated graph indicates compatible brain active regions for visual activity as neuroscientific findings. △ Less

Submitted 7 January, 2018; originally announced January 2018.

Showing 1–8 of 8 results for author: Cheung, N