Search | arXiv e-print repository

Understanding Hallucinations in Diffusion Models through Mode Interpolation

Authors: Sumukh K Aithal, Pratyush Maini, Zachary C. Lipton, J. Zico Kolter

Abstract: Colloquially speaking, image generation models based upon diffusion processes are frequently said to exhibit "hallucinations," samples that could never occur in the training data. But where do such hallucinations come from? In this paper, we study a particular failure mode in diffusion models, which we term mode interpolation. Specifically, we find that diffusion models smoothly "interpolate" betw… ▽ More Colloquially speaking, image generation models based upon diffusion processes are frequently said to exhibit "hallucinations," samples that could never occur in the training data. But where do such hallucinations come from? In this paper, we study a particular failure mode in diffusion models, which we term mode interpolation. Specifically, we find that diffusion models smoothly "interpolate" between nearby data modes in the training set, to generate samples that are completely outside the support of the original training distribution; this phenomenon leads diffusion models to generate artifacts that never existed in real data (i.e., hallucinations). We systematically study the reasons for, and the manifestation of this phenomenon. Through experiments on 1D and 2D Gaussians, we show how a discontinuous loss landscape in the diffusion model's decoder leads to a region where any smooth approximation will cause such hallucinations. Through experiments on artificial datasets with various shapes, we show how hallucination leads to the generation of combinations of shapes that never existed. Finally, we show that diffusion models in fact know when they go out of support and hallucinate. This is captured by the high variance in the trajectory of the generated sample towards the final few backward sampling process. Using a simple metric to capture this variance, we can remove over 95% of hallucinations at generation time while retaining 96% of in-support samples. We conclude our exploration by showing the implications of such hallucination (and its removal) on the collapse (and stabilization) of recursive training on synthetic data with experiments on MNIST and 2D Gaussians dataset. We release our code at https://github.com/locuslab/diffusion-model-hallucination. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2212.13827 [pdf, other]

Esca** Saddle Points for Effective Generalization on Class-Imbalanced Data

Authors: Harsh Rangwani, Sumukh K Aithal, Mayank Mishra, R. Venkatesh Babu

Abstract: Real-world datasets exhibit imbalances of varying types and degrees. Several techniques based on re-weighting and margin adjustment of loss are often used to enhance the performance of neural networks, particularly on minority classes. In this work, we analyze the class-imbalanced learning problem by examining the loss landscape of neural networks trained with re-weighting and margin-based techniq… ▽ More Real-world datasets exhibit imbalances of varying types and degrees. Several techniques based on re-weighting and margin adjustment of loss are often used to enhance the performance of neural networks, particularly on minority classes. In this work, we analyze the class-imbalanced learning problem by examining the loss landscape of neural networks trained with re-weighting and margin-based techniques. Specifically, we examine the spectral density of Hessian of class-wise loss, through which we observe that the network weights converge to a saddle point in the loss landscapes of minority classes. Following this observation, we also find that optimization methods designed to escape from saddle points can be effectively used to improve generalization on minority classes. We further theoretically and empirically demonstrate that Sharpness-Aware Minimization (SAM), a recent technique that encourages convergence to a flat minima, can be effectively used to escape saddle points for minority classes. Using SAM results in a 6.2\% increase in accuracy on the minority classes over the state-of-the-art Vector Scaling Loss, leading to an overall average increase of 4\% across imbalanced datasets. The code is available at: https://github.com/val-iisc/Saddle-LongTail. △ Less

Submitted 28 December, 2022; originally announced December 2022.

Comments: NeurIPS 2022. Code: https://github.com/val-iisc/Saddle-LongTail

arXiv:2206.08213 [pdf, other]

A Closer Look at Smoothness in Domain Adversarial Training

Authors: Harsh Rangwani, Sumukh K Aithal, Mayank Mishra, Arihant Jain, R. Venkatesh Babu

Abstract: Domain adversarial training has been ubiquitous for achieving invariant representations and is used widely for various domain adaptation tasks. In recent times, methods converging to smooth optima have shown improved generalization for supervised learning tasks like classification. In this work, we analyze the effect of smoothness enhancing formulations on domain adversarial training, the objectiv… ▽ More Domain adversarial training has been ubiquitous for achieving invariant representations and is used widely for various domain adaptation tasks. In recent times, methods converging to smooth optima have shown improved generalization for supervised learning tasks like classification. In this work, we analyze the effect of smoothness enhancing formulations on domain adversarial training, the objective of which is a combination of task loss (eg. classification, regression, etc.) and adversarial terms. We find that converging to a smooth minima with respect to (w.r.t.) task loss stabilizes the adversarial training leading to better performance on target domain. In contrast to task loss, our analysis shows that converging to smooth minima w.r.t. adversarial loss leads to sub-optimal generalization on the target domain. Based on the analysis, we introduce the Smooth Domain Adversarial Training (SDAT) procedure, which effectively enhances the performance of existing domain adversarial methods for both classification and object detection tasks. Our analysis also provides insight into the extensive usage of SGD over Adam in the community for domain adversarial training. △ Less

Submitted 16 June, 2022; originally announced June 2022.

Comments: ICML 2022. Code: https://github.com/val-iisc/SDAT

arXiv:2109.08901 [pdf, other]

S$^3$VAADA: Submodular Subset Selection for Virtual Adversarial Active Domain Adaptation

Authors: Harsh Rangwani, Arihant Jain, Sumukh K Aithal, R. Venkatesh Babu

Abstract: Unsupervised domain adaptation (DA) methods have focused on achieving maximal performance through aligning features from source and target domains without using labeled data in the target domain. Whereas, in the real-world scenario's it might be feasible to get labels for a small proportion of target data. In these scenarios, it is important to select maximally-informative samples to label and fin… ▽ More Unsupervised domain adaptation (DA) methods have focused on achieving maximal performance through aligning features from source and target domains without using labeled data in the target domain. Whereas, in the real-world scenario's it might be feasible to get labels for a small proportion of target data. In these scenarios, it is important to select maximally-informative samples to label and find an effective way to combine them with the existing knowledge from source data. Towards achieving this, we propose S$^3$VAADA which i) introduces a novel submodular criterion to select a maximally informative subset to label and ii) enhances a cluster-based DA procedure through novel improvements to effectively utilize all the available data for improving generalization on target. Our approach consistently outperforms the competing state-of-the-art approaches on datasets with varying degrees of domain shifts. △ Less

Submitted 18 September, 2021; originally announced September 2021.

Comments: ICCV 2021. Project page: http://sites.google.com/iisc.ac.in/s3vaada-iccv2021

arXiv:1912.05017 [pdf]

A Comprehensive Review On Various State Of Art Techniques For Eye Blink Detection

Authors: Sannidhan MS, Sunil Kumar Aithal, Abhir Bhandary

Abstract: Computer Vision is considered to be one of the most important areas in research and has focused on develo** many applications that has proved to be useful for both research and societal benefits. Today we have been witnessing many of the road mishaps happening just because of the lack of concentration while driving.As a part of avoiding this kind of disaster happening in day to day life there ar… ▽ More Computer Vision is considered to be one of the most important areas in research and has focused on develo** many applications that has proved to be useful for both research and societal benefits. Today we have been witnessing many of the road mishaps happening just because of the lack of concentration while driving.As a part of avoiding this kind of disaster happening in day to day life there are many technologies focusing on kee** track of the vehicle drivers concentration.One such technology uses the method of eye blink detection to find out the concentration level of the driver.With the advent of many high end camera devices with cost effectiveness factor today it has become more efficient and cheaper to use eye blink detection for kee** track of the concentration level of the driver.Hence this paper presents an exhaustive review on the implementations of various eye blink detection algorithms.The detection system has also extended its application in various other fields like drowsiness detection and fatigue detection and expression detection. △ Less

Submitted 26 November, 2019; originally announced December 2019.

Comments: International Journal of Current Engineering and Scientific Research (IJCESR) 2018

Showing 1–5 of 5 results for author: Aithal, S K