Search | arXiv e-print repository

At the edge of a generative cultural precipice

Abstract: Since NFTs and large generative models (such as DALLE2 and Stable Diffusion) have been publicly available, artists have seen their jobs threatened and stolen. While artists depend on sharing their art on online platforms such as Deviantart, Pixiv, and Artstation, many slowed down sharing their work or downright removed their past work therein, especially if these platforms fail to provide certain… ▽ More Since NFTs and large generative models (such as DALLE2 and Stable Diffusion) have been publicly available, artists have seen their jobs threatened and stolen. While artists depend on sharing their art on online platforms such as Deviantart, Pixiv, and Artstation, many slowed down sharing their work or downright removed their past work therein, especially if these platforms fail to provide certain guarantees regarding the copyright of their uploaded work. Text-to-image (T2I) generative models are trained using human-produced content to better guide the style and themes they can produce. Still, if the trend continues where data found online is generated by a machine instead of a human, this will have vast repercussions in culture. Inspired by recent work in generative models, we wish to tell a cautionary tale and ask what will happen to the visual arts if generative models continue on the path to be (eventually) trained solely on generated content. △ Less

Submitted 30 April, 2024; originally announced June 2024.

Comments: Accepted at the CVPR Fourth Workshop on Ethical Considerations in Creative applications of Computer Vision

arXiv:2309.06086 [pdf, other]

Plasticity-Optimized Complementary Networks for Unsupervised Continual Learning

Authors: Alex Gomez-Villa, Bartlomiej Twardowski, Kai Wang, Joost van de Weijer

Abstract: Continuous unsupervised representation learning (CURL) research has greatly benefited from improvements in self-supervised learning (SSL) techniques. As a result, existing CURL methods using SSL can learn high-quality representations without any labels, but with a notable performance drop when learning on a many-tasks data stream. We hypothesize that this is caused by the regularization losses tha… ▽ More Continuous unsupervised representation learning (CURL) research has greatly benefited from improvements in self-supervised learning (SSL) techniques. As a result, existing CURL methods using SSL can learn high-quality representations without any labels, but with a notable performance drop when learning on a many-tasks data stream. We hypothesize that this is caused by the regularization losses that are imposed to prevent forgetting, leading to a suboptimal plasticity-stability trade-off: they either do not adapt fully to the incoming data (low plasticity), or incur significant forgetting when allowed to fully adapt to a new SSL pretext-task (low stability). In this work, we propose to train an expert network that is relieved of the duty of kee** the previous knowledge and can focus on performing optimally on the new tasks (optimizing plasticity). In the second phase, we combine this new knowledge with the previous network in an adaptation-retrospection phase to avoid forgetting and initialize a new expert with the knowledge of the old network. We perform several experiments showing that our proposed approach outperforms other CURL exemplar-free methods in few- and many-task split settings. Furthermore, we show how to adapt our approach to semi-supervised continual learning (Semi-SCL) and show that we surpass the accuracy of other exemplar-free Semi-SCL methods and reach the results of some others that use exemplars. △ Less

Submitted 12 September, 2023; originally announced September 2023.

Comments: Accepted at WACV2024

arXiv:2202.07993 [pdf, other]

Planckian Jitter: countering the color-crippling effects of color jitter on self-supervised training

Authors: Simone Zini, Alex Gomez-Villa, Marco Buzzelli, Bartłomiej Twardowski, Andrew D. Bagdanov, Joost van de Weijer

Abstract: Several recent works on self-supervised learning are trained by map** different augmentations of the same image to the same feature representation. The data augmentations used are of crucial importance to the quality of learned feature representations. In this paper, we analyze how the color jitter traditionally used in data augmentation negatively impacts the quality of the color features in le… ▽ More Several recent works on self-supervised learning are trained by map** different augmentations of the same image to the same feature representation. The data augmentations used are of crucial importance to the quality of learned feature representations. In this paper, we analyze how the color jitter traditionally used in data augmentation negatively impacts the quality of the color features in learned feature representations. To address this problem, we propose a more realistic, physics-based color data augmentation - which we call Planckian Jitter - that creates realistic variations in chromaticity and produces a model robust to illumination changes that can be commonly observed in real life, while maintaining the ability to discriminate image content based on color information. Experiments confirm that such a representation is complementary to the representations learned with the currently-used color jitter augmentation and that a simple concatenation leads to significant performance gains on a wide range of downstream datasets. In addition, we present a color sensitivity analysis that documents the impact of different training methods on model neurons and shows that the performance of the learned features is robust with respect to illuminant variations. △ Less

Submitted 2 February, 2023; v1 submitted 16 February, 2022; originally announced February 2022.

Comments: Accepted at Eleventh International Conference on Learning Representations (ICLR 2023)

arXiv:2112.15022 [pdf, other]

Continually Learning Self-Supervised Representations with Projected Functional Regularization

Authors: Alex Gomez-Villa, Bartlomiej Twardowski, Lu Yu, Andrew D. Bagdanov, Joost van de Weijer

Abstract: Recent self-supervised learning methods are able to learn high-quality image representations and are closing the gap with supervised approaches. However, these methods are unable to acquire new knowledge incrementally -- they are, in fact, mostly used only as a pre-training phase over IID data. In this work we investigate self-supervised methods in continual learning regimes without any replay mec… ▽ More Recent self-supervised learning methods are able to learn high-quality image representations and are closing the gap with supervised approaches. However, these methods are unable to acquire new knowledge incrementally -- they are, in fact, mostly used only as a pre-training phase over IID data. In this work we investigate self-supervised methods in continual learning regimes without any replay mechanism. We show that naive functional regularization, also known as feature distillation, leads to lower plasticity and limits continual learning performance. Instead, we propose Projected Functional Regularization in which a separate temporal projection network ensures that the newly learned feature space preserves information of the previous one, while at the same time allowing for the learning of new features. This prevents forgetting while maintaining the plasticity of the learner. Comparison with other incremental learning approaches applied to self-supervision demonstrates that our method obtains competitive performance in different scenarios and on multiple datasets. △ Less

Submitted 2 May, 2022; v1 submitted 30 December, 2021; originally announced December 2021.

Comments: Accepted at Workshop on Continual Learning in Computer Vision (CVPR 2022)

arXiv:1912.01643 [pdf, other]

Visual Illusions Also Deceive Convolutional Neural Networks: Analysis and Implications

Authors: A. Gomez-Villa, A. Martín, J. Vazquez-Corral, M. Bertalmío, J. Malo

Abstract: Visual illusions allow researchers to devise and test new models of visual perception. Here we show that artificial neural networks trained for basic visual tasks in natural images are deceived by brightness and color illusions, having a response that is qualitatively very similar to the human achromatic and chromatic contrast sensitivity functions, and consistent with natural image statistics. We… ▽ More Visual illusions allow researchers to devise and test new models of visual perception. Here we show that artificial neural networks trained for basic visual tasks in natural images are deceived by brightness and color illusions, having a response that is qualitatively very similar to the human achromatic and chromatic contrast sensitivity functions, and consistent with natural image statistics. We also show that, while these artificial networks are deceived by illusions, their response might be significantly different to that of humans. Our results suggest that low-level illusions appear in any system that has to perform basic visual tasks in natural environments, in line with error minimization explanations of visual function, and they also imply a word of caution on using artificial networks to study human vision, as previously suggested in other contexts in the vision science literature. △ Less

Submitted 3 December, 2019; originally announced December 2019.

arXiv:1911.09599 [pdf, other]

Synthesizing Visual Illusions Using Generative Adversarial Networks

Authors: Alexander Gomez-Villa, Adrian Martín, Javier Vazquez-Corral, Jesús Malo, Marcelo Bertalmío

Abstract: Visual illusions are a very useful tool for vision scientists, because they allow them to better probe the limits, thresholds and errors of the visual system. In this work we introduce the first ever framework to generate novel visual illusions with an artificial neural network (ANN). It takes the form of a generative adversarial network, with a generator of visual illusion candidates and two disc… ▽ More Visual illusions are a very useful tool for vision scientists, because they allow them to better probe the limits, thresholds and errors of the visual system. In this work we introduce the first ever framework to generate novel visual illusions with an artificial neural network (ANN). It takes the form of a generative adversarial network, with a generator of visual illusion candidates and two discriminator modules, one for the inducer background and another that decides whether or not the candidate is indeed an illusion. The generality of the model is exemplified by synthesizing illusions of different types, and validated with psychophysical experiments that corroborate that the outputs of our ANN are indeed visual illusions to human observers. Apart from synthesizing new visual illusions, which may help vision researchers, the proposed model has the potential to open new ways to study the similarities and differences between ANN and human visual perception. △ Less

Submitted 21 November, 2019; originally announced November 2019.

arXiv:1811.10565 [pdf, other]

Convolutional Neural Networks Deceived by Visual Illusions

Authors: Alexander Gomez-Villa, Adrián Martín, Javier Vazquez-Corral, Marcelo Bertalmío

Abstract: Visual illusions teach us that what we see is not always what it is represented in the physical world. Its special nature make them a fascinating tool to test and validate any new vision model proposed. In general, current vision models are based on the concatenation of linear convolutions and non-linear operations. In this paper we get inspiration from the similarity of this structure with the op… ▽ More Visual illusions teach us that what we see is not always what it is represented in the physical world. Its special nature make them a fascinating tool to test and validate any new vision model proposed. In general, current vision models are based on the concatenation of linear convolutions and non-linear operations. In this paper we get inspiration from the similarity of this structure with the operations present in Convolutional Neural Networks (CNNs). This motivated us to study if CNNs trained for low-level visual tasks are deceived by visual illusions. In particular, we show that CNNs trained for image denoising, image deblurring, and computational color constancy are able to replicate the human response to visual illusions, and that the extent of this replication varies with respect to variation in architecture and spatial pattern size. We believe that this CNNs behaviour appears as a by-product of the training for the low level vision tasks of denoising, color constancy or deblurring. Our work opens a new bridge between human perception and CNNs: in order to obtain CNNs that better replicate human behaviour, we may need to start aiming for them to better replicate visual illusions. △ Less

Submitted 26 November, 2018; originally announced November 2018.

Showing 1–7 of 7 results for author: Gomez-Villa, A