Search | arXiv e-print repository

arXiv:2010.15040 [pdf, other]

Training Generative Adversarial Networks by Solving Ordinary Differential Equations

Authors: Chongli Qin, Yan Wu, Jost Tobias Springenberg, Andrew Brock, Jeff Donahue, Timothy P. Lillicrap, Pushmeet Kohli

Abstract: The instability of Generative Adversarial Network (GAN) training has frequently been attributed to gradient descent. Consequently, recent methods have aimed to tailor the models and training procedures to stabilise the discrete updates. In contrast, we study the continuous-time dynamics induced by GAN training. Both theory and toy experiments suggest that these dynamics are in fact surprisingly st… ▽ More The instability of Generative Adversarial Network (GAN) training has frequently been attributed to gradient descent. Consequently, recent methods have aimed to tailor the models and training procedures to stabilise the discrete updates. In contrast, we study the continuous-time dynamics induced by GAN training. Both theory and toy experiments suggest that these dynamics are in fact surprisingly stable. From this perspective, we hypothesise that instabilities in training GANs arise from the integration error in discretising the continuous dynamics. We experimentally verify that well-known ODE solvers (such as Runge-Kutta) can stabilise training - when combined with a regulariser that controls the integration error. Our approach represents a radical departure from previous methods which typically use adaptive optimisation and stabilisation techniques that constrain the functional space (e.g. Spectral Normalisation). Evaluation on CIFAR-10 and ImageNet shows that our method outperforms several strong baselines, demonstrating its efficacy. △ Less

Submitted 28 November, 2020; v1 submitted 28 October, 2020; originally announced October 2020.

arXiv:1912.00953 [pdf, other]

LOGAN: Latent Optimisation for Generative Adversarial Networks

Authors: Yan Wu, Jeff Donahue, David Balduzzi, Karen Simonyan, Timothy Lillicrap

Abstract: Training generative adversarial networks requires balancing of delicate adversarial dynamics. Even with careful tuning, training may diverge or end up in a bad equilibrium with dropped modes. In this work, we improve CS-GAN with natural gradient-based latent optimisation and show that it improves adversarial dynamics by enhancing interactions between the discriminator and the generator. Our experi… ▽ More Training generative adversarial networks requires balancing of delicate adversarial dynamics. Even with careful tuning, training may diverge or end up in a bad equilibrium with dropped modes. In this work, we improve CS-GAN with natural gradient-based latent optimisation and show that it improves adversarial dynamics by enhancing interactions between the discriminator and the generator. Our experiments demonstrate that latent optimisation can significantly improve GAN training, obtaining state-of-the-art performance for the ImageNet ($128 \times 128$) dataset. Our model achieves an Inception Score (IS) of $148$ and an Fréchet Inception Distance (FID) of $3.4$, an improvement of $17\%$ and $32\%$ in IS and FID respectively, compared with the baseline BigGAN-deep model with the same architecture and number of parameters. △ Less

Submitted 1 July, 2020; v1 submitted 2 December, 2019; originally announced December 2019.

Comments: Improved writing, added new analysis and evaluation

arXiv:1907.06571 [pdf, other]

Adversarial Video Generation on Complex Datasets

Authors: Aidan Clark, Jeff Donahue, Karen Simonyan

Abstract: Generative models of natural images have progressed towards high fidelity samples by the strong leveraging of scale. We attempt to carry this success to the field of video modeling by showing that large Generative Adversarial Networks trained on the complex Kinetics-600 dataset are able to produce video samples of substantially higher complexity and fidelity than previous work. Our proposed model,… ▽ More Generative models of natural images have progressed towards high fidelity samples by the strong leveraging of scale. We attempt to carry this success to the field of video modeling by showing that large Generative Adversarial Networks trained on the complex Kinetics-600 dataset are able to produce video samples of substantially higher complexity and fidelity than previous work. Our proposed model, Dual Video Discriminator GAN (DVD-GAN), scales to longer and higher resolution videos by leveraging a computationally efficient decomposition of its discriminator. We evaluate on the related tasks of video synthesis and video prediction, and achieve new state-of-the-art Fréchet Inception Distance for prediction for Kinetics-600, as well as state-of-the-art Inception Score for synthesis on the UCF-101 dataset, alongside establishing a strong baseline for synthesis on Kinetics-600. △ Less

Submitted 25 September, 2019; v1 submitted 15 July, 2019; originally announced July 2019.

arXiv:1907.02544 [pdf, other]

Large Scale Adversarial Representation Learning

Authors: Jeff Donahue, Karen Simonyan

Abstract: Adversarially trained generative models (GANs) have recently achieved compelling image synthesis results. But despite early successes in using GANs for unsupervised representation learning, they have since been superseded by approaches based on self-supervision. In this work we show that progress in image generation quality translates to substantially improved representation learning performance.… ▽ More Adversarially trained generative models (GANs) have recently achieved compelling image synthesis results. But despite early successes in using GANs for unsupervised representation learning, they have since been superseded by approaches based on self-supervision. In this work we show that progress in image generation quality translates to substantially improved representation learning performance. Our approach, BigBiGAN, builds upon the state-of-the-art BigGAN model, extending it to representation learning by adding an encoder and modifying the discriminator. We extensively evaluate the representation learning and generation capabilities of these BigBiGAN models, demonstrating that these generation-based models achieve the state of the art in unsupervised representation learning on ImageNet, as well as in unconditional image generation. Pretrained BigBiGAN models -- including image generators and encoders -- are available on TensorFlow Hub (https://tfhub.dev/s?publisher=deepmind&q=bigbigan). △ Less

Submitted 5 November, 2019; v1 submitted 4 July, 2019; originally announced July 2019.

Comments: 32 pages. In proceedings of NeurIPS 2019. This is the camera-ready version of the paper, with supplementary material included as appendices

arXiv:1809.11096 [pdf, other]

Large Scale GAN Training for High Fidelity Natural Image Synthesis

Authors: Andrew Brock, Jeff Donahue, Karen Simonyan

Abstract: Despite recent progress in generative image modeling, successfully generating high-resolution, diverse samples from complex datasets such as ImageNet remains an elusive goal. To this end, we train Generative Adversarial Networks at the largest scale yet attempted, and study the instabilities specific to such scale. We find that applying orthogonal regularization to the generator renders it amenabl… ▽ More Despite recent progress in generative image modeling, successfully generating high-resolution, diverse samples from complex datasets such as ImageNet remains an elusive goal. To this end, we train Generative Adversarial Networks at the largest scale yet attempted, and study the instabilities specific to such scale. We find that applying orthogonal regularization to the generator renders it amenable to a simple "truncation trick," allowing fine control over the trade-off between sample fidelity and variety by reducing the variance of the Generator's input. Our modifications lead to models which set the new state of the art in class-conditional image synthesis. When trained on ImageNet at 128x128 resolution, our models (BigGANs) achieve an Inception Score (IS) of 166.5 and Frechet Inception Distance (FID) of 7.4, improving over the previous best IS of 52.52 and FID of 18.6. △ Less

Submitted 25 February, 2019; v1 submitted 28 September, 2018; originally announced September 2018.

arXiv:1605.09782 [pdf, other]

Adversarial Feature Learning

Authors: Jeff Donahue, Philipp Krähenbühl, Trevor Darrell

Abstract: The ability of the Generative Adversarial Networks (GANs) framework to learn generative models map** from simple latent distributions to arbitrarily complex data distributions has been demonstrated empirically, with compelling results showing that the latent space of such generators captures semantic variation in the data distribution. Intuitively, models trained to predict these semantic latent… ▽ More The ability of the Generative Adversarial Networks (GANs) framework to learn generative models map** from simple latent distributions to arbitrarily complex data distributions has been demonstrated empirically, with compelling results showing that the latent space of such generators captures semantic variation in the data distribution. Intuitively, models trained to predict these semantic latent representations given data may serve as useful feature representations for auxiliary problems where semantics are relevant. However, in their existing form, GANs have no means of learning the inverse map** -- projecting data back into the latent space. We propose Bidirectional Generative Adversarial Networks (BiGANs) as a means of learning this inverse map**, and demonstrate that the resulting learned feature representation is useful for auxiliary supervised discrimination tasks, competitive with contemporary approaches to unsupervised and self-supervised feature learning. △ Less

Submitted 3 April, 2017; v1 submitted 31 May, 2016; originally announced May 2016.

Comments: Published as a conference paper at ICLR 2017. Changelog: (v7) Table 2 results improved 1-2% due to averaging predictions over 10 crops at test time, as done in Noroozi & Favaro; Table 3 VOC classification results slightly improved due to minor bugfix. (See v6 changelog for previous versions.)

arXiv:1308.4200 [pdf, other]

Towards Adapting ImageNet to Reality: Scalable Domain Adaptation with Implicit Low-rank Transformations

Authors: Erik Rodner, Judy Hoffman, Jeff Donahue, Trevor Darrell, Kate Saenko

Abstract: Images seen during test time are often not from the same distribution as images used for learning. This problem, known as domain shift, occurs when training classifiers from object-centric internet image databases and trying to apply them directly to scene understanding tasks. The consequence is often severe performance degradation and is one of the major barriers for the application of classifier… ▽ More Images seen during test time are often not from the same distribution as images used for learning. This problem, known as domain shift, occurs when training classifiers from object-centric internet image databases and trying to apply them directly to scene understanding tasks. The consequence is often severe performance degradation and is one of the major barriers for the application of classifiers in real-world systems. In this paper, we show how to learn transform-based domain adaptation classifiers in a scalable manner. The key idea is to exploit an implicit rank constraint, originated from a max-margin domain adaptation formulation, to make optimization tractable. Experiments show that the transformation between domains can be very efficiently learned from data and easily applied to new categories. This begins to bridge the gap between large-scale internet image collections and object images captured in everyday life environments. △ Less

Submitted 19 August, 2013; originally announced August 2013.

Showing 1–7 of 7 results for author: Donahue, J