-
Least $k$th-Order and Rényi Generative Adversarial Networks
Authors:
Himesh Bhatia,
William Paul,
Fady Alajaji,
Bahman Gharesifard,
Philippe Burlina
Abstract:
We investigate the use of parametrized families of information-theoretic measures to generalize the loss functions of generative adversarial networks (GANs) with the objective of improving performance. A new generator loss function, called least $k$th-order GAN (L$k$GAN), is first introduced, generalizing the least squares GANs (LSGANs) by using a $k$th order absolute error distortion measure with…
▽ More
We investigate the use of parametrized families of information-theoretic measures to generalize the loss functions of generative adversarial networks (GANs) with the objective of improving performance. A new generator loss function, called least $k$th-order GAN (L$k$GAN), is first introduced, generalizing the least squares GANs (LSGANs) by using a $k$th order absolute error distortion measure with $k \geq 1$ (which recovers the LSGAN loss function when $k=2$). It is shown that minimizing this generalized loss function under an (unconstrained) optimal discriminator is equivalent to minimizing the $k$th-order Pearson-Vajda divergence. Another novel GAN generator loss function is next proposed in terms of Rényi cross-entropy functionals with order $α>0$, $α\neq 1$. It is demonstrated that this Rényi-centric generalized loss function, which provably reduces to the original GAN loss function as $α\to1$, preserves the equilibrium point satisfied by the original GAN based on the Jensen-Rényi divergence, a natural extension of the Jensen-Shannon divergence.
Experimental results indicate that the proposed loss functions, applied to the MNIST and CelebA datasets, under both DCGAN and StyleGAN architectures, confer performance benefits by virtue of the extra degrees of freedom provided by the parameters $k$ and $α$, respectively. More specifically, experiments show improvements with regard to the quality of the generated images as measured by the Fréchet Inception Distance (FID) score and training stability. While it was applied to GANs in this study, the proposed approach is generic and can be used in other applications of information theory to deep learning, e.g., the issues of fairness or privacy in artificial intelligence.
△ Less
Submitted 11 March, 2021; v1 submitted 3 June, 2020;
originally announced June 2020.
-
Hierarchical Variational Imitation Learning of Control Programs
Authors:
Roy Fox,
Richard Shin,
William Paul,
Yitian Zou,
Dawn Song,
Ken Goldberg,
Pieter Abbeel,
Ion Stoica
Abstract:
Autonomous agents can learn by imitating teacher demonstrations of the intended behavior. Hierarchical control policies are ubiquitously useful for such learning, having the potential to break down structured tasks into simpler sub-tasks, thereby improving data efficiency and generalization. In this paper, we propose a variational inference method for imitation learning of a control policy represe…
▽ More
Autonomous agents can learn by imitating teacher demonstrations of the intended behavior. Hierarchical control policies are ubiquitously useful for such learning, having the potential to break down structured tasks into simpler sub-tasks, thereby improving data efficiency and generalization. In this paper, we propose a variational inference method for imitation learning of a control policy represented by parametrized hierarchical procedures (PHP), a program-like structure in which procedures can invoke sub-procedures to perform sub-tasks. Our method discovers the hierarchical structure in a dataset of observation-action traces of teacher demonstrations, by learning an approximate posterior distribution over the latent sequence of procedure calls and terminations. Samples from this learned distribution then guide the training of the hierarchical control policy. We identify and demonstrate a novel benefit of variational inference in the context of hierarchical imitation learning: in decomposing the policy into simpler procedures, inference can leverage acausal information that is unused by other methods. Training PHP with variational inference outperforms LSTM baselines in terms of data efficiency and generalization, requiring less than half as much data to achieve a 24% error rate in executing the bubble sort algorithm, and to achieve no error in executing Karel programs.
△ Less
Submitted 29 December, 2019;
originally announced December 2019.
-
Collective dynamics of pedestrians in a non-panic evacuation scenario
Authors:
Juan Cruz Moreno,
M. Leticia Rubio Puzzo,
Wolfgang Paul
Abstract:
We present a study of pedestrian motion along a corridor in a non-panic regime (e.g., schools, hospitals or airports). Such situations have been discussed so far within the Social Force Model (SFM). We suggest to enrich this model by interactions based on the velocity of the particles and some randomness, both of which we introduce using the ideas of the Vicsek Model (VM). This new model allows to…
▽ More
We present a study of pedestrian motion along a corridor in a non-panic regime (e.g., schools, hospitals or airports). Such situations have been discussed so far within the Social Force Model (SFM). We suggest to enrich this model by interactions based on the velocity of the particles and some randomness, both of which we introduce using the ideas of the Vicsek Model (VM). This new model allows to introduce fluctuations for a given average speed and geometry, and considering that the alignment interactions are modulated by an external control parameter (the noise $η$) allows to introduce phase transitions between ordered and disordered states. We have compared simulations of pedestrian motion along a corridor using (a) the VM with two boundary conditions (periodic and bouncing back) and with or without desired direction of motion, (b) the SFM, and (c) the new model SFM+VM. The study of steady-state configurations in the VM with confined geometry shows the expected bands perpendicular to the motion direction, while in the SFM and SFM+VM particles order in stripes of a given width $w$ along the direction of motion. The results in the SFM+VM case show that $w(t)\simeq t^α$ has a diffusive-like behavior at low noise $η$ (dynamic exponent $α\approx 1/2$), while it is sub-diffusive at high values of external noise ($α\approx 1/4$). We observe the order-disorder transition in the VM with both boundary conditions, but the application of a desired direction condition inhibits the existence of disorder as expected. For the SFM+VM case we find a susceptibility maximum which increases with system size as a function of noise strength indicative of a order-disorder transition in the whole range of densities and speeds studied. From our results we conclude that the new SFM+VM model is a well-suited model to describe non-panic evacuation with diverse degrees of disorder.
△ Less
Submitted 4 December, 2019;
originally announced December 2019.
-
Ray: A Distributed Framework for Emerging AI Applications
Authors:
Philipp Moritz,
Robert Nishihara,
Stephanie Wang,
Alexey Tumanov,
Richard Liaw,
Eric Liang,
Melih Elibol,
Zongheng Yang,
William Paul,
Michael I. Jordan,
Ion Stoica
Abstract:
The next generation of AI applications will continuously interact with the environment and learn from these interactions. These applications impose new and demanding systems requirements, both in terms of performance and flexibility. In this paper, we consider these requirements and present Ray---a distributed system to address them. Ray implements a unified interface that can express both task-pa…
▽ More
The next generation of AI applications will continuously interact with the environment and learn from these interactions. These applications impose new and demanding systems requirements, both in terms of performance and flexibility. In this paper, we consider these requirements and present Ray---a distributed system to address them. Ray implements a unified interface that can express both task-parallel and actor-based computations, supported by a single dynamic execution engine. To meet the performance requirements, Ray employs a distributed scheduler and a distributed and fault-tolerant store to manage the system's control state. In our experiments, we demonstrate scaling beyond 1.8 million tasks per second and better performance than existing specialized systems for several challenging reinforcement learning applications.
△ Less
Submitted 29 September, 2018; v1 submitted 15 December, 2017;
originally announced December 2017.