-
Taming the Tail in Class-Conditional GANs: Knowledge Sharing via Unconditional Training at Lower Resolutions
Authors:
Saeed Khorram,
Mingqi Jiang,
Mohamad Shahbazi,
Mohamad H. Danesh,
Li Fuxin
Abstract:
Despite extensive research on training generative adversarial networks (GANs) with limited training data, learning to generate images from long-tailed training distributions remains fairly unexplored. In the presence of imbalanced multi-class training data, GANs tend to favor classes with more samples, leading to the generation of low-quality and less diverse samples in tail classes. In this study…
▽ More
Despite extensive research on training generative adversarial networks (GANs) with limited training data, learning to generate images from long-tailed training distributions remains fairly unexplored. In the presence of imbalanced multi-class training data, GANs tend to favor classes with more samples, leading to the generation of low-quality and less diverse samples in tail classes. In this study, we aim to improve the training of class-conditional GANs with long-tailed data. We propose a straightforward yet effective method for knowledge sharing, allowing tail classes to borrow from the rich information from classes with more abundant training data. More concretely, we propose modifications to existing class-conditional GAN architectures to ensure that the lower-resolution layers of the generator are trained entirely unconditionally while reserving class-conditional generation for the higher-resolution layers. Experiments on several long-tail benchmarks and GAN architectures demonstrate a significant improvement over existing methods in both the diversity and fidelity of the generated images. The code is available at https://github.com/khorrams/utlo.
△ Less
Submitted 16 June, 2024; v1 submitted 26 February, 2024;
originally announced February 2024.
-
Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning
Authors:
Shengyi Huang,
Quentin Gallouédec,
Florian Felten,
Antonin Raffin,
Rousslan Fernand Julien Dossa,
Yanxiao Zhao,
Ryan Sullivan,
Viktor Makoviychuk,
Denys Makoviichuk,
Mohamad H. Danesh,
Cyril Roumégous,
Jiayi Weng,
Chufan Chen,
Md Masudur Rahman,
João G. M. Araújo,
Guorui Quan,
Daniel Tan,
Timo Klein,
Rujikorn Charakorn,
Mark Towers,
Yann Berthelot,
Kinal Mehta,
Dipam Chakraborty,
Arjun KG,
Valentin Charraut
, et al. (8 additional authors not shown)
Abstract:
In many Reinforcement Learning (RL) papers, learning curves are useful indicators to measure the effectiveness of RL algorithms. However, the complete raw data of the learning curves are rarely available. As a result, it is usually necessary to reproduce the experiments from scratch, which can be time-consuming and error-prone. We present Open RL Benchmark, a set of fully tracked RL experiments, i…
▽ More
In many Reinforcement Learning (RL) papers, learning curves are useful indicators to measure the effectiveness of RL algorithms. However, the complete raw data of the learning curves are rarely available. As a result, it is usually necessary to reproduce the experiments from scratch, which can be time-consuming and error-prone. We present Open RL Benchmark, a set of fully tracked RL experiments, including not only the usual data such as episodic return, but also all algorithm-specific and system metrics. Open RL Benchmark is community-driven: anyone can download, use, and contribute to the data. At the time of writing, more than 25,000 runs have been tracked, for a cumulative duration of more than 8 years. Open RL Benchmark covers a wide range of RL libraries and reference implementations. Special care is taken to ensure that each experiment is precisely reproducible by providing not only the full parameters, but also the versions of the dependencies used to generate it. In addition, Open RL Benchmark comes with a command-line interface (CLI) for easy fetching and generating figures to present the results. In this document, we include two case studies to demonstrate the usefulness of Open RL Benchmark in practice. To the best of our knowledge, Open RL Benchmark is the first RL benchmark of its kind, and the authors hope that it will improve and facilitate the work of researchers in the field.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Contextual Pre-planning on Reward Machine Abstractions for Enhanced Transfer in Deep Reinforcement Learning
Authors:
Guy Azran,
Mohamad H. Danesh,
Stefano V. Albrecht,
Sarah Keren
Abstract:
Recent studies show that deep reinforcement learning (DRL) agents tend to overfit to the task on which they were trained and fail to adapt to minor environment changes. To expedite learning when transferring to unseen tasks, we propose a novel approach to representing the current task using reward machines (RMs), state machine abstractions that induce subtasks based on the current task's rewards a…
▽ More
Recent studies show that deep reinforcement learning (DRL) agents tend to overfit to the task on which they were trained and fail to adapt to minor environment changes. To expedite learning when transferring to unseen tasks, we propose a novel approach to representing the current task using reward machines (RMs), state machine abstractions that induce subtasks based on the current task's rewards and dynamics. Our method provides agents with symbolic representations of optimal transitions from their current abstract state and rewards them for achieving these transitions. These representations are shared across tasks, allowing agents to exploit knowledge of previously encountered symbols and transitions, thus enhancing transfer. Empirical results show that our representations improve sample efficiency and few-shot transfer in a variety of domains.
△ Less
Submitted 20 February, 2024; v1 submitted 11 July, 2023;
originally announced July 2023.
-
LEADER: Learning Attention over Driving Behaviors for Planning under Uncertainty
Authors:
Mohamad H. Danesh,
Panpan Cai,
David Hsu
Abstract:
Uncertainty on human behaviors poses a significant challenge to autonomous driving in crowded urban environments. The partially observable Markov decision processes (POMDPs) offer a principled framework for planning under uncertainty, often leveraging Monte Carlo sampling to achieve online performance for complex tasks. However, sampling also raises safety concerns by potentially missing critical…
▽ More
Uncertainty on human behaviors poses a significant challenge to autonomous driving in crowded urban environments. The partially observable Markov decision processes (POMDPs) offer a principled framework for planning under uncertainty, often leveraging Monte Carlo sampling to achieve online performance for complex tasks. However, sampling also raises safety concerns by potentially missing critical events. To address this, we propose a new algorithm, LEarning Attention over Driving bEhavioRs (LEADER), that learns to attend to critical human behaviors during planning. LEADER learns a neural network generator to provide attention over human behaviors in real-time situations. It integrates the attention into a belief-space planner, using importance sampling to bias reasoning towards critical events. To train the algorithm, we let the attention generator and the planner form a min-max game. By solving the min-max game, LEADER learns to perform risk-aware planning without human labeling.
△ Less
Submitted 29 October, 2022; v1 submitted 23 September, 2022;
originally announced September 2022.
-
Out-of-Distribution Dynamics Detection: RL-Relevant Benchmarks and Results
Authors:
Mohamad H Danesh,
Alan Fern
Abstract:
We study the problem of out-of-distribution dynamics (OODD) detection, which involves detecting when the dynamics of a temporal process change compared to the training-distribution dynamics. This is relevant to applications in control, reinforcement learning (RL), and multi-variate time-series, where changes to test time dynamics can impact the performance of learning controllers/predictors in unk…
▽ More
We study the problem of out-of-distribution dynamics (OODD) detection, which involves detecting when the dynamics of a temporal process change compared to the training-distribution dynamics. This is relevant to applications in control, reinforcement learning (RL), and multi-variate time-series, where changes to test time dynamics can impact the performance of learning controllers/predictors in unknown ways. This problem is particularly important in the context of deep RL, where learned controllers often overfit to the training environment. Currently, however, there is a lack of established OODD benchmarks for the types of environments commonly used in RL research. Our first contribution is to design a set of OODD benchmarks derived from common RL environments with varying types and intensities of OODD. Our second contribution is to design a strong OODD baseline approach based on recurrent implicit quantile network (RIQN), which monitors autoregressive prediction errors for OODD detection. In addition to RIQN, we introduce and test three other simpler baselines. Our final contribution is to evaluate our baseline approaches on the benchmarks to provide results for future comparison.
△ Less
Submitted 24 May, 2022; v1 submitted 11 July, 2021;
originally announced July 2021.
-
Stochastic Block-ADMM for Training Deep Networks
Authors:
Saeed Khorram,
Xiao Fu,
Mohamad H. Danesh,
Zhongang Qi,
Li Fuxin
Abstract:
In this paper, we propose Stochastic Block-ADMM as an approach to train deep neural networks in batch and online settings. Our method works by splitting neural networks into an arbitrary number of blocks and utilizes auxiliary variables to connect these blocks while optimizing with stochastic gradient descent. This allows training deep networks with non-differentiable constraints where conventiona…
▽ More
In this paper, we propose Stochastic Block-ADMM as an approach to train deep neural networks in batch and online settings. Our method works by splitting neural networks into an arbitrary number of blocks and utilizes auxiliary variables to connect these blocks while optimizing with stochastic gradient descent. This allows training deep networks with non-differentiable constraints where conventional backpropagation is not applicable. An application of this is supervised feature disentangling, where our proposed DeepFacto inserts a non-negative matrix factorization (NMF) layer into the network. Since backpropagation only needs to be performed within each block, our approach alleviates vanishing gradients and provides potentials for parallelization. We prove the convergence of our proposed method and justify its capabilities through experiments in supervised and weakly-supervised settings.
△ Less
Submitted 1 May, 2021;
originally announced May 2021.
-
Reducing Neural Network Parameter Initialization Into an SMT Problem
Authors:
Mohamad H. Danesh
Abstract:
Training a neural network (NN) depends on multiple factors, including but not limited to the initial weights. In this paper, we focus on initializing deep NN parameters such that it performs better, comparing to random or zero initialization. We do this by reducing the process of initialization into an SMT solver. Previous works consider certain activation functions on small NNs, however the studi…
▽ More
Training a neural network (NN) depends on multiple factors, including but not limited to the initial weights. In this paper, we focus on initializing deep NN parameters such that it performs better, comparing to random or zero initialization. We do this by reducing the process of initialization into an SMT solver. Previous works consider certain activation functions on small NNs, however the studied NN is a deep network with different activation functions. Our experiments show that the proposed approach for parameter initialization achieves better performance comparing to randomly initialized networks.
△ Less
Submitted 9 November, 2020; v1 submitted 2 November, 2020;
originally announced November 2020.
-
Re-understanding Finite-State Representations of Recurrent Policy Networks
Authors:
Mohamad H. Danesh,
Anurag Koul,
Alan Fern,
Saeed Khorram
Abstract:
We introduce an approach for understanding control policies represented as recurrent neural networks. Recent work has approached this problem by transforming such recurrent policy networks into finite-state machines (FSM) and then analyzing the equivalent minimized FSM. While this led to interesting insights, the minimization process can obscure a deeper understanding of a machine's operation by m…
▽ More
We introduce an approach for understanding control policies represented as recurrent neural networks. Recent work has approached this problem by transforming such recurrent policy networks into finite-state machines (FSM) and then analyzing the equivalent minimized FSM. While this led to interesting insights, the minimization process can obscure a deeper understanding of a machine's operation by merging states that are semantically distinct. To address this issue, we introduce an analysis approach that starts with an unminimized FSM and applies more-interpretable reductions that preserve the key decision points of the policy. We also contribute an attention tool to attain a deeper understanding of the role of observations in the decisions. Our case studies on 7 Atari games and 3 control benchmarks demonstrate that the approach can reveal insights that have not been previously noticed.
△ Less
Submitted 11 July, 2021; v1 submitted 5 June, 2020;
originally announced June 2020.