-
Contextualized Hybrid Ensemble Q-learning: Learning Fast with Control Priors
Authors:
Emma Cramer,
Bernd Frauenknecht,
Ramil Sabirov,
Sebastian Trimpe
Abstract:
Combining Reinforcement Learning (RL) with a prior controller can yield the best out of two worlds: RL can solve complex nonlinear problems, while the control prior ensures safer exploration and speeds up training. Prior work largely blends both components with a fixed weight, neglecting that the RL agent's performance varies with the training progress and across regions in the state space. Theref…
▽ More
Combining Reinforcement Learning (RL) with a prior controller can yield the best out of two worlds: RL can solve complex nonlinear problems, while the control prior ensures safer exploration and speeds up training. Prior work largely blends both components with a fixed weight, neglecting that the RL agent's performance varies with the training progress and across regions in the state space. Therefore, we advocate for an adaptive strategy that dynamically adjusts the weighting based on the RL agent's current capabilities. We propose a new adaptive hybrid RL algorithm, Contextualized Hybrid Ensemble Q-learning (CHEQ). CHEQ combines three key ingredients: (i) a time-invariant formulation of the adaptive hybrid RL problem treating the adaptive weight as a context variable, (ii) a weight adaption mechanism based on the parametric uncertainty of a critic ensemble, and (iii) ensemble-based acceleration for data-efficient RL. Evaluating CHEQ on a car racing task reveals substantially stronger data efficiency, exploration safety, and transferability to unknown scenarios than state-of-the-art adaptive hybrid RL methods.
△ Less
Submitted 1 July, 2024; v1 submitted 28 June, 2024;
originally announced June 2024.
-
Trust the Model Where It Trusts Itself -- Model-Based Actor-Critic with Uncertainty-Aware Rollout Adaption
Authors:
Bernd Frauenknecht,
Artur Eisele,
Devdutt Subhasish,
Friedrich Solowjow,
Sebastian Trimpe
Abstract:
Dyna-style model-based reinforcement learning (MBRL) combines model-free agents with predictive transition models through model-based rollouts. This combination raises a critical question: 'When to trust your model?'; i.e., which rollout length results in the model providing useful data? Janner et al. (2019) address this question by gradually increasing rollout lengths throughout the training. Whi…
▽ More
Dyna-style model-based reinforcement learning (MBRL) combines model-free agents with predictive transition models through model-based rollouts. This combination raises a critical question: 'When to trust your model?'; i.e., which rollout length results in the model providing useful data? Janner et al. (2019) address this question by gradually increasing rollout lengths throughout the training. While theoretically tempting, uniform model accuracy is a fallacy that collapses at the latest when extrapolating. Instead, we propose asking the question 'Where to trust your model?'. Using inherent model uncertainty to consider local accuracy, we obtain the Model-Based Actor-Critic with Uncertainty-Aware Rollout Adaption (MACURA) algorithm. We propose an easy-to-tune rollout mechanism and demonstrate substantial improvements in data efficiency and performance compared to state-of-the-art deep MBRL methods on the MuJoCo benchmark.
△ Less
Submitted 21 June, 2024; v1 submitted 29 May, 2024;
originally announced May 2024.
-
Data-efficient Deep Reinforcement Learning for Vehicle Trajectory Control
Authors:
Bernd Frauenknecht,
Tobias Ehlgen,
Sebastian Trimpe
Abstract:
Advanced vehicle control is a fundamental building block in the development of autonomous driving systems. Reinforcement learning (RL) promises to achieve control performance superior to classical approaches while kee** computational demands low during deployment. However, standard RL approaches like soft-actor critic (SAC) require extensive amounts of training data to be collected and are thus…
▽ More
Advanced vehicle control is a fundamental building block in the development of autonomous driving systems. Reinforcement learning (RL) promises to achieve control performance superior to classical approaches while kee** computational demands low during deployment. However, standard RL approaches like soft-actor critic (SAC) require extensive amounts of training data to be collected and are thus impractical for real-world application. To address this issue, we apply recently developed data-efficient deep RL methods to vehicle trajectory control. Our investigation focuses on three methods, so far unexplored for vehicle control: randomized ensemble double Q-learning (REDQ), probabilistic ensembles with trajectory sampling and model predictive path integral optimizer (PETS-MPPI), and model-based policy optimization (MBPO). We find that in the case of trajectory control, the standard model-based RL formulation used in approaches like PETS-MPPI and MBPO is not suitable. We, therefore, propose a new formulation that splits dynamics prediction and vehicle localization. Our benchmark study on the CARLA simulator reveals that the three identified data-efficient deep RL approaches learn control strategies on a par with or better than SAC, yet reduce the required number of environment interactions by more than one order of magnitude.
△ Less
Submitted 30 November, 2023;
originally announced November 2023.
-
Glioma subtype classification from histopathological images using in-domain and out-of-domain transfer learning: An experimental study
Authors:
Vladimir Despotovic,
Sang-Yoon Kim,
Ann-Christin Hau,
Aliaksandra Kakoichankava,
Gilbert Georg Klamminger,
Felix Bruno Kleine Borgmann,
Katrin B. M. Frauenknecht,
Michel Mittelbronn,
Petr V. Nazarov
Abstract:
We provide in this paper a comprehensive comparison of various transfer learning strategies and deep learning architectures for computer-aided classification of adult-type diffuse gliomas. We evaluate the generalizability of out-of-domain ImageNet representations for a target domain of histopathological images, and study the impact of in-domain adaptation using self-supervised and multi-task learn…
▽ More
We provide in this paper a comprehensive comparison of various transfer learning strategies and deep learning architectures for computer-aided classification of adult-type diffuse gliomas. We evaluate the generalizability of out-of-domain ImageNet representations for a target domain of histopathological images, and study the impact of in-domain adaptation using self-supervised and multi-task learning approaches for pretraining the models using the medium-to-large scale datasets of histopathological images. A semi-supervised learning approach is furthermore proposed, where the fine-tuned models are utilized to predict the labels of unannotated regions of the whole slide images (WSI). The models are subsequently retrained using the ground-truth labels and weak labels determined in the previous step, providing superior performance in comparison to standard in-domain transfer learning with balanced accuracy of 96.91% and F1-score 97.07%, and minimizing the pathologist's efforts for annotation. Finally, we provide a visualization tool working at WSI level which generates heatmaps that highlight tumor areas; thus, providing insights to pathologists concerning the most informative parts of the WSI.
△ Less
Submitted 29 September, 2023;
originally announced September 2023.