Search | arXiv e-print repository

Stable Online and Offline Reinforcement Learning for Antibody CDRH3 Design

Authors: Yannick Vogt, Mehdi Naouar, Maria Kalweit, Christoph Cornelius Miething, Justus Duyster, Roland Mertelsmann, Gabriel Kalweit, Joschka Boedecker

Abstract: The field of antibody-based therapeutics has grown significantly in recent years, with targeted antibodies emerging as a potentially effective approach to personalized therapies. Such therapies could be particularly beneficial for complex, highly individual diseases such as cancer. However, progress in this field is often constrained by the extensive search space of amino acid sequences that form… ▽ More The field of antibody-based therapeutics has grown significantly in recent years, with targeted antibodies emerging as a potentially effective approach to personalized therapies. Such therapies could be particularly beneficial for complex, highly individual diseases such as cancer. However, progress in this field is often constrained by the extensive search space of amino acid sequences that form the foundation of antibody design. In this study, we introduce a novel reinforcement learning method specifically tailored to address the unique challenges of this domain. We demonstrate that our method can learn the design of high-affinity antibodies against multiple targets in silico, utilizing either online interaction or offline datasets. To the best of our knowledge, our approach is the first of its kind and outperforms existing methods on all tested antigens in the Absolut! database. △ Less

Submitted 29 November, 2023; originally announced January 2024.

arXiv:2312.00671 [pdf, other]

CellMixer: Annotation-free Semantic Cell Segmentation of Heterogeneous Cell Populations

Authors: Mehdi Naouar, Gabriel Kalweit, Anusha Klett, Yannick Vogt, Paula Silvestrini, Diana Laura Infante Ramirez, Roland Mertelsmann, Joschka Boedecker, Maria Kalweit

Abstract: In recent years, several unsupervised cell segmentation methods have been presented, trying to omit the requirement of laborious pixel-level annotations for the training of a cell segmentation model. Most if not all of these methods handle the instance segmentation task by focusing on the detection of different cell instances ignoring their type. While such models prove adequate for certain tasks,… ▽ More In recent years, several unsupervised cell segmentation methods have been presented, trying to omit the requirement of laborious pixel-level annotations for the training of a cell segmentation model. Most if not all of these methods handle the instance segmentation task by focusing on the detection of different cell instances ignoring their type. While such models prove adequate for certain tasks, like cell counting, other applications require the identification of each cell's type. In this paper, we present CellMixer, an innovative annotation-free approach for the semantic segmentation of heterogeneous cell populations. Our augmentation-based method enables the training of a segmentation model from image-level labels of homogeneous cell populations. Our results show that CellMixer can achieve competitive segmentation performance across multiple cell types and imaging modalities, demonstrating the method's scalability and potential for broader applications in medical imaging, cellular biology, and diagnostics. △ Less

Submitted 1 December, 2023; originally announced December 2023.

Comments: Medical Imaging Meets NeurIPS 2023

arXiv:2311.13870 [pdf, other]

Multi-intention Inverse Q-learning for Interpretable Behavior Representation

Authors: Hao Zhu, Brice De La Crompe, Gabriel Kalweit, Artur Schneider, Maria Kalweit, Ilka Diester, Joschka Boedecker

Abstract: In advancing the understanding of natural decision-making processes, inverse reinforcement learning (IRL) methods have proven instrumental in reconstructing animal's intentions underlying complex behaviors. Given the recent development of a continuous-time multi-intention IRL framework, there has been persistent inquiry into inferring discrete time-varying rewards with IRL. To address this challen… ▽ More In advancing the understanding of natural decision-making processes, inverse reinforcement learning (IRL) methods have proven instrumental in reconstructing animal's intentions underlying complex behaviors. Given the recent development of a continuous-time multi-intention IRL framework, there has been persistent inquiry into inferring discrete time-varying rewards with IRL. To address this challenge, we introduce the class of hierarchical inverse Q-learning (HIQL) algorithms. Through an unsupervised learning process, HIQL divides expert trajectories into multiple intention segments, and solves the IRL problem independently for each. Applying HIQL to simulated experiments and several real animal behavior datasets, our approach outperforms current benchmarks in behavior prediction and produces interpretable reward functions. Our results suggest that the intention transition dynamics underlying complex decision-making behavior is better modeled by a step function instead of a smoothly varying function. This advancement holds promise for neuroscience and cognitive science, contributing to a deeper understanding of decision-making and uncovering underlying brain mechanisms. △ Less

Submitted 19 June, 2024; v1 submitted 23 November, 2023; originally announced November 2023.

arXiv:2303.16533 [pdf, other]

Robust Tumor Detection from Coarse Annotations via Multi-Magnification Ensembles

Authors: Mehdi Naouar, Gabriel Kalweit, Ignacio Mastroleo, Philipp Poxleitner, Marc Metzger, Joschka Boedecker, Maria Kalweit

Abstract: Cancer detection and classification from gigapixel whole slide images of stained tissue specimens has recently experienced enormous progress in computational histopathology. The limitation of available pixel-wise annotated scans shifted the focus from tumor localization to global slide-level classification on the basis of (weakly-supervised) multiple-instance learning despite the clinical importan… ▽ More Cancer detection and classification from gigapixel whole slide images of stained tissue specimens has recently experienced enormous progress in computational histopathology. The limitation of available pixel-wise annotated scans shifted the focus from tumor localization to global slide-level classification on the basis of (weakly-supervised) multiple-instance learning despite the clinical importance of local cancer detection. However, the worse performance of these techniques in comparison to fully supervised methods has limited their usage until now for diagnostic interventions in domains of life-threatening diseases such as cancer. In this work, we put the focus back on tumor localization in form of a patch-level classification task and take up the setting of so-called coarse annotations, which provide greater training supervision while remaining feasible from a clinical standpoint. To this end, we present a novel ensemble method that not only significantly improves the detection accuracy of metastasis on the open CAMELYON16 data set of sentinel lymph nodes of breast cancer patients, but also considerably increases its robustness against noise while training on coarse annotations. Our experiments show that better results can be achieved with our technique making it clinically feasible to use for cancer diagnosis and opening a new avenue for translational and clinical research. △ Less

Submitted 29 March, 2023; originally announced March 2023.

arXiv:2209.14229 [pdf, other]

Process-guidance improves predictive performance of neural networks for carbon turnover in ecosystems

Authors: Marieke Wesselkamp, Niklas Moser, Maria Kalweit, Joschka Boedecker, Carsten F. Dormann

Abstract: Despite deep-learning being state-of-the-art for data-driven model predictions, it has not yet found frequent application in ecology. Given the low sample size typical in many environmental research fields, the default choice for the modelling of ecosystems and its functions remain process-based models. The process understanding coded in these models complements the sparse data and neural networks… ▽ More Despite deep-learning being state-of-the-art for data-driven model predictions, it has not yet found frequent application in ecology. Given the low sample size typical in many environmental research fields, the default choice for the modelling of ecosystems and its functions remain process-based models. The process understanding coded in these models complements the sparse data and neural networks can detect hidden dynamics even in noisy data. Embedding the process model in the neural network adds information to learn from, improving interpretability and predictive performance of the combined model towards the data-only neural networks and the mechanism-only process model. At the example of carbon fluxes in forest ecosystems, we compare different approaches of guiding a neural network towards process model theory. Evaluation of the results under four classical prediction scenarios supports decision-making on the appropriate choice of a process-guided neural network. △ Less

Submitted 28 September, 2022; originally announced September 2022.

arXiv:2204.04733 [pdf, other]

NeuRL: Closed-form Inverse Reinforcement Learning for Neural Decoding

Authors: Gabriel Kalweit, Maria Kalweit, Mansour Alyahyay, Zoe Jaeckel, Florian Steenbergen, Stefanie Hardung, Thomas Brox, Ilka Diester, Joschka Boedecker

Abstract: Current neural decoding methods typically aim at explaining behavior based on neural activity via supervised learning. However, since generally there is a strong connection between learning of subjects and their expectations on long-term rewards, we propose NeuRL, an inverse reinforcement learning approach that (1) extracts an intrinsic reward function from collected trajectories of a subject in c… ▽ More Current neural decoding methods typically aim at explaining behavior based on neural activity via supervised learning. However, since generally there is a strong connection between learning of subjects and their expectations on long-term rewards, we propose NeuRL, an inverse reinforcement learning approach that (1) extracts an intrinsic reward function from collected trajectories of a subject in closed form, (2) maps neural signals to this intrinsic reward to account for long-term dependencies in the behavior and (3) predicts the simulated behavior for unseen neural signals by extracting Q-values and the corresponding Boltzmann policy based on the intrinsic reward values for these unseen neural signals. We show that NeuRL leads to better generalization and improved decoding performance compared to supervised approaches. We study the behavior of rats in a response-preparation task and evaluate the performance of NeuRL within simulated inhibition and per-trial behavior prediction. By assigning clear functional roles to defined neuronal populations our approach offers a new interpretation tool for complex neuronal data with testable predictions. In per-trial behavior prediction, our approach furthermore improves accuracy by up to 15% compared to traditional methods. △ Less

Submitted 10 April, 2022; originally announced April 2022.

arXiv:2010.11278 [pdf, other]

Deep Surrogate Q-Learning for Autonomous Driving

Authors: Maria Kalweit, Gabriel Kalweit, Moritz Werling, Joschka Boedecker

Abstract: Challenging problems of deep reinforcement learning systems with regard to the application on real systems are their adaptivity to changing environments and their efficiency w.r.t. computational resources and data. In the application of learning lane-change behavior for autonomous driving, agents have to deal with a varying number of surrounding vehicles. Furthermore, the number of required transi… ▽ More Challenging problems of deep reinforcement learning systems with regard to the application on real systems are their adaptivity to changing environments and their efficiency w.r.t. computational resources and data. In the application of learning lane-change behavior for autonomous driving, agents have to deal with a varying number of surrounding vehicles. Furthermore, the number of required transitions imposes a bottleneck, since test drivers cannot perform an arbitrary amount of lane changes in the real world. In the off-policy setting, additional information on solving the task can be gained by observing actions from others. While in the classical RL setup this knowledge remains unused, we use other drivers as surrogates to learn the agent's value function more efficiently. We propose Surrogate Q-learning that deals with the aforementioned problems and reduces the required driving time drastically. We further propose an efficient implementation based on a permutation-equivariant deep neural network architecture of the Q-function to estimate action-values for a variable number of vehicles in sensor range. We show that the architecture leads to a novel replay sampling technique we call Scene-centric Experience Replay and evaluate the performance of Surrogate Q-learning and Scene-centric Experience Replay in the open traffic simulator SUMO. Additionally, we show that our methods enhance real-world applicability of RL systems by learning policies on the real highD dataset. △ Less

Submitted 17 February, 2022; v1 submitted 21 October, 2020; originally announced October 2020.

Comments: Accepted at ICRA 2022

Showing 1–7 of 7 results for author: Kalweit, M