Search | arXiv e-print repository

FPGA-Based Neural Thrust Controller for UAVs

Authors: Sharif Azem, David Scheunert, Mengguang Li, Jonas Gehrunger, Kai Cui, Christian Hochberger, Heinz Koeppl

Abstract: The advent of unmanned aerial vehicles (UAVs) has improved a variety of fields by providing a versatile, cost-effective and accessible platform for implementing state-of-the-art algorithms. To accomplish a broader range of tasks, there is a growing need for enhanced on-board computing to cope with increasing complexity and dynamic environmental conditions. Recent advances have seen the application… ▽ More The advent of unmanned aerial vehicles (UAVs) has improved a variety of fields by providing a versatile, cost-effective and accessible platform for implementing state-of-the-art algorithms. To accomplish a broader range of tasks, there is a growing need for enhanced on-board computing to cope with increasing complexity and dynamic environmental conditions. Recent advances have seen the application of Deep Neural Networks (DNNs), particularly in combination with Reinforcement Learning (RL), to improve the adaptability and performance of UAVs, especially in unknown environments. However, the computational requirements of DNNs pose a challenge to the limited computing resources available on many UAVs. This work explores the use of Field Programmable Gate Arrays (FPGAs) as a viable solution to this challenge, offering flexibility, high performance, energy and time efficiency. We propose a novel hardware board equipped with an Artix-7 FPGA for a popular open-source micro-UAV platform. We successfully validate its functionality by implementing an RL-based low-level controller using real-world experiments. △ Less

Submitted 28 March, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

arXiv:2403.15221 [pdf, other]

Mutual Information of a class of Poisson-type Channels using Markov Renewal Theory

Authors: Maximilian Gehri, Nicolai Engelmann, Heinz Koeppl

Abstract: The mutual information (MI) of Poisson-type channels has been linked to a filtering problem since the 70s, but its evaluation for specific continuous-time, discrete-state systems remains a demanding task. As an advantage, Markov renewal processes (MrP) retain their renewal property under state space filtering. This offers a way to solve the filtering problem analytically for small systems. We cons… ▽ More The mutual information (MI) of Poisson-type channels has been linked to a filtering problem since the 70s, but its evaluation for specific continuous-time, discrete-state systems remains a demanding task. As an advantage, Markov renewal processes (MrP) retain their renewal property under state space filtering. This offers a way to solve the filtering problem analytically for small systems. We consider a class of communication systems $X \to Y$ that can be derived from an MrP by a custom filtering procedure. For the subclasses, where (i) $Y$ is a renewal process or (ii) $(X,Y)$ belongs to a class of MrPs, we provide an evolution equation for finite transmission duration $T>0$ and limit theorems for $T \to \infty$ that facilitate simulation-free evaluation of the MI $\mathbb{I}(X_{[0,T]}; Y_{[0,T]})$ and its associated mutual information rate (MIR). In other cases, simulation cost is reduced to the marginal system $(X,Y)$ or $Y$. We show that systems with an additional $X$-modulating level $C$, which statically chooses between different processes $X_{[0,T]}(c)$, can naturally be included in our framework, thereby giving an expression for $\mathbb{I}(C; Y_{[0,T]})$. Our primary contribution is to apply the results of classical (Markov renewal) filtering theory in a novel manner to the problem of exactly computing the MI/MIR. The theoretical framework is showcased in an application to bacterial gene expression, where filtering is analytically tractable. △ Less

Submitted 15 June, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

Comments: 5 main pages, 1 main figure, 5 appendix pages, 3 appendix figures, Accepted at ISIT 2024 conference

arXiv:2403.00044 [pdf, other]

Scaling up Dynamic Edge Partition Models via Stochastic Gradient MCMC

Authors: Sikun Yang, Heinz Koeppl

Abstract: The edge partition model (EPM) is a generative model for extracting an overlap** community structure from static graph-structured data. In the EPM, the gamma process (GaP) prior is adopted to infer the appropriate number of latent communities, and each vertex is endowed with a gamma distributed positive memberships vector. Despite having many attractive properties, inference in the EPM is typica… ▽ More The edge partition model (EPM) is a generative model for extracting an overlap** community structure from static graph-structured data. In the EPM, the gamma process (GaP) prior is adopted to infer the appropriate number of latent communities, and each vertex is endowed with a gamma distributed positive memberships vector. Despite having many attractive properties, inference in the EPM is typically performed using Markov chain Monte Carlo (MCMC) methods that prevent it from being applied to massive network data. In this paper, we generalize the EPM to account for dynamic enviroment by representing each vertex with a positive memberships vector constructed using Dirichlet prior specification, and capturing the time-evolving behaviour of vertices via a Dirichlet Markov chain construction. A simple-to-implement Gibbs sampler is proposed to perform posterior computation using Negative- Binomial augmentation technique. For large network data, we propose a stochastic gradient Markov chain Monte Carlo (SG-MCMC) algorithm for scalable inference in the proposed model. The experimental results show that the novel methods achieve competitive performance in terms of link prediction, while being much faster. △ Less

Submitted 29 February, 2024; originally announced March 2024.

arXiv:2402.18995 [pdf, other]

Negative-Binomial Randomized Gamma Markov Processes for Heterogeneous Overdispersed Count Time Series

Authors: Rui Huang, Sikun Yang, Heinz Koeppl

Abstract: Modeling count-valued time series has been receiving increasing attention since count time series naturally arise in physical and social domains. Poisson gamma dynamical systems (PGDSs) are newly-developed methods, which can well capture the expressive latent transition structure and bursty dynamics behind count sequences. In particular, PGDSs demonstrate superior performance in terms of data impu… ▽ More Modeling count-valued time series has been receiving increasing attention since count time series naturally arise in physical and social domains. Poisson gamma dynamical systems (PGDSs) are newly-developed methods, which can well capture the expressive latent transition structure and bursty dynamics behind count sequences. In particular, PGDSs demonstrate superior performance in terms of data imputation and prediction, compared with canonical linear dynamical system (LDS) based methods. Despite these advantages, PGDS cannot capture the heterogeneous overdispersed behaviours of the underlying dynamic processes. To mitigate this defect, we propose a negative-binomial-randomized gamma Markov process, which not only significantly improves the predictive performance of the proposed dynamical system, but also facilitates the fast convergence of the inference algorithm. Moreover, we develop methods to estimate both factor-structured and graph-structured transition dynamics, which enable us to infer more explainable latent structure, compared with PGDSs. Finally, we demonstrate the explainable latent structure learned by the proposed method, and show its superior performance in imputing missing data and forecasting future observations, compared with the related models. △ Less

Submitted 29 February, 2024; originally announced February 2024.

arXiv:2402.16297 [pdf, other]

A Poisson-Gamma Dynamic Factor Model with Time-Varying Transition Dynamics

Authors: Jiahao Wang, Sikun Yang, Heinz Koeppl, Xiuzhen Cheng, Pengfei Hu, Guoming Zhang

Abstract: Probabilistic approaches for handling count-valued time sequences have attracted amounts of research attentions because their ability to infer explainable latent structures and to estimate uncertainties, and thus are especially suitable for dealing with \emph{noisy} and \emph{incomplete} count data. Among these models, Poisson-Gamma Dynamical Systems (PGDSs) are proven to be effective in capturing… ▽ More Probabilistic approaches for handling count-valued time sequences have attracted amounts of research attentions because their ability to infer explainable latent structures and to estimate uncertainties, and thus are especially suitable for dealing with \emph{noisy} and \emph{incomplete} count data. Among these models, Poisson-Gamma Dynamical Systems (PGDSs) are proven to be effective in capturing the evolving dynamics underlying observed count sequences. However, the state-of-the-art PGDS still fails to capture the \emph{time-varying} transition dynamics that are commonly observed in real-world count time sequences. To mitigate this gap, a non-stationary PGDS is proposed to allow the underlying transition matrices to evolve over time, and the evolving transition matrices are modeled by sophisticatedly-designed Dirichlet Markov chains. Leveraging Dirichlet-Multinomial-Beta data augmentation techniques, a fully-conjugate and efficient Gibbs sampler is developed to perform posterior simulation. Experiments show that, in comparison with related models, the proposed non-stationary PGDS achieves improved predictive performance due to its capacity to learn non-stationary dependency structure captured by the time-evolving transition matrices. △ Less

Submitted 23 May, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

arXiv:2402.07735 [pdf, other]

Graph Structure Inference with BAM: Introducing the Bilinear Attention Mechanism

Authors: Philipp Froehlich, Heinz Koeppl

Abstract: In statistics and machine learning, detecting dependencies in datasets is a central challenge. We propose a novel neural network model for supervised graph structure learning, i.e., the process of learning a map** between observational data and their underlying dependence structure. The model is trained with variably shaped and coupled simulated input data and requires only a single forward pass… ▽ More In statistics and machine learning, detecting dependencies in datasets is a central challenge. We propose a novel neural network model for supervised graph structure learning, i.e., the process of learning a map** between observational data and their underlying dependence structure. The model is trained with variably shaped and coupled simulated input data and requires only a single forward pass through the trained network for inference. By leveraging structural equation models and employing randomly generated multivariate Chebyshev polynomials for the simulation of training data, our method demonstrates robust generalizability across both linear and various types of non-linear dependencies. We introduce a novel bilinear attention mechanism (BAM) for explicit processing of dependency information, which operates on the level of covariance matrices of transformed data and respects the geometry of the manifold of symmetric positive definite matrices. Empirical evaluation demonstrates the robustness of our method in detecting a wide range of dependencies, excelling in undirected graph estimation and proving competitive in completed partially directed acyclic graph estimation through a novel two-step approach. △ Less

Submitted 13 February, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

arXiv:2402.01477 [pdf, other]

A Modular Aerial System Based on Homogeneous Quadrotors with Fault-Tolerant Control

Authors: Mengguang Li, Kai Cui, Heinz Koeppl

Abstract: The standard quadrotor is one of the most popular and widely used aerial vehicle of recent decades, offering great maneuverability with mechanical simplicity. However, the under-actuation characteristic limits its applications, especially when it comes to generating desired wrench with six degrees of freedom (DOF). Therefore, existing work often compromises between mechanical complexity and the co… ▽ More The standard quadrotor is one of the most popular and widely used aerial vehicle of recent decades, offering great maneuverability with mechanical simplicity. However, the under-actuation characteristic limits its applications, especially when it comes to generating desired wrench with six degrees of freedom (DOF). Therefore, existing work often compromises between mechanical complexity and the controllable DOF of the aerial system. To take advantage of the mechanical simplicity of a standard quadrotor, we propose a modular aerial system, IdentiQuad, that combines only homogeneous quadrotor-based modules. Each IdentiQuad can be operated alone like a standard quadrotor, but at the same time allows task-specific assembly, increasing the controllable DOF of the system. Each module is interchangeable within its assembly. We also propose a general controller for different configurations of assemblies, capable of tolerating rotor failures and balancing the energy consumption of each module. The functionality and robustness of the system and its controller are validated using physics-based simulations for different assembly configurations. △ Less

Submitted 21 March, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

Comments: ICRA2024

arXiv:2402.01431 [pdf, other]

Approximate Control for Continuous-Time POMDPs

Authors: Yannick Eich, Bastian Alt, Heinz Koeppl

Abstract: This work proposes a decision-making framework for partially observable systems in continuous time with discrete state and action spaces. As optimal decision-making becomes intractable for large state spaces we employ approximation methods for the filtering and the control problem that scale well with an increasing number of states. Specifically, we approximate the high-dimensional filtering distr… ▽ More This work proposes a decision-making framework for partially observable systems in continuous time with discrete state and action spaces. As optimal decision-making becomes intractable for large state spaces we employ approximation methods for the filtering and the control problem that scale well with an increasing number of states. Specifically, we approximate the high-dimensional filtering distribution by projecting it onto a parametric family of distributions, and integrate it into a control heuristic based on the fully observable system to obtain a scalable policy. We demonstrate the effectiveness of our approach on several partially observed systems, including queueing systems and chemical reaction networks. △ Less

Submitted 29 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

Comments: To be published in AISTATS 2024

arXiv:2401.12686 [pdf, other]

Learning Mean Field Games on Sparse Graphs: A Hybrid Graphex Approach

Authors: Christian Fabian, Kai Cui, Heinz Koeppl

Abstract: Learning the behavior of large agent populations is an important task for numerous research areas. Although the field of multi-agent reinforcement learning (MARL) has made significant progress towards solving these systems, solutions for many agents often remain computationally infeasible and lack theoretical guarantees. Mean Field Games (MFGs) address both of these issues and can be extended to G… ▽ More Learning the behavior of large agent populations is an important task for numerous research areas. Although the field of multi-agent reinforcement learning (MARL) has made significant progress towards solving these systems, solutions for many agents often remain computationally infeasible and lack theoretical guarantees. Mean Field Games (MFGs) address both of these issues and can be extended to Graphon MFGs (GMFGs) to include network structures between agents. Despite their merits, the real world applicability of GMFGs is limited by the fact that graphons only capture dense graphs. Since most empirically observed networks show some degree of sparsity, such as power law graphs, the GMFG framework is insufficient for capturing these network topologies. Thus, we introduce the novel concept of Graphex MFGs (GXMFGs) which builds on the graph theoretical concept of graphexes. Graphexes are the limiting objects to sparse graph sequences that also have other desirable features such as the small world property. Learning equilibria in these games is challenging due to the rich and sparse structure of the underlying graphs. To tackle these challenges, we design a new learning algorithm tailored to the GXMFG setup. This hybrid graphex learning approach leverages that the system mainly consists of a highly connected core and a sparse periphery. After defining the system and providing a theoretical analysis, we state our learning approach and demonstrate its learning capabilities on both synthetic graphs and real-world networks. This comparison shows that our GXMFG learning algorithm successfully extends MFGs to a highly relevant class of hard, realistic learning problems that are not accurately addressed by current MARL and MFG methods. △ Less

Submitted 23 February, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

Comments: accepted at ICLR 2024

arXiv:2312.12977 [pdf, other]

Collaborative Optimization of the Age of Information under Partial Observability

Authors: Anam Tahir, Kai Cui, Bastian Alt, Amr Rizk, Heinz Koeppl

Abstract: The significance of the freshness of sensor and control data at the receiver side, often referred to as Age of Information (AoI), is fundamentally constrained by contention for limited network resources. Evidently, network congestion is detrimental for AoI, where this congestion is partly self-induced by the sensor transmission process in addition to the contention from other transmitting sensors.… ▽ More The significance of the freshness of sensor and control data at the receiver side, often referred to as Age of Information (AoI), is fundamentally constrained by contention for limited network resources. Evidently, network congestion is detrimental for AoI, where this congestion is partly self-induced by the sensor transmission process in addition to the contention from other transmitting sensors. In this work, we devise a decentralized AoI-minimizing transmission policy for a number of sensor agents sharing capacity-limited, non-FIFO duplex channels that introduce random delays in communication with a common receiver. By implementing the same policy, however with no explicit inter-agent communication, the agents minimize the expected AoI in this partially observable system. We cater to the partial observability due to random channel delays by designing a bootstrap particle filter that independently maintains a belief over the AoI of each agent. We also leverage mean-field control approximations and reinforcement learning to derive scalable and optimal solutions for minimizing the expected AoI collaboratively. △ Less

Submitted 20 December, 2023; originally announced December 2023.

arXiv:2312.12973 [pdf, other]

Sparse Mean Field Load Balancing in Large Localized Queueing Systems

Authors: Anam Tahir, Kai Cui, Heinz Koeppl

Abstract: Scalable load balancing algorithms are of great interest in cloud networks and data centers, necessitating the use of tractable techniques to compute optimal load balancing policies for good performance. However, most existing scalable techniques, especially asymptotically scaling methods based on mean field theory, have not been able to model large queueing networks with strong locality. Meanwhil… ▽ More Scalable load balancing algorithms are of great interest in cloud networks and data centers, necessitating the use of tractable techniques to compute optimal load balancing policies for good performance. However, most existing scalable techniques, especially asymptotically scaling methods based on mean field theory, have not been able to model large queueing networks with strong locality. Meanwhile, general multi-agent reinforcement learning techniques can be hard to scale and usually lack a theoretical foundation. In this work, we address this challenge by leveraging recent advances in sparse mean field theory to learn a near-optimal load balancing policy in sparsely connected queueing networks in a tractable manner, which may be preferable to global approaches in terms of wireless communication overhead. Importantly, we obtain a general load balancing framework for a large class of sparse bounded-degree wireless topologies. By formulating a novel mean field control problem in the context of graphs with bounded degree, we reduce the otherwise difficult multi-agent problem to a single-agent problem. Theoretically, the approach is justified by approximation guarantees. Empirically, the proposed methodology performs well on several realistic and scalable wireless network topologies as compared to a number of well-known load balancing heuristics and existing scalable multi-agent reinforcement learning methods. △ Less

Submitted 22 March, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

arXiv:2312.10787 [pdf, other]

Learning Discrete-Time Major-Minor Mean Field Games

Authors: Kai Cui, Gökçe Dayanıklı, Mathieu Laurière, Matthieu Geist, Olivier Pietquin, Heinz Koeppl

Abstract: Recent techniques based on Mean Field Games (MFGs) allow the scalable analysis of multi-player games with many similar, rational agents. However, standard MFGs remain limited to homogeneous players that weakly influence each other, and cannot model major players that strongly influence other players, severely limiting the class of problems that can be handled. We propose a novel discrete time vers… ▽ More Recent techniques based on Mean Field Games (MFGs) allow the scalable analysis of multi-player games with many similar, rational agents. However, standard MFGs remain limited to homogeneous players that weakly influence each other, and cannot model major players that strongly influence other players, severely limiting the class of problems that can be handled. We propose a novel discrete time version of major-minor MFGs (M3FGs), along with a learning algorithm based on fictitious play and partitioning the probability simplex. Importantly, M3FGs generalize MFGs with common noise and can handle not only random exogeneous environment states but also major players. A key challenge is that the mean field is stochastic and not deterministic as in standard MFGs. Our theoretical investigation verifies both the M3FG model and its algorithmic solution, showing firstly the well-posedness of the M3FG model starting from a finite game of interest, and secondly convergence and approximation guarantees of the fictitious play algorithm. Then, we empirically verify the obtained theoretical results, ablating some of the theoretical assumptions made, and show successful equilibrium learning in three example problems. Overall, we establish a learning framework for a novel and broad class of tractable games. △ Less

Submitted 17 December, 2023; originally announced December 2023.

Comments: Accepted to AAAI 2024

arXiv:2311.14770 [pdf, other]

Learning to Cooperate and Communicate Over Imperfect Channels

Authors: Jannis Weil, Gizem Ekinci, Heinz Koeppl, Tobias Meuser

Abstract: Information exchange in multi-agent systems improves the cooperation among agents, especially in partially observable settings. In the real world, communication is often carried out over imperfect channels. This requires agents to handle uncertainty due to potential information loss. In this paper, we consider a cooperative multi-agent system where the agents act and exchange information in a dece… ▽ More Information exchange in multi-agent systems improves the cooperation among agents, especially in partially observable settings. In the real world, communication is often carried out over imperfect channels. This requires agents to handle uncertainty due to potential information loss. In this paper, we consider a cooperative multi-agent system where the agents act and exchange information in a decentralized manner using a limited and unreliable channel. To cope with such channel constraints, we propose a novel communication approach based on independent Q-learning. Our method allows agents to dynamically adapt how much information to share by sending messages of different sizes, depending on their local observations and the channel's properties. In addition to this message size selection, agents learn to encode and decode messages to improve their jointly trained policies. We show that our approach outperforms approaches without adaptive capabilities in a novel cooperative digit-prediction environment and discuss its limitations in the traffic junction environment. △ Less

Submitted 24 November, 2023; originally announced November 2023.

arXiv:2311.08298 [pdf, other]

A Survey of Confidence Estimation and Calibration in Large Language Models

Authors: Jiahui Geng, Fengyu Cai, Yuxia Wang, Heinz Koeppl, Preslav Nakov, Iryna Gurevych

Abstract: Large language models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks in various domains. Despite their impressive performance, they can be unreliable due to factual errors in their generations. Assessing their confidence and calibrating them across different tasks can help mitigate risks and enable LLMs to produce better generations. There has been a lot of recent re… ▽ More Large language models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks in various domains. Despite their impressive performance, they can be unreliable due to factual errors in their generations. Assessing their confidence and calibrating them across different tasks can help mitigate risks and enable LLMs to produce better generations. There has been a lot of recent research aiming to address this, but there has been no comprehensive overview to organize it and outline the main lessons learned. The present survey aims to bridge this gap. In particular, we outline the challenges and we summarize recent technical advancements for LLM confidence estimation and calibration. We further discuss their applications and suggest promising directions for future work. △ Less

Submitted 25 March, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

Comments: 16 pages, 1 page, 1 table

arXiv:2310.02726 [pdf, other]

Optimal Collaborative Transportation for Under-Capacitated Vehicle Routing Problems using Aerial Drone Swarms

Authors: Akash Kopparam Sreedhara, Deepesh Padala, Shashank Mahesh, Kai Cui, Mengguang Li, Heinz Koeppl

Abstract: Swarms of aerial drones have recently been considered for last-mile deliveries in urban logistics or automated construction. At the same time, collaborative transportation of payloads by multiple drones is another important area of recent research. However, efficient coordination algorithms for collaborative transportation of many payloads by many drones remain to be considered. In this work, we f… ▽ More Swarms of aerial drones have recently been considered for last-mile deliveries in urban logistics or automated construction. At the same time, collaborative transportation of payloads by multiple drones is another important area of recent research. However, efficient coordination algorithms for collaborative transportation of many payloads by many drones remain to be considered. In this work, we formulate the collaborative transportation of payloads by a swarm of drones as a novel, under-capacitated generalization of vehicle routing problems (VRP), which may also be of separate interest. In contrast to standard VRP and capacitated VRP, we must additionally consider waiting times for payloads lifted cooperatively by multiple drones, and the corresponding coordination. Algorithmically, we provide a solution encoding that avoids deadlocks and formulate an appropriate alternating minimization scheme to solve the problem. On the hardware side, we integrate our algorithms with collision avoidance and drone controllers. The approach and the impact of the system integration are successfully verified empirically, both on a swarm of real nano-quadcopters and for large swarms in simulation. Overall, we provide a framework for collaborative transportation with aerial drone swarms, that uses only as many drones as necessary for the transportation of any single payload. △ Less

Submitted 4 October, 2023; originally announced October 2023.

arXiv:2309.15604 [pdf, other]

Entropic Matching for Expectation Propagation of Markov Jump Processes

Authors: Bastian Alt, Heinz Koeppl

Abstract: This paper addresses the problem of statistical inference for latent continuous-time stochastic processes, which is often intractable, particularly for discrete state space processes described by Markov jump processes. To overcome this issue, we propose a new tractable inference scheme based on an entropic matching framework that can be embedded into the well-known expectation propagation algorith… ▽ More This paper addresses the problem of statistical inference for latent continuous-time stochastic processes, which is often intractable, particularly for discrete state space processes described by Markov jump processes. To overcome this issue, we propose a new tractable inference scheme based on an entropic matching framework that can be embedded into the well-known expectation propagation algorithm. We demonstrate the effectiveness of our method by providing closed-form results for a simple family of approximate distributions and apply it to the general class of chemical reaction networks, which are a crucial tool for modeling in systems biology. Moreover, we derive closed form expressions for point estimation of the underlying parameters using an approximate expectation maximization procedure. We evaluate the performance of our method on various chemical reaction network instantiations, including a stochastic Lotka-Voltera example, and discuss its limitations and potential for future improvements. Our proposed approach provides a promising direction for addressing complex continuous-time Bayesian inference problems. △ Less

Submitted 27 September, 2023; originally announced September 2023.

arXiv:2308.12116 [pdf, other]

doi 10.1109/ICCVW60793.2023.00426

The TYC Dataset for Understanding Instance-Level Semantics and Motions of Cells in Microstructures

Authors: Christoph Reich, Tim Prangemeier, Heinz Koeppl

Abstract: Segmenting cells and tracking their motion over time is a common task in biomedical applications. However, predicting accurate instance-wise segmentation and cell motions from microscopy imagery remains a challenging task. Using microstructured environments for analyzing single cells in a constant flow of media adds additional complexity. While large-scale labeled microscopy datasets are available… ▽ More Segmenting cells and tracking their motion over time is a common task in biomedical applications. However, predicting accurate instance-wise segmentation and cell motions from microscopy imagery remains a challenging task. Using microstructured environments for analyzing single cells in a constant flow of media adds additional complexity. While large-scale labeled microscopy datasets are available, we are not aware of any large-scale dataset, including both cells and microstructures. In this paper, we introduce the trapped yeast cell (TYC) dataset, a novel dataset for understanding instance-level semantics and motions of cells in microstructures. We release $105$ dense annotated high-resolution brightfield microscopy images, including about $19$k instance masks. We also release $261$ curated video clips composed of $1293$ high-resolution microscopy images to facilitate unsupervised understanding of cell motions and morphology. TYC offers ten times more instance annotations than the previously largest dataset, including cells and microstructures. Our effort also exceeds previous attempts in terms of microstructure variability, resolution, complexity, and capturing device (microscopy) variability. We facilitate a unified comparison on our novel dataset by introducing a standardized evaluation strategy. TYC and evaluation code are publicly available under CC BY 4.0 license. △ Less

Submitted 23 August, 2023; originally announced August 2023.

Comments: Accepted at ICCV 2023 Workshop on BioImage Computing. Project page (with links to the dataset and code): https://christophreich1996.github.io/tyc_dataset/

arXiv:2307.06175 [pdf, other]

Learning Decentralized Partially Observable Mean Field Control for Artificial Collective Behavior

Authors: Kai Cui, Sascha Hauck, Christian Fabian, Heinz Koeppl

Abstract: Recent reinforcement learning (RL) methods have achieved success in various domains. However, multi-agent RL (MARL) remains a challenge in terms of decentralization, partial observability and scalability to many agents. Meanwhile, collective behavior requires resolution of the aforementioned challenges, and remains of importance to many state-of-the-art applications such as active matter physics,… ▽ More Recent reinforcement learning (RL) methods have achieved success in various domains. However, multi-agent RL (MARL) remains a challenge in terms of decentralization, partial observability and scalability to many agents. Meanwhile, collective behavior requires resolution of the aforementioned challenges, and remains of importance to many state-of-the-art applications such as active matter physics, self-organizing systems, opinion dynamics, and biological or robotic swarms. Here, MARL via mean field control (MFC) offers a potential solution to scalability, but fails to consider decentralized and partially observable systems. In this paper, we enable decentralized behavior of agents under partial information by proposing novel models for decentralized partially observable MFC (Dec-POMFC), a broad class of problems with permutation-invariant agents allowing for reduction to tractable single-agent Markov decision processes (MDP) with single-agent RL solution. We provide rigorous theoretical results, including a dynamic programming principle, together with optimality guarantees for Dec-POMFC solutions applied to finite swarms of interest. Algorithmically, we propose Dec-POMFC-based policy gradient methods for MARL via centralized training and decentralized execution, together with policy gradient approximation guarantees. In addition, we improve upon state-of-the-art histogram-based MFC by kernel methods, which is of separate interest also for fully observable MFC. We evaluate numerically on representative collective behavior tasks such as adapted Kuramoto and Vicsek swarming models, being on par with state-of-the-art MARL. Overall, our framework takes a step towards RL-based engineering of artificial collective behavior via MFC. △ Less

Submitted 22 February, 2024; v1 submitted 12 July, 2023; originally announced July 2023.

Comments: Accepted to ICLR 2024

arXiv:2307.02988 [pdf, other]

UAV Swarms for Joint Data Ferrying and Dynamic Cell Coverage via Optimal Transport Descent and Quadratic Assignment

Authors: Kai Cui, Lars Baumgärtner, Burak Yilmaz, Mengguang Li, Christian Fabian, Benjamin Becker, Lin Xiang, Maximilian Bauer, Heinz Koeppl

Abstract: Both data ferrying with disruption-tolerant networking (DTN) and mobile cellular base stations constitute important techniques for UAV-aided communication in situations of crises where standard communication infrastructure is unavailable. For optimal use of a limited number of UAVs, we propose providing both DTN and a cellular base station on each UAV. Here, DTN is used for large amounts of low-pr… ▽ More Both data ferrying with disruption-tolerant networking (DTN) and mobile cellular base stations constitute important techniques for UAV-aided communication in situations of crises where standard communication infrastructure is unavailable. For optimal use of a limited number of UAVs, we propose providing both DTN and a cellular base station on each UAV. Here, DTN is used for large amounts of low-priority data, while capacity-constrained cell coverage remains reserved for emergency calls or command and control. We optimize cell coverage via a novel optimal transport-based formulation using alternating minimization, while for data ferrying we periodically deliver data between dynamic clusters by solving quadratic assignment problems. In our evaluation, we consider different scenarios with varying mobility models and a wide range of flight patterns. Overall, we tractably achieve optimal cell coverage under quality-of-service costs with DTN-based data ferrying, enabling large-scale deployment of UAV swarms for crisis communication. △ Less

Submitted 6 July, 2023; originally announced July 2023.

Comments: Accepted to IEEE LCN 2023 as full paper, pre-final version

arXiv:2304.07597 [pdf, other]

doi 10.1109/EMBC40787.2023.10340268

An Instance Segmentation Dataset of Yeast Cells in Microstructures

Authors: Christoph Reich, Tim Prangemeier, André O. Françani, Heinz Koeppl

Abstract: Extracting single-cell information from microscopy data requires accurate instance-wise segmentations. Obtaining pixel-wise segmentations from microscopy imagery remains a challenging task, especially with the added complexity of microstructured environments. This paper presents a novel dataset for segmenting yeast cells in microstructures. We offer pixel-wise instance segmentation labels for both… ▽ More Extracting single-cell information from microscopy data requires accurate instance-wise segmentations. Obtaining pixel-wise segmentations from microscopy imagery remains a challenging task, especially with the added complexity of microstructured environments. This paper presents a novel dataset for segmenting yeast cells in microstructures. We offer pixel-wise instance segmentation labels for both cells and trap microstructures. In total, we release 493 densely annotated microscopy images. To facilitate a unified comparison between novel segmentation algorithms, we propose a standardized evaluation strategy for our dataset. The aim of the dataset and evaluation strategy is to facilitate the development of new cell segmentation approaches. The dataset is publicly available at https://christophreich1996.github.io/yeast_in_microstructures_dataset/ . △ Less

Submitted 30 December, 2023; v1 submitted 15 April, 2023; originally announced April 2023.

Comments: IEEE EMBC 2023, Christoph Reich and Tim Prangemeier - both authors contributed equally

arXiv:2303.16698 [pdf, other]

Probabilistic inverse optimal control for non-linear partially observable systems disentangles perceptual uncertainty and behavioral costs

Authors: Dominik Straub, Matthias Schultheis, Heinz Koeppl, Constantin A. Rothkopf

Abstract: Inverse optimal control can be used to characterize behavior in sequential decision-making tasks. Most existing work, however, is limited to fully observable or linear systems, or requires the action signals to be known. Here, we introduce a probabilistic approach to inverse optimal control for partially observable stochastic non-linear systems with unobserved action signals, which unifies previou… ▽ More Inverse optimal control can be used to characterize behavior in sequential decision-making tasks. Most existing work, however, is limited to fully observable or linear systems, or requires the action signals to be known. Here, we introduce a probabilistic approach to inverse optimal control for partially observable stochastic non-linear systems with unobserved action signals, which unifies previous approaches to inverse optimal control with maximum causal entropy formulations. Using an explicit model of the noise characteristics of the sensory and motor systems of the agent in conjunction with local linearization techniques, we derive an approximate likelihood function for the model parameters, which can be computed within a single forward pass. We present quantitative evaluations on stochastic and partially observable versions of two classic control tasks and two human behavioral tasks. Importantly, we show that our method can disentangle perceptual factors and behavioral costs despite the fact that epistemic and pragmatic actions are intertwined in sequential decision-making under uncertainty, such as in active sensing and active learning. The proposed method has broad applicability, ranging from imitation learning to sensorimotor neuroscience. △ Less

Submitted 30 October, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

arXiv:2303.10665 [pdf, other]

Major-Minor Mean Field Multi-Agent Reinforcement Learning

Authors: Kai Cui, Christian Fabian, Anam Tahir, Heinz Koeppl

Abstract: Multi-agent reinforcement learning (MARL) remains difficult to scale to many agents. Recent MARL using Mean Field Control (MFC) provides a tractable and rigorous approach to otherwise difficult cooperative MARL. However, the strict MFC assumption of many independent, weakly-interacting agents is too inflexible in practice. We generalize MFC to instead simultaneously model many similar and few comp… ▽ More Multi-agent reinforcement learning (MARL) remains difficult to scale to many agents. Recent MARL using Mean Field Control (MFC) provides a tractable and rigorous approach to otherwise difficult cooperative MARL. However, the strict MFC assumption of many independent, weakly-interacting agents is too inflexible in practice. We generalize MFC to instead simultaneously model many similar and few complex agents -- as Major-Minor Mean Field Control (M3FC). Theoretically, we give approximation results for finite agent control, and verify the sufficiency of stationary policies for optimality together with a dynamic programming principle. Algorithmically, we propose Major-Minor Mean Field MARL (M3FMARL) for finite agent systems instead of the limiting system. The algorithm is shown to approximate the policy gradient of the underlying M3FC MDP. Finally, we demonstrate its capabilities experimentally in various scenarios. We observe a strong performance in comparison to state-of-the-art policy gradient MARL methods. △ Less

Submitted 7 May, 2024; v1 submitted 19 March, 2023; originally announced March 2023.

Comments: Accepted to ICML 2024

arXiv:2210.09058 [pdf, other]

Forward-Backward Latent State Inference for Hidden Continuous-Time semi-Markov Chains

Authors: Nicolai Engelmann, Heinz Koeppl

Abstract: Hidden semi-Markov Models (HSMM's) - while broadly in use - are restricted to a discrete and uniform time grid. They are thus not well suited to explain often irregularly spaced discrete event data from continuous-time phenomena. We show that non-sampling-based latent state inference used in HSMM's can be generalized to latent Continuous-Time semi-Markov Chains (CTSMC's). We formulate integro-diff… ▽ More Hidden semi-Markov Models (HSMM's) - while broadly in use - are restricted to a discrete and uniform time grid. They are thus not well suited to explain often irregularly spaced discrete event data from continuous-time phenomena. We show that non-sampling-based latent state inference used in HSMM's can be generalized to latent Continuous-Time semi-Markov Chains (CTSMC's). We formulate integro-differential forward and backward equations adjusted to the observation likelihood and introduce an exact integral equation for the Bayesian posterior marginals and a scalable Viterbi-type algorithm for posterior path estimates. The presented equations can be efficiently solved using well-known numerical methods. As a practical tool, variable-step HSMM's are introduced. We evaluate our approaches in latent state inference scenarios in comparison to classical HSMM's. △ Less

Submitted 17 October, 2022; originally announced October 2022.

Comments: 10 content pages, 2 figures, to be published at NeurIPS 2022

arXiv:2210.09021 [pdf, other]

doi 10.1117/12.2624609

Histopathological Image Classification based on Self-Supervised Vision Transformer and Weak Labels

Authors: Ahmet Gokberk Gul, Oezdemir Cetin, Christoph Reich, Tim Prangemeier, Nadine Flinner, Heinz Koeppl

Abstract: Whole Slide Image (WSI) analysis is a powerful method to facilitate the diagnosis of cancer in tissue samples. Automating this diagnosis poses various issues, most notably caused by the immense image resolution and limited annotations. WSIs commonly exhibit resolutions of 100Kx100K pixels. Annotating cancerous areas in WSIs on the pixel level is prohibitively labor-intensive and requires a high le… ▽ More Whole Slide Image (WSI) analysis is a powerful method to facilitate the diagnosis of cancer in tissue samples. Automating this diagnosis poses various issues, most notably caused by the immense image resolution and limited annotations. WSIs commonly exhibit resolutions of 100Kx100K pixels. Annotating cancerous areas in WSIs on the pixel level is prohibitively labor-intensive and requires a high level of expert knowledge. Multiple instance learning (MIL) alleviates the need for expensive pixel-level annotations. In MIL, learning is performed on slide-level labels, in which a pathologist provides information about whether a slide includes cancerous tissue. Here, we propose Self-ViT-MIL, a novel approach for classifying and localizing cancerous areas based on slide-level annotations, eliminating the need for pixel-wise annotated training data. Self-ViT- MIL is pre-trained in a self-supervised setting to learn rich feature representation without relying on any labels. The recent Vision Transformer (ViT) architecture builds the feature extractor of Self-ViT-MIL. For localizing cancerous regions, a MIL aggregator with global attention is utilized. To the best of our knowledge, Self-ViT- MIL is the first approach to introduce self-supervised ViTs in MIL-based WSI analysis tasks. We showcase the effectiveness of our approach on the common Camelyon16 dataset. Self-ViT-MIL surpasses existing state-of-the-art MIL-based approaches in terms of accuracy and area under the curve (AUC). △ Less

Submitted 17 April, 2023; v1 submitted 17 October, 2022; originally announced October 2022.

Journal ref: Proc. SPIE 12039, Medical Imaging 2022: Digital and Computational Pathology, 120391O (4 April 2022)

arXiv:2209.13413 [pdf, other]

Reinforcement Learning with Non-Exponential Discounting

Authors: Matthias Schultheis, Constantin A. Rothkopf, Heinz Koeppl

Abstract: Commonly in reinforcement learning (RL), rewards are discounted over time using an exponential function to model time preference, thereby bounding the expected long-term reward. In contrast, in economics and psychology, it has been shown that humans often adopt a hyperbolic discounting scheme, which is optimal when a specific task termination time distribution is assumed. In this work, we propose… ▽ More Commonly in reinforcement learning (RL), rewards are discounted over time using an exponential function to model time preference, thereby bounding the expected long-term reward. In contrast, in economics and psychology, it has been shown that humans often adopt a hyperbolic discounting scheme, which is optimal when a specific task termination time distribution is assumed. In this work, we propose a theory for continuous-time model-based reinforcement learning generalized to arbitrary discount functions. This formulation covers the case in which there is a non-exponential random termination time. We derive a Hamilton-Jacobi-Bellman (HJB) equation characterizing the optimal policy and describe how it can be solved using a collocation method, which uses deep learning for function approximation. Further, we show how the inverse RL problem can be approached, in which one tries to recover properties of the discount function given decision data. We validate the applicability of our proposed approach on two simulated problems. Our approach opens the way for the analysis of human discounting in sequential decision-making tasks. △ Less

Submitted 7 December, 2022; v1 submitted 27 September, 2022; originally announced September 2022.

Comments: 22 pages, 3 figures, published at 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

arXiv:2209.07420 [pdf, other]

Scalable Task-Driven Robotic Swarm Control via Collision Avoidance and Learning Mean-Field Control

Authors: Kai Cui, Mengguang Li, Christian Fabian, Heinz Koeppl

Abstract: In recent years, reinforcement learning and its multi-agent analogue have achieved great success in solving various complex control problems. However, multi-agent reinforcement learning remains challenging both in its theoretical analysis and empirical design of algorithms, especially for large swarms of embodied robotic agents where a definitive toolchain remains part of active research. We use e… ▽ More In recent years, reinforcement learning and its multi-agent analogue have achieved great success in solving various complex control problems. However, multi-agent reinforcement learning remains challenging both in its theoretical analysis and empirical design of algorithms, especially for large swarms of embodied robotic agents where a definitive toolchain remains part of active research. We use emerging state-of-the-art mean-field control techniques in order to convert many-agent swarm control into more classical single-agent control of distributions. This allows profiting from advances in single-agent reinforcement learning at the cost of assuming weak interaction between agents. However, the mean-field model is violated by the nature of real systems with embodied, physically colliding agents. Thus, we combine collision avoidance and learning of mean-field control into a unified framework for tractably designing intelligent robotic swarm behavior. On the theoretical side, we provide novel approximation guarantees for general mean-field control both in continuous spaces and with collision avoidance. On the practical side, we show that our approach outperforms multi-agent reinforcement learning and allows for decentralized open-loop application while avoiding collisions, both in simulation and real UAV swarms. Overall, we propose a framework for the design of swarm behavior that is both mathematically well-founded and practically useful, enabling the solution of otherwise intractable swarm problems. △ Less

Submitted 9 February, 2023; v1 submitted 15 September, 2022; originally announced September 2022.

Comments: Accepted to the 40th IEEE Conference on Robotics and Automation (ICRA)

arXiv:2209.03887 [pdf, other]

Mean Field Games on Weighted and Directed Graphs via Colored Digraphons

Authors: Christian Fabian, Kai Cui, Heinz Koeppl

Abstract: The field of multi-agent reinforcement learning (MARL) has made considerable progress towards controlling challenging multi-agent systems by employing various learning methods. Numerous of these approaches focus on empirical and algorithmic aspects of the MARL problems and lack a rigorous theoretical foundation. Graphon mean field games (GMFGs) on the other hand provide a scalable and mathematical… ▽ More The field of multi-agent reinforcement learning (MARL) has made considerable progress towards controlling challenging multi-agent systems by employing various learning methods. Numerous of these approaches focus on empirical and algorithmic aspects of the MARL problems and lack a rigorous theoretical foundation. Graphon mean field games (GMFGs) on the other hand provide a scalable and mathematically well-founded approach to learning problems that involve a large number of connected agents. In standard GMFGs, the connections between agents are undirected, unweighted and invariant over time. Our paper introduces colored digraphon mean field games (CDMFGs) which allow for weighted and directed links between agents that are also adaptive over time. Thus, CDMFGs are able to model more complex connections than standard GMFGs. Besides a rigorous theoretical analysis including both existence and convergence guarantees, we provide a learning scheme and illustrate our findings with an epidemics model and a model of the systemic risk in financial markets. △ Less

Submitted 8 September, 2022; originally announced September 2022.

arXiv:2209.03880 [pdf, other]

Learning Sparse Graphon Mean Field Games

Authors: Christian Fabian, Kai Cui, Heinz Koeppl

Abstract: Although the field of multi-agent reinforcement learning (MARL) has made considerable progress in the last years, solving systems with a large number of agents remains a hard challenge. Graphon mean field games (GMFGs) enable the scalable analysis of MARL problems that are otherwise intractable. By the mathematical structure of graphons, this approach is limited to dense graphs which are insuffici… ▽ More Although the field of multi-agent reinforcement learning (MARL) has made considerable progress in the last years, solving systems with a large number of agents remains a hard challenge. Graphon mean field games (GMFGs) enable the scalable analysis of MARL problems that are otherwise intractable. By the mathematical structure of graphons, this approach is limited to dense graphs which are insufficient to describe many real-world networks such as power law graphs. Our paper introduces a novel formulation of GMFGs, called LPGMFGs, which leverages the graph theoretical concept of $L^p$ graphons and provides a machine learning tool to efficiently and accurately approximate solutions for sparse network problems. This especially includes power law networks which are empirically observed in various application areas and cannot be captured by standard graphons. We derive theoretical existence and convergence guarantees and give empirical examples that demonstrate the accuracy of our learning approach for systems with many agents. Furthermore, we extend the Online Mirror Descent (OMD) learning algorithm to our setup to accelerate learning speed, empirically show its capabilities, and conduct a theoretical analysis using the novel concept of smoothed step graphons. In general, we provide a scalable, mathematically well-founded machine learning approach to a large class of otherwise intractable problems of great relevance in numerous research fields. △ Less

Submitted 13 March, 2023; v1 submitted 8 September, 2022; originally announced September 2022.

Comments: accepted for publication at the International Conference on Artificial Intelligence and Statistics (AISTATS) 2023; code available at: https://github.com/ChrFabian/Learning_sparse_GMFGs

arXiv:2209.03859 [pdf, other]

A Survey on Large-Population Systems and Scalable Multi-Agent Reinforcement Learning

Authors: Kai Cui, Anam Tahir, Gizem Ekinci, Ahmed Elshamanhory, Yannick Eich, Mengguang Li, Heinz Koeppl

Abstract: The analysis and control of large-population systems is of great interest to diverse areas of research and engineering, ranging from epidemiology over robotic swarms to economics and finance. An increasingly popular and effective approach to realizing sequential decision-making in multi-agent systems is through multi-agent reinforcement learning, as it allows for an automatic and model-free analys… ▽ More The analysis and control of large-population systems is of great interest to diverse areas of research and engineering, ranging from epidemiology over robotic swarms to economics and finance. An increasingly popular and effective approach to realizing sequential decision-making in multi-agent systems is through multi-agent reinforcement learning, as it allows for an automatic and model-free analysis of highly complex systems. However, the key issue of scalability complicates the design of control and reinforcement learning algorithms particularly in systems with large populations of agents. While reinforcement learning has found resounding empirical success in many scenarios with few agents, problems with many agents quickly become intractable and necessitate special consideration. In this survey, we will shed light on current approaches to tractably understanding and analyzing large-population systems, both through multi-agent reinforcement learning and through adjacent areas of research such as mean-field games, collective intelligence, or complex network theory. These classically independent subject areas offer a variety of approaches to understanding or modeling large-population systems, which may be of great use for the formulation of tractable MARL algorithms in the future. Finally, we survey potential areas of application for large-scale control and identify fruitful future applications of learning algorithms in practical systems. We hope that our survey could provide insight and future directions to junior and senior researchers in theoretical and applied sciences alike. △ Less

Submitted 8 September, 2022; originally announced September 2022.

arXiv:2209.03854 [pdf, other]

Optimal Offloading Strategies for Edge-Computing via Mean-Field Games and Control

Authors: Kai Cui, Mustafa Burak Yilmaz, Anam Tahir, Anja Klein, Heinz Koeppl

Abstract: The optimal offloading of tasks in heterogeneous edge-computing scenarios is of great practical interest, both in the selfish and fully cooperative setting. In practice, such systems are typically very large, rendering exact solutions in terms of cooperative optima or Nash equilibria intractable. For this purpose, we adopt a general mean-field formulation in order to solve the competitive and coop… ▽ More The optimal offloading of tasks in heterogeneous edge-computing scenarios is of great practical interest, both in the selfish and fully cooperative setting. In practice, such systems are typically very large, rendering exact solutions in terms of cooperative optima or Nash equilibria intractable. For this purpose, we adopt a general mean-field formulation in order to solve the competitive and cooperative offloading problems in the limit of infinitely large systems. We give theoretical guarantees for the approximation properties of the limiting solution and solve the resulting mean-field problems numerically. Furthermore, we verify our solutions numerically and find that our approximations are accurate for systems with dozens of edge devices. As a result, we obtain a tractable approach to the design of offloading strategies in large edge-computing scenarios with many users. △ Less

Submitted 8 September, 2022; originally announced September 2022.

Comments: Accepted to GLOBECOM 2022

arXiv:2208.13621 [pdf, other]

Decentralized Coordination in Partially Observable Queueing Networks

Authors: Jiekai Jia, Anam Tahir, Heinz Koeppl

Abstract: We consider communication in a fully cooperative multi-agent system, where the agents have partial observation of the environment and must act jointly to maximize the overall reward. We have a discrete-time queueing network where agents route packets to queues based only on the partial information of the current queue lengths. The queues have limited buffer capacity, so packet drops happen when th… ▽ More We consider communication in a fully cooperative multi-agent system, where the agents have partial observation of the environment and must act jointly to maximize the overall reward. We have a discrete-time queueing network where agents route packets to queues based only on the partial information of the current queue lengths. The queues have limited buffer capacity, so packet drops happen when they are sent to a full queue. In this work, we implemented a communication channel for the agents to share their information in order to reduce the packet drop rate. For efficient information sharing we use an attention-based communication model, called ATVC, to select informative messages from other agents. The agents then infer the state of queues using a combination of the variational auto-encoder, VAE, and product-of-experts, PoE, model. Ultimately, the agents learn what they need to communicate and with whom, instead of communicating all the time with everyone. We also show empirically that ATVC is able to infer the true state of the queues and leads to a policy which outperforms existing baselines. △ Less

Submitted 29 August, 2022; originally announced August 2022.

Comments: Accepted at IEEE Global Communications Conference 2022

arXiv:2208.04777 [pdf, other]

Learning Mean-Field Control for Delayed Information Load Balancing in Large Queuing Systems

Authors: Anam Tahir, Kai Cui, Heinz Koeppl

Abstract: Recent years have seen a great increase in the capacity and parallel processing power of data centers and cloud services. To fully utilize the said distributed systems, optimal load balancing for parallel queuing architectures must be realized. Existing state-of-the-art solutions fail to consider the effect of communication delays on the behaviour of very large systems with many clients. In this w… ▽ More Recent years have seen a great increase in the capacity and parallel processing power of data centers and cloud services. To fully utilize the said distributed systems, optimal load balancing for parallel queuing architectures must be realized. Existing state-of-the-art solutions fail to consider the effect of communication delays on the behaviour of very large systems with many clients. In this work, we consider a multi-agent load balancing system, with delayed information, consisting of many clients (load balancers) and many parallel queues. In order to obtain a tractable solution, we model this system as a mean-field control problem with enlarged state-action space in discrete time through exact discretization. Subsequently, we apply policy gradient reinforcement learning algorithms to find an optimal load balancing solution. Here, the discrete-time system model incorporates a synchronization delay under which the queue state information is synchronously broadcasted and updated at all clients. We then provide theoretical performance guarantees for our methodology in large systems. Finally, using experiments, we prove that our approach is not only scalable but also shows good performance when compared to the state-of-the-art power-of-d variant of the Join-the-Shortest-Queue (JSQ) and other policies in the presence of synchronization delays. △ Less

Submitted 9 August, 2022; originally announced August 2022.

Comments: 11 pages, 6 figures. Accepted in the 51st International Conference on Parallel Processing (ICPP'22)

arXiv:2205.08803 [pdf, other]

Markov Chain Monte Carlo for Continuous-Time Switching Dynamical Systems

Authors: Lukas Köhs, Bastian Alt, Heinz Koeppl

Abstract: Switching dynamical systems are an expressive model class for the analysis of time-series data. As in many fields within the natural and engineering sciences, the systems under study typically evolve continuously in time, it is natural to consider continuous-time model formulations consisting of switching stochastic differential equations governed by an underlying Markov jump process. Inference in… ▽ More Switching dynamical systems are an expressive model class for the analysis of time-series data. As in many fields within the natural and engineering sciences, the systems under study typically evolve continuously in time, it is natural to consider continuous-time model formulations consisting of switching stochastic differential equations governed by an underlying Markov jump process. Inference in these types of models is however notoriously difficult, and tractable computational schemes are rare. In this work, we propose a novel inference algorithm utilizing a Markov Chain Monte Carlo approach. The presented Gibbs sampler allows to efficiently obtain samples from the exact continuous-time posterior processes. Our framework naturally enables Bayesian parameter estimation, and we also include an estimate for the diffusion covariance, which is oftentimes assumed fixed in stochastic differential equation models. We evaluate our framework under the modeling assumption and compare it against an existing variational inference approach. △ Less

Submitted 18 May, 2022; originally announced May 2022.

Comments: Accepted at ICML 2022

arXiv:2205.07011 [pdf, other]

ACID: A Low Dimensional Characterization of Markov-Modulated and Self-Exciting Counting Processes

Authors: Mark Sinzger-D'Angelo, Heinz Koeppl

Abstract: The conditional intensity (CI) of a counting process $Y_t$ is based on the minimal knowledge $\mathcal{F}_t^Y$, i.e., on the observation of $Y_t$ alone. Prominently, the mutual information rate of a signal and its Poisson channel output is a difference functional between the CI and the intensity that has full knowledge about the input. While the CI of Markov-modulated Poisson processes evolves acc… ▽ More The conditional intensity (CI) of a counting process $Y_t$ is based on the minimal knowledge $\mathcal{F}_t^Y$, i.e., on the observation of $Y_t$ alone. Prominently, the mutual information rate of a signal and its Poisson channel output is a difference functional between the CI and the intensity that has full knowledge about the input. While the CI of Markov-modulated Poisson processes evolves according to Snyder's filter, self-exciting processes, e.g., Hawkes processes, specify the CI via the history of $Y_t$. The emergence of the CI as a self-contained stochastic process prompts us to bring its statistical ensemble into focus. We investigate the asymptotic conditional intensity distribution (ACID) and emphasize its rich information content. We assume the case in which the CI is determined from a sufficient statistic that progresses as a Markov process. We present a simulation-free method to compute the ACID when the dimension of the sufficient statistic is low. The method is made possible by introducing a backward recurrence time parametrization, which has the advantage to align all probability inflow in a boundary condition for the master equation. Case studies illustrate the usage of ACID for three primary examples: 1) the Poisson channels with binary Markovian input (as an example of a Markov-modulated Poisson process), 2) the standard Hawkes process with exponential kernel (as an example of a self-exciting counting process) and 3) the Gamma filter (as an example of an approximate filter to a Markov-modulated Poisson process). △ Less

Submitted 14 May, 2022; originally announced May 2022.

arXiv:2203.16223 [pdf, other]

doi 10.1063/5.0093758

Hypergraphon Mean Field Games

Authors: Kai Cui, Wasiur R. KhudaBukhsh, Heinz Koeppl

Abstract: We propose an approach to modelling large-scale multi-agent dynamical systems allowing interactions among more than just pairs of agents using the theory of mean field games and the notion of hypergraphons, which are obtained as limits of large hypergraphs. To the best of our knowledge, ours is the first work on mean field games on hypergraphs. Together with an extension to a multi-layer setup, we… ▽ More We propose an approach to modelling large-scale multi-agent dynamical systems allowing interactions among more than just pairs of agents using the theory of mean field games and the notion of hypergraphons, which are obtained as limits of large hypergraphs. To the best of our knowledge, ours is the first work on mean field games on hypergraphs. Together with an extension to a multi-layer setup, we obtain limiting descriptions for large systems of non-linear, weakly-interacting dynamical agents. On the theoretical side, we prove the well-foundedness of the resulting hypergraphon mean field game, showing both existence and approximate Nash properties. On the applied side, we extend numerical and learning algorithms to compute the hypergraphon mean field equilibria. To verify our approach empirically, we consider a social rumor spreading model, where we give agents intrinsic motivation to spread rumors to unaware agents, and an epidemics control problem. △ Less

Submitted 27 October, 2022; v1 submitted 30 March, 2022; originally announced March 2022.

Comments: The following article has been accepted by Chaos

arXiv:2203.09590 [pdf, other]

ECOLA: Enhanced Temporal Knowledge Embeddings with Contextualized Language Representations

Authors: Zhen Han, Ruotong Liao, **dong Gu, Yao Zhang, Zifeng Ding, Yujia Gu, Heinz Köppl, Hinrich Schütze, Volker Tresp

Abstract: Since conventional knowledge embedding models cannot take full advantage of the abundant textual information, there have been extensive research efforts in enhancing knowledge embedding using texts. However, existing enhancement approaches cannot apply to temporal knowledge graphs (tKGs), which contain time-dependent event knowledge with complex temporal dynamics. Specifically, existing enhancemen… ▽ More Since conventional knowledge embedding models cannot take full advantage of the abundant textual information, there have been extensive research efforts in enhancing knowledge embedding using texts. However, existing enhancement approaches cannot apply to temporal knowledge graphs (tKGs), which contain time-dependent event knowledge with complex temporal dynamics. Specifically, existing enhancement approaches often assume knowledge embedding is time-independent. In contrast, the entity embedding in tKG models usually evolves, which poses the challenge of aligning temporally relevant texts with entities. To this end, we propose to study enhancing temporal knowledge embedding with textual data in this paper. As an approach to this task, we propose Enhanced Temporal Knowledge Embeddings with Contextualized Language Representations (ECOLA), which takes the temporal aspect into account and injects textual information into temporal knowledge embedding. To evaluate ECOLA, we introduce three new datasets for training and evaluating ECOLA. Extensive experiments show that ECOLA significantly enhances temporal KG embedding models with up to 287% relative improvements regarding Hits@1 on the link prediction task. The code and models are publicly available on https://anonymous.4open.science/r/ECOLA. △ Less

Submitted 4 May, 2023; v1 submitted 17 March, 2022; originally announced March 2022.

Comments: accepted to Findings of the ACL 2023

arXiv:2202.10352 [pdf, other]

Optimal Decision Making in Active Queue Management

Authors: Sounak Kar, Bastian Alt, Heinz Koeppl, Amr Rizk

Abstract: Active Queue Management (AQM) aims to prevent bufferbloat and serial drops in router and switch FIFO packet buffers that usually employ drop-tail queueing. AQM describes methods to send proactive feedback to TCP flow sources to regulate their rate using selective packet drops or markings. Traditionally, AQM policies relied on heuristics to approximately provide Quality of Service (QoS) such as a t… ▽ More Active Queue Management (AQM) aims to prevent bufferbloat and serial drops in router and switch FIFO packet buffers that usually employ drop-tail queueing. AQM describes methods to send proactive feedback to TCP flow sources to regulate their rate using selective packet drops or markings. Traditionally, AQM policies relied on heuristics to approximately provide Quality of Service (QoS) such as a target delay for a given flow. These heuristics are usually based on simple network and TCP control models together with the monitored buffer filling. A primary drawback of these heuristics is that their way of accounting flow characteristics into the feedback mechanism and the corresponding effect on the state of congestion are not well understood. In this work, we show that taking a probabilistic model for the flow rates and the dequeueing pattern, a Semi-Markov Decision Process (SMDP) can be formulated to obtain an optimal packet-drop** policy. This policy-based AQM, named PAQMAN, takes into account a steady-state model of TCP and a target delay for the flows. Additionally, we present an inference algorithm that builds on TCP congestion control in order to calibrate the model parameters governing underlying network conditions. Using simulation, we show that the prescribed AQM yields comparable throughput to state-of-the-art AQM algorithms while reducing delays significantly. △ Less

Submitted 22 April, 2023; v1 submitted 21 February, 2022; originally announced February 2022.

arXiv:2202.00919 [pdf, other]

Dynamic Time Slot Allocation Algorithm for Quadcopter Swarms

Authors: Sharif Azem, Anam Tahir, Heinz Koeppl

Abstract: A swarm of quadcopters can perform cooperative tasks, such as monitoring of a large area, more efficiently than a single one. However, to be able to successfully work together, the quadcopters must be aware of the position of the other swarm members, especially to avoid collisions. A quadcopter can share its own position by transmitting it via radio waves and in order to allow multiple quadcopters… ▽ More A swarm of quadcopters can perform cooperative tasks, such as monitoring of a large area, more efficiently than a single one. However, to be able to successfully work together, the quadcopters must be aware of the position of the other swarm members, especially to avoid collisions. A quadcopter can share its own position by transmitting it via radio waves and in order to allow multiple quadcopters to communicate effectively, a decentralized channel access protocol is essential. We propose a new dynamic channel access protocol, called Dynamic time slot allocation (DTSA), where the quadcopters share the total channel access time in a non-periodic and decentralized manner. Quadcopters with higher communication demands occupy more time slots than less active ones. Our dynamic approach allows the agents to adapt to changing swarm situations and therefore to act efficiently, as compared to the state-of-the-art periodic channel access protocol, time division multiple access (TDMA). Along with simulations, we also do experiments using real Crazyflie quadcopters to show the improved performance of DTSA as compared to TDMA. △ Less

Submitted 2 February, 2022; originally announced February 2022.

Comments: Accepted in Robocom 2022 in conjunction with IEEE CCNC 2022

arXiv:2112.01280 [pdf, other]

Learning Graphon Mean Field Games and Approximate Nash Equilibria

Authors: Kai Cui, Heinz Koeppl

Abstract: Recent advances at the intersection of dense large graph limits and mean field games have begun to enable the scalable analysis of a broad class of dynamical sequential games with large numbers of agents. So far, results have been largely limited to graphon mean field systems with continuous-time diffusive or jump dynamics, typically without control and with little focus on computational methods.… ▽ More Recent advances at the intersection of dense large graph limits and mean field games have begun to enable the scalable analysis of a broad class of dynamical sequential games with large numbers of agents. So far, results have been largely limited to graphon mean field systems with continuous-time diffusive or jump dynamics, typically without control and with little focus on computational methods. We propose a novel discrete-time formulation for graphon mean field games as the limit of non-linear dense graph Markov games with weak interaction. On the theoretical side, we give extensive and rigorous existence and approximation properties of the graphon mean field solution in sufficiently large systems. On the practical side, we provide general learning schemes for graphon mean field equilibria by either introducing agent equivalence classes or reformulating the graphon mean field system as a classical mean field system. By repeatedly finding a regularized optimal control solution and its generated mean field, we successfully obtain plausible approximate Nash equilibria in otherwise infeasible large dense graph games with many agents. Empirically, we are able to demonstrate on a number of examples that the finite-agent behavior comes increasingly close to the mean field behavior for our computed equilibria as the graph or system size grows, verifying our theory. More generally, we successfully apply policy gradient reinforcement learning in conjunction with sequential Monte Carlo methods. △ Less

Submitted 18 February, 2022; v1 submitted 29 November, 2021; originally announced December 2021.

Comments: Accepted to the Tenth International Conference on Learning Representations (ICLR); Fixed some typos

arXiv:2110.10640 [pdf, other]

OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data

Authors: Christoph Reich, Tim Prangemeier, Özdemir Cetin, Heinz Koeppl

Abstract: Convolutional neural networks (CNNs) are the current state-of-the-art meta-algorithm for volumetric segmentation of medical data, for example, to localize COVID-19 infected tissue on computer tomography scans or the detection of tumour volumes in magnetic resonance imaging. A key limitation of 3D CNNs on voxelised data is that the memory consumption grows cubically with the training data resolutio… ▽ More Convolutional neural networks (CNNs) are the current state-of-the-art meta-algorithm for volumetric segmentation of medical data, for example, to localize COVID-19 infected tissue on computer tomography scans or the detection of tumour volumes in magnetic resonance imaging. A key limitation of 3D CNNs on voxelised data is that the memory consumption grows cubically with the training data resolution. Occupancy networks (O-Nets) are an alternative for which the data is represented continuously in a function space and 3D shapes are learned as a continuous decision boundary. While O-Nets are significantly more memory efficient than 3D CNNs, they are limited to simple shapes, are relatively slow at inference, and have not yet been adapted for 3D semantic segmentation of medical data. Here, we propose Occupancy Networks for Semantic Segmentation (OSS-Nets) to accurately and memory-efficiently segment 3D medical data. We build upon the original O-Net with modifications for increased expressiveness leading to improved segmentation performance comparable to 3D CNNs, as well as modifications for faster inference. We leverage local observations to represent complex shapes and prior encoder predictions to expedite inference. We showcase OSS-Net's performance on 3D brain tumour and liver segmentation against a function space baseline (O-Net), a performance baseline (3D residual U-Net), and an efficiency baseline (2D residual U-Net). OSS-Net yields segmentation results similar to the performance baseline and superior to the function space and efficiency baselines. In terms of memory efficiency, OSS-Net consumes comparable amounts of memory as the function space baseline, somewhat more memory than the efficiency baseline and significantly less than the performance baseline. As such, OSS-Net enables memory-efficient and accurate 3D semantic segmentation that can scale to high resolutions. △ Less

Submitted 20 October, 2021; originally announced October 2021.

Comments: BMVC 2021 (accepted), https://github.com/ChristophReich1996/OSS-Net (code)

arXiv:2109.14492 [pdf, other]

Variational Inference for Continuous-Time Switching Dynamical Systems

Authors: Lukas Köhs, Bastian Alt, Heinz Koeppl

Abstract: Switching dynamical systems provide a powerful, interpretable modeling framework for inference in time-series data in, e.g., the natural sciences or engineering applications. Since many areas, such as biology or discrete-event systems, are naturally described in continuous time, we present a model based on an Markov jump process modulating a subordinated diffusion process. We provide the exact evo… ▽ More Switching dynamical systems provide a powerful, interpretable modeling framework for inference in time-series data in, e.g., the natural sciences or engineering applications. Since many areas, such as biology or discrete-event systems, are naturally described in continuous time, we present a model based on an Markov jump process modulating a subordinated diffusion process. We provide the exact evolution equations for the prior and posterior marginal densities, the direct solutions of which are however computationally intractable. Therefore, we develop a new continuous-time variational inference algorithm, combining a Gaussian process approximation on the diffusion level with posterior inference for Markov jump processes. By minimizing the path-wise Kullback-Leibler divergence we obtain (i) Bayesian latent state estimates for arbitrary points on the real axis and (ii) point estimates of unknown system parameters, utilizing variational expectation maximization. We extensively evaluate our algorithm under the model assumption and for real-world examples. △ Less

Submitted 29 September, 2021; originally announced September 2021.

Comments: 34 pages, 6 figures, to be published in NeurIPS 2021

arXiv:2109.08548 [pdf, other]

Load Balancing in Compute Clusters with Delayed Feedback

Authors: Anam Tahir, Bastian Alt, Amr Rizk, Heinz Koeppl

Abstract: Load balancing arises as a fundamental problem, underlying the dimensioning and operation of many computing and communication systems, such as job routing in data center clusters, multipath communication, Big Data and queueing systems. In essence, the decision-making agent maps each arriving job to one of the possibly heterogeneous servers while aiming at an optimization goal such as load balancin… ▽ More Load balancing arises as a fundamental problem, underlying the dimensioning and operation of many computing and communication systems, such as job routing in data center clusters, multipath communication, Big Data and queueing systems. In essence, the decision-making agent maps each arriving job to one of the possibly heterogeneous servers while aiming at an optimization goal such as load balancing, low average delay or low loss rate. One main difficulty in finding optimal load balancing policies here is that the agent only partially observes the impact of its decisions, e.g., through the delayed acknowledgements of the served jobs. In this paper, we provide a partially observable (PO) model that captures the load balancing decisions in parallel buffered systems under limited information of delayed acknowledgements. We present a simulation model for this PO system to find a load balancing policy in real-time using a scalable Monte Carlo tree search algorithm. We numerically show that the resulting policy outperforms other limited information load balancing strategies such as variants of Join-the-Most-Observations and has comparable performance to full information strategies like: Join-the-Shortest-Queue, Join-the-Shortest-Queue(d) and Shortest-Expected-Delay. Finally, we show that our approach can optimise the real-time parallel processing by using network data provided by Kaggle. △ Less

Submitted 11 October, 2022; v1 submitted 17 September, 2021; originally announced September 2021.

Comments: Accepted at IEEE Transactions on Computers 2022

arXiv:2106.08285 [pdf, other]

doi 10.1007/978-3-030-87237-3_46

Multi-StyleGAN: Towards Image-Based Simulation of Time-Lapse Live-Cell Microscopy

Authors: Christoph Reich, Tim Prangemeier, Christian Wildner, Heinz Koeppl

Abstract: Time-lapse fluorescent microscopy (TLFM) combined with predictive mathematical modelling is a powerful tool to study the inherently dynamic processes of life on the single-cell level. Such experiments are costly, complex and labour intensive. A complimentary approach and a step towards in silico experimentation, is to synthesise the imagery itself. Here, we propose Multi-StyleGAN as a descriptive… ▽ More Time-lapse fluorescent microscopy (TLFM) combined with predictive mathematical modelling is a powerful tool to study the inherently dynamic processes of life on the single-cell level. Such experiments are costly, complex and labour intensive. A complimentary approach and a step towards in silico experimentation, is to synthesise the imagery itself. Here, we propose Multi-StyleGAN as a descriptive approach to simulate time-lapse fluorescence microscopy imagery of living cells, based on a past experiment. This novel generative adversarial network synthesises a multi-domain sequence of consecutive timesteps. We showcase Multi-StyleGAN on imagery of multiple live yeast cells in microstructured environments and train on a dataset recorded in our laboratory. The simulation captures underlying biophysical factors and time dependencies, such as cell morphology, growth, physical interactions, as well as the intensity of a fluorescent reporter protein. An immediate application is to generate additional training and validation data for feature extraction algorithms or to aid and expedite development of advanced experimental techniques such as online monitoring or control of cells. Code and dataset is available at https://git.rwth-aachen.de/bcs/projects/tp/multi-stylegan. △ Less

Submitted 24 September, 2021; v1 submitted 15 June, 2021; originally announced June 2021.

Comments: revised -- accepted to MICCAI 2021 (doi.org/10.1007/978-3-030-87237-3_46) (Tim Prangemeier and Christoph Reich --- both authors contributed equally)

arXiv:2105.14742 [pdf, other]

doi 10.1088/1742-5468/ac3908

Active Learning of Continuous-time Bayesian Networks through Interventions

Authors: Dominik Linzner, Heinz Koeppl

Abstract: We consider the problem of learning structures and parameters of Continuous-time Bayesian Networks (CTBNs) from time-course data under minimal experimental resources. In practice, the cost of generating experimental data poses a bottleneck, especially in the natural and social sciences. A popular approach to overcome this is Bayesian optimal experimental design (BOED). However, BOED becomes infeas… ▽ More We consider the problem of learning structures and parameters of Continuous-time Bayesian Networks (CTBNs) from time-course data under minimal experimental resources. In practice, the cost of generating experimental data poses a bottleneck, especially in the natural and social sciences. A popular approach to overcome this is Bayesian optimal experimental design (BOED). However, BOED becomes infeasible in high-dimensional settings, as it involves integration over all possible experimental outcomes. We propose a novel criterion for experimental design based on a variational approximation of the expected information gain. We show that for CTBNs, a semi-analytical expression for this criterion can be calculated for structure and parameter learning. By doing so, we can replace sampling over experimental outcomes by solving the CTBNs master-equation, for which scalable approximations exist. This alleviates the computational burden of sampling possible experimental outcomes in high-dimensions. We employ this framework in order to recommend interventional sequences. In this context, we extend the CTBN model to conditional CTBNs in order to incorporate interventions. We demonstrate the performance of our criterion on synthetic and real-world data. △ Less

Submitted 11 June, 2021; v1 submitted 31 May, 2021; originally announced May 2021.

Comments: Accepted at ICML2021

arXiv:2104.14912 [pdf, other]

Nearest-Neighbor-based Collision Avoidance for Quadrotors via Reinforcement Learning

Authors: Ramzi Ourari, Kai Cui, Ahmed Elshamanhory, Heinz Koeppl

Abstract: Collision avoidance algorithms are of central interest to many drone applications. In particular, decentralized approaches may be the key to enabling robust drone swarm solutions in cases where centralized communication becomes computationally prohibitive. In this work, we draw biological inspiration from flocks of starlings (Sturnus vulgaris) and apply the insight to end-to-end learned decentrali… ▽ More Collision avoidance algorithms are of central interest to many drone applications. In particular, decentralized approaches may be the key to enabling robust drone swarm solutions in cases where centralized communication becomes computationally prohibitive. In this work, we draw biological inspiration from flocks of starlings (Sturnus vulgaris) and apply the insight to end-to-end learned decentralized collision avoidance. More specifically, we propose a new, scalable observation model following a biomimetic nearest-neighbor information constraint that leads to fast learning and good collision avoidance behavior. By proposing a general reinforcement learning approach, we obtain an end-to-end learning-based approach to integrating collision avoidance with arbitrary tasks such as package collection and formation change. To validate the generality of this approach, we successfully apply our methodology through motion models of medium complexity, modeling momentum and nonetheless allowing direct application to real world quadrotors in conjunction with a standard PID controller. In contrast to prior works, we find that in our sufficiently rich motion model, nearest-neighbor information is indeed enough to learn effective collision avoidance behavior. Our learned policies are tested in simulation and subsequently transferred to real-world drones to validate their real-world applicability. △ Less

Submitted 18 February, 2022; v1 submitted 30 April, 2021; originally announced April 2021.

Comments: Accepted to the 39th IEEE Conference on Robotics and Automation (ICRA). Fixed some typos

arXiv:2104.14900 [pdf, other]

Discrete-Time Mean Field Control with Environment States

Authors: Kai Cui, Anam Tahir, Mark Sinzger, Heinz Koeppl

Abstract: Multi-agent reinforcement learning methods have shown remarkable potential in solving complex multi-agent problems but mostly lack theoretical guarantees. Recently, mean field control and mean field games have been established as a tractable solution for large-scale multi-agent problems with many agents. In this work, driven by a motivating scheduling problem, we consider a discrete-time mean fiel… ▽ More Multi-agent reinforcement learning methods have shown remarkable potential in solving complex multi-agent problems but mostly lack theoretical guarantees. Recently, mean field control and mean field games have been established as a tractable solution for large-scale multi-agent problems with many agents. In this work, driven by a motivating scheduling problem, we consider a discrete-time mean field control model with common environment states. We rigorously establish approximate optimality as the number of agents grows in the finite agent case and find that a dynamic programming principle holds, resulting in the existence of an optimal stationary policy. As exact solutions are difficult in general due to the resulting continuous action space of the limiting mean field Markov decision process, we apply established deep reinforcement learning methods to solve the associated mean field control problem. The performance of the learned mean field control policy is compared to typical multi-agent reinforcement learning approaches and is found to converge to the mean field performance for sufficiently many agents, verifying the obtained theoretical results and reaching competitive solutions. △ Less

Submitted 17 December, 2021; v1 submitted 30 April, 2021; originally announced April 2021.

Comments: Accepted to the 60th IEEE Conference on Decision and Control (CDC 2021); Added reference and fixed typo

arXiv:2103.00988 [pdf, other]

Moment-Based Variational Inference for Stochastic Differential Equations

Authors: Christian Wildner, Heinz Koeppl

Abstract: Existing deterministic variational inference approaches for diffusion processes use simple proposals and target the marginal density of the posterior. We construct the variational process as a controlled version of the prior process and approximate the posterior by a set of moment functions. In combination with moment closure, the smoothing problem is reduced to a deterministic optimal control pro… ▽ More Existing deterministic variational inference approaches for diffusion processes use simple proposals and target the marginal density of the posterior. We construct the variational process as a controlled version of the prior process and approximate the posterior by a set of moment functions. In combination with moment closure, the smoothing problem is reduced to a deterministic optimal control problem. Exploiting the path-wise Fisher information, we propose an optimization procedure that corresponds to a natural gradient descent in the variational parameters. Our approach allows for richer variational approximations that extend to state-dependent diffusion terms. The classical Gaussian process approximation is recovered as a special case. △ Less

Submitted 1 March, 2021; originally announced March 2021.

Comments: Appearing in Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (AISTATS) 2021, San Diego, California, USA. PMLR: Volume 130

arXiv:2102.01585 [pdf, other]

Approximately Solving Mean Field Games via Entropy-Regularized Deep Reinforcement Learning

Authors: Kai Cui, Heinz Koeppl

Abstract: The recent mean field game (MFG) formalism facilitates otherwise intractable computation of approximate Nash equilibria in many-agent settings. In this paper, we consider discrete-time finite MFGs subject to finite-horizon objectives. We show that all discrete-time finite MFGs with non-constant fixed point operators fail to be contractive as typically assumed in existing MFG literature, barring co… ▽ More The recent mean field game (MFG) formalism facilitates otherwise intractable computation of approximate Nash equilibria in many-agent settings. In this paper, we consider discrete-time finite MFGs subject to finite-horizon objectives. We show that all discrete-time finite MFGs with non-constant fixed point operators fail to be contractive as typically assumed in existing MFG literature, barring convergence via fixed point iteration. Instead, we incorporate entropy-regularization and Boltzmann policies into the fixed point iteration. As a result, we obtain provable convergence to approximate fixed points where existing methods fail, and reach the original goal of approximate Nash equilibria. All proposed methods are evaluated with respect to their exploitability, on both instructive examples with tractable exact solutions and high-dimensional problems where exact methods become intractable. In high-dimensional scenarios, we apply established deep reinforcement learning methods and empirically combine fictitious play with our approximations. △ Less

Submitted 8 July, 2022; v1 submitted 2 February, 2021; originally announced February 2021.

Comments: Accepted to the 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021); v2: Fixed B.6 and some typos

arXiv:2011.09763 [pdf, other]

doi 10.1109/BIBM49941.2020.9313305

Attention-Based Transformers for Instance Segmentation of Cells in Microstructures

Authors: Tim Prangemeier, Christoph Reich, Heinz Koeppl

Abstract: Detecting and segmenting object instances is a common task in biomedical applications. Examples range from detecting lesions on functional magnetic resonance images, to the detection of tumours in histopathological images and extracting quantitative single-cell information from microscopy imagery, where cell segmentation is a major bottleneck. Attention-based transformers are state-of-the-art in a… ▽ More Detecting and segmenting object instances is a common task in biomedical applications. Examples range from detecting lesions on functional magnetic resonance images, to the detection of tumours in histopathological images and extracting quantitative single-cell information from microscopy imagery, where cell segmentation is a major bottleneck. Attention-based transformers are state-of-the-art in a range of deep learning fields. They have recently been proposed for segmentation tasks where they are beginning to outperforming other methods. We present a novel attention-based cell detection transformer (Cell-DETR) for direct end-to-end instance segmentation. While the segmentation performance is on par with a state-of-the-art instance segmentation method, Cell-DETR is simpler and faster. We showcase the method's contribution in a the typical use case of segmenting yeast in microstructured environments, commonly employed in systems or synthetic biology. For the specific use case, the proposed method surpasses the state-of-the-art tools for semantic segmentation and additionally predicts the individual object instances. The fast and accurate instance segmentation performance increases the experimental information yield for a posteriori data processing and makes online monitoring of experiments and closed-loop optimal experimental design feasible. △ Less

Submitted 20 November, 2020; v1 submitted 19 November, 2020; originally announced November 2020.

Comments: IEEE BIBM 2020 (accepted)

arXiv:2011.08062 [pdf, other]

doi 10.1109/CIBCB48159.2020.9277693

Multiclass Yeast Segmentation in Microstructured Environments with Deep Learning

Authors: Tim Prangemeier, Christian Wildner, André O. Françani, Christoph Reich, Heinz Koeppl

Abstract: Cell segmentation is a major bottleneck in extracting quantitative single-cell information from microscopy data. The challenge is exasperated in the setting of microstructured environments. While deep learning approaches have proven useful for general cell segmentation tasks, existing segmentation tools for the yeast-microstructure setting rely on traditional machine learning approaches. Here we p… ▽ More Cell segmentation is a major bottleneck in extracting quantitative single-cell information from microscopy data. The challenge is exasperated in the setting of microstructured environments. While deep learning approaches have proven useful for general cell segmentation tasks, existing segmentation tools for the yeast-microstructure setting rely on traditional machine learning approaches. Here we present convolutional neural networks trained for multiclass segmenting of individual yeast cells and discerning these from cell-similar microstructures. We give an overview of the datasets recorded for training, validating and testing the networks, as well as a typical use-case. We showcase the method's contribution to segmenting yeast in microstructured environments with a typical synthetic biology application in mind. The models achieve robust segmentation results, outperforming the previous state-of-the-art in both accuracy and speed. The combination of fast and accurate segmentation is not only beneficial for a posteriori data processing, it also makes online monitoring of thousands of trapped cells or closed-loop optimal experimental design feasible from an image processing perspective. △ Less

Submitted 19 November, 2020; v1 submitted 16 November, 2020; originally announced November 2020.

Comments: IEEE CIBCB 2020 (accepted)

Showing 1–50 of 74 results for author: Koeppl, H