Search | arXiv e-print repository

Multi-Target Pursuit by a Decentralized Heterogeneous UAV Swarm using Deep Multi-Agent Reinforcement Learning

Authors: Maryam Kouzeghar, Youngbin Song, Malika Meghjani, Roland Bouffanais

Abstract: Multi-agent pursuit-evasion tasks involving intelligent targets are notoriously challenging coordination problems. In this paper, we investigate new ways to learn such coordinated behaviors of unmanned aerial vehicles (UAVs) aimed at kee** track of multiple evasive targets. Within a Multi-Agent Reinforcement Learning (MARL) framework, we specifically propose a variant of the Multi-Agent Deep Det… ▽ More Multi-agent pursuit-evasion tasks involving intelligent targets are notoriously challenging coordination problems. In this paper, we investigate new ways to learn such coordinated behaviors of unmanned aerial vehicles (UAVs) aimed at kee** track of multiple evasive targets. Within a Multi-Agent Reinforcement Learning (MARL) framework, we specifically propose a variant of the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) method. Our approach addresses multi-target pursuit-evasion scenarios within non-stationary and unknown environments with random obstacles. In addition, given the critical role played by collective exploration in terms of detecting possible targets, we implement heterogeneous roles for the pursuers for enhanced exploratory actions balanced by exploitation (i.e. tracking) of previously identified targets. Our proposed role-based MADDPG algorithm is not only able to track multiple targets, but also is able to explore for possible targets by means of the proposed Voronoi-based rewarding policy. We implemented, tested and validated our approach in a simulation environment prior to deploying a real-world multi-robot system comprising of Crazyflie drones. Our results demonstrate that a multi-agent pursuit team has the ability to learn highly efficient coordinated control policies in terms of target tracking and exploration even when confronted with multiple fast evasive targets in complex environments. △ Less

Submitted 3 March, 2023; originally announced March 2023.

Comments: Accepted for ICRA 2023

arXiv:2301.10703 [pdf, ps, other]

A Sequential Deep Learning Algorithm for Sampled Mixed-integer Optimisation Problems

Authors: Mohammadreza Chamanbaz, Roland Bouffanais

Abstract: Mixed-integer optimisation problems can be computationally challenging. Here, we introduce and analyse two efficient algorithms with a specific sequential design that are aimed at dealing with sampled problems within this class. At each iteration step of both algorithms, we first test the feasibility of a given test solution for each and every constraint associated with the sampled optimisation at… ▽ More Mixed-integer optimisation problems can be computationally challenging. Here, we introduce and analyse two efficient algorithms with a specific sequential design that are aimed at dealing with sampled problems within this class. At each iteration step of both algorithms, we first test the feasibility of a given test solution for each and every constraint associated with the sampled optimisation at hand, while also identifying those constraints that are violated. Subsequently, an optimisation problem is constructed with a constraint set consisting of the current basis -- namely, the smallest set of constraints that fully specifies the current test solution -- as well as constraints related to a limited number of the identified violating samples. We show that both algorithms exhibit finite-time convergence towards the optimal solution. Algorithm 2 features a neural network classifier that notably improves the computational performance compared to Algorithm 1. We quantitatively establish these algorithms' efficacy through three numerical tests: robust optimal power flow, robust unit commitment, and robust random mixed-integer linear program. △ Less

Submitted 5 March, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

arXiv:2301.10692 [pdf, other]

Effect of Swarm Density on Collective Tracking Performance

Authors: Hian Lee Kwa, Julien Philippot, Roland Bouffanais

Abstract: How does the size of a swarm affect its collective action? Despite being arguably a key parameter, no systematic and satisfactory guiding principles exist to select the number of units required for a given task and environment. Even when limited by practical considerations, system designers should endeavor to identify what a reasonable swarm size should be. Here, we show that this fundamental ques… ▽ More How does the size of a swarm affect its collective action? Despite being arguably a key parameter, no systematic and satisfactory guiding principles exist to select the number of units required for a given task and environment. Even when limited by practical considerations, system designers should endeavor to identify what a reasonable swarm size should be. Here, we show that this fundamental question is closely linked to that of selecting an appropriate swarm density. Our analysis of the influence of density on the collective performance of a target tracking task reveals different `phases' corresponding to markedly distinct group dynamics. We identify a `transition' phase, in which a complex emergent collective response arises. Interestingly, the collective dynamics within this transition phase exhibit a clear trade-off between exploratory actions and exploitative ones. We show that at any density, the exploration-exploitation balance can be adjusted to maximize the system's performance through various means, such as by changing the level of connectivity between agents. While the density is the primary factor to be considered, it should not be the sole one to be accounted for when sizing the system. Due to the inherent finite-size effects present in physical systems, we establish that the number of constituents primarily affects system-level properties such as exploitation in the transition phase. These results illustrate that instead of learning and optimizing a swarm's behavior for a specific set of task parameters, further work should instead concentrate on learning to be adaptive, thereby endowing the swarm with the highly desirable feature of being able to operate effectively over a wide range of circumstances. △ Less

Submitted 25 January, 2023; originally announced January 2023.

arXiv:2207.13523 [pdf, other]

Adapting the Exploration-Exploitation Balance in Heterogeneous Swarms: Tracking Evasive Targets

Authors: Hian Lee Kwa, Victor Babineau, Julien Philippot, Roland Bouffanais

Abstract: There has been growing interest in the use of multi-robot systems in various tasks and scenarios. The main attractiveness of such systems is their flexibility, robustness, and scalability. An often overlooked yet promising feature is system modularity, which offers the possibility to harness agent specialization, while also enabling system-level upgrades. However, altering the agents' capacities c… ▽ More There has been growing interest in the use of multi-robot systems in various tasks and scenarios. The main attractiveness of such systems is their flexibility, robustness, and scalability. An often overlooked yet promising feature is system modularity, which offers the possibility to harness agent specialization, while also enabling system-level upgrades. However, altering the agents' capacities can change the exploration-exploitation balance required to maximize the system's performance. Here, we study the effect of a swarm's heterogeneity on its exploration-exploitation balance while tracking multiple fast-moving evasive targets under the Cooperative Multi-Robot Observation of Multiple Moving Targets framework. To this end, we use a decentralized search and tracking strategy with adjustable levels of exploration and exploitation. By indirectly tuning the balance, we first confirm the presence of an optimal balance between these two key competing actions. Next, by substituting slower moving agents with faster ones, we show that the system exhibits a performance improvement without any modifications to the original strategy. In addition, owing to the additional amount of exploitation carried out by the faster agents, we demonstrate that a heterogeneous system's performance can be further improved by reducing an agent's level of connectivity, to favor the conduct of exploratory actions. Furthermore, in studying the influence of the density of swarming agents, we show that the addition of faster agents can counterbalance a reduction in the overall number of agents while maintaining the level of tracking performance. Finally, we explore the challenges of using differentiated strategies to take advantage of the heterogeneous nature of the swarm. △ Less

Submitted 27 July, 2022; originally announced July 2022.

Comments: To appear in Artificial Life

arXiv:2108.07122 [pdf, other]

doi 10.1162/isal_a_00376

Tracking Multiple Fast Targets With Swarms: Interplay Between Social Interaction and Agent Memory

Authors: Hian Lee Kwa, Jabez Leong Kit, Roland Bouffanais

Abstract: The task of searching for and tracking of multiple targets is a challenging one. However, most works in this area do not consider evasive targets that move faster than the agents comprising the multi-robot system. This is due to the assumption that the movement patterns of such targets, combined with their excessive speed, would make the task nearly impossible to accomplish. In this work, we show… ▽ More The task of searching for and tracking of multiple targets is a challenging one. However, most works in this area do not consider evasive targets that move faster than the agents comprising the multi-robot system. This is due to the assumption that the movement patterns of such targets, combined with their excessive speed, would make the task nearly impossible to accomplish. In this work, we show that this is not the case and we propose a decentralized search and tracking strategy in which the level of exploration and exploitation carried out by the swarm is adjustable. By tuning a swarm's exploration and exploitation dynamics, we demonstrate that there exists an optimal balance between the level of exploration and exploitation performed. This optimum maximizes its tracking performance and changes depending on the number of targets and the targets' movement profiles. We also show that the use of agent-based memory is critical in enabling the tracking of an evasive target. The obtained simulation results are validated through experimental tests with a decentralized swarm of six robots tracking a virtual fast-moving target. △ Less

Submitted 16 August, 2021; originally announced August 2021.

Journal ref: Proceedings of the ALIFE 2021: The 2021 Conference on Artificial Life. ALIFE 2021: The 2021 Conference on Artificial Life. Online. (pp. 62). ASME

arXiv:2012.11641 [pdf, other]

doi 10.1109/IEEECONF38699.2020.9389128

Multi-Agent Reinforcement Learning for Dynamic Ocean Monitoring by a Swarm of Buoys

Authors: Maryam Kouzehgar, Malika Meghjani, Roland Bouffanais

Abstract: Autonomous marine environmental monitoring problem traditionally encompasses an area coverage problem which can only be effectively carried out by a multi-robot system. In this paper, we focus on robotic swarms that are typically operated and controlled by means of simple swarming behaviors obtained from a subtle, yet ad hoc combination of bio-inspired strategies. We propose a novel and structured… ▽ More Autonomous marine environmental monitoring problem traditionally encompasses an area coverage problem which can only be effectively carried out by a multi-robot system. In this paper, we focus on robotic swarms that are typically operated and controlled by means of simple swarming behaviors obtained from a subtle, yet ad hoc combination of bio-inspired strategies. We propose a novel and structured approach for area coverage using multi-agent reinforcement learning (MARL) which effectively deals with the non-stationarity of environmental features. Specifically, we propose two dynamic area coverage approaches: (1) swarm-based MARL, and (2) coverage-range-based MARL. The former is trained using the multi-agent deep deterministic policy gradient (MADDPG) approach whereas, a modified version of MADDPG is introduced for the latter with a reward function that intrinsically leads to a collective behavior. Both methods are tested and validated with different geometric shaped regions with equal surface area (square vs. rectangle) yielding acceptable area coverage, and benefiting from the structured learning in non-stationary environments. Both approaches are advantageous compared to a naïve swarming method. However, coverage-range-based MARL outperforms the swarm-based MARL with stronger convergence features in learning criteria and higher spreading of agents for area coverage. △ Less

Submitted 21 December, 2020; originally announced December 2020.

Comments: Accepted for Publication at IEEE/MTS OCEANS 2020

Journal ref: in Proceedings of Global Oceans 2020: Singapore-US Gulf Coast (pp. 1-8). IEEE

arXiv:2008.00696 [pdf, other]

doi 10.1109/IEEECONF38699.2020.9389145

Heterogeneous Swarms for Maritime Dynamic Target Search and Tracking

Authors: Hian Lee Kwa, Grgur Tokić, Roland Bouffanais, Dick K. P. Yue

Abstract: Current strategies employed for maritime target search and tracking are primarily based on the use of agents following a predetermined path to perform a systematic sweep of a search area. Recently, dynamic Particle Swarm Optimization (PSO) algorithms have been used together with swarming multi-robot systems (MRS), giving search and tracking solutions the added properties of robustness, scalability… ▽ More Current strategies employed for maritime target search and tracking are primarily based on the use of agents following a predetermined path to perform a systematic sweep of a search area. Recently, dynamic Particle Swarm Optimization (PSO) algorithms have been used together with swarming multi-robot systems (MRS), giving search and tracking solutions the added properties of robustness, scalability, and flexibility. Swarming MRS also give the end-user the opportunity to incrementally upgrade the robotic system, inevitably leading to the use of heterogeneous swarming MRS. However, such systems have not been well studied and incorporating upgraded agents into a swarm may result in degraded mission performances. In this paper, we propose a PSO-based strategy using a topological k-nearest neighbor graph with tunable exploration and exploitation dynamics with an adaptive repulsion parameter. This strategy is implemented within a simulated swarm of 50 agents with varying proportions of fast agents tracking a target represented by a fictitious binary function. Through these simulations, we are able to demonstrate an increase in the swarm's collective response level and target tracking performance by substituting in a proportion of fast buoys. △ Less

Submitted 3 August, 2020; originally announced August 2020.

Comments: Accepted for IEEE/MTS OCEANS 2020, Singapore

Journal ref: IEEE/MTS Global Oceans 2020: Singapore - U.S. Gulf Coast, October 5-30, 2020, online, pp. 1-8

arXiv:2005.05063 [pdf, other]

doi 10.1038/s41598-020-75697-z

Spatial super-spreaders and super-susceptibles in human movement networks

Authors: Wei Chien Benny Chin, Roland Bouffanais

Abstract: As lockdowns and stay-at-home orders start to be lifted across the globe, governments are struggling to establish effective and practical guidelines to reopen their economies. In dense urban environments with people returning to work and public transportation resuming full capacity, enforcing strict social distancing measures will be extremely challenging, if not practically impossible. Government… ▽ More As lockdowns and stay-at-home orders start to be lifted across the globe, governments are struggling to establish effective and practical guidelines to reopen their economies. In dense urban environments with people returning to work and public transportation resuming full capacity, enforcing strict social distancing measures will be extremely challenging, if not practically impossible. Governments are thus paying close attention to particular locations that may become the next cluster of disease spreading. Indeed, certain places, like some people, can be "super-spreaders." Is a bustling train station in a central business district more or less susceptible and vulnerable as compared to teeming bus interchanges in the suburbs? Here, we propose a quantitative and systematic framework to identify spatial super-spreaders and the novel concept of super-susceptibles, i.e. respectively, places most likely to contribute to disease spread or to people contracting it. Our proposed data-analytic framework is based on the daily-aggregated ridership data of public transport in Singapore. By constructing the directed and weighted human movement networks and integrating human flow intensity with two neighborhood diversity metrics, we are able to pinpoint super-spreader and super-susceptible locations. Our results reveal that most super-spreaders are also super-susceptibles and that counterintuitively, busy peripheral bus interchanges are riskier places than crowded central train stations. Our analysis is based on data from Singapore, but can be readily adapted and extended for any other major urban center. It therefore serves as a useful framework for devising targeted and cost-effective preventive measures for urban planning and epidemiological preparedness. △ Less

Submitted 11 May, 2020; originally announced May 2020.

Comments: 19 pages, 10 figures

Journal ref: Scientific reports 10 (2020) 18642

arXiv:1910.01303 [pdf, ps, other]

doi 10.1109/MTS.2019.2948437

From Senseless Swarms to Smart Mobs: Tuning Networks for Prosocial Behaviour

Authors: Sun Sun Lim, Roland Bouffanais

Abstract: Social media have been seen to accelerate the spread of negative content such as disinformation and hate speech, often unleashing reckless herd mentality within networks, further aggravated by malicious entities using bots for amplification. So far, the response to this emerging global crisis has centred around social media platform companies making reactive moves that appear to have greater symbo… ▽ More Social media have been seen to accelerate the spread of negative content such as disinformation and hate speech, often unleashing reckless herd mentality within networks, further aggravated by malicious entities using bots for amplification. So far, the response to this emerging global crisis has centred around social media platform companies making reactive moves that appear to have greater symbolic value than practical utility. These include taking down patently objectionable content or manually deactivating the accounts of bad actors, while leaving vast troves of negative content to circulate and perpetuate within social networks. Governments worldwide have thus sought to intervene using regulatory tools, with countries such as France, Germany and Singapore introducing laws to compel technology companies to take down or correct erroneous and harmful content. However, the relentless pace of technological progress enfeebles regulatory measures that seem fated for obsolescence. △ Less

Submitted 3 October, 2019; v1 submitted 3 October, 2019; originally announced October 2019.

Comments: To appear in IEEE Technology and Society Magazine

Journal ref: IEEE Technology and Society Magazine, (38):4, pp. 17-19, 2019

arXiv:1908.05822 [pdf, other]

doi 10.1109/MRS.2019.8901058

Decentralized Multi-Floor Exploration by a Swarm of Miniature Robots Teaming with Wall-Climbing Units

Authors: Jabez L. Kit, Audelia G. Dharmawan, David Mateo, Shaohui Foong, Gim Song Soh, Roland Bouffanais, Kristin L. Wood

Abstract: In this paper, we consider the problem of collectively exploring unknown and dynamic environments with a decentralized heterogeneous multi-robot system consisting of multiple units of two variants of a miniature robot. The first variant-a wheeled ground unit-is at the core of a swarm of floor-map** robots exhibiting scalability, robustness and flexibility. These properties are systematically tes… ▽ More In this paper, we consider the problem of collectively exploring unknown and dynamic environments with a decentralized heterogeneous multi-robot system consisting of multiple units of two variants of a miniature robot. The first variant-a wheeled ground unit-is at the core of a swarm of floor-map** robots exhibiting scalability, robustness and flexibility. These properties are systematically tested and quantitatively evaluated in unstructured and dynamic environments, in the absence of any supporting infrastructure. The results of repeated sets of experiments show a consistent performance for all three features, as well as the possibility to inject units into the system while it is operating. Several units of the second variant-a wheg-based wall-climbing unit-are used to support the swarm of map** robots when simultaneously exploring multiple floors by expanding the distributed communication channel necessary for the coordinated behavior among platforms. Although the occupancy-grid maps obtained can be large, they are fully distributed. Not a single robotic unit possesses the overall map, which is not required by our cooperative path-planning strategy. △ Less

Submitted 15 August, 2019; originally announced August 2019.

Comments: Accepted for publication in IEEE-MRS 2019, Rutgers University, New Brunswick (NJ), USA

Journal ref: MRS 2019, IEEE International Symposium on Multi-Robot and Multi-Agent Systems, August 22-23, 2019, New Brunswick, NJ, pp. 195-201

arXiv:1907.04691 [pdf, other]

doi 10.1016/j.ifacol.2017.08.763

Randomized Constraints Consensus for Distributed Robust Mixed-Integer Programming

Authors: Mohammadreza Chamanbaz, Giuseppe Notarstefano, Francesco Sasso, Roland Bouffanais

Abstract: In this paper, we consider a network of processors aiming at cooperatively solving mixed-integer convex programs subject to uncertainty. Each node only knows a common cost function and its local uncertain constraint set. We propose a randomized, distributed algorithm working under asynchronous, unreliable and directed communication. The algorithm is based on a local computation and communication p… ▽ More In this paper, we consider a network of processors aiming at cooperatively solving mixed-integer convex programs subject to uncertainty. Each node only knows a common cost function and its local uncertain constraint set. We propose a randomized, distributed algorithm working under asynchronous, unreliable and directed communication. The algorithm is based on a local computation and communication paradigm. At each communication round, nodes perform two updates: (i) a verification in which they check---in a randomized fashion---the robust feasibility of a candidate optimal point, and (ii) an optimization step in which they exchange their candidate basis (the minimal set of constraints defining a solution) with neighbors and locally solve an optimization problem. As main result, we show that processors can stop the algorithm after a finite number of communication rounds (either because verification has been successful for a sufficient number of rounds or because a given threshold has been reached), so that candidate optimal solutions are consensual. The common solution is proven to be---with high confidence---feasible and hence optimal for the entire set of uncertainty except a subset having an arbitrary small probability measure. We show the effectiveness of the proposed distributed algorithm using two examples: a random, uncertain mixed-integer linear program and a distributed localization in wireless sensor networks. The distributed algorithm is implemented on a multi-core platform in which the nodes communicate asynchronously. △ Less

Submitted 9 July, 2019; originally announced July 2019.

Comments: Submitted for publication. arXiv admin note: text overlap with arXiv:1706.00488

Journal ref: IFAC 2017, 20th IFAC World Congress, July 9-14, Toulouse, France, IFAC PapersOnLine (50), 4973-497, 2017

arXiv:1811.08318 [pdf, ps, other]

doi 10.1177/1059712318818568

Self-Organizing Maps for Storage and Transfer of Knowledge in Reinforcement Learning

Authors: Thommen George Karimpanal, Roland Bouffanais

Abstract: The idea of reusing or transferring information from previously learned tasks (source tasks) for the learning of new tasks (target tasks) has the potential to significantly improve the sample efficiency of a reinforcement learning agent. In this work, we describe a novel approach for reusing previously acquired knowledge by using it to guide the exploration of an agent while it learns new tasks. I… ▽ More The idea of reusing or transferring information from previously learned tasks (source tasks) for the learning of new tasks (target tasks) has the potential to significantly improve the sample efficiency of a reinforcement learning agent. In this work, we describe a novel approach for reusing previously acquired knowledge by using it to guide the exploration of an agent while it learns new tasks. In order to do so, we employ a variant of the growing self-organizing map algorithm, which is trained using a measure of similarity that is defined directly in the space of the vectorized representations of the value functions. In addition to enabling transfer across tasks, the resulting map is simultaneously used to enable the efficient storage of previously acquired task knowledge in an adaptive and scalable manner. We empirically validate our approach in a simulated navigation environment, and also demonstrate its utility through simple experiments using a mobile micro-robotics platform. In addition, we demonstrate the scalability of this approach, and analytically examine its relation to the proposed network growth mechanism. Further, we briefly discuss some of the possible improvements and extensions to this approach, as well as its relevance to real world scenarios in the context of continual learning. △ Less

Submitted 18 November, 2018; originally announced November 2018.

Comments: 35 pages, 11 figures, Accepted in the journal Adaptive Behavior. arXiv admin note: substantial text overlap with arXiv:1807.07530

Journal ref: Adaptive Behavior 27 (2018) 111-126

arXiv:1810.05818 [pdf, other]

doi 10.1109/UEMCON.2018.8796753

A Decentralized Mobile Computing Network for Multi-Robot Systems Operations

Authors: Jabez Leong Kit, David Mateo, Roland Bouffanais

Abstract: Collective animal behaviors are paradigmatic examples of fully decentralized operations involving complex collective computations such as collective turns in flocks of birds or collective harvesting by ants. These systems offer a unique source of inspiration for the development of fault-tolerant and self-healing multi-robot systems capable of operating in dynamic environments. Specifically, swarm… ▽ More Collective animal behaviors are paradigmatic examples of fully decentralized operations involving complex collective computations such as collective turns in flocks of birds or collective harvesting by ants. These systems offer a unique source of inspiration for the development of fault-tolerant and self-healing multi-robot systems capable of operating in dynamic environments. Specifically, swarm robotics emerged and is significantly growing on these premises. However, to date, most swarm robotics systems reported in the literature involve basic computational tasks---averages and other algebraic operations. In this paper, we introduce a novel Collective computing framework based on the swarming paradigm, which exhibits the key innate features of swarms: robustness, scalability and flexibility. Unlike Edge computing, the proposed Collective computing framework is truly decentralized and does not require user intervention or additional servers to sustain its operations. This Collective computing framework is applied to the complex task of collective map**, in which multiple robots aim at cooperatively map a large area. Our results confirm the effectiveness of the cooperative strategy, its robustness to the loss of multiple units, as well as its scalability. Furthermore, the topology of the interconnecting network is found to greatly influence the performance of the collective action. △ Less

Submitted 13 October, 2018; originally announced October 2018.

Comments: Accepted for Publication in Proc. 9th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference

Journal ref: UEMCON 9 (2018) 309-314

arXiv:1808.10617 [pdf, other]

doi 10.1109/OCEANS.2018.8604642

Gradual Collective Upgrade of a Swarm of Autonomous Buoys for Dynamic Ocean Monitoring

Authors: Francesco Vallegra, David Mateo, Grgur Tokić, Roland Bouffanais, Dick K. P. Yue

Abstract: Swarms of autonomous surface vehicles equipped with environmental sensors and decentralized communications bring a new wave of attractive possibilities for the monitoring of dynamic features in oceans and other waterbodies. However, a key challenge in swarm robotics design is the efficient collective operation of heterogeneous systems. We present both theoretical analysis and field experiments on… ▽ More Swarms of autonomous surface vehicles equipped with environmental sensors and decentralized communications bring a new wave of attractive possibilities for the monitoring of dynamic features in oceans and other waterbodies. However, a key challenge in swarm robotics design is the efficient collective operation of heterogeneous systems. We present both theoretical analysis and field experiments on the responsiveness in dynamic area coverage of a collective of 22 autonomous buoys, where 4 units are upgraded to a new design that allows them to move 80\% faster than the rest. This system is able to react on timescales of the minute to changes in areas on the order of a few thousand square meters. We have observed that this partial upgrade of the system significantly increases its average responsiveness, without necessarily improving the spatial uniformity of the deployment. These experiments show that the autonomous buoy designs and the cooperative control rule described in this work provide an efficient, flexible, and scalable solution for the pervasive and persistent monitoring of water environments. △ Less

Submitted 31 August, 2018; originally announced August 2018.

Comments: Proceedings of the OCEANS 2018 conference

Journal ref: OCEANS 2018 MTS/IEEE Charleston, Charleston, S.C., 2018, p. 1-7

arXiv:1807.07530 [pdf, other]

Self-Organizing Maps as a Storage and Transfer Mechanism in Reinforcement Learning

Authors: Thommen George Karimpanal, Roland Bouffanais

Abstract: The idea of reusing information from previously learned tasks (source tasks) for the learning of new tasks (target tasks) has the potential to significantly improve the sample efficiency reinforcement learning agents. In this work, we describe an approach to concisely store and represent learned task knowledge, and reuse it by allowing it to guide the exploration of an agent while it learns new ta… ▽ More The idea of reusing information from previously learned tasks (source tasks) for the learning of new tasks (target tasks) has the potential to significantly improve the sample efficiency reinforcement learning agents. In this work, we describe an approach to concisely store and represent learned task knowledge, and reuse it by allowing it to guide the exploration of an agent while it learns new tasks. In order to do so, we use a measure of similarity that is defined directly in the space of parameterized representations of the value functions. This similarity measure is also used as a basis for a variant of the growing self-organizing map algorithm, which is simultaneously used to enable the storage of previously acquired task knowledge in an adaptive and scalable manner.We empirically validate our approach in a simulated navigation environment and discuss possible extensions to this approach along with potential applications where it could be particularly useful. △ Less

Submitted 19 July, 2018; originally announced July 2018.

Comments: 7 pages, 7 figures, presented at ALA Workshop, FAIM, Stockholm, 2018

arXiv:1807.04631 [pdf, other]

doi 10.1126/sciadv.aau0999

Optimal Network Topology for Effective Collective Response

Authors: David Mateo, Nikolaj Horsevad, Vahid Hassani, Mohammadreza Chamanbaz, Roland Bouffanais

Abstract: Natural, social, and artificial multi-agent systems usually operate in dynamic environments, where the ability to respond to changing circumstances is a crucial feature. An effective collective response requires suitable information transfer among agents, and thus is critically dependent on the agents' interaction network. In order to investigate the influence of the network topology on collective… ▽ More Natural, social, and artificial multi-agent systems usually operate in dynamic environments, where the ability to respond to changing circumstances is a crucial feature. An effective collective response requires suitable information transfer among agents, and thus is critically dependent on the agents' interaction network. In order to investigate the influence of the network topology on collective response, we consider an archetypal model of distributed decision-making---the leader-follower linear consensus---and study the collective capacity of the system to follow a dynamic driving signal (the "leader") for a range of topologies and system sizes. The analysis reveals a nontrivial relationship between optimal topology and frequency of the driving signal. Interestingly, the response is optimal when each individual interacts with a certain number of agents which decreases monotonically with the frequency and, for large enough systems, is independent of the size of the system. This phenomenology is investigated in experiments of collective motion using a swarm of land robots. The emergent collective response to both a slow- and a fast-changing leader is measured and analyzed for a range of interaction topologies. These results have far-reaching practical implications for the design and understanding of distributed systems, since they highlight that a dynamic rewiring of the interaction network is paramount to the effective collective operations of multi-agent systems at different time-scales. △ Less

Submitted 21 December, 2018; v1 submitted 10 July, 2018; originally announced July 2018.

Journal ref: Science Advances 5 (2019) eaau0999

arXiv:1801.02874 [pdf, other]

doi 10.1140/epjds/s13688-018-0161-9

Are the different layers of a social network conveying the same information?

Authors: Ajaykumar Manivannan, W. Quin Yow, Roland Bouffanais, Alain Barrat

Abstract: Comprehensive and quantitative investigations of social theories and phenomena increasingly benefit from the vast breadth of data describing human social relations, which is now available within the realm of computational social science. Such data are, however, typically proxies for one of the many interaction layers composing social networks, which can be defined in many ways and are typically co… ▽ More Comprehensive and quantitative investigations of social theories and phenomena increasingly benefit from the vast breadth of data describing human social relations, which is now available within the realm of computational social science. Such data are, however, typically proxies for one of the many interaction layers composing social networks, which can be defined in many ways and are typically composed of communication of various types (e.g., phone calls, face-to-face communication, etc.). As a result, many studies focus on one single layer, corresponding to the data at hand. Several studies have, however, shown that these layers are not interchangeable, despite the presence of a certain level of correlations between them. Here, we investigate whether different layers of interactions among individuals lead to similar conclusions with respect to the presence of homophily patterns in a population---homophily represents one of the widest studied phenomenon in social networks. To this aim, we consider a dataset describing interactions and links of various nature in a population of Asian students with diverse nationalities, first language and gender. We study homophily patterns, as well as their temporal evolutions in each layer of the social network. To facilitate our analysis, we put forward a general method to assess whether the homophily patterns observed in one layer inform us about patterns in another layer. For instance, our study reveals that three network layers---cell phone communications, questionnaires about friendship, and trust relations---lead to similar and consistent results despite some minor discrepancies. The homophily patterns of the co-presence network layer, however, does not yield any meaningful information about other network layers. △ Less

Submitted 9 January, 2018; originally announced January 2018.

Journal ref: EPJ Data Science 7 (2018) 34

arXiv:1705.10834 [pdf, other]

doi 10.3389/fnbot.2018.00032

Experience Replay Using Transition Sequences

Authors: Thommen George Karimpanal, Roland Bouffanais

Abstract: Experience replay is one of the most commonly used approaches to improve the sample efficiency of reinforcement learning algorithms. In this work, we propose an approach to select and replay sequences of transitions in order to accelerate the learning of a reinforcement learning agent in an off-policy setting. In addition to selecting appropriate sequences, we also artificially construct transitio… ▽ More Experience replay is one of the most commonly used approaches to improve the sample efficiency of reinforcement learning algorithms. In this work, we propose an approach to select and replay sequences of transitions in order to accelerate the learning of a reinforcement learning agent in an off-policy setting. In addition to selecting appropriate sequences, we also artificially construct transition sequences using information gathered from previous agent-environment interactions. These sequences, when replayed, allow value function information to trickle down to larger sections of the state/state-action space, thereby making the most of the agent's experience. We demonstrate our approach on modified versions of standard reinforcement learning tasks such as the mountain car and puddle world problems and empirically show that it enables better learning of value functions as compared to other forms of experience replay. Further, we briefly discuss some of the possible extensions to this work, as well as applications and situations where this approach could be particularly useful. △ Less

Submitted 12 September, 2019; v1 submitted 30 May, 2017; originally announced May 2017.

Comments: 23 pages, 6 figures

Journal ref: Frontiers in Neurorobotics 12 (2018) 32

arXiv:1705.04010 [pdf, other]

doi 10.3389/frobt.2017.00012

Swarm-Enabling Technology for Multi-Robot Systems

Authors: Mohammadreza Chamanbaz, David Mateo, Brandon M. Zoss, Grgur Tokić, Erik Wilhelm, Roland Bouffanais, and Dick K. P. Yue

Abstract: Swarm robotics has experienced a rapid expansion in recent years, primarily fueled by specialized multi-robot systems developed to achieve dedicated collective actions. These specialized platforms are in general designed with swarming considerations at the front and center. Key hardware and software elements required for swarming are often deeply embedded and integrated with the particular system.… ▽ More Swarm robotics has experienced a rapid expansion in recent years, primarily fueled by specialized multi-robot systems developed to achieve dedicated collective actions. These specialized platforms are in general designed with swarming considerations at the front and center. Key hardware and software elements required for swarming are often deeply embedded and integrated with the particular system. However, given the noticeable increase in the number of low-cost mobile robots readily available, practitioners and hobbyists may start considering to assemble full-fledged swarms by minimally retrofitting such mobile platforms with a swarm-enabling technology. Here, we report one possible embodiment of such a technology designed to enable the assembly and the study of swarming in a range of general-purpose robotic systems. This is achieved by combining a modular and transferable software toolbox with a hardware suite composed of a collection of low-cost and off-the-shelf components. The developed technology can be ported to a relatively vast range of robotic platforms with minimal changes and high levels of scalability. This swarm-enabling technology has successfully been implemented on two distinct distributed multi-robot systems, a swarm of mobile marine buoys and a team of commercial terrestrial robots. We have tested the effectiveness of both of these distributed robotic systems in performing collective exploration and search scenarios, as well as other classical cooperative behaviors. Experimental results on different swarm behaviors are reported for the two platforms in uncontrolled environments and without any supporting infrastructure. The design of the associated software library allows for a seamless switch to other cooperative behaviors, and also offers the possibility to simulate newly designed collective behaviors prior to their implementation onto the platforms. △ Less

Submitted 11 May, 2017; originally announced May 2017.

Journal ref: Frontiers in Robotics and AI 4 (2017) 12

arXiv:0709.1024 [pdf, ps, other]

Computational performance of a parallelized high-order spectral and mortar element toolbox

Authors: Roland Bouffanais, Vincent Keller, Ralf Gruber, Michel O. Deville

Abstract: In this paper, a comprehensive performance review of a MPI-based high-order spectral and mortar element method C++ toolbox is presented. The focus is put on the performance evaluation of several aspects with a particular emphasis on the parallel efficiency. The performance evaluation is analyzed and compared to predictions given by a heuristic model, the so-called Gamma model. A tailor-made CFD… ▽ More In this paper, a comprehensive performance review of a MPI-based high-order spectral and mortar element method C++ toolbox is presented. The focus is put on the performance evaluation of several aspects with a particular emphasis on the parallel efficiency. The performance evaluation is analyzed and compared to predictions given by a heuristic model, the so-called Gamma model. A tailor-made CFD computation benchmark case is introduced and used to carry out this review, stressing the particular interest for commodity clusters. Conclusions are drawn from this extensive series of analyses and modeling leading to specific recommendations concerning such toolbox development and parallel implementation. △ Less

Submitted 7 September, 2007; originally announced September 2007.

Comments: Preprint submitted for publication to Parallel Computing

arXiv:0709.0355 [pdf, ps, other]

doi 10.1016/j.apnum.2007.04.009

Solution of moving-boundary problems by the spectral element method

Authors: Nicolas Bodard, Roland Bouffanais, Michel O. Deville

Abstract: This paper describes a novel numerical model aiming at solving moving-boundary problems such as free-surface flows or fluid-structure interaction. This model uses a moving-grid technique to solve the Navier--Stokes equations expressed in the arbitrary Lagrangian--Eulerian kinematics. The discretization in space is based on the spectral element method. The coupling of the fluid equations and the… ▽ More This paper describes a novel numerical model aiming at solving moving-boundary problems such as free-surface flows or fluid-structure interaction. This model uses a moving-grid technique to solve the Navier--Stokes equations expressed in the arbitrary Lagrangian--Eulerian kinematics. The discretization in space is based on the spectral element method. The coupling of the fluid equations and the moving-grid equations is essentially done through the conditions on the moving boundaries. Two- and three-dimensional simulations are presented: translation and rotation of a cylinder in a fluid, and large-amplitude sloshing in a rectangular tank. The accuracy and robustness of the present numerical model is studied and discussed. △ Less

Submitted 4 September, 2007; originally announced September 2007.

Comments: Applied Numerical Mathematics, In Press, 2008

Journal ref: Applied Numerial Mathematics 58 (2008) 968-984

Showing 1–21 of 21 results for author: Bouffanais, R