Learning to flock through reinforcement
Authors:
Mihir Durve,
Fernando Peruani,
Antonio Celani
Abstract:
Flocks of birds, schools of fish, insects swarms are examples of coordinated motion of a group that arises spontaneously from the action of many individuals. Here, we study flocking behavior from the viewpoint of multi-agent reinforcement learning. In this setting, a learning agent tries to keep contact with the group using as sensory input the velocity of its neighbors. This goal is pursued by ea…
▽ More
Flocks of birds, schools of fish, insects swarms are examples of coordinated motion of a group that arises spontaneously from the action of many individuals. Here, we study flocking behavior from the viewpoint of multi-agent reinforcement learning. In this setting, a learning agent tries to keep contact with the group using as sensory input the velocity of its neighbors. This goal is pursued by each learning individual by exerting a limited control on its own direction of motion. By means of standard reinforcement learning algorithms we show that: i) a learning agent exposed to a group of teachers, i.e. hard-wired flocking agents, learns to follow them, and ii) that in the absence of teachers, a group of independently learning agents evolves towards a state where each agent knows how to flock. In both scenarios, i) and ii), the emergent policy (or navigation strategy) corresponds to the polar velocity alignment mechanism of the well-known Vicsek model. These results show that a) such a velocity alignment may have naturally evolved as an adaptive behavior that aims at minimizing the rate of neighbor loss, and b) prove that this alignment does not only favor (local) polar order, but it corresponds to best policy/strategy to keep group cohesion when the sensory input is limited to the velocity of neighboring agents. In short, to stay together, steer together.
△ Less
Submitted 5 November, 2019;
originally announced November 2019.
Directedness of information flow in mobile phone communication networks
Authors:
Fernando Peruani,
Lionel Tabourier
Abstract:
Without having direct access to the information that is being exchanged, traces of information flow can be obtained by looking at temporal sequences of user interactions. These sequences can be represented as causality trees whose statistics result from a complex interplay between the topology of the underlying (social) network and the time correlations among the communications. Here, we study cau…
▽ More
Without having direct access to the information that is being exchanged, traces of information flow can be obtained by looking at temporal sequences of user interactions. These sequences can be represented as causality trees whose statistics result from a complex interplay between the topology of the underlying (social) network and the time correlations among the communications. Here, we study causality trees in mobile-phone data, which can be represented as a dynamical directed network. This representation of the data reveals the existence of super-spreaders and super-receivers. We show that the tree statistics, respectively the information spreading process, are extremely sensitive to the in-out degree correlation exhibited by the users. We also learn that a given information, e.g., a rumor, would require users to retransmit it for more than 30 hours in order to cover a macroscopic fraction of the system. Our analysis indicates that topological node-node correlations of the underlying social network, while allowing the existence of information loops, they also promote information spreading. Temporal correlations, and therefore causality effects, are only visible as local phenomena and during short time scales. These results are obtained through a combination of theory and data analysis techniques.
△ Less
Submitted 1 February, 2013;
originally announced February 2013.