-
Inference through innovation processes tested in the authorship attribution task
Authors:
Giulio Tani Raffaelli,
Margherita Lalli,
Francesca Tria
Abstract:
Urn models for innovation capture fundamental empirical laws shared by several real-world processes. The so-called urn model with triggering includes, as particular cases, the urn representation of the two-parameter Poisson-Dirichlet process and the Dirichlet process, seminal in Bayesian non-parametric inference. In this work, we leverage this connection to introduce a general approach for quantif…
▽ More
Urn models for innovation capture fundamental empirical laws shared by several real-world processes. The so-called urn model with triggering includes, as particular cases, the urn representation of the two-parameter Poisson-Dirichlet process and the Dirichlet process, seminal in Bayesian non-parametric inference. In this work, we leverage this connection to introduce a general approach for quantifying closeness between symbolic sequences and test it within the framework of the authorship attribution problem. The method demonstrates high accuracy when compared to other related methods in different scenarios, featuring a substantial gain in computational efficiency and theoretical transparency. Beyond the practical convenience, this work demonstrates how the recently established connection between urn models and non-parametric Bayesian inference can pave the way for designing more efficient inference methods. In particular, the hybrid approach that we propose allows us to relax the exchangeability hypothesis, which can be particularly relevant for systems exhibiting complex correlation patterns and non-stationary dynamics.
△ Less
Submitted 5 July, 2024; v1 submitted 8 June, 2023;
originally announced June 2023.
-
Taylor's law in innovation processes
Authors:
F. Tria,
I. Crimaldi,
G. Aletti,
V. D. P. Servedio
Abstract:
Taylor's law quantifies the scaling properties of the fluctuations of the number of innovations occurring in open systems.
Urn based modelling schemes have already proven to be effective in modelling this complex behaviour.
Here, we present analytical estimations of Taylor's law exponents in such models, by leveraging on their representation in terms of triangular urn models.
We also highlig…
▽ More
Taylor's law quantifies the scaling properties of the fluctuations of the number of innovations occurring in open systems.
Urn based modelling schemes have already proven to be effective in modelling this complex behaviour.
Here, we present analytical estimations of Taylor's law exponents in such models, by leveraging on their representation in terms of triangular urn models.
We also highlight the correspondence of these models with Poisson-Dirichlet processes and demonstrate how a non-trivial Taylor's law exponent is a kind of universal feature in systems related to human activities.
We base this result on the analysis of four collections of data generated by human activity: (i) written language (from a Gutenberg corpus); (ii) a n online music website (Last.fm); (iii) Twitter hashtags; (iv) a on-line collaborative tagging system (Del.icio.us).
While Taylor's law observed in the last two datasets agrees with the plain model predictions, we need to introduce a generalization to fully characterize the behaviour of the first two datasets, where temporal correlations are possibly more relevant.
We suggest that Taylor's law is a fundamental complement to Zipf's and Heaps' laws in unveiling the complex dynamical processes underlying the evolution of systems featuring innovation.
△ Less
Submitted 1 May, 2020;
originally announced May 2020.
-
The exploration of the Adjacent Possible explains the emergence and evolution of social networks
Authors:
Enrico Ubaldi,
Raffaella Burioni,
Vittorio Loreto,
Fancesca Tria
Abstract:
The interactions among human beings represent the backbone of our societies. How people interact, establish new connections, and allocate their activities among these links can reveal a lot of our social organization. Despite focused attention by very diverse scientific communities, we still lack a first-principles modeling framework able to account for the birth and evolution of social networks.…
▽ More
The interactions among human beings represent the backbone of our societies. How people interact, establish new connections, and allocate their activities among these links can reveal a lot of our social organization. Despite focused attention by very diverse scientific communities, we still lack a first-principles modeling framework able to account for the birth and evolution of social networks. Here, we tackle this problem by looking at social interactions as a way to explore a very peculiar space, namely the adjacent possible space, i.e., the set of individuals we can meet at any given point in time during our lifetime. We leverage on a recent mathematical formalization of the adjacent possible space to propose a first-principles theory of social exploration based on simple microscopic rules defining how people get in touch and interact. The new theory predicts both microscopic and macroscopic features of social networks. The most striking feature captured on the microscopic side is the probability for an individual, with already $k$ connections, to acquire a new acquaintance. On the macroscopic side, the model reproduces the main static and dynamic features of social networks: the broad distribution of degree and activities, the average clustering coefficient and the innovation rate at the global and local level. The theory is born out in three diverse real-world social networks: the network of mentions between Twitter users, the network of co-authorship of the American Physical Society and a mobile-phone-call network.
△ Less
Submitted 7 July, 2020; v1 submitted 2 March, 2020;
originally announced March 2020.
-
Zipf's, Heaps' and Taylor's laws are determined by the expansion into the adjacent possible
Authors:
Francesca Tria,
Vittorio Loreto,
Vito D. P. Servedio
Abstract:
Zipf's, Heaps' and Taylor's laws are ubiquitous in many different systems where innovation processes are at play. Together, they represent a compelling set of stylized facts regarding the overall statistics, the innovation rate and the scaling of fluctuations for systems as diverse as written texts and cities, ecological systems and stock markets. Many modeling schemes have been proposed in litera…
▽ More
Zipf's, Heaps' and Taylor's laws are ubiquitous in many different systems where innovation processes are at play. Together, they represent a compelling set of stylized facts regarding the overall statistics, the innovation rate and the scaling of fluctuations for systems as diverse as written texts and cities, ecological systems and stock markets. Many modeling schemes have been proposed in literature to explain those laws, but only recently a modeling framework has been introduced that accounts for the emergence of those laws without deducing the emergence of one of the laws from the others or without ad hoc assumptions. This modeling framework is based on the concept of adjacent possible space and its key feature of being dynamically restructured while its boundaries get explored, i.e., conditional to the occurrence of novel events. Here, we illustrate this approach and show how this simple modelling framework, instantiated through a modified Polya's urn model, is able reproduce Zipf's, Heaps' and Taylor's laws within a unique self-consistent scheme. In addition the same modelling scheme embraces other less common evolutionary laws (Hoppe's model and Dirichlet processes) as particular cases.
△ Less
Submitted 30 September, 2018;
originally announced November 2018.
-
Dynamics on expanding spaces: modeling the emergence of novelties
Authors:
Vittorio Loreto,
Vito D. P. Servedio,
Steven H. Strogatz,
Francesca Tria
Abstract:
Novelties are part of our daily lives. We constantly adopt new technologies, conceive new ideas, meet new people, experiment with new situations. Occasionally, we as individuals, in a complicated cognitive and sometimes fortuitous process, come up with something that is not only new to us, but to our entire society so that what is a personal novelty can turn into an innovation at a global level. I…
▽ More
Novelties are part of our daily lives. We constantly adopt new technologies, conceive new ideas, meet new people, experiment with new situations. Occasionally, we as individuals, in a complicated cognitive and sometimes fortuitous process, come up with something that is not only new to us, but to our entire society so that what is a personal novelty can turn into an innovation at a global level. Innovations occur throughout social, biological and technological systems and, though we perceive them as a very natural ingredient of our human experience, little is known about the processes determining their emergence. Still the statistical occurrence of innovations shows striking regularities that represent a starting point to get a deeper insight in the whole phenomenology. This paper represents a small step in that direction, focusing on reviewing the scientific attempts to effectively model the emergence of the new and its regularities, with an emphasis on more recent contributions: from the plain Simon's model tracing back to the 1950s, to the newest model of Polya's urn with triggering of one novelty by another. What seems to be key in the successful modelling schemes proposed so far is the idea of looking at evolution as a path in a complex space, physical, conceptual, biological, technological, whose structure and topology get continuously reshaped and expanded by the occurrence of the new. Mathematically it is very interesting to look at the consequences of the interplay between the "actual" and the "possible" and this is the aim of this short review.
△ Less
Submitted 4 January, 2017;
originally announced January 2017.
-
Maximum entropy models capture melodic styles
Authors:
Jason Sakellariou,
Francesca Tria,
Vittorio Loreto,
François Pachet
Abstract:
We introduce a Maximum Entropy model able to capture the statistics of melodies in music. The model can be used to generate new melodies that emulate the style of the musical corpus which was used to train it. Instead of using the $n-$body interactions of $(n-1)-$order Markov models, traditionally used in automatic music generation, we use a $k-$nearest neighbour model with pairwise interactions o…
▽ More
We introduce a Maximum Entropy model able to capture the statistics of melodies in music. The model can be used to generate new melodies that emulate the style of the musical corpus which was used to train it. Instead of using the $n-$body interactions of $(n-1)-$order Markov models, traditionally used in automatic music generation, we use a $k-$nearest neighbour model with pairwise interactions only. In that way, we keep the number of parameters low and avoid over-fitting problems typical of Markov models. We show that long-range musical phrases don't need to be explicitly enforced using high-order Markov interactions, but can instead emerge from multiple, competing, pairwise interactions. We validate our Maximum Entropy model by contrasting how much the generated sequences capture the style of the original corpus without plagiarizing it. To this end we use a data-compression approach to discriminate the levels of borrowing and innovation featured by the artificial sequences. The results show that our modelling scheme outperforms both fixed-order and variable-order Markov models. This shows that, despite being based only on pairwise interactions, this Maximum Entropy scheme opens the possibility to generate musically sensible alterations of the original phrases, providing a way to generate innovation.
△ Less
Submitted 11 October, 2016;
originally announced October 2016.
-
Opinion dynamics: models, extensions and external effects
Authors:
Alina Sîrbu,
Vittorio Loreto,
Vito D. P. Servedio,
Francesca Tria
Abstract:
Recently, social phenomena have received a lot of attention not only from social scientists, but also from physicists, mathematicians and computer scientists, in the emerging interdisciplinary field of complex system science. Opinion dynamics is one of the processes studied, since opinions are the drivers of human behaviour, and play a crucial role in many global challenges that our complex world…
▽ More
Recently, social phenomena have received a lot of attention not only from social scientists, but also from physicists, mathematicians and computer scientists, in the emerging interdisciplinary field of complex system science. Opinion dynamics is one of the processes studied, since opinions are the drivers of human behaviour, and play a crucial role in many global challenges that our complex world and societies are facing: global financial crises, global pandemics, growth of cities, urbanisation and migration patterns, and last but not least important, climate change and environmental sustainability and protection. Opinion formation is a complex process affected by the interplay of different elements, including the individual predisposition, the influence of positive and negative peer interaction (social networks playing a crucial role in this respect), the information each individual is exposed to, and many others. Several models inspired from those in use in physics have been developed to encompass many of these elements, and to allow for the identification of the mechanisms involved in the opinion formation process and the understanding of their role, with the practical aim of simulating opinion formation and spreading under various conditions. These modelling schemes range from binary simple models such as the voter model, to multi-dimensional continuous approaches. Here, we provide a review of recent methods, focusing on models employing both peer interaction and external information, and emphasising the role that less studied mechanisms, such as disagreement, has in driving the opinion dynamics. [...]
△ Less
Submitted 20 May, 2016;
originally announced May 2016.
-
On the emergence of syntactic structures: quantifying and modelling duality of patterning
Authors:
Vittorio Loreto,
Pietro Gravino,
Vito D. P. Servedio,
Francesca Tria
Abstract:
The complex organization of syntax in hierarchical structures is one of the core design features of human language. Duality of patterning refers for instance to the organization of the meaningful elements in a language at two distinct levels: a combinatorial level where meaningless forms are combined into meaningful forms and a compositional level where meaningful forms are composed into larger le…
▽ More
The complex organization of syntax in hierarchical structures is one of the core design features of human language. Duality of patterning refers for instance to the organization of the meaningful elements in a language at two distinct levels: a combinatorial level where meaningless forms are combined into meaningful forms and a compositional level where meaningful forms are composed into larger lexical units. The question remains wide open regarding how such a structure could have emerged. Furthermore a clear mathematical framework to quantify this phenomenon is still lacking. The aim of this paper is that of addressing these two aspects in a self-consistent way. First, we introduce suitable measures to quantify the level of combinatoriality and compositionality in a language, and present a framework to estimate these observables in human natural languages. Second, we show that the theoretical predictions of a multi-agents modeling scheme, namely the Blending Game, are in surprisingly good agreement with empirical data. In the Blending Game a population of individuals plays language games aiming at success in communication. It is remarkable that the two sides of duality of patterning emerge simultaneously as a consequence of a pure cultural dynamics in a simulated environment that contains meaningful relations, provided a simple constraint on message transmission fidelity is also considered.
△ Less
Submitted 11 February, 2016;
originally announced February 2016.
-
Participatory Patterns in an International Air Quality Monitoring Initiative
Authors:
Alina Sîrbu,
Martin Becker,
Saverio Caminiti,
Bernard De Baets,
Bart Elen,
Louise Francis,
Pietro Gravino,
Andreas Hotho,
Stefano Ingarra,
Vittorio Loreto,
Andrea Molino,
Juergen Mueller,
Jan Peters,
Ferdinando Ricchiuti,
Fabio Saracino,
Vito D. P. Servedio,
Gerd Stumme,
Jan Theunis,
Francesca Tria,
Joris Van den Bossche
Abstract:
The issue of sustainability is at the top of the political and societal agenda, being considered of extreme importance and urgency. Human individual action impacts the environment both locally (e.g., local air/water quality, noise disturbance) and globally (e.g., climate change, resource use). Urban environments represent a crucial example, with an increasing realization that the most effective wa…
▽ More
The issue of sustainability is at the top of the political and societal agenda, being considered of extreme importance and urgency. Human individual action impacts the environment both locally (e.g., local air/water quality, noise disturbance) and globally (e.g., climate change, resource use). Urban environments represent a crucial example, with an increasing realization that the most effective way of producing a change is involving the citizens themselves in monitoring campaigns (a citizen science bottom-up approach). This is possible by develo** novel technologies and IT infrastructures enabling large citizen participation. Here, in the wider framework of one of the first such projects, we show results from an international competition where citizens were involved in mobile air pollution monitoring using low cost sensing devices, combined with a web-based game to monitor perceived levels of pollution. Measures of shift in perceptions over the course of the campaign are provided, together with insights into participatory patterns emerging from this study. Interesting effects related to inertia and to direct involvement in measurement activities rather than indirect information exposure are also highlighted, indicating that direct involvement can enhance learning and environmental awareness. In the future, this could result in better adoption of policies towards decreasing pollution.
△ Less
Submitted 26 March, 2015;
originally announced March 2015.
-
General three-state model with biased population replacement: Analytical solution and application to language dynamics
Authors:
Francesca Colaiori,
Claudio Castellano,
Christine F. Cuskley,
Vittorio Loreto,
Martina Pugliese,
Francesca Tria
Abstract:
Empirical evidence shows that the rate of irregular usage of English verbs exhibits discontinuity as a function of their frequency: the most frequent verbs tend to be totally irregular. We aim to qualitatively understand the origin of this feature by studying simple agent--based models of language dynamics, where each agent adopts an inflectional state for a verb and may change it upon interaction…
▽ More
Empirical evidence shows that the rate of irregular usage of English verbs exhibits discontinuity as a function of their frequency: the most frequent verbs tend to be totally irregular. We aim to qualitatively understand the origin of this feature by studying simple agent--based models of language dynamics, where each agent adopts an inflectional state for a verb and may change it upon interaction with other agents. At the same time, agents are replaced at some rate by new agents adopting the regular form. In models with only two inflectional states (regular and irregular), we observe that either all verbs regularize irrespective of their frequency, or a continuous transition occurs between a low frequency state where the lemma becomes fully regular, and a high frequency one where both forms coexist. Introducing a third (mixed) state, wherein agents may use either form, we find that a third, qualitatively different behavior may emerge, namely, a discontinuous transition in frequency. We introduce and solve analytically a very general class of three--state models that allows us to fully understand these behaviors in a unified framework. Realistic sets of interaction rules, including the well-known Naming Game (NG) model, result in a discontinuous transition, in agreement with recent empirical findings. We also point out that the distinction between speaker and hearer in the interaction has no effect on the collective behavior. The results for the general three--state model, although discussed in terms of language dynamics, are widely applicable.
△ Less
Submitted 13 January, 2015; v1 submitted 18 November, 2014;
originally announced November 2014.
-
Internal and external dynamics in language: Evidence from verb regularity in a historical corpus of English
Authors:
Christine F. Cuskley,
Martina Pugliese,
Claudio Castellano,
Francesca Colaiori,
Vittorio Loreto,
Francesca Tria
Abstract:
Human languages are rule governed, but almost invariably these rules have exceptions in the form of irregularities. Since rules in language are efficient and productive, the persistence of irregularity is an anomaly. How does irregularity linger in the face of internal (endogenous) and external (exogenous) pressures to conform to a rule? Here we address this problem by taking a detailed look at si…
▽ More
Human languages are rule governed, but almost invariably these rules have exceptions in the form of irregularities. Since rules in language are efficient and productive, the persistence of irregularity is an anomaly. How does irregularity linger in the face of internal (endogenous) and external (exogenous) pressures to conform to a rule? Here we address this problem by taking a detailed look at simple past tense verbs in the Corpus of Historical American English. The data show that the language is open, with many new verbs entering. At the same time, existing verbs might tend to regularize or irregularize as a consequence of internal dynamics, but overall, the amount of irregularity sustained by the language stays roughly constant over time. Despite continuous vocabulary growth, and presumably, an attendant increase in expressive power, there is no corresponding growth in irregularity. We analyze the set of irregulars, showing they may adhere to a set of minority rules, allowing for increased stability of irregularity over time. These findings contribute to the debate on how language systems become rule governed, and how and why they sustain exceptions to rules, providing insight into the interplay between the emergence and maintenance of rules and exceptions in language.
△ Less
Submitted 12 August, 2014;
originally announced August 2014.
-
XTribe: a web-based social computation platform
Authors:
Saverio Caminiti,
Claudio Cicali,
Pietro Gravino,
Vittorio Loreto,
Vito D. P. Servedio,
Alina Sîrbu,
Francesca Tria
Abstract:
In the last few years the Web has progressively acquired the status of an infrastructure for social computation that allows researchers to coordinate the cognitive abilities of human agents in on-line communities so to steer the collective user activity towards predefined goals. This general trend is also triggering the adoption of web-games as a very interesting laboratory to run experiments in t…
▽ More
In the last few years the Web has progressively acquired the status of an infrastructure for social computation that allows researchers to coordinate the cognitive abilities of human agents in on-line communities so to steer the collective user activity towards predefined goals. This general trend is also triggering the adoption of web-games as a very interesting laboratory to run experiments in the social sciences and whenever the contribution of human beings is crucially required for research purposes. Nowadays, while the number of on-line users has been steadily growing, there is still a need of systematization in the approach to the web as a laboratory. In this paper we present Experimental Tribe (XTribe in short), a novel general purpose web-based platform for web-gaming and social computation. Ready to use and already operational, XTribe aims at drastically reducing the effort required to develop and run web experiments. XTribe has been designed to speed up the implementation of those general aspects of web experiments that are independent of the specific experiment content. For example, XTribe takes care of user management by handling their registration and profiles and in case of multi-player games, it provides the necessary user grou** functionalities. XTribe also provides communication facilities to easily achieve both bidirectional and asynchronous communication. From a practical point of view, researchers are left with the only task of designing and implementing the game interface and logic of their experiment, on which they maintain full control. Moreover, XTribe acts as a repository of different scientific experiments, thus realizing a sort of showcase that stimulates users' curiosity, enhances their participation, and helps researchers in recruiting volunteers.
△ Less
Submitted 18 January, 2014;
originally announced January 2014.
-
The dynamics of correlated novelties
Authors:
F. Tria,
V. Loreto,
V. D. P. Servedio,
S. H. Strogatz
Abstract:
One new thing often leads to another. Such correlated novelties are a familiar part of daily life. They are also thought to be fundamental to the evolution of biological systems, human society, and technology. By opening new possibilities, one novelty can pave the way for others in a process that Kauffman has called "expanding the adjacent possible". The dynamics of correlated novelties, however,…
▽ More
One new thing often leads to another. Such correlated novelties are a familiar part of daily life. They are also thought to be fundamental to the evolution of biological systems, human society, and technology. By opening new possibilities, one novelty can pave the way for others in a process that Kauffman has called "expanding the adjacent possible". The dynamics of correlated novelties, however, have yet to be quantified empirically or modeled mathematically. Here we propose a simple mathematical model that mimics the process of exploring a physical, biological or conceptual space that enlarges whenever a novelty occurs. The model, a generalization of Polya's urn, predicts statistical laws for the rate at which novelties happen (analogous to Heaps' law) and for the probability distribution on the space explored (analogous to Zipf's law), as well as signatures of the hypothesized process by which one novelty sets the stage for another. We test these predictions on four data sets of human activity: the edit events of Wikipedia pages, the emergence of tags in annotation systems, the sequence of words in texts, and listening to new songs in online music catalogues. By quantifying the dynamics of correlated novelties, our results provide a starting point for a deeper understanding of the ever-expanding adjacent possible and its role in biological, linguistic, cultural, and technological evolution.
△ Less
Submitted 7 October, 2013;
originally announced October 2013.
-
Dynamical correlations in the escape strategy of Influenza A virus
Authors:
Lorenzo Taggi,
Francesca Colaiori,
Vittorio Loreto,
Francesca Tria
Abstract:
The evolutionary dynamics of human Influenza A virus presents a challenging theoretical problem. An extremely high mutation rate allows the virus to escape, at each epidemic season, the host immune protection elicited by previous infections. At the same time, at each given epidemic season a single quasi-species, that is a set of closely related strains, is observed. A non-trivial relation between…
▽ More
The evolutionary dynamics of human Influenza A virus presents a challenging theoretical problem. An extremely high mutation rate allows the virus to escape, at each epidemic season, the host immune protection elicited by previous infections. At the same time, at each given epidemic season a single quasi-species, that is a set of closely related strains, is observed. A non-trivial relation between the genetic (i.e., at the sequence level) and the antigenic (i.e., related to the host immune response) distances can shed light into this puzzle. In this paper we introduce a model in which, in accordance with experimental observations, a simple interaction rule based on spatial correlations among point mutations dynamically defines an immunity space in the space of sequences. We investigate the static and dynamic structure of this space and we discuss how it affects the dynamics of the virus-host interaction. Interestingly we observe a staggered time structure in the virus evolution as in the real Influenza evolutionary dynamics.
△ Less
Submitted 15 May, 2013;
originally announced May 2013.
-
Size and structure of an epistatic space
Authors:
Lorenzo Taggi,
Francesca Colaiori,
Vittorio Loreto,
Francesca Tria
Abstract:
We provide quantitative estimates on the size and the structure of the epistatic space defined in the main article, "Dynamical Correlations in the escape strategy of Influenza A virus", EPL 101 68003.
We provide quantitative estimates on the size and the structure of the epistatic space defined in the main article, "Dynamical Correlations in the escape strategy of Influenza A virus", EPL 101 68003.
△ Less
Submitted 17 May, 2015; v1 submitted 24 March, 2013;
originally announced March 2013.
-
Cohesion, consensus and extreme information in opinion dynamics
Authors:
Alina Sîrbu,
Vittorio Loreto,
Vito D. P. Servedio,
Francesca Tria
Abstract:
Opinion formation is an important element of social dynamics. It has been widely studied in the last years with tools from physics, mathematics and computer science. Here, a continuous model of opinion dynamics for multiple possible choices is analysed. Its main features are the inclusion of disagreement and possibility of modulating information, both from one and multiple sources. The interest is…
▽ More
Opinion formation is an important element of social dynamics. It has been widely studied in the last years with tools from physics, mathematics and computer science. Here, a continuous model of opinion dynamics for multiple possible choices is analysed. Its main features are the inclusion of disagreement and possibility of modulating information, both from one and multiple sources. The interest is in identifying the effect of the initial cohesion of the population, the interplay between cohesion and information extremism, and the effect of using multiple sources of information that can influence the system. Final consensus, especially with external information, depends highly on these factors, as numerical simulations show. When no information is present, consensus or segregation is determined by the initial cohesion of the population. Interestingly, when only one source of information is present, consensus can be obtained, in general, only when this is extremely mild, i.e. there is not a single opinion strongly promoted, or in the special case of a large initial cohesion and low information exposure. On the contrary, when multiple information sources are allowed, consensus can emerge with an information source even when this is not extremely mild, i.e. it carries a strong message, for a large range of initial conditions.
△ Less
Submitted 20 February, 2013;
originally announced February 2013.
-
Opinion dynamics with disagreement and modulated information
Authors:
Alina Sîrbu,
Vittorio Loreto,
Vito D. P. Servedio,
Francesca Tria
Abstract:
Opinion dynamics concerns social processes through which populations or groups of individuals agree or disagree on specific issues. As such, modelling opinion dynamics represents an important research area that has been progressively acquiring relevance in many different domains. Existing approaches have mostly represented opinions through discrete binary or continuous variables by exploring a who…
▽ More
Opinion dynamics concerns social processes through which populations or groups of individuals agree or disagree on specific issues. As such, modelling opinion dynamics represents an important research area that has been progressively acquiring relevance in many different domains. Existing approaches have mostly represented opinions through discrete binary or continuous variables by exploring a whole panoply of cases: e.g. independence, noise, external effects, multiple issues. In most of these cases the crucial ingredient is an attractive dynamics through which similar or similar enough agents get closer. Only rarely the possibility of explicit disagreement has been taken into account (i.e., the possibility for a repulsive interaction among individuals' opinions), and mostly for discrete or 1-dimensional opinions, through the introduction of additional model parameters. Here we introduce a new model of opinion formation, which focuses on the interplay between the possibility of explicit disagreement, modulated in a self-consistent way by the existing opinions' overlaps between the interacting individuals, and the effect of external information on the system. Opinions are modelled as a vector of continuous variables related to multiple possible choices for an issue. Information can be modulated to account for promoting multiple possible choices. Numerical results show that extreme information results in segregation and has a limited effect on the population, while milder messages have better success and a cohesion effect. Additionally, the initial condition plays an important role, with the population forming one or multiple clusters based on the initial average similarity between individuals, with a transition point depending on the number of opinion choices.
△ Less
Submitted 1 December, 2012;
originally announced December 2012.
-
On the accuracy of language trees
Authors:
Simone Pompei,
Vittorio Loreto,
Francesca Tria
Abstract:
Historical linguistics aims at inferring the most likely language phylogenetic tree starting from information concerning the evolutionary relatedness of languages. The available information are typically lists of homologous (lexical, phonological, syntactic) features or characters for many different languages.
From this perspective the reconstruction of language trees is an example of inverse pr…
▽ More
Historical linguistics aims at inferring the most likely language phylogenetic tree starting from information concerning the evolutionary relatedness of languages. The available information are typically lists of homologous (lexical, phonological, syntactic) features or characters for many different languages.
From this perspective the reconstruction of language trees is an example of inverse problems: starting from present, incomplete and often noisy, information, one aims at inferring the most likely past evolutionary history. A fundamental issue in inverse problems is the evaluation of the inference made. A standard way of dealing with this question is to generate data with artificial models in order to have full access to the evolutionary process one is going to infer. This procedure presents an intrinsic limitation: when dealing with real data sets, one typically does not know which model of evolution is the most suitable for them. A possible way out is to compare algorithmic inference with expert classifications. This is the point of view we take here by conducting a thorough survey of the accuracy of reconstruction methods as compared with the Ethnologue expert classifications. We focus in particular on state-of-the-art distance-based methods for phylogeny reconstruction using worldwide linguistic databases.
In order to assess the accuracy of the inferred trees we introduce and characterize two generalizations of standard definitions of distances between trees. Based on these scores we quantify the relative performances of the distance-based algorithms considered. Further we quantify how the completeness and the coverage of the available databases affect the accuracy of the reconstruction. Finally we draw some conclusions about where the accuracy of the reconstructions in historical linguistics stands and about the leading directions to improve it.
△ Less
Submitted 21 March, 2011;
originally announced March 2011.
-
Aging in language dynamics
Authors:
Animesh Mukherjee,
Francesca Tria,
Andrea Baronchelli,
Andrea Puglisi,
Vittorio Loreto
Abstract:
Human languages evolve continuously, and a puzzling problem is how to reconcile the apparent robustness of most of the deep linguistic structures we use with the evidence that they undergo possibly slow, yet ceaseless, changes. Is the state in which we observe languages today closer to what would be a dynamical attractor with statistically stationary properties or rather closer to a non-steady sta…
▽ More
Human languages evolve continuously, and a puzzling problem is how to reconcile the apparent robustness of most of the deep linguistic structures we use with the evidence that they undergo possibly slow, yet ceaseless, changes. Is the state in which we observe languages today closer to what would be a dynamical attractor with statistically stationary properties or rather closer to a non-steady state slowly evolving in time? Here we address this question in the framework of the emergence of shared linguistic categories in a population of individuals interacting through language games. The observed emerging asymptotic categorization, which has been previously tested - with success - against experimental data from human languages, corresponds to a metastable state where global shifts are always possible but progressively more unlikely and the response properties depend on the age of the system. This aging mechanism exhibits striking quantitative analogies to what is observed in the statistical mechanics of glassy systems. We argue that this can be a general scenario in language dynamics where shared linguistic conventions would not emerge as attractors, but rather as metastable states.
△ Less
Submitted 14 January, 2011;
originally announced January 2011.
-
A fast no-rejection algorithm for the Category Game
Authors:
Francesca Tria,
Animesh Mukherjee,
Andrea Baronchelli,
Andrea Puglisi,
Vittorio Loreto
Abstract:
The Category Game is a multi-agent model that accounts for the emergence of shared categorization patterns in a population of interacting individuals. In the framework of the model, linguistic categories appear as long lived consensus states that are constantly reshaped and re-negotiated by the communicating individuals. It is therefore crucial to investigate the long time behavior to gain a clear…
▽ More
The Category Game is a multi-agent model that accounts for the emergence of shared categorization patterns in a population of interacting individuals. In the framework of the model, linguistic categories appear as long lived consensus states that are constantly reshaped and re-negotiated by the communicating individuals. It is therefore crucial to investigate the long time behavior to gain a clear understanding of the dynamics. However, it turns out that the evolution of the emerging category system is so slow, already for small populations, that such an analysis has remained so far impossible. Here, we introduce a fast no-rejection algorithm for the Category Game that disentangles the physical simulation time from the CPU time, thus opening the way for thorough analysis of the model. We verify that the new algorithm is equivalent to the old one in terms of the emerging phenomenology and we quantify the CPU performances of the two algorithms, pointing out the neat advantages offered by the no-rejection one. This technical advance has already opened the way to new investigations of the model, thus hel** to shed light on the fundamental issue of categorization.
△ Less
Submitted 16 December, 2010;
originally announced December 2010.
-
A Stochastic Local Search algorithm for distance-based phylogeny reconstruction
Authors:
F. Tria,
E. Caglioti,
V. Loreto,
A. Pagnani
Abstract:
In many interesting cases the reconstruction of a correct phylogeny is blurred by high mutation rates and/or horizontal transfer events. As a consequence a divergence arises between the true evolutionary distances and the differences between pairs of taxa as inferred from available data, making the phylogenetic reconstruction a challenging problem. Mathematically this divergence translates in a…
▽ More
In many interesting cases the reconstruction of a correct phylogeny is blurred by high mutation rates and/or horizontal transfer events. As a consequence a divergence arises between the true evolutionary distances and the differences between pairs of taxa as inferred from available data, making the phylogenetic reconstruction a challenging problem. Mathematically this divergence translates in a loss of additivity of the actual distances between taxa. In distance-based reconstruction methods, two properties of additive distances were extensively exploited as antagonist criteria to drive phylogeny reconstruction: on the one hand a local property of quartets, i.e., sets of four taxa in a tree, the four-points condition; on the other hand a recently proposed formula that allows to write the tree length as a function of the distances between taxa, the Pauplin's formula. Here we introduce a new reconstruction scheme, that exploits in a unified framework both the four-points condition and the Pauplin's formula. We propose, in particular, a new general class of distance-based Stochastic Local Search algorithms, which reduces in a limit case to the minimization of the Pauplin's length. When tested on artificially generated phylogenies our Stochastic Big-Quartet Swap** algorithmic scheme significantly outperforms state-of-art distance-based algorithms in cases of deviation from additivity due to high rate of back mutations. A significant improvement is also observed with respect to the state-of-art algorithms in case of high rate of horizontal transfer.
△ Less
Submitted 4 February, 2010;
originally announced February 2010.
-
Classification and sparse-signature extraction from gene-expression data
Authors:
Andrea Pagnani,
Francesca Tria,
Martin Weigt
Abstract:
In this work we suggest a statistical mechanics approach to the classification of high-dimensional data according to a binary label. We propose an algorithm whose aim is twofold: First it learns a classifier from a relatively small number of data, second it extracts a sparse signature, {\it i.e.} a lower-dimensional subspace carrying the information needed for the classification. In particular t…
▽ More
In this work we suggest a statistical mechanics approach to the classification of high-dimensional data according to a binary label. We propose an algorithm whose aim is twofold: First it learns a classifier from a relatively small number of data, second it extracts a sparse signature, {\it i.e.} a lower-dimensional subspace carrying the information needed for the classification. In particular the second part of the task is NP-hard, therefore we propose a statistical-mechanics based message-passing approach. The resulting algorithm is firstly tested on artificial data to prove its validity, but also to elucidate possible limitations.
As an important application, we consider the classification of gene-expression data measured in various types of cancer tissues. We find that, despite the currently low quantity and quality of available data (the number of available samples is much smaller than the number of measured genes, limiting thus strongly the predictive capacities), the algorithm performs slightly better than many state-of-the-art approaches in bioinformatics.
△ Less
Submitted 21 July, 2009;
originally announced July 2009.
-
Aligning graphs and finding substructures by a cavity approach
Authors:
S. Bradde,
A. Braunstein,
H. Mahmoudi,
F. Tria,
M. Weigt,
R. Zecchina
Abstract:
We introduce a new distributed algorithm for aligning graphs or finding substructures within a given graph. It is based on the cavity method and is used to study the maximum-clique and the graph-alignment problems in random graphs. The algorithm allows to analyze large graphs and may find applications in fields such as computational biology. As a proof of concept we use our algorithm to align the…
▽ More
We introduce a new distributed algorithm for aligning graphs or finding substructures within a given graph. It is based on the cavity method and is used to study the maximum-clique and the graph-alignment problems in random graphs. The algorithm allows to analyze large graphs and may find applications in fields such as computational biology. As a proof of concept we use our algorithm to align the similarity graphs of two interacting protein families involved in bacterial signal transduction, and to predict actually interacting protein partners between these families.
△ Less
Submitted 1 April, 2010; v1 submitted 12 May, 2009;
originally announced May 2009.
-
A minimal stochastic model for influenza evolution
Authors:
Francesca Tria,
Michael Laessig,
Luca Peliti,
Silvio Franz
Abstract:
We introduce and discuss a minimal individual-based model for influenza dynamics. The model takes into account the effects of specific immunization against viral strains, but also infectivity randomness and the presence of a short-lived strain transcending immunity recently suggested in the literature. We show by simulations that the resulting model exhibits substitution of viral strains along t…
▽ More
We introduce and discuss a minimal individual-based model for influenza dynamics. The model takes into account the effects of specific immunization against viral strains, but also infectivity randomness and the presence of a short-lived strain transcending immunity recently suggested in the literature. We show by simulations that the resulting model exhibits substitution of viral strains along the years, but that their divergence remains bounded. We also show that drop** any of these features results in a drastically different behavior, leading either to the extinction of the disease, to the proliferation of the viral strains, or to their divergence.
△ Less
Submitted 18 May, 2005;
originally announced May 2005.
-
A note on the Guerra and Talagrand theorems for Mean Field Spin Glasses: the simple case of spherical models
Authors:
Silvio Franz,
Francesca Tria
Abstract:
The aim of this paper is to discuss the main ideas of the Talagrand proof of the Parisi Ansatz for the free-energy of Mean Field Spin Glasses with a physicist's approach. We consider the case of the spherical $p$-spin model, which has the following advantages: 1) the Parisi Ansatz takes the simple ``one step replica symmetry breaking form'', 2) the replica free-energy as a function of the order…
▽ More
The aim of this paper is to discuss the main ideas of the Talagrand proof of the Parisi Ansatz for the free-energy of Mean Field Spin Glasses with a physicist's approach. We consider the case of the spherical $p$-spin model, which has the following advantages: 1) the Parisi Ansatz takes the simple ``one step replica symmetry breaking form'', 2) the replica free-energy as a function of the order parameters is simple enough to allow for numerical maximization with arbitrary precision. We present the essential ideas of the proof, we stress its connections with the theory of effective potentials for glassy systems, and we reduce the technically more difficult part of the Talagrand's analysis to an explicit evaluation of the solution of a variational problem.
△ Less
Submitted 20 July, 2005; v1 submitted 1 April, 2005;
originally announced April 2005.
-
Evolutionary games and quasispecies
Authors:
M. Laessig,
L. Peliti,
F. Tria
Abstract:
We discuss a population of sequences subject to mutations and frequency-dependent selection, where the fitness of a sequence depends on the composition of the entire population. This type of dynamics is crucial to understand the evolution of genomic regulation. Mathematically, it takes the form of a reaction-diffusion problem that is nonlinear in the population state. In our model system, the fi…
▽ More
We discuss a population of sequences subject to mutations and frequency-dependent selection, where the fitness of a sequence depends on the composition of the entire population. This type of dynamics is crucial to understand the evolution of genomic regulation. Mathematically, it takes the form of a reaction-diffusion problem that is nonlinear in the population state. In our model system, the fitness is determined by a simple mathematical game, the hawk-dove game. The stationary population distribution is found to be a quasispecies with properties different from those which hold in fixed fitness landscapes.
△ Less
Submitted 10 February, 2003; v1 submitted 4 September, 2002;
originally announced September 2002.
-
Spin glasses on Bethe Lattices for large coordination number
Authors:
Giorgio Parisi,
Francesca Tria
Abstract:
We study spin glasses on random lattices with finite connectivity. In the infinite connectivity limit they reduce to the Sherrington Kirkpatrick model. In this paper we investigate the expansion around the high connectivity limit. Within the replica symmetry breaking scheme at two steps, we compute the free energy at the first order in the expansion in inverse powers of the average connectivity…
▽ More
We study spin glasses on random lattices with finite connectivity. In the infinite connectivity limit they reduce to the Sherrington Kirkpatrick model. In this paper we investigate the expansion around the high connectivity limit. Within the replica symmetry breaking scheme at two steps, we compute the free energy at the first order in the expansion in inverse powers of the average connectivity (z), both for the fixed connectivity and for the fluctuating connectivity random lattices. It is well known that the coefficient of the 1/z correction for the free energy is divergent at low temperatures if computed in the one step approximation. We find that this annoying divergence becomes much smaller if computed in the framework of the more accurate two steps breaking. Comparing the temperature dependance of the coefficients of this divergence in the replica symmetric, one step and two steps replica symmetry breaking, we conclude that this divergence is an artefact due to the use of a finite number of steps of replica symmetry breaking. The 1/z expansion is well defined also in the zero temperature limit.
△ Less
Submitted 4 July, 2002;
originally announced July 2002.