Search | arXiv e-print repository

Statistical signatures of abstraction in deep neural networks

Authors: Carlo Orientale Caputo, Matteo Marsili

Abstract: We study how abstract representations emerge in a Deep Belief Network (DBN) trained on benchmark datasets. Our analysis targets the principles of learning in the early stages of information processing, starting from the "primordial soup" of the under-sampling regime. As the data is processed by deeper and deeper layers, features are detected and removed, transferring more and more "context-invaria… ▽ More We study how abstract representations emerge in a Deep Belief Network (DBN) trained on benchmark datasets. Our analysis targets the principles of learning in the early stages of information processing, starting from the "primordial soup" of the under-sampling regime. As the data is processed by deeper and deeper layers, features are detected and removed, transferring more and more "context-invariant" information to deeper layers. We show that the representation approaches an universal model -- the Hierarchical Feature Model (HFM) -- determined by the principle of maximal relevance. Relevance quantifies the uncertainty on the model of the data, thus suggesting that "meaning" -- i.e. syntactic information -- is that part of the data which is not yet captured by a model. Our analysis shows that shallow layers are well described by pairwise Ising models, which provide a representation of the data in terms of generic, low order features. We also show that plasticity increases with depth, in a similar way as it does in the brain. These findings suggest that DBNs are capable of extracting a hierarchy of features from the data which is consistent with the principle of maximal relevance. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 18 pages, 5 figures

arXiv:2311.17166 [pdf, other]

Is stochastic thermodynamics the key to understanding the energy costs of computation?

Authors: David Wolpert, Jan Korbel, Christopher Lynn, Farita Tasnim, Joshua Grochow, Gülce Kardeş, James Aimone, Vijay Balasubramanian, Eric de Giuli, David Doty, Nahuel Freitas, Matteo Marsili, Thomas E. Ouldridge, Andrea Richa, Paul Riechers, Édgar Roldán, Brenda Rubenstein, Zoltan Toroczkai, Joseph Paradiso

Abstract: The relationship between the thermodynamic and computational characteristics of dynamical physical systems has been a major theoretical interest since at least the 19th century, and has been of increasing practical importance as the energetic cost of digital devices has exploded over the last half century. One of the most important thermodynamic features of real-world computers is that they operat… ▽ More The relationship between the thermodynamic and computational characteristics of dynamical physical systems has been a major theoretical interest since at least the 19th century, and has been of increasing practical importance as the energetic cost of digital devices has exploded over the last half century. One of the most important thermodynamic features of real-world computers is that they operate very far from thermal equilibrium, in finite time, with many quickly (co-)evolving degrees of freedom. Such computers also must almost always obey multiple physical constraints on how they work. For example, all modern digital computers are periodic processes, governed by a global clock. Another example is that many computers are modular, hierarchical systems, with strong restrictions on the connectivity of their subsystems. This properties hold both for naturally occurring computers, like brains or Eukaryotic cells, as well as digital systems. These features of real-world computers are absent in 20th century analyses of the thermodynamics of computational processes, which focused on quasi-statically slow processes. However, the field of stochastic thermodynamics has been developed in the last few decades - and it provides the formal tools for analyzing systems that have exactly these features of real-world computers. We argue here that these tools, together with other tools currently being developed in stochastic thermodynamics, may help us understand at a far deeper level just how the fundamental physical properties of dynamic systems are related to the computation that they perform. △ Less

Submitted 30 November, 2023; v1 submitted 28 November, 2023; originally announced November 2023.

Comments: Typo fix

arXiv:2210.13179 [pdf, other]

A simple probabilistic neural network for machine understanding

Authors: Rongrong Xie, Matteo Marsili

Abstract: We discuss probabilistic neural networks with a fixed internal representation as models for machine understanding. Here understanding is intended as map** data to an already existing representation which encodes an {\em a priori} organisation of the feature space. We derive the internal representation by requiring that it satisfies the principles of maximal relevance and of maximal ignorance abo… ▽ More We discuss probabilistic neural networks with a fixed internal representation as models for machine understanding. Here understanding is intended as map** data to an already existing representation which encodes an {\em a priori} organisation of the feature space. We derive the internal representation by requiring that it satisfies the principles of maximal relevance and of maximal ignorance about how different features are combined. We show that, when hidden units are binary variables, these two principles identify a unique model -- the Hierarchical Feature Model (HFM) -- which is fully solvable and provides a natural interpretation in terms of features. We argue that learning machines with this architecture enjoy a number of interesting properties, like the continuity of the representation with respect to changes in parameters and data, the possibility to control the level of compression and the ability to support functions that go beyond generalisation. We explore the behaviour of the model with extensive numerical experiments and argue that models where the internal representation is fixed reproduce a learning modality which is qualitatively different from that of traditional models such as Restricted Boltzmann Machines. △ Less

Submitted 6 December, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

Comments: 34 pages, 9 figures. Accepted in JSTAT

arXiv:2202.00339 [pdf, other]

doi 10.1016/j.physrep.2022.03.001

Quantifying Relevance in Learning and Inference

Authors: Matteo Marsili, Yasser Roudi

Abstract: Learning is a distinctive feature of intelligent behaviour. High-throughput experimental data and Big Data promise to open new windows on complex systems such as cells, the brain or our societies. Yet, the puzzling success of Artificial Intelligence and Machine Learning shows that we still have a poor conceptual understanding of learning. These applications push statistical inference into uncharte… ▽ More Learning is a distinctive feature of intelligent behaviour. High-throughput experimental data and Big Data promise to open new windows on complex systems such as cells, the brain or our societies. Yet, the puzzling success of Artificial Intelligence and Machine Learning shows that we still have a poor conceptual understanding of learning. These applications push statistical inference into uncharted territories where data is high-dimensional and scarce, and prior information on "true" models is scant if not totally absent. Here we review recent progress on understanding learning, based on the notion of "relevance". The relevance, as we define it here, quantifies the amount of information that a dataset or the internal representation of a learning machine contains on the generative model of the data. This allows us to define maximally informative samples, on one hand, and optimal learning machines on the other. These are ideal limits of samples and of machines, that contain the maximal amount of information about the unknown generative process, at a given resolution (or level of compression). Both ideal limits exhibit critical features in the statistical sense: Maximally informative samples are characterised by a power-law frequency distribution (statistical criticality) and optimal learning machines by an anomalously large susceptibility. The trade-off between resolution (i.e. compression) and relevance distinguishes the regime of noisy representations from that of lossy compression. These are separated by a special point characterised by Zipf's law statistics. This identifies samples obeying Zipf's law as the most compressed loss-less representations that are optimal in the sense of maximal relevance. Criticality in optimal learning machines manifests in an exponential degeneracy of energy levels, that leads to unusual thermodynamic properties. △ Less

Submitted 1 February, 2022; originally announced February 2022.

Comments: review article, 63 pages, 14 figures

arXiv:2112.09420 [pdf, other]

doi 10.1088/1742-5468/ac7794

A random energy approach to deep learning

Authors: Rongrong Xie, Matteo Marsili

Abstract: We study a generic ensemble of deep belief networks which is parametrized by the distribution of energy levels of the hidden states of each layer. We show that, within a random energy approach, statistical dependence can propagate from the visible to deep layers only if each layer is tuned close to the critical point during learning. As a consequence, efficiently trained learning machines are char… ▽ More We study a generic ensemble of deep belief networks which is parametrized by the distribution of energy levels of the hidden states of each layer. We show that, within a random energy approach, statistical dependence can propagate from the visible to deep layers only if each layer is tuned close to the critical point during learning. As a consequence, efficiently trained learning machines are characterised by a broad distribution of energy levels. The analysis of Deep Belief Networks and Restricted Boltzmann Machines on different datasets confirms these conclusions. △ Less

Submitted 17 December, 2021; originally announced December 2021.

Comments: 16 pages, 4 figures

arXiv:2008.00520 [pdf, other]

Statistical Inference of Minimally Complex Models

Authors: Clélia de Mulatier, Paolo P. Mazza, Matteo Marsili

Abstract: Finding the model that best describes a high dimensional dataset is a daunting task. For binary data, we show that this becomes feasible when restricting the search to a family of simple models, that we call Minimally Complex Models (MCMs). These are spin models, with interactions of arbitrary order, that are composed of independent components of minimal complexity (Beretta et al., 2018). They ten… ▽ More Finding the model that best describes a high dimensional dataset is a daunting task. For binary data, we show that this becomes feasible when restricting the search to a family of simple models, that we call Minimally Complex Models (MCMs). These are spin models, with interactions of arbitrary order, that are composed of independent components of minimal complexity (Beretta et al., 2018). They tend to be simple in information theoretic terms, which means that they are well-fitted to specific types of data, and are therefore easy to falsify. We show that Bayesian model selection restricted to these models is computationally feasible and has many other advantages. First, their evidence, which trades off goodness-of-fit against model complexity, can be computed easily without any parameter fitting. This allows selecting the best MCM among all, even though the number of models is astronomically large. Furthermore, MCMs can be inferred and sampled from without any computational effort. Finally, model selection among MCMs is invariant with respect to changes in the representation of the data. MCMs portray the structure of dependencies among variables in a simple way, as illustrated in several examples, and thus provide robust predictions on dependencies in the data. MCMs contain interactions of any order between variables, and thus may reveal the presence of interactions of order higher than pairwise. △ Less

Submitted 27 September, 2021; v1 submitted 2 August, 2020; originally announced August 2020.

Comments: 18 pages, 10 figures

arXiv:2006.06928 [pdf, other]

doi 10.1145/3383583.3398527

Characterising authors on the extent of their paper acceptance: A case study of the Journal of High Energy Physics

Authors: Rima Hazra, Aryan, Hardik Aggarwal, Matteo Marsili, Animesh Mukherjee

Abstract: New researchers are usually very curious about the recipe that could accelerate the chances of their paper getting accepted in a reputed forum (journal/conference). In search of such a recipe, we investigate the profile and peer review text of authors whose papers almost always get accepted at a venue (Journal of High Energy Physics in our current work). We find authors with high acceptance rate a… ▽ More New researchers are usually very curious about the recipe that could accelerate the chances of their paper getting accepted in a reputed forum (journal/conference). In search of such a recipe, we investigate the profile and peer review text of authors whose papers almost always get accepted at a venue (Journal of High Energy Physics in our current work). We find authors with high acceptance rate are likely to have a high number of citations, high $h$-index, higher number of collaborators etc. We notice that they receive relatively lengthy and positive reviews for their papers. In addition, we also construct three networks -- co-reviewer, co-citation and collaboration network and study the network-centric features and intra- and inter-category edge interactions. We find that the authors with high acceptance rate are more `central' in these networks; the volume of intra- and inter-category interactions are also drastically different for the authors with high acceptance rate compared to the other authors. Finally, using the above set of features, we train standard machine learning models (random forest, XGBoost) and obtain very high class wise precision and recall. In a followup discussion we also narrate how apart from the author characteristics, the peer-review system might itself have a role in propelling the distinction among the different categories which could lead to potential discrimination and unfairness and calls for further investigation by the system admins. △ Less

Submitted 11 June, 2020; originally announced June 2020.

Comments: Accepted in JCDL'2020

arXiv:2006.04544 [pdf, other]

doi 10.1088/1742-5468/abacb3

Optimal Work Extraction and the Minimum Description Length Principle

Authors: Léo Touzo, Matteo Marsili, Neri Merhav, Édgar Roldán

Abstract: We discuss work extraction from classical information engines (e.g., Szilárd) with $N$-particles, $q$ partitions, and initial arbitrary non-equilibrium states. In particular, we focus on their {\em optimal} behaviour, which includes the measurement of a set of quantities $Φ$ with a feedback protocol that extracts the maximal average amount of work. We show that the optimal non-equilibrium state to… ▽ More We discuss work extraction from classical information engines (e.g., Szilárd) with $N$-particles, $q$ partitions, and initial arbitrary non-equilibrium states. In particular, we focus on their {\em optimal} behaviour, which includes the measurement of a set of quantities $Φ$ with a feedback protocol that extracts the maximal average amount of work. We show that the optimal non-equilibrium state to which the engine should be driven before the measurement is given by the normalised maximum-likelihood probability distribution of a statistical model that admits $Φ$ as sufficient statistics. Furthermore, we show that the minimax universal code redundancy $\mathcal{R}^*$ associated to this model, provides an upper bound to the work that the demon can extract on average from the cycle, in units of $k_{\rm B}T$. We also find that, in the limit of $N$ large, the maximum average extracted work cannot exceed $H[Φ]/2$, i.e. one half times the Shannon entropy of the measurement. Our results establish a connection between optimal work extraction in stochastic thermodynamics and optimal universal data compression, providing design principles for optimal information engines. In particular, they suggest that: (i) optimal coding is thermodynamically efficient, and (ii) it is essential to drive the system into a critical state in order to achieve optimal performance. △ Less

Submitted 30 July, 2020; v1 submitted 8 June, 2020; originally announced June 2020.

Comments: 26 pages, 5 figures. To appear in JSTAT

MSC Class: 68P30; 68Q32 ACM Class: H.1.1; G.3

Journal ref: J. Stat. Mech. (2020) 093403

arXiv:1911.01968 [pdf]

Thermodynamic Computing

Authors: Tom Conte, Erik DeBenedictis, Natesh Ganesh, Todd Hylton, John Paul Strachan, R. Stanley Williams, Alexander Alemi, Lee Altenberg, Gavin Crooks, James Crutchfield, Lidia del Rio, Josh Deutsch, Michael DeWeese, Khari Douglas, Massimiliano Esposito, Michael Frank, Robert Fry, Peter Harsha, Mark Hill, Christopher Kello, Jeff Krichmar, Suhas Kumar, Shih-Chii Liu, Seth Lloyd, Matteo Marsili , et al. (14 additional authors not shown)

Abstract: The hardware and software foundations laid in the first half of the 20th Century enabled the computing technologies that have transformed the world, but these foundations are now under siege. The current computing paradigm, which is the foundation of much of the current standards of living that we now enjoy, faces fundamental limitations that are evident from several perspectives. In terms of hard… ▽ More The hardware and software foundations laid in the first half of the 20th Century enabled the computing technologies that have transformed the world, but these foundations are now under siege. The current computing paradigm, which is the foundation of much of the current standards of living that we now enjoy, faces fundamental limitations that are evident from several perspectives. In terms of hardware, devices have become so small that we are struggling to eliminate the effects of thermodynamic fluctuations, which are unavoidable at the nanometer scale. In terms of software, our ability to imagine and program effective computational abstractions and implementations are clearly challenged in complex domains. In terms of systems, currently five percent of the power generated in the US is used to run computing systems - this astonishing figure is neither ecologically sustainable nor economically scalable. Economically, the cost of building next-generation semiconductor fabrication plants has soared past $10 billion. All of these difficulties - device scaling, software complexity, adaptability, energy consumption, and fabrication economics - indicate that the current computing paradigm has matured and that continued improvements along this path will be limited. If technological progress is to continue and corresponding social and economic benefits are to continue to accrue, computing must become much more capable, energy efficient, and affordable. We propose that progress in computing can continue under a united, physically grounded, computational paradigm centered on thermodynamics. Herein we propose a research agenda to extend these thermodynamic foundations into complex, non-equilibrium, self-organizing systems and apply them holistically to future computing systems that will harness nature's innate computational capacity. We call this type of computing "Thermodynamic Computing" or TC. △ Less

Submitted 14 November, 2019; v1 submitted 5 November, 2019; originally announced November 2019.

Comments: A Computing Community Consortium (CCC) workshop report, 36 pages

Report number: ccc2019report_6

arXiv:1909.12792 [pdf, other]

Maximal Relevance and Optimal Learning Machines

Authors: O Duranthon, M Marsili, R Xie

Abstract: We show that the mutual information between the representation of a learning machine and the hidden features that it extracts from data is bounded from below by the relevance, which is the entropy of the model's energy distribution. Models with maximal relevance -- that we call Optimal Learning Machines (OLM) -- are hence expected to extract maximally informative representations. We explore this p… ▽ More We show that the mutual information between the representation of a learning machine and the hidden features that it extracts from data is bounded from below by the relevance, which is the entropy of the model's energy distribution. Models with maximal relevance -- that we call Optimal Learning Machines (OLM) -- are hence expected to extract maximally informative representations. We explore this principle in a range of models. For fully connected Ising models and we show that {\em i)} OLM are characterised by inhomogeneous distributions of couplings, and that {\em ii)} their learning performance is affected by sub-extensive features that are elusive to a thermodynamic treatment. On specific learning tasks, we find that likelihood maximisation is achieved by models with maximal relevance. Training of Restricted Boltzmann Machines on the MNIST benchmark shows that learning is associated with a broadening of the spectrum of energy levels and that the internal representation of the hidden layer approaches the maximal relevance that can be achieved in a finite dataset. Finally, we discuss a Gaussian learning machine that clarifies that learning hidden features is conceptually different from parameter estimation. △ Less

Submitted 27 January, 2021; v1 submitted 27 September, 2019; originally announced September 2019.

Comments: 28 pages, 9 figures

arXiv:1903.00386 [pdf, other]

On the complexity of logistic regression models

Authors: Nicola Bulso, Matteo Marsili, Yasser Roudi

Abstract: We investigate the complexity of logistic regression models which is defined by counting the number of indistinguishable distributions that the model can represent (Balasubramanian, 1997). We find that the complexity of logistic models with binary inputs does not only depend on the number of parameters but also on the distribution of inputs in a non-trivial way which standard treatments of complex… ▽ More We investigate the complexity of logistic regression models which is defined by counting the number of indistinguishable distributions that the model can represent (Balasubramanian, 1997). We find that the complexity of logistic models with binary inputs does not only depend on the number of parameters but also on the distribution of inputs in a non-trivial way which standard treatments of complexity do not address. In particular, we observe that correlations among inputs induce effective dependencies among parameters thus constraining the model and, consequently, reducing its complexity. We derive simple relations for the upper and lower bounds of the complexity. Furthermore, we show analytically that, defining the model parameters on a finite support rather than the entire axis, decreases the complexity in a manner that critically depends on the size of the domain. Based on our findings, we propose a novel model selection criterion which takes into account the entropy of the input distribution. We test our proposal on the problem of selecting the input variables of a logistic regression model in a Bayesian Model Selection framework. In our numerical tests, we find that, while the reconstruction errors of standard model selection approaches (AIC, BIC, $\ell_1$ regularization) strongly depend on the sparsity of the ground truth, the reconstruction error of our method is always close to the minimum in all conditions of sparsity, data size and strength of input correlations. Finally, we observe that, when considering categorical instead of binary inputs, in a simple and mathematically tractable case, the contribution of the alphabet size to the complexity is very small compared to that of parameter space dimension. We further explore the issue by analysing the dataset of the "13 keys to the White House" which is a method for forecasting the outcomes of US presidential elections. △ Less

Submitted 1 March, 2019; originally announced March 2019.

Comments: 29 pages, 6 figures, The supplementary material is an ancillary file and can be downloaded from a link on the right

arXiv:1710.11324 [pdf, other]

Resolution and Relevance Trade-offs in Deep Learning

Authors: Juyong Song, Matteo Marsili, Junghyo Jo

Abstract: Deep learning has been successfully applied to various tasks, but its underlying mechanism remains unclear. Neural networks associate similar inputs in the visible layer to the same state of hidden variables in deep layers. The fraction of inputs that are associated to the same state is a natural measure of similarity and is simply related to the cost in bits required to represent these inputs. Th… ▽ More Deep learning has been successfully applied to various tasks, but its underlying mechanism remains unclear. Neural networks associate similar inputs in the visible layer to the same state of hidden variables in deep layers. The fraction of inputs that are associated to the same state is a natural measure of similarity and is simply related to the cost in bits required to represent these inputs. The degeneracy of states with the same information cost provides instead a natural measure of noise and is simply related the entropy of the frequency of states, that we call relevance. Representations with minimal noise, at a given level of similarity (resolution), are those that maximise the relevance. A signature of such efficient representations is that frequency distributions follow power laws. We show, in extensive numerical experiments, that deep neural networks extract a hierarchy of efficient representations from data, because they i) achieve low levels of noise (i.e. high relevance) and ii) exhibit power law distributions. We also find that the layer that is most efficient to reliably generate patterns of training data is the one for which relevance and resolution are traded at the same price, which implies that frequency distribution follows Zipf's law. △ Less

Submitted 19 March, 2018; v1 submitted 31 October, 2017; originally announced October 2017.

Comments: 13 pages, 3 figures

arXiv:1705.01089 [pdf, other]

Influence of Reviewer Interaction Network on Long-term Citations: A Case Study of the Scientific Peer-Review System of the Journal of High Energy Physics

Authors: Sandipan Sikdar, Matteo Marsili, Niloy Ganguly, Animesh Mukherjee

Abstract: A `peer-review system' in the context of judging research contributions, is one of the prime steps undertaken to ensure the quality of the submissions received, a significant portion of the publishing budget is spent towards successful completion of the peer-review by the publication houses. Nevertheless, the scientific community is largely reaching a consensus that peer-review system, although in… ▽ More A `peer-review system' in the context of judging research contributions, is one of the prime steps undertaken to ensure the quality of the submissions received, a significant portion of the publishing budget is spent towards successful completion of the peer-review by the publication houses. Nevertheless, the scientific community is largely reaching a consensus that peer-review system, although indispensable, is nonetheless flawed. A very pertinent question therefore is "could this system be improved?". In this paper, we attempt to present an answer to this question by considering a massive dataset of around $29k$ papers with roughly $70k$ distinct review reports together consisting of $12m$ lines of review text from the Journal of High Energy Physics (JHEP) between 1997 and 2015. In specific, we introduce a novel \textit{reviewer-reviewer interaction network} (an edge exists between two reviewers if they were assigned by the same editor) and show that surprisingly the simple structural properties of this network such as degree, clustering coefficient, centrality (closeness, betweenness etc.) serve as strong predictors of the long-term citations (i.e., the overall scientific impact) of a submitted paper. These features, when plugged in a regression model, alone achieves a high $R^2$ of \0.79 and a low $RMSE$ of 0.496 in predicting the long-term citations. In addition, we also design a set of supporting features built from the basic characteristics of the submitted papers, the authors and the referees (e.g., the popularity of the submitting author, the acceptance rate history of a referee, the linguistic properties laden in the text of the review reports etc.), which further results in overall improvement with $R^2$ of 0.81 and $RMSE$ of 0.46. △ Less

Submitted 2 May, 2017; originally announced May 2017.

arXiv:1608.04875 [pdf, ps, other]

doi 10.1145/2983323.2983675

Anomalies in the peer-review system: A case study of the journal of High Energy Physics

Authors: Sandipan Sikdar, Matteo Marsili, Niloy Ganguly, Animesh Mukherjee

Abstract: Peer-review system has long been relied upon for bringing quality research to the notice of the scientific community and also preventing flawed research from entering into the literature. The need for the peer-review system has often been debated as in numerous cases it has failed in its task and in most of these cases editors and the reviewers were thought to be responsible for not being able to… ▽ More Peer-review system has long been relied upon for bringing quality research to the notice of the scientific community and also preventing flawed research from entering into the literature. The need for the peer-review system has often been debated as in numerous cases it has failed in its task and in most of these cases editors and the reviewers were thought to be responsible for not being able to correctly judge the quality of the work. This raises a question "Can the peer-review system be improved?" Since editors and reviewers are the most important pillars of a reviewing system, we in this work, attempt to address a related question - given the editing/reviewing history of the editors or re- viewers "can we identify the under-performing ones?", with citations received by the edited/reviewed papers being used as proxy for quantifying performance. We term such review- ers and editors as anomalous and we believe identifying and removing them shall improve the performance of the peer- review system. Using a massive dataset of Journal of High Energy Physics (JHEP) consisting of 29k papers submitted between 1997 and 2015 with 95 editors and 4035 reviewers and their review history, we identify several factors which point to anomalous behavior of referees and editors. In fact the anomalous editors and reviewers account for 26.8% and 14.5% of the total editors and reviewers respectively and for most of these anomalous reviewers the performance degrades alarmingly over time. △ Less

Submitted 17 August, 2016; originally announced August 2016.

Comments: 25th ACM International Conference on Information and Knowledge Management (CIKM 2016)

arXiv:1306.3830 [pdf, other]

doi 10.3390/e15083031

What do leaders know?

Authors: Giacomo Livan, Matteo Marsili

Abstract: The ability of a society to make the right decisions on relevant matters relies on its capability to properly aggregate the noisy information spread across the individuals it is made of. In this paper we study the information aggregation performance of a stylized model of a society whose most influential individuals - the leaders - are highly connected among themselves and uninformed. Agents updat… ▽ More The ability of a society to make the right decisions on relevant matters relies on its capability to properly aggregate the noisy information spread across the individuals it is made of. In this paper we study the information aggregation performance of a stylized model of a society whose most influential individuals - the leaders - are highly connected among themselves and uninformed. Agents update their state of knowledge in a Bayesian manner by listening to their neighbors. We find analytical and numerical evidence of a transition, as a function of the noise level in the information initially available to agents, from a regime where information is correctly aggregated to one where the population reaches consensus on the wrong outcome with finite probability. Furthermore, information aggregation depends in a non-trivial manner on the relative size of the clique of leaders, with the limit of a vanishingly small clique being singular. △ Less

Submitted 23 July, 2013; v1 submitted 17 June, 2013; originally announced June 2013.

Comments: 10 pages, 3 figures

Journal ref: Entropy, 15(8), 3031-3044 (2013)

arXiv:1207.6416 [pdf, ps, other]

doi 10.1007/s10955-013-0693-0

The Social Climbing Game

Authors: Marco Bardoscia, Giancarlo De Luca, Giacomo Livan, Matteo Marsili, Claudio J. Tessone

Abstract: The structure of a society depends, to some extent, on the incentives of the individuals they are composed of. We study a stylized model of this interplay, that suggests that the more individuals aim at climbing the social hierarchy, the more society's hierarchy gets strong. Such a dependence is sharp, in the sense that a persistent hierarchical order emerges abruptly when the preference for socia… ▽ More The structure of a society depends, to some extent, on the incentives of the individuals they are composed of. We study a stylized model of this interplay, that suggests that the more individuals aim at climbing the social hierarchy, the more society's hierarchy gets strong. Such a dependence is sharp, in the sense that a persistent hierarchical order emerges abruptly when the preference for social status gets larger than a threshold. This phase transition has its origin in the fact that the presence of a well defined hierarchy allows agents to climb it, thus reinforcing it, whereas in a "disordered" society it is harder for agents to find out whom they should connect to in order to become more central. Interestingly, a social order emerges when agents strive harder to climb society and it results in a state of reduced social mobility, as a consequence of ergodicity breaking, where climbing is more difficult. △ Less

Submitted 19 January, 2013; v1 submitted 26 July, 2012; originally announced July 2012.

Comments: 14 pages, 9 figures

Journal ref: Journal of Statistical Physics 151 (2013), pp. 440-457

arXiv:1104.2026 [pdf, other]

doi 10.1073/pnas.1105757109

Collaboration in Social Networks

Authors: Luca Dall'Asta, Matteo Marsili, Paolo Pin

Abstract: The very notion of social network implies that linked individuals interact repeatedly with each other. This allows them not only to learn successful strategies and adapt to them, but also to condition their own behavior on the behavior of others, in a strategic forward looking manner. Game theory of repeated games shows that these circumstances are conducive to the emergence of collaboration in si… ▽ More The very notion of social network implies that linked individuals interact repeatedly with each other. This allows them not only to learn successful strategies and adapt to them, but also to condition their own behavior on the behavior of others, in a strategic forward looking manner. Game theory of repeated games shows that these circumstances are conducive to the emergence of collaboration in simple games of two players. We investigate the extension of this concept to the case where players are engaged in a local contribution game and show that rationality and credibility of threats identify a class of Nash equilibria -- that we call "collaborative equilibria" -- that have a precise interpretation in terms of sub-graphs of the social network. For large network games, the number of such equilibria is exponentially large in the number of players. When incentives to defect are small, equilibria are supported by local structures whereas when incentives exceed a threshold they acquire a non-local nature, which requires a "critical mass" of more than a given fraction of the players to collaborate. Therefore, when incentives are high, an individual deviation typically causes the collapse of collaboration across the whole system. At the same time, higher incentives to defect typically support equilibria with a higher density of collaborators. The resulting picture conforms with several results in sociology and in the experimental literature on game theory, such as the prevalence of collaboration in denser groups and in the structural hubs of sparse networks. △ Less

Submitted 11 April, 2011; originally announced April 2011.

arXiv:0808.0584 [pdf, ps, other]

doi 10.1103/PhysRevE.79.015101

Congestion phenomena on complex networks

Authors: Daniele De Martino, Luca Dall'Asta, Ginestra Bianconi, Matteo Marsili

Abstract: We define a minimal model of traffic flows in complex networks containing the most relevant features of real routing schemes, i.e. a trade--off strategy between topological-based and traffic-based routing. The resulting collective behavior, obtained analytically for the ensemble of uncorrelated networks, is physically very rich and reproduces results recently observed in traffic simulations on s… ▽ More We define a minimal model of traffic flows in complex networks containing the most relevant features of real routing schemes, i.e. a trade--off strategy between topological-based and traffic-based routing. The resulting collective behavior, obtained analytically for the ensemble of uncorrelated networks, is physically very rich and reproduces results recently observed in traffic simulations on scale-free networks. We find that traffic control is useless in homogeneous graphs but may improves global performance in inhomogeneous networks, enlarging the free-flow region in parameter space. Traffic control also introduces non-linear effects and, beyond a critical strength, may trigger the appearance of a congested phase in a discontinuous manner. △ Less

Submitted 5 August, 2008; originally announced August 2008.

Comments: 4 pages, 4 figures, submitted to PRL

arXiv:0807.1458 [pdf, ps, other]

doi 10.1016/j.physa.2006.07.017

Theory of Rumour Spreading in Complex Social Networks

Authors: Maziar Nekovee, Y. Moreno, G. Bianconi, M. Marsili

Abstract: We introduce a general stochastic model for the spread of rumours, and derive mean-field equations that describe the dynamics of the model on complex social networks (in particular those mediated by the Internet). We use analytical and numerical solutions of these equations to examine the threshold behavior and dynamics of the model on several models of such networks: random graphs, uncorrelated… ▽ More We introduce a general stochastic model for the spread of rumours, and derive mean-field equations that describe the dynamics of the model on complex social networks (in particular those mediated by the Internet). We use analytical and numerical solutions of these equations to examine the threshold behavior and dynamics of the model on several models of such networks: random graphs, uncorrelated scale-free networks and scale-free networks with assortative degree correlations. We show that in both homogeneous networks and random graphs the model exhibits a critical threshold in the rumour spreading rate below which a rumour cannot propagate in the system. In the case of scale-free networks, on the other hand, this threshold becomes vanishingly small in the limit of infinite system size. We find that the initial rate at which a rumour spreads is much higher in scale-free networks than in random graphs, and that the rate at which the spreading proceeds on scale-free networks is further increased when assortative degree correlations are introduced. The impact of degree correlations on the final fraction of nodes that ever hears a rumour, however, depends on the interplay between network topology and the rumour spreading rate. Our results show that scale-free social networks are prone to the spreading of rumours, just as they are to the spreading of infections. They are relevant to the spreading dynamics of chain emails, viral advertising and large-scale information dissemination algorithms on the Internet. △ Less

Submitted 9 July, 2008; originally announced July 2008.

Journal ref: Physica A, Vol 374, 457 (2007)

Showing 1–19 of 19 results for author: Marsili, M