-
Modeling Self-Propagating Malware with Epidemiological Models
Authors:
Alesia Chernikova,
Nicolò Gozzi,
Simona Boboila,
Nicola Perra,
Tina Eliassi-Rad,
Alina Oprea
Abstract:
Self-propagating malware (SPM) has recently resulted in large financial losses and high social impact, with well-known campaigns such as WannaCry and Colonial Pipeline being able to propagate rapidly on the Internet and cause service disruptions. To date, the propagation behavior of SPM is still not well understood, resulting in the difficulty of defending against these cyber threats. To address t…
▽ More
Self-propagating malware (SPM) has recently resulted in large financial losses and high social impact, with well-known campaigns such as WannaCry and Colonial Pipeline being able to propagate rapidly on the Internet and cause service disruptions. To date, the propagation behavior of SPM is still not well understood, resulting in the difficulty of defending against these cyber threats. To address this gap, in this paper we perform a comprehensive analysis of a newly proposed epidemiological model for SPM propagation, Susceptible-Infected-Infected Dormant-Recovered (SIIDR). We perform a theoretical analysis of the stability of the SIIDR model and derive its basic reproduction number by representing it as a system of Ordinary Differential Equations with continuous time. We obtain access to 15 WananCry attack traces generated under various conditions, derive the model's transition rates, and show that SIIDR fits best the real data. We find that the SIIDR model outperforms more established compartmental models from epidemiology, such as SI, SIS, and SIR, at modeling SPM propagation.
△ Less
Submitted 3 August, 2023; v1 submitted 5 August, 2022;
originally announced August 2022.
-
The Concept of Decentralization Through Time and Disciplines: A Quantitative Exploration
Authors:
Gabriele Di Bona,
Alberto Bracci,
Nicola Perra,
Vito Latora,
Andrea Baronchelli
Abstract:
Decentralization is a pervasive concept found across disciplines, including Economics, Political Science, and Computer Science, where it is used in distinct yet interrelated ways. Here, we develop and publicly release a general pipeline to investigate the scholarly history of the term, analysing 425,144 academic publications that refer to (de)centralization. We find that the fraction of papers on…
▽ More
Decentralization is a pervasive concept found across disciplines, including Economics, Political Science, and Computer Science, where it is used in distinct yet interrelated ways. Here, we develop and publicly release a general pipeline to investigate the scholarly history of the term, analysing 425,144 academic publications that refer to (de)centralization. We find that the fraction of papers on the topic has been exponentially increasing since the 1950s. In 2021, 1 author in 154 mentioned (de)centralization in the title or abstract of an article. Using both semantic information and citation patterns, we cluster papers in fields and characterize the knowledge flows between them. Our analysis reveals that the topic has independently emerged in the different fields, with small cross-disciplinary contamination. Moreover, we show how Blockchain has become the most influential field about 10 years ago, while Governance dominated before the 1990s. In summary, our findings provide a quantitative assessment of the evolution of a key yet elusive concept, which has undergone cycles of rise and fall within different fields. Our pipeline offers a powerful tool to analyze the evolution of any scholarly term in the academic literature, providing insights into the interplay between collective and independent discoveries in science.
△ Less
Submitted 6 October, 2023; v1 submitted 28 July, 2022;
originally announced July 2022.
-
The structure of segregation in co-authorship networks and its impact on scientific production
Authors:
Ana Maria Jaramillo,
Hywel T. P. Williams,
Nicola Perra,
Ronaldo Menezes
Abstract:
Co-authorship networks, where nodes represent authors and edges represent co-authorship relations, are key to understanding the production and diffusion of knowledge in academia. Social constructs, biases (implicit and explicit), and constraints (e.g. spatial, temporal) affect who works with whom and cause co-authorship networks to organise into tight communities with different levels of segregati…
▽ More
Co-authorship networks, where nodes represent authors and edges represent co-authorship relations, are key to understanding the production and diffusion of knowledge in academia. Social constructs, biases (implicit and explicit), and constraints (e.g. spatial, temporal) affect who works with whom and cause co-authorship networks to organise into tight communities with different levels of segregation. We aim to look at aspects of the co-authorship network structure that lead to segregation and its impact on scientific production. We measure segregation using the Spectral Segregation Index (SSI) and find 4 ordered segregation categories: completely segregated, highly segregated, moderately segregated and non-segregated communities. We direct our attention to the non-segregated and highly segregated communities, quantifying and comparing their structural topologies and k-core positions. When considering communities of both categories (controlling for size), our results show no differences in density and clustering but substantial variability in core position. Larger non-segregated communities are more likely to occupy cores near the network nucleus, while the highly segregated ones tend to be closer to the network periphery. Finally, we analyse differences in citations gained by researchers within communities showing different segregation categories. Researchers in highly segregated communities get more citations from their community members in middle cores and gain more citations per publication in middle/periphery cores. Those in non-segregated communities get more citations per publication in the nucleus. To our knowledge, this work is the first to characterise community segregation in co-authorship networks and investigate the relationship between community segregation and author citations.
△ Less
Submitted 3 May, 2023; v1 submitted 20 July, 2022;
originally announced July 2022.
-
Modeling Teams Performance Using Deep Representational Learning on Graphs
Authors:
Francesco Carli,
Pietro Foini,
Nicolò Gozzi,
Nicola Perra,
Rossano Schifanella
Abstract:
The large majority of human activities require collaborations within and across formal or informal teams. Our understanding of how the collaborative efforts spent by teams relate to their performance is still a matter of debate. Teamwork results in a highly interconnected ecosystem of potentially overlap** components where tasks are performed in interaction with team members and across other tea…
▽ More
The large majority of human activities require collaborations within and across formal or informal teams. Our understanding of how the collaborative efforts spent by teams relate to their performance is still a matter of debate. Teamwork results in a highly interconnected ecosystem of potentially overlap** components where tasks are performed in interaction with team members and across other teams. To tackle this problem, we propose a graph neural network model designed to predict a team's performance while identifying the drivers that determine such an outcome. In particular, the model is based on three architectural channels: topological, centrality, and contextual which capture different factors potentially sha** teams' success. We endow the model with two attention mechanisms to boost model performance and allow interpretability. A first mechanism allows pinpointing key members inside the team. A second mechanism allows us to quantify the contributions of the three driver effects in determining the outcome performance. We test model performance on a wide range of domains outperforming most of the classical and neural baselines considered. Moreover, we include synthetic datasets specifically designed to validate how the model disentangles the intended properties on which our model vastly outperforms baselines.
△ Less
Submitted 29 June, 2022;
originally announced June 2022.
-
Cyber Network Resilience against Self-Propagating Malware Attacks
Authors:
Alesia Chernikova,
Nicolò Gozzi,
Simona Boboila,
Priyanka Angadi,
John Loughner,
Matthew Wilden,
Nicola Perra,
Tina Eliassi-Rad,
Alina Oprea
Abstract:
Self-propagating malware (SPM) has led to huge financial losses, major data breaches, and widespread service disruptions in recent years. In this paper, we explore the problem of develo** cyber resilient systems capable of mitigating the spread of SPM attacks. We begin with an in-depth study of a well-known self-propagating malware, WannaCry, and present a compartmental model called SIIDR that a…
▽ More
Self-propagating malware (SPM) has led to huge financial losses, major data breaches, and widespread service disruptions in recent years. In this paper, we explore the problem of develo** cyber resilient systems capable of mitigating the spread of SPM attacks. We begin with an in-depth study of a well-known self-propagating malware, WannaCry, and present a compartmental model called SIIDR that accurately captures the behavior observed in real-world attack traces. Next, we investigate ten cyber defense techniques, including existing edge and node hardening strategies, as well as newly developed methods based on reconfiguring network communication (NodeSplit) and isolating communities. We evaluate all defense strategies in detail using six real-world communication graphs collected from a large retail network and compare their performance across a wide range of attacks and network topologies. We show that several of these defenses are able to efficiently reduce the spread of SPM attacks modeled with SIIDR. For instance, given a strong attack that infects 97% of nodes when no defense is employed, strategically securing a small number of nodes (0.08%) reduces the infection footprint in one of the networks down to 1%.
△ Less
Submitted 8 October, 2022; v1 submitted 27 June, 2022;
originally announced June 2022.
-
The adoption of non-pharmaceutical interventions and the role of digital infrastructure during the COVID-19 Pandemic in Colombia, Ecuador, and El Salvador
Authors:
Nicolò Gozzi,
Niccolò Comini,
Nicola Perra
Abstract:
Adherence to the non-pharmaceutical interventions (NPIs) put in place to mitigate the spreading of infectious diseases is a multifaceted problem. Socio-demographic, socio-economic, and epidemiological factors can influence the perceived susceptibility and risk which are known to affect behavior. Furthermore, the adoption of NPIs is dependent upon the barriers, real or perceived, associated with th…
▽ More
Adherence to the non-pharmaceutical interventions (NPIs) put in place to mitigate the spreading of infectious diseases is a multifaceted problem. Socio-demographic, socio-economic, and epidemiological factors can influence the perceived susceptibility and risk which are known to affect behavior. Furthermore, the adoption of NPIs is dependent upon the barriers, real or perceived, associated with their implementation. We study the determinants of NPIs adherence during the first wave of the COVID-19 Pandemic in Colombia, Ecuador, and El Salvador. Analyses are performed at the level of municipalities and include socio-economic, socio-demographic, and epidemiological indicators. Furthermore, by leveraging a unique dataset comprising tens of millions of internet Speedtest measurements from Ookla, we investigate the quality of the digital infrastructure as a possible barrier to adoption. We use publicly available data provided by Meta capturing aggregated mobility changes as a proxy of adherence to NPIs. Across the three countries considered, we find a significant correlation between mobility drops and digital infrastructure quality. The relationship remains significant after controlling for several factors including socio-economic status, population size, and reported COVID-19 cases. This finding suggests that municipalities with better connectivity were able to afford higher mobility reductions. The link between mobility drops and digital infrastructure quality is stronger at the peak of NPIs stringency. We also find that mobility reductions were more pronounced in larger, denser, and wealthier municipalities. Our work provides new insights on the significance of access to digital tools as an additional factor influencing the ability to follow social distancing guidelines during a health emergency
△ Less
Submitted 24 February, 2022;
originally announced February 2022.
-
Macroscopic properties of buyer-seller networks in online marketplaces
Authors:
Alberto Bracci,
Jörn Boehnke,
Abeer ElBahrawy,
Nicola Perra,
Alexander Teytelboym,
Andrea Baronchelli
Abstract:
Online marketplaces are the main engines of legal and illegal e-commerce, yet their empirical properties are poorly understood due to the absence of large-scale data. We analyze two comprehensive datasets containing 245M transactions (16B USD) that took place on online marketplaces between 2010 and 2021, covering 28 dark web marketplaces, i.e., unregulated markets whose main currency is Bitcoin, a…
▽ More
Online marketplaces are the main engines of legal and illegal e-commerce, yet their empirical properties are poorly understood due to the absence of large-scale data. We analyze two comprehensive datasets containing 245M transactions (16B USD) that took place on online marketplaces between 2010 and 2021, covering 28 dark web marketplaces, i.e., unregulated markets whose main currency is Bitcoin, and 144 product markets of one popular regulated e-commerce platform. We show that transactions in online marketplaces exhibit strikingly similar patterns despite significant differences in language, lifetimes, products, regulation, and technology. Specifically, we find remarkable regularities in the distributions of transaction amounts, number of transactions, inter-event times and time between first and last transactions. We show that buyer behavior is affected by the memory of past interactions and use this insight to propose a model of network formation reproducing our main empirical observations. Our findings have implications for understanding market power on online marketplaces as well as inter-marketplace competition, and provide empirical foundation for theoretical economic models of online marketplaces.
△ Less
Submitted 11 April, 2022; v1 submitted 16 December, 2021;
originally announced December 2021.
-
Finding Patient Zero: Learning Contagion Source with Graph Neural Networks
Authors:
Chintan Shah,
Nima Dehmamy,
Nicola Perra,
Matteo Chinazzi,
Albert-László Barabási,
Alessandro Vespignani,
Rose Yu
Abstract:
Locating the source of an epidemic, or patient zero (P0), can provide critical insights into the infection's transmission course and allow efficient resource allocation. Existing methods use graph-theoretic centrality measures and expensive message-passing algorithms, requiring knowledge of the underlying dynamics and its parameters. In this paper, we revisit this problem using graph neural networ…
▽ More
Locating the source of an epidemic, or patient zero (P0), can provide critical insights into the infection's transmission course and allow efficient resource allocation. Existing methods use graph-theoretic centrality measures and expensive message-passing algorithms, requiring knowledge of the underlying dynamics and its parameters. In this paper, we revisit this problem using graph neural networks (GNNs) to learn P0. We establish a theoretical limit for the identification of P0 in a class of epidemic models. We evaluate our method against different epidemic models on both synthetic and a real-world contact network considering a disease with history and characteristics of COVID-19. % We observe that GNNs can identify P0 close to the theoretical bound on accuracy, without explicit input of dynamics or its parameters. In addition, GNN is over 100 times faster than classic methods for inference on arbitrary graph topologies. Our theoretical bound also shows that the epidemic is like a ticking clock, emphasizing the importance of early contact-tracing. We find a maximum time after which accurate recovery of the source becomes impossible, regardless of the algorithm used.
△ Less
Submitted 27 June, 2020; v1 submitted 21 June, 2020;
originally announced June 2020.
-
Collective response to the media coverage of COVID-19 Pandemic on Reddit and Wikipedia
Authors:
Nicolò Gozzi,
Michele Tizzani,
Michele Starnini,
Fabio Ciulla,
Daniela Paolotti,
André Panisson,
Nicola Perra
Abstract:
The exposure and consumption of information during epidemic outbreaks may alter risk perception, trigger behavioural changes, and ultimately affect the evolution of the disease. It is thus of the uttermost importance to map information dissemination by mainstream media outlets and public response. However, our understanding of this exposure-response dynamic during COVID-19 pandemic is still limite…
▽ More
The exposure and consumption of information during epidemic outbreaks may alter risk perception, trigger behavioural changes, and ultimately affect the evolution of the disease. It is thus of the uttermost importance to map information dissemination by mainstream media outlets and public response. However, our understanding of this exposure-response dynamic during COVID-19 pandemic is still limited. In this paper, we provide a characterization of media coverage and online collective attention to COVID-19 pandemic in four countries: Italy, United Kingdom, United States, and Canada. For this purpose, we collect an heterogeneous dataset including 227,768 online news articles and 13,448 Youtube videos published by mainstream media, 107,898 users posts and 3,829,309 comments on the social media platform Reddit, and 278,456,892 views to COVID-19 related Wikipedia pages. Our results show that public attention, quantified as users activity on Reddit and active searches on Wikipedia pages, is mainly driven by media coverage and declines rapidly, while news exposure and COVID-19 incidence remain high. Furthermore, by using an unsupervised, dynamical topic modeling approach, we show that while the attention dedicated to different topics by media and online users are in good accordance, interesting deviations emerge in their temporal patterns. Overall, our findings offer an additional key to interpret public perception/response to the current global health emergency and raise questions about the effects of attention saturation on collective awareness, risk perception and thus on tendencies towards behavioural changes.
△ Less
Submitted 8 June, 2020;
originally announced June 2020.
-
Towards a data-driven characterization of behavioral changes induced by the seasonal flu
Authors:
Nicolò Gozzi,
Daniela Perrotta,
Daniela Paolotti,
Nicola Perra
Abstract:
In this work, we aim to determine the main factors driving behavioral change during the seasonal flu. To this end, we analyze a unique dataset comprised of 599 surveys completed by 434 Italian users of Influweb, a Web platform for participatory surveillance, during the 2017-18 and 2018-19 seasons. The data provide socio-demographic information, level of concerns about the flu, past experience with…
▽ More
In this work, we aim to determine the main factors driving behavioral change during the seasonal flu. To this end, we analyze a unique dataset comprised of 599 surveys completed by 434 Italian users of Influweb, a Web platform for participatory surveillance, during the 2017-18 and 2018-19 seasons. The data provide socio-demographic information, level of concerns about the flu, past experience with illnesses, and the type of behavioral changes implemented by each participant. We describe each response with a set of features and divide them in three target categories. These describe those that report i) no (26 %), ii) only moderately (36 %), iii) significant (38 %) changes in behaviors. In these settings, we adopt machine learning algorithms to investigate the extent to which target variables can be predicted by looking only at the set of features. Notably, $66\%$ of the samples in the category describing more significant changes in behaviors are correctly classified through Gradient Boosted Trees. Furthermore, we investigate the importance of each feature in the classification task and uncover complex relationships between individuals' characteristics and their attitude towards behavioral change. We find that intensity, recency of past illnesses, perceived susceptibility to and perceived severity of an infection are the most significant features in the classification task. Interestingly, the last two match the theoretical constructs suggested by the Health-Belief Model. Overall, the research contributes to the small set of empirical studies devoted to the data-driven characterization of behavioral changes induced by infectious diseases.
△ Less
Submitted 3 February, 2020;
originally announced February 2020.
-
Explore with caution: map** the evolution of scientific interest in Physics
Authors:
Alberto Aleta,
Sandro Meloni,
Nicola Perra,
Yamir Moreno
Abstract:
In the book The Essential Tension Thomas Kuhn described the conflict between tradition and innovation in scientific research --i.e., the desire to explore new promising areas, counterposed to the need to capitalize on the work done in the past. While it is true that along their careers many scientists probably felt this tension, only few works have tried to quantify it. Here, we address this quest…
▽ More
In the book The Essential Tension Thomas Kuhn described the conflict between tradition and innovation in scientific research --i.e., the desire to explore new promising areas, counterposed to the need to capitalize on the work done in the past. While it is true that along their careers many scientists probably felt this tension, only few works have tried to quantify it. Here, we address this question by analyzing a large-scale dataset, containing all the papers published by the American Physical Society (APS) in more than $25$ years, which allows for a better understanding of scientists' careers evolution in Physics. We employ the Physics and Astronomy Classification Scheme (PACS) present in each paper to map the scientific interests of $181,397$ authors and their evolution along the years. Our results indeed confirm the existence of the `essential tension' with scientists balancing between exploring the boundaries of their area and exploiting previous work. In particular, we found that although the majority of physicists change the topics of their research, they stay within the same broader area thus exploring with caution new scientific endeavors. Furthermore, we quantify the flows of authors moving between different subfields and pinpoint which areas are more likely to attract or donate researchers to the other ones. Overall, our results depict a very distinctive portrait of the evolution of research interests in Physics and can help in designing specific policies for the future.
△ Less
Submitted 12 April, 2019;
originally announced April 2019.
-
Modelling Opinion Dynamics in the Age of Algorithmic Personalisation
Authors:
Nicola Perra,
Luis E C Rocha
Abstract:
Modern technology has drastically changed the way we interact and consume information. For example, online social platforms allow for seamless communication exchanges at an unprecedented scale. However, we are still bounded by cognitive and temporal constraints. Our attention is limited and extremely valuable. Algorithmic personalisation has become a standard approach to tackle the information ove…
▽ More
Modern technology has drastically changed the way we interact and consume information. For example, online social platforms allow for seamless communication exchanges at an unprecedented scale. However, we are still bounded by cognitive and temporal constraints. Our attention is limited and extremely valuable. Algorithmic personalisation has become a standard approach to tackle the information overload problem. As result, the exposure to our friends' opinions and our perception about important issues might be distorted. However, the effects of algorithmic gatekee** on our hyper-connected society are poorly understood. Here, we devise an opinion dynamics model where individuals are connected through a social network and adopt opinions as function of the view points they are exposed to. We apply various filtering algorithms that select the opinions shown to users i) at random ii) considering time ordering or iii) their current beliefs. Furthermore, we investigate the interplay between such mechanisms and crucial features of real networks. We found that algorithmic filtering might influence opinions' share and distributions, especially in case information is biased towards the current opinion of each user. These effects are reinforced in networks featuring topological and spatial correlations where echo chambers and polarisation emerge. Conversely, heterogeneity in connectivity patterns reduces such tendency. We consider also a scenario where one opinion, through nudging, is centrally pushed to all users. Interestingly, even minimal nudging is able to change the status quo moving it towards the desired view point. Our findings suggest that simple filtering algorithms might be powerful tools to regulate opinion dynamics taking place on social networks
△ Less
Submitted 8 November, 2018;
originally announced November 2018.
-
Epidemic spreading on time-varying multiplex networks
Authors:
Quan-Hui Liu,
Xinyue Xiong,
Qian Zhang,
Nicola Perra
Abstract:
Social interactions are stratified in multiple contexts and are subject to complex temporal dynamics. The systematic study of these two features of social systems has started only very recently mainly thanks to the development of multiplex and time-varying networks. However, these two advancements have progressed almost in parallel with very little overlap. Thus, the interplay between multiplexity…
▽ More
Social interactions are stratified in multiple contexts and are subject to complex temporal dynamics. The systematic study of these two features of social systems has started only very recently mainly thanks to the development of multiplex and time-varying networks. However, these two advancements have progressed almost in parallel with very little overlap. Thus, the interplay between multiplexity and the temporal nature of connectivity patterns is poorly understood. Here, we aim to tackle this limitation by introducing a time-varying model of multiplex networks. We are interested in characterizing how these two properties affect contagion processes. To this end, we study SIS epidemic models unfolding at comparable time-scale respect to the evolution of the multiplex network. We study both analytically and numerically the epidemic threshold as a function of the overlap between, and the features of, each layer. We found that, the overlap between layers significantly reduces the epidemic threshold especially when the temporal activation patterns of overlap** nodes are positively correlated. Furthermore, when the average connectivity across layers is very different, the contagion dynamics are driven by the features of the more densely connected layer. Here, the epidemic threshold is equivalent to that of a single layered graph and the impact of the disease, in the layer driving the contagion, is independent of the overlap. However, this is not the case in the other layers where the spreading dynamics are sharply influenced by it. The results presented provide another step towards the characterization of the properties of real networks and their effects on contagion phenomena
△ Less
Submitted 11 August, 2018;
originally announced August 2018.
-
Epidemic Spreading on Activity-Driven Networks with Attractiveness
Authors:
Iacopo Pozzana,
Kaiyuan Sun,
Nicola Perra
Abstract:
We study SIS epidemic spreading processes unfolding on a recent generalisation of the activity-driven modelling framework. In this model of time-varying networks each node is described by two variables: activity and attractiveness. The first, describes the propensity to form connections. The second, defines the propensity to attract them. We derive analytically the epidemic threshold considering t…
▽ More
We study SIS epidemic spreading processes unfolding on a recent generalisation of the activity-driven modelling framework. In this model of time-varying networks each node is described by two variables: activity and attractiveness. The first, describes the propensity to form connections. The second, defines the propensity to attract them. We derive analytically the epidemic threshold considering the timescale driving the evolution of contacts and the contagion as comparable. The solutions are general and hold for any joint distribution of activity and attractiveness. The theoretical picture is confirmed via large-scale numerical simulations performed considering heterogeneous distributions and different correlations between the two variables. We find that heterogeneous distributions of attractiveness alter the contagion process. In particular, in case of uncorrelated and positive correlations between the two variables, heterogeneous attractiveness facilitates the spreading. On the contrary, negative correlations between activity and attractiveness hamper the spreading. The results presented contribute to the understanding of the dynamical properties of time-varying networks and their effects on contagion phenomena unfolding on their fabric.
△ Less
Submitted 11 September, 2017; v1 submitted 7 March, 2017;
originally announced March 2017.
-
Random walks on activity-driven networks with attractiveness
Authors:
Laura Alessandretti,
Kaiyuan Sun,
Andrea Baronchelli,
Nicola Perra
Abstract:
Virtually all real-world networks are dynamical entities. In social networks, the propensity of nodes to engage in social interactions (activity) and their chances to be selected by active nodes (attractiveness) are heterogeneously distributed. Here, we present a time-varying network model where each node and the dynamical formation of ties are characterised by these two features. We study how the…
▽ More
Virtually all real-world networks are dynamical entities. In social networks, the propensity of nodes to engage in social interactions (activity) and their chances to be selected by active nodes (attractiveness) are heterogeneously distributed. Here, we present a time-varying network model where each node and the dynamical formation of ties are characterised by these two features. We study how these properties affect random walk processes unfolding on the network when the time scales describing the process and the network evolution are comparable. We derive analytical solutions for the stationary state and the mean first passage time of the process and we study cases informed by empirical observations of social networks. Our work shows that previously disregarded properties of real social systems such heterogeneous distributions of activity and attractiveness as well as the correlations between them, substantially affect the dynamical process unfolding on the network.
△ Less
Submitted 12 June, 2017; v1 submitted 23 January, 2017;
originally announced January 2017.
-
Statistical physics of vaccination
Authors:
Zhen Wang,
Chris T. Bauch,
Samit Bhattacharyya,
Alberto d'Onofrio,
Piero Manfredi,
Matjaz Perc,
Nicola Perra,
Marcel Salathé,
Dawei Zhao
Abstract:
Historically, infectious diseases caused considerable damage to human societies, and they continue to do so today. To help reduce their impact, mathematical models of disease transmission have been studied to help understand disease dynamics and inform prevention strategies. Vaccination - one of the most important preventive measures of modern times - is of great interest both theoretically and em…
▽ More
Historically, infectious diseases caused considerable damage to human societies, and they continue to do so today. To help reduce their impact, mathematical models of disease transmission have been studied to help understand disease dynamics and inform prevention strategies. Vaccination - one of the most important preventive measures of modern times - is of great interest both theoretically and empirically. And in contrast to traditional approaches, recent research increasingly explores the pivotal implications of individual behavior and heterogeneous contact patterns in populations. Our report reviews the developmental arc of theoretical epidemiology with emphasis on vaccination, as it led from classical models assuming homogeneously mixing (mean-field) populations and ignoring human behavior, to recent models that account for behavioral feedback and/or population spatial/social structure. Many of the methods used originated in statistical physics, such as lattice and network models, and their associated analytical frameworks. Similarly, the feedback loop between vaccinating behavior and disease propagation forms a coupled nonlinear system with analogs in physics. We also review the new paradigm of digital epidemiology, wherein sources of digital data such as online social media are mined for high-resolution information on epidemiologically relevant individual behavior. Armed with the tools and concepts of statistical physics, and further assisted by new sources of digital data, models that capture nonlinear interactions between behavior and disease dynamics offer a novel way of modeling real-world phenomena, and can help improve health outcomes. We conclude the review by discussing open problems in the field and promising directions for future research.
△ Less
Submitted 17 November, 2016; v1 submitted 31 August, 2016;
originally announced August 2016.
-
Burstiness and tie reinforcement in time varying social networks
Authors:
Enrico Ubaldi,
Alessandro Vezzani,
Marton Karsai,
Nicola Perra,
Raffaella Burioni
Abstract:
We introduce a time-varying network model accounting for burstiness and tie reinforcement observed in social networks. The analytical solution indicates a non-trivial phase diagram determined by the competition of the leading terms of the two processes. We test our results against numerical simulations, and compare the analytical predictions with an empirical dataset finding good agreements betwee…
▽ More
We introduce a time-varying network model accounting for burstiness and tie reinforcement observed in social networks. The analytical solution indicates a non-trivial phase diagram determined by the competition of the leading terms of the two processes. We test our results against numerical simulations, and compare the analytical predictions with an empirical dataset finding good agreements between them. The presented framework can be used to classify the dynamical features of real social networks and to gather new insights about the effects of social dynamics on ongoing spreading processes.
△ Less
Submitted 29 July, 2016;
originally announced July 2016.
-
The dynamic of information-driven coordination phenomena: a transfer entropy analysis
Authors:
Javier Borge-Holthoefer,
Nicola Perra,
Bruno Gonçalves,
Sandra González-Bailón,
Alex Arenas,
Yamir Moreno,
Alessandro Vespignani
Abstract:
Data from social media are providing unprecedented opportunities to investigate the processes that rule the dynamics of collective social phenomena. Here, we consider an information theoretical approach to define and measure the temporal and structural signatures typical of collective social events as they arise and gain prominence. We use the symbolic transfer entropy analysis of micro-blogging t…
▽ More
Data from social media are providing unprecedented opportunities to investigate the processes that rule the dynamics of collective social phenomena. Here, we consider an information theoretical approach to define and measure the temporal and structural signatures typical of collective social events as they arise and gain prominence. We use the symbolic transfer entropy analysis of micro-blogging time series to extract directed networks of influence among geolocalized sub-units in social systems. This methodology captures the emergence of system-level dynamics close to the onset of socially relevant collective phenomena. The framework is validated against a detailed empirical analysis of five case studies. In particular, we identify a change in the characteristic time-scale of the information transfer that flags the onset of information-driven collective phenomena. Furthermore, our approach identifies an order-disorder transition in the directed network of influence between social sub-units. In the absence of a clear exogenous driving, social collective phenomena can be represented as endogenously-driven structural transitions of the information transfer network. This study provides results that can help define models and predictive algorithms for the analysis of societal events based on open source data.
△ Less
Submitted 22 July, 2015;
originally announced July 2015.
-
Attention on Weak Ties in Social and Communication Networks
Authors:
Lilian Weng,
Márton Karsai,
Nicola Perra,
Filippo Menczer,
Alessandro Flammini
Abstract:
Granovetter's weak tie theory of social networks is built around two central hypotheses. The first states that strong social ties carry the large majority of interaction events; the second maintains that weak social ties, although less active, are often relevant for the exchange of especially important information (e.g., about potential new jobs in Granovetter's work). While several empirical stud…
▽ More
Granovetter's weak tie theory of social networks is built around two central hypotheses. The first states that strong social ties carry the large majority of interaction events; the second maintains that weak social ties, although less active, are often relevant for the exchange of especially important information (e.g., about potential new jobs in Granovetter's work). While several empirical studies have provided support for the first hypothesis, the second has been the object of far less scrutiny. A possible reason is that it involves notions relative to the nature and importance of the information that are hard to quantify and measure, especially in large scale studies. Here, we search for empirical validation of both Granovetter's hypotheses. We find clear empirical support for the first. We also provide empirical evidence and a quantitative interpretation for the second. We show that attention, measured as the fraction of interactions devoted to a particular social connection, is high on weak ties --- possibly reflecting the postulated informational purposes of such ties --- but also on very strong ties. Data from online social media and mobile communication reveal network-dependent mixtures of these two effects on the basis of a platform's typical usage. Our results establish a clear relationships between attention, importance, and strength of social links, and could lead to improved algorithms to prioritize social media content.
△ Less
Submitted 31 August, 2017; v1 submitted 10 May, 2015;
originally announced May 2015.
-
Committed activists and the resha** of status-quo social consensus
Authors:
Dina Mistry,
Qian Zhang,
Nicola Perra,
Andrea Baronchelli
Abstract:
The role of committed minorities in sha** public opinion has been recently addressed with the help of multi-agent models. However, previous studies focused on homogeneous populations where zealots stand out only for their stubbornness. Here, we consider the more general case in which individuals are characterized by different propensities to communicate. In particular, we correlate commitment wi…
▽ More
The role of committed minorities in sha** public opinion has been recently addressed with the help of multi-agent models. However, previous studies focused on homogeneous populations where zealots stand out only for their stubbornness. Here, we consider the more general case in which individuals are characterized by different propensities to communicate. In particular, we correlate commitment with a higher tendency to push an opinion, acknowledging the fact that individuals with unwavering dedication to a cause are also more active in their attempts to promote their message. We show that these \textit{activists} are not only more efficient in spreading their message but that their efforts require an order of magnitude fewer individuals than a randomly selected committed minority to bring the population over to a new consensus. Finally, we address the role of communities, showing that partisan divisions in the society can make it harder for committed individuals to flip the status-quo social consensus.
△ Less
Submitted 30 October, 2015; v1 submitted 8 May, 2015;
originally announced May 2015.
-
Contrasting Effects of Strong Ties on SIR and SIS Processes in Temporal Networks
Authors:
Kaiyuan Sun,
Andrea Baronchelli,
Nicola Perra
Abstract:
Most real networks are characterized by connectivity patterns that evolve in time following complex, non-Markovian, dynamics. Here we investigate the impact of this ubiquitous feature by studying the Susceptible-Infected-Recovered (SIR) and Susceptible-Infected-Susceptible (SIS) epidemic models on activity driven networks with and without memory (i.e., Markovian and non-Markovian). We show that wh…
▽ More
Most real networks are characterized by connectivity patterns that evolve in time following complex, non-Markovian, dynamics. Here we investigate the impact of this ubiquitous feature by studying the Susceptible-Infected-Recovered (SIR) and Susceptible-Infected-Susceptible (SIS) epidemic models on activity driven networks with and without memory (i.e., Markovian and non-Markovian). We show that while memory inhibits the spreading process in SIR models, where the epidemic threshold is moved to larger values, it plays the opposite effect in the case of the SIS, where the threshold is lowered. The heterogeneity in tie strengths, and the frequent repetition of connections that it entails, allows in fact less virulent SIS-like diseases to survive in tightly connected local clusters that serve as reservoir for the virus. We validate this picture by evaluating the threshold of both processes in a real temporal network. Our findings confirm the important role played by non-Markovian network dynamics on dynamical processes
△ Less
Submitted 23 June, 2015; v1 submitted 3 April, 2014;
originally announced April 2014.
-
The role of endogenous and exogenous mechanisms in the formation of R&D networks
Authors:
Mario Vincenzo Tomasello,
Nicola Perra,
Claudio Juan Tessone,
Márton Karsai,
Frank Schweitzer
Abstract:
We develop an agent-based model of strategic link formation in Research and Development (R&D) networks. Empirical evidence has shown that the growth of these networks is driven by mechanisms which are both endogenous to the system (that is, depending on existing alliances patterns) and exogenous (that is, driven by an exploratory search for newcomer firms). Extant research to date has not investig…
▽ More
We develop an agent-based model of strategic link formation in Research and Development (R&D) networks. Empirical evidence has shown that the growth of these networks is driven by mechanisms which are both endogenous to the system (that is, depending on existing alliances patterns) and exogenous (that is, driven by an exploratory search for newcomer firms). Extant research to date has not investigated both mechanisms simultaneously in a comparative manner. To overcome this limitation, we develop a general modeling framework to shed light on the relative importance of these two mechanisms. We test our model against a comprehensive dataset, listing cross-country and cross-sectoral R&D alliances from 1984 to 2009. Our results show that by fitting only three macroscopic properties of the network topology, this framework is able to reproduce a number of micro-level measures, including the distributions of degree, local clustering, path length and component size, and the emergence of network clusters. Furthermore, by estimating the link probabilities towards newcomers and established firms from the data, we find that endogenous mechanisms are predominant over the exogenous ones in the network formation, thus quantifying the importance of existing structures in selecting partner firms.
△ Less
Submitted 15 July, 2014; v1 submitted 17 March, 2014;
originally announced March 2014.
-
Controlling Contagion Processes in Time-Varying Networks
Authors:
Suyu Liu,
Nicola Perra,
Marton Karsai,
Alessandro Vespignani
Abstract:
The vast majority of strategies aimed at controlling contagion processes on networks considers the connectivity pattern of the system as either quenched or annealed. However, in the real world many networks are highly dynamical and evolve in time concurrently to the contagion process. Here, we derive an analytical framework for the study of control strategies specifically devised for time-varying…
▽ More
The vast majority of strategies aimed at controlling contagion processes on networks considers the connectivity pattern of the system as either quenched or annealed. However, in the real world many networks are highly dynamical and evolve in time concurrently to the contagion process. Here, we derive an analytical framework for the study of control strategies specifically devised for time-varying networks. We consider the removal/immunization of individual nodes according the their activity in the network and develop a block variable mean-field approach that allows the derivation of the equations describing the evolution of the contagion process concurrently to the network dynamic. We derive the critical immunization threshold and assess the effectiveness of the control strategies. Finally, we validate the theoretical picture by simulating numerically the information spreading process and control strategies in both synthetic networks and a large-scale, real-world mobile telephone call dataset
△ Less
Submitted 26 September, 2013;
originally announced September 2013.
-
Time varying networks and the weakness of strong ties
Authors:
Márton Karsai,
Nicola Perra,
Alessandro Vespignani
Abstract:
In most social and information systems the activity of agents generates rapidly evolving time-varying networks. The temporal variation in networks' connectivity patterns and the ongoing dynamic processes are usually coupled in ways that still challenge our mathematical or computational modelling. Here we analyse a mobile call dataset and find a simple statistical law that characterize the temporal…
▽ More
In most social and information systems the activity of agents generates rapidly evolving time-varying networks. The temporal variation in networks' connectivity patterns and the ongoing dynamic processes are usually coupled in ways that still challenge our mathematical or computational modelling. Here we analyse a mobile call dataset and find a simple statistical law that characterize the temporal evolution of users' egocentric networks. We encode this observation in a reinforcement process defining a time-varying network model that exhibits the emergence of strong and weak ties. We study the effect of time-varying and heterogeneous interactions on the classic rumour spreading model in both synthetic, and real-world networks. We observe that strong ties severely inhibit information diffusion by confining the spreading process among agents with recurrent communication patterns. This provides the counterintuitive evidence that strong ties may have a negative role in the spreading of information across networks.
△ Less
Submitted 17 February, 2014; v1 submitted 24 March, 2013;
originally announced March 2013.
-
Characterizing scientific production and consumption in Physics
Authors:
Qian Zhang,
Nicola Perra,
Bruno Goncalves,
Fabio Ciulla,
Alessandro Vespignani
Abstract:
We analyze the entire publication database of the American Physical Society generating longitudinal (50 years) citation networks geolocalized at the level of single urban areas. We define the knowledge diffusion proxy, and scientific production ranking algorithms to capture the spatio-temporal dynamics of Physics knowledge worldwide. By using the knowledge diffusion proxy we identify the key citie…
▽ More
We analyze the entire publication database of the American Physical Society generating longitudinal (50 years) citation networks geolocalized at the level of single urban areas. We define the knowledge diffusion proxy, and scientific production ranking algorithms to capture the spatio-temporal dynamics of Physics knowledge worldwide. By using the knowledge diffusion proxy we identify the key cities in the production and consumption of knowledge in Physics as a function of time. The results from the scientific production ranking algorithm allow us to characterize the top cities for scholarly research in Physics. Although we focus on a single dataset concerning a specific field, the methodology presented here opens the path to comparative studies of the dynamics of knowledge across disciplines and research areas
△ Less
Submitted 26 February, 2013;
originally announced February 2013.
-
The Role of Information Diffusion in the Evolution of Social Networks
Authors:
Lilian Weng,
Jacob Ratkiewicz,
Nicola Perra,
Bruno Gonçalves,
Carlos Castillo,
Francesco Bonchi,
Rossano Schifanella,
Filippo Menczer,
Alessandro Flammini
Abstract:
Every day millions of users are connected through online social networks, generating a rich trove of data that allows us to study the mechanisms behind human interactions. Triadic closure has been treated as the major mechanism for creating social links: if Alice follows Bob and Bob follows Charlie, Alice will follow Charlie. Here we present an analysis of longitudinal micro-blogging data, reveali…
▽ More
Every day millions of users are connected through online social networks, generating a rich trove of data that allows us to study the mechanisms behind human interactions. Triadic closure has been treated as the major mechanism for creating social links: if Alice follows Bob and Bob follows Charlie, Alice will follow Charlie. Here we present an analysis of longitudinal micro-blogging data, revealing a more nuanced view of the strategies employed by users when expanding their social circles. While the network structure affects the spread of information among users, the network is in turn shaped by this communication activity. This suggests a link creation mechanism whereby Alice is more likely to follow Charlie after seeing many messages by Charlie. We characterize users with a set of parameters associated with different link creation strategies, estimated by a Maximum-Likelihood approach. Triadic closure does have a strong effect on link formation, but shortcuts based on traffic are another key factor in interpreting network evolution. However, individual strategies for following other users are highly heterogeneous. Link creation behaviors can be summarized by classifying users in different categories with distinct structural and behavioral characteristics. Users who are popular, active, and influential tend to create traffic-based shortcuts, making the information diffusion process more efficient in the network.
△ Less
Submitted 20 June, 2013; v1 submitted 25 February, 2013;
originally announced February 2013.
-
The Twitter of Babel: Map** World Languages through Microblogging Platforms
Authors:
Delia Mocanu,
Andrea Baronchelli,
Bruno Gonçalves,
Nicola Perra,
Alessandro Vespignani
Abstract:
Large scale analysis and statistics of socio-technical systems that just a few short years ago would have required the use of consistent economic and human resources can nowadays be conveniently performed by mining the enormous amount of digital data produced by human activities. Although a characterization of several aspects of our societies is emerging from the data revolution, a number of quest…
▽ More
Large scale analysis and statistics of socio-technical systems that just a few short years ago would have required the use of consistent economic and human resources can nowadays be conveniently performed by mining the enormous amount of digital data produced by human activities. Although a characterization of several aspects of our societies is emerging from the data revolution, a number of questions concerning the reliability and the biases inherent to the big data "proxies" of social life are still open. Here, we survey worldwide linguistic indicators and trends through the analysis of a large-scale dataset of microblogging posts. We show that available data allow for the study of language geography at scales ranging from country-level aggregation to specific city neighborhoods. The high resolution and coverage of the data allows us to investigate different indicators such as the linguistic homogeneity of different countries, the touristic seasonal patterns within countries and the geographical distribution of different languages in multilingual regions. This work highlights the potential of geolocalized studies of open data sources to improve current analysis and develop indicators for major social phenomena in specific communities.
△ Less
Submitted 20 December, 2012;
originally announced December 2012.
-
Quantifying the effect of temporal resolution on time-varying networks
Authors:
Bruno Ribeiro,
Nicola Perra,
Andrea Baronchelli
Abstract:
Time-varying networks describe a wide array of systems whose constituents and interactions evolve over time. They are defined by an ordered stream of interactions between nodes, yet they are often represented in terms of a sequence of static networks, each aggregating all edges and nodes present in a time interval of size Δt. In this work we quantify the impact of an arbitrary Δt on the descriptio…
▽ More
Time-varying networks describe a wide array of systems whose constituents and interactions evolve over time. They are defined by an ordered stream of interactions between nodes, yet they are often represented in terms of a sequence of static networks, each aggregating all edges and nodes present in a time interval of size Δt. In this work we quantify the impact of an arbitrary Δt on the description of a dynamical process taking place upon a time-varying network. We focus on the elementary random walk, and put forth a simple mathematical framework that well describes the behavior observed on real datasets. The analytical description of the bias introduced by time integrating techniques represents a step forward in the correct characterization of dynamical processes on time-varying graphs.
△ Less
Submitted 22 October, 2013; v1 submitted 29 November, 2012;
originally announced November 2012.
-
Contagion dynamics in time-varying metapopulation networks
Authors:
Suyu Liu,
Andrea Baronchelli,
Nicola Perra
Abstract:
The metapopulation framework is adopted in a wide array of disciplines to describe systems of well separated yet connected subpopulations. The subgroups or patches are often represented as nodes in a network whose links represent the migration routes among them. The connections have been so far mostly considered as static, but in general evolve in time. Here we address this case by investigating s…
▽ More
The metapopulation framework is adopted in a wide array of disciplines to describe systems of well separated yet connected subpopulations. The subgroups or patches are often represented as nodes in a network whose links represent the migration routes among them. The connections have been so far mostly considered as static, but in general evolve in time. Here we address this case by investigating simple contagion processes on time-varying metapopulation networks. We focus on the SIR process and determine analytically the mobility threshold for the onset of an epidemic spreading in the framework of activity-driven network models. We find profound differences from the case of static networks. The threshold is entirely described by the dynamical parameters defining the average number of instantaneously migrating individuals and does not depend on the properties of the static network representation. Remarkably, the diffusion and contagion processes are slower in time-varying graphs than in their aggregated static counterparts, the mobility threshold being even two orders of magnitude larger in the first case. The presented results confirm the importance of considering the time-varying nature of complex networks.
△ Less
Submitted 26 March, 2013; v1 submitted 9 October, 2012;
originally announced October 2012.
-
Beating the news using Social Media: the case study of American Idol
Authors:
Fabio Ciulla,
Delia Mocanu,
Andrea Baronchelli,
Bruno Gonçalves,
Nicola Perra,
Alessandro Vespignani
Abstract:
We present a contribution to the debate on the predictability of social events using big data analytics. We focus on the elimination of contestants in the American Idol TV shows as an example of a well defined electoral phenomenon that each week draws millions of votes in the USA. We provide evidence that Twitter activity during the time span defined by the TV show airing and the voting period fol…
▽ More
We present a contribution to the debate on the predictability of social events using big data analytics. We focus on the elimination of contestants in the American Idol TV shows as an example of a well defined electoral phenomenon that each week draws millions of votes in the USA. We provide evidence that Twitter activity during the time span defined by the TV show airing and the voting period following it, correlates with the contestants ranking and allows the anticipation of the voting outcome. Furthermore, the fraction of Tweets that contain geolocation information allows us to map the fanbase of each contestant, both within the US and abroad, showing that strong regional polarizations occur. Although American Idol voting is just a minimal and simplified version of complex societal phenomena such as political elections, this work shows that the volume of information available in online systems permits the real time gathering of quantitative indicators anticipating the future unfolding of opinion formation events.
△ Less
Submitted 23 May, 2012; v1 submitted 20 May, 2012;
originally announced May 2012.
-
Activity driven modeling of time varying networks
Authors:
Nicola Perra,
Bruno Gonçalves,
Romualdo Pastor-Satorras,
Alessandro Vespignani
Abstract:
Network modeling plays a critical role in identifying statistical regularities and structural principles common to many systems. The large majority of recent modeling approaches are connectivity driven. The structural patterns of the network are at the basis of the mechanisms ruling the network formation. Connectivity driven models necessarily provide a time-aggregated representation that may fail…
▽ More
Network modeling plays a critical role in identifying statistical regularities and structural principles common to many systems. The large majority of recent modeling approaches are connectivity driven. The structural patterns of the network are at the basis of the mechanisms ruling the network formation. Connectivity driven models necessarily provide a time-aggregated representation that may fail to describe the instantaneous and fluctuating dynamics of many networks. We address this challenge by defining the activity potential, a time invariant function characterizing the agents' interactions and constructing an activity driven model capable of encoding the instantaneous time description of the network dynamics. The model provides an explanation of structural features such as the presence of hubs, which simply originate from the heterogeneous activity of agents. Within this framework, highly dynamical networks can be described analytically, allowing a quantitative discussion of the biases induced by the time-aggregated representations in the analysis of dynamical processes.
△ Less
Submitted 26 June, 2012; v1 submitted 23 March, 2012;
originally announced March 2012.
-
Validation of Dunbar's number in Twitter conversations
Authors:
Bruno Goncalves,
Nicola Perra,
Alessandro Vespignani
Abstract:
Modern society's increasing dependency on online tools for both work and recreation opens up unique opportunities for the study of social interactions. A large survey of online exchanges or conversations on Twitter, collected across six months involving 1.7 million individuals is presented here. We test the theoretical cognitive limit on the number of stable social relationships known as Dunbar's…
▽ More
Modern society's increasing dependency on online tools for both work and recreation opens up unique opportunities for the study of social interactions. A large survey of online exchanges or conversations on Twitter, collected across six months involving 1.7 million individuals is presented here. We test the theoretical cognitive limit on the number of stable social relationships known as Dunbar's number. We find that users can entertain a maximum of 100-200 stable relationships in support for Dunbar's prediction. The "economy of attention" is limited in the online world by cognitive and biological constraints as predicted by Dunbar's theory. Inspired by this empirical evidence we propose a simple dynamical mechanism, based on finite priority queuing and time resources, that reproduces the observed social behavior.
△ Less
Submitted 28 May, 2011; v1 submitted 25 May, 2011;
originally announced May 2011.
-
Schroedinger-like PageRank equation and localization in the WWW
Authors:
Nicola Perra,
Vinko Zlatic,
Alessandro Chessa,
Claudio Conti,
Debora Donato,
Guido Caldarelli
Abstract:
The WorldWide Web is one of the most important communication systems we use in our everyday life. Despite its central role, the growth and the development of the WWW is not controlled by any central authority. This situation has created a huge ensemble of connections whose complexity can be fruitfully described and quantified by network theory. One important application that allows to sort out t…
▽ More
The WorldWide Web is one of the most important communication systems we use in our everyday life. Despite its central role, the growth and the development of the WWW is not controlled by any central authority. This situation has created a huge ensemble of connections whose complexity can be fruitfully described and quantified by network theory. One important application that allows to sort out the information present in these connections is given by the PageRank alghorithm. Computation of this quantity is usually made iteratively with a large use of computational time. In this paper we show that the PageRank can be expressed in terms of a wave function obeying a Schroedinger-like equation. In particular the topological disorder given by the unbalance of outgoing and ingoing links between pages, induces wave function and potential structuring. This allows to directly localize the pages with the largest score. Through this new representation we can now compute the PageRank without iterative techniques. For most of the cases of interest our method is faster than the original one. Our results also clarify the role of topology in the diffusion of information within complex networks. The whole approach opens the possibility to novel techniques inspired by quantum physics for the analysis of the WWW properties.
△ Less
Submitted 28 July, 2008;
originally announced July 2008.