-
CoDiNG -- Naming Game with Continuous Latent State of Agents
Authors:
Mateusz Nurek,
Joanna Kołaczek,
Radosław Michalski,
Bolesław K. Szymański,
Omar Lizardo
Abstract:
Understanding the mechanisms behind opinion formation is crucial for gaining insight into the processes that shape political beliefs, cultural attitudes, consumer choices, and social movements. This work aims to explore a nuanced model that captures the intricacies of real-world opinion dynamics by synthesizing principles from cognitive science and employing social network analysis. The proposed m…
▽ More
Understanding the mechanisms behind opinion formation is crucial for gaining insight into the processes that shape political beliefs, cultural attitudes, consumer choices, and social movements. This work aims to explore a nuanced model that captures the intricacies of real-world opinion dynamics by synthesizing principles from cognitive science and employing social network analysis. The proposed model is a hybrid continuous-discrete extension of the well-known Naming Game opinion model. The added latent continuous layer of opinion strength follows cognitive processes in the human brain, akin to memory imprints. The discrete layer allows for the conversion of intrinsic continuous opinion into discrete form, which often occurs when we publicly verbalize our opinions. We evaluated our model using real data as ground truth and demonstrated that the proposed mechanism outperforms the classic Naming Game model in many cases, reflecting that our model is closer to the real process of opinion formation.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Limits of Large Language Models in Debating Humans
Authors:
James Flamino,
Mohammed Shahid Modi,
Boleslaw K. Szymanski,
Brendan Cross,
Colton Mikolajczyk
Abstract:
Large Language Models (LLMs) have shown remarkable promise in their ability to interact proficiently with humans. Subsequently, their potential use as artificial confederates and surrogates in sociological experiments involving conversation is an exciting prospect. But how viable is this idea? This paper endeavors to test the limits of current-day LLMs with a pre-registered study integrating real…
▽ More
Large Language Models (LLMs) have shown remarkable promise in their ability to interact proficiently with humans. Subsequently, their potential use as artificial confederates and surrogates in sociological experiments involving conversation is an exciting prospect. But how viable is this idea? This paper endeavors to test the limits of current-day LLMs with a pre-registered study integrating real people with LLM agents acting as people. The study focuses on debate-based opinion consensus formation in three environments: humans only, agents and humans, and agents only. Our goal is to understand how LLM agents influence humans, and how capable they are in debating like humans. We find that LLMs can blend in and facilitate human productivity but are less convincing in debate, with their behavior ultimately deviating from human's. We elucidate these primary failings and anticipate that LLMs must evolve further before being viable debaters.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Analyzing Trendy Twitter Hashtags in the 2022 French Election
Authors:
Aamir Mandviwalla,
Lake Yin,
Boleslaw K. Szymanski
Abstract:
Regressions trained to predict the future activity of social media users need rich features for accurate predictions. Many advanced models exist to generate such features; however, the time complexities of their computations are often prohibitive when they run on enormous data-sets. Some studies have shown that simple semantic network features can be rich enough to use for regressions without requ…
▽ More
Regressions trained to predict the future activity of social media users need rich features for accurate predictions. Many advanced models exist to generate such features; however, the time complexities of their computations are often prohibitive when they run on enormous data-sets. Some studies have shown that simple semantic network features can be rich enough to use for regressions without requiring complex computations. We propose a method for using semantic networks as user-level features for machine learning tasks. We conducted an experiment using a semantic network of 1037 Twitter hashtags from a corpus of 3.7 million tweets related to the 2022 French presidential election. A bipartite graph is formed where hashtags are nodes and weighted edges connect the hashtags reflecting the number of Twitter users that interacted with both hashtags. The graph is then transformed into a maximum-spanning tree with the most popular hashtag as its root node to construct a hierarchy amongst the hashtags. We then provide a vector feature for each user based on this tree. To validate the usefulness of our semantic feature we performed a regression experiment to predict the response rate of each user with six emotions like anger, enjoyment, or disgust. Our semantic feature performs well with the regression with most emotions having $R^2$ above 0.5. These results suggest that our semantic feature could be considered for use in further experiments predicting social media response on big data-sets.
△ Less
Submitted 28 February, 2024; v1 submitted 11 October, 2023;
originally announced October 2023.
-
Dynamics of Ideological Biases of Social Media Users
Authors:
Mohammed Shahid Modi,
James Flamino,
Boleslaw K. Szymanski
Abstract:
Humanity for centuries has perfected skills of interpersonal interactions and evolved patterns that enable people to detect lies and deceiving behavior of others in face-to-face settings. Unprecedented growth of people's access to mobile phones and social media raises an important question: How does this new technology influence people's interactions and support the use of traditional patterns? In…
▽ More
Humanity for centuries has perfected skills of interpersonal interactions and evolved patterns that enable people to detect lies and deceiving behavior of others in face-to-face settings. Unprecedented growth of people's access to mobile phones and social media raises an important question: How does this new technology influence people's interactions and support the use of traditional patterns? In this paper, we answer this question for homophily driven patterns in social media. In our previous studies, we found that, on a university campus, changes in student opinions were driven by the desire to hold popular opinions. Here, we demonstrate that the evolution of online platform-wide opinion groups is driven by the same desire. We focus on two social media: Twitter and Parler, on which we tracked the political biases of their users. On Parler, an initially stable group of right-biased users evolved into a permanent right-leaning echo chamber dominating weaker, transient groups of members with opposing political biases. In contrast, on Twitter, the initial presence of two large opposing bias groups led to the evolution of a bimodal bias distribution, with a high degree of polarization. We capture the movement of users from the initial to final bias groups during the tracking period. We also show that user choices are influenced by side-effects of homophily. The users entering the platform attempt to find a sufficiently large group whose members hold political bias within the range sufficiently close to the new user's bias. If successful, they stabilize their bias and become a permanent member of the group. Otherwise, they leave the platform. We believe that the dynamics of users uncovered in this paper create a foundation for technical solutions supporting social groups on social media and socially aware networks.
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
Generalized dimension reduction approach for heterogeneous networked systems with time-delay
Authors:
Cheng Ma,
Gyorgy Korniss,
Boleslaw K. Szymanski,
Jianxi Gao
Abstract:
Networks of interconnected agents are essential to study complex networked systems' state evolution, stability, resilience, and control. Nevertheless, the high dimensionality and nonlinear dynamics are vital factors preventing us from theoretically analyzing them. Recently, the dimension-reduction approaches reduced the system's size by map** the original system to a one-dimensional system such…
▽ More
Networks of interconnected agents are essential to study complex networked systems' state evolution, stability, resilience, and control. Nevertheless, the high dimensionality and nonlinear dynamics are vital factors preventing us from theoretically analyzing them. Recently, the dimension-reduction approaches reduced the system's size by map** the original system to a one-dimensional system such that only one effective representative can capture its macroscopic dynamics. However, the approaches dramatically fail as the network becomes heterogeneous and has multiple community structures. Here, we bridge the gap by develo** a generalized dimension reduction approach, which enables us to map the original system to a $m$-dimensional system that consists of $m$ interacting components. Notably, by validating it on various dynamical models, this approach accurately predicts the original system state and the tip** point, if any. Furthermore, the numerical results demonstrate that this approach approximates the system evolution and identifies the critical points for complex networks with time delay.
△ Less
Submitted 22 August, 2023;
originally announced August 2023.
-
Divide-and-rule policy in the Naming Game
Authors:
Cheng Ma,
Gyorgy Korniss,
Boleslaw K. Szymanski
Abstract:
The Naming Game is a classic model for studying the emergence and evolution of language in a population. In this paper, we consider the Naming Game with multiple committed opinions and investigate the dynamics of the game on a complete graph with an arbitrary large population. The homogeneous mixing condition enables us to use mean-field theory to analyze the opinion evolution of the system. Howev…
▽ More
The Naming Game is a classic model for studying the emergence and evolution of language in a population. In this paper, we consider the Naming Game with multiple committed opinions and investigate the dynamics of the game on a complete graph with an arbitrary large population. The homogeneous mixing condition enables us to use mean-field theory to analyze the opinion evolution of the system. However, when the number of opinions increases, the number of variables describing the system grows exponentially. We focus on a special scenario where the largest group of committed agents competes with a motley of committed groups, each of which is significantly smaller than the largest one, while the majority of uncommitted agents initially hold one unique opinion. We choose this scenario for two reasons. The first is that it arose many times in different societies, while the second is that its complexity can be reduced by merging all agents of small committed groups into a single committed group. We show that the phase transition occurs when the group of the largest committed fraction dominates the system, and the threshold for the size of the dominant group at which this transition occurs depends on the size of the committed group of the unified category. Further, we derive the general formula for the multi-opinion evolution using a recursive approach. Finally, we use agent-based simulations to reveal the opinion evolution in the random graphs. Our results provide insights into the conditions under which the dominant opinion emerges in a population and the factors that influence this process.
△ Less
Submitted 28 June, 2023;
originally announced June 2023.
-
Modeling Memory Imprints Induced by Interactions in Social Networks
Authors:
James Flamino,
Ross DeVito,
Omar Lizardo,
Boleslaw K. Szymanski
Abstract:
Memory imprints of the significance of relationships are constantly evolving. They are boosted by social interactions among people involved in relationships, and decay between such events, causing the relationships to change. Despite the importance of the evolution of relationships in social networks, there is little work exploring how interactions over extended periods correlate with people's mem…
▽ More
Memory imprints of the significance of relationships are constantly evolving. They are boosted by social interactions among people involved in relationships, and decay between such events, causing the relationships to change. Despite the importance of the evolution of relationships in social networks, there is little work exploring how interactions over extended periods correlate with people's memory imprints of relationship importance. In this paper, we represent memory dynamics by adapting a well-known cognitive science model. Using two unique longitudinal datasets, we fit the model's parameters to maximize agreement of the memory imprints of relationship strengths of a node predicted from call detail records with the ground-truth list of relationships of this node ordered by their strength. We find that this model, trained on one population, predicts not only on this population but also on a different one, suggesting the universality of memory imprints of social interactions among unrelated individuals. This paper lays the foundation for studying the modeling of social interactions as memory imprints, and its potential use as an unobtrusive tool to early detection of individuals with memory malfunctions.
△ Less
Submitted 31 January, 2023; v1 submitted 6 October, 2022;
originally announced October 2022.
-
Resource-Mediated Consensus Formation
Authors:
Omar Malik,
James Flamino,
Boleslaw K. Szymanski
Abstract:
In social sciences, simulating opinion dynamics to study the interplay between homophily and influence, and the subsequent formation of echo chambers, is of great importance. As such, in this paper we investigate echo chambers by implementing a unique social game in which we spawn in a large number of agents, each assigned one of the two opinions on an issue and a finite amount of influence in the…
▽ More
In social sciences, simulating opinion dynamics to study the interplay between homophily and influence, and the subsequent formation of echo chambers, is of great importance. As such, in this paper we investigate echo chambers by implementing a unique social game in which we spawn in a large number of agents, each assigned one of the two opinions on an issue and a finite amount of influence in the form of a game currency. Agents attempt to have an opinion that is a majority at the end of the game, to obtain a reward also paid in the game currency. At the beginning of each round, a randomly selected agent is selected, referred to as a speaker. The second agent is selected in the radius of speaker influence (which is a set subset of the speaker's neighbors) to interact with the speaker as a listener. In this interaction, the speaker proposes a payoff in the game currency from their personal influence budget to persuade the listener to hold the speaker's opinion in future rounds until chosen listener again. The listener can either choose to accept or reject this payoff to hold the speaker's opinion for future rounds. The listener's choice is informed only by their estimate of global majority opinion through a limited view of the opinions of their neighboring agents. We show that the influence game leads to the formation of "echo chambers," or homogeneous clusters of opinions. We also investigate various scenarios to disrupt the creation of such echo chambers, including the introduction of resource disparity between agents with different opinions, initially preferentially assigning opinions to agents, and the introduction of committed agents, who never change their initial opinion.
△ Less
Submitted 14 June, 2022;
originally announced June 2022.
-
Temporal Network Epistemology: on Reaching Consensus in Real World Setting
Authors:
Radosław Michalski,
Damian Serwata,
Mateusz Nurek,
Boleslaw K. Szymanski,
Przemysław Kazienko,
Tao Jia
Abstract:
This work develops the concept of temporal network epistemology model enabling the simulation of the learning process in dynamic networks. The results of the research, conducted on the temporal social network generated using the CogSNet model and on the static topologies as a reference, indicate a significant influence of the network temporal dynamics on the outcome and flow of the learning proces…
▽ More
This work develops the concept of temporal network epistemology model enabling the simulation of the learning process in dynamic networks. The results of the research, conducted on the temporal social network generated using the CogSNet model and on the static topologies as a reference, indicate a significant influence of the network temporal dynamics on the outcome and flow of the learning process. It has been shown that not only the dynamics of reaching consensus is different compared to baseline models but also that previously unobserved phenomena appear, such as uninformed agents or different consensus states for disconnected components. It has been also observed that sometimes only the change of the network structure can contribute to reaching consensus. The introduced approach and the experimental results can be used to better understand the way how human communities collectively solve both complex problems at the scientific level and to inquire into the correctness of less complex but common and equally important beliefs' spreading across entire societies.
△ Less
Submitted 10 March, 2022;
originally announced March 2022.
-
Shifting Polarization and Twitter News Influencers between two U.S. Presidential Elections
Authors:
James Flamino,
Alessandro Galezzi,
Stuart Feldman,
Michael W. Macy,
Brendan Cross,
Zhenkun Zhou,
Matteo Serafino,
Alexandre Bovet,
Hernan A. Makse,
Boleslaw K. Szymanski
Abstract:
Social media are decentralized, interactive, and transformative, empowering users to produce and spread information to influence others. This has changed the dynamics of political communication that were previously dominated by traditional corporate news media. Having hundreds of millions of tweets collected over the 2016 and 2020 U.S. presidential elections gave us a unique opportunity to measure…
▽ More
Social media are decentralized, interactive, and transformative, empowering users to produce and spread information to influence others. This has changed the dynamics of political communication that were previously dominated by traditional corporate news media. Having hundreds of millions of tweets collected over the 2016 and 2020 U.S. presidential elections gave us a unique opportunity to measure the change in polarization and the diffusion of political information. We analyze the diffusion of political information among Twitter users and investigate the change of polarization between these elections and how this change affected the composition and polarization of influencers and their retweeters. We identify "influencers" by their ability to spread information and classify them into those affiliated with a media organization, a political organization, or unaffiliated. Most of the top influencers were affiliated with media organizations during both elections. We found a clear increase from 2016 to 2020 in polarization among influencers and among those whom they influence. Moreover, 75% of the top influencers in 2020 were not present in 2016, demonstrating that such status is difficult to retain. Between 2016 and 2020, 10% of influencers affiliated with media were replaced by center- or right-orientated influencers affiliated with political organizations and unaffiliated influencers.
△ Less
Submitted 3 November, 2021;
originally announced November 2021.
-
Become a better you: correlation between the change of research direction and the change of scientific performance
Authors:
Xiaoyao Yu,
Boleslaw K. Szymanski,
Tao Jia
Abstract:
It is important to explore how scientists decide their research agenda and the corresponding consequences, as their decisions collectively shape contemporary science. There are studies focusing on the overall performance of individuals with different problem choosing strategies. Here we ask a slightly different but relatively unexplored question: how is a scientist's change of research agenda asso…
▽ More
It is important to explore how scientists decide their research agenda and the corresponding consequences, as their decisions collectively shape contemporary science. There are studies focusing on the overall performance of individuals with different problem choosing strategies. Here we ask a slightly different but relatively unexplored question: how is a scientist's change of research agenda associated with her change of scientific performance. Using publication records of over 14,000 authors in physics, we quantitatively measure the extent of research direction change and the performance change of individuals. We identify a strong positive correlation between the direction change and impact change. Scientists with a larger direction change not only are more likely to produce works with increased scientific impact compared to their past ones, but also have a higher growth rate of scientific impact. On the other hand, the direction change is not associated with productivity change. Those who stay in familiar topics do not publish faster than those who venture out and establish themselves in a new field. The gauge of research direction in this work is uncorrelated with the diversity of research agenda and the switching probability among topics, capturing the evolution of individual careers from a new point of view. Though the finding is inevitably affected by the survival bias, it sheds light on a range of problems in the career development of individual scientists.
△ Less
Submitted 2 July, 2021;
originally announced July 2021.
-
Optimizing Edge Sets in Networks to Produce Ground Truth Communities Based on Modularity
Authors:
Daniel Kosmas,
John E. Mitchell,
Thomas C. Sharkey,
Boleslaw K. Szymanski
Abstract:
We consider two new problems regarding the impact of edge addition or removal on the modularity of partitions (or community structures) in a network. The first problem seeks to add edges to enforce that a desired partition is the partition that maximizes modularity. The second problem seeks to find the sparsest representation of a network that has the same partition with maximum modularity as the…
▽ More
We consider two new problems regarding the impact of edge addition or removal on the modularity of partitions (or community structures) in a network. The first problem seeks to add edges to enforce that a desired partition is the partition that maximizes modularity. The second problem seeks to find the sparsest representation of a network that has the same partition with maximum modularity as the original network. We present integer programming formulations, a row generation algorithm, and heuristic algorithms to solve these problems. Further, we demonstrate a counter-intuitive behavior of modularity that makes the development of heuristics for general networks difficult. We then present results on a selection of social and illicit networks from the literature.
△ Less
Submitted 3 March, 2022; v1 submitted 16 March, 2021;
originally announced March 2021.
-
An Algorithm for Reconstructing the Orphan Stream Progenitor with MilkyWay@home Volunteer Computing
Authors:
Siddhartha Shelton,
Heidi Jo Newberg,
Jake Weiss,
Jacob S. Bauer,
Matthew Arsenault,
Larry Widrow,
Clayton Rayment,
Travis Desell,
Roland Judd,
Malik Magdon-Ismail,
Eric Mendelsohn,
Matthew Newby,
Colin Rice,
Boleslaw K. Szymanski,
Jeffery M. Thompson,
Carlos Varela,
Benjamin Willett,
Steve Ulin,
Lee Newberg
Abstract:
We have developed a method for estimating the properties of the progenitor dwarf galaxy from the tidal stream of stars that were ripped from it as it fell into the Milky Way. In particular, we show that the mass and radial profile of a progenitor dwarf galaxy evolved along the orbit of the Orphan Stream, including the stellar and dark matter components, can be reconstructed from the distribution o…
▽ More
We have developed a method for estimating the properties of the progenitor dwarf galaxy from the tidal stream of stars that were ripped from it as it fell into the Milky Way. In particular, we show that the mass and radial profile of a progenitor dwarf galaxy evolved along the orbit of the Orphan Stream, including the stellar and dark matter components, can be reconstructed from the distribution of stars in the tidal stream it produced. We use MilkyWay@home, a PetaFLOPS-scale distributed supercomputer, to optimize our dwarf galaxy parameters until we arrive at best-fit parameters. The algorithm fits the dark matter mass, dark matter radius, stellar mass, radial profile of stars, and orbital time. The parameters are recovered even though the dark matter component extends well past the half light radius of the dwarf galaxy progenitor, proving that we are able to extract information about the dark matter halos of dwarf galaxies from the tidal debris. Our simulations assumed that the Milky Way potential, dwarf galaxy orbit, and the form of the density model for the dwarf galaxy were known exactly; more work is required to evaluate the sources of systematic error in fitting real data. This method can be used to estimate the dark matter content in dwarf galaxies without the assumption of virial equilibrium that is required to estimate the mass using line-of-sight velocities. This demonstration is a first step towards building an infrastructure that will fit the Milky Way potential using multiple tidal streams.
△ Less
Submitted 14 February, 2021;
originally announced February 2021.
-
A Machine Learning Approach to Predicting Continuous Tie Strengths
Authors:
James Flamino,
Ross DeVito,
Boleslaw K. Szymanski,
Omar Lizardo
Abstract:
Relationships between people constantly evolve, altering interpersonal behavior and defining social groups. Relationships between nodes in social networks can be represented by a tie strength, often empirically assessed using surveys. While this is effective for taking static snapshots of relationships, such methods are difficult to scale to dynamic networks. In this paper, we propose a system tha…
▽ More
Relationships between people constantly evolve, altering interpersonal behavior and defining social groups. Relationships between nodes in social networks can be represented by a tie strength, often empirically assessed using surveys. While this is effective for taking static snapshots of relationships, such methods are difficult to scale to dynamic networks. In this paper, we propose a system that allows for the continuous approximation of relationships as they evolve over time. We evaluate this system using the NetSense study, which provides comprehensive communication records of students at the University of Notre Dame over the course of four years. These records are complemented by semesterly ego network surveys, which provide discrete samples over time of each participant's true social tie strength with others. We develop a pair of powerful machine learning models (complemented by a suite of baselines extracted from past works) that learn from these surveys to interpret the communications records as signals. These signals represent dynamic tie strengths, accurately recording the evolution of relationships between the individuals in our social networks. With these evolving tie values, we are able to make several empirically derived observations which we compare to past works.
△ Less
Submitted 23 January, 2021;
originally announced January 2021.
-
Learning Parameters for Balanced Index Influence Maximization
Authors:
Manqing Ma,
Gyorgy Korniss,
Boleslaw K. Szymanski
Abstract:
Influence maximization is the task of finding the smallest set of nodes whose activation in a social network can trigger an activation cascade that reaches the targeted network coverage, where threshold rules determine the outcome of influence. This problem is NP-hard and it has generated a significant amount of recent research on finding efficient heuristics. We focus on a {\it Balance Index} alg…
▽ More
Influence maximization is the task of finding the smallest set of nodes whose activation in a social network can trigger an activation cascade that reaches the targeted network coverage, where threshold rules determine the outcome of influence. This problem is NP-hard and it has generated a significant amount of recent research on finding efficient heuristics. We focus on a {\it Balance Index} algorithm that relies on three parameters to tune its performance to the given network structure. We propose using a supervised machine-learning approach for such tuning. We select the most influential graph features for the parameter tuning. Then, using random-walk-based graph-sampling, we create small snapshots from the given synthetic and large-scale real-world networks. Using exhaustive search, we find for these snapshots the high accuracy values of BI parameters to use as a ground truth. Then, we train our machine-learning model on the snapshots and apply this model to the real-word network to find the best BI parameters. We apply these parameters to the sampled real-world network to measure the quality of the sets of initiators found this way. We use various real-world networks to validate our approach against other heuristic.
△ Less
Submitted 14 December, 2020;
originally announced December 2020.
-
Optimizing sensors placement in complex networks for localization of hidden signal source: A review
Authors:
Robert Paluch,
Łukasz G. Gajewski,
Janusz A. Hołyst,
Boleslaw K. Szymanski
Abstract:
As the world becomes more and more interconnected, our everyday objects become part of the Internet of Things, and our lives get more and more mirrored in virtual reality, where every piece of~information, including misinformation, fake news and malware, can spread very fast practically anonymously. To suppress such uncontrolled spread, efficient computer systems and algorithms capable to~track do…
▽ More
As the world becomes more and more interconnected, our everyday objects become part of the Internet of Things, and our lives get more and more mirrored in virtual reality, where every piece of~information, including misinformation, fake news and malware, can spread very fast practically anonymously. To suppress such uncontrolled spread, efficient computer systems and algorithms capable to~track down such malicious information spread have to be developed. Currently, the most effective methods for source localization are based on sensors which provide the times at which they detect the~spread. We investigate the problem of the optimal placement of such sensors in complex networks and propose a new graph measure, called Collective Betweenness, which we compare against four other metrics. Extensive numerical tests are performed on different types of complex networks over the wide ranges of densities of sensors and stochasticities of signal. In these tests, we discovered clear difference in comparative performance of the investigated optimal placement methods between real or scale-free synthetic networks versus narrow degree distribution networks. The former have a clear region for any given method's dominance in contrast to the latter where the performance maps are less homogeneous. We find that while choosing the best method is very network and spread dependent, there are two methods that consistently stand out. High Variance Observers seem to do very well for spread with low stochasticity whereas Collective Betwenness, introduced in this paper, thrives when the spread is highly unpredictable.
△ Less
Submitted 3 December, 2020;
originally announced December 2020.
-
Universality of noise-induced resilience restoration in spatially-extended ecological systems
Authors:
Cheng Ma,
Gyorgy Korniss,
Boleslaw K. Szymanski,
Jianxi Gao
Abstract:
Many systems may switch to an undesired state due to internal failures or external perturbations, of which critical transitions toward degraded ecosystem states are a prominent example. Resilience restoration focuses on the ability of spatially-extended systems and the required time to recover to their desired states under stochastic environmental conditions. While mean-field approaches may guide…
▽ More
Many systems may switch to an undesired state due to internal failures or external perturbations, of which critical transitions toward degraded ecosystem states are a prominent example. Resilience restoration focuses on the ability of spatially-extended systems and the required time to recover to their desired states under stochastic environmental conditions. While mean-field approaches may guide recovery strategies by indicating the conditions needed to destabilize undesired states, these approaches are not accurately capturing the transition process toward the desired state of spatially-extended systems in stochastic environments. The difficulty is rooted in the lack of mathematical tools to analyze systems with high dimensionality, nonlinearity, and stochastic effects. We bridge this gap by develo** new mathematical tools that employ nucleation theory in spatially-embedded systems to advance resilience restoration. We examine our approach on systems following mutualistic dynamics and diffusion models, finding that systems may exhibit single-cluster or multi-cluster phases depending on their sizes and noise strengths, and also construct a new scaling law governing the restoration time for arbitrary system size and noise strength in two-dimensional systems. This approach is not limited to ecosystems and has applications in various dynamical systems, from biology to infrastructural systems.
△ Less
Submitted 9 September, 2021; v1 submitted 23 November, 2020;
originally announced November 2020.
-
Heuristic assessment of the economic effects of pandemic control
Authors:
Xiang Niu,
Christopher Brissette,
Chunheng Jiang,
Jianxi Gao,
Gyorgy Korniss,
Boleslaw K. Szymanski
Abstract:
Data-driven risk networks describe many complex system dynamics arising in fields such as epidemiology and ecology. They lack explicit dynamics and have multiple sources of cost, both of which are beyond the current scope of traditional control theory. We construct the global risk network by combining the consensus of experts from the World Economic Forum with risk activation data to define its to…
▽ More
Data-driven risk networks describe many complex system dynamics arising in fields such as epidemiology and ecology. They lack explicit dynamics and have multiple sources of cost, both of which are beyond the current scope of traditional control theory. We construct the global risk network by combining the consensus of experts from the World Economic Forum with risk activation data to define its topology and interactions. Many of these risks, including extreme weather, pose significant economic costs when active. We introduce a method for converting network interaction data into continuous dynamics to which we apply optimal control. We contribute the first method for constructing and controlling risk network dynamics based on empirically collected data. We identify seven risks commonly used by governments to control COVID-19 spread and show that many alternative driver risk sets exist with potentially lower cost of control.
△ Less
Submitted 27 October, 2020;
originally announced October 2020.
-
Network resilience
Authors:
Xueming Liu,
Daqing Li,
Manqing Ma,
Boleslaw K. Szymanski,
H Eugene Stanley,
Jianxi Gao
Abstract:
Many systems on our planet are known to shift abruptly and irreversibly from one state to another when they are forced across a "tip** point," such as mass extinctions in ecological networks, cascading failures in infrastructure systems, and social convention changes in human and animal networks. Such a regime shift demonstrates a system's resilience that characterizes the ability of a system to…
▽ More
Many systems on our planet are known to shift abruptly and irreversibly from one state to another when they are forced across a "tip** point," such as mass extinctions in ecological networks, cascading failures in infrastructure systems, and social convention changes in human and animal networks. Such a regime shift demonstrates a system's resilience that characterizes the ability of a system to adjust its activity to retain its basic functionality in the face of internal disturbances or external environmental changes. In the past 50 years, attention was almost exclusively given to low dimensional systems and calibration of their resilience functions and indicators of early warning signals without considerations for the interactions between the components. Only in recent years, taking advantages of the network theory and lavish real data sets, network scientists have directed their interest to the real-world complex networked multidimensional systems and their resilience function and early warning indicators. This report is devoted to a comprehensive review of resilience function and regime shift of complex systems in different domains, such as ecology, biology, social systems and infrastructure. We cover the related research about empirical observations, experimental studies, mathematical modeling, and theoretical analysis. We also discuss some ambiguous definitions, such as robustness, resilience, and stability.
△ Less
Submitted 9 April, 2022; v1 submitted 26 July, 2020;
originally announced July 2020.
-
The Paradox of Information Access: Growing Isolation in the Age of Sharing
Authors:
Tarek Abdelzaher,
Heng Ji,
**yang Li,
Chaoqi Yang,
John Dellaverson,
Lixia Zhang,
Chao Xu,
Boleslaw K. Szymanski
Abstract:
Modern online media, such as Twitter, Instagram, and YouTube, enable anyone to become an information producer and to offer online content for potentially global consumption. By increasing the amount of globally accessible real-time information, today's ubiquitous producers contribute to a world, where an individual consumes vanishingly smaller fractions of all produced content. In general, consume…
▽ More
Modern online media, such as Twitter, Instagram, and YouTube, enable anyone to become an information producer and to offer online content for potentially global consumption. By increasing the amount of globally accessible real-time information, today's ubiquitous producers contribute to a world, where an individual consumes vanishingly smaller fractions of all produced content. In general, consumers preferentially select information that closely matches their individual views and values. The bias inherent in such selection is further magnified by today's information curation services that maximize user engagement (and thus service revenue) by filtering new content in accordance with observed consumer preferences. Consequently, individuals get exposed to increasingly narrower bands of the ideology spectrum. Societies get fragmented into increasingly ideologically isolated enclaves. These enclaves (or echo-chambers) then become vulnerable to misinformation spread, which in turn further magnifies polarization and bias. We call this dynamic the paradox of information access; a growing ideological fragmentation in the age of sharing. This article describes the technical, economic, and socio-cognitive contributors to this paradox, and explores research directions towards its mitigation.
△ Less
Submitted 4 April, 2020;
originally announced April 2020.
-
The Paradox of Information Access: On Modeling Social-Media-Induced Polarization
Authors:
Chao Xu,
**yang Li,
Tarek Abdelzaher,
Heng Ji,
Boleslaw K. Szymanski,
John Dellaverson
Abstract:
The paper develops a stochastic model of drift in human beliefs that shows that today's sheer volume of accessible information, combined with consumers' confirmation bias and natural preference to more outlying content, necessarily lead to increased polarization. The model explains the paradox of growing ideological fragmentation in the age of increased sharing. As social media, search engines, an…
▽ More
The paper develops a stochastic model of drift in human beliefs that shows that today's sheer volume of accessible information, combined with consumers' confirmation bias and natural preference to more outlying content, necessarily lead to increased polarization. The model explains the paradox of growing ideological fragmentation in the age of increased sharing. As social media, search engines, and other real-time information sharing outlets purport to facilitate access to information, a need for content filtering arises due to the ensuing information overload. In general, consumers select information that matches their individual views and values. The bias inherent in such selection is echoed by today's information curation services that maximize user engagement by filtering new content in accordance with observed consumer preferences. Consequently, individuals get exposed to increasingly narrower bands of the ideology spectrum, thus fragmenting society into increasingly ideologically isolated enclaves. We call this dynamic the paradox of information access. The model also suggests the disproportionate damage attainable with a small infusion of well-positioned misinformation. The paper describes the modeling methodology, and evaluates modeling results for different population sizes and parameter settings.
△ Less
Submitted 15 January, 2021; v1 submitted 2 April, 2020;
originally announced April 2020.
-
On community structure in complex networks: challenges and opportunities
Authors:
Hocine Cherifi,
Gergely Palla,
Boleslaw K. Szymanski,
Xiaoyan Lu
Abstract:
Community structure is one of the most relevant features encountered in numerous real-world applications of networked systems. Despite the tremendous effort of scientists working on this subject over the past few decades to characterize, model, and analyze communities, more investigations are needed to better understand the impact of community structure and its dynamics on networked systems. Here,…
▽ More
Community structure is one of the most relevant features encountered in numerous real-world applications of networked systems. Despite the tremendous effort of scientists working on this subject over the past few decades to characterize, model, and analyze communities, more investigations are needed to better understand the impact of community structure and its dynamics on networked systems. Here, we first focus on generative models of communities in complex networks and their role in develo** strong foundation for community detection algorithms. We discuss modularity and the use of modularity maximization as the basis for community detection. Then, we overview the Stochastic Block Model, its different variants, and inference of community structures from such models. Next, we focus on time evolving networks, where existing nodes and links can disappear and/or new nodes and links may be introduced. The extraction of communities under such circumstances poses an interesting and non-trivial problem that has gained considerable interest over the last decade. We briefly discuss considerable advances made in this field recently. Finally, we focus on immunization strategies essential for targeting the influential spreaders of epidemics in modular networks. Their main goal is to select and immunize a small proportion of individuals from the whole network to control the diffusion process. Various strategies have emerged over the years suggesting different ways to immunize nodes in networks with overlap** and non-overlap** community structure. We first discuss stochastic strategies that require little or no information about the network topology at the expense of their performance. Then, we introduce deterministic strategies that have proven to be very efficient in controlling the epidemic outbreaks, but require complete knowledge of the network.
△ Less
Submitted 6 November, 2019; v1 submitted 13 August, 2019;
originally announced August 2019.
-
Supervised Learning of the Global Risk Network Activation from Media Event Reports
Authors:
Xiang Niu,
Gyorgy Korniss,
Boleslaw K. Szymanski
Abstract:
The World Economic Forum (WEF) publishes annual reports on global risks which have the high impact on the world's economy. Currently, many researchers analyze the modeling and evolution of risks. However, few studies focus on validation of the global risk networks published by the WEF. In this paper, we first create a risk knowledge graph from the annotated risk events crawled from the Wikipedia.…
▽ More
The World Economic Forum (WEF) publishes annual reports on global risks which have the high impact on the world's economy. Currently, many researchers analyze the modeling and evolution of risks. However, few studies focus on validation of the global risk networks published by the WEF. In this paper, we first create a risk knowledge graph from the annotated risk events crawled from the Wikipedia. Then, we compare the relational dependencies of risks in the WEF and Wikipedia networks, and find that they share over 50% of their edges. Moreover, the edges unique to each network signify the different perspectives of the experts and the public on global risks. To reduce the cost of manual annotation of events triggering risk activation, we build an auto-detection tool which filters out over 80% media reported events unrelated to the global risks. In the process of filtering, our tool also continuously learns keywords relevant to global risks from the event sentences. Using locations of events extracted from the risk knowledge graph, we find characteristics of geographical distributions of the categories of global risks.
△ Less
Submitted 6 August, 2019; v1 submitted 31 July, 2019;
originally announced August 2019.
-
Modeling competitive evolution of multiple languages
Authors:
Zejie Zhou,
Boleslaw K. Szymanski,
Jianxi Gao
Abstract:
Increasing evidence demonstrates that in many places language coexistence has become ubiquitous and essential for supporting language and cultural diversity and associated with its financial and economic benefits. The competitive evolution among multiple languages determines the evolution outcome, either coexistence, decline, or extinction. Here, we extend the Abrams-Strogatz model of language com…
▽ More
Increasing evidence demonstrates that in many places language coexistence has become ubiquitous and essential for supporting language and cultural diversity and associated with its financial and economic benefits. The competitive evolution among multiple languages determines the evolution outcome, either coexistence, decline, or extinction. Here, we extend the Abrams-Strogatz model of language competition to multiple languages and then validate it by analyzing the behavioral transitions of language usage over the recent several decades in Singapore and Hong Kong. In each case, we estimate from data the model parameters that measure each language utility for its speakers and the strength of two biases, the majority preference for their language, and the minority aversion to it. The values of these two biases decide which language is the fastest growing in the competition and what would be the stable state of the system. We also study the system convergence time to stable states and discover the existence of tip** points with multiple attractors. Moreover, the critical slowdown of convergence to the stable fractions of language users appears near and peaks at the tip** points, signaling when the system approaches them. Our analysis furthers our understanding of multiple language evolution and the role of tip** points in behavioral transitions. These insights may help to protect languages from extinction and retain the language and cultural diversity.
△ Less
Submitted 16 July, 2019;
originally announced July 2019.
-
The evolution of polarization in the legislative branch of government
Authors:
Xiaoyan Lu,
Jianxi Gao,
Boleslaw K. Szymanski
Abstract:
The polarization of political opinions among members of the U.S. legislative chambers measured by their voting records is greater today than it was thirty years ago. Previous research efforts to find causes of such increase have suggested diverse contributors, like growth of online media, echo chamber effects, media biases, or disinformation propagation. Yet, we lack theoretic tools to understand,…
▽ More
The polarization of political opinions among members of the U.S. legislative chambers measured by their voting records is greater today than it was thirty years ago. Previous research efforts to find causes of such increase have suggested diverse contributors, like growth of online media, echo chamber effects, media biases, or disinformation propagation. Yet, we lack theoretic tools to understand, quantify, and predict the emergence of high political polarization among voters and their legislators. Here, we analyze millions of roll-call votes cast in the U.S. Congress over the past six decades. Our analysis reveals the critical change of polarization patterns that started at the end of 1980's. In earlier decades, polarization within each Congress tended to decrease with time. In contrast, in the recent decades, the polarization has been likely to grow within each term. To shed light on the reasons for this change, we introduce here a formal model for competitive dynamics to quantify the evolution of polarization patterns in the legislative branch of the U.S. government. Our model represents dynamics of polarization, enabling us to successfully predict the direction of polarization changes in 28 out of 30 U.S. Congresses elected in the past six decades. From the evolution of polarization level as measured by the Rice index, our model extracts a hidden parameter - polarization utility which determines the convergence point of the polarization evolution. The increase in the polarization utility implied by the model strongly correlates with two current trends: growing polarization of voters and increasing influence of election campaign funders. Two largest peaks of the model's polarization utility correlate with significant political or legislative changes happening at the same time.
△ Less
Submitted 20 April, 2019;
originally announced April 2019.
-
A partial knowledge of friends of friends speeds social search
Authors:
Amr Elsisy,
Boleslaw K. Szymanski,
Jasmine A. Plum,
Miao Qi,
Alex Pentland
Abstract:
Milgram empirically showed that people knowing only connections to their friends could locate any person in the U.S. in a few steps. Later research showed that social network topology enables a node aware of its full routing to find an arbitrary target in even fewer steps. Yet, the success of people in forwarding efficiently knowing only personal connections is still not fully explained. To study…
▽ More
Milgram empirically showed that people knowing only connections to their friends could locate any person in the U.S. in a few steps. Later research showed that social network topology enables a node aware of its full routing to find an arbitrary target in even fewer steps. Yet, the success of people in forwarding efficiently knowing only personal connections is still not fully explained. To study this problem, we emulate it on a real location-based social network, Gowalla. It provides explicit information about friends and temporal locations of each user useful for studies of human mobility. Here, we use it to conduct a massive computational experiment to establish new necessary and sufficient conditions for achieving social search efficiency. The results demonstrate that only the distribution of friendship edges and the partial knowledge of friends of friends are essential and sufficient for the efficiency of social search. Surprisingly, the efficiency of the search using the original distribution of friendship edges is not dependent on how the nodes are distributed into space. Moreover, the effect of using a limited knowledge that each node possesses about friends of its friends is strongly nonlinear. We show that gains of such use grow statistically significantly only when this knowledge is limited to a small fraction of friends of friends.
△ Less
Submitted 20 August, 2021; v1 submitted 13 April, 2019;
originally announced April 2019.
-
Predicting complex user behavior from CDR based social networks
Authors:
Casey Doyle,
Zala Herga,
Stephen Dipple,
Boleslaw K. Szymanski,
Gyorgy Korniss,
Dunja Mladenic
Abstract:
Call Detail Record (CDR) datasets provide enough information about personal interactions to support building and analyzing detailed empirical social networks. We take one such dataset and describe the various ways of using it to create a true social network in spite of the highly noisy data source. We use the resulting network to predict each individual's likelihood to default on payments for the…
▽ More
Call Detail Record (CDR) datasets provide enough information about personal interactions to support building and analyzing detailed empirical social networks. We take one such dataset and describe the various ways of using it to create a true social network in spite of the highly noisy data source. We use the resulting network to predict each individual's likelihood to default on payments for the network services, a complex behavior that involves a combination of social, economic, and legal considerations. We use a large number of features extracted from the network to build a model for predicting which users will default. By analyzing the relative contributions of features, we choose their best performing subsets ranging in size from small to medium. Features based on the number of close ties maintained by a user performed better than those derived from user's geographical location. The paper contributions include systematic impact analysis that the number of calls cutoff has on the properties of the network derived from CDR, and a methodology for building complex behavior models by creating very large sets of diverse features and systematically choosing those which perform best for the final model.
△ Less
Submitted 10 June, 2019; v1 submitted 29 March, 2019;
originally announced March 2019.
-
Regularized Stochastic Block Model for robust community detection in complex networks
Authors:
Xiaoyan Lu,
Boleslaw K. Szymanski
Abstract:
The stochastic block model is able to generate different network partitions, ranging from traditional assortative communities to disassortative structures. Since the degree-corrected stochastic block model does not specify which mixing pattern is desired, the inference algorithms, which discover the most likely partition of the networks nodes, are likely to get trapped in the local optima of the l…
▽ More
The stochastic block model is able to generate different network partitions, ranging from traditional assortative communities to disassortative structures. Since the degree-corrected stochastic block model does not specify which mixing pattern is desired, the inference algorithms, which discover the most likely partition of the networks nodes, are likely to get trapped in the local optima of the log-likelihood. Here we introduce a new model constraining nodes' internal degrees ratios in the objective function to stabilize the inference of block models from the observed network data. Given the regularized model, the inference algorithms, such as Markov chain Monte Carlo, reliably finds assortative or disassortive structure as directed by the value of a single parameter. We show experimentally that the inference of our proposed model quickly converges to the desired assortative or disassortative partition while the inference of degree-corrected stochastic block model gets often trapped at the inferior local optimal partitions when the traditional assortative community structure is not strong in the observed networks.
△ Less
Submitted 27 March, 2019;
originally announced March 2019.
-
Asymptotic resolution bounds of generalized modularity and multi-scale community detection
Authors:
Xiaoyan Lu,
Brendan Cross,
Boleslaw K. Szymanski
Abstract:
The maximization of generalized modularity performs well on networks in which the members of all communities are statistically indistinguishable from each other. However, there is no theory bounding the maximization performance in more realistic networks where edges are heterogeneously distributed within and between communities. Using the random graph properties, we establish asymptotic theoretica…
▽ More
The maximization of generalized modularity performs well on networks in which the members of all communities are statistically indistinguishable from each other. However, there is no theory bounding the maximization performance in more realistic networks where edges are heterogeneously distributed within and between communities. Using the random graph properties, we establish asymptotic theoretical bounds on the resolution parameter for which the generalized modularity maximization performs well. From this new perspective on random graph model, we find the resolution limit of modularity maximization can be explained in a surprisingly simple and straightforward way. Given a network produced by the stochastic block models, the communities for which the resolution parameter is larger than their densities are likely to be spread among multiple clusters, while communities for which the resolution parameter is smaller than their background inter-community edge density will be merged into one large component. Therefore, no suitable resolution parameter exits when the intra-community edge density in a subgraph is lower than the inter-community edge density in some other subgraph. For such networks, we propose a progressive agglomerative heuristic algorithm to detect practically significant communities at multiple scales.
△ Less
Submitted 15 April, 2020; v1 submitted 12 February, 2019;
originally announced February 2019.
-
Scalable prediction of global online media news virality
Authors:
Xiaoyan Lu,
Boleslaw K. Szymanski
Abstract:
News reports shape the public perception of the critical social, political and economical events around the world. Yet, the way in which emergent phenomena are reported in the news makes the early prediction of such phenomena a challenging task. We propose a scalable community-based probabilistic framework to model the spreading of news about events in online media. Our approach exploits the laten…
▽ More
News reports shape the public perception of the critical social, political and economical events around the world. Yet, the way in which emergent phenomena are reported in the news makes the early prediction of such phenomena a challenging task. We propose a scalable community-based probabilistic framework to model the spreading of news about events in online media. Our approach exploits the latent community structure in the global news media and uses the affiliation of the early adopters with a variety of communities to identify the events widely reported in the news at the early stage of their spread. The time complexity of our approach is linear in the number of news reports. It is also amenable to efficient parallelization. To demonstrate these features, the inference algorithm is parallelized for message passing paradigm and tested on RPI Advanced Multiprocessing Optimized System (AMOS), one of the fastest Blue Gene/Q supercomputers in the world. Thanks to the community-level features of the early adopters, the model gains an improvement of 20% in the early detection of the most massively reported events compared to the feature-based machine learning algorithm. Its parallelization scheme achieves orders of magnitude speedup.
△ Less
Submitted 22 December, 2018;
originally announced December 2018.
-
Opinion Formation Threshold Estimates from Different Combinations of Social Media Data-Types
Authors:
Derrik E. Asher,
Justine Caylor,
Casey Doyle,
Alexis R. Neigel,
Gyorgy Korniss,
Boleslaw K. Szymanski
Abstract:
Passive consumption of a quantifiable amount of social media information related to a topic can cause individuals to form opinions. If a substantial amount of these individuals are motivated to take action from their recently established opinions, a movement or public opinion shift can be induced independent of the information's veracity. Given that social media is ubiquitous in modern society, it…
▽ More
Passive consumption of a quantifiable amount of social media information related to a topic can cause individuals to form opinions. If a substantial amount of these individuals are motivated to take action from their recently established opinions, a movement or public opinion shift can be induced independent of the information's veracity. Given that social media is ubiquitous in modern society, it is imperative that we understand the threshold at which social media data results in opinion formation. The present study estimates population opinion formation thresholds by querying 2222 participants about the number of various social media data-types (i.e., images, videos, and/or messages) that they would need to passively consume to form opinions. Opinion formation is assessed across three dimensions, 1) data-type(s), 2) context, and 3) source. This work provides a theoretical basis for estimating the amount of data needed to influence a population through social media information.
△ Less
Submitted 2 October, 2018;
originally announced October 2018.
-
Evolution of Threats in the Global Risk Network
Authors:
Xiang Niu,
Alaa Moussawi,
Gyorgy Korniss,
Boleslaw K. Szymanski
Abstract:
With a steadily growing population and rapid advancements in technology, the global economy is increasing in size and complexity. This growth exacerbates global vulnerabilities and may lead to unforeseen consequences such as global pandemics fueled by air travel, cyberspace attacks, and cascading failures caused by the weakest link in a supply chain. Hence, a quantitative understanding of the mech…
▽ More
With a steadily growing population and rapid advancements in technology, the global economy is increasing in size and complexity. This growth exacerbates global vulnerabilities and may lead to unforeseen consequences such as global pandemics fueled by air travel, cyberspace attacks, and cascading failures caused by the weakest link in a supply chain. Hence, a quantitative understanding of the mechanisms driving global network vulnerabilities is urgently needed. Develo** methods for efficiently monitoring evolution of the global economy is essential to such understanding. Each year the World Economic Forum publishes an authoritative report on the state of the global economy and identifies risks that are likely to be active, impactful or contagious. Using a Cascading Alternating Renewal Process approach to model the dynamics of the global risk network, we are able to answer critical questions regarding the evolution of this network. To fully trace the evolution of the network we analyze the asymptotic state of risks (risk levels which would be reached in the long term if the risks were left unabated) given a snapshot in time, this elucidates the various challenges faced by the world community at each point in time. We also investigate the influence exerted by each risk on others. Results presented here are obtained through either quantitative analysis or computational simulations.
△ Less
Submitted 22 September, 2018;
originally announced September 2018.
-
Probing Limits of Information Spread with Sequential Seeding
Authors:
Jaroslaw Jankowski,
Boleslaw K. Szymanski,
Przemyslaw Kazienko,
Radoslaw Michalski,
Piotr Brodka
Abstract:
We consider here information spread which propagates with certain probability from nodes just activated to their not yet activated neighbors. Diffusion cascades can be triggered by activation of even a small set of nodes. Such activation is commonly performed in a single stage. A novel approach based on sequential seeding is analyzed here resulting in three fundamental contributions. First, we pro…
▽ More
We consider here information spread which propagates with certain probability from nodes just activated to their not yet activated neighbors. Diffusion cascades can be triggered by activation of even a small set of nodes. Such activation is commonly performed in a single stage. A novel approach based on sequential seeding is analyzed here resulting in three fundamental contributions. First, we propose a coordinated execution of randomized choices to enable precise comparison of different algorithms in general. We apply it here when the newly activated nodes at each stage of spreading attempt to activate their neighbors. Then, we present a formal proof that sequential seeding delivers at least as large coverage as the single stage seeding does. Moreover, we also show that, under modest assumptions, sequential seeding achieves coverage provably better than the single stage based approach using the same number of seeds and node ranking. Finally, we present experimental results showing how single stage and sequential approaches on directed and undirected graphs compare to the well-known greedy approach to provide the objective measure of the sequential seeding benefits. Surprisingly, applying sequential seeding to a simple degree-based selection leads to higher coverage than achieved by the computationally expensive greedy approach currently considered to be the best heuristic.
△ Less
Submitted 18 September, 2018;
originally announced September 2018.
-
Social Networks through the Prism of Cognition
Authors:
Radosław Michalski,
Bolesław K. Szymański,
Przemysław Kazienko,
Christian Lebiere,
Omar Lizardo,
Marcin Kulisiewicz
Abstract:
Human relations are driven by social events-people interact, exchange information, share knowledge and emotions, and gather news from mass media. These events leave traces in human memory, the strength of which depends on cognitive factors such as emotions or attention span. Each trace continuously weakens over time unless another related event activity strengthens it. Here, we introduce a novel c…
▽ More
Human relations are driven by social events-people interact, exchange information, share knowledge and emotions, and gather news from mass media. These events leave traces in human memory, the strength of which depends on cognitive factors such as emotions or attention span. Each trace continuously weakens over time unless another related event activity strengthens it. Here, we introduce a novel cognition-driven social network (CogSNet) model that accounts for cognitive aspects of social perception. The model explicitly represents each social interaction as a trace in human memory with its corresponding dynamics. The strength of the trace is the only measure of the influence that the interactions had on a person. For validation, we apply our model to NetSense data on social interactions among university students. The results show that CogSNet significantly improves the quality of modeling of human interactions in social networks.
△ Less
Submitted 22 January, 2021; v1 submitted 12 June, 2018;
originally announced June 2018.
-
Influence Maximization for Fixed Heterogeneous Thresholds
Authors:
Panagiotis D. Karampourniotis,
Boleslaw K. Szymanski,
Gyorgy Korniss
Abstract:
Influence Maximization is a NP-hard problem of selecting the optimal set of influencers in a network. Here, we propose two new approaches to influence maximization based on two very different metrics. The first metric, termed Balanced Index (BI), is fast to compute and assigns top values to two kinds of nodes: those with high resistance to adoption, and those with large out-degree. This is done by…
▽ More
Influence Maximization is a NP-hard problem of selecting the optimal set of influencers in a network. Here, we propose two new approaches to influence maximization based on two very different metrics. The first metric, termed Balanced Index (BI), is fast to compute and assigns top values to two kinds of nodes: those with high resistance to adoption, and those with large out-degree. This is done by linearly combining three properties of a node: its degree, susceptibility to new opinions, and the impact its activation will have on its neighborhood. Controlling the weights between those three terms has a huge impact on performance. The second metric, termed Group Performance Index (GPI), measures performance of each node as an initiator when it is a part of randomly selected initiator set. In each such selection, the score assigned to each teammate is inversely proportional to the number of initiators causing the desired spread. These two metrics are applicable to various cascade models; here we test them on the Linear Threshold Model with fixed and known thresholds. Furthermore, we study the impact of network degree assortativity and threshold distribution on the cascade size for metrics including ours. The results demonstrate our two metrics deliver strong performance for influence maximization.
△ Less
Submitted 7 March, 2018;
originally announced March 2018.
-
The Age of Social Sensing
Authors:
Dong Wang,
Boleslaw K. Szymanski,
Tarek Abdelzaher,
Heng Ji,
Lance Kaplan
Abstract:
Online social media, such as Twitter and Instagram, democratized information broadcast, allowing anyone to share information about themselves and their surroundings at an unprecedented scale. The large volume of information thus posted on these media offer a new lens into the physical world through the eyes of the social network. The exploitation of this lens to inspect aspects of world state has…
▽ More
Online social media, such as Twitter and Instagram, democratized information broadcast, allowing anyone to share information about themselves and their surroundings at an unprecedented scale. The large volume of information thus posted on these media offer a new lens into the physical world through the eyes of the social network. The exploitation of this lens to inspect aspects of world state has recently been termed social sensing. The power of manipulating reality via the use (or intentional misuse) of social media opened concerns with issues ranging from radicalization by terror propaganda to potential manipulation of elections in mature democracies. Many important challenges and open research questions arise in this emerging field that aims to better understand how information can be extracted from the medium and what properties characterize the extracted information and the world it represents. Addressing the above challenges requires multi-disciplinary research at the intersection of computer science and social sciences that combines cyber-physical computing, sociology, sensor networks, social networks, cognition, data mining, estimation theory, data fusion, information theory, linguistics, machine learning, behavioral economics, and possibly others. This paper surveys important directions in social sensing, identifies current research challenges, and outlines avenues for future research.
△ Less
Submitted 27 January, 2018;
originally announced January 2018.
-
Entropy Measures of Human Communication Dynamics
Authors:
Marcin Kulisiewicz,
Przemysław Kazienko,
Bolesław K. Szymański,
Radosław Michalski
Abstract:
Human communication is commonly represented as a temporal social network, and evaluated in terms of its uniqueness. We propose a set of new entropy-based measures for human communication dynamics represented within the temporal social network as event sequences. Using real world datasets and random interaction series of different types we find that real human contact events always significantly di…
▽ More
Human communication is commonly represented as a temporal social network, and evaluated in terms of its uniqueness. We propose a set of new entropy-based measures for human communication dynamics represented within the temporal social network as event sequences. Using real world datasets and random interaction series of different types we find that real human contact events always significantly differ from random ones. This human distinctiveness increases over time and by means of the proposed entropy measures, we can observe sociological processes that take place within dynamic communities.
△ Less
Submitted 14 January, 2018;
originally announced January 2018.
-
Evolution of the Global Risk Network Mean-Field Stability Point
Authors:
Xiang Niu,
Alaa Moussawi,
Noemi Derzsy,
Xin Lin,
Gyorgy Korniss,
Boleslaw K. Szymanski
Abstract:
With a steadily growing human population and rapid advancements in technology, the global human network is increasing in size and connection density. This growth exacerbates networked global threats and can lead to unexpected consequences such as global epidemics mediated by air travel, threats in cyberspace, global governance, etc. A quantitative understanding of the mechanisms guiding this globa…
▽ More
With a steadily growing human population and rapid advancements in technology, the global human network is increasing in size and connection density. This growth exacerbates networked global threats and can lead to unexpected consequences such as global epidemics mediated by air travel, threats in cyberspace, global governance, etc. A quantitative understanding of the mechanisms guiding this global network is necessary for proper operation and maintenance of the global infrastructure. Each year the World Economic Forum publishes an authoritative report on global risks, and applying this data to a CARP model, we answer critical questions such as how the network evolves over time. In the evolution, we compare not the current states of the global risk network at different time points, but its steady state at those points, which would be reached if the risk were left unabated. Looking at the steady states show more drastically the differences in the challenges to the global economy and stability the world community had faced at each point of the time. Finally, we investigate the influence between risks in the global network, using a method successful in distinguishing between correlation and causation. All results presented in the paper were obtained using detailed mathematical analysis with simulations to support our findings.
△ Less
Submitted 15 October, 2017;
originally announced October 2017.
-
Influence of Personal Preferences on Link Dynamics in Social Networks
Authors:
Ashwin Bahulkar,
Boleslaw K. Szymanski,
Nitesh Chawla,
Omar Lizardo,
Kevin Chan
Abstract:
We study a unique network dataset including periodic surveys and electronic logs of dyadic contacts via smartphones. The participants were a sample of freshmen entering university in the Fall 2011. Their opinions on a variety of political and social issues and lists of activities on campus were regularly recorded at the beginning and end of each semester for the first three years of study. We iden…
▽ More
We study a unique network dataset including periodic surveys and electronic logs of dyadic contacts via smartphones. The participants were a sample of freshmen entering university in the Fall 2011. Their opinions on a variety of political and social issues and lists of activities on campus were regularly recorded at the beginning and end of each semester for the first three years of study. We identify a behavioral network defined by call and text data, and a cognitive network based on friendship nominations in ego-network surveys. Both networks are limited to study participants. Since a wide range of attributes on each node were collected in self-reports, we refer to these networks as attribute-rich networks. We study whether student preferences for certain attributes of friends can predict formation and dissolution of edges in both networks. We introduce a method for computing student preferences for different attributes which we use to predict link formation and dissolution. We then rank these attributes according to their importance for making predictions. We find that personal preferences, in particular political views, and preferences for common activities help predict link formation and dissolution in both the behavioral and cognitive networks.
△ Less
Submitted 21 September, 2017;
originally announced September 2017.
-
Quantifying patterns of research interest evolution
Authors:
Tao Jia,
Dashun Wang,
Boleslaw K. Szymanski
Abstract:
Our quantitative understanding of how scientists choose and shift their research focus over time is highly consequential, because it affects the ways in which scientists are trained, science is funded, knowledge is organized and discovered, and excellence is recognized and rewarded. Despite extensive investigations of various factors that influence a scientist's choice of research topics, quantita…
▽ More
Our quantitative understanding of how scientists choose and shift their research focus over time is highly consequential, because it affects the ways in which scientists are trained, science is funded, knowledge is organized and discovered, and excellence is recognized and rewarded. Despite extensive investigations of various factors that influence a scientist's choice of research topics, quantitative assessments of mechanisms that give rise to macroscopic patterns characterizing research interest evolution of individual scientists remain limited. Here we perform a large-scale analysis of publication records, finding that research interest change follows a reproducible pattern characterized by an exponential distribution. We identify three fundamental features responsible for the observed exponential distribution, which arise from a subtle interplay between exploitation and exploration in research interest evolution. We develop a random walk based model, allowing us to accurately reproduce the empirical observations. This work presents a quantitative analysis of macroscopic patterns governing research interest change, discovering a high degree of regularity underlying scientific research and individual careers.
△ Less
Submitted 11 September, 2017;
originally announced September 2017.
-
Limits of Risk Predictability in a Cascading Alternating Renewal Process Model
Authors:
Xin Lin,
Alaa Moussawi,
Gyorgy Korniss,
Jonathan Z. Bakdash,
Boleslaw K. Szymanski
Abstract:
Most risk analysis models systematically underestimate the probability and impact of catastrophic events (e.g., economic crises, natural disasters, and terrorism) by not taking into account interconnectivity and interdependence of risks. To address this weakness, we propose the Cascading Alternating Renewal Process (CARP) to forecast interconnected global risks. However, assessments of the model's…
▽ More
Most risk analysis models systematically underestimate the probability and impact of catastrophic events (e.g., economic crises, natural disasters, and terrorism) by not taking into account interconnectivity and interdependence of risks. To address this weakness, we propose the Cascading Alternating Renewal Process (CARP) to forecast interconnected global risks. However, assessments of the model's prediction precision are limited by lack of sufficient ground truth data. Here, we establish prediction precision as a function of input data size by using alternative long ground truth data generated by simulations of the CARP model with known parameters. We illustrate the approach on a model of fires in artificial cities assembled from basic city blocks with diverse housing. The results confirm that parameter recovery variance exhibits power law decay as a function of the length of available ground truth data. Using CARP, we also demonstrate estimation using a disparate dataset that also has dependencies: real-world prediction precision for the global risk model based on the World Economic Forum Global Risk Report. We conclude that the CARP model is an efficient method for predicting catastrophic cascading events with potential applications to emerging local and global interconnected risks.
△ Less
Submitted 21 June, 2017;
originally announced June 2017.
-
Limits of Predictability of Cascading Overload Failures in Spatially-Embedded Networks with Distributed Flows
Authors:
Alaa Moussawi,
Noemi Derzsy,
Xin Lin,
Boleslaw K. Szymanski,
Gyorgy Korniss
Abstract:
Cascading failures are a critical vulnerability of complex information or infrastructure networks. Here we investigate the properties of load-based cascading failures in real and synthetic spatially-embedded network structures, and propose mitigation strategies to reduce the severity of damages caused by such failures. We introduce a stochastic method for optimal heterogeneous distribution of reso…
▽ More
Cascading failures are a critical vulnerability of complex information or infrastructure networks. Here we investigate the properties of load-based cascading failures in real and synthetic spatially-embedded network structures, and propose mitigation strategies to reduce the severity of damages caused by such failures. We introduce a stochastic method for optimal heterogeneous distribution of resources (node capacities) subject to a fixed total cost. Additionally, we design and compare the performance of networks with N-stable and (N-1)-stable network-capacity allocations by triggering cascades using various real-world node-attack and node-failure scenarios. We show that failure mitigation through increased node protection can be effectively achieved against single node failures. However, mitigating against multiple node failures is much more difficult due to the combinatorial increase in possible failures. We analyze the robustness of the system with increasing protection, and find that a critical tolerance exists at which the system undergoes a phase transition, and above which the network almost completely survives an attack. Moreover, we show that cascade-size distributions measured in this region exhibit a power-law decay. Finally, we find a strong correlation between cascade sizes induced by individual nodes and sets of nodes. We also show that network topology alone is a weak factor in determining the progression of cascading failures.
△ Less
Submitted 14 June, 2017;
originally announced June 2017.
-
Adaptive Modularity Maximization via Edge Weighting Scheme
Authors:
Xiaoyan Lu,
Konstantin Kuzmin,
Mingming Chen,
Boleslaw K. Szymanski
Abstract:
Modularity maximization is one of the state-of-the-art methods for community detection that has gained popularity in the last decade. Yet it suffers from the resolution limit problem by preferring under certain conditions large communities over small ones. To solve this problem, we propose to expand the meaning of the edges that are currently used to indicate propensity of nodes for sharing the sa…
▽ More
Modularity maximization is one of the state-of-the-art methods for community detection that has gained popularity in the last decade. Yet it suffers from the resolution limit problem by preferring under certain conditions large communities over small ones. To solve this problem, we propose to expand the meaning of the edges that are currently used to indicate propensity of nodes for sharing the same community. In our approach this is the role of edges with positive weights while edges with negative weights indicate aversion for putting their end-nodes into one community. We also present a novel regression model which assigns weights to the edges of a graph according to their local topological features to enhance the accuracy of modularity maximization algorithms. We construct artificial graphs based on the parameters sampled from a given unweighted network and train the regression model on ground truth communities of these artificial graphs in a supervised fashion. The extraction of local topological edge features can be done in linear time, making this process efficient. Experimental results on real and synthetic networks show that the state-of-the-art community detection algorithms improve their performance significantly by finding communities in the weighted graphs produced by our model.
△ Less
Submitted 7 October, 2017; v1 submitted 13 May, 2017;
originally announced May 2017.
-
A Robust Asynchronous Newton Method for Massive Scale Computing Systems
Authors:
Travis Desell,
Malik Magdon-Ismail,
Heidi Newberg,
Lee A. Newberg,
Boleslaw K. Szymanski,
Carlos A. Varela
Abstract:
Volunteer computing grids offer super-computing levels of computing power at the relatively low cost of operating a server. In previous work, the authors have shown that it is possible to take traditionally iterative evolutionary algorithms and execute them on volunteer computing grids by performing them asynchronously. The asynchronous implementations dramatically increase scalability and decreas…
▽ More
Volunteer computing grids offer super-computing levels of computing power at the relatively low cost of operating a server. In previous work, the authors have shown that it is possible to take traditionally iterative evolutionary algorithms and execute them on volunteer computing grids by performing them asynchronously. The asynchronous implementations dramatically increase scalability and decrease the time taken to converge to a solution. Iterative and asynchronous optimization algorithms implemented using MPI on clusters and supercomputers, and BOINC on volunteer computing grids have been packaged together in a framework for generic distributed optimization (FGDO). This paper presents a new extension to FGDO for an asynchronous Newton method (ANM) for local optimization. ANM is resilient to heterogeneous, faulty and unreliable computing nodes and is extremely scalable. Preliminary results show that it can converge to a local optimum significantly faster than conjugate gradient descent does.
△ Less
Submitted 30 December, 2016;
originally announced February 2017.
-
Assortative Mating: Encounter-Network Topology and the Evolution of Attractiveness
Authors:
S. Dipple,
T. Jia,
T. Caraco,
G. Korniss,
B. K. Szymanski
Abstract:
We model a social-encounter network where linked nodes match for reproduction in a manner depending probabilistically on each node`s attractiveness. The developed model reveals that increasing either the network`s mean degree or the ``choosiness`` exercised during pair-formation increases the strength of positive assortative mating. That is, we note that attractiveness is correlated among mated no…
▽ More
We model a social-encounter network where linked nodes match for reproduction in a manner depending probabilistically on each node`s attractiveness. The developed model reveals that increasing either the network`s mean degree or the ``choosiness`` exercised during pair-formation increases the strength of positive assortative mating. That is, we note that attractiveness is correlated among mated nodes. Their total number also increases with mean degree and selectivity during pair-formation. By iterating over model map** of parents onto offspring across generations, we study the evolution of attractiveness. Selection mediated by exclusion from reproduction increases mean attractiveness, but is rapidly balanced by skew in the offspring distribution of highly attractive mated pairs.
△ Less
Submitted 29 November, 2016;
originally announced December 2016.
-
Analysis of Link Formation, Persistence and Dissolution in NetSense Data
Authors:
Ashwin Bahulkar,
Boleslaw K. Szymanski,
Omar Lizardo,
Yuxiao Dong,
Yang Yang,
Nitesh V. Chawla
Abstract:
We study a unique behavioral network data set (based on periodic surveys and on electronic logs of dyadic contact via smartphones) collected at the University of Notre Dame.The participants are a sample of members of the entering class of freshmen in the fall of 2011 whose opinions on a wide variety of political and social issues and activities on campus were regularly recorded - at the beginning…
▽ More
We study a unique behavioral network data set (based on periodic surveys and on electronic logs of dyadic contact via smartphones) collected at the University of Notre Dame.The participants are a sample of members of the entering class of freshmen in the fall of 2011 whose opinions on a wide variety of political and social issues and activities on campus were regularly recorded - at the beginning and end of each semester - for the first three years of their residence on campus. We create a communication activity network implied by call and text data, and a friendship network based on surveys. Both networks are limited to students participating in the NetSense surveys. We aim at finding student traits and activities on which agreements correlate well with formation and persistence of links while disagreements are highly correlated with non-existence or dissolution of links in the two social networks that we created. Using statistical analysis and machine learning, we observe several traits and activities displaying such correlations, thus being of potential use to predict social network evolution.
△ Less
Submitted 2 November, 2016;
originally announced November 2016.
-
Supporting novel biomedical research via multilayer collaboration networks
Authors:
Konstantin Kuzmin,
Xiaoyan Lu,
Partha Sarathi Mukherjee,
Juntao Zhuang,
Chris Gaiteri,
Boleslaw K Szymanski
Abstract:
The value of research containing novel combinations of molecules can be seen in many innovative and award-winning research programs. Despite calls to use innovative approaches to address common diseases, an increasing majority of research funding goes toward "safe" incremental research. Counteracting this trend by nurturing novel and potentially transformative scientific research is challenging, i…
▽ More
The value of research containing novel combinations of molecules can be seen in many innovative and award-winning research programs. Despite calls to use innovative approaches to address common diseases, an increasing majority of research funding goes toward "safe" incremental research. Counteracting this trend by nurturing novel and potentially transformative scientific research is challenging, it must be supported in competition with established research programs. Therefore, we propose a tool that helps to resolve the tension between safe but fundable research vs. high-risk but potentially transformational research. It does this by identifying hidden overlap** interest around novel molecular research topics. Specifically, it identifies paths of molecular interactions that connect research topics and hypotheses that would not typically be associated, as the basis for scientific collaboration. Because these collaborations are related to the scientists' present trajectory, they are low risk and can be initiated rapidly. Unlike most incremental steps, these collaborations have the potential for leaps in understanding, as they reposition research for novel disease applications. We demonstrate the use of this tool to identify scientists who could contribute to understanding the cellular role of genes with novel associations with Alzheimer's disease, which have not been thoroughly characterized, in part due to the funding emphasis on established research.
△ Less
Submitted 28 October, 2016;
originally announced October 2016.
-
Analysis of the high dimensional naming game with committed minorities
Authors:
William Pickering,
Boleslaw K. Szymanski,
Chjan Lim
Abstract:
The naming game has become an archetype for linguistic evolution and mathematical social behavioral analysis. In the model presented here, there are $N$ individuals and $K$ words. Our contribution is develo** a robust method that handles the case when $K = O(N)$. The initial condition plays a crucial role in the ordering of the system. We find that the system with high Shannon entropy has a high…
▽ More
The naming game has become an archetype for linguistic evolution and mathematical social behavioral analysis. In the model presented here, there are $N$ individuals and $K$ words. Our contribution is develo** a robust method that handles the case when $K = O(N)$. The initial condition plays a crucial role in the ordering of the system. We find that the system with high Shannon entropy has a higher consensus time and a lower critical fraction of zealots compared to low-entropy states. We also show that the critical number of committed agents decreases with the number of opinions and grows with the community size for each word. These results complement earlier conclusions that diversity of opinion is essential for evolution; without it, the system stagnates in the status quo [S. A. Marvel et al., Phys. Rev. Lett. 109, 118702 (2012)]. In contrast, our results suggest that committed minorities can more easily conquer highly diverse systems, showing them to be inherently unstable.
△ Less
Submitted 27 May, 2016; v1 submitted 10 December, 2015;
originally announced December 2015.
-
Parallel Toolkit for Measuring the Quality of Network Community Structure
Authors:
Mingming Chen,
Sisi Liu,
Boleslaw K. Szymanski
Abstract:
Many networks display community structure which identifies groups of nodes within which connections are denser than between them. Detecting and characterizing such community structure, which is known as community detection, is one of the fundamental issues in the study of network systems. It has received a considerable attention in the last years. Numerous techniques have been developed for both e…
▽ More
Many networks display community structure which identifies groups of nodes within which connections are denser than between them. Detecting and characterizing such community structure, which is known as community detection, is one of the fundamental issues in the study of network systems. It has received a considerable attention in the last years. Numerous techniques have been developed for both efficient and effective community detection. Among them, the most efficient algorithm is the label propagation algorithm whose computational complexity is O(|E|). Although it is linear in the number of edges, the running time is still too long for very large networks, creating the need for parallel community detection. Also, computing community quality metrics for community structure is computationally expensive both with and without ground truth. However, to date we are not aware of any effort to introduce parallelism for this problem. In this paper, we provide a parallel toolkit to calculate the values of such metrics. We evaluate the parallel algorithms on both distributed memory machine and shared memory machine. The experimental results show that they yield a significant performance gain over sequential execution in terms of total running time, speedup, and efficiency.
△ Less
Submitted 20 July, 2015;
originally announced July 2015.
-
Extension of Modularity Density for Overlap** Community Structure
Authors:
Mingming Chen,
Konstantin Kuzmin,
Boleslaw K. Szymanski
Abstract:
Modularity is widely used to effectively measure the strength of the disjoint community structure found by community detection algorithms. Although several overlap** extensions of modularity were proposed to measure the quality of overlap** community structure, there is lack of systematic comparison of different extensions. To fill this gap, we overview overlap** extensions of modularity to…
▽ More
Modularity is widely used to effectively measure the strength of the disjoint community structure found by community detection algorithms. Although several overlap** extensions of modularity were proposed to measure the quality of overlap** community structure, there is lack of systematic comparison of different extensions. To fill this gap, we overview overlap** extensions of modularity to select the best. In addition, we extend the Modularity Density metric to enable its usage for overlap** communities. The experimental results on four real networks using overlap** extensions of modularity, overlap** modularity density, and six other community quality metrics show that the best results are obtained when the product of the belonging coefficients of two nodes is used as the belonging function. Moreover, our experiments indicate that overlap** modularity density is a better measure of the quality of overlap** community structure than other metrics considered.
△ Less
Submitted 16 July, 2015;
originally announced July 2015.