-
Methods of quantifying specialized knowledge and network rewiring
Authors:
Sirui Wang,
Michael Macy,
Victor Nee
Abstract:
Technological innovations are a major driver of economic development that depend on the exchange of knowledge and ideas among those with unique but complementary specialized knowledge and knowhow. However, measurement of specialized knowledge embedded in technologists, scientists and entrepreneurs in the knowledge economy presents an empirical challenge as both the exchange of knowledge and knowle…
▽ More
Technological innovations are a major driver of economic development that depend on the exchange of knowledge and ideas among those with unique but complementary specialized knowledge and knowhow. However, measurement of specialized knowledge embedded in technologists, scientists and entrepreneurs in the knowledge economy presents an empirical challenge as both the exchange of knowledge and knowledge itself remain difficult to observe. We develop novel measures of specialized knowledge using a unique dataset of longitudinal records of participation at technology-focused meetup events in two regional knowledge economics. Our measures of specialized knowledge can be further used to quantify the extend of knowledge spillover and network rewiring and uncover underlying social mechanisms that contribute to the development of increasingly complex and differentiated networks in maturing knowledge economies. We apply these methods in the context of the rapid morphogenesis of emerging regional technology economies in New York City and Los Angeles.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
Trustworthiness Evaluations of Search Results: The Impact of Rank and Misinformation
Authors:
Sterling Williams-Ceci,
Michael Macy,
Mor Naaman
Abstract:
Users rely on search engines for information in critical contexts, such as public health emergencies. Understanding how users evaluate the trustworthiness of search results is therefore essential. Research has identified rank and the presence of misinformation as factors impacting perceptions and click behavior in search. Here, we elaborate on these findings by measuring the effects of rank and mi…
▽ More
Users rely on search engines for information in critical contexts, such as public health emergencies. Understanding how users evaluate the trustworthiness of search results is therefore essential. Research has identified rank and the presence of misinformation as factors impacting perceptions and click behavior in search. Here, we elaborate on these findings by measuring the effects of rank and misinformation, as well as warning banners, on the perceived trustworthiness of individual results in search. We conducted three online experiments (N=3196) using Covid-19-related queries to address this question. We show that although higher-ranked results are clicked more often, they are not more trusted. We also show that misinformation did not change trust in accurate results below it. However, a warning about unreliable sources backfired, decreasing trust in accurate information but not misinformation. This work addresses concerns about how people evaluate information in search, and illustrates the dangers of generic prevention approaches.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Demographic Confounding Causes Extreme Instances of Lifestyle Politics on Facebook
Authors:
Alexander Ruch,
Yujia Zhang,
Michael Macy
Abstract:
Lifestyle politics emerge when activities that have no substantive relevance to ideology become politically aligned and polarized. Homophily and social influence are able generate these fault lines on their own; however, social identities from demographics may serve as coordinating mechanisms through which lifestyle politics are mobilized are spread. Using a dataset of 137,661,886 observations fro…
▽ More
Lifestyle politics emerge when activities that have no substantive relevance to ideology become politically aligned and polarized. Homophily and social influence are able generate these fault lines on their own; however, social identities from demographics may serve as coordinating mechanisms through which lifestyle politics are mobilized are spread. Using a dataset of 137,661,886 observations from 299,327 Facebook interests aggregated across users of different racial/ethnic, education, age, gender, and income demographics, we find that the most extreme instances of lifestyle politics are those which are highly confounded by demographics such as race/ethnicity (e.g., Black artists and performers). After adjusting political alignment for demographic effects, lifestyle politics decreased by 27.36% toward the political "center" and demographically confounded interests were no longer among the most polarized interests. Instead, after demographic deconfounding, we found that the most liberal interests included electric cars, Planned Parenthood, and liberal satire while the most conservative interests included the Republican Party and conservative commentators. We validate our measures of political alignment and lifestyle politics using the General Social Survey and find similar demographic entanglements with lifestyle politics existed before social media such as Facebook were ubiquitous, giving us strong confidence that our results are not due to echo chambers or filter bubbles. Likewise, since demographic characteristics exist prior to ideological values, we argue that the demographic confounding we observe is causally responsible for the extreme instances of lifestyle politics that we find among the aggregated interests. We conclude our paper by relating our results to Simpson's paradox, cultural omnivorousness, and network autocorrelation.
△ Less
Submitted 17 January, 2022;
originally announced January 2022.
-
Shifting Polarization and Twitter News Influencers between two U.S. Presidential Elections
Authors:
James Flamino,
Alessandro Galezzi,
Stuart Feldman,
Michael W. Macy,
Brendan Cross,
Zhenkun Zhou,
Matteo Serafino,
Alexandre Bovet,
Hernan A. Makse,
Boleslaw K. Szymanski
Abstract:
Social media are decentralized, interactive, and transformative, empowering users to produce and spread information to influence others. This has changed the dynamics of political communication that were previously dominated by traditional corporate news media. Having hundreds of millions of tweets collected over the 2016 and 2020 U.S. presidential elections gave us a unique opportunity to measure…
▽ More
Social media are decentralized, interactive, and transformative, empowering users to produce and spread information to influence others. This has changed the dynamics of political communication that were previously dominated by traditional corporate news media. Having hundreds of millions of tweets collected over the 2016 and 2020 U.S. presidential elections gave us a unique opportunity to measure the change in polarization and the diffusion of political information. We analyze the diffusion of political information among Twitter users and investigate the change of polarization between these elections and how this change affected the composition and polarization of influencers and their retweeters. We identify "influencers" by their ability to spread information and classify them into those affiliated with a media organization, a political organization, or unaffiliated. Most of the top influencers were affiliated with media organizations during both elections. We found a clear increase from 2016 to 2020 in polarization among influencers and among those whom they influence. Moreover, 75% of the top influencers in 2020 were not present in 2016, demonstrating that such status is difficult to retain. Between 2016 and 2020, 10% of influencers affiliated with media were replaced by center- or right-orientated influencers affiliated with political organizations and unaffiliated influencers.
△ Less
Submitted 3 November, 2021;
originally announced November 2021.
-
Going beyond accuracy: estimating homophily in social networks using predictions
Authors:
George Berry,
Antonio Sirianni,
Ingmar Weber,
Jisun An,
Michael Macy
Abstract:
In online social networks, it is common to use predictions of node categories to estimate measures of homophily and other relational properties. However, online social network data often lacks basic demographic information about the nodes. Researchers must rely on predicted node attributes to estimate measures of homophily, but little is known about the validity of these measures. We show that est…
▽ More
In online social networks, it is common to use predictions of node categories to estimate measures of homophily and other relational properties. However, online social network data often lacks basic demographic information about the nodes. Researchers must rely on predicted node attributes to estimate measures of homophily, but little is known about the validity of these measures. We show that estimating homophily in a network can be viewed as a dyadic prediction problem, and that homophily estimates are unbiased when dyad-level residuals sum to zero in the network. Node-level prediction models, such as the use of names to classify ethnicity or gender, do not generally have this property and can introduce large biases into homophily estimates. Bias occurs due to error autocorrelation along dyads. Importantly, node-level classification performance is not a reliable indicator of estimation accuracy for homophily. We compare estimation strategies that make predictions at the node and dyad levels, evaluating performance in different settings. We propose a novel "ego-alter" modeling approach that outperforms standard node and dyad classification strategies. While this paper focuses on homophily, results generalize to other relational measures which aggregate predictions along the dyads in a network. We conclude with suggestions for research designs to study homophily in online networks. Code for this paper is available at https://github.com/georgeberry/autocorr.
△ Less
Submitted 29 January, 2020;
originally announced January 2020.
-
Estimating group properties in online social networks with a classifier
Authors:
George Berry,
Antonio Sirianni,
Nathan High,
Agrippa Kellum,
Ingmar Weber,
Michael Macy
Abstract:
We consider the problem of obtaining unbiased estimates of group properties in social networks when using a classifier for node labels. Inference for this problem is complicated by two factors: the network is not known and must be crawled, and even high-performance classifiers provide biased estimates of group proportions. We propose and evaluate AdjustedWalk for addressing this problem. This is a…
▽ More
We consider the problem of obtaining unbiased estimates of group properties in social networks when using a classifier for node labels. Inference for this problem is complicated by two factors: the network is not known and must be crawled, and even high-performance classifiers provide biased estimates of group proportions. We propose and evaluate AdjustedWalk for addressing this problem. This is a three step procedure which entails: 1) walking the graph starting from an arbitrary node; 2) learning a classifier on the nodes in the walk; and 3) applying a post-hoc adjustment to classification labels. The walk step provides the information necessary to make inferences over the nodes and edges, while the adjustment step corrects for classifier bias in estimating group proportions. This process provides de-biased estimates at the cost of additional variance. We evaluate AdjustedWalk on four tasks: the proportion of nodes belonging to a minority group, the proportion of the minority group among high degree nodes, the proportion of within-group edges, and Coleman's homophily index. Simulated and empirical graphs show that this procedure performs well compared to optimal baselines in a variety of circumstances, while indicating that variance increases can be large for low-recall classifiers.
△ Less
Submitted 24 July, 2018;
originally announced July 2018.
-
Cultural Values and Cross-cultural Video Consumption on YouTube
Authors:
Minsu Park,
Jaram Park,
Young Min Baek,
Michael Macy
Abstract:
Video-sharing social media like YouTube provide access to diverse cultural products from all over the world, making it possible to test theories that the Web facilitates global cultural convergence. Drawing on a daily listing of YouTube's most popular videos across 58 countries, we investigate the consumption of popular videos in countries that differ in cultural values, language, gross domestic p…
▽ More
Video-sharing social media like YouTube provide access to diverse cultural products from all over the world, making it possible to test theories that the Web facilitates global cultural convergence. Drawing on a daily listing of YouTube's most popular videos across 58 countries, we investigate the consumption of popular videos in countries that differ in cultural values, language, gross domestic product, and Internet penetration rate. Although online social media facilitate global access to cultural products, we find this technological capability does not result in universal cultural convergence. Instead, consumption of popular videos in culturally different countries appears to be constrained by cultural values. Cross-cultural convergence is more advanced in cosmopolitan countries with cultural values that favor individualism and power inequality.
△ Less
Submitted 17 May, 2017; v1 submitted 8 May, 2017;
originally announced May 2017.
-
Psychological and Personality Profiles of Political Extremists
Authors:
Meysam Alizadeh,
Ingmar Weber,
Claudio Cioffi-Revilla,
Santo Fortunato,
Michael Macy
Abstract:
Global recruitment into radical Islamic movements has spurred renewed interest in the appeal of political extremism. Is the appeal a rational response to material conditions or is it the expression of psychological and personality disorders associated with aggressive behavior, intolerance, conspiratorial imagination, and paranoia? Empirical answers using surveys have been limited by lack of access…
▽ More
Global recruitment into radical Islamic movements has spurred renewed interest in the appeal of political extremism. Is the appeal a rational response to material conditions or is it the expression of psychological and personality disorders associated with aggressive behavior, intolerance, conspiratorial imagination, and paranoia? Empirical answers using surveys have been limited by lack of access to extremist groups, while field studies have lacked psychological measures and failed to compare extremists with contrast groups. We revisit the debate over the appeal of extremism in the U.S. context by comparing publicly available Twitter messages written by over 355,000 political extremist followers with messages written by non-extremist U.S. users. Analysis of text-based psychological indicators supports the moral foundation theory which identifies emotion as a critical factor in determining political orientation of individuals. Extremist followers also differ from others in four of the Big Five personality traits.
△ Less
Submitted 1 April, 2017;
originally announced April 2017.
-
Automated Hate Speech Detection and the Problem of Offensive Language
Authors:
Thomas Davidson,
Dana Warmsley,
Michael Macy,
Ingmar Weber
Abstract:
A key challenge for automatic hate-speech detection on social media is the separation of hate speech from other instances of offensive language. Lexical detection methods tend to have low precision because they classify all messages containing particular terms as hate speech and previous work using supervised learning has failed to distinguish between the two categories. We used a crowd-sourced ha…
▽ More
A key challenge for automatic hate-speech detection on social media is the separation of hate speech from other instances of offensive language. Lexical detection methods tend to have low precision because they classify all messages containing particular terms as hate speech and previous work using supervised learning has failed to distinguish between the two categories. We used a crowd-sourced hate speech lexicon to collect tweets containing hate speech keywords. We use crowd-sourcing to label a sample of these tweets into three categories: those containing hate speech, only offensive language, and those with neither. We train a multi-class classifier to distinguish between these different categories. Close analysis of the predictions and the errors shows when we can reliably separate hate speech from other offensive language and when this differentiation is more difficult. We find that racist and homophobic tweets are more likely to be classified as hate speech but that sexist tweets are generally classified as offensive. Tweets without explicit hate keywords are also more difficult to classify.
△ Less
Submitted 11 March, 2017;
originally announced March 2017.
-
The Opacity Problem in Social Contagion
Authors:
George Berry,
Christopher J. Cameron,
Patrick Park,
Michael W. Macy
Abstract:
Fads, product adoption, mobs, rumors, memes, and emergent norms are diverse social contagions that have been modeled as network cascades. Empirical study of these cascades is vulnerable to what we describe as the "opacity problem": the inability to observe the critical level of peer influence required to trigger an individual's behavioral change. Even with maximal information, network cascades rev…
▽ More
Fads, product adoption, mobs, rumors, memes, and emergent norms are diverse social contagions that have been modeled as network cascades. Empirical study of these cascades is vulnerable to what we describe as the "opacity problem": the inability to observe the critical level of peer influence required to trigger an individual's behavioral change. Even with maximal information, network cascades reveal intervals that bound critical levels of peer exposure, rather than critical values themselves. Existing practice uses interval maxima, which systematically over-estimates the social influence required for behavioral change. Simulations reveal that the over-estimation is likely common and large in magnitude. This is confirmed by an empirical study of hashtag cascades among 3.2 million Twitter users: one in five hashtag adoptions suffers critical value uncertainty due to the opacity problem. Different assumptions about these intervals lead to qualitatively different conclusions about the role of peer reinforcement in diffusion. We introduce a solution that combines identifying tightly bounded intervals with predicting uncertain critical values using node-level information.
△ Less
Submitted 19 November, 2018; v1 submitted 8 February, 2017;
originally announced February 2017.
-
Bots as Virtual Confederates: Design and Ethics
Authors:
Peter M Krafft,
Michael Macy,
Alex Pentland
Abstract:
The use of bots as virtual confederates in online field experiments holds extreme promise as a new methodological tool in computational social science. However, this potential tool comes with inherent ethical challenges. Informed consent can be difficult to obtain in many cases, and the use of confederates necessarily implies the use of deception. In this work we outline a design space for bots as…
▽ More
The use of bots as virtual confederates in online field experiments holds extreme promise as a new methodological tool in computational social science. However, this potential tool comes with inherent ethical challenges. Informed consent can be difficult to obtain in many cases, and the use of confederates necessarily implies the use of deception. In this work we outline a design space for bots as virtual confederates, and we propose a set of guidelines for meeting the status quo for ethical experimentation. We draw upon examples from prior work in the CSCW community and the broader social science literature for illustration. While a handful of prior researchers have used bots in online experimentation, our work is meant to inspire future work in this area and raise awareness of the associated ethical issues.
△ Less
Submitted 1 November, 2016;
originally announced November 2016.
-
Editorial: Statistical Mechanics and Social Sciences
Authors:
Santo Fortunato,
Michael Macy,
Sidney Redner
Abstract:
This editorial opens the special issues that the Journal of Statistical Physics has dedicated to the growing field of statistical physics modeling of social dynamics. The issues include contributions from physicists and social scientists, with the goal of fostering a better communication between these two communities.
This editorial opens the special issues that the Journal of Statistical Physics has dedicated to the growing field of statistical physics modeling of social dynamics. The issues include contributions from physicists and social scientists, with the goal of fostering a better communication between these two communities.
△ Less
Submitted 20 April, 2013; v1 submitted 3 April, 2013;
originally announced April 2013.
-
The Mesh of Civilizations and International Email Flows
Authors:
Bogdan State,
Patrick Park,
Ingmar Weber,
Yelena Mejova,
Michael Macy
Abstract:
In The Clash of Civilizations, Samuel Huntington argued that the primary axis of global conflict was no longer ideological or economic but cultural and religious, and that this division would characterize the "battle lines of the future." In contrast to the "top down" approach in previous research focused on the relations among nation states, we focused on the flows of interpersonal communication…
▽ More
In The Clash of Civilizations, Samuel Huntington argued that the primary axis of global conflict was no longer ideological or economic but cultural and religious, and that this division would characterize the "battle lines of the future." In contrast to the "top down" approach in previous research focused on the relations among nation states, we focused on the flows of interpersonal communication as a bottom-up view of international alignments. To that end, we mapped the locations of the world's countries in global email networks to see if we could detect cultural fault lines. Using IP-geolocation on a worldwide anonymized dataset obtained from a large Internet company, we constructed a global email network. In computing email flows we employ a novel rescaling procedure to account for differences due to uneven adoption of a particular Internet service across the world. Our analysis shows that email flows are consistent with Huntington's thesis. In addition to location in Huntington's "civilizations," our results also attest to the importance of both cultural and economic factors in the patterning of inter-country communication ties.
△ Less
Submitted 10 March, 2013; v1 submitted 28 February, 2013;
originally announced March 2013.