-
The diminishing state of shared reality on US television news
Authors:
Homa Hosseinmardi,
Samuel Wolken,
David M. Rothschild,
Duncan J. Watts
Abstract:
The potential for a large, diverse population to coexist peacefully is thought to depend on the existence of a ``shared reality:'' a public sphere in which participants are exposed to similar facts about similar topics. A generation ago, broadcast television news was widely considered to serve this function; however, since the rise of cable news in the 1990s, critics and scholars have worried that…
▽ More
The potential for a large, diverse population to coexist peacefully is thought to depend on the existence of a ``shared reality:'' a public sphere in which participants are exposed to similar facts about similar topics. A generation ago, broadcast television news was widely considered to serve this function; however, since the rise of cable news in the 1990s, critics and scholars have worried that the corresponding fragmentation and segregation of audiences along partisan lines has caused this shared reality to be lost. Here we examine this concern using a unique combination of data sets tracking the production (since 2012) and consumption (since 2016) of television news content on the three largest cable and broadcast networks respectively. With regard to production, we find strong evidence for the ``loss of shared reality hypothesis:'' while broadcast continues to cover similar topics with similar language, cable news networks have become increasingly distinct, both from broadcast news and each other, diverging both in terms of content and language. With regard to consumption, we find more mixed evidence: while broadcast news has indeed declined in popularity, it remains the dominant source of news for roughly 50\% more Americans than does cable; moreover, its decline, while somewhat attributable to cable, appears driven more by a shift away from news consumption altogether than a growth in cable consumption. We conclude that shared reality on US television news is indeed diminishing, but is more robust than previously thought and is declining for somewhat different reasons.
△ Less
Submitted 28 October, 2023;
originally announced October 2023.
-
Causally estimating the effect of YouTube's recommender system using counterfactual bots
Authors:
Homa Hosseinmardi,
Amir Ghasemian,
Miguel Rivera-Lanas,
Manoel Horta Ribeiro,
Robert West,
Duncan J. Watts
Abstract:
In recent years, critics of online platforms have raised concerns about the ability of recommendation algorithms to amplify problematic content, with potentially radicalizing consequences. However, attempts to evaluate the effect of recommenders have suffered from a lack of appropriate counterfactuals -- what a user would have viewed in the absence of algorithmic recommendations -- and hence canno…
▽ More
In recent years, critics of online platforms have raised concerns about the ability of recommendation algorithms to amplify problematic content, with potentially radicalizing consequences. However, attempts to evaluate the effect of recommenders have suffered from a lack of appropriate counterfactuals -- what a user would have viewed in the absence of algorithmic recommendations -- and hence cannot disentangle the effects of the algorithm from a user's intentions. Here we propose a method that we call ``counterfactual bots'' to causally estimate the role of algorithmic recommendations on the consumption of highly partisan content. By comparing bots that replicate real users' consumption patterns with ``counterfactual'' bots that follow rule-based trajectories, we show that, on average, relying exclusively on the recommender results in less partisan consumption, where the effect is most pronounced for heavy partisan consumers. Following a similar method, we also show that if partisan consumers switch to moderate content, YouTube's sidebar recommender ``forgets'' their partisan preference within roughly 30 videos regardless of their prior history, while homepage recommendations shift more gradually towards moderate content. Overall, our findings indicate that, at least since the algorithm changes that YouTube implemented in 2019, individual consumption patterns mostly reflect individual preferences, where algorithmic recommendations play, if anything, a moderating role.
△ Less
Submitted 1 December, 2023; v1 submitted 20 August, 2023;
originally announced August 2023.
-
Examining the consumption of radical content on YouTube
Authors:
Homa Hosseinmardi,
Amir Ghasemian,
Aaron Clauset,
Markus Mobius,
David M. Rothschild,
Duncan J. Watts
Abstract:
Although it is under-studied relative to other social media platforms, YouTube is arguably the largest and most engaging online media consumption platform in the world. Recently, YouTube's scale has fueled concerns that YouTube users are being radicalized via a combination of biased recommendations and ostensibly apolitical anti-woke channels, both of which have been claimed to direct attention to…
▽ More
Although it is under-studied relative to other social media platforms, YouTube is arguably the largest and most engaging online media consumption platform in the world. Recently, YouTube's scale has fueled concerns that YouTube users are being radicalized via a combination of biased recommendations and ostensibly apolitical anti-woke channels, both of which have been claimed to direct attention to radical political content. Here we test this hypothesis using a representative panel of more than 300,000 Americans and their individual-level browsing behavior, on and off YouTube, from January 2016 through December 2019. Using a labeled set of political news channels, we find that news consumption on YouTube is dominated by mainstream and largely centrist sources. Consumers of far-right content, while more engaged than average, represent a small and stable percentage of news consumers. However, consumption of anti-woke content, defined in terms of its opposition to progressive intellectual and political agendas, grew steadily in popularity and is correlated with consumption of far-right content off-platform. We find no evidence that engagement with far-right content is caused by YouTube recommendations systematically, nor do we find clear evidence that anti-woke channels serve as a gateway to the far right. Rather, consumption of political content on YouTube appears to reflect individual preferences that extend across the web as a whole.
△ Less
Submitted 14 February, 2022; v1 submitted 25 November, 2020;
originally announced November 2020.
-
Learning Behavioral Representations from Wearable Sensors
Authors:
Nazgol Tavabi,
Homa Hosseinmardi,
Jennifer L. Villatte,
Andrés Abeliuk,
Shrikanth Narayanan,
Emilio Ferrara,
Kristina Lerman
Abstract:
Continuous collection of physiological data from wearable sensors enables temporal characterization of individual behaviors. Understanding the relation between an individual's behavioral patterns and psychological states can help identify strategies to improve quality of life. One challenge in analyzing physiological data is extracting the underlying behavioral states from the temporal sensor sign…
▽ More
Continuous collection of physiological data from wearable sensors enables temporal characterization of individual behaviors. Understanding the relation between an individual's behavioral patterns and psychological states can help identify strategies to improve quality of life. One challenge in analyzing physiological data is extracting the underlying behavioral states from the temporal sensor signals and interpreting them. Here, we use a non-parametric Bayesian approach to model sensor data from multiple people and discover the dynamic behaviors they share. We apply this method to data collected from sensors worn by a population of hospital workers and show that the learned states can cluster participants into meaningful groups and better predict their cognitive and psychological states. This method offers a way to learn interpretable compact behavioral representations from multivariate sensor signals.
△ Less
Submitted 4 July, 2020; v1 submitted 16 November, 2019;
originally announced November 2019.
-
Stacking Models for Nearly Optimal Link Prediction in Complex Networks
Authors:
Amir Ghasemian,
Homa Hosseinmardi,
Aram Galstyan,
Edoardo M. Airoldi,
Aaron Clauset
Abstract:
Most real-world networks are incompletely observed. Algorithms that can accurately predict which links are missing can dramatically speedup the collection of network data and improve the validity of network models. Many algorithms now exist for predicting missing links, given a partially observed network, but it has remained unknown whether a single best predictor exists, how link predictability v…
▽ More
Most real-world networks are incompletely observed. Algorithms that can accurately predict which links are missing can dramatically speedup the collection of network data and improve the validity of network models. Many algorithms now exist for predicting missing links, given a partially observed network, but it has remained unknown whether a single best predictor exists, how link predictability varies across methods and networks from different domains, and how close to optimality current methods are. We answer these questions by systematically evaluating 203 individual link predictor algorithms, representing three popular families of methods, applied to a large corpus of 548 structurally diverse networks from six scientific domains. We first show that individual algorithms exhibit a broad diversity of prediction errors, such that no one predictor or family is best, or worst, across all realistic inputs. We then exploit this diversity via meta-learning to construct a series of "stacked" models that combine predictors into a single algorithm. Applied to a broad range of synthetic networks, for which we may analytically calculate optimal performance, these stacked models achieve optimal or nearly optimal levels of accuracy. Applied to real-world networks, stacked models are also superior, but their accuracy varies strongly by domain, suggesting that link prediction may be fundamentally easier in social networks than in biological or technological networks. These results indicate that the state-of-the-art for link prediction comes from combining individual algorithms, which achieves nearly optimal predictions. We close with a brief discussion of limitations and opportunities for further improvement of these results.
△ Less
Submitted 17 September, 2019;
originally announced September 2019.
-
Discovering Hidden Structure in High Dimensional Human Behavioral Data via Tensor Factorization
Authors:
Homa Hosseinmardi,
Hsien-Te Kao,
Kristina Lerman,
Emilio Ferrara
Abstract:
In recent years, the rapid growth in technology has increased the opportunity for longitudinal human behavioral studies. Rich multimodal data, from wearables like Fitbit, online social networks, mobile phones etc. can be collected in natural environments. Uncovering the underlying low-dimensional structure of noisy multi-way data in an unsupervised setting is a challenging problem. Tensor factoriz…
▽ More
In recent years, the rapid growth in technology has increased the opportunity for longitudinal human behavioral studies. Rich multimodal data, from wearables like Fitbit, online social networks, mobile phones etc. can be collected in natural environments. Uncovering the underlying low-dimensional structure of noisy multi-way data in an unsupervised setting is a challenging problem. Tensor factorization has been successful in extracting the interconnected low-dimensional descriptions of multi-way data. In this paper, we apply non-negative tensor factorization on a real-word wearable sensor data, StudentLife, to find latent temporal factors and group of similar individuals. Meta data is available for the semester schedule, as well as the individuals' performance and personality. We demonstrate that non-negative tensor factorization can successfully discover clusters of individuals who exhibit higher academic performance, as well as those who frequently engage in leisure activities. The recovered latent temporal patterns associated with these groups are validated against ground truth data to demonstrate the accuracy of our framework.
△ Less
Submitted 21 May, 2019;
originally announced May 2019.
-
Tensor Embedding: A Supervised Framework for Human Behavioral Data Mining and Prediction
Authors:
Homa Hosseinmardi,
Amir Ghasemian,
Shrikanth Narayanan,
Kristina Lerman,
Emilio Ferrara
Abstract:
Today's densely instrumented world offers tremendous opportunities for continuous acquisition and analysis of multimodal sensor data providing temporal characterization of an individual's behaviors. Is it possible to efficiently couple such rich sensor data with predictive modeling techniques to provide contextual, and insightful assessments of individual performance and wellbeing? Prediction of d…
▽ More
Today's densely instrumented world offers tremendous opportunities for continuous acquisition and analysis of multimodal sensor data providing temporal characterization of an individual's behaviors. Is it possible to efficiently couple such rich sensor data with predictive modeling techniques to provide contextual, and insightful assessments of individual performance and wellbeing? Prediction of different aspects of human behavior from these noisy, incomplete, and heterogeneous bio-behavioral temporal data is a challenging problem, beyond unsupervised discovery of latent structures. We propose a Supervised Tensor Embedding (STE) algorithm for high dimension multimodal data with join decomposition of input and target variable. Furthermore, we show that features selection will help to reduce the contamination in the prediction and increase the performance. The efficiently of the methods was tested via two different real world datasets.
△ Less
Submitted 31 August, 2018;
originally announced August 2018.
-
Capturing Edge Attributes via Network Embedding
Authors:
Palash Goyal,
Homa Hosseinmardi,
Emilio Ferrara,
Aram Galstyan
Abstract:
Network embedding, which aims to learn low-dimensional representations of nodes, has been used for various graph related tasks including visualization, link prediction and node classification. Most existing embedding methods rely solely on network structure. However, in practice we often have auxiliary information about the nodes and/or their interactions, e.g., content of scientific papers in co-…
▽ More
Network embedding, which aims to learn low-dimensional representations of nodes, has been used for various graph related tasks including visualization, link prediction and node classification. Most existing embedding methods rely solely on network structure. However, in practice we often have auxiliary information about the nodes and/or their interactions, e.g., content of scientific papers in co-authorship networks, or topics of communication in Twitter mention networks. Here we propose a novel embedding method that uses both network structure and edge attributes to learn better network representations. Our method jointly minimizes the reconstruction error for higher-order node neighborhood, social roles and edge attributes using a deep architecture that can adequately capture highly non-linear interactions. We demonstrate the efficacy of our model over existing state-of-the-art methods on a variety of real-world networks including collaboration networks, and social networks. We also observe that using edge attributes to inform network embedding yields better performance in downstream tasks such as link prediction and node classification.
△ Less
Submitted 22 May, 2018; v1 submitted 8 May, 2018;
originally announced May 2018.
-
Evaluating Overfit and Underfit in Models of Network Community Structure
Authors:
Amir Ghasemian,
Homa Hosseinmardi,
Aaron Clauset
Abstract:
A common data mining task on networks is community detection, which seeks an unsupervised decomposition of a network into structural groups based on statistical regularities in the network's connectivity. Although many methods exist, the No Free Lunch theorem for community detection implies that each makes some kind of tradeoff, and no algorithm can be optimal on all inputs. Thus, different algori…
▽ More
A common data mining task on networks is community detection, which seeks an unsupervised decomposition of a network into structural groups based on statistical regularities in the network's connectivity. Although many methods exist, the No Free Lunch theorem for community detection implies that each makes some kind of tradeoff, and no algorithm can be optimal on all inputs. Thus, different algorithms will over or underfit on different inputs, finding more, fewer, or just different communities than is optimal, and evaluation methods that use a metadata partition as a ground truth will produce misleading conclusions about general accuracy. Here, we present a broad evaluation of over and underfitting in community detection, comparing the behavior of 16 state-of-the-art community detection algorithms on a novel and structurally diverse corpus of 406 real-world networks. We find that (i) algorithms vary widely both in the number of communities they find and in their corresponding composition, given the same input, (ii) algorithms can be clustered into distinct high-level groups based on similarities of their outputs on real-world networks, and (iii) these differences induce wide variation in accuracy on link prediction and link description tasks. We introduce a new diagnostic for evaluating overfitting and underfitting in practice, and use it to roughly divide community detection methods into general and specialized learning algorithms. Across methods and inputs, Bayesian techniques based on the stochastic block model and a minimum description length approach to regularization represent the best general learning approach, but can be outperformed under specific circumstances. These results introduce both a theoretically principled approach to evaluate over and underfitting in models of network community structure and a realistic benchmark by which new methods may be evaluated and compared.
△ Less
Submitted 16 April, 2019; v1 submitted 28 February, 2018;
originally announced February 2018.
-
Investigating Factors Influencing the Latency of Cyberbullying Detection
Authors:
Rahat Ibn Rafiq,
Homa Hosseinmardi,
Richard Han,
Qin Lv,
Shivakant Mishra
Abstract:
Cyberbullying in online social networks has become a critical problem, especially among teenagers who are social networks' prolific users. As a result, researchers have focused on identifying distinguishing features of cyberbullying and develo** techniques to automatically detect cyberbullying incidents. While this research has resulted in develo** highly accurate classifiers, two key practica…
▽ More
Cyberbullying in online social networks has become a critical problem, especially among teenagers who are social networks' prolific users. As a result, researchers have focused on identifying distinguishing features of cyberbullying and develo** techniques to automatically detect cyberbullying incidents. While this research has resulted in develo** highly accurate classifiers, two key practical issues related to identifying cyberbullying have largely been ignored, namely scalability of cyberbullying detection services and timeliness of raising alerts whenever a cyberbullying incident is suspected.
These two issues are the subject of this paper. We propose a multi-stage cyberbullying detection solution that drastically reduces the classification time and the time to raise cyberbullying alerts. The proposed solution is highly scalable, does not sacrifice accuracy for scalability, and is highly responsive in raising alerts. The solution is comprised of three novel components, an initial predictor, a multilevel priority scheduler, and an incremental classification mechanism. We have implemented this solution and utilized data obtained from the Vine online social network to demonstrate the utility of each of these components via a detailed performance evaluation. We show that our complete solution is significantly more scalable and responsive than the current state-of-the-art.
△ Less
Submitted 16 November, 2016;
originally announced November 2016.
-
Prediction of Cyberbullying Incidents on the Instagram Social Network
Authors:
Homa Hosseinmardi,
Sabrina Arredondo Mattson,
Rahat Ibn Rafiq,
Richard Han,
Qin Lv,
Shivakant Mishr
Abstract:
Cyberbullying is a growing problem affecting more than half of all American teens. The main goal of this paper is to investigate fundamentally new approaches to understand and automatically detect and predict incidents of cyberbullying in Instagram, a media-based mobile social network. In this work, we have collected a sample data set consisting of Instagram images and their associated comments. W…
▽ More
Cyberbullying is a growing problem affecting more than half of all American teens. The main goal of this paper is to investigate fundamentally new approaches to understand and automatically detect and predict incidents of cyberbullying in Instagram, a media-based mobile social network. In this work, we have collected a sample data set consisting of Instagram images and their associated comments. We then designed a labeling study and employed human contributors at the crowd-sourced CrowdFlower website to label these media sessions for cyberbullying. A detailed analysis of the labeled data is then presented, including a study of relationships between cyberbullying and a host of features such as cyberaggression, profanity, social graph features, temporal commenting behavior, linguistic content, and image content. Using the labeled data, we further design and evaluate the performance of classifiers to automatically detect and pre- dict incidents of cyberbullying and cyberaggression.
△ Less
Submitted 25 August, 2015;
originally announced August 2015.
-
Detection of Cyberbullying Incidents on the Instagram Social Network
Authors:
Homa Hosseinmardi,
Sabrina Arredondo Mattson,
Rahat Ibn Rafiq,
Richard Han,
Qin Lv,
Shivakant Mishra
Abstract:
Cyberbullying is a growing problem affecting more than half of all American teens. The main goal of this paper is to investigate fundamentally new approaches to understand and automatically detect incidents of cyberbullying over images in Instagram, a media-based mobile social network. To this end, we have collected a sample Instagram data set consisting of images and their associated comments, an…
▽ More
Cyberbullying is a growing problem affecting more than half of all American teens. The main goal of this paper is to investigate fundamentally new approaches to understand and automatically detect incidents of cyberbullying over images in Instagram, a media-based mobile social network. To this end, we have collected a sample Instagram data set consisting of images and their associated comments, and designed a labeling study for cyberbullying as well as image content using human labelers at the crowd-sourced Crowdflower Web site. An analysis of the labeled data is then presented, including a study of correlations between different features and cyberbullying as well as cyberaggression. Using the labeled data, we further design and evaluate the accuracy of a classifier to automatically detect incidents of cyberbullying.
△ Less
Submitted 12 March, 2015;
originally announced March 2015.
-
A Comparison of Common Users across Instagram and Ask.fm to Better Understand Cyberbullying
Authors:
Homa Hosseinmardi,
Rahat Ibn Rafiq,
Shaosong Li,
Zhili Yang,
Richard Han,
Shivakant Mishra,
Qin Lv
Abstract:
This paper examines users who are common to two popular online social networks, Instagram and Ask.fm, that are often used for cyberbullying. An analysis of the negativity and positivity of word usage in posts by common users of these two social networks is performed. These results are normalized in comparison to a sample of typical users in both networks. We also examine the posting activity of co…
▽ More
This paper examines users who are common to two popular online social networks, Instagram and Ask.fm, that are often used for cyberbullying. An analysis of the negativity and positivity of word usage in posts by common users of these two social networks is performed. These results are normalized in comparison to a sample of typical users in both networks. We also examine the posting activity of common user profiles and consider its correlation with negativity. Within the Ask.fm social network, which allows anonymous posts, the relationship between anonymity and negativity is further explored.
△ Less
Submitted 22 October, 2014; v1 submitted 21 August, 2014;
originally announced August 2014.
-
Towards Understanding Cyberbullying Behavior in a Semi-Anonymous Social Network
Authors:
Homa Hosseinmardi,
Amir Ghasemianlangroodi,
Richard Han,
Qin Lv,
Shivakant Mishra
Abstract:
Cyberbullying has emerged as an important and growing social problem, wherein people use online social networks and mobile phones to bully victims with offensive text, images, audio and video on a 247 basis. This paper studies negative user behavior in the Ask.fm social network, a popular new site that has led to many cases of cyberbullying, some leading to suicidal behavior.We examine the occurre…
▽ More
Cyberbullying has emerged as an important and growing social problem, wherein people use online social networks and mobile phones to bully victims with offensive text, images, audio and video on a 247 basis. This paper studies negative user behavior in the Ask.fm social network, a popular new site that has led to many cases of cyberbullying, some leading to suicidal behavior.We examine the occurrence of negative words in Ask.fms question+answer profiles along with the social network of likes of questions+answers. We also examine properties of users with cutting behavior in this social network.
△ Less
Submitted 21 August, 2014; v1 submitted 15 April, 2014;
originally announced April 2014.