-
Latent Hatred: A Benchmark for Understanding Implicit Hate Speech
Authors:
Mai ElSherief,
Caleb Ziems,
David Muchlinski,
Vaishnavi Anupindi,
Jordyn Seybolt,
Munmun De Choudhury,
Diyi Yang
Abstract:
Hate speech has grown significantly on social media, causing serious consequences for victims of all demographics. Despite much attention being paid to characterize and detect discriminatory speech, most work has focused on explicit or overt hate speech, failing to address a more pervasive form based on coded or indirect language. To fill this gap, this work introduces a theoretically-justified ta…
▽ More
Hate speech has grown significantly on social media, causing serious consequences for victims of all demographics. Despite much attention being paid to characterize and detect discriminatory speech, most work has focused on explicit or overt hate speech, failing to address a more pervasive form based on coded or indirect language. To fill this gap, this work introduces a theoretically-justified taxonomy of implicit hate speech and a benchmark corpus with fine-grained labels for each message and its implication. We present systematic analyses of our dataset using contemporary baselines to detect and explain implicit hate speech, and we discuss key features that challenge existing models. This dataset will continue to serve as a useful benchmark for understanding this multifaceted issue.
△ Less
Submitted 11 September, 2021;
originally announced September 2021.
-
Lifelong Learning of Hate Speech Classification on Social Media
Authors:
**g Qian,
Hong Wang,
Mai ElSherief,
Xifeng Yan
Abstract:
Existing work on automated hate speech classification assumes that the dataset is fixed and the classes are pre-defined. However, the amount of data in social media increases every day, and the hot topics changes rapidly, requiring the classifiers to be able to continuously adapt to new data without forgetting the previously learned knowledge. This ability, referred to as lifelong learning, is cru…
▽ More
Existing work on automated hate speech classification assumes that the dataset is fixed and the classes are pre-defined. However, the amount of data in social media increases every day, and the hot topics changes rapidly, requiring the classifiers to be able to continuously adapt to new data without forgetting the previously learned knowledge. This ability, referred to as lifelong learning, is crucial for the real-word application of hate speech classifiers in social media. In this work, we propose lifelong learning of hate speech classification on social media. To alleviate catastrophic forgetting, we propose to use Variational Representation Learning (VRL) along with a memory module based on LB-SOINN (Load-Balancing Self-Organizing Incremental Neural Network). Experimentally, we show that combining variational representation learning and the LB-SOINN memory module achieves better performance than the commonly-used lifelong learning techniques.
△ Less
Submitted 5 June, 2021;
originally announced June 2021.
-
Measuring and Characterizing Hate Speech on News Websites
Authors:
Savvas Zannettou,
Mai ElSherief,
Elizabeth Belding,
Shirin Nilizadeh,
Gianluca Stringhini
Abstract:
The Web has become the main source for news acquisition. At the same time, news discussion has become more social: users can post comments on news articles or discuss news articles on other platforms like Reddit. These features empower and enable discussions among the users; however, they also act as the medium for the dissemination of toxic discourse and hate speech. The research community lacks…
▽ More
The Web has become the main source for news acquisition. At the same time, news discussion has become more social: users can post comments on news articles or discuss news articles on other platforms like Reddit. These features empower and enable discussions among the users; however, they also act as the medium for the dissemination of toxic discourse and hate speech. The research community lacks a general understanding on what type of content attracts hateful discourse and the possible effects of social networks on the commenting activity on news articles. In this work, we perform a large-scale quantitative analysis of 125M comments posted on 412K news articles over the course of 19 months. We analyze the content of the collected articles and their comments using temporal analysis, user-based analysis, and linguistic analysis, to shed light on what elements attract hateful comments on news articles. We also investigate commenting activity when an article is posted on either 4chan's Politically Incorrect board (/pol/) or six selected subreddits. We find statistically significant increases in hateful commenting activity around real-world divisive events like the "Unite the Right" rally in Charlottesville and political events like the second and third 2016 US presidential debates. Also, we find that articles that attract a substantial number of hateful comments have different linguistic characteristics when compared to articles that do not attract hateful comments. Furthermore, we observe that the post of a news articles on either /pol/ or the six subreddits is correlated with an increase of (hateful) commenting activity on the news articles.
△ Less
Submitted 16 May, 2020;
originally announced May 2020.
-
Towards Understanding Gender Bias in Relation Extraction
Authors:
Andrew Gaut,
Tony Sun,
Shirlyn Tang,
Yuxin Huang,
**g Qian,
Mai ElSherief,
Jieyu Zhao,
Diba Mirza,
Elizabeth Belding,
Kai-Wei Chang,
William Yang Wang
Abstract:
Recent developments in Neural Relation Extraction (NRE) have made significant strides towards Automated Knowledge Base Construction (AKBC). While much attention has been dedicated towards improvements in accuracy, there have been no attempts in the literature to our knowledge to evaluate social biases in NRE systems. We create WikiGenderBias, a distantly supervised dataset with a human annotated t…
▽ More
Recent developments in Neural Relation Extraction (NRE) have made significant strides towards Automated Knowledge Base Construction (AKBC). While much attention has been dedicated towards improvements in accuracy, there have been no attempts in the literature to our knowledge to evaluate social biases in NRE systems. We create WikiGenderBias, a distantly supervised dataset with a human annotated test set. WikiGenderBias has sentences specifically curated to analyze gender bias in relation extraction systems. We use WikiGenderBias to evaluate systems for bias and find that NRE systems exhibit gender biased predictions and lay groundwork for future evaluation of bias in NRE. We also analyze how name anonymization, hard debiasing for word embeddings, and counterfactual data augmentation affect gender bias in predictions and performance.
△ Less
Submitted 8 August, 2020; v1 submitted 9 November, 2019;
originally announced November 2019.
-
Mitigating Gender Bias in Natural Language Processing: Literature Review
Authors:
Tony Sun,
Andrew Gaut,
Shirlyn Tang,
Yuxin Huang,
Mai ElSherief,
Jieyu Zhao,
Diba Mirza,
Elizabeth Belding,
Kai-Wei Chang,
William Yang Wang
Abstract:
As Natural Language Processing (NLP) and Machine Learning (ML) tools rise in popularity, it becomes increasingly vital to recognize the role they play in sha** societal biases and stereotypes. Although NLP models have shown success in modeling various applications, they propagate and may even amplify gender bias found in text corpora. While the study of bias in artificial intelligence is not new…
▽ More
As Natural Language Processing (NLP) and Machine Learning (ML) tools rise in popularity, it becomes increasingly vital to recognize the role they play in sha** societal biases and stereotypes. Although NLP models have shown success in modeling various applications, they propagate and may even amplify gender bias found in text corpora. While the study of bias in artificial intelligence is not new, methods to mitigate gender bias in NLP are relatively nascent. In this paper, we review contemporary studies on recognizing and mitigating gender bias in NLP. We discuss gender bias based on four forms of representation bias and analyze methods recognizing gender bias. Furthermore, we discuss the advantages and drawbacks of existing gender debiasing methods. Finally, we discuss future studies for recognizing and mitigating gender bias in NLP.
△ Less
Submitted 21 June, 2019;
originally announced June 2019.
-
Learning to Decipher Hate Symbols
Authors:
**g Qian,
Mai ElSherief,
Elizabeth Belding,
William Yang Wang
Abstract:
Existing computational models to understand hate speech typically frame the problem as a simple classification task, bypassing the understanding of hate symbols (e.g., 14 words, kigy) and their secret connotations. In this paper, we propose a novel task of deciphering hate symbols. To do this, we leverage the Urban Dictionary and collected a new, symbol-rich Twitter corpus of hate speech. We inves…
▽ More
Existing computational models to understand hate speech typically frame the problem as a simple classification task, bypassing the understanding of hate symbols (e.g., 14 words, kigy) and their secret connotations. In this paper, we propose a novel task of deciphering hate symbols. To do this, we leverage the Urban Dictionary and collected a new, symbol-rich Twitter corpus of hate speech. We investigate neural network latent context models for deciphering hate symbols. More specifically, we study Sequence-to-Sequence models and show how they are able to crack the ciphers based on context. Furthermore, we propose a novel Variational Decipher and show how it can generalize better to unseen hate symbols in a more challenging testing setting.
△ Less
Submitted 4 April, 2019;
originally announced April 2019.
-
Hierarchical CVAE for Fine-Grained Hate Speech Classification
Authors:
**g Qian,
Mai ElSherief,
Elizabeth Belding,
William Yang Wang
Abstract:
Existing work on automated hate speech detection typically focuses on binary classification or on differentiating among a small set of categories. In this paper, we propose a novel method on a fine-grained hate speech classification task, which focuses on differentiating among 40 hate groups of 13 different hate group categories. We first explore the Conditional Variational Autoencoder (CVAE) as a…
▽ More
Existing work on automated hate speech detection typically focuses on binary classification or on differentiating among a small set of categories. In this paper, we propose a novel method on a fine-grained hate speech classification task, which focuses on differentiating among 40 hate groups of 13 different hate group categories. We first explore the Conditional Variational Autoencoder (CVAE) as a discriminative model and then extend it to a hierarchical architecture to utilize the additional hate category information for more accurate prediction. Experimentally, we show that incorporating the hate category information for training can significantly improve the classification performance and our proposed model outperforms commonly-used discriminative models.
△ Less
Submitted 31 August, 2018;
originally announced September 2018.
-
Peer to Peer Hate: Hate Speech Instigators and Their Targets
Authors:
Mai ElSherief,
Shirin Nilizadeh,
Dana Nguyen,
Giovanni Vigna,
Elizabeth Belding
Abstract:
While social media has become an empowering agent to individual voices and freedom of expression, it also facilitates anti-social behaviors including online harassment, cyberbullying, and hate speech. In this paper, we present the first comparative study of hate speech instigators and target users on Twitter. Through a multi-step classification process, we curate a comprehensive hate speech datase…
▽ More
While social media has become an empowering agent to individual voices and freedom of expression, it also facilitates anti-social behaviors including online harassment, cyberbullying, and hate speech. In this paper, we present the first comparative study of hate speech instigators and target users on Twitter. Through a multi-step classification process, we curate a comprehensive hate speech dataset capturing various types of hate. We study the distinctive characteristics of hate instigators and targets in terms of their profile self-presentation, activities, and online visibility. We find that hate instigators target more popular and high profile Twitter users, and that participating in hate speech can result in greater online visibility. We conduct a personality analysis of hate instigators and targets and show that both groups have eccentric personality facets that differ from the general Twitter population. Our results advance the state of the art of understanding online hate speech engagement.
△ Less
Submitted 12 April, 2018;
originally announced April 2018.
-
Hate Lingo: A Target-based Linguistic Analysis of Hate Speech in Social Media
Authors:
Mai ElSherief,
Vivek Kulkarni,
Dana Nguyen,
William Yang Wang,
Elizabeth Belding
Abstract:
While social media empowers freedom of expression and individual voices, it also enables anti-social behavior, online harassment, cyberbullying, and hate speech. In this paper, we deepen our understanding of online hate speech by focusing on a largely neglected but crucial aspect of hate speech -- its target: either "directed" towards a specific person or entity, or "generalized" towards a group o…
▽ More
While social media empowers freedom of expression and individual voices, it also enables anti-social behavior, online harassment, cyberbullying, and hate speech. In this paper, we deepen our understanding of online hate speech by focusing on a largely neglected but crucial aspect of hate speech -- its target: either "directed" towards a specific person or entity, or "generalized" towards a group of people sharing a common protected characteristic. We perform the first linguistic and psycholinguistic analysis of these two forms of hate speech and reveal the presence of interesting markers that distinguish these types of hate speech. Our analysis reveals that Directed hate speech, in addition to being more personal and directed, is more informal, angrier, and often explicitly attacks the target (via name calling) with fewer analytic words and more words suggesting authority and influence. Generalized hate speech, on the other hand, is dominated by religious hate, is characterized by the use of lethal words such as murder, exterminate, and kill; and quantity words such as million and many. Altogether, our work provides a data-driven analysis of the nuances of online-hate speech that enables not only a deepened understanding of hate speech and its social implications but also its detection.
△ Less
Submitted 11 April, 2018;
originally announced April 2018.
-
Leveraging Intra-User and Inter-User Representation Learning for Automated Hate Speech Detection
Authors:
**g Qian,
Mai ElSherief,
Elizabeth M. Belding,
William Yang Wang
Abstract:
Hate speech detection is a critical, yet challenging problem in Natural Language Processing (NLP). Despite the existence of numerous studies dedicated to the development of NLP hate speech detection approaches, the accuracy is still poor. The central problem is that social media posts are short and noisy, and most existing hate speech detection solutions take each post as an isolated input instanc…
▽ More
Hate speech detection is a critical, yet challenging problem in Natural Language Processing (NLP). Despite the existence of numerous studies dedicated to the development of NLP hate speech detection approaches, the accuracy is still poor. The central problem is that social media posts are short and noisy, and most existing hate speech detection solutions take each post as an isolated input instance, which is likely to yield high false positive and negative rates. In this paper, we radically improve automated hate speech detection by presenting a novel model that leverages intra-user and inter-user representation learning for robust hate speech detection on Twitter. In addition to the target Tweet, we collect and analyze the user's historical posts to model intra-user Tweet representations. To suppress the noise in a single Tweet, we also model the similar Tweets posted by all other users with reinforced inter-user representation learning techniques. Experimentally, we show that leveraging these two representations can significantly improve the f-score of a strong bidirectional LSTM baseline model by 10.1%.
△ Less
Submitted 13 September, 2018; v1 submitted 9 April, 2018;
originally announced April 2018.
-
An Information-theoretic Model for Knowledge Sharing in Opportunistic Social Networks
Authors:
Mai ElSherief,
Tamer ElBatt,
Ahmed Zahran,
Ahmed Helmy
Abstract:
In this paper we establish fundamental limits on the performance of knowledge sharing in opportunistic social net- works. In particular, we introduce a novel information-theoretic model to characterize the performance limits of knowledge sharing policies. Towards this objective, we first introduce the notions of knowledge gain and its upper bound, knowledge gain limit, per user. Second, we charact…
▽ More
In this paper we establish fundamental limits on the performance of knowledge sharing in opportunistic social net- works. In particular, we introduce a novel information-theoretic model to characterize the performance limits of knowledge sharing policies. Towards this objective, we first introduce the notions of knowledge gain and its upper bound, knowledge gain limit, per user. Second, we characterize these quantities for a number of network topologies and sharing policies. This work constitutes a first step towards defining and characterizing the performance limits and trade-offs associated with knowledge sharing in opportunistic social networks. Finally, we present nu- merical results characterizing the cumulative knowledge gain over time and its upper bound, using publicly available smartphone data. The results confirm the key role of the proposed model to motivate future research in this ripe area of research as well as new knowledge sharing policies.
△ Less
Submitted 12 May, 2015;
originally announced May 2015.