One to rule them all: Towards Joint Indic Language Hate Speech Detection

Bhatia, Mehar; Bhotia, Tenzin Singhay; Agarwal, Akshat; Ramesh, Prakash; Gupta, Shubham; Shridhar, Kumar; Laumann, Felix; Dash, Ayushman

Computer Science > Computation and Language

arXiv:2109.13711 (cs)

[Submitted on 28 Sep 2021]

Title:One to rule them all: Towards Joint Indic Language Hate Speech Detection

Authors:Mehar Bhatia, Tenzin Singhay Bhotia, Akshat Agarwal, Prakash Ramesh, Shubham Gupta, Kumar Shridhar, Felix Laumann, Ayushman Dash

View PDF

Abstract:This paper is a contribution to the Hate Speech and Offensive Content Identification in Indo-European Languages (HASOC) 2021 shared task. Social media today is a hotbed of toxic and hateful conversations, in various languages. Recent news reports have shown that current models struggle to automatically identify hate posted in minority languages. Therefore, efficiently curbing hate speech is a critical challenge and problem of interest. We present a multilingual architecture using state-of-the-art transformer language models to jointly learn hate and offensive speech detection across three languages namely, English, Hindi, and Marathi. On the provided testing corpora, we achieve Macro F1 scores of 0.7996, 0.7748, 0.8651 for sub-task 1A and 0.6268, 0.5603 during the fine-grained classification of sub-task 1B. These results show the efficacy of exploiting a multilingual training scheme.

Comments:	submitted to FIRE 2021 in the HASOC-FIRE shared task on hate speech and offensive language detection
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2109.13711 [cs.CL]
	(or arXiv:2109.13711v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2109.13711

Submission history

From: Mehar Bhatia [view email]
[v1] Tue, 28 Sep 2021 13:30:00 UTC (1,854 KB)

Computer Science > Computation and Language

Title:One to rule them all: Towards Joint Indic Language Hate Speech Detection

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:One to rule them all: Towards Joint Indic Language Hate Speech Detection

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators