Search | arXiv e-print repository

Z-AGI Labs at ClimateActivism 2024: Stance and Hate Event Detection on Social Media

Authors: Nikhil Narayan, Mrutyunjay Biswal

Abstract: In the digital realm, rich data serves as a crucial source of insights into the complexities of social, political, and economic landscapes. Addressing the growing need for high-quality information on events and the imperative to combat hate speech, this research led to the establishment of the Shared Task on Climate Activism Stance and Hate Event Detection at CASE 2024. Focused on climate activist… ▽ More In the digital realm, rich data serves as a crucial source of insights into the complexities of social, political, and economic landscapes. Addressing the growing need for high-quality information on events and the imperative to combat hate speech, this research led to the establishment of the Shared Task on Climate Activism Stance and Hate Event Detection at CASE 2024. Focused on climate activists contending with hate speech on social media, our study contributes to hate speech identification from tweets. Analyzing three sub-tasks - Hate Speech Detection (Sub-task A), Targets of Hate Speech Identification (Sub-task B), and Stance Detection (Sub-task C) - Team Z-AGI Labs evaluated various models, including LSTM, Xgboost, and LGBM based on Tf-Idf. Results unveiled intriguing variations, with Catboost excelling in Subtask-B (F1: 0.5604) and Subtask-C (F1: 0.7081), while LGBM emerged as the top-performing model for Subtask-A (F1: 0.8684). This research provides valuable insights into the suitability of classical machine learning models for climate hate speech and stance detection, aiding informed model selection for robust mechanisms. △ Less

Submitted 14 April, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

Comments: Authors weren't supposed to upload given organisational agreements

arXiv:2312.05671 [pdf, other]

Hate Speech and Offensive Content Detection in Indo-Aryan Languages: A Battle of LSTM and Transformers

Authors: Nikhil Narayan, Mrutyunjay Biswal, Pramod Goyal, Abhranta Panigrahi

Abstract: Social media platforms serve as accessible outlets for individuals to express their thoughts and experiences, resulting in an influx of user-generated data spanning all age groups. While these platforms enable free expression, they also present significant challenges, including the proliferation of hate speech and offensive content. Such objectionable language disrupts objective discourse and can… ▽ More Social media platforms serve as accessible outlets for individuals to express their thoughts and experiences, resulting in an influx of user-generated data spanning all age groups. While these platforms enable free expression, they also present significant challenges, including the proliferation of hate speech and offensive content. Such objectionable language disrupts objective discourse and can lead to radicalization of debates, ultimately threatening democratic values. Consequently, organizations have taken steps to monitor and curb abusive behavior, necessitating automated methods for identifying suspicious posts. This paper contributes to Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages (HASOC) 2023 shared tasks track. We, team Z-AGI Labs, conduct a comprehensive comparative analysis of hate speech classification across five distinct languages: Bengali, Assamese, Bodo, Sinhala, and Gujarati. Our study encompasses a wide range of pre-trained models, including Bert variants, XLM-R, and LSTM models, to assess their performance in identifying hate speech across these languages. Results reveal intriguing variations in model performance. Notably, Bert Base Multilingual Cased emerges as a strong performer across languages, achieving an F1 score of 0.67027 for Bengali and 0.70525 for Assamese. At the same time, it significantly outperforms other models with an impressive F1 score of 0.83009 for Bodo. In Sinhala, XLM-R stands out with an F1 score of 0.83493, whereas for Gujarati, a custom LSTM-based model outshined with an F1 score of 0.76601. This study offers valuable insights into the suitability of various pre-trained models for hate speech detection in multilingual settings. By considering the nuances of each, our research contributes to an informed model selection for building robust hate speech detection systems. △ Less

Submitted 9 December, 2023; originally announced December 2023.

Comments: 14 pages, 3 figures. Accepted Working Notes at HASOC-FIRE 2023, to be published in CEUR Working Notes of FIRE

arXiv:2202.02349 [pdf, other]

Analysis of Independent Learning in Network Agents: A Packet Forwarding Use Case

Authors: Abu Saleh Md Tayeen, Milan Biswal, Abderrahmen Mtibaa, Satyajayant Misra

Abstract: Multi-Agent Reinforcement Learning (MARL) is nowadays widely used to solve real-world and complex decisions in various domains. While MARL can be categorized into independent and cooperative approaches, we consider the independent approach as a simple, more scalable, and less costly method for large-scale distributed systems, such as network packet forwarding. In this paper, we quantitatively and… ▽ More Multi-Agent Reinforcement Learning (MARL) is nowadays widely used to solve real-world and complex decisions in various domains. While MARL can be categorized into independent and cooperative approaches, we consider the independent approach as a simple, more scalable, and less costly method for large-scale distributed systems, such as network packet forwarding. In this paper, we quantitatively and qualitatively assess the benefits of leveraging such independent agents learning approach, in particular IQL-based algorithm, for packet forwarding in computer networking, using the Named Data Networking (NDN) architecture as a driving example. We put multiple IQL-based forwarding strategies (IDQF) to the test and compare their performances against very basic forwarding schemes and simple topologies/traffic models to highlight major challenges and issues. We discuss the main issues related to the poor performance of IDQF and quantify the impact of these issues on isolation when training and testing the IDQF models under different model tuning parameters and network topologies/characteristics. △ Less

Submitted 4 February, 2022; originally announced February 2022.

Comments: 6 pages, 4 figures

arXiv:2109.05666 [pdf, other]

AMI-FML: A Privacy-Preserving Federated Machine Learning Framework for AMI

Authors: Milan Biswal, Abu Saleh Md Tayeen, Satyajayant Misra

Abstract: Machine learning (ML) based smart meter data analytics is very promising for energy management and demand-response applications in the advanced metering infrastructure(AMI). A key challenge in develo** distributed ML applications for AMI is to preserve user privacy while allowing active end-users participation. This paper addresses this challenge and proposes a privacy-preserving federated learn… ▽ More Machine learning (ML) based smart meter data analytics is very promising for energy management and demand-response applications in the advanced metering infrastructure(AMI). A key challenge in develo** distributed ML applications for AMI is to preserve user privacy while allowing active end-users participation. This paper addresses this challenge and proposes a privacy-preserving federated learning framework for ML applications in the AMI. We consider each smart meter as a federated edge device hosting an ML application that exchanges information with a central aggregator or a data concentrator, periodically. Instead of transferring the raw data sensed by the smart meters, the ML model weights are transferred to the aggregator to preserve privacy. The aggregator processes these parameters to devise a robust ML model that can be substituted at each edge device. We also discuss strategies to enhance privacy and improve communication efficiency while sharing the ML model parameters, suited for relatively slow network connections in the AMI. We demonstrate the proposed framework on a use case federated ML (FML) application that improves short-term load forecasting (STLF). We use a long short-term memory(LSTM) recurrent neural network (RNN) model for STLF. In our architecture, we assume that there is an aggregator connected to a group of smart meters. The aggregator uses the learned model gradients received from the federated smart meters to generate an aggregate, robust RNN model which improves the forecasting accuracy for individual and aggregated STLF. Our results indicate that with FML, forecasting accuracy is increased while preserving the data privacy of the end-users. △ Less

Submitted 15 December, 2021; v1 submitted 12 September, 2021; originally announced September 2021.

Comments: 7 pages

Showing 1–4 of 4 results for author: Biswal, M