Skip to main content

Showing 1–17 of 17 results for author: Naseem, U

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.13282  [pdf, other

    cs.CV cs.MM

    Wills Aligner: A Robust Multi-Subject Brain Representation Learner

    Authors: Guangyin Bao, Zixuan Gong, Qi Zhang, Jialei Zhou, Wei Fan, Kun Yi, Usman Naseem, Liang Hu, Duoqian Miao

    Abstract: Decoding visual information from human brain activity has seen remarkable advancements in recent research. However, due to the significant variability in cortical parcellation and cognition patterns across subjects, current approaches personalized deep models for each subject, constraining the practicality of this technology in real-world contexts. To tackle the challenges, we introduce Wills Alig… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: 15 pages

  2. arXiv:2402.14834  [pdf, other

    cs.CL cs.AI cs.IR

    MSynFD: Multi-hop Syntax aware Fake News Detection

    Authors: Liang Xiao, Qi Zhang, Chongyang Shi, Shou** Wang, Usman Naseem, Liang Hu

    Abstract: The proliferation of social media platforms has fueled the rapid dissemination of fake news, posing threats to our real-life society. Existing methods use multimodal data or contextual information to enhance the detection of fake news by analyzing news content and/or its social context. However, these methods often overlook essential textual news content (articles) and heavily rely on sequential m… ▽ More

    Submitted 19 June, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

    Comments: 10 pages

  3. arXiv:2402.10772  [pdf, other

    cs.CL

    Enhancing ESG Impact Type Identification through Early Fusion and Multilingual Models

    Authors: Hariram Veeramani, Surendrabikram Thapa, Usman Naseem

    Abstract: In the evolving landscape of Environmental, Social, and Corporate Governance (ESG) impact assessment, the ML-ESG-2 shared task proposes identifying ESG impact types. To address this challenge, we present a comprehensive system leveraging ensemble learning techniques, capitalizing on early and late fusion approaches. Our approach employs four distinct models: mBERT, FlauBERT-base, ALBERT-base-v2, a… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: Accepted to FinNLP workshop at IJCNLP-ACL 2023

  4. arXiv:2402.05128  [pdf, other

    cs.CL cs.AI

    Enhancing Textbook Question Answering Task with Large Language Models and Retrieval Augmented Generation

    Authors: Hessa Abdulrahman Alawwad, Areej Alhothali, Usman Naseem, Ali Alkhathlan, Amani Jamal

    Abstract: Textbook question answering (TQA) is a challenging task in artificial intelligence due to the complex nature of context and multimodal data. Although previous research has significantly improved the task, there are still some limitations including the models' weak reasoning and inability to capture contextual information in the lengthy context. The introduction of large language models (LLMs) has… ▽ More

    Submitted 14 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  5. arXiv:2312.09693  [pdf, other

    cs.AI

    Prompting Large Language Models for Topic Modeling

    Authors: Han Wang, Nirmalendu Prakash, Nguyen Khoi Hoang, Ming Shan Hee, Usman Naseem, Roy Ka-Wei Lee

    Abstract: Topic modeling is a widely used technique for revealing underlying thematic structures within textual data. However, existing models have certain limitations, particularly when dealing with short text datasets that lack co-occurring words. Moreover, these models often neglect sentence-level semantics, focusing primarily on token-level semantics. In this paper, we propose PromptTopic, a novel topic… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: 6 pages, 3 figures, IEEE International Conference on Big Data

    ACM Class: I.2.7

  6. arXiv:2306.14764  [pdf

    cs.CL

    Uncovering Political Hate Speech During Indian Election Campaign: A New Low-Resource Dataset and Baselines

    Authors: Farhan Ahmad Jafri, Mohammad Aman Siddiqui, Surendrabikram Thapa, Kritesh Rauniyar, Usman Naseem, Imran Razzak

    Abstract: The detection of hate speech in political discourse is a critical issue, and this becomes even more challenging in low-resource languages. To address this issue, we introduce a new dataset named IEHate, which contains 11,457 manually annotated Hindi tweets related to the Indian Assembly Election Campaign from November 1, 2021, to March 9, 2022. We performed a detailed analysis of the dataset, focu… ▽ More

    Submitted 27 June, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

    Comments: Accepted to ICWSM Workshop (MEDIATE)

  7. arXiv:2305.13685  [pdf, other

    cs.CL

    Causal Intervention for Abstractive Related Work Generation

    Authors: Jiachang Liu, Qi Zhang, Chongyang Shi, Usman Naseem, Shou** Wang, Ivor Tsang

    Abstract: Abstractive related work generation has attracted increasing attention in generating coherent related work that better helps readers grasp the background in the current research. However, most existing abstractive models ignore the inherent causality of related work generation, leading to low quality of generated related work and spurious correlations that affect the models' generalizability. In t… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  8. arXiv:2304.13895  [pdf, other

    cs.SI cs.AI

    Rumor Detection with Hierarchical Representation on Bipartite Adhoc Event Trees

    Authors: Qi Zhang, Yayi Yang, Chongyang Shi, An Lao, Liang Hu, Shou** Wang, Usman Naseem

    Abstract: The rapid growth of social media has caused tremendous effects on information propagation, raising extreme challenges in detecting rumors. Existing rumor detection methods typically exploit the reposting propagation of a rumor candidate for detection by regarding all reposts to a rumor candidate as a temporal sequence and learning semantics representations of the repost sequence. However, extracti… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

  9. arXiv:2301.11004  [pdf, other

    cs.CL

    NLP as a Lens for Causal Analysis and Perception Mining to Infer Mental Health on Social Media

    Authors: Muskan Garg, Chandni Saxena, Usman Naseem, Bonnie J Dorr

    Abstract: Interactions among humans on social media often convey intentions behind their actions, yielding a psychological language resource for Mental Health Analysis (MHA) of online users. The success of Computational Intelligence Techniques (CIT) for inferring mental illness from such social media resources points to NLP as a lens for causal analysis and perception mining. However, we argue that more con… ▽ More

    Submitted 22 August, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

  10. arXiv:2207.03938  [pdf

    cs.IR cs.CL cs.CY

    An Approach to Ensure Fairness in News Articles

    Authors: Shaina Raza, Deepak John Reji, Dora D. Liu, Syed Raza Bashir, Usman Naseem

    Abstract: Recommender systems, information retrieval, and other information access systems present unique challenges for examining and applying concepts of fairness and bias mitigation in unstructured text. This paper introduces Dbias, which is a Python package to ensure fairness in news articles. Dbias is a trained Machine Learning (ML) pipeline that can take a text (e.g., a paragraph or news story) and de… ▽ More

    Submitted 8 July, 2022; originally announced July 2022.

    Comments: Accepted in KDD 2022 Workshop on Data Science and Artificial Intelligence for Responsible Recommendations (DS4RRS)

  11. arXiv:2204.04521  [pdf, other

    cs.CL

    Benchmarking for Public Health Surveillance tasks on Social Media with a Domain-Specific Pretrained Language Model

    Authors: Usman Naseem, Byoung Chan Lee, Matloob Khushi, **man Kim, Adam G. Dunn

    Abstract: A user-generated text on social media enables health workers to keep track of information, identify possible outbreaks, forecast disease trends, monitor emergency cases, and ascertain disease awareness and response to official health correspondence. This exchange of health information on social media has been regarded as an attempt to enhance public health surveillance (PHS). Despite its potential… ▽ More

    Submitted 9 April, 2022; originally announced April 2022.

    Comments: Accepted @ ACL2022 Workshop: The First Workshop on Efficient Benchmarking in NLP

  12. arXiv:2202.02824  [pdf

    cs.CY cs.DB

    A Summary of COVID-19 Datasets

    Authors: Syed Raza Bashir, Shaina Raza, Vidhi Thakkar, Usman Naseem

    Abstract: This research presents a review of main datasets that are developed for COVID-19 research. We hope this collection will continue to bring together members of the computing community, biomedical experts, and policymakers in the pursuit of effective COVID-19 treatments and management policies. Many organizations, such as the World Health Organization (WHO), John Hopkins, National Institute of Health… ▽ More

    Submitted 27 July, 2022; v1 submitted 6 February, 2022; originally announced February 2022.

    Comments: Accepted in CAIML 2022: International Conference on Artificial Intelligence and Machine Learning

  13. arXiv:2107.04374  [pdf, other

    cs.CL

    Benchmarking for Biomedical Natural Language Processing Tasks with a Domain Specific ALBERT

    Authors: Usman Naseem, Adam G. Dunn, Matloob Khushi, **man Kim

    Abstract: The availability of biomedical text data and advances in natural language processing (NLP) have made new applications in biomedical NLP possible. Language models trained or fine tuned using domain specific corpora can outperform general models, but work to date in biomedical NLP has been limited in terms of corpora and tasks. We present BioALBERT, a domain-specific adaptation of A Lite Bidirection… ▽ More

    Submitted 9 July, 2021; originally announced July 2021.

  14. arXiv:2106.09589  [pdf, other

    cs.CL

    Classifying vaccine sentiment tweets by modelling domain-specific representation and commonsense knowledge into context-aware attentive GRU

    Authors: Usman Naseem, Matloob Khushi, **man Kim, Adam G. Dunn

    Abstract: Vaccines are an important public health measure, but vaccine hesitancy and refusal can create clusters of low vaccine coverage and reduce the effectiveness of vaccination programs. Social media provides an opportunity to estimate emerging risks to vaccine acceptance by including geographical location and detailing vaccine-related concerns. Methods for classifying social media posts, such as vaccin… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

    Comments: Accepted in International Joint Conference on Neural Networks (IJCNN) 2021

  15. arXiv:2103.16388  [pdf

    q-fin.ST cs.LG

    Text Mining of Stocktwits Data for Predicting Stock Prices

    Authors: Mukul Jaggi, Priyanka Mandal, Shreya Narang, Usman Naseem, Matloob Khushi

    Abstract: Stock price prediction can be made more efficient by considering the price fluctuations and understanding the sentiments of people. A limited number of models understand financial jargon or have labelled datasets concerning stock price change. To overcome this challenge, we introduced FinALBERT, an ALBERT based model trained to handle financial domain text classification tasks by labelling Stocktw… ▽ More

    Submitted 12 March, 2021; originally announced March 2021.

    Journal ref: Appl. Syst. Innov. 2021, 4, 13

  16. arXiv:2010.15036  [pdf, other

    cs.CL

    A Comprehensive Survey on Word Representation Models: From Classical to State-Of-The-Art Word Representation Language Models

    Authors: Usman Naseem, Imran Razzak, Shah Khalid Khan, Mukesh Prasad

    Abstract: Word representation has always been an important research area in the history of natural language processing (NLP). Understanding such complex text data is imperative, given that it is rich in information and can be used widely across various applications. In this survey, we explore different word representation models and its power of expression, from the classical to modern-day state-of-the-art… ▽ More

    Submitted 28 October, 2020; originally announced October 2020.

    Journal ref: ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 2020

  17. arXiv:2009.09223  [pdf, other

    cs.CL

    BioALBERT: A Simple and Effective Pre-trained Language Model for Biomedical Named Entity Recognition

    Authors: Usman Naseem, Matloob Khushi, Vinay Reddy, Sakthivel Rajendran, Imran Razzak, **man Kim

    Abstract: In recent years, with the growing amount of biomedical documents, coupled with advancement in natural language processing algorithms, the research on biomedical named entity recognition (BioNER) has increased exponentially. However, BioNER research is challenging as NER in the biomedical domain are: (i) often restricted due to limited amount of training data, (ii) an entity can refer to multiple t… ▽ More

    Submitted 19 September, 2020; originally announced September 2020.

    Comments: 7 pages