Skip to main content

Showing 1–48 of 48 results for author: Mamidi, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.02978  [pdf, other

    cs.CL cs.AI

    Mast Kalandar at SemEval-2024 Task 8: On the Trail of Textual Origins: RoBERTa-BiLSTM Approach to Detect AI-Generated Text

    Authors: Jainit Sushil Bafna, Hardik Mittal, Suyash Sethia, Manish Shrivastava, Radhika Mamidi

    Abstract: Large Language Models (LLMs) have showcased impressive abilities in generating fluent responses to diverse user queries. However, concerns regarding the potential misuse of such texts in journalism, educational, and academic contexts have surfaced. SemEval 2024 introduces the task of Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection, aiming to develop automat… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: SemEval-2024

  2. arXiv:2406.07441  [pdf, other

    cs.DC math.NA

    GPU Accelerated Implicit Kinetic Meshfree Method based on Modified LU-SGS

    Authors: Mayuri Verma, Anil Nemili, Nischay Ram Mamidi

    Abstract: This report presents the GPU acceleration of implicit kinetic meshfree methods using modified LU-SGS algorithms. The meshfree scheme is based on the least squares kinetic upwind method (LSKUM). In the existing matrix-free LU-SGS approaches for kinetic meshfree methods, the products of split flux Jacobians and increments in conserved vectors are approximated by increments in the split fluxes. In ou… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  3. arXiv:2405.19701  [pdf, other

    cs.CL cs.AI

    Significance of Chain of Thought in Gender Bias Mitigation for English-Dravidian Machine Translation

    Authors: Lavanya Prahallad, Radhika Mamidi

    Abstract: Gender bias in machine translation (MT) sys- tems poses a significant challenge to achieving accurate and inclusive translations. This paper examines gender bias in machine translation systems for languages such as Telugu and Kan- nada from the Dravidian family, analyzing how gender inflections affect translation accuracy and neutrality using Google Translate and Chat- GPT. It finds that while plu… ▽ More

    Submitted 3 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: 6 pages

  4. arXiv:2403.13287  [pdf, ps, other

    cs.DC

    Regent based parallel meshfree LSKUM solver for heterogenous HPC platforms

    Authors: Sanath Salil, Nischay Ram Mamidi, Anil Nemili, Elliott Slaughter

    Abstract: Regent is an implicitly parallel programming language that allows the development of a single codebase for heterogeneous platforms targeting CPUs and GPUs. This paper presents the development of a parallel meshfree solver in Regent for two-dimensional inviscid compressible flows. The meshfree solver is based on the least squares kinetic upwind method. Example codes are presented to show the differ… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  5. arXiv:2403.12244  [pdf, other

    cs.CL

    Zero-Shot Multi-task Hallucination Detection

    Authors: Patanjali Bhamidipati, Advaith Malladi, Manish Shrivastava, Radhika Mamidi

    Abstract: In recent studies, the extensive utilization of large language models has underscored the importance of robust evaluation methodologies for assessing text generation quality and relevance to specific tasks. This has revealed a prevalent issue known as hallucination, an emergent condition in the model where generated text lacks faithfulness to the source and deviates from the evaluation criteria. I… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  6. arXiv:2402.15873  [pdf, ps, other

    cs.CL

    SemEval-2024 Task 8: Weighted Layer Averaging RoBERTa for Black-Box Machine-Generated Text Detection

    Authors: Ayan Datta, Aryan Chandramania, Radhika Mamidi

    Abstract: This document contains the details of the authors' submission to the proceedings of SemEval 2024's Task 8: Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection Subtask A (monolingual) and B. Detection of machine-generated text is becoming an increasingly important task, with the advent of large language models (LLMs). In this paper, we lay out how using weighted… ▽ More

    Submitted 9 April, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

  7. arXiv:2212.12937  [pdf, other

    cs.CL cs.LG

    GAE-ISumm: Unsupervised Graph-Based Summarization of Indian Languages

    Authors: Lakshmi Sireesha Vakada, Anudeep Ch, Mounika Marreddy, Subba Reddy Oota, Radhika Mamidi

    Abstract: Document summarization aims to create a precise and coherent summary of a text document. Many deep learning summarization models are developed mainly for English, often requiring a large training corpus and efficient pre-trained language models and tools. However, English summarization models for low-resource Indian languages are often limited by rich morphological variation, syntax, and semantic… ▽ More

    Submitted 25 December, 2022; originally announced December 2022.

    Comments: 9 pages, 7 figures

  8. arXiv:2211.13815  [pdf, ps, other

    cs.CL

    Using Selective Masking as a Bridge between Pre-training and Fine-tuning

    Authors: Tanish Lad, Himanshu Maheshwari, Shreyas Kottukkal, Radhika Mamidi

    Abstract: Pre-training a language model and then fine-tuning it for downstream tasks has demonstrated state-of-the-art results for various NLP tasks. Pre-training is usually independent of the downstream task, and previous works have shown that this pre-training alone might not be sufficient to capture the task-specific nuances. We propose a way to tailor a pre-trained BERT model for the downstream task via… ▽ More

    Submitted 24 November, 2022; originally announced November 2022.

    Comments: ENLSP Workshop, NeurIPS 2022

  9. arXiv:2206.07318  [pdf, other

    cs.CL

    CMNEROne at SemEval-2022 Task 11: Code-Mixed Named Entity Recognition by leveraging multilingual data

    Authors: Suman Dowlagar, Radhika Mamidi

    Abstract: Identifying named entities is, in general, a practical and challenging task in the field of Natural Language Processing. Named Entity Recognition on the code-mixed text is further challenging due to the linguistic complexity resulting from the nature of the mixing. This paper addresses the submission of team CMNEROne to the SEMEVAL 2022 shared task 11 MultiCoNER. The Code-mixed NER task aimed to i… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

    Comments: SemEval 2022 Task 11: MultiCoNER Multilingual Complex Named Entity Recognition, NAACL, 2022

  10. arXiv:2206.03354  [pdf, other

    cs.CL cs.CV

    cViL: Cross-Lingual Training of Vision-Language Models using Knowledge Distillation

    Authors: Kshitij Gupta, Devansh Gautam, Radhika Mamidi

    Abstract: Vision-and-language tasks are gaining popularity in the research community, but the focus is still mainly on English. We propose a pipeline that utilizes English-only vision-language models to train a monolingual model for a target language. We propose to extend OSCAR+, a model which leverages object tags as anchor points for learning image-text alignments, to train on visual question answering da… ▽ More

    Submitted 9 June, 2022; v1 submitted 7 June, 2022; originally announced June 2022.

    Comments: Accepted at ICPR 2022; 9 pages

  11. arXiv:2205.02937  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Detection of Propaganda Techniques in Visuo-Lingual Metaphor in Memes

    Authors: Sunil Gundapu, Radhika Mamidi

    Abstract: The exponential rise of social media networks has allowed the production, distribution, and consumption of data at a phenomenal rate. Moreover, the social media revolution has brought a unique phenomenon to social media platforms called Internet memes. Internet memes are one of the most popular contents used on social media, and they can be in the form of images with a witty, catchy, or satirical… ▽ More

    Submitted 3 May, 2022; originally announced May 2022.

    Comments: Paper accepted at 2nd International Conference on Machine Learning Techniques and Data Science (MLDS 2021)

  12. arXiv:2205.01204  [pdf, other

    cs.CL cs.AI cs.LG

    Multi-Task Text Classification using Graph Convolutional Networks for Large-Scale Low Resource Language

    Authors: Mounika Marreddy, Subba Reddy Oota, Lakshmi Sireesha Vakada, Venkata Charan Chinni, Radhika Mamidi

    Abstract: Graph Convolutional Networks (GCN) have achieved state-of-art results on single text classification tasks like sentiment analysis, emotion detection, etc. However, the performance is achieved by testing and reporting on resource-rich languages like English. Applying GCN for multi-task text classification is an unexplored area. Moreover, training a GCN or adopting an English GCN for Indian language… ▽ More

    Submitted 2 May, 2022; originally announced May 2022.

    Comments: 9 pages, 6 figures

  13. arXiv:2204.04347  [pdf, other

    cs.CL cs.AI

    On the Importance of Karaka Framework in Multi-modal Grounding

    Authors: Sai Kiran Gorthi, Radhika Mamidi

    Abstract: Computational Paninian Grammar model helps in decoding a natural language expression as a series of modifier-modified relations and therefore facilitates in identifying dependency relations closer to language (context) semantics compared to the usual Stanford dependency relations. However, the importance of this CPG dependency scheme has not been studied in the context of multi-modal vision and la… ▽ More

    Submitted 8 April, 2022; originally announced April 2022.

  14. arXiv:2108.07031  [pdf, other

    cs.PL cs.PF physics.comp-ph

    On the performance of GPU accelerated q-LSKUM based meshfree solvers in Fortran, C++, Python, and Julia

    Authors: Nischay Ram Mamidi, Kumar Prasun, Dhruv Saxena, Anil Nemili, Bharatkumar Sharma, S. M. Deshpande

    Abstract: This report presents a comprehensive analysis of the performance of GPU accelerated meshfree CFD solvers for two-dimensional compressible flows in Fortran, C++, Python, and Julia. The programming model CUDA is used to develop the GPU codes. The meshfree solver is based on the least squares kinetic upwind method with entropy variables (q-LSKUM). To assess the computational efficiency of the GPU sol… ▽ More

    Submitted 16 August, 2021; originally announced August 2021.

    Comments: 42 pages, 3 figures

    ACM Class: D.3.0; J.2

  15. arXiv:2106.00250  [pdf, other

    cs.CL cs.CV

    ViTA: Visual-Linguistic Translation by Aligning Object Tags

    Authors: Kshitij Gupta, Devansh Gautam, Radhika Mamidi

    Abstract: Multimodal Machine Translation (MMT) enriches the source text with visual information for translation. It has gained popularity in recent years, and several pipelines have been proposed in the same direction. Yet, the task lacks quality datasets to illustrate the contribution of visual modality in the translation systems. In this paper, we propose our system under the team name Volta for the Multi… ▽ More

    Submitted 28 June, 2021; v1 submitted 1 June, 2021; originally announced June 2021.

    Comments: 7 pages, accepted at WAT-2021 co-located with ACL-IJCNLP 2021

  16. arXiv:2106.00240  [pdf, other

    cs.CL cs.CV

    Volta at SemEval-2021 Task 6: Towards Detecting Persuasive Texts and Images using Textual and Multimodal Ensemble

    Authors: Kshitij Gupta, Devansh Gautam, Radhika Mamidi

    Abstract: Memes are one of the most popular types of content used to spread information online. They can influence a large number of people through rhetorical and psychological techniques. The task, Detection of Persuasion Techniques in Texts and Images, is to detect these persuasive techniques in memes. It consists of three subtasks: (A) Multi-label classification using textual content, (B) Multi-label cla… ▽ More

    Submitted 1 June, 2021; originally announced June 2021.

    Comments: 7 pages, accepted at SemEval-2021 co-located with ACL-IJCNLP 2021

  17. Detection of Fake Users in SMPs Using NLP and Graph Embeddings

    Authors: Manojit Chakraborty, Shubham Das, Radhika Mamidi

    Abstract: Social Media Platforms (SMPs) like Facebook, Twitter, Instagram etc. have large user base all around the world that generates huge amount of data every second. This includes a lot of posts by fake and spam users, typically used by many organisations around the globe to have competitive edge over others. In this work, we aim at detecting such user accounts in Twitter using a novel approach. We show… ▽ More

    Submitted 27 April, 2021; originally announced April 2021.

    Comments: 5 pages, 3 figures

  18. arXiv:2103.00536  [pdf, other

    cs.CL cs.HC cs.LG

    Towards Conversational Humor Analysis and Design

    Authors: Tanishq Chaudhary, Mayank Goel, Radhika Mamidi

    Abstract: Well-defined jokes can be divided neatly into a setup and a punchline. While most works on humor today talk about a joke as a whole, the idea of generating punchlines to a setup has applications in conversational humor, where funny remarks usually occur with a non-funny context. Thus, this paper is based around two core concepts: Classification and the Generation of a punchline from a particular s… ▽ More

    Submitted 28 February, 2021; originally announced March 2021.

  19. arXiv:2102.12179  [pdf, other

    cs.CL

    Multichannel LSTM-CNN for Telugu Technical Domain Identification

    Authors: Sunil Gundapu, Radhika Mamidi

    Abstract: With the instantaneous growth of text information, retrieving domain-oriented information from the text data has a broad range of applications in Information Retrieval and Natural language Processing. Thematic keywords give a compressed representation of the text. Usually, Domain Identification plays a significant role in Machine Translation, Text Summarization, Question Answering, Information Ext… ▽ More

    Submitted 24 February, 2021; originally announced February 2021.

    Comments: Paper accepted in The seventeenth International Conference on Natural Language Processing (ICON-2020)

  20. arXiv:2102.12082  [pdf, other

    cs.CL

    Hopeful_Men@LT-EDI-EACL2021: Hope Speech Detection Using Indic Transliteration and Transformers

    Authors: Ishan Sanjeev Upadhyay, Nikhil E, Anshul Wadhawan, Radhika Mamidi

    Abstract: This paper aims to describe the approach we used to detect hope speech in the HopeEDI dataset. We experimented with two approaches. In the first approach, we used contextual embeddings to train classifiers using logistic regression, random forest, SVM, and LSTM based models.The second approach involved using a majority voting ensemble of 11 models which were obtained by fine-tuning pre-trained tra… ▽ More

    Submitted 24 February, 2021; v1 submitted 24 February, 2021; originally announced February 2021.

  21. arXiv:2102.09990  [pdf, other

    cs.CL

    Analyzing Curriculum Learning for Sentiment Analysis along Task Difficulty, Pacing and Visualization Axes

    Authors: Anvesh Rao Vij**i, Kaveri Anuranjana, Radhika Mamidi

    Abstract: While Curriculum Learning (CL) has recently gained traction in Natural language Processing Tasks, it is still not adequately analyzed. Previous works only show their effectiveness but fail short to explain and interpret the internal workings fully. In this paper, we analyze curriculum learning in sentiment analysis along multiple axes. Some of these axes have been proposed by earlier works that ne… ▽ More

    Submitted 2 March, 2021; v1 submitted 19 February, 2021; originally announced February 2021.

    Comments: Accepted for presentation at WASSA 2021 at EACL

  22. arXiv:2101.09015  [pdf, ps, other

    cs.CL

    Unsupervised Technical Domain Terms Extraction using Term Extractor

    Authors: Suman Dowlagar, Radhika Mamidi

    Abstract: Terminology extraction, also known as term extraction, is a subtask of information extraction. The goal of terminology extraction is to extract relevant words or phrases from a given corpus automatically. This paper focuses on the unsupervised automated domain term extraction method that considers chunking, preprocessing, and ranking domain-specific terms using relevance and cohesion functions for… ▽ More

    Submitted 22 January, 2021; originally announced January 2021.

  23. arXiv:2101.09012  [pdf, other

    cs.CL

    Multilingual Pre-Trained Transformers and Convolutional NN Classification Models for Technical Domain Identification

    Authors: Suman Dowlagar, Radhika Mamidi

    Abstract: In this paper, we present a transfer learning system to perform technical domain identification on multilingual text data. We have submitted two runs, one uses the transformer model BERT, and the other uses XLM-ROBERTa with the CNN model for text classification. These models allowed us to identify the domain of the given sentences for the ICON 2020 shared Task, TechDOfication: Technical Domain Ide… ▽ More

    Submitted 22 January, 2021; originally announced January 2021.

  24. arXiv:2101.09009  [pdf, other

    cs.CL

    Does a Hybrid Neural Network based Feature Selection Model Improve Text Classification?

    Authors: Suman Dowlagar, Radhika Mamidi

    Abstract: Text classification is a fundamental problem in the field of natural language processing. Text classification mainly focuses on giving more importance to all the relevant features that help classify the textual data. Apart from these, the text can have redundant or highly correlated features. These features increase the complexity of the classification algorithm. Thus, many dimensionality reductio… ▽ More

    Submitted 22 January, 2021; originally announced January 2021.

  25. arXiv:2101.09007  [pdf, other

    cs.CL

    HASOCOne@FIRE-HASOC2020: Using BERT and Multilingual BERT models for Hate Speech Detection

    Authors: Suman Dowlagar, Radhika Mamidi

    Abstract: Hateful and Toxic content has become a significant concern in today's world due to an exponential rise in social media. The increase in hate speech and harmful content motivated researchers to dedicate substantial efforts to the challenging direction of hateful content identification. In this task, we propose an approach to automatically classify hate speech and offensive content. We have used the… ▽ More

    Submitted 22 January, 2021; originally announced January 2021.

  26. arXiv:2101.09004  [pdf, other

    cs.CL cs.LG

    CMSAOne@Dravidian-CodeMix-FIRE2020: A Meta Embedding and Transformer model for Code-Mixed Sentiment Analysis on Social Media Text

    Authors: Suman Dowlagar, Radhika Mamidi

    Abstract: Code-mixing(CM) is a frequently observed phenomenon that uses multiple languages in an utterance or sentence. CM is mostly practiced on various social media platforms and in informal conversations. Sentiment analysis (SA) is a fundamental step in NLP and is well studied in the monolingual text. Code-mixing adds a challenge to sentiment analysis due to its non-standard representations. This paper p… ▽ More

    Submitted 22 January, 2021; originally announced January 2021.

    Comments: FIRE 2020: Forum for Information Retrieval Evaluation, December 16-20, 2020, Hyderabad, India

  27. arXiv:2101.00180  [pdf, other

    cs.CL

    Transformer based Automatic COVID-19 Fake News Detection System

    Authors: Sunil Gundapu, Radhika Mamidi

    Abstract: Recent rapid technological advancements in online social networks such as Twitter have led to a great incline in spreading false information and fake news. Misinformation is especially prevalent in the ongoing coronavirus disease (COVID-19) pandemic, leading to individuals accepting bogus and potentially deleterious claims and articles. Quick detection of fake news can reduce the spread of panic a… ▽ More

    Submitted 21 January, 2021; v1 submitted 1 January, 2021; originally announced January 2021.

    Comments: First Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situation, 12 pages

  28. arXiv:2010.04482  [pdf, other

    cs.CL

    Word Level Language Identification in English Telugu Code Mixed Data

    Authors: Sunil Gundapu, Radhika Mamidi

    Abstract: In a multilingual or sociolingual configuration Intra-sentential Code Switching (ICS) or Code Mixing (CM) is frequently observed nowadays. In the world, most of the people know more than one language. CM usage is especially apparent in social media platforms. Moreover, ICS is particularly significant in the context of technology, health, and law where conveying the upcoming developments are diffic… ▽ More

    Submitted 9 October, 2020; originally announced October 2020.

    Comments: 7 pages, 3 figures

  29. arXiv:2010.04470  [pdf, other

    cs.CV

    gundapusunil at SemEval-2020 Task 8: Multimodal Memotion Analysis

    Authors: Sunil Gundapu, Radhika Mamidi

    Abstract: Recent technological advancements in the Internet and Social media usage have resulted in the evolution of faster and efficient platforms of communication. These platforms include visual, textual and speech mediums and have brought a unique social phenomenon called Internet memes. Internet memes are in the form of images with witty, catchy, or sarcastic text descriptions. In this paper, we present… ▽ More

    Submitted 9 October, 2020; originally announced October 2020.

    Comments: 8 pages, 4 figures

  30. arXiv:2010.04395  [pdf, other

    cs.CL

    gundapusunil at SemEval-2020 Task 9: Syntactic Semantic LSTM Architecture for SENTIment Analysis of Code-MIXed Data

    Authors: Sunil Gundapu, Radhika Mamidi

    Abstract: The phenomenon of mixing the vocabulary and syntax of multiple languages within the same utterance is called Code-Mixing. This is more evident in multilingual societies. In this paper, we have developed a system for SemEval 2020: Task 9 on Sentiment Analysis for Code-Mixed Social Media Text. Our system first generates two types of embeddings for the social media text. In those, the first one is ch… ▽ More

    Submitted 9 October, 2020; originally announced October 2020.

    Comments: 6 pages, 2 figures

  31. arXiv:2006.01222  [pdf, other

    cs.CL cs.SI

    BERT-based Ensembles for Modeling Disclosure and Support in Conversational Social Media Text

    Authors: Tanvi Dadu, Kartikey Pant, Radhika Mamidi

    Abstract: There is a growing interest in understanding how humans initiate and hold conversations. The affective understanding of conversations focuses on the problem of how speakers use emotions to react to a situation and to each other. In the CL-Aff Shared Task, the organizers released Get it #OffMyChest dataset, which contains Reddit comments from casual and confessional conversations, labeled for their… ▽ More

    Submitted 1 June, 2020; originally announced June 2020.

    Comments: Accepted at the Affective Content workshop held at AAAI 2020 as the Best System Paper

  32. arXiv:2005.04749  [pdf, other

    cs.CL

    A SentiWordNet Strategy for Curriculum Learning in Sentiment Analysis

    Authors: Vij**i Anvesh Rao, Kaveri Anuranjana, Radhika Mamidi

    Abstract: Curriculum Learning (CL) is the idea that learning on a training set sequenced or ordered in a manner where samples range from easy to difficult, results in an increment in performance over otherwise random ordering. The idea parallels cognitive science's theory of how human brains learn, and that learning a difficult task can be made easier by phrasing it as a sequence of easy to difficult tasks.… ▽ More

    Submitted 21 July, 2020; v1 submitted 10 May, 2020; originally announced May 2020.

    Comments: Accepted Short Paper at 25th International Conference on Applications of Natural Language to Information Systems, June 2020, DFKI Saarbrücken, Germany

  33. Towards Detection of Subjective Bias using Contextualized Word Embeddings

    Authors: Tanvi Dadu, Kartikey Pant, Radhika Mamidi

    Abstract: Subjective bias detection is critical for applications like propaganda detection, content recommendation, sentiment analysis, and bias neutralization. This bias is introduced in natural language via inflammatory words and phrases, casting doubt over facts, and presupposing the truth. In this work, we perform comprehensive experiments for detecting subjective bias using BERT-based models on the Wik… ▽ More

    Submitted 16 February, 2020; originally announced February 2020.

    Comments: To appear in Companion Proceedings of the Web Conference 2020 (WWW '20 Companion)

  34. arXiv:1911.10704  [pdf

    cs.CL

    Conversational implicatures in English dialogue: Annotated dataset

    Authors: Elizabeth Jasmi George, Radhika Mamidi

    Abstract: Human dialogue often contains utterances having meanings entirely different from the sentences used and are clearly understood by the interlocutors. But in human-computer interactions, the machine fails to understand the implicated meaning unless it is trained with a dataset containing the implicated meaning of an utterance along with the utterance and the context in which it is uttered. In lingui… ▽ More

    Submitted 24 November, 2019; originally announced November 2019.

    Comments: 8 Pages, NLP'19 Short paper

  35. arXiv:1911.09994  [pdf, other

    cs.CL cs.AI

    Anaphora Resolution in Dialogue Systems for South Asian Languages

    Authors: Vinay Annam, Nikhil Koditala, Radhika Mamidi

    Abstract: Anaphora resolution is a challenging task which has been the interest of NLP researchers for a long time. Traditional resolution techniques like eliminative constraints and weighted preferences were successful in many languages. However, they are ineffective in free word order languages like most SouthAsian languages.Heuristic and rule-based techniques were typical in these languages, which are co… ▽ More

    Submitted 22 November, 2019; originally announced November 2019.

  36. arXiv:1910.08294  [pdf

    cs.CL cs.AI

    Towards Computing Inferences from English News Headlines

    Authors: Elizabeth Jasmi George, Radhika Mamidi

    Abstract: Newspapers are a popular form of written discourse, read by many people, thanks to the novelty of the information provided by the news content in it. A headline is the most widely read part of any newspaper due to its appearance in a bigger font and sometimes in colour print. In this paper, we suggest and implement a method for computing inferences from English news headlines, excluding the inform… ▽ More

    Submitted 18 October, 2019; originally announced October 2019.

    Comments: PACLING 2019 Long paper, 15 pages

  37. SmokEng: Towards Fine-grained Classification of Tobacco-related Social Media Text

    Authors: Kartikey Pant, Venkata Himakar Yanamandra, Alok Debnath, Radhika Mamidi

    Abstract: Contemporary datasets on tobacco consumption focus on one of two topics, either public health mentions and disease surveillance, or sentiment analysis on topical tobacco products and services. However, two primary considerations are not accounted for, the language of the demographic affected and a combination of the topics mentioned above in a fine-grained classification mechanism. In this paper,… ▽ More

    Submitted 12 October, 2019; originally announced October 2019.

    Comments: Accepted at the Workshop on Noisy User-generated Text (W-NUT) at EMNLP-IJCNLP 2019

  38. arXiv:1906.08570  [pdf, other

    cs.CL

    Hindi Question Generation Using Dependency Structures

    Authors: Kaveri Anuranjana, Vij**i Anvesh Rao, Radhika Mamidi

    Abstract: Hindi question answering systems suffer from a lack of data. To address the same, this paper presents an approach towards automatic question generation. We present a rule-based system for question generation in Hindi by formalizing question transformation methods based on karaka-dependency theory. We use a Hindi dependency parser to mark the karaka roles and use IndoWordNet a Hindi ontology to det… ▽ More

    Submitted 20 June, 2019; originally announced June 2019.

  39. arXiv:1904.00762  [pdf, other

    cs.IR cs.CL cs.LG stat.ML

    Affect in Tweets Using Experts Model

    Authors: Subba Reddy Oota, Adithya Avvaru, Mounika Marreddy, Radhika Mamidi

    Abstract: Estimating the intensity of emotion has gained significance as modern textual inputs in potential applications like social media, e-retail markets, psychology, advertisements etc., carry a lot of emotions, feelings, expressions along with its meaning. However, the approaches of traditional sentiment analysis primarily focuses on classifying the sentiment in general (positive or negative) or at an… ▽ More

    Submitted 20 March, 2019; originally announced April 2019.

    Comments: 10 pages, 6 figures, The 32nd Pacific Asia Conference on Language, Information and Computation (PACLIC 32)

  40. arXiv:1807.03004  [pdf, other

    cs.CL

    Towards Enhancing Lexical Resource and Using Sense-annotations of OntoSenseNet for Sentiment Analysis

    Authors: Sreekavitha Parupalli, Vij**i Anvesh Rao, Radhika Mamidi

    Abstract: This paper illustrates the interface of the tool we developed for crowd sourcing and we explain the annotation procedure in detail. Our tool is named as 'Parupalli Padajaalam' which means web of words by Parupalli. The aim of this tool is to populate the OntoSenseNet, sentiment polarity annotated Telugu resource. Recent works have shown the importance of word-level annotations on sentiment analysi… ▽ More

    Submitted 25 July, 2018; v1 submitted 9 July, 2018; originally announced July 2018.

    Comments: Accepted at 3rd Workshop on Semantic Deep Learning (SemDeep-3) at The 27th International Conference on Computational Linguistics, COLING (August 2018) in Santa Fe, New Mexico, USA

  41. arXiv:1807.01679  [pdf, other

    cs.CL

    BCSAT : A Benchmark Corpus for Sentiment Analysis in Telugu Using Word-level Annotations

    Authors: Sreekavitha Parupalli, Vij**i Anvesh Rao, Radhika Mamidi

    Abstract: The presented work aims at generating a systematically annotated corpus that can support the enhancement of sentiment analysis tasks in Telugu using word-level sentiment annotations. From OntoSenseNet, we extracted 11,000 adjectives, 253 adverbs, 8483 verbs and sentiment annotation is being done by language experts. We discuss the methodology followed for the polarity annotations and validate the… ▽ More

    Submitted 4 July, 2018; originally announced July 2018.

    Comments: Accepted as Long Paper at Student Research Workshop in 56th Annual Meeting of the Association for Computational Linguistics, ACL-2018

  42. arXiv:1807.01677  [pdf, other

    cs.CL

    Towards Automation of Sense-type Identification of Verbs in OntoSenseNet(Telugu)

    Authors: Sreekavitha Parupalli, Vij**i Anvesh Rao, Radhika Mamidi

    Abstract: In this paper, we discuss the enrichment of a manually developed resource of Telugu lexicon, OntoSenseNet. OntoSenseNet is a ontological sense annotated lexicon that marks each verb of Telugu with a primary and a secondary sense. The area of research is relatively recent but has a large scope of development. We provide an introductory work to enrich the OntoSenseNet to promote further research in… ▽ More

    Submitted 4 July, 2018; originally announced July 2018.

    Comments: Accepted Long Oral Paper at 6th International Workshop on Natural Language Processing for Social Media (SocialNLP) at 56th Annual Meeting of the Association for Computational Linguistics, ACL

  43. arXiv:1806.04535  [pdf, other

    cs.CL cs.AI

    Automatic Target Recovery for Hindi-English Code Mixed Puns

    Authors: Srishti Aggarwal, Kritik Mathur, Radhika Mamidi

    Abstract: In order for our computer systems to be more human-like, with a higher emotional quotient, they need to be able to process and understand intrinsic human language phenomena like humour. In this paper, we consider a subtype of humour - puns, which are a common type of wordplay-based jokes. In particular, we consider code-mixed puns which have become increasingly mainstream on social media, in infor… ▽ More

    Submitted 11 June, 2018; originally announced June 2018.

  44. arXiv:1806.03821  [pdf, other

    cs.CL

    Addition of Code Mixed Features to Enhance the Sentiment Prediction of Song Lyrics

    Authors: Gangula Rama Rohit Reddy, Radhika Mamidi

    Abstract: Sentiment analysis, also called opinion mining, is the field of study that analyzes people's opinions,sentiments, attitudes and emotions. Songs are important to sentiment analysis since the songs and mood are mutually dependent on each other. Based on the selected song it becomes easy to find the mood of the listener, in future it can be used for recommendation. The song lyric is a rich source of… ▽ More

    Submitted 11 June, 2018; originally announced June 2018.

  45. arXiv:1805.11372  [pdf, other

    cs.CV

    "How to rate a video game?" - A prediction system for video games based on multimodal information

    Authors: Vishal Batchu, Varshit Battu, Murali Krishna Reddy, Radhika Mamidi

    Abstract: Video games have become an integral part of most people's lives in recent times. This led to an abundance of data related to video games being shared online. However, this comes with issues such as incorrect ratings, reviews or anything that is being shared. Recommendation systems are powerful tools that help users by providing them with meaningful recommendations. A straightforward approach would… ▽ More

    Submitted 29 May, 2018; originally announced May 2018.

    Comments: ICPRAI-18

  46. arXiv:1804.05398  [pdf

    cs.CL

    Context and Humor: Understanding Amul advertisements of India

    Authors: Radhika Mamidi

    Abstract: Contextual knowledge is the most important element in understanding language. By contextual knowledge we mean both general knowledge and discourse knowledge i.e. knowledge of the situational context, background knowledge and the co-textual context [10]. In this paper, we will discuss the importance of contextual knowledge in understanding the humor present in the cartoon based Amul advertisements… ▽ More

    Submitted 15 April, 2018; originally announced April 2018.

    Comments: Presented at Workshop in Designing Humour in Human-Computer Interaction (HUMIC 2017). September 26th 2017, Mumbai, India. In conjunction with INTERACT 2017

  47. arXiv:1605.07366  [pdf

    cs.CL

    Experiments in Linear Template Combination using Genetic Algorithms

    Authors: Nikhilesh Bhatnagar, Radhika Mamidi

    Abstract: Natural Language Generation systems typically have two parts - strategic ('what to say') and tactical ('how to say'). We present our experiments in building an unsupervised corpus-driven template based tactical NLG system. We consider templates as a sequence of words containing gaps. Our idea is based on the observation that templates are grammatical locally (within their textual span). We posit t… ▽ More

    Submitted 24 May, 2016; originally announced May 2016.

    Comments: 6 pages

    MSC Class: 68T50

  48. arXiv:1604.03136  [pdf, other

    cs.CL

    Shallow Parsing Pipeline for Hindi-English Code-Mixed Social Media Text

    Authors: Arnav Sharma, Sakshi Gupta, Raveesh Motlani, Piyush Bansal, Manish Srivastava, Radhika Mamidi, Dipti M. Sharma

    Abstract: In this study, the problem of shallow parsing of Hindi-English code-mixed social media text (CSMT) has been addressed. We have annotated the data, developed a language identifier, a normalizer, a part-of-speech tagger and a shallow parser. To the best of our knowledge, we are the first to attempt shallow parsing on CSMT. The pipeline developed has been made available to the research community with… ▽ More

    Submitted 11 April, 2016; originally announced April 2016.