Skip to main content

Showing 1–48 of 48 results for author: Majumder, N

.
  1. arXiv:2406.15487  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Improving Text-To-Audio Models with Synthetic Captions

    Authors: Zhifeng Kong, Sang-gil Lee, Deepanway Ghosal, Navonil Majumder, Ambuj Mehrish, Rafael Valle, Soujanya Poria, Bryan Catanzaro

    Abstract: It is an open challenge to obtain high quality training data, especially captions, for text-to-audio models. Although prior methods have leveraged \textit{text-only language models} to augment and improve captions, such methods have limitations related to scale and coherence between audio and captions. In this work, we propose an audio captioning pipeline that uses an \textit{audio language model}… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  2. arXiv:2406.15193  [pdf, other

    cs.CL

    Reward Steering with Evolutionary Heuristics for Decoding-time Alignment

    Authors: Chia-Yu Hung, Navonil Majumder, Ambuj Mehrish, Soujanya Poria

    Abstract: The widespread applicability and increasing omnipresence of LLMs have instigated a need to align LLM responses to user and stakeholder preferences. Many preference optimization approaches have been proposed that fine-tune LLM parameters to achieve good alignment. However, such parameter tuning is known to interfere with model performance on many tasks. Moreover, kee** up with shifting user prefe… ▽ More

    Submitted 25 June, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

  3. arXiv:2404.09956  [pdf, other

    cs.SD cs.AI cs.CL eess.AS

    Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization

    Authors: Navonil Majumder, Chia-Yu Hung, Deepanway Ghosal, Wei-Ning Hsu, Rada Mihalcea, Soujanya Poria

    Abstract: Generative multimodal content is increasingly prevalent in much of the content creation arena, as it has the potential to allow artists and media personnel to create pre-production mockups by quickly bringing their ideas to life. The generation of audio from text prompts is an important aspect of such processes in the music and film industry. Many of the recent diffusion-based text-to-audio models… ▽ More

    Submitted 16 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: https://github.com/declare-lab/tango

  4. arXiv:2401.09395  [pdf, other

    cs.CL

    Evaluating LLMs' Mathematical and Coding Competency through Ontology-guided Interventions

    Authors: Pengfei Hong, Navonil Majumder, Deepanway Ghosal, Somak Aditya, Rada Mihalcea, Soujanya Poria

    Abstract: Recent advancements in Large Language Models (LLMs) have showcased striking results on existing logical reasoning benchmarks, with some models even surpassing human performance. However, the true depth of their competencies and robustness in reasoning tasks remains an open question. To this end, in this paper, we focus on two popular reasoning tasks: arithmetic reasoning and code generation. Parti… ▽ More

    Submitted 27 June, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

  5. arXiv:2311.08355  [pdf, other

    eess.AS

    Mustango: Toward Controllable Text-to-Music Generation

    Authors: Jan Melechovsky, Zixun Guo, Deepanway Ghosal, Navonil Majumder, Dorien Herremans, Soujanya Poria

    Abstract: The quality of the text-to-music models has reached new heights due to recent advancements in diffusion models. The controllability of various musical aspects, however, has barely been explored. In this paper, we propose Mustango: a music-domain-knowledge-inspired text-to-music system based on diffusion. Mustango aims to control the generated music, not only with general text captions, but with mo… ▽ More

    Submitted 3 June, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: NAACL 2024

  6. arXiv:2310.20159  [pdf, other

    cs.CV cs.AI

    Language Guided Visual Question Answering: Elevate Your Multimodal Language Model Using Knowledge-Enriched Prompts

    Authors: Deepanway Ghosal, Navonil Majumder, Roy Ka-Wei Lee, Rada Mihalcea, Soujanya Poria

    Abstract: Visual question answering (VQA) is the task of answering questions about an image. The task assumes an understanding of both the image and the question to provide a natural language answer. VQA has gained popularity in recent years due to its potential applications in a wide range of fields, including robotics, education, and healthcare. In this paper, we focus on knowledge-augmented VQA, where an… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

  7. arXiv:2307.02053  [pdf, other

    cs.CL

    Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning

    Authors: Deepanway Ghosal, Yew Ken Chia, Navonil Majumder, Soujanya Poria

    Abstract: Recently, the release of INSTRUCTEVAL has provided valuable insights into the performance of large language models (LLMs) that utilize encoder-decoder or decoder-only architecture. Interestingly, despite being introduced four years ago, T5-based LLMs, such as FLAN-T5, continue to outperform the latest decoder-based LLMs, such as LLAMA and VICUNA, on tasks that require general problem-solving skill… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

  8. arXiv:2305.18028  [pdf, other

    cs.SD cs.AI cs.CL eess.AS

    ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS Adaptation

    Authors: Ambuj Mehrish, Abhinav Ramesh Kashyap, Li Yingting, Navonil Majumder, Soujanya Poria

    Abstract: There are significant challenges for speaker adaptation in text-to-speech for languages that are not widely spoken or for speakers with accents or dialects that are not well-represented in the training data. To address this issue, we propose the use of the "mixture of adapters" method. This approach involves adding multiple adapters within a backbone-model layer to learn the unique characteristics… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    Comments: Interspeech 2023

  9. arXiv:2305.12301  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Sentence Embedder Guided Utterance Encoder (SEGUE) for Spoken Language Understanding

    Authors: Yi Xuan Tan, Navonil Majumder, Soujanya Poria

    Abstract: The pre-trained speech encoder wav2vec 2.0 performs very well on various spoken language understanding (SLU) tasks. However, on many tasks, it trails behind text encoders with textual input. To improve the understanding capability of SLU encoders, various studies have used knowledge distillation to transfer knowledge from natural language understanding (NLU) encoders. We use a very simple method o… ▽ More

    Submitted 20 May, 2023; originally announced May 2023.

    Comments: Interspeech 2023

  10. arXiv:2305.00359  [pdf, other

    eess.AS

    A Review of Deep Learning Techniques for Speech Processing

    Authors: Ambuj Mehrish, Navonil Majumder, Rishabh Bhardwaj, Rada Mihalcea, Soujanya Poria

    Abstract: The field of speech processing has undergone a transformative shift with the advent of deep learning. The use of multiple processing layers has enabled the creation of models capable of extracting intricate features from speech data. This development has paved the way for unparalleled advancements in speech recognition, text-to-speech synthesis, automatic speech recognition, and emotion recognitio… ▽ More

    Submitted 30 May, 2023; v1 submitted 29 April, 2023; originally announced May 2023.

  11. arXiv:2304.13731  [pdf, other

    eess.AS cs.AI cs.CL cs.SD

    Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model

    Authors: Deepanway Ghosal, Navonil Majumder, Ambuj Mehrish, Soujanya Poria

    Abstract: The immense scale of the recent large language models (LLM) allows many interesting properties, such as, instruction- and chain-of-thought-based fine-tuning, that has significantly improved zero- and few-shot performance in many natural language processing (NLP) tasks. Inspired by such successes, we adopt such an instruction-tuned LLM Flan-T5 as the text encoder for text-to-audio (TTA) generation… ▽ More

    Submitted 29 May, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

    Comments: https://github.com/declare-lab/tango

  12. arXiv:2303.03267  [pdf, other

    cs.CL cs.SD eess.AS

    Evaluating Parameter-Efficient Transfer Learning Approaches on SURE Benchmark for Speech Understanding

    Authors: Yingting Li, Ambuj Mehrish, Shuai Zhao, Rishabh Bhardwaj, Amir Zadeh, Navonil Majumder, Rada Mihalcea, Soujanya Poria

    Abstract: Fine-tuning is widely used as the default algorithm for transfer learning from pre-trained models. Parameter inefficiency can however arise when, during transfer learning, all the parameters of a large pre-trained model need to be updated for individual downstream tasks. As the number of parameters grows, fine-tuning is prone to overfitting and catastrophic forgetting. In addition, full fine-tunin… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: ICASSP 2023

  13. arXiv:2210.16495  [pdf, other

    cs.CL cs.AI cs.LG

    Two is Better than Many? Binary Classification as an Effective Approach to Multi-Choice Question Answering

    Authors: Deepanway Ghosal, Navonil Majumder, Rada Mihalcea, Soujanya Poria

    Abstract: We propose a simple refactoring of multi-choice question answering (MCQA) tasks as a series of binary classifications. The MCQA task is generally performed by scoring each (question, answer) pair normalized over all the pairs, and then selecting the answer from the pair that yield the highest score. For n answer choices, this is equivalent to an n-class classification setup where only one class (t… ▽ More

    Submitted 29 October, 2022; originally announced October 2022.

  14. arXiv:2210.02890  [pdf, other

    cs.CL

    Multiview Contextual Commonsense Inference: A New Dataset and Task

    Authors: Siqi Shen, Deepanway Ghosal, Navonil Majumder, Henry Lim, Rada Mihalcea, Soujanya Poria

    Abstract: Contextual commonsense inference is the task of generating various types of explanations around the events in a dyadic dialogue, including cause, motivation, emotional reaction, and others. Producing a coherent and non-trivial explanation requires awareness of the dialogue's structure and of how an event is grounded in the context. In this work, we create CICEROv2, a dataset consisting of 8,351 in… ▽ More

    Submitted 2 November, 2022; v1 submitted 6 October, 2022; originally announced October 2022.

  15. arXiv:2209.13101  [pdf, other

    cs.CL

    WikiDes: A Wikipedia-Based Dataset for Generating Short Descriptions from Paragraphs

    Authors: Hoang Thang Ta, Abu Bakar Siddiqur Rahman, Navonil Majumder, Amir Hussain, Lotfollah Najjar, Newton Howard, Soujanya Poria, Alexander Gelbukh

    Abstract: As free online encyclopedias with massive volumes of content, Wikipedia and Wikidata are key to many Natural Language Processing (NLP) tasks, such as information retrieval, knowledge base building, machine translation, text classification, and text summarization. In this paper, we introduce WikiDes, a novel dataset to generate short descriptions of Wikipedia articles for the problem of text summar… ▽ More

    Submitted 26 September, 2022; originally announced September 2022.

    Comments: 27 pages, 8 figures, 15 tables

  16. arXiv:2203.13926  [pdf, other

    cs.CL cs.AI

    CICERO: A Dataset for Contextualized Commonsense Inference in Dialogues

    Authors: Deepanway Ghosal, Siqi Shen, Navonil Majumder, Rada Mihalcea, Soujanya Poria

    Abstract: This paper addresses the problem of dialogue reasoning with contextualized commonsense inference. We curate CICERO, a dataset of dyadic conversations with five types of utterance-level reasoning-based inferences: cause, subsequent event, prerequisite, motivation, and emotional reaction. The dataset contains 53,105 of such inferences from 5,672 dialogues. We use this dataset to solve relevant gener… ▽ More

    Submitted 6 April, 2022; v1 submitted 25 March, 2022; originally announced March 2022.

    Comments: ACL 2022

  17. arXiv:2109.02247  [pdf, other

    cs.CL cs.AI

    STaCK: Sentence Ordering with Temporal Commonsense Knowledge

    Authors: Deepanway Ghosal, Navonil Majumder, Rada Mihalcea, Soujanya Poria

    Abstract: Sentence order prediction is the task of finding the correct order of sentences in a randomly ordered document. Correctly ordering the sentences requires an understanding of coherence with respect to the chronological sequence of events described in the text. Document-level contextual understanding and commonsense knowledge centered around these events are often essential in uncovering this cohere… ▽ More

    Submitted 6 September, 2021; originally announced September 2021.

    Comments: Accepted as a full paper at EMNLP 2021

  18. arXiv:2108.09689  [pdf, other

    cs.CL

    Improving Distantly Supervised Relation Extraction with Self-Ensemble Noise Filtering

    Authors: Tapas Nayak, Navonil Majumder, Soujanya Poria

    Abstract: Distantly supervised models are very popular for relation extraction since we can obtain a large amount of training data using the distant supervision method without human annotation. In distant supervision, a sentence is considered as a source of a tuple if the sentence contains both entities of the tuple. However, this condition is too permissive and does not guarantee the presence of relevant r… ▽ More

    Submitted 22 August, 2021; originally announced August 2021.

    Comments: Accepted in RANLP 2021. arXiv admin note: substantial text overlap with arXiv:2104.01799, arXiv:2103.16929

  19. arXiv:2108.06107  [pdf, other

    cs.CL cs.AI

    Aspect Sentiment Triplet Extraction Using Reinforcement Learning

    Authors: Samson Yu Bai Jian, Tapas Nayak, Navonil Majumder, Soujanya Poria

    Abstract: Aspect Sentiment Triplet Extraction (ASTE) is the task of extracting triplets of aspect terms, their associated sentiments, and the opinion terms that provide evidence for the expressed sentiments. Previous approaches to ASTE usually simultaneously extract all three components or first identify the aspect and opinion terms, then pair them up to predict their sentiment polarities. In this work, we… ▽ More

    Submitted 13 August, 2021; originally announced August 2021.

    Comments: CIKM 2021

  20. arXiv:2108.01260  [pdf, other

    cs.CL

    M2H2: A Multimodal Multiparty Hindi Dataset For Humor Recognition in Conversations

    Authors: Dushyant Singh Chauhan, Gopendra Vikram Singh, Navonil Majumder, Amir Zadeh, Asif Ekbal, Pushpak Bhattacharyya, Louis-philippe Morency, Soujanya Poria

    Abstract: Humor recognition in conversations is a challenging task that has recently gained popularity due to its importance in dialogue understanding, including in multimodal settings (i.e., text, acoustics, and visual). The few existing datasets for humor are mostly in English. However, due to the tremendous growth in multilingual content, there is a great demand to build models and systems that support m… ▽ More

    Submitted 2 August, 2021; originally announced August 2021.

    Comments: ICMI 2021

  21. arXiv:2106.11791  [pdf, other

    cs.CL cs.AI

    Exemplars-guided Empathetic Response Generation Controlled by the Elements of Human Communication

    Authors: Navonil Majumder, Deepanway Ghosal, Devamanyu Hazarika, Alexander Gelbukh, Rada Mihalcea, Soujanya Poria

    Abstract: The majority of existing methods for empathetic response generation rely on the emotion of the context to generate empathetic responses. However, empathy is much more than generating responses with an appropriate emotion. It also often entails subtle expressions of understanding and personal resonance with the situation of the other interlocutor. Unfortunately, such qualities are difficult to quan… ▽ More

    Submitted 4 August, 2021; v1 submitted 22 June, 2021; originally announced June 2021.

  22. arXiv:2106.01269  [pdf, other

    cs.CL

    More Identifiable yet Equally Performant Transformers for Text Classification

    Authors: Rishabh Bhardwaj, Navonil Majumder, Soujanya Poria, Eduard Hovy

    Abstract: Interpretability is an important aspect of the trustworthiness of a model's predictions. Transformer's predictions are widely explained by the attention weights, i.e., a probability distribution generated at its self-attention unit (head). Current empirical studies provide shreds of evidence that attention weights are not explanations by proving that they are not unique. A recent study showed theo… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Comments: ACL 2021

    Journal ref: ACL 2021

  23. arXiv:2106.00510  [pdf, other

    cs.CL cs.AI cs.LG

    CIDER: Commonsense Inference for Dialogue Explanation and Reasoning

    Authors: Deepanway Ghosal, Pengfei Hong, Siqi Shen, Navonil Majumder, Rada Mihalcea, Soujanya Poria

    Abstract: Commonsense inference to understand and explain human language is a fundamental research problem in natural language processing. Explaining human conversations poses a great challenge as it requires contextual understanding, planning, inference, and several aspects of reasoning including causal, temporal, and commonsense reasoning. In this work, we introduce CIDER -- a manually curated dataset tha… ▽ More

    Submitted 29 June, 2021; v1 submitted 1 June, 2021; originally announced June 2021.

    Comments: SIGDIAL 2021

  24. Deep Neural Approaches to Relation Triplets Extraction: A Comprehensive Survey

    Authors: Tapas Nayak, Navonil Majumder, Pawan Goyal, Soujanya Poria

    Abstract: Recently, with the advances made in continuous representation of words (word embeddings) and deep neural architectures, many research works are published in the area of relation extraction and it is very difficult to keep track of so many papers. To help future research, we present a comprehensive review of the recently published research works in relation extraction. We mostly focus on relation e… ▽ More

    Submitted 31 March, 2021; originally announced March 2021.

    Comments: A survey paper for relation extraction. Cogn Comput (2021)

  25. arXiv:2012.11820  [pdf, other

    cs.CL

    Recognizing Emotion Cause in Conversations

    Authors: Soujanya Poria, Navonil Majumder, Devamanyu Hazarika, Deepanway Ghosal, Rishabh Bhardwaj, Samson Yu Bai Jian, Pengfei Hong, Romila Ghosh, Abhinaba Roy, Niyati Chhaya, Alexander Gelbukh, Rada Mihalcea

    Abstract: We address the problem of recognizing emotion cause in conversations, define two novel sub-tasks of this problem, and provide a corresponding dialogue-level dataset, along with strong Transformer-based baselines. The dataset is available at https://github.com/declare-lab/RECCON. Introduction: Recognizing the cause behind emotions in text is a fundamental yet under-explored area of research in NL… ▽ More

    Submitted 28 July, 2021; v1 submitted 21 December, 2020; originally announced December 2020.

    Comments: https://github.com/declare-lab/RECCON, Accepted at Cognitive Computation

  26. arXiv:2012.06236  [pdf, other

    cs.CL

    Improving Zero Shot Learning Baselines with Commonsense Knowledge

    Authors: Abhinaba Roy, Deepanway Ghosal, Erik Cambria, Navonil Majumder, Rada Mihalcea, Soujanya Poria

    Abstract: Zero shot learning -- the problem of training and testing on a completely disjoint set of classes -- relies greatly on its ability to transfer knowledge from train classes to test classes. Traditionally semantic embeddings consisting of human defined attributes (HA) or distributed word embeddings (DWE) are used to facilitate this transfer by improving the association between visual and semantic em… ▽ More

    Submitted 11 December, 2020; originally announced December 2020.

  27. arXiv:2011.09954  [pdf, other

    cs.CL cs.LG

    Persuasive Dialogue Understanding: the Baselines and Negative Results

    Authors: Hui Chen, Deepanway Ghosal, Navonil Majumder, Amir Hussain, Soujanya Poria

    Abstract: Persuasion aims at forming one's opinion and action via a series of persuasive messages containing persuader's strategies. Due to its potential application in persuasive dialogue systems, the task of persuasive strategy recognition has gained much attention lately. Previous methods on user intent recognition in dialogue systems adopt recurrent neural network (RNN) or convolutional neural network (… ▽ More

    Submitted 22 November, 2020; v1 submitted 19 November, 2020; originally announced November 2020.

    Comments: 12 pages, 5 figures

  28. arXiv:2010.02795  [pdf, other

    cs.CL

    COSMIC: COmmonSense knowledge for eMotion Identification in Conversations

    Authors: Deepanway Ghosal, Navonil Majumder, Alexander Gelbukh, Rada Mihalcea, Soujanya Poria

    Abstract: In this paper, we address the task of utterance level emotion recognition in conversations using commonsense knowledge. We propose COSMIC, a new framework that incorporates different elements of commonsense such as mental states, events, and causal relations, and build upon them to learn interactions between interlocutors participating in a conversation. Current state-of-the-art methods often enco… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

  29. arXiv:2010.01454  [pdf, other

    cs.CL

    MIME: MIMicking Emotions for Empathetic Response Generation

    Authors: Navonil Majumder, Pengfei Hong, Shanshan Peng, Jiankun Lu, Deepanway Ghosal, Alexander Gelbukh, Rada Mihalcea, Soujanya Poria

    Abstract: Current approaches to empathetic response generation view the set of emotions expressed in the input text as a flat structure, where all the emotions are treated uniformly. We argue that empathetic responses often mimic the emotion of the user to a varying degree, depending on its positivity or negativity and content. We show that the consideration of this polarity-based emotion clusters and emoti… ▽ More

    Submitted 3 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020

  30. arXiv:2009.13902  [pdf, other

    cs.CL

    Utterance-level Dialogue Understanding: An Empirical Study

    Authors: Deepanway Ghosal, Navonil Majumder, Rada Mihalcea, Soujanya Poria

    Abstract: The recent abundance of conversational data on the Web and elsewhere calls for effective NLP systems for dialog understanding. Complete utterance-level understanding often requires context understanding, defined by nearby utterances. In recent years, a number of approaches have been proposed for various utterance-level dialogue understanding tasks. Most of these approaches account for the context… ▽ More

    Submitted 22 October, 2020; v1 submitted 29 September, 2020; originally announced September 2020.

  31. arXiv:2009.05092  [pdf, other

    cs.CL

    Dialogue Relation Extraction with Document-level Heterogeneous Graph Attention Networks

    Authors: Hui Chen, Pengfei Hong, Wei Han, Navonil Majumder, Soujanya Poria

    Abstract: Dialogue relation extraction (DRE) aims to detect the relation between two entities mentioned in a multi-party dialogue. It plays an important role in constructing knowledge graphs from conversational data increasingly abundant on the internet and facilitating intelligent dialogue system development. The prior methods of DRE do not meaningfully leverage speaker information-they just prepend the ut… ▽ More

    Submitted 20 June, 2021; v1 submitted 10 September, 2020; originally announced September 2020.

  32. arXiv:2009.05021  [pdf, other

    cs.CL

    Investigating Gender Bias in BERT

    Authors: Rishabh Bhardwaj, Navonil Majumder, Soujanya Poria

    Abstract: Contextual language models (CLMs) have pushed the NLP benchmarks to a new height. It has become a new norm to utilize CLM provided word embeddings in downstream tasks such as text classification. However, unless addressed, CLMs are prone to learn intrinsic gender-bias in the dataset. As a result, predictions of downstream NLP models can vary noticeably by varying gender words, such as replacing "h… ▽ More

    Submitted 10 September, 2020; originally announced September 2020.

  33. arXiv:2008.05575  [pdf

    cs.LG cs.NE

    Comprehensive forecasting based analysis using stacked stateless and stateful Gated Recurrent Unit models

    Authors: Swayamjit Saha, Niladri Majumder, Devansh Sangani

    Abstract: Photovoltaic power is a renewable source of energy which is highly used in industries. In economically struggling countries it can be a potential source of electric energy as other non-renewable resources are already exhausting. Now if installation of a photovoltaic cell in a region is done prior to research, it may not provide the desired energy output required for running that region. Hence fore… ▽ More

    Submitted 14 August, 2020; v1 submitted 12 August, 2020; originally announced August 2020.

    Comments: 12 pages, 2 figures, 6 tables; typos corrected, references added

  34. arXiv:2005.06607  [pdf, other

    cs.CL

    Improving Aspect-Level Sentiment Analysis with Aspect Extraction

    Authors: Navonil Majumder, Rishabh Bhardwaj, Soujanya Poria, Amir Zadeh, Alexander Gelbukh, Amir Hussain, Louis-Philippe Morency

    Abstract: Aspect-based sentiment analysis (ABSA), a popular research area in NLP has two distinct parts -- aspect extraction (AE) and labeling the aspects with sentiment polarity (ALSA). Although distinct, these two tasks are highly correlated. The work primarily hypothesize that transferring knowledge from a pre-trained AE model can benefit the performance of ALSA models. Based on this hypothesis, word emb… ▽ More

    Submitted 3 May, 2020; originally announced May 2020.

  35. arXiv:2005.00791  [pdf, other

    cs.CL

    KinGDOM: Knowledge-Guided DOMain adaptation for sentiment analysis

    Authors: Deepanway Ghosal, Devamanyu Hazarika, Abhinaba Roy, Navonil Majumder, Rada Mihalcea, Soujanya Poria

    Abstract: Cross-domain sentiment analysis has received significant attention in recent years, prompted by the need to combat the domain gap between different applications that make use of sentiment analysis. In this paper, we take a novel perspective on this task by exploring the role of external commonsense knowledge. We introduce a new framework, KinGDOM, which utilizes the ConceptNet knowledge graph to e… ▽ More

    Submitted 11 May, 2020; v1 submitted 2 May, 2020; originally announced May 2020.

  36. arXiv:2005.00357  [pdf, other

    cs.CL cs.IR

    Beneath the Tip of the Iceberg: Current Challenges and New Directions in Sentiment Analysis Research

    Authors: Soujanya Poria, Devamanyu Hazarika, Navonil Majumder, Rada Mihalcea

    Abstract: Sentiment analysis as a field has come a long way since it was first introduced as a task nearly 20 years ago. It has widespread commercial applications in various domains like marketing, risk management, market research, and politics, to name a few. Given its saturation in specific subtasks -- such as sentiment polarity classification -- and datasets, there is an underlying perception that this f… ▽ More

    Submitted 16 November, 2020; v1 submitted 1 May, 2020; originally announced May 2020.

    Comments: Published in the IEEE Transactions on Affective Computing (TAFFC)

  37. arXiv:1908.11540  [pdf, other

    cs.CL cs.LG

    DialogueGCN: A Graph Convolutional Neural Network for Emotion Recognition in Conversation

    Authors: Deepanway Ghosal, Navonil Majumder, Soujanya Poria, Niyati Chhaya, Alexander Gelbukh

    Abstract: Emotion recognition in conversation (ERC) has received much attention, lately, from researchers due to its potential widespread applications in diverse areas, such as health-care, education, and human resources. In this paper, we present Dialogue Graph Convolutional Network (DialogueGCN), a graph neural network based approach to ERC. We leverage self and inter-speaker dependency of the interlocuto… ▽ More

    Submitted 30 August, 2019; originally announced August 2019.

    Comments: Accepted at EMNLP 2019

  38. arXiv:1908.06008  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Variational Fusion for Multimodal Sentiment Analysis

    Authors: Navonil Majumder, Soujanya Poria, Gangeshwar Krishnamurthy, Niyati Chhaya, Rada Mihalcea, Alexander Gelbukh

    Abstract: Multimodal fusion is considered a key step in multimodal tasks such as sentiment analysis, emotion detection, question answering, and others. Most of the recent work on multimodal fusion does not guarantee the fidelity of the multimodal representation with respect to the unimodal representations. In this paper, we propose a variational autoencoder-based approach for modality fusion that minimizes… ▽ More

    Submitted 13 August, 2019; originally announced August 2019.

  39. Recent Trends in Deep Learning Based Personality Detection

    Authors: Yash Mehta, Navonil Majumder, Alexander Gelbukh, Erik Cambria

    Abstract: Recently, the automatic prediction of personality traits has received a lot of attention. Specifically, personality trait prediction from multimodal data has emerged as a hot topic within the field of affective computing. In this paper, we review significant machine learning models which have been employed for personality detection, with an emphasis on deep learning-based methods. This review pape… ▽ More

    Submitted 27 August, 2019; v1 submitted 7 August, 2019; originally announced August 2019.

    Journal ref: Artif Intell Rev 53 (2020) 2313-2339

  40. arXiv:1905.02947  [pdf, other

    cs.CL cs.AI

    Emotion Recognition in Conversation: Research Challenges, Datasets, and Recent Advances

    Authors: Soujanya Poria, Navonil Majumder, Rada Mihalcea, Eduard Hovy

    Abstract: Emotion is intrinsic to humans and consequently emotion understanding is a key part of human-like artificial intelligence (AI). Emotion recognition in conversation (ERC) is becoming increasingly popular as a new research frontier in natural language processing (NLP) due to its ability to mine opinions from the plethora of publicly available conversational data in platforms such as Facebook, Youtub… ▽ More

    Submitted 8 May, 2019; originally announced May 2019.

  41. arXiv:1901.08014  [pdf, other

    cs.CL

    Sentiment and Sarcasm Classification with Multitask Learning

    Authors: Navonil Majumder, Soujanya Poria, Haiyun Peng, Niyati Chhaya, Erik Cambria, Alexander Gelbukh

    Abstract: Sentiment classification and sarcasm detection are both important natural language processing (NLP) tasks. Sentiment is always coupled with sarcasm where intensive emotion is expressed. Nevertheless, most literature considers them as two separate tasks. We argue that knowledge in sarcasm detection can also be beneficial to sentiment classification and vice versa. We show that these two tasks are c… ▽ More

    Submitted 8 March, 2019; v1 submitted 23 January, 2019; originally announced January 2019.

    Journal ref: IEEE Intelligent Systems 34(3) (2019)

  42. arXiv:1811.00405  [pdf, other

    cs.CL

    DialogueRNN: An Attentive RNN for Emotion Detection in Conversations

    Authors: Navonil Majumder, Soujanya Poria, Devamanyu Hazarika, Rada Mihalcea, Alexander Gelbukh, Erik Cambria

    Abstract: Emotion detection in conversations is a necessary step for a number of applications, including opinion mining over chat history, social media threads, debates, argumentation mining, understanding consumer feedback in live conversations, etc. Currently, systems do not treat the parties in the conversation individually by adapting to the speaker of each utterance. In this paper, we describe a new me… ▽ More

    Submitted 25 May, 2019; v1 submitted 1 November, 2018; originally announced November 2018.

    Comments: AAAI 2019

  43. arXiv:1810.02508  [pdf, other

    cs.CL

    MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations

    Authors: Soujanya Poria, Devamanyu Hazarika, Navonil Majumder, Gautam Naik, Erik Cambria, Rada Mihalcea

    Abstract: Emotion recognition in conversations is a challenging task that has recently gained popularity due to its potential applications. Until now, however, a large-scale multimodal multi-party emotional conversational database containing more than two speakers per dialogue was missing. Thus, we propose the Multimodal EmotionLines Dataset (MELD), an extension and enhancement of EmotionLines. MELD contain… ▽ More

    Submitted 4 June, 2019; v1 submitted 4 October, 2018; originally announced October 2018.

    Comments: https://affective-meld.github.io

  44. arXiv:1806.06228  [pdf, other

    cs.CL cs.CV

    Multimodal Sentiment Analysis using Hierarchical Fusion with Context Modeling

    Authors: N. Majumder, D. Hazarika, A. Gelbukh, E. Cambria, S. Poria

    Abstract: Multimodal sentiment analysis is a very actively growing field of research. A promising area of opportunity in this field is to improve the multimodal fusion mechanism. We present a novel feature fusion strategy that proceeds in a hierarchical fashion, first fusing the modalities two in two and only then fusing all three modalities. On multimodal sentiment analysis of individual utterances, our st… ▽ More

    Submitted 16 June, 2018; originally announced June 2018.

    Comments: Accepted for publication at Knowledge Based Systems

  45. arXiv:1803.07427  [pdf, other

    cs.CL cs.CV cs.IR

    Multimodal Sentiment Analysis: Addressing Key Issues and Setting up the Baselines

    Authors: Soujanya Poria, Navonil Majumder, Devamanyu Hazarika, Erik Cambria, Alexander Gelbukh, Amir Hussain

    Abstract: We compile baselines, along with dataset split, for multimodal sentiment analysis. In this paper, we explore three different deep-learning based architectures for multimodal sentiment classification, each improving upon the previous. Further, we evaluate these architectures with multiple datasets with fixed train/test partition. We also discuss some major issues, frequently ignored in multimodal s… ▽ More

    Submitted 11 February, 2019; v1 submitted 18 March, 2018; originally announced March 2018.

    Comments: IEEE Intelligence Systems. arXiv admin note: substantial text overlap with arXiv:1707.09538

  46. arXiv:1803.00344  [pdf, other

    cs.CL cs.AI cs.CV

    A Deep Learning Approach for Multimodal Deception Detection

    Authors: Gangeshwar Krishnamurthy, Navonil Majumder, Soujanya Poria, Erik Cambria

    Abstract: Automatic deception detection is an important task that has gained momentum in computational linguistics due to its potential applications. In this paper, we propose a simple yet tough to beat multi-modal neural model for deception detection. By combining features from different modalities such as video, audio, and text along with Micro-Expression features, we show that detecting deception in real… ▽ More

    Submitted 1 March, 2018; originally announced March 2018.

    Comments: Accepted at the 19th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing), 2018

  47. arXiv:1512.06730  [pdf, ps, other

    cs.IT cs.CV cs.LG stat.ML

    Multilinear Subspace Clustering

    Authors: Eric Kernfeld, Nathan Majumder, Shuchin Aeron, Misha Kilmer

    Abstract: In this paper we present a new model and an algorithm for unsupervised clustering of 2-D data such as images. We assume that the data comes from a union of multilinear subspaces (UOMS) model, which is a specific structured case of the much studied union of subspaces (UOS) model. For segmentation under this model, we develop Multilinear Subspace Clustering (MSC) algorithm and evaluate its performan… ▽ More

    Submitted 21 December, 2015; originally announced December 2015.

  48. arXiv:1109.3317  [pdf

    cs.CV

    Design of an Optical Character Recognition System for Camera-based Handheld Devices

    Authors: Ayatullah Faruk Mollah, Nabamita Majumder, Subhadip Basu, Mita Nasipuri

    Abstract: This paper presents a complete Optical Character Recognition (OCR) system for camera captured image/graphics embedded textual documents for handheld devices. At first, text regions are extracted and skew corrected. Then, these regions are binarized and segmented into lines and characters. Characters are passed into the recognition module. Experimenting with a set of 100 business card images, captu… ▽ More

    Submitted 15 September, 2011; originally announced September 2011.

    Journal ref: Int'l J. of Computer Science Issues, Vol. 8, Issue 4, pp. 283-289, July 2011