Skip to main content

Showing 1–15 of 15 results for author: Sahu, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.11462  [pdf, other

    cs.CL cs.AI

    LLM aided semi-supervision for Extractive Dialog Summarization

    Authors: Nishant Mishra, Gaurav Sahu, Iacer Calixto, Ameen Abu-Hanna, Issam H. Laradji

    Abstract: Generating high-quality summaries for chat dialogs often requires large labeled datasets. We propose a method to efficiently use unlabeled data for extractive summarization of customer-agent dialogs. In our method, we frame summarization as a question-answering problem and use state-of-the-art large language models (LLMs) to generate pseudo-labels for a dialog. We then use these pseudo-labels to f… ▽ More

    Submitted 23 November, 2023; v1 submitted 19 November, 2023; originally announced November 2023.

    Comments: to be published in EMNLP Findings

  2. arXiv:2311.09559  [pdf, other

    cs.CL cs.AI

    Prompt-based Pseudo-labeling Strategy for Sample-Efficient Semi-Supervised Extractive Summarization

    Authors: Gaurav Sahu, Olga Vechtomova, Issam H. Laradji

    Abstract: Semi-supervised learning (SSL) is a widely used technique in scenarios where labeled data is scarce and unlabeled data is abundant. While SSL is popular for image and text classification, it is relatively underexplored for the task of extractive text summarization. Standard SSL methods follow a teacher-student paradigm to first train a classification model and then use the classifier's confidence… ▽ More

    Submitted 1 July, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: 8 pages, 6 figures, 3 tables

  3. arXiv:2310.14192  [pdf, other

    cs.CL cs.AI

    PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation

    Authors: Gaurav Sahu, Olga Vechtomova, Dzmitry Bahdanau, Issam H. Laradji

    Abstract: Data augmentation is a widely used technique to address the problem of text classification when there is a limited amount of training data. Recent work often tackles this problem using large language models (LLMs) like GPT3 that can generate new examples given already available ones. In this work, we propose a method to generate more helpful augmented data by utilizing the LLM's abilities to follo… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 (Long paper)

  4. arXiv:2307.09312  [pdf, other

    cs.CL cs.LG cs.MM cs.SI

    Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media

    Authors: Liam Hebert, Gaurav Sahu, Yuxuan Guo, Nanda Kishore Sreenivas, Lukasz Golab, Robin Cohen

    Abstract: We present the Multi-Modal Discussion Transformer (mDT), a novel methodfor detecting hate speech in online social networks such as Reddit discussions. In contrast to traditional comment-only methods, our approach to labelling a comment as hate speech involves a holistic analysis of text and images grounded in the discussion context. This is done by leveraging graph transformers to capture the cont… ▽ More

    Submitted 22 February, 2024; v1 submitted 18 July, 2023; originally announced July 2023.

    Comments: Accepted to AAAI 2024 (AI for Social Impact Track)

  5. arXiv:2212.09947  [pdf, other

    cs.CL cs.AI cs.LG

    Future Sight: Dynamic Story Generation with Large Pretrained Language Models

    Authors: Brian D. Zimmerman, Gaurav Sahu, Olga Vechtomova

    Abstract: Recent advances in deep learning research, such as transformers, have bolstered the ability for automated agents to generate creative texts similar to those that a human would write. By default, transformer decoders can only generate new text with respect to previously generated text. The output distribution of candidate tokens at any position is conditioned on previously selected tokens using a s… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

    Comments: 9 pages, 1 figure, 4 tables

  6. arXiv:2210.15638  [pdf, other

    cs.SD cs.AI cs.CL cs.LG cs.MM eess.AS

    LyricJam Sonic: A Generative System for Real-Time Composition and Musical Improvisation

    Authors: Olga Vechtomova, Gaurav Sahu

    Abstract: Electronic music artists and sound designers have unique workflow practices that necessitate specialized approaches for develo** music information retrieval and creativity support tools. Furthermore, electronic music instruments, such as modular synthesizers, have near-infinite possibilities for sound creation and can be combined to create unique and complex audio paths. The process of discoveri… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: 15 pages, 9 figures, 2 tables

  7. arXiv:2204.01959  [pdf, other

    cs.CL cs.AI

    Data Augmentation for Intent Classification with Off-the-shelf Large Language Models

    Authors: Gaurav Sahu, Pau Rodriguez, Issam H. Laradji, Parmida Atighehchian, David Vazquez, Dzmitry Bahdanau

    Abstract: Data augmentation is a widely employed technique to alleviate the problem of data scarcity. In this work, we propose a prompting-based approach to generate labelled training data for intent classification with off-the-shelf language models (LMs) such as GPT-3. An advantage of this method is that no task-specific LM-fine-tuning for data generation is required; hence the method requires no hyper-par… ▽ More

    Submitted 4 April, 2022; originally announced April 2022.

    Comments: Accepted to 4th Workshop on NLP for Conversational AI, ACL 2022

  8. arXiv:2111.06440  [pdf, other

    cs.SI cs.AI

    Personalized multi-faceted trust modeling to determine trust links in social media and its potential for misinformation management

    Authors: Alexandre Parmentier, Robin Cohen, Xueguang Ma, Gaurav Sahu, Queenie Chen

    Abstract: In this paper, we present an approach for predicting trust links between peers in social media, one that is grounded in the artificial intelligence area of multiagent trust modeling. In particular, we propose a data-driven multi-faceted trust modeling which incorporates many distinct features for a comprehensive analysis. We focus on demonstrating how clustering of similar users enables a critical… ▽ More

    Submitted 11 November, 2021; originally announced November 2021.

    Comments: 28 pages

  9. arXiv:2106.01960  [pdf, other

    cs.SD cs.AI cs.CL cs.LG eess.AS

    LyricJam: A system for generating lyrics for live instrumental music

    Authors: Olga Vechtomova, Gaurav Sahu, Dhruv Kumar

    Abstract: We describe a real-time system that receives a live audio stream from a jam session and generates lyric lines that are congruent with the live music being played. Two novel approaches are proposed to align the learned latent spaces of audio and text representations that allow the system to generate novel lyric lines matching live instrumental music. One approach is based on adversarial alignment o… ▽ More

    Submitted 3 June, 2021; originally announced June 2021.

    Comments: Accepted to International Conference on Computational Creativity (ICCC) 2021 [Oral]

  10. arXiv:2105.01129  [pdf, other

    cs.AI cs.CL cs.CV cs.MA

    Towards A Multi-agent System for Online Hate Speech Detection

    Authors: Gaurav Sahu, Robin Cohen, Olga Vechtomova

    Abstract: This paper envisions a multi-agent system for detecting the presence of hate speech in online social media platforms such as Twitter and Facebook. We introduce a novel framework employing deep learning techniques to coordinate the channels of textual and im-age processing. Our experimental results aim to demonstrate the effectiveness of our methods for classifying online content, training the prop… ▽ More

    Submitted 3 May, 2021; originally announced May 2021.

    Comments: Accepted to the 2nd International Workshop on Autonomous Agents for Social Good (AASG), AAMAS, 2021

  11. arXiv:2009.14375  [pdf, other

    cs.CL

    Generation of lyrics lines conditioned on music audio clips

    Authors: Olga Vechtomova, Gaurav Sahu, Dhruv Kumar

    Abstract: We present a system for generating novel lyrics lines conditioned on music audio. A bimodal neural network model learns to generate lines conditioned on any given short audio clip. The model consists of a spectrogram variational autoencoder (VAE) and a text VAE. Both automatic and human evaluations demonstrate effectiveness of our model in generating lines that have an emotional impact matching a… ▽ More

    Submitted 29 September, 2020; originally announced September 2020.

    Comments: Accepted to First Workshop on NLP for Music and Audio (NLP4MusA) at ISMIR 2020

  12. arXiv:1911.03821  [pdf, other

    cs.CL cs.CV cs.LG eess.AS

    Adaptive Fusion Techniques for Multimodal Data

    Authors: Gaurav Sahu, Olga Vechtomova

    Abstract: Effective fusion of data from multiple modalities, such as video, speech, and text, is challenging due to the heterogeneous nature of multimodal data. In this paper, we propose adaptive fusion techniques that aim to model context from different modalities effectively. Instead of defining a deterministic fusion operation, such as concatenation, for the network, we let the network decide "how" to co… ▽ More

    Submitted 26 January, 2021; v1 submitted 9 November, 2019; originally announced November 2019.

    Comments: Camera-ready version for EACL 2021

  13. arXiv:1911.03817  [pdf, other

    cs.CL

    Adversarial Learning on the Latent Space for Diverse Dialog Generation

    Authors: Kashif Khan, Gaurav Sahu, Vikash Balasubramanian, Lili Mou, Olga Vechtomova

    Abstract: Generating relevant responses in a dialog is challenging, and requires not only proper modeling of context in the conversation but also being able to generate fluent sentences during inference. In this paper, we propose a two-step framework based on generative adversarial nets for generating conditioned responses. Our model first learns a meaningful representation of sentences by autoencoding and… ▽ More

    Submitted 3 November, 2020; v1 submitted 9 November, 2019; originally announced November 2019.

    Comments: Accepted to COLING 2020

  14. arXiv:1904.06022  [pdf, other

    cs.LG cs.CL stat.ML

    Multimodal Speech Emotion Recognition and Ambiguity Resolution

    Authors: Gaurav Sahu

    Abstract: Identifying emotion from speech is a non-trivial task pertaining to the ambiguous definition of emotion itself. In this work, we adopt a feature-engineering based approach to tackle the task of speech emotion recognition. Formalizing our problem as a multi-class classification problem, we compare the performance of two categories of models. For both, we extract eight hand-crafted features from the… ▽ More

    Submitted 11 April, 2019; originally announced April 2019.

    Comments: 9 pages

  15. arXiv:1809.01446  [pdf, other

    cs.CL

    Free as in Free Word Order: An Energy Based Model for Word Segmentation and Morphological Tagging in Sanskrit

    Authors: Amrith Krishna, Bishal Santra, Sasi Prasanth Bandaru, Gaurav Sahu, Vishnu Dutt Sharma, Pavankumar Satuluri, Pawan Goyal

    Abstract: The configurational information in sentences of a free word order language such as Sanskrit is of limited use. Thus, the context of the entire sentence will be desirable even for basic processing tasks such as word segmentation. We propose a structured prediction framework that jointly solves the word segmentation and morphological tagging tasks in Sanskrit. We build an energy based model where we… ▽ More

    Submitted 25 October, 2018; v1 submitted 5 September, 2018; originally announced September 2018.

    Comments: version 2: Corrected typo in Table1, page7 | Accepted in EMNLP 2018. Supplementary material can be found at - http://cse.iitkgp.ac.in/~amrithk/1080_supp.pdf