Skip to main content

Showing 1–50 of 692 results for author: Ghosh, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00317  [pdf, other

    cs.IR stat.AP

    Towards Statistically Significant Taxonomy Aware Co-location Pattern Detection

    Authors: Subhankar Ghosh, Arun Sharma, Jayant Gupta, Shashi Shekhar

    Abstract: Given a collection of Boolean spatial feature types, their instances, a neighborhood relation (e.g., proximity), and a hierarchical taxonomy of the feature types, the goal is to find the subsets of feature types or their parents whose spatial interaction is statistically significant. This problem is for taxonomy-reliant applications such as ecology (e.g., finding new symbiotic relationships across… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: Accepted in The 16th Conference on Spatial Information Theory (COSIT) 2024

    ACM Class: E.m; H.3.3; I.5; J.4; J.4

  2. arXiv:2406.17957  [pdf, other

    cs.SD cs.AI eess.AS

    Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment

    Authors: Paarth Neekhara, Shehzeen Hussain, Subhankar Ghosh, Jason Li, Rafael Valle, Rohan Badlani, Boris Ginsburg

    Abstract: Large Language Model (LLM) based text-to-speech (TTS) systems have demonstrated remarkable capabilities in handling large speech datasets and generating natural speech for new speakers. However, LLM-based TTS models are not robust as the generated output can contain repeating words, missing words and mis-aligned speech (referred to as hallucinations or attention errors), especially when the text c… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Published as a conference paper at INTERSPEECH 2024

  3. arXiv:2406.11768  [pdf, other

    cs.SD cs.AI cs.CL eess.AS

    GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities

    Authors: Sreyan Ghosh, Sonal Kumar, Ashish Seth, Chandra Kiran Reddy Evuru, Utkarsh Tyagi, S Sakshi, Oriol Nieto, Ramani Duraiswami, Dinesh Manocha

    Abstract: Perceiving and understanding non-speech sounds and non-verbal speech is essential to making decisions that help us interact with our surroundings. In this paper, we propose GAMA, a novel General-purpose Large Audio-Language Model (LALM) with Advanced Audio Understanding and Complex Reasoning Abilities. We build GAMA by integrating an LLM with multiple types of audio representations, including feat… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Project Website: https://sreyan88.github.io/gamaaudio/

  4. arXiv:2406.11704  [pdf, other

    cs.CL cs.AI cs.LG

    Nemotron-4 340B Technical Report

    Authors: Nvidia, :, Bo Adler, Niket Agarwal, Ashwath Aithal, Dong H. Anh, Pallab Bhattacharya, Annika Brundyn, Jared Casper, Bryan Catanzaro, Sharon Clay, Jonathan Cohen, Sirshak Das, Ayush Dattagupta, Olivier Delalleau, Leon Derczynski, Yi Dong, Daniel Egert, Ellie Evans, Aleksander Ficek, Denys Fridman, Shaona Ghosh, Boris Ginsburg, Igor Gitman, Tomasz Grzegorzek , et al. (58 additional authors not shown)

    Abstract: We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward. Our models are open access under the NVIDIA Open Model License Agreement, a permissive model license that allows distribution, modification, and use of the models and its outputs. These models perform competitively to open access models on a wide range of evaluation be… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  5. arXiv:2406.06818  [pdf, other

    cs.LG

    Conformal Prediction for Class-wise Coverage via Augmented Label Rank Calibration

    Authors: Yuanjie Shi, Subhankar Ghosh, Taha Belkhouja, Janardhan Rao Doppa, Yan Yan

    Abstract: Conformal prediction (CP) is an emerging uncertainty quantification framework that allows us to construct a prediction set to cover the true label with a pre-specified marginal or conditional probability. Although the valid coverage guarantee has been extensively studied for classification problems, CP often produces large prediction sets which may not be practically useful. This issue is exacerba… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  6. arXiv:2406.05348  [pdf, other

    cs.CL cs.AI cs.IR

    Toward Reliable Ad-hoc Scientific Information Extraction: A Case Study on Two Materials Datasets

    Authors: Satanu Ghosh, Neal R. Brodnik, Carolina Frey, Collin Holgate, Tresa M. Pollock, Samantha Daly, Samuel Carton

    Abstract: We explore the ability of GPT-4 to perform ad-hoc schema based information extraction from scientific literature. We assess specifically whether it can, with a basic prompting approach, replicate two existing material science datasets, given the manuscripts from which they were originally manually extracted. We employ materials scientists to perform a detailed manual error analysis to assess where… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  7. arXiv:2406.04432  [pdf, other

    eess.AS cs.AI cs.CL

    LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition

    Authors: Sreyan Ghosh, Sonal Kumar, Ashish Seth, Purva Chiniya, Utkarsh Tyagi, Ramani Duraiswami, Dinesh Manocha

    Abstract: Visual cues, like lip motion, have been shown to improve the performance of Automatic Speech Recognition (ASR) systems in noisy environments. We propose LipGER (Lip Motion aided Generative Error Correction), a novel framework for leveraging visual cues for noise-robust ASR. Instead of learning the cross-modal correlation between the audio and visual modalities, we make an LLM learn the task of vis… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: InterSpeech 2024. Code and Data: https://github.com/Sreyan88/LipGER

  8. arXiv:2406.04370  [pdf, other

    cs.CL cs.AI cs.LG

    Large Language Model Confidence Estimation via Black-Box Access

    Authors: Tejaswini Pedapati, Amit Dhurandhar, Soumya Ghosh, Soham Dan, Prasanna Sattigeri

    Abstract: Estimating uncertainty or confidence in the responses of a model can be significant in evaluating trust not only in the responses, but also in the model as a whole. In this paper, we explore the problem of estimating confidence for responses of large language models (LLMs) with simply black-box or query access to them. We propose a simple and extensible framework where, we engineer novel features… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

  9. arXiv:2406.04286  [pdf, other

    cs.CL cs.AI

    ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract Descriptions

    Authors: Sreyan Ghosh, Utkarsh Tyagi, Sonal Kumar, C. K. Evuru, S Ramaneswaran, S Sakshi, Dinesh Manocha

    Abstract: We present ABEX, a novel and effective generative data augmentation methodology for low-resource Natural Language Understanding (NLU) tasks. ABEX is based on ABstract-and-EXpand, a novel paradigm for generating diverse forms of an input document -- we first convert a document into its concise, abstract description and then generate new documents based on expanding the resultant abstraction. To lea… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: ACL 2024 Main Conference. Code and data: https://github.com/Sreyan88/ABEX

  10. arXiv:2406.04245  [pdf, ps, other

    quant-ph cs.LG

    Online learning of a panoply of quantum objects

    Authors: Akshay Bansal, Ian George, Soumik Ghosh, Jamie Sikora, Alice Zheng

    Abstract: In many quantum tasks, there is an unknown quantum object that one wishes to learn. An online strategy for this task involves adaptively refining a hypothesis to reproduce such an object or its measurement statistics. A common evaluation metric for such a strategy is its regret, or roughly the accumulated errors in hypothesis statistics. We prove a sublinear regret bound for learning over general… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 34 pages. Comments welcome

  11. arXiv:2406.03747  [pdf, other

    cs.CV cs.AI cs.LG

    Instance Segmentation and Teeth Classification in Panoramic X-rays

    Authors: Devichand Budagam, Ayush Kumar, Sayan Ghosh, Anuj Shrivastav, Azamat Zhanatuly Imanbayev, Iskander Rafailovich Akhmetov, Dmitrii Kaplun, Sergey Antonov, Artem Rychenkov, Gleb Cyganov, Aleksandr Sinitca

    Abstract: Teeth segmentation and recognition are critical in various dental applications and dental diagnosis. Automatic and accurate segmentation approaches have been made possible by integrating deep learning models. Although teeth segmentation has been studied in the past, only some techniques were able to effectively classify and segment teeth simultaneously. This article offers a pipeline of two deep l… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: submtted to Expert Systems with Applications Journal

  12. arXiv:2406.00451  [pdf, other

    cs.RO

    Task Planning for Object Rearrangement in Multi-room Environments

    Authors: Karan Mirakhor, Sourav Ghosh, Dipanjan Das, Brojeshwar Bhowmick

    Abstract: Object rearrangement in a multi-room setup should produce a reasonable plan that reduces the agent's overall travel and the number of steps. Recent state-of-the-art methods fail to produce such plans because they rely on explicit exploration for discovering unseen objects due to partial observability and a heuristic planner to sequence the actions for rearrangement. This paper proposes a novel hie… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: Accepted in AAAI 2024 as oral paper

  13. arXiv:2405.18746  [pdf, other

    quant-ph cs.LG

    STIQ: Safeguarding Training and Inferencing of Quantum Neural Networks from Untrusted Cloud

    Authors: Satwik Kundu, Swaroop Ghosh

    Abstract: The high expenses imposed by current quantum cloud providers, coupled with the escalating need for quantum resources, may incentivize the emergence of cheaper cloud-based quantum services from potentially untrusted providers. Deploying or hosting quantum models, such as Quantum Neural Networks (QNNs), on these untrusted platforms introduces a myriad of security concerns, with the most critical one… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  14. arXiv:2405.17258  [pdf, other

    cs.LG cs.AI

    $\textit{Trans-LoRA}$: towards data-free Transferable Parameter Efficient Finetuning

    Authors: Runqian Wang, Soumya Ghosh, David Cox, Diego Antognini, Aude Oliva, Rogerio Feris, Leonid Karlinsky

    Abstract: Low-rank adapters (LoRA) and their variants are popular parameter-efficient fine-tuning (PEFT) techniques that closely match full model fine-tune performance while requiring only a small number of additional parameters. These additional LoRA parameters are specific to the base model being adapted. When the base model needs to be deprecated and replaced with a new one, all the associated LoRA modul… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  15. arXiv:2405.15683  [pdf, other

    cs.CV cs.AI cs.CL

    VDGD: Mitigating LVLM Hallucinations in Cognitive Prompts by Bridging the Visual Perception Gap

    Authors: Sreyan Ghosh, Chandra Kiran Reddy Evuru, Sonal Kumar, Utkarsh Tyagi, Oriol Nieto, Zeyu **, Dinesh Manocha

    Abstract: Recent interest in Large Vision-Language Models (LVLMs) for practical applications is moderated by the significant challenge of hallucination or the inconsistency between the factual information and the generated text. In this paper, we first perform an in-depth analysis of hallucinations and discover several novel insights about how and when LVLMs hallucinate. From our analysis, we show that: (1)… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Preprint. Under review. Code will be released on paper acceptance

  16. arXiv:2405.14776  [pdf, other

    cond-mat.str-el cond-mat.mtrl-sci cs.LG

    Kinetics of orbital ordering in cooperative Jahn-Teller models: Machine-learning enabled large-scale simulations

    Authors: Supriyo Ghosh, Sheng Zhang, Chen Cheng, Gia-Wei Chern

    Abstract: We present a scalable machine learning (ML) force-field model for the adiabatic dynamics of cooperative Jahn-Teller (JT) systems. Large scale dynamical simulations of the JT model also shed light on the orbital ordering dynamics in colossal magnetoresistance manganites. The JT effect in these materials describes the distortion of local oxygen octahedra driven by a coupling to the orbital degrees o… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 17 pages, 11 figures

  17. arXiv:2405.14707  [pdf

    cs.AI

    Artificial Intelligence (AI) in Legal Data Mining

    Authors: Aniket Deroy, Naksatra Kumar Bailung, Kripabandhu Ghosh, Saptarshi Ghosh, Abhijnan Chakraborty

    Abstract: Despite the availability of vast amounts of data, legal data is often unstructured, making it difficult even for law practitioners to ingest and comprehend the same. It is important to organise the legal information in a way that is useful for practitioners and downstream automation tasks. The word ontology was used by Greek philosophers to discuss concepts of existence, being, becoming and realit… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Book name-Technology and Analytics for Law and Justice, Page no-273-297, Chapter no-14

  18. arXiv:2405.13140  [pdf, ps, other

    math.ST cs.LG math.PR

    On Convergence of the Alternating Directions SGHMC Algorithm

    Authors: Soumyadip Ghosh, Yingdong Lu, Tomasz Nowicki

    Abstract: We study convergence rates of Hamiltonian Monte Carlo (HMC) algorithms with leapfrog integration under mild conditions on stochastic gradient oracle for the target distribution (SGHMC). Our method extends standard HMC by allowing the use of general auxiliary distributions, which is achieved by a novel procedure of Alternating Directions. The convergence analysis is based on the investigations of… ▽ More

    Submitted 26 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

  19. arXiv:2405.12255  [pdf, other

    eess.IV cs.CV

    Mammo-CLIP: A Vision Language Foundation Model to Enhance Data Efficiency and Robustness in Mammography

    Authors: Shantanu Ghosh, Clare B. Poynton, Shyam Visweswaran, Kayhan Batmanghelich

    Abstract: The lack of large and diverse training data on Computer-Aided Diagnosis (CAD) in breast cancer detection has been one of the concerns that impedes the adoption of the system. Recently, pre-training with large-scale image text datasets via Vision-Language models (VLM) (\eg CLIP) partially addresses the issue of robustness and data efficiency in computer vision (CV). This paper proposes Mammo-CLIP,… ▽ More

    Submitted 22 May, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: MICCAI 2024, early accept, top 11%

  20. arXiv:2405.09570  [pdf, other

    eess.SP cs.LG cs.SD eess.AS

    FunnelNet: An End-to-End Deep Learning Framework to Monitor Digital Heart Murmur in Real-Time

    Authors: Md Jobayer, Md. Mehedi Hasan Shawon, Md Rakibul Hasan, Shreya Ghosh, Tom Gedeon, Md Zakir Hossain

    Abstract: Objective: Heart murmurs are abnormal sounds caused by turbulent blood flow within the heart. Several diagnostic methods are available to detect heart murmurs and their severity, such as cardiac auscultation, echocardiography, phonocardiogram (PCG), etc. However, these methods have limitations, including extensive training and experience among healthcare providers, cost and accessibility of echoca… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 8-page main paper and 4-page supplementary material

  21. arXiv:2405.06671  [pdf, other

    cs.CL cs.CE cs.LG

    Parameter-Efficient Instruction Tuning of Large Language Models For Extreme Financial Numeral Labelling

    Authors: Subhendu Khatuya, Rajdeep Mukherjee, Akash Ghosh, Manjunath Hegde, Koustuv Dasgupta, Niloy Ganguly, Saptarshi Ghosh, Pawan Goyal

    Abstract: We study the problem of automatically annotating relevant numerals (GAAP metrics) occurring in the financial documents with their corresponding XBRL tags. Different from prior works, we investigate the feasibility of solving this extreme classification problem using a generative paradigm through instruction tuning of Large Language Models (LLMs). To this end, we leverage metric metadata informatio… ▽ More

    Submitted 15 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Comments: This work has been accepted to appear at North American Chapter of the Association for Computational Linguistics (NAACL), 2024

  22. arXiv:2405.06669  [pdf, other

    cs.CL cs.CE cs.IR cs.LG

    Instruction-Guided Bullet Point Summarization of Long Financial Earnings Call Transcripts

    Authors: Subhendu Khatuya, Koushiki Sinha, Niloy Ganguly, Saptarshi Ghosh, Pawan Goyal

    Abstract: While automatic summarization techniques have made significant advancements, their primary focus has been on summarizing short news articles or documents that have clear structural patterns like scientific articles or government reports. There has not been much exploration into develo** efficient methods for summarizing financial documents, which often contain complex facts and figures. Here, we… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: Accepted in SIGIR 2024

  23. arXiv:2405.06355  [pdf, other

    eess.SY cs.RO

    Switched Vector Field-based Guidance for General Reference Path Following in Planar Environment

    Authors: Subham Basak, Satadal Ghosh

    Abstract: Reference path following is a key component in the functioning of almost all engineered autonomous agents. Among several path following guidance methods in existing literature, vector-field-based guidance approach has got wide attention because of its simplicity and guarantee of stability under a broad class of scenarios. However, the usage of same cross-track-error-dependent structure of desired… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

  24. arXiv:2405.06119  [pdf, other

    cs.LG math.NA

    Gradient Flow Based Phase-Field Modeling Using Separable Neural Networks

    Authors: Revanth Mattey, Susanta Ghosh

    Abstract: The $L^2$ gradient flow of the Ginzburg-Landau free energy functional leads to the Allen Cahn equation that is widely used for modeling phase separation. Machine learning methods for solving the Allen-Cahn equation in its strong form suffer from inaccuracies in collocation techniques, errors in computing higher-order spatial derivatives through automatic differentiation, and the large system size… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  25. arXiv:2405.05511  [pdf, other

    quant-ph cs.ET

    Investigating impact of bit-flip errors in control electronics on quantum computation

    Authors: Subrata Das, Avimita Chatterjee, Swaroop Ghosh

    Abstract: In this paper, we investigate the impact of bit flip errors in FPGA memories in control electronics on quantum computing systems. FPGA memories are integral in storing the amplitude and phase information pulse envelopes, which are essential for generating quantum gate pulses. However, these memories can incur faults due to physical and environmental stressors such as electromagnetic interference,… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 9 pages, 9 figures, conference

  26. arXiv:2405.03725  [pdf, other

    cs.NE cs.AI cs.LG

    Deep Oscillatory Neural Network

    Authors: Nurani Rajagopal Rohan, Vigneswaran C, Sayan Ghosh, Kishore Rajendran, Gaurav A, V Srinivasa Chakravarthy

    Abstract: We propose a novel, brain-inspired deep neural network model known as the Deep Oscillatory Neural Network (DONN). Deep neural networks like the Recurrent Neural Networks indeed possess sequence processing capabilities but the internal states of the network are not designed to exhibit brain-like oscillatory activity. With this motivation, the DONN is designed to have oscillatory internal dynamics.… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  27. arXiv:2405.01737  [pdf, other

    stat.ML cs.LG stat.CO

    Sample-efficient neural likelihood-free Bayesian inference of implicit HMMs

    Authors: Sanmitra Ghosh, Paul J. Birrell, Daniela De Angelis

    Abstract: Likelihood-free inference methods based on neural conditional density estimation were shown to drastically reduce the simulation burden in comparison to classical methods such as ABC. When applied in the context of any latent variable model, such as a Hidden Markov model (HMM), these methods are designed to only estimate the parameters, rather than the joint distribution of the parameters and the… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  28. What's in the Flow? Exploiting Temporal Motion Cues for Unsupervised Generic Event Boundary Detection

    Authors: Sourabh Vasant Gothe, Vibhav Agarwal, Sourav Ghosh, Jayesh Rajkumar Vachhani, Pranay Kashyap, Barath Raj Kandur Raja

    Abstract: Generic Event Boundary Detection (GEBD) task aims to recognize generic, taxonomy-free boundaries that segment a video into meaningful events. Current methods typically involve a neural model trained on a large volume of data, demanding substantial computational power and storage space. We explore two pivotal questions pertaining to GEBD: Can non-parametric algorithms outperform unsupervised neural… ▽ More

    Submitted 15 February, 2024; originally announced April 2024.

    Comments: Accepted in WACV-2024. Supplementary at https://openaccess.thecvf.com/content/WACV2024/supplemental/Gothe_Whats_in_the_WACV_2024_supplemental.pdf

    Journal ref: 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 2024, pp. 6926-6935

  29. arXiv:2404.16156  [pdf, other

    quant-ph cs.AR cs.CR cs.LG

    Guardians of the Quantum GAN

    Authors: Archisman Ghosh, Debarshi Kundu, Avimita Chatterjee, Swaroop Ghosh

    Abstract: Quantum Generative Adversarial Networks (qGANs) are at the forefront of image-generating quantum machine learning models. To accommodate the growing demand for Noisy Intermediate-Scale Quantum (NISQ) devices to train and infer quantum machine learning models, the number of third-party vendors offering quantum hardware as a service is expected to rise. This expansion introduces the risk of untruste… ▽ More

    Submitted 15 May, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

    Comments: 11 pages, 10 figures

  30. arXiv:2404.15610  [pdf, other

    cs.HC

    Revealing Aspects of Hawai'i Tourism Using Situated Augmented Reality

    Authors: Karen Abe, Jules Park, Samir Ghosh

    Abstract: In this position paper, we present a process artifact that aims to bring awareness to historical context, contemporary issues, and identity harm inflicted by tourism in Hawaii. First, we introduce the historical background and how the work is informed by the positionality of the authors. We discuss how related augmented reality work can inform strategy for building augmented reality experiences th… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: Presented at CHI 2024 (arXiv:2404.05889)

    Report number: ARSJ/2024/08

  31. arXiv:2404.10290  [pdf, other

    eess.IV cs.CV

    NeuroMorphix: A Novel Brain MRI Asymmetry-specific Feature Construction Approach For Seizure Recurrence Prediction

    Authors: Soumen Ghosh, Viktor Vegh, Shahrzad Moinian, Hamed Moradi, Alice-Ann Sullivan, John Phamnguyen, David Reutens

    Abstract: Seizure recurrence is an important concern after an initial unprovoked seizure; without drug treatment, it occurs within 2 years in 40-50% of cases. The decision to treat currently relies on predictors of seizure recurrence risk that are inaccurate, resulting in unnecessary, possibly harmful, treatment in some patients and potentially preventable seizures in others. Because of the link between bra… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: This work has been submitted to the IEEE TMI for possible publication

  32. arXiv:2404.05993  [pdf, other

    cs.LG cs.CL cs.CY

    AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts

    Authors: Shaona Ghosh, Prasoon Varshney, Erick Galinkin, Christopher Parisien

    Abstract: As Large Language Models (LLMs) and generative AI become more widespread, the content safety risks associated with their use also increase. We find a notable deficiency in high-quality content safety datasets and benchmarks that comprehensively cover a wide range of critical safety areas. To address this, we define a broad content safety risk taxonomy, comprising 13 critical risk and 9 sparse risk… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  33. arXiv:2404.03820  [pdf, other

    cs.CL

    CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues

    Authors: Makesh Narsimhan Sreedhar, Traian Rebedea, Shaona Ghosh, Jiaqi Zeng, Christopher Parisien

    Abstract: Recent advancements in instruction-tuning datasets have predominantly focused on specific tasks like mathematical or logical reasoning. There has been a notable gap in data designed for aligning language models to maintain topic relevance in conversations - a critical aspect for deploying chatbots to production. We introduce the CantTalkAboutThis dataset to help language models remain focused on t… ▽ More

    Submitted 21 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

  34. arXiv:2404.03662  [pdf, other

    cs.NI cs.AI

    X-lifecycle Learning for Cloud Incident Management using LLMs

    Authors: Drishti Goel, Fiza Husain, Aditya Singh, Supriyo Ghosh, Anjaly Parayil, Chetan Bansal, Xuchao Zhang, Saravan Rajmohan

    Abstract: Incident management for large cloud services is a complex and tedious process and requires significant amount of manual efforts from on-call engineers (OCEs). OCEs typically leverage data from different stages of the software development lifecycle [SDLC] (e.g., codes, configuration, monitor data, service properties, service dependencies, trouble-shooting documents, etc.) to generate insights for d… ▽ More

    Submitted 15 February, 2024; originally announced April 2024.

  35. arXiv:2404.01669  [pdf, other

    cs.SI cs.CY physics.soc-ph

    How COVID-19 has Impacted the Anti-Vaccine Discourse: A Large-Scale Twitter Study Spanning Pre-COVID and Post-COVID Era

    Authors: Soham Poddar, Rajdeep Mukherjee, Subhendu Khatuya, Niloy Ganguly, Saptarshi Ghosh

    Abstract: The debate around vaccines has been going on for decades, but the COVID-19 pandemic showed how crucial it is to understand and mitigate anti-vaccine sentiments. While the pandemic may be over, it is still important to understand how the pandemic affected the anti-vaccine discourse, and whether the arguments against non-COVID vaccines (e.g., Flu, MMR, IPV, HPV vaccines) have also changed due to the… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: This work has been accepted to appear at the 18th International AAAI Conference on Web and Social Media (ICWSM), 2024

  36. arXiv:2404.00419  [pdf, other

    cs.CV cs.CL

    Do Vision-Language Models Understand Compound Nouns?

    Authors: Sonal Kumar, Sreyan Ghosh, S Sakshi, Utkarsh Tyagi, Dinesh Manocha

    Abstract: Open-vocabulary vision-language models (VLMs) like CLIP, trained using contrastive loss, have emerged as a promising new paradigm for text-to-image retrieval. However, do VLMs understand compound nouns (CNs) (e.g., lab coat) as well as they understand nouns (e.g., lab)? We curate Compun, a novel benchmark with 400 unique and commonly used CNs, to evaluate the effectiveness of VLMs in interpreting… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: Accepted to NAACL 2024 Main Conference

  37. arXiv:2404.00415  [pdf, other

    cs.CL

    CoDa: Constrained Generation based Data Augmentation for Low-Resource NLP

    Authors: Chandra Kiran Reddy Evuru, Sreyan Ghosh, Sonal Kumar, Ramaneswaran S, Utkarsh Tyagi, Dinesh Manocha

    Abstract: We present CoDa (Constrained Generation based Data Augmentation), a controllable, effective, and training-free data augmentation technique for low-resource (data-scarce) NLP. Our approach is based on prompting off-the-shelf instruction-following Large Language Models (LLMs) for generating text that satisfies a set of constraints. Precisely, we extract a set of simple constraints from every instanc… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: Accepted to NAACL 2024 Findings

  38. arXiv:2404.00122  [pdf, other

    cs.CV eess.IV

    AgileFormer: Spatially Agile Transformer UNet for Medical Image Segmentation

    Authors: Peijie Qiu, ** Yang, Sayantan Kumar, Soumyendu Sekhar Ghosh, Aristeidis Sotiras

    Abstract: In the past decades, deep neural networks, particularly convolutional neural networks, have achieved state-of-the-art performance in a variety of medical image segmentation tasks. Recently, the introduction of the vision transformer (ViT) has significantly altered the landscape of deep segmentation models. There has been a growing focus on ViTs, driven by their excellent performance and scalabilit… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

  39. arXiv:2403.20317  [pdf, other

    cs.CV

    Convolutional Prompting meets Language Models for Continual Learning

    Authors: Anurag Roy, Riddhiman Moulick, Vinay K. Verma, Saptarshi Ghosh, Abir Das

    Abstract: Continual Learning (CL) enables machine learning models to learn from continuously shifting new training data in absence of data from old tasks. Recently, pretrained vision transformers combined with prompt tuning have shown promise for overcoming catastrophic forgetting in CL. These approaches rely on a pool of learnable prompts which can be inefficient in sharing knowledge across tasks leading t… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: CVPR 2024 Camera Ready

  40. arXiv:2403.19822  [pdf, other

    cs.CL cs.AI

    Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition

    Authors: Yash Jain, David Chan, Pranav Dheram, Aparna Khare, Olabanji Shonibare, Venkatesh Ravichandran, Shalini Ghosh

    Abstract: Recent advances in machine learning have demonstrated that multi-modal pre-training can improve automatic speech recognition (ASR) performance compared to randomly initialized models, even when models are fine-tuned on uni-modal tasks. Existing multi-modal pre-training methods for the ASR task have primarily focused on single-stage pre-training where a single unsupervised task is used for pre-trai… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: Accepted in LREC-COLING 2024 - The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation

  41. arXiv:2403.19317  [pdf, other

    cs.CL

    Beyond Borders: Investigating Cross-Jurisdiction Transfer in Legal Case Summarization

    Authors: T. Y. S. S Santosh, Vatsal Venkatkrishna, Saptarshi Ghosh, Matthias Grabmair

    Abstract: Legal professionals face the challenge of managing an overwhelming volume of lengthy judgments, making automated legal case summarization crucial. However, prior approaches mainly focused on training and evaluating these models within the same jurisdiction. In this study, we explore the cross-jurisdictional generalizability of legal case summarization models.Specifically, we explore how to effecti… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: Accepted to NAACL 2024

  42. arXiv:2403.18639  [pdf, other

    cs.DC cs.LG

    Dependency Aware Incident Linking in Large Cloud Systems

    Authors: Supriyo Ghosh, Karish Grover, Jimmy Wong, Chetan Bansal, Rakesh Namineni, Mohit Verma, Saravan Rajmohan

    Abstract: Despite significant reliability efforts, large-scale cloud services inevitably experience production incidents that can significantly impact service availability and customer's satisfaction. Worse, in many cases one incident can lead to multiple downstream failures due to cascading effects that creates several related incidents across different dependent services. Often time On-call Engineers (OCE… ▽ More

    Submitted 5 February, 2024; originally announced March 2024.

  43. arXiv:2403.18623  [pdf, other

    cs.CY cs.HC cs.IR

    Antitrust, Amazon, and Algorithmic Auditing

    Authors: Abhisek Dash, Abhijnan Chakraborty, Saptarshi Ghosh, Animesh Mukherjee, Jens Frankenreiter, Stefan Bechtold, Krishna P. Gummadi

    Abstract: In digital markets, antitrust law and special regulations aim to ensure that markets remain competitive despite the dominating role that digital platforms play today in everyone's life. Unlike traditional markets, market participant behavior is easily observable in these markets. We present a series of empirical investigations into the extent to which Amazon engages in practices that are typically… ▽ More

    Submitted 25 April, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: The paper has been accepted to appear at Journal of Institutional and Theoretical Economics (JITE) 2024

  44. arXiv:2403.15402  [pdf

    cs.CY

    This Class Isn't Designed For Me: Recognizing Ableist Trends In Design Education, And Redesigning For An Inclusive And Sustainable Future

    Authors: Sourojit Ghosh, Sarah Coppola

    Abstract: Traditional and currently-prevalent pedagogies of design perpetuate ableist and exclusionary notions of what it means to be a designer. In this paper, we trace such historically exclusionary norms of design education, and highlight modern-day instances from our own experiences as design educators in such epistemologies. Towards imagining a more inclusive and sustainable future of design education,… ▽ More

    Submitted 19 February, 2024; originally announced March 2024.

    Comments: Upcoming Publication, Design Research Society 2024

  45. arXiv:2403.14459  [pdf, other

    cs.CL cs.AI

    Multi-Level Explanations for Generative Language Models

    Authors: Lucas Monteiro Paes, Dennis Wei, Hyo ** Do, Hendrik Strobelt, Ronny Luss, Amit Dhurandhar, Manish Nagireddy, Karthikeyan Natesan Ramamurthy, Prasanna Sattigeri, Werner Geyer, Soumya Ghosh

    Abstract: Perturbation-based explanation methods such as LIME and SHAP are commonly applied to text classification. This work focuses on their extension to generative language models. To address the challenges of text as output and long text inputs, we propose a general framework called MExGen that can be instantiated with different attribution algorithms. To handle text output, we introduce the notion of s… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  46. arXiv:2403.12979  [pdf, other

    quant-ph cs.ET cs.LG

    AltGraph: Redesigning Quantum Circuits Using Generative Graph Models for Efficient Optimization

    Authors: Collin Beaudoin, Koustubh Phalak, Swaroop Ghosh

    Abstract: Quantum circuit transformation aims to produce equivalent circuits while optimizing for various aspects such as circuit depth, gate count, and compatibility with modern Noisy Intermediate Scale Quantum (NISQ) devices. There are two techniques for circuit transformation. The first is a rule-based approach that greedily cancels out pairs of gates that equate to the identity unitary operation. Rule-b… ▽ More

    Submitted 21 March, 2024; v1 submitted 23 February, 2024; originally announced March 2024.

  47. arXiv:2403.10944  [pdf, other

    cs.HC cs.AI

    Human Centered AI for Indian Legal Text Analytics

    Authors: Sudipto Ghosh, Devanshu Verma, Balaji Ganesan, Purnima Bindal, Vikas Kumar, Vasudha Bhatnagar

    Abstract: Legal research is a crucial task in the practice of law. It requires intense human effort and intellectual prudence to research a legal case and prepare arguments. Recent boom in generative AI has not translated to proportionate rise in impactful legal applications, because of low trustworthiness and and the scarcity of specialized datasets for training Large Language Models (LLMs). This position… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: 7 pages, 7 figures

  48. arXiv:2403.10776  [pdf, other

    cs.HC cs.AI cs.CY cs.LG

    From Melting Pots to Misrepresentations: Exploring Harms in Generative AI

    Authors: Sanjana Gautam, Pranav Narayanan Venkit, Sourojit Ghosh

    Abstract: With the widespread adoption of advanced generative models such as Gemini and GPT, there has been a notable increase in the incorporation of such models into sociotechnical systems, categorized under AI-as-a-Service (AIaaS). Despite their versatility across diverse sectors, concerns persist regarding discriminatory tendencies within these models, particularly favoring selected `majority' demograph… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: In CHI 2024: Generative AI and HCI workshop (GenAICHI 24)

  49. arXiv:2403.09702  [pdf, other

    cs.CL

    Generator-Guided Crowd Reaction Assessment

    Authors: Sohom Ghosh, Chung-Chi Chen, Sudip Kumar Naskar

    Abstract: In the realm of social media, understanding and predicting post reach is a significant challenge. This paper presents a Crowd Reaction AssessMent (CReAM) task designed to estimate if a given social media post will receive more reaction than another, a particularly essential task for digital marketers and content writers. We introduce the Crowd Reaction Estimation Dataset (CRED), consisting of pair… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: Accepted for publication in The ACM Web Conference WWW'24 Companion Short Papers Track, May 13 to 17 2024, Singapore, DOI 10.1145/3589335.3651512

    ACM Class: I.2.7

  50. arXiv:2403.09026  [pdf, other

    cs.AR cs.NE

    FlexNN: A Dataflow-aware Flexible Deep Learning Accelerator for Energy-Efficient Edge Devices

    Authors: Arnab Raha, Deepak A. Mathaikutty, Soumendu K. Ghosh, Shamik Kundu

    Abstract: This paper introduces FlexNN, a Flexible Neural Network accelerator, which adopts agile design principles to enable versatile dataflows, enhancing energy efficiency. Unlike conventional convolutional neural network accelerator architectures that adhere to fixed dataflows (such as input, weight, output, or row stationary) for transferring activations and weights between storage and compute units, o… ▽ More

    Submitted 11 April, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: Version 1. Work started in 2019