Skip to main content

Showing 1–50 of 328 results for author: Gupta, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.13636  [pdf, other

    cs.RO cs.LG

    Contrast Sets for Evaluating Language-Guided Robot Policies

    Authors: Abrar Anwar, Rohan Gupta, Jesse Thomason

    Abstract: Robot evaluations in language-guided, real world settings are time-consuming and often sample only a small space of potential instructions across complex scenes. In this work, we introduce contrast sets for robotics as an approach to make small, but specific, perturbations to otherwise independent, identically distributed (i.i.d.) test instances. We investigate the relationship between experimente… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  2. arXiv:2406.05276  [pdf, other

    cs.LG

    VTrans: Accelerating Transformer Compression with Variational Information Bottleneck based Pruning

    Authors: Oshin Dutta, Ritvik Gupta, Sumeet Agarwal

    Abstract: In recent years, there has been a growing emphasis on compressing large pre-trained transformer models for resource-constrained devices. However, traditional pruning methods often leave the embedding layer untouched, leading to model over-parameterization. Additionally, they require extensive compression time with large datasets to maintain performance in pruned models. To address these challenges… ▽ More

    Submitted 11 June, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

  3. arXiv:2405.18780  [pdf, other

    cs.AI cs.LG

    Quantitative Certification of Bias in Large Language Models

    Authors: Isha Chaudhary, Qian Hu, Manoj Kumar, Morteza Ziyadi, Rahul Gupta, Gagandeep Singh

    Abstract: Large Language Models (LLMs) can produce responses that exhibit social biases and support stereotypes. However, conventional benchmarking is insufficient to thoroughly evaluate LLM bias, as it can not scale to large sets of prompts and provides no guarantees. Therefore, we propose a novel certification framework QuaCer-B (Quantitative Certification of Bias) that provides formal guarantees on obtai… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  4. arXiv:2405.12167  [pdf, other

    cs.CY

    Open-Source Assessments of AI Capabilities: The Proliferation of AI Analysis Tools, Replicating Competitor Models, and the Zhousidun Dataset

    Authors: Ritwik Gupta, Leah Walker, Eli Glickman, Raine Koizumi, Sarthak Bhatnagar, Andrew W. Reddie

    Abstract: The integration of artificial intelligence (AI) into military capabilities has become a norm for major military power across the globe. Understanding how these AI models operate is essential for maintaining strategic advantages and ensuring security. This paper demonstrates an open-source methodology for analyzing military AI models through a detailed examination of the Zhousidun dataset, a Chines… ▽ More

    Submitted 24 May, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

  5. arXiv:2405.03873  [pdf, other

    cs.AI cs.HC

    Investigating Personalized Driving Behaviors in Dilemma Zones: Analysis and Prediction of Stop-or-Go Decisions

    Authors: Ziye Qin, Siyan Li, Guoyuan Wu, Matthew J. Barth, Amr Abdelraouf, Rohit Gupta, Kyungtae Han

    Abstract: Dilemma zones at signalized intersections present a commonly occurring but unsolved challenge for both drivers and traffic operators. Onsets of the yellow lights prompt varied responses from different drivers: some may brake abruptly, compromising the ride comfort, while others may accelerate, increasing the risk of red-light violations and potential safety hazards. Such diversity in drivers' stop… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  6. arXiv:2405.01734  [pdf, other

    cs.CV cs.AI

    Diabetic Retinopathy Detection Using Quantum Transfer Learning

    Authors: Ankush Jain, Rinav Gupta, Jai Singhal

    Abstract: Diabetic Retinopathy (DR), a prevalent complication in diabetes patients, can lead to vision impairment due to lesions formed on the retina. Detecting DR at an advanced stage often results in irreversible blindness. The traditional process of diagnosing DR through retina fundus images by ophthalmologists is not only time-intensive but also expensive. While classical transfer learning models have b… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 14 pages, 12 figures and 5 tables

  7. arXiv:2405.01488  [pdf, other

    cs.LG stat.ML

    Digital Twin Generators for Disease Modeling

    Authors: Nameyeh Alam, Jake Basilico, Daniele Bertolini, Satish Casie Chetty, Heather D'Angelo, Ryan Douglas, Charles K. Fisher, Franklin Fuller, Melissa Gomes, Rishabh Gupta, Alex Lang, Anton Loukianov, Rachel Mak-McCully, Cary Murray, Hanalei Pham, Susanna Qiao, Elena Ryapolova-Webb, Aaron Smith, Dimitri Theoharatos, Anil Tolwani, Eric W. Tramel, Anna Vidovszky, Judy Viduya, Jonathan R. Walsh

    Abstract: A patient's digital twin is a computational model that describes the evolution of their health over time. Digital twins have the potential to revolutionize medicine by enabling individual-level computer simulations of human health, which can be used to conduct more efficient clinical trials or to recommend personalized treatment options. Due to the overwhelming complexity of human biology, machine… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  8. arXiv:2404.17983  [pdf, other

    cs.SD cs.CL eess.AS

    TI-ASU: Toward Robust Automatic Speech Understanding through Text-to-speech Imputation Against Missing Speech Modality

    Authors: Tiantian Feng, Xuan Shi, Rahul Gupta, Shrikanth S. Narayanan

    Abstract: Automatic Speech Understanding (ASU) aims at human-like speech interpretation, providing nuanced intent, emotion, sentiment, and content understanding from speech and language (text) content conveyed in speech. Typically, training a robust ASU model relies heavily on acquiring large-scale, high-quality speech and associated transcriptions. However, it is often challenging to collect or use speech… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  9. arXiv:2404.11181  [pdf, other

    cs.LG cs.AI cs.RO

    KI-GAN: Knowledge-Informed Generative Adversarial Networks for Enhanced Multi-Vehicle Trajectory Forecasting at Signalized Intersections

    Authors: Chuheng Wei, Guoyuan Wu, Matthew J. Barth, Amr Abdelraouf, Rohit Gupta, Kyungtae Han

    Abstract: Reliable prediction of vehicle trajectories at signalized intersections is crucial to urban traffic management and autonomous driving systems. However, it presents unique challenges, due to the complex roadway layout at intersections, involvement of traffic signal controls, and interactions among different types of road users. To address these issues, we present in this paper a novel model called… ▽ More

    Submitted 19 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

    Comments: 2024 CVPR AICity Workshop

  10. arXiv:2404.08940  [pdf, other

    cs.IR cs.CL cs.LG

    Introducing Super RAGs in Mistral 8x7B-v1

    Authors: Ayush Thakur, Raghav Gupta

    Abstract: The relentless pursuit of enhancing Large Language Models (LLMs) has led to the advent of Super Retrieval-Augmented Generation (Super RAGs), a novel approach designed to elevate the performance of LLMs by integrating external knowledge sources with minimal structural modifications. This paper presents the integration of Super RAGs into the Mistral 8x7B v1, a state-of-the-art LLM, and examines the… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  11. arXiv:2404.08704  [pdf, other

    cs.CL cs.AI

    MM-PhyQA: Multimodal Physics Question-Answering With Multi-Image CoT Prompting

    Authors: Avinash Anand, Janak Kapuriya, Apoorv Singh, Jay Saraf, Naman Lal, Astha Verma, Rushali Gupta, Rajiv Shah

    Abstract: While Large Language Models (LLMs) can achieve human-level performance in various tasks, they continue to face challenges when it comes to effectively tackling multi-step physics reasoning tasks. To identify the shortcomings of existing models and facilitate further research in this area, we curated a novel dataset, MM-PhyQA, which comprises well-constructed, high schoollevel multimodal physics pr… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  12. arXiv:2404.03245  [pdf, other

    cs.ET cs.OS

    Memory Sharing with CXL: Hardware and Software Design Approaches

    Authors: Sunita Jain, Nagaradhesh Yeleswarapu, Hasan Al Maruf, Rita Gupta

    Abstract: Compute Express Link (CXL) is a rapidly emerging coherent interconnect standard that provides opportunities for memory pooling and sharing. Memory sharing is a well-established software feature that improves memory utilization by avoiding unnecessary data movement. In this paper, we discuss multiple approaches to enable memory sharing with different generations of CXL protocol (i.e., CXL 2.0 and C… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Presented at the 3rd Workshop on Heterogeneous Composable and Disaggregated Systems (HCDS 2024)

  13. arXiv:2404.02323  [pdf, other

    cs.CL

    Toward Informal Language Processing: Knowledge of Slang in Large Language Models

    Authors: Zhewei Sun, Qian Hu, Rahul Gupta, Richard Zemel, Yang Xu

    Abstract: Recent advancement in large language models (LLMs) has offered a strong potential for natural language systems to process informal language. A representative form of informal language is slang, used commonly in daily conversations and online social media. To date, slang has not been comprehensively evaluated in LLMs due partly to the absence of a carefully designed and publicly accessible benchmar… ▽ More

    Submitted 12 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted to NAACL 2024 main conference

  14. arXiv:2403.17270  [pdf, other

    cs.RO cs.HC

    Human Stress Response and Perceived Safety during Encounters with Quadruped Robots

    Authors: Ryan Gupta, Hyonyoung Shin, Emily Norman, Keri K. Stephens, Nanshu Lu, Luis Sentis

    Abstract: Despite the rise of mobile robot deployments in home and work settings, perceived safety of users and bystanders is understudied in the human-robot interaction (HRI) literature. To address this, we present a study designed to identify elements of a human-robot encounter that correlate with observed stress response. Stress is a key component of perceived safety and is strongly associated with human… ▽ More

    Submitted 6 June, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

    Comments: 8 pages, 7 figs, 5 tables

  15. arXiv:2403.17255  [pdf, other

    eess.IV cs.CV

    Decoding the visual attention of pathologists to reveal their level of expertise

    Authors: Souradeep Chakraborty, Dana Perez, Paul Friedman, Natallia Sheuka, Constantin Friedman, Oksana Yaskiv, Rajarsi Gupta, Gregory J. Zelinsky, Joel H. Saltz, Dimitris Samaras

    Abstract: We present a method for classifying the expertise of a pathologist based on how they allocated their attention during a cancer reading. We engage this decoding task by develo** a novel method for predicting the attention of pathologists as they read whole-slide Images (WSIs) of prostate and make cancer grade classifications. Our ground truth measure of a pathologists' attention is the x, y and z… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  16. arXiv:2403.15170  [pdf, other

    cs.LG cs.AI eess.SP

    Exploring the Task-agnostic Trait of Self-supervised Learning in the Context of Detecting Mental Disorders

    Authors: Rohan Kumar Gupta, Rohit Sinha

    Abstract: Self-supervised learning (SSL) has been investigated to generate task-agnostic representations across various domains. However, such investigation has not been conducted for detecting multiple mental disorders. The rationale behind the existence of a task-agnostic representation lies in the overlap** symptoms among multiple mental disorders. Consequently, the behavioural data collected for menta… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  17. arXiv:2403.14742  [pdf, other

    astro-ph.IM astro-ph.HE cs.LG

    A Classifier-Based Approach to Multi-Class Anomaly Detection for Astronomical Transients

    Authors: Rithwik Gupta, Daniel Muthukrishna, Michelle Lochner

    Abstract: Automating real-time anomaly detection is essential for identifying rare transients in the era of large-scale astronomical surveys. Modern survey telescopes are generating tens of thousands of alerts per night, and future telescopes, such as the Vera C. Rubin Observatory, are projected to increase this number dramatically. Currently, most anomaly detection algorithms for astronomical transients re… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: 16 pages, 14 figures, 1 table, submitted to MNRAS

  18. arXiv:2403.10581  [pdf, other

    q-bio.QM cs.AI cs.CL cs.LG eess.SP

    Large Language Model-informed ECG Dual Attention Network for Heart Failure Risk Prediction

    Authors: Chen Chen, Lei Li, Marcel Beetz, Abhirup Banerjee, Ramneek Gupta, Vicente Grau

    Abstract: Heart failure (HF) poses a significant public health challenge, with a rising global mortality rate. Early detection and prevention of HF could significantly reduce its impact. We introduce a novel methodology for predicting HF risk using 12-lead electrocardiograms (ECGs). We present a novel, lightweight dual-attention ECG network designed to capture complex ECG features essential for early HF ris… ▽ More

    Submitted 22 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: Under journal revision

  19. arXiv:2403.05982  [pdf

    cs.CL

    Enhanced Auto Language Prediction with Dictionary Capsule -- A Novel Approach

    Authors: Pinni Venkata Abhiram, Ananya Rathore, Abhir Mirikar, Hari Krishna S, Sheena Christabel Pravin, Vishwanath Kamath Pethri, Manjunath Lokanath Belgod, Reetika Gupta, K Muthukumaran

    Abstract: The paper presents a novel Auto Language Prediction Dictionary Capsule (ALPDC) framework for language prediction and machine translation. The model uses a combination of neural networks and symbolic representations to predict the language of a given input text and then translate it to a target language using pre-built dictionaries. This research work also aims to translate the text of various lang… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

    Comments: 21 Pages

  20. arXiv:2403.05882  [pdf, other

    cs.LG

    DiffRed: Dimensionality Reduction guided by stable rank

    Authors: Prarabdh Shukla, Gagan Raj Gupta, Kunal Dutta

    Abstract: In this work, we propose a novel dimensionality reduction technique, DiffRed, which first projects the data matrix, A, along first $k_1$ principal components and the residual matrix $A^{*}$ (left after subtracting its $k_1$-rank approximation) along $k_2$ Gaussian random vectors. We evaluate M1, the distortion of mean-squared pair-wise distance, and Stress, the normalized value of RMS of distortio… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  21. arXiv:2403.01927  [pdf, other

    q-bio.GN cs.CV q-bio.QM q-bio.TO

    Advancing Gene Selection in Oncology: A Fusion of Deep Learning and Sparsity for Precision Gene Selection

    Authors: Akhila Krishna, Ravi Kant Gupta, Pranav Jeevan, Amit Sethi

    Abstract: Gene selection plays a pivotal role in oncology research for improving outcome prediction accuracy and facilitating cost-effective genomic profiling for cancer patients. This paper introduces two gene selection strategies for deep learning-based survival prediction models. The first strategy uses a sparsity-inducing method while the second one uses importance based gene selection for identifying r… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  22. arXiv:2403.01915  [pdf, other

    cs.CV cs.AI

    xT: Nested Tokenization for Larger Context in Large Images

    Authors: Ritwik Gupta, Shufan Li, Tyler Zhu, Jitendra Malik, Trevor Darrell, Karttikeya Mangalam

    Abstract: Modern computer vision pipelines handle large images in one of two sub-optimal ways: down-sampling or crop**. These two methods incur significant losses in the amount of information and context present in an image. There are many downstream applications in which global context matters as much as high frequency details, such as in real-world satellite imagery; in such cases researchers have to ma… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  23. arXiv:2403.01615  [pdf, other

    cs.LG cs.DC

    Partial Federated Learning

    Authors: Tiantian Feng, Anil Ramakrishna, Jimit Majmudar, Charith Peris, Jixuan Wang, Clement Chung, Richard Zemel, Morteza Ziyadi, Rahul Gupta

    Abstract: Federated Learning (FL) is a popular algorithm to train machine learning models on user data constrained to edge devices (for example, mobile phones) due to privacy concerns. Typically, FL is trained with the assumption that no part of the user data can be egressed from the edge. However, in many production settings, specific data-modalities/meta-data are limited to be on device while others are n… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

  24. arXiv:2402.18128  [pdf, other

    cs.CV cs.LG

    Downstream Task Guided Masking Learning in Masked Autoencoders Using Multi-Level Optimization

    Authors: Han Guo, Ramtin Hosseini, Ruiyi Zhang, Sai Ashish Somayajula, Ranak Roy Chowdhury, Rajesh K. Gupta, Pengtao Xie

    Abstract: Masked Autoencoder (MAE) is a notable method for self-supervised pretraining in visual representation learning. It operates by randomly masking image patches and reconstructing these masked patches using the unmasked ones. A key limitation of MAE lies in its disregard for the varying informativeness of different patches, as it uniformly selects patches to mask. To overcome this, some approaches pr… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  25. arXiv:2402.17447  [pdf, other

    cs.CL cs.AI cs.IR

    Deep Learning Based Named Entity Recognition Models for Recipes

    Authors: Mansi Goel, Ayush Agarwal, Shubham Agrawal, Janak Kapuriya, Akhil Vamshi Konam, Rishabh Gupta, Shrey Rastogi, Niharika, Ganesh Bagler

    Abstract: Food touches our lives through various endeavors, including flavor, nourishment, health, and sustainability. Recipes are cultural capsules transmitted across generations via unstructured text. Automated protocols for recognizing named entities, the building blocks of recipe text, are of immense value for various applications ranging from information extraction to novel recipe generation. Named ent… ▽ More

    Submitted 6 June, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: 13 pages, 6 main figures and 2 in appendices, and 3 main tables; Accepted for publication in LREC-COLING 2024

  26. arXiv:2402.03957  [pdf, other

    cs.CL

    Sparse Graph Representations for Procedural Instructional Documents

    Authors: Shruti Singh, Rishabh Gupta

    Abstract: Computation of document similarity is a critical task in various NLP domains that has applications in deduplication, matching, and recommendation. Traditional approaches for document similarity computation include learning representations of documents and employing a similarity or a distance function over the embeddings. However, pairwise similarities and differences are not efficiently captured b… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  27. arXiv:2402.01801  [pdf, other

    cs.LG cs.AI cs.CL

    Large Language Models for Time Series: A Survey

    Authors: Xiyuan Zhang, Ranak Roy Chowdhury, Rajesh K. Gupta, **gbo Shang

    Abstract: Large Language Models (LLMs) have seen significant use in domains such as natural language processing and computer vision. Going beyond text, image and graphics, LLMs present a significant potential for analysis of time series data, benefiting domains such as climate, IoT, healthcare, traffic, audio and finance. This survey paper provides an in-depth exploration and a detailed taxonomy of the vari… ▽ More

    Submitted 6 May, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: GitHub repository: https://github.com/xiyuanzh/awesome-llm-time-series

  28. arXiv:2401.15288  [pdf

    cs.CV cs.MM cs.NI

    STAC: Leveraging Spatio-Temporal Data Associations For Efficient Cross-Camera Streaming and Analytics

    Authors: Volodymyr Vakhniuk, Ayush Sarkar, Ragini Gupta

    Abstract: We propose an efficient cross-cameras surveillance system called,STAC, that leverages spatio-temporal associations between multiple cameras to provide real-time analytics and inference under constrained network environments. STAC is built using the proposed omni-scale feature learning people reidentification (reid) algorithm that allows accurate detection, tracking and re-identification of people… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    ACM Class: I.4.2; I.4.0; C.2.2; C.2.0

  29. arXiv:2401.10931  [pdf, other

    q-fin.ST cs.CR cs.LG

    Forecasting Cryptocurrency Staking Rewards

    Authors: Sauren Gupta, Apoorva Hathi Katharaki, Yifan Xu, Bhaskar Krishnamachari, Rajarshi Gupta

    Abstract: This research explores a relatively unexplored area of predicting cryptocurrency staking rewards, offering potential insights to researchers and investors. We investigate two predictive methodologies: a) a straightforward sliding-window average, and b) linear regression models predicated on historical data. The findings reveal that ETH staking rewards can be forecasted with an RMSE within 0.7% and… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: 9 pages, 18 figures

  30. arXiv:2401.09937  [pdf, other

    cs.CY cs.HC

    From Cash to Cashless: UPI's Impact on Spending Behavior among Indian Users

    Authors: Harshal Dev, Raj Gupta, Dhruv Kumar

    Abstract: The emergence of digital payment systems has transformed how individuals conduct financial transactions, offering convenience, security, and efficiency. One groundbreaking innovation making waves in the Indian financial landscape is the Unified Payments Interface (UPI). Existing work has explored how digital payments benefit a country's economy and GDP. However, our study explores how the introduc… ▽ More

    Submitted 7 May, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: Accepted to ACM CHI 2024 - Late Breaking Work Track

  31. arXiv:2312.17254  [pdf, other

    cs.CL

    Faithful Model Evaluation for Model-Based Metrics

    Authors: Palash Goyal, Qian Hu, Rahul Gupta

    Abstract: Statistical significance testing is used in natural language processing (NLP) to determine whether the results of a study or experiment are likely to be due to chance or if they reflect a genuine relationship. A key step in significance testing is the estimation of confidence interval which is a function of sample variance. Sample variance calculation is straightforward when evaluating against gro… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  32. arXiv:2312.15010  [pdf, other

    cs.CV

    SI-MIL: Taming Deep MIL for Self-Interpretability in Gigapixel Histopathology

    Authors: Saarthak Kapse, Pushpak Pati, Srijan Das, **gwei Zhang, Chao Chen, Maria Vakalopoulou, Joel Saltz, Dimitris Samaras, Rajarsi R. Gupta, Prateek Prasanna

    Abstract: Introducing interpretability and reasoning into Multiple Instance Learning (MIL) methods for Whole Slide Image (WSI) analysis is challenging, given the complexity of gigapixel slides. Traditionally, MIL interpretability is limited to identifying salient regions deemed pertinent for downstream tasks, offering little insight to the end-user (pathologist) regarding the rationale behind these selectio… ▽ More

    Submitted 18 May, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

  33. arXiv:2312.11779  [pdf, other

    cs.CL cs.AI cs.LG

    Tokenization Matters: Navigating Data-Scarce Tokenization for Gender Inclusive Language Technologies

    Authors: Anaelia Ovalle, Ninareh Mehrabi, Palash Goyal, Jwala Dhamala, Kai-Wei Chang, Richard Zemel, Aram Galstyan, Yuval Pinter, Rahul Gupta

    Abstract: Gender-inclusive NLP research has documented the harmful limitations of gender binary-centric large language models (LLM), such as the inability to correctly use gender-diverse English neopronouns (e.g., xe, zir, fae). While data scarcity is a known culprit, the precise mechanisms through which scarcity affects this behavior remain underexplored. We discover LLM misgendering is significantly influ… ▽ More

    Submitted 6 April, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: Accepted to NAACL 2024 findings

  34. arXiv:2312.08366  [pdf, other

    cs.CV

    See, Say, and Segment: Teaching LMMs to Overcome False Premises

    Authors: Tsung-Han Wu, Giscard Biamby, David Chan, Lisa Dunlap, Ritwik Gupta, Xudong Wang, Joseph E. Gonzalez, Trevor Darrell

    Abstract: Current open-source Large Multimodal Models (LMMs) excel at tasks such as open-vocabulary language grounding and segmentation but can suffer under false premises when queries imply the existence of something that is not actually present in the image. We observe that existing methods that fine-tune an LMM to segment images significantly degrade their ability to reliably determine ("see") if an obje… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

    Comments: Project Page: https://see-say-segment.github.io

  35. arXiv:2312.04372  [pdf, other

    cs.CL cs.AI

    LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs

    Authors: Yunsheng Ma, Can Cui, Xu Cao, Wenqian Ye, Peiran Liu, Juanwu Lu, Amr Abdelraouf, Rohit Gupta, Kyungtae Han, Aniket Bera, James M. Rehg, Ziran Wang

    Abstract: Autonomous driving (AD) has made significant strides in recent years. However, existing frameworks struggle to interpret and execute spontaneous user instructions, such as "overtake the car ahead." Large Language Models (LLMs) have demonstrated impressive reasoning capabilities showing potential to bridge this gap. In this paper, we present LaMPilot, a novel framework that integrates LLMs into AD… ▽ More

    Submitted 4 April, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: CVPR 2024

  36. arXiv:2311.09473  [pdf, other

    cs.AI cs.CL

    JAB: Joint Adversarial Prompting and Belief Augmentation

    Authors: Ninareh Mehrabi, Palash Goyal, Anil Ramakrishna, Jwala Dhamala, Shalini Ghosh, Richard Zemel, Kai-Wei Chang, Aram Galstyan, Rahul Gupta

    Abstract: With the recent surge of language models in different applications, attention to safety and robustness of these models has gained significant importance. Here we introduce a joint framework in which we simultaneously probe and improve the robustness of a black-box target model via adversarial prompting and belief augmentation using iterative feedback loops. This framework utilizes an automated red… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  37. arXiv:2311.06968  [pdf, other

    cs.LG cs.AI eess.SP stat.ML

    Physics-Informed Data Denoising for Real-Life Sensing Systems

    Authors: Xiyuan Zhang, Xiaohan Fu, Diyan Teng, Chengyu Dong, Keerthivasan Vijayakumar, Jiayun Zhang, Ranak Roy Chowdhury, Junsheng Han, Dezhi Hong, Rashmi Kulkarni, **gbo Shang, Rajesh Gupta

    Abstract: Sensors measuring real-life physical processes are ubiquitous in today's interconnected world. These sensors inherently bear noise that often adversely affects performance and reliability of the systems they support. Classic filtering-based approaches introduce strong assumptions on the time or frequency characteristics of sensory measurements, while learning-based denoising approaches typically r… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

    Comments: SenSys 2023

  38. arXiv:2311.05284  [pdf, other

    cs.DC

    Challenges and Opportunities in the Co-design of Convolutions and RISC-V Vector Processors

    Authors: Sonia Rani Gupta, Nikela Papadopoulou, Miquel Pericàs

    Abstract: The RISC-V "V" extension introduces vector processing to the RISC-V architecture. Unlike most SIMD extensions, it supports long vectors which can result in significant improvement of multiple applications. In this paper, we present our ongoing research to implement and optimize a vectorized Winograd algorithm used in convolutional layers on RISC-V Vector(RISC-VV) processors. Our study identifies e… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: To appear at the Second International workshop on RISC-V for HPC, co-located with SC 2023

  39. arXiv:2311.04978  [pdf, other

    cs.CL

    On the steerability of large language models toward data-driven personas

    Authors: Junyi Li, Ninareh Mehrabi, Charith Peris, Palash Goyal, Kai-Wei Chang, Aram Galstyan, Richard Zemel, Rahul Gupta

    Abstract: Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented. Here, we present a novel approach to achieve controllable generation of specific viewpoints using LLMs, that can be leveraged to produce multiple perspectives and to reflect the diverse opinions. Moving beyond the traditional reliance on demographics like a… ▽ More

    Submitted 2 April, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

  40. arXiv:2311.02399  [pdf, ps, other

    cs.LG cs.DC

    Entropy Aware Training for Fast and Accurate Distributed GNN

    Authors: Dhruv Deshmukh, Gagan Raj Gupta, Manisha Chawla, Vishwesh Jatala, Anirban Haldar

    Abstract: Several distributed frameworks have been developed to scale Graph Neural Networks (GNNs) on billion-size graphs. On several benchmarks, we observe that the graph partitions generated by these frameworks have heterogeneous data distributions and class imbalance, affecting convergence, and resulting in lower performance than centralized implementations. We holistically address these challenges and d… ▽ More

    Submitted 4 November, 2023; originally announced November 2023.

    Comments: 8 pages, 3 figures, 5 tables, accepted at ICDM'23

    ACM Class: I.5.1; I.5.2

  41. arXiv:2311.02010  [pdf, other

    cs.CY

    A cast of thousands: How the IDEAS Productivity project has advanced software productivity and sustainability

    Authors: Lois Curfman McInnes, Michael Heroux, David E. Bernholdt, Anshu Dubey, Elsa Gonsiorowski, Rinku Gupta, Osni Marques, J. David Moulton, Hai Ah Nam, Boyana Norris, Elaine M. Raybourn, Jim Willenbring, Ann Almgren, Ross Bartlett, Kita Cranfill, Stephen Fickas, Don Frederick, William Godoy, Patricia Grubel, Rebecca Hartman-Baker, Axel Huebl, Rose Lynch, Addi Malviya Thakur, Reed Milewicz, Mark C. Miller , et al. (9 additional authors not shown)

    Abstract: Computational and data-enabled science and engineering are revolutionizing advances throughout science and society, at all scales of computing. For example, teams in the U.S. DOE Exascale Computing Project have been tackling new frontiers in modeling, simulation, and analysis by exploiting unprecedented exascale computing capabilities-building an advanced software ecosystem that supports next-gene… ▽ More

    Submitted 16 February, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

    Comments: 12 pages, 1 figure

  42. arXiv:2310.16673  [pdf, other

    cs.SE cs.AI cs.IR

    Exploring Large Language Models for Code Explanation

    Authors: Paheli Bhattacharya, Manojit Chakraborty, Kartheek N S N Palepu, Vikas Pandey, Ishan Dindorkar, Rakesh Rajpurohit, Rishabh Gupta

    Abstract: Automating code documentation through explanatory text can prove highly beneficial in code understanding. Large Language Models (LLMs) have made remarkable strides in Natural Language Processing, especially within software engineering tasks such as code generation and code summarization. This study specifically delves into the task of generating natural-language summaries for code snippets, using… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: Accepted at the Forum for Information Retrieval Evaluation 2023 (IRSE Track)

    ACM Class: D.2.3; I.7

  43. arXiv:2310.16639  [pdf, other

    cs.CV cs.LG

    Driving through the Concept Gridlock: Unraveling Explainability Bottlenecks in Automated Driving

    Authors: Jessica Echterhoff, An Yan, Kyungtae Han, Amr Abdelraouf, Rohit Gupta, Julian McAuley

    Abstract: Concept bottleneck models have been successfully used for explainable machine learning by encoding information within the model with a set of human-defined concepts. In the context of human-assisted or autonomous driving, explainability models can help user acceptance and understanding of decisions made by the autonomous vehicle, which can be used to rationalize and explain driver or vehicle behav… ▽ More

    Submitted 26 October, 2023; v1 submitted 25 October, 2023; originally announced October 2023.

  44. arXiv:2310.15054  [pdf, other

    cs.LG

    Coordinated Replay Sample Selection for Continual Federated Learning

    Authors: Jack Good, Jimit Majmudar, Christophe Dupuy, Jixuan Wang, Charith Peris, Clement Chung, Richard Zemel, Rahul Gupta

    Abstract: Continual Federated Learning (CFL) combines Federated Learning (FL), the decentralized learning of a central model on a number of client devices that may not communicate their data, and Continual Learning (CL), the learning of a model from a continual stream of data without kee** the entire history. In CL, the main challenge is \textit{forgetting} what was learned from past data. While replay-ba… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: 7 pages, 6 figures, accepted to EMNLP (industry track)

  45. arXiv:2310.14542  [pdf, other

    cs.CL

    Evaluating Large Language Models on Controlled Generation Tasks

    Authors: Jiao Sun, Yufei Tian, Wangchunshu Zhou, Nan Xu, Qian Hu, Rahul Gupta, John Frederick Wieting, Nanyun Peng, Xuezhe Ma

    Abstract: While recent studies have looked into the abilities of large language models in various benchmark tasks, including question generation, reading comprehension, multilingual and etc, there have been few studies looking into the controllability of large language models on generation tasks. We present an extensive analysis of various benchmarks including a sentence planning benchmark with different gr… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023

  46. arXiv:2310.14162  [pdf, other

    cs.CV cs.AI

    Augmenting End-to-End Steering Angle Prediction with CAN Bus Data

    Authors: Rohan Gupta

    Abstract: In recent years, end to end steering prediction for autonomous vehicles has become a major area of research. The primary method for achieving end to end steering was to use computer vision models on a live feed of video data. However, to further increase accuracy, many companies have added data from light detection and ranging (LiDAR) and or radar sensors through sensor fusion. However, the additi… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

    Comments: 5 pages

  47. arXiv:2310.04572  [pdf, other

    cs.RO

    LIVE: Lidar Informed Visual Search for Multiple Objects with Multiple Robots

    Authors: Ryan Gupta, Minkyu Kim, Juliana T Rodriguez, Kyle Morgenstein, Luis Sentis

    Abstract: This paper introduces LIVE: Lidar Informed Visual Search focused on the problem of multi-robot (MR) planning and execution for robust visual detection of multiple objects. We perform extensive real-world experiments with a two-robot team in an indoor apartment setting. LIVE acts as a perception module that detects unmapped obstacles, or Short Term Features (STFs), in Lidar observations. STFs are f… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

    Comments: 4 pages + references; 6 figures

  48. arXiv:2310.03346  [pdf, other

    cs.CV

    Combining Datasets with Different Label Sets for Improved Nucleus Segmentation and Classification

    Authors: Amruta Parulekar, Utkarsh Kanwat, Ravi Kant Gupta, Medha Chippa, Thomas Jacob, Tripti Bameta, Swapnil Rane, Amit Sethi

    Abstract: Segmentation and classification of cell nuclei in histopathology images using deep neural networks (DNNs) can save pathologists' time for diagnosing various diseases, including cancers, by automating cell counting and morphometric assessments. It is now well-known that the accuracy of DNNs increases with the sizes of annotated datasets available for training. Although multiple datasets of histopat… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  49. arXiv:2310.03185  [pdf, other

    cs.CR cs.AI

    Misusing Tools in Large Language Models With Visual Adversarial Examples

    Authors: Xiaohan Fu, Zihan Wang, Shuheng Li, Rajesh K. Gupta, Niloofar Mireshghallah, Taylor Berg-Kirkpatrick, Earlence Fernandes

    Abstract: Large Language Models (LLMs) are being enhanced with the ability to use tools and to process multiple modalities. These new capabilities bring new benefits and also new security risks. In this work, we show that an attacker can use visual adversarial examples to cause attacker-desired tool usage. For example, the attacker could cause a victim LLM to delete calendar events, leak private conversatio… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  50. arXiv:2309.17172  [pdf, other

    cs.CV

    Domain-Adaptive Learning: Unsupervised Adaptation for Histology Images with Improved Loss Function Combination

    Authors: Ravi Kant Gupta, Shounak Das, Amit Sethi

    Abstract: This paper presents a novel approach for unsupervised domain adaptation (UDA) targeting H&E stained histology images. Existing adversarial domain adaptation methods may not effectively align different domains of multimodal distributions associated with classification problems. The objective is to enhance domain alignment and reduce domain shifts between these domains by leveraging their unique cha… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.