Skip to main content

Showing 1–50 of 232 results for author: Gupta, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19470  [pdf, other

    cs.CL

    Changing Answer Order Can Decrease MMLU Accuracy

    Authors: Vipul Gupta, David Pantoja, Candace Ross, Adina Williams, Megan Ung

    Abstract: As large language models (LLMs) have grown in prevalence, particular benchmarks have become essential for the evaluation of these models and for understanding model capabilities. Most commonly, we use test accuracy averaged across multiple subtasks in order to rank models on leaderboards, to determine which model is best for our purposes. In this paper, we investigate the robustness of the accurac… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Short paper, 9 pages

  2. arXiv:2406.19237  [pdf, other

    cs.CL cs.CV cs.IR cs.LG

    FlowVQA: Map** Multimodal Logic in Visual Question Answering with Flowcharts

    Authors: Shubhankar Singh, Purvi Chaurasia, Yerram Varun, Pranshu Pandya, Vatsal Gupta, Vivek Gupta, Dan Roth

    Abstract: Existing benchmarks for visual question answering lack in visual grounding and complexity, particularly in evaluating spatial reasoning skills. We introduce FlowVQA, a novel benchmark aimed at assessing the capabilities of visual question-answering multimodal language models in reasoning with flowcharts as visual contexts. FlowVQA comprises 2,272 carefully generated and human-verified flowchart im… ▽ More

    Submitted 28 June, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: Accepted in ACL 2024 (Findings), 21 pages, 7 figures, 9 Tables

  3. arXiv:2406.16964  [pdf, other

    cs.LG cs.AI

    Are Language Models Actually Useful for Time Series Forecasting?

    Authors: Mingtian Tan, Mike A. Merrill, Vinayak Gupta, Tim Althoff, Thomas Hartvigsen

    Abstract: Large language models (LLMs) are being applied to time series tasks, particularly time series forecasting. However, are language models actually useful for time series? After a series of ablation studies on three recent and popular LLM-based time series forecasting methods, we find that removing the LLM component or replacing it with a basic attention layer does not degrade the forecasting results… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 25 pages, 8 figures and 20 tables

  4. arXiv:2406.16253  [pdf, other

    cs.CL

    LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

    Authors: Jiangshu Du, Yibo Wang, Wenting Zhao, Zhongfen Deng, Shuaiqi Liu, Renze Lou, Henry Peng Zou, Pranav Narayanan Venkit, Nan Zhang, Mukund Srinath, Haoran Ranran Zhang, Vipul Gupta, Yinghui Li, Tao Li, Fei Wang, Qin Liu, Tianlin Liu, Pengzhi Gao, Congying Xia, Chen Xing, Jiayang Cheng, Zhaowei Wang, Ying Su, Raj Sanjay Shah, Ruohao Guo , et al. (15 additional authors not shown)

    Abstract: This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as th… ▽ More

    Submitted 25 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

  5. arXiv:2406.10889  [pdf, other

    cs.CV cs.AI cs.LG

    VELOCITI: Can Video-Language Models Bind Semantic Concepts through Time?

    Authors: Darshana Saravanan, Darshan Singh, Varun Gupta, Zeeshan Khan, Vineet Gandhi, Makarand Tapaswi

    Abstract: Compositionality is a fundamental aspect of vision-language understanding and is especially required for videos since they contain multiple entities (e.g. persons, actions, and scenes) interacting dynamically over time. Existing benchmarks focus primarily on perception capabilities. However, they do not study binding, the ability of a model to associate entities through appropriate relationships.… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: 26 pages, 17 figures, 3 tables

  6. arXiv:2406.10085  [pdf, other

    cs.CL

    Enhancing Question Answering on Charts Through Effective Pre-training Tasks

    Authors: Ashim Gupta, Vivek Gupta, Shuo Zhang, Yujie He, Ning Zhang, Shalin Shah

    Abstract: To completely understand a document, the use of textual information is not enough. Understanding visual cues, such as layouts and charts, is also required. While the current state-of-the-art approaches for document understanding (both OCR-based and OCR-free) work well, a thorough analysis of their capabilities and limitations has not yet been performed. Therefore, in this work, we addresses the li… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  7. arXiv:2406.00968  [pdf, other

    cs.RO cs.HC

    Evaluating MEDIRL: A Replication and Ablation Study of Maximum Entropy Deep Inverse Reinforcement Learning for Human Social Navigation

    Authors: Vinay Gupta, Nihal Gunukula

    Abstract: In this study, we enhance the Maximum Entropy Deep Inverse Reinforcement Learning (MEDIRL) framework, targeting its application in human robot interaction (HRI) for modeling pedestrian behavior in crowded environments. Our work is grounded in the pioneering research by Fahad, Chen, and Guo, and aims to elevate MEDIRL's efficacy in real world HRI settings. We replicated the original MEDIRL model an… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: 14 pages, 13 figures

  8. arXiv:2405.16752  [pdf, other

    cs.LG cs.AI

    Model Ensembling for Constrained Optimization

    Authors: Ira Globus-Harris, Varun Gupta, Michael Kearns, Aaron Roth

    Abstract: There is a long history in machine learning of model ensembling, beginning with boosting and bagging and continuing to the present day. Much of this history has focused on combining models for classification and regression, but recently there is interest in more complex settings such as ensembling policies in reinforcement learning. Strong connections have also emerged between ensembling and multi… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  9. arXiv:2405.15046  [pdf, other

    math.CO cs.DM

    On the minimum spectral radius of connected graphs of given order and size

    Authors: Sebastian M. Cioabă, Vishal Gupta, Celso Marques

    Abstract: In this paper, we study a question of Hong from 1993 related to the minimum spectral radii of the adjacency matrices of connected graphs of given order and size. Hong asked if it is true that among all connected graphs of given number of vertices $n$ and number of edges $e$, the graphs having minimum spectral radius (the minimizer graphs) must be almost regular, meaning that the difference between… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 19 pages, 6 figures

    MSC Class: 05C50; 15A18

  10. Fair Evaluation of Federated Learning Algorithms for Automated Breast Density Classification: The Results of the 2022 ACR-NCI-NVIDIA Federated Learning Challenge

    Authors: Kendall Schmidt, Benjamin Bearce, Ken Chang, Laura Coombs, Keyvan Farahani, Marawan Elbatele, Kaouther Mouhebe, Robert Marti, Ruipeng Zhang, Yao Zhang, Yanfeng Wang, Yaojun Hu, Haochao Ying, Yuyang Xu, Conrad Testagrose, Mutlu Demirer, Vikash Gupta, Ünal Akünal, Markus Bujotzek, Klaus H. Maier-Hein, Yi Qin, Xiaomeng Li, Jayashree Kalpathy-Cramer, Holger R. Roth

    Abstract: The correct interpretation of breast density is important in the assessment of breast cancer risk. AI has been shown capable of accurately predicting breast density, however, due to the differences in imaging characteristics across mammography systems, models built using data from one system do not generalize well to other systems. Though federated learning (FL) has emerged as a way to improve the… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 16 pages, 9 figures

    Journal ref: Medical Image Analysis Volume 95, July 2024, 103206

  11. arXiv:2405.00908  [pdf

    cs.CV cs.AI cs.LG

    Transformer-Based Self-Supervised Learning for Histopathological Classification of Ischemic Stroke Clot Origin

    Authors: K. Yeh, M. S. Jabal, V. Gupta, D. F. Kallmes, W. Brinjikji, B. S. Erdal

    Abstract: Background and Purpose: Identifying the thromboembolism source in ischemic stroke is crucial for treatment and secondary prevention yet is often undetermined. This study describes a self-supervised deep learning approach in digital pathology of emboli for classifying ischemic stroke clot origin from histopathological images. Methods: The dataset included whole slide images (WSI) from the STRIP AI… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  12. arXiv:2404.11757  [pdf, other

    cs.CL

    Language Models Still Struggle to Zero-shot Reason about Time Series

    Authors: Mike A. Merrill, Mingtian Tan, Vinayak Gupta, Tom Hartvigsen, Tim Althoff

    Abstract: Time series are critical for decision-making in fields like finance and healthcare. Their importance has driven a recent influx of works passing time series into language models, leading to non-trivial forecasting on some datasets. But it remains unknown whether non-trivial forecasting implies that language models can reason about time series. To address this gap, we generate a first-of-its-kind e… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  13. Improvement in Semantic Address Matching using Natural Language Processing

    Authors: Vansh Gupta, Mohit Gupta, Jai Garg, Nitesh Garg

    Abstract: Address matching is an important task for many businesses especially delivery and take out companies which help them to take out a certain address from their data warehouse. Existing solution uses similarity of strings, and edit distance algorithms to find out the similar addresses from the address database, but these algorithms could not work effectively with redundant, unstructured, or incomplet… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 5 pages, 7 tables, 2021 2nd International Conference for Emerging Technology (INCET)

    Journal ref: 2021 2nd International Conference for Emerging Technology (INCET), Belagavi, India, 2021, pp. 1-5

  14. Designing an Intelligent Parcel Management System using IoT & Machine Learning

    Authors: Mohit Gupta, Nitesh Garg, Jai Garg, Vansh Gupta, Devraj Gautam

    Abstract: Parcels delivery is a critical activity in railways. More importantly, each parcel must be thoroughly checked and sorted according to its destination address. We require an efficient and robust IoT system capable of doing all of these tasks with great precision and minimal human interaction. This paper discusses, We created a fully-fledged solution using IoT and machine learning to assist trains i… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 6 pages, 6 figures, 2022 IEEE IAS Global Conference on Emerging Technologies (GlobConET)

    Journal ref: 2022 IEEE IAS Global Conference on Emerging Technologies (GlobConET), Arad, Romania, 2022, pp. 751-756

  15. arXiv:2404.07461  [pdf, other

    cs.CL cs.AI

    "Confidently Nonsensical?'': A Critical Survey on the Perspectives and Challenges of 'Hallucinations' in NLP

    Authors: Pranav Narayanan Venkit, Tatiana Chakravorti, Vipul Gupta, Heidi Biggs, Mukund Srinath, Koustava Goswami, Sarah Rajtmajer, Shomir Wilson

    Abstract: We investigate how hallucination in large language models (LLM) is characterized in peer-reviewed literature using a critical examination of 103 publications across NLP research. Through a comprehensive review of sociological and technological literature, we identify a lack of agreement with the term `hallucination.' Additionally, we conduct a survey with 171 practitioners from the field of NLP an… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  16. arXiv:2404.06751  [pdf, other

    cs.CY

    Leveraging open-source models for legal language modeling and analysis: a case study on the Indian constitution

    Authors: Vikhyath Gupta, Srinivasa Rao P

    Abstract: In recent years, the use of open-source models has gained immense popularity in various fields, including legal language modelling and analysis. These models have proven to be highly effective in tasks such as summarizing legal documents, extracting key information, and even predicting case outcomes. This has revolutionized the legal industry, enabling lawyers, researchers, and policymakers to qui… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 10 Pages , 3 figures

  17. arXiv:2403.04007  [pdf, other

    cs.LG math.OC

    Sampling-based Safe Reinforcement Learning for Nonlinear Dynamical Systems

    Authors: Wesley A. Suttle, Vipul K. Sharma, Krishna C. Kosaraju, S. Sivaranjani, Ji Liu, Vijay Gupta, Brian M. Sadler

    Abstract: We develop provably safe and convergent reinforcement learning (RL) algorithms for control of nonlinear dynamical systems, bridging the gap between the hard safety guarantees of control theory and the convergence guarantees of RL theory. Recent advances at the intersection of control and RL follow a two-stage, safety filter approach to enforcing hard safety constraints: model-free RL is used to le… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: 20 pages, 7 figures

  18. arXiv:2402.17108  [pdf, ps, other

    cs.GT cs.DS cs.LG

    Repeated Contracting with Multiple Non-Myopic Agents: Policy Regret and Limited Liability

    Authors: Natalie Collina, Varun Gupta, Aaron Roth

    Abstract: We study a repeated contracting setting in which a Principal adaptively chooses amongst $k$ Agents at each of $T$ rounds. The Agents are non-myopic, and so a mechanism for the Principal induces a $T$-round extensive form game amongst the Agents. We give several results aimed at understanding an under-explored aspect of contract theory -- the game induced when choosing an Agent to contract with. Fi… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  19. arXiv:2402.11755  [pdf, other

    cs.LG cs.CL cs.CR cs.PL

    SPML: A DSL for Defending Language Models Against Prompt Attacks

    Authors: Reshabh K Sharma, Vinayak Gupta, Dan Grossman

    Abstract: Large language models (LLMs) have profoundly transformed natural language applications, with a growing reliance on instruction-based definitions for designing chatbots. However, post-deployment the chatbot definitions are fixed and are vulnerable to attacks by malicious users, emphasizing the need to prevent unethical applications and financial losses. Existing studies explore user prompts' impact… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

  20. arXiv:2402.11194  [pdf, other

    cs.CL

    Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering

    Authors: Pragya Srivastava, Manuj Malik, Vivek Gupta, Tanuja Ganu, Dan Roth

    Abstract: Large Language Models (LLMs), excel in natural language understanding, but their capability for complex mathematical reasoning with an amalgamation of structured tables and unstructured text is uncertain. This study explores LLMs' mathematical reasoning on four financial tabular question-answering datasets: TATQA, FinQA, ConvFinQA, and Multihiertt. Through extensive experiments with various models… ▽ More

    Submitted 29 February, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

    Comments: 25 pages, 17 figures

  21. arXiv:2402.09658  [pdf

    eess.IV cs.CV

    Towards Precision Cardiovascular Analysis in Zebrafish: The ZACAF Paradigm

    Authors: Amir Mohammad Naderi, Jennifer G. Casey, Mao-Hsiang Huang, Rachelle Victorio, David Y. Chiang, Calum MacRae, Hung Cao, Vandana A. Gupta

    Abstract: Quantifying cardiovascular parameters like ejection fraction in zebrafish as a host of biological investigations has been extensively studied. Since current manual monitoring techniques are time-consuming and fallible, several image processing frameworks have been proposed to automate the process. Most of these works rely on supervised deep-learning architectures. However, supervised methods tend… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  22. arXiv:2402.08747  [pdf, other

    cs.GT eess.SY

    Rationality of Learning Algorithms in Repeated Normal-Form Games

    Authors: Shivam Bajaj, Pranoy Das, Yevgeniy Vorobeychik, Vijay Gupta

    Abstract: Many learning algorithms are known to converge to an equilibrium for specific classes of games if the same learning algorithm is adopted by all agents. However, when the agents are self-interested, a natural question is whether agents have a strong incentive to adopt an alternative learning algorithm that yields them greater individual utility. We capture such incentives as an algorithm's rational… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  23. arXiv:2402.04632  [pdf, other

    cs.CV cs.GR

    GSN: Generalisable Segmentation in Neural Radiance Field

    Authors: Vinayak Gupta, Rahul Goel, Sirikonda Dhawal, P. J. Narayanan

    Abstract: Traditional Radiance Field (RF) representations capture details of a specific scene and must be trained afresh on each scene. Semantic feature fields have been added to RFs to facilitate several segmentation tasks. Generalised RF representations learn the principles of view interpolation. A generalised RF can render new views of an unknown and untrained scene, given a few views. We present a way t… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: Accepted at the Main Technical Track of AAAI 2024

  24. arXiv:2402.04146  [pdf, other

    stat.ML cs.LG

    Interpretable Multi-Source Data Fusion Through Latent Variable Gaussian Process

    Authors: Sandipp Krishnan Ravi, Yigitcan Comlek, Wei Chen, Arjun Pathak, Vipul Gupta, Rajnikant Umretiya, Andrew Hoffman, Ghanshyam Pilania, Piyush Pandita, Sayan Ghosh, Nathaniel Mckeever, Li** Wang

    Abstract: With the advent of artificial intelligence (AI) and machine learning (ML), various domains of science and engineering communites has leveraged data-driven surrogates to model complex systems from numerous sources of information (data). The proliferation has led to significant reduction in cost and time involved in development of superior systems designed to perform specific functionalities. A high… ▽ More

    Submitted 16 February, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: 27 Pages,9 Figures, 3 Supplementary Figures, 2 Supplementary Tables

  25. arXiv:2402.03256  [pdf, ps, other

    cs.LG math.OC stat.ML

    Learning Best-in-Class Policies for the Predict-then-Optimize Framework

    Authors: Michael Huang, Vishal Gupta

    Abstract: We propose a novel family of decision-aware surrogate losses, called Perturbation Gradient (PG) losses, for the predict-then-optimize framework. These losses directly approximate the downstream decision loss and can be optimized using off-the-shelf gradient-based methods. Importantly, unlike existing surrogate losses, the approximation error of our PG losses vanishes as the number of samples grows… ▽ More

    Submitted 8 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  26. arXiv:2402.00093  [pdf, other

    cs.SE cs.LG

    ChIRAAG: ChatGPT Informed Rapid and Automated Assertion Generation

    Authors: Bhabesh Mali, Karthik Maddala, Vatsal Gupta, Sweeya Reddy, Chandan Karfa, Ramesh Karri

    Abstract: System Verilog Assertion (SVA) formulation -- a critical yet complex task is a prerequisite in the Assertion Based Verification (ABV) process. Traditionally, SVA formulation involves expert-driven interpretation of specifications, which is time-consuming and prone to human error. Recently, LLM-informed automatic assertion generation is gaining interest. We designed a novel framework called ChIRAAG… ▽ More

    Submitted 28 June, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

    Comments: 4 pages, 2 figures and 2 tables

  27. arXiv:2401.14521  [pdf

    cs.LG cs.AI

    Towards Interpretable Physical-Conceptual Catchment-Scale Hydrological Modeling using the Mass-Conserving-Perceptron

    Authors: Yuan-Heng Wang, Hoshin V. Gupta

    Abstract: We investigate the applicability of machine learning technologies to the development of parsimonious, interpretable, catchment-scale hydrologic models using directed-graph architectures based on the mass-conserving perceptron (MCP) as the fundamental computational unit. Here, we focus on architectural complexity (depth) at a single location, rather than universal applicability (breadth) across lar… ▽ More

    Submitted 22 May, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: 65 pages, 8 Figures, 4 Tables, 1 Supplementary Material

  28. arXiv:2401.01258  [pdf, other

    math.OC cs.LG eess.SY

    Towards Model-Free LQR Control over Rate-Limited Channels

    Authors: Aritra Mitra, Lintao Ye, Vijay Gupta

    Abstract: Given the success of model-free methods for control design in many problem settings, it is natural to ask how things will change if realistic communication channels are utilized for the transmission of gradients or policies. While the resulting problem has analogies with the formulations studied under the rubric of networked control systems, the rich literature in that area has typically assumed t… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: 24 pages

  29. arXiv:2311.15194  [pdf, other

    cs.LG cs.AI

    Understanding the Countably Infinite: Neural Network Models of the Successor Function and its Acquisition

    Authors: Vima Gupta, Sashank Varma

    Abstract: As children enter elementary school, their understanding of the ordinal structure of numbers transitions from a memorized count list of the first 50-100 numbers to knowing the successor function and understanding the countably infinite. We investigate this developmental change in two neural network models that learn the successor function on the pairs (N, N+1) for N in (0, 98). The first uses a on… ▽ More

    Submitted 21 May, 2024; v1 submitted 26 November, 2023; originally announced November 2023.

    Comments: 6 pages, 11 figures

  30. arXiv:2311.14570  [pdf

    cs.AI physics.med-ph

    RAISE -- Radiology AI Safety, an End-to-end lifecycle approach

    Authors: M. Jorge Cardoso, Julia Moosbauer, Tessa S. Cook, B. Selnur Erdal, Brad Genereaux, Vikash Gupta, Bennett A. Landman, Tiarna Lee, Parashkev Nachev, Elanchezhian Somasundaram, Ronald M. Summers, Khaled Younis, Sebastien Ourselin, Franz MJ Pfister

    Abstract: The integration of AI into radiology introduces opportunities for improved clinical care provision and efficiency but it demands a meticulous approach to mitigate potential risks as with any other new technology. Beginning with rigorous pre-deployment evaluation and validation, the focus should be on ensuring models meet the highest standards of safety, effectiveness and efficacy for their intende… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

    Comments: 14 pages, 3 figures

  31. arXiv:2311.10840  [pdf

    cs.AI

    Integration and Implementation Strategies for AI Algorithm Deployment with Smart Routing Rules and Workflow Management

    Authors: Barbaros Selnur Erdal, Vikash Gupta, Mutlu Demirer, Kim H. Fair, Richard D. White, Jeff Blair, Barbara Deichert, Laurie Lafleur, Ming Melvin Qin, David Bericat, Brad Genereaux

    Abstract: This paper reviews the challenges hindering the widespread adoption of artificial intelligence (AI) solutions in the healthcare industry, focusing on computer vision applications for medical imaging, and how interoperability and enterprise-grade scalability can be used to address these challenges. The complex nature of healthcare workflows, intricacies in managing large and secure medical imaging… ▽ More

    Submitted 21 November, 2023; v1 submitted 17 November, 2023; originally announced November 2023.

    Comments: 13 pages, 6 figures

    ACM Class: I.2.m

  32. arXiv:2311.10085  [pdf, other

    cs.LG cs.CL math.OC

    A Computationally Efficient Sparsified Online Newton Method

    Authors: Fnu Devvrit, Sai Surya Duvvuri, Rohan Anil, Vineet Gupta, Cho-Jui Hsieh, Inderjit Dhillon

    Abstract: Second-order methods hold significant promise for enhancing the convergence of deep neural network training; however, their large memory and computational demands have limited their practicality. Thus there is a need for scalable second-order methods that can efficiently train large models. In this paper, we introduce the Sparsified Online Newton (SONew) method, a memory-efficient second-order alg… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: 30 pages. First two authors contributed equally. Accepted at NeurIPS 2023

  33. arXiv:2311.08662  [pdf, other

    cs.CL cs.AI cs.IR

    Multi-Set Inoculation: Assessing Model Robustness Across Multiple Challenge Sets

    Authors: Vatsal Gupta, Pranshu Pandya, Tushar Kataria, Vivek Gupta, Dan Roth

    Abstract: Language models, given their black-box nature, often exhibit sensitivity to input perturbations, leading to trust issues due to hallucinations. To bolster trust, it's essential to understand these models' failure modes and devise strategies to enhance their performance. In this study, we propose a framework to study the effect of input perturbations on language models of different scales, from pre… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: 13 pages, 2 Figure, 12 Tables

  34. arXiv:2311.08002  [pdf, other

    cs.CL cs.AI cs.IR

    TempTabQA: Temporal Question Answering for Semi-Structured Tables

    Authors: Vivek Gupta, Pranshu Kandoi, Mahek Bhavesh Vora, Shuo Zhang, Yujie He, Ridho Reinanda, Vivek Srikumar

    Abstract: Semi-structured data, such as Infobox tables, often include temporal information about entities, either implicitly or explicitly. Can current NLP systems reason about such information in semi-structured tables? To tackle this question, we introduce the task of temporal question answering on semi-structured tables. We present a dataset, TempTabQA, which comprises 11,454 question-answer pairs extrac… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: EMNLP 2023(Main), 23 Figures, 32 Tables

  35. arXiv:2311.07453  [pdf, other

    cs.CL cs.CV

    ChartCheck: Explainable Fact-Checking over Real-World Chart Images

    Authors: Mubashara Akhtar, Nikesh Subedi, Vivek Gupta, Sahar Tahmasebi, Oana Cocarascu, Elena Simperl

    Abstract: Whilst fact verification has attracted substantial interest in the natural language processing community, verifying misinforming statements against data visualizations such as charts has so far been overlooked. Charts are commonly used in the real-world to summarize and communicate key information, but they can also be easily misused to spread misinformation and promote certain agendas. In this pa… ▽ More

    Submitted 16 February, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

  36. arXiv:2311.02216  [pdf, other

    cs.CL cs.LG

    Exploring the Numerical Reasoning Capabilities of Language Models: A Comprehensive Analysis on Tabular Data

    Authors: Mubashara Akhtar, Abhilash Shankarampeta, Vivek Gupta, Arpit Patil, Oana Cocarascu, Elena Simperl

    Abstract: Numbers are crucial for various real-world domains such as finance, economics, and science. Thus, understanding and reasoning with numbers are essential skills for language models to solve different tasks. While different numerical benchmarks have been introduced in recent years, they are limited to specific numerical aspects mostly. In this paper, we propose a hierarchical taxonomy for numerical… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: Accepted at EMNLP 2023 (Findings)

  37. arXiv:2310.12318  [pdf, other

    cs.CL cs.AI cs.CY cs.HC

    The Sentiment Problem: A Critical Survey towards Deconstructing Sentiment Analysis

    Authors: Pranav Narayanan Venkit, Mukund Srinath, Sanjana Gautam, Saranya Venkatraman, Vipul Gupta, Rebecca J. Passonneau, Shomir Wilson

    Abstract: We conduct an inquiry into the sociotechnical aspects of sentiment analysis (SA) by critically examining 189 peer-reviewed papers on their applications, models, and datasets. Our investigation stems from the recognition that SA has become an integral component of diverse sociotechnical systems, exerting influence on both social and technical users. By delving into sociological and technological li… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: This paper has been accepted and will appear at the EMNLP 2023 Main Conference

  38. arXiv:2310.12174  [pdf, other

    physics.soc-ph cs.CE eess.SY

    A Traffic Control Framework for Uncrewed Aircraft Systems

    Authors: Ananay Vikram Gupta, Aaditya Prakash Kattekola, Ansh Vikram Gupta, Dacharla Venkata Abhiram, Kamesh Namuduri, Ravichandran Subramanian

    Abstract: The exponential growth of Advanced Air Mobility (AAM) services demands assurances of safety in the airspace. This research a Traffic Control Framework (TCF) for develo** digital flight rules for Uncrewed Aircraft System (UAS) flying in designated air corridors. The proposed TCF helps model, deploy, and test UAS control, agents, regardless of their hardware configurations. This paper investigates… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

    Comments: 6 pages, 7 figures

  39. arXiv:2310.08644  [pdf

    cs.LG cs.AI

    A Mass-Conserving-Perceptron for Machine Learning-Based Modeling of Geoscientific Systems

    Authors: Yuan-Heng Wang, Hoshin V. Gupta

    Abstract: Although decades of effort have been devoted to building Physical-Conceptual (PC) models for predicting the time-series evolution of geoscientific systems, recent work shows that Machine Learning (ML) based Gated Recurrent Neural Network technology can be used to develop models that are much more accurate. However, the difficulty of extracting physical understanding from ML-based models complicate… ▽ More

    Submitted 12 May, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: 68 pages, 7 figures in the main text, 10 figures, and 10 tables in the supplementary materials

  40. arXiv:2309.13019  [pdf

    cs.CE

    Differential Evolution Algorithm Based Hyperparameter Selection of Gated Recurrent Unit for Electrical Load Forecasting

    Authors: Anuvab Sen, Vedica Gupta, Chi Tang

    Abstract: Accurate load forecasting remains a formidable challenge in numerous sectors, given the intricate dynamics of dynamic power systems, which often defy conventional statistical models. As a response, time-series methodologies like ARIMA and sophisticated deep learning techniques such as Artificial Neural Networks (ANN) and Long Short-Term Memory (LSTM) networks have demonstrated their mettle by achi… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

    Comments: 3 figures, 2 tables, Presented @ 3rd Annual BRIC Symposium, 2023 @McMaster University, Hamilton, Canada. arXiv admin note: substantial text overlap with arXiv:2307.15299

    Journal ref: Proceedings of 3rd Annual BRIC Symposium, 2023 McMaster University, Hamilton, Canada

  41. arXiv:2308.12539  [pdf, other

    cs.CL cs.AI cs.LG

    CALM : A Multi-task Benchmark for Comprehensive Assessment of Language Model Bias

    Authors: Vipul Gupta, Pranav Narayanan Venkit, Hugo Laurençon, Shomir Wilson, Rebecca J. Passonneau

    Abstract: As language models (LMs) become increasingly powerful and widely used, it is important to quantify them for sociodemographic bias with potential for harm. Prior measures of bias are sensitive to perturbations in the templates designed to compare performance across social groups, due to factors such as low diversity or limited number of templates. Also, most previous work considers only one NLP tas… ▽ More

    Submitted 23 January, 2024; v1 submitted 23 August, 2023; originally announced August 2023.

  42. arXiv:2308.09138  [pdf, other

    cs.CL cs.AI cs.CY

    Semantic Consistency for Assuring Reliability of Large Language Models

    Authors: Harsh Raj, Vipul Gupta, Domenic Rosati, Subhabrata Majumdar

    Abstract: Large Language Models (LLMs) exhibit remarkable fluency and competence across various natural language tasks. However, recent research has highlighted their sensitivity to variations in input prompts. To deploy LLMs in a safe and reliable manner, it is crucial for their outputs to be consistent when prompted with expressions that carry the same meaning or intent. While some existing work has explo… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

  43. Turning hazardous volatile matter compounds into fuel by catalytic steam reforming: An evolutionary machine learning approach

    Authors: Alireza Shafizadeh, Hossein Shahbeik, Mohammad Hossein Nadian, Vijai Kumar Gupta, Abdul-Sattar Nizami, Su Shiung Lam, Wanxi Peng, Junting Pan, Meisam Tabatabaei, Mortaza Aghbashlo

    Abstract: Chemical and biomass processing systems release volatile matter compounds into the environment daily. Catalytic reforming can convert these compounds into valuable fuels, but develo** stable and efficient catalysts is challenging. Machine learning can handle complex relationships in big data and optimize reaction conditions, making it an effective solution for addressing the mentioned issues. Th… ▽ More

    Submitted 25 July, 2023; originally announced August 2023.

  44. arXiv:2307.10305  [pdf, other

    cs.CV cs.LG

    Tapestry of Time and Actions: Modeling Human Activity Sequences using Temporal Point Process Flows

    Authors: Vinayak Gupta, Srikanta Bedathur

    Abstract: Human beings always engage in a vast range of activities and tasks that demonstrate their ability to adapt to different scenarios. Any human activity can be represented as a temporal sequence of actions performed to achieve a certain goal. Unlike the time series datasets extracted from electronics or machines, these action sequences are highly disparate in their nature -- the time to finish a sequ… ▽ More

    Submitted 13 July, 2023; originally announced July 2023.

    Comments: Extended version of Gupta and Bedathur [arXiv:2206.05291] (SIGKDD 2022). Under review in a journal

  45. arXiv:2307.09613  [pdf, other

    cs.LG cs.IR

    Retrieving Continuous Time Event Sequences using Neural Temporal Point Processes with Learnable Hashing

    Authors: Vinayak Gupta, Srikanta Bedathur, Abir De

    Abstract: Temporal sequences have become pervasive in various real-world applications. Consequently, the volume of data generated in the form of continuous time-event sequence(s) or CTES(s) has increased exponentially in the past few years. Thus, a significant fraction of the ongoing research on CTES datasets involves designing models to address downstream tasks such as next-event prediction, long-term fore… ▽ More

    Submitted 13 July, 2023; originally announced July 2023.

    Comments: Extended version of Gupta et al. [arXiv:2202.11485] (AAAI 2022). Under review in a journal

  46. arXiv:2307.08152  [pdf

    cs.CL

    The Potential and Pitfalls of using a Large Language Model such as ChatGPT or GPT-4 as a Clinical Assistant

    Authors: **gqing Zhang, Kai Sun, Akshay Jagadeesh, Mahta Ghahfarokhi, Deepa Gupta, Ashok Gupta, Vibhor Gupta, Yike Guo

    Abstract: Recent studies have demonstrated promising performance of ChatGPT and GPT-4 on several medical domain tasks. However, none have assessed its performance using a large-scale real-world electronic health record database, nor have evaluated its utility in providing clinical diagnostic assistance for patients across a full range of disease presentation. We performed two analyses using ChatGPT and GPT-… ▽ More

    Submitted 16 July, 2023; originally announced July 2023.

    Comments: This manuscript is pre-print and in peer review. Supplementary materials will be published later

  47. arXiv:2307.03313  [pdf, other

    cs.CL cs.CY cs.IR

    InfoSync: Information Synchronization across Multilingual Semi-structured Tables

    Authors: Siddharth Khincha, Chelsi Jain, Vivek Gupta, Tushar Kataria, Shuo Zhang

    Abstract: Information Synchronization of semi-structured data across languages is challenging. For instance, Wikipedia tables in one language should be synchronized across languages. To address this problem, we introduce a new dataset InfoSyncC and a two-step method for tabular synchronization. InfoSync contains 100K entity-centric tables (Wikipedia Infoboxes) across 14 languages, of which a subset (3.5K pa… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

    Comments: 22 pages, 7 figures, 20 tables, ACL 2023 (Toronto, Canada)

  48. arXiv:2306.10854  [pdf, other

    cs.LG cs.HC

    Performance of data-driven inner speech decoding with same-task EEG-fMRI data fusion and bimodal models

    Authors: Holly Wilson, Scott Wellington, Foteini Simistira Liwicki, Vibha Gupta, Rajkumar Saini, Kanjar De, Nosheen Abid, Sumit Rakesh, Johan Eriksson, Oliver Watts, Xi Chen, Mohammad Golbabaee, Michael J. Proulx, Marcus Liwicki, Eamonn O'Neill, Benjamin Metcalfe

    Abstract: Decoding inner speech from the brain signal via hybridisation of fMRI and EEG data is explored to investigate the performance benefits over unimodal models. Two different bimodal fusion approaches are examined: concatenation of probability vectors output from unimodal fMRI and EEG machine learning models, and data fusion with feature engineering. Same task inner speech data are recorded from four… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

  49. arXiv:2306.08158  [pdf, other

    cs.CL cs.AI cs.LG

    Sociodemographic Bias in Language Models: A Survey and Forward Path

    Authors: Vipul Gupta, Pranav Narayanan Venkit, Shomir Wilson, Rebecca J. Passonneau

    Abstract: This paper presents a comprehensive survey of work on sociodemographic bias in language models (LMs). Sociodemographic biases embedded within language models can have harmful effects when deployed in real-world settings. We systematically organize the existing literature into three main areas: types of bias, quantifying bias, and debiasing techniques. We also track the evolution of investigations… ▽ More

    Submitted 1 March, 2024; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: 23 pages, 3 figure

  50. arXiv:2306.06543  [pdf, other

    cs.RO cs.AI cs.LG

    MANER: Multi-Agent Neural Rearrangement Planning of Objects in Cluttered Environments

    Authors: Vivek Gupta, Praphpreet Dhir, Jeegn Dani, Ahmed H. Qureshi

    Abstract: Object rearrangement is a fundamental problem in robotics with various practical applications ranging from managing warehouses to cleaning and organizing home kitchens. While existing research has primarily focused on single-agent solutions, real-world scenarios often require multiple robots to work together on rearrangement tasks. This paper proposes a comprehensive learning-based framework for m… ▽ More

    Submitted 4 November, 2023; v1 submitted 10 June, 2023; originally announced June 2023.

    Comments: The videos and supplementary material are available at https://sites.google.com/view/maner-supplementary

    Journal ref: Published in IEEE Robotics and Automation Letters, vol. 8, no. 12, pp. 8295-8302, Dec. 2023