Skip to main content

Showing 1–50 of 639 results for author: Sharma, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01222  [pdf, other

    cs.RO

    Deep Learning Models for Flap** Fin Unmanned Underwater Vehicle Control System Gait Optimization

    Authors: Brian Zhou, Kamal Viswanath, Jason Geder, Alisha Sharma, Julian Lee

    Abstract: The last few decades have led to the rise of research focused on propulsion and control systems for bio-inspired unmanned underwater vehicles (UUVs), which provide more maneuverable alternatives to traditional UUVs in underwater missions. Recent work has explored the use of time-series neural network surrogate models to predict thrust and power from vehicle design and fin kinematics. We develop a… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 28 pages, 20 figures. arXiv admin note: text overlap with arXiv:2310.14135

  2. arXiv:2407.00317  [pdf, other

    cs.IR stat.AP

    Towards Statistically Significant Taxonomy Aware Co-location Pattern Detection

    Authors: Subhankar Ghosh, Arun Sharma, Jayant Gupta, Shashi Shekhar

    Abstract: Given a collection of Boolean spatial feature types, their instances, a neighborhood relation (e.g., proximity), and a hierarchical taxonomy of the feature types, the goal is to find the subsets of feature types or their parents whose spatial interaction is statistically significant. This problem is for taxonomy-reliant applications such as ecology (e.g., finding new symbiotic relationships across… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: Accepted in The 16th Conference on Spatial Information Theory (COSIT) 2024

    ACM Class: E.m; H.3.3; I.5; J.4; J.4

  3. arXiv:2407.00246  [pdf, other

    cs.CR cs.SE

    SBOM.EXE: Countering Dynamic Code Injection based on Software Bill of Materials in Java

    Authors: Aman Sharma, Martin Wittlinger, Benoit Baudry, Martin Monperrus

    Abstract: Software supply chain attacks have become a significant threat as software development increasingly relies on contributions from multiple, often unverified sources. The code from unverified sources does not pose a threat until it is executed. Log4Shell is a recent example of a supply chain attack that processed a malicious input at runtime, leading to remote code execution. It exploited the dynami… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: 17 pages, 3 figures, 5 tables, 8 listings

  4. arXiv:2406.19881  [pdf, other

    cs.CR cs.LG

    Attention Meets UAVs: A Comprehensive Evaluation of DDoS Detection in Low-Cost UAVs

    Authors: Ashish Sharma, SVSLN Surya Suhas Vaddhiparthy, Sai Usha Goparaju, Deepak Gangadharan, Harikumar Kandath

    Abstract: This paper explores the critical issue of enhancing cybersecurity measures for low-cost, Wi-Fi-based Unmanned Aerial Vehicles (UAVs) against Distributed Denial of Service (DDoS) attacks. In the current work, we have explored three variants of DDoS attacks, namely Transmission Control Protocol (TCP), Internet Control Message Protocol (ICMP), and TCP + ICMP flooding attacks, and developed a detectio… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  5. arXiv:2406.19092  [pdf, ps, other

    cs.LG

    Adaptive Stochastic Weight Averaging

    Authors: Caglar Demir, Arnab Sharma, Axel-Cyrille Ngonga Ngomo

    Abstract: Ensemble models often improve generalization performances in challenging tasks. Yet, traditional techniques based on prediction averaging incur three well-known disadvantages: the computational overhead of training multiple models, increased latency, and memory requirements at test time. To address these issues, the Stochastic Weight Averaging (SWA) technique maintains a running average of model p… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  6. arXiv:2406.18602  [pdf

    stat.AP cs.LG stat.CO

    Multi-level Phenotypic Models of Cardiovascular Disease and Obstructive Sleep Apnea Comorbidities: A Longitudinal Wisconsin Sleep Cohort Study

    Authors: Duy Nguyen, Ca Hoang, Phat K. Huynh, Tien Truong, Dang Nguyen, Abhay Sharma, Trung Q. Le

    Abstract: Cardiovascular diseases (CVDs) are notably prevalent among patients with obstructive sleep apnea (OSA), posing unique challenges in predicting CVD progression due to the intricate interactions of comorbidities. Traditional models typically lack the necessary dynamic and longitudinal scope to accurately forecast CVD trajectories in OSA patients. This study introduces a novel multi-level phenotypic… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 30 pages, 5 figure, 5 tables

  7. arXiv:2406.16851  [pdf, other

    cs.CL cs.AI cs.CV

    Losing Visual Needles in Image Haystacks: Vision Language Models are Easily Distracted in Short and Long Contexts

    Authors: Aditya Sharma, Michael Saxon, William Yang Wang

    Abstract: We present LoCoVQA, a dynamic benchmark generator for evaluating long-context extractive reasoning in vision language models (VLMs). LoCoVQA augments test examples for mathematical reasoning, VQA, and character recognition tasks with increasingly long visual contexts composed of both in-distribution and out-of-distribution distractor images. Across these tasks, a diverse set of VLMs rapidly lose… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Under review

  8. arXiv:2406.15809  [pdf, other

    cs.CL cs.LG

    LaMSUM: A Novel Framework for Extractive Summarization of User Generated Content using LLMs

    Authors: Garima Chhikara, Anurag Sharma, V. Gurucharan, Kripabandhu Ghosh, Abhijnan Chakraborty

    Abstract: Large Language Models (LLMs) have demonstrated impressive performance across a wide range of NLP tasks, including summarization. Inherently LLMs produce abstractive summaries, and the task of achieving extractive summaries through LLMs still remains largely unexplored. To bridge this gap, in this work, we propose a novel framework LaMSUM to generate extractive summaries through LLMs for large user… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: Under review

  9. arXiv:2406.14169  [pdf, other

    cs.IR cs.LG

    Optimizing Novelty of Top-k Recommendations using Large Language Models and Reinforcement Learning

    Authors: Amit Sharma, Hua Li, Xue Li, Jian Jiao

    Abstract: Given an input query, a recommendation model is trained using user feedback data (e.g., click data) to output a ranked list of items. In real-world systems, besides accuracy, an important consideration for a new model is novelty of its top-k recommendations w.r.t. an existing deployed model. However, novelty of top-k items is a difficult goal to optimize a model for, since it involves a non-differ… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Accepted at KDD 2024

  10. arXiv:2406.13831  [pdf, other

    cs.DB

    A Comprehensive Overview of GPU Accelerated Databases

    Authors: Harshit Sharma, Anmol Sharma

    Abstract: Over the past decade, the landscape of data analytics has seen a notable shift towards heterogeneous architectures, particularly the integration of GPUs to enhance overall performance. In the realm of in-memory analytics, which often grapples with memory bandwidth constraints, the adoption of GPUs has proven advantageous, thanks to their superior bandwidth capabilities. The parallel processing pro… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  11. arXiv:2406.10504  [pdf, other

    cs.AI cs.CL cs.LG

    Task Facet Learning: A Structured Approach to Prompt Optimization

    Authors: Gurusha Juneja, Nagarajan Natarajan, Hua Li, Jian Jiao, Amit Sharma

    Abstract: Given a task in the form of a basic description and its training examples, prompt optimization is the problem of synthesizing the given information into a text prompt for a large language model (LLM). Humans solve this problem by also considering the different facets that define a task (e.g., counter-examples, explanations, analogies) and including them in the prompt. However, it is unclear whethe… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  12. arXiv:2406.07893  [pdf

    quant-ph cs.NE

    Parameter Estimation in Quantum Metrology Technique for Time Series Prediction

    Authors: Vaidik A Sharma, N. Madurai Meenachi, B. Venkatraman

    Abstract: The paper investigates the techniques of quantum computation in metrological predictions, with a particular emphasis on enhancing prediction potential through variational parameter estimation. The applicability of quantum simulations and quantum metrology techniques for modelling complex physical systems and achieving high-resolution measurements are proposed. The impacts of various parameter dist… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: conference. arXiv admin note: substantial text overlap with arXiv:2406.05767

  13. arXiv:2406.07153  [pdf, other

    cs.HC

    EEG classification for visual brain decoding with spatio-temporal and transformer based paradigms

    Authors: Akanksha Sharma, Jyoti Nigam, Abhishek Rathore, Arnav Bhavsar

    Abstract: In this work, we delve into the EEG classification task in the domain of visual brain decoding via two frameworks, involving two different learning paradigms. Considering the spatio-temporal nature of EEG data, one of our frameworks is based on a CNN-BiLSTM model. The other involves a CNN-Transformer architecture which inherently involves the more versatile attention based learning paradigm. In bo… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: The paper has been submitted at ICPR 2024. It contains 15 pages with 7 images

  14. arXiv:2406.04805  [pdf, other

    cs.CR cs.LG

    GENIE: Watermarking Graph Neural Networks for Link Prediction

    Authors: Venkata Sai Pranav Bachina, Ankit Gangwal, Aaryan Ajay Sharma, Charu Sharma

    Abstract: Graph Neural Networks (GNNs) have advanced the field of machine learning by utilizing graph-structured data, which is ubiquitous in the real world. GNNs have applications in various fields, ranging from social network analysis to drug discovery. GNN training is strenuous, requiring significant computational resources and human expertise. It makes a trained GNN an indispensable Intellectual Propert… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 20 pages, 12 figures

  15. arXiv:2406.04136  [pdf, other

    cs.CL cs.AI cs.LG

    Legal Judgment Reimagined: PredEx and the Rise of Intelligent AI Interpretation in Indian Courts

    Authors: Shubham Kumar Nigam, Anurag Sharma, Danush Khanna, Noel Shallum, Kripabandhu Ghosh, Arnab Bhattacharya

    Abstract: In the era of Large Language Models (LLMs), predicting judicial outcomes poses significant challenges due to the complexity of legal proceedings and the scarcity of expert-annotated datasets. Addressing this, we introduce \textbf{Pred}iction with \textbf{Ex}planation (\texttt{PredEx}), the largest expert-annotated dataset for legal judgment prediction and explanation in the Indian context, featuri… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  16. arXiv:2406.01947  [pdf, other

    cs.RO cs.LG

    Data-Driven Approaches for Thrust Prediction in Underwater Flap** Fin Propulsion Systems

    Authors: Julian Lee, Kamal Viswanath, Alisha Sharma, Jason Geder, Ravi Ramamurti, Marius D. Pruessner

    Abstract: Flap**-fin underwater vehicle propulsion systems provide an alternative to propeller-driven systems in situations that require involve a constrained environment or require high maneuverability. Testing new configurations through experiments or high-fidelity simulations is an expensive process, slowing development of new systems. This is especially true when introducing new fin geometries. In thi… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 9 pages, 11 figures, AAAI 2021 Fall Series Symposium on Science-Guided AI

  17. arXiv:2405.20247  [pdf, other

    cs.AI cs.CV cs.LG cs.SE

    KerasCV and KerasNLP: Vision and Language Power-Ups

    Authors: Matthew Watson, Divyashree Shivakumar Sreepathihalli, Francois Chollet, Martin Gorner, Kiranbir Sodhia, Ramesh Sampath, Tirth Patel, Haifeng **, Neel Kovelamudi, Gabriel Rasskin, Samaneh Saadat, Luke Wood, Chen Qian, Jonathan Bischof, Ian Stenbit, Abheesht Sharma, Anshuman Mishra

    Abstract: We present the Keras domain packages KerasCV and KerasNLP, extensions of the Keras API for Computer Vision and Natural Language Processing workflows, capable of running on either JAX, TensorFlow, or PyTorch. These domain packages are designed to enable fast experimentation, with a focus on ease-of-use and performance. We adopt a modular, layered design: at the library's lowest level of abstraction… ▽ More

    Submitted 5 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: Submitted to Journal of Machine Learning Open Source Software

    ACM Class: I.2.5; I.2.7; I.2.10

  18. arXiv:2405.20110  [pdf

    cs.RO cond-mat.mtrl-sci physics.ins-det

    Autonomous programmable microscopic electronic lablets optimized with digital control

    Authors: Thomas Maeke, John McCaskill, Dominic Funke, Pierre Mayr, Abhishek Sharma, Uwe Tangen, Jürgen Oehm

    Abstract: Lablets are autonomous microscopic particles with programmable CMOS electronics that can control electrokinetic phenomena and electrochemical reactions in solution via actuator and sensor microelectrodes. In this paper, we describe the design and fabrication of optimized singulated lablets (CMOS3) with dimensions 140x140x50 micrometers carrying an integrated coplanar encapsulated supercapacitor as… ▽ More

    Submitted 16 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: This article was originally submitted (2016) for review as one of a number of preprints as supporting information for the final review of the EU MICREAgents Project # 318671 (2012-2016). Here it is presented in slightly revised form. The version, v2 contains a reference to the verilog source code on GitHub

    ACM Class: I.2.9; B.7.0; J.2; J.3; J.7; H.0

  19. arXiv:2405.19328  [pdf, other

    cs.MA

    Normative Modules: A Generative Agent Architecture for Learning Norms that Supports Multi-Agent Cooperation

    Authors: Atrisha Sarkar, Andrei Ioan Muresanu, Carter Blair, Aaryam Sharma, Rakshit S Trivedi, Gillian K Hadfield

    Abstract: Generative agents, which implement behaviors using a large language model (LLM) to interpret and evaluate an environment, has demonstrated the capacity to solve complex tasks across many social and technological domains. However, when these agents interact with other agents and humans in presence of social structures such as existing norms, fostering cooperation between them is a fundamental chall… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  20. arXiv:2405.15608  [pdf

    cs.RO cond-mat.mtrl-sci

    Design and fabrication of autonomous electronic lablets for chemical control

    Authors: John S. McCaskill, Thomas Maeke, Dominic Funke, Pierre Mayr, Abhishek Sharma, Patrick F. Wagler, Jürgen Oehm

    Abstract: Lablets are autonomous microscopic particles with programmable CMOS electronics that canvcontrol electrokinetic phenomena and electrochemical reactions in solution via actuator and sensor microelectrodes. The lablets are designed to be rechargeable using an integrated supercapacitor, and to allow docking to one another or to a smart surface for interchange of energy, electronic information and che… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: This article was originally submitted (2016) for review as supporting information for the final review of the EU MICREAgents Project # 318671 (2012-2016). Here it is presented in slightly revised form

    ACM Class: I.2.9; B.7.0; J.2; J.3; J.7; H.0

  21. arXiv:2405.14876  [pdf, other

    cs.CV cs.AI

    Precise and Robust Sidewalk Detection: Leveraging Ensemble Learning to Surpass LLM Limitations in Urban Environments

    Authors: Ibne Farabi Shihab, Benjir Islam Alvee, Sudesh Ramesh Bhagat, Anuj Sharma

    Abstract: This study aims to compare the effectiveness of a robust ensemble model with the state-of-the-art ONE-PEACE Large Language Model (LLM) for accurate detection of sidewalks. Accurate sidewalk detection is crucial in improving road safety and urban planning. The study evaluated the model's performance on Cityscapes, Ade20k, and the Boston Dataset. The results showed that the ensemble model performed… ▽ More

    Submitted 1 April, 2024; originally announced May 2024.

  22. arXiv:2405.11139  [pdf, other

    cs.RO cs.AI cs.LG

    RuleFuser: Injecting Rules in Evidential Networks for Robust Out-of-Distribution Trajectory Prediction

    Authors: Jay Patrikar, Sushant Veer, Apoorva Sharma, Marco Pavone, Sebastian Scherer

    Abstract: Modern neural trajectory predictors in autonomous driving are developed using imitation learning (IL) from driving logs. Although IL benefits from its ability to glean nuanced and multi-modal human driving behaviors from large datasets, the resulting predictors often struggle with out-of-distribution (OOD) scenarios and with traffic rule compliance. On the other hand, classical rule-based predicto… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: 9 pages, 3 figures

  23. arXiv:2405.09247  [pdf, other

    cs.CV cs.LG

    Graph Neural Network based Handwritten Trajectories Recognition

    Authors: Anuj Sharma, Sukhdeep Singh, S Ratna

    Abstract: The graph neural networks has been proved to be an efficient machine learning technique in real life applications. The handwritten recognition is one of the useful area in real life use where both offline and online handwriting recognition are required. The chain code as feature extraction technique has shown significant results in literature and we have been able to use chain codes with graph neu… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  24. arXiv:2405.08717  [pdf, other

    cs.CV cs.AI

    How Much You Ate? Food Portion Estimation on Spoons

    Authors: Aaryam Sharma, Chris Czarnecki, Yuhao Chen, Pengcheng Xi, Linlin Xu, Alexander Wong

    Abstract: Monitoring dietary intake is a crucial aspect of promoting healthy living. In recent years, advances in computer vision technology have facilitated dietary intake monitoring through the use of images and depth cameras. However, the current state-of-the-art image-based food portion estimation algorithms assume that users take images of their meals one or two times, which can be inconvenient and fai… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  25. arXiv:2405.07242  [pdf, other

    quant-ph cs.IT

    Fault-Tolerant Quantum LDPC Encoders

    Authors: Abhi Kumar Sharma, Shayan Srinivasa Garani

    Abstract: We propose fault-tolerant encoders for quantum low-density parity check (LDPC) codes. By grou** qubits within a quantum code over contiguous blocks and applying preshared entanglement across these blocks, we show how transversal implementation can be realized. The proposed encoder reduces the error propagation while using multi-qubit gates and is applicable for both entanglement-unassisted and e… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

  26. arXiv:2404.15350  [pdf, other

    eess.SP cs.HC cs.LG

    Evaluating Fast Adaptability of Neural Networks for Brain-Computer Interface

    Authors: Anupam Sharma, Krishna Miyapuram

    Abstract: Electroencephalography (EEG) classification is a versatile and portable technique for building non-invasive Brain-computer Interfaces (BCI). However, the classifiers that decode cognitive states from EEG brain data perform poorly when tested on newer domains, such as tasks or individuals absent during model training. Researchers have recently used complex strategies like Model-agnostic meta-learni… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: Accepted in IJCNN 2024

  27. arXiv:2404.14367  [pdf, other

    cs.LG

    Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data

    Authors: Fahim Tajwar, Anikait Singh, Archit Sharma, Rafael Rafailov, Jeff Schneider, Tengyang Xie, Stefano Ermon, Chelsea Finn, Aviral Kumar

    Abstract: Learning from preference labels plays a crucial role in fine-tuning large language models. There are several distinct approaches for preference fine-tuning, including supervised learning, on-policy reinforcement learning (RL), and contrastive learning. Different methods come with different implementation tradeoffs and performance differences, and existing empirical findings present different concl… ▽ More

    Submitted 2 June, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: International Conference on Machine Learning (ICML), 2024

  28. arXiv:2404.14062  [pdf, other

    cs.CV cs.LG

    GatedLexiconNet: A Comprehensive End-to-End Handwritten Paragraph Text Recognition System

    Authors: Lalita Kumari, Sukhdeep Singh, Vaibhav Varish Singh Rathore, Anuj Sharma

    Abstract: The Handwritten Text Recognition problem has been a challenge for researchers for the last few decades, especially in the domain of computer vision, a subdomain of pattern recognition. Variability of texts amongst writers, cursiveness, and different font styles of handwritten texts with degradation of historical text images make it a challenging problem. Recognizing scanned document images in neur… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  29. arXiv:2404.12258  [pdf, ps, other

    cs.CV

    DeepLocalization: Using change point detection for Temporal Action Localization

    Authors: Mohammed Shaiqur Rahman, Ibne Farabi Shihab, Lynna Chu, Anuj Sharma

    Abstract: In this study, we introduce DeepLocalization, an innovative framework devised for the real-time localization of actions tailored explicitly for monitoring driver behavior. Utilizing the power of advanced deep learning methodologies, our objective is to tackle the critical issue of distracted driving-a significant factor contributing to road accidents. Our strategy employs a dual approach: leveragi… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  30. arXiv:2404.09432  [pdf, other

    cs.CV cs.AI cs.LG

    The 8th AI City Challenge

    Authors: Shuo Wang, David C. Anastasiu, Zheng Tang, Ming-Ching Chang, Yue Yao, Liang Zheng, Mohammed Shaiqur Rahman, Meenakshi S. Arya, Anuj Sharma, Pranamesh Chakraborty, Sanjita Prajapati, Quan Kong, Norimasa Kobori, Munkhjargal Gochoo, Munkh-Erdene Otgonbold, Fady Alnajjar, Ganzorig Batnasan, **-Yang Chen, Jun-Wei Hsieh, Xunlei Wu, Sameer Satish Pusegaonkar, Yizhou Wang, Sujit Biswas, Rama Chellappa

    Abstract: The eighth AI City Challenge highlighted the convergence of computer vision and artificial intelligence in areas like retail, warehouse settings, and Intelligent Traffic Systems (ITS), presenting significant research opportunities. The 2024 edition featured five tracks, attracting unprecedented interest from 726 teams in 47 countries and regions. Track 1 dealt with multi-target multi-camera (MTMC)… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: Summary of the 8th AI City Challenge Workshop in conjunction with CVPR 2024

  31. arXiv:2404.08850  [pdf

    cs.AI cs.CE cs.LG

    Assessing Economic Viability: A Comparative Analysis of Total Cost of Ownership for Domain-Adapted Large Language Models versus State-of-the-art Counterparts in Chip Design Coding Assistance

    Authors: Amit Sharma, Teodor-Dumitru Ene, Kishor Kunal, Mingjie Liu, Zafar Hasan, Haoxing Ren

    Abstract: This paper presents a comparative analysis of total cost of ownership (TCO) and performance between domain-adapted large language models (LLM) and state-of-the-art (SoTA) LLMs , with a particular emphasis on tasks related to coding assistance for chip design. We examine the TCO and performance metrics of a domain-adaptive LLM, ChipNeMo, against two leading LLMs, Claude 3 Opus and ChatGPT-4 Turbo,… ▽ More

    Submitted 28 May, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

    Comments: Paper accepted in IEEE-ACM conference: 2024 IEEE LLM-Aided Design Workshop (LAD)

  32. arXiv:2404.08011  [pdf, other

    cs.CV cs.LG

    An inclusive review on deep learning techniques and their scope in handwriting recognition

    Authors: Sukhdeep Singh, Sudhir Rohilla, Anuj Sharma

    Abstract: Deep learning expresses a category of machine learning algorithms that have the capability to combine raw inputs into intermediate features layers. These deep learning algorithms have demonstrated great results in different fields. Deep learning has particularly witnessed for a great achievement of human level performance across a number of domains in computer vision and pattern recognition. For t… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  33. arXiv:2404.05981  [pdf, other

    cs.LG cs.CV

    A Lightweight Measure of Classification Difficulty from Application Dataset Characteristics

    Authors: Bryan Bo Cao, Abhinav Sharma, Lawrence O'Gorman, Michael Coss, Shubham Jain

    Abstract: Despite accuracy and computation benchmarks being widely available to help choose among neural network models, these are usually trained on datasets with many classes, and do not give a precise idea of performance for applications of few (< 10) classes. The conventional procedure to predict performance is to train and test repeatedly on the different models and dataset variations of interest. Howe… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: 13 pages, 3 figures

    MSC Class: 65D19

  34. arXiv:2404.04251  [pdf, other

    cs.CV cs.AI cs.CL

    Who Evaluates the Evaluations? Objectively Scoring Text-to-Image Prompt Coherence Metrics with T2IScoreScore (TS2)

    Authors: Michael Saxon, Fatima Jahara, Mahsa Khoshnoodi, Yujie Lu, Aditya Sharma, William Yang Wang

    Abstract: With advances in the quality of text-to-image (T2I) models has come interest in benchmarking their prompt faithfulness-the semantic coherence of generated images to the prompts they were conditioned on. A variety of T2I faithfulness metrics have been proposed, leveraging advances in cross-modal embeddings and vision-language models (VLMs). However, these metrics are not rigorously compared and ben… ▽ More

    Submitted 22 May, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

    Comments: 10 pages main, 12 pages appendices, 13 figures, 3 tables

  35. arXiv:2404.03683  [pdf, other

    cs.LG cs.AI cs.CL

    Stream of Search (SoS): Learning to Search in Language

    Authors: Kanishk Gandhi, Denise Lee, Gabriel Grand, Muxin Liu, Winson Cheng, Archit Sharma, Noah D. Goodman

    Abstract: Language models are rarely shown fruitful mistakes while training. They then struggle to look beyond the next token, suffering from a snowballing of errors and struggling to predict the consequence of their actions several steps ahead. In this paper, we show how language models can be taught to search by representing the process of search in language, as a flattened string -- a stream of search (S… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  36. arXiv:2404.03646  [pdf, other

    cs.CL

    Locating and Editing Factual Associations in Mamba

    Authors: Arnab Sen Sharma, David Atkinson, David Bau

    Abstract: We investigate the mechanisms of factual recall in the Mamba state space model. Our work is inspired by previous findings in autoregressive transformer language models suggesting that their knowledge recall is localized to particular modules at specific token locations; we therefore ask whether factual recall in Mamba can be similarly localized. To investigate this, we conduct four lines of experi… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  37. arXiv:2404.03307  [pdf, other

    cs.RO eess.SY

    Bi-level Trajectory Optimization on Uneven Terrains with Differentiable Wheel-Terrain Interaction Model

    Authors: Amith Manoharan, Aditya Sharma, Himani Belsare, Kaustab Pal, K. Madhava Krishna, Arun Kumar Singh

    Abstract: Navigation of wheeled vehicles on uneven terrain necessitates going beyond the 2D approaches for trajectory planning. Specifically, it is essential to incorporate the full 6dof variation of vehicle pose and its associated stability cost in the planning process. To this end, most recent works aim to learn a neural network model to predict the vehicle evolution. However, such approaches are data-int… ▽ More

    Submitted 11 April, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: 8 pages, 7 figures, submitted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)

  38. arXiv:2404.01704  [pdf, other

    cs.NI

    Application of S-band for Protection in Multi-band Flexible-Grid Optical Networks

    Authors: Varsha Lohani, Anjali Sharma, Yatindra Nath Singh

    Abstract: The core network is experiencing bandwidth capacity constraints as internet traffic grows. As a result, the notion of a Multi-band flexible-grid optical network was established to increase the lifespan of an optical core network. In this paper, we use the C+L band for working traffic transmission and the S-band for protection against failure. Furthermore, we compare the proposed method with the ex… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: First Draft

  39. arXiv:2404.01585  [pdf, other

    cs.DB cs.PF

    FLEXIS: FLEXible Frequent Subgraph Mining using Maximal Independent Sets

    Authors: Akshit Sharma, Sam Reinher, Dinesh Mehta, Bo Wu

    Abstract: Frequent Subgraph Mining (FSM) is the process of identifying common subgraph patterns that surpass a predefined frequency threshold. While FSM is widely applicable in fields like bioinformatics, chemical analysis, and social network anomaly detection, its execution remains time-consuming and complex. This complexity stems from the need to recognize high-frequency subgraphs and ascertain if they ex… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  40. arXiv:2404.00172  [pdf, other

    cs.CV cs.AI cs.LG

    Universal Bovine Identification via Depth Data and Deep Metric Learning

    Authors: Asheesh Sharma, Lucy Randewich, William Andrew, Sion Hannuna, Neill Campbell, Siobhan Mullan, Andrew W. Dowsey, Melvyn Smith, Mark Hansen, Tilo Burghardt

    Abstract: This paper proposes and evaluates, for the first time, a top-down (dorsal view), depth-only deep learning system for accurately identifying individual cattle and provides associated code, datasets, and training weights for immediate reproducibility. An increase in herd size skews the cow-to-human ratio at the farm and makes the manual monitoring of individuals more challenging. Therefore, real-tim… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

    Comments: LaTeX, 38 pages, 14 figures, 3 tables

  41. arXiv:2403.18144  [pdf, other

    cs.CR cs.CV

    Leak and Learn: An Attacker's Cookbook to Train Using Leaked Data from Federated Learning

    Authors: Joshua C. Zhao, Ahaan Dabholkar, Atul Sharma, Saurabh Bagchi

    Abstract: Federated learning is a decentralized learning paradigm introduced to preserve privacy of client data. Despite this, prior work has shown that an attacker at the server can still reconstruct the private training data using only the client updates. These attacks are known as data reconstruction attacks and fall into two major categories: gradient inversion (GI) and linear layer leakage attacks (LLL… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024

  42. arXiv:2403.17541  [pdf, other

    cs.CV cs.GR

    WordRobe: Text-Guided Generation of Textured 3D Garments

    Authors: Astitva Srivastava, Pranav Manu, Amit Raj, Varun Jampani, Avinash Sharma

    Abstract: In this paper, we tackle a new and challenging problem of text-driven generation of 3D garments with high-quality textures. We propose "WordRobe", a novel framework for the generation of unposed & textured 3D garment meshes from user-friendly text prompts. We achieve this by first learning a latent representation of 3D garments using a novel coarse-to-fine training strategy and a loss for latent d… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  43. arXiv:2403.15238  [pdf

    eess.IV cs.CV stat.ME

    WEEP: A method for spatial interpretation of weakly supervised CNN models in computational pathology

    Authors: Abhinav Sharma, Bo**g Liu, Mattias Rantalainen

    Abstract: Deep learning enables the modelling of high-resolution histopathology whole-slide images (WSI). Weakly supervised learning of tile-level data is typically applied for tasks where labels only exist on the patient or WSI level (e.g. patient outcomes or histological grading). In this context, there is a need for improved spatial interpretability of predictions from such models. We propose a novel met… ▽ More

    Submitted 8 April, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

  44. arXiv:2403.15077  [pdf, other

    cs.LG

    GTAGCN: Generalized Topology Adaptive Graph Convolutional Networks

    Authors: Sukhdeep Singh, Anuj Sharma, Vinod Kumar Chauhan

    Abstract: Graph Neural Networks (GNN) have emerged as a popular and standard approach for learning from graph-structured data. The literature on GNN highlights the potential of this evolving research area and its widespread adoption in real-life applications. However, most of the approaches are either new in concept or derived from specific techniques. Therefore, the potential of more than one approach in h… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: 2 figures, 3 tables and 26 pages

  45. arXiv:2403.12945  [pdf, other

    cs.RO

    DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

    Authors: Alexander Khazatsky, Karl Pertsch, Suraj Nair, Ashwin Balakrishna, Sudeep Dasari, Siddharth Karamcheti, Soroush Nasiriany, Mohan Kumar Srirama, Lawrence Yunliang Chen, Kirsty Ellis, Peter David Fagan, Joey Hejna, Masha Itkina, Marion Lepert, Yecheng Jason Ma, Patrick Tree Miller, Jimmy Wu, Suneel Belkhale, Shivin Dass, Huy Ha, Arhan Jain, Abraham Lee, Youngwoon Lee, Marius Memmel, Sungjae Park , et al. (74 additional authors not shown)

    Abstract: The creation of large, diverse, high-quality robot manipulation datasets is an important step** stone on the path toward more capable and robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipulation data in diverse environments poses logistical and safety challenges and requires substantial investments in hardware and human labour. As a resu… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Project website: https://droid-dataset.github.io/

  46. arXiv:2403.12910  [pdf, other

    cs.RO cs.AI cs.LG

    Yell At Your Robot: Improving On-the-Fly from Language Corrections

    Authors: Lucy Xiaoyang Shi, Zheyuan Hu, Tony Z. Zhao, Archit Sharma, Karl Pertsch, Jianlan Luo, Sergey Levine, Chelsea Finn

    Abstract: Hierarchical policies that combine language and low-level control have been shown to perform impressively long-horizon robotic tasks, by leveraging either zero-shot high-level planners like pretrained language and vision-language models (LLMs/VLMs) or models trained on annotated robotic demonstrations. However, for complex and dexterous skills, attaining high success rates on long-horizon tasks st… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Project website: https://yay-robot.github.io/

  47. arXiv:2403.12596  [pdf, other

    cs.CL

    Chart-based Reasoning: Transferring Capabilities from LLMs to VLMs

    Authors: Victor Carbune, Hassan Mansoor, Fangyu Liu, Rahul Aralikatte, Gilles Baechler, **dong Chen, Abhanshu Sharma

    Abstract: Vision-language models (VLMs) are achieving increasingly strong performance on multimodal tasks. However, reasoning capabilities remain limited particularly for smaller VLMs, while those of large-language models (LLMs) have seen numerous improvements. We propose a technique to transfer capabilities from LLMs to VLMs. On the recently introduced ChartQA, our method obtains state-of-the-art performan… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Findings of NAACL 2024

  48. arXiv:2403.11169  [pdf, other

    cs.CL cs.AI

    Correcting misinformation on social media with a large language model

    Authors: Xinyi Zhou, Ashish Sharma, Amy X. Zhang, Tim Althoff

    Abstract: Real-world misinformation can be partially correct and even factual but misleading. It undermines public trust in science and democracy, particularly on social media, where it can spread rapidly. High-quality and timely correction of misinformation that identifies and explains its (in)accuracies has been shown to effectively reduce false beliefs. Despite the wide acceptance of manual correction, i… ▽ More

    Submitted 30 April, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

    Comments: 53 pages

  49. arXiv:2403.10912  [pdf

    cs.CV cs.LG

    Automatic location detection based on deep learning

    Authors: Anjali Karangiya, Anirudh Sharma, Divax Shah, Kartavya Badgujar, Dr. Chintan Thacker, Dainik Dave

    Abstract: The proliferation of digital images and the advancements in deep learning have paved the way for innovative solutions in various domains, especially in the field of image classification. Our project presents an in-depth study and implementation of an image classification system specifically tailored to identify and classify images of Indian cities. Drawing from an extensive dataset, our model clas… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

  50. arXiv:2403.07911  [pdf

    cs.CY cs.AI

    Standing on FURM ground -- A framework for evaluating Fair, Useful, and Reliable AI Models in healthcare systems

    Authors: Alison Callahan, Duncan McElfresh, Juan M. Banda, Gabrielle Bunney, Danton Char, Jonathan Chen, Conor K. Corbin, Debadutta Dash, Norman L. Downing, Sneha S. Jain, Nikesh Kotecha, Jonathan Masterson, Michelle M. Mello, Keith Morse, Srikar Nallan, Abby Pandya, Anurang Revri, Aditya Sharma, Christopher Sharp, Rahul Thapa, Michael Wornow, Alaa Youssef, Michael A. Pfeffer, Nigam H. Shah

    Abstract: The impact of using artificial intelligence (AI) to guide patient care or operational processes is an interplay of the AI model's output, the decision-making protocol based on that output, and the capacity of the stakeholders involved to take the necessary subsequent action. Estimating the effects of this interplay before deployment, and studying it in real time afterwards, are essential to bridge… ▽ More

    Submitted 14 March, 2024; v1 submitted 26 February, 2024; originally announced March 2024.