Skip to main content

Showing 1–50 of 278 results for author: Agarwal, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12313  [pdf

    cs.DB

    A framework for develo** a knowledge management platform

    Authors: Marie Lisandra Zepeda Mendoza, Sonali Agarwal, James A. Blackshaw, Vanesa Bol, Audrey Fazzi, Filippo Fiorini, Amy Louise Foreman, Nancy George, Brett R. Johnson, Brian Martin, Dave McComb, Euphemia Mutasa-Gottgens, Helen Parkinson, Martin Romacker, Rolf Russell, Valérien Ségard, Shawn Zheng Kai Tan, Wei Kheng Teh, F. P. Winstanley, Benedict Wong, Adrian M. Smith

    Abstract: Knowledge management (KM) involves collecting, organizing, storing, and disseminating information to improve decision-making, innovation, and performance. Implementing KM at scale has become essential for organizations to effectively leverage vast accessible data. This paper is a compilation of concepts that emerged from KM workshops hosted by EMBL-EBI, attended by SMEs and industry. We provide gu… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 18 pages, 1 figure

  2. arXiv:2406.05276  [pdf, other

    cs.LG

    VTrans: Accelerating Transformer Compression with Variational Information Bottleneck based Pruning

    Authors: Oshin Dutta, Ritvik Gupta, Sumeet Agarwal

    Abstract: In recent years, there has been a growing emphasis on compressing large pre-trained transformer models for resource-constrained devices. However, traditional pruning methods often leave the embedding layer untouched, leading to model over-parameterization. Additionally, they require extensive compression time with large datasets to maintain performance in pruned models. To address these challenges… ▽ More

    Submitted 11 June, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

  3. arXiv:2406.03142  [pdf, ps, other

    cs.LG

    On the Power of Randomization in Fair Classification and Representation

    Authors: Sushant Agarwal, Amit Deshpande

    Abstract: Fair classification and fair representation learning are two important problems in supervised and unsupervised fair machine learning, respectively. Fair classification asks for a classifier that maximizes accuracy on a given data distribution subject to fairness constraints. Fair representation maps a given data distribution over the original feature space to a distribution over a new representati… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Appeared in ACM FAccT 2022

  4. arXiv:2405.20405  [pdf, other

    cs.DS cs.CR cs.IT cs.LG stat.ML

    Private Mean Estimation with Person-Level Differential Privacy

    Authors: Sushant Agarwal, Gautam Kamath, Mahbod Majid, Argyris Mouzakis, Rose Silver, Jonathan Ullman

    Abstract: We study differentially private (DP) mean estimation in the case where each person holds multiple samples. Commonly referred to as the "user-level" setting, DP here requires the usual notion of distributional stability when all of a person's datapoints can be modified. Informally, if $n$ people each have $m$ samples from an unknown $d$-dimensional distribution with bounded $k$-th moments, we show… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 67 pages, 3 figures

  5. arXiv:2405.15152  [pdf, other

    cs.CL cs.AI

    Machine Unlearning in Large Language Models

    Authors: Saaketh Koundinya Gundavarapu, Shreya Agarwal, Arushi Arora, Chandana Thimmalapura Jagadeeshaiah

    Abstract: Machine unlearning, a novel area within artificial intelligence, focuses on addressing the challenge of selectively forgetting or reducing undesirable knowledge or behaviors in machine learning models, particularly in the context of large language models (LLMs). This paper introduces a methodology to align LLMs, such as Open Pre-trained Transformer Language Models, with ethical, privacy, and safet… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 10 pages

  6. arXiv:2405.12433  [pdf, other

    cs.AI

    LLM+Reasoning+Planning for supporting incomplete user queries in presence of APIs

    Authors: Sudhir Agarwal, Anu Sreepathy, David H. Alonso, Prarit Lamba

    Abstract: Recent availability of Large Language Models (LLMs) has led to the development of numerous LLM-based approaches aimed at providing natural language interfaces for various end-user tasks. These end-user tasks in turn can typically be accomplished by orchestrating a given set of APIs. In practice, natural language task requests (user queries) are often incomplete, i.e., they may not contain all the… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 9 pages main content, 2 pages references, 12 pages appendix, 5 figures

  7. arXiv:2405.11346  [pdf

    cs.AI

    Decision support system for Forest fire management using Ontology with Big Data and LLMs

    Authors: Ritesh Chandra, Shashi Shekhar Kumar, Rushil Patra, Sonali Agarwal

    Abstract: Forests are crucial for ecological balance, but wildfires, a major cause of forest loss, pose significant risks. Fire weather indices, which assess wildfire risk and predict resource demands, are vital. With the rise of sensor networks in fields like healthcare and environmental monitoring, semantic sensor networks are increasingly used to gather climatic data such as wind speed, temperature, and… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  8. arXiv:2405.11215  [pdf, other

    cs.CL cs.CY

    MemeMQA: Multimodal Question Answering for Memes via Rationale-Based Inferencing

    Authors: Siddhant Agarwal, Shivam Sharma, Preslav Nakov, Tanmoy Chakraborty

    Abstract: Memes have evolved as a prevalent medium for diverse communication, ranging from humour to propaganda. With the rising popularity of image-focused content, there is a growing need to explore its potential harm from different aspects. Previous studies have analyzed memes in closed settings - detecting harm, applying semantic labels, and offering natural language explanations. To extend this researc… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

    Comments: The paper has been accepted in ACL'24 (Findings)

  9. arXiv:2405.08015  [pdf, other

    cs.LG cs.AI

    A Methodology-Oriented Study of Catastrophic Forgetting in Incremental Deep Neural Networks

    Authors: Ashutosh Kumar, Sonali Agarwal, D Jude Hemanth

    Abstract: Human being and different species of animals having the skills to gather, transferring knowledge, processing, fine-tune and generating information throughout their lifetime. The ability of learning throughout their lifespan is referred as continuous learning which is using neurocognition mechanism. Consequently, in real world computational system of incremental learning autonomous agents also need… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  10. arXiv:2405.07284  [pdf

    cs.CV cs.AI

    Zero Shot Context-Based Object Segmentation using SLIP (SAM+CLIP)

    Authors: Saaketh Koundinya Gundavarapu, Arushi Arora, Shreya Agarwal

    Abstract: We present SLIP (SAM+CLIP), an enhanced architecture for zero-shot object segmentation. SLIP combines the Segment Anything Model (SAM) \cite{kirillov2023segment} with the Contrastive Language-Image Pretraining (CLIP) \cite{radford2021learning}. By incorporating text prompts into SAM using CLIP, SLIP enables object segmentation without prior training on specific classes or categories. We fine-tune… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

    Comments: 5 pages, 3 figures

  11. arXiv:2405.05658  [pdf

    eess.IV cs.CV

    Artificial intelligence for abnormality detection in high volume neuroimaging: a systematic review and meta-analysis

    Authors: Siddharth Agarwal, David A. Wood, Mariusz Grzeda, Chandhini Suresh, Munaib Din, James Cole, Marc Modat, Thomas C Booth

    Abstract: Purpose: Most studies evaluating artificial intelligence (AI) models that detect abnormalities in neuroimaging are either tested on unrepresentative patient cohorts or are insufficiently well-validated, leading to poor generalisability to real-world tasks. The aim was to determine the diagnostic test accuracy and summarise the evidence supporting the use of AI models performing first-line, high-vo… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  12. arXiv:2405.05647  [pdf

    cs.CV

    Letter to the Editor: What are the legal and ethical considerations of submitting radiology reports to ChatGPT?

    Authors: Siddharth Agarwal, David Wood, Robin Carpenter, Yiran Wei, Marc Modat, Thomas C Booth

    Abstract: This letter critically examines the recent article by Infante et al. assessing the utility of large language models (LLMs) like GPT-4, Perplexity, and Bard in identifying urgent findings in emergency radiology reports. While acknowledging the potential of LLMs in generating labels for computer vision, concerns are raised about the ethical implications of using patient data without explicit approva… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  13. arXiv:2405.03113  [pdf, other

    cs.RO cs.AI

    Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning

    Authors: Caleb Chuck, Carl Qi, Michael J. Munje, Shuozhe Li, Max Rudolph, Chang Shi, Siddhant Agarwal, Harshit Sikchi, Abhinav Peri, Sarthak Dayal, Evan Kuo, Kavan Mehta, Anthony Wang, Peter Stone, Amy Zhang, Scott Niekum

    Abstract: Reinforcement Learning is a promising tool for learning complex policies even in fast-moving and object-interactive domains where human teleoperation or hard-coded policies might fail. To effectively reflect this challenging category of tasks, we introduce a dynamic, interactive RL testbed based on robot air hockey. By augmenting air hockey with a large family of tasks ranging from easy tasks like… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  14. arXiv:2405.02782  [pdf

    cs.CV

    A self-supervised text-vision framework for automated brain abnormality detection

    Authors: David A. Wood, Emily Guilhem, Sina Kafiabadi, Ayisha Al Busaidi, Kishan Dissanayake, Ahmed Hammam, Nina Mansoor, Matthew Townend, Siddharth Agarwal, Yiran Wei, Asif Mazumder, Gareth J. Barker, Peter Sasieni, Sebastien Ourselin, James H. Cole, Thomas C. Booth

    Abstract: Artificial neural networks trained on large, expert-labelled datasets are considered state-of-the-art for a range of medical image recognition tasks. However, categorically labelled datasets are time-consuming to generate and constrain classification to a pre-defined, fixed set of classes. For neuroradiological applications in particular, this represents a barrier to clinical adoption. To address… ▽ More

    Submitted 11 June, 2024; v1 submitted 4 May, 2024; originally announced May 2024.

    Comments: Under Review

  15. Reliable Student: Addressing Noise in Semi-Supervised 3D Object Detection

    Authors: Farzad Nozarian, Shashank Agarwal, Farzaneh Rezaeianaran, Danish Shahzad, Atanas Poibrenski, Christian Müller, Philipp Slusallek

    Abstract: Semi-supervised 3D object detection can benefit from the promising pseudo-labeling technique when labeled data is limited. However, recent approaches have overlooked the impact of noisy pseudo-labels during training, despite efforts to enhance pseudo-label quality through confidence-based filtering. In this paper, we examine the impact of noisy pseudo-labels on IoU-based target assignment and prop… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: Accepted at CVPR Workshop L3D-IVU 2023. Code: https://github.com/fnozarian/ReliableStudent

  16. arXiv:2404.16710  [pdf, other

    cs.CL cs.AI cs.LG

    LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

    Authors: Mostafa Elhoushi, Akshat Shrivastava, Diana Liskovich, Basil Hosmer, Bram Wasti, Liangzhen Lai, Anas Mahmoud, Bilge Acun, Saurabh Agarwal, Ahmed Roman, Ahmed A Aly, Beidi Chen, Carole-Jean Wu

    Abstract: We present LayerSkip, an end-to-end solution to speed-up inference of large language models (LLMs). First, during training we apply layer dropout, with low dropout rates for earlier layers and higher dropout rates for later layers, and an early exit loss where all transformer layers share the same exit. Second, during inference, we show that this training recipe increases the accuracy of early exi… ▽ More

    Submitted 29 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: Code open sourcing is in progress

  17. arXiv:2404.04392  [pdf, other

    cs.CR cs.AI

    Increased LLM Vulnerabilities from Fine-tuning and Quantization

    Authors: Divyanshu Kumar, Anurakt Kumar, Sahil Agarwal, Prashanth Harshangi

    Abstract: Large Language Models (LLMs) have become very popular and have found use cases in many domains, such as chatbots, auto-task completion agents, and much more. However, LLMs are vulnerable to different types of attacks, such as jailbreaking, prompt injection attacks, and privacy leakage attacks. Foundational LLMs undergo adversarial and alignment training to learn not to generate malicious and toxic… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  18. arXiv:2404.02912  [pdf, ps, other

    cs.CC cs.AI

    Probabilistic Generating Circuits -- Demystified

    Authors: Sanyam Agarwal, Markus Bläser

    Abstract: Zhang et al. (ICML 2021, PLMR 139, pp. 12447-1245) introduced probabilistic generating circuits (PGCs) as a probabilistic model to unify probabilistic circuits (PCs) and determinantal point processes (DPPs). At a first glance, PGCs store a distribution in a very different way, they compute the probability generating polynomial instead of the probability mass function and it seems that this is the… ▽ More

    Submitted 4 March, 2024; originally announced April 2024.

  19. arXiv:2403.14701  [pdf

    cs.CY cs.DB

    Rule based Complex Event Processing for an Air Quality Monitoring System in Smart City

    Authors: Shashi Shekhar Kumar, Ritesh Chandra, Sonali Agarwal

    Abstract: In recent years, smart city-based development has gained momentum due to its versatile nature in architecture and planning for the systematic habitation of human beings. According to World Health Organization (WHO) report, air pollution causes serious respiratory diseases. Hence, it becomes necessary to real-time monitoring of air quality to minimize effect by taking time-bound decisions by the st… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

  20. arXiv:2403.09914  [pdf, other

    cs.CV

    ProMark: Proactive Diffusion Watermarking for Causal Attribution

    Authors: Vishal Asnani, John Collomosse, Tu Bui, Xiaoming Liu, Shruti Agarwal

    Abstract: Generative AI (GenAI) is transforming creative workflows through the capability to synthesize and manipulate images via high-level prompts. Yet creatives are not well supported to receive recognition or reward for the use of their content in GenAI training. To this end, we propose ProMark, a causal attribution technique to attribute a synthetically generated image to its training data concepts lik… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024

  21. arXiv:2403.08058  [pdf, other

    cs.LG cs.CL

    CHAI: Clustered Head Attention for Efficient LLM Inference

    Authors: Saurabh Agarwal, Bilge Acun, Basil Hosmer, Mostafa Elhoushi, Ye** Lee, Shivaram Venkataraman, Dimitris Papailiopoulos, Carole-Jean Wu

    Abstract: Large Language Models (LLMs) with hundreds of billions of parameters have transformed the field of machine learning. However, serving these models at inference time is both compute and memory intensive, where a single request can require multiple GPUs and tens of Gigabytes of memory. Multi-Head Attention is one of the key components of LLMs, which can account for over 50% of LLMs memory and comput… ▽ More

    Submitted 27 April, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

  22. arXiv:2403.08043  [pdf, other

    cs.CL

    Authorship Style Transfer with Policy Optimization

    Authors: Shuai Liu, Shantanu Agarwal, Jonathan May

    Abstract: Authorship style transfer aims to rewrite a given text into a specified target while preserving the original meaning in the source. Existing approaches rely on the availability of a large number of target style exemplars for model training. However, these overlook cases where a limited number of target style examples are available. The development of parameter-efficient transfer learning technique… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  23. arXiv:2403.06938  [pdf, other

    cs.AR

    TCAM-SSD: A Framework for Search-Based Computing in Solid-State Drives

    Authors: Ryan Wong, Nikita Kim, Kevin Higgs, Sapan Agarwal, Engin Ipek, Saugata Ghose, Ben Feinberg

    Abstract: As the amount of data produced in society continues to grow at an exponential rate, modern applications are incurring significant performance and energy penalties due to high data movement between the CPU and memory/storage. While processing in main memory can alleviate these penalties, it is becoming increasingly difficult to keep large datasets entirely in main memory. This has led to a recent p… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  24. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  25. arXiv:2403.05513  [pdf, other

    cs.RO

    A Detection and Filtering Framework for Collaborative Localization

    Authors: Thirumalaesh Ashokkumar, Katherine A Skinner, Siddarth Agarwal, Ankit Vora, Ashutosh Bhown

    Abstract: Increasingly, autonomous vehicles (AVs) are becoming a reality, such as the Advanced Driver Assistance Systems (ADAS) in vehicles that assist drivers in driving and parking functions with vehicles today. The localization problem for AVs relies primarily on multiple sensors, including cameras, LiDARs, and radars. Manufacturing, installing, calibrating, and maintaining these sensors can be very expe… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  26. arXiv:2403.04160  [pdf, other

    cs.IR cs.AI

    Improving Retrieval in Theme-specific Applications using a Corpus Topical Taxonomy

    Authors: SeongKu Kang, Shivam Agarwal, Bowen **, Dongha Lee, Hwanjo Yu, Jiawei Han

    Abstract: Document retrieval has greatly benefited from the advancements of large-scale pre-trained language models (PLMs). However, their effectiveness is often limited in theme-specific applications for specialized areas or industries, due to unique terminologies, incomplete contexts of user queries, and specialized search intents. To capture the theme-specific information and improve retrieval, we propos… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: TheWebConf'24

  27. arXiv:2403.02682  [pdf, other

    cs.LG eess.SP

    Time Weaver: A Conditional Time Series Generation Model

    Authors: Sai Shankar Narasimhan, Shubhankar Agarwal, Oguzhan Akcin, Sujay Sanghavi, Sandeep Chinchali

    Abstract: Imagine generating a city's electricity demand pattern based on weather, the presence of an electric vehicle, and location, which could be used for capacity planning during a winter freeze. Such real-world time series are often enriched with paired heterogeneous contextual metadata (weather, location, etc.). Current approaches to time series generation often ignore this paired metadata, and its he… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  28. arXiv:2402.18434  [pdf, other

    cs.LG cs.IR

    Graph Regularized Encoder Training for Extreme Classification

    Authors: Anshul Mittal, Shikhar Mohan, Deepak Saini, Suchith C. Prabhu, Jain jiao, Sumeet Agarwal, Soumen Chakrabarti, Purushottam Kar, Manik Varma

    Abstract: Deep extreme classification (XC) aims to train an encoder architecture and an accompanying classifier architecture to tag a data point with the most relevant subset of labels from a very large universe of labels. XC applications in ranking, recommendation and tagging routinely encounter tail labels for which the amount of training data is exceedingly small. Graph convolutional networks (GCN) prese… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  29. arXiv:2402.06608  [pdf, other

    cs.CL cs.AI

    TIC: Translate-Infer-Compile for accurate "text to plan" using LLMs and Logical Representations

    Authors: Sudhir Agarwal, Anu Sreepathy

    Abstract: We study the problem of generating plans for given natural language planning task requests. On one hand, LLMs excel at natural language processing but do not perform well on planning. On the other hand, classical planning tools excel at planning tasks but require input in a structured language such as the Planning Domain Definition Language (PDDL). We leverage the strengths of both the techniques… ▽ More

    Submitted 28 June, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

    Comments: 20 pages (7 main + 2 references + 11 appendix), 4 figures, 2 tables

  30. arXiv:2402.05398  [pdf, other

    cs.CV

    On the Effect of Image Resolution on Semantic Segmentation

    Authors: Ritambhara Singh, Abhishek Jain, Pietro Perona, Shivani Agarwal, Junfeng Yang

    Abstract: High-resolution semantic segmentation requires substantial computational resources. Traditional approaches in the field typically downscale the input images before processing and then upscale the low-resolution outputs back to their original dimensions. While this strategy effectively identifies broad regions, it often misses finer details. In this study, we demonstrate that a streamlined model ca… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: text overlap with arXiv:2209.08667 by other authors

  31. arXiv:2402.01829  [pdf, other

    q-bio.BM cs.CL cs.LG

    Predicting ATP binding sites in protein sequences using Deep Learning and Natural Language Processing

    Authors: Shreyas V, Swati Agarwal

    Abstract: Predicting ATP-Protein Binding sites in genes is of great significance in the field of Biology and Medicine. The majority of research in this field has been conducted through time- and resource-intensive 'wet experiments' in laboratories. Over the years, researchers have been investigating computational methods computational methods to accomplish the same goals, utilising the strength of advanced… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: Published at 3rd Annual AAAI Workshop on AI to Accelerate Science and Engineering (AI2ASE)

  32. arXiv:2402.01788  [pdf, other

    cs.CL cs.AI cs.IR

    LitLLM: A Toolkit for Scientific Literature Review

    Authors: Shubham Agarwal, Issam H. Laradji, Laurent Charlin, Christopher Pal

    Abstract: Conducting literature reviews for scientific papers is essential for understanding research, its limitations, and building on existing work. It is a tedious task which makes an automatic literature review generator appealing. Unfortunately, many existing works that generate such reviews using Large Language Models (LLMs) have significant limitations. They tend to hallucinate-generate non-actual in… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  33. arXiv:2402.01528  [pdf, other

    cs.LG cs.CL

    Decoding Speculative Decoding

    Authors: Minghao Yan, Saurabh Agarwal, Shivaram Venkataraman

    Abstract: Speculative Decoding is a widely used technique to speed up inference for Large Language Models (LLMs) without sacrificing quality. When performing inference, speculative decoding uses a smaller draft model to generate speculative tokens and then uses the target LLM to verify those draft tokens. The speedup provided by speculative decoding heavily depends on the choice of the draft model. In this… ▽ More

    Submitted 26 April, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  34. arXiv:2402.01055  [pdf, other

    cs.LG stat.ML

    Multiclass Learning from Noisy Labels for Non-decomposable Performance Measures

    Authors: Mingyuan Zhang, Shivani Agarwal

    Abstract: There has been much interest in recent years in learning good classifiers from data with noisy labels. Most work on learning from noisy labels has focused on standard loss-based performance measures. However, many machine learning problems require using non-decomposable performance measures which cannot be expressed as the expectation or sum of a loss on individual examples; these include for exam… ▽ More

    Submitted 23 April, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  35. arXiv:2401.04855  [pdf, other

    cs.RO cs.LG

    LPAC: Learnable Perception-Action-Communication Loops with Applications to Coverage Control

    Authors: Saurav Agarwal, Ramya Muthukrishnan, Walker Gosrich, Vijay Kumar, Alejandro Ribeiro

    Abstract: Coverage control is the problem of navigating a robot swarm to collaboratively monitor features or a phenomenon of interest not known a priori. The problem is challenging in decentralized settings with robots that have limited communication and sensing capabilities. We propose a learnable Perception-Action-Communication (LPAC) architecture for the problem, wherein a convolution neural network (CNN… ▽ More

    Submitted 8 February, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

  36. arXiv:2401.01596  [pdf, other

    cs.AI cs.CL

    MedSumm: A Multimodal Approach to Summarizing Code-Mixed Hindi-English Clinical Queries

    Authors: Akash Ghosh, Arkadeep Acharya, Prince Jha, Aniket Gaudgaul, Rajdeep Majumdar, Sriparna Saha, Aman Chadha, Raghav Jain, Setu Sinha, Shivani Agarwal

    Abstract: In the healthcare domain, summarizing medical questions posed by patients is critical for improving doctor-patient interactions and medical decision-making. Although medical data has grown in complexity and quantity, the current body of research in this domain has primarily concentrated on text-based methods, overlooking the integration of visual cues. Also prior works in the area of medical quest… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

    Comments: ECIR 2024

  37. arXiv:2401.00728  [pdf, other

    eess.IV cs.CV cs.LG

    MultiFusionNet: Multilayer Multimodal Fusion of Deep Neural Networks for Chest X-Ray Image Classification

    Authors: Saurabh Agarwal, K. V. Arya, Yogesh Kumar Meena

    Abstract: Chest X-ray imaging is a critical diagnostic tool for identifying pulmonary diseases. However, manual interpretation of these images is time-consuming and error-prone. Automated systems utilizing convolutional neural networks (CNNs) have shown promise in improving the accuracy and efficiency of chest X-ray image classification. While previous work has mainly focused on using feature maps from the… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

    Comments: 19 pages

  38. arXiv:2312.12621  [pdf, other

    cs.DC

    Blox: A Modular Toolkit for Deep Learning Schedulers

    Authors: Saurabh Agarwal, Amar Phanishayee, Shivaram Venkataraman

    Abstract: Deep Learning (DL) workloads have rapidly increased in popularity in enterprise clusters and several new cluster schedulers have been proposed in recent years to support these workloads. With rapidly evolving DL workloads, it is challenging to quickly prototype and compare scheduling policies across workloads. Further, as prior systems target different aspects of scheduling (resource allocation, p… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: To be presented at Eurosys'24

  39. arXiv:2312.11556  [pdf, other

    cs.CV cs.AI cs.CL

    StarVector: Generating Scalable Vector Graphics Code from Images

    Authors: Juan A. Rodriguez, Shubham Agarwal, Issam H. Laradji, Pau Rodriguez, David Vazquez, Christopher Pal, Marco Pedersoli

    Abstract: Scalable Vector Graphics (SVGs) have become integral in modern image rendering applications due to their infinite scalability in resolution, versatile usability, and editing capabilities. SVGs are particularly popular in the fields of web development and graphic design. Existing approaches for SVG modeling using deep learning often struggle with generating complex SVGs and are restricted to simple… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

  40. arXiv:2312.10540  [pdf, other

    cs.CV cs.GR

    VecFusion: Vector Font Generation with Diffusion

    Authors: Vikas Thamizharasan, Difan Liu, Shantanu Agarwal, Matthew Fisher, Michael Gharbi, Oliver Wang, Alec Jacobson, Evangelos Kalogerakis

    Abstract: We present VecFusion, a new neural architecture that can generate vector fonts with varying topological structures and precise control point positions. Our approach is a cascaded diffusion model which consists of a raster diffusion model followed by a vector diffusion model. The raster model generates low-resolution, rasterized fonts with auxiliary control point information, capturing the global s… ▽ More

    Submitted 21 May, 2024; v1 submitted 16 December, 2023; originally announced December 2023.

  41. arXiv:2312.04429  [pdf, other

    cs.CV

    Approximate Caching for Efficiently Serving Diffusion Models

    Authors: Shubham Agarwal, Subrata Mitra, Sarthak Chakraborty, Srikrishna Karanam, Koyel Mukherjee, Shiv Saini

    Abstract: Text-to-image generation using diffusion models has seen explosive popularity owing to their ability in producing high quality images adhering to text prompts. However, production-grade diffusion model serving is a resource intensive task that not only require high-end GPUs which are expensive but also incurs considerable latency. In this paper, we introduce a technique called approximate-caching… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: Accepted at NSDI'24

  42. arXiv:2312.00434  [pdf, other

    cs.LG cs.AI cs.CY

    PEFTDebias : Capturing debiasing information using PEFTs

    Authors: Sumit Agarwal, Aditya Srikanth Veerubhotla, Srijan Bansal

    Abstract: The increasing use of foundation models highlights the urgent need to address and eliminate implicit biases present in them that arise during pretraining. In this paper, we introduce PEFTDebias, a novel approach that employs parameter-efficient fine-tuning (PEFT) to mitigate the biases within foundation models. PEFTDebias consists of two main phases: an upstream phase for acquiring debiasing param… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: EMNLP 2023

  43. arXiv:2311.18297  [pdf, other

    cs.CV cs.AI

    TrustMark: Universal Watermarking for Arbitrary Resolution Images

    Authors: Tu Bui, Shruti Agarwal, John Collomosse

    Abstract: Imperceptible digital watermarking is important in copyright protection, misinformation prevention, and responsible generative AI. We propose TrustMark - a GAN-based watermarking method with novel design in architecture and spatio-spectra losses to balance the trade-off between watermarked image quality with the watermark recovery accuracy. Our model is trained with robustness in mind, withstandin… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  44. arXiv:2311.11045  [pdf, other

    cs.AI

    Orca 2: Teaching Small Language Models How to Reason

    Authors: Arindam Mitra, Luciano Del Corro, Shweti Mahajan, Andres Codas, Clarisse Simoes, Sahaj Agarwal, Xuxi Chen, Anastasia Razdaibiedina, Erik Jones, Kriti Aggarwal, Hamid Palangi, Guoqing Zheng, Corby Rosset, Hamed Khanpour, Ahmed Awadallah

    Abstract: Orca 1 learns from rich signals, such as explanation traces, allowing it to outperform conventional instruction-tuned models on benchmarks like BigBench Hard and AGIEval. In Orca 2, we continue exploring how improved training signals can enhance smaller LMs' reasoning abilities. Research on training small LMs has often relied on imitation learning to replicate the output of more capable models. We… ▽ More

    Submitted 21 November, 2023; v1 submitted 18 November, 2023; originally announced November 2023.

    Comments: Added url to model weights fixed typo in Author name

  45. arXiv:2311.10933  [pdf, other

    cs.AI cs.CL cs.CV cs.HC

    Representing visual classification as a linear combination of words

    Authors: Shobhit Agarwal, Yevgeniy R. Semenov, William Lotter

    Abstract: Explainability is a longstanding challenge in deep learning, especially in high-stakes domains like healthcare. Common explainability methods highlight image regions that drive an AI model's decision. Humans, however, heavily rely on language to convey explanations of not only "where" but "what". Additionally, most explainability approaches focus on explaining individual AI predictions, rather tha… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

    Comments: To be published in the Proceedings of the 3rd Machine Learning for Health symposium, Proceedings of Machine Learning Research (PMLR)

  46. arXiv:2311.07595  [pdf

    cs.AI cs.LG

    A Decision Support System for Liver Diseases Prediction: Integrating Batch Processing, Rule-Based Event Detection and SPARQL Query

    Authors: Ritesh Chandra, Sadhana Tiwari, Satyam Rastogi, Sonali Agarwal

    Abstract: Liver diseases pose a significant global health burden, impacting a substantial number of individuals and exerting substantial economic and social consequences. Rising liver problems are considered a fatal disease in many countries, such as Egypt, Molda, etc. The objective of this study is to construct a predictive model for liver illness using Basic Formal Ontology (BFO) and detection rules deriv… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

  47. arXiv:2311.06145  [pdf, other

    cs.CV cs.CY

    An Evaluation of Forensic Facial Recognition

    Authors: Justin Norman, Shruti Agarwal, Hany Farid

    Abstract: Recent advances in machine learning and computer vision have led to reported facial recognition accuracies surpassing human performance. We question if these systems will translate to real-world forensic scenarios in which a potentially low-resolution, low-quality, partially-occluded image is compared against a standard facial database. We describe the construction of a large-scale synthetic facia… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

  48. arXiv:2311.02535   

    cs.CV

    TokenMotion: Motion-Guided Vision Transformer for Video Camouflaged Object Detection Via Learnable Token Selection

    Authors: Zifan Yu, Erfan Bank Tavakoli, Meida Chen, Suya You, Raghuveer Rao, Sanjeev Agarwal, Fengbo Ren

    Abstract: The area of Video Camouflaged Object Detection (VCOD) presents unique challenges in the field of computer vision due to texture similarities between target objects and their surroundings, as well as irregular motion patterns caused by both objects and camera movement. In this paper, we introduce TokenMotion (TMNet), which employs a transformer-based model to enhance VCOD by extracting motion-guide… ▽ More

    Submitted 1 February, 2024; v1 submitted 4 November, 2023; originally announced November 2023.

    Comments: Revising Needed

  49. arXiv:2310.13259  [pdf

    eess.IV cs.CV

    Domain-specific optimization and diverse evaluation of self-supervised models for histopathology

    Authors: Jeremy Lai, Faruk Ahmed, Supriya Vijay, Tiam Jaroensri, Jessica Loo, Saurabh Vyawahare, Saloni Agarwal, Fayaz Jamil, Yossi Matias, Greg S. Corrado, Dale R. Webster, Jonathan Krause, Yun Liu, Po-Hsuan Cameron Chen, Ellery Wulczyn, David F. Steiner

    Abstract: Task-specific deep learning models in histopathology offer promising opportunities for improving diagnosis, clinical research, and precision medicine. However, development of such models is often limited by availability of high-quality data. Foundation models in histopathology that learn general representations across a wide range of tissue types, diagnoses, and magnifications offer the potential… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: 4 main tables, 3 main figures, additional supplemental tables and figures

  50. arXiv:2310.06794  [pdf, other

    cs.LG cs.AI cs.RO

    $f$-Policy Gradients: A General Framework for Goal Conditioned RL using $f$-Divergences

    Authors: Siddhant Agarwal, Ishan Durugkar, Peter Stone, Amy Zhang

    Abstract: Goal-Conditioned Reinforcement Learning (RL) problems often have access to sparse rewards where the agent receives a reward signal only when it has achieved the goal, making policy optimization a difficult problem. Several works augment this sparse reward with a learned dense reward function, but this can lead to sub-optimal policies if the reward is misaligned. Moreover, recent works have demonst… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: Accepted at NeurIPS 2023