Skip to main content

Showing 1–50 of 495 results for author: Dhruv

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01725  [pdf, other

    cs.CL cs.AI cs.LG

    DiscoveryBench: Towards Data-Driven Discovery with Large Language Models

    Authors: Bodhisattwa Prasad Majumder, Harshit Surana, Dhruv Agarwal, Bhavana Dalvi Mishra, Abhijeetsingh Meena, Aryan Prakhar, Tirth Vora, Tushar Khot, Ashish Sabharwal, Peter Clark

    Abstract: Can the rapid advances in code generation, function calling, and data analysis using large language models (LLMs) help automate the search and verification of hypotheses purely from a set of provided datasets? To evaluate this question, we present DiscoveryBench, the first comprehensive benchmark that formalizes the multi-step process of data-driven discovery. The benchmark is designed to systemat… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Website: https://github.com/allenai/discoverybench

  2. arXiv:2407.00677  [pdf, ps, other

    cs.IT

    Combinatorial Multi-Access Coded Caching with Private Caches

    Authors: Dhruv Pratap Singh, Anjana A. Mahesh, B. Sundar Rajan

    Abstract: We consider a variant of the coded caching problem where users connect to two types of caches, called private and access caches. The problem setting consists of a server with a library of files and a set of access caches. Each user, equipped with a private cache, connects to a distinct $r-$subset of the access caches. The server populates both types of caches with files in uncoded format. For this… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: 13 pages and 6 figures

  3. arXiv:2406.17576  [pdf, other

    cs.CR cs.AI cs.LG

    Leveraging Reinforcement Learning in Red Teaming for Advanced Ransomware Attack Simulations

    Authors: Cheng Wang, Christopher Redino, Ryan Clark, Abdul Rahman, Sal Aguinaga, Sathvik Murli, Dhruv Nandakumar, Roland Rao, Lanxiao Huang, Daniel Radke, Edward Bowen

    Abstract: Ransomware presents a significant and increasing threat to individuals and organizations by encrypting their systems and not releasing them until a large fee has been extracted. To bolster preparedness against potential attacks, organizations commonly conduct red teaming exercises, which involve simulated attacks to assess existing security measures. This paper proposes a novel approach utilizing… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  4. arXiv:2406.12998  [pdf, other

    eess.AS cs.AI cs.CL cs.SD

    Articulatory Encodec: Vocal Tract Kinematics as a Codec for Speech

    Authors: Cheol Jun Cho, Peter Wu, Tejas S. Prabhune, Dhruv Agarwal, Gopala K. Anumanchipalli

    Abstract: Vocal tract articulation is a natural, grounded control space of speech production. The spatiotemporal coordination of articulators combined with the vocal source shapes intelligible speech sounds to enable effective spoken communication. Based on this physiological grounding of speech, we propose a new framework of neural encoding-decoding of speech -- articulatory encodec. The articulatory encod… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  5. arXiv:2406.09366  [pdf, other

    cs.LG cs.CV q-bio.NC

    Towards an Improved Understanding and Utilization of Maximum Manifold Capacity Representations

    Authors: Rylan Schaeffer, Victor Lecomte, Dhruv Bhandarkar Pai, Andres Carranza, Berivan Isik, Alyssa Unell, Mikail Khona, Thomas Yerxa, Yann LeCun, SueYeon Chung, Andrey Gromov, Ravid Shwartz-Ziv, Sanmi Koyejo

    Abstract: Maximum Manifold Capacity Representations (MMCR) is a recent multi-view self-supervised learning (MVSSL) method that matches or surpasses other leading MVSSL methods. MMCR is intriguing because it does not fit neatly into any of the commonplace MVSSL lineages, instead originating from a statistical mechanical perspective on the linear separability of data manifolds. In this paper, we seek to impro… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  6. arXiv:2406.01799  [pdf, other

    cs.LG math.OC stat.ML

    Online Control in Population Dynamics

    Authors: Noah Golowich, Elad Hazan, Zhou Lu, Dhruv Rohatgi, Y. Jennifer Sun

    Abstract: The study of population dynamics originated with early sociological works but has since extended into many fields, including biology, epidemiology, evolutionary game theory, and economics. Most studies on population dynamics focus on the problem of prediction rather than control. Existing mathematical models for control in population dynamics are often restricted to specific, noise-free dynamics,… ▽ More

    Submitted 6 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  7. arXiv:2405.20254  [pdf, other

    cs.HC cs.CY

    Conversational Agents to Facilitate Deliberation on Harmful Content in WhatsApp Groups

    Authors: Dhruv Agarwal, Farhana Shahid, Aditya Vashistha

    Abstract: WhatsApp groups have become a hotbed for the propagation of harmful content including misinformation, hate speech, polarizing content, and rumors, especially in Global South countries. Given the platform's end-to-end encryption, moderation responsibilities lie on group admins and members, who rarely contest such content. Another approach is fact-checking, which is unscalable, and can only contest… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: To appear at CSCW 2024

  8. arXiv:2405.16406  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    SpinQuant: LLM quantization with learned rotations

    Authors: Zechun Liu, Changsheng Zhao, Igor Fedorov, Bilge Soran, Dhruv Choudhary, Raghuraman Krishnamoorthi, Vikas Chandra, Yuandong Tian, Tijmen Blankevoort

    Abstract: Post-training quantization (PTQ) techniques applied to weights, activations, and the KV cache greatly reduce memory usage, latency, and power consumption of Large Language Models (LLMs), but may lead to large quantization errors when outliers are present. Recent findings suggest that rotating activation or weight matrices helps remove outliers and benefits quantization. In this work, we identify a… ▽ More

    Submitted 28 May, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

  9. arXiv:2405.11487  [pdf, other

    cs.CV

    "Previously on ..." From Recaps to Story Summarization

    Authors: Aditya Kumar Singh, Dhruv Srivastava, Makarand Tapaswi

    Abstract: We introduce multimodal story summarization by leveraging TV episode recaps - short video sequences interweaving key story moments from previous episodes to bring viewers up to speed. We propose PlotSnap, a dataset featuring two crime thriller TV shows with rich recaps and long episodes of 40 minutes. Story summarization labels are unlocked by matching recap shots to corresponding sub-stories in t… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: CVPR 2024; Project page: https://katha-ai.github.io/projects/recap-story-summ/

  10. arXiv:2405.10391  [pdf, other

    cs.RO cs.AI eess.IV

    Vision Transformers for End-to-End Vision-Based Quadrotor Obstacle Avoidance

    Authors: Anish Bhattacharya, Nishanth Rao, Dhruv Parikh, Pratik Kunapuli, Nikolai Matni, Vijay Kumar

    Abstract: We demonstrate the capabilities of an attention-based end-to-end approach for high-speed quadrotor obstacle avoidance in dense, cluttered environments, with comparison to various state-of-the-art architectures. Quadrotor unmanned aerial vehicles (UAVs) have tremendous maneuverability when flown fast; however, as flight speed increases, traditional vision-based navigation via independent map**, p… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: 8 pages, 10 figures, 3 tables

  11. arXiv:2405.05852  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.RO stat.ML

    Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control

    Authors: Gunshi Gupta, Karmesh Yadav, Yarin Gal, Dhruv Batra, Zsolt Kira, Cong Lu, Tim G. J. Rudner

    Abstract: Embodied AI agents require a fine-grained understanding of the physical world mediated through visual and language inputs. Such capabilities are difficult to learn solely from task-specific data. This has led to the emergence of pre-trained vision-language models as a tool for transferring representations learned from internet-scale data to downstream tasks and new domains. However, commonly used… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  12. arXiv:2405.05033  [pdf, other

    cs.CE cs.LG stat.ML

    Multi-fidelity Hamiltonian Monte Carlo

    Authors: Dhruv V. Patel, Jonghyun Lee, Matthew W. Farthing, Peter K. Kitanidis, Eric F. Darve

    Abstract: Numerous applications in biology, statistics, science, and engineering require generating samples from high-dimensional probability distributions. In recent years, the Hamiltonian Monte Carlo (HMC) method has emerged as a state-of-the-art Markov chain Monte Carlo technique, exploiting the shape of such high-dimensional target distributions to efficiently generate samples. Despite its impressive em… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  13. arXiv:2405.01394  [pdf, other

    cs.AI

    Analysis of a Modular Autonomous Driving Architecture: The Top Submission to CARLA Leaderboard 2.0 Challenge

    Authors: Weize Zhang, Mohammed Elmahgiubi, Kasra Rezaee, Behzad Khamidehi, Hamidreza Mirkhani, Fazel Arasteh, Chunlin Li, Muhammad Ahsan Kaleem, Eduardo R. Corral-Soto, Dhruv Sharma, Tongtong Cao

    Abstract: In this paper we present the architecture of the Kyber-E2E submission to the map track of CARLA Leaderboard 2.0 Autonomous Driving (AD) challenge 2023, which achieved first place. We employed a modular architecture for our solution consists of five main components: sensing, localization, perception, tracking/prediction, and planning/control. Our solution leverages state-of-the-art language-assiste… ▽ More

    Submitted 21 March, 2024; originally announced May 2024.

  14. arXiv:2404.11819  [pdf, other

    cs.CV

    Utilizing Adversarial Examples for Bias Mitigation and Accuracy Enhancement

    Authors: Pushkar Shukla, Dhruv Srikanth, Lee Cohen, Matthew Turk

    Abstract: We propose a novel approach to mitigate biases in computer vision models by utilizing counterfactual generation and fine-tuning. While counterfactuals have been used to analyze and address biases in DNN models, the counterfactuals themselves are often generated from biased generative models, which can introduce additional biases or spurious correlations. To address this issue, we propose using adv… ▽ More

    Submitted 27 June, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

  15. arXiv:2404.11325  [pdf, ps, other

    cs.CR cs.DS

    On Learning Parities with Dependent Noise

    Authors: Noah Golowich, Ankur Moitra, Dhruv Rohatgi

    Abstract: In this expository note we show that the learning parities with noise (LPN) assumption is robust to weak dependencies in the noise distribution of small batches of samples. This provides a partial converse to the linearization technique of [AG11]. The material in this note is drawn from a recent work by the authors [GMR24], where the robustness guarantee was a key component in a cryptographic sepa… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: This note draws heavily from arXiv:2404.03774

  16. arXiv:2404.08513  [pdf, other

    cs.LG cs.AI

    Adversarial Imitation Learning via Boosting

    Authors: Jonathan D. Chang, Dhruv Sreenivas, Yingbing Huang, Kianté Brantley, Wen Sun

    Abstract: Adversarial imitation learning (AIL) has stood out as a dominant framework across various imitation learning (IL) applications, with Discriminator Actor Critic (DAC) (Kostrikov et al.,, 2019) demonstrating the effectiveness of off-policy learning algorithms in improving sample efficiency and scalability to higher-dimensional observations. Despite DAC's empirical success, the original AIL objective… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 19 pages, 7 figures, 4 tables, 3 algorithms, ICLR 2024

  17. arXiv:2404.06723  [pdf, other

    cs.LG cs.CL

    Global Contrastive Training for Multimodal Electronic Health Records with Language Supervision

    Authors: Yingbo Ma, Suraj Kolla, Zhenhong Hu, Dhruv Kaliraman, Victoria Nolan, Ziyuan Guan, Yuanfang Ren, Brooke Armfield, Tezcan Ozrazgat-Baslanti, Jeremy A. Balch, Tyler J. Loftus, Parisa Rashidi, Azra Bihorac, Benjamin Shickel

    Abstract: Modern electronic health records (EHRs) hold immense promise in tracking personalized patient health trajectories through sequential deep learning, owing to their extensive breadth, scale, and temporal granularity. Nonetheless, how to effectively leverage multiple modalities from EHRs poses significant challenges, given its complex characteristics such as high dimensionality, multimodality, sparsi… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 12 pages, 3 figures. arXiv admin note: text overlap with arXiv:2403.04012

  18. arXiv:2404.06609  [pdf, other

    cs.AI cs.RO

    GOAT-Bench: A Benchmark for Multi-Modal Lifelong Navigation

    Authors: Mukul Khanna, Ram Ramrakhya, Gunjan Chhablani, Sriram Yenamandra, Theophile Gervet, Matthew Chang, Zsolt Kira, Devendra Singh Chaplot, Dhruv Batra, Roozbeh Mottaghi

    Abstract: The Embodied AI community has made significant strides in visual navigation tasks, exploring targets from 3D coordinates, objects, language descriptions, and images. However, these navigation models often handle only a single input modality as the target. With the progress achieved so far, it is time to move towards universal navigation models capable of handling various goal types, enabling more… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  19. arXiv:2404.04603  [pdf, ps, other

    cs.HC cs.CY

    Analyzing LLM Usage in an Advanced Computing Class in India

    Authors: Chaitanya Arora, Utkarsh Venaik, Pavit Singh, Sahil Goyal, Jatin Tyagi, Shyama Goel, Ujjwal Singhal, Dhruv Kumar

    Abstract: This paper investigates the usage patterns of undergraduate and graduate students when engaging with large language models (LLMs) to tackle programming assignments in the context of advanced computing courses. Existing work predominantly focuses on the influence of LLMs in introductory programming contexts. Additionally, there is a scarcity of studies analyzing actual conversations between student… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: Under review: 12 pages

  20. arXiv:2404.04527  [pdf, other

    cs.CV cs.AI cs.AR cs.DC

    VTR: An Optimized Vision Transformer for SAR ATR Acceleration on FPGA

    Authors: Sachini Wickramasinghe, Dhruv Parikh, Bingyi Zhang, Rajgopal Kannan, Viktor Prasanna, Carl Busart

    Abstract: Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR) is a key technique used in military applications like remote-sensing image recognition. Vision Transformers (ViTs) are the current state-of-the-art in various computer vision applications, outperforming their CNN counterparts. However, using ViTs for SAR ATR applications is challenging due to (1) standard ViTs require extensive trai… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: SPIE DCS 2024

  21. arXiv:2404.03774  [pdf, other

    cs.LG cs.CC cs.CR cs.DS

    Exploration is Harder than Prediction: Cryptographically Separating Reinforcement Learning from Supervised Learning

    Authors: Noah Golowich, Ankur Moitra, Dhruv Rohatgi

    Abstract: Supervised learning is often computationally easy in practice. But to what extent does this mean that other modes of learning, such as reinforcement learning (RL), ought to be computationally easy by extension? In this work we show the first cryptographic separation between RL and supervised learning, by exhibiting a class of block MDPs and associated decoding functions where reward-free explorati… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: 112 pages, 3 figures

  22. arXiv:2404.01413  [pdf, other

    cs.LG cs.AI cs.CL cs.ET stat.ML

    Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data

    Authors: Matthias Gerstgrasser, Rylan Schaeffer, Apratim Dey, Rafael Rafailov, Henry Sleight, John Hughes, Tomasz Korbak, Rajashree Agrawal, Dhruv Pai, Andrey Gromov, Daniel A. Roberts, Diyi Yang, David L. Donoho, Sanmi Koyejo

    Abstract: The proliferation of generative models, combined with pretraining on web-scale data, raises a timely question: what happens when these models are trained on their own generated outputs? Recent investigations into model-data feedback loops proposed that such loops would lead to a phenomenon termed model collapse, under which performance progressively degrades with each model-data feedback iteration… ▽ More

    Submitted 29 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

  23. arXiv:2403.14047  [pdf, other

    cs.DC cs.AR cs.CV

    Accelerating ViT Inference on FPGA through Static and Dynamic Pruning

    Authors: Dhruv Parikh, Shouyi Li, Bingyi Zhang, Rajgopal Kannan, Carl Busart, Viktor Prasanna

    Abstract: Vision Transformers (ViTs) have achieved state-of-the-art accuracy on various computer vision tasks. However, their high computational complexity prevents them from being applied to many real-world applications. Weight and token pruning are two well-known methods for reducing complexity: weight pruning reduces the model size and associated computational demands, while token pruning further dynamic… ▽ More

    Submitted 12 April, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

    Comments: FCCM 2024

  24. arXiv:2403.12173  [pdf, other

    cs.CL cs.AI cs.IR

    TnT-LLM: Text Mining at Scale with Large Language Models

    Authors: Mengting Wan, Tara Safavi, Sujay Kumar Jauhar, Yu** Kim, Scott Counts, Jennifer Neville, Siddharth Suri, Chirag Shah, Ryen W White, Longqi Yang, Reid Andersen, Georg Buscher, Dhruv Joshi, Nagu Rangan

    Abstract: Transforming unstructured text into structured and meaningful forms, organized by useful category labels, is a fundamental step in text mining for downstream analysis and application. However, most existing methods for producing label taxonomies and building text-based label classifiers still rely heavily on domain expertise and manual curation, making the process expensive and time-consuming. Thi… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: 9 pages main content, 8 pages references and appendix

  25. AI coach for badminton

    Authors: Dhruv Toshniwal, Arpit Patil, Nancy Vachhani

    Abstract: In the competitive realm of sports, optimal performance necessitates rigorous management of nutrition and physical conditioning. Specifically, in badminton, the agility and precision required make it an ideal candidate for motion analysis through video analytics. This study leverages advanced neural network methodologies to dissect video footage of badminton matches, aiming to extract detailed ins… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: 7 pages, 11 figures. https://ieeexplore.ieee.org/document/9825164

    Journal ref: 2022 3rd International Conference for Emerging Technology (INCET), Belgaum, India, 2022, pp. 1-7

  26. arXiv:2403.08283  [pdf, other

    cs.CV

    Optimized Detection and Classification on GTRSB: Advancing Traffic Sign Recognition with Convolutional Neural Networks

    Authors: Dhruv Toshniwal, Saurabh Loya, Anuj Khot, Yash Marda

    Abstract: In the rapidly evolving landscape of transportation, the proliferation of automobiles has made road traffic more complex, necessitating advanced vision-assisted technologies for enhanced safety and navigation. These technologies are imperative for providing critical traffic sign information, influencing driver behavior, and supporting vehicle control, especially for drivers with disabilities and i… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: 8 pages, 8 figures, 1 table

  27. arXiv:2403.08195  [pdf, ps, other

    quant-ph cs.CC

    Efficiently verifiable quantum advantage on near-term analog quantum simulators

    Authors: Zhenning Liu, Dhruv Devulapalli, Dominik Hangleiter, Yi-Kai Liu, Alicia J. Kollár, Alexey V. Gorshkov, Andrew M. Childs

    Abstract: Existing schemes for demonstrating quantum computational advantage are subject to various practical restrictions, including the hardness of verification and challenges in experimental implementation. Meanwhile, analog quantum simulators have been realized in many experiments to study novel physics. In this work, we propose a quantum advantage protocol based on single-step Feynman-Kitaev verificati… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 20 pages, 6 figures

  28. arXiv:2403.04012  [pdf, other

    cs.LG

    Temporal Cross-Attention for Dynamic Embedding and Tokenization of Multimodal Electronic Health Records

    Authors: Yingbo Ma, Suraj Kolla, Dhruv Kaliraman, Victoria Nolan, Zhenhong Hu, Ziyuan Guan, Yuanfang Ren, Brooke Armfield, Tezcan Ozrazgat-Baslanti, Tyler J. Loftus, Parisa Rashidi, Azra Bihorac, Benjamin Shickel

    Abstract: The breadth, scale, and temporal granularity of modern electronic health records (EHR) systems offers great potential for estimating personalized and contextual patient health trajectories using sequential deep learning. However, learning useful representations of EHR data is challenging due to its high dimensionality, sparsity, multimodality, irregular and variable-specific recording frequency, a… ▽ More

    Submitted 1 April, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

    Comments: ICLR 2024 Workshop on Learning From Time Series for Health. 10 pages, 3 figures

  29. arXiv:2403.00991  [pdf, other

    cs.RO cs.CV cs.LG

    SELFI: Autonomous Self-Improvement with Reinforcement Learning for Social Navigation

    Authors: Noriaki Hirose, Dhruv Shah, Kyle Stachowicz, Ajay Sridhar, Sergey Levine

    Abstract: Autonomous self-improving robots that interact and improve with experience are key to the real-world deployment of robotic systems. In this paper, we propose an online learning method, SELFI, that leverages online robot experience to rapidly fine-tune pre-trained control policies efficiently. SELFI applies online model-free reinforcement learning on top of offline model-based learning to bring out… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: 11pages, 13 figures, 2 tables

  30. arXiv:2402.19432  [pdf, other

    cs.RO

    Pushing the Limits of Cross-Embodiment Learning for Manipulation and Navigation

    Authors: Jonathan Yang, Catherine Glossop, Arjun Bhorkar, Dhruv Shah, Quan Vuong, Chelsea Finn, Dorsa Sadigh, Sergey Levine

    Abstract: Recent years in robotics and imitation learning have shown remarkable progress in training large-scale foundation models by leveraging data across a multitude of embodiments. The success of such policies might lead us to wonder: just how diverse can the robots in the training set be while still facilitating positive transfer? In this work, we study this question in the context of heterogeneous emb… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: 16 pages, 9 figures

    MSC Class: 68T40 ACM Class: I.2.9

  31. arXiv:2402.16472  [pdf, other

    cs.CL cs.AI

    mEdIT: Multilingual Text Editing via Instruction Tuning

    Authors: Vipul Raheja, Dimitris Alikaniotis, Vivek Kulkarni, Bashar Alhafni, Dhruv Kumar

    Abstract: We introduce mEdIT, a multi-lingual extension to CoEdIT -- the recent state-of-the-art text editing models for writing assistance. mEdIT models are trained by fine-tuning multi-lingual large, pre-trained language models (LLMs) via instruction tuning. They are designed to take instructions from the user specifying the attributes of the desired text in the form of natural language instructions, such… ▽ More

    Submitted 17 April, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: Accepted to NAACL 2024 (Main). 23 pages, 8 tables, 11 figures

    ACM Class: I.2.7

  32. arXiv:2402.15409  [pdf, other

    stat.ML cs.CC cs.DS cs.LG math.ST

    Lasso with Latents: Efficient Estimation, Covariate Rescaling, and Computational-Statistical Gaps

    Authors: Jonathan Kelner, Frederic Koehler, Raghu Meka, Dhruv Rohatgi

    Abstract: It is well-known that the statistical performance of Lasso can suffer significantly when the covariates of interest have strong correlations. In particular, the prediction error of Lasso becomes much worse than computationally inefficient alternatives like Best Subset Selection. Due to a large conjectured computational-statistical tradeoff in the problem of sparse linear regression, it may be impo… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  33. arXiv:2402.13610  [pdf, other

    cs.CL cs.AI cs.LG

    Data-driven Discovery with Large Generative Models

    Authors: Bodhisattwa Prasad Majumder, Harshit Surana, Dhruv Agarwal, Sanchaita Hazra, Ashish Sabharwal, Peter Clark

    Abstract: With the accumulation of data at an unprecedented rate, its potential to fuel scientific discovery is growing exponentially. This position paper urges the Machine Learning (ML) community to exploit the capabilities of large generative models (LGMs) to develop automated systems for end-to-end data-driven discovery -- a paradigm encompassing the search and verification of hypotheses purely from a se… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  34. arXiv:2402.10202  [pdf, other

    cs.LG

    Bridging Associative Memory and Probabilistic Modeling

    Authors: Rylan Schaeffer, Nika Zahedi, Mikail Khona, Dhruv Pai, Sang Truong, Yilun Du, Mitchell Ostrow, Sarthak Chandra, Andres Carranza, Ila Rani Fiete, Andrey Gromov, Sanmi Koyejo

    Abstract: Associative memory and probabilistic modeling are two fundamental topics in artificial intelligence. The first studies recurrent neural networks designed to denoise, complete and retrieve data, whereas the second studies learning and sampling from probability distributions. Based on the observation that associative memory's energy functions can be seen as probabilistic modeling's negative log like… ▽ More

    Submitted 13 June, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  35. arXiv:2402.10051  [pdf, other

    cs.AI cs.CL

    SwissNYF: Tool Grounded LLM Agents for Black Box Setting

    Authors: Somnath Sendhil Kumar, Dhruv Jain, Eshaan Agarwal, Raunak Pandey

    Abstract: While Large Language Models (LLMs) have demonstrated enhanced capabilities in function-calling, these advancements primarily rely on accessing the functions' responses. This methodology is practical for simpler APIs but faces scalability issues with irreversible APIs that significantly impact the system, such as a database deletion API. Similarly, processes requiring extensive time for each API ca… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  36. arXiv:2402.09811  [pdf, other

    cs.CV

    TEXTRON: Weakly Supervised Multilingual Text Detection through Data Programming

    Authors: Dhruv Kudale, Badri Vishal Kasuba, Venkatapathy Subramanian, Parag Chaudhuri, Ganesh Ramakrishnan

    Abstract: Several recent deep learning (DL) based techniques perform considerably well on image-based multilingual text detection. However, their performance relies heavily on the availability and quality of training data. There are numerous types of page-level document images consisting of information in several modalities, languages, fonts, and layouts. This makes text detection a challenging problem in t… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

    Comments: Accepted at the WACV 2024 Conference

  37. Discovering Command and Control (C2) Channels on Tor and Public Networks Using Reinforcement Learning

    Authors: Cheng Wang, Christopher Redino, Abdul Rahman, Ryan Clark, Daniel Radke, Tyler Cody, Dhruv Nandakumar, Edward Bowen

    Abstract: Command and control (C2) channels are an essential component of many types of cyber attacks, as they enable attackers to remotely control their malware-infected machines and execute harmful actions, such as propagating malicious code across networks, exfiltrating confidential data, or initiating distributed denial of service (DDoS) attacks. Identifying these C2 channels is therefore crucial in hel… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  38. arXiv:2402.07118  [pdf, other

    cs.HC cs.AI cs.LG eess.IV eess.SP

    Next-Generation Teleophthalmology: AI-enabled Quality Assessment Aiding Remote Smartphone-based Consultation

    Authors: Dhruv Srikanth, Jayang Gurung, N Satya Deepika, Vineet Joshi, Pravin Vaddavalli, Soumya Jana

    Abstract: Blindness and other eye diseases are a global health concern, particularly in low- and middle-income countries like India. In this regard, during the COVID-19 pandemic, teleophthalmology became a lifeline, and the Grabi attachment for smartphone-based eye imaging gained in use. However, quality of user-captured image often remained inadequate, requiring clinician vetting and delays. In this backdr… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

    Comments: 4 pages, Submitted to IEEE EMBC 2024

  39. arXiv:2402.04914  [pdf, other

    cs.CL

    Personalized Text Generation with Fine-Grained Linguistic Control

    Authors: Bashar Alhafni, Vivek Kulkarni, Dhruv Kumar, Vipul Raheja

    Abstract: As the text generation capabilities of large language models become increasingly prominent, recent studies have focused on controlling particular aspects of the generated text to make it more personalized. However, most research on controllable text generation focuses on controlling the content or modeling specific high-level/coarse-grained attributes that reflect authors' writing styles, such as… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  40. arXiv:2402.01687  [pdf, ps, other

    cs.CY cs.HC cs.LG

    "Which LLM should I use?": Evaluating LLMs for tasks performed by Undergraduate Computer Science Students

    Authors: Vibhor Agarwal, Madhav Krishan Garg, Sahiti Dharmavaram, Dhruv Kumar

    Abstract: This study evaluates the effectiveness of various large language models (LLMs) in performing tasks common among undergraduate computer science students. Although a number of research studies in the computing education community have explored the possibility of using LLMs for a variety of tasks, there is a lack of comprehensive research comparing different LLMs and evaluating which LLMs are most ef… ▽ More

    Submitted 3 April, 2024; v1 submitted 22 January, 2024; originally announced February 2024.

    Comments: Under review

  41. arXiv:2401.15605  [pdf, other

    cs.HC cs.CY

    AI as a Medical Ally: Evaluating ChatGPT's Usage and Impact in Indian Healthcare

    Authors: Aryaman Raina, Prateek Mishra, Harshit goyal, Dhruv Kumar

    Abstract: This study investigates the integration and impact of Large Language Models (LLMs), like ChatGPT, in India's healthcare sector. Our research employs a dual approach, engaging both general users and medical professionals through surveys and interviews respectively. Our findings reveal that healthcare professionals value ChatGPT in medical education and preliminary clinical settings, but exercise ca… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

    Comments: Under review

  42. arXiv:2401.15595  [pdf, other

    cs.HC cs.CY

    Comuniqa : Exploring Large Language Models for improving speaking skills

    Authors: Manas Mhasakar, Shikhar Sharma, Apurv Mehra, Utkarsh Venaik, Ujjwal Singhal, Dhruv Kumar, Kashish Mittal

    Abstract: In this paper, we investigate the potential of Large Language Models (LLMs) to improve English speaking skills. This is particularly relevant in countries like India, where English is crucial for academic, professional, and personal communication but remains a non-native language for many. Traditional methods for enhancing speaking skills often rely on human experts, which can be limited in terms… ▽ More

    Submitted 14 May, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

    Comments: Accepted at 7th ACM SIGCAS/SIGCHI Conference of Computing and Sustainable Societies : ACM COMPASS 2024

  43. arXiv:2401.15589  [pdf, other

    cs.HC cs.CY

    OpineBot: Class Feedback Reimagined Using a Conversational LLM

    Authors: Henansh Tanwar, Kunal Shrivastva, Rahul Singh, Dhruv Kumar

    Abstract: Conventional class feedback systems often fall short, relying on static, unengaging surveys offering little incentive for student participation. To address this, we present OpineBot, a novel system employing large language models (LLMs) to conduct personalized, conversational class feedback via chatbot interface. We assessed OpineBot's effectiveness in a user study with 20 students from an Indian… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

    Comments: Under review

  44. arXiv:2401.11095  [pdf, other

    cs.HC cs.SD eess.AS

    SoundShift: Exploring Sound Manipulations for Accessible Mixed-Reality Awareness

    Authors: Ruei-Che Chang, Chia-Sheng Hung, Bing-Yu Chen, Dhruv Jain, Anhong Guo

    Abstract: Mixed-reality (MR) soundscapes blend real-world sound with virtual audio from hearing devices, presenting intricate auditory information that is hard to discern and differentiate. This is particularly challenging for blind or visually impaired individuals, who rely on sounds and descriptions in their everyday lives. To understand how complex audio information is consumed, we analyzed online forum… ▽ More

    Submitted 26 May, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

    Comments: DIS 2024

  45. arXiv:2401.09937  [pdf, other

    cs.CY cs.HC

    From Cash to Cashless: UPI's Impact on Spending Behavior among Indian Users

    Authors: Harshal Dev, Raj Gupta, Dhruv Kumar

    Abstract: The emergence of digital payment systems has transformed how individuals conduct financial transactions, offering convenience, security, and efficiency. One groundbreaking innovation making waves in the Indian financial landscape is the Unified Payments Interface (UPI). Existing work has explored how digital payments benefit a country's economy and GDP. However, our study explores how the introduc… ▽ More

    Submitted 7 May, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: Accepted to ACM CHI 2024 - Late Breaking Work Track

  46. arXiv:2401.08091  [pdf, other

    cs.HC

    'One Style Does Not Regulate All': Moderation Practices in Public and Private WhatsApp Groups

    Authors: Farhana Shahid, Dhruv Agarwal, Aditya Vashistha

    Abstract: WhatsApp is the largest social media platform in the Global South and is a virulent force in global misinformation and political propaganda. Due to end-to-end encryption WhatsApp can barely review any content and this often pushes the responsibility of moderation towards group admins. Yet, little is known about how WhatsApp group admins manage their groups, what factors and values influence modera… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  47. arXiv:2401.07770  [pdf, other

    cs.CV

    Seeing the Unseen: Visual Common Sense for Semantic Placement

    Authors: Ram Ramrakhya, Aniruddha Kembhavi, Dhruv Batra, Zsolt Kira, Kuo-Hao Zeng, Luca Weihs

    Abstract: Computer vision tasks typically involve describing what is present in an image (e.g. classification, detection, segmentation, and captioning). We study a visual common sense task that requires understanding what is not present. Specifically, given an image (e.g. of a living room) and name of an object ("cushion"), a vision system is asked to predict semantically-meaningful regions (masks or boundi… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  48. arXiv:2401.04120  [pdf, other

    cs.HC cs.AI cs.CL

    Generation Z's Ability to Discriminate Between AI-generated and Human-Authored Text on Discord

    Authors: Dhruv Ramu, Rishab Jain, Aditya Jain

    Abstract: The growing popularity of generative artificial intelligence (AI) chatbots such as ChatGPT is having transformative effects on social media. As the prevalence of AI-generated content grows, concerns have been raised regarding privacy and misinformation online. Among social media platforms, Discord enables AI integrations -- making their primarily "Generation Z" userbase particularly exposed to AI-… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

  49. arXiv:2312.17601  [pdf

    cs.SE cs.AI

    The Tyranny of Possibilities in the Design of Task-Oriented LLM Systems: A Sco** Survey

    Authors: Dhruv Dhamani, Mary Lou Maher

    Abstract: This sco** survey focuses on our current understanding of the design space for task-oriented LLM systems and elaborates on definitions and relationships among the available design parameters. The paper begins by defining a minimal task-oriented LLM system and exploring the design space of such systems through a thought experiment contemplating the performance of diverse LLM system configurations… ▽ More

    Submitted 29 December, 2023; originally announced December 2023.

    Comments: 18 pages, 6 figures. Work-in-progress draft published to gather feedback. Please reach out with comments, if any

  50. arXiv:2312.16733  [pdf, other

    cs.DC cs.LG

    SuperServe: Fine-Grained Inference Serving for Unpredictable Workloads

    Authors: Alind Khare, Dhruv Garg, Sukrit Kalra, Snigdha Grandhi, Ion Stoica, Alexey Tumanov

    Abstract: The increasing deployment of ML models on the critical path of production applications in both datacenter and the edge requires ML inference serving systems to serve these models under unpredictable and bursty request arrival rates. Serving models under such conditions requires these systems to strike a careful balance between the latency and accuracy requirements of the application and the overal… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.