Skip to main content

Showing 1–50 of 1,773 results for author: Krishna

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03093  [pdf, other

    cs.SE cs.AI cs.CR cs.LG

    Revisiting the Performance of Deep Learning-Based Vulnerability Detection on Realistic Datasets

    Authors: Partha Chakraborty, Krishna Kanth Arumugam, Mahmoud Alfadel, Meiyappan Nagappan, Shane McIntosh

    Abstract: The impact of software vulnerabilities on everyday software systems is significant. Despite deep learning models being proposed for vulnerability detection, their reliability is questionable. Prior evaluations show high recall/F1 scores of up to 99%, but these models underperform in practical scenarios, particularly when assessed on entire codebases rather than just the fixing commit. This paper i… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    ACM Class: D.2; I.2

    Journal ref: 10.1109/TSE.2024.3423712

  2. arXiv:2407.02960  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    ObfuscaTune: Obfuscated Offsite Fine-tuning and Inference of Proprietary LLMs on Private Datasets

    Authors: Ahmed Frikha, Nassim Walha, Ricardo Mendes, Krishna Kanth Nakka, Xue Jiang, Xuebing Zhou

    Abstract: This work addresses the timely yet underexplored problem of performing inference and finetuning of a proprietary LLM owned by a model provider entity on the confidential/private data of another data owner entity, in a way that ensures the confidentiality of both the model and the data. Hereby, the finetuning is conducted offsite, i.e., on the computation infrastructure of a third-party cloud provi… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Preprint

  3. arXiv:2407.02956  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    IncogniText: Privacy-enhancing Conditional Text Anonymization via LLM-based Private Attribute Randomization

    Authors: Ahmed Frikha, Nassim Walha, Krishna Kanth Nakka, Ricardo Mendes, Xue Jiang, Xuebing Zhou

    Abstract: In this work, we address the problem of text anonymization where the goal is to prevent adversaries from correctly inferring private attributes of the author, while kee** the text utility, i.e., meaning and semantics. We propose IncogniText, a technique that anonymizes the text to mislead a potential adversary into predicting a wrong private attribute value. Our empirical evaluation shows a redu… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Preprint

  4. arXiv:2407.02943  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    PII-Compass: Guiding LLM training data extraction prompts towards the target PII via grounding

    Authors: Krishna Kanth Nakka, Ahmed Frikha, Ricardo Mendes, Xue Jiang, Xuebing Zhou

    Abstract: The latest and most impactful advances in large models stem from their increased size. Unfortunately, this translates into an improved memorization capacity, raising data privacy concerns. Specifically, it has been shown that models can output personal identifiable information (PII) contained in their training data. However, reported PIII extraction performance varies widely, and there is no conse… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted at ACL 2024

  5. arXiv:2407.02543  [pdf, other

    cs.CL cs.AI cs.LG cs.SD eess.AS

    Towards the Next Frontier in Speech Representation Learning Using Disentanglement

    Authors: Varun Krishna, Sriram Ganapathy

    Abstract: The popular frameworks for self-supervised learning of speech representations have largely focused on frame-level masked prediction of speech regions. While this has shown promising downstream task performance for speech recognition and related tasks, this has largely ignored factors of speech that are encoded at coarser level, like characteristics of the speaker or channel that remain consistent… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  6. arXiv:2407.02013  [pdf, other

    cs.LG

    DiGRAF: Diffeomorphic Graph-Adaptive Activation Function

    Authors: Krishna Sri Ipsit Mantri, Xinzhi Wang, Carola-Bibiane Schönlieb, Bruno Ribeiro, Beatrice Bevilacqua, Moshe Eliasof

    Abstract: In this paper, we propose a novel activation function tailored specifically for graph data in Graph Neural Networks (GNNs). Motivated by the need for graph-adaptive and flexible activation functions, we introduce DiGRAF, leveraging Continuous Piecewise-Affine Based (CPAB) transformations, which we augment with an additional GNN to learn a graph-adaptive diffeomorphic activation function in an end-… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  7. arXiv:2407.01732  [pdf, other

    cs.CY cs.HC cs.IR

    Investigating Nudges toward Related Sellers on E-commerce Marketplaces: A Case Study on Amazon

    Authors: Abhisek Dash, Abhijnan Chakraborty, Saptarshi Ghosh, Animesh Mukherjee, Krishna P. Gummadi

    Abstract: E-commerce marketplaces provide business opportunities to millions of sellers worldwide. Some of these sellers have special relationships with the marketplace by virtue of using their subsidiary services (e.g., fulfillment and/or ship** services provided by the marketplace) -- we refer to such sellers collectively as Related Sellers. When multiple sellers offer to sell the same product, the mark… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: This work has been accepted for presentation at the ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW) 2024. It will appear in Proceedings of the ACM on Human-Computer Interaction

  8. arXiv:2407.00167  [pdf, other

    cs.CL cs.AI cs.ET cs.HC cs.SI

    Can GPT-4 Help Detect Quit Va** Intentions? An Exploration of Automatic Data Annotation Approach

    Authors: Sai Krishna Revanth Vuruma, Dezhi Wu, Saborny Sen Gupta, Lucas Aust, Valerie Lookingbill, Wyatt Bellamy, Yang Ren, Erin Kasson, Li-Shiun Chen, Patricia Cavazos-Rehg, Dian Hu, Ming Huang

    Abstract: In recent years, the United States has witnessed a significant surge in the popularity of va** or e-cigarette use, leading to a notable rise in cases of e-cigarette and va** use-associated lung injury (EVALI) that caused hospitalizations and fatalities during the EVALI outbreak in 2019, highlighting the urgency to comprehend va** behaviors and develop effective strategies for cessation. Due… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: Accepted for the AI Applications in Public Health and Social Services workshop at the 22nd International Conference on Artificial Intelligence in Medicine (AIME 2024)

  9. arXiv:2406.19954  [pdf, other

    cs.CL cs.HC cs.SD eess.AS

    BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5

    Authors: Zhehuai Chen, He Huang, Oleksii Hrinchuk, Krishna C. Puvvada, Nithin Rao Koluguri, Piotr Żelasko, Jagadeesh Balam, Boris Ginsburg

    Abstract: Incorporating speech understanding capabilities into pretrained large-language models has become a vital research direction (SpeechLLM). The previous architectures can be categorized as: i) GPT-style, prepend speech prompts to the text prompts as a sequence of LLM inputs like a decoder-only model; ii) T5-style, introduce speech cross-attention to each layer of the pretrained LLMs. We propose BESTO… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    MSC Class: 68T10 ACM Class: I.2.7

  10. arXiv:2406.19738  [pdf, other

    quant-ph cs.AI cs.LG

    Classical Bandit Algorithms for Entanglement Detection in Parameterized Qubit States

    Authors: Bharati. K, Vikesh Siddhu, Krishna Jagannathan

    Abstract: Entanglement is a key resource for a wide range of tasks in quantum information and computing. Thus, verifying availability of this quantum resource is essential. Extensive research on entanglement detection has led to no-go theorems (Lu et al. [Phys. Rev. Lett., 116, 230501 (2016)]) that highlight the need for full state tomography (FST) in the absence of adaptive or joint measurements. Recent ad… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 20 pages, 5 figures

  11. arXiv:2406.19674  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Less is More: Accurate Speech Recognition & Translation without Web-Scale Data

    Authors: Krishna C. Puvvada, Piotr Żelasko, He Huang, Oleksii Hrinchuk, Nithin Rao Koluguri, Kunal Dhawan, Somshubra Majumdar, Elena Rastorgueva, Zhehuai Chen, Vitaly Lavrukhin, Jagadeesh Balam, Boris Ginsburg

    Abstract: Recent advances in speech recognition and translation rely on hundreds of thousands of hours of Internet speech data. We argue that state-of-the art accuracy can be reached without relying on web-scale data. Canary - multilingual ASR and speech translation model, outperforms current state-of-the-art models - Whisper, OWSM, and Seamless-M4T on English, French, Spanish, and German languages, while b… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: Accepted at Interspeech-2024

  12. arXiv:2406.19580  [pdf, other

    cs.AR cs.LG

    FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models

    Authors: Saeed Rashidi, William Won, Sudarshan Srinivasan, Puneet Gupta, Tushar Krishna

    Abstract: Distributed Deep Neural Network (DNN) training is a technique to reduce the training overhead by distributing the training tasks into multiple accelerators, according to a parallelization strategy. However, high-performance compute and interconnects are needed for maximum speed-up and linear scaling of the system. Wafer-scale systems are a promising technology that allows for tightly integrating h… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  13. arXiv:2406.18915  [pdf, other

    cs.RO cs.CV

    Manipulate-Anything: Automating Real-World Robots using Vision-Language Models

    Authors: Jiafei Duan, Wentao Yuan, Wilbert Pumacay, Yi Ru Wang, Kiana Ehsani, Dieter Fox, Ranjay Krishna

    Abstract: Large-scale endeavors like RT-1 and widespread community efforts such as Open-X-Embodiment have contributed to growing the scale of robot demonstration data. However, there is still an opportunity to improve the quality, quantity, and diversity of robot demonstration data. Although vision-language models have been shown to automatically generate demonstration data, their utility has been limited t… ▽ More

    Submitted 27 June, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: Project page: https://robot-ma.github.io/

  14. arXiv:2406.17968  [pdf, other

    cs.IR cs.AI cs.LG stat.ML

    Efficient Document Ranking with Learnable Late Interactions

    Authors: Ziwei Ji, Himanshu Jain, Andreas Veit, Sashank J. Reddi, Sadeep Jayasumana, Ankit Singh Rawat, Aditya Krishna Menon, Felix Yu, Sanjiv Kumar

    Abstract: Cross-Encoder (CE) and Dual-Encoder (DE) models are two fundamental approaches for query-document relevance in information retrieval. To predict relevance, CE models use joint query-document embeddings, while DE models maintain factorized query and document embeddings; usually, the former has higher quality while the latter benefits from lower latency. Recently, late-interaction models have been p… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  15. arXiv:2406.17774  [pdf, other

    cs.CV cs.GR

    Fast and Uncertainty-Aware SVBRDF Recovery from Multi-View Capture using Frequency Domain Analysis

    Authors: Ruben Wiersma, Julien Philip, Miloš Hašan, Krishna Mullia, Fujun Luan, Elmar Eisemann, Valentin Deschaintre

    Abstract: Relightable object acquisition is a key challenge in simplifying digital asset creation. Complete reconstruction of an object typically requires capturing hundreds to thousands of photographs under controlled illumination, with specialized equipment. The recent progress in differentiable rendering improved the quality and accessibility of inverse rendering optimization. Nevertheless, under uncontr… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Project page: https://brdf-uncertainty.github.io

  16. arXiv:2406.17377  [pdf, other

    cs.CL

    A Three-Pronged Approach to Cross-Lingual Adaptation with Multilingual LLMs

    Authors: Vaibhav Singh, Amrith Krishna, Karthika NJ, Ganesh Ramakrishnan

    Abstract: Low-resource languages, by its very definition, tend to be under represented in the pre-training corpora of Large Language Models. In this work, we investigate three low-resource cross-lingual approaches that enable an LLM adapt to tasks in previously unseen languages. Llama-2 is an LLM where Indic languages, among many other language families, contribute to less than $0.005\%$ of the total $2$ tr… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  17. arXiv:2406.16008  [pdf, other

    cs.CL cs.AI cs.LG

    Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilization

    Authors: Cheng-Yu Hsieh, Yung-Sung Chuang, Chun-Liang Li, Zifeng Wang, Long T. Le, Abhishek Kumar, James Glass, Alexander Ratner, Chen-Yu Lee, Ranjay Krishna, Tomas Pfister

    Abstract: Large language models (LLMs), even when specifically trained to process long input contexts, struggle to capture relevant information located in the middle of their input. This phenomenon has been known as the lost-in-the-middle problem. In this work, we make three contributions. First, we set out to understand the factors that cause this phenomenon. In doing so, we establish a connection between… ▽ More

    Submitted 3 July, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

    Comments: ACL Findings 2024

  18. arXiv:2406.14517  [pdf, other

    cs.LG cs.AI cs.CL cs.CR

    PostMark: A Robust Blackbox Watermark for Large Language Models

    Authors: Yapei Chang, Kalpesh Krishna, Amir Houmansadr, John Wieting, Mohit Iyyer

    Abstract: The most effective techniques to detect LLM-generated text rely on inserting a detectable signature -- or watermark -- during the model's decoding process. Most existing watermarking methods require access to the underlying LLM's logits, which LLM API providers are loath to share due to fears of model distillation. As such, these watermarks must be implemented independently by each LLM provider. I… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: preprint; 18 pages, 5 figures

  19. arXiv:2406.14458  [pdf, other

    cs.LG cs.AI cs.IT eess.SP

    Centimeter Positioning Accuracy using AI/ML for 6G Applications

    Authors: Sai Prasanth Kotturi, Radha Krishna Ganti

    Abstract: This research looks at using AI/ML to achieve centimeter-level user positioning in 6G applications such as the Industrial Internet of Things (IIoT). Initial results show that our AI/ML-based method can estimate user positions with an accuracy of 17 cm in an indoor factory environment. In this proposal, we highlight our approaches and future directions.

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 2 Pages, 2 Figures, ICMLCN Conference, Stockholm, Sweden

  20. arXiv:2406.13868  [pdf, other

    cs.LG cs.AI

    SDQ: Sparse Decomposed Quantization for LLM Inference

    Authors: Geonhwa Jeong, Po-An Tsai, Stephen W. Keckler, Tushar Krishna

    Abstract: Recently, large language models (LLMs) have shown surprising performance in task-specific workloads as well as general tasks with the given prompts. However, to achieve unprecedented performance, recent LLMs use billions to trillions of parameters, which hinder the wide adaptation of those models due to their extremely large compute and memory requirements. To resolve the issue, various model comp… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Preprint

  21. arXiv:2406.13129  [pdf, other

    cs.CV cs.LG

    M3T: Multi-Modal Medical Transformer to bridge Clinical Context with Visual Insights for Retinal Image Medical Description Generation

    Authors: Nagur Shareef Shaik, Teja Krishna Cherukuri, Dong Hye Ye

    Abstract: Automated retinal image medical description generation is crucial for streamlining medical diagnosis and treatment planning. Existing challenges include the reliance on learned retinal image representations, difficulties in handling multiple imaging modalities, and the lack of clinical context in visual representations. Addressing these issues, we propose the Multi-Modal Medical Transformer (M3T),… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: This paper has been accepted for presentation at the IEEE International Conference on Image Processing (ICIP 2024)

  22. arXiv:2406.13126  [pdf, other

    cs.CV cs.LG

    Guided Context Gating: Learning to leverage salient lesions in retinal fundus images

    Authors: Teja Krishna Cherukuri, Nagur Shareef Shaik, Dong Hye Ye

    Abstract: Effectively representing medical images, especially retinal images, presents a considerable challenge due to variations in appearance, size, and contextual information of pathological signs called lesions. Precise discrimination of these lesions is crucial for diagnosing vision-threatening issues such as diabetic retinopathy. While visual attention-based neural networks have been introduced to lea… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: This paper has been accepted for presentation at the IEEE International Conference on Image Processing (ICIP 2024)

  23. arXiv:2406.12997  [pdf, other

    cs.CL

    Suitability of CCA for Generating Latent State/ Variables in Multi-View Textual Data

    Authors: Akanksha Mehndiratta, Krishna Asawa

    Abstract: The probabilistic interpretation of Canonical Correlation Analysis (CCA) for learning low-dimensional real vectors, called as latent variables, has been exploited immensely in various fields. This study takes a step further by demonstrating the potential of CCA in discovering a latent state that captures the contextual information within the textual data under a two-view setting. The interpretatio… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  24. arXiv:2406.12818  [pdf, other

    econ.TH cs.SI

    Optimal Bailouts in Diversified Financial Networks

    Authors: Krishna Dasaratha, Santosh Venkatesh, Rakesh Vohra

    Abstract: Widespread default involves substantial deadweight costs which could be countered by injecting capital into failing firms. Injections have positive spillovers that can trigger a repayment cascade. But which firms should a regulator bailout so as to minimize the total injection of capital while ensuring solvency of all firms? While the problem is, in general, NP-hard, for a wide range of networks t… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  25. arXiv:2406.12683  [pdf, other

    cs.CV cs.LG

    Spatial Sequence Attention Network for Schizophrenia Classification from Structural Brain MR Images

    Authors: Nagur Shareef Shaik, Teja Krishna Cherukuri, Vince Calhoun, Dong Hye Ye

    Abstract: Schizophrenia is a debilitating, chronic mental disorder that significantly impacts an individual's cognitive abilities, behavior, and social interactions. It is characterized by subtle morphological changes in the brain, particularly in the gray matter. These changes are often imperceptible through manual observation, demanding an automated approach to diagnosis. This study introduces a deep lear… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: This paper has been accepted for the 21st IEEE International Symposium on Biomedical Imaging (ISBI 2024)

  26. arXiv:2406.12336  [pdf, other

    cs.CL cs.LG

    A Compass for Navigating the World of Sentence Embeddings for the Telecom Domain

    Authors: Sujoy Roychowdhury, Sumit Soman, H. G. Ranjani, Vansh Chhabra, Neeraj Gunda, Subhadip Bandyopadhyay, Sai Krishna Bala

    Abstract: A plethora of sentence embedding models makes it challenging to choose one, especially for domains such as telecom, rich with specialized vocabulary. We evaluate multiple embeddings obtained from publicly available models and their domain-adapted variants, on both point retrieval accuracies as well as their (95\%) confidence intervals. We establish a systematic method to obtain thresholds for simi… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 10 pages, 3 figures, 4 tables

    MSC Class: 68T50 ACM Class: I.2.7

  27. arXiv:2406.11930  [pdf, other

    cs.SE cs.AI cs.CL

    A Critical Study of What Code-LLMs (Do Not) Learn

    Authors: Abhinav Anand, Shweta Verma, Krishna Narasimhan, Mira Mezini

    Abstract: Large Language Models trained on code corpora (code-LLMs) have demonstrated impressive performance in various coding assistance tasks. However, despite their increased size and training dataset, code-LLMs still have limitations such as suggesting codes with syntactic errors, variable misuse etc. Some studies argue that code-LLMs perform well on coding tasks because they use self-attention and hidd… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  28. arXiv:2406.11877  [pdf

    physics.ao-ph cs.LG

    Solar Power Prediction Using Satellite Data in Different Parts of Nepal

    Authors: Raj Krishna Nepal, Bibek Khanal, Vibek Ghimire, Kismat Neupane, Atul Pokharel, Kshitij Niraula, Baburam Tiwari, Nawaraj Bhattarai, Khem N. Poudyal, Nawaraj Karki, Mohan B Dangi, John Biden

    Abstract: Due to the unavailability of solar irradiance data for many potential sites of Nepal, the paper proposes predicting solar irradiance based on alternative meteorological parameters. The study focuses on five distinct regions in Nepal and utilizes a dataset spanning almost ten years, obtained from CERES SYN1deg and MERRA-2. Machine learning models such as Random Forest, XGBoost, K-Nearest Neighbors,… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: 20 pages, 12 figures, 5 tables

  29. arXiv:2406.11775  [pdf, other

    cs.CV cs.AI

    Task Me Anything

    Authors: Jieyu Zhang, Weikai Huang, Zixian Ma, Oscar Michel, Dong He, Tanmay Gupta, Wei-Chiu Ma, Ali Farhadi, Aniruddha Kembhavi, Ranjay Krishna

    Abstract: Benchmarks for large multimodal language models (MLMs) now serve to simultaneously assess the general capabilities of models instead of evaluating for a specific capability. As a result, when a developer wants to identify which models to use for their application, they are overwhelmed by the number of benchmarks and remain uncertain about which benchmark's results are most reflective of their spec… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: website: https://www.task-me-anything.org

  30. arXiv:2406.11488  [pdf, other

    cs.FL

    Reversible Transducers over Infinite Words

    Authors: Luc Dartois, Paul Gastin, Loïc Germerie Guizouarn, R. Govind, Shankaranarayanan Krishna

    Abstract: Deterministic two-way transducers capture the class of regular functions. The efficiency of composing two-way transducers has a direct implication in algorithmic problems related to reactive synthesis, where transformation specifications are converted into equivalent transducers. These specifications are presented in a modular way, and composing the resultant machines simulates the full specificat… ▽ More

    Submitted 28 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  31. arXiv:2406.10721  [pdf, other

    cs.RO cs.AI cs.CV

    RoboPoint: A Vision-Language Model for Spatial Affordance Prediction for Robotics

    Authors: Wentao Yuan, Jiafei Duan, Valts Blukis, Wilbert Pumacay, Ranjay Krishna, Adithyavairavan Murali, Arsalan Mousavian, Dieter Fox

    Abstract: From rearranging objects on a table to putting groceries into shelves, robots must plan precise action points to perform tasks accurately and reliably. In spite of the recent adoption of vision language models (VLMs) to control robot behavior, VLMs struggle to precisely articulate robot actions using language. We introduce an automatic synthetic data generation pipeline that instruction-tunes VLMs… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  32. arXiv:2406.09617  [pdf, other

    cs.CL cs.HC eess.AS

    Multimodal Large Language Models with Fusion Low Rank Adaptation for Device Directed Speech Detection

    Authors: Shruti Palaskar, Oggi Rudovic, Sameer Dharur, Florian Pesce, Gautam Krishna, Aswin Sivaraman, Jack Berkowitz, Ahmed Hussen Abdelaziz, Saurabh Adya, Ahmed Tewfik

    Abstract: Although Large Language Models (LLMs) have shown promise for human-like conversations, they are primarily pre-trained on text data. Incorporating audio or video improves performance, but collecting large-scale multimodal data and pre-training multimodal LLMs is challenging. To this end, we propose a Fusion Low Rank Adaptation (FLoRA) technique that efficiently adapts a pre-trained unimodal LLM to… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted at Interspeech 2024

  33. arXiv:2406.09403  [pdf, other

    cs.CV cs.CL

    Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models

    Authors: Yushi Hu, Weijia Shi, Xingyu Fu, Dan Roth, Mari Ostendorf, Luke Zettlemoyer, Noah A Smith, Ranjay Krishna

    Abstract: Humans draw to facilitate reasoning: we draw auxiliary lines when solving geometry problems; we mark and circle when reasoning on maps; we use sketches to amplify our ideas and relieve our limited-capacity working memory. However, such actions are missing in current multimodal language models (LMs). Current chain-of-thought and tool-use paradigms only use text as intermediate reasoning steps. In t… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 26 pages

  34. arXiv:2406.09264  [pdf, other

    cs.HC cs.AI cs.CL

    Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions

    Authors: Hua Shen, Tiffany Knearem, Reshmi Ghosh, Kenan Alkiek, Kundan Krishna, Yachuan Liu, Ziqiao Ma, Savvas Petridis, Yi-Hao Peng, Li Qiwei, Sushrita Rakshit, Chenglei Si, Yutong Xie, Jeffrey P. Bigham, Frank Bentley, Joyce Chai, Zachary Lipton, Qiaozhu Mei, Rada Mihalcea, Michael Terry, Diyi Yang, Meredith Ringel Morris, Paul Resnick, David Jurgens

    Abstract: Recent advancements in general-purpose AI have highlighted the importance of guiding AI systems towards the intended goals, ethical principles, and values of individuals and groups, a concept broadly recognized as alignment. However, the lack of clarified definitions and scopes of human-AI alignment poses a significant obstacle, hampering collaborative efforts across research domains to achieve th… ▽ More

    Submitted 17 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: 56 pages

  35. arXiv:2406.07892  [pdf, ps, other

    cs.LG cs.AI

    Finite Time Analysis of Temporal Difference Learning for Mean-Variance in a Discounted MDP

    Authors: Tejaram Sangadi, L. A. Prashanth, Krishna Jagannathan

    Abstract: Motivated by risk-sensitive reinforcement learning scenarios, we consider the problem of policy evaluation for variance in a discounted reward Markov decision process (MDP). For this problem, a temporal difference (TD) type learning algorithm with linear function approximation (LFA) exists in the literature, though only asymptotic guarantees are available for this algorithm. We derive finite sampl… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  36. arXiv:2406.07332  [pdf, other

    cs.CV

    Minimizing Energy Costs in Deep Learning Model Training: The Gaussian Sampling Approach

    Authors: Challapalli Phanindra Revanth, Sumohana S. Channappayya, C Krishna Mohan

    Abstract: Computing the loss gradient via backpropagation consumes considerable energy during deep learning (DL) model training. In this paper, we propose a novel approach to efficiently compute DL models' gradients to mitigate the substantial energy overhead associated with backpropagation. Exploiting the over-parameterized nature of DL models and the smoothness of their loss landscapes, we propose a metho… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  37. arXiv:2406.07246  [pdf, other

    cs.LG

    Marginalization Consistent Mixture of Separable Flows for Probabilistic Irregular Time Series Forecasting

    Authors: Vijaya Krishna Yalavarthi, Randolf Scholz, Kiran Madhusudhanan, Stefan Born, Lars Schmidt-Thieme

    Abstract: Probabilistic forecasting models for joint distributions of targets in irregular time series are a heavily under-researched area in machine learning with, to the best of our knowledge, only three models researched so far: GPR, the Gaussian Process Regression model~\citep{Durichen2015.Multitask}, TACTiS, the Transformer-Attentional Copulas for Time Series~\cite{Drouin2022.Tactis, ashok2024tactis} a… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  38. arXiv:2406.05184  [pdf, other

    cs.CV

    The Unmet Promise of Synthetic Training Images: Using Retrieved Real Images Performs Better

    Authors: Scott Geng, Cheng-Yu Hsieh, Vivek Ramanujan, Matthew Wallingford, Chun-Liang Li, Pang Wei Koh, Ranjay Krishna

    Abstract: Generative text-to-image models enable us to synthesize unlimited amounts of images in a controllable manner, spurring many recent efforts to train vision models with synthetic data. However, every synthetic image ultimately originates from the upstream data used to train the generator. What additional value does the intermediate generator provide over directly training on relevant parts of the up… ▽ More

    Submitted 3 July, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

    Comments: Correspondence to sgeng at cs dot washington dot edu. RK and PWK equally advised the project

  39. arXiv:2406.04548  [pdf, other

    cs.LG cs.IR cs.SI

    GNNAnatomy: Systematic Generation and Evaluation of Multi-Level Explanations for Graph Neural Networks

    Authors: Hsiao-Ying Lu, Yiran Li, Ujwal Pratap Krishna Kaluvakolanu Thyagarajan, Kwan-Liu Ma

    Abstract: Graph Neural Networks (GNNs) have proven highly effective in various machine learning (ML) tasks involving graphs, such as node/graph classification and link prediction. However, explaining the decisions made by GNNs poses challenges because of the aggregated relational information based on graph structure, leading to complex data transformations. Existing methods for explaining GNNs often face li… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  40. arXiv:2406.01698  [pdf, other

    cs.AR cs.AI cs.DC cs.LG

    Demystifying Platform Requirements for Diverse LLM Inference Use Cases

    Authors: Abhimanyu Bambhaniya, Ritik Raj, Geonhwa Jeong, Souvik Kundu, Sudarshan Srinivasan, Midhilesh Elavazhagan, Madhu Kumar, Tushar Krishna

    Abstract: Large language models (LLMs) have shown remarkable performance across a wide range of applications, often outperforming human experts. However, deploying these parameter-heavy models efficiently for diverse inference use cases requires carefully designed hardware platforms with ample computing, memory, and network resources. With LLM deployment scenarios and models evolving at breakneck speed, the… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 12 Pages, https://github.com/abhibambhaniya/GenZ-LLM-Analyzer

  41. arXiv:2406.00060  [pdf, other

    cs.CL cs.LG

    Cascade-Aware Training of Language Models

    Authors: Congchao Wang, Sean Augenstein, Keith Rush, Wittawat Jitkrittum, Harikrishna Narasimhan, Ankit Singh Rawat, Aditya Krishna Menon, Alec Go

    Abstract: Reducing serving cost and latency is a fundamental concern for the deployment of language models (LMs) in business applications. To address this, cascades of LMs offer an effective solution that conditionally employ smaller models for simpler queries. Cascaded systems are typically built with independently trained models, neglecting the advantages of considering inference-time interactions of the… ▽ More

    Submitted 29 May, 2024; originally announced June 2024.

    Comments: 22 pages, 13 figures

  42. arXiv:2405.20933  [pdf, ps, other

    cs.LG stat.ML

    Concentration Bounds for Optimized Certainty Equivalent Risk Estimation

    Authors: Ayon Ghosh, L. A. Prashanth, Krishna Jagannathan

    Abstract: We consider the problem of estimating the Optimized Certainty Equivalent (OCE) risk from independent and identically distributed (i.i.d.) samples. For the classic sample average approximation (SAA) of OCE, we derive mean-squared error as well as concentration bounds (assuming sub-Gaussianity). Further, we analyze an efficient stochastic approximation-based OCE estimator, and derive finite sample b… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  43. arXiv:2405.20654  [pdf, other

    cs.CL cs.IR

    Passage-specific Prompt Tuning for Passage Reranking in Question Answering with Large Language Models

    Authors: Xuyang Wu, Zhiyuan Peng, Krishna Sravanthi Rajanala Sai, Hsin-Tai Wu, Yi Fang

    Abstract: Effective passage retrieval and reranking methods have been widely utilized to identify suitable candidates in open-domain question answering tasks, recent studies have resorted to LLMs for reranking the retrieved passages by the log-likelihood of the question conditioned on each passage. Although these methods have demonstrated promising results, the performance is notably sensitive to the human-… ▽ More

    Submitted 20 June, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

    Comments: Accepted at Gen-IR@SIGIR24

  44. arXiv:2405.20457  [pdf, other

    cs.SI cs.CY cs.HC

    Online network topology shapes personal narratives and hashtag generation

    Authors: J. Hunter Priniski, Bryce Linford, Sai Krishna, Fred Morstatter, Jeff Brantingham, Hong**g Lu

    Abstract: While narratives have shaped cognition and cultures for centuries, digital media and online social networks have introduced new narrative phenomena. With increased narrative agency, networked groups of individuals can directly contribute and steer narratives that center our collective discussions of politics, science, and morality. We report the results of an online network experiment on narrative… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Will be published in the 2024 Proceedings of the Cognitive Science Society

  45. arXiv:2405.19597  [pdf, other

    cs.LG cs.AI cs.CL

    SVFT: Parameter-Efficient Fine-Tuning with Singular Vectors

    Authors: Vijay Lingam, Atula Tejaswi, Aditya Vavre, Aneesh Shetty, Gautham Krishna Gudur, Joydeep Ghosh, Alex Dimakis, Eunsol Choi, Aleksandar Bojchevski, Sujay Sanghavi

    Abstract: Popular parameter-efficient fine-tuning (PEFT) methods, such as LoRA and its variants, freeze pre-trained model weights \(W\) and inject learnable matrices \(ΔW\). These \(ΔW\) matrices are structured for efficient parameterization, often using techniques like low-rank approximations or scaling vectors. However, these methods typically show a performance gap compared to full fine-tuning. Although… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 17 pages, 5 figures, 14 tables

  46. arXiv:2405.19261  [pdf, other

    cs.CL cs.AI cs.LG

    Faster Cascades via Speculative Decoding

    Authors: Harikrishna Narasimhan, Wittawat Jitkrittum, Ankit Singh Rawat, Seungyeon Kim, Neha Gupta, Aditya Krishna Menon, Sanjiv Kumar

    Abstract: Cascades and speculative decoding are two common approaches to improving language models' inference efficiency. Both approaches involve interleaving models of different sizes, but via fundamentally distinct mechanisms: cascades employ a deferral rule that invokes the larger model only for "hard" inputs, while speculative decoding uses speculative execution to primarily invoke the larger model in p… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  47. arXiv:2405.18400  [pdf, other

    cs.CL cs.LG

    Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass

    Authors: Ethan Shen, Alan Fan, Sarah M. Pratt, Jae Sung Park, Matthew Wallingford, Sham M. Kakade, Ari Holtzman, Ranjay Krishna, Ali Farhadi, Aditya Kusupati

    Abstract: Many applications today provide users with multiple auto-complete drafts as they type, including GitHub's code completion, Gmail's smart compose, and Apple's messaging auto-suggestions. Under the hood, language models support this by running an autoregressive inference pass to provide a draft. Consequently, providing $k$ drafts to the user requires running an expensive language model $k$ times. To… ▽ More

    Submitted 24 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: 22 pages, 15 figures

  48. arXiv:2405.17731  [pdf, other

    cs.DB

    Evaluating NoSQL Databases for OLAP Workloads: A Benchmarking Study of MongoDB, Redis, Kudu and ArangoDB

    Authors: Rishi Kesav Mohan, Risheek Rakshit Sukumar Kanmani, Krishna Anandan Ganesan, Nisha Ramasubramanian

    Abstract: In the era of big data, conventional RDBMS models have become impractical for handling colossal workloads. Consequently, NoSQL databases have emerged as the preferred storage solutions for executing processing-intensive Online Analytical Processing (OLAP) tasks. Within the realm of NoSQL databases, various classifications exist based on their data storage mechanisms, making it challenging to selec… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  49. arXiv:2405.17309  [pdf, other

    cs.LG cs.NI

    Survey of Graph Neural Network for Internet of Things and NextG Networks

    Authors: Sabarish Krishna Moorthy, Jithin Jagannath

    Abstract: The exponential increase in Internet of Things (IoT) devices coupled with 6G pushing towards higher data rates and connected devices has sparked a surge in data. Consequently, harnessing the full potential of data-driven machine learning has become one of the important thrusts. In addition to the advancement in wireless technology, it is important to efficiently use the resources available and mee… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  50. arXiv:2405.16915  [pdf, other

    cs.CV cs.LG

    Multilingual Diversity Improves Vision-Language Representations

    Authors: Thao Nguyen, Matthew Wallingford, Sebastin Santy, Wei-Chiu Ma, Sewoong Oh, Ludwig Schmidt, Pang Wei Koh, Ranjay Krishna

    Abstract: Massive web-crawled image-text datasets lay the foundation for recent progress in multimodal learning. These datasets are designed with the goal of training a model to do well on standard computer vision benchmarks, many of which, however, have been shown to be English-centric (e.g., ImageNet). Consequently, existing data curation techniques gravitate towards using predominantly English image-text… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.