Skip to main content

Showing 1–50 of 1,049 results for author: Kumar, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01306  [pdf, other

    cs.LG cs.CR

    Unveiling the Unseen: Exploring Whitebox Membership Inference through the Lens of Explainability

    Authors: Chenxi Li, Abhinav Kumar, Zhen Guo, Jie Hou, Reza Tourani

    Abstract: The increasing prominence of deep learning applications and reliance on personalized data underscore the urgent need to address privacy vulnerabilities, particularly Membership Inference Attacks (MIAs). Despite numerous MIA studies, significant knowledge gaps persist, particularly regarding the impact of hidden features (in isolation) on attack efficacy and insufficient justification for the root… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 20 pages, 10 figures, 4 tables

  2. arXiv:2407.00866  [pdf, other

    cs.LG

    Silver Linings in the Shadows: Harnessing Membership Inference for Machine Unlearning

    Authors: Nexhi Sula, Abhinav Kumar, Jie Hou, Han Wang, Reza Tourani

    Abstract: With the continued advancement and widespread adoption of machine learning (ML) models across various domains, ensuring user privacy and data security has become a paramount concern. In compliance with data privacy regulations, such as GDPR, a secure machine learning framework should not only grant users the right to request the removal of their contributed data used for model training but also fa… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: 17 pages, 14 figures, 6 tables

  3. arXiv:2407.00774  [pdf, other

    quant-ph cs.LG

    Advantages of quantum support vector machine in cross-domain classification of quantum states

    Authors: Diksha Sharma, Vivek Balasaheb Sabale, Parvinder Singh, Atul Kumar

    Abstract: In this study, we use cross-domain classification using quantum machine learning for quantum advantages to address the entanglement versus separability paradigm. We further demonstrate the efficient classification of Bell diagonal states into zero and non-zero discord classes. The inherited structure of quantum states and its relation with a particular class of quantum states are exploited to intu… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  4. arXiv:2407.00537  [pdf, other

    eess.IV cs.CV cs.LG

    Accelerating Longitudinal MRI using Prior Informed Latent Diffusion

    Authors: Yonatan Urman, Zachary Shah, Ashwin Kumar, Bruno P. Soares, Kawin Setsompop

    Abstract: MRI is a widely used ionization-free soft-tissue imaging modality, often employed repeatedly over a patient's lifetime. However, prolonged scanning durations, among other issues, can limit availability and accessibility. In this work, we aim to substantially reduce scan times by leveraging prior scans of the same patient. These prior scans typically contain considerable shared information with the… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  5. arXiv:2407.00071  [pdf, other

    cs.AI cs.CL cs.ET cs.LG

    Combinatorial Reasoning: Selecting Reasons in Generative AI Pipelines via Combinatorial Optimization

    Authors: Mert Esencan, Tarun Advaith Kumar, Ata Akbari Asanjan, P. Aaron Lott, Masoud Mohseni, Can Unlu, Davide Venturelli, Alan Ho

    Abstract: Recent Large Language Models (LLMs) have demonstrated impressive capabilities at tasks that require human intelligence and are a significant step towards human-like artificial intelligence (AI). Yet the performance of LLMs at reasoning tasks have been subpar and the reasoning capability of LLMs is a matter of significant debate. While it has been shown that the choice of the prompting technique to… ▽ More

    Submitted 19 June, 2024; originally announced July 2024.

    Comments: 13 pages, 3 figures

  6. arXiv:2406.17304  [pdf, other

    cs.CL

    Leveraging LLMs for Dialogue Quality Measurement

    Authors: **ghan Jia, Abi Komma, Timothy Leffel, Xujun Peng, Ajay Nagesh, Tamer Soliman, Aram Galstyan, Anoop Kumar

    Abstract: In task-oriented conversational AI evaluation, unsupervised methods poorly correlate with human judgments, and supervised approaches lack generalization. Recent advances in large language models (LLMs) show robust zeroshot and few-shot capabilities across NLP tasks. This paper explores using LLMs for automated dialogue quality evaluation, experimenting with various configurations on public and pro… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  7. arXiv:2406.16008  [pdf, other

    cs.CL cs.AI cs.LG

    Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilization

    Authors: Cheng-Yu Hsieh, Yung-Sung Chuang, Chun-Liang Li, Zifeng Wang, Long T. Le, Abhishek Kumar, James Glass, Alexander Ratner, Chen-Yu Lee, Ranjay Krishna, Tomas Pfister

    Abstract: Large language models (LLMs), even when specifically trained to process long input contexts, struggle to capture relevant information located in the middle of their input. This phenomenon has been known as the lost-in-the-middle problem. In this work, we make three contributions. First, we set out to understand the factors that cause this phenomenon. In doing so, we establish a connection between… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: ACL Findings 2024

  8. arXiv:2406.15649  [pdf, other

    cs.CV

    Efficient Human Pose Estimation: Leveraging Advanced Techniques with MediaPipe

    Authors: Sandeep Singh Sengar, Abhishek Kumar, Owen Singh

    Abstract: This study presents significant enhancements in human pose estimation using the MediaPipe framework. The research focuses on improving accuracy, computational efficiency, and real-time processing capabilities by comprehensively optimising the underlying algorithms. Novel modifications are introduced that substantially enhance pose estimation accuracy across challenging scenarios, such as dynamic m… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  9. arXiv:2406.15646  [pdf, other

    cs.CV

    VigilEye -- Artificial Intelligence-based Real-time Driver Drowsiness Detection

    Authors: Sandeep Singh Sengar, Aswin Kumar, Owen Singh

    Abstract: This study presents a novel driver drowsiness detection system that combines deep learning techniques with the OpenCV framework. The system utilises facial landmarks extracted from the driver's face as input to Convolutional Neural Networks trained to recognise drowsiness patterns. The integration of OpenCV enables real-time video processing, making the system suitable for practical implementation… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  10. arXiv:2406.15565  [pdf, other

    cs.CV cs.LG

    Unseen Object Reasoning with Shared Appearance Cues

    Authors: Paridhi Singh, Arun Kumar

    Abstract: This paper introduces an innovative approach to open world recognition (OWR), where we leverage knowledge acquired from known objects to address the recognition of previously unseen objects. The traditional method of object modeling relies on supervised learning with strict closed-set assumptions, presupposing that objects encountered during inference are already known at the training phase. Howev… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  11. arXiv:2406.14532  [pdf, other

    cs.LG cs.CL

    RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold

    Authors: Amrith Setlur, Saurabh Garg, Xinyang Geng, Naman Garg, Virginia Smith, Aviral Kumar

    Abstract: Training on model-generated synthetic data is a promising approach for finetuning LLMs, but it remains unclear when it helps or hurts. In this paper, we investigate this question for math reasoning via an empirical study, followed by building a conceptual understanding of our observations. First, we find that while the typical approach of finetuning a model on synthetic correct or positive problem… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  12. arXiv:2406.13236  [pdf, other

    cs.CL cs.AI

    Data Contamination Can Cross Language Barriers

    Authors: Feng Yao, Yufan Zhuang, Zihao Sun, Sunan Xu, Animesh Kumar, **gbo Shang

    Abstract: The opacity in develo** large language models (LLMs) is raising growing concerns about the potential contamination of public benchmarks in the pre-training data. Existing contamination detection methods are typically based on the text overlap between training and evaluation data, which can be too superficial to reflect deeper forms of contamination. In this paper, we first present a cross-lingua… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 12 pages, 5 figures

  13. arXiv:2406.12644  [pdf, other

    cs.CL cs.AI

    Hierarchical Prompting Taxonomy: A Universal Evaluation Framework for Large Language Models

    Authors: Devichand Budagam, Sankalp KJ, Ashutosh Kumar, Vinija Jain, Aman Chadha

    Abstract: Assessing the effectiveness of large language models (LLMs) in addressing diverse tasks is essential for comprehending their strengths and weaknesses. Conventional evaluation techniques typically apply a single prompting strategy uniformly across datasets, not considering the varying degrees of task complexity. We introduce the Hierarchical Prompting Taxonomy (HPT), a taxonomy that employs a Hiera… ▽ More

    Submitted 27 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  14. arXiv:2406.11925  [pdf, other

    cs.SE cs.AI cs.CL

    DocCGen: Document-based Controlled Code Generation

    Authors: Sameer Pimparkhede, Mehant Kammakomati, Srikanth G. Tamilselvam, Prince Kumar, Ashok Pon Kumar, Pushpak Bhattacharyya

    Abstract: Recent developments show that Large Language Models (LLMs) produce state-of-the-art performance on natural language (NL) to code generation for resource-rich general-purpose languages like C++, Java, and Python. However, their practical usage for structured domain-specific languages (DSLs) such as YAML, JSON is limited due to domain-specific schema, grammar, and customizations generally unseen by… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  15. arXiv:2406.11896  [pdf, other

    cs.LG

    DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning

    Authors: Hao Bai, Yifei Zhou, Mert Cemri, Jiayi Pan, Alane Suhr, Sergey Levine, Aviral Kumar

    Abstract: Training corpuses for vision language models (VLMs) typically lack sufficient amounts of decision-centric data. This renders off-the-shelf VLMs sub-optimal for decision-making tasks such as in-the-wild device control through graphical user interfaces (GUIs). While training with static demonstrations has shown some promise, we show that such methods fall short for controlling real GUIs due to their… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 11 pages of main text, 28 pages in total

  16. arXiv:2406.11619  [pdf, other

    eess.AS cs.LG

    AV-CrossNet: an Audiovisual Complex Spectral Map** Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling

    Authors: Vahid Ahmadi Kalkhorani, Cheng Yu, Anurag Kumar, Ke Tan, Buye Xu, DeLiang Wang

    Abstract: Adding visual cues to audio-based speech separation can improve separation performance. This paper introduces AV-CrossNet, an audiovisual (AV) system for speech enhancement, target speaker extraction, and multi-talker speaker separation. AV-CrossNet is extended from the CrossNet architecture, which is a recently proposed network that performs complex spectral map** for speech separation by lever… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 10 pages, 4 Figures, and 4 Tables

  17. arXiv:2406.10935  [pdf, other

    cs.CV

    Pick-or-Mix: Dynamic Channel Sampling for ConvNets

    Authors: Ashish Kumar, Daneul Kim, Jaesik Park, Laxmidhar Behera

    Abstract: Channel pruning approaches for convolutional neural networks (ConvNets) deactivate the channels, statically or dynamically, and require special implementation. In addition, channel squeezing in representative ConvNets is carried out via 1x1 convolutions which dominates a large portion of computations and network parameters. Given these challenges, we propose an effective multi-purpose module for d… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: Published in Computer Vision and Pattern Recognition (CVPR 2024)

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

  18. arXiv:2406.10764  [pdf, other

    cs.CL

    GNOME: Generating Negotiations through Open-Domain Map** of Exchanges

    Authors: Darshan Deshpande, Shambhavi Sinha, Anirudh Ravi Kumar, Debaditya Pal, Jonathan May

    Abstract: Language Models have previously shown strong negotiation capabilities in closed domains where the negotiation strategy prediction scope is constrained to a specific setup. In this paper, we first show that these models are not generalizable beyond their original training domain despite their wide-scale pretraining. Following this, we propose an automated framework called GNOME, which processes exi… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  19. arXiv:2406.09329  [pdf, other

    cs.LG cs.AI

    Is Value Learning Really the Main Bottleneck in Offline RL?

    Authors: Seohong Park, Kevin Frans, Sergey Levine, Aviral Kumar

    Abstract: While imitation learning requires access to high-quality data, offline reinforcement learning (RL) should, in principle, perform similarly or better with substantially lower data quality by using a value function. However, current results indicate that offline RL often performs worse than imitation learning, and it is often unclear what holds back the performance of offline RL. Motivated by this o… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  20. arXiv:2406.06533  [pdf, other

    cs.AR cs.AI

    Pragmatic Formal Verification Methodology for Clock Domain Crossing (CDC)

    Authors: Aman Kumar, Muhammad Ul Haque Khan, Bijitendra Mittra

    Abstract: Modern System-on-Chip (SoC) designs are becoming more and more complex due to the technology upscaling. SoC designs often operate on multiple asynchronous clock domains, further adding to the complexity of the overall design. To make the devices power efficient, designers take a Globally-Asynchronous Locally-Synchronous (GALS) approach that creates multiple asynchronous domains. These Clock Domain… ▽ More

    Submitted 20 April, 2024; originally announced June 2024.

    Comments: Published in DVCon Europe 2023

  21. arXiv:2406.06512  [pdf, other

    cs.CV cs.AI

    Merlin: A Vision Language Foundation Model for 3D Computed Tomography

    Authors: Louis Blankemeier, Joseph Paul Cohen, Ashwin Kumar, Dave Van Veen, Syed Jamal Safdar Gardezi, Magdalini Paschali, Zhihong Chen, Jean-Benoit Delbrouck, Eduardo Reis, Cesar Truyts, Christian Bluethgen, Malte Engmann Kjeldskov Jensen, Sophie Ostmeier, Maya Varma, Jeya Maria Jose Valanarasu, Zhongnan Fang, Zepeng Huo, Zaid Nabulsi, Diego Ardila, Wei-Hung Weng, Edson Amaro Junior, Neera Ahuja, Jason Fries, Nigam H. Shah, Andrew Johnston , et al. (6 additional authors not shown)

    Abstract: Over 85 million computed tomography (CT) scans are performed annually in the US, of which approximately one quarter focus on the abdomen. Given the current radiologist shortage, there is a large impetus to use artificial intelligence to alleviate the burden of interpreting these complex imaging studies. Prior state-of-the-art approaches for automated medical image interpretation leverage vision la… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 18 pages, 7 figures

  22. arXiv:2406.04744  [pdf, other

    cs.CL

    CRAG -- Comprehensive RAG Benchmark

    Authors: Xiao Yang, Kai Sun, Hao Xin, Yushi Sun, Nikita Bhalla, Xiangsen Chen, Sajal Choudhary, Rongze Daniel Gui, Ziran Will Jiang, Ziyu Jiang, Lingkun Kong, Brian Moran, Jiaqi Wang, Yifan Ethan Xu, An Yan, Chenyu Yang, Eting Yuan, Hanwen Zha, Nan Tang, Lei Chen, Nicolas Scheffer, Yue Liu, Nirav Shah, Rakesh Wanga, Anuj Kumar , et al. (2 additional authors not shown)

    Abstract: Retrieval-Augmented Generation (RAG) has recently emerged as a promising solution to alleviate Large Language Model (LLM)'s deficiency in lack of knowledge. Existing RAG datasets, however, do not adequately represent the diverse and dynamic nature of real-world Question Answering (QA) tasks. To bridge this gap, we introduce the Comprehensive RAG Benchmark (CRAG), a factual question answering bench… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  23. arXiv:2406.04660  [pdf, other

    eess.AS cs.SD

    URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement

    Authors: Wangyou Zhang, Robin Scheibler, Kohei Saijo, Samuele Cornell, Chenda Li, Zhaoheng Ni, Anurag Kumar, Jan Pirklbauer, Marvin Sach, Shinji Watanabe, Tim Fingscheidt, Yanmin Qian

    Abstract: The last decade has witnessed significant advancements in deep learning-based speech enhancement (SE). However, most existing SE research has limitations on the coverage of SE sub-tasks, data diversity and amount, and evaluation metrics. To fill this gap and promote research toward universal SE, we establish a new SE challenge, named URGENT, to focus on the universality, robustness, and generaliza… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 6 pages, 3 figures, 3 tables. Accepted by Interspeech 2024. An extended version of the accepted manuscript with appendix

  24. arXiv:2406.04413  [pdf, other

    cs.CV cs.AI

    Efficient 3D-Aware Facial Image Editing via Attribute-Specific Prompt Learning

    Authors: Amandeep Kumar, Muhammad Awais, Sanath Narayan, Hisham Cholakkal, Salman Khan, Rao Muhammad Anwer

    Abstract: Drawing upon StyleGAN's expressivity and disentangled latent space, existing 2D approaches employ textual prompting to edit facial images with different attributes. In contrast, 3D-aware approaches that generate faces at different target poses require attribute-specific classifiers, learning separate model weights for each attribute, and are not scalable for novel attributes. In this work, we prop… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  25. arXiv:2406.03747  [pdf, other

    cs.CV cs.AI cs.LG

    Instance Segmentation and Teeth Classification in Panoramic X-rays

    Authors: Devichand Budagam, Ayush Kumar, Sayan Ghosh, Anuj Shrivastav, Azamat Zhanatuly Imanbayev, Iskander Rafailovich Akhmetov, Dmitrii Kaplun, Sergey Antonov, Artem Rychenkov, Gleb Cyganov, Aleksandr Sinitca

    Abstract: Teeth segmentation and recognition are critical in various dental applications and dental diagnosis. Automatic and accurate segmentation approaches have been made possible by integrating deep learning models. Although teeth segmentation has been studied in the past, only some techniques were able to effectively classify and segment teeth simultaneously. This article offers a pipeline of two deep l… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: submtted to Expert Systems with Applications Journal

  26. arXiv:2406.00724  [pdf, other

    cs.HC cs.RO

    Exploring Child-Robot Interaction in Individual and Group settings in India

    Authors: Gayathri Manikutty, Sai Ankith Potapragada, Devasena Pasupuleti, Mahesh S. Unnithan, Arjun Venugopal, Pranav Prabha, Arunav H., Vyshnavi Anil Kumar, Rthuraj P. R., Rao R Bhavani

    Abstract: This study evaluates the effectiveness of child-robot interactions with the HaKsh-E social robot in India, examining both individual and group interaction settings. The research centers on game-based interactions designed to teach hand hygiene to children aged 7-11. Utilizing video analysis, rubric assessments, and post-study questionnaires, the study gathered data from 36 participants. Findings i… ▽ More

    Submitted 4 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

    Comments: 6 pages, 6 figures, Accepted for presentation at ICRAS 2024 (https://www.icras.org/)

  27. arXiv:2406.00071  [pdf

    astro-ph.IM astro-ph.SR cs.LG

    Optimizing Photometric Light Curve Analysis: Evaluating Scipy's Minimize Function for Eclipse Map** of Cataclysmic Variables

    Authors: Anoop Kumar, Madan Mohan Tito Ayyalasomayajula, Dheerendra Panwar, Yeshwanth Vasa

    Abstract: With a particular focus on Scipy's minimize function the eclipse map** method is thoroughly researched and implemented utilizing Python and essential libraries. Many optimization techniques are used, including Sequential Least Squares Programming (SLSQP), Nelder-Mead, and Conjugate Gradient (CG). However, for the purpose of examining photometric light curves these methods seek to solve the maxim… ▽ More

    Submitted 30 May, 2024; originally announced June 2024.

  28. arXiv:2406.00010  [pdf, other

    cs.IR cs.CL

    EnterpriseEM: Fine-tuned Embeddings for Enterprise Semantic Search

    Authors: Kamalkumar Rathinasamy, Jayarama Nettar, Amit Kumar, Vishal Manchanda, Arun Vijayakumar, Ayush Kataria, Venkateshprasanna Manjunath, Chidambaram GS, Jaskirat Singh Sodhi, Shoeb Shaikh, Wasim Akhtar Khan, Prashant Singh, Tanishq Dattatray Ige, Vipin Tiwari, Rajab Ali Mondal, Harshini K, S Reka, Chetana Amancharla, Faiz ur Rahman, Harikrishnan P A, Indraneel Saha, Bhavya Tiwary, Navin Shankar Patel, Pradeep T S, Balaji A J , et al. (2 additional authors not shown)

    Abstract: Enterprises grapple with the significant challenge of managing proprietary unstructured data, hindering efficient information retrieval. This has led to the emergence of AI-driven information retrieval solutions, designed to adeptly extract relevant insights to address employee inquiries. These solutions often leverage pre-trained embedding models and generative models as foundational components.… ▽ More

    Submitted 18 May, 2024; originally announced June 2024.

    ACM Class: I.2.7

  29. arXiv:2405.20755  [pdf

    cs.CL

    Improving code-mixed hate detection by native sample mixing: A case study for Hindi-English code-mixed scenario

    Authors: Debajyoti Mazumder, Aakash Kumar, Jasabanta Patro

    Abstract: Hate detection has long been a challenging task for the NLP community. The task becomes complex in a code-mixed environment because the models must understand the context and the hate expressed through language alteration. Compared to the monolingual setup, we see very less work on code-mixed hate as large-scale annotated hate corpora are unavailable to make the study. To overcome this bottleneck,… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: Generated from XeLaTeX

  30. arXiv:2405.20402  [pdf, other

    eess.AS cs.SD eess.SP

    Cross-Talk Reduction

    Authors: Zhong-Qiu Wang, Anurag Kumar, Shinji Watanabe

    Abstract: While far-field multi-talker mixtures are recorded, each speaker can wear a close-talk microphone so that close-talk mixtures can be recorded at the same time. Although each close-talk mixture has a high signal-to-noise ratio (SNR) of the wearer, it has a very limited range of applications, as it also contains significant cross-talk speech by other speakers and is not clean enough. In this context… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: in International Joint Conference on Artificial Intelligence (IJCAI), 2024

  31. arXiv:2405.19815  [pdf, other

    cs.AI cs.LG

    Efficient Stimuli Generation using Reinforcement Learning in Design Verification

    Authors: Deepak Narayan Gadde, Thomas Nalapat, Aman Kumar, Djones Lettnin, Wolfgang Kunz, Sebastian Simon

    Abstract: The increasing design complexity of System-on-Chips (SoCs) has led to significant verification challenges, particularly in meeting coverage targets within a timely manner. At present, coverage closure is heavily dependent on constrained random and coverage driven verification methodologies where the randomized stimuli are bounded to verify certain scenarios and to reach coverage goals. This proces… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted for publication at the 20th International Conference on Synthesis, Modeling, Analysis and Simulation Methods, and Applications to Circuit Design (SMACD'24), Jul 2-5 2024, Volos, Greece

  32. arXiv:2405.18304  [pdf, other

    cs.CV

    Multi-modal Generation via Cross-Modal In-Context Learning

    Authors: Amandeep Kumar, Muzammal Naseer, Sanath Narayan, Rao Muhammad Anwer, Salman Khan, Hisham Cholakkal

    Abstract: In this work, we study the problem of generating novel images from complex multimodal prompt sequences. While existing methods achieve promising results for text-to-image generation, they often struggle to capture fine-grained details from lengthy prompts and maintain contextual coherence within prompt sequences. Moreover, they often result in misaligned image generation for prompt sequences featu… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Technical Report

  33. arXiv:2405.17401  [pdf, other

    cs.LG cs.CV stat.ML

    RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control

    Authors: Litu Rout, Yujia Chen, Nataniel Ruiz, Abhishek Kumar, Constantine Caramanis, Sanjay Shakkottai, Wen-Sheng Chu

    Abstract: We propose Reference-Based Modulation (RB-Modulation), a new plug-and-play solution for training-free personalization of diffusion models. Existing training-free approaches exhibit difficulties in (a) style extraction from reference images in the absence of additional style or content text descriptions, (b) unwanted content leakage from reference style images, and (c) effective composition of styl… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Preprint. Under review

  34. arXiv:2405.16282  [pdf, other

    cs.CL cs.AI cs.LG

    Confidence Under the Hood: An Investigation into the Confidence-Probability Alignment in Large Language Models

    Authors: Abhishek Kumar, Robert Morabito, Sanzhar Umbet, Jad Kabbara, Ali Emami

    Abstract: As the use of Large Language Models (LLMs) becomes more widespread, understanding their self-evaluation of confidence in generated responses becomes increasingly important as it is integral to the reliability of the output of these models. We introduce the concept of Confidence-Probability Alignment, that connects an LLM's internal confidence, quantified by token probabilities, to the confidence c… ▽ More

    Submitted 15 June, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

    Comments: 9 pages (excluding references), accepted to ACL 2024 Main Conference

  35. arXiv:2405.14555  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Subtle Biases Need Subtler Measures: Dual Metrics for Evaluating Representative and Affinity Bias in Large Language Models

    Authors: Abhishek Kumar, Sarfaroz Yunusov, Ali Emami

    Abstract: Research on Large Language Models (LLMs) has often neglected subtle biases that, although less apparent, can significantly influence the models' outputs toward particular social narratives. This study addresses two such biases within LLMs: representative bias, which denotes a tendency of LLMs to generate outputs that mirror the experiences of certain identity groups, and affinity bias, reflecting… ▽ More

    Submitted 3 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: 9 pages (excluding references), accepted to ACL 2024 Main Conference

  36. arXiv:2405.09288  [pdf, other

    cs.CV

    DeCoDEx: Confounder Detector Guidance for Improved Diffusion-based Counterfactual Explanations

    Authors: Nima Fathi, Amar Kumar, Brennan Nichyporuk, Mohammad Havaei, Tal Arbel

    Abstract: Deep learning classifiers are prone to latching onto dominant confounders present in a dataset rather than on the causal markers associated with the target class, leading to poor generalization and biased predictions. Although explainability via counterfactual image generation has been successful at exposing the problem, bias mitigation strategies that permit accurate explainability in the presenc… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: Accepted to Medical Imaging with Deep Learning (MIDL) 2024

  37. arXiv:2405.08015  [pdf, other

    cs.LG cs.AI

    A Methodology-Oriented Study of Catastrophic Forgetting in Incremental Deep Neural Networks

    Authors: Ashutosh Kumar, Sonali Agarwal, D Jude Hemanth

    Abstract: Human being and different species of animals having the skills to gather, transferring knowledge, processing, fine-tune and generating information throughout their lifetime. The ability of learning throughout their lifespan is referred as continuous learning which is using neurocognition mechanism. Consequently, in real world computational system of incremental learning autonomous agents also need… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  38. arXiv:2405.07079  [pdf, other

    cs.SE

    Host-Based Allocators for Device Memory

    Authors: Oren Bell, Ashwin Kumar, Chris Gill

    Abstract: Memory allocation is a fairly mature field of computer science. However, we challenge a prevailing assumption in the literature over the last 50 years which, if reconsidered, necessitates a fundamental reevaluation of many classical memory management algorithms. We pose a model where the allocation algorithm runs on host memory but allocates device memory and so incur the following constraint: the… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: 9 pages, 4 figures

  39. arXiv:2405.05243  [pdf, other

    quant-ph cs.LG physics.comp-ph

    Deep learning-based variational autoencoder for classification of quantum and classical states of light

    Authors: Mahesh Bhupati, Abhishek Mall, Anshuman Kumar, Pankaj K. Jha

    Abstract: Advancements in optical quantum technologies have been enabled by the generation, manipulation, and characterization of light, with identification based on its photon statistics. However, characterizing light and its sources through single photon measurements often requires efficient detectors and longer measurement times to obtain high-quality photon statistics. Here we introduce a deep learning-… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  40. arXiv:2405.03948  [pdf, other

    cs.IR cs.HC

    The Fault in Our Recommendations: On the Perils of Optimizing the Measurable

    Authors: Omar Besbes, Yash Kanoria, Akshit Kumar

    Abstract: Recommendation systems are widespread, and through customized recommendations, promise to match users with options they will like. To that end, data on engagement is collected and used. Most recommendation systems are ranking-based, where they rank and recommend items based on their predicted engagement. However, the engagement signals are often only a crude proxy for utility, as data on the latte… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  41. arXiv:2405.03005  [pdf, other

    cs.LG cs.AI

    Safe Reinforcement Learning with Learned Non-Markovian Safety Constraints

    Authors: Siow Meng Low, Akshat Kumar

    Abstract: In safe Reinforcement Learning (RL), safety cost is typically defined as a function dependent on the immediate state and actions. In practice, safety constraints can often be non-Markovian due to the insufficient fidelity of state representation, and safety cost may not be known. We therefore address a general setting where safety labels (e.g., safe or unsafe) are associated with state-action traj… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  42. arXiv:2405.01572  [pdf, other

    cs.SE cs.AI cs.AR

    A Semi-Formal Verification Methodology for Efficient Configuration Coverage of Highly Configurable Digital Designs

    Authors: Aman Kumar, Sebastian Simon

    Abstract: Nowadays, a majority of System-on-Chips (SoCs) make use of Intellectual Property (IP) in order to shorten development cycles. When such IPs are developed, one of the main focuses lies in the high configurability of the design. This flexibility on the design side introduces the challenge of covering a huge state space of IP configurations on the verification side to ensure the functional correctnes… ▽ More

    Submitted 20 April, 2024; originally announced May 2024.

    Comments: Published in DVCon U.S. 2021

  43. arXiv:2405.01040  [pdf, other

    cs.CV cs.CL eess.IV

    Few Shot Class Incremental Learning using Vision-Language models

    Authors: Anurag Kumar, Chinmay Bharti, Saikat Dutta, Srikrishna Karanam, Biplab Banerjee

    Abstract: Recent advancements in deep learning have demonstrated remarkable performance comparable to human capabilities across various supervised computer vision tasks. However, the prevalent assumption of having an extensive pool of training data encompassing all classes prior to model training often diverges from real-world scenarios, where limited data availability for novel classes is the norm. The cha… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: under review at Pattern Recognition Letters

  44. arXiv:2405.00130  [pdf, other

    eess.IV cs.CV cs.LG

    A Flexible 2.5D Medical Image Segmentation Approach with In-Slice and Cross-Slice Attention

    Authors: Amarjeet Kumar, Hongxu Jiang, Muhammad Imran, Cyndi Valdes, Gabriela Leon, Dahyun Kang, Parvathi Nataraj, Yuyin Zhou, Michael D. Weiss, Wei Shao

    Abstract: Deep learning has become the de facto method for medical image segmentation, with 3D segmentation models excelling in capturing complex 3D structures and 2D models offering high computational efficiency. However, segmenting 2.5D images, which have high in-plane but low through-plane resolution, is a relatively unexplored challenge. While applying 2D models to individual slices of a 2.5D image is f… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

  45. arXiv:2404.18270  [pdf, other

    cs.AI cs.LO

    Pragmatic Formal Verification of Sequential Error Detection and Correction Codes (ECCs) used in Safety-Critical Design

    Authors: Aman Kumar

    Abstract: Error Detection and Correction Codes (ECCs) are often used in digital designs to protect data integrity. Especially in safety-critical systems such as automotive electronics, ECCs are widely used and the verification of such complex logic becomes more critical considering the ISO 26262 safety standards. Exhaustive verification of ECC using formal methods has been a challenge given the high number… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: Published in DVCon U.S. 2023

  46. arXiv:2404.16896  [pdf, other

    cs.GR cs.LG

    A Neural-Network-Based Approach for Loose-Fitting Clothing

    Authors: Yongxu **, Dalton Omens, Zhenglin Geng, Joseph Teran, Abishek Kumar, Kenji Tashiro, Ronald Fedkiw

    Abstract: Since loose-fitting clothing contains dynamic modes that have proven to be difficult to predict via neural networks, we first illustrate how to coarsely approximate these modes with a real-time numerical algorithm specifically designed to mimic the most important ballistic features of a classical numerical simulation. Although there is some flexibility in the choice of the numerical algorithm used… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  47. arXiv:2404.16893  [pdf, other

    cs.LG cs.AI cs.RO

    Automatic AI controller that can drive with confidence: steering vehicle with uncertainty knowledge

    Authors: Neha Kumari, Sumit Kumar. Sneha Priya, Ayush Kumar, Akash Fogla

    Abstract: In safety-critical systems that interface with the real world, the role of uncertainty in decision-making is pivotal, particularly in the context of machine learning models. For the secure functioning of Cyber-Physical Systems (CPS), it is imperative to manage such uncertainty adeptly. In this research, we focus on the development of a vehicle's lateral control system using a machine learning fram… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2303.08187

  48. arXiv:2404.16060  [pdf

    cs.HC physics.ed-ph physics.optics

    Pocket Schlieren: a background oriented schlieren imaging platform on a smartphone

    Authors: Diganta Rabha, Vimod Kumar, Akshay Kumar, Dinesh Saini, Manish Kumar

    Abstract: Background-oriented schlieren (BOS) is a powerful technique for flow visualization. Nevertheless, the widespread dissemination of BOS is impeded by its dependence on scientific cameras, computing hardware, and dedicated analysis software. In this work, we aim to democratize BOS by providing a smartphone based scientific tool called "Pocket Schlieren". Pocket Schlieren enables users to directly cap… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 24 pages, 6 figures, 4 Supplementary figures

  49. arXiv:2404.15371  [pdf, other

    eess.SP cs.AI

    Efficient Verification of a RADAR SoC Using Formal and Simulation-Based Methods

    Authors: Aman Kumar, Mark Litterick, Samuele Candido

    Abstract: As the demand for Internet of Things (IoT) and Human-to-Machine Interaction (HMI) increases, modern System-on-Chips (SoCs) offering such solutions are becoming increasingly complex. This intricate design poses significant challenges for verification, particularly when time-to-market is a crucial factor for consumer electronics products. This paper presents a case study based on our work to verify… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: Published in DVCon Europe 2023

  50. arXiv:2404.14372  [pdf, other

    cs.CL cs.AI

    Beyond Scaling: Predicting Patent Approval with Domain-specific Fine-grained Claim Dependency Graph

    Authors: Xiaochen Kev Gao, Feng Yao, Kewen Zhao, Beilei He, Animesh Kumar, Vish Krishnan, **gbo Shang

    Abstract: Model scaling is becoming the default choice for many language tasks due to the success of large language models (LLMs). However, it can fall short in specific scenarios where simple customized methods excel. In this paper, we delve into the patent approval pre-diction task and unveil that simple domain-specific graph methods outperform enlarging the model, using the intrinsic dependencies within… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 17 Pages, Under Review