Skip to main content

Showing 1–50 of 451 results for author: Jain, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.15680  [pdf, other

    cs.GT econ.TH

    Calibrated Forecasting and Persuasion

    Authors: Atulya Jain, Vianney Perchet

    Abstract: How should an expert send forecasts to maximize her utility subject to passing a calibration test? We consider a dynamic game where an expert sends probabilistic forecasts to a decision maker. The decision maker uses a calibration test based on past outcomes to verify the expert's forecasts. We characterize the optimal forecasting strategy by reducing the dynamic game to a static persuasion proble… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: The conference version of this work has been accepted to the Twenty-Fifth ACM Conference on Economics and Computation (EC'24)

  2. arXiv:2406.13031  [pdf, other

    cs.CV

    A machine learning pipeline for automated insect monitoring

    Authors: Aditya Jain, Fagner Cunha, Michael Bunsen, Léonard Pasi, Anna Viklund, Maxim Larrivée, David Rolnick

    Abstract: Climate change and other anthropogenic factors have led to a catastrophic decline in insects, endangering both biodiversity and the ecosystem services on which human society depends. Data on insect abundance, however, remains woefully inadequate. Camera traps, conventionally used for monitoring terrestrial vertebrates, are now being modified for insects, especially moths. We describe a complete, o… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Journal ref: NeurIPS 2023 Workshop on Tackling Climate Change with Machine Learning

  3. arXiv:2406.12452  [pdf, other

    cs.CV cs.AI cs.LG

    Insect Identification in the Wild: The AMI Dataset

    Authors: Aditya Jain, Fagner Cunha, Michael James Bunsen, Juan Sebastián Cañas, Léonard Pasi, Nathan Pinoy, Flemming Helsing, JoAnne Russo, Marc Botham, Michael Sabourin, Jonathan Fréchette, Alexandre Anctil, Yacksecari Lopez, Eduardo Navarro, Filonila Perez Pimentel, Ana Cecilia Zamora, José Alejandro Ramirez Silva, Jonathan Gagnon, Tom August, Kim Bjerge, Alba Gomez Segura, Marc Bélisle, Yves Basset, Kent P. McFarland, David Roy , et al. (3 additional authors not shown)

    Abstract: Insects represent half of all global biodiversity, yet many of the world's insects are disappearing, with severe implications for ecosystems and agriculture. Despite this crisis, data on insect diversity and abundance remain woefully inadequate, due to the scarcity of human experts and the lack of scalable tools for monitoring. Ecologists have started to adopt camera traps to record and study inse… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  4. arXiv:2406.11500  [pdf, other

    eess.SP cs.HC

    ESI-GAL: EEG Source Imaging-based Kinematics Parameter Estimation for Grasp and Lift Task

    Authors: Anant Jain, Lalan Kumar

    Abstract: Objective: Electroencephalogram (EEG) signals-based motor kinematics prediction (MKP) has been an active area of research to develop brain-computer interface (BCI) systems such as exosuits, prostheses, and rehabilitation devices. However, EEG source imaging (ESI) based kinematics prediction is sparsely explored in the literature. Approach: In this study, pre-movement EEG features are utilized to p… ▽ More

    Submitted 18 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  5. arXiv:2406.08431  [pdf, other

    cs.CV cs.AI cs.CR cs.LG

    Diffusion Soup: Model Merging for Text-to-Image Diffusion Models

    Authors: Benjamin Biggs, Arjun Seshadri, Yang Zou, Achin Jain, Aditya Golatkar, Yusheng Xie, Alessandro Achille, Ashwin Swaminathan, Stefano Soatto

    Abstract: We present Diffusion Soup, a compartmentalization method for Text-to-Image Generation that averages the weights of diffusion models trained on sharded data. By construction, our approach enables training-free continual learning and unlearning with no additional memory or inference costs, since models corresponding to data shards can be added or removed by re-averaging. We show that Diffusion Soup… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  6. arXiv:2406.06808  [pdf, ps, other

    cs.DS cs.LG

    Fast White-Box Adversarial Streaming Without a Random Oracle

    Authors: Ying Feng, Aayush Jain, David P. Woodruff

    Abstract: Recently, the question of adversarially robust streaming, where the stream is allowed to depend on the randomness of the streaming algorithm, has gained a lot of attention. In this work, we consider a strong white-box adversarial model (Ajtai et al. PODS 2022), in which the adversary has access to all past random coins and the parameters used by the streaming algorithm. We focus on the sparse reco… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  7. arXiv:2406.00287  [pdf, other

    cs.CV cs.AI

    GenPalm: Contactless Palmprint Generation with Diffusion Models

    Authors: Steven A. Grosz, Anil K. Jain

    Abstract: The scarcity of large-scale palmprint databases poses a significant bottleneck to advancements in contactless palmprint recognition. To address this, researchers have turned to synthetic data generation. While Generative Adversarial Networks (GANs) have been widely used, they suffer from instability and mode collapse. Recently, diffusion probabilistic models have emerged as a promising alternative… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

  8. arXiv:2406.00237  [pdf, other

    eess.IV cs.CV cs.LG

    A Comparative Study of CNN, ResNet, and Vision Transformers for Multi-Classification of Chest Diseases

    Authors: Ananya Jain, Aviral Bhardwaj, Kaushik Murali, Isha Surani

    Abstract: Large language models, notably utilizing Transformer architectures, have emerged as powerful tools due to their scalability and ability to process large amounts of data. Dosovitskiy et al. expanded this architecture to introduce Vision Transformers (ViT), extending its applicability to image processing tasks. Motivated by this advancement, we fine-tuned two variants of ViT models, one pre-trained… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: 8 pages, 6 figures

  9. arXiv:2405.18296  [pdf, other

    cs.LG cond-mat.dis-nn stat.ML

    Bias in Motion: Theoretical Insights into the Dynamics of Bias in SGD Training

    Authors: Anchit Jain, Rozhin Nobahari, Aristide Baratin, Stefano Sarao Mannelli

    Abstract: Machine learning systems often acquire biases by leveraging undesired features in the data, impacting accuracy variably across different sub-populations. Current understanding of bias formation mostly focuses on the initial and final stages of learning, leaving a gap in knowledge regarding the transient dynamics. To address this gap, this paper explores the evolution of bias in a teacher-student s… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  10. arXiv:2405.15282  [pdf, other

    cs.LG cs.AI

    Prompt Tuning Strikes Back: Customizing Foundation Models with Low-Rank Prompt Adaptation

    Authors: Abhinav Jain, Swarat Chaudhuri, Thomas Reps, Chris Jermaine

    Abstract: Parameter-Efficient Fine-Tuning (PEFT) has become the standard for customising Foundation Models (FMs) to user-specific downstream tasks. However, typical PEFT methods require storing multiple task-specific adapters, creating scalability issues as these adapters must be housed and run at the FM server. Traditional prompt tuning offers a potential solution by customising them through task-specific… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 11 pages, 4 figures, 3 tables

  11. arXiv:2405.13930  [pdf, other

    cond-mat.mtrl-sci cs.RO cs.SE

    AlabOS: A Python-based Reconfigurable Workflow Management Framework for Autonomous Laboratories

    Authors: Yuxing Fei, Bernardus Rendy, Rishi Kumar, Olympia Dartsi, Hrushikesh P. Sahasrabuddhe, Matthew J. McDermott, Zheren Wang, Nathan J. Szymanski, Lauren N. Walters, David Milsted, Yan Zeng, Anubhav Jain, Gerbrand Ceder

    Abstract: The recent advent of autonomous laboratories, coupled with algorithms for high-throughput screening and active learning, promises to accelerate materials discovery and innovation. As these autonomous systems grow in complexity, the demand for robust and efficient workflow management software becomes increasingly critical. In this paper, we introduce AlabOS, a general-purpose software framework for… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 30 pages, 5 figures

  12. arXiv:2405.12842  [pdf, other

    cs.RO cs.CV

    SmartFlow: Robotic Process Automation using LLMs

    Authors: Arushi Jain, Shubham Paliwal, Monika Sharma, Lovekesh Vig, Gautam Shroff

    Abstract: Robotic Process Automation (RPA) systems face challenges in handling complex processes and diverse screen layouts that require advanced human-like decision-making capabilities. These systems typically rely on pixel-level encoding through drag-and-drop or automation frameworks such as Selenium to create navigation workflows, rather than visual understanding of screen elements. In this context, we p… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 32nd ACM International Conference on Information and Knowledge Management

  13. arXiv:2405.12742  [pdf, other

    cs.CV

    Multi-Subject Personalization

    Authors: Arushi Jain, Shubham Paliwal, Monika Sharma, Vikram Jamwal, Lovekesh Vig

    Abstract: Creative story illustration requires a consistent interplay of multiple characters or objects. However, conventional text-to-image models face significant challenges while producing images featuring multiple personalized subjects. For example, they distort the subject rendering, or the text descriptions fail to render coherent subject interactions. We present Multi-Subject Personalization (MSP) to… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 2023 Conference on Neural Information Processing Systems

  14. arXiv:2405.12531  [pdf, other

    cs.CV cs.LG

    CustomText: Customized Textual Image Generation using Diffusion Models

    Authors: Shubham Paliwal, Arushi Jain, Monika Sharma, Vikram Jamwal, Lovekesh Vig

    Abstract: Textual image generation spans diverse fields like advertising, education, product packaging, social media, information visualization, and branding. Despite recent strides in language-guided image synthesis using diffusion models, current models excel in image generation but struggle with accurate text rendering and offer limited control over font attributes. In this paper, we aim to enhance the s… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Accepted by AI for Content Creation (AI4CC) workshop at CVPR 2024

  15. arXiv:2405.07838  [pdf, other

    cs.LG cs.AI

    Adaptive Exploration for Data-Efficient General Value Function Evaluations

    Authors: Arushi Jain, Josiah P. Hanna, Doina Precup

    Abstract: General Value Functions (GVFs) (Sutton et al, 2011) are an established way to represent predictive knowledge in reinforcement learning. Each GVF computes the expected return for a given policy, based on a unique pseudo-reward. Multiple GVFs can be estimated in parallel using off-policy learning from a single stream of data, often sourced from a fixed behavior policy or pre-collected dataset. This… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 20 pages, 9 figures, Under Review

  16. arXiv:2405.07417  [pdf, other

    cs.SI eess.SP

    Identifying Hate Speech Peddlers in Online Platforms. A Bayesian Social Learning Approach for Large Language Model Driven Decision-Makers

    Authors: Adit Jain, Vikram Krishnamurthy

    Abstract: This paper studies the problem of autonomous agents performing Bayesian social learning for sequential detection when the observations of the state belong to a high-dimensional space and are expensive to analyze. Specifically, when the observations are textual, the Bayesian agent can use a large language model (LLM) as a map to get a low-dimensional private observation. The agent performs Bayesian… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

  17. arXiv:2405.07415  [pdf, ps, other

    cs.LG eess.SY

    Structured Reinforcement Learning for Incentivized Stochastic Covert Optimization

    Authors: Adit Jain, Vikram Krishnamurthy

    Abstract: This paper studies how a stochastic gradient algorithm (SG) can be controlled to hide the estimate of the local stationary point from an eavesdropper. Such problems are of significant interest in distributed optimization settings like federated learning and inventory management. A learner queries a stochastic oracle and incentivizes the oracle to obtain noisy gradient measurements and perform SG.… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

  18. arXiv:2405.06989  [pdf, other

    cs.RO eess.SY

    Stabilizing Circular Motion Within Nonconcentric Circular Boundary: A Mobius Transformation-Based Approach

    Authors: Shubham Singh, Anoop Jain

    Abstract: Nonuniform motion constraints are ubiquitous in robotic applications. Geofencing control is one such paradigm where the motion of a robot must be constrained within a predefined boundary. This paper addresses the problem of stabilizing a unicycle robot around a desired circular orbit while confining its motion within a nonconcentric external circular boundary. Our solution approach relies on the c… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  19. arXiv:2405.04030  [pdf, other

    cs.RO

    Uncovering implementable dormant pruning decisions from three different stakeholder perspectives

    Authors: Deanna Flynn, Abhinav Jain, Heather Knight, Cristina G. Wilson, Cindy Grimm

    Abstract: Dormant pruning, or the removal of unproductive portions of a tree while a tree is not actively growing, is an important orchard task to help maintain yield, requiring years to build expertise. Because of long training periods and an increasing labor shortage in agricultural jobs, pruning could benefit from robotic automation. However, to program robots to prune branches, we first need to understa… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 36 pages; 21 figures

  20. arXiv:2405.01734  [pdf, other

    cs.CV cs.AI

    Diabetic Retinopathy Detection Using Quantum Transfer Learning

    Authors: Ankush Jain, Rinav Gupta, Jai Singhal

    Abstract: Diabetic Retinopathy (DR), a prevalent complication in diabetes patients, can lead to vision impairment due to lesions formed on the retina. Detecting DR at an advanced stage often results in irreversible blindness. The traditional process of diagnosing DR through retina fundus images by ophthalmologists is not only time-intensive but also expensive. While classical transfer learning models have b… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 14 pages, 12 figures and 5 tables

  21. arXiv:2404.18890  [pdf, other

    cs.CV

    Hide and Seek: How Does Watermarking Impact Face Recognition?

    Authors: Yuguang Yao, Steven Grosz, Sijia Liu, Anil Jain

    Abstract: The recent progress in generative models has revolutionized the synthesis of highly realistic images, including face images. This technological development has undoubtedly helped face recognition, such as training data augmentation for higher recognition accuracy and data privacy. However, it has also introduced novel challenges concerning the responsible use and proper attribution of computer gen… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  22. arXiv:2404.17627  [pdf, other

    cs.MA cs.CE cs.ET eess.SY

    Impact of Traffic-Following on Order of Autonomous Airspace Operations

    Authors: Anahita Jain, Husni R. Idris, John-Paul Clarke

    Abstract: In this paper, we investigate the dynamic emergence of traffic order in a distributed multi-agent system, aiming to minimize inefficiencies that stem from unnecessary structural impositions. We introduce a methodology for develo** a dynamically-updating traffic pattern map of the airspace by leveraging information about the consistency and frequency of flow directions used by current as well as… ▽ More

    Submitted 3 June, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

  23. arXiv:2404.17390  [pdf, other

    cs.HC cs.AI

    How Could AI Support Design Education? A Study Across Fields Fuels Situating Analytics

    Authors: Ajit Jain, Andruid Kerne, Hannah Fowler, **sil Seo, Galen Newman, Nic Lupfer, Aaron Perrine

    Abstract: We use the process and findings from a case study of design educators' practices of assessment and feedback to fuel theorizing about how to make AI useful in service of human experience. We build on Suchman's theory of situated actions. We perform a qualitative study of 11 educators in 5 fields, who teach design processes situated in project-based learning contexts. Through qualitative data gather… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: 31 pages, 3 figures, Submitted to ACM

    ACM Class: H.5.2

  24. arXiv:2404.13791  [pdf, other

    cs.CV cs.AI

    Universal Fingerprint Generation: Controllable Diffusion Model with Multimodal Conditions

    Authors: Steven A. Grosz, Anil K. Jain

    Abstract: The utilization of synthetic data for fingerprint recognition has garnered increased attention due to its potential to alleviate privacy concerns surrounding sensitive biometric data. However, current methods for generating fingerprints have limitations in creating impressions of the same finger with useful intra-class variations. To tackle this challenge, we present GenPrint, a framework to produ… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  25. arXiv:2404.06645  [pdf, other

    cs.RO cs.AI

    GenCHiP: Generating Robot Policy Code for High-Precision and Contact-Rich Manipulation Tasks

    Authors: Kaylee Burns, A**kya Jain, Keegan Go, Fei Xia, Michael Stark, Stefan Schaal, Karol Hausman

    Abstract: Large Language Models (LLMs) have been successful at generating robot policy code, but so far these results have been limited to high-level tasks that do not require precise movement. It is an open question how well such approaches work for tasks that require reasoning over contact forces and working within tight success tolerances. We find that, with the right action space, LLMs are capable of su… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 14 pages, 12 figures

    ACM Class: I.2.9

  26. arXiv:2404.05870  [pdf, other

    cs.RO

    CoBT: Collaborative Programming of Behaviour Trees from One Demonstration for Robot Manipulation

    Authors: Aayush Jain, Philip Long, Valeria Villani, John D. Kelleher, Maria Chiara Leva

    Abstract: Mass customization and shorter manufacturing cycles are becoming more important among small and medium-sized companies. However, classical industrial robots struggle to cope with product variation and dynamic environments. In this paper, we present CoBT, a collaborative programming by demonstration framework for generating reactive and modular behavior trees. CoBT relies on a single demonstration… ▽ More

    Submitted 10 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

    Comments: Accepted for presentation at IEEE ICRA 2024

  27. arXiv:2404.05417  [pdf, other

    cs.HC cs.AI cs.CY

    Indexing Analytics to Instances: How Integrating a Dashboard can Support Design Education

    Authors: Ajit Jain, Andruid Kerne, Nic Lupfer, Gabriel Britain, Aaron Perrine, Yoonsuck Choe, John Keyser, Ruihong Huang, **sil Seo, Annie Sungkajun, Robert Lightfoot, Timothy McGuire

    Abstract: We investigate how to use AI-based analytics to support design education. The analytics at hand measure multiscale design, that is, students' use of space and scale to visually and conceptually organize their design work. With the goal of making the analytics intelligible to instructors, we developed a research artifact integrating a design analytics dashboard with design instances, and the design… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: 22 pages, 4 figures, Submitted to ACM DIS

    ACM Class: H.5.2

  28. arXiv:2403.14852  [pdf, other

    cs.CV

    KeyPoint Relative Position Encoding for Face Recognition

    Authors: Minchul Kim, Yiyang Su, Feng Liu, Anil Jain, Xiaoming Liu

    Abstract: In this paper, we address the challenge of making ViT models more robust to unseen affine transformations. Such robustness becomes useful in various recognition tasks such as face recognition when image alignment failures occur. We propose a novel method called KP-RPE, which leverages key points (e.g.~facial landmarks) to make ViT more resilient to scale, translation, and pose variations. We begin… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: To appear in CVPR2024

  29. arXiv:2403.12945  [pdf, other

    cs.RO

    DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset

    Authors: Alexander Khazatsky, Karl Pertsch, Suraj Nair, Ashwin Balakrishna, Sudeep Dasari, Siddharth Karamcheti, Soroush Nasiriany, Mohan Kumar Srirama, Lawrence Yunliang Chen, Kirsty Ellis, Peter David Fagan, Joey Hejna, Masha Itkina, Marion Lepert, Yecheng Jason Ma, Patrick Tree Miller, Jimmy Wu, Suneel Belkhale, Shivin Dass, Huy Ha, Arhan Jain, Abraham Lee, Youngwoon Lee, Marius Memmel, Sungjae Park , et al. (74 additional authors not shown)

    Abstract: The creation of large, diverse, high-quality robot manipulation datasets is an important step** stone on the path toward more capable and robust robotic manipulation policies. However, creating such datasets is challenging: collecting robot manipulation data in diverse environments poses logistical and safety challenges and requires substantial investments in hardware and human labour. As a resu… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Project website: https://droid-dataset.github.io/

  30. arXiv:2403.12267  [pdf, other

    cs.CV cs.LG

    Data-Efficient Contrastive Language-Image Pretraining: Prioritizing Data Quality over Quantity

    Authors: Siddharth Joshi, Arnav Jain, Ali Payani, Baharan Mirzasoleiman

    Abstract: Contrastive Language-Image Pre-training (CLIP) on large-scale image-caption datasets learns representations that can achieve remarkable zero-shot generalization. However, such models require a massive amount of pre-training data. Improving the quality of the pre-training data has been shown to be much more effective in improving CLIP's performance than increasing its volume. Nevertheless, finding… ▽ More

    Submitted 19 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: AISTATS 2024, Code: https://github.com/BigML-CS-UCLA/clipcov-data-efficient-clip

  31. arXiv:2403.12047  [pdf, other

    cs.CV

    Alpha-wolves and Alpha-mammals: Exploring Dictionary Attacks on Iris Recognition Systems

    Authors: Sudipta Banerjee, Anubhav Jain, Zehua Jiang, Nasir Memon, Julian Togelius, Arun Ross

    Abstract: A dictionary attack in a biometric system entails the use of a small number of strategically generated images or templates to successfully match with a large number of identities, thereby compromising security. We focus on dictionary attacks at the template level, specifically the IrisCodes used in iris recognition systems. We present an hitherto unknown vulnerability wherein we mix IrisCodes usin… ▽ More

    Submitted 20 November, 2023; originally announced March 2024.

    Comments: 8 pages, 5 figures, 13 tables, Workshop on Manipulation, Adversarial, and Presentation Attacks in Biometrics, Winter Conference on Applications of Computer Vision

  32. arXiv:2403.10955  [pdf, other

    cs.RO

    Agonist-Antagonist Pouch Motors: Bidirectional Soft Actuators Enhanced by Thermally Responsive Peltier Elements

    Authors: Trevor Exley, Rashmi Wijesundara, Nathan Tan, Akshay Sunkara, Xinyu He, Shuopu Wang, Bonnie Chan, Aditya Jain, Luis Espinosa, Amir Jafari

    Abstract: In this study, we introduce a novel Mylar-based pouch motor design that leverages the reversible actuation capabilities of Peltier junctions to enable agonist-antagonist muscle mimicry in soft robotics. Addressing the limitations of traditional silicone-based materials, such as leakage and phase-change fluid degradation, our pouch motors filled with Novec 7000 provide a durable and leak-proof solu… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: submitted to IROS 2024, 7 pages, 9 figures

  33. arXiv:2403.09611  [pdf, other

    cs.CV cs.CL cs.LG

    MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

    Authors: Brandon McKinzie, Zhe Gan, Jean-Philippe Fauconnier, Sam Dodge, Bowen Zhang, Philipp Dufter, Dhruti Shah, Xianzhi Du, Futang Peng, Floris Weers, Anton Belyi, Haotian Zhang, Karanjeet Singh, Doug Kang, Ankur Jain, Hongyu Hè, Max Schwarzer, Tom Gunter, Xiang Kong, Aonan Zhang, Jianyu Wang, Chong Wang, Nan Du, Tao Lei, Sam Wiseman , et al. (7 additional authors not shown)

    Abstract: In this work, we discuss building performant Multimodal Large Language Models (MLLMs). In particular, we study the importance of various architecture components and data choices. Through careful and comprehensive ablations of the image encoder, the vision language connector, and various pre-training data choices, we identified several crucial design lessons. For example, we demonstrate that for la… ▽ More

    Submitted 18 April, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  34. arXiv:2403.03998  [pdf, other

    cs.CR

    OpenVPN is Open to VPN Fingerprinting

    Authors: Diwen Xue, Reethika Ramesh, Arham Jain, Michalis Kallitsis, J. Alex Halderman, Jedidiah R. Crandall, Roya Ensafi

    Abstract: VPN adoption has seen steady growth over the past decade due to increased public awareness of privacy and surveillance threats. In response, certain governments are attempting to restrict VPN access by identifying connections using "dual use" DPI technology. To investigate the potential for VPN blocking, we develop mechanisms for accurately fingerprinting connections using OpenVPN, the most popula… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: In: USENIX Security Symposium 2022 (USENIX Security '22)

    Journal ref: 31st USENIX Security Symposium (USENIX Security 22). 2022

  35. arXiv:2403.02709  [pdf, other

    cs.RO

    RT-Sketch: Goal-Conditioned Imitation Learning from Hand-Drawn Sketches

    Authors: Priya Sundaresan, Quan Vuong, Jiayuan Gu, Peng Xu, Ted Xiao, Sean Kirmani, Tianhe Yu, Michael Stark, A**kya Jain, Karol Hausman, Dorsa Sadigh, Jeannette Bohg, Stefan Schaal

    Abstract: Natural language and images are commonly used as goal representations in goal-conditioned imitation learning (IL). However, natural language can be ambiguous and images can be over-specified. In this work, we propose hand-drawn sketches as a modality for goal specification in visual imitation learning. Sketches are easy for users to provide on the fly like language, but similar to images they can… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  36. arXiv:2403.01410  [pdf, other

    cs.RO

    Barrier Functions Inspired Reward Sha** for Reinforcement Learning

    Authors: Nilaksh Nilaksh, Abhishek Ranjan, Shreenabh Agrawal, Aayush Jain, Pushpak Jagtap, Shishir Kolathaya

    Abstract: Reinforcement Learning (RL) has progressed from simple control tasks to complex real-world challenges with large state spaces. While RL excels in these tasks, training time remains a limitation. Reward sha** is a popular solution, but existing methods often rely on value functions, which face scalability issues. This paper presents a novel safety-oriented reward-sha** framework inspired by bar… ▽ More

    Submitted 1 April, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: 7 pages, 10 figures, Accepted as contributed paper at ICRA 2024

    ACM Class: I.2.9

  37. arXiv:2403.01248  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code

    Authors: Ziniu Hu, Ahmet Iscen, Aashi Jain, Thomas Kipf, Yisong Yue, David A. Ross, Cordelia Schmid, Alireza Fathi

    Abstract: This paper introduces SceneCraft, a Large Language Model (LLM) Agent converting text descriptions into Blender-executable Python scripts which render complex scenes with up to a hundred 3D assets. This process requires complex spatial planning and arrangement. We tackle these challenges through a combination of advanced abstraction, strategic planning, and library learning. SceneCraft first models… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

  38. arXiv:2402.17101  [pdf, other

    cs.CV cs.AI

    T-HITL Effectively Addresses Problematic Associations in Image Generation and Maintains Overall Visual Quality

    Authors: Susan Epstein, Li Chen, Alessandro Vecchiato, Ankit Jain

    Abstract: Generative AI image models may inadvertently generate problematic representations of people. Past research has noted that millions of users engage daily across the world with these models and that the models, including through problematic representations of people, have the potential to compound and accelerate real-world discrimination and other harms (Bianchi et al, 2023). In this paper, we focus… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: 11 pages, 8 figures

    MSC Class: I.I.2 ACM Class: I.2.1

  39. arXiv:2402.07851  [pdf, other

    cs.LG

    Comparing skill of historical rainfall data based monsoon rainfall prediction in India with NCEP-NWP forecasts

    Authors: Apoorva Narula, Aastha Jain, Jatin Batra, Sandeep Juneja

    Abstract: In this draft we consider the problem of forecasting rainfall across India during the four monsoon months, one day as well as three days in advance. We train neural networks using historical daily gridded precipitation data for India obtained from IMD for the time period $1901- 2022$, at a spatial resolution of $1^{\circ} \times 1^{\circ}$. This is compared with the numerical weather prediction (N… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  40. arXiv:2402.06559  [pdf, other

    cs.LG cs.AI cs.CL cs.RO

    Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous Driving and Zero-Shot Instruction Following

    Authors: Brian Yang, Huangyuan Su, Nikolaos Gkanatsios, Tsung-Wei Ke, Ayush Jain, Jeff Schneider, Katerina Fragkiadaki

    Abstract: Diffusion models excel at modeling complex and multimodal trajectory distributions for decision-making and control. Reward-gradient guided denoising has been recently proposed to generate trajectories that maximize both a differentiable reward function and the likelihood under the data distribution captured by a diffusion model. Reward-gradient guided denoising requires a differentiable reward fun… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  41. arXiv:2402.05602  [pdf, other

    cs.CL cs.AI cs.CV cs.LG

    AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers

    Authors: Reduan Achtibat, Sayed Mohammad Vakilzadeh Hatefi, Maximilian Dreyer, Aakriti Jain, Thomas Wiegand, Sebastian Lapuschkin, Wojciech Samek

    Abstract: Large Language Models are prone to biased predictions and hallucinations, underlining the paramount importance of understanding their model-internal reasoning process. However, achieving faithful attributions for the entirety of a black-box transformer model and maintaining computational efficiency is an unsolved challenge. By extending the Layer-wise Relevance Propagation attribution method to ha… ▽ More

    Submitted 10 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  42. arXiv:2402.05398  [pdf, other

    cs.CV

    On the Effect of Image Resolution on Semantic Segmentation

    Authors: Ritambhara Singh, Abhishek Jain, Pietro Perona, Shivani Agarwal, Junfeng Yang

    Abstract: High-resolution semantic segmentation requires substantial computational resources. Traditional approaches in the field typically downscale the input images before processing and then upscale the low-resolution outputs back to their original dimensions. While this strategy effectively identifies broad regions, it often misses finer details. In this study, we demonstrate that a streamlined model ca… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: text overlap with arXiv:2209.08667 by other authors

  43. arXiv:2402.04376  [pdf, other

    cs.LG cs.AI stat.ML

    Scaling laws for learning with real and surrogate data

    Authors: Ayush Jain, Andrea Montanari, Eren Sasoglu

    Abstract: Collecting large quantities of high-quality data can be prohibitively expensive or impractical, and a bottleneck in machine learning. One may instead augment a small set of $n$ data points from the target distribution with data from more accessible sources, e.g. data collected under different circumstances or synthesized by generative models. We refer to such data as `surrogate data.' We introduce… ▽ More

    Submitted 28 June, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: Added new experiments

  44. arXiv:2402.03633  [pdf, ps, other

    cs.CR cs.CC cs.IT

    Lossy Cryptography from Code-Based Assumptions

    Authors: Quang Dao, Aayush Jain

    Abstract: Over the past few decades, we have seen a proliferation of advanced cryptographic primitives with lossy or homomorphic properties built from various assumptions such as Quadratic Residuosity, Decisional Diffie-Hellman, and Learning with Errors. These primitives imply hard problems in the complexity class $SZK$ (statistical zero-knowledge); as a consequence, they can only be based on assumptions th… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: 37 pages, 3 figures

    MSC Class: 94A60 ACM Class: E.3

  45. arXiv:2402.01045  [pdf, other

    cs.LG cs.CE

    LatticeGraphNet: A two-scale graph neural operator for simulating lattice structures

    Authors: Ayush Jain, Ehsan Haghighat, Sai Nelaturi

    Abstract: This study introduces a two-scale Graph Neural Operator (GNO), namely, LatticeGraphNet (LGN), designed as a surrogate model for costly nonlinear finite-element simulations of three-dimensional latticed parts and structures. LGN has two networks: LGN-i, learning the reduced dynamics of lattices, and LGN-ii, learning the map** from the reduced representation onto the tetrahedral mesh. LGN can pred… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  46. arXiv:2401.16803  [pdf, other

    cs.SD cs.LG eess.AS

    PBSCSR: The Piano Bootleg Score Composer Style Recognition Dataset

    Authors: Arhan Jain, Alec Bunn, Austin Pham, TJ Tsai

    Abstract: This article motivates, describes, and presents the PBSCSR dataset for studying composer style recognition of piano sheet music. Our overarching goal was to create a dataset for studying composer style recognition that is "as accessible as MNIST and as challenging as ImageNet". To achieve this goal, we use a previously proposed feature representation of sheet music called a bootleg score, which en… ▽ More

    Submitted 7 February, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: 15 pages, 4 figures

  47. arXiv:2401.14497  [pdf, other

    cs.CV cs.LG

    Investigating the Quality of DermaMNIST and Fitzpatrick17k Dermatological Image Datasets

    Authors: Kumar Abhishek, Aditi Jain, Ghassan Hamarneh

    Abstract: The remarkable progress of deep learning in dermatological tasks has brought us closer to achieving diagnostic accuracies comparable to those of human experts. However, while large datasets play a crucial role in the development of reliable deep neural network models, the quality of data therein and their correct usage are of paramount importance. Several factors can impact data quality, such as t… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: 36 pages, 8 figures, 3 tables

  48. arXiv:2401.09756  [pdf, other

    cs.LG cs.AI

    Explaining Drift using Shapley Values

    Authors: Narayanan U. Edakunni, Utkarsh Tekriwal, Anukriti Jain

    Abstract: Machine learning models often deteriorate in their performance when they are used to predict the outcomes over data on which they were not trained. These scenarios can often arise in real world when the distribution of data changes gradually or abruptly due to major events like a pandemic. There have been many attempts in machine learning research to come up with techniques that are resilient to s… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

  49. arXiv:2401.08111  [pdf, other

    cs.CV

    Mobile Contactless Palmprint Recognition: Use of Multiscale, Multimodel Embeddings

    Authors: Steven A. Grosz, Akash Godbole, Anil K. Jain

    Abstract: Contactless palmprints are comprised of both global and local discriminative features. Most prior work focuses on extracting global features or local features alone for palmprint matching, whereas this research introduces a novel framework that combines global and local features for enhanced palmprint matching accuracy. Leveraging recent advancements in deep learning, this study integrates a visio… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  50. arXiv:2401.04198  [pdf, other

    cs.LG cs.AI

    Curiosity & Entropy Driven Unsupervised RL in Multiple Environments

    Authors: Shaurya Dewan, Anisha Jain, Zoe LaLena, Lifan Yu

    Abstract: The authors of 'Unsupervised Reinforcement Learning in Multiple environments' propose a method, alpha-MEPOL, to tackle unsupervised RL across multiple environments. They pre-train a task-agnostic exploration policy using interactions from an entire environment class and then fine-tune this policy for various tasks using supervision. We expanded upon this work, with the goal of improving performanc… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.