Skip to main content

Showing 1–50 of 172 results for author: Sharma, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.14541  [pdf, other

    cs.LG

    Are LLMs Naturally Good at Synthetic Tabular Data Generation?

    Authors: Shengzhe Xu, Cho-Ting Lee, Mandar Sharma, Raquib Bin Yousuf, Nikhil Muralidhar, Naren Ramakrishnan

    Abstract: Large language models (LLMs) have demonstrated their prowess in generating synthetic text and images; however, their potential for generating tabular data -- arguably the most common data type in business and scientific applications -- is largely underexplored. This paper demonstrates that LLMs, used as-is, or after traditional fine-tuning, are severely inadequate as synthetic table generators. Du… ▽ More

    Submitted 21 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

  2. arXiv:2406.14005  [pdf, other

    cs.CL cs.AI cs.LG

    Information Guided Regularization for Fine-tuning Language Models

    Authors: Mandar Sharma, Nikhil Muralidhar, Shengzhe Xu, Raquib Bin Yousuf, Naren Ramakrishnan

    Abstract: The pretraining-fine-tuning paradigm has been the de facto strategy for transfer learning in modern language modeling. With the understanding that task adaptation in LMs is often a function of parameters shared across tasks, we argue that a more surgical approach to regularization needs to exist for smoother transfer learning. Towards this end, we investigate how the pretraining loss landscape is… ▽ More

    Submitted 21 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

  3. arXiv:2406.08627  [pdf, other

    cs.LG cs.CL

    Time-MMD: A New Multi-Domain Multimodal Dataset for Time Series Analysis

    Authors: Haoxin Liu, Shangqing Xu, Zhiyuan Zhao, Lingkai Kong, Harshavardhan Kamarthi, Aditya B. Sasanur, Megha Sharma, Jiaming Cui, Qingsong Wen, Chao Zhang, B. Aditya Prakash

    Abstract: Time series data are ubiquitous across a wide range of real-world domains. While real-world time series analysis (TSA) requires human experts to integrate numerical series data with multimodal domain-specific knowledge, most existing TSA models rely solely on numerical data, overlooking the significance of information beyond numerical series. This oversight is due to the untapped potential of text… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  4. arXiv:2406.08063  [pdf, other

    cs.CV

    MWIRSTD: A MWIR Small Target Detection Dataset

    Authors: Nikhil Kumar, Avinash Upadhyay, Shreya Sharma, Manoj Sharma, Pravendra Singh

    Abstract: This paper presents a novel mid-wave infrared (MWIR) small target detection dataset (MWIRSTD) comprising 14 video sequences containing approximately 1053 images with annotated targets of three distinct classes of small objects. Captured using cooled MWIR imagers, the dataset offers a unique opportunity for researchers to develop and evaluate state-of-the-art methods for small object detection in r… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted in ICIP2024

  5. arXiv:2406.02018  [pdf, other

    cs.CL cs.AI cs.HC

    Why Would You Suggest That? Human Trust in Language Model Responses

    Authors: Manasi Sharma, Ho Chit Siu, Rohan Paleja, Jaime D. Peña

    Abstract: The emergence of Large Language Models (LLMs) has revealed a growing need for human-AI collaboration, especially in creative decision-making scenarios where trust and reliance are paramount. Through human studies and model evaluations on the open-ended News Headline Generation task from the LaMP benchmark, we analyze how the framing and presence of explanations affect user trust and model performa… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  6. arXiv:2405.12842  [pdf, other

    cs.RO cs.CV

    SmartFlow: Robotic Process Automation using LLMs

    Authors: Arushi Jain, Shubham Paliwal, Monika Sharma, Lovekesh Vig, Gautam Shroff

    Abstract: Robotic Process Automation (RPA) systems face challenges in handling complex processes and diverse screen layouts that require advanced human-like decision-making capabilities. These systems typically rely on pixel-level encoding through drag-and-drop or automation frameworks such as Selenium to create navigation workflows, rather than visual understanding of screen elements. In this context, we p… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 32nd ACM International Conference on Information and Knowledge Management

  7. arXiv:2405.12742  [pdf, other

    cs.CV

    Multi-Subject Personalization

    Authors: Arushi Jain, Shubham Paliwal, Monika Sharma, Vikram Jamwal, Lovekesh Vig

    Abstract: Creative story illustration requires a consistent interplay of multiple characters or objects. However, conventional text-to-image models face significant challenges while producing images featuring multiple personalized subjects. For example, they distort the subject rendering, or the text descriptions fail to render coherent subject interactions. We present Multi-Subject Personalization (MSP) to… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 2023 Conference on Neural Information Processing Systems

  8. arXiv:2405.12531  [pdf, other

    cs.CV cs.LG

    CustomText: Customized Textual Image Generation using Diffusion Models

    Authors: Shubham Paliwal, Arushi Jain, Monika Sharma, Vikram Jamwal, Lovekesh Vig

    Abstract: Textual image generation spans diverse fields like advertising, education, product packaging, social media, information visualization, and branding. Despite recent strides in language-guided image synthesis using diffusion models, current models excel in image generation but struggle with accurate text rendering and offer limited control over font attributes. In this paper, we aim to enhance the s… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Accepted by AI for Content Creation (AI4CC) workshop at CVPR 2024

  9. arXiv:2405.04829  [pdf, other

    cs.CL

    Fine-tuning Pre-trained Named Entity Recognition Models For Indian Languages

    Authors: Sankalp Bahad, Pruthwik Mishra, Karunesh Arora, Rakesh Chandra Balabantaray, Dipti Misra Sharma, Parameswari Krishnamurthy

    Abstract: Named Entity Recognition (NER) is a useful component in Natural Language Processing (NLP) applications. It is used in various tasks such as Machine Translation, Summarization, Information Retrieval, and Question-Answering systems. The research on NER is centered around English and some other major languages, whereas limited attention has been given to Indian languages. We analyze the challenges an… ▽ More

    Submitted 10 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

    Comments: 8 pages, accepted in NAACL-SRW, 2024

  10. arXiv:2404.10212  [pdf, other

    cs.CV

    LWIRPOSE: A novel LWIR Thermal Image Dataset and Benchmark

    Authors: Avinash Upadhyay, Bhipanshu Dhupar, Manoj Sharma, Ankit Shukla, Ajith Abraham

    Abstract: Human pose estimation faces hurdles in real-world applications due to factors like lighting changes, occlusions, and cluttered environments. We introduce a unique RGB-Thermal Nearly Paired and Annotated 2D Pose Dataset, comprising over 2,400 high-quality LWIR (thermal) images. Each image is meticulously annotated with 2D human poses, offering a valuable resource for researchers and practitioners.… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Submitted in ICIP2024

  11. arXiv:2404.02512  [pdf, other

    cs.CL

    Towards Large Language Model driven Reference-less Translation Evaluation for English and Indian Languages

    Authors: Vandan Mujadia, Pruthwik Mishra, Arafat Ahsan, Dipti Misra Sharma

    Abstract: With the primary focus on evaluating the effectiveness of large language models for automatic reference-less translation assessment, this work presents our experiments on mimicking human direct assessment to evaluate the quality of translations in English and Indian languages. We constructed a translation evaluation task where we performed zero-shot learning, in-context example-driven learning, an… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: text overlap with arXiv:2311.09216

  12. arXiv:2404.01536  [pdf, other

    cs.CL cs.AI cs.LG

    Laying Anchors: Semantically Priming Numerals in Language Modeling

    Authors: Mandar Sharma, Rutuja Murlidhar Taware, Pravesh Koirala, Nikhil Muralidhar, Naren Ramakrishnan

    Abstract: Off-the-shelf pre-trained language models have become the de facto standard in NLP pipelines for a multitude of downstream tasks. However, the inability of these models to properly encode numerals limits their performance on tasks requiring numeric comprehension. We introduce strategies to semantically prime numerals in any corpus by generating anchors governed by the distribution of numerals in s… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: Accepted to the findings of NAACL 2024

  13. arXiv:2403.17199  [pdf, other

    cs.CL

    Extracting Social Support and Social Isolation Information from Clinical Psychiatry Notes: Comparing a Rule-based NLP System and a Large Language Model

    Authors: Braja Gopal Patra, Lauren A. Lepow, Praneet Kasi Reddy Jagadeesh Kumar, Veer Vekaria, Mohit Manoj Sharma, Prakash Adekkanattu, Brian Fennessy, Gavin Hynes, Isotta Landi, Jorge A. Sanchez-Ruiz, Euijung Ryu, Joanna M. Biernacka, Girish N. Nadkarni, Ardesheer Talati, Myrna Weissman, Mark Olfson, J. John Mann, Alexander W. Charney, Jyotishman Pathak

    Abstract: Background: Social support (SS) and social isolation (SI) are social determinants of health (SDOH) associated with psychiatric outcomes. In electronic health records (EHRs), individual-level SS/SI is typically documented as narrative clinical notes rather than structured coded data. Natural language processing (NLP) algorithms can automate the otherwise labor-intensive process of data extraction.… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 2 figures, 3 tables

  14. arXiv:2403.09227  [pdf, other

    cs.RO cs.AI

    BEHAVIOR-1K: A Human-Centered, Embodied AI Benchmark with 1,000 Everyday Activities and Realistic Simulation

    Authors: Chengshu Li, Ruohan Zhang, Josiah Wong, Cem Gokmen, Sanjana Srivastava, Roberto Martín-Martín, Chen Wang, Gabrael Levine, Wensi Ai, Benjamin Martinez, Hang Yin, Michael Lingelbach, Minjune Hwang, Ayano Hiranaka, Sujay Garlanka, Arman Aydin, Sharon Lee, Jiankai Sun, Mona Anvari, Manasi Sharma, Dhruva Bansal, Samuel Hunter, Kyu-Young Kim, Alan Lou, Caleb R Matthews , et al. (10 additional authors not shown)

    Abstract: We present BEHAVIOR-1K, a comprehensive simulation benchmark for human-centered robotics. BEHAVIOR-1K includes two components, guided and motivated by the results of an extensive survey on "what do you want robots to do for you?". The first is the definition of 1,000 everyday activities, grounded in 50 scenes (houses, gardens, restaurants, offices, etc.) with more than 9,000 objects annotated with… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: A preliminary version was published at 6th Conference on Robot Learning (CoRL 2022)

  15. arXiv:2402.19371  [pdf

    cs.CL cs.AI cs.IR

    OpenMedLM: Prompt engineering can out-perform fine-tuning in medical question-answering with open-source large language models

    Authors: Jenish Maharjan, Anurag Garikipati, Navan Preet Singh, Leo Cyrus, Mayank Sharma, Madalina Ciobanu, Gina Barnes, Rahul Thapa, Qingqing Mao, Ritankar Das

    Abstract: LLMs have become increasingly capable at accomplishing a range of specialized-tasks and can be utilized to expand equitable access to medical knowledge. Most medical LLMs have involved extensive fine-tuning, leveraging specialized medical data and significant, thus costly, amounts of computational power. Many of the top performing LLMs are proprietary and their access is limited to very few resear… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  16. arXiv:2402.16863  [pdf

    cs.NE cs.AI

    Quantum Inspired Chaotic Salp Swarm Optimization for Dynamic Optimization

    Authors: Sanjai Pathak, Ashish Mani, Mayank Sharma, Amlan Chatterjee

    Abstract: Many real-world problems are dynamic optimization problems that are unknown beforehand. In practice, unpredictable events such as the arrival of new jobs, due date changes, and reservation cancellations, changes in parameters or constraints make the search environment dynamic. Many algorithms are designed to deal with stationary optimization problems, but these algorithms do not face dynamic optim… ▽ More

    Submitted 20 January, 2024; originally announced February 2024.

    Comments: 14 pages, 2 figures, 1 algorithm

  17. arXiv:2401.14502  [pdf, other

    cs.RO cs.CV cs.LG

    MResT: Multi-Resolution Sensing for Real-Time Control with Vision-Language Models

    Authors: Saumya Saxena, Mohit Sharma, Oliver Kroemer

    Abstract: Leveraging sensing modalities across diverse spatial and temporal resolutions can improve performance of robotic manipulation tasks. Multi-spatial resolution sensing provides hierarchical information captured at different spatial scales and enables both coarse and precise motions. Simultaneously multi-temporal resolution sensing enables the agent to exhibit high reactivity and real-time control. I… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: CoRL'23, Project website: http://tinyurl.com/multi-res-realtime-control

  18. arXiv:2401.08014  [pdf, other

    cs.CV cs.AI

    Convolutional Neural Network Compression via Dynamic Parameter Rank Pruning

    Authors: Manish Sharma, Jamison Heard, Eli Saber, Panos P. Markopoulos

    Abstract: While Convolutional Neural Networks (CNNs) excel at learning complex latent-space representations, their over-parameterization can lead to overfitting and reduced performance, particularly with limited data. This, alongside their high computational and memory demands, limits the applicability of CNNs for edge deployment. Low-rank matrix approximation has emerged as a promising approach to reduce C… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

    Comments: 11 pages, 6 figures

  19. arXiv:2401.05566  [pdf, other

    cs.CR cs.AI cs.CL cs.LG cs.SE

    Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

    Authors: Evan Hubinger, Carson Denison, Jesse Mu, Mike Lambert, Meg Tong, Monte MacDiarmid, Tamera Lanham, Daniel M. Ziegler, Tim Maxwell, Newton Cheng, Adam Jermyn, Amanda Askell, Ansh Radhakrishnan, Cem Anil, David Duvenaud, Deep Ganguli, Fazl Barez, Jack Clark, Kamal Ndousse, Kshitij Sachan, Michael Sellitto, Mrinank Sharma, Nova DasSarma, Roger Grosse, Shauna Kravec , et al. (14 additional authors not shown)

    Abstract: Humans are capable of strategically deceptive behavior: behaving helpfully in most situations, but then behaving very differently in order to pursue alternative objectives when given the opportunity. If an AI system learned such a deceptive strategy, could we detect it and remove it using current state-of-the-art safety training techniques? To study this question, we construct proof-of-concept exa… ▽ More

    Submitted 17 January, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: updated to add missing acknowledgements

  20. arXiv:2312.16124  [pdf, other

    cs.LG physics.chem-ph q-bio.QM

    Olfactory Label Prediction on Aroma-Chemical Pairs

    Authors: Laura Sisson, Aryan Amit Barsainyan, Mrityunjay Sharma, Ritesh Kumar

    Abstract: The application of deep learning techniques on aroma-chemicals has resulted in models more accurate than human experts at predicting olfactory qualities. However, public research in this domain has been limited to predicting the qualities of single molecules, whereas in industry applications, perfumers and food scientists are often concerned with blends of many molecules. In this paper, we apply b… ▽ More

    Submitted 5 June, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

  21. arXiv:2312.14542  [pdf, other

    cs.CL

    Automatic Data Retrieval for Cross Lingual Summarization

    Authors: Nikhilesh Bhatnagar, Ashok Urlana, Vandan Mujadia, Pruthwik Mishra, Dipti Misra Sharma

    Abstract: Cross-lingual summarization involves the summarization of text written in one language to a different one. There is a body of research addressing cross-lingual summarization from English to other European languages. In this work, we aim to perform cross-lingual summarization from English to Hindi. We propose pairing up the coverage of newsworthy events in textual and video format can prove to be h… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

    Comments: 6 pages, 6 tables, 2 figures, conference: ICON 2023

  22. arXiv:2312.11395  [pdf, other

    cs.CL cs.AI

    Verb Categorisation for Hindi Word Problem Solving

    Authors: Harshita Sharma, Pruthwik Mishra, Dipti Misra Sharma

    Abstract: Word problem Solving is a challenging NLP task that deals with solving mathematical problems described in natural language. Recently, there has been renewed interest in develo** word problem solvers for Indian languages. As part of this paper, we have built a Hindi arithmetic word problem solver which makes use of verbs. Additionally, we have created verb categorization data for Hindi. Verbs are… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: 16 pages, 17 figures, ICON 2023 Conference

    ACM Class: I.2.7

  23. arXiv:2312.10396  [pdf, ps, other

    cs.LG cs.AI

    How Far Can Fairness Constraints Help Recover From Biased Data?

    Authors: Mohit Sharma, Amit Deshpande

    Abstract: A general belief in fair classification is that fairness constraints incur a trade-off with accuracy, which biased data may worsen. Contrary to this belief, Blum & Stangl (2019) show that fair classification with equal opportunity constraints even on extremely biased data can recover optimally accurate and fair classifiers on the original data distribution. Their result is interesting because it d… ▽ More

    Submitted 1 June, 2024; v1 submitted 16 December, 2023; originally announced December 2023.

    Comments: Accepted for publication at ICML 2024

  24. arXiv:2312.04832  [pdf

    cs.SE

    Exposing Algorithmic Discrimination and Its Consequences in Modern Society: Insights from a Sco** Study

    Authors: Ramandeep Singh Dehal, Mehak Sharma, Ronnie de Souza Santos

    Abstract: Algorithmic discrimination is a condition that arises when data-driven software unfairly treats users based on attributes like ethnicity, race, gender, sexual orientation, religion, age, disability, or other personal characteristics. Nowadays, as machine learning gains popularity, cases of algorithmic discrimination are increasingly being reported in several contexts. This study delves into variou… ▽ More

    Submitted 16 January, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

  25. arXiv:2312.01054  [pdf, other

    cs.RO cs.AI cs.CL

    Exploring and Improving the Spatial Reasoning Abilities of Large Language Models

    Authors: Manasi Sharma

    Abstract: Large Language Models (LLMs) represent formidable tools for sequence modeling, boasting an innate capacity for general pattern recognition. Nevertheless, their broader spatial reasoning capabilities, especially applied to numerical trajectory data, remain insufficiently explored. In this paper, we investigate the out-of-the-box performance of ChatGPT-3.5, ChatGPT-4 and Llama 2 7B models when confr… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

    Comments: Published in NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following

  26. arXiv:2311.09216  [pdf, other

    cs.CL cs.AI

    Assessing Translation capabilities of Large Language Models involving English and Indian Languages

    Authors: Vandan Mujadia, Ashok Urlana, Yash Bhaskar, Penumalla Aditya Pavani, Kukkapalli Shravya, Parameswari Krishnamurthy, Dipti Misra Sharma

    Abstract: Generative Large Language Models (LLMs) have achieved remarkable advancements in various NLP tasks. In this work, our aim is to explore the multilingual capabilities of large language models by using machine translation as a task involving English and 22 Indian languages. We first investigate the translation capabilities of raw large language models, followed by exploring the in-context learning c… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  27. arXiv:2310.13548  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    Towards Understanding Sycophancy in Language Models

    Authors: Mrinank Sharma, Meg Tong, Tomasz Korbak, David Duvenaud, Amanda Askell, Samuel R. Bowman, Newton Cheng, Esin Durmus, Zac Hatfield-Dodds, Scott R. Johnston, Shauna Kravec, Timothy Maxwell, Sam McCandlish, Kamal Ndousse, Oliver Rausch, Nicholas Schiefer, Da Yan, Miranda Zhang, Ethan Perez

    Abstract: Human feedback is commonly utilized to finetune AI assistants. But human feedback may also encourage model responses that match user beliefs over truthful ones, a behaviour known as sycophancy. We investigate the prevalence of sycophancy in models whose finetuning procedure made use of human feedback, and the potential role of human preference judgments in such behavior. We first demonstrate that… ▽ More

    Submitted 27 October, 2023; v1 submitted 20 October, 2023; originally announced October 2023.

    Comments: 32 pages, 20 figures

    ACM Class: I.2.6

  28. arXiv:2310.08864  [pdf, other

    cs.RO

    Open X-Embodiment: Robotic Learning Datasets and RT-X Models

    Authors: Open X-Embodiment Collaboration, Abby O'Neill, Abdul Rehman, Abhinav Gupta, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, A**kya Jain, Albert Tung, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anchit Gupta, Andrew Wang, Andrey Kolobov, Anikait Singh, Animesh Garg, Aniruddha Kembhavi, Annie Xie , et al. (267 additional authors not shown)

    Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning method… ▽ More

    Submitted 1 June, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: Project website: https://robotics-transformer-x.github.io

  29. arXiv:2310.08043  [pdf, other

    cs.AI

    Understanding and Controlling a Maze-Solving Policy Network

    Authors: Ulisse Mini, Peli Grietzer, Mrinank Sharma, Austin Meek, Monte MacDiarmid, Alexander Matt Turner

    Abstract: To understand the goals and goal representations of AI systems, we carefully study a pretrained reinforcement learning policy that solves mazes by navigating to a range of target squares. We find this network pursues multiple context-dependent goals, and we further identify circuits within the network that correspond to one of these goals. In particular, we identified eleven channels that track th… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: 46 pages

  30. arXiv:2309.16592  [pdf, other

    cs.CV cs.LG

    Tensor Factorization for Leveraging Cross-Modal Knowledge in Data-Constrained Infrared Object Detection

    Authors: Manish Sharma, Moitreya Chatterjee, Kuan-Chuan Peng, Suhas Lohit, Michael Jones

    Abstract: The primary bottleneck towards obtaining good recognition performance in IR images is the lack of sufficient labeled training data, owing to the cost of acquiring such data. Realizing that object detection methods for the RGB modality are quite robust (at least for some commonplace classes, like person, car, etc.), thanks to the giant training sets that exist, in this work we seek to leverage cues… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: Accepted to ICCV 2023, LIMIT Workshop. The first two authors contributed equally

  31. arXiv:2309.12320  [pdf

    cs.CY cs.AI

    Use Scenarios & Practical Examples of AI Use in Education

    Authors: Dara Cassidy, Yann-Aël Le Borgne, Francisco Bellas, Riina Vuorikari, Elise Rondin, Madhumalti Sharma, Jessica Niewint-Gori, Johanna Gröpler, Anne Gilleran, Lidija Kralj

    Abstract: This report presents a set of use scenarios based on existing resources that teachers can use as inspiration to create their own, with the aim of introducing artificial intelligence (AI) at different pre-university levels, and with different goals. The Artificial Intelligence Education field (AIEd) is very active, with new resources and tools arising continuously. Those included in this document h… ▽ More

    Submitted 25 July, 2023; originally announced September 2023.

    Comments: Developed within the AI in Education working group of the European Digital Education Hub

  32. arXiv:2309.01918  [pdf, other

    cs.RO cs.LG

    RoboAgent: Generalization and Efficiency in Robot Manipulation via Semantic Augmentations and Action Chunking

    Authors: Homanga Bharadhwaj, Jay Vakil, Mohit Sharma, Abhinav Gupta, Shubham Tulsiani, Vikash Kumar

    Abstract: The grand aim of having a single robot that can manipulate arbitrary objects in diverse settings is at odds with the paucity of robotics datasets. Acquiring and growing such datasets is strenuous due to manual efforts, operational costs, and safety challenges. A path toward such an universal agent would require a structured framework capable of wide generalization but trained within a reasonable d… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

  33. arXiv:2305.08246  [pdf, other

    cs.CL cs.AI cs.LG

    Learning Non-linguistic Skills without Sacrificing Linguistic Proficiency

    Authors: Mandar Sharma, Nikhil Muralidhar, Naren Ramakrishnan

    Abstract: The field of Math-NLP has witnessed significant growth in recent years, motivated by the desire to expand LLM performance to the learning of non-linguistic notions (numerals, and subsequently, arithmetic reasoning). However, non-linguistic skill injection typically comes at a cost for LLMs: it leads to catastrophic forgetting of core linguistic skills, a consequence that often remains unaddressed… ▽ More

    Submitted 14 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023's main conference

  34. arXiv:2305.06052  [pdf, other

    cs.CV

    Post-training Model Quantization Using GANs for Synthetic Data Generation

    Authors: Athanasios Masouris, Mansi Sharma, Adrian Boguszewski, Alexander Kozlov, Zhuo Wu, Raymond Lo

    Abstract: Quantization is a widely adopted technique for deep neural networks to reduce the memory and computational resources required. However, when quantized, most models would need a suitable calibration process to keep their performance intact, which requires data from the target domain, such as a fraction of the dataset used in model training and model validation (i.e. calibration dataset). In this… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

  35. arXiv:2304.06600  [pdf, other

    cs.LG cs.CV cs.RO

    Lossless Adaptation of Pretrained Vision Models For Robotic Manipulation

    Authors: Mohit Sharma, Claudio Fantacci, Yuxiang Zhou, Skanda Koppula, Nicolas Heess, Jon Scholz, Yusuf Aytar

    Abstract: Recent works have shown that large models pretrained on common visual learning tasks can provide useful representations for a wide range of specialized perception problems, as well as a variety of robotic manipulation tasks. While prior work on robotic manipulation has predominantly used frozen pretrained features, we demonstrate that in robotics this approach can fail to reach optimal performance… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

    Comments: ICLR'23, Project page see https://sites.google.com/view/robo-adapters/

  36. arXiv:2304.01762  [pdf, other

    cs.LG cs.AI stat.ML

    Incorporating Unlabelled Data into Bayesian Neural Networks

    Authors: Mrinank Sharma, Tom Rainforth, Yee Whye Teh, Vincent Fortuin

    Abstract: Conventional Bayesian Neural Networks (BNNs) cannot leverage unlabelled data to improve their predictions. To overcome this limitation, we introduce Self-Supervised Bayesian Neural Networks, which use unlabelled data to learn improved prior predictive distributions by maximising an evidence lower bound during an unsupervised pre-training step. With a novel methodology developed to better understan… ▽ More

    Submitted 19 May, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

  37. arXiv:2304.00763  [pdf

    cs.CV

    BOLLWM: A real-world dataset for bollworm pest monitoring from cotton fields in India

    Authors: Jerome White, Chandan Agrawal, Anmol Ojha, Apoorv Agnihotri, Makkunda Sharma, Jigar Doshi

    Abstract: This paper presents a dataset of agricultural pest images captured over five years by thousands of small holder farmers and farming extension workers across India. The dataset has been used to support a mobile application that relies on artificial intelligence to assist farmers with pest management decisions. Creation came from a mix of organized data collection, and from mobile application usage… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

    Journal ref: ICLR 2023 workshop on Practical Machine Learning for Develo** Countries

  38. arXiv:2302.05906  [pdf, other

    cs.LG cs.AI

    On Comparing Fair Classifiers under Data Bias

    Authors: Mohit Sharma, Amit Deshpande, Rajiv Ratn Shah

    Abstract: In this paper, we consider a theoretical model for injecting data bias, namely, under-representation and label bias (Blum & Stangl, 2019). We empirically study the effect of varying data biases on the accuracy and fairness of fair classifiers. Through extensive experiments on both synthetic and real-world datasets (e.g., Adult, German Credit, Bank Marketing, COMPAS), we empirically audit pre-, in-… ▽ More

    Submitted 10 December, 2023; v1 submitted 12 February, 2023; originally announced February 2023.

    Comments: Accepted as a Spotlight Presentation at Algorithmic Fairness through the Lens of Time, Neurips 2023 Workshop

  39. Continuous Scatterplot Operators for Bivariate Analysis and Study of Electronic Transitions

    Authors: Mohit Sharma, Talha Bin Masood, Signe S. Thygesen, Mathieu Linares, Ingrid Hotz, Vijay Natarajan

    Abstract: Electronic transitions in molecules due to the absorption or emission of light is a complex quantum mechanical process. Their study plays an important role in the design of novel materials. A common yet challenging task in the study is to determine the nature of electronic transitions, namely which subgroups of the molecule are involved in the transition by donating or accepting electrons, followe… ▽ More

    Submitted 1 February, 2023; originally announced February 2023.

  40. arXiv:2301.11315  [pdf, other

    cs.CV cs.AI

    Evaluate underdiagnosis and overdiagnosis bias of deep learning model on primary open-angle glaucoma diagnosis in under-served patient populations

    Authors: Mingquan Lin, Yuyun Xiao, Bojian Hou, Tingyi Wanyan, Mohit Manoj Sharma, Zhangyang Wang, Fei Wang, Sarah Van Tassel, Yifan Peng

    Abstract: In the United States, primary open-angle glaucoma (POAG) is the leading cause of blindness, especially among African American and Hispanic individuals. Deep learning has been widely used to detect POAG using fundus images as its performance is comparable to or even surpasses diagnosis by clinicians. However, human bias in clinical diagnosis may be reflected and amplified in the widely-used deep le… ▽ More

    Submitted 29 January, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

    Comments: 9 pages, 2 figures, Accepted by AMIA 2023 Informatics Summit

    Journal ref: AMIA 2023 Informatics Summit

  41. arXiv:2211.06291  [pdf, other

    cs.LG cs.AI stat.ML

    Do Bayesian Neural Networks Need To Be Fully Stochastic?

    Authors: Mrinank Sharma, Sebastian Farquhar, Eric Nalisnick, Tom Rainforth

    Abstract: We investigate the benefit of treating all the parameters in a Bayesian neural network stochastically and find compelling theoretical and empirical evidence that this standard construction may be unnecessary. To this end, we prove that expressive predictive distributions require only small amounts of stochasticity. In particular, partially stochastic networks with only $n$ stochastic biases are un… ▽ More

    Submitted 20 February, 2023; v1 submitted 11 November, 2022; originally announced November 2022.

    Comments: Published at AISTATS2023 (Oral)

  42. arXiv:2211.02098  [pdf, other

    cs.CL cs.AI cs.LG

    Overcoming Barriers to Skill Injection in Language Modeling: Case Study in Arithmetic

    Authors: Mandar Sharma, Nikhil Muralidhar, Naren Ramakrishnan

    Abstract: Through their transfer learning abilities, highly-parameterized large pre-trained language models have dominated the NLP landscape for a multitude of downstream language tasks. Though linguistically proficient, the inability of these models to incorporate the learning of non-linguistic entities (numerals and arithmetic reasoning) limits their usage for tasks that require numeric comprehension or s… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

    Comments: NeurIPS 2022: Math-AI Workshop

  43. arXiv:2210.15374  [pdf, other

    cs.CV

    2T-UNET: A Two-Tower UNet with Depth Clues for Robust Stereo Depth Estimation

    Authors: Rohit Choudhary, Mansi Sharma, Rithvik Anil

    Abstract: Stereo correspondence matching is an essential part of the multi-step stereo depth estimation process. This paper revisits the depth estimation problem, avoiding the explicit stereo matching step using a simple two-tower convolutional neural network. The proposed algorithm is entitled as 2T-UNet. The idea behind 2T-UNet is to replace cost volume construction with twin convolution towers. These tow… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

  44. arXiv:2210.15362  [pdf, other

    cs.CV

    A Novel Approach for Neuromorphic Vision Data Compression based on Deep Belief Network

    Authors: Sally Khaidem, Mansi Sharma, Abhipraay Nevatia

    Abstract: A neuromorphic camera is an image sensor that emulates the human eyes capturing only changes in local brightness levels. They are widely known as event cameras, silicon retinas or dynamic vision sensors (DVS). DVS records asynchronous per-pixel brightness changes, resulting in a stream of events that encode the brightness change's time, location, and polarity. DVS consumes little power and can cap… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

  45. arXiv:2210.12215  [pdf, other

    cs.CL

    Gui at MixMT 2022 : English-Hinglish: An MT approach for translation of code mixed data

    Authors: Akshat Gahoi, Jayant Duneja, Anshul Padhi, Shivam Mangale, Saransh Rajput, Tanvi Kamble, Dipti Misra Sharma, Vasudeva Varma

    Abstract: Code-mixed machine translation has become an important task in multilingual communities and extending the task of machine translation to code mixed data has become a common task for these languages. In the shared tasks of WMT 2022, we try to tackle the same for both English + Hindi to Hinglish and Hinglish to English. The first task dealt with both Roman and Devanagari script as we had monolingual… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

  46. arXiv:2210.01447  [pdf, other

    cs.CV eess.IV

    A Novel Light Field Coding Scheme Based on Deep Belief Network & Weighted Binary Images for Additive Layered Displays

    Authors: Sally Khaidem, Mansi Sharma

    Abstract: Light-field displays create an immersive experience by providing binocular depth sensation and motion parallax. Stacking light attenuating layers is one approach to implement a light field display with a broader depth of field, wide viewing angles and high resolution. Due to the transparent holographic optical element (HOE) layers, additive layered displays can be integrated into augmented reality… ▽ More

    Submitted 21 April, 2023; v1 submitted 4 October, 2022; originally announced October 2022.

    Comments: The paper is under consideration at Pattern Recognition Letters

  47. Jacobi Set Driven Search for Flexible Fiber Surface Extraction

    Authors: Mohit Sharma, Vijay Natarajan

    Abstract: Isosurfaces are an important tool for analysis and visualization of univariate scalar fields. Earlier works have demonstrated the presence of interesting isosurfaces at isovalues close to critical values. This motivated the development of efficient methods for computing individual components of isosurfaces restricted to a region of interest. Generalization of isosurfaces to fiber surfaces and crit… ▽ More

    Submitted 13 August, 2022; originally announced August 2022.

  48. arXiv:2208.06359  [pdf

    cs.LG cs.CV

    A Case for Rejection in Low Resource ML Deployment

    Authors: Jerome White, Pulkit Madaan, Nikhil Shenoy, Apoorv Agnihotri, Makkunda Sharma, Jigar Doshi

    Abstract: Building reliable AI decision support systems requires a robust set of data on which to train models; both with respect to quantity and diversity. Obtaining such datasets can be difficult in resource limited settings, or for applications in early stages of deployment. Sample rejection is one way to work around this challenge, however much of the existing work in this area is ill-suited for such sc… ▽ More

    Submitted 15 August, 2022; v1 submitted 12 August, 2022; originally announced August 2022.

    Journal ref: NeurIPS 2022 workshop on Challenges In Deploying And Monitoring Machine Learning Systems

  49. arXiv:2207.12571  [pdf, other

    cs.CL

    Innovations in Neural Data-to-text Generation: A Survey

    Authors: Mandar Sharma, Ajay Gogineni, Naren Ramakrishnan

    Abstract: The neural boom that has sparked natural language processing (NLP) research through the last decade has similarly led to significant innovations in data-to-text generation (DTG). This survey offers a consolidated view into the neural DTG paradigm with a structured examination of the approaches, benchmark datasets, and evaluation protocols. This survey draws boundaries separating DTG from the rest… ▽ More

    Submitted 1 April, 2024; v1 submitted 25 July, 2022; originally announced July 2022.

    Comments: Accepted to ACM Transactions on Intelligent Systems and Technology 2024

  50. arXiv:2206.11095  [pdf, other

    cs.CV eess.IV

    A High Resolution Multi-exposure Stereoscopic Image & Video Database of Natural Scenes

    Authors: Rohit Choudhary, Mansi Sharma, Aditya Wadaskar

    Abstract: Immersive displays such as VR headsets, AR glasses, Multiview displays, Free point televisions have emerged as a new class of display technologies in recent years, offering a better visual experience and viewer engagement as compared to conventional displays. With the evolution of 3D video and display technologies, the consumer market for High Dynamic Range (HDR) cameras and displays is quickly gr… ▽ More

    Submitted 22 June, 2022; originally announced June 2022.