Skip to main content

Showing 1–15 of 15 results for author: Anand, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.17306  [pdf, other

    cs.AI

    Visual Hallucination: Definition, Quantification, and Prescriptive Remediations

    Authors: Anku Rani, Vipula Rawte, Harshad Sharma, Neeraj Anand, Krishnav Rajbangshi, Amit Sheth, Amitava Das

    Abstract: The troubling rise of hallucination presents perhaps the most significant impediment to the advancement of responsible AI. In recent times, considerable research has focused on detecting and mitigating hallucination in Large Language Models (LLMs). However, it's worth noting that hallucination is also quite prevalent in Vision-Language models (VLMs). In this paper, we offer a fine-grained discours… ▽ More

    Submitted 30 March, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

  2. arXiv:2401.01867  [pdf, other

    cs.LG

    Dataset Difficulty and the Role of Inductive Bias

    Authors: Devin Kwok, Nikhil Anand, Jonathan Frankle, Gintare Karolina Dziugaite, David Rolnick

    Abstract: Motivated by the goals of dataset pruning and defect identification, a growing body of methods have been developed to score individual examples within a dataset. These methods, which we call "example difficulty scores", are typically used to rank or categorize examples, but the consistency of rankings between different training runs, scoring methods, and model architectures is generally unknown. T… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

    Comments: 10 pages, 6 figures

  3. arXiv:2312.11669  [pdf, other

    cs.LG cs.AI

    Prediction and Control in Continual Reinforcement Learning

    Authors: Nishanth Anand, Doina Precup

    Abstract: Temporal difference (TD) learning is often used to update the estimate of the value function which is used by RL agents to extract useful policies. In this paper, we focus on value function estimation in continual reinforcement learning. We propose to decompose the value function into two components which update at different timescales: a permanent value function, which holds general knowledge tha… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: Published at the 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  4. arXiv:2311.16302  [pdf, other

    cs.LG cs.CL

    Comprehensive Benchmarking of Entropy and Margin Based Scoring Metrics for Data Selection

    Authors: Anusha Sabbineni, Nikhil Anand, Maria Minakova

    Abstract: While data selection methods have been studied extensively in active learning, data pruning, and data augmentation settings, there is little evidence for the efficacy of these methods in industry scale settings, particularly in low-resource languages. Our work presents ways of assessing prospective training examples in those settings for their "usefulness" or "difficulty". We also demonstrate how… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: Accepted to Efficient Natural Language and Speech Processing (ENLSP-III) workshop at NeurIPS '23

  5. arXiv:2311.16298  [pdf, other

    cs.LG cs.CL

    Influence Scores at Scale for Efficient Language Data Sampling

    Authors: Nikhil Anand, Joshua Tan, Maria Minakova

    Abstract: Modern ML systems ingest data aggregated from diverse sources, such as synthetic, human-annotated, and live customer traffic. Understanding \textit{which} examples are important to the performance of a learning algorithm is crucial for efficient model training. Recently, a growing body of literature has given rise to various "influence scores," which use training artifacts such as model confidence… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: Accepted at EMNLP '23

  6. arXiv:2306.08845  [pdf, other

    cs.SD cs.AI eess.AS

    Unsupervised speech intelligibility assessment with utterance level alignment distance between teacher and learner Wav2Vec-2.0 representations

    Authors: Nayan Anand, Meenakshi Sirigiraju, Chiranjeevi Yarra

    Abstract: Speech intelligibility is crucial in language learning for effective communication. Thus, to develop computer-assisted language learning systems, automatic speech intelligibility detection (SID) is necessary. Most of the works have assessed the intelligibility in a supervised manner considering manual annotations, which requires cost and time; hence scalability is limited. To overcome these, this… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

  7. arXiv:2304.08243  [pdf, ps, other

    cs.CL cs.LG

    Stochastic Code Generation

    Authors: Swapnil Sharma, Nikita Anand, Kranthi Kiran G. V

    Abstract: Large language models pre-trained for code generation can generate high-quality short code but often struggle with generating coherent long code and understanding higher-level or system-level specifications. This issue is also observed in language modeling for long text generation, and one proposed solution is the use of a latent stochastic process. This approach involves generating a document pla… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

    Comments: 6 pages, 3 figures

  8. arXiv:2304.06861  [pdf, other

    cs.CL cs.CY cs.LG

    Evaluation of Social Biases in Recent Large Pre-Trained Models

    Authors: Swapnil Sharma, Nikita Anand, Kranthi Kiran G. V., Alind Jain

    Abstract: Large pre-trained language models are widely used in the community. These models are usually trained on unmoderated and unfiltered data from open sources like the Internet. Due to this, biases that we see in platforms online which are a reflection of those in society are in turn captured and learned by these models. These models are deployed in applications that affect millions of people and their… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

    Comments: 7 pages, 4 Tables

  9. arXiv:2211.06739  [pdf, other

    cs.CV

    Partial Binarization of Neural Networks for Budget-Aware Efficient Learning

    Authors: Udbhav Bamba, Neeraj Anand, Saksham Aggarwal, Dilip K. Prasad, Deepak K. Gupta

    Abstract: Binarization is a powerful compression technique for neural networks, significantly reducing FLOPs, but often results in a significant drop in model performance. To address this issue, partial binarization techniques have been developed, but a systematic approach to mixing binary and full-precision parameters in a single network is still lacking. In this paper, we propose a controlled approach to… ▽ More

    Submitted 8 November, 2023; v1 submitted 12 November, 2022; originally announced November 2022.

    Comments: Accepted at WACV 2023 Conference

  10. arXiv:2205.15019  [pdf, other

    q-bio.QM cs.AI

    Protein Structure and Sequence Generation with Equivariant Denoising Diffusion Probabilistic Models

    Authors: Namrata Anand, Tudor Achim

    Abstract: Proteins are macromolecules that mediate a significant fraction of the cellular processes that underlie life. An important task in bioengineering is designing proteins with specific 3D structures and chemical properties which enable targeted functions. To this end, we introduce a generative model of both protein structure and sequence that can operate at significantly larger scales than previous m… ▽ More

    Submitted 26 May, 2022; originally announced May 2022.

  11. arXiv:2106.06508  [pdf, other

    cs.LG cs.AI

    Preferential Temporal Difference Learning

    Authors: Nishanth Anand, Doina Precup

    Abstract: Temporal-Difference (TD) learning is a general and very useful tool for estimating the value function of a given policy, which in turn is required to find good policies. Generally speaking, TD learning updates states whenever they are visited. When the agent lands in a state, its value can be used to compute the TD-error, which is then propagated to other states. However, it may be interesting, wh… ▽ More

    Submitted 23 August, 2021; v1 submitted 11 June, 2021; originally announced June 2021.

    Comments: Accepted at the 38th International Conference on Machine Learning (ICML, 2021)

    Journal ref: Proceedings of the 38th International Conference on Machine Learning, PMLR 139, 2021, 286-296

  12. arXiv:2006.01294  [pdf, other

    cs.DB cs.AI

    NEMA: Automatic Integration of Large Network Management Databases

    Authors: Fubao Wu, Han Hee Song, Jiangtao Yin, Lixin Gao, Mario Baldi, Narendra Anand

    Abstract: Network management, whether for malfunction analysis, failure prediction, performance monitoring and improvement, generally involves large amounts of data from different sources. To effectively integrate and manage these sources, automatically finding semantic matches among their schemas or ontologies is crucial. Existing approaches on database matching mainly fall into two categories. One focuses… ▽ More

    Submitted 1 June, 2020; originally announced June 2020.

    Comments: 14 pages, 13 Figures, 7 tables

  13. arXiv:2004.03497  [pdf, other

    q-bio.BM cs.LG stat.ML

    ProGen: Language Modeling for Protein Generation

    Authors: Ali Madani, Bryan McCann, Nikhil Naik, Nitish Shirish Keskar, Namrata Anand, Raphael R. Eguchi, Po-Ssu Huang, Richard Socher

    Abstract: Generative modeling for protein engineering is key to solving fundamental problems in synthetic biology, medicine, and material science. We pose protein engineering as an unsupervised sequence generation problem in order to leverage the exponentially growing set of proteins that lack costly, structural annotations. We train a 1.2B-parameter language model, ProGen, on ~280M protein sequences condit… ▽ More

    Submitted 7 March, 2020; originally announced April 2020.

  14. arXiv:1905.09562  [pdf, other

    cs.LG stat.ML

    Recurrent Value Functions

    Authors: Pierre Thodoroff, Nishanth Anand, Lucas Caccia, Doina Precup, Joelle Pineau

    Abstract: Despite recent successes in Reinforcement Learning, value-based methods often suffer from high variance hindering performance. In this paper, we illustrate this in a continuous control setting where state of the art methods perform poorly whenever sensor noise is introduced. To overcome this issue, we introduce Recurrent Value Functions (RVFs) as an alternative to estimate the value function of a… ▽ More

    Submitted 23 May, 2019; originally announced May 2019.

  15. arXiv:1412.7399  [pdf, other

    quant-ph cond-mat.dis-nn cs.DS physics.data-an

    Do quantum strategies always win?

    Authors: Namit Anand, Colin Benjamin

    Abstract: In a seminal paper, Meyer [David Meyer, Phys. Rev. Lett. 82, 1052 (1999)] described the advantages of quantum game theory by looking at the classical penny flip game. A player using a quantum strategy can win against a classical player almost 100\% of the time. Here we make a slight modification to the quantum game, with the two players sharing an entangled state to begin with. We then analyze two… ▽ More

    Submitted 11 August, 2015; v1 submitted 23 December, 2014; originally announced December 2014.

    Comments: 12 pages, 3 figures, expanded with material on general quantum unitaries and discussion on gaming the quantum

    Journal ref: Quantum Information Processing, Volume 14, issue 11, pp 4027-4038 (November 2015)