Skip to main content

Showing 1–9 of 9 results for author: Maheshwary, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17415  [pdf, other

    cs.CL cs.AI cs.LG

    Layer-Wise Quantization: A Pragmatic and Effective Method for Quantizing LLMs Beyond Integer Bit-Levels

    Authors: Razvan-Gabriel Dumitru, Vikas Yadav, Rishabh Maheshwary, Paul-Ioan Clotan, Sathwik Tejaswi Madhusudhan, Mihai Surdeanu

    Abstract: We present a simple variable quantization approach that quantizes different layers of a large language model (LLM) at different bit levels. Specifically, we quantize the most important layers to higher bit precision and less important layers to lower bits to achieve floating point quantization levels. We propose two effective strategies to measure the importance of layers within LLMs: the first me… ▽ More

    Submitted 26 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

    Comments: submitted to EMNLP, 15 pages, 10 figures, 4 tables

    ACM Class: I.2.7; I.2.0

  2. arXiv:2406.16783  [pdf, other

    cs.CL cs.AI cs.LG

    M2Lingual: Enhancing Multilingual, Multi-Turn Instruction Alignment in Large Language Models

    Authors: Rishabh Maheshwary, Vikas Yadav, Hoang Nguyen, Khyati Mahajan, Sathwik Tejaswi Madhusudhan

    Abstract: Instruction finetuning (IFT) is critical for aligning Large Language Models (LLMs) to follow instructions. While many effective IFT datasets have been introduced recently, they predominantly focus on high-resource languages like English. To better align LLMs across a broad spectrum of languages and tasks, we propose a fully synthetic, novel taxonomy (Evol) guided Multilingual, Multi-turn instructi… ▽ More

    Submitted 28 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

    Comments: 39 pages

  3. arXiv:2403.07230  [pdf, other

    cs.CL cs.AI cs.LG

    Curry-DPO: Enhancing Alignment using Curriculum Learning & Ranked Preferences

    Authors: Pulkit Pattnaik, Rishabh Maheshwary, Kelechi Ogueji, Vikas Yadav, Sathwik Tejaswi Madhusudhan

    Abstract: Direct Preference Optimization (DPO) is an effective technique that leverages pairwise preference data (usually one chosen and rejected response pair per user prompt) to align LLMs to human preferences. In practice, multiple responses can exist for a given prompt with varying quality relative to each other. With availability of such quality ratings for multiple responses, we propose utilizing thes… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: Work in progress

  4. arXiv:2306.08751  [pdf, other

    cs.CV

    Improving Selective Visual Question Answering by Learning from Your Peers

    Authors: Corentin Dancette, Spencer Whitehead, Rishabh Maheshwary, Ramakrishna Vedantam, Stefan Scherer, Xinlei Chen, Matthieu Cord, Marcus Rohrbach

    Abstract: Despite advances in Visual Question Answering (VQA), the ability of models to assess their own correctness remains underexplored. Recent work has shown that VQA models, out-of-the-box, can have difficulties abstaining from answering when they are wrong. The option to abstain, also called Selective Prediction, is highly relevant when deploying systems to users who must trust the system's output (e.… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: CVPR 2023. Code available here: https://github.com/facebookresearch/selective-vqa_ood

  5. arXiv:2205.00177  [pdf, other

    cs.CL

    Practice Makes a Solver Perfect: Data Augmentation for Math Word Problem Solvers

    Authors: Vivek Kumar, Rishabh Maheshwary, Vikram Pudi

    Abstract: Existing Math Word Problem (MWP) solvers have achieved high accuracy on benchmark datasets. However, prior works have shown that such solvers do not generalize well and rely on superficial cues to achieve high performance. In this paper, we first conduct experiments to showcase that this behaviour is mainly associated with the limited size and diversity present in existing MWP datasets. Next, we p… ▽ More

    Submitted 30 April, 2022; originally announced May 2022.

    Comments: Accepted at NAACL 2022

  6. arXiv:2109.05925  [pdf, other

    cs.CL

    Adversarial Examples for Evaluating Math Word Problem Solvers

    Authors: Vivek Kumar, Rishabh Maheshwary, Vikram Pudi

    Abstract: Standard accuracy metrics have shown that Math Word Problem (MWP) solvers have achieved high performance on benchmark datasets. However, the extent to which existing MWP solvers truly understand language and its relation with numbers is still unclear. In this paper, we generate adversarial attacks to evaluate the robustness of state-of-the-art MWP solvers. We propose two methods Question Reorderin… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Comments: Accepted at EMNLP Findings 2021

  7. arXiv:2109.04775  [pdf, other

    cs.CL

    A Strong Baseline for Query Efficient Attacks in a Black Box Setting

    Authors: Rishabh Maheshwary, Saket Maheshwary, Vikram Pudi

    Abstract: Existing black box search methods have achieved high success rate in generating adversarial attacks against NLP models. However, such search methods are inefficient as they do not consider the amount of queries required to generate adversarial attacks. Also, prior attacks do not maintain a consistent search space while comparing different search methods. In this paper, we propose a query efficient… ▽ More

    Submitted 10 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021 - Main Conference

  8. arXiv:2012.14956  [pdf, other

    cs.CL

    Generating Natural Language Attacks in a Hard Label Black Box Setting

    Authors: Rishabh Maheshwary, Saket Maheshwary, Vikram Pudi

    Abstract: We study an important and challenging task of attacking natural language processing models in a hard label black box setting. We propose a decision-based attack strategy that crafts high quality adversarial examples on text classification and entailment tasks. Our proposed attack strategy leverages population-based optimization algorithm to craft plausible and semantically similar adversarial exam… ▽ More

    Submitted 29 April, 2021; v1 submitted 29 December, 2020; originally announced December 2020.

    Comments: Accepted at AAAI 2021 (Main Conference)

  9. arXiv:2012.13339  [pdf, ps, other

    cs.CL

    A Context Aware Approach for Generating Natural Language Attacks

    Authors: Rishabh Maheshwary, Saket Maheshwary, Vikram Pudi

    Abstract: We study an important task of attacking natural language processing models in a black box setting. We propose an attack strategy that crafts semantically similar adversarial examples on text classification and entailment tasks. Our proposed attack finds candidate words by considering the information of both the original word and its surrounding context. It jointly leverages masked language modelli… ▽ More

    Submitted 24 December, 2020; originally announced December 2020.

    Comments: Accepted as Student Poster at AAAI 2021