Skip to main content

Showing 1–50 of 70 results for author: Khapra, M

.
  1. arXiv:2406.13439  [pdf, other

    cs.CL

    Finding Blind Spots in Evaluator LLMs with Interpretable Checklists

    Authors: Sumanth Doddapaneni, Mohammed Safi Ur Rahman Khan, Sshubam Verma, Mitesh M. Khapra

    Abstract: Large Language Models (LLMs) are increasingly relied upon to evaluate text outputs of other LLMs, thereby influencing leaderboards and development decisions. However, concerns persist over the accuracy of these assessments and the potential for misleading conclusions. In this work, we investigate the effectiveness of LLMs as evaluators for text generation tasks. We propose FBI, a novel framework d… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  2. arXiv:2406.03893  [pdf, other

    cs.CL

    How Good is Zero-Shot MT Evaluation for Low Resource Indian Languages?

    Authors: Anushka Singh, Ananya B. Sai, Raj Dabre, Ratish Puduppully, Anoop Kunchukuttan, Mitesh M Khapra

    Abstract: While machine translation evaluation has been studied primarily for high-resource languages, there has been a recent interest in evaluation for low-resource languages due to the increasing availability of data and models. In this paper, we focus on a zero-shot evaluation setting focusing on low-resource Indian languages, namely Assamese, Kannada, Maithili, and Punjabi. We collect sufficient Multi-… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  3. arXiv:2403.06350  [pdf, other

    cs.CL

    IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages

    Authors: Mohammed Safi Ur Rahman Khan, Priyam Mehta, Ananth Sankar, Umashankar Kumaravelan, Sumanth Doddapaneni, Suriyaprasaad G, Varun Balan G, Sparsh Jain, Anoop Kunchukuttan, Pratyush Kumar, Raj Dabre, Mitesh M. Khapra

    Abstract: Despite the considerable advancements in English LLMs, the progress in building comparable models for other languages has been hindered due to the scarcity of tailored resources. Our work aims to bridge this divide by introducing an expansive suite of resources specifically designed for the development of Indic LLMs, covering 22 languages, containing a total of 251B tokens and 74.8M instruction-re… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

  4. arXiv:2403.01926  [pdf, other

    cs.CL

    IndicVoices: Towards building an Inclusive Multilingual Speech Dataset for Indian Languages

    Authors: Tahir Javed, Janki Atul Nawale, Eldho Ittan George, Sakshi Joshi, Kaushal Santosh Bhogale, Deovrat Mehendale, Ishvinder Virender Sethi, Aparna Ananthanarayanan, Hafsah Faquih, Pratiti Palit, Sneha Ravishankar, Saranya Sukumaran, Tripura Panchagnula, Sunjay Murali, Kunal Sharad Gandhi, Ambujavalli R, Manickam K M, C Venkata Vaijayanthi, Krishnan Srinivasa Raghavan Karunganni, Pratyush Kumar, Mitesh M Khapra

    Abstract: We present INDICVOICES, a dataset of natural and spontaneous speech containing a total of 7348 hours of read (9%), extempore (74%) and conversational (17%) audio from 16237 speakers covering 145 Indian districts and 22 languages. Of these 7348 hours, 1639 hours have already been transcribed, with a median of 73 hours per language. Through this paper, we share our journey of capturing the cultural,… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  5. arXiv:2401.15006  [pdf, other

    cs.CL cs.AI

    Airavata: Introducing Hindi Instruction-tuned LLM

    Authors: Jay Gala, Thanmay Jayakumar, Jaavid Aktar Husain, Aswanth Kumar M, Mohammed Safi Ur Rahman Khan, Diptesh Kanojia, Ratish Puduppully, Mitesh M. Khapra, Raj Dabre, Rudra Murthy, Anoop Kunchukuttan

    Abstract: We announce the initial release of "Airavata," an instruction-tuned LLM for Hindi. Airavata was created by fine-tuning OpenHathi with diverse, instruction-tuning Hindi datasets to make it better suited for assistive tasks. Along with the model, we also share the IndicInstruct dataset, which is a collection of diverse instruction-tuning datasets to enable further research for Indic LLMs. Additional… ▽ More

    Submitted 26 February, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

    Comments: Work in progress

  6. arXiv:2305.16307  [pdf

    cs.CL

    IndicTrans2: Towards High-Quality and Accessible Machine Translation Models for all 22 Scheduled Indian Languages

    Authors: Jay Gala, Pranjal A. Chitale, Raghavan AK, Varun Gumma, Sumanth Doddapaneni, Aswanth Kumar, Janki Nawale, Anupama Sujatha, Ratish Puduppully, Vivek Raghavan, Pratyush Kumar, Mitesh M. Khapra, Raj Dabre, Anoop Kunchukuttan

    Abstract: India has a rich linguistic landscape with languages from 4 major language families spoken by over a billion people. 22 of these languages are listed in the Constitution of India (referred to as scheduled languages) are the focus of this work. Given the linguistic diversity, high-quality and accessible Machine Translation (MT) systems are essential in a country like India. Prior to this work, ther… ▽ More

    Submitted 20 December, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: Accepted at TMLR

  7. arXiv:2305.15814  [pdf, other

    cs.CL

    Bhasha-Abhijnaanam: Native-script and romanized Language Identification for 22 Indic languages

    Authors: Yash Madhani, Mitesh M. Khapra, Anoop Kunchukuttan

    Abstract: We create publicly available language identification (LID) datasets and models in all 22 Indian languages listed in the Indian constitution in both native-script and romanized text. First, we create Bhasha-Abhijnaanam, a language identification test set for native-script as well as romanized text which spans all 22 Indic languages. We also train IndicLID, a language identifier for all the above-me… ▽ More

    Submitted 26 October, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023

  8. arXiv:2305.15760  [pdf, other

    cs.CL cs.SD eess.AS

    Svarah: Evaluating English ASR Systems on Indian Accents

    Authors: Tahir Javed, Sakshi Joshi, Vignesh Nagarajan, Sai Sundaresan, Janki Nawale, Abhigyan Raman, Kaushal Bhogale, Pratyush Kumar, Mitesh M. Khapra

    Abstract: India is the second largest English-speaking country in the world with a speaker base of roughly 130 million. Thus, it is imperative that automatic speech recognition (ASR) systems for English should be evaluated on Indian accents. Unfortunately, Indian speakers find a very poor representation in existing English ASR benchmarks such as LibriSpeech, Switchboard, Speech Accent Archive, etc. In this… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

  9. arXiv:2305.15386  [pdf, other

    cs.CL cs.SD eess.AS

    Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR

    Authors: Kaushal Santosh Bhogale, Sai Sundaresan, Abhigyan Raman, Tahir Javed, Mitesh M. Khapra, Pratyush Kumar

    Abstract: Improving ASR systems is necessary to make new LLM-based use-cases accessible to people across the globe. In this paper, we focus on Indian languages, and make the case that diverse benchmarks are required to evaluate and improve ASR systems for Indian languages. To address this, we collate Vistaar as a set of 59 benchmarks across various language and domain combinations, on which we evaluate 3 pu… ▽ More

    Submitted 2 August, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted in INTERSPEECH 2023

  10. arXiv:2305.07491  [pdf, other

    cs.CL

    A Comprehensive Analysis of Adapter Efficiency

    Authors: Nandini Mundra, Sumanth Doddapaneni, Raj Dabre, Anoop Kunchukuttan, Ratish Puduppully, Mitesh M. Khapra

    Abstract: Adapters have been positioned as a parameter-efficient fine-tuning (PEFT) approach, whereby a minimal number of parameters are added to the model and fine-tuned. However, adapters have not been sufficiently analyzed to understand if PEFT translates to benefits in training/deployment efficiency and maintainability/extensibility. Through extensive experiments on many adapters, tasks, and languages i… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

  11. arXiv:2212.10180  [pdf, other

    cs.CL

    IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation metrics for Indian Languages

    Authors: Ananya B. Sai, Vignesh Nagarajan, Tanay Dixit, Raj Dabre, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra

    Abstract: The rapid growth of machine translation (MT) systems has necessitated comprehensive studies to meta-evaluate evaluation metrics being used, which enables a better selection of metrics that best reflect MT quality. Unfortunately, most of the research focuses on high-resource languages, mainly English, the observations for which may not always apply to other languages. Indian languages, having over… ▽ More

    Submitted 3 July, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: ACL 2023 long paper

  12. arXiv:2212.10168  [pdf, other

    cs.CL

    Naamapadam: A Large-Scale Named Entity Annotated Data for Indic Languages

    Authors: Arnav Mhaske, Harshit Kedia, Sumanth Doddapaneni, Mitesh M. Khapra, Pratyush Kumar, Rudra Murthy V, Anoop Kunchukuttan

    Abstract: We present, Naamapadam, the largest publicly available Named Entity Recognition (NER) dataset for the 11 major Indian languages from two language families. The dataset contains more than 400k sentences annotated with a total of at least 100k entities from three standard entity categories (Person, Location, and, Organization) for 9 out of the 11 languages. The training dataset has been automaticall… ▽ More

    Submitted 28 May, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: ACL 2023

  13. arXiv:2212.05409  [pdf, other

    cs.CL

    Towards Leaving No Indic Language Behind: Building Monolingual Corpora, Benchmark and Models for Indic Languages

    Authors: Sumanth Doddapaneni, Rahul Aralikatte, Gowtham Ramesh, Shreya Goyal, Mitesh M. Khapra, Anoop Kunchukuttan, Pratyush Kumar

    Abstract: Building Natural Language Understanding (NLU) capabilities for Indic languages, which have a collective speaker base of more than one billion speakers is absolutely crucial. In this work, we aim to improve the NLU capabilities of Indic languages by making contributions along 3 important axes (i) monolingual corpora (ii) NLU testsets (iii) multilingual LLMs focusing on Indic languages. Specifically… ▽ More

    Submitted 24 May, 2023; v1 submitted 10 December, 2022; originally announced December 2022.

    Comments: ACL 2023

  14. arXiv:2211.09536  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Towards Building Text-To-Speech Systems for the Next Billion Users

    Authors: Gokul Karthik Kumar, Praveen S V, Pratyush Kumar, Mitesh M. Khapra, Karthik Nandakumar

    Abstract: Deep learning based text-to-speech (TTS) systems have been evolving rapidly with advances in model architectures, training methodologies, and generalization across speakers and languages. However, these advances have not been thoroughly investigated for Indian language speech synthesis. Such investigation is computationally expensive given the number and diversity of Indian languages, relatively l… ▽ More

    Submitted 17 February, 2023; v1 submitted 17 November, 2022; originally announced November 2022.

    Comments: Accepted at ICASSP 2023. Gokul and Praveen contributed equally

  15. arXiv:2208.12666  [pdf, other

    cs.CL cs.SD eess.AS

    Effectiveness of Mining Audio and Text Pairs from Public Data for Improving ASR Systems for Low-Resource Languages

    Authors: Kaushal Santosh Bhogale, Abhigyan Raman, Tahir Javed, Sumanth Doddapaneni, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra

    Abstract: End-to-end (E2E) models have become the default choice for state-of-the-art speech recognition systems. Such models are trained on large amounts of labelled data, which are often not available for low-resource languages. Techniques such as self-supervised learning and transfer learning hold promise, but have not yet been effective in training accurate models. On the other hand, collecting labelled… ▽ More

    Submitted 26 August, 2022; originally announced August 2022.

  16. arXiv:2208.11761  [pdf, other

    cs.CL cs.SD eess.AS

    IndicSUPERB: A Speech Processing Universal Performance Benchmark for Indian languages

    Authors: Tahir Javed, Kaushal Santosh Bhogale, Abhigyan Raman, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra

    Abstract: A cornerstone in AI research has been the creation and adoption of standardized training and test datasets to earmark the progress of state-of-the-art models. A particularly successful example is the GLUE dataset for training and evaluating Natural Language Understanding (NLU) models for English. The large body of research around self-supervised BERT-based language models revolved around performan… ▽ More

    Submitted 15 December, 2022; v1 submitted 24 August, 2022; originally announced August 2022.

  17. arXiv:2205.03018  [pdf

    cs.CL

    Aksharantar: Open Indic-language Transliteration datasets and models for the Next Billion Users

    Authors: Yash Madhani, Sushane Parthan, Priyanka Bedekar, Gokul NC, Ruchi Khapra, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra

    Abstract: Transliteration is very important in the Indian language context due to the usage of multiple scripts and the widespread use of romanized inputs. However, few training and evaluation sets are publicly available. We introduce Aksharantar, the largest publicly available transliteration dataset for Indian languages created by mining from monolingual and parallel corpora, as well as collecting data fr… ▽ More

    Submitted 26 October, 2023; v1 submitted 6 May, 2022; originally announced May 2022.

    Comments: This manuscript is an extended version of the paper accepted to EMNLP Findings 2023. You can find the EMNLP Findings version at https://anoopkunchukuttan.gitlab.io/publications/emnlp_findings_2023_aksharantar.pdf

  18. arXiv:2203.14049  [pdf, other

    cs.LG cs.CL cs.HC

    Joint Transformer/RNN Architecture for Gesture Ty** in Indic Languages

    Authors: Emil Biju, Anirudh Sriram, Mitesh M. Khapra, Pratyush Kumar

    Abstract: Gesture ty** is a method of ty** words on a touch-based keyboard by creating a continuous trace passing through the relevant keys. This work is aimed at develo** a keyboard that supports gesture ty** in Indic languages. We begin by noting that when dealing with Indic languages, one needs to cater to two different sets of users: (i) users who prefer to type in the native Indic script (Devan… ▽ More

    Submitted 26 March, 2022; originally announced March 2022.

    Comments: Published at COLING 2020, 12 pages, 4 Tables and 5 Figures

  19. arXiv:2203.12298  [pdf, other

    cs.CL cs.CR cs.LG

    Input-specific Attention Subnetworks for Adversarial Detection

    Authors: Emil Biju, Anirudh Sriram, Pratyush Kumar, Mitesh M Khapra

    Abstract: Self-attention heads are characteristic of Transformer models and have been well studied for interpretability and pruning. In this work, we demonstrate an altogether different utility of attention heads, namely for adversarial detection. Specifically, we propose a method to construct input-specific attention subnetworks (IAS) from which we extract three features to discriminate between authentic a… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: Accepted at Findings of ACL 2022, 14 pages, 6 Tables and 9 Figures

  20. arXiv:2203.06414  [pdf, other

    cs.CL

    A Survey of Adversarial Defences and Robustness in NLP

    Authors: Shreya Goyal, Sumanth Doddapaneni, Mitesh M. Khapra, Balaraman Ravindran

    Abstract: In the past few years, it has become increasingly evident that deep neural networks are not resilient enough to withstand adversarial perturbations in input data, leaving them vulnerable to attack. Various authors have proposed strong adversarial attacks for computer vision and Natural Language Processing (NLP) tasks. As a response, many defense mechanisms have also been proposed to prevent these… ▽ More

    Submitted 18 April, 2023; v1 submitted 12 March, 2022; originally announced March 2022.

    Comments: Accepted for publication at ACM Computing Surveys

  21. arXiv:2203.06063  [pdf, other

    cs.CL cs.AI cs.LG

    Active Evaluation: Efficient NLG Evaluation with Few Pairwise Comparisons

    Authors: Akash Kumar Mohankumar, Mitesh M. Khapra

    Abstract: Recent studies have shown the advantages of evaluating NLG systems using pairwise comparisons as opposed to direct assessment. Given $k$ systems, a naive approach for identifying the top-ranked system would be to uniformly obtain pairwise comparisons from all ${k \choose 2}$ pairs of systems. However, this can be very expensive as the number of human annotations required would grow quadratically w… ▽ More

    Submitted 17 April, 2022; v1 submitted 11 March, 2022; originally announced March 2022.

    Comments: Accepted at ACL 2022; 21 pages and 12 figures

  22. arXiv:2203.05437  [pdf

    cs.CL cs.AI

    IndicNLG Benchmark: Multilingual Datasets for Diverse NLG Tasks in Indic Languages

    Authors: Aman Kumar, Himani Shrotriya, Prachi Sahu, Raj Dabre, Ratish Puduppully, Anoop Kunchukuttan, Amogh Mishra, Mitesh M. Khapra, Pratyush Kumar

    Abstract: Natural Language Generation (NLG) for non-English languages is hampered by the scarcity of datasets in these languages. In this paper, we present the IndicNLG Benchmark, a collection of datasets for benchmarking NLG for 11 Indic languages. We focus on five diverse tasks, namely, biography generation using Wikipedia infoboxes, news headline generation, sentence summarization, paraphrase generation… ▽ More

    Submitted 26 October, 2022; v1 submitted 10 March, 2022; originally announced March 2022.

    Comments: Accepted at EMNLP 2022

  23. arXiv:2111.03945  [pdf, other

    cs.CL cs.SD eess.AS

    Towards Building ASR Systems for the Next Billion Users

    Authors: Tahir Javed, Sumanth Doddapaneni, Abhigyan Raman, Kaushal Santosh Bhogale, Gowtham Ramesh, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra

    Abstract: Recent methods in speech and language technology pretrain very LARGE models which are fine-tuned for specific tasks. However, the benefits of such LARGE models are often limited to a few resource rich languages of the world. In this work, we make multiple contributions towards building ASR systems for low resource languages from the Indian subcontinent. First, we curate 17,000 hours of raw speech… ▽ More

    Submitted 22 December, 2021; v1 submitted 6 November, 2021; originally announced November 2021.

  24. arXiv:2110.05877  [pdf, other

    cs.CL cs.AI cs.CV cs.LG

    OpenHands: Making Sign Language Recognition Accessible with Pose-based Pretrained Models across Languages

    Authors: Prem Selvaraj, Gokul NC, Pratyush Kumar, Mitesh Khapra

    Abstract: AI technologies for Natural Languages have made tremendous progress recently. However, commensurate progress has not been made on Sign Languages, in particular, in recognizing signs as individual words or as complete sentences. We introduce OpenHands, a library where we take four key ideas from the NLP community for low-resource languages and apply them to sign languages for word-level recognition… ▽ More

    Submitted 12 October, 2021; originally announced October 2021.

    Comments: Submitted to AAAI22, 13 pages, 9 figures, 6 tables

    ACM Class: I.2.7

  25. arXiv:2110.04620  [pdf, other

    cs.CL cs.AI

    A Framework for Rationale Extraction for Deep QA models

    Authors: Sahana Ramnath, Preksha Nema, Deep Sahni, Mitesh M. Khapra

    Abstract: As neural-network-based QA models become deeper and more complex, there is a demand for robust frameworks which can access a model's rationale for its prediction. Current techniques that provide insights on a model's working are either dependent on adversarial datasets or are proposing models with explicit explanation generation components. These techniques are time-consuming and challenging to ex… ▽ More

    Submitted 9 October, 2021; originally announced October 2021.

    Comments: 5 pages including references

  26. arXiv:2109.12683  [pdf, other

    cs.CL

    On the Prunability of Attention Heads in Multilingual BERT

    Authors: Aakriti Budhraja, Madhura Pande, Pratyush Kumar, Mitesh M. Khapra

    Abstract: Large multilingual models, such as mBERT, have shown promise in crosslingual transfer. In this work, we employ pruning to quantify the robustness and interpret layer-wise importance of mBERT. On four GLUE tasks, the relative drops in accuracy due to pruning have almost identical results on mBERT and BERT suggesting that the reduced attention capacity of the multilingual models does not affect robu… ▽ More

    Submitted 26 September, 2021; originally announced September 2021.

  27. arXiv:2109.05771  [pdf, other

    cs.CL

    Perturbation CheckLists for Evaluating NLG Evaluation Metrics

    Authors: Ananya B. Sai, Tanay Dixit, Dev Yashpal Sheth, Sreyas Mohan, Mitesh M. Khapra

    Abstract: Natural Language Generation (NLG) evaluation is a multifaceted task requiring assessment of multiple desirable criteria, e.g., fluency, coherency, coverage, relevance, adequacy, overall quality, etc. Across existing datasets for 6 NLG tasks, we observe that the human evaluation scores on these multiple criteria are often not correlated. For example, there is a very low correlation between human sc… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Comments: Accepted at EMNLP 2021. See https://iitmnlp.github.io/EvalEval/ for our templates and code

  28. IndicBART: A Pre-trained Model for Indic Natural Language Generation

    Authors: Raj Dabre, Himani Shrotriya, Anoop Kunchukuttan, Ratish Puduppully, Mitesh M. Khapra, Pratyush Kumar

    Abstract: In this paper, we study pre-trained sequence-to-sequence models for a group of related languages, with a focus on Indic languages. We present IndicBART, a multilingual, sequence-to-sequence pre-trained model focusing on 11 Indic languages and English. IndicBART utilizes the orthographic similarity between Indic scripts to improve transfer learning between similar Indic languages. We evaluate Indic… ▽ More

    Submitted 26 October, 2022; v1 submitted 7 September, 2021; originally announced September 2021.

    Comments: Published at ACL 2022, 15 pages

  29. arXiv:2107.00676  [pdf, other

    cs.CL

    A Primer on Pretrained Multilingual Language Models

    Authors: Sumanth Doddapaneni, Gowtham Ramesh, Mitesh M. Khapra, Anoop Kunchukuttan, Pratyush Kumar

    Abstract: Multilingual Language Models (\MLLMs) such as mBERT, XLM, XLM-R, \textit{etc.} have emerged as a viable option for bringing the power of pretraining to a large number of languages. Given their success in zero-shot transfer learning, there has emerged a large body of work in (i) building bigger \MLLMs~covering a large number of languages (ii) creating exhaustive benchmarks covering a wider variety… ▽ More

    Submitted 23 December, 2021; v1 submitted 1 July, 2021; originally announced July 2021.

  30. arXiv:2104.05596  [pdf

    cs.CL

    Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages

    Authors: Gowtham Ramesh, Sumanth Doddapaneni, Aravinth Bheemaraj, Mayank Jobanputra, Raghavan AK, Ajitesh Sharma, Sujit Sahoo, Harshita Diddee, Mahalakshmi J, Divyanshu Kakwani, Navneet Kumar, Aswin Pradeep, Srihari Nagaraj, Kumar Deepak, Vivek Raghavan, Anoop Kunchukuttan, Pratyush Kumar, Mitesh Shantadevi Khapra

    Abstract: We present Samanantar, the largest publicly available parallel corpora collection for Indic languages. The collection contains a total of 49.7 million sentence pairs between English and 11 Indic languages (from two language families). Specifically, we compile 12.4 million sentence pairs from existing, publicly-available parallel corpora, and additionally mine 37.4 million sentence pairs from the w… ▽ More

    Submitted 12 June, 2023; v1 submitted 12 April, 2021; originally announced April 2021.

    Comments: Accepted to the Transactions of the Association for Computational Linguistics (TACL)

  31. arXiv:2101.09115  [pdf, other

    cs.CL cs.AI

    The heads hypothesis: A unifying statistical approach towards understanding multi-headed attention in BERT

    Authors: Madhura Pande, Aakriti Budhraja, Preksha Nema, Pratyush Kumar, Mitesh M. Khapra

    Abstract: Multi-headed attention heads are a mainstay in transformer-based models. Different methods have been proposed to classify the role of each attention head based on the relations between tokens which have high pair-wise attention. These roles include syntactic (tokens with some syntactic relation), local (nearby tokens), block (tokens in the same sentence) and delimiter (the special [CLS], [SEP] tok… ▽ More

    Submitted 22 January, 2021; originally announced January 2021.

    Comments: accepted at AAAI 2021 (Main conference)

  32. arXiv:2011.15045  [pdf, other

    eess.IV cs.CV cs.LG stat.ML

    Unsupervised Deep Video Denoising

    Authors: Dev Yashpal Sheth, Sreyas Mohan, Joshua L. Vincent, Ramon Manzorro, Peter A. Crozier, Mitesh M. Khapra, Eero P. Simoncelli, Carlos Fernandez-Granda

    Abstract: Deep convolutional neural networks (CNNs) for video denoising are typically trained with supervision, assuming the availability of clean videos. However, in many applications, such as microscopy, noiseless videos are not available. To address this, we propose an Unsupervised Deep Video Denoiser (UDVD), a CNN architecture designed to be trained exclusively with noisy data. The performance of UDVD i… ▽ More

    Submitted 19 August, 2021; v1 submitted 30 November, 2020; originally announced November 2020.

    Comments: Dev and Sreyas contributed equally. To appear at 2021 IEEE/CVF International Conference on Computer Vision (ICCV). See https://sreyas-mohan.github.io/udvd/ for code and more results

  33. arXiv:2010.08983  [pdf, other

    cs.CL cs.AI

    Towards Interpreting BERT for Reading Comprehension Based QA

    Authors: Sahana Ramnath, Preksha Nema, Deep Sahni, Mitesh M. Khapra

    Abstract: BERT and its variants have achieved state-of-the-art performance in various NLP tasks. Since then, various works have been proposed to analyze the linguistic information being captured in BERT. However, the current works do not provide an insight into how BERT is able to achieve near human-level performance on the task of Reading Comprehension based Question Answering. In this work, we attempt to… ▽ More

    Submitted 18 October, 2020; originally announced October 2020.

    Comments: 7 pages including references and appendix. Accepted at EMNLP 2020

  34. arXiv:2010.00722  [pdf, other

    cs.LG cs.CL cs.IR

    Evaluating a Generative Adversarial Framework for Information Retrieval

    Authors: Ameet Deshpande, Mitesh M. Khapra

    Abstract: Recent advances in Generative Adversarial Networks (GANs) have resulted in its widespread applications to multiple domains. A recent model, IRGAN, applies this framework to Information Retrieval (IR) and has gained significant attention over the last few years. In this focused work, we critically analyze multiple components of IRGAN, while providing experimental and theoretical evidence of some of… ▽ More

    Submitted 1 October, 2020; originally announced October 2020.

  35. arXiv:2009.11321  [pdf, other

    cs.CL

    Improving Dialog Evaluation with a Multi-reference Adversarial Dataset and Large Scale Pretraining

    Authors: Ananya B. Sai, Akash Kumar Mohankumar, Siddhartha Arora, Mitesh M. Khapra

    Abstract: There is an increasing focus on model-based dialog evaluation metrics such as ADEM, RUBER, and the more recent BERT-based metrics. These models aim to assign a high score to all relevant responses and a low score to all irrelevant responses. Ideally, such models should be trained using multiple relevant and irrelevant responses for any given context. However, no such data is publicly available, an… ▽ More

    Submitted 23 September, 2020; originally announced September 2020.

    Comments: Accepted for publication in TACL

  36. arXiv:2008.12009  [pdf, other

    cs.CL

    A Survey of Evaluation Metrics Used for NLG Systems

    Authors: Ananya B. Sai, Akash Kumar Mohankumar, Mitesh M. Khapra

    Abstract: The success of Deep Learning has created a surge in interest in a wide a range of Natural Language Generation (NLG) tasks. Deep Learning has not only pushed the state of the art in several existing NLG tasks but has also facilitated researchers to explore various newer NLG tasks such as image captioning. Such rapid progress in NLG has necessitated the development of accurate automatic evaluation m… ▽ More

    Submitted 5 October, 2020; v1 submitted 27 August, 2020; originally announced August 2020.

    Comments: A condensed version of this paper is submitted to ACM CSUR

  37. arXiv:2008.05828  [pdf, other

    cs.CL cs.LG

    On the Importance of Local Information in Transformer Based Models

    Authors: Madhura Pande, Aakriti Budhraja, Preksha Nema, Pratyush Kumar, Mitesh M. Khapra

    Abstract: The self-attention module is a key component of Transformer-based models, wherein each token pays attention to every other token. Recent studies have shown that these heads exhibit syntactic, semantic, or local behaviour. Some studies have also identified promise in restricting this attention to be local, i.e., a token attending to other tokens only in a small neighbourhood around it. However, no… ▽ More

    Submitted 13 August, 2020; originally announced August 2020.

    Comments: 10 pages, 4 figures

  38. arXiv:2007.02240  [pdf, other

    cs.CV

    A Systematic Evaluation of Object Detection Networks for Scientific Plots

    Authors: Pritha Ganguly, Nitesh Methani, Mitesh M. Khapra, Pratyush Kumar

    Abstract: Are existing object detection methods adequate for detecting text and visual elements in scientific plots which are arguably different than the objects found in natural images? To answer this question, we train and compare the accuracy of various SOTA object detection networks on the PlotQA dataset. At the standard IOU setting of 0.5, most networks perform well with mAP scores greater than 80% in… ▽ More

    Submitted 19 December, 2020; v1 submitted 5 July, 2020; originally announced July 2020.

    Comments: This work has been accepted and will be presented at AAAI 2021

  39. arXiv:2005.14315  [pdf, other

    cs.CL

    On Incorporating Structural Information to improve Dialogue Response Generation

    Authors: Nikita Moghe, Priyesh Vijayan, Balaraman Ravindran, Mitesh M. Khapra

    Abstract: We consider the task of generating dialogue responses from background knowledge comprising of domain specific resources. Specifically, given a conversation around a movie, the task is to generate the next response based on background knowledge about the movie such as the plot, review, Reddit comments etc. This requires capturing structural, sequential and semantic information from the conversation… ▽ More

    Submitted 28 May, 2020; originally announced May 2020.

  40. arXiv:2005.00085  [pdf, ps, other

    cs.CL

    AI4Bharat-IndicNLP Corpus: Monolingual Corpora and Word Embeddings for Indic Languages

    Authors: Anoop Kunchukuttan, Divyanshu Kakwani, Satish Golla, Gokul N. C., Avik Bhattacharyya, Mitesh M. Khapra, Pratyush Kumar

    Abstract: We present the IndicNLP corpus, a large-scale, general-domain corpus containing 2.7 billion words for 10 Indian languages from two language families. We share pre-trained word embeddings trained on these corpora. We create news article category classification datasets for 9 languages to evaluate the embeddings. We show that the IndicNLP embeddings significantly outperform publicly available pre-tr… ▽ More

    Submitted 30 April, 2020; originally announced May 2020.

    Comments: 7 pages, 8 tables, https://github.com/ai4bharat-indicnlp/indicnlp_corpus

  41. arXiv:2004.14243  [pdf, other

    cs.CL

    Towards Transparent and Explainable Attention Models

    Authors: Akash Kumar Mohankumar, Preksha Nema, Sharan Narasimhan, Mitesh M. Khapra, Balaji Vasan Srinivasan, Balaraman Ravindran

    Abstract: Recent studies on interpretability of attention distributions have led to notions of faithful and plausible explanations for a model's predictions. Attention distributions can be considered a faithful explanation if a higher attention weight implies a greater impact on the model's prediction. They can be considered a plausible explanation if they provide a human-understandable justification for th… ▽ More

    Submitted 29 April, 2020; originally announced April 2020.

    Comments: Accepted at ACL 2020

  42. arXiv:1911.00850  [pdf, other

    cs.AI cs.CL cs.CV

    Scene Graph based Image Retrieval -- A case study on the CLEVR Dataset

    Authors: Sahana Ramnath, Amrita Saha, Soumen Chakrabarti, Mitesh M. Khapra

    Abstract: With the prolification of multimodal interaction in various domains, recently there has been much interest in text based image retrieval in the computer vision community. However most of the state of the art techniques model this problem in a purely neural way, which makes it difficult to incorporate pragmatic strategies in searching a large scale catalog especially when the search requirements ar… ▽ More

    Submitted 3 November, 2019; originally announced November 2019.

    Comments: 3 pages including references, Accepted at the ICCV 2019 Workshop - 'Linguistics Meets Image and Video Retrieval' (received Best Paper Award)

  43. arXiv:1909.05355  [pdf, other

    cs.CL cs.AI

    Let's Ask Again: Refine Network for Automatic Question Generation

    Authors: Preksha Nema, Akash Kumar Mohankumar, Mitesh M. Khapra, Balaji Vasan Srinivasan, Balaraman Ravindran

    Abstract: In this work, we focus on the task of Automatic Question Generation (AQG) where given a passage and an answer the task is to generate the corresponding question. It is desired that the generated question should be (i) grammatically correct (ii) answerable from the passage and (iii) specific to the given answer. An analysis of existing AQG models shows that they produce questions which do not adher… ▽ More

    Submitted 31 August, 2019; originally announced September 2019.

    Comments: accepted in EMNLP 2019 in Main Conference, (10 pages)

  44. arXiv:1909.00997  [pdf, other

    cs.CV cs.AI cs.CL

    PlotQA: Reasoning over Scientific Plots

    Authors: Nitesh Methani, Pritha Ganguly, Mitesh M. Khapra, Pratyush Kumar

    Abstract: Existing synthetic datasets (FigureQA, DVQA) for reasoning over plots do not contain variability in data labels, real-valued data, or complex reasoning questions. Consequently, proposed models for these datasets do not fully address the challenge of reasoning over plots. In particular, they assume that the answer comes either from a small fixed size vocabulary or from a bounding box within the ima… ▽ More

    Submitted 1 February, 2020; v1 submitted 3 September, 2019; originally announced September 2019.

    Comments: This is an extension of our previous arxiv paper "Data Interpretation over Plots" and it is to be presented at WACV 2020

  45. arXiv:1904.02665  [pdf, ps, other

    cs.CL

    Frustratingly Poor Performance of Reading Comprehension Models on Non-adversarial Examples

    Authors: Soham Parikh, Ananya B. Sai, Preksha Nema, Mitesh M. Khapra

    Abstract: When humans learn to perform a difficult task (say, reading comprehension (RC) over longer passages), it is typically the case that their performance improves significantly on an easier version of this task (say, RC over shorter passages). Ideally, we would want an intelligent agent to also exhibit such a behavior. However, on experimenting with state of the art RC models using the standard RACE d… ▽ More

    Submitted 4 April, 2019; originally announced April 2019.

    Comments: 8 pages

  46. arXiv:1904.02651  [pdf, other

    cs.CL

    ElimiNet: A Model for Eliminating Options for Reading Comprehension with Multiple Choice Questions

    Authors: Soham Parikh, Ananya B. Sai, Preksha Nema, Mitesh M. Khapra

    Abstract: The task of Reading Comprehension with Multiple Choice Questions, requires a human (or machine) to read a given passage, question pair and select one of the n given options. The current state of the art model for this task first computes a question-aware representation for the passage and then selects the option which has the maximum similarity with this representation. However, when humans perfor… ▽ More

    Submitted 4 April, 2019; originally announced April 2019.

    Comments: IJCAI-18

    Journal ref: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (2018) Main track. Pages 4272-4278

  47. arXiv:1902.10640  [pdf, other

    cs.CV

    Efficient Video Classification Using Fewer Frames

    Authors: Shweta Bhardwaj, Mukundhan Srinivasan, Mitesh M. Khapra

    Abstract: Recently,there has been a lot of interest in building compact models for video classification which have a small memory footprint (<1 GB). While these models are compact, they typically operate by repeated application of a small weight matrix to all the frames in a video. E.g. recurrent neural network based methods compute a hidden state for every frame of the video using a recurrent weight matrix… ▽ More

    Submitted 27 February, 2019; originally announced February 2019.

    Comments: To Appear in Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR'2019)

  48. arXiv:1902.08832  [pdf, other

    cs.CL

    Re-evaluating ADEM: A Deeper Look at Scoring Dialogue Responses

    Authors: Ananya B. Sai, Mithun Das Gupta, Mitesh M. Khapra, Mukundhan Srinivasan

    Abstract: Automatically evaluating the quality of dialogue responses for unstructured domains is a challenging problem. ADEM(Lowe et al. 2017) formulated the automatic evaluation of dialogue systems as a learning problem and showed that such a model was able to predict responses which correlate significantly with human judgements, both at utterance and system level. Their system was shown to have beaten wor… ▽ More

    Submitted 23 February, 2019; originally announced February 2019.

    Comments: Accepted as a long paper in the proceedings of AAAI-2019

  49. arXiv:1812.10240  [pdf, other

    cs.LG cs.CV cs.NE

    Studying the Plasticity in Deep Convolutional Neural Networks using Random Pruning

    Authors: Deepak Mittal, Shweta Bhardwaj, Mitesh M. Khapra, Balaraman Ravindran

    Abstract: Recently there has been a lot of work on pruning filters from deep convolutional neural networks (CNNs) with the intention of reducing computations.The key idea is to rank the filters based on a certain criterion (say, l1-norm) and retain only the top ranked filters. Once the low scoring filters are pruned away the remainder of the network is fine tuned and is shown to give performance comparable… ▽ More

    Submitted 26 December, 2018; originally announced December 2018.

    Comments: To appear in the Journal of Machine Vision and Applications, Springer. This work is an extended version of our previous work arXiv:1801.10447, "Recovering from Random Pruning: On the Plasticity of Deep Convolutional Neural Networks", accepted at WACV 2018

  50. arXiv:1810.11975  [pdf, other

    cs.LG cs.CL stat.ML

    On Controllable Sparse Alternatives to Softmax

    Authors: Anirban Laha, Saneem A. Chemmengath, Priyanka Agrawal, Mitesh M. Khapra, Karthik Sankaranarayanan, Harish G. Ramaswamy

    Abstract: Converting an n-dimensional vector to a probability distribution over n objects is a commonly used component in many machine learning tasks like multiclass classification, multilabel classification, attention mechanisms etc. For this, several probability map** functions have been proposed and employed in literature such as softmax, sum-normalization, spherical softmax, and sparsemax, but there i… ▽ More

    Submitted 30 October, 2018; v1 submitted 29 October, 2018; originally announced October 2018.

    Comments: To appear in NIPS 2018, Total 16 pages including appendix