Skip to main content

Showing 1–12 of 12 results for author: Parvez, M R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.14277  [pdf, other

    cs.CL

    Improving Language Models Trained with Translated Data via Continual Pre-Training and Dictionary Learning Analysis

    Authors: Sabri Boughorbel, MD Rizwan Parvez, Majd Hawasly

    Abstract: Training LLMs in low resources languages usually utilizes data augmentation with machine translation (MT) from English language. However, translation brings a number of challenges: there are large costs attached to translating and curating huge amounts of content with high-end machine translation solutions, the translated content carries over cultural biases, and if the translation is not faithful… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 15 pages

  2. arXiv:2405.11403  [pdf, other

    cs.CL cs.AI

    MapCoder: Multi-Agent Code Generation for Competitive Problem Solving

    Authors: Md. Ashraful Islam, Mohammed Eunus Ali, Md Rizwan Parvez

    Abstract: Code synthesis, which requires a deep understanding of complex natural language problem descriptions, generation of code instructions for complex algorithms and data structures, and the successful execution of comprehensive unit tests, presents a significant challenge. While large language models (LLMs) demonstrate impressive proficiency in natural language processing, their performance in code ge… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  3. arXiv:2403.09028  [pdf, other

    cs.CL

    ChartInstruct: Instruction Tuning for Chart Comprehension and Reasoning

    Authors: Ahmed Masry, Mehrad Shahmohammadi, Md Rizwan Parvez, Enamul Hoque, Shafiq Joty

    Abstract: Charts provide visual representations of data and are widely used for analyzing information, addressing queries, and conveying insights to others. Various chart-related downstream tasks have emerged recently, such as question-answering and summarization. A common strategy to solve these tasks is to fine-tune various models originally trained on vision tasks language. However, such task-specific mo… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  4. arXiv:2401.05787  [pdf, other

    cs.CL

    Evidence to Generate (E2G): A Single-agent Two-step Prompting for Context Grounded and Retrieval Augmented Reasoning

    Authors: Md Rizwan Parvez

    Abstract: While chain-of-thought (CoT) prompting has revolutionized how LLMs perform reasoning tasks, its current methods and variations (e.g, Self-consistency, ReACT, Reflexion, Tree-of-Thoughts (ToT), Cumulative Reasoning (CR)) suffer from limitations like slowness, limited context grounding, hallucination and inconsistent outputs. To overcome these challenges, we introduce Evidence to Generate (E2G), a n… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

  5. arXiv:2312.05200  [pdf, other

    cs.CL

    DelucionQA: Detecting Hallucinations in Domain-specific Question Answering

    Authors: Mobashir Sadat, Zhengyu Zhou, Lukas Lange, Jun Araki, Arsalan Gundroo, Bingqing Wang, Rakesh R Menon, Md Rizwan Parvez, Zhe Feng

    Abstract: Hallucination is a well-known phenomenon in text generated by large language models (LLMs). The existence of hallucinatory responses is found in almost all application scenarios e.g., summarization, question-answering (QA) etc. For applications requiring high reliability (e.g., customer-facing assistants), the potential existence of hallucination in LLM-generated text is a critical problem. The am… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: Accepted in EMNLP 2023 (Findings)

  6. arXiv:2311.08377  [pdf, other

    cs.CL cs.AI

    Learning to Filter Context for Retrieval-Augmented Generation

    Authors: Zhiruo Wang, Jun Araki, Zhengbao Jiang, Md Rizwan Parvez, Graham Neubig

    Abstract: On-the-fly retrieval of relevant knowledge has proven an essential element of reliable systems for tasks such as open-domain question answering and fact verification. However, because retrieval systems are not perfect, generation models are required to generate outputs given partially or entirely irrelevant passages. This can cause over- or under-reliance on context, and result in problems in the… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  7. arXiv:2303.03004  [pdf, other

    cs.CL

    xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval

    Authors: Mohammad Abdullah Matin Khan, M Saiful Bari, Xuan Long Do, Weishi Wang, Md Rizwan Parvez, Shafiq Joty

    Abstract: Recently, pre-trained large language models (LLMs) have shown impressive abilities in generating codes from natural language descriptions, repairing buggy codes, translating codes between languages, and retrieving relevant code segments. However, the evaluation of these models has often been performed in a scattered way on only one or two specific tasks, in a few languages, at a partial granularit… ▽ More

    Submitted 6 November, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

    Comments: Code & Data available at https://github.com/ntunlp/xCodeEval, https://huggingface.co/datasets/NTU-NLP-sg/xCodeEval. Evaluation framework available at https://github.com/ntunlp/execeval

  8. arXiv:2204.08952  [pdf, other

    cs.CL

    Retrieval Enhanced Data Augmentation for Question Answering on Privacy Policies

    Authors: Md Rizwan Parvez, Jianfeng Chi, Wasi Uddin Ahmad, Yuan Tian, Kai-Wei Chang

    Abstract: Prior studies in privacy policies frame the question answering (QA) task as identifying the most relevant text segment or a list of sentences from a policy document given a user query. Existing labeled datasets are heavily imbalanced (only a few relevant segments), limiting the QA performance in this domain. In this paper, we develop a data augmentation framework based on ensembling retriever mode… ▽ More

    Submitted 22 April, 2023; v1 submitted 19 April, 2022; originally announced April 2022.

    Comments: EACL 2023

  9. arXiv:2108.11601  [pdf, other

    cs.SE cs.CL

    Retrieval Augmented Code Generation and Summarization

    Authors: Md Rizwan Parvez, Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang

    Abstract: Software developers write a lot of source code and documentation during software development. Intrinsically, developers often recall parts of source code or code summaries that they had written in the past while implementing software or documenting them. To mimic developers' code or summary generation behavior, we propose a retrieval augmented framework, REDCODER, that retrieves relevant code or s… ▽ More

    Submitted 10 September, 2021; v1 submitted 26 August, 2021; originally announced August 2021.

    Comments: accepted in EMNLP-Findings 2021

  10. arXiv:2104.12567  [pdf, other

    cs.CL

    Evaluating the Values of Sources in Transfer Learning

    Authors: Md Rizwan Parvez, Kai-Wei Chang

    Abstract: Transfer learning that adapts a model trained on data-rich sources to low-resource targets has been widely applied in natural language processing (NLP). However, when training a transfer model over multiple sources, not every source is equally useful for the target. To better transfer a model, it is essential to understand the values of the sources. In this paper, we develop SEAL-Shap, an efficien… ▽ More

    Submitted 26 April, 2021; originally announced April 2021.

    Comments: NAACL 2021 Camera Ready

    Journal ref: @inproceedings{parvez2021evaluating, title = {Evaluating the Values of Sources in Transfer Learning}, author = {Parvez, Md Rizwan and Chang, Kai-Wei}, booktitle = {NAACL}, year = {2021} }

  11. arXiv:1808.08270  [pdf, other

    cs.LG cs.CL stat.ML

    Robust Text Classifier on Test-Time Budgets

    Authors: Md Rizwan Parvez, Tolga Bolukbasi, Kai-Wei Chang, Venkatesh Saligrama

    Abstract: We propose a generic and interpretable learning framework for building robust text classification model that achieves accuracy comparable to full models under test-time budget constraints. Our approach learns a selector to identify words that are relevant to the prediction tasks and passes them to the classifier for processing. The selector is trained jointly with the classifier and directly learn… ▽ More

    Submitted 13 September, 2019; v1 submitted 24 August, 2018; originally announced August 2018.

    Comments: To appear at EMNLP-IJCAI 2019, 6 pages + 2 pages appendix

  12. arXiv:1805.04836  [pdf, other

    cs.CL

    Building Language Models for Text with Named Entities

    Authors: Md Rizwan Parvez, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang

    Abstract: Text in many domains involves a significant amount of named entities. Predict- ing the entity names is often challenging for a language model as they appear less frequent on the training corpus. In this paper, we propose a novel and effective approach to building a discriminative language model which can learn the entity names by leveraging their entity type information. We also introduce two benc… ▽ More

    Submitted 13 May, 2018; originally announced May 2018.