Skip to main content

Showing 1–2 of 2 results for author: Palavalli, M

.
  1. arXiv:2407.08716  [pdf, other

    cs.CL

    A Taxonomy for Data Contamination in Large Language Models

    Authors: Medha Palavalli, Amanda Bertsch, Matthew R. Gormley

    Abstract: Large language models pretrained on extensive web corpora demonstrate remarkable performance across a wide range of downstream tasks. However, a growing concern is data contamination, where evaluation datasets may be contained in the pretraining corpus, inflating model performance. Decontamination, the process of detecting and removing such data, is a potential solution; yet these contaminants may… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 19 pages, 8 figures, accepted to CONDA Workshop on Data Contamination @ ACL 2024

  2. arXiv:2306.17384  [pdf, other

    cs.CL

    SummQA at MEDIQA-Chat 2023:In-Context Learning with GPT-4 for Medical Summarization

    Authors: Yash Mathur, Sanketh Rangreji, Raghav Kapoor, Medha Palavalli, Amanda Bertsch, Matthew R. Gormley

    Abstract: Medical dialogue summarization is challenging due to the unstructured nature of medical conversations, the use of medical terminology in gold summaries, and the need to identify key information across multiple symptom sets. We present a novel system for the Dialogue2Note Medical Summarization tasks in the MEDIQA 2023 Shared Task. Our approach for section-wise summarization (Task A) is a two-stage… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

    Comments: ClinicalNLP @ ACL 2023