Skip to main content

Showing 1–4 of 4 results for author: Ratnakar, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.01474  [pdf, other

    cs.CL

    Finding Replicable Human Evaluations via Stable Ranking Probability

    Authors: Parker Riley, Daniel Deutsch, George Foster, Viresh Ratnakar, Ali Dabirmoghaddam, Markus Freitag

    Abstract: Reliable human evaluation is critical to the development of successful natural language generation models, but achieving it is notoriously difficult. Stability is a crucial requirement when ranking systems by quality: consistent ranking of systems across repeated evaluations is not just desirable, but essential. Without it, there is no reliable foundation for hill-climbing or product launch decisi… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: To appear at NAACL 2024

  2. arXiv:2211.09102  [pdf, other

    cs.CL

    Prompting PaLM for Translation: Assessing Strategies and Performance

    Authors: David Vilar, Markus Freitag, Colin Cherry, Jiaming Luo, Viresh Ratnakar, George Foster

    Abstract: Large language models (LLMs) that have been trained on multilingual but not parallel text exhibit a remarkable ability to translate between languages. We probe this ability in an in-depth study of the pathways language model (PaLM), which has demonstrated the strongest machine translation (MT) performance among similarly-trained LLMs to date. We investigate various strategies for choosing translat… ▽ More

    Submitted 25 June, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

    Comments: ACL 2023

  3. arXiv:2112.08709  [pdf, other

    cs.CL

    DOCmT5: Document-Level Pretraining of Multilingual Language Models

    Authors: Chia-Hsuan Lee, Aditya Siddhant, Viresh Ratnakar, Melvin Johnson

    Abstract: In this paper, we introduce DOCmT5, a multilingual sequence-to-sequence language model pretrained with large scale parallel documents. While previous approaches have focused on leveraging sentence-level parallel data, we try to build a general-purpose pretrained model that can understand and generate long documents. We propose a simple and effective pretraining objective - Document reordering Mach… ▽ More

    Submitted 4 May, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

    Comments: NAACL 2022 Findings

  4. arXiv:2104.14478  [pdf, other

    cs.CL cs.AI cs.LG

    Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation

    Authors: Markus Freitag, George Foster, David Grangier, Viresh Ratnakar, Qijun Tan, Wolfgang Macherey

    Abstract: Human evaluation of modern high-quality machine translation systems is a difficult problem, and there is increasing evidence that inadequate evaluation procedures can lead to erroneous conclusions. While there has been considerable research on human evaluation, the field still lacks a commonly-accepted standard procedure. As a step toward this goal, we propose an evaluation methodology grounded in… ▽ More

    Submitted 29 April, 2021; originally announced April 2021.