Skip to main content

Showing 1–1 of 1 results for author: Marroquin, E

.
  1. arXiv:2406.07524  [pdf, other

    cs.CL cs.AI cs.LG

    Simple and Effective Masked Diffusion Language Models

    Authors: Subham Sekhar Sahoo, Marianne Arriola, Yair Schiff, Aaron Gokaslan, Edgar Marroquin, Justin T Chiu, Alexander Rush, Volodymyr Kuleshov

    Abstract: While diffusion models excel at generating high-quality images, prior work reports a significant performance gap between diffusion and autoregressive (AR) methods in language modeling. In this work, we show that simple masked discrete diffusion is more performant than previously thought. We apply an effective training recipe that improves the performance of masked diffusion models and derive a sim… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Report number: cr07