Skip to main content

Showing 1–3 of 3 results for author: Arriola, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.07524  [pdf, other

    cs.CL cs.AI cs.LG

    Simple and Effective Masked Diffusion Language Models

    Authors: Subham Sekhar Sahoo, Marianne Arriola, Yair Schiff, Aaron Gokaslan, Edgar Marroquin, Justin T Chiu, Alexander Rush, Volodymyr Kuleshov

    Abstract: While diffusion models excel at generating high-quality images, prior work reports a significant performance gap between diffusion and autoregressive (AR) methods in language modeling. In this work, we show that simple masked discrete diffusion is more performant than previously thought. We apply an effective training recipe that improves the performance of masked diffusion models and derive a sim… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Report number: cr07

  2. arXiv:2405.11280  [pdf, other

    cs.LG

    Joint Analysis of Single-Cell Data across Cohorts with Missing Modalities

    Authors: Marianne Arriola, Weishen Pan, Manqi Zhou, Qiannan Zhang, Chang Su, Fei Wang

    Abstract: Joint analysis of multi-omic single-cell data across cohorts has significantly enhanced the comprehensive analysis of cellular processes. However, most of the existing approaches for this purpose require access to samples with complete modality availability, which is impractical in many real-world scenarios. In this paper, we propose (Single-Cell Cross-Cohort Cross-Category) integration, a novel f… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

    Comments: 10 pages, 7 figures, 5 tables

  3. arXiv:cmp-lg/9503020  [pdf, ps

    cs.CL

    Different Issues in the Design of a Lemmatizer/Tagger for Basque

    Authors: I. Aduriz, I. Alegria, J. M. Arriola, X. Artola, Diaz de Illarraza A., N. Ezeiza, K. Gojenola, M. Maritxalar

    Abstract: This paper presents relevant issues that have been considered in the design of a general purpose lemmatizer/tagger for Basque (EUSLEM). The lemmatizer/tagger is conceived as a basic tool necessary for other linguistic applications. It uses the lexical data base and the morphological analyzer previously developed and implemented. Due to the characteristics of the language, the tagset here propose… ▽ More

    Submitted 20 March, 1995; originally announced March 1995.

    Comments: in Proceedings of SIGDAT-95 (EACL-Dublin), 6 pages, uuencoded and gz-compressed postscript