Skip to main content

Showing 1–2 of 2 results for author: Mikuła, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.08295  [pdf, other

    cs.CL cs.AI

    Gemma: Open Models Based on Gemini Research and Technology

    Authors: Gemma Team, Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love, Pouya Tafti, Léonard Hussenot, Pier Giuseppe Sessa, Aakanksha Chowdhery, Adam Roberts, Aditya Barua, Alex Botev, Alex Castro-Ros, Ambrose Slone, Amélie Héliou, Andrea Tacchetti, Anna Bulanova, Antonia Paterson, Beth Tsai, Bobak Shahriari , et al. (83 additional authors not shown)

    Abstract: This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models. Gemma models demonstrate strong performance across academic benchmarks for language understanding, reasoning, and safety. We release two sizes of models (2 billion and 7 billion parameters), and provide both pretrained and fine-tuned checkpoints. Ge… ▽ More

    Submitted 16 April, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  2. arXiv:2303.04488  [pdf, other

    cs.LG cs.AI cs.LO

    Magnushammer: A Transformer-Based Approach to Premise Selection

    Authors: Maciej Mikuła, Szymon Tworkowski, Szymon Antoniak, Bartosz Piotrowski, Albert Qiaochu Jiang, ** Peng Zhou, Christian Szegedy, Łukasz Kuciński, Piotr Miłoś, Yuhuai Wu

    Abstract: This paper presents a novel approach to premise selection, a crucial reasoning task in automated theorem proving. Traditionally, symbolic methods that rely on extensive domain knowledge and engineering effort are applied to this task. In contrast, this work demonstrates that contrastive training with the transformer architecture can achieve higher-quality retrieval of relevant premises, without th… ▽ More

    Submitted 18 March, 2024; v1 submitted 8 March, 2023; originally announced March 2023.

    Comments: ICLR 2024