Skip to main content

Showing 1–42 of 42 results for author: Michalewski, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.08295  [pdf, other

    cs.CL cs.AI

    Gemma: Open Models Based on Gemini Research and Technology

    Authors: Gemma Team, Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love, Pouya Tafti, Léonard Hussenot, Pier Giuseppe Sessa, Aakanksha Chowdhery, Adam Roberts, Aditya Barua, Alex Botev, Alex Castro-Ros, Ambrose Slone, Amélie Héliou, Andrea Tacchetti, Anna Bulanova, Antonia Paterson, Beth Tsai, Bobak Shahriari , et al. (83 additional authors not shown)

    Abstract: This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models. Gemma models demonstrate strong performance across academic benchmarks for language understanding, reasoning, and safety. We release two sizes of models (2 billion and 7 billion parameters), and provide both pretrained and fine-tuned checkpoints. Ge… ▽ More

    Submitted 16 April, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  2. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  3. arXiv:2402.08073  [pdf, other

    cs.LG cs.PL cs.SE

    Grounding Data Science Code Generation with Input-Output Specifications

    Authors: Yeming Wen, Pengcheng Yin, Kensen Shi, Henryk Michalewski, Swarat Chaudhuri, Alex Polozov

    Abstract: Large language models (LLMs) have recently demonstrated a remarkable ability to generate code from natural language (NL) prompts. However, in the real world, NL is often too ambiguous to capture the true intent behind programming problems, requiring additional input-output (I/O) specifications. Unfortunately, LLMs can have difficulty aligning their outputs with both the NL prompt and the I/O speci… ▽ More

    Submitted 14 March, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  4. arXiv:2402.03877  [pdf, other

    cs.CL cs.AI

    Beyond Lines and Circles: Unveiling the Geometric Reasoning Gap in Large Language Models

    Authors: Spyridon Mouselinos, Henryk Michalewski, Mateusz Malinowski

    Abstract: Large Language Models (LLMs) demonstrate ever-increasing abilities in mathematical and algorithmic tasks, yet their geometric reasoning skills are underexplored. We investigate LLMs' abilities in constructive geometric problem-solving one of the most fundamental steps in the development of human mathematical reasoning. Our work reveals notable challenges that the state-of-the-art LLMs face in this… ▽ More

    Submitted 14 February, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: Preprint. Work in progress

  5. arXiv:2312.17296  [pdf, other

    cs.CL

    Structured Packing in LLM Training Improves Long Context Utilization

    Authors: Konrad Staniszewski, Szymon Tworkowski, Sebastian Jaszczur, Yu Zhao, Henryk Michalewski, Łukasz Kuciński, Piotr Miłoś

    Abstract: Recent advancements in long-context large language models have attracted significant attention, yet their practical applications often suffer from suboptimal context utilization. This study investigates structuring training data to enhance semantic interdependence, demonstrating that this approach effectively improves context utilization. To this end, we introduce the Structured Packing for Long C… ▽ More

    Submitted 24 June, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

    Comments: new experiments with a 13B model

  6. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  7. arXiv:2309.16797  [pdf, other

    cs.CL cs.AI cs.LG cs.NE

    Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution

    Authors: Chrisantha Fernando, Dylan Banarse, Henryk Michalewski, Simon Osindero, Tim Rocktäschel

    Abstract: Popular prompt strategies like Chain-of-Thought Prompting can dramatically improve the reasoning abilities of Large Language Models (LLMs) in various domains. However, such hand-crafted prompt-strategies are often sub-optimal. In this paper, we present Promptbreeder, a general-purpose self-referential self-improvement mechanism that evolves and adapts prompts for a given domain. Driven by an LLM,… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

  8. arXiv:2307.15818  [pdf, other

    cs.RO cs.CL cs.CV cs.LG

    RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control

    Authors: Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Ding, Danny Driess, Avinava Dubey, Chelsea Finn, Pete Florence, Chuyuan Fu, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Kehang Han, Karol Hausman, Alexander Herzog, Jasmine Hsu, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal , et al. (29 additional authors not shown)

    Abstract: We study how vision-language models trained on Internet-scale data can be incorporated directly into end-to-end robotic control to boost generalization and enable emergent semantic reasoning. Our goal is to enable a single end-to-end trained model to both learn to map robot observations to actions and enjoy the benefits of large-scale pretraining on language and vision-language data from the web.… ▽ More

    Submitted 28 July, 2023; originally announced July 2023.

    Comments: Website: https://robotics-transformer.github.io/

  9. arXiv:2307.03170  [pdf, other

    cs.CL cs.AI cs.LG

    Focused Transformer: Contrastive Training for Context Scaling

    Authors: Szymon Tworkowski, Konrad Staniszewski, Mikołaj Pacek, Yuhuai Wu, Henryk Michalewski, Piotr Miłoś

    Abstract: Large language models have an exceptional capability to incorporate new information in a contextual manner. However, the full potential of such an approach is often restrained due to a limitation in the effective context length. One solution to this issue is to endow an attention layer with access to an external memory, which comprises of (key, value) pairs. Yet, as the number of documents increas… ▽ More

    Submitted 30 November, 2023; v1 submitted 6 July, 2023; originally announced July 2023.

    Comments: Accepted at 37th Conference on Neural Information Processing Systems (NeurIPS 2023). 28 pages, 10 figures, 11 tables

  10. arXiv:2212.09248  [pdf, other

    cs.CL cs.SE

    Natural Language to Code Generation in Interactive Data Science Notebooks

    Authors: Pengcheng Yin, Wen-Ding Li, Kefan Xiao, Abhishek Rao, Yeming Wen, Kensen Shi, Joshua Howland, Paige Bailey, Michele Catasta, Henryk Michalewski, Alex Polozov, Charles Sutton

    Abstract: Computational notebooks, such as Jupyter notebooks, are interactive computing environments that are ubiquitous among data scientists to perform data wrangling and analytic tasks. To measure the performance of AI pair programmers that automatically synthesize programs for those tasks given natural language (NL) intents from users, we build ARCADE, a benchmark of 1082 code generation problems using… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

    Comments: 46 pages. 32 figures

  11. arXiv:2211.00609  [pdf, other

    cs.AI cs.PL

    A Simple, Yet Effective Approach to Finding Biases in Code Generation

    Authors: Spyridon Mouselinos, Mateusz Malinowski, Henryk Michalewski

    Abstract: Recently, high-performing code generation systems based on large language models have surfaced. They are trained on massive corpora containing much more natural text than actual executable computer code. This work shows that current code generation systems exhibit undesired biases inherited from their large language model backbones, which can reduce the quality of the generated code under specific… ▽ More

    Submitted 9 May, 2023; v1 submitted 31 October, 2022; originally announced November 2022.

    Comments: To appear in ACL Findings 2023

  12. arXiv:2207.10342  [pdf, ps, other

    cs.CL cs.AI

    Language Model Cascades

    Authors: David Dohan, Winnie Xu, Aitor Lewkowycz, Jacob Austin, David Bieber, Raphael Gontijo Lopes, Yuhuai Wu, Henryk Michalewski, Rif A. Saurous, Jascha Sohl-dickstein, Kevin Murphy, Charles Sutton

    Abstract: Prompted models have demonstrated impressive few-shot learning abilities. Repeated interactions at test-time with a single model, or the composition of multiple models together, further expands capabilities. These compositions are probabilistic models, and may be expressed in the language of graphical models with random variables whose values are complex data types such as strings. Cases with cont… ▽ More

    Submitted 28 July, 2022; v1 submitted 21 July, 2022; originally announced July 2022.

    Comments: Presented as spotlight at the Beyond Bases workshop at ICML 2022 (https://beyond-bayes.github.io)

  13. arXiv:2206.14858  [pdf, other

    cs.CL cs.AI cs.LG

    Solving Quantitative Reasoning Problems with Language Models

    Authors: Aitor Lewkowycz, Anders Andreassen, David Dohan, Ethan Dyer, Henryk Michalewski, Vinay Ramasesh, Ambrose Slone, Cem Anil, Imanol Schlag, Theo Gutman-Solo, Yuhuai Wu, Behnam Neyshabur, Guy Gur-Ari, Vedant Misra

    Abstract: Language models have achieved remarkable performance on a wide range of tasks that require natural language understanding. Nevertheless, state-of-the-art models have generally struggled with tasks that require quantitative reasoning, such as solving mathematics, science, and engineering problems at the college level. To help close this gap, we introduce Minerva, a large language model pretrained o… ▽ More

    Submitted 30 June, 2022; v1 submitted 29 June, 2022; originally announced June 2022.

    Comments: 12 pages, 5 figures + references and appendices

  14. arXiv:2205.15241  [pdf, other

    cs.AI cs.LG

    Multi-Game Decision Transformers

    Authors: Kuang-Huei Lee, Ofir Nachum, Mengjiao Yang, Lisa Lee, Daniel Freeman, Winnie Xu, Sergio Guadarrama, Ian Fischer, Eric Jang, Henryk Michalewski, Igor Mordatch

    Abstract: A longstanding goal of the field of AI is a method for learning a highly capable, generalist agent from diverse experience. In the subfields of vision and language, this was largely achieved by scaling up transformer-based models and training them on large, diverse datasets. Motivated by this progress, we investigate whether the same strategy can be used to produce generalist reinforcement learnin… ▽ More

    Submitted 15 October, 2022; v1 submitted 30 May, 2022; originally announced May 2022.

    Comments: NeurIPS 2022. 24 pages, 16 figures. Additional information, videos and code can be seen at https://sites.google.com/view/multi-game-transformers

  15. arXiv:2204.02311  [pdf, other

    cs.CL

    PaLM: Scaling Language Modeling with Pathways

    Authors: Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin , et al. (42 additional authors not shown)

    Abstract: Large language models have been shown to achieve remarkable performance across a variety of natural language tasks using few-shot learning, which drastically reduces the number of task-specific training examples needed to adapt the model to a particular application. To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Tran… ▽ More

    Submitted 5 October, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

  16. arXiv:2202.12162  [pdf, other

    cs.LG cs.AI cs.CV

    Measuring CLEVRness: Blackbox testing of Visual Reasoning Models

    Authors: Spyridon Mouselinos, Henryk Michalewski, Mateusz Malinowski

    Abstract: How can we measure the reasoning capabilities of intelligence systems? Visual question answering provides a convenient framework for testing the model's abilities by interrogating the model through questions about the scene. However, despite scores of various visual QA datasets and architectures, which sometimes yield even a super-human performance, the question of whether those architectures can… ▽ More

    Submitted 28 February, 2022; v1 submitted 24 February, 2022; originally announced February 2022.

    Comments: ICLR 2022

  17. arXiv:2112.00114  [pdf, other

    cs.LG cs.NE

    Show Your Work: Scratchpads for Intermediate Computation with Language Models

    Authors: Maxwell Nye, Anders Johan Andreassen, Guy Gur-Ari, Henryk Michalewski, Jacob Austin, David Bieber, David Dohan, Aitor Lewkowycz, Maarten Bosma, David Luan, Charles Sutton, Augustus Odena

    Abstract: Large pre-trained language models perform remarkably well on tasks that can be done "in one pass", such as generating realistic text or synthesizing computer programs. However, they struggle with tasks that require unbounded multi-step computation, such as adding integers or executing programs. Surprisingly, we find that these same models are able to perform complex multi-step computations -- even… ▽ More

    Submitted 30 November, 2021; originally announced December 2021.

  18. arXiv:2111.12763  [pdf, other

    cs.LG cs.CL

    Sparse is Enough in Scaling Transformers

    Authors: Sebastian Jaszczur, Aakanksha Chowdhery, Afroz Mohiuddin, Łukasz Kaiser, Wojciech Gajewski, Henryk Michalewski, Jonni Kanerva

    Abstract: Large Transformer models yield impressive results on many tasks, but are expensive to train, or even fine-tune, and so slow at decoding that their use and study becomes out of reach. We address this problem by leveraging sparsity. We study sparse variants for all layers in the Transformer and propose Scaling Transformers, a family of next generation Transformer models that use sparse layers to sca… ▽ More

    Submitted 24 November, 2021; originally announced November 2021.

    Comments: NeurIPS 2021

  19. arXiv:2111.11229  [pdf, other

    cs.LG cs.AI cs.MA

    Off-Policy Correction For Multi-Agent Reinforcement Learning

    Authors: Michał Zawalski, Błażej Osiński, Henryk Michalewski, Piotr Miłoś

    Abstract: Multi-agent reinforcement learning (MARL) provides a framework for problems involving multiple interacting agents. Despite apparent similarity to the single-agent case, multi-agent problems are often harder to train and analyze theoretically. In this work, we propose MA-Trace, a new on-policy actor-critic algorithm, which extends V-Trace to the MARL setting. The key advantage of our algorithm is i… ▽ More

    Submitted 3 April, 2024; v1 submitted 22 November, 2021; originally announced November 2021.

    ACM Class: I.2

  20. arXiv:2110.13711  [pdf, other

    cs.LG cs.CL

    Hierarchical Transformers Are More Efficient Language Models

    Authors: Piotr Nawrot, Szymon Tworkowski, Michał Tyrolski, Łukasz Kaiser, Yuhuai Wu, Christian Szegedy, Henryk Michalewski

    Abstract: Transformer models yield impressive results on many NLP and sequence modeling tasks. Remarkably, Transformers can handle long sequences which allows them to produce long coherent outputs: full paragraphs produced by GPT-3 or well-structured images produced by DALL-E. These large language models are impressive but also very inefficient and costly, which limits their applications and accessibility.… ▽ More

    Submitted 16 April, 2022; v1 submitted 26 October, 2021; originally announced October 2021.

  21. arXiv:2108.07732  [pdf, other

    cs.PL cs.LG

    Program Synthesis with Large Language Models

    Authors: Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, Charles Sutton

    Abstract: This paper explores the limits of the current generation of large language models for program synthesis in general purpose programming languages. We evaluate a collection of such models (with between 244M and 137B parameters) on two new benchmarks, MBPP and MathQA-Python, in both the few-shot and fine-tuning regimes. Our benchmarks are designed to measure the ability of these models to synthesize… ▽ More

    Submitted 15 August, 2021; originally announced August 2021.

    Comments: Jacob and Augustus contributed equally

  22. arXiv:2106.03921  [pdf, other

    cs.CL cs.AI cs.LG

    Measuring and Improving BERT's Mathematical Abilities by Predicting the Order of Reasoning

    Authors: Piotr Piękos, Henryk Michalewski, Mateusz Malinowski

    Abstract: Imagine you are in a supermarket. You have two bananas in your basket and want to buy four apples. How many fruits do you have in total? This seemingly straightforward question can be challenging for data-driven language models, even if trained at scale. However, we would expect such generic language models to possess some mathematical abilities in addition to typical linguistic competence. Toward… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

    Comments: The paper has been accepted to the ACL-IJCNLP 2021 conference

    ACM Class: I.2.7

  23. arXiv:2102.06782  [pdf, other

    cs.LG

    Q-Value Weighted Regression: Reinforcement Learning with Limited Data

    Authors: Piotr Kozakowski, Łukasz Kaiser, Henryk Michalewski, Afroz Mohiuddin, Katarzyna Kańska

    Abstract: Sample efficiency and performance in the offline setting have emerged as significant challenges of deep reinforcement learning. We introduce Q-Value Weighted Regression (QWR), a simple RL algorithm that excels in these aspects. QWR is an extension of Advantage Weighted Regression (AWR), an off-policy actor-critic algorithm that performs very well on continuous control tasks, also in the offline se… ▽ More

    Submitted 12 February, 2021; originally announced February 2021.

  24. arXiv:2012.11329  [pdf, other

    cs.RO cs.AI cs.LG

    CARLA Real Traffic Scenarios -- novel training ground and benchmark for autonomous driving

    Authors: Błażej Osiński, Piotr Miłoś, Adam Jakubowski, Paweł Zięcina, Michał Martyniak, Christopher Galias, Antonia Breuer, Silviu Homoceanu, Henryk Michalewski

    Abstract: This work introduces interactive traffic scenarios in the CARLA simulator, which are based on real-world traffic. We concentrate on tactical tasks lasting several seconds, which are especially challenging for current control methods. The CARLA Real Traffic Scenarios (CRTS) is intended to be a training and testing ground for autonomous driving systems. To this end, we open-source the code under a p… ▽ More

    Submitted 19 September, 2021; v1 submitted 16 December, 2020; originally announced December 2020.

  25. arXiv:2005.13406  [pdf, other

    cs.AI

    Neural heuristics for SAT solving

    Authors: Sebastian Jaszczur, Michał Łuszczyk, Henryk Michalewski

    Abstract: We use neural graph networks with a message-passing architecture and an attention mechanism to enhance the branching heuristic in two SAT-solving algorithms. We report improvements of learned neural heuristics compared with two standard human-designed heuristics.

    Submitted 27 May, 2020; originally announced May 2020.

  26. arXiv:1911.12905  [pdf, other

    cs.LG cs.AI cs.RO

    Simulation-based reinforcement learning for real-world autonomous driving

    Authors: Błażej Osiński, Adam Jakubowski, Piotr Miłoś, Paweł Zięcina, Christopher Galias, Silviu Homoceanu, Henryk Michalewski

    Abstract: We use reinforcement learning in simulation to obtain a driving system controlling a full-size real-world vehicle. The driving policy takes RGB images from a single camera and their semantic segmentation as input. We use mostly synthetic data, with labelled real-world data appearing only in the training of the segmentation network. Using reinforcement learning in simulation and synthetic data is… ▽ More

    Submitted 3 April, 2024; v1 submitted 28 November, 2019; originally announced November 2019.

  27. arXiv:1905.13100  [pdf, other

    cs.LO cs.AI cs.LG

    Towards Finding Longer Proofs

    Authors: Zsolt Zombori, Adrián Csiszárik, Henryk Michalewski, Cezary Kaliszyk, Josef Urban

    Abstract: We present a reinforcement learning (RL) based guidance system for automated theorem proving geared towards Finding Longer Proofs (FLoP). Unlike most learning based approaches, we focus on generalising from very little training data and achieving near complete confidence. We use several simple, structured datasets with very long proofs to show that FLoP can successfully generalise a single trainin… ▽ More

    Submitted 29 June, 2021; v1 submitted 30 May, 2019; originally announced May 2019.

    Comments: 16 pages, 3 figures, published at TABLEAUX2021

  28. arXiv:1903.00374  [pdf, other

    cs.LG stat.ML

    Model-Based Reinforcement Learning for Atari

    Authors: Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski

    Abstract: Model-free reinforcement learning (RL) can be used to learn effective policies for complex tasks, such as Atari games, even from image observations. However, this typically requires very large amounts of interaction -- substantially more, in fact, than a human would need to learn the same games. How can people learn so quickly? Part of the answer may be that people can learn how the game works and… ▽ More

    Submitted 3 April, 2024; v1 submitted 1 March, 2019; originally announced March 2019.

  29. arXiv:1809.03447  [pdf, other

    cs.LG stat.ML

    Expert-augmented actor-critic for ViZDoom and Montezumas Revenge

    Authors: Michał Garmulewicz, Henryk Michalewski, Piotr Miłoś

    Abstract: We propose an expert-augmented actor-critic algorithm, which we evaluate on two environments with sparse rewards: Montezumas Revenge and a demanding maze from the ViZDoom suite. In the case of Montezumas Revenge, an agent trained with our method achieves very good results consistently scoring above 27,000 points (in many experiments beating the first world). With an appropriate choice of hyperpara… ▽ More

    Submitted 10 September, 2018; originally announced September 2018.

  30. arXiv:1805.07563  [pdf, ps, other

    cs.AI cs.LG cs.LO

    Reinforcement Learning of Theorem Proving

    Authors: Cezary Kaliszyk, Josef Urban, Henryk Michalewski, Mirek Olšák

    Abstract: We introduce a theorem proving algorithm that uses practically no domain heuristics for guiding its connection-style proof search. Instead, it runs many Monte-Carlo simulations guided by reinforcement learning from previous proof attempts. We produce several versions of the prover, parameterized by different learning and guiding algorithms. The strongest version of the system is trained on a large… ▽ More

    Submitted 19 May, 2018; originally announced May 2018.

  31. arXiv:1804.00361  [pdf, other

    cs.LG cs.AI stat.ML

    Learning to Run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments

    Authors: Łukasz Kidziński, Sharada Prasanna Mohanty, Carmichael Ong, Zhewei Huang, Shuchang Zhou, Anton Pechenko, Adam Stelmaszczyk, Piotr Jarosik, Mikhail Pavlov, Sergey Kolesnikov, Sergey Plis, Zhibo Chen, Zhizheng Zhang, Jiale Chen, Jun Shi, Zhuobin Zheng, Chun Yuan, Zhihui Lin, Henryk Michalewski, Piotr Miłoś, Błażej Osiński, Andrew Melnik, Malte Schilling, Helge Ritter, Sean Carroll , et al. (4 additional authors not shown)

    Abstract: In the NIPS 2017 Learning to Run challenge, participants were tasked with building a controller for a musculoskeletal model to make it run as fast as possible through an obstacle course. Top participants were invited to describe their algorithms. In this work, we present eight solutions that used deep reinforcement learning approaches, based on algorithms such as Deep Deterministic Policy Gradient… ▽ More

    Submitted 1 April, 2018; originally announced April 2018.

    Comments: 27 pages, 17 figures

  32. arXiv:1801.02852  [pdf, ps, other

    cs.AI

    Distributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutes

    Authors: Igor Adamski, Robert Adamski, Tomasz Grel, Adam Jędrych, Kamil Kaczmarek, Henryk Michalewski

    Abstract: We present a study in Distributed Deep Reinforcement Learning (DDRL) focused on scalability of a state-of-the-art Deep Reinforcement Learning algorithm known as Batch Asynchronous Advantage ActorCritic (BA3C). We show that using the Adam optimization algorithm with a batch size of up to 2048 is a viable choice for carrying out large scale machine learning computations. This, combined with careful… ▽ More

    Submitted 9 April, 2018; v1 submitted 9 January, 2018; originally announced January 2018.

  33. Atari games and Intel processors

    Authors: Robert Adamski, Tomasz Grel, Maciej Klimek, Henryk Michalewski

    Abstract: The asynchronous nature of the state-of-the-art reinforcement learning algorithms such as the Asynchronous Advantage Actor-Critic algorithm, makes them exceptionally suitable for CPU computations. However, given the fact that deep reinforcement learning often deals with interpreting visual information, a large part of the train and inference time is spent performing convolutions. In this work we p… ▽ More

    Submitted 19 May, 2017; originally announced May 2017.

  34. arXiv:1703.04736  [pdf, other

    cs.FL

    Some connections between universal algebra and logics for trees

    Authors: Mikołaj Bojańczyk, Henryk Michalewski

    Abstract: One of the major open problems in automata and logic is the following: is there an algorithm which inputs a regular tree language and decides if the language can be defined in first-order logic? The goal of this paper is to present this problem and similar ones using the language of universal algebra, highlighting potential connections to the structural theory of finite algebras, including Tame Co… ▽ More

    Submitted 14 March, 2017; originally announced March 2017.

  35. arXiv:1702.04769  [pdf, other

    cs.LO cs.FL math.LO

    Monadic Second Order Logic with Measure and Category Quantifiers

    Authors: Matteo Mio, Michał Skrzypczak, Henryk Michalewski

    Abstract: We investigate the extension of Monadic Second Order logic, interpreted over infinite words and trees, with generalized "for almost all" quantifiers interpreted using the notions of Baire category and Lebesgue measure.

    Submitted 9 April, 2018; v1 submitted 15 February, 2017; originally announced February 2017.

    Journal ref: Logical Methods in Computer Science, Volume 14, Issue 2, Automata and logic (April 10, 2018) lmcs:3148

  36. The logical strength of Büchi's decidability theorem

    Authors: Leszek Kołodziejczyk, Henryk Michalewski, Cécilia Pradic, Michał Skrzypczak

    Abstract: We study the strength of axioms needed to prove various results related to automata on infinite words and Büchi's theorem on the decidability of the MSO theory of $(N, {\le})$. We prove that the following are equivalent over the weak second-order arithmetic theory $RCA_0$: (1) the induction scheme for $Σ^0_2$ formulae of arithmetic, (2) a variant of Ramsey's Theorem for pairs restricted to so-… ▽ More

    Submitted 22 May, 2019; v1 submitted 26 August, 2016; originally announced August 2016.

    Journal ref: Logical Methods in Computer Science, Volume 15, Issue 2 (May 23, 2019) lmcs:4866

  37. On the Regular Emptiness Problem of Subzero Automata

    Authors: Henryk Michalewski, Matteo Mio, Mikołaj Bojańczyk

    Abstract: Subzero automata is a class of tree automata whose acceptance condition can express probabilistic constraints. Our main result is that the problem of determining if a subzero automaton accepts some regular tree is decidable.

    Submitted 10 August, 2016; originally announced August 2016.

    Comments: In Proceedings ICE 2016, arXiv:1608.03131

    Journal ref: EPTCS 223, 2016, pp. 1-23

  38. arXiv:1605.01335  [pdf, other

    cs.LG cs.AI

    Learning from the memory of Atari 2600

    Authors: Jakub Sygnowski, Henryk Michalewski

    Abstract: We train a number of neural networks to play games Bowling, Breakout and Seaquest using information stored in the memory of a video game console Atari 2600. We consider four models of neural networks which differ in size and architecture: two networks which use only information contained in the RAM and two mixed networks which use both information in the RAM and information from the screen. As the… ▽ More

    Submitted 4 May, 2016; originally announced May 2016.

  39. arXiv:1510.01640  [pdf, other

    cs.FL

    On the Problem of Computing the Probability of Regular Sets of Trees

    Authors: Henryk Michalewski, Matteo Mio

    Abstract: We consider the problem of computing the probability of regular languages of infinite trees with respect to the natural coin-flip** measure. We propose an algorithm which computes the probability of languages recognizable by \emph{game automata}. In particular this algorithm is applicable to all deterministic automata. We then use the algorithm to prove through examples three properties of measu… ▽ More

    Submitted 6 October, 2015; originally announced October 2015.

    ACM Class: F.1.1; F.1.2

  40. arXiv:1508.06780  [pdf, ps, other

    math.LO cs.FL cs.LO

    How unprovable is Rabin's decidability theorem?

    Authors: Leszek Aleksander Kołodziejczyk, Henryk Michalewski

    Abstract: We study the strength of set-theoretic axioms needed to prove Rabin's theorem on the decidability of the MSO theory of the infinite binary tree. We first show that the complementation theorem for tree automata, which forms the technical core of typical proofs of Rabin's theorem, is equivalent over the moderately strong second-order arithmetic theory $\mathsf{ACA}_0$ to a determinacy principle impl… ▽ More

    Submitted 27 August, 2015; originally announced August 2015.

    Comments: 21 pages

    MSC Class: 03B30; 03F35; 03B25; 03D05; 68Q45; 03E60

  41. arXiv:1403.3502  [pdf, other

    cs.LO cs.FL math.LO

    Deciding the Borel complexity of regular tree languages

    Authors: Alessandro Facchini, Henryk Michalewski

    Abstract: We show that it is decidable whether a given a regular tree language belongs to the class ${\bf Δ^0_2}$ of the Borel hierarchy, or equivalently whether the Wadge degree of a regular tree language is countable.

    Submitted 14 March, 2014; originally announced March 2014.

    Comments: 15 pages, 2 figures

  42. arXiv:1401.4025  [pdf, ps, other

    cs.FL

    Unambiguous Buchi is weak

    Authors: Henryk Michalewski, Michał Skrzypczak

    Abstract: A non-deterministic automaton running on infinite trees is unambiguous if it has at most one accepting run on every tree. The class of languages recognisable by unambiguous tree automata is still not well-understood. In particular, decidability of the problem whether a given language is recognisable by some unambiguous automaton is open. Moreover, there are no known upper bounds on the descriptive… ▽ More

    Submitted 9 May, 2016; v1 submitted 16 January, 2014; originally announced January 2014.