Skip to main content

Showing 1–8 of 8 results for author: Bamford, C

.
  1. arXiv:2401.04088  [pdf, other

    cs.LG cs.CL

    Mixtral of Experts

    Authors: Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux, Arthur Mensch, Blanche Savary, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Emma Bou Hanna, Florian Bressand, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Sandeep Subramanian, Sophia Yang, Szymon Antoniak, Teven Le Scao, Théophile Gervet, Thibaut Lavril, Thomas Wang, Timothée Lacroix , et al. (1 additional authors not shown)

    Abstract: We introduce Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model. Mixtral has the same architecture as Mistral 7B, with the difference that each layer is composed of 8 feedforward blocks (i.e. experts). For every token, at each layer, a router network selects two experts to process the current state and combine their outputs. Even though each token only sees two experts, the selected e… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

    Comments: See more details at https://mistral.ai/news/mixtral-of-experts/

  2. arXiv:2310.06825  [pdf, other

    cs.CL cs.AI cs.LG

    Mistral 7B

    Authors: Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, Lélio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed

    Abstract: We introduce Mistral 7B v0.1, a 7-billion-parameter language model engineered for superior performance and efficiency. Mistral 7B outperforms Llama 2 13B across all evaluated benchmarks, and Llama 1 34B in reasoning, mathematics, and code generation. Our model leverages grouped-query attention (GQA) for faster inference, coupled with sliding window attention (SWA) to effectively handle sequences o… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: Models and code are available at https://mistral.ai/news/announcing-mistral-7b/

  3. arXiv:2207.06105  [pdf, other

    cs.AI

    GriddlyJS: A Web IDE for Reinforcement Learning

    Authors: Christopher Bamford, Minqi Jiang, Mikayel Samvelyan, Tim Rocktäschel

    Abstract: Progress in reinforcement learning (RL) research is often driven by the design of new, challenging environments -- a costly undertaking requiring skills orthogonal to that of a typical machine learning researcher. The complexity of environment development has only increased with the rise of procedural-content generation (PCG) as the prevailing paradigm for producing varied environments capable of… ▽ More

    Submitted 12 October, 2022; v1 submitted 13 July, 2022; originally announced July 2022.

  4. arXiv:2105.13807  [pdf, other

    cs.LG

    Gym-$μ$RTS: Toward Affordable Full Game Real-time Strategy Games Research with Deep Reinforcement Learning

    Authors: Shengyi Huang, Santiago Ontañón, Chris Bamford, Lukasz Grela

    Abstract: In recent years, researchers have achieved great success in applying Deep Reinforcement Learning (DRL) algorithms to Real-time Strategy (RTS) games, creating strong autonomous agents that could defeat professional players in StarCraft~II. However, existing approaches to tackle full games have high computational costs, usually requiring the use of thousands of GPUs and CPUs for weeks. This paper ha… ▽ More

    Submitted 28 July, 2021; v1 submitted 21 May, 2021; originally announced May 2021.

    Comments: Accepted to IEEE Conference of Games (COG) 2021. See the blog post at https://wandb.ai/vwxyzjn/gym-microrts-paper/reports/Gym-RTS-Toward-Affordable-Deep-Reinforcement-Learning-Research-in-Real-Time-Strategy-Games--Vmlldzo2MDIzMTg and the source code at https://github.com/vwxyzjn/gym-microrts-paper

  5. arXiv:2104.07294  [pdf, other

    cs.LG cs.AI stat.ML

    Generalising Discrete Action Spaces with Conditional Action Trees

    Authors: Christopher Bamford, Alvaro Ovalle

    Abstract: There are relatively few conventions followed in reinforcement learning (RL) environments to structure the action spaces. As a consequence the application of RL algorithms to tasks with large action spaces with multiple components require additional effort to adjust to different formats. In this paper we introduce {\em Conditional Action Trees} with two main objectives: (1) as a method of structur… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

  6. arXiv:2011.06363  [pdf, other

    cs.AI

    Griddly: A platform for AI research in games

    Authors: Chris Bamford, Shengyi Huang, Simon Lucas

    Abstract: In recent years, there have been immense breakthroughs in Game AI research, particularly with Reinforcement Learning (RL). Despite their success, the underlying games are usually implemented with their own preset environments and game mechanics, thus making it difficult for researchers to prototype different game environments. However, testing the RL agents against a variety of game environments i… ▽ More

    Submitted 12 July, 2022; v1 submitted 12 November, 2020; originally announced November 2020.

  7. arXiv:2003.10520  [pdf, other

    cs.AI

    Neural Game Engine: Accurate learning of generalizable forward models from pixels

    Authors: Chris Bamford, Simon Lucas

    Abstract: Access to a fast and easily copied forward model of a game is essential for model-based reinforcement learning and for algorithms such as Monte Carlo tree search, and is also beneficial as a source of unlimited experience data for model-free algorithms. Learning forward models is an interesting and important challenge in order to address problems where a model is not available. Building upon previ… ▽ More

    Submitted 31 March, 2020; v1 submitted 23 March, 2020; originally announced March 2020.

  8. arXiv:1903.12508  [pdf, other

    cs.AI

    A Local Approach to Forward Model Learning: Results on the Game of Life Game

    Authors: Simon M. Lucas, Alexander Dockhorn, Vanessa Volz, Chris Bamford, Raluca D. Gaina, Ivan Bravi, Diego Perez-Liebana, Sanaz Mostaghim, Rudolf Kruse

    Abstract: This paper investigates the effect of learning a forward model on the performance of a statistical forward planning agent. We transform Conway's Game of Life simulation into a single-player game where the objective can be either to preserve as much life as possible or to extinguish all life as quickly as possible. In order to learn the forward model of the game, we formulate the problem in a nov… ▽ More

    Submitted 29 March, 2019; originally announced March 2019.

    Comments: Submitted to IEEE Conference on Games 2019