Skip to main content

Showing 1–17 of 17 results for author: Pritzel, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  2. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  3. arXiv:2212.12794  [pdf, other

    cs.LG physics.ao-ph

    GraphCast: Learning skillful medium-range global weather forecasting

    Authors: Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson, Peter Wirnsberger, Meire Fortunato, Ferran Alet, Suman Ravuri, Timo Ewalds, Zach Eaton-Rosen, Weihua Hu, Alexander Merose, Stephan Hoyer, George Holland, Oriol Vinyals, Jacklynn Stott, Alexander Pritzel, Shakir Mohamed, Peter Battaglia

    Abstract: Global medium-range weather forecasting is critical to decision-making across many social and economic domains. Traditional numerical weather prediction uses increased compute resources to improve forecast accuracy, but cannot directly use historical weather data to improve the underlying model. We introduce a machine learning-based method called "GraphCast", which can be trained directly from rea… ▽ More

    Submitted 4 August, 2023; v1 submitted 24 December, 2022; originally announced December 2022.

    Comments: GraphCast code and trained weights are available at: https://github.com/deepmind/graphcast

  4. arXiv:2210.00612  [pdf, other

    cs.LG cs.CE

    MultiScale MeshGraphNets

    Authors: Meire Fortunato, Tobias Pfaff, Peter Wirnsberger, Alexander Pritzel, Peter Battaglia

    Abstract: In recent years, there has been a growing interest in using machine learning to overcome the high cost of numerical simulation, with some learned models achieving impressive speed-ups over classical solvers whilst maintaining accuracy. However, these methods are usually tested at low-resolution settings, and it remains to be seen whether they can scale to the costly high-resolution simulations tha… ▽ More

    Submitted 2 October, 2022; originally announced October 2022.

    Journal ref: 2nd AI4Science Workshop at the 39th International Conference on Machine Learning (ICML), 2022

  5. arXiv:2002.06038  [pdf, other

    cs.LG stat.ML

    Never Give Up: Learning Directed Exploration Strategies

    Authors: Adrià Puigdomènech Badia, Pablo Sprechmann, Alex Vitvitskyi, Daniel Guo, Bilal Piot, Steven Kapturowski, Olivier Tieleman, Martín Arjovsky, Alexander Pritzel, Andew Bolt, Charles Blundell

    Abstract: We propose a reinforcement learning agent to solve hard exploration games by learning a range of directed exploratory policies. We construct an episodic memory-based intrinsic reward using k-nearest neighbors over the agent's recent experience to train the directed exploratory policies, thereby encouraging the agent to repeatedly revisit all states in its environment. A self-supervised inverse dyn… ▽ More

    Submitted 14 February, 2020; originally announced February 2020.

    Comments: Published as a conference paper in ICLR 2020

  6. arXiv:1905.03030  [pdf, other

    cs.LG cs.AI stat.ML

    Meta-learning of Sequential Strategies

    Authors: Pedro A. Ortega, Jane X. Wang, Mark Rowland, Tim Genewein, Zeb Kurth-Nelson, Razvan Pascanu, Nicolas Heess, Joel Veness, Alex Pritzel, Pablo Sprechmann, Siddhant M. Jayakumar, Tom McGrath, Kevin Miller, Mohammad Azar, Ian Osband, Neil Rabinowitz, András György, Silvia Chiappa, Simon Osindero, Yee Whye Teh, Hado van Hasselt, Nando de Freitas, Matthew Botvinick, Shane Legg

    Abstract: In this report we review memory-based meta-learning as a tool for building sample-efficient strategies that learn from past experience to adapt to any task within a target class. Our goal is to equip the reader with the conceptual foundations of this tool for building new, scalable agents that operate on broad domains. To do so, we present basic algorithmic templates for building near-optimal pred… ▽ More

    Submitted 18 July, 2019; v1 submitted 8 May, 2019; originally announced May 2019.

    Comments: DeepMind Technical Report (15 pages, 6 figures). Version V1.1

  7. arXiv:1810.08163  [pdf, other

    cs.LG cs.AI

    Fast deep reinforcement learning using online adjustments from the past

    Authors: Steven Hansen, Pablo Sprechmann, Alexander Pritzel, André Barreto, Charles Blundell

    Abstract: We propose Ephemeral Value Adjusments (EVA): a means of allowing deep reinforcement learning agents to rapidly adapt to experience in their replay buffer. EVA shifts the value predicted by a neural network with an estimate of the value function found by planning over experience tuples from the replay buffer near the current state. EVA combines a number of recent ideas around combining episodic mem… ▽ More

    Submitted 18 October, 2018; originally announced October 2018.

    Comments: Accepted at NIPS 2018

  8. arXiv:1806.07917  [pdf, other

    cs.NE cs.AI cs.LG

    Meta-Learning by the Baldwin Effect

    Authors: Chrisantha Thomas Fernando, Jakub Sygnowski, Simon Osindero, Jane Wang, Tom Schaul, Denis Teplyashin, Pablo Sprechmann, Alexander Pritzel, Andrei A. Rusu

    Abstract: The scope of the Baldwin effect was recently called into question by two papers that closely examined the seminal work of Hinton and Nowlan. To this date there has been no demonstration of its necessity in empirically challenging tasks. Here we show that the Baldwin effect is capable of evolving few-shot supervised and reinforcement learning mechanisms, by sha** the hyperparameters and the initi… ▽ More

    Submitted 22 June, 2018; v1 submitted 6 June, 2018; originally announced June 2018.

  9. arXiv:1804.09401  [pdf, other

    stat.ML cs.LG

    Generative Temporal Models with Spatial Memory for Partially Observed Environments

    Authors: Marco Fraccaro, Danilo Jimenez Rezende, Yori Zwols, Alexander Pritzel, S. M. Ali Eslami, Fabio Viola

    Abstract: In model-based reinforcement learning, generative and temporal models of environments can be leveraged to boost agent performance, either by tuning the agent's representations during training or via use as part of an explicit planning mechanism. However, their application in practice has been limited to simplistic environments, due to the difficulty of training such models in larger, potentially p… ▽ More

    Submitted 19 July, 2018; v1 submitted 25 April, 2018; originally announced April 2018.

    Comments: ICML 2018

  10. arXiv:1802.10542  [pdf, other

    stat.ML cs.LG

    Memory-based Parameter Adaptation

    Authors: Pablo Sprechmann, Siddhant M. Jayakumar, Jack W. Rae, Alexander Pritzel, Adrià Puigdomènech Badia, Benigno Uria, Oriol Vinyals, Demis Hassabis, Razvan Pascanu, Charles Blundell

    Abstract: Deep neural networks have excelled on a wide range of problems, from vision to language and game playing. Neural networks very gradually incorporate information into weights as they process data, requiring very low learning rates. If the training distribution shifts, the network is slow to adapt, and when it does adapt, it typically performs badly on the training distribution before the shift. Our… ▽ More

    Submitted 28 February, 2018; originally announced February 2018.

    Comments: Published as a conference paper at ICLR 2018

  11. arXiv:1707.08475  [pdf, other

    stat.ML cs.AI cs.LG

    DARLA: Improving Zero-Shot Transfer in Reinforcement Learning

    Authors: Irina Higgins, Arka Pal, Andrei A. Rusu, Loic Matthey, Christopher P Burgess, Alexander Pritzel, Matthew Botvinick, Charles Blundell, Alexander Lerchner

    Abstract: Domain adaptation is an important open problem in deep reinforcement learning (RL). In many scenarios of interest data is hard to obtain, so agents may learn a source policy in a setting where data is readily available, with the hope that it generalises well to the target domain. We propose a new multi-stage RL agent, DARLA (DisentAngled Representation Learning Agent), which learns to see before l… ▽ More

    Submitted 6 June, 2018; v1 submitted 26 July, 2017; originally announced July 2017.

    Comments: ICML 2017

  12. arXiv:1703.01988  [pdf, other

    cs.LG stat.ML

    Neural Episodic Control

    Authors: Alexander Pritzel, Benigno Uria, Sriram Srinivasan, Adrià Puigdomènech, Oriol Vinyals, Demis Hassabis, Daan Wierstra, Charles Blundell

    Abstract: Deep reinforcement learning methods attain super-human performance in a wide range of environments. Such methods are grossly inefficient, often taking orders of magnitudes more data than humans to achieve reasonable performance. We propose Neural Episodic Control: a deep reinforcement learning agent that is able to rapidly assimilate new experiences and act upon them. Our agent uses a semi-tabular… ▽ More

    Submitted 6 March, 2017; originally announced March 2017.

  13. arXiv:1701.08734  [pdf, other

    cs.NE cs.LG

    PathNet: Evolution Channels Gradient Descent in Super Neural Networks

    Authors: Chrisantha Fernando, Dylan Banarse, Charles Blundell, Yori Zwols, David Ha, Andrei A. Rusu, Alexander Pritzel, Daan Wierstra

    Abstract: For artificial general intelligence (AGI) it would be efficient if multiple users trained the same giant neural network, permitting parameter reuse, without catastrophic forgetting. PathNet is a first step in this direction. It is a neural network algorithm that uses agents embedded in the neural network whose task is to discover which parts of the network to re-use for new tasks. Agents are pathw… ▽ More

    Submitted 30 January, 2017; originally announced January 2017.

  14. arXiv:1612.01474  [pdf, other

    stat.ML cs.LG

    Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles

    Authors: Balaji Lakshminarayanan, Alexander Pritzel, Charles Blundell

    Abstract: Deep neural networks (NNs) are powerful black box predictors that have recently achieved impressive performance on a wide spectrum of tasks. Quantifying predictive uncertainty in NNs is a challenging and yet unsolved problem. Bayesian NNs, which learn a distribution over weights, are currently the state-of-the-art for estimating predictive uncertainty; however these require significant modificatio… ▽ More

    Submitted 3 November, 2017; v1 submitted 5 December, 2016; originally announced December 2016.

    Comments: NIPS 2017

  15. arXiv:1606.04460  [pdf, other

    stat.ML cs.LG q-bio.NC

    Model-Free Episodic Control

    Authors: Charles Blundell, Benigno Uria, Alexander Pritzel, Yazhe Li, Avraham Ruderman, Joel Z Leibo, Jack Rae, Daan Wierstra, Demis Hassabis

    Abstract: State of the art deep reinforcement learning algorithms take many millions of interactions to attain human-level performance. Humans, on the other hand, can very quickly exploit highly rewarding nuances of an environment upon first discovery. In the brain, such rapid learning is thought to depend on the hippocampus and its capacity for episodic memory. Here we investigate whether a simple model of… ▽ More

    Submitted 14 June, 2016; originally announced June 2016.

  16. arXiv:1602.04621  [pdf, other

    cs.LG cs.AI eess.SY stat.ML

    Deep Exploration via Bootstrapped DQN

    Authors: Ian Osband, Charles Blundell, Alexander Pritzel, Benjamin Van Roy

    Abstract: Efficient exploration in complex environments remains a major challenge for reinforcement learning. We propose bootstrapped DQN, a simple algorithm that explores in a computationally and statistically efficient manner through use of randomized value functions. Unlike dithering strategies such as epsilon-greedy exploration, bootstrapped DQN carries out temporally-extended (or deep) exploration; thi… ▽ More

    Submitted 4 July, 2016; v1 submitted 15 February, 2016; originally announced February 2016.

  17. arXiv:1509.02971  [pdf, other

    cs.LG stat.ML

    Continuous control with deep reinforcement learning

    Authors: Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra

    Abstract: We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain. We present an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces. Using the same learning algorithm, network architecture and hyper-parameters, our algorithm robustly solves more than 20 simulated physics tasks, including classic pr… ▽ More

    Submitted 5 July, 2019; v1 submitted 9 September, 2015; originally announced September 2015.

    Comments: 10 pages + supplementary