Skip to main content

Showing 1–50 of 61 results for author: Santoro, A

.
  1. arXiv:2407.03117  [pdf, other

    physics.soc-ph cs.CY

    A 72h exploration of the co-evolution of food insecurity and international migration

    Authors: Duncan Cassells, Lorenzo Costantini, Ariel Flint Ashery, Shreyas Gadge, Diogo L. Pires, Miguel Á. Sánchez-Cortés, Arnaldo Santoro, Elisa Omodei

    Abstract: Food insecurity, defined as the lack of physical or economic access to safe, nutritious and sufficient food, remains one of the main challenges of the 2030 Agenda for Sustainable Development. Food insecurity is a complex phenomenon, resulting from the interplay of environmental, socio-demographic, and political events. Previous work has investigated the nexus between climate change, conflict, migr… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  2. arXiv:2404.02258  [pdf, other

    cs.LG cs.CL

    Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

    Authors: David Raposo, Sam Ritter, Blake Richards, Timothy Lillicrap, Peter Conway Humphreys, Adam Santoro

    Abstract: Transformer-based language models spread FLOPs uniformly across input sequences. In this work we demonstrate that transformers can instead learn to dynamically allocate FLOPs (or compute) to specific positions in a sequence, optimising the allocation along the sequence for different layers across the model depth. Our method enforces a total compute budget by cap** the number of tokens ($k$) that… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  3. arXiv:2302.02764  [pdf, other

    physics.ins-det hep-ex

    Machine Learning based tool for CMS RPC currents quality monitoring

    Authors: E. Shumka, A. Samalan, M. Tytgat, M. El Sawy, G. A. Alves, F. Marujo, E. A. Coelho, E. M. Da Costa, H. Nogima, A. Santoro, S. Fonseca De Souza, D. De Jesus Damiao, M. Thiel, K. Mota Amarilo, M. Barroso Ferreira Filho, A. Aleksandrov, R. Hadjiiska, P. Iaydjiev, M. Rodozov, M. Shopova, G. Soultanov, A. Dimitrov, L. Litov, B. Pavlov, P. Petkov , et al. (83 additional authors not shown)

    Abstract: The muon system of the CERN Compact Muon Solenoid (CMS) experiment includes more than a thousand Resistive Plate Chambers (RPC). They are gaseous detectors operated in the hostile environment of the CMS underground cavern on the Large Hadron Collider where pp luminosities of up to $2\times 10^{34}$ $\text{cm}^{-2}\text{s}^{-1}$ are routinely achieved. The CMS RPC system performance is constantly m… ▽ More

    Submitted 6 February, 2023; originally announced February 2023.

  4. RPC based tracking system at CERN GIF++ facility

    Authors: K. Mota Amarilo, A. Samalan, M. Tytgat, M. El Sawy, G. A. Alves, F. Marujo, E. A. Coelho, E. M. Da Costa, H. Nogima, A. Santoro, S. Fonseca De Souza, D. De Jesus Damiao, M. Thiel, M. Barroso Ferreira Filho, A. Aleksandrov, R. Hadjiiska, P. Iaydjiev, M. Rodozov, M. Shopova, G. Soultanov, A. Dimitrov, L. Litov, B. Pavlov, P. Petkov, A. Petrov , et al. (83 additional authors not shown)

    Abstract: With the HL-LHC upgrade of the LHC machine, an increase of the instantaneous luminosity by a factor of five is expected and the current detection systems need to be validated for such working conditions to ensure stable data taking. At the CERN Gamma Irradiation Facility (GIF++) many muon detectors undergo such studies, but the high gamma background can pose a challenge to the muon trigger system… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

    Comments: 12 pages, 9 figures. Contribution to XVI Workshop on Resistive Plate Chambers and Related Detectors (RPC2022), September 26-30 2022. Submitted to Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment

  5. arXiv:2211.11602  [pdf, other

    cs.LG cs.HC cs.MA

    Improving Multimodal Interactive Agents with Reinforcement Learning from Human Feedback

    Authors: Josh Abramson, Arun Ahuja, Federico Carnevale, Petko Georgiev, Alex Goldin, Alden Hung, Jessica Landon, Jirka Lhotka, Timothy Lillicrap, Alistair Muldal, George Powell, Adam Santoro, Guy Scully, Sanjana Srivastava, Tamara von Glehn, Greg Wayne, Nathaniel Wong, Chen Yan, Rui Zhu

    Abstract: An important goal in artificial intelligence is to create agents that can both interact naturally with humans and learn from their feedback. Here we demonstrate how to use reinforcement learning from human feedback (RLHF) to improve upon simulated, embodied agents trained to a base level of competency with imitation learning. First, we collected data of humans interacting with agents in a simulate… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

  6. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  7. arXiv:2206.03139  [pdf, other

    cs.LG cs.AI cs.CL

    Intra-agent speech permits zero-shot task acquisition

    Authors: Chen Yan, Federico Carnevale, Petko Georgiev, Adam Santoro, Aurelia Guy, Alistair Muldal, Chia-Chun Hung, Josh Abramson, Timothy Lillicrap, Gregory Wayne

    Abstract: Human language learners are exposed to a trickle of informative, context-sensitive language, but a flood of raw sensory data. Through both social language use and internal processes of rehearsal and practice, language learners are able to build high-level, semantic representations that explain their perceptions. Here, we take inspiration from such processes of "inner speech" in humans (Vygotsky, 1… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

  8. arXiv:2205.13274  [pdf, other

    cs.LG cs.AI

    Evaluating Multimodal Interactive Agents

    Authors: Josh Abramson, Arun Ahuja, Federico Carnevale, Petko Georgiev, Alex Goldin, Alden Hung, Jessica Landon, Timothy Lillicrap, Alistair Muldal, Blake Richards, Adam Santoro, Tamara von Glehn, Greg Wayne, Nathaniel Wong, Chen Yan

    Abstract: Creating agents that can interact naturally with humans is a common goal in artificial intelligence (AI) research. However, evaluating these interactions is challenging: collecting online human-agent interactions is slow and expensive, yet faster proxy metrics often do not correlate well with interactive evaluation. In this paper, we assess the merits of these existing evaluation metrics and prese… ▽ More

    Submitted 14 July, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

  9. arXiv:2205.05055  [pdf, other

    cs.LG cs.AI cs.CL

    Data Distributional Properties Drive Emergent In-Context Learning in Transformers

    Authors: Stephanie C. Y. Chan, Adam Santoro, Andrew K. Lampinen, Jane X. Wang, Aaditya Singh, Pierre H. Richemond, Jay McClelland, Felix Hill

    Abstract: Large transformer-based models are able to perform in-context few-shot learning, without being explicitly trained for it. This observation raises the question: what aspects of the training regime lead to this emergent behavior? Here, we show that this behavior is driven by the distributions of the training data itself. In-context learning emerges when the training data exhibits particular distribu… ▽ More

    Submitted 17 November, 2022; v1 submitted 22 April, 2022; originally announced May 2022.

    Comments: Accepted at NeurIPS 2022 (Oral). Code is available at: https://github.com/deepmind/emergent_in_context_learning

  10. arXiv:2203.10702  [pdf, other

    physics.soc-ph cs.SI q-bio.NC

    Unveiling the higher-order organization of multivariate time series

    Authors: Andrea Santoro, Federico Battiston, Giovanni Petri, Enrico Amico

    Abstract: Time series analysis has proven to be a powerful method to characterize several phenomena in biology, neuroscience and economics, and to understand some of their underlying dynamical features. Despite a plethora of methods have been proposed for the analysis of multivariate time series, most of them neglect the effect of non-pairwise interactions on the emerging dynamics. Here, we propose a novel… ▽ More

    Submitted 12 September, 2022; v1 submitted 20 March, 2022; originally announced March 2022.

    Comments: 16 pages, 5 figures. Supplementary Information (16 figures, 2 tables)

  11. arXiv:2202.08137  [pdf, other

    cs.LG

    A data-driven approach for learning to control computers

    Authors: Peter C Humphreys, David Raposo, Toby Pohlen, Gregory Thornton, Rachita Chhaparia, Alistair Muldal, Josh Abramson, Petko Georgiev, Alex Goldin, Adam Santoro, Timothy Lillicrap

    Abstract: It would be useful for machines to use computers as humans do so that they can aid us in everyday tasks. This is a setting in which there is also the potential to leverage large-scale expert demonstrations and human judgements of interactive behaviour, which are two ingredients that have driven much recent success in AI. Here we investigate the setting of computer control using keyboard and mouse,… ▽ More

    Submitted 11 November, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

    Journal ref: Proceedings of the 39th International Conference on Machine Learning, Baltimore, Maryland, USA, PMLR 162, 2022

  12. arXiv:2112.03763  [pdf, other

    cs.LG

    Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning

    Authors: DeepMind Interactive Agents Team, Josh Abramson, Arun Ahuja, Arthur Brussee, Federico Carnevale, Mary Cassin, Felix Fischer, Petko Georgiev, Alex Goldin, Mansi Gupta, Tim Harley, Felix Hill, Peter C Humphreys, Alden Hung, Jessica Landon, Timothy Lillicrap, Hamza Merzic, Alistair Muldal, Adam Santoro, Guy Scully, Tamara von Glehn, Greg Wayne, Nathaniel Wong, Chen Yan, Rui Zhu

    Abstract: A common vision from science fiction is that robots will one day inhabit our physical spaces, sense the world as we do, assist our physical labours, and communicate with us through natural language. Here we study how to design artificial agents that can interact naturally with humans using the simplification of a virtual environment. We show that imitation learning of human-human interactions in a… ▽ More

    Submitted 2 February, 2022; v1 submitted 7 December, 2021; originally announced December 2021.

  13. arXiv:2112.03753  [pdf, other

    cs.LG cs.AI stat.ML

    Tell me why! Explanations support learning relational and causal structure

    Authors: Andrew K. Lampinen, Nicholas A. Roy, Ishita Dasgupta, Stephanie C. Y. Chan, Allison C. Tam, James L. McClelland, Chen Yan, Adam Santoro, Neil C. Rabinowitz, Jane X. Wang, Felix Hill

    Abstract: Inferring the abstract relational and causal structure of the world is a major challenge for reinforcement-learning (RL) agents. For humans, language--particularly in the form of explanations--plays a considerable role in overcoming this challenge. Here, we show that language can play a similar role for deep RL agents in complex environments. While agents typically struggle to acquire relational a… ▽ More

    Submitted 25 May, 2022; v1 submitted 7 December, 2021; originally announced December 2021.

    Comments: ICML 2022; 23 pages

    ACM Class: I.2.6

  14. arXiv:2109.14331  [pdf, other

    physics.ins-det hep-ex

    Upgrade of the CMS Resistive Plate Chambers for the High Luminosity LHC

    Authors: A. Samalan, M. Tytgat, G. A. Alves, F. Marujo, F. Torres Da Silva De Araujo, E. M. DaCosta, D. De Jesus Damiao, H. Nogima, A. Santoro, S. Fonseca De Souza, A. Aleksandrov, R. Hadjiiska, P. Iaydjiev, M. Rodozov, M. Shopova, G. Soultanov, M. Bonchev, A. Dimitrov, L. Litov, B. Pavlov, P. Petkov, A. Petrov, S. J. Qian, C. Bernal, A. Cabrera , et al. (86 additional authors not shown)

    Abstract: During the upcoming High Luminosity phase of the Large Hadron Collider (HL-LHC), the integrated luminosity of the accelerator will increase to 3000 fb$^{-1}$. The expected experimental conditions in that period in terms of background rates, event pileup, and the probable aging of the current detectors present a challenge for all the existing experiments at the LHC, including the Compact Muon Solen… ▽ More

    Submitted 2 November, 2021; v1 submitted 29 September, 2021; originally announced September 2021.

  15. arXiv:2102.13013  [pdf, other

    physics.soc-ph cs.SI physics.bio-ph

    Optimising the mitigation of epidemic spreading through targeted adoption of contact tracing apps

    Authors: Aleix Bassolas, Andrea Santoro, Sandro Sousa, Silvia Rognone, Vincenzo Nicosia

    Abstract: The ongoing COVID-19 pandemic is the first epidemic in human history in which digital contact-tracing has been deployed at a global scale. Tracking and quarantining all the contacts of individuals who test positive to a virus can help slowing-down an epidemic, but the impact of contact-tracing is severely limited by the generally low adoption of contact-tracing apps in the population. We derive he… ▽ More

    Submitted 25 February, 2021; originally announced February 2021.

    Comments: 10 pages, 5 figures, + 12 SI pages, 9 suppl. figs, 1 suppl. table

  16. arXiv:2102.12425  [pdf, other

    cs.LG

    Synthetic Returns for Long-Term Credit Assignment

    Authors: David Raposo, Sam Ritter, Adam Santoro, Greg Wayne, Theophane Weber, Matt Botvinick, Hado van Hasselt, Francis Song

    Abstract: Since the earliest days of reinforcement learning, the workhorse method for assigning credit to actions over time has been temporal-difference (TD) learning, which propagates credit backward timestep-by-timestep. This approach suffers when delays between actions and rewards are long and when intervening unrelated events contribute variance to long-term returns. We propose state-associative (SA) le… ▽ More

    Submitted 24 February, 2021; originally announced February 2021.

  17. arXiv:2102.03406  [pdf, other

    cs.AI cs.LG

    Symbolic Behaviour in Artificial Intelligence

    Authors: Adam Santoro, Andrew Lampinen, Kory Mathewson, Timothy Lillicrap, David Raposo

    Abstract: The ability to use symbols is the pinnacle of human intelligence, but has yet to be fully replicated in machines. Here we argue that the path towards symbolically fluent artificial intelligence (AI) begins with a reinterpretation of what symbols are, how they come to exist, and how a system behaves when it uses them. We begin by offering an interpretation of symbols as entities whose meaning is es… ▽ More

    Submitted 21 January, 2022; v1 submitted 5 February, 2021; originally announced February 2021.

  18. arXiv:2012.08508  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Attention over learned object embeddings enables complex visual reasoning

    Authors: David Ding, Felix Hill, Adam Santoro, Malcolm Reynolds, Matt Botvinick

    Abstract: Neural networks have achieved success in a wide array of perceptual tasks but often fail at tasks involving both perception and higher-level reasoning. On these more challenging tasks, bespoke approaches (such as modular symbolic components, independent dynamics models or semantic parsers) targeted towards that specific type of task have typically performed better. The downside to these targeted a… ▽ More

    Submitted 26 October, 2021; v1 submitted 15 December, 2020; originally announced December 2020.

    Comments: 22 pages, 5 figures

  19. arXiv:2012.05672  [pdf, other

    cs.LG cs.AI cs.MA

    Imitating Interactive Intelligence

    Authors: Josh Abramson, Arun Ahuja, Iain Barr, Arthur Brussee, Federico Carnevale, Mary Cassin, Rachita Chhaparia, Stephen Clark, Bogdan Damoc, Andrew Dudzik, Petko Georgiev, Aurelia Guy, Tim Harley, Felix Hill, Alden Hung, Zachary Kenton, Jessica Landon, Timothy Lillicrap, Kory Mathewson, Soňa Mokrá, Alistair Muldal, Adam Santoro, Nikolay Savinov, Vikrant Varma, Greg Wayne , et al. (4 additional authors not shown)

    Abstract: A common vision from science fiction is that robots will one day inhabit our physical spaces, sense the world as we do, assist our physical labours, and communicate with us through natural language. Here we study how to design artificial agents that can interact naturally with humans using the simplification of a virtual environment. This setting nevertheless integrates a number of the central cha… ▽ More

    Submitted 20 January, 2021; v1 submitted 10 December, 2020; originally announced December 2020.

  20. Comparison of $pp$ and $p \bar{p}$ differential elastic cross sections and observation of the exchange of a colorless $C$-odd gluonic compound

    Authors: V. M. Abazov, B. Abbott, B. S. Acharya, M. Adams, T. Adams, J. P. Agnew, G. D. Alexeev, G. Alkhazov, A. Alton, G. A. Alves, G. Antchev, A. Askew, P. Aspell, A. C. S. Assis Jesus, I. Atanassov, S. Atkins, K. Augsten, V. Aushev, Y. Aushev, V. Avati, C. Avila, F. Badaud, J. Baechler, L. Bagby, C. Baldenegro Barrera , et al. (451 additional authors not shown)

    Abstract: We describe an analysis comparing the $p\bar{p}$ elastic cross section as measured by the D0 Collaboration at a center-of-mass energy of 1.96 TeV to that in $pp$ collisions as measured by the TOTEM Collaboration at 2.76, 7, 8, and 13 TeV using a model-independent approach. The TOTEM cross sections extrapolated to a center-of-mass energy of $\sqrt{s} =$ 1.96 TeV are compared with the D0 measurement… ▽ More

    Submitted 25 June, 2021; v1 submitted 7 December, 2020; originally announced December 2020.

    Comments: D0 and TOTEM Collaborations

    Journal ref: Phys. Rev. Lett. 127, 062003 (2021)

  21. arXiv:2006.03662  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Rapid Task-Solving in Novel Environments

    Authors: Sam Ritter, Ryan Faulkner, Laurent Sartran, Adam Santoro, Matt Botvinick, David Raposo

    Abstract: We propose the challenge of rapid task-solving in novel environments (RTS), wherein an agent must solve a series of tasks as rapidly as possible in an unfamiliar environment. An effective RTS agent must balance between exploring the unfamiliar environment and solving its current task, all while building a model of the new environment over which it can plan when faced with later tasks. While modern… ▽ More

    Submitted 19 April, 2021; v1 submitted 5 June, 2020; originally announced June 2020.

  22. arXiv:2005.12769  [pdf, other

    physics.ins-det hep-ex

    CMS RPC Background -- Studies and Measurements

    Authors: R. Hadjiiska, A. Samalan, M. Tytgat, N. Zaganidis, G. A. Alves, F. Marujo, F. Torres Da Silva De Araujo, E. M. Da Costa, D. De Jesus Damiao, H. Nogima, A. Santoro, S. Fonseca De Souza, A. Aleksandrov, P. Iaydjiev, M. Rodozov, M. Shopova, G. Sultanov, M. Bonchev, A. Dimitrov, L. Litov, B. Pavlov, P. Petkov, A. Petrov, S. J. Qian, C. Bernal , et al. (84 additional authors not shown)

    Abstract: The expected radiation background in the CMS RPC system has been studied using the MC prediction with the CMS FLUKA simulation of the detector and the cavern. The MC geometry used in the analysis describes very accurately the present RPC system but still does not include the complete description of the RPC upgrade region with pseudorapidity $1.9 < \lvert η\rvert < 2.4$. Present results will be upd… ▽ More

    Submitted 13 December, 2020; v1 submitted 26 May, 2020; originally announced May 2020.

    Comments: 6 pages, Conference proceeding for the 2020 Resistive Plate Chambers and Related Detectors. Minor revision of the report, the results remain unchanged. Three new plots are added and some details were explained better

  23. arXiv:1910.04783  [pdf, other

    physics.soc-ph cond-mat.stat-mech cs.SI

    Optimal percolation in correlated multilayer networks with overlap

    Authors: Andrea Santoro, Vincenzo Nicosia

    Abstract: Multilayer networks have been found to be prone to abrupt cascading failures under random and targeted attacks, but most of the targeting algorithms proposed so far have been mainly tested on uncorrelated systems. Here we show that the size of the critical percolation set of a multilayer network is substantially affected by the presence of inter-layer degree correlations and edge overlap. We provi… ▽ More

    Submitted 22 July, 2020; v1 submitted 10 October, 2019; originally announced October 2019.

    Comments: 14 pages, 9 figures, 1 table

    Journal ref: Phys. Rev. Research 2, 033122 (2020)

  24. arXiv:1910.00571  [pdf, other

    cs.AI

    Environmental drivers of systematicity and generalization in a situated agent

    Authors: Felix Hill, Andrew Lampinen, Rosalia Schneider, Stephen Clark, Matthew Botvinick, James L. McClelland, Adam Santoro

    Abstract: The question of whether deep neural networks are good at generalising beyond their immediate training experience is of critical importance for learning-based approaches to AI. Here, we consider tests of out-of-sample generalisation that require an agent to respond to never-seen-before instructions by manipulating and positioning objects in a 3D Unity simulated room. We first describe a comparative… ▽ More

    Submitted 19 February, 2020; v1 submitted 1 October, 2019; originally announced October 2019.

  25. arXiv:1909.12892  [pdf, other

    cs.LG cs.AI stat.ML

    Automated curricula through setter-solver interactions

    Authors: Sebastien Racaniere, Andrew K. Lampinen, Adam Santoro, David P. Reichert, Vlad Firoiu, Timothy P. Lillicrap

    Abstract: Reinforcement learning algorithms use correlations between policies and rewards to improve agent performance. But in dynamic or sparsely rewarding environments these correlations are often too small, or rewarding events are too infrequent to make learning feasible. Human education instead relies on curricula--the breakdown of tasks into simpler, static challenges with dense rewards--to build up to… ▽ More

    Submitted 21 January, 2020; v1 submitted 27 September, 2019; originally announced September 2019.

    Journal ref: International Conference on Learning Representations, 2020

  26. arXiv:1906.09068  [pdf, other

    physics.soc-ph cs.LG cs.SI math.AT

    Simplex2Vec embeddings for community detection in simplicial complexes

    Authors: Jacob Charles Wright Billings, Mirko Hu, Giulia Lerda, Alexey N. Medvedev, Francesco Mottes, Adrian Onicas, Andrea Santoro, Giovanni Petri

    Abstract: Topological representations are rapidly becoming a popular way to capture and encode higher-order interactions in complex systems. They have found applications in disciplines as different as cancer genomics, brain function, and computational social science, in representing both descriptive features of data and inference models. While intense research has focused on the connectivity and homological… ▽ More

    Submitted 21 June, 2019; originally announced June 2019.

  27. arXiv:1904.10396  [pdf, other

    q-bio.NC cs.AI cs.LG

    Is coding a relevant metaphor for building AI? A commentary on "Is coding a relevant metaphor for the brain?", by Romain Brette

    Authors: Adam Santoro, Felix Hill, David Barrett, David Raposo, Matthew Botvinick, Timothy Lillicrap

    Abstract: Brette contends that the neural coding metaphor is an invalid basis for theories of what the brain does. Here, we argue that it is an insufficient guide for building an artificial intelligence that learns to accomplish short- and long-term goals in a complex, changing environment.

    Submitted 18 April, 2019; originally announced April 2019.

  28. arXiv:1903.08049  [pdf, other

    physics.soc-ph cond-mat.stat-mech cs.IT physics.data-an

    Algorithmic complexity of multiplex networks

    Authors: Andrea Santoro, Vincenzo Nicosia

    Abstract: Multilayer networks preserve full information about the different interactions among the constituents of a complex system, and have recently proven quite useful in modelling transportation networks, social circles, and the human brain. A fundamental and still open problem is to assess if and when the multilayer representation of a system provides a qualitatively better model than the classical sin… ▽ More

    Submitted 26 June, 2020; v1 submitted 19 March, 2019; originally announced March 2019.

    Comments: 28 pages, 17 figures, 3 tables

    Journal ref: Phys. Rev. X 10, 021069 (2020)

  29. arXiv:1902.00120  [pdf, other

    cs.AI

    Learning to Make Analogies by Contrasting Abstract Relational Structure

    Authors: Felix Hill, Adam Santoro, David G. T. Barrett, Ari S. Morcos, Timothy Lillicrap

    Abstract: Analogical reasoning has been a principal focus of various waves of AI research. Analogy is particularly challenging for machines because it requires relational structures to be represented such that they can be flexibly applied across diverse domains of experience. Here, we study how analogical reasoning can be induced in neural networks that learn to perceive and reason about raw visual data. We… ▽ More

    Submitted 31 January, 2019; originally announced February 2019.

  30. arXiv:1901.03559  [pdf, other

    cs.LG cs.AI stat.ML

    An investigation of model-free planning

    Authors: Arthur Guez, Mehdi Mirza, Karol Gregor, Rishabh Kabra, Sébastien Racanière, Théophane Weber, David Raposo, Adam Santoro, Laurent Orseau, Tom Eccles, Greg Wayne, David Silver, Timothy Lillicrap

    Abstract: The field of reinforcement learning (RL) is facing increasingly challenging domains with combinatorial complexity. For an RL agent to address these challenges, it is essential that it can plan effectively. Prior work has typically utilized an explicit model of the environment, combined with a specific planning algorithm (such as tree search). More recently, a new family of methods have been propos… ▽ More

    Submitted 20 May, 2019; v1 submitted 11 January, 2019; originally announced January 2019.

  31. arXiv:1808.00300  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.NE

    Learning Visual Question Answering by Bootstrap** Hard Attention

    Authors: Mateusz Malinowski, Carl Doersch, Adam Santoro, Peter Battaglia

    Abstract: Attention mechanisms in biological perception are thought to select subsets of perceptual information for more sophisticated processing which would be prohibitive to perform on all sensory inputs. In computer vision, however, there has been relatively little exploration of hard attention, where some information is selectively ignored, in spite of the success of soft attention, where information is… ▽ More

    Submitted 1 August, 2018; originally announced August 2018.

    Comments: ECCV 2018

  32. arXiv:1807.05680  [pdf, other

    physics.ins-det hep-ex

    High Rate RPC detector for LHC

    Authors: F. Lagarde, A. Fagot, M. Gul, C. Roskas, M. Tytgat, N. Zaganidis, S. Fonseca De Souza, A. Santoro, F. Torres Da Silva De Araujo, A. Aleksandrov, R. Hadjiiska, P. Iaydjiev, M. Rodozov, M. Shopova, G. Sultanov, A. Dimitrov, L. Litov, B. Pavlov, P. Petkov, A. Petrov, S. J. Qian, D. Han, W. Yi, C. Avila, A. Cabrera , et al. (77 additional authors not shown)

    Abstract: The High Luminosity LHC (HL-LHC) phase is designed to increase by an order of magnitude the amount of data to be collected by the LHC experiments. The foreseen gradual increase of the instantaneous luminosity of up to more than twice its nominal value of $10\times10^{34}\ {\rm cm}^{-1}{\rm s}^{-2}$ during Phase I and Phase II of the LHC running, presents special challenges for the experiments. The… ▽ More

    Submitted 16 July, 2018; originally announced July 2018.

  33. arXiv:1807.04587  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures

    Authors: Sergey Bartunov, Adam Santoro, Blake A. Richards, Luke Marris, Geoffrey E. Hinton, Timothy Lillicrap

    Abstract: The backpropagation of error algorithm (BP) is impossible to implement in a real brain. The recent success of deep networks in machine learning and AI, however, has inspired proposals for understanding how the brain might learn across multiple layers, and hence how it might approximate BP. As of yet, none of these proposals have been rigorously evaluated on tasks where BP-guided deep learning has… ▽ More

    Submitted 20 November, 2018; v1 submitted 12 July, 2018; originally announced July 2018.

    Comments: NIPS 2018. Version 2 contains more experimental data including best hyperparameters found

  34. arXiv:1807.04225  [pdf, other

    cs.LG stat.ML

    Measuring abstract reasoning in neural networks

    Authors: David G. T. Barrett, Felix Hill, Adam Santoro, Ari S. Morcos, Timothy Lillicrap

    Abstract: Whether neural networks can learn abstract reasoning or whether they merely rely on superficial statistics is a topic of recent debate. Here, we propose a dataset and challenge designed to probe abstract reasoning, inspired by a well-known human IQ test. To succeed at this challenge, models must cope with various generalisation `regimes' in which the training and test data differ in clearly-define… ▽ More

    Submitted 11 July, 2018; originally announced July 2018.

    Comments: ICML 2018

  35. arXiv:1806.01830  [pdf, other

    cs.LG stat.ML

    Relational Deep Reinforcement Learning

    Authors: Vinicius Zambaldi, David Raposo, Adam Santoro, Victor Bapst, Yujia Li, Igor Babuschkin, Karl Tuyls, David Reichert, Timothy Lillicrap, Edward Lockhart, Murray Shanahan, Victoria Langston, Razvan Pascanu, Matthew Botvinick, Oriol Vinyals, Peter Battaglia

    Abstract: We introduce an approach for deep reinforcement learning (RL) that improves upon the efficiency, generalization capacity, and interpretability of conventional approaches through structured perception and relational reasoning. It uses self-attention to iteratively reason about the relations between entities in a scene and to guide a model-free policy. Our results show that in a novel navigation and… ▽ More

    Submitted 28 June, 2018; v1 submitted 5 June, 2018; originally announced June 2018.

  36. arXiv:1806.01822  [pdf, other

    cs.LG stat.ML

    Relational recurrent neural networks

    Authors: Adam Santoro, Ryan Faulkner, David Raposo, Jack Rae, Mike Chrzanowski, Theophane Weber, Daan Wierstra, Oriol Vinyals, Razvan Pascanu, Timothy Lillicrap

    Abstract: Memory-based neural networks model temporal data by leveraging an ability to remember information for long periods. It is unclear, however, whether they also have an ability to perform complex relational reasoning with the information they remember. Here, we first confirm our intuitions that standard memory architectures may struggle at tasks that heavily involve an understanding of the ways in wh… ▽ More

    Submitted 28 June, 2018; v1 submitted 5 June, 2018; originally announced June 2018.

  37. arXiv:1806.01261  [pdf, other

    cs.LG cs.AI stat.ML

    Relational inductive biases, deep learning, and graph networks

    Authors: Peter W. Battaglia, Jessica B. Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, Caglar Gulcehre, Francis Song, Andrew Ballard, Justin Gilmer, George Dahl, Ashish Vaswani, Kelsey Allen, Charles Nash, Victoria Langston, Chris Dyer, Nicolas Heess, Daan Wierstra, Pushmeet Kohli, Matt Botvinick, Oriol Vinyals , et al. (2 additional authors not shown)

    Abstract: Artificial intelligence (AI) has undergone a renaissance recently, making major progress in key domains such as vision, language, control, and decision-making. This has been due, in part, to cheap data and cheap compute resources, which have fit the natural strengths of deep learning. However, many defining characteristics of human intelligence, which developed under much different pressures, rema… ▽ More

    Submitted 17 October, 2018; v1 submitted 4 June, 2018; originally announced June 2018.

  38. arXiv:1805.09786  [pdf, other

    cs.NE

    Hyperbolic Attention Networks

    Authors: Caglar Gulcehre, Misha Denil, Mateusz Malinowski, Ali Razavi, Razvan Pascanu, Karl Moritz Hermann, Peter Battaglia, Victor Bapst, David Raposo, Adam Santoro, Nando de Freitas

    Abstract: We introduce hyperbolic attention networks to endow neural networks with enough capacity to match the complexity of data with hierarchical and power-law structure. A few recent approaches have successfully demonstrated the benefits of imposing hyperbolic geometry on the parameters of shallow networks. We extend this line of work by imposing hyperbolic geometry on the activations of neural networks… ▽ More

    Submitted 24 May, 2018; originally announced May 2018.

  39. arXiv:1803.10760  [pdf, other

    cs.LG stat.ML

    Unsupervised Predictive Memory in a Goal-Directed Agent

    Authors: Greg Wayne, Chia-Chun Hung, David Amos, Mehdi Mirza, Arun Ahuja, Agnieszka Grabska-Barwinska, Jack Rae, Piotr Mirowski, Joel Z. Leibo, Adam Santoro, Mevlana Gemici, Malcolm Reynolds, Tim Harley, Josh Abramson, Shakir Mohamed, Danilo Rezende, David Saxton, Adam Cain, Chloe Hillier, David Silver, Koray Kavukcuoglu, Matt Botvinick, Demis Hassabis, Timothy Lillicrap

    Abstract: Animals execute goal-directed behaviours despite the limited range and scope of their sensors. To cope, they explore environments and store memories maintaining estimates of important information that is not presently available. Recently, progress has been made with artificial intelligence (AI) agents that learn to perform tasks from sensory input, even at a human level, by merging reinforcement l… ▽ More

    Submitted 28 March, 2018; originally announced March 2018.

  40. arXiv:1711.08378  [pdf

    cs.AI

    Building Machines that Learn and Think for Themselves: Commentary on Lake et al., Behavioral and Brain Sciences, 2017

    Authors: M. Botvinick, D. G. T. Barrett, P. Battaglia, N. de Freitas, D. Kumaran, J. Z Leibo, T. Lillicrap, J. Modayil, S. Mohamed, N. C. Rabinowitz, D. J. Rezende, A. Santoro, T. Schaul, C. Summerfield, G. Wayne, T. Weber, D. Wierstra, S. Legg, D. Hassabis

    Abstract: We agree with Lake and colleagues on their list of key ingredients for building humanlike intelligence, including the idea that model-based reasoning is essential. However, we favor an approach that centers on one additional ingredient: autonomy. In particular, we aim toward agents that can both build and exploit their own internal models, with minimal human hand-engineering. We believe an approac… ▽ More

    Submitted 22 November, 2017; originally announced November 2017.

  41. arXiv:1710.01068  [pdf, ps, other

    physics.soc-ph cond-mat.dis-nn cond-mat.stat-mech math.OC nlin.AO

    Pareto optimality in multilayer network growth

    Authors: Andrea Santoro, Vito Latora, Giuseppe Nicosia, Vincenzo Nicosia

    Abstract: We model the formation of multi-layer transportation networks as a multi-objective optimization process, where service providers compete for passengers, and the creation of routes is determined by a multi-objective cost function encoding a trade-off between efficiency and competition. The resulting model reproduces well real-world systems as diverse as airplane, train and bus networks, thus sugges… ▽ More

    Submitted 19 July, 2018; v1 submitted 3 October, 2017; originally announced October 2017.

    Comments: 6 pages, 4 figures, Supplemental Material

    Journal ref: Phys. Rev. Lett. 121, 128302 (2018)

  42. arXiv:1706.08606  [pdf, other

    stat.ML cs.CV cs.LG

    Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study

    Authors: Samuel Ritter, David G. T. Barrett, Adam Santoro, Matt M. Botvinick

    Abstract: Deep neural networks (DNNs) have achieved unprecedented performance on a wide range of complex tasks, rapidly outpacing our understanding of the nature of their solutions. This has caused a recent surge of interest in methods for rendering modern neural systems more interpretable. In this work, we propose to address the interpretability problem in modern DNNs using the rich history of problem desc… ▽ More

    Submitted 29 June, 2017; v1 submitted 26 June, 2017; originally announced June 2017.

    Comments: ICML 2017

  43. arXiv:1706.01427  [pdf, other

    cs.CL cs.LG

    A simple neural network module for relational reasoning

    Authors: Adam Santoro, David Raposo, David G. T. Barrett, Mateusz Malinowski, Razvan Pascanu, Peter Battaglia, Timothy Lillicrap

    Abstract: Relational reasoning is a central component of generally intelligent behavior, but has proven difficult for neural networks to learn. In this paper we describe how to use Relation Networks (RNs) as a simple plug-and-play module to solve problems that fundamentally hinge on relational reasoning. We tested RN-augmented networks on three tasks: visual question answering using a challenging dataset ca… ▽ More

    Submitted 5 June, 2017; originally announced June 2017.

  44. arXiv:1702.05068  [pdf, other

    cs.LG cs.CV

    Discovering objects and their relations from entangled scene representations

    Authors: David Raposo, Adam Santoro, David Barrett, Razvan Pascanu, Timothy Lillicrap, Peter Battaglia

    Abstract: Our world can be succinctly and compactly described as structured scenes of objects and relations. A typical room, for example, contains salient objects such as tables, chairs and books, and these objects typically relate to each other by their underlying causes and semantics. This gives rise to correlated features, such as position, function and shape. Humans exploit knowledge of objects and thei… ▽ More

    Submitted 16 February, 2017; originally announced February 2017.

    Comments: ICLR Workshop 2017

  45. arXiv:1702.04649  [pdf, other

    cs.LG cs.NE stat.ML

    Generative Temporal Models with Memory

    Authors: Mevlana Gemici, Chia-Chun Hung, Adam Santoro, Greg Wayne, Shakir Mohamed, Danilo J. Rezende, David Amos, Timothy Lillicrap

    Abstract: We consider the general problem of modeling temporal data with long-range dependencies, wherein new observations are fully or partially predictable based on temporally-distant, past observations. A sufficiently powerful temporal model should separate predictable elements of the sequence from unpredictable elements, express uncertainty about those unpredictable elements, and rapidly identify novel… ▽ More

    Submitted 21 February, 2017; v1 submitted 15 February, 2017; originally announced February 2017.

  46. LHC Forward Physics

    Authors: K. Akiba, M. Akbiyik, M. Albrow, M. Arneodo, V. Avati, J. Baechler, O. Villalobos Baillie, P. Bartalini, J. Bartels, S. Baur, C. Baus, W. Beaumont, U. Behrens, D. Berge, M. Berretti, E. Bossini, R. Boussarie, S. Brodsky, M. Broz, M. Bruschi, P. Bussey, W. Byczynski, J. C. Cabanillas Noris, E. Calvo Villar, A. Campbell , et al. (162 additional authors not shown)

    Abstract: The goal of this report is to give a comprehensive overview of the rich field of forward physics, with a special attention to the topics that can be studied at the LHC. The report starts presenting a selection of the Monte Carlo simulation tools currently available, chapter 2, then enters the rich phenomenology of QCD at low, chapter 3, and high, chapter 4, momentum transfer, while the unique scat… ▽ More

    Submitted 9 December, 2017; v1 submitted 15 November, 2016; originally announced November 2016.

    Comments: 358 pages; authors added that were missing; minor fixes in affiliations

    Report number: CERN-PH-LPCC-2015-001, SLAC-PUB-16364, DESY 15-167

    Journal ref: J. Phys. G: Nucl. Part. Phys. 43 (2016) 110201

  47. arXiv:1605.06065  [pdf, other

    cs.LG

    One-shot Learning with Memory-Augmented Neural Networks

    Authors: Adam Santoro, Sergey Bartunov, Matthew Botvinick, Daan Wierstra, Timothy Lillicrap

    Abstract: Despite recent breakthroughs in the applications of deep neural networks, one setting that presents a persistent challenge is that of "one-shot learning." Traditional gradient-based networks require a lot of data to learn, often through extensive iterative training. When new data is encountered, the models must inefficiently relearn their parameters to adequately incorporate the new information wi… ▽ More

    Submitted 19 May, 2016; originally announced May 2016.

    Comments: 13 pages, 8 figures

  48. arXiv:1506.04981  [pdf, other

    physics.geo-ph cond-mat.mtrl-sci cond-mat.stat-mech physics.data-an

    Analogies between the cracking noise of ethanol-dampened charcoal and earthquakes

    Authors: H. V. Ribeiro, L. S. Costa, L. G. A. Alves, P. A. Santoro, S. Picoli, E. K. Lenzi, R. S. Mendes

    Abstract: We report on an extensive characterization of the cracking noise produced by charcoal samples when dampened with ethanol. We argue that the evaporation of ethanol causes transient and irregularly distributed internal stresses that promote the fragmentation of the samples and mimic some situations found in mining processes. The results show that, in general, the most fundamental seismic laws ruling… ▽ More

    Submitted 1 July, 2015; v1 submitted 16 June, 2015; originally announced June 2015.

    Comments: Accepted for publication in PRL

    Journal ref: Phys. Rev. Lett. 115, 025503 (2015)

  49. arXiv:1411.4413  [pdf, other

    hep-ex hep-ph

    Observation of the rare $B^0_s\toμ^+μ^-$ decay from the combined analysis of CMS and LHCb data

    Authors: The CMS, LHCb Collaborations, :, V. Khachatryan, A. M. Sirunyan, A. Tumasyan, W. Adam, T. Bergauer, M. Dragicevic, J. Erö, M. Friedl, R. Frühwirth, V. M. Ghete, C. Hartl, N. Hörmann, J. Hrubec, M. Jeitler, W. Kiesenhofer, V. Knünz, M. Krammer, I. Krätschmer, D. Liko, I. Mikulec, D. Rabady, B. Rahbaran , et al. (2807 additional authors not shown)

    Abstract: A joint measurement is presented of the branching fractions $B^0_s\toμ^+μ^-$ and $B^0\toμ^+μ^-$ in proton-proton collisions at the LHC by the CMS and LHCb experiments. The data samples were collected in 2011 at a centre-of-mass energy of 7 TeV, and in 2012 at 8 TeV. The combined analysis produces the first observation of the $B^0_s\toμ^+μ^-$ decay, with a statistical significance exceeding six sta… ▽ More

    Submitted 17 August, 2015; v1 submitted 17 November, 2014; originally announced November 2014.

    Comments: Correspondence should be addressed to [email protected]

    Report number: CERN-PH-EP-2014-220, CMS-BPH-13-007, LHCb-PAPER-2014-049

    Journal ref: Nature 522, 68-72 (04 June 2015)

  50. arXiv:1206.2404  [pdf, ps, other

    physics.data-an physics.comp-ph

    Complexity-Entropy Causality Plane as a Complexity Measure for Two-dimensional Patterns

    Authors: H. V. Ribeiro, L. Zunino, E. K. Lenzi, P. A. Santoro, R. S. Mendes

    Abstract: Complexity measures are essential to understand complex systems and there are numerous definitions to analyze one-dimensional data. However, extensions of these approaches to two or higher-dimensional data, such as images, are much less common. Here, we reduce this gap by applying the ideas of the permutation entropy combined with a relative entropic index. We build up a numerical procedure that c… ▽ More

    Submitted 11 June, 2012; originally announced June 2012.

    Comments: Accepted for publication in PLoS One

    Journal ref: PLoS ONE 7, e40689 (2012)