Skip to main content

Showing 1–28 of 28 results for author: Lerer, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.01427  [pdf, other

    cs.CL cs.AI

    Attention Sorting Combats Recency Bias In Long Context Language Models

    Authors: Alexander Peysakhovich, Adam Lerer

    Abstract: Current language models often fail to incorporate long contexts efficiently during generation. We show that a major contributor to this issue are attention priors that are likely learned during pre-training: relevant information located earlier in context is attended to less on average. Yet even when models fail to use the information from a relevant document in their response, they still pay pref… ▽ More

    Submitted 28 September, 2023; originally announced October 2023.

  2. arXiv:2210.05492  [pdf, other

    cs.GT cs.AI cs.LG cs.MA

    Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning

    Authors: Anton Bakhtin, David J Wu, Adam Lerer, Jonathan Gray, Athul Paul Jacob, Gabriele Farina, Alexander H Miller, Noam Brown

    Abstract: No-press Diplomacy is a complex strategy game involving both cooperation and competition that has served as a benchmark for multi-agent AI research. While self-play reinforcement learning has resulted in numerous successes in purely adversarial games like chess, Go, and poker, self-play alone is insufficient for achieving optimal performance in domains involving cooperation with humans. We address… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

  3. arXiv:2210.05125  [pdf, ps, other

    cs.AI cs.LG cs.MA

    Human-AI Coordination via Human-Regularized Search and Learning

    Authors: Hengyuan Hu, David J Wu, Adam Lerer, Jakob Foerster, Noam Brown

    Abstract: We consider the problem of making AI agents that collaborate well with humans in partially observable fully cooperative environments given datasets of human behavior. Inspired by piKL, a human-data-regularized search method that improves upon a behavioral cloning policy without diverging far away from it, we develop a three-step algorithm that achieve strong performance in coordinating with real h… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

  4. arXiv:2206.04907  [pdf, other

    cs.LG stat.ME

    Efficient Heterogeneous Treatment Effect Estimation With Multiple Experiments and Multiple Outcomes

    Authors: Leon Yao, Caroline Lo, Israel Nir, Sarah Tan, Ariel Evnine, Adam Lerer, Alex Peysakhovich

    Abstract: Learning heterogeneous treatment effects (HTEs) is an important problem across many fields. Most existing methods consider the setting with a single treatment arm and a single outcome metric. However, in many real world domains, experiments are run consistently - for example, in internet companies, A/B tests are run every day to measure the impacts of potential changes across many different metric… ▽ More

    Submitted 10 June, 2022; originally announced June 2022.

  5. arXiv:2204.04826  [pdf, other

    cs.GT

    Equilibrium Finding in Normal-Form Games Via Greedy Regret Minimization

    Authors: Hugh Zhang, Adam Lerer, Noam Brown

    Abstract: We extend the classic regret minimization framework for approximating equilibria in normal-form games by greedily weighing iterates based on regrets observed at runtime. Theoretically, our method retains all previous convergence rate guarantees. Empirically, experiments on large randomly generated games and normal-form subgames of the AI benchmark Diplomacy show that greedy weights outperforms pre… ▽ More

    Submitted 10 April, 2022; originally announced April 2022.

    Comments: AAAI 2022

  6. arXiv:2112.07544  [pdf, other

    cs.MA cs.AI cs.GT cs.LG

    Modeling Strong and Human-Like Gameplay with KL-Regularized Search

    Authors: Athul Paul Jacob, David J. Wu, Gabriele Farina, Adam Lerer, Hengyuan Hu, Anton Bakhtin, Jacob Andreas, Noam Brown

    Abstract: We consider the task of building strong but human-like policies in multi-agent decision-making problems, given examples of human behavior. Imitation learning is effective at predicting human actions but may not match the strength of expert humans, while self-play learning and search techniques (e.g. AlphaZero) lead to strong performance but may produce policies that are difficult for humans to und… ▽ More

    Submitted 16 February, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

  7. arXiv:2110.02924  [pdf, other

    cs.LG cs.AI cs.GT cs.MA

    No-Press Diplomacy from Scratch

    Authors: Anton Bakhtin, David Wu, Adam Lerer, Noam Brown

    Abstract: Prior AI successes in complex games have largely focused on settings with at most hundreds of actions at each decision point. In contrast, Diplomacy is a game with more than 10^20 possible actions per turn. Previous attempts to address games with large branching factors, such as Diplomacy, StarCraft, and Dota, used human data to bootstrap the policy or used handcrafted reward sha**. In this pape… ▽ More

    Submitted 6 October, 2021; originally announced October 2021.

  8. arXiv:2106.09086  [pdf, other

    cs.AI

    Learned Belief Search: Efficiently Improving Policies in Partially Observable Settings

    Authors: Hengyuan Hu, Adam Lerer, Noam Brown, Jakob Foerster

    Abstract: Search is an important tool for computing effective policies in single- and multi-agent environments, and has been crucial for achieving superhuman performance in several benchmark fully and partially observable games. However, one major limitation of prior search approaches for partially observable environments is that the computational cost scales poorly with the amount of hidden information. In… ▽ More

    Submitted 16 June, 2021; originally announced June 2021.

  9. arXiv:2103.04000  [pdf, other

    cs.AI cs.LG

    Off-Belief Learning

    Authors: Hengyuan Hu, Adam Lerer, Brandon Cui, David Wu, Luis Pineda, Noam Brown, Jakob Foerster

    Abstract: The standard problem setting in Dec-POMDPs is self-play, where the goal is to find a set of policies that play optimally together. Policies learned through self-play may adopt arbitrary conventions and implicitly rely on multi-step reasoning based on fragile assumptions about other agents' actions and thus fail when paired with humans or independently trained agents at test time. To address this,… ▽ More

    Submitted 17 August, 2021; v1 submitted 5 March, 2021; originally announced March 2021.

  10. arXiv:2011.09679  [pdf, other

    cs.LG

    Scalable Graph Neural Networks for Heterogeneous Graphs

    Authors: Lingfan Yu, Jiajun Shen, **yang Li, Adam Lerer

    Abstract: Graph neural networks (GNNs) are a popular class of parametric model for learning over graph-structured data. Recent work has argued that GNNs primarily use the graph for feature smoothing, and have shown competitive results on benchmark tasks by simply operating on graph-smoothed node features, rather than using end-to-end learned feature hierarchies that are challenging to scale to large graphs.… ▽ More

    Submitted 19 November, 2020; originally announced November 2020.

  11. arXiv:2011.06505  [pdf, other

    cs.LG

    Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian

    Authors: Jack Parker-Holder, Luke Metz, Cinjon Resnick, Hengyuan Hu, Adam Lerer, Alistair Letcher, Alex Peysakhovich, Aldo Pacchiano, Jakob Foerster

    Abstract: Over the last decade, a single algorithm has changed many facets of our lives - Stochastic Gradient Descent (SGD). In the era of ever decreasing loss functions, SGD and its various offspring have become the go-to optimization tool in machine learning and are a key component of the success of deep neural networks (DNNs). While SGD is guaranteed to converge to a local optimum (under loose assumption… ▽ More

    Submitted 12 November, 2020; originally announced November 2020.

    Comments: Camera-ready version, NeurIPS 2020

  12. arXiv:2010.02923  [pdf, other

    cs.AI cs.GT cs.LG

    Human-Level Performance in No-Press Diplomacy via Equilibrium Search

    Authors: Jonathan Gray, Adam Lerer, Anton Bakhtin, Noam Brown

    Abstract: Prior AI breakthroughs in complex games have focused on either the purely adversarial or purely cooperative settings. In contrast, Diplomacy is a game of shifting alliances that involves both cooperation and competition. For this reason, Diplomacy has proven to be a formidable research challenge. In this paper we describe an agent for the no-press variant of Diplomacy that combines supervised lear… ▽ More

    Submitted 3 May, 2021; v1 submitted 5 October, 2020; originally announced October 2020.

  13. arXiv:2007.13544  [pdf, other

    cs.GT cs.AI cs.LG

    Combining Deep Reinforcement Learning and Search for Imperfect-Information Games

    Authors: Noam Brown, Anton Bakhtin, Adam Lerer, Qucheng Gong

    Abstract: The combination of deep reinforcement learning and search at both training and test time is a powerful paradigm that has led to a number of successes in single-agent settings and perfect-information games, best exemplified by AlphaZero. However, prior algorithms of this form cannot cope with imperfect-information games. This paper presents ReBeL, a general framework for self-play reinforcement lea… ▽ More

    Submitted 28 November, 2020; v1 submitted 27 July, 2020; originally announced July 2020.

  14. arXiv:2006.10410  [pdf, other

    cs.LG cs.GT stat.ML

    DREAM: Deep Regret minimization with Advantage baselines and Model-free learning

    Authors: Eric Steinberger, Adam Lerer, Noam Brown

    Abstract: We introduce DREAM, a deep reinforcement learning algorithm that finds optimal strategies in imperfect-information games with multiple agents. Formally, DREAM converges to a Nash Equilibrium in two-player zero-sum games and to an extensive-form coarse correlated equilibrium in all other games. Our primary innovation is an effective algorithm that, in contrast to other regret-based deep learning al… ▽ More

    Submitted 29 November, 2020; v1 submitted 18 June, 2020; originally announced June 2020.

  15. arXiv:2003.02979  [pdf, other

    cs.AI

    "Other-Play" for Zero-Shot Coordination

    Authors: Hengyuan Hu, Adam Lerer, Alex Peysakhovich, Jakob Foerster

    Abstract: We consider the problem of zero-shot coordination - constructing AI agents that can coordinate with novel partners they have not seen before (e.g. humans). Standard Multi-Agent Reinforcement Learning (MARL) methods typically focus on the self-play (SP) setting where agents construct strategies by playing the game with themselves repeatedly. Unfortunately, applying SP naively to the zero-shot coord… ▽ More

    Submitted 12 May, 2021; v1 submitted 5 March, 2020; originally announced March 2020.

  16. arXiv:1912.02318  [pdf, other

    cs.AI cs.MA

    Improving Policies via Search in Cooperative Partially Observable Games

    Authors: Adam Lerer, Hengyuan Hu, Jakob Foerster, Noam Brown

    Abstract: Recent superhuman results in games have largely been achieved in a variety of zero-sum settings, such as Go and Poker, in which agents need to compete against others. However, just like humans, real-world AI systems have to coordinate and communicate with other agents in cooperative partially observable environments as well. These settings commonly require participants to both interpret the action… ▽ More

    Submitted 4 December, 2019; originally announced December 2019.

    Journal ref: AAAI 2020

  17. arXiv:1912.01703  [pdf, other

    cs.LG cs.MS stat.ML

    PyTorch: An Imperative Style, High-Performance Deep Learning Library

    Authors: Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, Soumith Chintala

    Abstract: Deep learning frameworks have often focused on either usability or speed, but not both. PyTorch is a machine learning library that shows that these two goals are in fact compatible: it provides an imperative and Pythonic programming style that supports code as a model, makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting… ▽ More

    Submitted 3 December, 2019; originally announced December 2019.

    Comments: 12 pages, 3 figures, NeurIPS 2019

  18. arXiv:1904.02235  [pdf, other

    cs.GT cs.AI cs.MA stat.AP

    Robust Multi-agent Counterfactual Prediction

    Authors: Alexander Peysakhovich, Christian Kroer, Adam Lerer

    Abstract: We consider the problem of using logged data to make predictions about what would happen if we changed the `rules of the game' in a multi-agent system. This task is difficult because in many cases we observe actions individuals take but not their private information or their full reward functions. In addition, agents are strategic, so when the rules change, they will also change their actions. Exi… ▽ More

    Submitted 3 April, 2019; originally announced April 2019.

  19. arXiv:1903.12287  [pdf, other

    cs.LG cs.AI cs.DC cs.SI stat.ML

    PyTorch-BigGraph: A Large-scale Graph Embedding System

    Authors: Adam Lerer, Ledell Wu, Jiajun Shen, Timothee Lacroix, Luca Wehrstedt, Abhijit Bose, Alex Peysakhovich

    Abstract: Graph embedding methods produce unsupervised node features from graphs that can then be used for a variety of machine learning tasks. Modern graphs, particularly in industrial applications, contain billions of nodes and trillions of edges, which exceeds the capability of existing embedding systems. We present PyTorch-BigGraph (PBG), an embedding system that incorporates several modifications to tr… ▽ More

    Submitted 9 April, 2019; v1 submitted 28 March, 2019; originally announced March 2019.

    Journal ref: Proceedings of The Conference on Systems and Machine Learning, 2019

  20. arXiv:1811.00164  [pdf, other

    cs.AI cs.GT cs.LG

    Deep Counterfactual Regret Minimization

    Authors: Noam Brown, Adam Lerer, Sam Gross, Tuomas Sandholm

    Abstract: Counterfactual Regret Minimization (CFR) is the leading framework for solving large imperfect-information games. It converges to an equilibrium by iteratively traversing the game tree. In order to deal with extremely large games, abstraction is typically applied before running CFR. The abstracted game is solved with tabular CFR, and its solution is mapped back to the full game. This process can be… ▽ More

    Submitted 22 May, 2019; v1 submitted 31 October, 2018; originally announced November 2018.

    Journal ref: International Conference on Machine Learning (ICML), 2019

  21. arXiv:1806.10071  [pdf, other

    cs.AI cs.GT

    Learning Existing Social Conventions via Observationally Augmented Self-Play

    Authors: Adam Lerer, Alexander Peysakhovich

    Abstract: In order for artificial agents to coordinate effectively with people, they must act consistently with existing conventions (e.g. how to navigate in traffic, which language to speak, or how to coordinate with teammates). A group's conventions can be viewed as a choice of equilibrium in a coordination game. We consider the problem of an agent learning a policy for a coordination game in a simulated… ▽ More

    Submitted 13 March, 2019; v1 submitted 26 June, 2018; originally announced June 2018.

    Comments: Published in AAAI-AIES2019 - Best Paper

  22. arXiv:1803.07616  [pdf, other

    cs.AI cs.CV

    IntPhys: A Framework and Benchmark for Visual Intuitive Physics Reasoning

    Authors: Ronan Riochet, Mario Ynocente Castro, Mathieu Bernard, Adam Lerer, Rob Fergus, Véronique Izard, Emmanuel Dupoux

    Abstract: In order to reach human performance on complexvisual tasks, artificial systems need to incorporate a sig-nificant amount of understanding of the world in termsof macroscopic objects, movements, forces, etc. Inspiredby work on intuitive physics in infants, we propose anevaluation benchmark which diagnoses how much a givensystem understands about physics by testing whether itcan tell apart well matc… ▽ More

    Submitted 11 February, 2020; v1 submitted 20 March, 2018; originally announced March 2018.

  23. arXiv:1803.00512  [pdf, other

    cs.AI

    Composable Planning with Attributes

    Authors: Amy Zhang, Adam Lerer, Sainbayar Sukhbaatar, Rob Fergus, Arthur Szlam

    Abstract: The tasks that an agent will need to solve often are not known during training. However, if the agent knows which properties of the environment are important then, after learning how its actions affect those properties, it may be able to use this knowledge to solve complex tasks without training specifically for them. Towards this end, we consider a setup in which an environment is augmented with… ▽ More

    Submitted 25 April, 2019; v1 submitted 1 March, 2018; originally announced March 2018.

    Journal ref: International Conference on Machine Learning, 2018

  24. arXiv:1710.06975  [pdf, other

    cs.AI cs.GT cs.MA

    Consequentialist conditional cooperation in social dilemmas with imperfect information

    Authors: Alexander Peysakhovich, Adam Lerer

    Abstract: Social dilemmas, where mutual cooperation can lead to high payoffs but participants face incentives to cheat, are ubiquitous in multi-agent interaction. We wish to construct agents that cooperate with pure cooperators, avoid exploitation by pure defectors, and incentivize cooperation from the rest. However, often the actions taken by a partner are (partially) unobserved or the consequences of indi… ▽ More

    Submitted 2 March, 2018; v1 submitted 18 October, 2017; originally announced October 2017.

    Journal ref: Proceedings of the International Conference on Learning Representations 2018

  25. arXiv:1709.02865  [pdf, other

    cs.AI cs.GT

    Prosocial learning agents solve generalized Stag Hunts better than selfish ones

    Authors: Alexander Peysakhovich, Adam Lerer

    Abstract: Deep reinforcement learning has become an important paradigm for constructing agents that can enter complex multi-agent situations and improve their policies through experience. One commonly used technique is reactive training - applying standard RL methods while treating other agents as a part of the learner's environment. It is known that in general-sum games reactive training can lead groups of… ▽ More

    Submitted 8 December, 2017; v1 submitted 8 September, 2017; originally announced September 2017.

  26. arXiv:1707.01068  [pdf, other

    cs.AI cs.GT cs.MA

    Maintaining cooperation in complex social dilemmas using deep reinforcement learning

    Authors: Adam Lerer, Alexander Peysakhovich

    Abstract: Social dilemmas are situations where individuals face a temptation to increase their payoffs at a cost to total welfare. Building artificially intelligent agents that achieve good outcomes in these situations is important because many real world interactions include a tension between selfish interests and the welfare of others. We show how to modify modern reinforcement learning methods to constru… ▽ More

    Submitted 2 March, 2018; v1 submitted 4 July, 2017; originally announced July 2017.

  27. arXiv:1604.02135  [pdf, other

    cs.CV

    A MultiPath Network for Object Detection

    Authors: Sergey Zagoruyko, Adam Lerer, Tsung-Yi Lin, Pedro O. Pinheiro, Sam Gross, Soumith Chintala, Piotr Dollár

    Abstract: The recent COCO object detection dataset presents several new challenges for object detection. In particular, it contains objects at a broad range of scales, less prototypical images, and requires more precise localization. To address these challenges, we test three modifications to the standard Fast R-CNN object detector: (1) skip connections that give the detector access to features at multiple… ▽ More

    Submitted 8 August, 2016; v1 submitted 7 April, 2016; originally announced April 2016.

  28. arXiv:1603.01312  [pdf, other

    cs.AI

    Learning Physical Intuition of Block Towers by Example

    Authors: Adam Lerer, Sam Gross, Rob Fergus

    Abstract: Wooden blocks are a common toy for infants, allowing them to develop motor skills and gain intuition about the physical behavior of the world. In this paper, we explore the ability of deep feed-forward models to learn such intuitive physics. Using a 3D game engine, we create small towers of wooden blocks whose stability is randomized and render them collapsing (or remaining upright). This data all… ▽ More

    Submitted 3 March, 2016; originally announced March 2016.