Skip to main content

Showing 1–28 of 28 results for author: Tesauro, G

.
  1. arXiv:2303.17508  [pdf, other

    cs.AI cs.CV cs.HC q-bio.NC

    Learning in Factored Domains with Information-Constrained Visual Representations

    Authors: Tyler Malloy, Miao Liu, Matthew D. Riemer, Tim Klinger, Gerald Tesauro, Chris R. Sims

    Abstract: Humans learn quickly even in tasks that contain complex visual information. This is due in part to the efficient formation of compressed representations of visual information, allowing for better generalization and robustness. However, compressed representations alone are insufficient for explaining the high speed of human learning. Reinforcement learning (RL) models that seek to replicate this im… ▽ More

    Submitted 30 March, 2023; originally announced March 2023.

  2. arXiv:2210.16175  [pdf, other

    cs.GT cs.AI

    Game-Theoretical Perspectives on Active Equilibria: A Preferred Solution Concept over Nash Equilibria

    Authors: Dong-Ki Kim, Matthew Riemer, Miao Liu, Jakob N. Foerster, Gerald Tesauro, Jonathan P. How

    Abstract: Multiagent learning settings are inherently more difficult than single-agent learning because each agent interacts with other simultaneously learning agents in a shared environment. An effective approach in multiagent reinforcement learning is to consider the learning process of agents and influence their future policies toward desirable behaviors from each agent's perspective. Importantly, if eac… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

  3. arXiv:2203.03535  [pdf, other

    cs.LG cs.AI cs.MA

    Influencing Long-Term Behavior in Multiagent Reinforcement Learning

    Authors: Dong-Ki Kim, Matthew Riemer, Miao Liu, Jakob N. Foerster, Michael Everett, Chuangchuang Sun, Gerald Tesauro, Jonathan P. How

    Abstract: The main challenge of multiagent reinforcement learning is the difficulty of learning useful policies in the presence of other simultaneously learning agents whose changing behaviors jointly affect the environment's transition and reward dynamics. An effective approach that has recently emerged for addressing this non-stationarity is for each agent to anticipate the learning of other agents and in… ▽ More

    Submitted 15 October, 2022; v1 submitted 7 March, 2022; originally announced March 2022.

    Comments: Accepted to NeurIPS 2022. The earlier version was presented at the Gamification and Multiagent Solutions Workshop (ICLR 2022) with a spotlight. Code at https://github.com/dkkim93/further and videos at https://sites.google.com/view/further-marl

  4. arXiv:2109.09876  [pdf, other

    cs.LG cs.AI

    Context-Specific Representation Abstraction for Deep Option Learning

    Authors: Marwa Abdulhai, Dong-Ki Kim, Matthew Riemer, Miao Liu, Gerald Tesauro, Jonathan P. How

    Abstract: Hierarchical reinforcement learning has focused on discovering temporally extended actions, such as options, that can provide benefits in problems requiring extensive exploration. One promising approach that learns these options end-to-end is the option-critic (OC) framework. We examine and show in this paper that OC does not decompose a problem into simpler sub-problems, but instead increases the… ▽ More

    Submitted 23 April, 2022; v1 submitted 20 September, 2021; originally announced September 2021.

    Comments: Accepted at AAAI 2022

  5. arXiv:2011.11517  [pdf, other

    cs.AI

    Consolidation via Policy Information Regularization in Deep RL for Multi-Agent Games

    Authors: Tyler Malloy, Tim Klinger, Miao Liu, Matthew Riemer, Gerald Tesauro, Chris R. Sims

    Abstract: This paper introduces an information-theoretic constraint on learned policy complexity in the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) reinforcement learning algorithm. Previous research with a related approach in continuous control experiments suggests that this method favors learning policies that are more robust to changing environment dynamics. The multi-agent game setting natur… ▽ More

    Submitted 23 November, 2020; originally announced November 2020.

  6. arXiv:2011.00382  [pdf, other

    cs.LG cs.AI cs.MA

    A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning

    Authors: Dong-Ki Kim, Miao Liu, Matthew Riemer, Chuangchuang Sun, Marwa Abdulhai, Golnaz Habibi, Sebastian Lopez-Cot, Gerald Tesauro, Jonathan P. How

    Abstract: A fundamental challenge in multiagent reinforcement learning is to learn beneficial behaviors in a shared environment with other simultaneously learning agents. In particular, each agent perceives the environment as effectively non-stationary due to the changing policies of other agents. Moreover, each agent is itself constantly learning, leading to natural non-stationarity in the distribution of… ▽ More

    Submitted 11 June, 2021; v1 submitted 31 October, 2020; originally announced November 2020.

    Comments: Accepted to ICML 2021. Code at https://github.com/dkkim93/meta-mapg and Videos at https://sites.google.com/view/meta-mapg/home

  7. arXiv:2010.04646  [pdf, other

    cs.LG cs.AI

    Deep RL With Information Constrained Policies: Generalization in Continuous Control

    Authors: Tyler Malloy, Chris R. Sims, Tim Klinger, Miao Liu, Matthew Riemer, Gerald Tesauro

    Abstract: Biological agents learn and act intelligently in spite of a highly limited capacity to process and store information. Many real-world problems involve continuous control, which represents a difficult task for artificial intelligence agents. In this paper we explore the potential learning advantages a natural constraint on information flow might confer onto artificial agents in continuous control t… ▽ More

    Submitted 9 October, 2020; originally announced October 2020.

  8. arXiv:2010.03790  [pdf, other

    cs.AI cs.CL cs.LG

    Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines

    Authors: Keerthiram Murugesan, Mattia Atzeni, Pavan Kapanipathi, Pushkar Shukla, Sadhana Kumaravel, Gerald Tesauro, Kartik Talamadupula, Mrinmaya Sachan, Murray Campbell

    Abstract: Text-based games have emerged as an important test-bed for Reinforcement Learning (RL) research, requiring RL agents to combine grounded language understanding with sequential decision making. In this paper, we examine the problem of infusing RL agents with commonsense knowledge. Such knowledge would allow agents to efficiently act in the world by pruning out implausible actions, and to perform lo… ▽ More

    Submitted 8 October, 2020; originally announced October 2020.

  9. arXiv:2004.13242  [pdf, other

    cs.AI cs.LG stat.ML

    Efficient Black-Box Planning Using Macro-Actions with Focused Effects

    Authors: Cameron Allen, Michael Katz, Tim Klinger, George Konidaris, Matthew Riemer, Gerald Tesauro

    Abstract: The difficulty of deterministic planning increases exponentially with search-tree depth. Black-box planning presents an even greater challenge, since planners must operate without an explicit model of the domain. Heuristics can make search more efficient, but goal-aware heuristics for black-box planning usually rely on goal counting, which is often quite uninformative. In this work, we show how to… ▽ More

    Submitted 23 June, 2021; v1 submitted 27 April, 2020; originally announced April 2020.

    Comments: To appear at IJCAI 2021; code available at https://github.com/camall3n/focused-macros

  10. arXiv:1912.13408  [pdf, other

    cs.LG stat.ML

    On the Role of Weight Sharing During Deep Option Learning

    Authors: Matthew Riemer, Ignacio Cases, Clemens Rosenbaum, Miao Liu, Gerald Tesauro

    Abstract: The options framework is a popular approach for building temporally extended actions in reinforcement learning. In particular, the option-critic architecture provides general purpose policy gradient theorems for learning actions from scratch that are extended in time. However, past work makes the key assumption that each of the components of option-critic has independent parameters. In this work w… ▽ More

    Submitted 6 February, 2020; v1 submitted 31 December, 2019; originally announced December 2019.

    Comments: AAAI 2020

  11. arXiv:1903.04110  [pdf, other

    cs.LG cs.AI stat.ML

    Hybrid Reinforcement Learning with Expert State Sequences

    Authors: Xiaoxiao Guo, Shiyu Chang, Mo Yu, Gerald Tesauro, Murray Campbell

    Abstract: Existing imitation learning approaches often require that the complete demonstration data, including sequences of actions and states, are available. In this paper, we consider a more realistic and difficult scenario where a reinforcement learning agent only has access to the state sequences of an expert, while the expert actions are unobserved. We propose a novel tensor-based model to infer the un… ▽ More

    Submitted 10 March, 2019; originally announced March 2019.

    Comments: AAAI 2019; https://github.com/XiaoxiaoGuo/tensor4rl

  12. arXiv:1903.03216  [pdf, other

    cs.LG cs.AI cs.MA

    Learning Hierarchical Teaching Policies for Cooperative Agents

    Authors: Dong-Ki Kim, Miao Liu, Shayegan Omidshafiei, Sebastian Lopez-Cot, Matthew Riemer, Golnaz Habibi, Gerald Tesauro, Sami Mourad, Murray Campbell, Jonathan P. How

    Abstract: Collective learning can be greatly enhanced when agents effectively exchange knowledge with their peers. In particular, recent work studying agents that learn to teach other teammates has demonstrated that action advising accelerates team-wide learning. However, the prior work has simplified the learning of advising policies by using simple function approximations and only considered advising with… ▽ More

    Submitted 18 May, 2020; v1 submitted 7 March, 2019; originally announced March 2019.

    Comments: Presented at AAMAS 2020; arXiv version added with the appendix

  13. arXiv:1810.11910  [pdf, other

    cs.LG cs.AI stat.ML

    Learning to Learn without Forgetting by Maximizing Transfer and Minimizing Interference

    Authors: Matthew Riemer, Ignacio Cases, Robert Ajemian, Miao Liu, Irina Rish, Yuhai Tu, Gerald Tesauro

    Abstract: Lack of performance when it comes to continual learning over non-stationary distributions of data remains a major challenge in scaling neural network learning to more human realistic settings. In this work we propose a new conceptualization of the continual learning problem in terms of a temporally symmetric trade-off between transfer and interference that can be optimized by enforcing gradient al… ▽ More

    Submitted 2 May, 2019; v1 submitted 28 October, 2018; originally announced October 2018.

    Comments: ICLR 2019

  14. arXiv:1810.11583  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Abstract Options

    Authors: Matthew Riemer, Miao Liu, Gerald Tesauro

    Abstract: Building systems that autonomously create temporal abstractions from data is a key challenge in scaling learning and planning in reinforcement learning. One popular approach for addressing this challenge is the options framework (Sutton et al., 1999). However, only recently in (Bacon et al., 2017) was a policy gradient theorem derived for online learning of general purpose options in an end to end… ▽ More

    Submitted 31 December, 2019; v1 submitted 26 October, 2018; originally announced October 2018.

    Comments: NeurIPS 2018

  15. arXiv:1805.07830  [pdf, other

    cs.MA cs.AI

    Learning to Teach in Cooperative Multiagent Reinforcement Learning

    Authors: Shayegan Omidshafiei, Dong-Ki Kim, Miao Liu, Gerald Tesauro, Matthew Riemer, Christopher Amato, Murray Campbell, Jonathan P. How

    Abstract: Collective human knowledge has clearly benefited from the fact that innovations by individuals are taught to others through communication. Similar to human social groups, agents in distributed learning systems would likely benefit from communication to share knowledge and teach skills. The problem of teaching to improve agent learning has been investigated by prior works, but these approaches make… ▽ More

    Submitted 31 August, 2018; v1 submitted 20 May, 2018; originally announced May 2018.

  16. arXiv:1805.07513  [pdf, other

    cs.CL cs.LG

    Diverse Few-Shot Text Classification with Multiple Metrics

    Authors: Mo Yu, Xiaoxiao Guo, **feng Yi, Shiyu Chang, Saloni Potdar, Yu Cheng, Gerald Tesauro, Haoyu Wang, Bowen Zhou

    Abstract: We study few-shot learning in natural language domains. Compared to many existing works that apply either metric-based or optimization-based meta-learning to image domain with low inter-task variance, we consider a more realistic setting, where tasks are diverse. However, it imposes tremendous difficulties to existing state-of-the-art metric-based algorithms since a single metric is insufficient t… ▽ More

    Submitted 19 May, 2018; originally announced May 2018.

    Comments: NAACL 2018. 11+5 pages. arXiv admin note: text overlap with arXiv:1708.07918

  17. arXiv:1805.00145  [pdf, other

    cs.CV cs.AI

    Dialog-based Interactive Image Retrieval

    Authors: Xiaoxiao Guo, Hui Wu, Yu Cheng, Steven Rennie, Gerald Tesauro, Rogerio Schmidt Feris

    Abstract: Existing methods for interactive image retrieval have demonstrated the merit of integrating user feedback, improving retrieval results. However, most current systems rely on restricted forms of user feedback, such as binary relevance responses, or feedback based on a fixed set of relative attributes, which limits their impact. In this paper, we introduce a new approach to interactive image search… ▽ More

    Submitted 20 December, 2018; v1 submitted 30 April, 2018; originally announced May 2018.

    Comments: accepted at NeurIPS 2018

  18. arXiv:1712.04065  [pdf, other

    cs.AI

    The Eigenoption-Critic Framework

    Authors: Miao Liu, Marlos C. Machado, Gerald Tesauro, Murray Campbell

    Abstract: Eigenoptions (EOs) have been recently introduced as a promising idea for generating a diverse set of options through the graph Laplacian, having been shown to allow efficient exploration. Despite its initial promising results, a couple of issues in current algorithms limit its application, namely: (1) EO methods require two separate steps (eigenoption discovery and reward maximization) to learn a… ▽ More

    Submitted 11 December, 2017; originally announced December 2017.

  19. arXiv:1711.05116  [pdf, other

    cs.CL cs.AI

    Evidence Aggregation for Answer Re-Ranking in Open-Domain Question Answering

    Authors: Shuohang Wang, Mo Yu, **g Jiang, Wei Zhang, Xiaoxiao Guo, Shiyu Chang, Zhiguo Wang, Tim Klinger, Gerald Tesauro, Murray Campbell

    Abstract: A popular recent approach to answering open-domain questions is to first search for question-related passages and then apply reading comprehension models to extract answers. Existing methods usually extract answers from single passages independently. But some questions require a combination of evidence from across different sources to answer correctly. In this paper, we propose two models which ma… ▽ More

    Submitted 26 April, 2018; v1 submitted 14 November, 2017; originally announced November 2017.

    Comments: 13 pages

  20. arXiv:1710.11089  [pdf, other

    cs.LG cs.AI

    Eigenoption Discovery through the Deep Successor Representation

    Authors: Marlos C. Machado, Clemens Rosenbaum, Xiaoxiao Guo, Miao Liu, Gerald Tesauro, Murray Campbell

    Abstract: Options in reinforcement learning allow agents to hierarchically decompose a task into subtasks, having the potential to speed up learning and planning. However, autonomously learning effective sets of options is still a major challenge in the field. In this paper we focus on the recently introduced idea of using representation learning methods to guide the option discovery process. Specifically,… ▽ More

    Submitted 23 February, 2018; v1 submitted 30 October, 2017; originally announced October 2017.

    Comments: Published as a conference paper at ICLR 2018

  21. arXiv:1709.00023  [pdf, other

    cs.CL cs.AI

    R$^3$: Reinforced Reader-Ranker for Open-Domain Question Answering

    Authors: Shuohang Wang, Mo Yu, Xiaoxiao Guo, Zhiguo Wang, Tim Klinger, Wei Zhang, Shiyu Chang, Gerald Tesauro, Bowen Zhou, **g Jiang

    Abstract: In recent years researchers have achieved considerable success applying neural network methods to question answering (QA). These approaches have achieved state of the art results in simplified closed-domain settings such as the SQuAD (Rajpurkar et al., 2016) dataset, which provides a pre-selected passage, from which the answer to a given question may be extracted. More recently, researchers have b… ▽ More

    Submitted 21 November, 2017; v1 submitted 31 August, 2017; originally announced September 2017.

    Comments: 8 pages, accepted by AAAI 2018

  22. arXiv:1708.07918  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Robust Task Clustering for Deep Many-Task Learning

    Authors: Mo Yu, Xiaoxiao Guo, **feng Yi, Shiyu Chang, Saloni Potdar, Gerald Tesauro, Haoyu Wang, Bowen Zhou

    Abstract: We investigate task clustering for deep-learning based multi-task and few-shot learning in a many-task setting. We propose a new method to measure task similarities with cross-task transfer performance matrix for the deep learning scenario. Although this matrix provides us critical information regarding similarity between tasks, its asymmetric property and unreliable performance scores can affect… ▽ More

    Submitted 17 May, 2018; v1 submitted 25 August, 2017; originally announced August 2017.

  23. arXiv:1606.00776  [pdf, other

    cs.CL cs.AI cs.LG cs.NE stat.ML

    Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation

    Authors: Iulian Vlad Serban, Tim Klinger, Gerald Tesauro, Kartik Talamadupula, Bowen Zhou, Yoshua Bengio, Aaron Courville

    Abstract: We introduce the multiresolution recurrent neural network, which extends the sequence-to-sequence framework to model natural language generation as two parallel discrete stochastic processes: a sequence of high-level coarse tokens, and a sequence of natural language tokens. There are many ways to estimate or learn the high-level coarse tokens, but we argue that a simple extraction procedure is suf… ▽ More

    Submitted 13 June, 2016; v1 submitted 2 June, 2016; originally announced June 2016.

    Comments: 21 pages, 2 figures, 10 tables

    ACM Class: I.5.1; I.2.7

  24. arXiv:1605.07427  [pdf, other

    stat.ML cs.CL cs.LG cs.NE

    Hierarchical Memory Networks

    Authors: Sarath Chandar, Sung** Ahn, Hugo Larochelle, Pascal Vincent, Gerald Tesauro, Yoshua Bengio

    Abstract: Memory networks are neural networks with an explicit memory component that can be both read and written to by the network. The memory is often addressed in a soft way using a softmax function, making end-to-end training with backpropagation possible. However, this is not computationally scalable for applications which require the network to read from extremely large memories. On the other hand, it… ▽ More

    Submitted 24 May, 2016; originally announced May 2016.

    Comments: 10 pages

  25. arXiv:1601.00024  [pdf, other

    cs.LG stat.ML

    Selecting Near-Optimal Learners via Incremental Data Allocation

    Authors: Ashish Sabharwal, Horst Samulowitz, Gerald Tesauro

    Abstract: We study a novel machine learning (ML) problem setting of sequentially allocating small subsets of training data amongst a large set of classifiers. The goal is to select a classifier that will give near-optimal accuracy when trained on all data, while also minimizing the cost of misallocated samples. This is motivated by large modern datasets and ML toolkits with many combinations of learning alg… ▽ More

    Submitted 31 December, 2015; originally announced January 2016.

    Comments: AAAI-2016: The Thirtieth AAAI Conference on Artificial Intelligence

  26. Analysis of Watson's Strategies for Playing Jeopardy!

    Authors: Gerald Tesauro, David C. Gondek, Jonathan Lenchner, James Fan, John M. Prager

    Abstract: Major advances in Question Answering technology were needed for IBM Watson to play Jeopardy! at championship level -- the show requires rapid-fire answers to challenging natural language questions, broad general knowledge, high precision, and accurate confidence estimates. In addition, Jeopardy! features four types of decision making carrying great strategic importance: (1) Daily Double wagering;… ▽ More

    Submitted 3 February, 2014; originally announced February 2014.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 47, pages 205-251, 2013

  27. arXiv:1212.2443  [pdf

    cs.GT

    Cooperative Negotiation in Autonomic Systems using Incremental Utility Elicitation

    Authors: Craig Boutilier, Rajarshi Das, Jeffrey O. Kephart, Gerald Tesauro, William E. Walsh

    Abstract: Decentralized resource allocation is a key problem for large-scale autonomic (or self-managing) computing systems. Motivated by a data center scenario, we explore efficient techniques for resolving resource conflicts via cooperative negotiation. Rather than computing in advance the functional dependence of each element's utility upon the amount of resource it receives, which could be prohibitivel… ▽ More

    Submitted 19 October, 2012; originally announced December 2012.

    Comments: Appears in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI2003)

    Report number: UAI-P-2003-PG-89-97

  28. arXiv:1203.3519  [pdf

    cs.LG cs.AI stat.ML

    Bayesian Inference in Monte-Carlo Tree Search

    Authors: Gerald Tesauro, V T Rajan, Richard Segal

    Abstract: Monte-Carlo Tree Search (MCTS) methods are drawing great interest after yielding breakthrough results in computer Go. This paper proposes a Bayesian approach to MCTS that is inspired by distributionfree approaches such as UCT [13], yet significantly differs in important respects. The Bayesian framework allows potentially much more accurate (Bayes-optimal) estimation of node values and node uncerta… ▽ More

    Submitted 15 March, 2012; originally announced March 2012.

    Comments: Appears in Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI2010)

    Report number: UAI-P-2010-PG-580-588