Skip to main content

Showing 1–6 of 6 results for author: McAleer, S M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.17179  [pdf, other

    cs.LG cs.AI cs.CL

    Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training

    Authors: Xidong Feng, Ziyu Wan, Muning Wen, Stephen Marcus McAleer, Ying Wen, Weinan Zhang, Jun Wang

    Abstract: Recent works like Tree-of-Thought (ToT) and Reasoning via Planning (RAP) aim to augment the reasoning capabilities of LLMs by using tree-search algorithms to guide multi-step reasoning. These methods rely on prompting a pre-trained model to serve as a value function and focus on problems with low search depth. As a result, these methods will not work in domains where the pre-trained LLM does not h… ▽ More

    Submitted 8 February, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

  2. arXiv:2306.05221  [pdf, other

    cs.GT

    Steering No-Regret Learners to a Desired Equilibrium

    Authors: Brian Hu Zhang, Gabriele Farina, Ioannis Anagnostides, Federico Cacciamani, Stephen Marcus McAleer, Andreas Alexander Haupt, Andrea Celli, Nicola Gatti, Vincent Conitzer, Tuomas Sandholm

    Abstract: A mediator observes no-regret learners playing an extensive-form game repeatedly across $T$ rounds. The mediator attempts to steer players toward some desirable predetermined equilibrium by giving (nonnegative) payments to players. We call this the steering problem. The steering problem captures problems several problems of interest, among them equilibrium selection and information design (persuas… ▽ More

    Submitted 17 February, 2024; v1 submitted 8 June, 2023; originally announced June 2023.

  3. arXiv:2306.05216  [pdf, ps, other

    cs.GT

    Computing Optimal Equilibria and Mechanisms via Learning in Zero-Sum Extensive-Form Games

    Authors: Brian Hu Zhang, Gabriele Farina, Ioannis Anagnostides, Federico Cacciamani, Stephen Marcus McAleer, Andreas Alexander Haupt, Andrea Celli, Nicola Gatti, Vincent Conitzer, Tuomas Sandholm

    Abstract: We introduce a new approach for computing optimal equilibria via learning in games. It applies to extensive-form settings with any number of players, including mechanism design, information design, and solution concepts such as correlated, communication, and certification equilibria. We observe that optimal equilibria are minimax equilibrium strategies of a player in an extensive-form zero-sum gam… ▽ More

    Submitted 23 May, 2024; v1 submitted 8 June, 2023; originally announced June 2023.

  4. arXiv:2304.10498  [pdf, other

    cs.GT

    Regret-Minimizing Double Oracle for Extensive-Form Games

    Authors: Xiaohang Tang, Le Cong Dinh, Stephen Marcus McAleer, Yaodong Yang

    Abstract: By incorporating regret minimization, double oracle methods have demonstrated rapid convergence to Nash Equilibrium (NE) in normal-form games and extensive-form games, through algorithms such as online double oracle (ODO) and extensive-form double oracle (XDO), respectively. In this study, we further examine the theoretical convergence rate and sample complexity of such regret minimization-based d… ▽ More

    Submitted 13 July, 2023; v1 submitted 20 April, 2023; originally announced April 2023.

    Comments: Accepted at ICML, 2023

  5. arXiv:2206.08686  [pdf, other

    cs.RO cs.AI cs.LG cs.MA

    Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning

    Authors: Yuanpei Chen, Tianhao Wu, Shengjie Wang, Xidong Feng, Jiechuang Jiang, Stephen Marcus McAleer, Yiran Geng, Hao Dong, Zongqing Lu, Song-Chun Zhu, Yaodong Yang

    Abstract: Achieving human-level dexterity is an important open problem in robotics. However, tasks of dexterous hand manipulation, even at the baby level, are challenging to solve through reinforcement learning (RL). The difficulty lies in the high degrees of freedom and the required cooperation among heterogeneous agents (e.g., joints of fingers). In this study, we propose the Bimanual Dexterous Hands Benc… ▽ More

    Submitted 11 October, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

    Comments: 38 pages, 8 figures

    Report number: V-02

  6. arXiv:2205.15434  [pdf, other

    cs.LG cs.AI cs.GT cs.MA

    A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems

    Authors: Oliver Slumbers, David Henry Mguni, Stephen Marcus McAleer, Stefano B. Blumberg, Jun Wang, Yaodong Yang

    Abstract: In order for agents in multi-agent systems (MAS) to be safe, they need to take into account the risks posed by the actions of other agents. However, the dominant paradigm in game theory (GT) assumes that agents are not affected by risk from other agents and only strive to maximise their expected utility. For example, in hybrid human-AI driving systems, it is necessary to limit large deviations in… ▽ More

    Submitted 2 March, 2023; v1 submitted 30 May, 2022; originally announced May 2022.