Skip to main content

Showing 1–14 of 14 results for author: Sokota, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.19540  [pdf, other

    cs.IT cs.CR

    Computing Low-Entropy Couplings for Large-Support Distributions

    Authors: Samuel Sokota, Dylan Sam, Christian Schroeder de Witt, Spencer Compton, Jakob Foerster, J. Zico Kolter

    Abstract: Minimum-entropy coupling (MEC) -- the process of finding a joint distribution with minimum entropy for given marginals -- has applications in areas such as causality and steganography. However, existing algorithms are either computationally intractable for large-support distributions or limited to specific distribution types and sensitive to hyperparameter choices. This work addresses these limita… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  2. arXiv:2305.13546  [pdf, other

    cs.LG cs.AI

    Neural Functional Transformers

    Authors: Allan Zhou, Kaien Yang, Yiding Jiang, Kaylee Burns, Winnie Xu, Samuel Sokota, J. Zico Kolter, Chelsea Finn

    Abstract: The recent success of neural networks as implicit representation of data has driven growing interest in neural functionals: models that can process other neural networks as input by operating directly over their weight spaces. Nevertheless, constructing expressive and efficient neural functional architectures that can handle high-dimensional weight-space objects remains challenging. This paper use… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

  3. arXiv:2304.13138  [pdf, other

    cs.AI cs.LG

    The Update-Equivalence Framework for Decision-Time Planning

    Authors: Samuel Sokota, Gabriele Farina, David J. Wu, Hengyuan Hu, Kevin A. Wang, J. Zico Kolter, Noam Brown

    Abstract: The process of revising (or constructing) a policy at execution time -- known as decision-time planning -- has been key to achieving superhuman performance in perfect-information games like chess and Go. A recent line of work has extended decision-time planning to imperfect-information games, leading to superhuman performance in poker. However, these methods involve solving subgames whose sizes gr… ▽ More

    Submitted 13 May, 2024; v1 submitted 25 April, 2023; originally announced April 2023.

  4. arXiv:2303.10733  [pdf, other

    cs.AI cs.MA

    Cheap Talk Discovery and Utilization in Multi-Agent Reinforcement Learning

    Authors: Yat Long Lo, Christian Schroeder de Witt, Samuel Sokota, Jakob Nicolaus Foerster, Shimon Whiteson

    Abstract: By enabling agents to communicate, recent cooperative multi-agent reinforcement learning (MARL) methods have demonstrated better task performance and more coordinated behavior. Most existing approaches facilitate inter-agent communication by allowing agents to send messages to each other through free communication channels, i.e., cheap talk channels. Current methods require these channels to be co… ▽ More

    Submitted 19 March, 2023; originally announced March 2023.

    Comments: The 11th International Conference on Learning Representations (ICLR)

  5. arXiv:2302.14040  [pdf, other

    cs.LG cs.AI

    Permutation Equivariant Neural Functionals

    Authors: Allan Zhou, Kaien Yang, Kaylee Burns, Adriano Cardace, Yiding Jiang, Samuel Sokota, J. Zico Kolter, Chelsea Finn

    Abstract: This work studies the design of neural networks that can process the weights or gradients of other neural networks, which we refer to as neural functional networks (NFNs). Despite a wide range of potential applications, including learned optimization, processing implicit neural representations, network editing, and policy evaluation, there are few unifying principles for designing effective archit… ▽ More

    Submitted 26 September, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: To appear in Neural Information Processing Systems (NeurIPS), 2023

  6. arXiv:2301.09159  [pdf, other

    cs.GT cs.AI cs.LG

    Abstracting Imperfect Information Away from Two-Player Zero-Sum Games

    Authors: Samuel Sokota, Ryan D'Orazio, Chun Kai Ling, David J. Wu, J. Zico Kolter, Noam Brown

    Abstract: In their seminal work, Nayyar et al. (2013) showed that imperfect information can be abstracted away from common-payoff games by having players publicly announce their policies as they play. This insight underpins sound solvers and decision-time planning algorithms for common-payoff games. Unfortunately, a naive application of the same insight to two-player zero-sum games fails because Nash equili… ▽ More

    Submitted 31 July, 2023; v1 submitted 22 January, 2023; originally announced January 2023.

  7. arXiv:2210.14889  [pdf, other

    cs.CR cs.AI cs.MM

    Perfectly Secure Steganography Using Minimum Entropy Coupling

    Authors: Christian Schroeder de Witt, Samuel Sokota, J. Zico Kolter, Jakob Foerster, Martin Strohmeier

    Abstract: Steganography is the practice of encoding secret information into innocuous content in such a manner that an adversarial third party would not realize that there is hidden meaning. While this problem has classically been studied in security literature, recent advances in generative models have led to a shared interest among security and machine learning researchers in develo** scalable steganogr… ▽ More

    Submitted 30 October, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

  8. arXiv:2207.12322  [pdf, other

    cs.AI cs.LG

    Self-Explaining Deviations for Coordination

    Authors: Hengyuan Hu, Samuel Sokota, David Wu, Anton Bakhtin, Andrei Lupu, Brandon Cui, Jakob N. Foerster

    Abstract: Fully cooperative, partially observable multi-agent problems are ubiquitous in the real world. In this paper, we focus on a specific subclass of coordination problems in which humans are able to discover self-explaining deviations (SEDs). SEDs are actions that deviate from the common understanding of what reasonable behavior would be in normal circumstances. They are taken with the intention of ca… ▽ More

    Submitted 13 July, 2022; originally announced July 2022.

  9. arXiv:2206.05825  [pdf, other

    cs.LG cs.AI cs.GT

    A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games

    Authors: Samuel Sokota, Ryan D'Orazio, J. Zico Kolter, Nicolas Loizou, Marc Lanctot, Ioannis Mitliagkas, Noam Brown, Christian Kroer

    Abstract: This work studies an algorithm, which we call magnetic mirror descent, that is inspired by mirror descent and the non-Euclidean proximal gradient algorithm. Our contribution is demonstrating the virtues of magnetic mirror descent as both an equilibrium solver and as an approach to reinforcement learning in two-player zero-sum games. These virtues include: 1) Being the first quantal response equili… ▽ More

    Submitted 11 April, 2023; v1 submitted 12 June, 2022; originally announced June 2022.

  10. arXiv:2201.12658  [pdf, other

    cs.LG cs.AI cs.MA

    Learning Intuitive Policies Using Action Features

    Authors: Mingwei Ma, Jizhou Liu, Samuel Sokota, Max Kleiman-Weiner, Jakob Foerster

    Abstract: An unaddressed challenge in multi-agent coordination is to enable AI agents to exploit the semantic relationships between the features of actions and the features of observations. Humans take advantage of these relationships in highly intuitive ways. For instance, in the absence of a shared language, we might point to the object we desire or hold up our fingers to indicate how many objects we want… ▽ More

    Submitted 5 June, 2023; v1 submitted 29 January, 2022; originally announced January 2022.

    Comments: ICML 2023

  11. arXiv:2107.08295  [pdf, other

    cs.AI cs.MA

    Communicating via Markov Decision Processes

    Authors: Samuel Sokota, Christian Schroeder de Witt, Maximilian Igl, Luisa Zintgraf, Philip Torr, Martin Strohmeier, J. Zico Kolter, Shimon Whiteson, Jakob Foerster

    Abstract: We consider the problem of communicating exogenous information by means of Markov decision process trajectories. This setting, which we call a Markov coding game (MCG), generalizes both source coding and a large class of referential games. MCGs also isolate a problem that is important in decentralized control settings in which cheap-talk is not available -- namely, they require balancing communica… ▽ More

    Submitted 12 June, 2022; v1 submitted 17 July, 2021; originally announced July 2021.

    Comments: ICML 2022

  12. arXiv:2101.04237  [pdf, other

    cs.AI cs.LG

    Solving Common-Payoff Games with Approximate Policy Iteration

    Authors: Samuel Sokota, Edward Lockhart, Finbarr Timbers, Elnaz Davoodi, Ryan D'Orazio, Neil Burch, Martin Schmid, Michael Bowling, Marc Lanctot

    Abstract: For artificially intelligent learning systems to have widespread applicability in real-world settings, it is important that they be able to operate decentrally. Unfortunately, decentralized control is difficult -- computing even an epsilon-optimal joint policy is a NEXP complete problem. Nevertheless, a recently rediscovered insight -- that a team of agents can coordinate via common knowledge -- h… ▽ More

    Submitted 11 January, 2021; originally announced January 2021.

    Comments: AAAI 2021

  13. arXiv:2007.02418  [pdf, other

    cs.LG cs.AI stat.ML

    Selective Dyna-style Planning Under Limited Model Capacity

    Authors: Zaheer Abbas, Samuel Sokota, Erin J. Talvitie, Martha White

    Abstract: In model-based reinforcement learning, planning with an imperfect model of the environment has the potential to harm learning progress. But even when a model is imperfect, it may still contain information that is useful for planning. In this paper, we investigate the idea of using an imperfect model selectively. The agent should plan in parts of the state space where the model would be helpful but… ▽ More

    Submitted 7 March, 2021; v1 submitted 5 July, 2020; originally announced July 2020.

    Comments: Accepted at ICML 2020

  14. arXiv:1906.10780  [pdf, other

    cs.LG stat.AP stat.ML

    Simultaneous Prediction Intervals for Patient-Specific Survival Curves

    Authors: Samuel Sokota, Ryan D'Orazio, Khurram Javed, Humza Haider, Russell Greiner

    Abstract: Accurate models of patient survival probabilities provide important information to clinicians prescribing care for life-threatening and terminal ailments. A recently developed class of models - known as individual survival distributions (ISDs) - produces patient-specific survival functions that offer greater descriptive power of patient outcomes than was previously possible. Unfortunately, at the… ▽ More

    Submitted 25 June, 2019; originally announced June 2019.

    Comments: 7 pages, 7 figures, IJCAI 19