Skip to main content

Showing 1–6 of 6 results for author: Tracey, B D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2307.11546  [pdf, other

    physics.plasm-ph cs.LG

    Towards practical reinforcement learning for tokamak magnetic control

    Authors: Brendan D. Tracey, Andrea Michi, Yuri Chervonyi, Ian Davies, Cosmin Paduraru, Nevena Lazic, Federico Felici, Timo Ewalds, Craig Donner, Cristian Galperti, Jonas Buchli, Michael Neunert, Andrea Huber, Jonathan Evens, Paula Kurylowicz, Daniel J. Mankowitz, Martin Riedmiller, The TCV Team

    Abstract: Reinforcement learning (RL) has shown promising results for real-time control systems, including the domain of plasma magnetic control. However, there are still significant drawbacks compared to traditional feedback control approaches for magnetic confinement. In this work, we address key drawbacks of the RL method; achieving higher control accuracy for desired plasma properties, reducing the stea… ▽ More

    Submitted 5 October, 2023; v1 submitted 21 July, 2023; originally announced July 2023.

  2. arXiv:2105.12196  [pdf, other

    cs.AI cs.MA cs.NE cs.RO

    From Motor Control to Team Play in Simulated Humanoid Football

    Authors: Siqi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami, Daniel Hennes, Wojciech M. Czarnecki, Yuval Tassa, Shayegan Omidshafiei, Abbas Abdolmaleki, Noah Y. Siegel, Leonard Hasenclever, Luke Marris, Saran Tunyasuvunakool, H. Francis Song, Markus Wulfmeier, Paul Muller, Tuomas Haarnoja, Brendan D. Tracey, Karl Tuyls, Thore Graepel, Nicolas Heess

    Abstract: Intelligent behaviour in the physical world exhibits structure at multiple spatial and temporal scales. Although movements are ultimately executed at the level of instantaneous muscle tensions or joint torques, they must be selected to serve goals defined on much longer timescales, and in terms of relations that extend far beyond the body itself, ultimately involving coordination with other agents… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

  3. arXiv:1808.07593  [pdf, other

    stat.ML cs.LG

    Caveats for information bottleneck in deterministic scenarios

    Authors: Artemy Kolchinsky, Brendan D. Tracey, Steven Van Kuyk

    Abstract: Information bottleneck (IB) is a method for extracting information from one random variable $X$ that is relevant for predicting another random variable $Y$. To do so, IB identifies an intermediate "bottleneck" variable $T$ that has low mutual information $I(X;T)$ and high mutual information $I(Y;T)$. The "IB curve" characterizes the set of bottleneck variables that achieve maximal $I(Y;T)$ for a g… ▽ More

    Submitted 8 February, 2019; v1 submitted 22 August, 2018; originally announced August 2018.

    Journal ref: International Conference on Learning Representations (ICLR), 2019

  4. Deep Reinforcement Learning for Event-Driven Multi-Agent Decision Processes

    Authors: Kunal Menda, Yi-Chun Chen, Justin Grana, James W. Bono, Brendan D. Tracey, Mykel J. Kochenderfer, David Wolpert

    Abstract: The incorporation of macro-actions (temporally extended actions) into multi-agent decision problems has the potential to address the curse of dimensionality associated with such decision problems. Since macro-actions last for stochastic durations, multiple agents executing decentralized policies in cooperative environments must act asynchronously. We present an algorithm that modifies generalized… ▽ More

    Submitted 29 May, 2019; v1 submitted 19 September, 2017; originally announced September 2017.

    Comments: Published in IEEE Transactions on Intelligent Transportation Systems (Volume: 20, Issue: 4, April 2019). https://ieeexplore.ieee.org/document/8419722

    Journal ref: IEEE Transactions on Intelligent Transportation Systems, vol. 20, no. 4, pp. 1259-1268, April 2019

  5. arXiv:1706.02419  [pdf, other

    cs.IT stat.ME stat.ML

    Estimating Mixture Entropy with Pairwise Distances

    Authors: Artemy Kolchinsky, Brendan D. Tracey

    Abstract: Mixture distributions arise in many parametric and non-parametric settings -- for example, in Gaussian mixture models and in non-parametric estimation. It is often necessary to compute the entropy of a mixture, but, in most cases, this quantity has no closed-form expression, making some form of approximation necessary. We propose a family of estimators based on a pairwise distance function between… ▽ More

    Submitted 22 August, 2018; v1 submitted 7 June, 2017; originally announced June 2017.

    Comments: Corrects several errata in published version, in particular in Section V (bounds on mutual information)

    Journal ref: Entropy, 2017

  6. arXiv:1705.02436  [pdf, other

    cs.IT cs.LG stat.ML

    Nonlinear Information Bottleneck

    Authors: Artemy Kolchinsky, Brendan D. Tracey, David H. Wolpert

    Abstract: Information bottleneck (IB) is a technique for extracting information in one random variable $X$ that is relevant for predicting another random variable $Y$. IB works by encoding $X$ in a compressed "bottleneck" random variable $M$ from which $Y$ can be accurately decoded. However, finding the optimal bottleneck variable involves a difficult optimization problem, which until recently has been cons… ▽ More

    Submitted 30 November, 2019; v1 submitted 5 May, 2017; originally announced May 2017.

    Journal ref: Entropy, 2019