Skip to main content

Showing 1–6 of 6 results for author: Zinkevich, M

.
  1. arXiv:1711.11165  [pdf, other

    cs.LG eess.SY

    Safe Exploration for Identifying Linear Systems via Robust Optimization

    Authors: Tyler Lu, Martin Zinkevich, Craig Boutilier, Binz Roy, Dale Schuurmans

    Abstract: Safely exploring an unknown dynamical system is critical to the deployment of reinforcement learning (RL) in physical systems where failures may have catastrophic consequences. In scenarios where one knows little about the dynamics, diverse transition data covering relevant regions of state-action space is needed to apply either model-based or model-free RL. Motivated by the cooling of Google's da… ▽ More

    Submitted 29 November, 2017; originally announced November 2017.

  2. arXiv:1206.6855  [pdf

    cs.GT

    An Efficient Optimal-Equilibrium Algorithm for Two-player Game Trees

    Authors: Michael L. Littman, Nishkam Ravi, Arjun Talwar, Martin Zinkevich

    Abstract: Two-player complete-information game trees are perhaps the simplest possible setting for studying general-sum games and the computational problem of finding equilibria. These games admit a simple bottom-up algorithm for finding subgame perfect Nash equilibria efficiently. However, such an algorithm can fail to identify optimal equilibria, such as those that maximize social welfare. The reason is t… ▽ More

    Submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI2006)

    Report number: UAI-P-2006-PG-298-305

  3. arXiv:1206.3318  [pdf, other

    cs.AI

    On Local Regret

    Authors: Michael Bowling, Martin Zinkevich

    Abstract: Online learning aims to perform nearly as well as the best hypothesis in hindsight. For some hypothesis classes, though, even finding the best hypothesis offline is challenging. In such offline cases, local search techniques are often employed and only local optimality guaranteed. For online decision-making with such hypothesis classes, we introduce local regret, a generalization of regret that ai… ▽ More

    Submitted 14 June, 2012; originally announced June 2012.

    Comments: This is the longer version of the same-titled paper appearing in the Proceedings of the Twenty-Ninth International Conference on Machine Learning (ICML), 2012

    Report number: TR12-04

  4. arXiv:1205.0622  [pdf, other

    cs.GT cs.AI

    No-Regret Learning in Extensive-Form Games with Imperfect Recall

    Authors: Marc Lanctot, Richard Gibson, Neil Burch, Martin Zinkevich, Michael Bowling

    Abstract: Counterfactual Regret Minimization (CFR) is an efficient no-regret learning algorithm for decision problems modeled as extensive games. CFR's regret bounds depend on the requirement of perfect recall: players always remember information that was revealed to them and the order in which it was revealed. In games without perfect recall, however, CFR's guarantees do not apply. In this paper, we presen… ▽ More

    Submitted 3 May, 2012; originally announced May 2012.

    Comments: 21 pages, 4 figures, expanded version of article to appear in Proceedings of the Twenty-Ninth International Conference on Machine Learning

  5. arXiv:0911.0491  [pdf, ps, other

    math.OC stat.ML

    Slow Learners are Fast

    Authors: John Langford, Alexander Smola, Martin Zinkevich

    Abstract: Online learning algorithms have impressive convergence properties when it comes to risk minimization and convex games on very large problems. However, they are inherently sequential in their design which prevents them from taking advantage of modern multi-core architectures. In this paper we prove that online learning with delayed updates converges well, thereby facilitating parallel online lear… ▽ More

    Submitted 3 November, 2009; originally announced November 2009.

    Comments: Extended version of conference paper - NIPS 2009

    MSC Class: 49M30; 80M50

  6. arXiv:cond-mat/0208300  [pdf

    cond-mat.supr-con cond-mat.mtrl-sci

    Crystal growth of MgB2 from Mg-Cu-B melt flux and superconducting properties

    Authors: D. Souptel, G. Behr, W. Loser, W. Kopylov, M. Zinkevich

    Abstract: A new method for preparation of single crystals of the superconducting intermetallic MgB2 compound from a Mg-Cu-B melt flux is presented. The high vapour pressure of Mg at elevated temperature is a serious challenge of the preparation process. The approximate thermodynamic calculations of the ternary Mg-Cu-B phase diagram show a beneficial effect of Cu, which extends the range of formation of Mg… ▽ More

    Submitted 15 August, 2002; originally announced August 2002.

    Comments: 22 pages, 6 figures