Skip to main content

Showing 1–1 of 1 results for author: Dennis, M D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2211.00241  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Adversarial Policies Beat Superhuman Go AIs

    Authors: Tony T. Wang, Adam Gleave, Tom Tseng, Kellin Pelrine, Nora Belrose, Joseph Miller, Michael D. Dennis, Yawen Duan, Viktor Pogrebniak, Sergey Levine, Stuart Russell

    Abstract: We attack the state-of-the-art Go-playing AI system KataGo by training adversarial policies against it, achieving a >97% win rate against KataGo running at superhuman settings. Our adversaries do not win by playing Go well. Instead, they trick KataGo into making serious blunders. Our attack transfers zero-shot to other superhuman Go-playing AIs, and is comprehensible to the extent that human exper… ▽ More

    Submitted 13 July, 2023; v1 submitted 31 October, 2022; originally announced November 2022.

    Comments: Accepted to ICML 2023, see paper for changelog

    ACM Class: I.2.6