Skip to main content

Showing 1–19 of 19 results for author: Bakhtin, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.06941  [pdf, other

    cs.SE

    Comparison of Static Analysis Architecture Recovery Tools for Microservice Applications

    Authors: Simon Schneider, Alexander Bakhtin, Xiaozhou Li, Jacopo Soldani, Antonio Brogi, Tomas Cerny, Riccardo Scandariato, Davide Taibi

    Abstract: Architecture recovery tools help software engineers obtain an overview of their software systems during all phases of the software development lifecycle. This is especially important for microservice applications because their distributed nature makes it more challenging to oversee the architecture. Various tools and techniques for this task are presented in academic and grey literature sources. P… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  2. arXiv:2306.16388  [pdf, other

    cs.CL cs.AI

    Towards Measuring the Representation of Subjective Global Opinions in Language Models

    Authors: Esin Durmus, Karina Nguyen, Thomas I. Liao, Nicholas Schiefer, Amanda Askell, Anton Bakhtin, Carol Chen, Zac Hatfield-Dodds, Danny Hernandez, Nicholas Joseph, Liane Lovitt, Sam McCandlish, Orowa Sikder, Alex Tamkin, Janel Thamkul, Jared Kaplan, Jack Clark, Deep Ganguli

    Abstract: Large language models (LLMs) may not equitably represent diverse global perspectives on societal issues. In this paper, we develop a quantitative framework to evaluate whose opinions model-generated responses are more similar to. We first build a dataset, GlobalOpinionQA, comprised of questions and answers from cross-national surveys designed to capture diverse opinions on global issues across dif… ▽ More

    Submitted 11 April, 2024; v1 submitted 28 June, 2023; originally announced June 2023.

  3. arXiv:2210.05492  [pdf, other

    cs.GT cs.AI cs.LG cs.MA

    Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning

    Authors: Anton Bakhtin, David J Wu, Adam Lerer, Jonathan Gray, Athul Paul Jacob, Gabriele Farina, Alexander H Miller, Noam Brown

    Abstract: No-press Diplomacy is a complex strategy game involving both cooperation and competition that has served as a benchmark for multi-agent AI research. While self-play reinforcement learning has resulted in numerous successes in purely adversarial games like chess, Go, and poker, self-play alone is insufficient for achieving optimal performance in domains involving cooperation with humans. We address… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

  4. arXiv:2207.12322  [pdf, other

    cs.AI cs.LG

    Self-Explaining Deviations for Coordination

    Authors: Hengyuan Hu, Samuel Sokota, David Wu, Anton Bakhtin, Andrei Lupu, Brandon Cui, Jakob N. Foerster

    Abstract: Fully cooperative, partially observable multi-agent problems are ubiquitous in the real world. In this paper, we focus on a specific subclass of coordination problems in which humans are able to discover self-explaining deviations (SEDs). SEDs are actions that deviate from the common understanding of what reasonable behavior would be in normal circumstances. They are taken with the intention of ca… ▽ More

    Submitted 13 July, 2022; originally announced July 2022.

  5. arXiv:2207.02776  [pdf, other

    cs.SE

    Using Microservice Telemetry Data for System Dynamic Analysis

    Authors: Abdullah Al Maruf, Alexander Bakhtin, Tomas Cerny, Davide Taibi

    Abstract: Microservices bring various benefits to software systems. They also bring decentralization and lose coupling across self-contained system parts. Since these systems likely evolve in a decentralized manner, they need to be monitored to identify when possibly poorly designed extensions deteriorate the overall system quality. For monolith systems, such tasks have been commonly addressed through stati… ▽ More

    Submitted 6 July, 2022; originally announced July 2022.

    Comments: 10 pages, 4 figures, IEEE SOSE 2022

  6. arXiv:2205.10133  [pdf, ps, other

    cs.SE

    Survey on Tools and Techniques Detecting Microservice API Patterns

    Authors: Alexander Bakhtin, Abdullah Al Maruf, Tomas Cerny, Davide Taibi

    Abstract: It is well recognized that design patterns improve system development and maintenance in many aspects. While we commonly recognize these patterns in monolithic systems, many patterns emerged for cloud computing, specifically microservices. Unfortunately, while various patterns have been proposed, available quality assessment tools often do not recognize many. This article performs a grey literatur… ▽ More

    Submitted 20 May, 2022; originally announced May 2022.

    Journal ref: IEEE SCC 2022

  7. arXiv:2112.07544  [pdf, other

    cs.MA cs.AI cs.GT cs.LG

    Modeling Strong and Human-Like Gameplay with KL-Regularized Search

    Authors: Athul Paul Jacob, David J. Wu, Gabriele Farina, Adam Lerer, Hengyuan Hu, Anton Bakhtin, Jacob Andreas, Noam Brown

    Abstract: We consider the task of building strong but human-like policies in multi-agent decision-making problems, given examples of human behavior. Imitation learning is effective at predicting human actions but may not match the strength of expert humans, while self-play learning and search techniques (e.g. AlphaZero) lead to strong performance but may produce policies that are difficult for humans to und… ▽ More

    Submitted 16 February, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

  8. arXiv:2110.02924  [pdf, other

    cs.LG cs.AI cs.GT cs.MA

    No-Press Diplomacy from Scratch

    Authors: Anton Bakhtin, David Wu, Adam Lerer, Noam Brown

    Abstract: Prior AI successes in complex games have largely focused on settings with at most hundreds of actions at each decision point. In contrast, Diplomacy is a game with more than 10^20 possible actions per turn. Previous attempts to address games with large branching factors, such as Diplomacy, StarCraft, and Dota, used human data to bootstrap the policy or used handcrafted reward sha**. In this pape… ▽ More

    Submitted 6 October, 2021; originally announced October 2021.

  9. arXiv:2102.10336  [pdf, other

    cs.AI cs.LG

    Physical Reasoning Using Dynamics-Aware Models

    Authors: Eltayeb Ahmed, Anton Bakhtin, Laurens van der Maaten, Rohit Girdhar

    Abstract: A common approach to solving physical reasoning tasks is to train a value learner on example tasks. A limitation of such an approach is that it requires learning about object dynamics solely from reward values assigned to the final state of a rollout of the environment. This study aims to address this limitation by augmenting the reward value with self-supervised signals about object dynamics. Spe… ▽ More

    Submitted 1 September, 2021; v1 submitted 20 February, 2021; originally announced February 2021.

    Comments: ICML 2021 Workshop on Self-Supervised Learning for Reasoning and Perception; Webpage/Code: https://facebookresearch.github.io/DynamicsAware

  10. arXiv:2010.02923  [pdf, other

    cs.AI cs.GT cs.LG

    Human-Level Performance in No-Press Diplomacy via Equilibrium Search

    Authors: Jonathan Gray, Adam Lerer, Anton Bakhtin, Noam Brown

    Abstract: Prior AI breakthroughs in complex games have focused on either the purely adversarial or purely cooperative settings. In contrast, Diplomacy is a game of shifting alliances that involves both cooperation and competition. For this reason, Diplomacy has proven to be a formidable research challenge. In this paper we describe an agent for the no-press variant of Diplomacy that combines supervised lear… ▽ More

    Submitted 3 May, 2021; v1 submitted 5 October, 2020; originally announced October 2020.

  11. arXiv:2007.13544  [pdf, other

    cs.GT cs.AI cs.LG

    Combining Deep Reinforcement Learning and Search for Imperfect-Information Games

    Authors: Noam Brown, Anton Bakhtin, Adam Lerer, Qucheng Gong

    Abstract: The combination of deep reinforcement learning and search at both training and test time is a powerful paradigm that has led to a number of successes in single-agent settings and perfect-information games, best exemplified by AlphaZero. However, prior algorithms of this form cannot cope with imperfect-information games. This paper presents ReBeL, a general framework for self-play reinforcement lea… ▽ More

    Submitted 28 November, 2020; v1 submitted 27 July, 2020; originally announced July 2020.

  12. arXiv:2004.11714  [pdf, other

    cs.CL cs.LG

    Residual Energy-Based Models for Text Generation

    Authors: Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'Aurelio Ranzato

    Abstract: Text generation is ubiquitous in many NLP tasks, from summarization, to dialogue and machine translation. The dominant parametric approach is based on locally normalized models which predict one word at a time. While these work remarkably well, they are plagued by exposure bias due to the greedy nature of the generation process. In this work, we investigate un-normalized energy-based models (EBMs)… ▽ More

    Submitted 22 April, 2020; originally announced April 2020.

    Comments: published at ICLR 2020. arXiv admin note: substantial text overlap with arXiv:2004.10188

    Journal ref: ICLR 2020

  13. arXiv:2004.10188  [pdf, other

    cs.CL cs.LG stat.ML

    Residual Energy-Based Models for Text

    Authors: Anton Bakhtin, Yuntian Deng, Sam Gross, Myle Ott, Marc'Aurelio Ranzato, Arthur Szlam

    Abstract: Current large-scale auto-regressive language models display impressive fluency and can generate convincing text. In this work we start by asking the question: Can the generations of these models be reliably distinguished from real text by statistical discriminators? We find experimentally that the answer is affirmative when we have access to the training data for the model, and guardedly affirmati… ▽ More

    Submitted 21 December, 2020; v1 submitted 6 April, 2020; originally announced April 2020.

    Comments: long journal version

    Journal ref: Journal of Machine Learning Research 21 (2020) 1-41

  14. arXiv:1909.01066  [pdf, other

    cs.CL

    Language Models as Knowledge Bases?

    Authors: Fabio Petroni, Tim Rocktäschel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, Alexander H. Miller, Sebastian Riedel

    Abstract: Recent progress in pretraining language models on large textual corpora led to a surge of improvements for downstream NLP tasks. Whilst learning linguistic knowledge, these models may also be storing relational knowledge present in the training data, and may be able to answer queries structured as "fill-in-the-blank" cloze statements. Language models have many advantages over structured knowledge… ▽ More

    Submitted 4 September, 2019; v1 submitted 3 September, 2019; originally announced September 2019.

    Comments: accepted at EMNLP 2019

  15. arXiv:1908.05656  [pdf, other

    cs.LG cs.AI stat.ML

    PHYRE: A New Benchmark for Physical Reasoning

    Authors: Anton Bakhtin, Laurens van der Maaten, Justin Johnson, Laura Gustafson, Ross Girshick

    Abstract: Understanding and reasoning about physics is an important ability of intelligent agents. We develop the PHYRE benchmark for physical reasoning that contains a set of simple classical mechanics puzzles in a 2D physical environment. The benchmark is designed to encourage the development of learning algorithms that are sample-efficient and generalize well across puzzles. We test several modern learni… ▽ More

    Submitted 15 August, 2019; originally announced August 2019.

  16. arXiv:1906.03351  [pdf, other

    cs.LG cs.CL stat.ML

    Real or Fake? Learning to Discriminate Machine from Human Generated Text

    Authors: Anton Bakhtin, Sam Gross, Myle Ott, Yuntian Deng, Marc'Aurelio Ranzato, Arthur Szlam

    Abstract: Energy-based models (EBMs), a.k.a. un-normalized models, have had recent successes in continuous spaces. However, they have not been successfully applied to model text sequences. While decreasing the energy at training samples is straightforward, mining (negative) samples where the energy should be increased is difficult. In part, this is because standard gradient-based methods are not readily app… ▽ More

    Submitted 25 November, 2019; v1 submitted 7 June, 2019; originally announced June 2019.

  17. arXiv:1804.07705  [pdf, other

    cs.CL

    Lightweight Adaptive Mixture of Neural and N-gram Language Models

    Authors: Anton Bakhtin, Arthur Szlam, Marc'Aurelio Ranzato, Edouard Grave

    Abstract: It is often the case that the best performing language model is an ensemble of a neural language model with n-grams. In this work, we propose a method to improve how these two models are combined. By using a small network which predicts the mixture weight between the two models, we adapt their relative importance at each time step. Because the gating network is small, it trains quickly on small am… ▽ More

    Submitted 26 October, 2018; v1 submitted 20 April, 2018; originally announced April 2018.

  18. arXiv:1710.09617  [pdf, other

    cs.CL

    Streaming Small-Footprint Keyword Spotting using Sequence-to-Sequence Models

    Authors: Yanzhang He, Rohit Prabhavalkar, Kanishka Rao, Wei Li, Anton Bakhtin, Ian McGraw

    Abstract: We develop streaming keyword spotting systems using a recurrent neural network transducer (RNN-T) model: an all-neural, end-to-end trained, sequence-to-sequence model which jointly learns acoustic and language model components. Our models are trained to predict either phonemes or graphemes as subword units, thus allowing us to detect arbitrary keyword phrases, without any out-of-vocabulary words.… ▽ More

    Submitted 26 October, 2017; originally announced October 2017.

    Comments: To appear in Proceedings of IEEE ASRU 2017

  19. arXiv:1607.04683  [pdf, other

    cs.LG cs.CL

    On the efficient representation and execution of deep acoustic models

    Authors: Raziel Alvarez, Rohit Prabhavalkar, Anton Bakhtin

    Abstract: In this paper we present a simple and computationally efficient quantization scheme that enables us to reduce the resolution of the parameters of a neural network from 32-bit floating point values to 8-bit integer values. The proposed quantization scheme leads to significant memory savings and enables the use of optimized hardware instructions for integer arithmetic, thus significantly reducing th… ▽ More

    Submitted 16 December, 2016; v1 submitted 15 July, 2016; originally announced July 2016.

    Comments: Accepted conference paper: "The Annual Conference of the International Speech Communication Association (Interspeech), 2016"