Skip to main content

Showing 1–3 of 3 results for author: Kudashkina, K

.
  1. arXiv:2104.08543  [pdf, other

    cs.AI

    Planning with Expectation Models for Control

    Authors: Katya Kudashkina, Yi Wan, Abhishek Naik, Richard S. Sutton

    Abstract: In model-based reinforcement learning (MBRL), Wan et al. (2019) showed conditions under which the environment model could produce the expectation of the next feature vector rather than the full distribution, or a sample thereof, with no loss in planning performance. Such expectation models are of interest when the environment is stochastic and non-stationary, and the model is approximate, such as… ▽ More

    Submitted 17 April, 2021; originally announced April 2021.

  2. arXiv:2008.12095  [pdf, other

    cs.AI cs.HC cs.LG

    Document-editing Assistants and Model-based Reinforcement Learning as a Path to Conversational AI

    Authors: Katya Kudashkina, Patrick M. Pilarski, Richard S. Sutton

    Abstract: Intelligent assistants that follow commands or answer simple questions, such as Siri and Google search, are among the most economically important applications of AI. Future conversational AI assistants promise even greater capabilities and a better user experience through a deeper understanding of the domain, the user, or the user's purposes. But what domain and what methods are best suited to res… ▽ More

    Submitted 27 August, 2020; originally announced August 2020.

    Comments: Currently under review

  3. arXiv:2004.13657  [pdf, other

    cs.LG cs.AI stat.ML

    Sample-Efficient Model-based Actor-Critic for an Interactive Dialogue Task

    Authors: Katya Kudashkina, Valliappa Chockalingam, Graham W. Taylor, Michael Bowling

    Abstract: Human-computer interactive systems that rely on machine learning are becoming paramount to the lives of millions of people who use digital assistants on a daily basis. Yet, further advances are limited by the availability of data and the cost of acquiring new samples. One way to address this problem is by improving the sample efficiency of current approaches. As a solution path, we present a model… ▽ More

    Submitted 28 April, 2020; originally announced April 2020.