Skip to main content

Showing 1–4 of 4 results for author: Stojic, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2302.08436  [pdf, other

    stat.ML cs.LG

    Trieste: Efficiently Exploring The Depths of Black-box Functions with TensorFlow

    Authors: Victor Picheny, Joel Berkeley, Henry B. Moss, Hrvoje Stojic, Uri Granta, Sebastian W. Ober, Artem Artemev, Khurram Ghani, Alexander Goodall, Andrei Paleyes, Sattar Vakili, Sergio Pascual-Diaz, Stratis Markou, Jixiang Qing, Nasrulloh R. B. S Loka, Ivo Couckuyt

    Abstract: We present Trieste, an open-source Python package for Bayesian optimization and active learning benefiting from the scalability and efficiency of TensorFlow. Our library enables the plug-and-play of popular TensorFlow-based models within sequential decision-making loops, e.g. Gaussian processes from GPflow or GPflux, or neural networks from Keras. This modular mindset is central to the package and… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

  2. arXiv:2103.14407  [pdf, other

    cs.LG

    Bellman: A Toolbox for Model-Based Reinforcement Learning in TensorFlow

    Authors: John McLeod, Hrvoje Stojic, Vincent Adam, Dongho Kim, Jordi Grau-Moya, Peter Vrancx, Felix Leibfried

    Abstract: In the past decade, model-free reinforcement learning (RL) has provided solutions to challenging domains such as robotics. Model-based RL shows the prospect of being more sample-efficient than model-free methods in terms of agent-environment interactions, because the model enables to extrapolate to unseen situations. In the more recent past, model-based methods have shown superior results compared… ▽ More

    Submitted 13 April, 2021; v1 submitted 26 March, 2021; originally announced March 2021.

  3. An empirical evaluation of active inference in multi-armed bandits

    Authors: Dimitrije Markovic, Hrvoje Stojic, Sarah Schwoebel, Stefan J. Kiebel

    Abstract: A key feature of sequential decision making under uncertainty is a need to balance between exploiting--choosing the best action according to the current knowledge, and exploring--obtaining information about values of other actions. The multi-armed bandit problem, a classical task that captures this trade-off, served as a vehicle in machine learning for develo** bandit algorithms that proved to b… ▽ More

    Submitted 4 August, 2021; v1 submitted 21 January, 2021; originally announced January 2021.

  4. arXiv:1703.10970  [pdf, other

    cs.AI cs.MA

    Diversity of preferences can increase collective welfare in sequential exploration problems

    Authors: Pantelis P. Analytis, Hrvoje Stojic, Alexandros Gelastopoulos, Mehdi Moussaïd

    Abstract: In search engines, online marketplaces and other human-computer interfaces large collectives of individuals sequentially interact with numerous alternatives of varying quality. In these contexts, trial and error (exploration) is crucial for uncovering novel high-quality items or solutions, but entails a high cost for individual users. Self-interested decision makers, are often better off imitating… ▽ More

    Submitted 2 April, 2017; v1 submitted 28 March, 2017; originally announced March 2017.

    Comments: 4 pages, 1 figure, originally presented at the collected intelligence (CI) conference in June 2017