Search | arXiv e-print repository

arXiv:2407.02078 [pdf, other]

doi 10.1016/j.robot.2024.104654

MARLIN: A Cloud Integrated Robotic Solution to Support Intralogistics in Retail

Authors: Dennis Mronga, Andreas Bresser, Fabian Maas, Adrian Danzglock, Simon Stelter, Alina Hawkin, Hoang Giang Nguyen, Michael Beetz, Frank Kirchner

Abstract: In this paper, we present the service robot MARLIN and its integration with the K4R platform, a cloud system for complex AI applications in retail. At its core, this platform contains so-called semantic digital twins, a semantically annotated representation of the retail store. MARLIN continuously exchanges data with the K4R platform, improving the robot's capabilities in perception, autonomous na… ▽ More In this paper, we present the service robot MARLIN and its integration with the K4R platform, a cloud system for complex AI applications in retail. At its core, this platform contains so-called semantic digital twins, a semantically annotated representation of the retail store. MARLIN continuously exchanges data with the K4R platform, improving the robot's capabilities in perception, autonomous navigation, and task planning. We exploit these capabilities in a retail intralogistics scenario, specifically by assisting store employees in stocking shelves. We demonstrate that MARLIN is able to update the digital representation of the retail store by detecting and classifying obstacles, autonomously planning and executing replenishment missions, adapting to unforeseen changes in the environment, and interacting with store employees. Experiments are conducted in simulation, in a laboratory environment, and in a real store. We also describe and evaluate a novel algorithm for autonomous navigation of articulated tractor-trailer systems. The algorithm outperforms the manufacturer's proprietary navigation approach and improves MARLIN's navigation capabilities in confined spaces. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Journal ref: MARLIN: A cloud integrated robotic solution to support intralogistics in retail, Robotics and Autonomous Systems, Volume 175, 2024

arXiv:2405.17072 [pdf, other]

A novel framework for systematic propositional formula simplification based on existential graphs

Authors: Jordina Francès de Mas, Juliana Bowles

Abstract: This paper presents a novel simplification calculus for propositional logic derived from Peirce's existential graphs' rules of inference and implication graphs. Our rules can be applied to propositional logic formulae in nested form, are equivalence-preserving, guarantee a monotonically decreasing number of variables, clauses and literals, and maximise the preservation of structural problem inform… ▽ More This paper presents a novel simplification calculus for propositional logic derived from Peirce's existential graphs' rules of inference and implication graphs. Our rules can be applied to propositional logic formulae in nested form, are equivalence-preserving, guarantee a monotonically decreasing number of variables, clauses and literals, and maximise the preservation of structural problem information. Our techniques can also be seen as higher-level SAT preprocessing, and we show how one of our rules (TWSR) generalises and streamlines most of the known equivalence-preserving SAT preprocessing methods. In addition, we propose a simplification procedure based on the systematic application of two of our rules (EPR and TWSR) which is solver-agnostic and can be used to simplify large Boolean satisfiability problems and propositional formulae in arbitrary form, and we provide a formal analysis of its algorithmic complexity in terms of space and time. Finally, we show how our rules can be further extended with a novel n-ary implication graph to capture all known equivalence-preserving preprocessing procedures. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: 19 pages, 12 figures. Under consideration in Theory and Practice of Logic Programming (TPLP)

MSC Class: 03B35; 03B70; 68N17; 68T27 ACM Class: F.4.1; I.2.2; I.2.3; I.2.4

arXiv:2207.09521 [pdf, other]

doi 10.1007/978-3-031-16443-9_51

The Dice loss in the context of missing or empty labels: Introducing $Φ$ and $ε$

Authors: Sofie Tilborghs, Jeroen Bertels, David Robben, Dirk Vandermeulen, Frederik Maes

Abstract: Albeit the Dice loss is one of the dominant loss functions in medical image segmentation, most research omits a closer look at its derivative, i.e. the real motor of the optimization when using gradient descent. In this paper, we highlight the peculiar action of the Dice loss in the presence of missing or empty labels. First, we formulate a theoretical basis that gives a general description of the… ▽ More Albeit the Dice loss is one of the dominant loss functions in medical image segmentation, most research omits a closer look at its derivative, i.e. the real motor of the optimization when using gradient descent. In this paper, we highlight the peculiar action of the Dice loss in the presence of missing or empty labels. First, we formulate a theoretical basis that gives a general description of the Dice loss and its derivative. It turns out that the choice of the reduction dimensions $Φ$ and the smoothing term $ε$ is non-trivial and greatly influences its behavior. We find and propose heuristic combinations of $Φ$ and $ε$ that work in a segmentation setting with either missing or empty labels. Second, we empirically validate these findings in a binary and multiclass segmentation setting using two publicly available datasets. We confirm that the choice of $Φ$ and $ε$ is indeed pivotal. With $Φ$ chosen such that the reductions happen over a single batch (and class) element and with a negligible $ε$, the Dice loss deals with missing labels naturally and performs similarly compared to recent adaptations specific for missing labels. With $Φ$ chosen such that the reductions happen over multiple batch elements or with a heuristic value for $ε$, the Dice loss handles empty labels correctly. We believe that this work highlights some essential perspectives and hope that it encourages researchers to better describe their exact implementation of the Dice loss in future work. △ Less

Submitted 9 November, 2022; v1 submitted 19 July, 2022; originally announced July 2022.

Comments: 8 pages, 3 figures, 1 table, International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) 2022

Journal ref: Medical Image Computing and Computer Assisted Intervention (MICCAI) 2022. Lecture Notes in Computer Science, vol 13435. Springer, Cham

arXiv:2203.01089 [pdf, other]

doi 10.1016/j.media.2022.102533

Shape constrained CNN for segmentation guided prediction of myocardial shape and pose parameters in cardiac MRI

Authors: Sofie Tilborghs, Jan Bogaert, Frederik Maes

Abstract: Semantic segmentation using convolutional neural networks (CNNs) is the state-of-the-art for many medical image segmentation tasks including myocardial segmentation in cardiac MR images. However, the predicted segmentation maps obtained from such standard CNN do not allow direct quantification of regional shape properties such as regional wall thickness. Furthermore, the CNNs lack explicit shape c… ▽ More Semantic segmentation using convolutional neural networks (CNNs) is the state-of-the-art for many medical image segmentation tasks including myocardial segmentation in cardiac MR images. However, the predicted segmentation maps obtained from such standard CNN do not allow direct quantification of regional shape properties such as regional wall thickness. Furthermore, the CNNs lack explicit shape constraints, occasionally resulting in unrealistic segmentations. In this paper, we use a CNN to predict shape parameters of an underlying statistical shape model of the myocardium learned from a training set of images. Additionally, the cardiac pose is predicted, which allows to reconstruct the myocardial contours. The integrated shape model regularizes the predicted contours and guarantees realistic shapes. We enforce robustness of shape and pose prediction by simultaneously performing pixel-wise semantic segmentation during training and define two loss functions to impose consistency between the two predicted representations: one distance-based loss and one overlap-based loss. We evaluated the proposed method in a 5-fold cross validation on an in-house clinical dataset with 75 subjects and on the ACDC and LVQuan19 public datasets. We show the benefits of simultaneous semantic segmentation and the two newly defined loss functions for the prediction of shape parameters. Our method achieved a correlation of 99% for left ventricular (LV) area on the three datasets, between 91% and 97% for myocardial area, 98-99% for LV dimensions and between 80% and 92% for regional wall thickness. △ Less

Submitted 2 March, 2022; originally announced March 2022.

arXiv:2202.12295 [pdf, other]

doi 10.1016/j.media.2022.102706

Factorizer: A Scalable Interpretable Approach to Context Modeling for Medical Image Segmentation

Authors: Pooya Ashtari, Diana M. Sima, Lieven De Lathauwer, Dominique Sappey-Marinier, Frederik Maes, Sabine Van Huffel

Abstract: Convolutional Neural Networks (CNNs) with U-shaped architectures have dominated medical image segmentation, which is crucial for various clinical purposes. However, the inherent locality of convolution makes CNNs fail to fully exploit global context, essential for better recognition of some structures, e.g., brain lesions. Transformers have recently proven promising performance on vision tasks, in… ▽ More Convolutional Neural Networks (CNNs) with U-shaped architectures have dominated medical image segmentation, which is crucial for various clinical purposes. However, the inherent locality of convolution makes CNNs fail to fully exploit global context, essential for better recognition of some structures, e.g., brain lesions. Transformers have recently proven promising performance on vision tasks, including semantic segmentation, mainly due to their capability of modeling long-range dependencies. Nevertheless, the quadratic complexity of attention makes existing Transformer-based models use self-attention layers only after somehow reducing the image resolution, which limits the ability to capture global contexts present at higher resolutions. Therefore, this work introduces a family of models, dubbed Factorizer, which leverages the power of low-rank matrix factorization for constructing an end-to-end segmentation model. Specifically, we propose a linearly scalable approach to context modeling, formulating Nonnegative Matrix Factorization (NMF) as a differentiable layer integrated into a U-shaped architecture. The shifted window technique is also utilized in combination with NMF to effectively aggregate local information. Factorizers compete favorably with CNNs and Transformers in terms of accuracy, scalability, and interpretability, achieving state-of-the-art results on the BraTS dataset for brain tumor segmentation and ISLES'22 dataset for stroke lesion segmentation. Highly meaningful NMF components give an additional interpretability advantage to Factorizers over CNNs and Transformers. Moreover, our ablation studies reveal a distinctive feature of Factorizers that enables a significant speed-up in inference for a trained Factorizer without any extra steps and without sacrificing much accuracy. The code and models are publicly available at https://github.com/pashtari/factorizer. △ Less

Submitted 13 December, 2022; v1 submitted 24 February, 2022; originally announced February 2022.

Comments: 31 pages, 8 figures, 4 tables

Journal ref: Medical Image Analysis 84 (2023) 102706

arXiv:2112.12560 [pdf, other]

On the relationship between calibrated predictors and unbiased volume estimation

Authors: Teodora Popordanoska, Jeroen Bertels, Dirk Vandermeulen, Frederik Maes, Matthew B. Blaschko

Abstract: Machine learning driven medical image segmentation has become standard in medical image analysis. However, deep learning models are prone to overconfident predictions. This has led to a renewed focus on calibrated predictions in the medical imaging and broader machine learning communities. Calibrated predictions are estimates of the probability of a label that correspond to the true expected value… ▽ More Machine learning driven medical image segmentation has become standard in medical image analysis. However, deep learning models are prone to overconfident predictions. This has led to a renewed focus on calibrated predictions in the medical imaging and broader machine learning communities. Calibrated predictions are estimates of the probability of a label that correspond to the true expected value of the label conditioned on the confidence. Such calibrated predictions have utility in a range of medical imaging applications, including surgical planning under uncertainty and active learning systems. At the same time it is often an accurate volume measurement that is of real importance for many medical applications. This work investigates the relationship between model calibration and volume estimation. We demonstrate both mathematically and empirically that if the predictor is calibrated per image, we can obtain the correct volume by taking an expectation of the probability scores per pixel/voxel of the image. Furthermore, we show that convex combinations of calibrated classifiers preserve volume estimation, but do not preserve calibration. Therefore, we conclude that having a calibrated predictor is a sufficient, but not necessary condition for obtaining an unbiased estimate of the volume. We validate our theoretical findings empirically on a collection of 18 different (calibrated) training strategies on the tasks of glioma volume estimation on BraTS 2018, and ischemic stroke lesion volume estimation on ISLES 2018 datasets. △ Less

Submitted 23 December, 2021; originally announced December 2021.

Comments: Published at MICCAI 2021

arXiv:2011.11719 [pdf, other]

Explainable-by-design Semi-Supervised Representation Learning for COVID-19 Diagnosis from CT Imaging

Authors: Abel Díaz Berenguer, Hichem Sahli, Boris Joukovsky, Maryna Kvasnytsia, Ine Dirks, Mitchel Alioscha-Perez, Nikos Deligiannis, Panagiotis Gonidakis, Sebastián Amador Sánchez, Redona Brahimetaj, Evgenia Papavasileiou, Jonathan Cheung-Wai Chana, Fei Li, Shangzhen Song, Yixin Yang, Sofie Tilborghs, Siri Willems, Tom Eelbode, Jeroen Bertels, Dirk Vandermeulen, Frederik Maes, Paul Suetens, Lucas Fidon, Tom Vercauteren, David Robben , et al. (15 additional authors not shown)

Abstract: Our motivating application is a real-world problem: COVID-19 classification from CT imaging, for which we present an explainable Deep Learning approach based on a semi-supervised classification pipeline that employs variational autoencoders to extract efficient feature embedding. We have optimized the architecture of two different networks for CT images: (i) a novel conditional variational autoenc… ▽ More Our motivating application is a real-world problem: COVID-19 classification from CT imaging, for which we present an explainable Deep Learning approach based on a semi-supervised classification pipeline that employs variational autoencoders to extract efficient feature embedding. We have optimized the architecture of two different networks for CT images: (i) a novel conditional variational autoencoder (CVAE) with a specific architecture that integrates the class labels inside the encoder layers and uses side information with shared attention layers for the encoder, which make the most of the contextual clues for representation learning, and (ii) a downstream convolutional neural network for supervised classification using the encoder structure of the CVAE. With the explainable classification results, the proposed diagnosis system is very effective for COVID-19 classification. Based on the promising results obtained qualitatively and quantitatively, we envisage a wide deployment of our developed technique in large-scale clinical studies.Code is available at https://git.etrovub.be/AVSP/ct-based-covid-19-diagnostic-tool.git. △ Less

Submitted 2 September, 2021; v1 submitted 23 November, 2020; originally announced November 2020.

arXiv:2010.13499 [pdf, other]

doi 10.1109/TMI.2020.3002417

Optimization for Medical Image Segmentation: Theory and Practice when evaluating with Dice Score or Jaccard Index

Authors: Tom Eelbode, Jeroen Bertels, Maxim Berman, Dirk Vandermeulen, Frederik Maes, Raf Bisschops, Matthew B. Blaschko

Abstract: In many medical imaging and classical computer vision tasks, the Dice score and Jaccard index are used to evaluate the segmentation performance. Despite the existence and great empirical success of metric-sensitive losses, i.e. relaxations of these metrics such as soft Dice, soft Jaccard and Lovasz-Softmax, many researchers still use per-pixel losses, such as (weighted) cross-entropy to train CNNs… ▽ More In many medical imaging and classical computer vision tasks, the Dice score and Jaccard index are used to evaluate the segmentation performance. Despite the existence and great empirical success of metric-sensitive losses, i.e. relaxations of these metrics such as soft Dice, soft Jaccard and Lovasz-Softmax, many researchers still use per-pixel losses, such as (weighted) cross-entropy to train CNNs for segmentation. Therefore, the target metric is in many cases not directly optimized. We investigate from a theoretical perspective, the relation within the group of metric-sensitive loss functions and question the existence of an optimal weighting scheme for weighted cross-entropy to optimize the Dice score and Jaccard index at test time. We find that the Dice score and Jaccard index approximate each other relatively and absolutely, but we find no such approximation for a weighted Hamming similarity. For the Tversky loss, the approximation gets monotonically worse when deviating from the trivial weight setting where soft Tversky equals soft Dice. We verify these results empirically in an extensive validation on six medical segmentation tasks and can confirm that metric-sensitive losses are superior to cross-entropy based loss functions in case of evaluation with Dice Score or Jaccard Index. This further holds in a multi-class setting, and across different object sizes and foreground/background ratios. These results encourage a wider adoption of metric-sensitive loss functions for medical segmentation tasks where the performance measure of interest is the Dice score or Jaccard index. △ Less

Submitted 26 October, 2020; originally announced October 2020.

Comments: 15 pages, 14 figures, accepted for publication in IEEE Transactions on Medical Imaging (2020)

arXiv:2010.08952 [pdf, other]

doi 10.1007/978-3-030-68107-4_13

Shape Constrained CNN for Cardiac MR Segmentation with Simultaneous Prediction of Shape and Pose Parameters

Authors: Sofie Tilborghs, Tom Dresselaers, Piet Claus, Jan Bogaert, Frederik Maes

Abstract: Semantic segmentation using convolutional neural networks (CNNs) is the state-of-the-art for many medical segmentation tasks including left ventricle (LV) segmentation in cardiac MR images. However, a drawback is that these CNNs lack explicit shape constraints, occasionally resulting in unrealistic segmentations. In this paper, we perform LV and myocardial segmentation by regression of pose and sh… ▽ More Semantic segmentation using convolutional neural networks (CNNs) is the state-of-the-art for many medical segmentation tasks including left ventricle (LV) segmentation in cardiac MR images. However, a drawback is that these CNNs lack explicit shape constraints, occasionally resulting in unrealistic segmentations. In this paper, we perform LV and myocardial segmentation by regression of pose and shape parameters derived from a statistical shape model. The integrated shape model regularizes predicted segmentations and guarantees realistic shapes. Furthermore, in contrast to semantic segmentation, it allows direct calculation of regional measures such as myocardial thickness. We enforce robustness of shape and pose prediction by simultaneously constructing a segmentation distance map during training. We evaluated the proposed method in a fivefold cross validation on a in-house clinical dataset with 75 subjects containing a total of 1539 delineated short-axis slices covering LV from apex to base, and achieved a correlation of 99% for LV area, 94% for myocardial area, 98% for LV dimensions and 88% for regional wall thicknesses. The method was additionally validated on the LVQuan18 and LVQuan19 public datasets and achieved state-of-the-art results. △ Less

Submitted 18 October, 2020; originally announced October 2020.

Comments: Presented at the 11th Workshop on Statistical Atlases and Computational Modeling of the Heart (STACOM 2020), held in conjunction with MICCAI 2020

arXiv:2007.15546 [pdf, other]

Comparative study of deep learning methods for the automatic segmentation of lung, lesion and lesion type in CT scans of COVID-19 patients

Authors: Sofie Tilborghs, Ine Dirks, Lucas Fidon, Siri Willems, Tom Eelbode, Jeroen Bertels, Bart Ilsen, Arne Brys, Adriana Dubbeldam, Nico Buls, Panagiotis Gonidakis, Sebastián Amador Sánchez, Annemiek Snoeckx, Paul M. Parizel, Johan de Mey, Dirk Vandermeulen, Tom Vercauteren, David Robben, Dirk Smeets, Frederik Maes, Jef Vandemeulebroucke, Paul Suetens

Abstract: Recent research on COVID-19 suggests that CT imaging provides useful information to assess disease progression and assist diagnosis, in addition to help understanding the disease. There is an increasing number of studies that propose to use deep learning to provide fast and accurate quantification of COVID-19 using chest CT scans. The main tasks of interest are the automatic segmentation of lung a… ▽ More Recent research on COVID-19 suggests that CT imaging provides useful information to assess disease progression and assist diagnosis, in addition to help understanding the disease. There is an increasing number of studies that propose to use deep learning to provide fast and accurate quantification of COVID-19 using chest CT scans. The main tasks of interest are the automatic segmentation of lung and lung lesions in chest CT scans of confirmed or suspected COVID-19 patients. In this study, we compare twelve deep learning algorithms using a multi-center dataset, including both open-source and in-house developed algorithms. Results show that ensembling different methods can boost the overall test set performance for lung segmentation, binary lesion segmentation and multiclass lesion segmentation, resulting in mean Dice scores of 0.982, 0.724 and 0.469, respectively. The resulting binary lesions were segmented with a mean absolute volume error of 91.3 ml. In general, the task of distinguishing different lesion types was more difficult, with a mean absolute volume difference of 152 ml and mean Dice scores of 0.369 and 0.523 for consolidation and ground glass opacity, respectively. All methods perform binary lesion segmentation with an average volume error that is better than visual assessment by human raters, suggesting these methods are mature enough for a large-scale evaluation for use in clinical practice. △ Less

Submitted 10 January, 2022; v1 submitted 29 July, 2020; originally announced July 2020.

Comments: Updated acknowledgments

arXiv:1911.01685 [pdf, other]

doi 10.1007/978-3-030-32245-8_11

Optimizing the Dice Score and Jaccard Index for Medical Image Segmentation: Theory & Practice

Authors: Jeroen Bertels, Tom Eelbode, Maxim Berman, Dirk Vandermeulen, Frederik Maes, Raf Bisschops, Matthew Blaschko

Abstract: The Dice score and Jaccard index are commonly used metrics for the evaluation of segmentation tasks in medical imaging. Convolutional neural networks trained for image segmentation tasks are usually optimized for (weighted) cross-entropy. This introduces an adverse discrepancy between the learning optimization objective (the loss) and the end target metric. Recent works in computer vision have pro… ▽ More The Dice score and Jaccard index are commonly used metrics for the evaluation of segmentation tasks in medical imaging. Convolutional neural networks trained for image segmentation tasks are usually optimized for (weighted) cross-entropy. This introduces an adverse discrepancy between the learning optimization objective (the loss) and the end target metric. Recent works in computer vision have proposed soft surrogates to alleviate this discrepancy and directly optimize the desired metric, either through relaxations (soft-Dice, soft-Jaccard) or submodular optimization (Lovász-softmax). The aim of this study is two-fold. First, we investigate the theoretical differences in a risk minimization framework and question the existence of a weighted cross-entropy loss with weights theoretically optimized to surrogate Dice or Jaccard. Second, we empirically investigate the behavior of the aforementioned loss functions w.r.t. evaluation with Dice score and Jaccard index on five medical segmentation tasks. Through the application of relative approximation bounds, we show that all surrogates are equivalent up to a multiplicative factor, and that no optimal weighting of cross-entropy exists to approximate Dice or Jaccard measures. We validate these findings empirically and show that, while it is important to opt for one of the target metric surrogates rather than a cross-entropy-based loss, the choice of the surrogate does not make a statistical difference on a wide range of medical segmentation tasks. △ Less

Submitted 5 November, 2019; originally announced November 2019.

Comments: MICCAI 2019

Journal ref: LNCS 11765, Springer Nature Switzerland AG 2019

arXiv:1208.4773 [pdf, ps, other]

Optimized Look-Ahead Tree Policies: A Bridge Between Look-Ahead Tree Policies and Direct Policy Search

Authors: Tobias Jung, Louis Wehenkel, Damien Ernst, Francis Maes

Abstract: Direct policy search (DPS) and look-ahead tree (LT) policies are two widely used classes of techniques to produce high performance policies for sequential decision-making problems. To make DPS approaches work well, one crucial issue is to select an appropriate space of parameterized policies with respect to the targeted problem. A fundamental issue in LT approaches is that, to take good decisions,… ▽ More Direct policy search (DPS) and look-ahead tree (LT) policies are two widely used classes of techniques to produce high performance policies for sequential decision-making problems. To make DPS approaches work well, one crucial issue is to select an appropriate space of parameterized policies with respect to the targeted problem. A fundamental issue in LT approaches is that, to take good decisions, such policies must develop very large look-ahead trees which may require excessive online computational resources. In this paper, we propose a new hybrid policy learning scheme that lies at the intersection of DPS and LT, in which the policy is an algorithm that develops a small look-ahead tree in a directed way, guided by a node scoring function that is learned through DPS. The LT-based representation is shown to be a versatile way of representing policies in a DPS scheme, while at the same time, DPS enables to significantly reduce the size of the look-ahead trees that are required to take high-quality decisions. We experimentally compare our method with two other state-of-the-art DPS techniques and four common LT policies on four benchmark domains and show that it combines the advantages of the two techniques from which it originates. In particular, we show that our method: (1) produces overall better performing policies than both pure DPS and pure LT policies, (2) requires a substantially smaller number of policy evaluations than other DPS techniques, (3) is easy to tune and (4) results in policies that are quite robust with respect to perturbations of the initial conditions. △ Less

Submitted 23 August, 2012; originally announced August 2012.

Comments: In Submission

arXiv:1208.4692 [pdf, other]

Monte Carlo Search Algorithm Discovery for One Player Games

Authors: Francis Maes, David Lupien St-Pierre, Damien Ernst

Abstract: Much current research in AI and games is being devoted to Monte Carlo search (MCS) algorithms. While the quest for a single unified MCS algorithm that would perform well on all problems is of major interest for AI, practitioners often know in advance the problem they want to solve, and spend plenty of time exploiting this knowledge to customize their MCS algorithm in a problem-driven way. We propo… ▽ More Much current research in AI and games is being devoted to Monte Carlo search (MCS) algorithms. While the quest for a single unified MCS algorithm that would perform well on all problems is of major interest for AI, practitioners often know in advance the problem they want to solve, and spend plenty of time exploiting this knowledge to customize their MCS algorithm in a problem-driven way. We propose an MCS algorithm discovery scheme to perform this in an automatic and reproducible way. We first introduce a grammar over MCS algorithms that enables inducing a rich space of candidate algorithms. Afterwards, we search in this space for the algorithm that performs best on average for a given distribution of training problems. We rely on multi-armed bandits to approximately solve this optimization problem. The experiments, generated on three different domains, show that our approach enables discovering algorithms that outperform several well-known MCS algorithms such as Upper Confidence bounds applied to Trees and Nested Monte Carlo search. We also show that the discovered algorithms are generally quite robust with respect to changes in the distribution over the training problems. △ Less

Submitted 18 December, 2012; v1 submitted 23 August, 2012; originally announced August 2012.

arXiv:1207.5208 [pdf, other]

Meta-Learning of Exploration/Exploitation Strategies: The Multi-Armed Bandit Case

Authors: Francis Maes, Damien Ernst, Louis Wehenkel

Abstract: The exploration/exploitation (E/E) dilemma arises naturally in many subfields of Science. Multi-armed bandit problems formalize this dilemma in its canonical form. Most current research in this field focuses on generic solutions that can be applied to a wide range of problems. However, in practice, it is often the case that a form of prior information is available about the specific class of targe… ▽ More The exploration/exploitation (E/E) dilemma arises naturally in many subfields of Science. Multi-armed bandit problems formalize this dilemma in its canonical form. Most current research in this field focuses on generic solutions that can be applied to a wide range of problems. However, in practice, it is often the case that a form of prior information is available about the specific class of target problems. Prior knowledge is rarely used in current solutions due to the lack of a systematic approach to incorporate it into the E/E strategy. To address a specific class of E/E problems, we propose to proceed in three steps: (i) model prior knowledge in the form of a probability distribution over the target class of E/E problems; (ii) choose a large hypothesis space of candidate E/E strategies; and (iii), solve an optimization problem to find a candidate E/E strategy of maximal average performance over a sample of problems drawn from the prior distribution. We illustrate this meta-learning approach with two different hypothesis spaces: one where E/E strategies are numerically parameterized and another where E/E strategies are represented as small symbolic formulas. We propose appropriate optimization algorithms for both cases. Our experiments, with two-armed Bernoulli bandit problems and various playing budgets, show that the meta-learnt E/E strategies outperform generic strategies of the literature (UCB1, UCB1-Tuned, UCB-v, KL-UCB and epsilon greedy); they also evaluate the robustness of the learnt E/E strategies, by tests carried out on arms whose rewards follow a truncated Gaussian distribution. △ Less

Submitted 22 July, 2012; originally announced July 2012.

Comments: 16 pages, Springer Selection of papers of ICAART'12

Showing 1–14 of 14 results for author: Maes, F