Search | arXiv e-print repository

Information-seeking polynomial NARX model-predictive control through expected free energy minimization

Abstract: We propose an adaptive model-predictive controller that balances driving the system to a goal state and seeking system observations that are informative with respect to the parameters of a nonlinear autoregressive exogenous model. The controller's objective function is derived from an expected free energy functional and contains information-theoretic terms expressing uncertainty over model paramet… ▽ More We propose an adaptive model-predictive controller that balances driving the system to a goal state and seeking system observations that are informative with respect to the parameters of a nonlinear autoregressive exogenous model. The controller's objective function is derived from an expected free energy functional and contains information-theoretic terms expressing uncertainty over model parameters and output predictions. Experiments illustrate how parameter uncertainty affects the control objective and evaluate the proposed controller for a pendulum swing-up task. △ Less

Submitted 22 December, 2023; originally announced December 2023.

Comments: 6 pages, 3 figures

arXiv:2009.00845 [pdf, other]

doi 10.1007/978-3-030-64919-7_6

Online system identification in a Duffing oscillator by free energy minimisation

Authors: Wouter M Kouw

Abstract: Online system identification is the estimation of parameters of a dynamical system, such as mass or friction coefficients, for each measurement of the input and output signals. Here, the nonlinear stochastic differential equation of a Duffing oscillator is cast to a generative model and dynamical parameters are inferred using variational message passing on a factor graph of the model. The approach… ▽ More Online system identification is the estimation of parameters of a dynamical system, such as mass or friction coefficients, for each measurement of the input and output signals. Here, the nonlinear stochastic differential equation of a Duffing oscillator is cast to a generative model and dynamical parameters are inferred using variational message passing on a factor graph of the model. The approach is validated with an experiment on data from an electronic implementation of a Duffing oscillator. The proposed inference procedure performs as well as offline prediction error minimisation in a state-of-the-art nonlinear model. △ Less

Submitted 2 September, 2020; originally announced September 2020.

Comments: 10 pages, 5 figures. Accepted to the International Workshop on Active Inference (final author version)

arXiv:2002.12105 [pdf, other]

doi 10.1371/journal.pone.0237009

The Data Representativeness Criterion: Predicting the Performance of Supervised Classification Based on Data Set Similarity

Authors: Evelien Schat, Rens van de Schoot, Wouter M. Kouw, Duco Veen, Adriënne M. Mendrik

Abstract: In a broad range of fields it may be desirable to reuse a supervised classification algorithm and apply it to a new data set. However, generalization of such an algorithm and thus achieving a similar classification performance is only possible when the training data used to build the algorithm is similar to new unseen data one wishes to apply it to. It is often unknown in advance how an algorithm… ▽ More In a broad range of fields it may be desirable to reuse a supervised classification algorithm and apply it to a new data set. However, generalization of such an algorithm and thus achieving a similar classification performance is only possible when the training data used to build the algorithm is similar to new unseen data one wishes to apply it to. It is often unknown in advance how an algorithm will perform on new unseen data, being a crucial reason for not deploying an algorithm at all. Therefore, tools are needed to measure the similarity of data sets. In this paper, we propose the Data Representativeness Criterion (DRC) to determine how representative a training data set is of a new unseen data set. We present a proof of principle, to see whether the DRC can quantify the similarity of data sets and whether the DRC relates to the performance of a supervised classification algorithm. We compared a number of magnetic resonance imaging (MRI) data sets, ranging from subtle to severe difference is acquisition parameters. Results indicate that, based on the similarity of data sets, the DRC is able to give an indication as to when the performance of a supervised classifier decreases. The strictness of the DRC can be set by the user, depending on what one considers to be an acceptable underperformance. △ Less

Submitted 27 February, 2020; originally announced February 2020.

Comments: 12 pages, 6 figures

Journal ref: PLoS ONE 15(8): e0237009, 2020, pp. 1-16

arXiv:1903.04191 [pdf, other]

doi 10.1007/978-3-030-20351-1_27

A cross-center smoothness prior for variational Bayesian brain tissue segmentation

Authors: Wouter M. Kouw, Silas N. Ørting, Jens Petersen, Kim S. Pedersen, Marleen de Bruijne

Abstract: Suppose one is faced with the challenge of tissue segmentation in MR images, without annotators at their center to provide labeled training data. One option is to go to another medical center for a trained classifier. Sadly, tissue classifiers do not generalize well across centers due to voxel intensity shifts caused by center-specific acquisition protocols. However, certain aspects of segmentatio… ▽ More Suppose one is faced with the challenge of tissue segmentation in MR images, without annotators at their center to provide labeled training data. One option is to go to another medical center for a trained classifier. Sadly, tissue classifiers do not generalize well across centers due to voxel intensity shifts caused by center-specific acquisition protocols. However, certain aspects of segmentations, such as spatial smoothness, remain relatively consistent and can be learned separately. Here we present a smoothness prior that is fit to segmentations produced at another medical center. This informative prior is presented to an unsupervised Bayesian model. The model clusters the voxel intensities, such that it produces segmentations that are similarly smooth to those of the other medical center. In addition, the unsupervised Bayesian model is extended to a semi-supervised variant, which needs no visual interpretation of clusters into tissues. △ Less

Submitted 11 March, 2019; originally announced March 2019.

Comments: 12 pages, 2 figures, 1 table. Accepted to the International Conference on Information Processing in Medical Imaging (2019)

Journal ref: International Conference on Information Processing in Medical Imaging (IPMI), Hong Kong, 2019, pp. 360-371

arXiv:1901.05335 [pdf, ps, other]

doi 10.1109/TPAMI.2019.2945942

A review of domain adaptation without target labels

Authors: Wouter M. Kouw, Marco Loog

Abstract: Domain adaptation has become a prominent problem setting in machine learning and related fields. This review asks the question: how can a classifier learn from a source domain and generalize to a target domain? We present a categorization of approaches, divided into, what we refer to as, sample-based, feature-based and inference-based methods. Sample-based methods focus on weighting individual obs… ▽ More Domain adaptation has become a prominent problem setting in machine learning and related fields. This review asks the question: how can a classifier learn from a source domain and generalize to a target domain? We present a categorization of approaches, divided into, what we refer to as, sample-based, feature-based and inference-based methods. Sample-based methods focus on weighting individual observations during training based on their importance to the target domain. Feature-based methods revolve around on map**, projecting and representing features such that a source classifier performs well on the target domain and inference-based methods incorporate adaptation into the parameter estimation procedure, for instance through constraints on the optimization procedure. Additionally, we review a number of conditions that allow for formulating bounds on the cross-domain generalization error. Our categorization highlights recurring ideas and raises questions important to further research. △ Less

Submitted 24 July, 2019; v1 submitted 16 January, 2019; originally announced January 2019.

Comments: 20 pages, 5 figures

arXiv:1812.11806 [pdf, ps, other]

An introduction to domain adaptation and transfer learning

Authors: Wouter M. Kouw, Marco Loog

Abstract: In machine learning, if the training data is an unbiased sample of an underlying distribution, then the learned classification function will make accurate predictions for new samples. However, if the training data is not an unbiased sample, then there will be differences between how the training data is distributed and how the test data is distributed. Standard classifiers cannot cope with changes… ▽ More In machine learning, if the training data is an unbiased sample of an underlying distribution, then the learned classification function will make accurate predictions for new samples. However, if the training data is not an unbiased sample, then there will be differences between how the training data is distributed and how the test data is distributed. Standard classifiers cannot cope with changes in data distributions between training and test phases, and will not perform well. Domain adaptation and transfer learning are sub-fields within machine learning that are concerned with accounting for these types of changes. Here, we present an introduction to these fields, guided by the question: when and how can a classifier generalize from a source to a target domain? We will start with a brief introduction into risk minimization, and how transfer learning and domain adaptation expand upon this framework. Following that, we discuss three special cases of data set shift, namely prior, covariate and concept shift. For more complex domain shifts, there are a wide variety of approaches. These are categorized into: importance-weighting, subspace map**, domain-invariant spaces, feature augmentation, minimax estimators and robust algorithms. A number of points will arise, which we will discuss in the last section. We conclude with the remark that many open questions will have to be addressed before transfer learners and domain-adaptive classifiers become practical. △ Less

Submitted 14 January, 2019; v1 submitted 31 December, 2018; originally announced December 2018.

Comments: Technical Report. 41 pages, 5 figures

arXiv:1810.07430 [pdf, other]

doi 10.1109/ISBI.2019.8759281

Learning an MR acquisition-invariant representation using Siamese neural networks

Authors: Wouter M. Kouw, Marco Loog, Wilbert Bartels, Adriënne M. Mendrik

Abstract: Generalization of voxelwise classifiers is hampered by differences between MRI-scanners, e.g. different acquisition protocols and field strengths. To address this limitation, we propose a Siamese neural network (MRAI-NET) that extracts acquisition-invariant feature vectors. These can consequently be used by task-specific methods, such as voxelwise classifiers for tissue segmentation. MRAI-NET is t… ▽ More Generalization of voxelwise classifiers is hampered by differences between MRI-scanners, e.g. different acquisition protocols and field strengths. To address this limitation, we propose a Siamese neural network (MRAI-NET) that extracts acquisition-invariant feature vectors. These can consequently be used by task-specific methods, such as voxelwise classifiers for tissue segmentation. MRAI-NET is tested on both simulated and real patient data. Experiments show that MRAI-NET outperforms voxelwise classifiers trained on the source or target scanner data when a small number of labeled samples is available. △ Less

Submitted 17 October, 2018; originally announced October 2018.

Comments: 3 figures, submitted to International Symposium on Biomedical Imaging 2019

Journal ref: 16th IEEE International Symposium on Biomedical Imaging (ISBI), Venice, 2019, pp. 364-367

arXiv:1806.09463 [pdf, ps, other]

doi 10.1007/978-3-030-73973-7_1

Target Robust Discriminant Analysis

Authors: Wouter M. Kouw, Marco Loog

Abstract: In practice, the data distribution at test time often differs, to a smaller or larger extent, from that of the original training data. Consequentially, the so-called source classifier, trained on the available labelled data, deteriorates on the test, or target, data. Domain adaptive classifiers aim to combat this problem, but typically assume some particular form of domain shift. Most are not robu… ▽ More In practice, the data distribution at test time often differs, to a smaller or larger extent, from that of the original training data. Consequentially, the so-called source classifier, trained on the available labelled data, deteriorates on the test, or target, data. Domain adaptive classifiers aim to combat this problem, but typically assume some particular form of domain shift. Most are not robust to violations of domain shift assumptions and may even perform worse than their non-adaptive counterparts. We construct robust parameter estimators for discriminant analysis that guarantee performance improvements of the adaptive classifier over the non-adaptive source classifier. △ Less

Submitted 8 February, 2021; v1 submitted 21 June, 2018; originally announced June 2018.

Comments: 10 pages, no figures, 2 tables, 2 lemma's, 1 theorem. arXiv admin note: substantial text overlap with arXiv:1706.08082 Accepted to the IAPR Joint International Workshops on Statistical + Structural and Syntactic Pattern Recognition (S+SSPR 2020). The final authenticated publication will soon be available online

arXiv:1804.07344 [pdf, other]

doi 10.1109/ICPR.2018.8546186

Effects of sampling skewness of the importance-weighted risk estimator on model selection

Authors: Wouter M. Kouw, Marco Loog

Abstract: Importance-weighting is a popular and well-researched technique for dealing with sample selection bias and covariate shift. It has desirable characteristics such as unbiasedness, consistency and low computational complexity. However, weighting can have a detrimental effect on an estimator as well. In this work, we empirically show that the sampling distribution of an importance-weighted estimator… ▽ More Importance-weighting is a popular and well-researched technique for dealing with sample selection bias and covariate shift. It has desirable characteristics such as unbiasedness, consistency and low computational complexity. However, weighting can have a detrimental effect on an estimator as well. In this work, we empirically show that the sampling distribution of an importance-weighted estimator can be skewed. For sample selection bias settings, and for small sample sizes, the importance-weighted risk estimator produces overestimates for datasets in the body of the sampling distribution, i.e. the majority of cases, and large underestimates for data sets in the tail of the sampling distribution. These over- and underestimates of the risk lead to suboptimal regularization parameters when used for importance-weighted validation. △ Less

Submitted 19 April, 2018; originally announced April 2018.

Comments: Conference paper, 6 pages, 5 figures

Journal ref: 24th International Conference on Pattern Recognition (ICPR), Bei**g, 2018, pp. 1468 - 1473

arXiv:1710.06514 [pdf, ps, other]

Robust importance-weighted cross-validation under sample selection bias

Authors: Wouter M. Kouw, Jesse H. Krijthe, Marco Loog

Abstract: Cross-validation under sample selection bias can, in principle, be done by importance-weighting the empirical risk. However, the importance-weighted risk estimator produces sub-optimal hyperparameter estimates in problem settings where large weights arise with high probability. We study its sampling variance as a function of the training data distribution and introduce a control variate to increas… ▽ More Cross-validation under sample selection bias can, in principle, be done by importance-weighting the empirical risk. However, the importance-weighted risk estimator produces sub-optimal hyperparameter estimates in problem settings where large weights arise with high probability. We study its sampling variance as a function of the training data distribution and introduce a control variate to increase its robustness to problematically large weights. △ Less

Submitted 27 August, 2019; v1 submitted 17 October, 2017; originally announced October 2017.

Comments: 6 pages, 8 figures, Accepted to the IEEE International Workshop on Machine Learning for Signal Processing 2019

arXiv:1709.07944 [pdf, other]

MR Acquisition-Invariant Representation Learning

Authors: Wouter M. Kouw, Marco Loog, Lambertus W. Bartels, Adriënne M. Mendrik

Abstract: Voxelwise classification approaches are popular and effective methods for tissue quantification in brain magnetic resonance imaging (MRI) scans. However, generalization of these approaches is hampered by large differences between sets of MRI scans such as differences in field strength, vendor or acquisition protocols. Due to this acquisition related variation, classifiers trained on data from a sp… ▽ More Voxelwise classification approaches are popular and effective methods for tissue quantification in brain magnetic resonance imaging (MRI) scans. However, generalization of these approaches is hampered by large differences between sets of MRI scans such as differences in field strength, vendor or acquisition protocols. Due to this acquisition related variation, classifiers trained on data from a specific scanner fail or under-perform when applied to data that was acquired differently. In order to address this lack of generalization, we propose a Siamese neural network (MRAI-net) to learn a representation that minimizes the between-scanner variation, while maintaining the contrast between brain tissues necessary for brain tissue quantification. The proposed MRAI-net was evaluated on both simulated and real MRI data. After learning the MR acquisition invariant representation, any supervised classification model that uses feature vectors can be applied. In this paper, we provide a proof of principle, which shows that a linear classifier applied on the MRAI representation is able to outperform supervised convolutional neural network classifiers for tissue classification when little target training data is available. △ Less

Submitted 19 April, 2018; v1 submitted 22 September, 2017; originally announced September 2017.

Comments: 36 pages, 2 appendices, 12 figures, 3 tables

arXiv:1706.08082 [pdf, ps, other]

doi 10.1016/j.patrec.2021.05.005

Target contrastive pessimistic risk for robust domain adaptation

Authors: Wouter M. Kouw, Marco Loog

Abstract: In domain adaptation, classifiers with information from a source domain adapt to generalize to a target domain. However, an adaptive classifier can perform worse than a non-adaptive classifier due to invalid assumptions, increased sensitivity to estimation errors or model misspecification. Our goal is to develop a domain-adaptive classifier that is robust in the sense that it does not rely on rest… ▽ More In domain adaptation, classifiers with information from a source domain adapt to generalize to a target domain. However, an adaptive classifier can perform worse than a non-adaptive classifier due to invalid assumptions, increased sensitivity to estimation errors or model misspecification. Our goal is to develop a domain-adaptive classifier that is robust in the sense that it does not rely on restrictive assumptions on how the source and target domains relate to each other and that it does not perform worse than the non-adaptive classifier. We formulate a conservative parameter estimator that only deviates from the source classifier when a lower risk is guaranteed for all possible labellings of the given target samples. We derive the classical least-squares and discriminant analysis cases and show that these perform on par with state-of-the-art domain adaptive classifiers in sample selection bias settings, while outperforming them in more general domain adaptation settings. △ Less

Submitted 25 June, 2017; originally announced June 2017.

Comments: 35 pages, 3 figures, 6 tables, 2 algorithms, 1 theorem

arXiv:1608.00250 [pdf, other]

doi 10.1109/ICPR.2016.7899671

On Regularization Parameter Estimation under Covariate Shift

Authors: Wouter M. Kouw, Marco Loog

Abstract: This paper identifies a problem with the usual procedure for L2-regularization parameter estimation in a domain adaptation setting. In such a setting, there are differences between the distributions generating the training data (source domain) and the test data (target domain). The usual cross-validation procedure requires validation data, which can not be obtained from the unlabeled target data.… ▽ More This paper identifies a problem with the usual procedure for L2-regularization parameter estimation in a domain adaptation setting. In such a setting, there are differences between the distributions generating the training data (source domain) and the test data (target domain). The usual cross-validation procedure requires validation data, which can not be obtained from the unlabeled target data. The problem is that if one decides to use source validation data, the regularization parameter is underestimated. One possible solution is to scale the source validation data through importance weighting, but we show that this correction is not sufficient. We conclude the paper with an empirical analysis of the effect of several importance weight estimators on the estimation of the regularization parameter. △ Less

Submitted 31 July, 2016; originally announced August 2016.

Comments: 6 pages, 2 figures, 2 tables. Accepted to ICPR 2016

Journal ref: 23rd International Conference on Pattern Recognition (ICPR), Cancun, 2016, pp. 426-431

arXiv:1512.04829 [pdf, other]

Feature-Level Domain Adaptation

Authors: Wouter M. Kouw, Jesse H. Krijthe, Marco Loog, Laurens J. P. van der Maaten

Abstract: Domain adaptation is the supervised learning setting in which the training and test data are sampled from different distributions: training data is sampled from a source domain, whilst test data is sampled from a target domain. This paper proposes and studies an approach, called feature-level domain adaptation (FLDA), that models the dependence between the two domains by means of a feature-level t… ▽ More Domain adaptation is the supervised learning setting in which the training and test data are sampled from different distributions: training data is sampled from a source domain, whilst test data is sampled from a target domain. This paper proposes and studies an approach, called feature-level domain adaptation (FLDA), that models the dependence between the two domains by means of a feature-level transfer model that is trained to describe the transfer from source to target domain. Subsequently, we train a domain-adapted classifier by minimizing the expected loss under the resulting transfer model. For linear classifiers and a large family of loss functions and transfer models, this expected loss can be computed or approximated analytically, and minimized efficiently. Our empirical evaluation of FLDA focuses on problems comprising binary and count data in which the transfer can be naturally modeled via a dropout distribution, which allows the classifier to adapt to differences in the marginal probability of features in the source and the target domain. Our experiments on several real-world problems show that FLDA performs on par with state-of-the-art domain-adaptation techniques. △ Less

Submitted 7 June, 2016; v1 submitted 15 December, 2015; originally announced December 2015.

Comments: 32 pages, 13 figures, 9 tables

Journal ref: JMLR 17:171 (2016) 1-32

Showing 1–14 of 14 results for author: Kouw, W M