Search | arXiv e-print repository

Incorporating Sum Constraints into Multitask Gaussian Processes

Authors: Philipp Pilar, Carl Jidling, Thomas B. Schön, Niklas Wahlström

Abstract: Machine learning models can be improved by adapting them to respect existing background knowledge. In this paper we consider multitask Gaussian processes, with background knowledge in the form of constraints that require a specific sum of the outputs to be constant. This is achieved by conditioning the prior distribution on the constraint fulfillment. The approach allows for both linear and nonlin… ▽ More Machine learning models can be improved by adapting them to respect existing background knowledge. In this paper we consider multitask Gaussian processes, with background knowledge in the form of constraints that require a specific sum of the outputs to be constant. This is achieved by conditioning the prior distribution on the constraint fulfillment. The approach allows for both linear and nonlinear constraints. We demonstrate that the constraints are fulfilled with high precision and that the construction can improve the overall prediction accuracy as compared to the standard Gaussian process. △ Less

Submitted 1 February, 2023; v1 submitted 3 February, 2022; originally announced February 2022.

Journal ref: Transactions on Machine Learning Research, 2022

arXiv:2102.10880 [pdf, other]

A Probabilistically Motivated Learning Rate Adaptation for Stochastic Optimization

Authors: Filip de Roos, Carl Jidling, Adrian Wills, Thomas Schön, Philipp Hennig

Abstract: Machine learning practitioners invest significant manual and computational resources in finding suitable learning rates for optimization algorithms. We provide a probabilistic motivation, in terms of Gaussian inference, for popular stochastic first-order methods. As an important special case, it recovers the Polyak step with a general metric. The inference allows us to relate the learning rate to… ▽ More Machine learning practitioners invest significant manual and computational resources in finding suitable learning rates for optimization algorithms. We provide a probabilistic motivation, in terms of Gaussian inference, for popular stochastic first-order methods. As an important special case, it recovers the Polyak step with a general metric. The inference allows us to relate the learning rate to a dimensionless quantity that can be automatically adapted during training by a control algorithm. The resulting meta-algorithm is shown to adapt learning rates in a robust manner across a large range of initial values when applied to deep learning benchmark problems. △ Less

Submitted 22 February, 2021; originally announced February 2021.

arXiv:2002.01600 [pdf, other]

Linearly Constrained Neural Networks

Authors: Johannes Hendriks, Carl Jidling, Adrian Wills, Thomas Schön

Abstract: We present a novel approach to modelling and learning vector fields from physical systems using neural networks that explicitly satisfy known linear operator constraints. To achieve this, the target function is modelled as a linear transformation of an underlying potential field, which is in turn modelled by a neural network. This transformation is chosen such that any prediction of the target fun… ▽ More We present a novel approach to modelling and learning vector fields from physical systems using neural networks that explicitly satisfy known linear operator constraints. To achieve this, the target function is modelled as a linear transformation of an underlying potential field, which is in turn modelled by a neural network. This transformation is chosen such that any prediction of the target function is guaranteed to satisfy the constraints. The approach is demonstrated on both simulated and real data examples. △ Less

Submitted 27 April, 2021; v1 submitted 4 February, 2020; originally announced February 2020.

arXiv:1909.01844 [pdf, other]

Deep kernel learning for integral measurements

Authors: Carl Jidling, Johannes Hendriks, Thomas B. Schön, Adrian Wills

Abstract: Deep kernel learning refers to a Gaussian process that incorporates neural networks to improve the modelling of complex functions. We present a method that makes this approach feasible for problems where the data consists of line integral measurements of the target function. The performance is illustrated on computed tomography reconstruction examples. Deep kernel learning refers to a Gaussian process that incorporates neural networks to improve the modelling of complex functions. We present a method that makes this approach feasible for problems where the data consists of line integral measurements of the target function. The performance is illustrated on computed tomography reconstruction examples. △ Less

Submitted 4 September, 2019; originally announced September 2019.

arXiv:1905.06854 [pdf, other]

doi 10.1016/j.nimb.2019.07.005

Neutron Transmission Strain Tomography for Non-Constant Stress-Free Lattice Spacing

Authors: J. N. Hendriks, C. Jidling, T. B. Schön, A. Wills, C. M. Wensrich, E. H. Kisi

Abstract: Recently, several algorithms for strain tomography from energy-resolved neutron transmission measurements have been proposed. These methods assume that the stress-free lattice spacing $d_0$ is a known constant limiting their application to the study of stresses generated by manufacturing and loading methods that do not alter this parameter. In this paper, we consider the more general problem of jo… ▽ More Recently, several algorithms for strain tomography from energy-resolved neutron transmission measurements have been proposed. These methods assume that the stress-free lattice spacing $d_0$ is a known constant limiting their application to the study of stresses generated by manufacturing and loading methods that do not alter this parameter. In this paper, we consider the more general problem of jointly reconstructing the strain and $d_0$ fields. A method for solving this inherently non-linear problem is presented that ensures the estimated strain field satisfies equilibrium and can include knowledge of boundary conditions. This method is tested on a simulated data set with realistic noise levels, demonstrating that it is possible to jointly reconstruct $d_0$ and the strain field. △ Less

Submitted 18 July, 2019; v1 submitted 15 May, 2019; originally announced May 2019.

Comments: Journal article, 14 pages, 5 figures

Journal ref: Nuclear instruments and methods in physics research section B, 456:64-73, 2019

arXiv:1812.07319 [pdf, other]

Evaluating the squared-exponential covariance function in Gaussian processes with integral observations

Authors: J. N. Hendriks, C. Jidling, A. Wills, T. B. Schön

Abstract: This paper deals with the evaluation of double line integrals of the squared exponential covariance function. We propose a new approach in which the double integral is reduced to a single integral using the error function. This single integral is then computed with efficiently implemented numerical techniques. The performance is compared against existing state of the art methods and the results sh… ▽ More This paper deals with the evaluation of double line integrals of the squared exponential covariance function. We propose a new approach in which the double integral is reduced to a single integral using the error function. This single integral is then computed with efficiently implemented numerical techniques. The performance is compared against existing state of the art methods and the results show superior properties in numerical robustness and accuracy per computation time. △ Less

Submitted 18 December, 2018; originally announced December 2018.

arXiv:1810.01269 [pdf, other]

A fast quasi-Newton-type method for large-scale stochastic optimisation

Authors: Adrian Wills, Carl Jidling, Thomas Schon

Abstract: During recent years there has been an increased interest in stochastic adaptations of limited memory quasi-Newton methods, which compared to pure gradient-based routines can improve the convergence by incorporating second order information. In this work we propose a direct least-squares approach conceptually similar to the limited memory quasi-Newton methods, but that computes the search direction… ▽ More During recent years there has been an increased interest in stochastic adaptations of limited memory quasi-Newton methods, which compared to pure gradient-based routines can improve the convergence by incorporating second order information. In this work we propose a direct least-squares approach conceptually similar to the limited memory quasi-Newton methods, but that computes the search direction in a slightly different way. This is achieved in a fast and numerically robust manner by maintaining a Cholesky factor of low dimension. This is combined with a stochastic line search relying upon fulfilment of the Wolfe condition in a backtracking manner, where the step length is adaptively modified with respect to the optimisation progress. We support our new algorithm by providing several theoretical results guaranteeing its performance. The performance is demonstrated on real-world benchmark problems which shows improved results in comparison with already established methods. △ Less

Submitted 29 September, 2018; originally announced October 2018.

Comments: arXiv admin note: substantial text overlap with arXiv:1802.04310

arXiv:1809.03779 [pdf, other]

doi 10.1088/1361-6420/ab2e2a

Probabilistic approach to limited-data computed tomography reconstruction

Authors: Zenith Purisha, Carl Jidling, Niklas Wahlström, Simo Särkkä, Thomas B. Schön

Abstract: In this work, we consider the inverse problem of reconstructing the internal structure of an object from limited x-ray projections. We use a Gaussian process prior to model the target function and estimate its (hyper)parameters from measured data. In contrast to other established methods, this comes with the advantage of not requiring any manual parameter tuning, which usually arises in classical… ▽ More In this work, we consider the inverse problem of reconstructing the internal structure of an object from limited x-ray projections. We use a Gaussian process prior to model the target function and estimate its (hyper)parameters from measured data. In contrast to other established methods, this comes with the advantage of not requiring any manual parameter tuning, which usually arises in classical regularization strategies. Our method uses a basis function expansion technique for the Gaussian process which significantly reduces the computational complexity and avoids the need for numerical integration. The approach also allows for reformulation of come classical regularization methods as Laplacian and Tikhonov regularization as Gaussian process regression, and hence provides an efficient algorithm and principled means for their parameter tuning. Results from simulated and real data indicate that this approach is less sensitive to streak artifacts as compared to the commonly used method of filtered backprojection. △ Less

Submitted 3 July, 2019; v1 submitted 11 September, 2018; originally announced September 2018.

arXiv:1802.03636 [pdf, other]

doi 10.1016/j.nimb.2018.08.051

Probabilistic modelling and reconstruction of strain

Authors: Carl Jidling, Johannes Hendriks, Niklas Wahlström, Alexander Gregg, Thomas B. Schön, Christopher Wensrich, Adrian Wills

Abstract: This paper deals with modelling and reconstruction of strain fields, relying upon data generated from neutron Bragg-edge measurements. We propose a probabilistic approach in which the strain field is modelled as a Gaussian process, assigned a covariance structure customised by incorporation of the so-called equilibrium constraints. The computational complexity is significantly reduced by utilising… ▽ More This paper deals with modelling and reconstruction of strain fields, relying upon data generated from neutron Bragg-edge measurements. We propose a probabilistic approach in which the strain field is modelled as a Gaussian process, assigned a covariance structure customised by incorporation of the so-called equilibrium constraints. The computational complexity is significantly reduced by utilising an approximation scheme well suited for the problem. We illustrate the method on simulations and real data. The results indicate a high potential and can hopefully inspire the concept of probabilistic modelling to be used within other tomographic applications as well. △ Less

Submitted 5 November, 2018; v1 submitted 10 February, 2018; originally announced February 2018.

arXiv:1703.00787 [pdf, other]

Linearly constrained Gaussian processes

Authors: Carl Jidling, Niklas Wahlström, Adrian Wills, Thomas B. Schön

Abstract: We consider a modification of the covariance function in Gaussian processes to correctly account for known linear constraints. By modelling the target function as a transformation of an underlying function, the constraints are explicitly incorporated in the model such that they are guaranteed to be fulfilled by any sample drawn or prediction made. We also propose a constructive procedure for desig… ▽ More We consider a modification of the covariance function in Gaussian processes to correctly account for known linear constraints. By modelling the target function as a transformation of an underlying function, the constraints are explicitly incorporated in the model such that they are guaranteed to be fulfilled by any sample drawn or prediction made. We also propose a constructive procedure for designing the transformation operator and illustrate the result on both simulated and real-data examples. △ Less

Submitted 19 September, 2017; v1 submitted 2 March, 2017; originally announced March 2017.

Comments: A few fixes and added citation inforomation

Showing 1–10 of 10 results for author: Jidling, C