-
Incorporating Sum Constraints into Multitask Gaussian Processes
Authors:
Philipp Pilar,
Carl Jidling,
Thomas B. Schön,
Niklas Wahlström
Abstract:
Machine learning models can be improved by adapting them to respect existing background knowledge. In this paper we consider multitask Gaussian processes, with background knowledge in the form of constraints that require a specific sum of the outputs to be constant. This is achieved by conditioning the prior distribution on the constraint fulfillment. The approach allows for both linear and nonlin…
▽ More
Machine learning models can be improved by adapting them to respect existing background knowledge. In this paper we consider multitask Gaussian processes, with background knowledge in the form of constraints that require a specific sum of the outputs to be constant. This is achieved by conditioning the prior distribution on the constraint fulfillment. The approach allows for both linear and nonlinear constraints. We demonstrate that the constraints are fulfilled with high precision and that the construction can improve the overall prediction accuracy as compared to the standard Gaussian process.
△ Less
Submitted 1 February, 2023; v1 submitted 3 February, 2022;
originally announced February 2022.
-
A Probabilistically Motivated Learning Rate Adaptation for Stochastic Optimization
Authors:
Filip de Roos,
Carl Jidling,
Adrian Wills,
Thomas Schön,
Philipp Hennig
Abstract:
Machine learning practitioners invest significant manual and computational resources in finding suitable learning rates for optimization algorithms. We provide a probabilistic motivation, in terms of Gaussian inference, for popular stochastic first-order methods. As an important special case, it recovers the Polyak step with a general metric. The inference allows us to relate the learning rate to…
▽ More
Machine learning practitioners invest significant manual and computational resources in finding suitable learning rates for optimization algorithms. We provide a probabilistic motivation, in terms of Gaussian inference, for popular stochastic first-order methods. As an important special case, it recovers the Polyak step with a general metric. The inference allows us to relate the learning rate to a dimensionless quantity that can be automatically adapted during training by a control algorithm. The resulting meta-algorithm is shown to adapt learning rates in a robust manner across a large range of initial values when applied to deep learning benchmark problems.
△ Less
Submitted 22 February, 2021;
originally announced February 2021.
-
Linearly Constrained Neural Networks
Authors:
Johannes Hendriks,
Carl Jidling,
Adrian Wills,
Thomas Schön
Abstract:
We present a novel approach to modelling and learning vector fields from physical systems using neural networks that explicitly satisfy known linear operator constraints. To achieve this, the target function is modelled as a linear transformation of an underlying potential field, which is in turn modelled by a neural network. This transformation is chosen such that any prediction of the target fun…
▽ More
We present a novel approach to modelling and learning vector fields from physical systems using neural networks that explicitly satisfy known linear operator constraints. To achieve this, the target function is modelled as a linear transformation of an underlying potential field, which is in turn modelled by a neural network. This transformation is chosen such that any prediction of the target function is guaranteed to satisfy the constraints. The approach is demonstrated on both simulated and real data examples.
△ Less
Submitted 27 April, 2021; v1 submitted 4 February, 2020;
originally announced February 2020.
-
Deep kernel learning for integral measurements
Authors:
Carl Jidling,
Johannes Hendriks,
Thomas B. Schön,
Adrian Wills
Abstract:
Deep kernel learning refers to a Gaussian process that incorporates neural networks to improve the modelling of complex functions. We present a method that makes this approach feasible for problems where the data consists of line integral measurements of the target function. The performance is illustrated on computed tomography reconstruction examples.
Deep kernel learning refers to a Gaussian process that incorporates neural networks to improve the modelling of complex functions. We present a method that makes this approach feasible for problems where the data consists of line integral measurements of the target function. The performance is illustrated on computed tomography reconstruction examples.
△ Less
Submitted 4 September, 2019;
originally announced September 2019.
-
Neutron Transmission Strain Tomography for Non-Constant Stress-Free Lattice Spacing
Authors:
J. N. Hendriks,
C. Jidling,
T. B. Schön,
A. Wills,
C. M. Wensrich,
E. H. Kisi
Abstract:
Recently, several algorithms for strain tomography from energy-resolved neutron transmission measurements have been proposed. These methods assume that the stress-free lattice spacing $d_0$ is a known constant limiting their application to the study of stresses generated by manufacturing and loading methods that do not alter this parameter. In this paper, we consider the more general problem of jo…
▽ More
Recently, several algorithms for strain tomography from energy-resolved neutron transmission measurements have been proposed. These methods assume that the stress-free lattice spacing $d_0$ is a known constant limiting their application to the study of stresses generated by manufacturing and loading methods that do not alter this parameter. In this paper, we consider the more general problem of jointly reconstructing the strain and $d_0$ fields. A method for solving this inherently non-linear problem is presented that ensures the estimated strain field satisfies equilibrium and can include knowledge of boundary conditions. This method is tested on a simulated data set with realistic noise levels, demonstrating that it is possible to jointly reconstruct $d_0$ and the strain field.
△ Less
Submitted 18 July, 2019; v1 submitted 15 May, 2019;
originally announced May 2019.
-
Evaluating the squared-exponential covariance function in Gaussian processes with integral observations
Authors:
J. N. Hendriks,
C. Jidling,
A. Wills,
T. B. Schön
Abstract:
This paper deals with the evaluation of double line integrals of the squared exponential covariance function. We propose a new approach in which the double integral is reduced to a single integral using the error function. This single integral is then computed with efficiently implemented numerical techniques. The performance is compared against existing state of the art methods and the results sh…
▽ More
This paper deals with the evaluation of double line integrals of the squared exponential covariance function. We propose a new approach in which the double integral is reduced to a single integral using the error function. This single integral is then computed with efficiently implemented numerical techniques. The performance is compared against existing state of the art methods and the results show superior properties in numerical robustness and accuracy per computation time.
△ Less
Submitted 18 December, 2018;
originally announced December 2018.
-
A fast quasi-Newton-type method for large-scale stochastic optimisation
Authors:
Adrian Wills,
Carl Jidling,
Thomas Schon
Abstract:
During recent years there has been an increased interest in stochastic adaptations of limited memory quasi-Newton methods, which compared to pure gradient-based routines can improve the convergence by incorporating second order information. In this work we propose a direct least-squares approach conceptually similar to the limited memory quasi-Newton methods, but that computes the search direction…
▽ More
During recent years there has been an increased interest in stochastic adaptations of limited memory quasi-Newton methods, which compared to pure gradient-based routines can improve the convergence by incorporating second order information. In this work we propose a direct least-squares approach conceptually similar to the limited memory quasi-Newton methods, but that computes the search direction in a slightly different way. This is achieved in a fast and numerically robust manner by maintaining a Cholesky factor of low dimension. This is combined with a stochastic line search relying upon fulfilment of the Wolfe condition in a backtracking manner, where the step length is adaptively modified with respect to the optimisation progress. We support our new algorithm by providing several theoretical results guaranteeing its performance. The performance is demonstrated on real-world benchmark problems which shows improved results in comparison with already established methods.
△ Less
Submitted 29 September, 2018;
originally announced October 2018.
-
Probabilistic approach to limited-data computed tomography reconstruction
Authors:
Zenith Purisha,
Carl Jidling,
Niklas Wahlström,
Simo Särkkä,
Thomas B. Schön
Abstract:
In this work, we consider the inverse problem of reconstructing the internal structure of an object from limited x-ray projections. We use a Gaussian process prior to model the target function and estimate its (hyper)parameters from measured data. In contrast to other established methods, this comes with the advantage of not requiring any manual parameter tuning, which usually arises in classical…
▽ More
In this work, we consider the inverse problem of reconstructing the internal structure of an object from limited x-ray projections. We use a Gaussian process prior to model the target function and estimate its (hyper)parameters from measured data. In contrast to other established methods, this comes with the advantage of not requiring any manual parameter tuning, which usually arises in classical regularization strategies. Our method uses a basis function expansion technique for the Gaussian process which significantly reduces the computational complexity and avoids the need for numerical integration. The approach also allows for reformulation of come classical regularization methods as Laplacian and Tikhonov regularization as Gaussian process regression, and hence provides an efficient algorithm and principled means for their parameter tuning. Results from simulated and real data indicate that this approach is less sensitive to streak artifacts as compared to the commonly used method of filtered backprojection.
△ Less
Submitted 3 July, 2019; v1 submitted 11 September, 2018;
originally announced September 2018.
-
Probabilistic modelling and reconstruction of strain
Authors:
Carl Jidling,
Johannes Hendriks,
Niklas Wahlström,
Alexander Gregg,
Thomas B. Schön,
Christopher Wensrich,
Adrian Wills
Abstract:
This paper deals with modelling and reconstruction of strain fields, relying upon data generated from neutron Bragg-edge measurements. We propose a probabilistic approach in which the strain field is modelled as a Gaussian process, assigned a covariance structure customised by incorporation of the so-called equilibrium constraints. The computational complexity is significantly reduced by utilising…
▽ More
This paper deals with modelling and reconstruction of strain fields, relying upon data generated from neutron Bragg-edge measurements. We propose a probabilistic approach in which the strain field is modelled as a Gaussian process, assigned a covariance structure customised by incorporation of the so-called equilibrium constraints. The computational complexity is significantly reduced by utilising an approximation scheme well suited for the problem. We illustrate the method on simulations and real data. The results indicate a high potential and can hopefully inspire the concept of probabilistic modelling to be used within other tomographic applications as well.
△ Less
Submitted 5 November, 2018; v1 submitted 10 February, 2018;
originally announced February 2018.
-
Linearly constrained Gaussian processes
Authors:
Carl Jidling,
Niklas Wahlström,
Adrian Wills,
Thomas B. Schön
Abstract:
We consider a modification of the covariance function in Gaussian processes to correctly account for known linear constraints. By modelling the target function as a transformation of an underlying function, the constraints are explicitly incorporated in the model such that they are guaranteed to be fulfilled by any sample drawn or prediction made. We also propose a constructive procedure for desig…
▽ More
We consider a modification of the covariance function in Gaussian processes to correctly account for known linear constraints. By modelling the target function as a transformation of an underlying function, the constraints are explicitly incorporated in the model such that they are guaranteed to be fulfilled by any sample drawn or prediction made. We also propose a constructive procedure for designing the transformation operator and illustrate the result on both simulated and real-data examples.
△ Less
Submitted 19 September, 2017; v1 submitted 2 March, 2017;
originally announced March 2017.