-
Probabilistic Matching of Real and Generated Data Statistics in Generative Adversarial Networks
Authors:
Philipp Pilar,
Niklas Wahlström
Abstract:
Generative adversarial networks constitute a powerful approach to generative modeling. While generated samples often are indistinguishable from real data, mode-collapse may occur and there is no guarantee that they will follow the true data distribution. For scientific applications in particular, it is essential that the true distribution is well captured by the generated distribution. In this wor…
▽ More
Generative adversarial networks constitute a powerful approach to generative modeling. While generated samples often are indistinguishable from real data, mode-collapse may occur and there is no guarantee that they will follow the true data distribution. For scientific applications in particular, it is essential that the true distribution is well captured by the generated distribution. In this work, we propose a method to ensure that the distributions of certain generated data statistics coincide with the respective distributions of the real data. In order to achieve this, we add a new loss term to the generator loss function, which quantifies the difference between these distributions via suitable f-divergences. Kernel density estimation is employed to obtain representations of the true distributions, and to estimate the corresponding generated distributions from minibatch values at each iteration. When compared to other methods, our approach has the advantage that the complete shapes of the distributions are taken into account. We evaluate the method on a synthetic dataset and a real-world dataset and demonstrate improved performance of our approach.
△ Less
Submitted 8 February, 2024; v1 submitted 19 June, 2023;
originally announced June 2023.
-
Invertible Kernel PCA with Random Fourier Features
Authors:
Daniel Gedon,
Antôni H. Ribeiro,
Niklas Wahlström,
Thomas B. Schön
Abstract:
Kernel principal component analysis (kPCA) is a widely studied method to construct a low-dimensional data representation after a nonlinear transformation. The prevailing method to reconstruct the original input signal from kPCA -- an important task for denoising -- requires us to solve a supervised learning problem. In this paper, we present an alternative method where the reconstruction follows n…
▽ More
Kernel principal component analysis (kPCA) is a widely studied method to construct a low-dimensional data representation after a nonlinear transformation. The prevailing method to reconstruct the original input signal from kPCA -- an important task for denoising -- requires us to solve a supervised learning problem. In this paper, we present an alternative method where the reconstruction follows naturally from the compression step. We first approximate the kernel with random Fourier features. Then, we exploit the fact that the nonlinear transformation is invertible in a certain subdomain. Hence, the name \emph{invertible kernel PCA (ikPCA)}. We experiment with different data modalities and show that ikPCA performs similarly to kPCA with supervised reconstruction on denoising tasks, making it a strong alternative.
△ Less
Submitted 9 March, 2023;
originally announced March 2023.
-
Physics-informed Neural Networks with Unknown Measurement Noise
Authors:
Philipp Pilar,
Niklas Wahlström
Abstract:
Physics-informed neural networks (PINNs) constitute a flexible approach to both finding solutions and identifying parameters of partial differential equations. Most works on the topic assume noiseless data, or data contaminated with weak Gaussian noise. We show that the standard PINN framework breaks down in case of non-Gaussian noise. We give a way of resolving this fundamental issue and we propo…
▽ More
Physics-informed neural networks (PINNs) constitute a flexible approach to both finding solutions and identifying parameters of partial differential equations. Most works on the topic assume noiseless data, or data contaminated with weak Gaussian noise. We show that the standard PINN framework breaks down in case of non-Gaussian noise. We give a way of resolving this fundamental issue and we propose to jointly train an energy-based model (EBM) to learn the correct noise distribution. We illustrate the improved performance of our approach using multiple examples.
△ Less
Submitted 19 June, 2024; v1 submitted 28 November, 2022;
originally announced November 2022.
-
Incorporating Sum Constraints into Multitask Gaussian Processes
Authors:
Philipp Pilar,
Carl Jidling,
Thomas B. Schön,
Niklas Wahlström
Abstract:
Machine learning models can be improved by adapting them to respect existing background knowledge. In this paper we consider multitask Gaussian processes, with background knowledge in the form of constraints that require a specific sum of the outputs to be constant. This is achieved by conditioning the prior distribution on the constraint fulfillment. The approach allows for both linear and nonlin…
▽ More
Machine learning models can be improved by adapting them to respect existing background knowledge. In this paper we consider multitask Gaussian processes, with background knowledge in the form of constraints that require a specific sum of the outputs to be constant. This is achieved by conditioning the prior distribution on the constraint fulfillment. The approach allows for both linear and nonlinear constraints. We demonstrate that the constraints are fulfilled with high precision and that the construction can improve the overall prediction accuracy as compared to the standard Gaussian process.
△ Less
Submitted 1 February, 2023; v1 submitted 3 February, 2022;
originally announced February 2022.
-
Learning deep autoregressive models for hierarchical data
Authors:
Carl R. Andersson,
Niklas Wahlström,
Thomas B. Schön
Abstract:
We propose a model for hierarchical structured data as an extension to the stochastic temporal convolutional network. The proposed model combines an autoregressive model with a hierarchical variational autoencoder and downsampling to achieve superior computational complexity. We evaluate the proposed model on two different types of sequential data: speech and handwritten text. The results are prom…
▽ More
We propose a model for hierarchical structured data as an extension to the stochastic temporal convolutional network. The proposed model combines an autoregressive model with a hierarchical variational autoencoder and downsampling to achieve superior computational complexity. We evaluate the proposed model on two different types of sequential data: speech and handwritten text. The results are promising with the proposed model achieving state-of-the-art performance.
△ Less
Submitted 1 July, 2021; v1 submitted 28 April, 2021;
originally announced April 2021.
-
Deep State Space Models for Nonlinear System Identification
Authors:
Daniel Gedon,
Niklas Wahlström,
Thomas B. Schön,
Lennart Ljung
Abstract:
Deep state space models (SSMs) are an actively researched model class for temporal models developed in the deep learning community which have a close connection to classic SSMs. The use of deep SSMs as a black-box identification model can describe a wide range of dynamics due to the flexibility of deep neural networks. Additionally, the probabilistic nature of the model class allows the uncertaint…
▽ More
Deep state space models (SSMs) are an actively researched model class for temporal models developed in the deep learning community which have a close connection to classic SSMs. The use of deep SSMs as a black-box identification model can describe a wide range of dynamics due to the flexibility of deep neural networks. Additionally, the probabilistic nature of the model class allows the uncertainty of the system to be modelled. In this work a deep SSM class and its parameter learning algorithm are explained in an effort to extend the toolbox of nonlinear identification methods with a deep learning based method. Six recent deep SSMs are evaluated in a first unified implementation on nonlinear system identification benchmarks.
△ Less
Submitted 18 June, 2021; v1 submitted 31 March, 2020;
originally announced March 2020.
-
Deep Convolutional Networks in System Identification
Authors:
Carl Andersson,
Antônio H. Ribeiro,
Koen Tiels,
Niklas Wahlström,
Thomas B. Schön
Abstract:
Recent developments within deep learning are relevant for nonlinear system identification problems. In this paper, we establish connections between the deep learning and the system identification communities. It has recently been shown that convolutional architectures are at least as capable as recurrent architectures when it comes to sequence modeling tasks. Inspired by these results we explore t…
▽ More
Recent developments within deep learning are relevant for nonlinear system identification problems. In this paper, we establish connections between the deep learning and the system identification communities. It has recently been shown that convolutional architectures are at least as capable as recurrent architectures when it comes to sequence modeling tasks. Inspired by these results we explore the explicit relationships between the recently proposed temporal convolutional network (TCN) and two classic system identification model structures; Volterra series and block-oriented models. We end the paper with an experimental study where we provide results on two real-world problems, the well-known Silverbox dataset and a newer dataset originating from ground vibration experiments on an F-16 fighter aircraft.
△ Less
Submitted 19 November, 2019; v1 submitted 4 September, 2019;
originally announced September 2019.
-
Probabilistic approach to limited-data computed tomography reconstruction
Authors:
Zenith Purisha,
Carl Jidling,
Niklas Wahlström,
Simo Särkkä,
Thomas B. Schön
Abstract:
In this work, we consider the inverse problem of reconstructing the internal structure of an object from limited x-ray projections. We use a Gaussian process prior to model the target function and estimate its (hyper)parameters from measured data. In contrast to other established methods, this comes with the advantage of not requiring any manual parameter tuning, which usually arises in classical…
▽ More
In this work, we consider the inverse problem of reconstructing the internal structure of an object from limited x-ray projections. We use a Gaussian process prior to model the target function and estimate its (hyper)parameters from measured data. In contrast to other established methods, this comes with the advantage of not requiring any manual parameter tuning, which usually arises in classical regularization strategies. Our method uses a basis function expansion technique for the Gaussian process which significantly reduces the computational complexity and avoids the need for numerical integration. The approach also allows for reformulation of come classical regularization methods as Laplacian and Tikhonov regularization as Gaussian process regression, and hence provides an efficient algorithm and principled means for their parameter tuning. Results from simulated and real data indicate that this approach is less sensitive to streak artifacts as compared to the commonly used method of filtered backprojection.
△ Less
Submitted 3 July, 2019; v1 submitted 11 September, 2018;
originally announced September 2018.
-
Probabilistic modelling and reconstruction of strain
Authors:
Carl Jidling,
Johannes Hendriks,
Niklas Wahlström,
Alexander Gregg,
Thomas B. Schön,
Christopher Wensrich,
Adrian Wills
Abstract:
This paper deals with modelling and reconstruction of strain fields, relying upon data generated from neutron Bragg-edge measurements. We propose a probabilistic approach in which the strain field is modelled as a Gaussian process, assigned a covariance structure customised by incorporation of the so-called equilibrium constraints. The computational complexity is significantly reduced by utilising…
▽ More
This paper deals with modelling and reconstruction of strain fields, relying upon data generated from neutron Bragg-edge measurements. We propose a probabilistic approach in which the strain field is modelled as a Gaussian process, assigned a covariance structure customised by incorporation of the so-called equilibrium constraints. The computational complexity is significantly reduced by utilising an approximation scheme well suited for the problem. We illustrate the method on simulations and real data. The results indicate a high potential and can hopefully inspire the concept of probabilistic modelling to be used within other tomographic applications as well.
△ Less
Submitted 5 November, 2018; v1 submitted 10 February, 2018;
originally announced February 2018.
-
Data-Driven Impulse Response Regularization via Deep Learning
Authors:
Carl Andersson,
Niklas Wahlström,
Thomas B. Schön
Abstract:
We consider the problem of impulse response estimation of stable linear single-input single-output systems. It is a well-studied problem where flexible non-parametric models recently offered a leap in performance compared to the classical finite-dimensional model structures. Inspired by this development and the success of deep learning we propose a new flexible data-driven model. Our experiments i…
▽ More
We consider the problem of impulse response estimation of stable linear single-input single-output systems. It is a well-studied problem where flexible non-parametric models recently offered a leap in performance compared to the classical finite-dimensional model structures. Inspired by this development and the success of deep learning we propose a new flexible data-driven model. Our experiments indicate that the new model is capable of exploiting even more of the hidden patterns that are present in the input-output data as compared to the non-parametric models.
△ Less
Submitted 11 October, 2018; v1 submitted 25 January, 2018;
originally announced January 2018.
-
Linearly constrained Gaussian processes
Authors:
Carl Jidling,
Niklas Wahlström,
Adrian Wills,
Thomas B. Schön
Abstract:
We consider a modification of the covariance function in Gaussian processes to correctly account for known linear constraints. By modelling the target function as a transformation of an underlying function, the constraints are explicitly incorporated in the model such that they are guaranteed to be fulfilled by any sample drawn or prediction made. We also propose a constructive procedure for desig…
▽ More
We consider a modification of the covariance function in Gaussian processes to correctly account for known linear constraints. By modelling the target function as a transformation of an underlying function, the constraints are explicitly incorporated in the model such that they are guaranteed to be fulfilled by any sample drawn or prediction made. We also propose a constructive procedure for designing the transformation operator and illustrate the result on both simulated and real-data examples.
△ Less
Submitted 19 September, 2017; v1 submitted 2 March, 2017;
originally announced March 2017.
-
Data-Efficient Learning of Feedback Policies from Image Pixels using Deep Dynamical Models
Authors:
John-Alexander M. Assael,
Niklas Wahlström,
Thomas B. Schön,
Marc Peter Deisenroth
Abstract:
Data-efficient reinforcement learning (RL) in continuous state-action spaces using very high-dimensional observations remains a key challenge in develo** fully autonomous systems. We consider a particularly important instance of this challenge, the pixels-to-torques problem, where an RL agent learns a closed-loop control policy ("torques") from pixel information only. We introduce a data-efficie…
▽ More
Data-efficient reinforcement learning (RL) in continuous state-action spaces using very high-dimensional observations remains a key challenge in develo** fully autonomous systems. We consider a particularly important instance of this challenge, the pixels-to-torques problem, where an RL agent learns a closed-loop control policy ("torques") from pixel information only. We introduce a data-efficient, model-based reinforcement learning algorithm that learns such a closed-loop policy directly from pixel information. The key ingredient is a deep dynamical model for learning a low-dimensional feature embedding of images jointly with a predictive model in this low-dimensional feature space. Joint learning is crucial for long-term predictions, which lie at the core of the adaptive nonlinear model predictive control strategy that we use for closed-loop control. Compared to state-of-the-art RL methods for continuous states and actions, our approach learns quickly, scales to high-dimensional state spaces, is lightweight and an important step toward fully autonomous end-to-end learning from pixels to torques.
△ Less
Submitted 9 October, 2015; v1 submitted 7 October, 2015;
originally announced October 2015.
-
Modeling and interpolation of the ambient magnetic field by Gaussian processes
Authors:
Arno Solin,
Manon Kok,
Niklas Wahlström,
Thomas B. Schön,
Simo Särkkä
Abstract:
Anomalies in the ambient magnetic field can be used as features in indoor positioning and navigation. By using Maxwell's equations, we derive and present a Bayesian non-parametric probabilistic modeling approach for interpolation and extrapolation of the magnetic field. We model the magnetic field components jointly by imposing a Gaussian process (GP) prior on the latent scalar potential of the ma…
▽ More
Anomalies in the ambient magnetic field can be used as features in indoor positioning and navigation. By using Maxwell's equations, we derive and present a Bayesian non-parametric probabilistic modeling approach for interpolation and extrapolation of the magnetic field. We model the magnetic field components jointly by imposing a Gaussian process (GP) prior on the latent scalar potential of the magnetic field. By rewriting the GP model in terms of a Hilbert space representation, we circumvent the computational pitfalls associated with GP modeling and provide a computationally efficient and physically justified modeling tool for the ambient magnetic field. The model allows for sequential updating of the estimate and time-dependent changes in the magnetic field. The model is shown to work well in practice in different applications: we demonstrate map** of the magnetic field both with an inexpensive Raspberry Pi powered robot and on foot using a standard smartphone.
△ Less
Submitted 21 March, 2018; v1 submitted 15 September, 2015;
originally announced September 2015.
-
From Pixels to Torques: Policy Learning with Deep Dynamical Models
Authors:
Niklas Wahlström,
Thomas B. Schön,
Marc Peter Deisenroth
Abstract:
Data-efficient learning in continuous state-action spaces using very high-dimensional observations remains a key challenge in develo** fully autonomous systems. In this paper, we consider one instance of this challenge, the pixels to torques problem, where an agent must learn a closed-loop control policy from pixel information only. We introduce a data-efficient, model-based reinforcement learni…
▽ More
Data-efficient learning in continuous state-action spaces using very high-dimensional observations remains a key challenge in develo** fully autonomous systems. In this paper, we consider one instance of this challenge, the pixels to torques problem, where an agent must learn a closed-loop control policy from pixel information only. We introduce a data-efficient, model-based reinforcement learning algorithm that learns such a closed-loop policy directly from pixel information. The key ingredient is a deep dynamical model that uses deep auto-encoders to learn a low-dimensional embedding of images jointly with a predictive model in this low-dimensional feature space. Joint learning ensures that not only static but also dynamic properties of the data are accounted for. This is crucial for long-term predictions, which lie at the core of the adaptive model predictive control strategy that we use for closed-loop control. Compared to state-of-the-art reinforcement learning methods for continuous states and actions, our approach learns quickly, scales to high-dimensional state spaces and is an important step toward fully autonomous learning from pixels to torques.
△ Less
Submitted 18 June, 2015; v1 submitted 8 February, 2015;
originally announced February 2015.
-
Learning deep dynamical models from image pixels
Authors:
Niklas Wahlström,
Thomas B. Schön,
Marc Peter Deisenroth
Abstract:
Modeling dynamical systems is important in many disciplines, e.g., control, robotics, or neurotechnology. Commonly the state of these systems is not directly observed, but only available through noisy and potentially high-dimensional observations. In these cases, system identification, i.e., finding the measurement map** and the transition map** (system dynamics) in latent space can be challen…
▽ More
Modeling dynamical systems is important in many disciplines, e.g., control, robotics, or neurotechnology. Commonly the state of these systems is not directly observed, but only available through noisy and potentially high-dimensional observations. In these cases, system identification, i.e., finding the measurement map** and the transition map** (system dynamics) in latent space can be challenging. For linear system dynamics and measurement map**s efficient solutions for system identification are available. However, in practical applications, the linearity assumptions does not hold, requiring non-linear system identification techniques. If additionally the observations are high-dimensional (e.g., images), non-linear system identification is inherently hard. To address the problem of non-linear system identification from high-dimensional observations, we combine recent advances in deep learning and system identification. In particular, we jointly learn a low-dimensional embedding of the observation by means of deep auto-encoders and a predictive transition model in this low-dimensional space. We demonstrate that our model enables learning good predictive models of dynamical systems from pixel information only.
△ Less
Submitted 28 October, 2014;
originally announced October 2014.
-
Discretizing stochastic dynamical systems using Lyapunov equations
Authors:
Niklas Wahlström,
Patrix Axelsson,
Fredrik Gustafsson
Abstract:
Stochastic dynamical systems are fundamental in state estimation, system identification and control. System models are often provided in continuous time, while a major part of the applied theory is developed for discrete-time systems. Discretization of continuous-time models is hence fundamental. We present a novel algorithm using a combination of Lyapunov equations and analytical solutions, enabl…
▽ More
Stochastic dynamical systems are fundamental in state estimation, system identification and control. System models are often provided in continuous time, while a major part of the applied theory is developed for discrete-time systems. Discretization of continuous-time models is hence fundamental. We present a novel algorithm using a combination of Lyapunov equations and analytical solutions, enabling efficient implementation in software. The proposed method circumvents numerical problems exhibited by standard algorithms in the literature. Both theoretical and simulation results are provided.
△ Less
Submitted 6 February, 2014;
originally announced February 2014.