-
Re-Envisioning Numerical Information Field Theory (NIFTy.re): A Library for Gaussian Processes and Variational Inference
Authors:
Gordian Edenhofer,
Philipp Frank,
Jakob Roth,
Reimar H. Leike,
Massin Guerdi,
Lukas I. Scheel-Platz,
Matteo Guardiani,
Vincent Eberle,
Margret Westerkamp,
Torsten A. Enßlin
Abstract:
Imaging is the process of transforming noisy, incomplete data into a space that humans can interpret. NIFTy is a Bayesian framework for imaging and has already successfully been applied to many fields in astrophysics. Previous design decisions held the performance and the development of methods in NIFTy back. We present a rewrite of NIFTy, coined NIFTy.re, which reworks the modeling principle, ext…
▽ More
Imaging is the process of transforming noisy, incomplete data into a space that humans can interpret. NIFTy is a Bayesian framework for imaging and has already successfully been applied to many fields in astrophysics. Previous design decisions held the performance and the development of methods in NIFTy back. We present a rewrite of NIFTy, coined NIFTy.re, which reworks the modeling principle, extends the inference strategies, and outsources much of the heavy lifting to JAX. The rewrite dramatically accelerates models written in NIFTy, lays the foundation for new types of inference machineries, improves maintainability, and enables interoperability between NIFTy and the JAX machine learning ecosystem.
△ Less
Submitted 15 June, 2024; v1 submitted 26 February, 2024;
originally announced February 2024.
-
Attention to Entropic Communication
Authors:
Torsten Enßlin,
Carolin Weidinger,
Philipp Frank
Abstract:
The concept of attention, numerical weights that emphasize the importance of particular data, has proven to be very relevant in artificial intelligence. Relative entropy (RE, aka Kullback-Leibler divergence) plays a central role in communication theory. Here we combine these concepts, attention and RE. RE guides optimal encoding of messages in bandwidth-limited communication as well as optimal mes…
▽ More
The concept of attention, numerical weights that emphasize the importance of particular data, has proven to be very relevant in artificial intelligence. Relative entropy (RE, aka Kullback-Leibler divergence) plays a central role in communication theory. Here we combine these concepts, attention and RE. RE guides optimal encoding of messages in bandwidth-limited communication as well as optimal message decoding via the maximum entropy principle (MEP). In the coding scenario, RE can be derived from four requirements, namely being analytical, local, proper, and calibrated. Weighted RE, used for attention steering in communications, turns out to be improper. To see how proper attention communication can emerge, we analyze a scenario of a message sender who wants to ensure that the receiver of the message can perform well-informed actions. If the receiver decodes the message using the MEP, the sender only needs to know the receiver's utility function to inform optimally, but not the receiver's initial knowledge state. In case only the curvature of the utility function maxima are known, it becomes desirable to accurately communicate an attention function, in this case a by this curvature weighted and re-normalized probability function. Entropic attention communication is here proposed as the desired generalization of entropic communication that permits weighting while being proper, thereby aiding the design of optimal communication protocols in technical applications and hel** to understand human communication. For example, our analysis shows how to derive the level of cooperation expected under misaligned interests of otherwise honest communication partners.
△ Less
Submitted 9 January, 2024; v1 submitted 21 July, 2023;
originally announced July 2023.
-
Sparse Kernel Gaussian Processes through Iterative Charted Refinement (ICR)
Authors:
Gordian Edenhofer,
Reimar H. Leike,
Philipp Frank,
Torsten A. Enßlin
Abstract:
Gaussian Processes (GPs) are highly expressive, probabilistic models. A major limitation is their computational complexity. Naively, exact GP inference requires $\mathcal{O}(N^3)$ computations with $N$ denoting the number of modeled points. Current approaches to overcome this limitation either rely on sparse, structured or stochastic representations of data or kernel respectively and usually invol…
▽ More
Gaussian Processes (GPs) are highly expressive, probabilistic models. A major limitation is their computational complexity. Naively, exact GP inference requires $\mathcal{O}(N^3)$ computations with $N$ denoting the number of modeled points. Current approaches to overcome this limitation either rely on sparse, structured or stochastic representations of data or kernel respectively and usually involve nested optimizations to evaluate a GP. We present a new, generative method named Iterative Charted Refinement (ICR) to model GPs on nearly arbitrarily spaced points in $\mathcal{O}(N)$ time for decaying kernels without nested optimizations. ICR represents long- as well as short-range correlations by combining views of the modeled locations at varying resolutions with a user-provided coordinate chart. In our experiment with points whose spacings vary over two orders of magnitude, ICR's accuracy is comparable to state-of-the-art GP methods. ICR outperforms existing methods in terms of computational speed by one order of magnitude on the CPU and GPU and has already been successfully applied to model a GP with $122$ billion parameters.
△ Less
Submitted 21 June, 2022;
originally announced June 2022.
-
The Galactic 3D large-scale dust distribution via Gaussian process regression on spherical coordinates
Authors:
R. H. Leike,
G. Edenhofer,
J. Knollmüller,
C. Alig,
P. Frank,
T. A. Enßlin
Abstract:
Knowing the Galactic 3D dust distribution is relevant for understanding many processes in the interstellar medium and for correcting many astronomical observations for dust absorption and emission. Here, we aim for a 3D reconstruction of the Galactic dust distribution with an increase in the number of meaningful resolution elements by orders of magnitude with respect to previous reconstructions, w…
▽ More
Knowing the Galactic 3D dust distribution is relevant for understanding many processes in the interstellar medium and for correcting many astronomical observations for dust absorption and emission. Here, we aim for a 3D reconstruction of the Galactic dust distribution with an increase in the number of meaningful resolution elements by orders of magnitude with respect to previous reconstructions, while taking advantage of the dust's spatial correlations to inform the dust map. We use iterative grid refinement to define a log-normal process in spherical coordinates. This log-normal process assumes a fixed correlation structure, which was inferred in an earlier reconstruction of Galactic dust. Our map is informed through 111 Million data points, combining data of PANSTARRS, 2MASS, Gaia DR2 and ALLWISE. The log-normal process is discretized to 122 Billion degrees of freedom, a factor of 400 more than our previous map. We derive the most probable posterior map and an uncertainty estimate using natural gradient descent and the Fisher-Laplace approximation. The dust reconstruction covers a quarter of the volume of our Galaxy, with a maximum coordinate distance of $16\,\text{kpc}$, and meaningful information can be found up to at distances of $4\,$kpc, still improving upon our earlier map by a factor of 5 in maximal distance, of $900$ in volume, and of about eighteen in angular grid resolution. Unfortunately, the maximum posterior approach chosen to make the reconstruction computational affordable introduces artifacts and reduces the accuracy of our uncertainty estimate. Despite of the apparent limitations of the presented 3D dust map, a good part of the reconstructed structures are confirmed by independent maser observations. Thus, the map is a step towards reliable 3D Galactic cartography and already can serve for a number of tasks, if used with care.
△ Less
Submitted 25 April, 2022;
originally announced April 2022.
-
Information Field Theory and Artificial Intelligence
Authors:
Torsten Enßlin
Abstract:
Information field theory (IFT), the information theory for fields, is a mathematical framework for signal reconstruction and non-parametric inverse problems. Artificial intelligence (AI) and machine learning (ML) aim at generating intelligent systems including such for perception, cognition, and learning. This overlaps with IFT, which is designed to address perception, reasoning, and inference tas…
▽ More
Information field theory (IFT), the information theory for fields, is a mathematical framework for signal reconstruction and non-parametric inverse problems. Artificial intelligence (AI) and machine learning (ML) aim at generating intelligent systems including such for perception, cognition, and learning. This overlaps with IFT, which is designed to address perception, reasoning, and inference tasks. Here, the relation between concepts and tools in IFT and those in AI and ML research are discussed. In the context of IFT, fields denote physical quantities that change continuously as a function of space (and time) and information theory refers to Bayesian probabilistic logic equipped with the associated entropic information measures. Reconstructing a signal with IFT is a computational problem similar to training a generative neural network (GNN) in ML. In this paper, the process of inference in IFT is reformulated in terms of GNN training. In contrast to classical neural networks, IFT based GNNs can operate without pre-training thanks to incorporating expert knowledge into their architecture. Furthermore, the cross-fertilization of variational inference methods used in IFT and ML are discussed. These discussions suggests that IFT is well suited to address many problems in AI and ML research and application.
△ Less
Submitted 7 March, 2022; v1 submitted 19 December, 2021;
originally announced December 2021.
-
Probabilistic Autoencoder using Fisher Information
Authors:
Johannes Zacherl,
Philipp Frank,
Torsten A. Enßlin
Abstract:
Neural Networks play a growing role in many science disciplines, including physics. Variational Autoencoders (VAEs) are neural networks that are able to represent the essential information of a high dimensional data set in a low dimensional latent space, which have a probabilistic interpretation. In particular the so-called encoder network, the first part of the VAE, which maps its input onto a po…
▽ More
Neural Networks play a growing role in many science disciplines, including physics. Variational Autoencoders (VAEs) are neural networks that are able to represent the essential information of a high dimensional data set in a low dimensional latent space, which have a probabilistic interpretation. In particular the so-called encoder network, the first part of the VAE, which maps its input onto a position in latent space, additionally provides uncertainty information in terms of a variance around this position. In this work, an extension to the Autoencoder architecture is introduced, the FisherNet. In this architecture, the latent space uncertainty is not generated using an additional information channel in the encoder, but derived from the decoder, by means of the Fisher information metric. This architecture has advantages from a theoretical point of view as it provides a direct uncertainty quantification derived from the model, and also accounts for uncertainty cross-correlations. We can show experimentally that the FisherNet produces more accurate data reconstructions than a comparable VAE and its learning performance also apparently scales better with the number of latent space dimensions.
△ Less
Submitted 7 December, 2021; v1 submitted 28 October, 2021;
originally announced October 2021.
-
A Reputation Game Simulation: Emergent Social Phenomena from Information Theory
Authors:
Torsten Enßlin,
Viktoria Kainz,
Céline Bœhm
Abstract:
Reputation is a central element of social communications, be it with human or artificial intelligence (AI), and as such can be the primary target of malicious communication strategies. There is already a vast amount of literature on trust networks addressing this issue and proposing ways to simulate these networks dynamics using Bayesian principles and involving Theory of Mind models. The main iss…
▽ More
Reputation is a central element of social communications, be it with human or artificial intelligence (AI), and as such can be the primary target of malicious communication strategies. There is already a vast amount of literature on trust networks addressing this issue and proposing ways to simulate these networks dynamics using Bayesian principles and involving Theory of Mind models. The main issue for these simulations is usually the amount of information that can be stored and is usually solved by discretising variables and using hard thresholds. Here we propose a novel approach to the way information is updated that accounts for knowledge uncertainty and is closer to reality. In our game, agents use information compression techniques to capture their complex environment and store it in their finite memories. The loss of information that results from this leads to emergent phenomena, such as echo chambers, self-deception, deception symbiosis, and freezing of group opinions. Various malicious strategies of agents are studied for their impact on group sociology, like sycophancy, egocentricity, pathological lying, and aggressiveness. Even though our modeling could be made more complex, our set-up can already provide insights into social interactions and can be used to investigate the effects of various communication strategies and find ways to counteract malicious ones. Eventually this work should help to safeguard the design of non-abusive AI systems.
△ Less
Submitted 3 February, 2022; v1 submitted 9 June, 2021;
originally announced June 2021.
-
Causal, Bayesian, & Non-parametric Modeling of the SARS-CoV-2 Viral Load Distribution vs. Patient's Age
Authors:
Matteo Guardiani,
Philipp Frank,
Andrija Kostić,
Gordian Edenhofer,
Jakob Roth,
Berit Uhlmann,
Torsten Enßlin
Abstract:
The viral load of patients infected with SARS-CoV-2 varies on logarithmic scales and possibly with age. Controversial claims have been made in the literature regarding whether the viral load distribution actually depends on the age of the patients. Such a dependence would have implications for the COVID-19 spreading mechanism, the age-dependent immune system reaction, and thus for policymaking. We…
▽ More
The viral load of patients infected with SARS-CoV-2 varies on logarithmic scales and possibly with age. Controversial claims have been made in the literature regarding whether the viral load distribution actually depends on the age of the patients. Such a dependence would have implications for the COVID-19 spreading mechanism, the age-dependent immune system reaction, and thus for policymaking. We hereby develop a method to analyze viral-load distribution data as a function of the patients' age within a flexible, non-parametric, hierarchical, Bayesian, and causal model. The causal nature of the developed reconstruction additionally allows to test for bias in the data. This could be due to, e.g., bias in patient-testing and data collection or systematic errors in the measurement of the viral load. We perform these tests by calculating the Bayesian evidence for each implied possible causal direction. The possibility of testing for bias in data collection and identifying causal directions can be very useful in other contexts as well. For this reason we make our model freely available. When applied to publicly available age and SARS-CoV-2 viral load data, we find a statistically significant increase in the viral load with age, but only for one of the two analyzed datasets. If we consider this dataset, and based on the current understanding of viral load's impact on patients' infectivity, we expect a non-negligible difference in the infectivity of different age groups. This difference is nonetheless too small to justify considering any age group as noninfectious.
△ Less
Submitted 12 December, 2022; v1 submitted 27 May, 2021;
originally announced May 2021.
-
Geometric variational inference
Authors:
Philipp Frank,
Reimar Leike,
Torsten A. Enßlin
Abstract:
Efficiently accessing the information contained in non-linear and high dimensional probability distributions remains a core challenge in modern statistics. Traditionally, estimators that go beyond point estimates are either categorized as Variational Inference (VI) or Markov-Chain Monte-Carlo (MCMC) techniques. While MCMC methods that utilize the geometric properties of continuous probability dist…
▽ More
Efficiently accessing the information contained in non-linear and high dimensional probability distributions remains a core challenge in modern statistics. Traditionally, estimators that go beyond point estimates are either categorized as Variational Inference (VI) or Markov-Chain Monte-Carlo (MCMC) techniques. While MCMC methods that utilize the geometric properties of continuous probability distributions to increase their efficiency have been proposed, VI methods rarely use the geometry. This work aims to fill this gap and proposes geometric Variational Inference (geoVI), a method based on Riemannian geometry and the Fisher information metric. It is used to construct a coordinate transformation that relates the Riemannian manifold associated with the metric to Euclidean space. The distribution, expressed in the coordinate system induced by the transformation, takes a particularly simple form that allows for an accurate variational approximation by a normal distribution. Furthermore, the algorithmic structure allows for an efficient implementation of geoVI which is demonstrated on multiple examples, ranging from low-dimensional illustrative ones to non-linear, hierarchical Bayesian inverse problems in thousands of dimensions.
△ Less
Submitted 2 July, 2021; v1 submitted 21 May, 2021;
originally announced May 2021.
-
Probabilistic simulation of partial differential equations
Authors:
Philipp Frank,
Torsten A. Enßlin
Abstract:
Computer simulations of differential equations require a time discretization, which inhibits to identify the exact solution with certainty. Probabilistic simulations take this into account via uncertainty quantification. The construction of a probabilistic simulation scheme can be regarded as Bayesian filtering by means of probabilistic numerics. Gaussian prior based filters, specifically Gauss-Ma…
▽ More
Computer simulations of differential equations require a time discretization, which inhibits to identify the exact solution with certainty. Probabilistic simulations take this into account via uncertainty quantification. The construction of a probabilistic simulation scheme can be regarded as Bayesian filtering by means of probabilistic numerics. Gaussian prior based filters, specifically Gauss-Markov priors, have successfully been applied to simulation of ordinary differential equations (ODEs) and give rise to filtering problems that can be solved efficiently. This work extends this approach to partial differential equations (PDEs) subject to periodic boundary conditions and utilizes continuous Gaussian processes in space and time to arrive at a Bayesian filtering problem structurally similar to the ODE setting. The usage of a process that is Markov in time and statistically homogeneous in space leads to a probabilistic spectral simulation method that allows for an efficient realization. Furthermore, the Bayesian perspective allows the incorporation of methods developed within the context of information field theory such as the estimation of the power spectrum associated with the prior distribution, to be jointly estimated along with the solution of the PDE.
△ Less
Submitted 13 October, 2020;
originally announced October 2020.
-
Comparison of classical and Bayesian imaging in radio interferometry
Authors:
Philipp Arras,
Hertzog L. Bester,
Richard A. Perley,
Reimar Leike,
Oleg Smirnov,
Rüdiger Westermann,
Torsten A. Enßlin
Abstract:
CLEAN, the commonly employed imaging algorithm in radio interferometry, suffers from a number of shortcomings: in its basic version it does not have the concept of diffuse flux, and the common practice of convolving the CLEAN components with the CLEAN beam erases the potential for super-resolution; it does not output uncertainty information; it produces images with unphysical negative flux regions…
▽ More
CLEAN, the commonly employed imaging algorithm in radio interferometry, suffers from a number of shortcomings: in its basic version it does not have the concept of diffuse flux, and the common practice of convolving the CLEAN components with the CLEAN beam erases the potential for super-resolution; it does not output uncertainty information; it produces images with unphysical negative flux regions; and its results are highly dependent on the so-called weighting scheme as well as on any human choice of CLEAN masks to guiding the imaging. Here, we present the Bayesian imaging algorithm resolve which solves the above problems and naturally leads to super-resolution. We take a VLA observation of Cygnus~A at four different frequencies and image it with single-scale CLEAN, multi-scale CLEAN and resolve. Alongside the sky brightness distribution resolve estimates a baseline-dependent correction function for the noise budget, the Bayesian equivalent of weighting schemes. We report noise correction factors between 0.4 and 429. The enhancements achieved by resolve come at the cost of higher computational effort.
△ Less
Submitted 25 January, 2021; v1 submitted 26 August, 2020;
originally announced August 2020.
-
Bayesian Reasoning with Trained Neural Networks
Authors:
Jakob Knollmüller,
Torsten Enßlin
Abstract:
We showed how to use trained neural networks to perform Bayesian reasoning in order to solve tasks outside their initial scope. Deep generative models provide prior knowledge, and classification/regression networks impose constraints. The tasks at hand were formulated as Bayesian inference problems, which we approximately solved through variational or sampling techniques. The approach built on top…
▽ More
We showed how to use trained neural networks to perform Bayesian reasoning in order to solve tasks outside their initial scope. Deep generative models provide prior knowledge, and classification/regression networks impose constraints. The tasks at hand were formulated as Bayesian inference problems, which we approximately solved through variational or sampling techniques. The approach built on top of already trained networks, and the addressable questions grew super-exponentially with the number of available networks. In its simplest form, the approach yielded conditional generative models. However, multiple simultaneous constraints constitute elaborate questions. We compared the approach to specifically trained generators, showed how to solve riddles, and demonstrated its compatibility with state-of-the-art architectures.
△ Less
Submitted 1 June, 2021; v1 submitted 29 January, 2020;
originally announced January 2020.
-
Field dynamics inference for local and causal interactions
Authors:
Philipp Frank,
Reimar Leike,
Torsten A. Enßlin
Abstract:
Inference of fields defined in space and time from observational data is a core discipline in many scientific areas. This work approaches the problem in a Bayesian framework. The proposed method is based on statistically homogeneous random fields defined in space and time and demonstrates how to reconstruct the field together with its prior correlation structure from data. The prior model of the c…
▽ More
Inference of fields defined in space and time from observational data is a core discipline in many scientific areas. This work approaches the problem in a Bayesian framework. The proposed method is based on statistically homogeneous random fields defined in space and time and demonstrates how to reconstruct the field together with its prior correlation structure from data. The prior model of the correlation structure is described in a non-parametric fashion and solely builds on fundamental physical assumptions such as space-time homogeneity, locality, and causality. These assumptions are sufficient to successfully infer the field and its prior correlation structure from noisy and incomplete data of a single realization of the process as demonstrated via multiple numerical examples.
△ Less
Submitted 4 May, 2021; v1 submitted 5 February, 2019;
originally announced February 2019.
-
Metric Gaussian Variational Inference
Authors:
Jakob Knollmüller,
Torsten A. Enßlin
Abstract:
Solving Bayesian inference problems approximately with variational approaches can provide fast and accurate results. Capturing correlation within the approximation requires an explicit parametrization. This intrinsically limits this approach to either moderately dimensional problems, or requiring the strongly simplifying mean-field approach. We propose Metric Gaussian Variational Inference (MGVI)…
▽ More
Solving Bayesian inference problems approximately with variational approaches can provide fast and accurate results. Capturing correlation within the approximation requires an explicit parametrization. This intrinsically limits this approach to either moderately dimensional problems, or requiring the strongly simplifying mean-field approach. We propose Metric Gaussian Variational Inference (MGVI) as a method that goes beyond mean-field. Here correlations between all model parameters are taken into account, while still scaling linearly in computational time and memory. With this method we achieve higher accuracy and in many cases a significant speedup compared to traditional methods. MGVI is an iterative method that performs a series of Gaussian approximations to the posterior. We alternate between approximating the covariance with the inverse Fisher information metric evaluated at an intermediate mean estimate and optimizing the KL-divergence for the given covariance with respect to the mean. This procedure is iterated until the uncertainty estimate is self-consistent with the mean parameter. We achieve linear scaling by avoiding to store the covariance explicitly at any time. Instead we draw samples from the approximating distribution relying on an implicit representation and numerical schemes to approximately solve linear equations. Those samples are used to approximate the KL-divergence and its gradient. The usage of natural gradient descent allows for rapid convergence. Formulating the Bayesian model in standardized coordinates makes MGVI applicable to any inference problem with continuous parameters. We demonstrate the high accuracy of MGVI by comparing it to HMC and its fast convergence relative to other established methods in several examples. We investigate real-data applications, as well as synthetic examples of varying size and complexity and up to a million model parameters.
△ Less
Submitted 30 January, 2020; v1 submitted 30 January, 2019;
originally announced January 2019.
-
A Bayesian Model for Bivariate Causal Inference
Authors:
Maximilian Kurthen,
Torsten A. Enßlin
Abstract:
We address the problem of two-variable causal inference without intervention. This task is to infer an existing causal relation between two random variables, i.e. $X \rightarrow Y$ or $Y \rightarrow X$ , from purely observational data. As the option to modify a potential cause is not given in many situations only structural properties of the data can be used to solve this ill-posed problem. We bri…
▽ More
We address the problem of two-variable causal inference without intervention. This task is to infer an existing causal relation between two random variables, i.e. $X \rightarrow Y$ or $Y \rightarrow X$ , from purely observational data. As the option to modify a potential cause is not given in many situations only structural properties of the data can be used to solve this ill-posed problem. We briefly review a number of state-of-the-art methods for this, including very recent ones. A novel inference method is introduced, Bayesian Causal Inference (BCI), which assumes a generative Bayesian hierarchical model to pursue the strategy of Bayesian model selection. In the adopted model the distribution of the cause variable is given by a Poisson lognormal distribution, which allows to explicitly regard the discrete nature of datasets, correlations in the parameter spaces, as well as the variance of probability densities on logarithmic scales. We assume Fourier diagonal Field covariance operators. The model itself is restricted to use cases where a direct causal relation $X \rightarrow Y$ has to be decided against a relation $Y \rightarrow X$ , therefore we compare it other methods for this exact problem setting. The generative model assumed provides synthetic causal data for benchmarking our model in comparison to existing State-of-the-art models, namely LiNGAM , ANM-HSIC , ANM-MML , IGCI and CGNN . We explore how well the above methods perform in case of high noise settings, strongly discretized data and very sparse data. BCI performs generally reliable with synthetic data as well as with the real world TCEP benchmark set, with an accuracy comparable to state-of-the-art algorithms. We discuss directions for the future development of BCI .
△ Less
Submitted 5 January, 2020; v1 submitted 24 December, 2018;
originally announced December 2018.
-
Bayesian parameter estimation of miss-specified models
Authors:
Johannes Oberpriller,
T. A. Enßlin
Abstract:
Fitting a simplifying model with several parameters to real data of complex objects is a highly nontrivial task, but enables the possibility to get insights into the objects physics. Here, we present a method to infer the parameters of the model, the model error as well as the statistics of the model error. This method relies on the usage of many data sets in a simultaneous analysis in order to ov…
▽ More
Fitting a simplifying model with several parameters to real data of complex objects is a highly nontrivial task, but enables the possibility to get insights into the objects physics. Here, we present a method to infer the parameters of the model, the model error as well as the statistics of the model error. This method relies on the usage of many data sets in a simultaneous analysis in order to overcome the problems caused by the degeneracy between model parameters and model error. Errors in the modeling of the measurement instrument can be absorbed in the model error allowing for applications with complex instruments.
△ Less
Submitted 19 December, 2018;
originally announced December 2018.
-
Encoding prior knowledge in the structure of the likelihood
Authors:
Jakob Knollmüller,
Torsten A. Enßlin
Abstract:
The inference of deep hierarchical models is problematic due to strong dependencies between the hierarchies. We investigate a specific transformation of the model parameters based on the multivariate distributional transform. This transformation is a special form of the reparametrization trick, flattens the hierarchy and leads to a standard Gaussian prior on all resulting parameters. The transform…
▽ More
The inference of deep hierarchical models is problematic due to strong dependencies between the hierarchies. We investigate a specific transformation of the model parameters based on the multivariate distributional transform. This transformation is a special form of the reparametrization trick, flattens the hierarchy and leads to a standard Gaussian prior on all resulting parameters. The transformation also transfers all the prior information into the structure of the likelihood, hereby decoupling the transformed parameters a priori from each other. A variational Gaussian approximation in this standardized space will be excellent in situations of relatively uninformative data. Additionally, the curvature of the log-posterior is well-conditioned in directions that are weakly constrained by the data, allowing for fast inference in such a scenario. In an example we perform the transformation explicitly for Gaussian process regression with a priori unknown correlation structure. Deep models are inferred rapidly in highly and slowly in poorly informed situations. The flat model show exactly the opposite performance pattern. A synthesis of both, the deep and the flat perspective, provides their combined advantages and overcomes the individual limitations, leading to a faster inference.
△ Less
Submitted 11 December, 2018;
originally announced December 2018.
-
Separating diffuse from point-like sources - a Bayesian approach
Authors:
Jakob Knollmüller,
Philipp Frank,
Torsten A. Enßlin
Abstract:
We present the starblade algorithm, a method to separate superimposed point sources from auto-correlated, diffuse flux using a Bayesian model. Point sources are assumed to be independent from each other and to follow a power-law brightness distribution. The diffuse emission is described as a non-parametric log-normal model with a priori unknown correlation structure. This model enforces positivity…
▽ More
We present the starblade algorithm, a method to separate superimposed point sources from auto-correlated, diffuse flux using a Bayesian model. Point sources are assumed to be independent from each other and to follow a power-law brightness distribution. The diffuse emission is described as a non-parametric log-normal model with a priori unknown correlation structure. This model enforces positivity of the underlying emission and allows for variation in the order of magnitudes. The correlation structure is recovered non-parametrically in addition to the diffuse flux and is used for the separation of the point sources. Additionally many measurement artifacts appear as point-like or quasi-point-like effects, not compatible with superimposed diffuse emission. An estimate of the separation uncertainty can be provided as well. We demonstrate the capabilities of the derived method on synthetic data and data obtained by the Hubble Space Telescope, emphasizing its effect on instrumental artifacts as well as physical sources. The performance of this method is compared to the background estimation of the SExtractor method, as well as to a denoising auto-encoder.
△ Less
Submitted 6 August, 2018; v1 submitted 16 April, 2018;
originally announced April 2018.
-
The rationality of irrationality in the Monty Hall problem
Authors:
Torsten Enßlin,
Margret Westerkamp
Abstract:
The rational solution of the Monty Hall problem unsettles many people. Most people, including the authors, think it feels wrong to switch the initial choice of one of the three doors, despite having fully accepted the mathematical proof for its superiority. Many people, if given the choice to switch, think the chances are fifty-fifty between their options, but still strongly prefer to stay with th…
▽ More
The rational solution of the Monty Hall problem unsettles many people. Most people, including the authors, think it feels wrong to switch the initial choice of one of the three doors, despite having fully accepted the mathematical proof for its superiority. Many people, if given the choice to switch, think the chances are fifty-fifty between their options, but still strongly prefer to stay with their initial choice. Is there some sense behind these irrational feelings?
We entertain the possibility that intuition solves the problem of how to behave in a real game show, not in the abstract textbook version of the Monty Hall problem. A real showmaster sometimes plays evil, either to make the show more interesting, to save money, or because he is in a bad mood. A moody showmaster erases any information advantage the guest could extract by him opening other doors which drives the chance of the car being behind the chosen door towards fifty percent. Furthermore, the showmaster could try to read or manipulate the guest's strategy to the guest's disadvantage. Given this, the preference to stay with the initial choice turns out to be a very rational defense strategy of the show's guest against the threat of being manipulated by its host. Thus, the intuitive feelings most people have about the Monty Hall problem coincide with what would be a rational strategy for a real-world game show. Although these investigations are mainly intended to be an entertaining mathematical commentary on an information-theoretic puzzle, they touch on interesting psychological questions.
△ Less
Submitted 22 October, 2018; v1 submitted 10 April, 2018;
originally announced April 2018.
-
Radio Imaging With Information Field Theory
Authors:
Philipp Arras,
Jakob Knollmüller,
Henrik Junklewitz,
Torsten A. Enßlin
Abstract:
Data from radio interferometers provide a substantial challenge for statisticians. It is incomplete, noise-dominated and originates from a non-trivial measurement process. The signal is not only corrupted by imperfect measurement devices but also from effects like fluctuations in the ionosphere that act as a distortion screen. In this paper we focus on the imaging part of data reduction in radio a…
▽ More
Data from radio interferometers provide a substantial challenge for statisticians. It is incomplete, noise-dominated and originates from a non-trivial measurement process. The signal is not only corrupted by imperfect measurement devices but also from effects like fluctuations in the ionosphere that act as a distortion screen. In this paper we focus on the imaging part of data reduction in radio astronomy and present RESOLVE, a Bayesian imaging algorithm for radio interferometry in its new incarnation. It is formulated in the language of information field theory. Solely by algorithmic advances the inference could be sped up significantly and behaves noticeably more stable now. This is one more step towards a fully user-friendly version of RESOLVE which can be applied routinely by astronomers.
△ Less
Submitted 6 March, 2018;
originally announced March 2018.
-
Inference of signals with unknown correlation structure from nonlinear measurements
Authors:
Jakob Knollmüller,
Theo Steininger,
Torsten A. Enßlin
Abstract:
We present a method to reconstruct autocorrelated signals together with their autocorrelation structure from nonlinear, noisy measurements for arbitrary monotonous nonlinear instrument response. In the presented formulation the algorithm provides a significant speedup compared to prior implementations, allowing for a wider range of application. The nonlinearity can be used to model instrument char…
▽ More
We present a method to reconstruct autocorrelated signals together with their autocorrelation structure from nonlinear, noisy measurements for arbitrary monotonous nonlinear instrument response. In the presented formulation the algorithm provides a significant speedup compared to prior implementations, allowing for a wider range of application. The nonlinearity can be used to model instrument characteristics or to enforce properties on the underlying signal, such as positivity. Uncertainties on any posterior quantities can be provided due to independent samples from an approximate posterior distribution. We demonstrate the methods applicability via simulated and real measurements, using different measurement instruments, nonlinearities and dimensionality.
△ Less
Submitted 13 February, 2018; v1 submitted 8 November, 2017;
originally announced November 2017.
-
Towards information optimal simulation of partial differential equations
Authors:
Reimar H. Leike,
Torsten A. Enßlin
Abstract:
Most simulation schemes for partial differential equations (PDEs) focus on minimizing a simple error norm of a discretized version of a field. This paper takes a fundamentally different approach; the discretized field is interpreted as data providing information about a real physical field that is unknown. This information is sought to be conserved by the scheme as the field evolves in time. Such…
▽ More
Most simulation schemes for partial differential equations (PDEs) focus on minimizing a simple error norm of a discretized version of a field. This paper takes a fundamentally different approach; the discretized field is interpreted as data providing information about a real physical field that is unknown. This information is sought to be conserved by the scheme as the field evolves in time. Such an information theoretic approach to simulation was pursued before by information field dynamics (IFD). In this paper we work out the theory of IFD for nonlinear PDEs in a noiseless Gaussian approximation. The result is an action that can be minimized to obtain an informationally optimal simulation scheme. It can be brought into a closed form using field operators to calculate the appearing Gaussian integrals. The resulting simulation schemes are tested numerically in two instances for the Burgers equation. Their accuracy surpasses finite-difference schemes on the same resolution. The IFD scheme, however, has to be correctly informed on the subgrid correlation structure. In certain limiting cases we recover well-known simulation schemes like spectral Fourier Galerkin methods. We discuss implications of the approximations made.
△ Less
Submitted 11 December, 2017; v1 submitted 8 September, 2017;
originally announced September 2017.
-
Field dynamics inference via spectral density estimation
Authors:
Philipp Frank,
Theo Steininger,
Torsten A. Enßlin
Abstract:
Stochastic differential equations (SDEs) are of utmost importance in various scientific and industrial areas. They are the natural description of dynamical processes whose precise equations of motion are either not known or too expensive to solve, e.g., when modeling Brownian motion. In some cases, the equations governing the dynamics of a physical system on macroscopic scales occur to be unknown…
▽ More
Stochastic differential equations (SDEs) are of utmost importance in various scientific and industrial areas. They are the natural description of dynamical processes whose precise equations of motion are either not known or too expensive to solve, e.g., when modeling Brownian motion. In some cases, the equations governing the dynamics of a physical system on macroscopic scales occur to be unknown since they typically cannot be deduced from general principles. In this work, we describe how the underlying laws of a stochastic process can be approximated by the spectral density of the corresponding process. Furthermore, we show how the density can be inferred from possibly very noisy and incomplete measurements of the dynamical field. Generally, inverse problems like these can be tackled with the help of Information Field Theory (IFT). For now, we restrict to linear and autonomous processes. Though, this is a non-conceptual limitation that may be omitted in future work. To demonstrate its applicability we employ our reconstruction algorithm on a time-series and spatio-temporal processes.
△ Less
Submitted 17 August, 2017;
originally announced August 2017.
-
Noisy independent component analysis of auto-correlated components
Authors:
Jakob Knollmüller,
Torsten A. Enßlin
Abstract:
We present a new method for the separation of superimposed, independent, auto-correlated components from noisy multi-channel measurement. The presented method simultaneously reconstructs and separates the components, taking all channels into account and thereby increases the effective signal-to-noise ratio considerably, allowing separations even in the high noise regime. Characteristics of the mea…
▽ More
We present a new method for the separation of superimposed, independent, auto-correlated components from noisy multi-channel measurement. The presented method simultaneously reconstructs and separates the components, taking all channels into account and thereby increases the effective signal-to-noise ratio considerably, allowing separations even in the high noise regime. Characteristics of the measurement instruments can be included, allowing for application in complex measurement situations. Independent posterior samples can be provided, permitting error estimates on all desired quantities. Using the concept of information field theory, the algorithm is not restricted to any dimensionality of the underlying space or discretization scheme thereof.
△ Less
Submitted 4 August, 2017; v1 submitted 5 May, 2017;
originally announced May 2017.
-
Correlated signal inference by free energy exploration
Authors:
Torsten A. Enßlin,
Jakob Knollmüller
Abstract:
The inference of correlated signal fields with unknown correlation structures is of high scientific and technological relevance, but poses significant conceptual and numerical challenges. To address these, we develop the correlated signal inference (CSI) algorithm within information field theory (IFT) and discuss its numerical implementation. To this end, we introduce the free energy exploration (…
▽ More
The inference of correlated signal fields with unknown correlation structures is of high scientific and technological relevance, but poses significant conceptual and numerical challenges. To address these, we develop the correlated signal inference (CSI) algorithm within information field theory (IFT) and discuss its numerical implementation. To this end, we introduce the free energy exploration (FrEE) strategy for numerical information field theory (NIFTy) applications. The FrEE strategy is to let the mathematical structure of the inference problem determine the dynamics of the numerical solver. FrEE uses the Gibbs free energy formalism for all involved unknown fields and correlation structures without marginalization of nuisance quantities. It thereby avoids the complexity marginalization often impose to IFT equations. FrEE simultaneously solves for the mean and the uncertainties of signal, nuisance, and auxiliary fields, while exploiting any analytically calculable quantity. Finally, FrEE uses a problem specific and self-tuning exploration strategy to swiftly identify the optimal field estimates as well as their uncertainty maps. For all estimated fields, properly weighted posterior samples drawn from their exact, fully non-Gaussian distributions can be generated. Here, we develop the FrEE strategies for the CSI of a normal, a log-normal, and a Poisson log-normal IFT signal inference problem and demonstrate their performances via their NIFTy implementations.
△ Less
Submitted 13 February, 2017; v1 submitted 26 December, 2016;
originally announced December 2016.
-
Operator Calculus for Information Field Theory
Authors:
Reimar H. Leike,
Torsten A. Enßlin
Abstract:
Signal inference problems with non-Gaussian posteriors can be hard to tackle. Through using the concept of Gibbs free energy these posteriors are rephrased as Gaussian posteriors for the price of computing various expectation values with respect to a Gaussian distribution. We present a new way of translating these expectation values to a language of operators which is similar to that in quantum me…
▽ More
Signal inference problems with non-Gaussian posteriors can be hard to tackle. Through using the concept of Gibbs free energy these posteriors are rephrased as Gaussian posteriors for the price of computing various expectation values with respect to a Gaussian distribution. We present a new way of translating these expectation values to a language of operators which is similar to that in quantum mechanics. This simplifies many calculations, for instance such involving log-normal priors. The operator calculus is illustrated by deriving a novel self-calibrating algorithm which is tested with mock data.
△ Less
Submitted 21 October, 2016; v1 submitted 2 May, 2016;
originally announced May 2016.
-
Stochastic determination of matrix determinants
Authors:
Sebastian Dorn,
Torsten A. Enßlin
Abstract:
Matrix determinants play an important role in data analysis, in particular when Gaussian processes are involved. Due to currently exploding data volumes, linear operations - matrices - acting on the data are often not accessible directly but are only represented indirectly in form of a computer routine. Such a routine implements the transformation a data vector undergoes under matrix multiplicatio…
▽ More
Matrix determinants play an important role in data analysis, in particular when Gaussian processes are involved. Due to currently exploding data volumes, linear operations - matrices - acting on the data are often not accessible directly but are only represented indirectly in form of a computer routine. Such a routine implements the transformation a data vector undergoes under matrix multiplication. While efficient probing routines to estimate a matrix's diagonal or trace, based solely on such computationally affordable matrix-vector multiplications, are well known and frequently used in signal inference, there is no stochastic estimate for its determinant. We introduce a probing method for the logarithm of a determinant of a linear operator. Our method rests upon a reformulation of the log-determinant by an integral representation and the transformation of the involved terms into stochastic expressions. This stochastic determinant determination enables large-size applications in Bayesian inference, in particular evidence calculations, model comparison, and posterior determination.
△ Less
Submitted 7 July, 2015; v1 submitted 10 April, 2015;
originally announced April 2015.
-
Signal inference with unknown response: Calibration-uncertainty renormalized estimator
Authors:
Sebastian Dorn,
Torsten A. Enßlin,
Maksim Greiner,
Marco Selig,
Vanessa Boehm
Abstract:
The calibration of a measurement device is crucial for every scientific experiment, where a signal has to be inferred from data. We present CURE, the calibration uncertainty renormalized estimator, to reconstruct a signal and simultaneously the instrument's calibration from the same data without knowing the exact calibration, but its covariance structure. The idea of CURE, developed in the framewo…
▽ More
The calibration of a measurement device is crucial for every scientific experiment, where a signal has to be inferred from data. We present CURE, the calibration uncertainty renormalized estimator, to reconstruct a signal and simultaneously the instrument's calibration from the same data without knowing the exact calibration, but its covariance structure. The idea of CURE, developed in the framework of information field theory, is starting with an assumed calibration to successively include more and more portions of calibration uncertainty into the signal inference equations and to absorb the resulting corrections into renormalized signal (and calibration) solutions. Thereby, the signal inference and calibration problem turns into solving a single system of ordinary differential equations and can be identified with common resummation techniques used in field theories. We verify CURE by applying it to a simplistic toy example and compare it against existent self-calibration schemes, Wiener filter solutions, and Markov Chain Monte Carlo sampling. We conclude that the method is able to keep up in accuracy with the best self-calibration methods and serves as a non-iterative alternative to it.
△ Less
Submitted 2 March, 2015; v1 submitted 23 October, 2014;
originally announced October 2014.
-
Astrophysical data analysis with information field theory
Authors:
Torsten Enßlin
Abstract:
Non-parametric imaging and data analysis in astrophysics and cosmology can be addressed by information field theory (IFT), a means of Bayesian, data based inference on spatially distributed signal fields. IFT is a statistical field theory, which permits the construction of optimal signal recovery algorithms. It exploits spatial correlations of the signal fields even for nonlinear and non-Gaussian…
▽ More
Non-parametric imaging and data analysis in astrophysics and cosmology can be addressed by information field theory (IFT), a means of Bayesian, data based inference on spatially distributed signal fields. IFT is a statistical field theory, which permits the construction of optimal signal recovery algorithms. It exploits spatial correlations of the signal fields even for nonlinear and non-Gaussian signal inference problems. The alleviation of a perception threshold for recovering signals of unknown correlation structure by using IFT will be discussed in particular as well as a novel improvement on instrumental self-calibration schemes. IFT can be applied to many areas. Here, applications in in cosmology (cosmic microwave background, large-scale structure) and astrophysics (galactic magnetism, radio interferometry) are presented.
△ Less
Submitted 29 May, 2014;
originally announced May 2014.
-
Improving self-calibration
Authors:
Torsten A. Enßlin,
Henrik Junklewitz,
Lars Winderling,
Maksim Greiner,
Marco Selig
Abstract:
Response calibration is the process of inferring how much the measured data depend on the signal one is interested in. It is essential for any quantitative signal estimation on the basis of the data. Here, we investigate self-calibration methods for linear signal measurements and linear dependence of the response on the calibration parameters. The common practice is to augment an external calibrat…
▽ More
Response calibration is the process of inferring how much the measured data depend on the signal one is interested in. It is essential for any quantitative signal estimation on the basis of the data. Here, we investigate self-calibration methods for linear signal measurements and linear dependence of the response on the calibration parameters. The common practice is to augment an external calibration solution using a known reference signal with an internal calibration on the unknown measurement signal itself. Contemporary self-calibration schemes try to find a self-consistent solution for signal and calibration by exploiting redundancies in the measurements. This can be understood in terms of maximizing the joint probability of signal and calibration. However, the full uncertainty structure of this joint probability around its maximum is thereby not taken into account by these schemes. Therefore better schemes -- in sense of minimal square error -- can be designed by accounting for asymmetries in the uncertainty of signal and calibration. We argue that at least a systematic correction of the common self-calibration scheme should be applied in many measurement situations in order to properly treat uncertainties of the signal on which one calibrates. Otherwise the calibration solutions suffer from a systematic bias, which consequently distorts the signal reconstruction. Furthermore, we argue that non-parametric, signal-to-noise filtered calibration should provide more accurate reconstructions than the common bin averages and provide a new, improved self-calibration scheme. We illustrate our findings with a simplistic numerical example.
△ Less
Submitted 6 September, 2014; v1 submitted 4 December, 2013;
originally announced December 2013.
-
D$^3$PO - Denoising, Deconvolving, and Decomposing Photon Observations
Authors:
Marco Selig,
Torsten Enßlin
Abstract:
The analysis of astronomical images is a non-trivial task. The D3PO algorithm addresses the inference problem of denoising, deconvolving, and decomposing photon observations. Its primary goal is the simultaneous but individual reconstruction of the diffuse and point-like photon flux given a single photon count image, where the fluxes are superimposed. In order to discriminate between these morphol…
▽ More
The analysis of astronomical images is a non-trivial task. The D3PO algorithm addresses the inference problem of denoising, deconvolving, and decomposing photon observations. Its primary goal is the simultaneous but individual reconstruction of the diffuse and point-like photon flux given a single photon count image, where the fluxes are superimposed. In order to discriminate between these morphologically different signal components, a probabilistic algorithm is derived in the language of information field theory based on a hierarchical Bayesian parameter model. The signal inference exploits prior information on the spatial correlation structure of the diffuse component and the brightness distribution of the spatially uncorrelated point-like sources. A maximum a posteriori solution and a solution minimizing the Gibbs free energy of the inference problem using variational Bayesian methods are discussed. Since the derivation of the solution is not dependent on the underlying position space, the implementation of the D3PO algorithm uses the NIFTY package to ensure applicability to various spatial grids and at any resolution. The fidelity of the algorithm is validated by the analysis of simulated data, including a realistic high energy photon count image showing a 32 x 32 arcmin^2 observation with a spatial resolution of 0.1 arcmin. In all tests the D3PO algorithm successfully denoised, deconvolved, and decomposed the data into a diffuse and a point-like signal estimate for the respective photon flux components.
△ Less
Submitted 29 January, 2015; v1 submitted 8 November, 2013;
originally announced November 2013.
-
NIFTY - Numerical Information Field Theory - a versatile Python library for signal inference
Authors:
Marco Selig,
Michael R. Bell,
Henrik Junklewitz,
Niels Oppermann,
Martin Reinecke,
Maksim Greiner,
Carlos Pachajoa,
Torsten A. Enßlin
Abstract:
NIFTY, "Numerical Information Field Theory", is a software package designed to enable the development of signal inference algorithms that operate regardless of the underlying spatial grid and its resolution. Its object-oriented framework is written in Python, although it accesses libraries written in Cython, C++, and C for efficiency. NIFTY offers a toolkit that abstracts discretized representatio…
▽ More
NIFTY, "Numerical Information Field Theory", is a software package designed to enable the development of signal inference algorithms that operate regardless of the underlying spatial grid and its resolution. Its object-oriented framework is written in Python, although it accesses libraries written in Cython, C++, and C for efficiency. NIFTY offers a toolkit that abstracts discretized representations of continuous spaces, fields in these spaces, and operators acting on fields into classes. Thereby, the correct normalization of operations on fields is taken care of automatically without concerning the user. This allows for an abstract formulation and programming of inference algorithms, including those derived within information field theory. Thus, NIFTY permits its user to rapidly prototype algorithms in 1D, and then apply the developed code in higher-dimensional settings of real world problems. The set of spaces on which NIFTY operates comprises point sets, n-dimensional regular grids, spherical spaces, their harmonic counterparts, and product spaces constructed as combinations of those. The functionality and diversity of the package is demonstrated by a Wiener filter code example that successfully runs without modification regardless of the space on which the inference problem is defined.
△ Less
Submitted 5 June, 2013; v1 submitted 18 January, 2013;
originally announced January 2013.
-
Information field theory
Authors:
Torsten Enßlin
Abstract:
Non-linear image reconstruction and signal analysis deal with complex inverse problems. To tackle such problems in a systematic way, I present information field theory (IFT) as a means of Bayesian, data based inference on spatially distributed signal fields. IFT is a statistical field theory, which permits the construction of optimal signal recovery algorithms even for non-linear and non-Gaussian…
▽ More
Non-linear image reconstruction and signal analysis deal with complex inverse problems. To tackle such problems in a systematic way, I present information field theory (IFT) as a means of Bayesian, data based inference on spatially distributed signal fields. IFT is a statistical field theory, which permits the construction of optimal signal recovery algorithms even for non-linear and non-Gaussian signal inference problems. IFT algorithms exploit spatial correlations of the signal fields and benefit from techniques developed to investigate quantum and statistical field theories, such as Feynman diagrams, re-normalisation calculations, and thermodynamic potentials. The theory can be used in many areas, and applications in cosmology and numerics are presented.
△ Less
Submitted 11 January, 2013;
originally announced January 2013.
-
Reconstruction of Gaussian and log-normal fields with spectral smoothness
Authors:
Niels Oppermann,
Marco Selig,
Michael R. Bell,
Torsten A. Enßlin
Abstract:
We develop a method to infer log-normal random fields from measurement data affected by Gaussian noise. The log-normal model is well suited to describe strictly positive signals with fluctuations whose amplitude varies over several orders of magnitude. We use the formalism of minimum Gibbs free energy to derive an algorithm that uses the signal's correlation structure to regularize the reconstruct…
▽ More
We develop a method to infer log-normal random fields from measurement data affected by Gaussian noise. The log-normal model is well suited to describe strictly positive signals with fluctuations whose amplitude varies over several orders of magnitude. We use the formalism of minimum Gibbs free energy to derive an algorithm that uses the signal's correlation structure to regularize the reconstruction. The correlation structure, described by the signal's power spectrum, is thereby reconstructed from the same data set. We show that the minimization of the Gibbs free energy, corresponding to a Gaussian approximation to the posterior marginalized over the power spectrum, is equivalent to the empirical Bayes ansatz, in which the power spectrum is fixed to its maximum a posteriori value. We further introduce a prior for the power spectrum that enforces spectral smoothness. The appropriateness of this prior in different scenarios is discussed and its effects on the reconstruction's results are demonstrated. We validate the performance of our reconstruction algorithm in a series of one- and two-dimensional test cases with varying degrees of non-linearity and different noise levels.
△ Less
Submitted 13 March, 2013; v1 submitted 25 October, 2012;
originally announced October 2012.
-
Reply to "Comment on `Inference with minimal Gibbs free energy in information field theory'" by Iatsenko, Stefanovska and McClintock
Authors:
Torsten A. Enßlin,
Cornelius Weig
Abstract:
We endorse the comment on our recent paper [Enßlin and Weig, Phys. Rev. E 82, 051112 (2010)] by Iatsenko, Stefanovska and McClintock [Phys. Rev. E 85 033101 (2012)] and we try to clarify the origin of the apparent controversy on two issues. The aim of the minimal Gibbs free energy approach to provide a signal estimate is not affected by their Comment. However, if one wants to extend the method to…
▽ More
We endorse the comment on our recent paper [Enßlin and Weig, Phys. Rev. E 82, 051112 (2010)] by Iatsenko, Stefanovska and McClintock [Phys. Rev. E 85 033101 (2012)] and we try to clarify the origin of the apparent controversy on two issues. The aim of the minimal Gibbs free energy approach to provide a signal estimate is not affected by their Comment. However, if one wants to extend the method to also infer the a posteriori signal uncertainty any tempering of the posterior has to be undone at the end of the calculations, as they correctly point out. Furthermore, a distinction is made here between maximum entropy, the maximum entropy principle, and the so-called maximum entropy method in imaging, hopefully clarifying further the second issue of their Comment paper.
△ Less
Submitted 20 March, 2012;
originally announced March 2012.
-
Improving stochastic estimates with inference methods: calculating matrix diagonals
Authors:
Marco Selig,
Niels Oppermann,
Torsten A. Enßlin
Abstract:
Estimating the diagonal entries of a matrix, that is not directly accessible but only available as a linear operator in the form of a computer routine, is a common necessity in many computational applications, especially in image reconstruction and statistical inference. Here, methods of statistical inference are used to improve the accuracy or the computational costs of matrix probing methods to…
▽ More
Estimating the diagonal entries of a matrix, that is not directly accessible but only available as a linear operator in the form of a computer routine, is a common necessity in many computational applications, especially in image reconstruction and statistical inference. Here, methods of statistical inference are used to improve the accuracy or the computational costs of matrix probing methods to estimate matrix diagonals. In particular, the generalized Wiener filter methodology, as developed within information field theory, is shown to significantly improve estimates based on only a few sampling probes, in cases in which some form of continuity of the solution can be assumed. The strength, length scale, and precise functional form of the exploited autocorrelation function of the matrix diagonal is determined from the probes themselves. The developed algorithm is successfully applied to mock and real world problems. These performance tests show that, in situations where a matrix diagonal has to be calculated from only a small number of computationally expensive probes, a speedup by a factor of 2 to 10 is possible with the proposed method.
△ Less
Submitted 24 February, 2012; v1 submitted 2 August, 2011;
originally announced August 2011.
-
Inference with minimal Gibbs free energy in information field theory
Authors:
Torsten A. Ensslin,
Cornelius Weig
Abstract:
Non-linear and non-Gaussian signal inference problems are difficult to tackle. Renormalization techniques permit us to construct good estimators for the posterior signal mean within information field theory (IFT), but the approximations and assumptions made are not very obvious. Here we introduce the simple concept of minimal Gibbs free energy to IFT, and show that previous renormalization results…
▽ More
Non-linear and non-Gaussian signal inference problems are difficult to tackle. Renormalization techniques permit us to construct good estimators for the posterior signal mean within information field theory (IFT), but the approximations and assumptions made are not very obvious. Here we introduce the simple concept of minimal Gibbs free energy to IFT, and show that previous renormalization results emerge naturally. They can be understood as being the Gaussian approximation to the full posterior probability, which has maximal cross information with it. We derive optimized estimators for three applications, to illustrate the usage of the framework: (i) reconstruction of a log-normal signal from Poissonian data with background counts and point spread function, as it is needed for gamma ray astronomy and for cosmography using photometric galaxy redshifts, (ii) inference of a Gaussian signal with unknown spectrum and (iii) inference of a Poissonian log-normal signal with unknown spectrum, the combination of (i) and (ii). Finally we explain how Gaussian knowledge states constructed by the minimal Gibbs free energy principle at different temperatures can be combined into a more accurate surrogate of the non-Gaussian posterior.
△ Less
Submitted 31 August, 2010; v1 submitted 16 April, 2010;
originally announced April 2010.
-
Reconstruction of signals with unknown spectra in information field theory with parameter uncertainty
Authors:
Torsten Ensslin,
Mona Frommert
Abstract:
The optimal reconstruction of cosmic metric perturbations and other signals requires knowledge of their power spectra and other parameters. If these are not known a priori, they have to be measured simultaneously from the same data used for the signal reconstruction. We formulate the general problem of signal inference in the presence of unknown parameters within the framework of information field…
▽ More
The optimal reconstruction of cosmic metric perturbations and other signals requires knowledge of their power spectra and other parameters. If these are not known a priori, they have to be measured simultaneously from the same data used for the signal reconstruction. We formulate the general problem of signal inference in the presence of unknown parameters within the framework of information field theory. We develop a generic parameter uncertainty renormalized estimation (PURE) technique and address the problem of reconstructing Gaussian signals with unknown power-spectrum with five different approaches: (i) separate maximum-a-posteriori power spectrum measurement and subsequent reconstruction, (ii) maximum-a-posteriori power reconstruction with marginalized power-spectrum, (iii) maximizing the joint posterior of signal and spectrum, (iv) guessing the spectrum from the variance in the Wiener filter map, and (v) renormalization flow analysis of the field theoretical problem providing the PURE filter. In all cases, the reconstruction can be described or approximated as Wiener filter operations with assumed signal spectra derived from the data according to the same recipe, but with differing coefficients. All of these filters, except the renormalized one, exhibit a perception threshold in case of a Jeffreys prior for the unknown spectrum. Data modes, with variance below this threshold do not affect the signal reconstruction at all. Filter (iv) seems to be similar to the so called Karhune-Loeve and Feldman-Kaiser-Peacock estimators for galaxy power spectra used in cosmology, which therefore should also exhibit a marginal perception threshold if correctly implemented. We present statistical performance tests and show that the PURE filter is superior to the others.
△ Less
Submitted 10 May, 2011; v1 submitted 15 February, 2010;
originally announced February 2010.
-
Information field theory for cosmological perturbation reconstruction and non-linear signal analysis
Authors:
Torsten A. Ensslin,
Mona Frommert,
Francisco S. Kitaura
Abstract:
We develop information field theory (IFT) as a means of Bayesian inference on spatially distributed signals, the information fields. A didactical approach is attempted. Starting from general considerations on the nature of measurements, signals, noise, and their relation to a physical reality, we derive the information Hamiltonian, the source field, propagator, and interaction terms. Free IFT re…
▽ More
We develop information field theory (IFT) as a means of Bayesian inference on spatially distributed signals, the information fields. A didactical approach is attempted. Starting from general considerations on the nature of measurements, signals, noise, and their relation to a physical reality, we derive the information Hamiltonian, the source field, propagator, and interaction terms. Free IFT reproduces the well known Wiener-filter theory. Interacting IFT can be diagrammatically expanded, for which we provide the Feynman rules in position-, Fourier-, and spherical harmonics space, and the Boltzmann-Shannon information measure. The theory should be applicable in many fields. However, here, two cosmological signal recovery problems are discussed in their IFT-formulation. 1) Reconstruction of the cosmic large-scale structure matter distribution from discrete galaxy counts in incomplete galaxy surveys within a simple model of galaxy formation. We show that a Gaussian signal, which should resemble the initial density perturbations of the Universe, observed with a strongly non-linear, incomplete and Poissonian-noise affected response, as the processes of structure and galaxy formation and observations provide, can be reconstructed thanks to the virtue of a response-renormalization flow equation. 2) We design a filter to detect local non-linearities in the cosmic microwave background, which are predicted from some Early-Universe inflationary scenarios, and expected due to measurement imperfections. This filter is the optimal Bayes' estimator up to linear order in the non-linearity parameter and can be used even to construct sky maps of non-linearities in the data.
△ Less
Submitted 29 September, 2009; v1 submitted 20 June, 2008;
originally announced June 2008.