Search | arXiv e-print repository

FlopPITy: Enabling self-consistent exoplanet atmospheric retrievals with machine learning

Authors: Francisco Ardévol Martínez, Michiel Min, Daniela Huppenkothen, Inga Kamp, Paul I. Palmer

Abstract: Interpreting the observations of exoplanet atmospheres to constrain physical and chemical properties is typically done using Bayesian retrieval techniques. Because these methods require many model computations, a compromise is made between model complexity and run time. Reaching this compromise leads to the simplification of many physical and chemical processes (e.g. parameterised temperature stru… ▽ More Interpreting the observations of exoplanet atmospheres to constrain physical and chemical properties is typically done using Bayesian retrieval techniques. Because these methods require many model computations, a compromise is made between model complexity and run time. Reaching this compromise leads to the simplification of many physical and chemical processes (e.g. parameterised temperature structure). Here we implement and test sequential neural posterior estimation (SNPE), a machine learning inference algorithm, for exoplanet atmospheric retrievals. The goal is to speed up retrievals so they can be run with more computationally expensive atmospheric models, such as those computing the temperature structure using radiative transfer. We generate 100 synthetic observations using ARCiS (ARtful Modeling Code for exoplanet Science, an atmospheric modelling code with the flexibility to compute models in varying degrees of complexity) and perform retrievals on them to test the faithfulness of the SNPE posteriors. The faithfulness quantifies whether the posteriors contain the ground truth as often as we expect. We also generate a synthetic observation of a cool brown dwarf using the self-consistent capabilities of ARCiS and run a retrieval with self-consistent models to showcase the possibilities that SNPE opens. We find that SNPE provides faithful posteriors and is therefore a reliable tool for exoplanet atmospheric retrievals. We are able to run a self-consistent retrieval of a synthetic brown dwarf spectrum using only 50,000 forward model evaluations. We find that SNPE can speed up retrievals between $\sim2\times$ and $\geq10\times$ depending on the computational load of the forward model, the dimensionality of the observation, and the signal-to-noise ratio of the observation. We make the code publicly available for the community on Github. △ Less

Submitted 8 January, 2024; originally announced January 2024.

Comments: Accepted for publication at A&A

arXiv:2210.17227 [pdf, other]

Modelling M/M/R-JSQ-PS sojourn time distribution for Ultra-Reliable Low Latency Communication services

Authors: Geraint I. Palmer, Jorge Martín-Pérez

Abstract: The future Internet promises to support time-sensitive services that require ultra low latencies and reliabilities of 99.99%. Recent advances in cellular and WiFi connections enhance the network to meet high reliability and ultra low latencies. However, the aforementioned services require that the server processing time ensures low latencies with high reliability, otherwise the end-to-end performa… ▽ More The future Internet promises to support time-sensitive services that require ultra low latencies and reliabilities of 99.99%. Recent advances in cellular and WiFi connections enhance the network to meet high reliability and ultra low latencies. However, the aforementioned services require that the server processing time ensures low latencies with high reliability, otherwise the end-to-end performance is not met. To that end, in this paper we use queuing theory to model the sojourn time distribution for Ultra-Reliable Low Latency Communication services of M/M/R-JSQ-PS systems: Markovian queues with R CPU servers following a join shortest queue processor-sharing discipline (for example Linux systems). We develop open-source simulation software, and develop and compare six analytical approximations for the sojourn time distribution. The proposed approximations yield Wasserstein distances below 2 time units, and upon medium loads incur into errors of less than 1.78 time units (e.g., milliseconds) for the 99.99th percentile sojourn time. Moreover, the proposed sojourn time approximations are stable regardless the number of CPUs and stay close to the simulations. △ Less

Submitted 22 December, 2022; v1 submitted 31 October, 2022; originally announced October 2022.

Comments: 14 Pages, 10 figures, submitted to Elsevier European Journal of Operational Research

arXiv:2203.01236 [pdf, other]

doi 10.1051/0004-6361/202142976

Convolutional neural networks as an alternative to Bayesian retrievals

Authors: Francisco Ardevol Martinez, Michiel Min, Inga Kamp, Paul I. Palmer

Abstract: Exoplanet observations are currently analysed with Bayesian retrieval techniques. Due to the computational load of the models used, a compromise is needed between model complexity and computing time. Analysis of data from future facilities, will need more complex models which will increase the computational load of retrievals, prompting the search for a faster approach for interpreting exoplanet o… ▽ More Exoplanet observations are currently analysed with Bayesian retrieval techniques. Due to the computational load of the models used, a compromise is needed between model complexity and computing time. Analysis of data from future facilities, will need more complex models which will increase the computational load of retrievals, prompting the search for a faster approach for interpreting exoplanet observations. Our goal is to compare machine learning retrievals of exoplanet transmission spectra with nested sampling, and understand if machine learning can be as reliable as Bayesian retrievals for a statistically significant sample of spectra while being orders of magnitude faster. We generate grids of synthetic transmission spectra and their corresponding planetary and atmospheric parameters, one using free chemistry models, and the other using equilibrium chemistry models. Each grid is subsequently rebinned to simulate both HST/WFC3 and JWST/NIRSpec observations, yielding four datasets in total. Convolutional neural networks (CNNs) are trained with each of the datasets. We perform retrievals on a 1,000 simulated observations for each combination of model type and instrument with nested sampling and machine learning. We also use both methods to perform retrievals on real WFC3 transmission spectra. Finally, we test how robust machine learning and nested sampling are against incorrect assumptions in our models. CNNs reach a lower coefficient of determination between predicted and true values of the parameters. Nested sampling underestimates the uncertainty in ~8% of retrievals, whereas CNNs estimate them correctly. For real WFC3 observations, nested sampling and machine learning agree within $2σ$ for ~86% of spectra. When doing retrievals with incorrect assumptions, nested sampling underestimates the uncertainty in ~12% to ~41% of cases, whereas this is always below ~10% for the CNN. △ Less

Submitted 3 March, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

Comments: Accepted for publication in A&A

Journal ref: A&A 662, A108 (2022)

arXiv:2111.03729 [pdf, other]

Explaining neural network predictions of material strength

Authors: Ian A. Palmer, T. Nathan Mundhenk, Brian Gallagher, Yong Han

Abstract: We recently developed a deep learning method that can determine the critical peak stress of a material by looking at scanning electron microscope (SEM) images of the material's crystals. However, it has been somewhat unclear what kind of image features the network is keying off of when it makes its prediction. It is common in computer vision to employ an explainable AI saliency map to tell one wha… ▽ More We recently developed a deep learning method that can determine the critical peak stress of a material by looking at scanning electron microscope (SEM) images of the material's crystals. However, it has been somewhat unclear what kind of image features the network is keying off of when it makes its prediction. It is common in computer vision to employ an explainable AI saliency map to tell one what parts of an image are important to the network's decision. One can usually deduce the important features by looking at these salient locations. However, SEM images of crystals are more abstract to the human observer than natural image photographs. As a result, it is not easy to tell what features are important at the locations which are most salient. To solve this, we developed a method that helps us map features from important locations in SEM images to non-abstract textures that are easier to interpret. △ Less

Submitted 5 November, 2021; originally announced November 2021.

arXiv:2110.07575 [pdf, other]

Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset

Authors: Ian Palmer, Andrew Rouditchenko, Andrei Barbu, Boris Katz, James Glass

Abstract: Visually-grounded spoken language datasets can enable models to learn cross-modal correspondences with very weak supervision. However, modern audio-visual datasets contain biases that undermine the real-world performance of models trained on that data. We introduce Spoken ObjectNet, which is designed to remove some of these biases and provide a way to better evaluate how effectively models will pe… ▽ More Visually-grounded spoken language datasets can enable models to learn cross-modal correspondences with very weak supervision. However, modern audio-visual datasets contain biases that undermine the real-world performance of models trained on that data. We introduce Spoken ObjectNet, which is designed to remove some of these biases and provide a way to better evaluate how effectively models will perform in real-world scenarios. This dataset expands upon ObjectNet, which is a bias-controlled image dataset that features similar image classes to those present in ImageNet. We detail our data collection pipeline, which features several methods to improve caption quality, including automated language model checks. Lastly, we show baseline results on image retrieval and audio retrieval tasks. These results show that models trained on other datasets and then evaluated on Spoken ObjectNet tend to perform poorly due to biases in other datasets that the models have learned. We also show evidence that the performance decrease is due to the dataset controls, and not the transfer setting. △ Less

Submitted 14 October, 2021; originally announced October 2021.

Comments: Presented at Interspeech 2021. This version contains additional experiments on the Spoken ObjectNet test set

arXiv:1710.03561 [pdf, other]

Ciw: An open source discrete event simulation library

Authors: Geraint I. Palmer, Vincent A. Knight, Paul R. Harper, Asyl L. Hawa

Abstract: This paper introduces Ciw, an open source library for conducting discrete event simulations that has been developed in Python. The strengths of the library are illustrated in terms of best practice and reproducibility for computational research. An analysis of Ciw's performance and comparison to several alternative discrete event simulation frameworks is presented. This paper introduces Ciw, an open source library for conducting discrete event simulations that has been developed in Python. The strengths of the library are illustrated in terms of best practice and reproducibility for computational research. An analysis of Ciw's performance and comparison to several alternative discrete event simulation frameworks is presented. △ Less

Submitted 27 September, 2017; originally announced October 2017.

arXiv:0906.2997

The Jewett-Krieger Construction for Tilings

Authors: Ian Palmer, Jean Bellissard

Abstract: Given a random distribution of impurities on a periodic crystal, an equivalent uniquely ergodic tiling space is built, made of aperiodic, repetitive tilings with finite local complexity, and with configurational entropy close to the entropy of the impurity distribution. The construction is the tiling analog of the Jewett-Kreger theorem. Given a random distribution of impurities on a periodic crystal, an equivalent uniquely ergodic tiling space is built, made of aperiodic, repetitive tilings with finite local complexity, and with configurational entropy close to the entropy of the impurity distribution. The construction is the tiling analog of the Jewett-Kreger theorem. △ Less

Submitted 3 November, 2010; v1 submitted 16 June, 2009; originally announced June 2009.

Comments: This paper has been withdrawn in order to address conceptual problems. It will be rewritten and resubmitted at a later date

MSC Class: 37B50 (Primary); 37A35; 37B10; 37A50 (Secondary)

Showing 1–7 of 7 results for author: Palmer, I