Search | arXiv e-print repository

arXiv:2407.03198 [pdf, other]

BOWIE-ALIGN: A JWST comparative survey of aligned vs misaligned hot Jupiters to test the dependence of atmospheric composition on migration history

Authors: James Kirk, Eva-Maria Ahrer, Anna B. T. Penzlin, James E. Owen, Richard A. Booth, Lili Alderson, Duncan A. Christie, Alastair B. Claringbold, Emma Esparza-Borges, Chloe E. Fisher, Mercedes López-Morales, N. J. Mayne, Mason McCormack, Annabella Meech, Vatsal Panwar, Diana Powell, Jake Taylor, Denis E. Sergeev, Daniel Valentine, Hannah R. Wakeford, Peter J. Wheatley, Maria Zamyatina

Abstract: A primary objective of exoplanet atmosphere characterisation is to learn about planet formation and evolution, however, this is challenged by degeneracies. To determine whether differences in atmospheric composition can be reliably traced to differences in evolution, we are undertaking a new survey with JWST to compare the compositions of a sample of hot Jupiters that orbit F stars above the Kraft… ▽ More A primary objective of exoplanet atmosphere characterisation is to learn about planet formation and evolution, however, this is challenged by degeneracies. To determine whether differences in atmospheric composition can be reliably traced to differences in evolution, we are undertaking a new survey with JWST to compare the compositions of a sample of hot Jupiters that orbit F stars above the Kraft break with different orbital alignments. Under the assumption that aligned planets migrate through the inner disc, while misaligned planets migrate after disc dispersal, the act of migrating through the inner disc should lead to a measurable difference in the C/O between aligned and misaligned planets. We expect the amplitude and sign of this difference to depend on the amount of planetesimal accretion and whether silicates accreted from the inner disc release their oxygen. Here, we identify all known exoplanets that are suitable for testing this hypothesis, describe our JWST survey, and use noise simulations and atmospheric retrievals to estimate our survey's sensitivity. With the selected sample of four aligned and four misaligned hot Jupiters, we will be sensitive to the predicted differences in C/O between aligned and misaligned hot Jupiters for a wide range of model scenarios. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 13 pages, 8 figures, submitted to RASTI

arXiv:2405.14895 [pdf, other]

TOI-1685 b is a Hot Rocky Super-Earth: Updates to the Stellar and Planet Parameters of a Popular JWST Cycle 2 Target

Authors: Jennifer A. Burt, Matthew J. Hooton, Eric E. Mamajek, Oscar Barragán, Sarah C. Millholland, Tyler R. Fairnington, Chloe Fisher, Samuel P. Halverson, Chelsea X. Huang, Madison Brady, Andreas Seifahrt, Eric Gaidos, Rafael Luque, David Kasper, Jacob L. Bean

Abstract: We present an updated characterization of the TOI-1685 planetary system, which consists of a P$_{\rm{b}}$ = 0.69\,day USP super-Earth planet orbiting a nearby ($d$ = 37.6\,pc) M2.5V star (TIC 28900646, 2MASS J04342248+4302148). This planet was previously featured in two contemporaneous discovery papers, but the best-fit planet mass, radius, and bulk density values were discrepant allowing it to be… ▽ More We present an updated characterization of the TOI-1685 planetary system, which consists of a P$_{\rm{b}}$ = 0.69\,day USP super-Earth planet orbiting a nearby ($d$ = 37.6\,pc) M2.5V star (TIC 28900646, 2MASS J04342248+4302148). This planet was previously featured in two contemporaneous discovery papers, but the best-fit planet mass, radius, and bulk density values were discrepant allowing it to be interpreted either as a hot, bare rock or a 50\% H$_{2}$O / 50\% MgSiO$_{3}$ water world. TOI-1685 b will be observed in three independent JWST cycle two programs, two of which assume the planet is a water world while the third assumes that it is a hot rocky planet. Here we include a refined stellar classification with a focus on addressing the host star's metallicity, an updated planet radius measurement that includes two sectors of TESS data and multi-color photometry from a variety of ground-based facilities, and a more accurate dynamical mass measurement from a combined CARMENES, IRD, and MAROON-X radial velocity data set. We find that the star is very metal-rich ([Fe/H] $\simeq$ +0.3) and that the planet is systematically smaller, lower mass, and higher density than initially reported, with new best-fit parameters of \Rpl = 1.468 $^{+0.050}_{-0.051}$ \Rearth\ and \Mpl = 3.03$^{+0.33}_{-0.32}$ \Mearth. These results fall in between the previously derived values and suggest that TOI-1685 b is a hot, rocky, planet with an Earth-like density (\Rhopl = 5.3 $\pm$ 0.8 g cm$^{-3}$, or 0.96 \rhoearth), high equilibrium temperature (T$_{\rm{eq}}$ = 1062 $\pm$ 27 K) and negligible volatiles, rather than a water world. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: 20 pages, 9 Figures, accepted for publication in ApJL. Datasets and software available via Zenodo and GitHub links found in the paper

arXiv:2405.02656 [pdf, other]

Information content of JWST spectra of WASP-39b

Authors: Anna Lueber, Aline Novais, Chloe Fisher, Kevin Heng

Abstract: WASP-39b was observed using several different JWST instrument modes and the spectra were published in a series of papers by the ERS team. The current study examines the information content of these spectra measured using the different instrument modes, focusing on the complexity of the temperature-pressure profiles and number of chemical species warranted by the data. We examine if H2O, CO, CO2, K… ▽ More WASP-39b was observed using several different JWST instrument modes and the spectra were published in a series of papers by the ERS team. The current study examines the information content of these spectra measured using the different instrument modes, focusing on the complexity of the temperature-pressure profiles and number of chemical species warranted by the data. We examine if H2O, CO, CO2, K, H2S, CH4, and SO2 are detected in each of the instrument modes. Two Bayesian inference methods are used to perform atmospheric retrievals: standard nested sampling and supervised machine learning of the random forest (trained on a model grid). For nested sampling, Bayesian model comparison is used as a guide to identify the set of models with the required complexity to explain the data. Generally, non-isothermal transit chords are needed to fit the transmission spectra of WASP-39b, although the complexity of the Tp-profile required is mode-dependent. The minimal set of chemical species needed to fit a spectrum is mode-dependent as well, and also depends on whether grey or non-grey clouds are assumed. When a non-grey cloud model is used to fit the G395H spectrum, it generates a spectral continuum that compensates for the H2O opacity. The same compensation is absent when fitting the non-grey cloud model to the PRISM spectrum (which has broader wavelength coverage), suggesting that it is spurious. The interplay between the cloud spectral continuum and the H2O opacity determines if SO2 is needed to fit either spectrum. The inferred elemental abundances of carbon and oxygen and the carbon-to-oxygen (C/O) ratios are all mode- and model-dependent, and should be interpreted with caution. Bayesian model comparison does not always offer a clear path forward for favouring specific retrieval models (e.g. grey versus non-grey clouds) and thus for enabling unambiguous interpretations of exoplanet spectra. △ Less

Submitted 4 May, 2024; originally announced May 2024.

Comments: Accepted by A&A. 25 pages, 26 figures, 3 tables

arXiv:2405.01488 [pdf, other]

Digital Twin Generators for Disease Modeling

Authors: Nameyeh Alam, Jake Basilico, Daniele Bertolini, Satish Casie Chetty, Heather D'Angelo, Ryan Douglas, Charles K. Fisher, Franklin Fuller, Melissa Gomes, Rishabh Gupta, Alex Lang, Anton Loukianov, Rachel Mak-McCully, Cary Murray, Hanalei Pham, Susanna Qiao, Elena Ryapolova-Webb, Aaron Smith, Dimitri Theoharatos, Anil Tolwani, Eric W. Tramel, Anna Vidovszky, Judy Viduya, Jonathan R. Walsh

Abstract: A patient's digital twin is a computational model that describes the evolution of their health over time. Digital twins have the potential to revolutionize medicine by enabling individual-level computer simulations of human health, which can be used to conduct more efficient clinical trials or to recommend personalized treatment options. Due to the overwhelming complexity of human biology, machine… ▽ More A patient's digital twin is a computational model that describes the evolution of their health over time. Digital twins have the potential to revolutionize medicine by enabling individual-level computer simulations of human health, which can be used to conduct more efficient clinical trials or to recommend personalized treatment options. Due to the overwhelming complexity of human biology, machine learning approaches that leverage large datasets of historical patients' longitudinal health records to generate patients' digital twins are more tractable than potential mechanistic models. In this manuscript, we describe a neural network architecture that can learn conditional generative models of clinical trajectories, which we call Digital Twin Generators (DTGs), that can create digital twins of individual patients. We show that the same neural network architecture can be trained to generate accurate digital twins for patients across 13 different indications simply by changing the training set and tuning hyperparameters. By introducing a general purpose architecture, we aim to unlock the ability to scale machine learning approaches to larger datasets and across more indications so that a digital twin could be created for any patient in the world. △ Less

Submitted 2 May, 2024; originally announced May 2024.

arXiv:2402.18900 [pdf, ps, other]

Prognostic Covariate Adjustment for Logistic Regression in Randomized Controlled Trials

Authors: Yunfan Li, Arman Sabbaghi, Jonathan R. Walsh, Charles K. Fisher

Abstract: Randomized controlled trials (RCTs) with binary primary endpoints introduce novel challenges for inferring the causal effects of treatments. The most significant challenge is non-collapsibility, in which the conditional odds ratio estimand under covariate adjustment differs from the unconditional estimand in the logistic regression analysis of RCT data. This issue gives rise to apparent paradoxes,… ▽ More Randomized controlled trials (RCTs) with binary primary endpoints introduce novel challenges for inferring the causal effects of treatments. The most significant challenge is non-collapsibility, in which the conditional odds ratio estimand under covariate adjustment differs from the unconditional estimand in the logistic regression analysis of RCT data. This issue gives rise to apparent paradoxes, such as the variance of the estimator for the conditional odds ratio from a covariate-adjusted model being greater than the variance of the estimator from the unadjusted model. We address this challenge in the context of adjustment based on predictions of control outcomes from generative artificial intelligence (AI) algorithms, which are referred to as prognostic scores. We demonstrate that prognostic score adjustment in logistic regression increases the power of the Wald test for the conditional odds ratio under a fixed sample size, or alternatively reduces the necessary sample size to achieve a desired power, compared to the unadjusted analysis. We derive formulae for prospective calculations of the power gain and sample size reduction that can result from adjustment for the prognostic score. Furthermore, we utilize g-computation to expand the scope of prognostic score adjustment to inferences on the marginal risk difference, relative risk, and odds ratio estimands. We demonstrate the validity of our formulae via extensive simulation studies that encompass different types of logistic regression model specifications. Our simulation studies also indicate how prognostic score adjustment can reduce the variance of g-computation estimators for the marginal estimands while maintaining frequentist properties such as asymptotic unbiasedness and Type I error rate control. Our methodology can ultimately enable more definitive and conclusive analyses for RCTs with binary primary endpoints. △ Less

Submitted 29 February, 2024; originally announced February 2024.

Comments: 27 pages, 1 figure, 9 tables

MSC Class: 62J12

arXiv:2310.18027 [pdf, other]

Bayesian Prognostic Covariate Adjustment With Additive Mixture Priors

Authors: Alyssa M. Vanderbeek, Arman Sabbaghi, Jon R. Walsh, Charles K. Fisher

Abstract: Effective and rapid decision-making from randomized controlled trials (RCTs) requires unbiased and precise treatment effect inferences. Two strategies to address this requirement are to adjust for covariates that are highly correlated with the outcome, and to leverage historical control information via Bayes' theorem. We propose a new Bayesian prognostic covariate adjustment methodology, referred… ▽ More Effective and rapid decision-making from randomized controlled trials (RCTs) requires unbiased and precise treatment effect inferences. Two strategies to address this requirement are to adjust for covariates that are highly correlated with the outcome, and to leverage historical control information via Bayes' theorem. We propose a new Bayesian prognostic covariate adjustment methodology, referred to as Bayesian PROCOVA, that combines these two strategies. Covariate adjustment in Bayesian PROCOVA is based on generative artificial intelligence (AI) algorithms that construct a digital twin generator (DTG) for RCT participants. The DTG is trained on historical control data and yields a digital twin (DT) probability distribution for each RCT participant's outcome under the control treatment. The expectation of the DT distribution, referred to as the prognostic score, defines the covariate for adjustment. Historical control information is leveraged via an additive mixture prior with two components: an informative prior probability distribution specified based on historical control data, and a weakly informative prior distribution. The mixture weight determines the extent to which posterior inferences are drawn from the informative component, versus the weakly informative component. This weight has a prior distribution as well, and so the entire additive mixture prior is completely pre-specifiable without involving any RCT information. We establish an efficient Gibbs algorithm for sampling from the posterior distribution, and derive closed-form expressions for the posterior mean and variance of the treatment effect parameter conditional on the weight, in Bayesian PROCOVA. We evaluate efficiency gains of Bayesian PROCOVA via its bias control and variance reduction compared to frequentist PROCOVA in simulation studies that encompass different discrepancies. These gains translate to smaller RCTs. △ Less

Submitted 28 February, 2024; v1 submitted 27 October, 2023; originally announced October 2023.

Comments: 33 pages, 12 figures, 2 tables; Added a new appendix section (conclusions unchanged)

MSC Class: 62F15

arXiv:2309.14256 [pdf, other]

A Weighted Prognostic Covariate Adjustment Method for Efficient and Powerful Treatment Effect Inferences in Randomized Controlled Trials

Authors: Alyssa M. Vanderbeek, Anna A. Vidovszky, Jessica L. Ross, Arman Sabbaghi, Jonathan R. Walsh, Charles K. Fisher, the Critical Path for Alzheimer's Disease, the Alzheimer's Disease Neuroimaging Initiative, the European Prevention of Alzheimer's Disease, Consortium, the Alzheimer's Disease Cooperative Study

Abstract: A crucial task for a randomized controlled trial (RCT) is to specify a statistical method that can yield an efficient estimator and powerful test for the treatment effect. A novel and effective strategy to obtain efficient and powerful treatment effect inferences is to incorporate predictions from generative artificial intelligence (AI) algorithms into covariate adjustment for the regression analy… ▽ More A crucial task for a randomized controlled trial (RCT) is to specify a statistical method that can yield an efficient estimator and powerful test for the treatment effect. A novel and effective strategy to obtain efficient and powerful treatment effect inferences is to incorporate predictions from generative artificial intelligence (AI) algorithms into covariate adjustment for the regression analysis of a RCT. Training a generative AI algorithm on historical control data enables one to construct a digital twin generator (DTG) for RCT participants, which utilizes a participant's baseline covariates to generate a probability distribution for their potential control outcome. Summaries of the probability distribution from the DTG are highly predictive of the trial outcome, and adjusting for these features via regression can thus improve the quality of treatment effect inferences, while satisfying regulatory guidelines on statistical analyses, for a RCT. However, a critical assumption in this strategy is homoskedasticity, or constant variance of the outcome conditional on the covariates. In the case of heteroskedasticity, existing covariate adjustment methods yield inefficient estimators and underpowered tests. We propose to address heteroskedasticity via a weighted prognostic covariate adjustment methodology (Weighted PROCOVA) that adjusts for both the mean and variance of the regression model using information obtained from the DTG. We prove that our method yields unbiased treatment effect estimators, and demonstrate via comprehensive simulation studies and case studies from Alzheimer's disease that it can reduce the variance of the treatment effect estimator, maintain the Type I error rate, and increase the power of the test for the treatment effect from 80% to 85%~90% when the variances from the DTG can explain 5%~10% of the variation in the RCT participants' outcomes. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: 49 pages, 6 figures, 12 tables

MSC Class: 62J99

arXiv:2308.07330 [pdf, ps, other]

A Rule of Thumb for the Power Gain due to Covariate Adjustment in Randomized Controlled Trials with Continuous Outcomes

Authors: Charles K. Fisher

Abstract: Randomized Controlled Trials (RCTs) often adjust for baseline covariates in order to increase power. This technical note provides a short derivation of a simple rule of thumb for approximating the ratio of the power of an adjusted analysis to that of an unadjusted analysis. Specifically, if the unadjusted analysis is powered to approximately 80\%, then the ratio of the power of the adjusted analys… ▽ More Randomized Controlled Trials (RCTs) often adjust for baseline covariates in order to increase power. This technical note provides a short derivation of a simple rule of thumb for approximating the ratio of the power of an adjusted analysis to that of an unadjusted analysis. Specifically, if the unadjusted analysis is powered to approximately 80\%, then the ratio of the power of the adjusted analysis to the power of the unadjusted analysis is approximately $1 + \frac{1}{2} R^2$, where $R$ is the correlation between the baseline covariate and the outcome. △ Less

Submitted 9 August, 2023; originally announced August 2023.

arXiv:2306.15041 [pdf]

A Comparison of Neuroelectrophysiology Databases

Authors: Priyanka Subash, Alex Gray, Misque Boswell, Samantha L. Cohen, Rachael Garner, Sana Salehi, Calvary Fisher, Samuel Hobel, Satrajit Ghosh, Yaroslav Halchenko, Benjamin Dichter, Russell A. Poldrack, Chris Markiewicz, Dora Hermes, Arnaud Delorme, Scott Makeig, Brendan Behan, Alana Sparks, Stephen R Arnott, Zhengjia Wang, John Magnotti, Michael S. Beauchamp, Nader Pouratian, Arthur W. Toga, Dominique Duncan

Abstract: As data sharing has become more prevalent, three pillars - archives, standards, and analysis tools - have emerged as critical components in facilitating effective data sharing and collaboration. This paper compares four freely available intracranial neuroelectrophysiology data repositories: Data Archive for the BRAIN Initiative (DABI), Distributed Archives for Neurophysiology Data Integration (DAN… ▽ More As data sharing has become more prevalent, three pillars - archives, standards, and analysis tools - have emerged as critical components in facilitating effective data sharing and collaboration. This paper compares four freely available intracranial neuroelectrophysiology data repositories: Data Archive for the BRAIN Initiative (DABI), Distributed Archives for Neurophysiology Data Integration (DANDI), OpenNeuro, and Brain-CODE. The aim of this review is to describe archives that provide researchers with tools to store, share, and reanalyze both human and non-human neurophysiology data based on criteria that are of interest to the neuroscientific community. The Brain Imaging Data Structure (BIDS) and Neurodata Without Borders (NWB) are utilized by these archives to make data more accessible to researchers by implementing a common standard. As the necessity for integrating large-scale analysis into data repository platforms continues to grow within the neuroscientific community, this article will highlight the various analytical and customizable tools developed within the chosen archives that may advance the field of neuroinformatics. △ Less

Submitted 30 August, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

Comments: 22 pages, 6 figures, 5 tables

arXiv:2305.08337 [pdf, other]

Neural Boltzmann Machines

Authors: Alex H. Lang, Anton D. Loukianov, Charles K. Fisher

Abstract: Conditional generative models are capable of using contextual information as input to create new imaginative outputs. Conditional Restricted Boltzmann Machines (CRBMs) are one class of conditional generative models that have proven to be especially adept at modeling noisy discrete or continuous data, but the lack of expressivity in CRBMs have limited their widespread adoption. Here we introduce Ne… ▽ More Conditional generative models are capable of using contextual information as input to create new imaginative outputs. Conditional Restricted Boltzmann Machines (CRBMs) are one class of conditional generative models that have proven to be especially adept at modeling noisy discrete or continuous data, but the lack of expressivity in CRBMs have limited their widespread adoption. Here we introduce Neural Boltzmann Machines (NBMs) which generalize CRBMs by converting each of the CRBM parameters to their own neural networks that are allowed to be functions of the conditional inputs. NBMs are highly flexible conditional generative models that can be trained via stochastic gradient descent to approximately maximize the log-likelihood of the data. We demonstrate the utility of NBMs especially with normally distributed data which has historically caused problems for Gaussian-Bernoulli CRBMs. Code to reproduce our results can be found at https://github.com/unlearnai/neural-boltzmann-machines. △ Less

Submitted 15 May, 2023; originally announced May 2023.

Comments: 7 pages, 4 figures

arXiv:2305.07719 [pdf, other]

Intercomparison of Brown Dwarf Model Grids and Atmospheric Retrieval Using Machine Learning

Authors: Anna Lueber, Daniel Kitzmann, Chloe E. Fisher, Brendan P. Bowler, Adam J. Burgasser, Mark Marley, Kevin Heng

Abstract: Understanding differences between sub-stellar spectral data and models has proven to be a major challenge, especially for self-consistent model grids that are necessary for a thorough investigation of brown dwarf atmospheres. Using the supervised machine learning method of the random forest, we study the information content of 14 previously published model grids of brown dwarfs (from 1997 to 2021)… ▽ More Understanding differences between sub-stellar spectral data and models has proven to be a major challenge, especially for self-consistent model grids that are necessary for a thorough investigation of brown dwarf atmospheres. Using the supervised machine learning method of the random forest, we study the information content of 14 previously published model grids of brown dwarfs (from 1997 to 2021). The random forest method allows us to analyze the predictive power of these model grids, as well as interpret data within the framework of Approximate Bayesian Computation (ABC). Our curated dataset includes 3 benchmark brown dwarfs (Gl 570D, ε Indi Ba and Bb) as well as a sample of 19 L and T dwarfs; this sample was previously analyzed in Lueber et al. (2022) using traditional Bayesian methods (nested sampling). We find that the effective temperature of a brown dwarf can be robustly predicted independent of the model grid chosen for the interpretation. However, inference of the surface gravity is model-dependent. Specifically, the BT-Settl, Sonora Bobcat and Sonora Cholla model grids tend to predict logg ~3-4 (cgs units) even after data blueward of 1.2 μm have been disregarded to mitigate for our incomplete knowledge of the shapes of alkali lines. Two major, longstanding challenges associated with understanding the influence of clouds in brown dwarf atmospheres remain: our inability to model them from first principles and also to robustly validate these models. △ Less

Submitted 6 July, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

Comments: Accepted for publication in The Astrophysical Journal

arXiv:2305.00113 [pdf, other]

doi 10.1103/PhysRevB.95.014111

Lattice dynamics and ferroelectric properties of the nitride perovskite ${\mathrm{LaWN}}_{3}$

Authors: Yue-Wen Fang, Craig A. J. Fisher, Akihide Kuwabara, Xin-Wei Shen, Takafumi Ogawa, Hiroki Moriwake, Rong Huang, Chun-Gang Duan

Abstract: Using first-principles calculations we examine the crystal structures and phase transitions of nitride perovskite LaWN$_3$. Lattice dynamics calculations indicate that the ground-state structure belongs to space group $R3c$. Two competitive phase transition pathways are identified which are characterized by symmetry-adapted distortion modes. The results suggest that $R3c$ LaWN$_3$ should be an exc… ▽ More Using first-principles calculations we examine the crystal structures and phase transitions of nitride perovskite LaWN$_3$. Lattice dynamics calculations indicate that the ground-state structure belongs to space group $R3c$. Two competitive phase transition pathways are identified which are characterized by symmetry-adapted distortion modes. The results suggest that $R3c$ LaWN$_3$ should be an excellent ferroelectric semiconductor: its large spontaneous polarization of around 61 $μ$C/cm$^2$ is comparable to that of PbTiO$_3$, and its band gap is about 1.72 eV. Ferroelectricity is found to result from the \emph{B}-site instability driven by hybridization between W-5$d$ and N-2$p$ orbitals. These properties make LaWN$_3$ an attractive candidate material for use in ferroelectric memory devices and photovoltaic cells. △ Less

Submitted 28 April, 2023; originally announced May 2023.

Comments: 13 pages, 8 figures in main text and 5 figures in supplementary

Journal ref: Phys. Rev. B 95, 014111 (2017)

arXiv:2304.11052 [pdf]

A Multiagent CyberBattleSim for RL Cyber Operation Agents

Authors: Thomas Kunz, Christian Fisher, James La Novara-Gsell, Christopher Nguyen, Li Li

Abstract: Hardening cyber physical assets is both crucial and labor-intensive. Recently, Machine Learning (ML) in general and Reinforcement Learning RL) more specifically has shown great promise to automate tasks that otherwise would require significant human insight/intelligence. The development of autonomous RL agents requires a suitable training environment that allows us to quickly evaluate various alte… ▽ More Hardening cyber physical assets is both crucial and labor-intensive. Recently, Machine Learning (ML) in general and Reinforcement Learning RL) more specifically has shown great promise to automate tasks that otherwise would require significant human insight/intelligence. The development of autonomous RL agents requires a suitable training environment that allows us to quickly evaluate various alternatives, in particular how to arrange training scenarios that pit attackers and defenders against each other. CyberBattleSim is a training environment that supports the training of red agents, i.e., attackers. We added the capability to train blue agents, i.e., defenders. The paper describes our changes and reports on the results we obtained when training blue agents, either in isolation or jointly with red agents. Our results show that training a blue agent does lead to stronger defenses against attacks. In particular, training a blue agent jointly with a red agent increases the blue agent's capability to thwart sophisticated red agents. △ Less

Submitted 3 April, 2023; originally announced April 2023.

Comments: To appear in Proceedings of the 2022 International Conference on Computational Science and Computational Intelligence

arXiv:2211.02174 [pdf, other]

Can RBMs be trained with zero step contrastive divergence?

Authors: Charles K. Fisher

Abstract: Restricted Boltzmann Machines (RBMs) are probabilistic generative models that can be trained by maximum likelihood in principle, but are usually trained by an approximate algorithm called Contrastive Divergence (CD) in practice. In general, a CD-k algorithm estimates an average with respect to the model distribution using a sample obtained from a k-step Markov Chain Monte Carlo Algorithm (e.g., bl… ▽ More Restricted Boltzmann Machines (RBMs) are probabilistic generative models that can be trained by maximum likelihood in principle, but are usually trained by an approximate algorithm called Contrastive Divergence (CD) in practice. In general, a CD-k algorithm estimates an average with respect to the model distribution using a sample obtained from a k-step Markov Chain Monte Carlo Algorithm (e.g., block Gibbs sampling) starting from some initial configuration. Choices of k typically vary from 1 to 100. This technical report explores if it's possible to leverage a simple approximate sampling algorithm with a modified version of CD in order to train an RBM with k=0. As usual, the method is illustrated on MNIST. △ Less

Submitted 3 November, 2022; originally announced November 2022.

arXiv:2206.12194 [pdf, other]

doi 10.3847/1538-4357/ac7801

How do we optimally sample model grids of exoplanet spectra?

Authors: Chloe Fisher, Kevin Heng

Abstract: The construction and implementation of atmospheric model grids is a popular tool in exoplanet characterisation. These typically vary a number of parameters linearly, containing one model for every combination of parameter values. Here we investigate alternative methods of sampling parameters, including random sampling and Latin hypercube (LH) sampling, and how these compare to linearly sampled gri… ▽ More The construction and implementation of atmospheric model grids is a popular tool in exoplanet characterisation. These typically vary a number of parameters linearly, containing one model for every combination of parameter values. Here we investigate alternative methods of sampling parameters, including random sampling and Latin hypercube (LH) sampling, and how these compare to linearly sampled grids. We use a random forest to analyse the performance of these grids for two different models, as well as investigate the information content of the particular model grid from Goyal et al. 2019. We also use nested-sampling to implement mock atmospheric retrievals on simulated JWST transmission spectra by interpolating on linearly sampled model grids. Our results show that random or LH sampling out-performs linear sampling in parameter predictability for our higher dimensional models, requiring fewer models in the grid, and thus allowing for more computationally intensive forward models to be used. We also find that using a traditional retrieval with interpolation on a linear grid can produce biased posterior distributions, especially for parameters with non-linear effects on the spectrum. In particular, we advise caution when performing linear interpolation on the C/O ratio, cloud properties, and metallicity. Finally, we find that the information content analysis of the grid from Goyal et al. 2019 is able to highlight key areas of the spectra where the presence or absence of certain molecules can be detected, providing good indicators for parameters such as temperature and C/O ratio. △ Less

Submitted 24 June, 2022; originally announced June 2022.

Comments: 14 pages, 8 figures. Accepted for publication in ApJ

arXiv:2201.09905 [pdf, other]

The Effect of Stellar Contamination on Low-resolution Transmission Spectroscopy: Needs Identified by NASA's Exoplanet Exploration Program Study Analysis Group 21

Authors: Benjamin V. Rackham, Néstor Espinoza, Svetlana V. Berdyugina, Heidi Korhonen, Ryan J. MacDonald, Benjamin T. Montet, Brett M. Morris, Mahmoudreza Oshagh, Alexander I. Shapiro, Yvonne C. Unruh, Elisa V. Quintana, Robert T. Zellem, Dániel Apai, Thomas Barclay, Joanna K. Barstow, Giovanni Bruno, Ludmila Carone, Sarah L. Casewell, Heather M. Cegla, Serena Criscuoli, Catherine Fischer, Damien Fournier, Mark S. Giampapa, Helen Giles, Aishwarya Iyer , et al. (36 additional authors not shown)

Abstract: Study Analysis Group 21 (SAG21) of NASA's Exoplanet Exploration Program Analysis Group (ExoPAG) was organized to study the effect of stellar contamination on space-based transmission spectroscopy, a method for studying exoplanetary atmospheres by measuring the wavelength-dependent radius of a planet as it transits its star. Transmission spectroscopy relies on a precise understanding of the spectru… ▽ More Study Analysis Group 21 (SAG21) of NASA's Exoplanet Exploration Program Analysis Group (ExoPAG) was organized to study the effect of stellar contamination on space-based transmission spectroscopy, a method for studying exoplanetary atmospheres by measuring the wavelength-dependent radius of a planet as it transits its star. Transmission spectroscopy relies on a precise understanding of the spectrum of the star being occulted. However, stars are not homogeneous, constant light sources but have temporally evolving photospheres and chromospheres with inhomogeneities like spots, faculae, plages, granules, and flares. This SAG brought together an interdisciplinary team of more than 100 scientists, with observers and theorists from the heliophysics, stellar astrophysics, planetary science, and exoplanetary atmosphere research communities, to study the current research needs that can be addressed in this context to make the most of transit studies from current NASA facilities like HST and JWST. The analysis produced 14 findings, which fall into three Science Themes encompassing (1) how the Sun is used as our best laboratory to calibrate our understanding of stellar heterogeneities ("The Sun as the Stellar Benchmark"), (2) how stars other than the Sun extend our knowledge of heterogeneities ("Surface Heterogeneities of Other Stars") and (3) how to incorporate information gathered for the Sun and other stars into transit studies ("Map** Stellar Knowledge to Transit Studies"). In this invited review, we largely reproduce the final report of SAG21 as a contribution to the peer-reviewed literature. △ Less

Submitted 17 March, 2023; v1 submitted 24 January, 2022; originally announced January 2022.

Comments: Invited review in press at RASTI. Based on the ExoPAG SAG21 report (arXiv:2201.09905v1) and refined via feedback from three reviewers. 75 pages, 30 figures, 5 tables

arXiv:2111.12732 [pdf, other]

doi 10.1038/s41550-021-01581-z

Titanium oxide and chemical inhomogeneity in the atmosphere of the exoplanet WASP-189b

Authors: Bibiana Prinoth, H. Jens Hoeijmakers, Daniel Kitzmann, Elin Sandvik, Julia V. Seidel, Monika Lendl, Nicholas W. Borsato, Brian Thorsbro, David R. Anderson, David Barrado, Kateryna Kravchenko, Romain Allart, Vincent Bourrier, Heather M. Cegla, David Ehrenreich, Chloe Fisher, Christophe Lovis, Andrea Guzmán-Mesa, Simon Grimm, Matthew Hooton, Brett M. Morris, Maria Oreshenko, Lorenzo Pino, Kevin Heng

Abstract: The temperature of an atmosphere decreases with increasing altitude, unless a shortwave absorber exists that causes a temperature inversion. Ozone plays this role in the Earth`s atmosphere. In the atmospheres of highly irradiated exoplanets, shortwave absorbers are predicted to be titanium oxide (TiO) and vanadium oxide (VO). Detections of TiO and VO have been claimed using both low and high spect… ▽ More The temperature of an atmosphere decreases with increasing altitude, unless a shortwave absorber exists that causes a temperature inversion. Ozone plays this role in the Earth`s atmosphere. In the atmospheres of highly irradiated exoplanets, shortwave absorbers are predicted to be titanium oxide (TiO) and vanadium oxide (VO). Detections of TiO and VO have been claimed using both low and high spectral resolution observations, but later observations have failed to confirm these claims or overturned them. Here we report the unambiguous detection of TiO in the ultra-hot Jupiter WASP-189b using high-resolution transmission spectroscopy. This detection is based on applying the cross-correlation technique to many spectral lines of TiO from 460 to 690 nm. Moreover, we report detections of metals, including neutral and singly ionised iron and titanium, as well as chromium, magnesium, vanadium and manganese (Fe, Fe+, Ti, Ti+, Cr, Mg, V, Mn). The line positions of the detected species differ, which we interpret as a consequence of spatial gradients in their chemical abundances, such that they exist in different regions or dynamical regimes. This is direct observational evidence for the three-dimensional thermo-chemical stratification of an exoplanet atmosphere derived from high-resolution ground-based spectroscopy. △ Less

Submitted 30 January, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

Comments: Published in Nature Astronomy on 27 January 2022, accepted on 1 December 2021 (32 pages, 21 figures, 3 tables)

arXiv:2101.02005 [pdf, other]

doi 10.3847/1538-4365/abd773

HELIOS-K 2.0 Opacity Calculator and Open-source Opacity Database for Exoplanetary Atmospheres

Authors: Simon L. Grimm, Matej Malik, Daniel Kitzmann, Andrea Guzmán-Mesa, H. Jens Hoeijmakers, Chloe Fisher, João M. Mendonça, Sergey N. Yurchenko, Jonathan Tennyson, Fabien Alesina, Nicolas Buchschacher, Julien Burnier, Damien Segransan, Robert L. Kurucz, Kevin Heng

Abstract: Computing and using opacities is a key part of modeling and interpreting data of exoplanetary atmospheres. Since the underlying spectroscopic line lists are constantly expanding and currently include up to ~ 10^10 - 10^11 transition lines, the opacity calculator codes need to become more powerful. Here we present major upgrades to the HELIOS-K GPU-accelerated opacity calculator and describe the ne… ▽ More Computing and using opacities is a key part of modeling and interpreting data of exoplanetary atmospheres. Since the underlying spectroscopic line lists are constantly expanding and currently include up to ~ 10^10 - 10^11 transition lines, the opacity calculator codes need to become more powerful. Here we present major upgrades to the HELIOS-K GPU-accelerated opacity calculator and describe the necessary steps to process large line lists within a reasonable amount of time. Besides performance improvements, we include more capabilities and present a toolbox for handling different atomic and molecular data sets: from downloading and pre-processing the data to performing the opacity calculations in a user-friendly way. HELIOS-K supports line lists from ExoMol, HITRAN, HITEMP, NIST, Kurucz and VALD3. By matching the resolution of 0.1 cm^-1 and cutting length of 25 cm^-1 used by the ExoCross code for timing performance (251 seconds excluding data read-in time), HELIOS-K can process the ExoMol BT2 water line list in 12.5 seconds. Using a resolution of 0.01 cm^-1, it takes 45 seconds - equivalent to about 10^7 lines per second. As a wavenumber resolution of 0.01 cm^-1 suffices for most exoplanetary atmosphere spectroscopic calculations, we adopt this resolution in calculating opacity functions for several hundred atomic and molecular species, and make them freely available on the open-access DACE database. For the opacity calculations of the database, we use a cutting length of 100 cm^-1 for molecules and no cutting length for atoms. Our opacities are available for downloading from https://dace.unige.ch/opacityDatabase and may be visualized using https://dace.unige.ch/opacity. △ Less

Submitted 22 March, 2021; v1 submitted 6 January, 2021; originally announced January 2021.

Comments: Published in The Astrophysical Journal Supplement Series

arXiv:2012.13455 [pdf, other]

Modeling Disease Progression in Mild Cognitive Impairment and Alzheimer's Disease with Digital Twins

Authors: Daniele Bertolini, Anton D. Loukianov, Aaron M. Smith, David Li-Bland, Yannick Pouliot, Jonathan R. Walsh, Charles K. Fisher

Abstract: Alzheimer's Disease (AD) is a neurodegenerative disease that affects subjects in a broad range of severity and is assessed in clinical trials with multiple cognitive and functional instruments. As clinical trials in AD increasingly focus on earlier stages of the disease, especially Mild Cognitive Impairment (MCI), the ability to model subject outcomes across the disease spectrum is extremely impor… ▽ More Alzheimer's Disease (AD) is a neurodegenerative disease that affects subjects in a broad range of severity and is assessed in clinical trials with multiple cognitive and functional instruments. As clinical trials in AD increasingly focus on earlier stages of the disease, especially Mild Cognitive Impairment (MCI), the ability to model subject outcomes across the disease spectrum is extremely important. We use unsupervised machine learning models called Conditional Restricted Boltzmann Machines (CRBMs) to create Digital Twins of AD subjects. Digital Twins are simulated clinical records that share baseline data with actual subjects and comprehensively model their outcomes under standard-of-care. The CRBMs are trained on a large set of records from subjects in observational studies and the placebo arms of clinical trials across the AD spectrum. These data exhibit a challenging, but common, patchwork of measured and missing observations across subjects in the dataset, and we present a novel model architecture designed to learn effectively from it. We evaluate performance against a held-out test dataset and show how Digital Twins simultaneously capture the progression of a number of key endpoints in clinical trials across a broad spectrum of disease severity, including MCI and mild-to-moderate AD. △ Less

Submitted 24 December, 2020; originally announced December 2020.

arXiv:2012.13112 [pdf, other]

Bayesian prognostic covariate adjustment

Authors: David Walsh, Alejandro Schuler, Diana Hall, Jon Walsh, Charles Fisher

Abstract: Historical data about disease outcomes can be integrated into the analysis of clinical trials in many ways. We build on existing literature that uses prognostic scores from a predictive model to increase the efficiency of treatment effect estimates via covariate adjustment. Here we go further, utilizing a Bayesian framework that combines prognostic covariate adjustment with an empirical prior dist… ▽ More Historical data about disease outcomes can be integrated into the analysis of clinical trials in many ways. We build on existing literature that uses prognostic scores from a predictive model to increase the efficiency of treatment effect estimates via covariate adjustment. Here we go further, utilizing a Bayesian framework that combines prognostic covariate adjustment with an empirical prior distribution learned from the predictive performances of the prognostic model on past trials. The Bayesian approach interpolates between prognostic covariate adjustment with strict type I error control when the prior is diffuse, and a single-arm trial when the prior is sharply peaked. This method is shown theoretically to offer a substantial increase in statistical power, while limiting the type I error rate under reasonable conditions. We demonstrate the utility of our method in simulations and with an analysis of a past Alzheimer's disease clinical trial. △ Less

Submitted 24 December, 2020; originally announced December 2020.

Comments: 28 pages, 11 figures

arXiv:2012.09935 [pdf, ps, other]

Increasing the efficiency of randomized trial estimates via linear adjustment for a prognostic score

Authors: Alejandro Schuler, David Walsh, Diana Hall, Jon Walsh, Charles Fisher

Abstract: Estimating causal effects from randomized experiments is central to clinical research. Reducing the statistical uncertainty in these analyses is an important objective for statisticians. Registries, prior trials, and health records constitute a growing compendium of historical data on patients under standard-of-care that may be exploitable to this end. However, most methods for historical borrowin… ▽ More Estimating causal effects from randomized experiments is central to clinical research. Reducing the statistical uncertainty in these analyses is an important objective for statisticians. Registries, prior trials, and health records constitute a growing compendium of historical data on patients under standard-of-care that may be exploitable to this end. However, most methods for historical borrowing achieve reductions in variance by sacrificing strict type-I error rate control. Here, we propose a use of historical data that exploits linear covariate adjustment to improve the efficiency of trial analyses without incurring bias. Specifically, we train a prognostic model on the historical data, then estimate the treatment effect using a linear regression while adjusting for the trial subjects' predicted outcomes (their prognostic scores). We prove that, under certain conditions, this prognostic covariate adjustment procedure attains the minimum variance possible among a large class of estimators. When those conditions are not met, prognostic covariate adjustment is still more efficient than raw covariate adjustment and the gain in efficiency is proportional to a measure of the predictive accuracy of the prognostic model above and beyond the linear relationship with the raw covariates. We demonstrate the approach using simulations and a reanalysis of an Alzheimer's Disease clinical trial and observe meaningful reductions in mean-squared error and the estimated variance. Lastly, we provide a simplified formula for asymptotic variance that enables power calculations that account for these gains. Sample size reductions between 10% and 30% are attainable when using prognostic models that explain a clinically realistic percentage of the outcome variance. △ Less

Submitted 2 December, 2021; v1 submitted 17 December, 2020; originally announced December 2020.

arXiv:2007.15769 [pdf, other]

Instrument variable detection with graph learning : an application to high dimensional GIS-census data for house pricing

Authors: Ning Xu, Timothy C. G. Fisher, Jian Hong

Abstract: Endogeneity bias and instrument variable validation have always been important topics in statistics and econometrics. In the era of big data, such issues typically combine with dimensionality issues and, hence, require even more attention. In this paper, we merge two well-known tools from machine learning and biostatistics---variable selection algorithms and probablistic graphs---to estimate house… ▽ More Endogeneity bias and instrument variable validation have always been important topics in statistics and econometrics. In the era of big data, such issues typically combine with dimensionality issues and, hence, require even more attention. In this paper, we merge two well-known tools from machine learning and biostatistics---variable selection algorithms and probablistic graphs---to estimate house prices and the corresponding causal structure using 2010 data on Sydney. The estimation uses a 200-gigabyte ultrahigh dimensional database consisting of local school data, GIS information, census data, house characteristics and other socio-economic records. Using "big data", we show that it is possible to perform a data-driven instrument selection efficiently and purge out the invalid instruments. Our approach improves the sparsity of variable selection, stability and robustness in the presence of high dimensionality, complicated causal structures and the consequent multicollinearity, and recovers a sparse and intuitive causal structure. The approach also reveals an efficiency and effectiveness in endogeneity detection, instrument validation, weak instrument pruning and the selection of valid instruments. From the perspective of machine learning, the estimation results both align with and confirms the facts of Sydney house market, the classical economic theories and the previous findings of simultaneous equations modeling. Moreover, the estimation results are consistent with and supported by classical econometric tools such as two-stage least square regression and different instrument tests. All the code may be found at \url{https://github.com/isaac2math/solar_graph_learning}. △ Less

Submitted 16 December, 2020; v1 submitted 30 July, 2020; originally announced July 2020.

Comments: introduction rewritten; detailed graph learning and variable selection procedure explained

arXiv:2007.15707 [pdf, other]

Solar: $L_0$ solution path averaging for fast and accurate variable selection in high-dimensional data

Authors: Ning Xu, Timothy C. G. Fisher

Abstract: We propose a new variable selection algorithm, subsample-ordered least-angle regression (solar), and its coordinate descent generalization, solar-cd. Solar re-constructs lasso paths using the $L_0$ norm and averages the resulting solution paths across subsamples. Path averaging retains the ranking information of the informative variables while averaging out sensitivity to high dimensionality, impr… ▽ More We propose a new variable selection algorithm, subsample-ordered least-angle regression (solar), and its coordinate descent generalization, solar-cd. Solar re-constructs lasso paths using the $L_0$ norm and averages the resulting solution paths across subsamples. Path averaging retains the ranking information of the informative variables while averaging out sensitivity to high dimensionality, improving variable selection stability, efficiency, and accuracy. We prove that: (i) with a high probability, path averaging perfectly separates informative variables from redundant variables on the average $L_0$ path; (ii) solar variable selection is consistent and accurate; and (iii) the probability that solar omits weak signals is controllable for finite sample size. We also demonstrate that: (i) solar yields, with less than $1/3$ of the lasso computation load, substantial improvements over lasso in terms of the sparsity (64-84\% reduction in redundant variable selection) and accuracy of variable selection; (ii) compared with the lasso safe/strong rule and variable screening, solar largely avoids selection of redundant variables and rejection of informative variables in the presence of complicated dependence structures; (iii) the sparsity and stability of solar conserves residual degrees of freedom for data-splitting hypothesis testing, improving the accuracy of post-selection inference on weak signals with limited $n$; (iv) replacing lasso with solar in bootstrap selection (e.g., bolasso or stability selection) produces a multi-layer variable ranking scheme that improves selection sparsity and ranking accuracy with the computation load of only one lasso realization; and (v) given the computation resources, solar bootstrap selection is substantially faster (98\% lower computation time) than the theoretical maximum speedup for parallelized bootstrap lasso (confirmed by Amdahl's law). △ Less

Submitted 5 May, 2022; v1 submitted 30 July, 2020; originally announced July 2020.

arXiv:2007.15614 [pdf, ps, other]

Accuracy and stability of solar variable selection comparison under complicated dependence structures

Authors: Ning Xu, Timothy C. G. Fisher, Jian Hong

Abstract: In this paper we focus on the empirical variable-selection peformance of subsample-ordered least angle regression (Solar) -- a novel ultrahigh dimensional redesign of lasso -- on the empirical data with complicated dependence structures and, hence, severe multicollinearity and grou** effect issues. Previous researches show that Solar largely alleviates several known high-dimensional issues with… ▽ More In this paper we focus on the empirical variable-selection peformance of subsample-ordered least angle regression (Solar) -- a novel ultrahigh dimensional redesign of lasso -- on the empirical data with complicated dependence structures and, hence, severe multicollinearity and grou** effect issues. Previous researches show that Solar largely alleviates several known high-dimensional issues with least-angle regression and $\mathcal{L}_1$ shrinkage. Also, With the same computation load, solar yields substantiali mprovements over two lasso solvers (least-angle regression for lasso and coordinate-descent) in terms of the sparsity (37-64\% reduction in the average number of selected variables), stability and accuracy of variable selection. Simulations also demonstrate that solar enhances the robustness of variable selection to different settings of the irrepresentable condition and to variations in the dependence structures assumed in regression analysis. To confirm that the improvements are also available for empirical researches, we choose the prostate cancer data and the Sydney house price data and apply two lasso solvers, elastic net and Solar on them for comparison. The results shows that (i) lasso is affected by the grou** effect and randomly drop variables with high correlations, resulting unreliable and uninterpretable results; (ii) elastic net is more robust to grou** effect; however, it completely lose variable-selection sparsity when the dependence structure of the data is complicated; (iii) solar demonstrates its superior robustness to complicated dependence structures and grou** effect, returning variable-selection results with better stability and sparsity. The code can be found at https://github.com/isaac2math/solar_application △ Less

Submitted 16 December, 2020; v1 submitted 30 July, 2020; originally announced July 2020.

Comments: Minor errors on data and table fixed; to focus on variable selection, causal inference moved to arXiv:2007.15769

arXiv:2007.15598 [pdf, other]

Rademacher upper bounds for cross-validation errors with an application to the lasso

Authors: Ning Xu, Timothy C. G. Fisher, Jian Hong

Abstract: We establish a general upper bound for $K$-fold cross-validation ($K$-CV) errors that can be adapted to many $K$-CV-based estimators and learning algorithms. Based on Rademacher complexity of the model and the Orlicz-$Ψ_ν$ norm of the error process, the CV error upper bound applies to both light-tail and heavy-tail error distributions. We also extend the CV error upper bound to $β$-mixing data usi… ▽ More We establish a general upper bound for $K$-fold cross-validation ($K$-CV) errors that can be adapted to many $K$-CV-based estimators and learning algorithms. Based on Rademacher complexity of the model and the Orlicz-$Ψ_ν$ norm of the error process, the CV error upper bound applies to both light-tail and heavy-tail error distributions. We also extend the CV error upper bound to $β$-mixing data using the technique of independent blocking. We provide a Python package (\texttt{CVbound}, \url{https://github.com/isaac2math}) for computing the CV error upper bound in $K$-CV-based algorithms. Using the lasso as an example, we demonstrate in simulations that the upper bounds are tight and stable across different parameter settings and random seeds. As well as accurately bounding the CV errors for the lasso, the minimizer of the new upper bounds can be used as a criterion for variable selection. Compared with the CV-error minimizer, simulations show that tuning the lasso penalty parameter according to the minimizer of the upper bound yields a more sparse and more stable model that retains all of the relevant variables. △ Less

Submitted 30 July, 2020; originally announced July 2020.

arXiv:2004.10106 [pdf, other]

doi 10.3847/1538-3881/ab9176

Information content of JWST-NIRSPEC transmission spectra of warm Neptunes

Authors: Andrea Guzmán-Mesa, Daniel Kitzmann, Chloe Fisher, Adam J. Burgasser, H. Jens Hoeijmakers, Pablo Márquez-Neila, Simon L. Grimm, Avi M. Mandell, Raphael Sznitman, Kevin Heng

Abstract: Warm Neptunes offer a rich opportunity for understanding exo-atmospheric chemistry. With the upcoming James Webb Space Telescope (JWST), there is a need to elucidate the balance between investments in telescope time versus scientific yield. We use the supervised machine learning method of the random forest to perform an information content analysis on a 11-parameter model of transmission spectra f… ▽ More Warm Neptunes offer a rich opportunity for understanding exo-atmospheric chemistry. With the upcoming James Webb Space Telescope (JWST), there is a need to elucidate the balance between investments in telescope time versus scientific yield. We use the supervised machine learning method of the random forest to perform an information content analysis on a 11-parameter model of transmission spectra from the various NIRSpec modes. The three bluest medium-resolution NIRSpec modes (0.7 - 1.27 microns, 0.97 - 1.84 microns, 1.66 - 3.07 microns) are insensitive to the presence of CO. The reddest medium-resolution mode (2.87 - 5.10 microns) is sensitive to all of the molecules assumed in our model: CO, CO2, CH4, C2H2, H2O, HCN and NH3. It competes effectively with the three bluest modes on the information encoded on cloud abundance and particle size. It is also competitive with the low-resolution prism mode (0.6 - 5.3 microns) on the inference of every parameter except for the temperature and ammonia abundance. We recommend astronomers to use the reddest medium-resolution NIRSpec mode for studying the atmospheric chemistry of 800-1200 K warm Neptunes; its corresponding high-resolution counterpart offers diminishing returns. We compare our findings to previous JWST information content analyses that favor the blue orders, and suggest that the reliance on chemical equilibrium could lead to biased outcomes if this assumption does not apply. A simple, pressure-independent diagnostic for identifying chemical disequilibrium is proposed based on measuring the abundances of H2O, CO and CO2. △ Less

Submitted 7 May, 2020; v1 submitted 21 April, 2020; originally announced April 2020.

Comments: 21 pages, 17 figures, 3 tables. Accepted by AJ

arXiv:2004.02099 [pdf, other]

doi 10.1109/BigData47090.2019.9005548

Learning and Recognizing Archeological Features from LiDAR Data

Authors: Conrad M Albrecht, Chris Fisher, Marcus Freitag, Hendrik F Hamann, Sharathchandra Pankanti, Florencia Pezzutti, Francesca Rossi

Abstract: We present a remote sensing pipeline that processes LiDAR (Light Detection And Ranging) data through machine & deep learning for the application of archeological feature detection on big geo-spatial data platforms such as e.g. IBM PAIRS Geoscope. Today, archeologists get overwhelmed by the task of visually surveying huge amounts of (raw) LiDAR data in order to identify areas of interest for insp… ▽ More We present a remote sensing pipeline that processes LiDAR (Light Detection And Ranging) data through machine & deep learning for the application of archeological feature detection on big geo-spatial data platforms such as e.g. IBM PAIRS Geoscope. Today, archeologists get overwhelmed by the task of visually surveying huge amounts of (raw) LiDAR data in order to identify areas of interest for inspection on the ground. We showcase a software system pipeline that results in significant savings in terms of expert productivity while missing only a small fraction of the artifacts. Our work employs artificial neural networks in conjunction with an efficient spatial segmentation procedure based on domain knowledge. Data processing is constraint by a limited amount of training labels and noisy LiDAR signals due to vegetation cover and decay of ancient structures. We aim at identifying geo-spatial areas with archeological artifacts in a supervised fashion allowing the domain expert to flexibly tune parameters based on her needs. △ Less

Submitted 5 April, 2020; originally announced April 2020.

Journal ref: 2019 IEEE International Conference on Big Data (Big Data)

arXiv:2002.02779 [pdf, other]

Generating Digital Twins with Multiple Sclerosis Using Probabilistic Neural Networks

Authors: Jonathan R. Walsh, Aaron M. Smith, Yannick Pouliot, David Li-Bland, Anton Loukianov, Charles K. Fisher

Abstract: Multiple Sclerosis (MS) is a neurodegenerative disorder characterized by a complex set of clinical assessments. We use an unsupervised machine learning model called a Conditional Restricted Boltzmann Machine (CRBM) to learn the relationships between covariates commonly used to characterize subjects and their disease progression in MS clinical trials. A CRBM is capable of generating digital twins,… ▽ More Multiple Sclerosis (MS) is a neurodegenerative disorder characterized by a complex set of clinical assessments. We use an unsupervised machine learning model called a Conditional Restricted Boltzmann Machine (CRBM) to learn the relationships between covariates commonly used to characterize subjects and their disease progression in MS clinical trials. A CRBM is capable of generating digital twins, which are simulated subjects having the same baseline data as actual subjects. Digital twins allow for subject-level statistical analyses of disease progression. The CRBM is trained using data from 2395 subjects enrolled in the placebo arms of clinical trials across the three primary subtypes of MS. We discuss how CRBMs are trained and show that digital twins generated by the model are statistically indistinguishable from their actual subject counterparts along a number of measures. △ Less

Submitted 19 April, 2020; v1 submitted 3 February, 2020; originally announced February 2020.

arXiv:1910.11795 [pdf, other]

doi 10.3847/1538-3881/ab5955

Supervised Machine Learning for Intercomparison of Model Grids of Brown Dwarfs: Application to GJ 570D and the Epsilon Indi B Binary System

Authors: Maria Oreshenko, Daniel Kitzmann, Pablo Marquez-Neila, Matej Malik, Brendan P. Bowler, Adam J. Burgasser, Raphael Sznitman, Chloe E. Fisher, Kevin Heng

Abstract: Self-consistent model grids of brown dwarfs involve complex physics and chemistry, and are often computed using proprietary computer codes, making it challenging to identify the reasons for discrepancies between model and data as well as between the models produced by different research groups. In the current study, we demonstrate a novel method for analyzing brown dwarf spectra, which combines th… ▽ More Self-consistent model grids of brown dwarfs involve complex physics and chemistry, and are often computed using proprietary computer codes, making it challenging to identify the reasons for discrepancies between model and data as well as between the models produced by different research groups. In the current study, we demonstrate a novel method for analyzing brown dwarf spectra, which combines the use of the Sonora, AMES-Cond and HELIOS model grids with the supervised machine learning method of the random forest. Besides performing atmospheric retrieval, the random forest enables information content analysis of the three model grids as a natural outcome of the method, both individually on each grid and by comparing the grids against one another, via computing large suites of mock retrievals. Our analysis reveals that the different choices made in modelling the alkali line shapes hinder the use of the alkali lines as gravity indicators. Nevertheless, the spectrum longward of 1.2 micron encodes enough information on the surface gravity to allow its inference from retrieval. Temperature may be accurately and precisely inferred independent of the choice of model grid, but not the surface gravity. We apply random forest retrieval to three objects: the benchmark T7.5 brown dwarf GJ 570D; and Epsilon Indi Ba (T1.5 brown dwarf) and Bb (T6 brown dwarf), which are part of a binary system and have measured dynamical masses. For GJ 570D, the inferred effective temperature and surface gravity are consistent with previous studies. For Epsilon Indi Ba and Bb, the inferred surface gravities are broadly consistent with the values informed by the dynamical masses. △ Less

Submitted 18 December, 2019; v1 submitted 25 October, 2019; originally announced October 2019.

Comments: Accepted for publication in The Astronomical Journal

arXiv:1910.11627 [pdf, other]

doi 10.3847/1538-3881/ab7a92

Interpreting High-Resolution Spectroscopy of Exoplanets Using Cross-Correlations and Supervised Machine Learning

Authors: Chloe Fisher, H. Jens Hoeijmakers, Daniel Kitzmann, Pablo Márquez-Neila, Simon L. Grimm, Raphael Sznitman, Kevin Heng

Abstract: We present a new method for performing atmospheric retrieval on ground-based, high-resolution data of exoplanets. Our method combines cross-correlation functions with a random forest, a supervised machine learning technique, to overcome challenges associated with high-resolution data. A series of cross-correlation functions are concatenated to give a "CCF-sequence" for each model atmosphere, which… ▽ More We present a new method for performing atmospheric retrieval on ground-based, high-resolution data of exoplanets. Our method combines cross-correlation functions with a random forest, a supervised machine learning technique, to overcome challenges associated with high-resolution data. A series of cross-correlation functions are concatenated to give a "CCF-sequence" for each model atmosphere, which reduces the dimensionality by a factor of ~100. The random forest, trained on our grid of ~65,000 models, provides a likelihood-free method of retrieval. The pre-computed grid spans 31 values of both temperature and metallicity, and incorporates a realistic noise model. We apply our method to HARPS-N observations of the ultra-hot Jupiter KELT-9b, and obtain a metallicity consistent with solar (logM = $-0.2\pm0.2$). Our retrieved transit chord temperature (T = $6000^{+0}_{-200}$K) is unreliable as the ion cross-correlations lie outside of the training set, which we interpret as being indicative of missing physics in our atmospheric model. We compare our method to traditional nested-sampling, as well as other machine learning techniques, such as Bayesian neural networks. We demonstrate that the likelihood-free aspect of the random forest makes it more robust than nested-sampling to different error distributions, and that the Bayesian neural network we tested is unable to reproduce complex posteriors. We also address the claim in Cobb et al. (2019) that our random forest retrieval technique can be over-confident but incorrect. We show that this is an artefact of the training set, rather than the machine learning method, and that the posteriors agree with those obtained using nested-sampling. △ Less

Submitted 29 February, 2020; v1 submitted 25 October, 2019; originally announced October 2019.

Comments: 15 pages, 18 figures

arXiv:1906.07035 [pdf, other]

doi 10.3847/1538-4357/ab29e8

How Much Information Does the Sodium Doublet Encode? Retrieval Analysis of Non-LTE Sodium Lines at Low and High Spectral Resolutions

Authors: Chloe Fisher, Kevin Heng

Abstract: Motivated by both ground- and space-based detections of the sodium doublet in the transmission spectra of exoplanetary atmospheres, we revisit the theory and interpretation of sodium lines in non-local thermodynamic equilibrium (NLTE), where collisions are not efficient enough to maintain a Boltzmann distribution for the excited and ground states of the sodium atom. We consider non-Boltzmann distr… ▽ More Motivated by both ground- and space-based detections of the sodium doublet in the transmission spectra of exoplanetary atmospheres, we revisit the theory and interpretation of sodium lines in non-local thermodynamic equilibrium (NLTE), where collisions are not efficient enough to maintain a Boltzmann distribution for the excited and ground states of the sodium atom. We consider non-Boltzmann distributions that account for the ineffectiveness of collisions. We analyze the sodium doublet in transmission spectra measured at low (HAT-P-1b, HAT-P-12b, HD 189733b, WASP-6b, WASP-17b and WASP-39b) and high (WASP-49b) spectral resolutions. Nested-sampling retrievals performed on low-resolution optical/visible transmission spectra are unable to break the normalization degeneracy if the spectral continuum is associated with Rayleigh scattering by small cloud particles. Using mock retrievals, we demonstrate that un-normalized ground-based, high-resolution spectra centered on the sodium doublet alone are unable to precisely inform us about the pressure levels probed by the transit chord and hence to identify the region (i.e., thermosphere, exosphere) of the atmosphere being probed. Retrievals performed on the HARPS transmission spectrum of WASP-49b support this conclusion. Generally, we are unable to distinguish between LTE versus NLTE interpretations of the sodium doublet based on the computed Bayesian evidence with the implication that LTE interpretations tend to under-estimate the temperature probed by the transit chord. With the current low-resolution data, the sodium line shapes are consistent with Voigt profiles without the need for sub-Lorentzian wings. The retrieved sodium abundances are consistent with being sub-solar to solar. △ Less

Submitted 17 June, 2019; originally announced June 2019.

Comments: Accepted for publication in ApJ. 19 pages, 11 figures

arXiv:1905.02096 [pdf, other]

doi 10.1051/0004-6361/201935089

A spectral survey of an ultra-hot Jupiter: Detection of metals in the transmission spectrum of KELT-9 b

Authors: H. J. Hoeijmakers, D. Ehrenreich, D. Kitzmann, R. Allart, S. L. Grimm, J. V. Seidel, A. Wyttenbach, L. Pino, L. D. Nielsen, C. Fisher, P. B. Rimmer, V. Bourrier, H. M. Cegla, B. Lavie, C. Lovis, A. B. C. Patzer, J. W. Stock, F. A. Pepe, Kevin Heng

Abstract: Context: KELT-9 b exemplifies a newly emerging class of short-period gaseous exoplanets that tend to orbit hot, early type stars - termed ultra-hot Jupiters. The severe stellar irradiation heats their atmospheres to temperatures of $\sim 4,000$ K, similar to the photospheres of dwarf stars. Due to the absence of aerosols and complex molecular chemistry at such temperatures, these planets offer the… ▽ More Context: KELT-9 b exemplifies a newly emerging class of short-period gaseous exoplanets that tend to orbit hot, early type stars - termed ultra-hot Jupiters. The severe stellar irradiation heats their atmospheres to temperatures of $\sim 4,000$ K, similar to the photospheres of dwarf stars. Due to the absence of aerosols and complex molecular chemistry at such temperatures, these planets offer the potential of detailed chemical characterisation through transit and day-side spectroscopy. Studies of their chemical inventories may provide crucial constraints on their formation process and evolution history. Aims: To search the optical transmission spectrum of KELT-9 b for absorption lines by metals using the cross-correlation technique. Methods: We analyse 2 transits observed with the HARPS-N spectrograph. We use an isothermal equilibrium chemistry model to predict the transmission spectrum for each of the neutral and singly-ionized atoms with atomic numbers between 3 and 78. Of these, we identify the elements that are expected to have spectral lines in the visible wavelength range and use those as cross-correlation templates. Results: We detect absorption of Na I, Cr II, Sc II and Y II, and confirm previous detections of Mg I, Fe I, Fe II and Ti II. In addition, we find evidence of Ca I, Cr I, Co I, and Sr II that will require further observations to verify. The detected absorption lines are significantly deeper than model predictions, suggesting that material is transported to higher altitudes where the density is enhanced compared to a hydrostatic profile. There appears to be no significant blue-shift of the absorption spectrum due to a net day-to-night side wind. In particular, the strong Fe II feature is shifted by $0.18 \pm 0.27$ km~s$^{-1}$, consistent with zero. Using the orbital velocity of the planet we revise the steller and planetary masses and radii. △ Less

Submitted 6 May, 2019; originally announced May 2019.

Comments: Submitted to Astronomy and Astrophysics on January 18, 2019. Accepted on May 3, 2019. 26 pages, 11 figures

Journal ref: A&A 627, A165 (2019)

arXiv:1903.06490 [pdf, other]

doi 10.18637/jss.v096.i01

colorspace: A Toolbox for Manipulating and Assessing Colors and Palettes

Authors: Achim Zeileis, Jason C. Fisher, Kurt Hornik, Ross Ihaka, Claire D. McWhite, Paul Murrell, Reto Stauffer, Claus O. Wilke

Abstract: The R package colorspace provides a flexible toolbox for selecting individual colors or color palettes, manipulating these colors, and employing them in statistical graphics and data visualizations. In particular, the package provides a broad range of color palettes based on the HCL (Hue-Chroma-Luminance) color space. The three HCL dimensions have been shown to match those of the human visual syst… ▽ More The R package colorspace provides a flexible toolbox for selecting individual colors or color palettes, manipulating these colors, and employing them in statistical graphics and data visualizations. In particular, the package provides a broad range of color palettes based on the HCL (Hue-Chroma-Luminance) color space. The three HCL dimensions have been shown to match those of the human visual system very well, thus facilitating intuitive selection of color palettes through trajectories in this space. Using the HCL color model general strategies for three types of palettes are implemented: (1) Qualitative for coding categorical information, i.e., where no particular ordering of categories is available. (2) Sequential for coding ordered/numeric information, i.e., going from high to low (or vice versa). (3) Diverging for coding ordered/numeric information around a central neutral value, i.e., where colors diverge from neutral to two extremes. To aid selection and application of these palettes the package also contains scales for use with ggplot2, shiny (and tcltk) apps for interactive exploration, visualizations of palette properties, accompanying manipulation utilities (like desaturation and lighten/darken), and emulation of color vision deficiencies. △ Less

Submitted 14 March, 2019; originally announced March 2019.

Journal ref: Journal of Statistical Software, Volume 96, Issue 1 (2020), 1-49

arXiv:1902.00001 [pdf, other]

doi 10.1051/0004-6361/201834776

Hot Exoplanet Atmospheres Resolved with Transit Spectroscopy (HEARTS) - II. A broadened sodium feature on the ultra-hot giant WASP-76b

Authors: J. V. Seidel, D. Ehrenreich, A. Wyttenbach, R. Allart, M. Lendl, L. Pino, V. Bourrier, H. M. Cegla, C. Lovis, D. Barrado, D. Bayliss, N. Astudillo-Defru, A. Deline, C. Fisher, K. Heng, R. Joseph, B. Lavie, C. Melo, F. Pepe, D. Ségrasan, S. Udry

Abstract: High-resolution optical spectroscopy is a powerful tool to characterise exoplanetary atmospheres from the ground. The sodium D lines, with their large cross sections, are especially suited to study the upper layers of atmospheres in this context. We report on the results from HEARTS, a spectroscopic survey of exoplanet atmospheres, performing a comparative study of hot gas giants to determine the… ▽ More High-resolution optical spectroscopy is a powerful tool to characterise exoplanetary atmospheres from the ground. The sodium D lines, with their large cross sections, are especially suited to study the upper layers of atmospheres in this context. We report on the results from HEARTS, a spectroscopic survey of exoplanet atmospheres, performing a comparative study of hot gas giants to determine the effects of stellar irradiation. In this second installation of the series, we highlight the detection of neutral sodium on the ultra-hot giant WASP-76b. We observed three transits of the planet using the HARPS high-resolution spectrograph at the ESO 3.6m telescope and collected 175 spectra of WASP-76. We repeatedly detect the absorption signature of neutral sodium in the planet atmosphere ($0.371\pm0.034\%$; $10.75 σ$ in a $0.75$ Å passband). The sodium lines have a Gaussian profile with full width at half maximum (FWHM) of $27.6\pm2.8$ km s$^{-1}$. This is significantly broader than the line spread function of HARPS ($2.7$ km s$^{-1}$). We surmise that the observed broadening could trace the super-rotation in the upper atmosphere of this ultra-hot gas giant. △ Less

Submitted 30 January, 2019; originally announced February 2019.

Comments: 11 pages, 9 figures; accepted by Astronomy and Astrophysics (29.01.2019)

Journal ref: A&A 623, A166 (2019)

arXiv:1809.06894 [pdf, other]

doi 10.1093/mnras/sty2550

Retrieval analysis of 38 WFC3 transmission spectra and resolution of the normalisation degeneracy

Authors: Chloe Fisher, Kevin Heng

Abstract: A comprehensive analysis of 38 previously published Wide Field Camera 3 (WFC3) transmission spectra is performed using a hierarchy of nested-sampling retrievals: with versus without clouds, grey versus non-grey clouds, isothermal versus non-isothermal transit chords and with water, hydrogen cyanide and/or ammonia. We revisit the "normalisation degeneracy": the relative abundances of molecules are… ▽ More A comprehensive analysis of 38 previously published Wide Field Camera 3 (WFC3) transmission spectra is performed using a hierarchy of nested-sampling retrievals: with versus without clouds, grey versus non-grey clouds, isothermal versus non-isothermal transit chords and with water, hydrogen cyanide and/or ammonia. We revisit the "normalisation degeneracy": the relative abundances of molecules are degenerate at the order-of-magnitude level with the absolute normalisation of the transmission spectrum. Using a suite of mock retrievals, we demonstrate that the normalisation degeneracy may be partially broken using WFC3 data alone, even in the absence of optical/visible data and without appealing to the presence of patchy clouds, although lower limits to the mixing ratios may be prior-dominated depending on the measurement uncertainties. With James Webb Space Telescope-like spectral resolutions, the normalisation degeneracy may be completely broken from infrared spectra alone. We find no trend in the retrieved water abundances across nearly two orders of magnitude in exoplanet mass and a factor of 5 in retrieved temperature (about 500 to 2500 K). We further show that there is a general lack of strong Bayesian evidence to support interpretations of non-grey over grey clouds (only for WASP-69b and WASP-76b) and non-isothermal over isothermal atmospheres (no objects). 35 out of 38 WFC3 transmission spectra are well-fitted by an isothermal transit chord with grey clouds and water only, while 8 are adequately explained by flat lines. Generally, the cloud composition is unconstrained. △ Less

Submitted 18 September, 2018; originally announced September 2018.

Comments: Accepted by MNRAS. 33 pages, 29 figures, 3 tables

arXiv:1807.03876 [pdf, other]

doi 10.1038/s41598-019-49656-2

Deep learning for comprehensive forecasting of Alzheimer's Disease progression

Authors: Charles K. Fisher, Aaron M. Smith, Jonathan R. Walsh, the Coalition Against Major Diseases

Abstract: Most approaches to machine learning from electronic health data can only predict a single endpoint. Here, we present an alternative that uses unsupervised deep learning to simulate detailed patient trajectories. We use data comprising 18-month trajectories of 44 clinical variables from 1908 patients with Mild Cognitive Impairment or Alzheimer's Disease to train a model for personalized forecasting… ▽ More Most approaches to machine learning from electronic health data can only predict a single endpoint. Here, we present an alternative that uses unsupervised deep learning to simulate detailed patient trajectories. We use data comprising 18-month trajectories of 44 clinical variables from 1908 patients with Mild Cognitive Impairment or Alzheimer's Disease to train a model for personalized forecasting of disease progression. We simulate synthetic patient data including the evolution of each sub-component of cognitive exams, laboratory tests, and their associations with baseline clinical characteristics, generating both predictions and their confidence intervals. Our unsupervised model predicts changes in total ADAS-Cog scores with the same accuracy as specifically trained supervised models and identifies sub-components associated with word recall as predictive of progression. The ability to simultaneously simulate dozens of patient characteristics is a crucial step towards personalized medicine for Alzheimer's Disease. △ Less

Submitted 7 November, 2018; v1 submitted 10 July, 2018; originally announced July 2018.

arXiv:1806.06608 [pdf, other]

Variability in IC5070: two young stars with deep recurring eclipses

Authors: D. Froebrich, A. Scholz, J. Campbell-White, J. Crumpton, E. D'Arcy, S. V. Makin, T. Zegmott. S. J. Billington, R. Hibbert, R. J. Newport, C. R. Fisher

Abstract: We present two low-mass YSOs in IC5070 (V1490Cyg, V1706Cyg) with deep recurring eclipses. We present two low-mass YSOs in IC5070 (V1490Cyg, V1706Cyg) with deep recurring eclipses. △ Less

Submitted 18 June, 2018; originally announced June 2018.

Comments: Accepted for publication by RNAAS, 2pages, 1 figure, full version with full appendix available at http://astro.kent.ac.uk/~df/papers.html

arXiv:1806.03944 [pdf, other]

Supervised Machine Learning for Analysing Spectra of Exoplanetary Atmospheres

Authors: Pablo Marquez-Neila, Chloe Fisher, Raphael Sznitman, Kevin Heng

Abstract: The use of machine learning is becoming ubiquitous in astronomy, but remains rare in the study of the atmospheres of exoplanets. Given the spectrum of an exoplanetary atmosphere, a multi-parameter space is swept through in real time to find the best-fit model. Known as atmospheric retrieval, it is a technique that originates from the Earth and planetary sciences. Such methods are very time-consumi… ▽ More The use of machine learning is becoming ubiquitous in astronomy, but remains rare in the study of the atmospheres of exoplanets. Given the spectrum of an exoplanetary atmosphere, a multi-parameter space is swept through in real time to find the best-fit model. Known as atmospheric retrieval, it is a technique that originates from the Earth and planetary sciences. Such methods are very time-consuming and by necessity there is a compromise between physical and chemical realism versus computational feasibility. Machine learning has previously been used to determine which molecules to include in the model, but the retrieval itself was still performed using standard methods. Here, we report an adaptation of the random forest method of supervised machine learning, trained on a pre-computed grid of atmospheric models, which retrieves full posterior distributions of the abundances of molecules and the cloud opacity. The use of a pre-computed grid allows a large part of the computational burden to be shifted offline. We demonstrate our technique on a transmission spectrum of the hot gas-giant exoplanet WASP-12b using a five-parameter model (temperature, a constant cloud opacity and the volume mixing ratios or relative abundance by number of water, ammonia and hydrogen cyanide). We obtain results consistent with the standard nested-sampling retrieval method. Additionally, we can estimate the sensitivity of the measured spectrum to constraining the model parameters and we can quantify the information content of the spectrum. Our method can be straightforwardly applied using more sophisticated atmospheric models and also to interpreting an ensemble of spectra without having to retrain the random forest. △ Less

Submitted 11 June, 2018; originally announced June 2018.

Comments: 11 pages, 7 figures, 1 table

arXiv:1804.08682 [pdf, other]

Boltzmann Encoded Adversarial Machines

Authors: Charles K. Fisher, Aaron M. Smith, Jonathan R. Walsh

Abstract: Restricted Boltzmann Machines (RBMs) are a class of generative neural network that are typically trained to maximize a log-likelihood objective function. We argue that likelihood-based training strategies may fail because the objective does not sufficiently penalize models that place a high probability in regions where the training data distribution has low probability. To overcome this problem, w… ▽ More Restricted Boltzmann Machines (RBMs) are a class of generative neural network that are typically trained to maximize a log-likelihood objective function. We argue that likelihood-based training strategies may fail because the objective does not sufficiently penalize models that place a high probability in regions where the training data distribution has low probability. To overcome this problem, we introduce Boltzmann Encoded Adversarial Machines (BEAMs). A BEAM is an RBM trained against an adversary that uses the hidden layer activations of the RBM to discriminate between the training data and the probability distribution generated by the model. We present experiments demonstrating that BEAMs outperform RBMs and GANs on multiple benchmarks. △ Less

Submitted 23 April, 2018; originally announced April 2018.

arXiv:1803.08823 [pdf, other]

doi 10.1016/j.physrep.2019.03.001

A high-bias, low-variance introduction to Machine Learning for physicists

Authors: Pankaj Mehta, Marin Bukov, Ching-Hao Wang, Alexandre G. R. Day, Clint Richardson, Charles K. Fisher, David J. Schwab

Abstract: Machine Learning (ML) is one of the most exciting and dynamic areas of modern research and application. The purpose of this review is to provide an introduction to the core concepts and tools of machine learning in a manner easily understood and intuitive to physicists. The review begins by covering fundamental concepts in ML and modern statistics such as the bias-variance tradeoff, overfitting, r… ▽ More Machine Learning (ML) is one of the most exciting and dynamic areas of modern research and application. The purpose of this review is to provide an introduction to the core concepts and tools of machine learning in a manner easily understood and intuitive to physicists. The review begins by covering fundamental concepts in ML and modern statistics such as the bias-variance tradeoff, overfitting, regularization, generalization, and gradient descent before moving on to more advanced topics in both supervised and unsupervised learning. Topics covered in the review include ensemble models, deep learning and neural networks, clustering and data visualization, energy-based models (including MaxEnt models and Restricted Boltzmann Machines), and variational methods. Throughout, we emphasize the many natural connections between ML and statistical physics. A notable aspect of the review is the use of Python Jupyter notebooks to introduce modern ML/statistical packages to readers using physics-inspired datasets (the Ising Model and Monte-Carlo simulations of supersymmetric decays of proton-proton collisions). We conclude with an extended outlook discussing possible uses of machine learning for furthering our understanding of the physical world as well as open problems in ML where physicists may be able to contribute. (Notebooks are available at https://physics.bu.edu/~pankajm/MLnotebooks.html ) △ Less

Submitted 27 May, 2019; v1 submitted 23 March, 2018; originally announced March 2018.

Comments: Notebooks have been updated. 122 pages, 78 figures, 20 Python notebooks

Journal ref: Phyics Reports 810 (2019) 1-124

arXiv:1802.01081 [pdf, other]

doi 10.3847/1538-3881/aabb03

A search for technosignatures from 14 planetary systems in the Kepler field with the Green Bank Telescope at 1.15-1.73 GHz

Authors: Jean-Luc Margot, Adam H. Greenberg, Pavlo Pinchuk, Akshay Shinde, Yashaswi Alladi, Srinivas Prasad MN, M. Oliver Bowman, Callum Fisher, Szilard Gyalay, Willow McKibbin, Brittany Miles, Donald Nguyen, Conor Power, Namrata Ramani, Rashmi Raviprasad, Jesse Santana, Ryan S. Lynch

Abstract: Analysis of Kepler mission data suggests that the Milky Way includes billions of Earth-like planets in the habitable zone of their host star. Current technology enables the detection of technosignatures emitted from a large fraction of the Galaxy. We describe a search for technosignatures that is sensitive to Arecibo-class transmitters located within ~420 ly of Earth and transmitters that are 1000… ▽ More Analysis of Kepler mission data suggests that the Milky Way includes billions of Earth-like planets in the habitable zone of their host star. Current technology enables the detection of technosignatures emitted from a large fraction of the Galaxy. We describe a search for technosignatures that is sensitive to Arecibo-class transmitters located within ~420 ly of Earth and transmitters that are 1000 times more effective than Arecibo within ~13 000 ly of Earth. Our observations focused on 14 planetary systems in the Kepler field and used the L-band receiver (1.15-1.73 GHz) of the 100 m diameter Green Bank Telescope. Each source was observed for a total integration time of 5 minutes. We obtained power spectra at a frequency resolution of 3 Hz and examined narrowband signals with Doppler drift rates between +/-9 Hz/s. We flagged any detection with a signal-to-noise ratio in excess of 10 as a candidate signal and identified approximately 850 000 candidates. Most (99%) of these candidate signals were automatically classified as human-generated radio-frequency interference (RFI). A large fraction (>99%) of the remaining candidate signals were also flagged as anthropogenic RFI because they have frequencies that overlap those used by global navigation satellite systems, satellite downlinks, or other interferers detected in heavily polluted regions of the spectrum. All 19 remaining candidate signals were scrutinized and none were attributable to an extraterrestrial source. △ Less

Submitted 30 March, 2018; v1 submitted 4 February, 2018; originally announced February 2018.

Comments: 15 pages, 5 figures, accepted for publication in the Astronomical Journal

arXiv:1705.07349 [pdf, other]

$\left( β, \varpi \right)$-stability for cross-validation and the choice of the number of folds

Authors: Ning Xu, Jian Hong, Timothy C. G. Fisher

Abstract: In this paper, we introduce a new concept of stability for cross-validation, called the $\left( β, \varpi \right)$-stability, and use it as a new perspective to build the general theory for cross-validation. The $\left( β, \varpi \right)$-stability mathematically connects the generalization ability and the stability of the cross-validated model via the Rademacher complexity. Our result reveals mat… ▽ More In this paper, we introduce a new concept of stability for cross-validation, called the $\left( β, \varpi \right)$-stability, and use it as a new perspective to build the general theory for cross-validation. The $\left( β, \varpi \right)$-stability mathematically connects the generalization ability and the stability of the cross-validated model via the Rademacher complexity. Our result reveals mathematically the effect of cross-validation from two sides: on one hand, cross-validation picks the model with the best empirical generalization ability by validating all the alternatives on test sets; on the other hand, cross-validation may compromise the stability of the model selection by causing subsampling error. Moreover, the difference between training and test errors in q\textsuperscript{th} round, sometimes referred to as the generalization error, might be autocorrelated on q. Guided by the ideas above, the $\left( β, \varpi \right)$-stability help us derivd a new class of Rademacher bounds, referred to as the one-round/convoluted Rademacher bounds, for the stability of cross-validation in both the i.i.d.\ and non-i.i.d.\ cases. For both light-tail and heavy-tail losses, the new bounds quantify the stability of the one-round/average test error of the cross-validated model in terms of its one-round/average training error, the sample sizes $n$, number of folds $K$, the tail property of the loss (encoded as Orlicz-$Ψ_ν$ norms) and the Rademacher complexity of the model class $Λ$. The new class of bounds not only quantitatively reveals the stability of the generalization ability of the cross-validated model, it also shows empirically the optimal choice for number of folds $K$, at which the upper bound of the one-round/average test error is lowest, or, to put it in another way, where the test error is most stable. △ Less

Submitted 5 July, 2017; v1 submitted 20 May, 2017; originally announced May 2017.

arXiv:1610.06683 [pdf]

New Survey Questions and Estimators for Network Clustering with Respondent-Driven Sampling Data

Authors: Ashton M. Verdery, Jacob C. Fisher, Nalyn Siripong, Kahina Abdesselam, Shawn Bauldry

Abstract: Respondent-driven sampling (RDS) is a popular method for sampling hard-to-survey populations that leverages social network connections through peer recruitment. While RDS is most frequently applied to estimate the prevalence of infections and risk behaviors of interest to public health, like HIV/AIDS or condom use, it is rarely used to draw inferences about the structural properties of social netw… ▽ More Respondent-driven sampling (RDS) is a popular method for sampling hard-to-survey populations that leverages social network connections through peer recruitment. While RDS is most frequently applied to estimate the prevalence of infections and risk behaviors of interest to public health, like HIV/AIDS or condom use, it is rarely used to draw inferences about the structural properties of social networks among such populations because it does not typically collect the necessary data. Drawing on recent advances in computer science, we introduce a set of data collection instruments and RDS estimators for network clustering, an important topological property that has been linked to a network's potential for diffusion of information, disease, and health behaviors. We use simulations to explore how these estimators, originally developed for random walk samples of computer networks, perform when applied to RDS samples with characteristics encountered in realistic field settings that depart from random walks. In particular, we explore the effects of multiple seeds, without vs. with replacement, branching chains, imperfect response rates, preferential recruitment, and misreporting of ties. We find that clustering coefficient estimators retain desirable properties in RDS samples. This paper takes an important step towards calculating network characteristics using non-traditional sampling methods, and it expands RDS's potential to tell researchers more about hidden populations and the social factors driving disease prevalence. △ Less

Submitted 21 October, 2016; originally announced October 2016.

Comments: 47 pages, 4 figures, 5 tables

arXiv:1610.05448 [pdf, other]

Generalization error minimization: a new approach to model evaluation and selection with an application to penalized regression

Authors: Ning Xu, Jian Hong, Timothy C. G. Fisher

Abstract: We study model evaluation and model selection from the perspective of generalization ability (GA): the ability of a model to predict outcomes in new samples from the same population. We believe that GA is one way formally to address concerns about the external validity of a model. The GA of a model estimated on a sample can be measured by its empirical out-of-sample errors, called the generalizati… ▽ More We study model evaluation and model selection from the perspective of generalization ability (GA): the ability of a model to predict outcomes in new samples from the same population. We believe that GA is one way formally to address concerns about the external validity of a model. The GA of a model estimated on a sample can be measured by its empirical out-of-sample errors, called the generalization errors (GE). We derive upper bounds for the GE, which depend on sample sizes, model complexity and the distribution of the loss function. The upper bounds can be used to evaluate the GA of a model, ex ante. We propose using generalization error minimization (GEM) as a framework for model selection. Using GEM, we are able to unify a big class of penalized regression estimators, including lasso, ridge and bridge, under the same set of assumptions. We establish finite-sample and asymptotic properties (including $\mathcal{L}_2$-consistency) of the GEM estimator for both the $n \geqslant p$ and the $n < p$ cases. We also derive the $\mathcal{L}_2$-distance between the penalized and corresponding unpenalized regression estimates. In practice, GEM can be implemented by validation or cross-validation. We show that the GE bounds can be used for selecting the optimal number of folds in $K$-fold cross-validation. We propose a variant of $R^2$, the $GR^2$, as a measure of GA, which considers both both in-sample and out-of-sample goodness of fit. Simulations are used to demonstrate our key results. △ Less

Submitted 18 October, 2016; originally announced October 2016.

Comments: The theoretical generalization and extension of arXiv:1606.00142 and arXiv:1609.03344

arXiv:1609.03344 [pdf, other]

Finite-sample and asymptotic analysis of generalization ability with an application to penalized regression

Authors: Ning Xu, Jian Hong, Timothy C. G. Fisher

Abstract: In this paper, we study the performance of extremum estimators from the perspective of generalization ability (GA): the ability of a model to predict outcomes in new samples from the same population. By adapting the classical concentration inequalities, we derive upper bounds on the empirical out-of-sample prediction errors as a function of the in-sample errors, in-sample data size, heaviness in t… ▽ More In this paper, we study the performance of extremum estimators from the perspective of generalization ability (GA): the ability of a model to predict outcomes in new samples from the same population. By adapting the classical concentration inequalities, we derive upper bounds on the empirical out-of-sample prediction errors as a function of the in-sample errors, in-sample data size, heaviness in the tails of the error distribution, and model complexity. We show that the error bounds may be used for tuning key estimation hyper-parameters, such as the number of folds $K$ in cross-validation. We also show how $K$ affects the bias-variance trade-off for cross-validation. We demonstrate that the $\mathcal{L}_2$-norm difference between penalized and the corresponding un-penalized regression estimates is directly explained by the GA of the estimates and the GA of empirical moment conditions. Lastly, we prove that all penalized regression estimates are $L_2$-consistent for both the $n \geqslant p$ and the $n < p$ cases. Simulations are used to demonstrate key results. Keywords: generalization ability, upper bound of generalization error, penalized regression, cross-validation, bias-variance trade-off, $\mathcal{L}_2$ difference between penalized and unpenalized regression, lasso, high-dimensional data. △ Less

Submitted 13 September, 2016; v1 submitted 12 September, 2016; originally announced September 2016.

Comments: The theoretical generalization and extension of arXiv:1606.00142

arXiv:1606.00142 [pdf, other]

Model selection consistency from the perspective of generalization ability and VC theory with an application to Lasso

Authors: Ning Xu, Jian Hong, Timothy C. G. Fisher

Abstract: Model selection is difficult to analyse yet theoretically and empirically important, especially for high-dimensional data analysis. Recently the least absolute shrinkage and selection operator (Lasso) has been applied in the statistical and econometric literature. Consis- tency of Lasso has been established under various conditions, some of which are difficult to verify in practice. In this paper,… ▽ More Model selection is difficult to analyse yet theoretically and empirically important, especially for high-dimensional data analysis. Recently the least absolute shrinkage and selection operator (Lasso) has been applied in the statistical and econometric literature. Consis- tency of Lasso has been established under various conditions, some of which are difficult to verify in practice. In this paper, we study model selection from the perspective of generalization ability, under the framework of structural risk minimization (SRM) and Vapnik-Chervonenkis (VC) theory. The approach emphasizes the balance between the in-sample and out-of-sample fit, which can be achieved by using cross-validation to select a penalty on model complexity. We show that an exact relationship exists between the generalization ability of a model and model selection consistency. By implementing SRM and the VC inequality, we show that Lasso is L2-consistent for model selection under assumptions similar to those imposed on OLS. Furthermore, we derive a probabilistic bound for the distance between the penalized extremum estimator and the extremum estimator without penalty, which is dominated by overfitting. We also propose a new measurement of overfitting, GR2, based on generalization ability, that converges to zero if model selection is consistent. Using simulations, we demonstrate that the proposed CV-Lasso algorithm performs well in terms of model selection and overfitting control. △ Less

Submitted 1 June, 2016; originally announced June 2016.

arXiv:1511.09166 [pdf, other]

doi 10.1103/PhysRevE.94.022423

An analytically tractable model for community ecology with many species

Authors: Benjamin Dickens, Charles K. Fisher, Pankaj Mehta

Abstract: A fundamental problem in community ecology is to understand how ecological processes such as selection, drift, and immigration give rise to observed patterns in species composition and diversity. Here, we present a simple, analytically tractable, presence-absence (PA) model for community assembly and use it to ask how ecological traits such as the strength of competition, the amount of diversity,… ▽ More A fundamental problem in community ecology is to understand how ecological processes such as selection, drift, and immigration give rise to observed patterns in species composition and diversity. Here, we present a simple, analytically tractable, presence-absence (PA) model for community assembly and use it to ask how ecological traits such as the strength of competition, the amount of diversity, and demographic and environmental stochasticity affect species composition in a community. In the PA model, species are treated as stochastic binary variables that can either be present or absent in a community: species can immigrate into the community from a regional species pool and can go extinct due to competition and stochasticity. Despite its simplicity, the PA model reproduces the qualitative features of more complicated models of community assembly. In agreement with recent work on large, competitive Lotka-Volterra systems, the PA model exhibits distinct ecological behaviors organized around a special ("critical") point corresponding to Hubbell's neutral theory of biodiversity. These results suggest that the concepts of ecological "phases" and phase diagrams can provide a powerful framework for thinking about community ecology and that the PA model captures the essential ecological dynamics of community assembly. △ Less

Submitted 30 November, 2015; originally announced November 2015.

Comments: 15 pages, 4 figures

Journal ref: Phys. Rev. E 94, 022423 (2016)

arXiv:1510.00198 [pdf, other]

Habitat Fluctuations Drive Species Covariation in the Human Microbiota

Authors: Charles K. Fisher, Thierry Mora, Aleksandra M. Walczak

Abstract: Two species with similar resource requirements respond in a characteristic way to variations in their habitat -- their abundances rise and fall in concert. We use this idea to learn how bacterial populations in the microbiota respond to habitat conditions that vary from person-to-person across the human population. Our mathematical framework shows that habitat fluctuations are sufficient for expla… ▽ More Two species with similar resource requirements respond in a characteristic way to variations in their habitat -- their abundances rise and fall in concert. We use this idea to learn how bacterial populations in the microbiota respond to habitat conditions that vary from person-to-person across the human population. Our mathematical framework shows that habitat fluctuations are sufficient for explaining intra-bodysite correlations in relative species abundances from the Human Microbiome Project. We explicitly show that the relative abundances of phylogenetically related species are positively correlated and can be predicted from taxonomic relationships. We identify a small set of functional pathways related to metabolism and maintenance of the cell wall that form the basis of a common resource sharing niche space of the human microbiota. △ Less

Submitted 1 October, 2015; originally announced October 2015.

arXiv:1507.00725 [pdf]

doi 10.1117/1.JATIS.1.3.035001

Prime Focus Spectrograph for the Subaru telescope: massively multiplexed optical and near-infrared fiber spectrograph

Authors: Hajime Sugai, Naoyuki Tamura, Hiroshi Karoji, Atsushi Shimono, Naruhisa Takato, Masahiko Kimura, Youichi Ohyama, Akitoshi Ueda, Hrand Aghazarian, Marcio Vital de Arruda, Robert H. Barkhouser, Charles L. Bennett, Steve Bickerton, Alexandre Bozier, David F. Braun, Khanh Bui, Christopher M. Capocasale, Michael A. Carr, Bruno Castilho, Yin-Chang Chang, Hsin-Yo Chen, Richard C. Y. Chou, Olivia R. Dawson, Richard G. Dekany, Eric M. Ek , et al. (59 additional authors not shown)

Abstract: The Prime Focus Spectrograph (PFS) is an optical/near-infrared multifiber spectrograph with 2394 science fibers distributed across a 1.3-deg diameter field of view at the Subaru 8.2-m telescope. The wide wavelength coverage from 0.38 μm to 1.26 μm, with a resolving power of 3000, simultaneously strengthens its ability to target three main survey programs: cosmology, galactic archaeology and galaxy… ▽ More The Prime Focus Spectrograph (PFS) is an optical/near-infrared multifiber spectrograph with 2394 science fibers distributed across a 1.3-deg diameter field of view at the Subaru 8.2-m telescope. The wide wavelength coverage from 0.38 μm to 1.26 μm, with a resolving power of 3000, simultaneously strengthens its ability to target three main survey programs: cosmology, galactic archaeology and galaxy/AGN evolution. A medium resolution mode with a resolving power of 5000 for 0.71 μm to 0.89 μm will also be available by simply exchanging dispersers. We highlight some of the technological aspects of the design. To transform the telescope focal ratio, a broad-band coated microlens is glued to each fiber tip. A higher transmission fiber is selected for the longest part of the cable system, optimizing overall throughput; a fiber with low focal ratio degradation is selected for the fiber-positioner and fiber-slit components, minimizing the effects of fiber movements and fiber bending. Fiber positioning will be performed by a positioner consisting of two stages of piezo-electric rotary motors. The positions of these motors are measured by taking an image of artificially back-illuminated fibers with the metrology camera located in the Cassegrain container; the fibers are placed in the proper location by iteratively measuring and then adjusting the positions of the motors. Target light reaches one of the four identical fast-Schmidt spectrograph modules, each with three arms. The PFS project has passed several project-wide design reviews and is now in the construction phase. △ Less

Submitted 2 July, 2015; originally announced July 2015.

Comments: 19 pages, 11 figures

Journal ref: Journal of Astronomical Telescopes, Instruments, and Systems, 1(3), 035001 (2015)

arXiv:1506.00810 [pdf, ps, other]

Circle incidence theorems

Authors: J. Chris Fisher, Eberhard M. Schröder, Jan Stevens

Abstract: Larry Hoehn discovered a remarkable concurrence theorem about pentagrams. Draw cicles through two consecutive vertices and the intersection points of the sides in between, Then the radical axes of each pair of consecutive circles are concurrent or parallel. In this note we prove a generalisation to n-gons. Larry Hoehn discovered a remarkable concurrence theorem about pentagrams. Draw cicles through two consecutive vertices and the intersection points of the sides in between, Then the radical axes of each pair of consecutive circles are concurrent or parallel. In this note we prove a generalisation to n-gons. △ Less

Submitted 2 June, 2015; originally announced June 2015.

Comments: 18 pages

Journal ref: Forum Geometricorum 15 (2015), 211-228

Showing 1–50 of 67 results for author: Fisher, C