Search | arXiv e-print repository

Towards an optimal marked correlation function analysis for the detection of modified gravity

Authors: Martin Kärcher, Julien Bel, Sylvain de la Torre

Abstract: Modified gravity (MG) theories have emerged as a promising alternative to explain the late-time acceleration of the Universe. However, the detection of MG in observations of the large-scale structure remains challenging due to the screening mechanisms that obscure any deviations from General Relativity (GR) in high-density regions. The marked two-point correlation function offers a promising appro… ▽ More Modified gravity (MG) theories have emerged as a promising alternative to explain the late-time acceleration of the Universe. However, the detection of MG in observations of the large-scale structure remains challenging due to the screening mechanisms that obscure any deviations from General Relativity (GR) in high-density regions. The marked two-point correlation function offers a promising approach to potentially detect MG signals. This work investigates novel marks based on large-scale environment estimates but also that exploit the anti-correlation between objects in low- and high-density regions. This is the first time discreteness effects in density-dependent marked correlation functions are investigated in depth. We assess the performance of various marks to distinguish GR from MG by using the ELEPHANT simulations, comprised of realisations of GR as well as $f(R)$ and nDGP gravity. In addition, discreteness effects are studied using the high-density Covmos catalogues. We establish a robust method to correct for shot-noise effects that allows the recovery of the true signal with an accuracy below $5\%$ over a wide range of scales. We find such correction to be crucial to measure the amplitude of the marked correlation function in an unbiased manner. Furthermore, we demonstrate that marks, anti-correlating objects in low- and high-density regions, are among the most effective in distinguishing between MG and GR. We report differences in the marked correlation function between $f(R)$ with $|f_{R0}|=10^{-6}$ and GR simulations of the order of 3-5$σ$ in real space up to scales of about $80\, h^{-1} \, {\rm Mpc}$. The redshift-space monopole exhibits similar features and performances. The combination of the proposed $\tanh$-mark with shot-noise correction paves the way towards an optimal approach for the detection of MG in current and future galaxy spectroscopic surveys. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: 27 pages, 22 figures, submitted to A&A

arXiv:2405.13491 [pdf, other]

Euclid. I. Overview of the Euclid mission

Authors: Euclid Collaboration, Y. Mellier, Abdurro'uf, J. A. Acevedo Barroso, A. Achúcarro, J. Adamek, R. Adam, G. E. Addison, N. Aghanim, M. Aguena, V. Ajani, Y. Akrami, A. Al-Bahlawan, A. Alavi, I. S. Albuquerque, G. Alestas, G. Alguero, A. Allaoui, S. W. Allen, V. Allevato, A. V. Alonso-Tetilla, B. Altieri, A. Alvarez-Candal, A. Amara, L. Amendola , et al. (1086 additional authors not shown)

Abstract: The current standard model of cosmology successfully describes a variety of measurements, but the nature of its main ingredients, dark matter and dark energy, remains unknown. Euclid is a medium-class mission in the Cosmic Vision 2015-2025 programme of the European Space Agency (ESA) that will provide high-resolution optical imaging, as well as near-infrared imaging and spectroscopy, over about 14… ▽ More The current standard model of cosmology successfully describes a variety of measurements, but the nature of its main ingredients, dark matter and dark energy, remains unknown. Euclid is a medium-class mission in the Cosmic Vision 2015-2025 programme of the European Space Agency (ESA) that will provide high-resolution optical imaging, as well as near-infrared imaging and spectroscopy, over about 14,000 deg^2 of extragalactic sky. In addition to accurate weak lensing and clustering measurements that probe structure formation over half of the age of the Universe, its primary probes for cosmology, these exquisite data will enable a wide range of science. This paper provides a high-level overview of the mission, summarising the survey characteristics, the various data-processing steps, and data products. We also highlight the main science objectives and expected performance. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: Paper submitted as part of the A&A special issue`Euclid on Sky'

arXiv:2312.00679 [pdf, other]

Euclid preparation. TBD. Galaxy power spectrum modelling in real space

Authors: Euclid Collaboration, A. Pezzotta, C. Moretti, M. Zennaro, A. Moradinezhad Dizgah, M. Crocce, E. Sefusatti, I. Ferrero, K. Pardede, A. Eggemeier, A. Barreira, R. E. Angulo, M. Marinucci, B. Camacho Quevedo, S. de la Torre, D. Alkhanishvili, M. Biagetti, M. -A. Breton, E. Castorina, G. D'Amico, V. Desjacques, M. Guidi, M. Kärcher, A. Oddo, M. Pellejero Ibanez , et al. (224 additional authors not shown)

Abstract: We investigate the accuracy of the perturbative galaxy bias expansion in view of the forthcoming analysis of the Euclid spectroscopic galaxy samples. We compare the performance of an Eulerian galaxy bias expansion, using state-of-art prescriptions from the effective field theory of large-scale structure (EFTofLSS), against a hybrid approach based on Lagrangian perturbation theory and high-resoluti… ▽ More We investigate the accuracy of the perturbative galaxy bias expansion in view of the forthcoming analysis of the Euclid spectroscopic galaxy samples. We compare the performance of an Eulerian galaxy bias expansion, using state-of-art prescriptions from the effective field theory of large-scale structure (EFTofLSS), against a hybrid approach based on Lagrangian perturbation theory and high-resolution simulations. These models are benchmarked against comoving snapshots of the Flagship I N-body simulation at $z=(0.9,1.2,1.5,1.8)$, which have been populated with H$α$ galaxies leading to catalogues of millions of objects within a volume of about $58\,h^{-3}\,{\rm Gpc}^3$. Our analysis suggests that both models can be used to provide a robust inference of the parameters $(h, ω_{\rm c})$ in the redshift range under consideration, with comparable constraining power. We additionally determine the range of validity of the EFTofLSS model in terms of scale cuts and model degrees of freedom. From these tests, it emerges that the standard third-order Eulerian bias expansion can accurately describe the full shape of the real-space galaxy power spectrum up to the maximum wavenumber $k_{\rm max}=0.45\,h\,{\rm Mpc}^{-1}$, even with a measurement precision well below the percent level. In particular, this is true for a configuration with six free nuisance parameters, including local and non-local bias parameters, a matter counterterm, and a correction to the shot-noise contribution. Fixing either tidal bias parameters to physically-motivated relations still leads to unbiased cosmological constraints. We finally repeat our analysis assuming a volume that matches the expected footprint of Euclid, but without considering observational effects, as purity and completeness, showing that we can get consistent cosmological constraints over this range of scales and redshifts. △ Less

Submitted 1 December, 2023; originally announced December 2023.

Comments: 38 pages, 19 figures

arXiv:2109.07629 [pdf, other]

How trustworthy is your tree? Bayesian phylogenetic effective sample size through the lens of Monte Carlo error

Authors: Andrew F. Magee, Michael D. Karcher, Frederick A. Matsen IV, Vladimir N. Minin

Abstract: Bayesian inference is a popular and widely-used approach to infer phylogenies (evolutionary trees). However, despite decades of widespread application, it remains difficult to judge how well a given Bayesian Markov chain Monte Carlo (MCMC) run explores the space of phylogenetic trees. In this paper, we investigate the Monte Carlo error of phylogenies, focusing on high-dimensional summaries of the… ▽ More Bayesian inference is a popular and widely-used approach to infer phylogenies (evolutionary trees). However, despite decades of widespread application, it remains difficult to judge how well a given Bayesian Markov chain Monte Carlo (MCMC) run explores the space of phylogenetic trees. In this paper, we investigate the Monte Carlo error of phylogenies, focusing on high-dimensional summaries of the posterior distribution, including variability in estimated edge/branch (known in phylogenetics as "split") probabilities and tree probabilities, and variability in the estimated summary tree. Specifically, we ask if there is any measure of effective sample size (ESS) applicable to phylogenetic trees which is capable of capturing the Monte Carlo error of these three summary measures. We find that there are some ESS measures capable of capturing the error inherent in using MCMC samples to approximate the posterior distributions on phylogenies. We term these tree ESS measures, and identify a set of three which are useful in practice for assessing the Monte Carlo error. Lastly, we present visualization tools that can improve comparisons between multiple independent MCMC runs by accounting for the Monte Carlo error present in each chain. Our results indicate that common post-MCMC workflows are insufficient to capture the inherent Monte Carlo error of the tree, and highlight the need for both within-chain mixing and between-chain convergence assessments. △ Less

Submitted 3 September, 2022; v1 submitted 15 September, 2021; originally announced September 2021.

Comments: 30 pages, 7 figures

arXiv:2104.11191 [pdf, other]

Variational Bayesian Supertrees

Authors: Michael Karcher, Cheng Zhang, Frederick A Matsen IV

Abstract: Given overlap** subsets of a set of taxa (e.g. species), and posterior distributions on phylogenetic tree topologies for each of these taxon sets, how can we infer a posterior distribution on phylogenetic tree topologies for the entire taxon set? Although the equivalent problem for in the non-Bayesian case has attracted substantial research, the Bayesian case has not attracted the attention it d… ▽ More Given overlap** subsets of a set of taxa (e.g. species), and posterior distributions on phylogenetic tree topologies for each of these taxon sets, how can we infer a posterior distribution on phylogenetic tree topologies for the entire taxon set? Although the equivalent problem for in the non-Bayesian case has attracted substantial research, the Bayesian case has not attracted the attention it deserves. In this paper we develop a variational Bayes approach to this problem and demonstrate its effectiveness. △ Less

Submitted 22 April, 2021; originally announced April 2021.

arXiv:1903.11797 [pdf, other]

doi 10.1371/journal.pcbi.1007774

Estimating effective population size changes from preferentially sampled genetic sequences

Authors: Michael D. Karcher, Marc A. Suchard, Gytis Dudas, Vladimir N. Minin

Abstract: Coalescent theory combined with statistical modeling allows us to estimate effective population size fluctuations from molecular sequences of individuals sampled from a population of interest. When sequences are sampled serially through time and the distribution of the sampling times depends on the effective population size, explicit statistical modeling of sampling times improves population size… ▽ More Coalescent theory combined with statistical modeling allows us to estimate effective population size fluctuations from molecular sequences of individuals sampled from a population of interest. When sequences are sampled serially through time and the distribution of the sampling times depends on the effective population size, explicit statistical modeling of sampling times improves population size estimation. Previous work assumed that the genealogy relating sampled sequences is known and modeled sampling times as an inhomogeneous Poisson process with log-intensity equal to a linear function of the log-transformed effective population size. We improve this approach in two ways. First, we extend the method to allow for joint Bayesian estimation of the genealogy, effective population size trajectory, and other model parameters. Next, we improve the sampling time model by incorporating additional sources of information in the form of time-varying covariates. We validate our new modeling framework using a simulation study and apply our new methodology to analyses of population dynamics of seasonal influenza and to the recent Ebola virus outbreak in West Africa. △ Less

Submitted 28 March, 2019; originally announced March 2019.

Comments: 47 pages

arXiv:1802.02328 [pdf, other]

Reduced basis approximation and a~posteriori error bounds for 4D-Var data assimilation

Authors: Mark Kärcher, Sébastien Boyaval, Martin A. Grepl, Karen Veroy

Abstract: We propose a certified reduced basis approach for the strong- and weak-constraint four-dimensional variational (4D-Var) data assimilation problem for a parametrized PDE model. While the standard strong-constraint 4D-Var approach uses the given observational data to estimate only the unknown initial condition of the model, the weak-constraint 4D-Var formulation additionally provides an estimate for… ▽ More We propose a certified reduced basis approach for the strong- and weak-constraint four-dimensional variational (4D-Var) data assimilation problem for a parametrized PDE model. While the standard strong-constraint 4D-Var approach uses the given observational data to estimate only the unknown initial condition of the model, the weak-constraint 4D-Var formulation additionally provides an estimate for the model error and thus can deal with imperfect models. Since the model error is a distributed function in both space and time, the 4D-Var formulation leads to a large-scale optimization problem for every given parameter instance of the PDE model. To solve the problem efficiently, various reduced order approaches have therefore been proposed in the recent past. Here, we employ the reduced basis method to generate reduced order approximations for the state, adjoint, initial condition, and model error. Our main contribution is the development of efficiently computable \textit{a~posteriori} upper bounds for the error of the reduced basis approximation with respect to the underlying high-dimensional 4D-Var problem. Numerical results are conducted to test the validity of our approach. △ Less

Submitted 7 February, 2018; originally announced February 2018.

Report number: hal-01556304 MSC Class: 65K10

arXiv:1610.05817 [pdf, other]

phylodyn: an R package for phylodynamic simulation and inference

Authors: Michael D. Karcher, Julia A. Palacios, Shiwei Lan, Vladimir N. Minin

Abstract: We introduce phylodyn, an R package for phylodynamic analysis based on gene genealogies. The package main functionality is Bayesian nonparametric estimation of effective population size fluctuations over time. Our implementation includes several Markov chain Monte Carlo-based methods and an integrated nested Laplace approximation-based approach for phylodynamic inference that have been developed i… ▽ More We introduce phylodyn, an R package for phylodynamic analysis based on gene genealogies. The package main functionality is Bayesian nonparametric estimation of effective population size fluctuations over time. Our implementation includes several Markov chain Monte Carlo-based methods and an integrated nested Laplace approximation-based approach for phylodynamic inference that have been developed in recent years. Genealogical data describe the timed ancestral relationships of individuals sampled from a population of interest. Here, individuals are assumed to be sampled at the same point in time (isochronous sampling) or at different points in time (heterochronous sampling); in addition, sampling events can be modeled with preferential sampling, which means that the intensity of sampling events is allowed to depend on the effective population size trajectory. We assume the coalescent and the sequentially Markov coalescent processes as generative models of genealogies. We include several coalescent simulation functions that are useful for testing our phylodynamics methods via simulation studies. We compare the performance and outputs of various methods implemented in phylodyn and outline their strengths and weaknesses. R package phylodyn is available at https://github.com/mdkarcher/phylodyn. △ Less

Submitted 18 October, 2016; originally announced October 2016.

Comments: 9 pages, 3 figures

arXiv:1607.02151 [pdf, other]

doi 10.1093/mnras/stx1160

The Romulus Cosmological Simulations: A Physical Approach to the Formation, Dynamics and Accretion Models of SMBHs

Authors: Michael Tremmel, Michael Karcher, Fabio Governato, Marta Volonteri, Tom Quinn, Andrew Pontzen, Lauren Anderson, Jillian Bellovary

Abstract: We present a novel implementation of supermassive black hole (SMBH) formation, dynamics, and accretion in the massively parallel tree+SPH code, ChaNGa. This approach improves the modeling of SMBHs in fully cosmological simulations, allowing for a more de- tailed analysis of SMBH-galaxy co-evolution throughout cosmic time. Our scheme includes novel, physically motivated models for SMBH formation, d… ▽ More We present a novel implementation of supermassive black hole (SMBH) formation, dynamics, and accretion in the massively parallel tree+SPH code, ChaNGa. This approach improves the modeling of SMBHs in fully cosmological simulations, allowing for a more de- tailed analysis of SMBH-galaxy co-evolution throughout cosmic time. Our scheme includes novel, physically motivated models for SMBH formation, dynamics and sinking timescales within galaxies, and SMBH accretion of rotationally supported gas. The sub-grid parameters that regulate star formation (SF) and feedback from SMBHs and SNe are optimized against a comprehensive set of z = 0 galaxy scaling relations using a novel, multi-dimensional parameter search. We have incorporated our new SMBH implementation and parameter optimization into a new set of high resolution, large-scale cosmological simulations called Romulus. We present initial results from our flagship simulation, Romulus25, showing that our SMBH model results in SF efficiency, SMBH masses, and global SF and SMBH accretion histories at high redshift that are consistent with observations. We discuss the importance of SMBH physics in sha** the evolution of massive galaxies and show how SMBH feedback is much more effective at regulating star formation compared to SNe feedback in this regime. Further, we show how each aspect of our SMBH model impacts this evolution compared to more common approaches. Finally, we present a science application of this scheme studying the properties and time evolution of an example dual AGN system, highlighting how our approach allows simulations to better study galaxy interactions and SMBH mergers in the context of galaxy-BH co-evolution. △ Less

Submitted 27 June, 2017; v1 submitted 7 July, 2016; originally announced July 2016.

Comments: 21 pages, 17 figures, Accepted to MNRAS, in press. Updated references

arXiv:1606.05352 [pdf, other]

doi 10.1093/mnras/stx709

The Little Galaxies that Could (Reionize the Universe): Predicting Faint End Slopes & Escape Fractions at z > 4

Authors: Lauren Anderson, Fabio Governato, Michael Karcher, Tom Quinn, James Wadsley

Abstract: The sources that reionized the universe are still unknown, but likely candidates are faint but numerous galaxies. In this paper we present results from running a high resolution, uniform volume simulation, the Vulcan, to predict the number densities of undetectable, faint galaxies and their escape fractions of ionizing radiation, $f_\mathrm{esc}$, during reionization. Our approach combines a high… ▽ More The sources that reionized the universe are still unknown, but likely candidates are faint but numerous galaxies. In this paper we present results from running a high resolution, uniform volume simulation, the Vulcan, to predict the number densities of undetectable, faint galaxies and their escape fractions of ionizing radiation, $f_\mathrm{esc}$, during reionization. Our approach combines a high spatial resolution, a realistic treatment of feedback and hydro processes, a strict threshold for minimum number of resolution elements per galaxy, and a converged measurement of $f_\mathrm{esc}$. We calibrate our physical model using a novel approach to create realistic galaxies at z=0, so the simulation is predictive at high redshifts. With this approach we can (1) robustly predict the evolution of the galaxy UV luminosity function at faint magnitudes down to $M_\mathrm{UV}$~-15, two magnitudes fainter than observations, and (2) estimate $f_\mathrm{esc}$ over a large range of galaxy masses based on the detailed stellar and gas distributions in resolved galaxies. We find steep faint end slopes, implying high number densities of faint galaxies, and the dependence of $f_\mathrm{esc}$ on the UV magnitude of a galaxy, given by the power-law: log $f_\mathrm{esc} = (0.51 \pm 0.04)M_\mathrm{UV} + 7.3 \pm 0.8$, with the faint population having $f_\mathrm{esc}$~35%. Convolving the UV luminosity function with $f_\mathrm{esc}$($M_\mathrm{UV}$), we find an ionizing emissivity that is (1) dominated by the faintest galaxies and (2) reionizes the universe at the appropriate rate, consistent with observational constraints of the ionizing emissivity and the optical depth to the decoupling surface tau_es, without the need for additional sources of ionizing radiation. △ Less

Submitted 21 March, 2017; v1 submitted 16 June, 2016; originally announced June 2016.

Comments: 16 pages, 12 Figures, Accepted for publication to MNRAS

arXiv:1510.00775 [pdf, other]

doi 10.1371/journal.pcbi.1004789

Quantifying and mitigating the effect of preferential sampling on phylodynamic inference

Authors: Michael D. Karcher, Julia A. Palacios, Trevor Bedford, Marc A. Suchard, Vladimir N. Minin

Abstract: Phylodynamics seeks to estimate effective population size fluctuations from molecular sequences of individuals sampled from a population of interest. One way to accomplish this task formulates an observed sequence data likelihood exploiting a coalescent model for the sampled individuals' genealogy and then integrating over all possible genealogies via Monte Carlo or, less efficiently, by condition… ▽ More Phylodynamics seeks to estimate effective population size fluctuations from molecular sequences of individuals sampled from a population of interest. One way to accomplish this task formulates an observed sequence data likelihood exploiting a coalescent model for the sampled individuals' genealogy and then integrating over all possible genealogies via Monte Carlo or, less efficiently, by conditioning on one genealogy estimated from the sequence data. However, when analyzing sequences sampled serially through time, current methods implicitly assume either that sampling times are fixed deterministically by the data collection protocol or that their distribution does not depend on the size of the population. Through simulation, we first show that, when sampling times do probabilistically depend on effective population size, estimation methods may be systematically biased. To correct for this deficiency, we propose a new model that explicitly accounts for preferential sampling by modeling the sampling times as an inhomogeneous Poisson process dependent on effective population size. We demonstrate that in the presence of preferential sampling our new model not only reduces bias, but also improves estimation precision. Finally, we compare the performance of the currently used phylodynamic methods with our proposed model through clinically-relevant, seasonal human influenza examples. △ Less

Submitted 3 October, 2015; originally announced October 2015.

Comments: 30 pages, 7 figures plust 7 appendix figures

arXiv:1412.0158 [pdf, other]

doi 10.1093/bioinformatics/btv378

An Efficient Bayesian Inference Framework for Coalescent-Based Nonparametric Phylodynamics

Authors: Shiwei Lan, Julia A. Palacios, Michael Karcher, Vladimir N. Minin, Babak Shahbaba

Abstract: Phylodynamics focuses on the problem of reconstructing past population size dynamics from current genetic samples taken from the population of interest. This technique has been extensively used in many areas of biology, but is particularly useful for studying the spread of quickly evolving infectious diseases agents, e.g.,\ influenza virus. Phylodynamics inference uses a coalescent model that defi… ▽ More Phylodynamics focuses on the problem of reconstructing past population size dynamics from current genetic samples taken from the population of interest. This technique has been extensively used in many areas of biology, but is particularly useful for studying the spread of quickly evolving infectious diseases agents, e.g.,\ influenza virus. Phylodynamics inference uses a coalescent model that defines a probability density for the genealogy of randomly sampled individuals from the population. When we assume that such a genealogy is known, the coalescent model, equipped with a Gaussian process prior on population size trajectory, allows for nonparametric Bayesian estimation of population size dynamics. While this approach is quite powerful, large data sets collected during infectious disease surveillance challenge the state-of-the-art of Bayesian phylodynamics and demand computationally more efficient inference framework. To satisfy this demand, we provide a computationally efficient Bayesian inference framework based on Hamiltonian Monte Carlo for coalescent process models. Moreover, we show that by splitting the Hamiltonian function we can further improve the efficiency of this approach. Using several simulated and real datasets, we show that our method provides accurate estimates of population size dynamics and is substantially faster than alternative methods based on elliptical slice sampler and Metropolis-adjusted Langevin algorithm. △ Less

Submitted 29 November, 2014; originally announced December 2014.

Showing 1–12 of 12 results for author: Karcher, M