Scalable Bayesian divergence time estimation with ratio transformations
Authors:
Xiang Ji,
Alexander A. Fisher,
Shuo Su,
Jeffrey L. Thorne,
Barney Potter,
Philippe Lemey,
Guy Baele,
Marc A. Suchard
Abstract:
Divergence time estimation is crucial to provide temporal signals for dating biologically important events, from species divergence to viral transmissions in space and time. With the advent of high-throughput sequencing, recent Bayesian phylogenetic studies have analyzed hundreds to thousands of sequences. Such large-scale analyses challenge divergence time reconstruction by requiring inference on…
▽ More
Divergence time estimation is crucial to provide temporal signals for dating biologically important events, from species divergence to viral transmissions in space and time. With the advent of high-throughput sequencing, recent Bayesian phylogenetic studies have analyzed hundreds to thousands of sequences. Such large-scale analyses challenge divergence time reconstruction by requiring inference on highly-correlated internal node heights that often become computationally infeasible. To overcome this limitation, we explore a ratio transformation that maps the original N - 1 internal node heights into a space of one height parameter and N - 2 ratio parameters. To make analyses scalable, we develop a collection of linear-time algorithms to compute the gradient and Jacobian-associated terms of the log-likelihood with respect to these ratios. We then apply Hamiltonian Monte Carlo sampling with the ratio transform in a Bayesian framework to learn the divergence times in four pathogenic virus phylogenies: West Nile virus, rabies virus, Lassa virus and Ebola virus. Our method both resolves a mixing issue in the West Nile virus example and improves inference efficiency by at least 5-fold for the Lassa and rabies virus examples. Our method also makes it now computationally feasible to incorporate mixed-effects molecular clock models for the Ebola virus example, confirms the findings from the original study and reveals clearer multimodal distributions of the divergence times of some clades of interest.
△ Less
Submitted 25 October, 2021;
originally announced October 2021.
Generalized Integrated Functional Test for Regional Methylation Rates
Authors:
Duchwan Ryu,
Hongyan Xu,
Varghese George,
Shaoyong Su,
Xiaoling Wang,
Huidong Shi,
Robert H. Podolsky
Abstract:
Motivation: Methods are needed to test pre-defined genomic regions such as promoters for differential methylation in genome-wide association studies, where the number of samples is limited and the data have large amounts of measurement error. Results: We developed a new statistical test, the generalized integrated functional test (GIFT), which tests for regional differences in methylation based on…
▽ More
Motivation: Methods are needed to test pre-defined genomic regions such as promoters for differential methylation in genome-wide association studies, where the number of samples is limited and the data have large amounts of measurement error. Results: We developed a new statistical test, the generalized integrated functional test (GIFT), which tests for regional differences in methylation based on differences in the functional relationship between methylation percent and location of the CpG sites within a region. In this method, subject-specific functional profiles are first estimated, and the average profile within groups is compared between groups using an ANOVA-like test. Simulations and analyses of data obtained from patients with chronic lymphocytic leukemia indicate that GIFT has good statistical properties and is able to identify promising genomic regions. Further, GIFT is likely to work with multiple different types of experiments since different smoothing functions can be used to estimate the functional relationship between methylation percent and CpG site location. Availability and Implementation: Matlab code for GIFT and sample data are available at http://biostat.gru.edu/~dryu/research.html. Contact: [email protected] or [email protected]
△ Less
Submitted 24 July, 2014;
originally announced July 2014.