-
Leveraging advances in machine learning for the robust classification and interpretation of networks
Authors:
Raima Carol Appaw,
Nicholas Fountain-Jones,
Michael A. Charleston
Abstract:
The ability to simulate realistic networks based on empirical data is an important task across scientific disciplines, from epidemiology to computer science. Often simulation approaches involve selecting a suitable network generative model such as Erdös-Rényi or small-world. However, few tools are available to quantify if a particular generative model is suitable for capturing a given network stru…
▽ More
The ability to simulate realistic networks based on empirical data is an important task across scientific disciplines, from epidemiology to computer science. Often simulation approaches involve selecting a suitable network generative model such as Erdös-Rényi or small-world. However, few tools are available to quantify if a particular generative model is suitable for capturing a given network structure or organization. We utilize advances in interpretable machine learning to classify simulated networks by our generative models based on various network attributes, using both primary features and their interactions. Our study underscores the significance of specific network features and their interactions in distinguishing generative models, comprehending complex network structures, and the formation of real-world networks.
△ Less
Submitted 12 June, 2024; v1 submitted 19 March, 2024;
originally announced March 2024.
-
Microgravity Mass Gauging with Capacitance Sensing: Sensor Design and Experiment
Authors:
M. A. Charleston,
S. M. Chowdhury,
Q. M. Marashdeh,
B. J. Straiton,
F. L. Teixeira
Abstract:
The use of capacitance sensors for fuel mass gauging has been in consideration since the early days of manned space flight. However, certain difficulties arise when considering tanks in microgravity environments. Surface tension effects lead to fluid wetting of the interior surface of the tank, leaving large interior voids, while thrust/settling effects can lead to dispersed two-phase mixtures. Wi…
▽ More
The use of capacitance sensors for fuel mass gauging has been in consideration since the early days of manned space flight. However, certain difficulties arise when considering tanks in microgravity environments. Surface tension effects lead to fluid wetting of the interior surface of the tank, leaving large interior voids, while thrust/settling effects can lead to dispersed two-phase mixtures. With the exception of Electrical Capacitance Volume Tomography (ECVT), few sensing technologies are well suited for measuring annular, stratified, and dispersed fluid configurations as well as handling the additional complications of mechanical installation inside a spherical tank. To optimize the design of future ECVT based spherical tank mass gauging sensors, different electrode plate layouts are considered, and their effect on the performance of the sensor as a fuel mass gauge is analyzed through the use of imaging and averaging techniques.
△ Less
Submitted 9 January, 2024;
originally announced January 2024.
-
Exploring the consequences of lack of closure in codon models
Authors:
Michael D. Woodhams,
Jeremy G. Sumner,
David A. Liberles,
Michael A. Charleston,
Barbara R. Holland
Abstract:
Models of codon evolution are commonly used to identify positive selection. Positive selection is typically a heterogeneous process, i.e., it acts on some branches of the evolutionary tree and not others. Previous work on DNA models showed that when evolution occurs under a heterogeneous process it is important to consider the property of model closure, because non-closed models can give biased es…
▽ More
Models of codon evolution are commonly used to identify positive selection. Positive selection is typically a heterogeneous process, i.e., it acts on some branches of the evolutionary tree and not others. Previous work on DNA models showed that when evolution occurs under a heterogeneous process it is important to consider the property of model closure, because non-closed models can give biased estimates of evolutionary processes. The existing codon models that account for the genetic code are not closed; to establish this it is enough to show that they are not linear (meaning that the sum of two codon rate matrices in the model is not a matrix in the model). This raises the concern that a single codon model fit to a heterogeneous process might mis-estimate both the effect of selection and branch lengths.
Codon models are typically constructed by choosing an underlying DNA model (e.g., HKY) that acts identically and independently at each codon position, and then applying the genetic code via the parameter $ω$ to modify the rate of transitions between codons that code for different amino acids. Here we use simulation to investigate the accuracy of estimation of both the selection parameter $ω$ and branch lengths in cases where the underlying DNA process is heterogeneous but $ω$ is constant. We find that both $ω$ and branch lengths can be mis-estimated in these scenarios. Errors in $ω$ were usually less than 2% but could be as high as 17%. We also assessed if choosing different underlying DNA models had any affect on accuracy, in particular we assessed if using closed DNA models gave any advantage. However, a DNA model being closed does not imply that the codon model constructed from it is closed, and in general we found that using closed DNA models did not decrease errors in the estimation of $ω$.
△ Less
Submitted 15 September, 2017;
originally announced September 2017.
-
WiSPA: A new approach for dealing with widespread parasitism
Authors:
Benjamin Drinkwater,
Angela Qiao,
Michael A. Charleston
Abstract:
Traditionally, studies of coevolving systems have considered cases where a parasite may inhabit only a single host. The case where a parasite may infect many hosts, widespread parasitism, has until recently gained little traction. This is due in part to the computational complexity involved in reconstructing the coevolutionary histories where parasites may infect only a single host, which is NP-Ha…
▽ More
Traditionally, studies of coevolving systems have considered cases where a parasite may inhabit only a single host. The case where a parasite may infect many hosts, widespread parasitism, has until recently gained little traction. This is due in part to the computational complexity involved in reconstructing the coevolutionary histories where parasites may infect only a single host, which is NP-Hard. Allowing parasites to inhabit more than one host has been seen to only further compound this computationally intractable problem. Recently however, well-established algorithms for estimating the problem instance where a parasite may infect only a single host have been extended to handle widespread parasites. Although this has offered significant progress, it has been noted that these algorithms poorly handle parasites that inhabit phylogenetically distant hosts.
In this work we extend these previous algorithms to handle cases where parasites inhabit phylogenetically distant hosts using an additional evolutionary event which we call spread. Our new framework is shown to infer significantly more congruent coevolutionary histories compared to existing methods over both synthetic and biological data sets. We then apply the newly proposed algorithm, which we call WiSPA (WideSpread Parasitism Analyser), to the well studied coevolutionary system of Primates and Enterobius (pinworms), where existing methods have been unable to reconcile the widespread parasitism present without permitting additional divergence events. Using WiSPA and the new biological event, spread, we provide the first statistically significant coevolutionary hypothesis for this system.
△ Less
Submitted 30 March, 2016;
originally announced March 2016.
-
Phylogenetic estimation with partial likelihood tensors
Authors:
J. G. Sumner,
M. A. Charleston
Abstract:
We present an alternative method for calculating likelihoods in molecular phylogenetics. Our method is based on partial likelihood tensors, which are generalizations of partial likelihood vectors, as used in Felsenstein's approach. Exploiting a lexicographic sorting and partial likelihood tensors, it is possible to obtain significant computational savings. We show this on a range of simulated da…
▽ More
We present an alternative method for calculating likelihoods in molecular phylogenetics. Our method is based on partial likelihood tensors, which are generalizations of partial likelihood vectors, as used in Felsenstein's approach. Exploiting a lexicographic sorting and partial likelihood tensors, it is possible to obtain significant computational savings. We show this on a range of simulated data by enumerating all numerical calculations that are required by our method and the standard approach.
△ Less
Submitted 22 July, 2008;
originally announced July 2008.
-
Markov invariants, plethysms, and phylogenetics (the long version)
Authors:
J. G. Sumner,
M. A. Charleston,
L. S. Jermiin,
P. D. Jarvis
Abstract:
We explore model based techniques of phylogenetic tree inference exercising Markov invariants. Markov invariants are group invariant polynomials and are distinct from what is known in the literature as phylogenetic invariants, although we establish a commonality in some special cases. We show that the simplest Markov invariant forms the foundation of the Log-Det distance measure. We take as our…
▽ More
We explore model based techniques of phylogenetic tree inference exercising Markov invariants. Markov invariants are group invariant polynomials and are distinct from what is known in the literature as phylogenetic invariants, although we establish a commonality in some special cases. We show that the simplest Markov invariant forms the foundation of the Log-Det distance measure. We take as our primary tool group representation theory, and show that it provides a general framework for analysing Markov processes on trees. From this algebraic perspective, the inherent symmetries of these processes become apparent, and focusing on plethysms, we are able to define Markov invariants and give existence proofs. We give an explicit technique for constructing the invariants, valid for any number of character states and taxa. For phylogenetic trees with three and four leaves, we demonstrate that the corresponding Markov invariants can be fruitfully exploited in applied phylogenetic studies.
△ Less
Submitted 22 July, 2008; v1 submitted 22 November, 2007;
originally announced November 2007.