-
Counting cherry reduction sequences is counting linear extensions (in phylogenetic tree-child networks)
Authors:
Tomás M. Coronado,
Joan Carles Pons,
Gabriel Riera
Abstract:
Orchard and tree-child networks share an important property with phylogenetic trees: they can be completely reduced to a single node by iteratively deleting cherries and reticulated cherries. As it is the case with phylogenetic trees, the number of ways in which this can be done gives information about the topology of the network. Here, we show that the problem of computing this number in tree-chi…
▽ More
Orchard and tree-child networks share an important property with phylogenetic trees: they can be completely reduced to a single node by iteratively deleting cherries and reticulated cherries. As it is the case with phylogenetic trees, the number of ways in which this can be done gives information about the topology of the network. Here, we show that the problem of computing this number in tree-child networks is akin to that of finding the number of linear extensions of the poset induced by each network, and give an algorithm based on this reduction whose complexity is bounded in terms of the level of the network.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
Comparison of orchard networks using their extended $μ$-representation
Authors:
Gabriel Cardona,
Joan Carles Pons,
Gerard Ribas,
Tomás Martínez Coronado
Abstract:
Phylogenetic networks generalize phylogenetic trees in order to model reticulation events. Although the comparison of phylogenetic trees is well studied, and there are multiple ways to do it in an efficient way, the situation is much different for phylogenetic networks.
Some classes of phylogenetic networks, mainly tree-child networks, are known to be classified efficiently by their $μ$-represen…
▽ More
Phylogenetic networks generalize phylogenetic trees in order to model reticulation events. Although the comparison of phylogenetic trees is well studied, and there are multiple ways to do it in an efficient way, the situation is much different for phylogenetic networks.
Some classes of phylogenetic networks, mainly tree-child networks, are known to be classified efficiently by their $μ$-representation, which essentially counts, for every node, the number of paths to each leaf. In this paper, we introduce the extended $μ$-representation of networks, where the number of paths to reticulations is also taken into account. This modification allows us to distinguish orchard networks and to define a sound metric on the space of such networks that can, moreover, be computed efficiently.
The class of orchard networks, as well as being one of the classes with biological significance (one such network can be interpreted as a tree with extra arcs involving coexisting organisms), is one of the most generic ones (in mathematical terms) for which such a representation can (conjecturally) exist, since a slight relaxation of the definition leads to a problem that is Graph Isomorphism Complete.
△ Less
Submitted 20 February, 2023;
originally announced February 2023.
-
A polynomial invariant for a new class of phylogenetic networks
Authors:
Joan Carles Pons,
Tomás M. Coronado,
Michael Hendriksen,
Andrew Francis
Abstract:
Invariants for complicated objects such as those arising in phylogenetics, whether they are invariants as matrices, polynomials, or other mathematical structures, are important tools for distinguishing and working with such objects. In this paper, we generalize a complete polynomial invariant on trees to a class of phylogenetic networks called separable networks, which will include orchard network…
▽ More
Invariants for complicated objects such as those arising in phylogenetics, whether they are invariants as matrices, polynomials, or other mathematical structures, are important tools for distinguishing and working with such objects. In this paper, we generalize a complete polynomial invariant on trees to a class of phylogenetic networks called separable networks, which will include orchard networks. Networks are becoming increasingly important for their ability to represent reticulation events, such as hybridization, in evolutionary history. We provide a function from the space of internally multi-labelled phylogenetic networks, a more generic graph structure than phylogenetic networks where the reticulations are also labelled, to a polynomial ring. We prove that the separability condition allows us to characterize, via the polynomial, the phylogenetic networks with the same number of leaves and same number of reticulations by considering their internally labelled versions. While the invariant for trees is a polynomial in Z[x_1,..., x_n,y] where n is the number of leaves, the invariant for internally multi-labelled phylogenetic networks is an element of Z[x_1,..., x_n,lambda_1,...,lambda_r,y], where r is the number of reticulations in the network. When the networks are considered without leaf labels the number of variables reduces to r+2.
△ Less
Submitted 5 April, 2022; v1 submitted 30 December, 2021;
originally announced December 2021.
-
Explicit solution of divide-and-conquer dividing by a half recurrences with polynomial independent term
Authors:
Tomás M. Coronado,
Arnau Mir,
Francesc Rosselló
Abstract:
Divide-and-conquer dividing by a half recurrences, of the form $x_n =a\cdot x_{\left\lceil{n}/{2}\right\rceil}+a\cdot x_{\left\lfloor{n}/{2}\right\rfloor}+p(n)$, $n\geq 2$, appear in many areas of applied mathematics, from the analysis of algorithms to the optimization of phylogenetic balance indices. The Master Theorems that solve these equations do not provide the solution's explicit expression,…
▽ More
Divide-and-conquer dividing by a half recurrences, of the form $x_n =a\cdot x_{\left\lceil{n}/{2}\right\rceil}+a\cdot x_{\left\lfloor{n}/{2}\right\rfloor}+p(n)$, $n\geq 2$, appear in many areas of applied mathematics, from the analysis of algorithms to the optimization of phylogenetic balance indices. The Master Theorems that solve these equations do not provide the solution's explicit expression, only its big-$Θ$ order of growth. In this paper we give an explicit expression (in terms of the binary decomposition of $n$) for the solution $x_n$ of a recurrence of this form, with given initial condition $x_1$, when the independent term $p(n)$ is a polynomial in $\lceil{n}/{2}\rceil$ and $\lfloor{n}/{2}\rfloor$.
△ Less
Submitted 24 November, 2021;
originally announced November 2021.
-
Squaring within the Colless index yields a better balance index
Authors:
Tomás M. Coronado,
Arnau Mir,
Francesc Rosselló
Abstract:
The Colless index for bifurcating phylogenetic trees, introduced by Colless (1982), is defined as the sum, over all internal nodes $v$ of the tree, of the absolute value of the difference of the sizes of the clades defined by the children of $v$. It is one of the most popular phylogenetic balance indices, because, in addition to measuring the balance of a tree in a very simple and intuitive way, i…
▽ More
The Colless index for bifurcating phylogenetic trees, introduced by Colless (1982), is defined as the sum, over all internal nodes $v$ of the tree, of the absolute value of the difference of the sizes of the clades defined by the children of $v$. It is one of the most popular phylogenetic balance indices, because, in addition to measuring the balance of a tree in a very simple and intuitive way, it turns out to be one of the most powerful and discriminating phylogenetic shape indices. But it has some drawbacks. On the one hand, although its minimum value is reached at the so-called maximally balanced trees, it is almost always reached also at trees that are not maximally balanced. On the other hand, its definition as a sum of absolute values of differences makes it difficult to study analytically its distribution under probabilistic models of bifurcating phylogenetic trees. In this paper we show that if we replace in its definition the absolute values of the differences of clade sizes by the squares of these differences, all these drawbacks are overcome and the resulting index is still more powerful and discriminating than the original Colless index.
△ Less
Submitted 29 July, 2020;
originally announced July 2020.
-
On the minimum value of the Colless index and the bifurcating trees that achieve it
Authors:
Tomás M. Coronado,
Mareike Fischer,
Lina Herbst,
Francesc Rosselló,
Kristina Wicke
Abstract:
Measures of tree balance play an important role in the analysis of phylogenetic trees. One of the oldest and most popular indices in this regard is the Colless index for rooted bifurcating trees, introduced by Colless (1982). While many of its statistical properties under different probabilistic models for phylogenetic trees have already been established, little is known about its minimum value an…
▽ More
Measures of tree balance play an important role in the analysis of phylogenetic trees. One of the oldest and most popular indices in this regard is the Colless index for rooted bifurcating trees, introduced by Colless (1982). While many of its statistical properties under different probabilistic models for phylogenetic trees have already been established, little is known about its minimum value and the trees that achieve it. In this manuscript, we fill this gap in the literature. To begin with, we derive both recursive and closed expressions for the minimum Colless index of a tree with $n$ leaves. Surprisingly, these expressions show a connection between the minimum Colless index and the so-called Blancmange curve, a fractal curve. We then fully characterize the tree shapes that achieve this minimum value and we introduce both an algorithm to generate them and a recurrence to count them. After focusing on two extremal classes of trees with minimum Colless index (the maximally balanced trees and the greedy from the bottom trees), we conclude by showing that all trees with minimum Colless index also have minimum Sackin index, another popular balance index.
△ Less
Submitted 17 February, 2020; v1 submitted 11 July, 2019;
originally announced July 2019.
-
The minimum value of the Colless index
Authors:
Tomás M. Coronado,
Francesc Rosselló
Abstract:
The Colless index is one of the oldest and most widely used balance indices for rooted bifurcating trees. Despite its popularity, its minimum value on the space $\mathcal{T}_n$ of rooted bifurcating trees with $n$ leaves is only known when $n$ is a power of 2. In this paper we fill this gap in the literature, by providing a formula that computes, for each $n$, the minimum Colless index on…
▽ More
The Colless index is one of the oldest and most widely used balance indices for rooted bifurcating trees. Despite its popularity, its minimum value on the space $\mathcal{T}_n$ of rooted bifurcating trees with $n$ leaves is only known when $n$ is a power of 2. In this paper we fill this gap in the literature, by providing a formula that computes, for each $n$, the minimum Colless index on $\mathcal{T}_n$, and characterizing those trees where this minimum value is reached.
△ Less
Submitted 23 July, 2019; v1 submitted 27 March, 2019;
originally announced March 2019.
-
The Fair Proportion is a Shapley Value on phylogenetic networks too
Authors:
Tomás M. Coronado,
Gabriel Riera,
Francesc Rosselló
Abstract:
The Fair Proportion of a species in a phylogenetic tree is a very simple measure that has been used to assess its value relative to the overall phylogenetic diversity represented by the tree. It has recently been proved by Fuchs and ** to be equal to the Shapley Value of the coallitional game that sends each subset of species to its rooted Phylogenetic Diversity in the tree. We prove in this pape…
▽ More
The Fair Proportion of a species in a phylogenetic tree is a very simple measure that has been used to assess its value relative to the overall phylogenetic diversity represented by the tree. It has recently been proved by Fuchs and ** to be equal to the Shapley Value of the coallitional game that sends each subset of species to its rooted Phylogenetic Diversity in the tree. We prove in this paper that this result extends to the natural translations of the Fair Proportion and the rooted Phylogenetic Diversity to rooted phylogenetic networks. We also generalize to rooted phylogenetic networks the expression for the Shapley Value of the unrooted Phylogenetic Diversity game on a phylogenetic tree established by Haake, Kashiwada and Su.
△ Less
Submitted 5 April, 2018;
originally announced April 2018.
-
A balance index for phylogenetic trees based on rooted quartets
Authors:
Tomás M. Coronado,
Arnau Mir,
Francesc Rosselló,
Gabriel Valiente
Abstract:
We define a new balance index for rooted phylogenetic trees based on the symmetry of the evolutive history of every set of 4 leaves. This index makes sense for multifurcating trees and it can be computed in time linear in the number of leaves. We determine its maximum and minimum values for arbitrary and bifurcating trees, and we provide exact formulas for its expected value and variance on bifurc…
▽ More
We define a new balance index for rooted phylogenetic trees based on the symmetry of the evolutive history of every set of 4 leaves. This index makes sense for multifurcating trees and it can be computed in time linear in the number of leaves. We determine its maximum and minimum values for arbitrary and bifurcating trees, and we provide exact formulas for its expected value and variance on bifurcating trees under Ford's $α$-model and Aldous' $β$-model and on arbitrary trees under the $α$-$γ$-model.
△ Less
Submitted 22 March, 2019; v1 submitted 5 March, 2018;
originally announced March 2018.
-
The probabilities of trees and cladograms under Ford's $α$-model
Authors:
Tomás M. Coronado,
Arnau Mir,
Francesc Rosselló
Abstract:
We give correct explicit formulas for the probabilities of rooted binary trees and cladograms under Ford's $α$-model.
We give correct explicit formulas for the probabilities of rooted binary trees and cladograms under Ford's $α$-model.
△ Less
Submitted 11 January, 2018;
originally announced January 2018.