Heuristic algorithms for the Maximum Colorful Subtree problem
Authors:
Kai Dührkop,
Marie Anne Lataretu,
W. Timothy J. White,
Sebastian Böcker
Abstract:
In metabolomics, small molecules are structurally elucidated using tandem mass spectrometry (MS/MS); this resulted in the computational Maximum Colorful Subtree problem, which is NP-hard. Unfortunately, data from a single metabolite requires us to solve hundreds or thousands of instances of this problem; and in a single Liquid Chromatography MS/MS run, hundreds or thousands of metabolites are meas…
▽ More
In metabolomics, small molecules are structurally elucidated using tandem mass spectrometry (MS/MS); this resulted in the computational Maximum Colorful Subtree problem, which is NP-hard. Unfortunately, data from a single metabolite requires us to solve hundreds or thousands of instances of this problem; and in a single Liquid Chromatography MS/MS run, hundreds or thousands of metabolites are measured.
Here, we comprehensively evaluate the performance of several heuristic algorithms for the problem against an exact algorithm. We put particular emphasis on whether a heuristic is able to rank candidates such that the correct solution is ranked highly. We propose this "intermediate" evaluation because evaluating the approximating quality of heuristics is misleading: Even a slightly suboptimal solution can be structurally very different from the true solution. On the other hand, we cannot structurally evaluate against the ground truth, as this is unknown. We find that one particular heuristic consistently ranks the correct solution in a top position, allowing us to speed up computations about 100-fold. We also find that scores of the best heuristic solutions are very close to the optimal score; in contrast, the structure of the solutions can deviate significantly from the optimal structures.
△ Less
Submitted 13 February, 2018; v1 submitted 23 January, 2018;
originally announced January 2018.
Exact and heuristic algorithms for Cograph Editing
Authors:
W. Timothy J. White,
Marcus Ludwig,
Sebastian Böcker
Abstract:
We present a dynamic programming algorithm for optimally solving the Cograph Editing problem on an $n$-vertex graph that runs in $O(3^n n)$ time and uses $O(2^n)$ space. In this problem, we are given a graph $G = (V, E)$ and the task is to find a smallest possible set $F \subseteq V \times V$ of vertex pairs such that $(V, E \bigtriangleup F)$ is a cograph (or $P_4$-free graph), where…
▽ More
We present a dynamic programming algorithm for optimally solving the Cograph Editing problem on an $n$-vertex graph that runs in $O(3^n n)$ time and uses $O(2^n)$ space. In this problem, we are given a graph $G = (V, E)$ and the task is to find a smallest possible set $F \subseteq V \times V$ of vertex pairs such that $(V, E \bigtriangleup F)$ is a cograph (or $P_4$-free graph), where $\bigtriangleup$ represents the symmetric difference operator. We also describe a technique for speeding up the performance of the algorithm in practice. Additionally, we present a heuristic for solving the Cograph Editing problem which produces good results on small to medium datasets. In application it is much more important to find the ground truth, not some optimal solution. For the first time, we evaluate whether the cograph property is strict enough to recover the true graph from data to which noise has been added.
△ Less
Submitted 10 January, 2018; v1 submitted 15 November, 2017;
originally announced November 2017.