Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Algorithms for Efficient Near-Perfect Phylogenetic Tree Reconstruction in Theory and Practice

Published: 01 October 2007 Publication History

Abstract

We consider the problem of reconstructing near-perfect phylogenetic trees using binary character states (referred to as BNPP). A perfect phylogeny assumes that every character mutates at most once in the evolutionary tree, yielding an algorithm for binary character states that is computationally efficient but not robust to imperfections in real data. A near-perfect phylogeny relaxes the perfect phylogeny assumption by allowing at most a constant number of additional mutations. We develop two algorithms for constructing optimal near-perfect phylogenies and provide empirical evidence of their performance. The first simple algorithm is fixed parameter tractable when the number of additional mutations and the number of characters that share four gametes with some other character are constants. The second, more involved algorithm for the problem is fixed parameter tractable when only the number of additional mutations is fixed. We have implemented both algorithms and shown them to be extremely efficient in practice on biologically significant data sets. This work proves the BNPP problem fixed parameter tractable and provides the first practical phylogenetic tree reconstruction algorithms that find guaranteed optimal solutions while being easily implemented and computationally feasible for data sets of biologically meaningful size and complexity.

References

[1]
R. Agarwala and D. Fernandez-Baca, “A Polynomial-Time Algorithm for the Perfect Phylogeny Problem When the Number of Character States Is Fixed,” SIAM J. Computing, vol. 23, pp. 1216-1224, 1994.
[2]
G.E. Blelloch, K. Dhamdhere, E. Halperin, R. Ravi, R. Schwartz, and S. Sridhar, “Fixed Parameter Tractability of Binary Near-Perfect Phylogenetic Tree Reconstruction,” Proc. 33rd Int'l Colloquium Automata, Languages, and Programming, 2006.
[3]
H. Bodlaender, M. Fellows, and T. Warnow, “Two Strikes against Perfect Phylogeny,” Proc. 19th Int'l Colloquium on Automata, Languages, and Programming, pp. 273-283, 1992.
[4]
H. Bodlaender, M. Fellows, M. Hallett, H. Wareham, and T. Warnow, “The Hardness of Perfect Phylogeny, Feasible Register Assignment and Other Problems on Thin Colored Graphs,” Theoretical Computer Science, vol. 244, nos. 1-2, pp. 167-188, 2000.
[5]
M. Bonet, M. Steel, T. Warnow, and S. Yooseph, “Better Methods for Solving Parsimony and Compatibility,” J. Computational Biology, vol. 5, no. 3, pp. 409-422, 1992.
[6]
R.G. Downey and M.R. Fellows, Parameterized Complexity. Springer, 1999.
[7]
W.H. Day and D. Sankoff, “Computational Complexity of Inferring Phylogenies by Compatibility,” Systematic Zoology, vol. 35, pp. 224-229, 1986.
[8]
P. Damaschke, “Parameterized Enumeration, Transversals, and Imperfect Phylogeny Reconstruction,” Proc. Int'l Workshop Parameterized and Exact Computation, pp. 1-12, 2004.
[9]
E. Eskin, E. Halperin, and R.M. Karp, “Efficient Reconstruction of Haplotype Structure via Perfect Phylogeny,” J. Bioinformatics and Computational Biology, vol. 1, no. 1, pp. 1-20, 2003.
[10]
J. Felsenstein, PHYLIP (Phylogeny Inference Package) version 3.6, distributed by the author, Dept. of Genome Sciences, Univ. of Washington, Seattle, 2005.
[11]
D. Fernandez-Baca and J. Lagergren, “A Polynomial-Time Algorithm for Near-Perfect Phylogeny,” SIAM J. Computing, vol. 32, pp.1115-1127, 2003.
[12]
L.R. Foulds and R.L. Graham, “The Steiner Problem in Phylogeny Is NP-Complete,” Advances in Applied Math., vol. 3, pp. 43-49, 1982.
[13]
G. Ganapathy, V. Ramachandran, and T. Warnow, “Better Hill-Climbing Searches for Parsimony,” Proc. Third Int'l Workshop Algorithms in Bioinformatics (WABI '03), pp. 254-258, 2003.
[14]
D. Gusfield, “Efficient Algorithms for Inferring Evolutionary Trees,” Networks, vol. 21, pp. 19-28, 1991.
[15]
D. Gusfield, Algorithms on Strings, Trees and Sequences. Cambridge Univ. Press, 1999.
[16]
D. Gusfield and V. Bansal, “A Fundamental Decomposition Theory for Phylogenetic Networks and Incompatible Characters,” Proc. Ninth Ann. Int'l Conf. Research in Computational Molecular Biology, pp. 217-232, 2005.
[17]
D. Gusfield, S. Eddhu, and C. Langley, “Efficient Reconstruction of Phylogenetic Networks with Constrained Recombination,” Proc. Second IEEE CS Computational Systems Bioinformatics Conf., pp. 363-374, 2003.
[18]
D.A. Hinds, L.L. Stuve, G.B. Nilsen, E. Halperin, E. Eskin, D.G. Ballinger, K.A. Frazer, and D.R. Cox, “Whole Genome Patterns of Common DNA Variation in Three Human Populations,” Science, vol. 307, no. 5712, pp. 1072-1079, 2005.
[19]
The International HapMap Consortium, “The International HapMap Project,” Nature, vol. 426, pp. 789-796, 2003.
[20]
S. Kannan and T. Warnow, “A Fast Algorithm for the Computation and Enumeration of Perfect Phylogenies,” SIAM J. Computing, vol. 26, pp. 1749-1763, 1997.
[21]
M. Merimaa, M. Liivak, E. Heinaru, J. Truu, and A. Heinaru, “Functional Co-Adaption of Phenol Hydroxylase and Catechol 2,3-Dioxygenase Genes in Bacteria Possessing Different Phenol and P-Cresol Degradation Pathways,” Proc. Eighth Symp. Bacterial Genetics and Ecology, 2005.
[22]
H.J. Promel and A. Steger, The Steiner Tree Problem: A Tour through Graphs Algorithms and Complexity. Vieweg Verlag, 2002.
[23]
S. Sridhar, K. Dhamdhere, G.E. Blelloch, E. Halperin, R. Ravi, and R. Schwartz, “Simple Reconstruction of Binary Near-Perfect Phylogenetic Trees,” Proc. Int'l Workshop Bioinformatics Research and Applications, 2006.
[24]
C. Semple and M. Steel, Phylogenetics. Oxford Univ. Press, 2003.
[25]
M.A. Steel, “The Complexity of Reconstructing Trees from Qualitative Characters and Subtrees,” J. Classification, vol. 9, pp.91-116, 1992.
[26]
S.T. Sherry, M.H. Ward, M. Kholodov, J. Baker, L. Pham, E. Smigielski, and K. Sirotkin, “dbSNP: The NCBI Database of Genetic Variation,” Nucleic Acids Research, vol. 29, pp. 308-311, 2001.
[27]
A.C. Stone, R.C. Griffiths, S.L. Zegura, and M.F. Hammer, “High Levels of Y-Chromosome Nucleotide Diversity in the Genus Pan,” Proc. Nat'l Academy of Sciences USA, vol. 99, no. 1, pp. 43-48, 2002.
[28]
T. Wirth, X. Wang, B. Linz, R.P. Novick, J.K. Lum, M. Blaser, G. Morelli, D. Falush, and M. Achtman, “Distinguishing Human Ethnic Groups by Means of Sequences from Helicobacter pylori: Lessons from Ladakh,” Proc. Nat'l Academy of Sciences USA, vol. 101, no. 14, pp. 4746-4751, 2004.

Cited By

View all
  • (2025)Parameterized algorithms for the Steiner arborescence problem on a hypercubeActa Informatica10.1007/s00236-024-00474-862:1Online publication date: 1-Mar-2025
  • (2009)Generalizing the four gamete condition and splits equivalence theoremProceedings of the 9th international conference on Algorithms in bioinformatics10.5555/1812906.1812924(206-219)Online publication date: 12-Sep-2009

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE/ACM Transactions on Computational Biology and Bioinformatics
IEEE/ACM Transactions on Computational Biology and Bioinformatics  Volume 4, Issue 4
October 2007
192 pages

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 01 October 2007
Published in TCBB Volume 4, Issue 4

Author Tags

  1. biology and genetics
  2. computations on discrete structures
  3. trees

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)1
Reflects downloads up to 28 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Parameterized algorithms for the Steiner arborescence problem on a hypercubeActa Informatica10.1007/s00236-024-00474-862:1Online publication date: 1-Mar-2025
  • (2009)Generalizing the four gamete condition and splits equivalence theoremProceedings of the 9th international conference on Algorithms in bioinformatics10.5555/1812906.1812924(206-219)Online publication date: 12-Sep-2009

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media