Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3107411.3107450acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
research-article
Public Access

Synthesizing Species Trees from Unrooted Gene Trees: A Parameterized Approach

Published: 20 August 2017 Publication History

Abstract

Synthesizing species trees from a collection of smaller gene trees is a widely used approach for inferring credible species tree estimates. While corresponding computational problems are typically NP-hard, several of these problems have been effectively addressed by using the parameterized Strict Consensus Approach. This approach is limited to gene trees that are rooted. In practice, however, most gene trees are unrooted, and it is often difficult, if not impossible, to identify accurate rootings. Here, we address this stringent limitation by proposing efficient algorithms that adopt the parameterized Strict Consensus Approach to handle unrooted gene trees. Finally, we demonstrate the performance of our algorithms in a comparative study using empirical and simulated data sets.

References

[1]
Kenneth J. Arrow. 1952. Social Choice and Individual Values. Yale University Press, New Haven, Connecticut, United States.
[2]
Mukul S. Bansal and Oliver Eulenstein 2013. Algorithms for genome-scale phylogenetics using gene tree parsimony. IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol. 10, 4 (2013), 939--56.
[3]
Mukul S. Bansal and Ron Shamir 2011. A Note on the Fixed Parameter Tractability of the Gene-Duplication Problem. IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol. 8, 3 (2011), 848--50.
[4]
Michael A. Bender and Martin Farach-Colton 2000. The LCA Problem Revisited. In Proceedings of the Latin American Symposium on Theoretical Informatics. Springer, New York, New York, United States, 88--94.
[5]
Olaf R. Bininda-Emonds. 2004. Phylogenetic Supertrees. Combining Information to Reveal the Tree of Life. Springer Science & Business Media, New York, New York, United States.
[6]
Laura M. Boykin, Laura S. Kubatko, and Timothy K. Lowrey. 2010. Comparison of methods for rooting phylogenetic trees: a case study using Orcuttieae (Poaceae: Chloridoideae). Molecular Phylogenetics and Evolution Vol. 54, 3 (2010), 687--700.
[7]
David Bryant. 2003. A Classification of Consensus Methods for Phylogenetics. Discrete Mathematics and Theoretical Computer Science. Vol. Vol. 61. American Mathematical Society, Providence, Rhode Island, United States, 163--185.
[8]
J. Gordon Burleigh, Mukul S. Bansal, Oliver Eulenstein, Stefanie Hartmann, André Wehe, and Todd J. Vision 2011. Genome-Scale Phylogenetics: Inferring the Plant Tree of Life from 18,896 Gene Trees. Systematic Biology, Vol. 60, 2 (2011), 117--125.
[9]
Wen-Chieh Chang, Paweł Górecki, and Oliver Eulenstein. 2013. Exact Solutions for Species Tree Inference from discordant Gene Trees. Journal of Bioinformatics and Computational Biology, Vol. 11, 05 (2013), 1342005.
[10]
International Human Genome Sequencing Consortium and others 2004. Finishing the euchromatic sequence of the human genome. Nature, Vol. 431, 7011 (2004), 931--945.
[11]
James A. Cotton and Roderic D.M. Page 2002. Going nuclear: gene family evolution and vertebrate phylogeny reconciled. Proceedings of the Royal Society of London B: Biological Sciences, Vol. 269, 1500 (2002), 1555--1561.
[12]
James A. Cotton and Roderic D.M. Page 2005. Rates and Patterns of Gene Duplication and Loss in the Human Genome. Proceedings of the Royal Society of London B: Biological Sciences, Vol. 272, 1560 (2005), 277--83.
[13]
Theodosius Dobzhansky. 2013. Nothing in biology makes sense except in the light of evolution. The american biology teacher Vol. 75, 2 (2013), 87--91.
[14]
Oliver Eulenstein, Snehalata Huzurbazar, and David A. Liberles 2010. Evolution after Gene Duplication. Wiley, Hoboken, New Jersey, United States, Chapter Reconciling Phylogenetic Trees, 185--206.
[15]
Iakes Ezkurdia, David Juan, Jose Manuel Rodriguez, Adam Frankish, Mark Diekhans, Jennifer Harrow, Jesus Vazquez, Alfonso Valencia, and Michael L Tress 2014. Multiple evidence strands suggest that there may be as few as 19 000 human protein-coding genes. Human molecular genetics Vol. 23, 22 (2014), 5866--5878.
[16]
Joseph Felsenstein. 1981. Evolutionary trees from DNA sequences: A maximum likelihood approach. Journal of Molecular Evolution Vol. 17, 6 (1981), 368--376.
[17]
Walter M. Fitch. 1971. Toward Defining the Course of Evolution: minimum change for a specific tree topology. Systematic Biology, Vol. 20, 4 (1971), 406--416.
[18]
Markus Fleischauer and Sebastian Böcker 2016. Collecting reliable clades using the Greedy Strict Consensus Merger. PeerJ Computer Science Vol. 4 (2016), e2172.
[19]
Peter Forster and Colin Renfrew 2006. Phylogenetic methods and the prehistory of languages. McDonald Inst of Archeological, Cambridge, England, United Kingdom.
[20]
Morris Goodman, John Czelusniak, G. William Moore, A.E. Romero-Herrera, and Genji Matsuda. 1979. Fitting the Gene Lineage into its Species Lineage, a Parsimony Strategy Illustrated by Cladograms Constructed from Globin Sequences. Systematic Biology, Vol. 28, 2 (1979), 132--163.
[21]
Paweł Górecki and Oliver Eulenstein 2012. A Robinson-Foulds Measure to Compare Unrooted Trees with Rooted Trees Proceedings of the International Symposium on Bioinformatics Research and Applications. Springer, New York, New York, United States, 115--126.
[22]
Simon R. Harris, Edward J.P. Cartwright, M Estéée Török, Matthew T.G. Holden, Nicholas M. Brown, Amanda L. Ogilvy-Stuart, Matthew J. Ellington, Michael A. Quail, Stephen D. Bentley, Julian Parkhill, and Sharon J. Peacock 2013. Whole-genome sequencing for analysis of an outbreak of meticillin-resistant Staphylococcus aureus: a descriptive study. The Lancet Infectious diseases Vol. 13, 2 (2013), 130--136.
[23]
Ruth A. Hufbauer, Robin A. Marrs, Aaron K. Jackson, René Sforza, Harsh Pal Bais, Jorge M. Vivanco, and Shannan E. Carney. 2003. Population structure, ploidy levels and allelopathy of Centaurea maculosa (spotted knapweed) and C. diffusa (diffuse knapweed) in North America and Eurasia Proceedings of the International Symposium on Biological Control of Weeds. USDA Forest Service. Forest Health Technology Enterprise Team, Morgantown, WV., 121--126.
[24]
Daniel H. Huson, Scott M. Nettles, and Tandy J. Warnow. 1999. Disk-Covering, a Fast-Converging Method for Phylogenetic Tree Reconstruction. Journal of Computational Biology Vol. 6, 3--4 (1999), 369--386.
[25]
Andrew P. Jackson. 2004. A reconciliation analysis of host switching in plant-fungal symbioses. Evolution, Vol. 58, 9 (2004), 1909--23.
[26]
Harris T. Lin, J. Gordon Burleigh, and Oliver Eulenstein. 2012. Consensus Properties for the Deep Coalescence Problem and their Application for Scalable Tree Search. BMC Bioinformatics, Vol. 13, 10 (2012), S12.
[27]
Harris T. Lin, Jucheol Moon, and Oliver Eulenstein. 2015. Consensus Properties of the Gene Duplication Problem for Enhanced Phylogenetic Inference Proceedings of the International Conference on Bioinformatics and Computational Biology. Curran Associates, Inc., Red Hook, New York, United States, 131--136.
[28]
Bin Ma, Ming Li, and Louxin Zhang 2000. From Gene Trees to Species Trees. SIAM Journal on Computing Vol. 30, 3 (2000), 729--752.
[29]
Wayne P. Maddison and L. Lacey Knowles 2006. Inferring Phylogeny Despite Incomplete Lineage Sorting. Systematic Biology, Vol. 55, 1 (2006), 21--30.
[30]
Andrew P. Martin and Theresa M. Burg 2002. Perils of paralogy: using HSP70 genes for inferring organismal phylogenies. Systematic Biology, Vol. 51, 4 (2002), 570--587.
[31]
Michael R. McGowen, Clay. Clark, and John Gatesy. 2008. The vestigial olfactory receptor subgenome of odontocete whales: phylogenetic congruence between gene-tree reconciliation and supermatrix methods. Systematic biology, Vol. 57, 4 (2008), 574--590.
[32]
Jucheol Moon and Oliver Eulenstein 2016. Synthesizing Large-Scale Species Trees using Guidance Trees Proceedings of the International Conference on Bioinformatics and Computational Biology. Curran Associates, Inc., Red Hook, New York, United States, 103--108.
[33]
Jucheol Moon and Oliver Eulenstein accepted. Synthesizing Large-scale Species Trees using the Strict Consensus Approach. Journal of Bioinformatics and Computational Biology ( accepted), in press.
[34]
Jucheol Moon, Harris T. Lin, and Oliver Eulenstein. 2016. Consensus Properties and their Large-Scale Applications for the Gene Duplication Problem. Journal of Bioinformatics and Computational Biology, Vol. 14, 03 (2016), 1642005.
[35]
Serena Nik-Zainal, Peter V. Loo, David C. Wedge, Ludmil B. Alexandrov, Christopher D. Greenman, King W. Lau, Keiran Raine, David Jones, John Marshall, Manasa Ramakrishna, Adam Shlien, Susanna L. Cooke, Jonathan Hinton, Andrew Menzies, Lucy A. Stebbings, Catherine Leroy, Mingming Jia, Richard Rance, Laura J. Mudie, Stephen J. Gamble, Philip J. Stephens, Stuart McLaren, Patrick S. Tarpey, Elli Papaemmanuil, Helen R. Davies, Ignacio Varela, David J. McBride, Graham R. Bignell, Kenric Leung, Adam P. Butler, Jon W. Teague, Sancha Martin, Goran Jönsson, Odette Mariani, Sandrine Boyault, Penelope Miron, Aquila Fatima, Anita Langerød, Samuel A.J.R. Aparicio, Andrew Tutt, Anieta M. Sieuwerts, Åke Borg, Gilles Thomas, Anne Vincent Salomon, Andrea L. Richardson, Anne-Lise Børresen-Dale, P. Andrew Futreal, Michael R. Stratton, and Peter J. Campbell 2012. The Life History of 21 Breast Cancers. Cell, Vol. 149, 5 (2012), 994--1007.
[36]
Roderic D.M. Page. 1998. GeneTree: comparing gene and species phylogenies using reconciled trees. Bioinformatics, Vol. 14, 9 (1998), 819--820.
[37]
Roderic D.M. Page. 2000. Extracting species trees from complex gene trees: reconciled trees and vertebrate phylogeny. Molecular Phylogenetics and Evolution Vol. 14 (2000), 89--106.
[38]
Pekka Pamilo and Masatoshi Nei 1988. Relationships between gene trees and species trees. Molecular biology and evolution Vol. 5, 5 (1988), 568--583.
[39]
Elizabeth Pennisi. 2012. ENCODE project writes eulogy for junk DNA. Science, Vol. 337, 6099 (2012), 1159--1161.
[40]
Robert A. Pyron and John J. Wiens 2011natexlaba. A large-scale phylogeny of Amphibia including over 2800 species, and a revised classification of extant frogs, salamanders, and caecilians. Molecular Phylogenetics and Evolution Vol. 61, 2 (2011), 543--583.
[41]
Robert A. Pyron and John J. Wiens 2011natexlabb. Data from: A large-scale phylogeny of Amphibia including over 2800 species, and a revised classification of extant frogs, salamanders, and caecilians. (2011). shownote.
[42]
Vincent Ranwez, Vincent Berry, Alexis Criscuolo, Pierre-Henri Fabre, Sylvain Guillemot, Celine Scornavacca, and Emmanuel J.P. Douzery. 2007. PhySIC: A Veto Supertree Method with Desirable Properties. Systematic Biology, Vol. 56, 5 (2007), 798--817.
[43]
Johannes J. Le Roux, Ania M. Wieczorek, Mohsen M. Ramadan, and Carol T. Tran 2006. Resolving the native provenance of invasive fireweed (Senecio madagascariensis Poir.) in the Hawaiian Islands as inferred Poir.) in the Hawaiian Islands as inferred from phylogenetic analysis. Diversity and Distributions Vol. 12 (2006), 694--702.
[44]
Naruya Saitou and Masatoshi Nei 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology and Evolution Vol. 4, 4 (1987), 406--425.
[45]
Charles Semple and Mike A. Steel 2003. Phylogenetics. Oxford University Press, Oxford, England, United Kingdom.
[46]
Alexandros Stamatakis. 2014. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, Vol. 30, 9 (2014), 1312--1313.
[47]
Cuong Than and Luay Nakhleh 2009. Species Tree Inference by Minimizing Deep Coalescences. PLOS Computational Biology Vol. 5, 9 (2009), e1000501.
[48]
André Wehe, Mukul S. Bansal, J. Gordon Burleigh, and Oliver Eulenstein 2008. DupTree: A program for large-scale phylogenetic analyses using gene tree parsimony. Bioinformatics, Vol. 24, 13 (2008), 1540--1541.
[49]
André Wehe and J. Gordon Burleigh 2010. Scaling the gene duplication problem towards the Tree of Life: Accelerating the rSPR heuristic search. In Proceedings of the International Conference on Bioinformatics and Computational Biology. Curran Associates, Inc., Red Hook, New York, United States, 133--138.
[50]
Mark Wilkinson, James A. Cotton, Franccois-Joseph Lapointe, and Davide Pisani 2007. Properties of Supertree Methods in the Consensus Setting. Systematic Biology, Vol. 56, 2 (2007), 330--337.

Cited By

View all
  • (2019)Consensus of all Solutions for Intractable Phylogenetic Tree InferenceIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2019.2947051(1-1)Online publication date: 2019
  • (2018)Phylogenetic Consensus for Exact Median TreesProceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics10.1145/3233547.3233560(366-375)Online publication date: 15-Aug-2018

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ACM-BCB '17: Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics
August 2017
800 pages
ISBN:9781450347228
DOI:10.1145/3107411
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 August 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. gene duplication
  2. guidance tree based tree rooting
  3. guidance tree based unrooted tree fill-in
  4. pareto for clusters
  5. strict consensus approach

Qualifiers

  • Research-article

Funding Sources

Conference

BCB '17
Sponsor:

Acceptance Rates

ACM-BCB '17 Paper Acceptance Rate 42 of 132 submissions, 32%;
Overall Acceptance Rate 254 of 885 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)18
  • Downloads (Last 6 weeks)3
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Consensus of all Solutions for Intractable Phylogenetic Tree InferenceIEEE/ACM Transactions on Computational Biology and Bioinformatics10.1109/TCBB.2019.2947051(1-1)Online publication date: 2019
  • (2018)Phylogenetic Consensus for Exact Median TreesProceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics10.1145/3233547.3233560(366-375)Online publication date: 15-Aug-2018

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media