Summary
A progressive alignment method is described that utilizes the Needleman and Wunsch pairwise alignment algorithm iteratively to achieve the multiple alignment of a set of protein sequences and to construct an evolutionary tree depicting their relationship. The sequences are assumed a priori to share a common ancestor, and the trees are constructed from difference matrices derived directly from the multiple alignment. The thrust of the method involves putting more trust in the comparison of recently diverged sequences than in those evolved in the distant past. In particular, this rule is followed: “once a gap, always a gap”. The method has been applied to three sets of protein sequences: 7 superoxide dismutases, 11 globins, and 9 tyrosine kinase-like sequences. Multiple alignments and phylogenetic trees for these sets of sequences were determined and compared with trees derived by conventional pairwise treatments. In several instances, the progressive method led to trees that appeared to be more in line with biological expectations than were trees obtained by more commonly used methods.
Similar content being viewed by others
References
Bajaj M, Blundell T (1984) Evolution and the tertiary structure of proteins. Ann Rev Biophys Bioeng 13:453–492
Bannister JV, Parker MW (1985) The presence of a copper/ zinc superoxide dismutase in the bacteriumPhotobacterium leiognathi: a likely case of gene transfer from eukaryotes to prokaryotes. Proc Natl Acad Sci USA 82:149–152
Cannon RE, White JA, Scandalios JG (1987) Cloning of cDNA for maize superoxide dismutase 2 (SOD2). Proc Natl Acad Sci USA 84:179–183
Dayhoff MO, Eck RV (1968) Atlas of protein sequence and structure 1967–1968, National Biomedical Research Foundation, Silver Spring MD, p 19
Dayhoff MO, Park CM, McLaughlin PJ (1972) Building a phylogenetic tree: cytochrome c. In: Dayhoff MO (ed) Atlas of protein sequence and structure, vol 5. National Biomedical Research Foundation, Washington DC, pp 7–16
Dayhoff MO, Schwartz RM, Orcutt BC (1978) A model for evolutionary change. In: Dayhoff MO (ed) Atlas of protein sequence and structure, vol 5, suppl 3. National Biomedical Research Foundation, Washington DC, pp 345–358
Doolittle RF (1981) Similar amino acid sequences: chance or common ancestry? Science 214:149–159
Feng DF, Johnson MS, Doolittle RF (1985) Aligning amino acid sequences: comparison of commonly used methods. J Mol Evol 21:112–125
Fitch WM (1966) An improved method of testing for evolutionary homology. J Mol Biol 16:9–16
Fitch WM (1970) Further improvements in the method of testing for evolutionary homology among proteins. J Mol Biol 49:1–14
Fitch WM (1977) On the problem of discovering the most parsimonious tree. Am Nat 111:223–257
Fitch WM (1981) The old REH theory remains unsatisfactory and the new REH theory is problematical—a reply to Holmquist and Jukes. J Mol Evol 18:60–67
Fitch WM, Margoliash E (1967) Construction of phylogenetic trees. Science 15:279–284
Fredman ML (1984) Computing evolutionary similarity measures with length independent gap penalties. Bull Math Biol 46:553–566
Goodman M, Moore GW, Barnabas J, Matsuda G (1974) The phylogeny of human globin genes investigated by the maximum parsimony method. J Mol Evol 3:1–48
Hogeweg P, Hesper B (1984) The alignment of sets of sequences and the construction of phyletic trees: an integrated method. J Mol Evol 20:175–186
Holmquist R (1979) The method of parsimony: an experimental test and theoretical analysis of the adequacy of molecular restoration studies. J Mol Biol 135:939–958
Holmquist R, Jukes T (1981) The current status of REH theory. Reply to an essay by Fitch. J Mol Evol 18:47–59
Hunt LT, Hurst-Calderone S, Dayhoff MO (1978) Globins. In: Dayhoff MO (ed) Atlas of protein sequence and structure, vol 5, suppl 3. National Biomedical Research Foundation, Washington DC, pp 229–249
Jabusch JR, Farb DL, Kerschensteiner DA, Deutsch HF (1980) Some sulfhydryl properties and primary structure of human superoxide dismutase. Biochemistry 19:2310–2316
Johansen JT, Overballe-Petersen C, Martin B, Hasemann B, Svendsen I (1979) The complete amino acid sequence of copper-zinc superoxide dismutase fromSaccharomyces cerevisiae. Carlsberg Res Commun 44:201–217
Johnson MS, Doolittle RF (1986) A method for the simultaneous alignment of three or more amino acid sequences. J Mol Evol 23:267–278
Jue RA, Woodbury NW, Doolittle RF (1980) Sequence homologies amongE. coli ribosomal proteins: evidence for evolutionarily related groupings and internal duplications. J Mol Evol 15:129–148
Kernighan BW, Ritchie DM (1978) The C programming language. Prentice-Hall, Englewood Cliffs NJ
Klotz LC, Blanken RL (1981) A practical method for calculating evolutionary trees from sequence data. J Theor Biol 91:261–272
Lee YM, Friedman DJ, Ayala FJ (1985) Superoxide dismutase: an evolutionary puzzle. Proc Natl Acad Sci USA 82:824–828
Leunissen JAM, De Jong WW (1986) Copper/zinc superoxide dismutase: how likely is gene transfer from ponyfish toPhotobacterium leiognathi? J Mol Evol 23:250–258
Martin JP, Fridovich I (1981) Evidence for a natural gene transfer from the ponyfish to its bioluminescent bacterial symbiontPhotobacter leiognathi. J Biol Chem 256:6080–6089
Moore GM, Goodman M, Barnabas J (1973) An iterative approach from the standpoint of the additive hypothesis to the dendrogram problem posed by molecular data sets. J Theor Biol 38:423–457
Murata M, Richardson JS, Sussman JL (1985) Simultaneous comparison of three protein sequences. Proc Natl Acad Sci USA 82:3073–3077
Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48:443–453
Penny D, Hendy M (1986) Estimating the reliability of evolutionary trees. Mol Biol Evol 3:403–417
Rocha HA, Bannister WH, Bannister JV (1984) The amino acid sequence of copper/zinc superoxide dismutase from swordfish liver. Eur J Biochem 145:477–484
Sankoff D, Cedergren RJ, McKay WM (1982) A strategy for sequence phylogeny research. Nucleic Acids Res 10:421–431
Sellers PH (1974) Evolutionary distances. SIAM J Appl Math 26:787–793
Steffens GJ, Bannister JV, Bannister WH, Flohe L, Gunzler WA, Kim S-MA, Otting F (1983) The primary structure of Cu-Zn superoxide dismutase fromPhotobacterium leiognathi: evidence for a separate evolution of Cu-Zn superoxide dismutase in bacteria. Hoppe-Seyler's Z Physiol Chem 364:675–690
Steinman HM, Naik VR, Abernathy JL, Hill RL (1974) Bovine erythrocyte superoxide dismutase J Biol Chem 249:7326–7338
Tateno Y, Nei M, Tajima F (1982) Accuracy of estimated phylogenetic trees from molecular data. I. Distantly related species. J Mol Evol 18:387–404
Wakabayashi S, Matsubara H, Webster DA (1986) Primary sequence of a dimeric bacterial hemoglobin fromVitreoscilla. Nature 322:481–483
Zelenik M, Rudloff V, Braunitzer G (1979) Die Aminosaure-sequenz des monmeren Hamoglobins von Lampetra fluviatilis. Hoppe-Seyler's Z Physiol Chem 360:1879–1894
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Feng, DF., Doolittle, R.F. Progressive sequence alignment as a prerequisitetto correct phylogenetic trees. J Mol Evol 25, 351–360 (1987). https://doi.org/10.1007/BF02603120
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02603120