Data base for protein sequences
Pages 261 - 266
Abstract
Proteins are linear polymers synthesized in living organisms from twenty different kinds of amino acids according to the message carried in the chromosomes. Typically, they have evolved by natural selection over hundreds of millions of years through many small changes in sequence. The present computerized data base includes 77,267 amino acid residues from 767 sequences. We believe that all of the protein structures occurring in living organisms can be combined into fewer than 1,000 groups containing proteins of similar sequence. Each group can be characterized by a few sequences that are known exactly and by a number of evolutionary parameters. The rest of the structures, occurring in organisms not examined, can be described with estimated precision in terms of the number of differences from a known sequence or from sequences inferred to have been present in ancestral forms. The following kinds of information are needed: sequences of proteins from each group, the phylogenetic tree of biological species, a list of protein groups and the gene duplications in each, a description of the quantitative parameters of the evolutionary processes affecting proteins, and methods of estimation of sequences in ancestral species and in living forms phylogenetically close to those investigated. The conceptual tools and the computer programs necessary for the prediction of all of the 1010 to 1011 protein sequences in living species are described. One can readily visualize the separate parts operating as an integrated interactive computerized data base that could predict sequences for specified organisms with an estimated precision based on the collection of known sequences.
References
[1]
Dayhoff, M. O., editor, Atlas of Protein Sequence and Structure 1972, Vol. 5, National Biomedical Research Foundation. Washington, D.C., 1972.
[2]
Dayhoff, M. O., editor, Atlas of Protein Sequence and Structure, Vol. 5, Suppl. 1, National Biomedical Research Foundation, Washington, D.C., 1973.
[3]
Dayhoff, M. O., editor, Atlas of Protein Sequence and Structure, Vol. 5, Suppl. 2, National Biomedical Research Foundation, Washington, D.C., 1976, in press.
[4]
Dayhoff, M. O., L. T. Hunt and W. C. Barker, Protein Sequence Data Tape 76, National Biomedical Research Foundation, Washington, D.C., 1976.
[5]
Dayhoff, M. O., L. T. Hunt, W. C. Barker and B. C. Orcutt, Protein Segment Dictionary 76, National Biomedical Research Foundation, Washington, D.C., 1976.
[6]
Dayhoff, M. O., P. J. McLaughlin, W. C. Barker and L. T. Hunt, Naturwissenschaften 62, pp. 154--161, 1975.
[7]
Dayhoff, M. O., Fed. Proc. 1976, in press.
[8]
Dayhoff, M. O., "Evolution of proteins," in Exobiology, edited by C. Ponnamperuma, pp. 266--300, North-Holland, Amsterdam, 1972.
[9]
Margulis, L., Origin of Eukaryotic Cells, Yale Univ. Press, New Haven and London, 1970.
[10]
Margulis, L., Biosystems 7, pp. 266--292, 1975.
[11]
Schwartz, R. M., W. C. Barker and M. O. Dayhoff, in Second College Park Colloq. on Chemical Evolution, 1975, p. 40, Univ. Maryland, College Park.
[12]
McLaughlin, P. J. and M. O. Dayhoff, J. Mol. Evol. 2, pp. 99--116, 1973.
[13]
Dayhoff, M. O., C. M. Park and P. J. McLaughlin, in Atlas of Protein Sequence and Structure 1972, Vol. 5, edited by M. O. Dayhoff, pp. 7--16, National Biomedical Research Foundation, Washington, D.C., 1972.
[14]
Dayhoff, M. O., L. T. Hunt, P. J. McLaughlin and D. D. Jones, in Atlas of Protein Sequence and Structure 1972, Vol. 5, edited by M. O. Dayhoff, pp. 17--30, National Biomedical Research Foundation, Washington, D.C., 1972.
[15]
Langley, C. H. and W. M. Fitch, J. Mol. Evol. 3, pp. 161--177, 1974.
[16]
Moore, G. W., J. Barnabas and M. Goodman, J. Mheor. Biol. 38, pp. 459--486, 1973.
[17]
McLaughlin, P. J., L. T. Hunt and M. O. Dayhoff, J. Human Evol. 1, pp. 565--578, 1972.
[18]
Dayhoff, M. O., W. C. Barker and L. T. Hunt, in Atlas of Protein Sequence and Structure, Vol. 5, Suppl. 2, edited by M. O. Dayhoff, National Biomedical Research Foundation, Washington, D.C., 1976, in press.
[19]
Barker, W. C. and M. O. Dayhoff, in Atlas of Protein Sequence and Structure 1972, Vol. 5, edited by M. O. Dayhoff, pp. 101--110, National Biomedical Research Foundation, Washington, D.C., 1972.
[20]
Needleman, S. B. and C. D. Wunsch, J. Mol. Biol. 48, pp. 443--453, 1970.
[21]
Dayhoff, M. O. and W. C. Barker, in Atlas of Protein Sequence and Structure 1972, Vol. 5, edited by M. O. Dayhoff, pp. 41--45, National Biomedical Research Foundation, Washington, D.C., 1972.
[22]
Fitch, W. M., Syst. Zool. 20, pp. 406--416, 1971.
- Data base for protein sequences
Recommendations
Determining functional specificity from protein sequences
Motivation: Given a large family of homologous protein sequences, many methods can divide the family into smaller groups that correspond to the different functions carried out by proteins within the family. One important problem, however, has been ...
Comments
Please enable JavaScript to view thecomments powered by Disqus.Information & Contributors
Information
Published In
June 1976
1125 pages
ISBN:9781450379175
DOI:10.1145/1499799
Copyright © 1976 ACM.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
- AFIPS: American Federation of Information Processing Societies
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Published: 07 June 1976
Check for updates
Qualifiers
- Research-article
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 212Total Downloads
- Downloads (Last 12 months)11
- Downloads (Last 6 weeks)1
Reflects downloads up to 16 Nov 2024
Other Metrics
Citations
View Options
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in