Using knowledge-based neural networks to improve algorithms: Refining the Chou-Fasman algorithm for protein folding

Richard Maclin¹ &
Jude W. Shavlik¹

1435 Accesses
Explore all metrics

Abstract

This article describes a connectionist method for refining algorithms represented as generalized finitestate automata. The method translates the rule-like knowledge in an automaton into a corresponding artificial neural network, and then refines the reformulated automaton by applying backpropagation to a set of examples. This technique for translating an automaton into a network extends thekbann algorithm, a system that translates a set of propositional rules into a corresponding neural network. The extended system,FSkbann, allows one to refine the large class of algorithms that can be represented as state-based processes. As a test,FSkbann is used to improve the Chou-Fasman algorithm, a method for predicting how globular proteins fold. Empirical evidence shows that the multistrategy approach ofFSkbann leads to a statistically-significantly, more accurate solution than both the original Chou-Fasman algorithm and a neural network trained using the standard approach. Extensive statistics report the types of errors made by the Chou-Fasman algorithm, the standard neural network, and theFSkbann network.

Article PDF

AlphaFold predictions are valuable hypotheses and accelerate but do not replace experimental structure determination

Article Open access 30 November 2023

Automatically obtaining a cellular automaton scheme for modeling protein folding using the FCC model

Article 24 August 2018

A probabilistic view of protein stability, conformational specificity, and design

Article Open access 19 September 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Chou, P., & Fasman, G. (1978). Prediction of the secondary structure of proteins from their amino acid sequence.Advances in Enzymology, 47, 45–148.
Google Scholar
Cleeremans, A., Servan-Schreiber, D., & McClelland, J. (1989). Finite state automata and simple recurrent networks.Neural Computation, (3), 372–381.
Cohen, B., Presnell, S., Cohen, F., & Langridge, R. (1991). A proposal for feature-based scoring of protein secondary structure predictions.Proceedings of the AAAI-91 Workshop on Artificial Intelligence Approaches to Classification and Pattern Recognition in Molecular Biology (pp. 5–20). Anaheim, CA.
Cost, S., & Salzberg, S. (1993). A weighted nearest neighbor algorithm for learning with symbolic features.Machine Learning, 10(1, 57–78.
Google Scholar
Elman, J. (1990). Finding structure in time.Cognitive Science 14(2, 179–211.
Google Scholar
Fahlman, S., & Lebiere, C. (1990). The cascade-correlation learning architecture. In D. Touretzky (Ed.),Advances in neural information processing systems, Vol. 2 (pp. 524–532). Denver, CO: Morgan Kaufmann.
Google Scholar
Fasman, G. (1989). The development of the prediction of protein structure. In Fasman, G. (Ed.),Prediction of protein structure and the principles of protein conformation. New York: Plenum Press.
Google Scholar
Garnier, J., & Robson, B. (1989). The GOR method for predicting secondary structures in proteins. In G. Fasman (Ed.),Prediction of protein structure and the principles of protein conformation. New York: Plenum Press.
Google Scholar
Giles, C., Miller, C., Chen, D., Chen, H., Sun, G., & Lee, Y. (1992). Learning and extracting finite state automata with second-order recurrent neural network.Neural Computation, 4, 393–405.
Google Scholar
Holley, L., & Karplus, M. (1989). Protein structure prediction with a neural network.Proceedings of the National Academy of Science (itUSA), 86, 152–156.
Google Scholar
Hopcroft, J., & Ullman, J. (1979).Introduction to automata theory, languages, and computation. Reading, MA: Addison Wesley.
Google Scholar
Hunter, L. (1991). Representing amino acids with bitstrings.Proceedings of the AAAI-91 Workshop on Artificial Intelligence Approaches to Classification and Pattern Recognition in Molecular Biology (pp. 110–117). Anaheim, CA.
Jacobs, R., Jordan, M., Nowlan, S., & Hinton, G. (1991). Adaptive mixtures of local experts.Neural Computation, 3(1, 79–87.
Google Scholar
Jordan, M. (1986).Serial order: A parallel distributed processing approach (Technical Report 8604). San Diego: University of California, Institute for Cognitive Science.
Google Scholar
Kneller, D., Cohen, F., & Langridge, R. (1990). Improvements in protein secondary structure prediction by an enhanced neural network.Journal of Molecular Biology, 214, 171–182.
Google Scholar
Lim, V. (1974). Algorithms for prediction of α-helical and β-structural regions in globular proteins.Journal of Molecular Biology, 88, 873–894.
Google Scholar
Mathews, B. (1975). Comparison of the predicted and observed secondary structure of T4 Phage Lysozyme.Biochimica et Biophysica Acta, 405, 442–451.
Google Scholar
Muggleton, S. (Ed.). (1992).Inductive logic programming. London: Academic Press.
Google Scholar
Muggleton, S., & Feng, R. (1991).Predicting protein secondary-structure using inductive logic programming. (Technical Report). Glasgow, Scotland: Turing Institute.
Google Scholar
Nishikawa, K. (1983). Assessment of secondary-structure prediction of proteins: Comparison of computerized Chou-Fasman method with others.Biochimica et Biophysica Acta, 748, 285–299.
Google Scholar
Noordewier, M., Towell, G., & Shavlik, J. (1991). Training knowledge-based neural networks to recognize genes in DNA sequences. In R. Lippmann, J. Moody, & D. Touretzky, (Eds.),Advances in neural information processing systems, Vol. 3. Denver, CO: Morgan Kaufmann.
Google Scholar
Ourston, D., & Mooney, R. (1990). Changing the rules: A comprehensive approach to theory refinement.Proceedings of the Eighth National Conference on Artificial Intelligence (pp. 815–820). Boston, MA: MIT Press.
Google Scholar
Pazzani, M. (1993). Learning causal patterns: Making a transition from data-driven to theory-driven learning.Machine Learning, 11 (this issue).
Prevelige, P. Jr., & Fasman, G. (1989). Chou-Fasman prediction of the secondary structure of proteins: The Chou-Fasman-Prevelige algorithm. In G. Fasman (Ed.),Prediction of protein structure and the principles of protein conformation. New York: Plenum Press.
Google Scholar
Qian, N., & Sejnowski, T. (1988). Predicting the secondary structure of globular proteins using neural network models.Journal of Molecular Biology, 202, 865–884.
Google Scholar
Quinlan, J. (1990). Learning logical definitions from relations.Machine Learning, 5(3, 239–266.
Google Scholar
Richardson, J., & Richardson, D. (1989). Principles and patterns of protein conformation. In G. Fasman (Ed.),Prediction of protein structure and the principles of protein conformation. New York: Plenum Press.
Google Scholar
Robson, B., & Suzuki, E. (1976). Conformational properties of amino acid residues in globular proteins.Journal of Molecular Biology, 107, 327–356.
Google Scholar
Rumelhart, D., Hinton, G., & Williams, R. (1986). Learning internal representations by error propagation. In D. Rumelhart, & J. McClelland (Eds.),Parallel distributed processing: Explorations in the microstructure of cognition. Volume 1: Foundations (pp. 318–363). Cambridge, MA: MIT Press.
Google Scholar
Saitta, L., & Botta, M. (1993). Multistrategy learning and theory revision.Machine Learning, 11 (this issue).
Sejnowski, T., & Rosenberg, C. (1987). Parallel networks that learn to pronounce English text.Complex Systems, 1, 145–168.
Google Scholar
Stolorz, P., Lapedes, A., & Xia, Y. (1991). Predicting protein secondary structure using neural net and statistical methods.Journal of Molecular Biology, 225, 363–377.
Google Scholar
Tecuci, G. (1993). Plausible justification trees: A framework for deep and dynamic integration of learning strategies.Machine Learning, 11 (this issue).
Towell, G., Shavlik, J., & Noordewier, M. (1990). Refinement of approximate domain theories by knowledge-based neural networks.Proceedings of the Eighth National Conference on Artificial Intelligence (pp. 861–866). Boston, MA: MIT Press.
Google Scholar
Towell, G. (1991).Symbolic knowledge and neural networks: Insertion, refinement and extraction. Doctoral dissertation, Department of Computer Science, University of Wisconsin, Madison.
Google Scholar
Towell, G., & Shavlik, J. (1992). Interpretation of artificial neural networks: Mapping knowledge-based neural networks into rules. In R. Lippmann, J. Moody, & D. Touretzky (Eds.),Advances in neural information processing systems, Vol. 4. Denver, CO: Morgan Kaufmann.
Google Scholar
Watson, J. (1990). The Human Genome Project: Past, present, and future.Science, 248, 44–48.
Google Scholar
Wilson, I., Haft, D., Getzoff, E., Tainer, J., Lerner, R., & Brenner, S. (1985). Identical short peptide sequences in unrelated proteins can have different conformations: A testing ground for theories of immune recognition.Proceeding of the National Academy of Sciences (itUSA), 82, 5255–5259.
Google Scholar
Zhang, X., Mesirov, J. & Waltz, D. (1992). Hybrid system for protein secondary structure prediction.Journal of Molecular Biology, 225, 1049–1063.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Sciences Department, University of Wisconsin, 1210 W. Dayton St., 53706, Madison, WI
Richard Maclin & Jude W. Shavlik

Authors

Richard Maclin
View author publications
You can also search for this author in PubMed Google Scholar
Jude W. Shavlik
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Maclin, R., Shavlik, J.W. Using knowledge-based neural networks to improve algorithms: Refining the Chou-Fasman algorithm for protein folding. Mach Learn 11, 195–215 (1993). https://doi.org/10.1007/BF00993077

Download citation

Received: 11 December 1991
Accepted: 16 January 1992
Issue Date: May 1993
DOI: https://doi.org/10.1007/BF00993077

Using knowledge-based neural networks to improve algorithms: Refining the Chou-Fasman algorithm for protein folding

Abstract

Article PDF

Similar content being viewed by others

AlphaFold predictions are valuable hypotheses and accelerate but do not replace experimental structure determination

Automatically obtaining a cellular automaton scheme for modeling protein folding using the FCC model

A probabilistic view of protein stability, conformational specificity, and design

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Using knowledge-based neural networks to improve algorithms: Refining the Chou-Fasman algorithm for protein folding

Abstract

Article PDF

Similar content being viewed by others

AlphaFold predictions are valuable hypotheses and accelerate but do not replace experimental structure determination

Automatically obtaining a cellular automaton scheme for modeling protein folding using the FCC model

A probabilistic view of protein stability, conformational specificity, and design

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords