Abstract
The basic issue concerning the construction of neural network systems for protein classification is the sequence encoding scheme that must be used in order to feed the network. To deal with this problem we propose a method that maps a protein sequence into a numerical feature space using the matching local scores of the sequence to groups of conserved patterns (called motifs). We consider two alternative schemes for discovering a group of D motifs within a set of K-class sequences. We also evaluate the impact of the background features (2-grams) to the performance of the neural system. Experimental results on real datasets indicate that the proposed method is superior to other known protein classification approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hughey R. and Krogh A. Hidden Markov models for sequence analysis: Extension and analysis of the basic method. CABIOS, 12(2):95–107, 1996.
Wang J.T.L., Ma Q., Shasha D., and Wu C.H. New techniques for extracting features from protein sequences. IBM: Systems Journal, 40(2):426–441, 2001.
Bréjova B., DiMarco C., Vinař T., Hidalgo S.R., Holguin G., and Patten C. Finding patterns in biological sequences. Project Report for CS798g, University of Waterloo, 2000.
Ma Q. and Wang J.T.L. Application of Bayesian neural networks to protein sequence classification. In ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pages 305–309, Boston, MA, USA, Aug 2000.
Bailey T.L. and Gribskov M. Combining evidence using p-values: application to sequence homology searches. Bioinformatics, 14:48–54, 1998.
Bailey T.L. and Elkan C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In Second International Conference on Intelligent Systems for Molecular Biology, pages 28–36, Menlo Park, California, 1994. AAAI Press.
MacKay D.J.C. Bayesian interpolation. Neural Computation, 4:415–447, 1992.
Foresse F.D. and Hagan M.T. Gauss-Newton approximation to Bayesian regularization. In Proceedings of the 1997 International Joint Conference on Neural Network, pages 1930–1935, 1997.
Bishop C.M. Neural Networks for Pattern Recognition. Oxford Univ. Press Inc., New York, 1995.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Blekas, K., Fotiadis, D.I., Likas, A. (2003). Protein Sequence Classification Using Probabilistic Motifs and Neural Networks. In: Kaynak, O., Alpaydin, E., Oja, E., Xu, L. (eds) Artificial Neural Networks and Neural Information Processing — ICANN/ICONIP 2003. ICANN ICONIP 2003 2003. Lecture Notes in Computer Science, vol 2714. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44989-2_84
Download citation
DOI: https://doi.org/10.1007/3-540-44989-2_84
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40408-8
Online ISBN: 978-3-540-44989-8
eBook Packages: Springer Book Archive