Abstract
The DNA barcoding is a promising technique for identifications of biological species based on a relatively short sequence of COI gene. A research area to improve the DNA barcoding is to study the classification techniques that utilize common properties of DNA and amino acid sequences such as variable lengths of gene sequences, and the comparison of different reference genes. In this study, we evaluate a classification model for DNA barcoding induced by genetic programming. The proposed method can be adapted for both DNA and amino acid sequences. The performance is evaluated by representing the two types of sequences and one based on their properties. The proposed method evaluates common significant sites on the reference genes which are useful to differentiate between species.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hebert, P.D.N., Ratnasingham, S., Waard, J.R.: Barcoding animal life: cytochrome c oxidase subunit 1 divergences among closely related species. Proceedings of the Royal Society of London. Series B: Biological Sciences 270, S96–S99 (2003)
Chase, M.W., Fay, M.F.: Barcoding of plants and fungi. Science 325, 682–683 (2009)
Moritz, C., Cicero, C.: DNA Barcoding: Promise and Pitfalls. PLoS Biol. 2, 1529–1531 (2004)
Saitou, N., Nei, M.: The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular biology and evolution 4, 406–425 (1987)
Guindon, S., Gascuel, O.: A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Systematic biology 52, 696–704 (2003)
Austerlitz, F., David, O., Schaeffer, B., Bleakley, K., Olteanu, M., Leblois, R., Veuille, M., Laredo, C.: DNA barcode analysis: a comparison of phylogenetic and statistical classification methods. BMC bioinformatics 10, S10 (2009)
De’ath, G., Fabricius, K.E.: Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81, 3178–3192 (2008)
Breiman, L.: Random forests. Machine learning 45, 5–32 (2001)
Leslie, C., Eskin, E., Noble, W.S.: The spectrum kernel: A string kernel for SVM protein classification. In: Proceedings of the Pacific Symposium on Biocomputing, vol. 7, pp. 566–575 (2002)
Menchetti, S., Costa, F., Frasconi, P.: Weighted decomposition kernels. In: Proceedings of the 22nd International Conference on Machine learning, pp. 585–592 (2005)
Bouveyron, C., Girard, S., Olteanu, M.: Supervised classification of categorical data with uncertain labels for DNA barcoding. In: Proceeding of the 16th European Symposium on Artificial Neural Networks (ESANN 2009), pp. 29–34 (2009)
Zhang, A.B., Sikes, D.S., Muster, C., Li, S.Q.: Inferring species membership using DNA sequences with back-propagation neural networks. Systematic Biology 57, 202–216 (2008)
Nielsen, R., Matz, M.: Statistical approaches for DNA barcoding. Systematic Biology 55, 162–169 (2006)
Koza, J., Poli, R.: Genetic programming: on the programming of computers by means of natural selection. The MIT press, Cambridge (1992)
Goldberg, D.E., Deb, K.: A comparative analysis of selection schemes used in genetic algorithms. Foundations of genetic algorithms 1, 69–93 (1991)
Loveard, T., Ciesielski, V.: Representing classification problems in genetic programming. In: Proceedings of the 2001 Congress on Evolutionary Computation, vol. 2, pp. 1070–1077 (2001)
Barcode of Life Data Systems, http://www.boldsystems.org
Taylor, W.R.: The classification of amino acid conservation. Journal of theoretical Biology 119, 205–218 (1986)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zamani, M., Chiu, D.K.Y. (2010). An Evaluation of DNA Barcoding Using Genetic Programming-Based Process. In: Li, K., Jia, L., Sun, X., Fei, M., Irwin, G.W. (eds) Life System Modeling and Intelligent Computing. ICSEE LSMS 2010 2010. Lecture Notes in Computer Science(), vol 6330. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15615-1_36
Download citation
DOI: https://doi.org/10.1007/978-3-642-15615-1_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15614-4
Online ISBN: 978-3-642-15615-1
eBook Packages: Computer ScienceComputer Science (R0)