Abstract
In whole genome shotgun sequencing when DNA fragments are derived from thousands of microorganisms in the environment sample, traditional alignment methods are impractical to use because of their high computation complexity. In this paper, we take the divergence vector which is consist of Kullback-Leibler divergences of different word lengths as the feature vector. Based on this, we use BP neural network to identify whether two fragments are from the same microorganism and obtain the similarity between fragments. Finally, we develop a new novel method to cluster DNA fragments from different microorganisms into different groups. Experiments show that it performs well.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Tyson, G.W., Chapman, J., et al.: Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 428, 37–43 (2004)
Roach, J., Boysen, C., Wang, K., Hood, L.: Pariwise end sequencing: a unified appraoch to genomic mapping and sequencing. Genomics 26, 345–353 (1995)
Wu, T.J., Hsieh, Y.C., Li, L.A.: Statistical Measures of DNA Sequence Dissimilarity under Markov Chain Models of Base Composition. Biometrics 57, 441–448 (2001)
Thomas, M.C., Joy, A.T.: Elements of Information Theory. Wiley, New York (2001)
http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Root
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990)
Haykin, S.: Neural Network: A Comprehensive Foudation. Prentice-Hall, Englewood Cliffs (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pi, X., Yang, W., Zhang, L. (2005). Blind Clustering of DNA Fragments Based on Kullback-Leibler Divergence. In: Wang, L., Chen, K., Ong, Y.S. (eds) Advances in Natural Computation. ICNC 2005. Lecture Notes in Computer Science, vol 3610. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11539087_139
Download citation
DOI: https://doi.org/10.1007/11539087_139
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28323-2
Online ISBN: 978-3-540-31853-8
eBook Packages: Computer ScienceComputer Science (R0)