Abstract
Clustering genes into groups that exhibit similar expression patterns is one of the most fundamental issues in microarray data analysis. In this paper, we present a normalized Expectation-Maximization (EM) approach for the problem of gene-based clustering. The normalized EM clustering also follows the framework of generative clustering models but for the data in a fixed manifold. We illustrate the effectiveness of the normalized EM on two real microarray data sets by comparing its clustering results with the ones produced by other related clustering algorithms. It is shown that the normalized EM performs better than the related algorithms in term of clustering outcomes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Lockhart, D., Dong, H., Byrne, M., Follettie, M., Gallo, M., Chee, M., Mittmann, M., Wang, C., Kobayashi, M., Horton, H., Brown, P.: Expression monitoring by hybridization to high density oligonucleotide arrays. Nature Biotechnology 14, 1675–1680 (1996)
Schena, M., Shalon, D., Davis, R., Brown, P.: Quantitative monitoring of gene expression patterns with a DNA microarray. Science 210, 467–470 (1995)
Shalon, D., Smith, S., Brown, P.: A DNA microarray system for analyzing complex DNA samples using two-color fluorescent probe hybridization. Genome Research 6, 639–645 (1996)
Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. In: Proceedings of the National Academy of Sciences of the United States of America (1998)
Iyer, V., Eisen, M., Ross, D., Schuler, G., Moore, T., Lee, J., Trent, J., Staudt, L., Hudson, J., Boguski, M., Lashkari, D., Shalon, D., Botstein, D., Brown, P.: The transcriptional program in response of human fibroblasts to serum. Science 283, 83–87 (1999)
Wen, X., Fuhrman, S., Michaels, G.S., Carr, D.B., Smith, S., Barker, J.L., Somogyi, R.: Large-scale temporal gene expression mapping of central nervous system development. The national academy of sciences, 334–339 (January 1998)
Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E., Golub, T.: Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. In: Proceedings of the National Academy of Sciences of the United States of America, pp. 2097–2912 (1999)
Smet, F., Mathys, J., Marchal, K., Thijs, G., Moor, B., Moreau, Y.: Adaptive quality-based clustering of gene expresion profiles. Bioinformatics 18(5), 735–746 (2002)
Tavazoie, S., Hughes, J., Campbell, M., Cho, R., Church, G.: Systematic determination of genetic network architechture. Nature Genetics 22, 281–285 (1999)
Tseng, G.: Penalized and weighted k-means for clustering with scattered objects and prior information in high-throughput biological data. Bioinformatics 23(17), 2247–2255 (2007)
Sharan, R., Shamir, R.: Click: A clustering algorithm with applications to gene expression analysis. In: Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology (ISMB), pp. 307–316 (2000)
Xu, Y., Olman, V., Xu, D.: Clustering gene expression data using a graph-theoretic approach: an application of minimum spanning trees. Bioinformatics 17(4), 309–318 (2001)
Ghosh, D., Chinnaiyan, A.M.: Mixture modelling of gene expression from microarray experiments. Bioinformatics 18(2), 275–286 (2002)
McLachlan, G., Bean, R., Peel, D.: A mixture model-based approach to the clustering of microarray expression data. Bioinformatics 18(3), 413–422 (2002)
Yeung, K.Y., Fraley, C., Murua, A., Raftery, A.E., Ruzzo, W.L.: Model-based clustering and data transformations for gene expression data. Bioinformatics 17, 977–987 (2001)
Dhillon, I., Modha, D.: Concept decompositions for large sparse text data using clustering. Machine Learning 42(1), 143–175 (2001)
Banerjee, A., Dhillon, I., Ghosh, J., Sra, S.: Clustering on the unit hypersphere using von Mises-Fisher distributions. Journal of Machine Learning Research 6, 1345–1382 (2005)
Celeux, G., Govaert, G.: Gaussian parsimonious clustering models. Pattern Recognition 28(5), 781–793 (1995)
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood for incomplete data via the EM algorithm. Journal of Royal Stastistical Society 29, 1–38 (1977)
Spellman, P.T., Sherlock, G., Zhang, M.Q., Iyer, V.R., Anders, K., Eisen, M., Brown, P., Bostein, D., Futcher, B.: Comprehensive identification of cell cycle-regulated genes of the yeast saccharomyces cerevisiae by microarray hybridization. Molecular biology of the cell 9, 3273–3297 (1998)
Hubert, L., Arabie, P.: Comparing partitions. Journal of Classification 2, 193–218 (1985)
Fraley, C., Raftery, A.: Mclust: software for model based cluster analysis. Journal of Classification 16, 297–306 (1999)
Bezdek, J.C., Pal, N.R.: Some new indexes of cluster validity. IEEE Trans. on Systems, Man and Cybertics 28, 301–315 (1998)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Phuong, N.M., Tuan, H.D. (2008). Clustering in a Fixed Manifold to Detect Groups of Genes with Similar Expression Patterns. In: Elloumi, M., Küng, J., Linial, M., Murphy, R.F., Schneider, K., Toma, C. (eds) Bioinformatics Research and Development. BIRD 2008. Communications in Computer and Information Science, vol 13. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70600-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-70600-7_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70598-7
Online ISBN: 978-3-540-70600-7
eBook Packages: Computer ScienceComputer Science (R0)