Abstract
The huge volume of gene expression data produced by microarrays and other high-throughput techniques has encouraged the development of new computational techniques to evaluate the data and to formulate new biological hypotheses. To this purpose, co-clustering techniques are widely used: these identify groups of genes that show similar activity patterns under a specific subset of the experimental conditions by measuring the similarity in expression within these groups. However, in many applications, distance metrics based only on expression levels fail in capturing biologically meaningful clusters.
We propose a methodology in which a standard expression-based co-clustering algorithm is enhanced by sets of constraints which take into account the similarity/dissimilarity (inferred by the Gene Ontology, GO) between pairs of genes. Our approach minimizes the intervention of the analyst within the co-clustering process. It provides meaningful co-clusters whose discovery and interpretation is increased by embedding GO annotations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Eisen, M., Spellman, P., Botstein, P.B.D.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14863–14868 (1998)
Madeira, S., Oliveira, A.: Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans Comput Biol Bioinform 1, 24–45 (2004)
Hanisch, D., Zien, A., Zimmer, R., Lengauer, T.: Co-clustering of biological networks and gene expression data. Bioinformatics 18, S145–S154 (2002)
Steinhauser, D., Junker, B., Luedemann, A., Selbig, J., Kopka, J.: Hypothesis-driven approach to predict transcriptional units from gene expression data. Bioinformatics 20, 1928–1939 (2004)
Brameier, M., Wiuf, C.: Co-clustering and visualization of gene expression data and gene ontology terms for saccharomyces cerevisiae using self-organizing maps. J. Biomed. Inform. 40, 160–173 (2007)
Pensa, R., Boulicaut, J.: Constrained co-clustering of gene expression data. In: Proceedings of SIAM SDM, pp. 25–36 (2008)
Cordero, F., Visconti, A., Botta, M.: A new protein motif extraction framework based on constrained co-clustering. In: Proceedings of the 24th Annual ACM Symposium on Applied Computing, pp. 776–781 (2009)
Ashburner, M., et al.: Gene ontology: tool for the unification of biology. the gene ontology consortium. Nat Genet. 25, 25–29 (2000)
Cheng, Y., Church, G.M.: Biclustering of expression data. In: Proceedings ISMB 2000, pp. 93–103 (2000)
Cho, H., Dhillon, I.S., Guan, Y., Sra, S.: Minimum sum-squared residue co-clustering of gene expression data. In: Proceedings of the Fourth SIAM International Conference on Data Mining, pp. 114–125 (2004)
Salvador, S., Chan, P.: Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms. In: Proceedings of the 16th IEEE International Conference on Tools with AI, pp. 576–584 (2004)
Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3, 583–617 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cordero, F., Pensa, R.G., Visconti, A., Ienco, D., Botta, M. (2009). Ontology-Driven Co-clustering of Gene Expression Data. In: Serra, R., Cucchiara, R. (eds) AI*IA 2009: Emergent Perspectives in Artificial Intelligence. AI*IA 2009. Lecture Notes in Computer Science(), vol 5883. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10291-2_43
Download citation
DOI: https://doi.org/10.1007/978-3-642-10291-2_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10290-5
Online ISBN: 978-3-642-10291-2
eBook Packages: Computer ScienceComputer Science (R0)