Abstract
The explosion of ‘‘omics’’ data over the past few decades has generated an increasing need of efficiently analyzing high-dimensional gene expression data in several different and heterogenous contexts, such as for example in information retrieval, knowledge discovery, and data mining. For this reason, biclustering, or simultaneous clustering of both genes and conditions has generated considerable interest over the past few decades. Unfortunately, the problem of locating the most significant bicluster has been shown to be NP-complete. We have designed and implemented a GRASP-like heuristic algorithm to efficiently find good solutions in reasonable running times, and to overcome the inner intractability of the problem from a computational point of view.
Experimental results on two datasets of expression data are promising indicating that this algorithm is able to find significant biclusters, especially from a biological point of view.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hartigan, J.: Direct clustering of a data matrix. J. Am. Stat. Assoc. 67, 123–127 (1972)
Cheng, Y., Church, G.M.: Biclustering of expression data. In: Altman, R., Bailey, T., Bourne, P., Gribskov, M., Lengauer, T., Shindyalov, I. (eds.) Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology (ISMB 2000), pp. 93–103 (2000)
Madeira, S., Oliveira, A.: Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Trans. Comput. Biol. Bioinform. 1, 24–45 (2004)
Tanay, A., Sharan, R., Shamir, R.: Discovering statistically significant biclusters in gene expression data. Bioinformatics 18(suppl. 1), S136–S144 (2002)
Wang, H., Wang, W., Yang, J., Yu, P.: Clustering by pattern similarity in large data sets. In: Proc. 2002 ACM SIGMOD Int’l Conf. Management of Data, pp. 394–405 (2002)
Getz, G., Levine, E., Domany, E.: Coupled two-way clustering analysis of gene microarray data. Proc. Natl. Acad. Sci. USA 97 22, 12079–12084 (2000)
Tang, C., Zhang, L., Zhang, I., Ramanathan, M.: Interrelated two-way clustering: An unsupervised approach for gene expression data analysis. In: Proc. Second IEEE Int’l Symp. Bioinformatics and Bioeng., pp. 41–48 (2001)
Duffy, D., Quiroz, A.: A permutation based algorithm for block clustering. J. Classif. 8, 65–91 (1991)
Cho, H., Dhillon, I., Guan, Y., Sra, S.: Minimum Sum-Squared Residue Co-clustering of Gene Expression Data. In: Berry, M., Dayal, U. (eds.) Proceedings of the 4th SIAM Int’l Conf. Data Mining (2004)
Yang, J., Wang, W., Wang, H., Yu, P.: δ-clusters: Capturing subspace correlation in a large data set. In: Proc. 18th IEEE Int’l Conf. Data Eng., pp. 517–528 (2002)
Yang, J., Wang, W., Wang, H., Yu, P.: Enhanced biclustering on expression data. In: Proc. Third IEEE Conf. Bioinformatics and Bioeng., pp. 321–327 (2003)
Klugar, Y., Basri, R., Chang, J., Gerstein, M.: Spectral biclustering of microarray data: Coclustering genes and conditions. Genome Res. 13, 703–716 (2003)
Segal, E., Taskar, B., Gasch, A., Friedman, N., Koller, D.: Rich probabilistic models for gene expression. Bioinformatics 17(suppl. 1), S243–S252 (2001)
Sheng, Q., Moreau, Y., Moor, B.D.: Biclustering microarray data by gibbs sampling. Bioinformatics 19(suppl. 2), ii196–ii205 (2003)
Manjunath Aradhya, V.N., Masulli, F., Rovetta, S.: A Novel Approach for Biclustering Gene Expression Data Using Modular Singular Value Decomposition. In: Masulli, F., Peterson, L.E., Tagliaferri, R. (eds.) CIBB 2009. LNCS, vol. 6160, pp. 254–265. Springer, Heidelberg (2010)
Bryan, K., Cunningham, P., Bolshakova, N.: Application of simulated annealing to the biclustering of gene expression data. IEEE Trans. Inf. Technol. Biomed. 10(3), 519–525 (2006)
Mitra, S., Banka, H.: Multi-objective evolutionary biclustering of gene expression data. Pattern Recogn. 39, 2464–2477 (2006)
Dharan, S., Nair, A.: Biclustering of gene expression data using reactive greedy randomized adaptive search procedure. BMC Bioinformatics 10(suppl. 1), S27 (2009)
Tanay, A., Sharan, R., Shamir, R.: Biclustering Algorithms: A Survey. In: Aluru, S. (ed.) Handbook of Computational Molecular Biology. Computer and Information Science Series. S. Chapman & Hall/CRC (2005)
Peeters, R.: The maximum edge biclique problem is NP-Complete. Discrete Appl. Math. 131(3), 651–654 (2003)
Feo, T., Resende, M.: A probabilistic heuristic for a computationally difficult set covering problem. Oper. Res. Lett. 8, 67–71 (1989)
Feo, T., Resende, M.: Greedy randomized adaptive search procedures. J. Global Optim. 6, 109–133 (1995)
Festa, P., Resende, M.: GRASP: An annotated bibliography. In: Ribeiro, C., Hansen, P. (eds.) Essays and Surveys on Metaheuristics, pp. 325–367. Kluwer Academic Publishers (2002)
Festa, P., Resende, M.: An annotated bibliography of GRASP – Part I: Algorithms. International Transactions in Operational Research 16(1), 1–24 (2009)
Festa, P., Resende, M.: An annotated bibliography of GRASP – Part II: Applications. International Transactions in Operational Research 16(2), 131–172 (2009)
Prais, M., Ribeiro, C.: Reactive GRASP: An application to a matrix decomposition problem in TDMA traffic assignment. INFORMS J. Comput. 12, 164–176 (2000)
Binato, S., Oliveira, G.: A Reactive GRASP for transmission network expansion planning. In: Ribeiro, C., Hansen, P. (eds.) Essays and Surveys on Metaheuristics, pp. 81–100. Kluwer Academic Publishers (2002)
Delmaire, H., Díaz, J., Fernández, E., Ortega, M.: Reactive GRASP and tabu search based heuristics for the single source capacitated plant location problem. INFOR 37, 194–225 (1999)
Tavazoie, S., Hughes, J., Campbell, M.J., Cho, R.J., Church, G.M.: Systematic determination of genetic network architecture. Nat. Genet. 22, 281–285 (1999)
Alizadeh, A., Eisen, M., Davis, R., Ma, C., Lossos, I., Rosenwald, A., Boldrick, J., Sabet, H., Tran, T., Yu, X., Powell, J., Yang, L., Marti, G., Moore, T., Hudson, J., Lu, L., Lewis, D., Tibshirani, R., Sherlock, G., Chan, W., Greiner, T., Weisenburger, D., Armitage, J., Warnke, R., Levy, R., Wilson, W., Grever, M., Byrd, J., Botstein, D., Brown, P., Staudt, L.: Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)
Mi, H., Dong, Q., Muruganujan, A., Gaudet, P., Lewis, S., Thomas, P.: PANTHER version 7: improved phylogenetic trees, orthologs and collaboration with the gene ontology consortium. Nucleic Acids Res. 38, D204–D210 (2010)
Frinhani, R.M.D., Silva, R.M.A., Mateus, G.R., Festa, P., Resende, M.G.C.: GRASP with Path-Relinking for Data Clustering: A Case Study for Biological Data. In: Pardalos, P.M., Rebennack, S. (eds.) SEA 2011. LNCS, vol. 6630, pp. 410–420. Springer, Heidelberg (2011)
Laguna, M., Martí, R.: GRASP and path relinking for 2-layer straight line crossing minimization. INFORMS J. Comput. 11, 44–52 (1999)
Festa, P., Pardalos, P., Resende, M., Ribeiro, C.: Randomized heuristics for the MAX-CUT problem. Optim. Methods Softw. 7, 1033–1058 (2002)
Mladenović, N., Hansen, P.: Variable neighborhood search. Comput. Oper. Res. 24, 1097–1100 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Facchiano, A., Festa, P., Marabotti, A., Milanesi, L., Musacchia, F. (2012). Solving Biclustering with a GRASP-Like Metaheuristic: Two Case-Studies on Gene Expression Analysis. In: Biganzoli, E., Vellido, A., Ambrogi, F., Tagliaferri, R. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2011. Lecture Notes in Computer Science(), vol 7548. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35686-5_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-35686-5_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35685-8
Online ISBN: 978-3-642-35686-5
eBook Packages: Computer ScienceComputer Science (R0)