Abstract
Rough set-based rule induction allows easily interpretable descriptions of complex biological systems. Here, we review a number of applications of rough sets to problems in bioinformatics, including cancer classification, gene and protein function prediction, gene regulation, protein-drug interaction and drug resistance.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Fleischmann, R.D., et al.: Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269, 496–512 (1995)
Berman, H.M., et al.: The protein data bank. Nucleic Acids Research 28, 235–242 (2000)
Schena, M., et al.: Quantitative monitoring of gene expression patterns with a complementary dna microarray. Science 270, 467–470 (1995)
Duggan, D.J., et al.: Expression profiling using cDNA microarrays. Nat. Genet. 21, 10–14 (1999)
Patterson, S.D., Aebersold, R.H.: Proteomics: the first decade and beyond. Nat. Genet. 33(Suppl.), 311–323 (2003)
Kanehisa, M., Bork, P.: Bioinformatics in the post-sequence era. Nat. Genet. 33(Suppl.), 305–310 (2003)
Altschul, S.F., et al.: Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Research 25, 3389–3402 (1997)
Shatkay, H., Feldman, R.: Mining the biomedical literature in the genomic era: an overview. J. Comput. Biol. 10, 821–855 (2003)
Jenssen, T.K., et al.: A literature network of human genes for high-throughput analysis of gene expression. Nat. Genet. 28, 21–28 (2001)
Brazma, A., Krestyaninova, M., Sarkans, U.: Standards for systems biology. Nat. Rev. Genet. 7, 593–605 (2006)
The Gene Ontology Consortium: Gene ontology: tool for the unification of biology. Nature Genetics 25, 25–29 (2000)
Pawlak, Z.: Rough sets. International Journal of Information and Computer Science 11(5), 341–356 (1982)
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning about Data. Series D: System Theory, Knowledge Engineering and Problem Solving, vol. 9. Kluwer Academic Publishers, Dordrecht (1991)
Komorowski, J., et al.: Rough sets: A tutorial. In: Rough Fuzzy Hybridization: A New Trend in Decision-Making, pp. 3–98. Springer, Singapore (1999)
Skowron, A., Rauszer, C.: The discernibility matrices and functions in information systems. In: Słowiński, R. (ed.) Intelligent Decision Support: Handbook of Applications and Advances in Rough Sets Theory. Series D: System Theory, Knowledge Engineering and Problem Solving, vol. 11, pp. 331–362. Kluwer Academic Publishers, Dordrecht (1992)
Skowron, A., Nguyen, H.S.: Boolean reasoning scheme with some applications in data mining. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 107–115. Springer, Heidelberg (1999)
Churchill, G.A.: Fundamentals of experimental design for cDNA microarrays. Nat. Genet. 32(Suppl.), 490–495 (2002)
Quackenbush, J.: Microarray data normalization and transformation. Nat. Genet. 32(Suppl.), 496–501 (2002)
Iyer, V.R., et al.: The transcriptional program in the response of human fibroblasts to serum. Science 283, 83–87 (1999)
Golub, T., et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
Brown, M.P.S., et al.: Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc. Natl. Acad. Sci. USA 97(1), 262–267 (2000)
Midelfart, H., et al.: Learning rough set classifiers from gene expression and clinical data. Fundamenta Informaticae 53(2), 155–183 (2002)
Nørsett, K.G., et al.: Gene expression based classification of gastric carcinoma. Cancer Lett. 210, 227–237 (2004)
Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. Chapman & Hall, London (1993)
Hanley, J.A., McNeil, B.J.: The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 29–36 (1982)
Manley, B.F.J.: Randomization, Bootstrap and Monte Carlo Methods in Biology. Chapman & Hall, Boca Raton (2002)
Dennis, J.L., et al.: Markers of adenocarcinoma characteristic of the site of origin: Development of a diagnostic algorithm. Clin. Cancer Res. 11, 3766–3772 (2005)
Hvidsten, T.R., et al.: Predicting gene function from gene expressions and ontologies. In: Altman, R.B., et al. (eds.) Pacific Symposium on Biocomputing, Mauna Lani, Hawai’i, pp. 299–310. World Scientific Publishing, Singapore (2001)
Hvidsten, T.R., Lægreid, A., Komorowski, J.: Learning rule-based models of biological process from gene expression time profiles using gene ontology. Bioinformatics 19, 1116–1123 (2003)
Lægreid, A., et al.: Predicting gene ontology biological process from temporal gene expression patterns. Genome Res. 13, 965–979 (2003)
Eisen, M., et al.: Cluster analysis and display of genome-wide expression pattern. Proc. Natl. Acad. Sci. USA 95(25), 14863–14868 (1998)
Brown, P.O., Botstein, D.: Exploring the new world of the genome with DNA microarrays. Nat. Genet. 21, 33–37 (1999)
Cho, R.J., et al.: Transcriptional regulation and function during the human cell cycle. Nature Genetics 27, 48–54 (2001)
Pilpel, Y., Sudarsanam, P., Church, G.M.: Identifying regulatory networks by combinatorial analysis of promoter elements. Nature genetics 29, 153–159 (2001)
Hvidsten, T.R., et al.: Discovering regulatory binding-site modules using rule-based learning. Genome Res. 15, 856–866 (2005)
Hughes, J.D., et al.: Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. J. Mol. Biol. 296, 1205–1214 (2000)
Lee, T.I., et al.: Transcriptional regulatory networks in Saccharomyces cerevisiae. Science 298, 799–804 (2002)
Wilczyński, B., et al.: Using local gene expression similarities to discover regulatory binding site modules. Accepted in BMC Bioinformatics (2006)
Andersson, C.R., et al.: Revealing cell cycle control by combining model-based detection of periodic expression with novel cis-regulatory descriptors. Submitted (2006)
Skolnick, J., Fetrow, J.S.: From genes to protein structure and function: Novel applications of computational approaches in the genomic era. Trends Biotechnol 18, 34–39 (2000)
Apweiler, R., et al.: UniProt: the Universal Protein knowledgebase. Nucleic Acids Res. 32, D115–D119 (2004)
Chandonia, J.-M., Brenner, S.E.: The impact of structural genomics: Expectations and outcomes. Science 311, 347–351 (2006)
Tress, M., et al.: Assessment of predictions submitted for the CASP6 comparative modeling category. Proteins 61(Suppl. 7), 27–45 (2005)
Zhang, C., Kim, S.-H.: Overview of structural genomics: from structure to function. Curr. Opin. Chem. Biol. 7, 28–32 (2003)
Hvidsten, T.R., et al.: A novel approach to fold recognition using sequence-derived properties from sets of structurally similar local fragments of proteins. Bioinformatics 19(Suppl. 2), II81–II91 (2003)
Pazos, F., Sternberg, M.J.E.: Automated prediction of protein function and detection of functional sites from structure. Proc. Natl. Acad. Sci. USA 101, 14754–14759 (2004)
Orengo, C.A., Todd, A.E., Thornton, J.M.: From protein structure to function. Curr. Opin. Struct. Biol. 9, 374–382 (1999)
Laskowski, R.A., Watson, J.D., Thornton, J.M.: ProFunc: a server for predicting protein function from 3D structure. Nucleic Acids Res. 33, W89–W93 (2005)
Pal, D., Eisenberg, D.: Inference of protein function from protein structure. Structure 13, 121–130 (2005)
Hvidsten, T.R., et al.: High through-put protein function prediction using local substructures. Submitted (2006)
Terfloth, L.: Drug design. In: Gasteiger, J., Engel, T. (eds.) Chemoinformatics, pp. 497–618. Wiley-VCH, Weinheim (2003)
Wikberg, J.E.S., Maris, L., Peteris, P.: Proteochemometrics: A tool for modelling the molecular interaction space. In: Kubinyi, H., Müler, G. (eds.) Chemogenomics in Drug Discovery - A Medicinal Chemistry Perspective, pp. 289–309. Wiley-VCH, Weinheim (2004)
Strömbergsson, H., et al.: Rough set-based proteochemometrics modeling of G-protein-coupled receptor-ligand interactions. Proteins 63, 24–34 (2006)
Strömbergsson, H., et al.: Generalized modeling of enzyme-ligand interactions using proteochemometrics and local protein substructures. Accepted to Proteins (2006)
Kontijevskis, A., Wikberg, J.E.S., Komorowski, J.: Computational proteomics analysis of HIV-1 protease interactome. Submitted (2006)
Kierczak, M., Rudnicki, W.R., Komorowski, J.: Construction of rough set-based classifiers for predicting HIV resistance to non-nucleoside reverse transcriptase inhibitors. Manuscript (2006)
Bazan, J.G., Skowron, A., Synak, P.: Dynamic reducts as a tool for extracting laws from decision tables. In: Raś, Z.W., Zemankova, M. (eds.) ISMIS 1994. LNCS, vol. 869, pp. 346–355. Springer, Heidelberg (1994)
Vinterbo, S., Øhrn, A.: Minimal approximate hitting sets and rule templates. International Journal of Approximate Reasoning 25(2), 123–143 (2000)
Ågotnes, T., Komorowski, J., Løken, T.: Taming large rule models in rough set approaches. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 193–203. Springer, Heidelberg (1999)
Makosa, E.: Rule tuning. Master thesis. The Linnaeus Centre for Bioinformatics, Uppsala University (2005)
Düntsch, I.: Statistical evaluation of rough set dependency analysis. Int. J. Human-Computer Studies 46, 589–604 (1997)
Düntsch, I., Gediga, G.: Uncertainty measures of rough set prediction. Artificial Intelligence 106, 109–137 (1998)
Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)
Skowron, A.: Synthesis of adaptive decision systems from experimental data. In: Aamodt, A., Komorowski, J. (eds.) Fifth Scandinavian Conference on Artificial Intelligence, Trondheim, Norway, pp. 220–238. IOS Press, Amsterdam (1995)
Komorowski, J., Øhrn, A., Skowron, A.: ROSETTA rough sets. In: Klösgen, W., Żytkow, J. (eds.) Handbook of Data Mining and Knowledge Discovery, pp. 554–559. Oxford University Press, Oxford (2002)
Żytkow, J.M., Rauch, J. (eds.): PKDD 1999. LNCS (LNAI), vol. 1704. Springer, Heidelberg (1999)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this chapter
Cite this chapter
Hvidsten, T.R., Komorowski, J. (2007). Rough Sets in Bioinformatics. In: Peters, J.F., Skowron, A., Marek, V.W., Orłowska, E., Słowiński, R., Ziarko, W. (eds) Transactions on Rough Sets VII. Lecture Notes in Computer Science, vol 4400. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71663-1_14
Download citation
DOI: https://doi.org/10.1007/978-3-540-71663-1_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71662-4
Online ISBN: 978-3-540-71663-1
eBook Packages: Computer ScienceComputer Science (R0)