Rough Sets in Bioinformatics

Torgeir R. Hvidsten¹ &
Jan Komorowski¹

Part of the book series: Lecture Notes in Computer Science ((TRS,volume 4400))

577 Accesses

Abstract

Rough set-based rule induction allows easily interpretable descriptions of complex biological systems. Here, we review a number of applications of rough sets to problems in bioinformatics, including cancer classification, gene and protein function prediction, gene regulation, protein-drug interaction and drug resistance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Rough Sets in Machine Learning: A Review

Rough-Set-Base Data Analysis: Theoretical Basis and Applications

Rseslib 3: Open Source Library of Rough Set and Machine Learning Methods

References

Fleischmann, R.D., et al.: Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269, 496–512 (1995)
Article Google Scholar
Berman, H.M., et al.: The protein data bank. Nucleic Acids Research 28, 235–242 (2000)
Article Google Scholar
Schena, M., et al.: Quantitative monitoring of gene expression patterns with a complementary dna microarray. Science 270, 467–470 (1995)
Article Google Scholar
Duggan, D.J., et al.: Expression profiling using cDNA microarrays. Nat. Genet. 21, 10–14 (1999)
Article Google Scholar
Patterson, S.D., Aebersold, R.H.: Proteomics: the first decade and beyond. Nat. Genet. 33(Suppl.), 311–323 (2003)
Article Google Scholar
Kanehisa, M., Bork, P.: Bioinformatics in the post-sequence era. Nat. Genet. 33(Suppl.), 305–310 (2003)
Article Google Scholar
Altschul, S.F., et al.: Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Research 25, 3389–3402 (1997)
Article Google Scholar
Shatkay, H., Feldman, R.: Mining the biomedical literature in the genomic era: an overview. J. Comput. Biol. 10, 821–855 (2003)
Article Google Scholar
Jenssen, T.K., et al.: A literature network of human genes for high-throughput analysis of gene expression. Nat. Genet. 28, 21–28 (2001)
Article Google Scholar
Brazma, A., Krestyaninova, M., Sarkans, U.: Standards for systems biology. Nat. Rev. Genet. 7, 593–605 (2006)
Article Google Scholar
The Gene Ontology Consortium: Gene ontology: tool for the unification of biology. Nature Genetics 25, 25–29 (2000)
Article Google Scholar
Pawlak, Z.: Rough sets. International Journal of Information and Computer Science 11(5), 341–356 (1982)
Article MathSciNet MATH Google Scholar
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning about Data. Series D: System Theory, Knowledge Engineering and Problem Solving, vol. 9. Kluwer Academic Publishers, Dordrecht (1991)
MATH Google Scholar
Komorowski, J., et al.: Rough sets: A tutorial. In: Rough Fuzzy Hybridization: A New Trend in Decision-Making, pp. 3–98. Springer, Singapore (1999)
Google Scholar
Skowron, A., Rauszer, C.: The discernibility matrices and functions in information systems. In: Słowiński, R. (ed.) Intelligent Decision Support: Handbook of Applications and Advances in Rough Sets Theory. Series D: System Theory, Knowledge Engineering and Problem Solving, vol. 11, pp. 331–362. Kluwer Academic Publishers, Dordrecht (1992)
Google Scholar
Skowron, A., Nguyen, H.S.: Boolean reasoning scheme with some applications in data mining. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 107–115. Springer, Heidelberg (1999)
Google Scholar
Churchill, G.A.: Fundamentals of experimental design for cDNA microarrays. Nat. Genet. 32(Suppl.), 490–495 (2002)
Article Google Scholar
Quackenbush, J.: Microarray data normalization and transformation. Nat. Genet. 32(Suppl.), 496–501 (2002)
Article Google Scholar
Iyer, V.R., et al.: The transcriptional program in the response of human fibroblasts to serum. Science 283, 83–87 (1999)
Article Google Scholar
Golub, T., et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
Article Google Scholar
Brown, M.P.S., et al.: Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc. Natl. Acad. Sci. USA 97(1), 262–267 (2000)
Article Google Scholar
Midelfart, H., et al.: Learning rough set classifiers from gene expression and clinical data. Fundamenta Informaticae 53(2), 155–183 (2002)
MathSciNet Google Scholar
Nørsett, K.G., et al.: Gene expression based classification of gastric carcinoma. Cancer Lett. 210, 227–237 (2004)
Article Google Scholar
Efron, B., Tibshirani, R.J.: An Introduction to the Bootstrap. Chapman & Hall, London (1993)
MATH Google Scholar
Hanley, J.A., McNeil, B.J.: The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 29–36 (1982)
Google Scholar
Manley, B.F.J.: Randomization, Bootstrap and Monte Carlo Methods in Biology. Chapman & Hall, Boca Raton (2002)
Google Scholar
Dennis, J.L., et al.: Markers of adenocarcinoma characteristic of the site of origin: Development of a diagnostic algorithm. Clin. Cancer Res. 11, 3766–3772 (2005)
Article Google Scholar
Hvidsten, T.R., et al.: Predicting gene function from gene expressions and ontologies. In: Altman, R.B., et al. (eds.) Pacific Symposium on Biocomputing, Mauna Lani, Hawai’i, pp. 299–310. World Scientific Publishing, Singapore (2001)
Google Scholar
Hvidsten, T.R., Lægreid, A., Komorowski, J.: Learning rule-based models of biological process from gene expression time profiles using gene ontology. Bioinformatics 19, 1116–1123 (2003)
Article Google Scholar
Lægreid, A., et al.: Predicting gene ontology biological process from temporal gene expression patterns. Genome Res. 13, 965–979 (2003)
Article Google Scholar
Eisen, M., et al.: Cluster analysis and display of genome-wide expression pattern. Proc. Natl. Acad. Sci. USA 95(25), 14863–14868 (1998)
Article Google Scholar
Brown, P.O., Botstein, D.: Exploring the new world of the genome with DNA microarrays. Nat. Genet. 21, 33–37 (1999)
Article Google Scholar
Cho, R.J., et al.: Transcriptional regulation and function during the human cell cycle. Nature Genetics 27, 48–54 (2001)
Google Scholar
Pilpel, Y., Sudarsanam, P., Church, G.M.: Identifying regulatory networks by combinatorial analysis of promoter elements. Nature genetics 29, 153–159 (2001)
Article Google Scholar
Hvidsten, T.R., et al.: Discovering regulatory binding-site modules using rule-based learning. Genome Res. 15, 856–866 (2005)
Article Google Scholar
Hughes, J.D., et al.: Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. J. Mol. Biol. 296, 1205–1214 (2000)
Article Google Scholar
Lee, T.I., et al.: Transcriptional regulatory networks in Saccharomyces cerevisiae. Science 298, 799–804 (2002)
Article Google Scholar
Wilczyński, B., et al.: Using local gene expression similarities to discover regulatory binding site modules. Accepted in BMC Bioinformatics (2006)
Google Scholar
Andersson, C.R., et al.: Revealing cell cycle control by combining model-based detection of periodic expression with novel cis-regulatory descriptors. Submitted (2006)
Google Scholar
Skolnick, J., Fetrow, J.S.: From genes to protein structure and function: Novel applications of computational approaches in the genomic era. Trends Biotechnol 18, 34–39 (2000)
Article Google Scholar
Apweiler, R., et al.: UniProt: the Universal Protein knowledgebase. Nucleic Acids Res. 32, D115–D119 (2004)
Article Google Scholar
Chandonia, J.-M., Brenner, S.E.: The impact of structural genomics: Expectations and outcomes. Science 311, 347–351 (2006)
Article Google Scholar
Tress, M., et al.: Assessment of predictions submitted for the CASP6 comparative modeling category. Proteins 61(Suppl. 7), 27–45 (2005)
Article Google Scholar
Zhang, C., Kim, S.-H.: Overview of structural genomics: from structure to function. Curr. Opin. Chem. Biol. 7, 28–32 (2003)
Article Google Scholar
Hvidsten, T.R., et al.: A novel approach to fold recognition using sequence-derived properties from sets of structurally similar local fragments of proteins. Bioinformatics 19(Suppl. 2), II81–II91 (2003)
Google Scholar
Pazos, F., Sternberg, M.J.E.: Automated prediction of protein function and detection of functional sites from structure. Proc. Natl. Acad. Sci. USA 101, 14754–14759 (2004)
Article Google Scholar
Orengo, C.A., Todd, A.E., Thornton, J.M.: From protein structure to function. Curr. Opin. Struct. Biol. 9, 374–382 (1999)
Article Google Scholar
Laskowski, R.A., Watson, J.D., Thornton, J.M.: ProFunc: a server for predicting protein function from 3D structure. Nucleic Acids Res. 33, W89–W93 (2005)
Article Google Scholar
Pal, D., Eisenberg, D.: Inference of protein function from protein structure. Structure 13, 121–130 (2005)
Article Google Scholar
Hvidsten, T.R., et al.: High through-put protein function prediction using local substructures. Submitted (2006)
Google Scholar
Terfloth, L.: Drug design. In: Gasteiger, J., Engel, T. (eds.) Chemoinformatics, pp. 497–618. Wiley-VCH, Weinheim (2003)
Google Scholar
Wikberg, J.E.S., Maris, L., Peteris, P.: Proteochemometrics: A tool for modelling the molecular interaction space. In: Kubinyi, H., Müler, G. (eds.) Chemogenomics in Drug Discovery - A Medicinal Chemistry Perspective, pp. 289–309. Wiley-VCH, Weinheim (2004)
Google Scholar
Strömbergsson, H., et al.: Rough set-based proteochemometrics modeling of G-protein-coupled receptor-ligand interactions. Proteins 63, 24–34 (2006)
Article Google Scholar
Strömbergsson, H., et al.: Generalized modeling of enzyme-ligand interactions using proteochemometrics and local protein substructures. Accepted to Proteins (2006)
Google Scholar
Kontijevskis, A., Wikberg, J.E.S., Komorowski, J.: Computational proteomics analysis of HIV-1 protease interactome. Submitted (2006)
Google Scholar
Kierczak, M., Rudnicki, W.R., Komorowski, J.: Construction of rough set-based classifiers for predicting HIV resistance to non-nucleoside reverse transcriptase inhibitors. Manuscript (2006)
Google Scholar
Bazan, J.G., Skowron, A., Synak, P.: Dynamic reducts as a tool for extracting laws from decision tables. In: Raś, Z.W., Zemankova, M. (eds.) ISMIS 1994. LNCS, vol. 869, pp. 346–355. Springer, Heidelberg (1994)
Google Scholar
Vinterbo, S., Øhrn, A.: Minimal approximate hitting sets and rule templates. International Journal of Approximate Reasoning 25(2), 123–143 (2000)
Article MathSciNet MATH Google Scholar
Ågotnes, T., Komorowski, J., Løken, T.: Taming large rule models in rough set approaches. In: Żytkow, J.M., Rauch, J. (eds.) PKDD 1999. LNCS (LNAI), vol. 1704, pp. 193–203. Springer, Heidelberg (1999)
Google Scholar
Makosa, E.: Rule tuning. Master thesis. The Linnaeus Centre for Bioinformatics, Uppsala University (2005)
Google Scholar
Düntsch, I.: Statistical evaluation of rough set dependency analysis. Int. J. Human-Computer Studies 46, 589–604 (1997)
Article Google Scholar
Düntsch, I., Gediga, G.: Uncertainty measures of rough set prediction. Artificial Intelligence 106, 109–137 (1998)
Article MathSciNet MATH Google Scholar
Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)
Article MATH Google Scholar
Skowron, A.: Synthesis of adaptive decision systems from experimental data. In: Aamodt, A., Komorowski, J. (eds.) Fifth Scandinavian Conference on Artificial Intelligence, Trondheim, Norway, pp. 220–238. IOS Press, Amsterdam (1995)
Google Scholar
Komorowski, J., Øhrn, A., Skowron, A.: ROSETTA rough sets. In: Klösgen, W., Żytkow, J. (eds.) Handbook of Data Mining and Knowledge Discovery, pp. 554–559. Oxford University Press, Oxford (2002)
Google Scholar
Żytkow, J.M., Rauch, J. (eds.): PKDD 1999. LNCS (LNAI), vol. 1704. Springer, Heidelberg (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

The Linnaeus Centre for Bioinformatics, Uppsala University, Uppsala, Sweden
Torgeir R. Hvidsten & Jan Komorowski

Authors

Torgeir R. Hvidsten
View author publications
You can also search for this author in PubMed Google Scholar
Jan Komorowski
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

James F. Peters Andrzej Skowron Victor W. Marek Ewa Orłowska Roman Słowiński Wojciech Ziarko

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Hvidsten, T.R., Komorowski, J. (2007). Rough Sets in Bioinformatics. In: Peters, J.F., Skowron, A., Marek, V.W., Orłowska, E., Słowiński, R., Ziarko, W. (eds) Transactions on Rough Sets VII. Lecture Notes in Computer Science, vol 4400. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71663-1_14

Download citation

DOI: https://doi.org/10.1007/978-3-540-71663-1_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71662-4
Online ISBN: 978-3-540-71663-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics