Abstract
High density DNA microarrays are widely used in cancer research, monitoring thousands of genes at once. Due to small sample size and the large amount of genes in micrarray experiments, selection of significant genes via expression patterns is an important matter in cancer classification. Many gene selection methods have been investigated, but it is hard to find out the perfect one. In this paper we propose a new gene selection method based on partial correlation in regression analysis to find the informative genes to predict cancer. The genes selected by this method tend to have information about the cancer that is not overlapped by the genes selected previously. We have measured the sensitivity, specificity, and recognition rate of the selected genes with k-nearest neighbor classifier for colon cancer dataset. In most of the cases, the proposed method has produced better results than the gene selection methods based on correlation coefficients, showing high accuracy of 90.3% for colon cancer dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Harrington, C.A., Rosenow, C., Retief, J.: Monitoring gene expression using DNA microarrays. Curr. Opin. Microbiol. 3, 285–291 (2000)
Cho, S.-B., Ryu, J.: Classifying gene expression data of cancer using classifier ensemble with mutually exclusive features. Proc. of the IEEE 90(11), 1744–1753 (2002)
Shannon, W.D., Watson, M.A., Perry, A., Rich, K.: Mantel statistics to correlate gene expression levels from microarrays with clinical covariates. Genetic Epidemiology 23(1), 96–97 (2002)
Furey, T.S., Cristianini, N., Duffy, N., Bednarski, D.W., Schummer, M., Haussler, D.: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16(10), 906–914 (2000)
Tamayo, P.: Interpreting patterns of gene expression with self-organizing map: Methods and application to hematopoietic differentiation. Proc. of the Natl. Acad. of Sci. USA 96, 2907–2912 (1999)
Li, L., Weinberg, C.R., Darden, T.A., Pedersen, L.G.: Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics 17(12), 1131–1142 (2001)
Lipshutz, R.J., Fodor, S.P.A., Gingeras, T.R., Lockhart, D.J.: High density synthetic oligonucleotide arrays. Nature Genetics 21, 20–24 (1999)
Lee, K.E., Sha, N., Dougherty, E.R., Vannucci, M., Mallick, B.K.: Gene selection: a Bayesian variable selection approach. Bioinformatics 19(1), 90–97 (2002)
West, M., Nevins, J.R., Marks, J.R., Spang, R., Blanchette, C., Zuzan, H.: DNA microarray data analysis and regression modeling for genetic expression profiling. In: ISDS Discussion, pp. 00–15 (2000)
Bo, T.H., Jonassen, I.: New feature subset selection procedures for classification of expression profiles. Genome Biology 3(4), 17.1–17.11 (2002)
Liu, J., Iba, H.: Selecting informative genes with parallel genetic algorithms in tissue classification. Genome Informatics 12, 14–23 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yoo, SH., Cho, SB. (2004). Optimal Gene Selection for Cancer Classification with Partial Correlation and k-Nearest Neighbor Classifier. In: Zhang, C., W. Guesgen, H., Yeap, WK. (eds) PRICAI 2004: Trends in Artificial Intelligence. PRICAI 2004. Lecture Notes in Computer Science(), vol 3157. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28633-2_75
Download citation
DOI: https://doi.org/10.1007/978-3-540-28633-2_75
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22817-2
Online ISBN: 978-3-540-28633-2
eBook Packages: Springer Book Archive