Abstract
Microarrays allow simultaneous measurement of the expression levels of thousands of genes in cells under different physiological or disease states. Because the number of genes exceeds the number of samples, class prediction on microarray expression data leads to an extreme “curse of dimensionality” problem. A principal goal of these studies is to identify a subset of informative genes for class prediction to reduce the curse of dimensionality. We propose a novel genetic approach that selects a subset of predictive genes for classification on the basis of gene expression data. Our genetic algorithm maximizes correlation between genes and classes and minimizes intercorrelation among genes. We tested the genetic algorithm on leukemia data sets and obtained improved results over previous results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D., Levine, A.J.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl Acad. Sci. USA 96, 6745–6750 (1999)
Ben-Dor, A., Bruhn, L., Friedman, N., Nachman, I., Schummer, M., Yakhini, Z.: Tissue classification with gene expression profiles. In: The Fourth International Conference on Computational Molecular Biology (RECOMB 2000), ACM Press, New York (2000)
Ben-Dor, A., Shamir, R., Yakhini, Z.: Clustering gene expression patterns. J. Comput. Biol. 6, 281–297 (1999)
Bittner, M., Meltzer, P., Trent, J.: Data analysis and integration: of steps and arrows. Nature Genetics 22, 213–215 (1999)
Brown, P.O., Botstein, D.: Exploring the new world of the genome with DNA microarrays. Nature Genetics 21, 33–37 (1999)
Bui, T.N., Moon, B.R.: Genetic algorithm and graph partitioning. IEEE Trans. on Computers 45(7), 841–855 (1996)
Efron, B.: The jacknife, the bootstrap, and other resampling plans. Society for Industrial and Applied Methematics (1982)
Efron, B., Tibshirani, R.: Cross-validation and the bootstrap: Estimating the error rate of a prediction rule. Dept. of Statistics, Stanford University (1995)
Getz, G., Levine, E., Domany, E.: Coupled two-way clustering analysis of gene microarray data. Proc. Natl Acad. Sci. USA 97, 12079–12084 (2000)
Getz, G., Levine, E., Domany, E., Zhang, M.Q.: Superparamagnatic clustering of yeast gene expression profiles. Physica A 279, 457–464 (2000)
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46(1-3), 389–422 (2002)
Hartuv, E., Schmitt, A.O., Lange, J., Meier-Ewert, S., Lehrach, H., Shamir, R.: An algorithm for clustering cDNA fingerprints. Genomics 66, 249–256 (2000)
Li, L., Darden, T.A., Weinberg, C.R., Pedersen, L.G.: Gene assessment and sample classification for gene expression data using a genetic algorithm/k-nearest neighbor method. Combinatorial Chemistry & High Throughput Screening 4, 727–739 (2001)
Lockhart, D.J., Winzeler, E.A.: Genomics, gene expression and DNA arrays. Nature 405, 827–836 (2000)
Iba, H., Ando, S.: Artificial immune system for classification of gene expression data. In: Genetic and Evolutionary Compatation Conference, pp. 1926–1937 (2003)
Sammon Jr., J.W.: A non-linear mapping for data structure analysis. IEEE Transactions on Computers 18, 401–409 (1969)
Whitley, D., Kauth, J.: Genitor: A different genetic algorithm. In: Rocky Mountain Conference on Artificial Intelligence, pp. 118–130 (1988)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, YH., Lee, SY., Moon, BR. (2004). A Genetic Approach for Gene Selection on Microarray Expression Data. In: Deb, K. (eds) Genetic and Evolutionary Computation – GECCO 2004. GECCO 2004. Lecture Notes in Computer Science, vol 3102. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24854-5_36
Download citation
DOI: https://doi.org/10.1007/978-3-540-24854-5_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22344-3
Online ISBN: 978-3-540-24854-5
eBook Packages: Springer Book Archive