Abstract
Gene expression detection is a key bioinformatic problem which has been tackled as a classification problem of microarray gene expression, obtained by the light reflection analysis of genomic material. A typical microarray dataset may contain thousands of genes but only a small number of patterns (often less than two hundred). When the dataset presents these kinds of characteristics, state-of-the-art classification models show a high lack of performance. A two-stage algorithm has been proposed to successfully address the problem of microarray classification. In the first stage, two filter algorithms identify salient expression genes from thousands of genes. In the second stage, the proposed methodology is performed using selected gene subsets as new input variables. The methodology proposed is composed of a combination of Logistic Regression (LR) and Evolutionary Generalized Radial Basis Function (EGRBF) neural networks which have shown to be highly accurate in previous research in the modeling of high-dimensional patterns. Finally, the results obtained are contrasted with nonparametric statistical tests and confirm good synergy between EGRBF and LR models.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Alon U, Barka N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA 96(12): 6745–6750
Bandurski K, Kwedlo W (2010) A lamarckian hybrid of differential evolution and conjugate gradients for neural network training. Neural Process Lett 32(1): 31–44
Beyer K, Goldstein J, Ramakrishnan R, Shaft U (1999) When is Nearest neighbor meaningful? In: International conference on database theory, pp 217–235
Bhattacharjee A, Richards W, Staunton J, Li C, Monti S, Vasa P, Ladd C, Beheshti J, Bueno R, Gillette M, Loda M, Weber G, Mark E, Lander E, Wong W, Johnson B, Golub T, Sugarbaker D, Meyerson M (2001) Classification of human lung carcinomas by mrna expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci USA 98(24): 13,790–13,795
Castaño A, Fernández-Navarro F, Hervás-Martínez C, Gutierrez PA, García MM (2010) Classification by evolutionary generalized radial basis functions. Int J Hybrid Intell Syst 7(1): 1–10
le Cessie S, van Houwelingen J (1992) Ridge estimators in logistic regression. Appl Stat 41(1): 191–201
Chang C, Lin C (2011) Libsvm: a library for support vector machines
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7: 1–30
Fernández-Navarro F, Hervás-Martínez C, Cruz M, Gutierrez PA, Valero A (2011) Evolutionary q-gaussian radial basis function neural network to determine the microbial growth/no growth interface of Staphylococcus aureus. Appl Soft Comput 11(3): 3012–3020
Fernández-Navarro F, Hervás-Martínez C, Gutíerrez PA (2011) A dynamic over-sampling procedure based on sensitivity for multi-class problems. Pattern Recognition. http://dx.doi.org/10.1016/j.patcog.2011.02.019
Fernández-Navarro F, Hervás-Martínez C, Gutierrez PA, Carboreno M (2011) Evolutionary q-gaussian radial basis functions neural networks for multi-classification. Neural Networks In Press. http://dx.doi.org/10.1016/j.neunet.2011.03.014
Fernández-Navarro F, Hervás-Martínez C, Sánchez-Monedero J, Gutierrez PA (2011) MELM-GRBF: a modified version of the extreme learning machine for generalized radial basis function neural networks. Neurocomputing (in press)
Francois D (2008) High dimentional data analisis, from optimal metric to feature selection. In: Seeking on right metric. VDM Verlag, Saarbrucken, pp 54–55
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1): 86–92
Fu L, Zhang M, Li H (2010) Sparse rbf networks with multi-kernels. Neural Process Lett 32(3): 235–247
Gill PE, Murray W, Wright MH (1982) Practical optimization. Academic Press, New York
Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439): 531–537
Hastie T, Tibshirani R, Friedman JH (2001) The elements of statistical learning. Springer, New York
Hervás-Martínez C, Martínez-Estudillo F (2007) Logistic regression using covariates obtained by product-unit neural network models. Pattern Recognit 40(1):52–64
Hervás-Martínez C, Martínez-Estudillo FJ, Carbonero-Ruz M (2008) Multilogistic regression by means of evolutionary product-unit neural networks. Neural Netw 21(7):951–961
Howell AJ, Buxton H (2002) RBF network methods for face detection and attentional frames. Neural Process Lett 15(3): 197–211
Landwehr N, Hall M, Frank E (2005) Logistic model trees. Mach Learn 59(1–2): 161–205
Li J, Liu X (2011) Melt index prediction by RBF neural network optimized with an MPSO-SA hybrid algorithm. Neurocomputing 74(5): 735–740
Li M, Huang G, Saratchandran P, Sundararajan N (2005) Performance evaluation of gap-rbf network in channel equalization. Neural Process Lett 22(2): 223–233
Pomeroy SL, Tamayo P, Gaasenbeek M, Sturla LM, Angelo M, McLaughlin ME, Kim JYH, Goumnerova LC, Black PM, Lau C, Allen JC, Zagzag D, Olson JM, Curran T, Wetmore C, Biegel JA, Poggio T, Mukherjee S, Rifkin R, Califano A, Stolovitzky G, Louis DN, Mesirov JP, Lander ES, Golub TR (2002) Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 415(6870): 436–442
Ramaswamy S, Tamayo P, Rifkin R, Mukherjee S, Yeang C, Angelo M, Ladd C, Reich M, Latulippe E, Mesirov JP, Poggio T, Gerald W, Loda M, Lander ES, Golub TR (2001) Multiclass cancer diagnosis using tumor gene expression signatures. Proc Natl Acad Sci USA 98(26): 15,149–15,154
Ruiz R, Aguilar-Ruiz J, Riquelme J (2008) Best agglomerative ranked subset for feature selection. JMLR Workshop Conf Proc 4: 146–160
Van’t Veer LJ, Dai H, Vande Vijver MJ, He YD, Hart AAM, Mao M, Peterse HL, VanDer Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871): 530–536
Vapnik VN (1999) The nature of statistical learning theory. Springer, New York
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann series in data management systems. Elsevier, Amsterdam
Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Fawcett T, Mishra NICML. AAAI Press, San Francisco, pp 856–863
Zhang M (2009) Ml-rbf: Rbf neural networks for multi-label learning. Neural Process Lett 29(2): 61–74
Zhang ML, Zhou ZH (2006) Adapting RBF neural networks to multi-instance learning. Neural Process Lett 23(1): 1–26
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Castaño, A., Fernández-Navarro, F., Hervás-Martínez, C. et al. Neuro-logistic Models Based on Evolutionary Generalized Radial Basis Function for the Microarray Gene Expression Classification Problem. Neural Process Lett 34, 117–131 (2011). https://doi.org/10.1007/s11063-011-9187-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-011-9187-8