Abstract
In this work, we present some significant improvements for for feature selection in wrapper methods. They are two: the first of them consists in a proper preordering of the feature set; and the second one consists in the application of an irrelevant feature elimination method, where the irrelevance condition is subjected to the partial selected feature subset by the wrapper method. We validate these approaches with the Diffuse Large B-Cell Lymphoma subtype classification problem and we show that these two changes are an important improvement in the computation cost and the classification accuracy of these wrapper methods in this domain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hand, D.: Discrimination and Classification. John Wiley, Chichester (1981)
Friedman, N., Linial, M., Nachman, I., Pe’er, D.: Using bayesian networks to analyze expression data. Journal of Computational Biology 7, 601–620 (2000)
Inza, I., Sierra, B., Blanco, R., Larrañaga, P.: Gene selection by sequential wrapper approaches in microarray cancer class prediction. Journal of Intelligent and Fuzzy Systems 12, 25–34 (2002)
Langley, P., Iba, W., Thompson, K.: An analysis of bayesian classifiers. In: National Conference on Artificial Intelligence, pp. 223–228 (1992)
Domingos, P., Pazzani, M.J.: On the optimality of the simple bayesian classifier under zero-one loss. Machine Learning 29, 103–130 (1997)
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29, 131–163 (1997)
Langley, P., Sage, S.: Induction of selective bayesian classifiers. In: Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence, pp. 399–406 (1994)
John, G.H., Kohavi, R.: Irrelevant features and the subset selection problem. In: Proceedings of the Eleventh International Conference on Machine Learning, pp. 121–129 (1994)
Golub, T.R., et al.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
Inza, I., Larrañaga, P., Blanco, R., Cerrolaza, A.: Filter versus wrapper gene selection approaches in dna microarray domains. Artificial Intelligence in Medicine, special issue in Data mining in genomics and proteomics 31(2), 91–103 (2004)
Hsu, C.N., Huang, H.J., Wong, T.T.: Why discretization works for naïve bayesian classifiers. In: Proc. 17th International Conf. on Machine Learning, pp. 399–406. Morgan Kaufmann, San Francisco (2000)
John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann, San Francisco (1995)
Cowell, R., Dawid, A., Lauritzen, S., Spiegelhalter, D.: Probabilistic Networks and Expert Systems. In: Statistics for Engineering and Information Science, Springer, New York (1999)
Wright, G., Tan, B., Rosenwald, A., Hurt, E.H., Wiestner, A., Staudt, L.M.: A gene expression-based method to diagnose clinically distinct subgroups of diffuse large b cell lymphoma. Proceedings of National Academy of Sciences of the United States of America 100, 9991–9996 (2003)
Cano, A., Castellano, F.G., Masegosa, A., Moral, S.: Application of a selective gaussian naïve bayes model for diffuse large-b-cell lymphoma classification. In: Proceedings of the Second European Workshop in Probabilistic Graphicals Models, Leiden, Holland, pp. 33–40 (2004)
Kittler, J.: Feature set search algorithms. In: Chen, C.H. (ed.) Pattern Recognition and Signal Processing, Sijthoff & Noordhoff, pp. 41–60 (1978)
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. John Wiley, New York (1973)
Stone, M.: An asymptotic equivalence of choice of model by cross-validation and akaike’s criterion. Journal of the Real Statistical Society 38, 38–47 (1997)
Aha, D.W., Bankert, R.L.: Feature selection for case-based classification of cloud types: An empirical comparision. In: Working Notes of the AAAI-94 Workshop on Case-Based Reasoning, pp. 106–112. AAAI Press, Seattle (1994)
Langley, P., Sage, S.: Oblivious decision trees and abstract cases. In: Working Notes of the AAAI 1994 Workshop on Case-Based Reasoning, AAAI Press, Seattle (1994)
Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artificial Intelligence 97, 273–324 (1997)
Allmuallim, H., Dietterich, T.: Learning with many irrelevant features. In: Ninth National Conference on Artificial Intelligence, pp. 547–552. MIT Press, Cambridge (1991)
Alizadeh, A., et al.: Distinct types of diffuse large B–cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)
Zhang, H., Yu, C.Y., Singer, B.: Cell and tumor classification using gene expression data: Construction of forests. Proceedings of the National Academy of Sciences 100, 4168–4172 (2003)
Li, L., Weinberg, C.R., Darden, T.A., Pedersen, L.G.: Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the ga/knn method. Bioinformatics 17, 1131–1142 (2001)
Ando, T., Katayama, M., Seto, M., Kobayashi, T., Honda, H.: Selection of causal gene sets from transciptional profiling by fnn modeling an prediction of lymphoma outcome. Gene Informatics 13, 278–279 (2002)
Rosenwald, A., Wright, G., Chan, W.C., Connors, J.M., Campo, E., Fisher, R.I., Gascoyne, R.D., Muller-Hermelink, H.K., Smealand, E.B., Staudt, L.M.: The use of molecular profiling to predict survival after chemotherapy for diffuse large-b-cell lymphoma. New England Journal of Medicine 346, 1937–1947 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cano, A., Castellano, J.G., Masegosa, A.R., Moral, S. (2005). Selective Gaussian Naïve Bayes Model for Diffuse Large-B-Cell Lymphoma Classification: Some Improvements in Preprocessing and Variable Elimination. In: Godo, L. (eds) Symbolic and Quantitative Approaches to Reasoning with Uncertainty. ECSQARU 2005. Lecture Notes in Computer Science(), vol 3571. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11518655_76
Download citation
DOI: https://doi.org/10.1007/11518655_76
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27326-4
Online ISBN: 978-3-540-31888-0
eBook Packages: Computer ScienceComputer Science (R0)