Abstract
Most of the methods designed within the framework of Bayesian networks (BNs) assume that the involved variables are of discrete nature, but this assumption rarely holds in real problems. The Bayesian classifier AODE (Aggregating One-Dependence Estimators) e.g. can only work directly with discrete variables. The HAODE (from Hybrid AODE) classifier is proposed as an appealing alternative to AODE which is less affected by the discretization process. In this paper, we study if this behavior holds when applying different discretization methods. More importantly, we include other Bayesian classifiers in the comparison to find out to what extent the type of discretization affects their results in terms of accuracy and bias-variance discretization. If the type of discretization applied is not decisive, then future experiments can be k times faster, k being the number of discretization methods considered.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
Demšar, J.: Statistical Comparisons of Classifiers over Multiple Data Sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Dougherty, J., Kohavi, R., Sahami, M.: Supervised and Unsupervised Discretization of Continuous Features. In: Proc. of the 12th Int. Conf. on Mach. Learn., pp. 194–202 (1995)
Fayyad, U.M., Irani, K.B.: Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning. In: Proc. of the 13th Int. Joint Conf. on AI, pp. 1022–1027 (1993)
Flores, M.J., Gámez, J.A., Martínez, A.M., Puerta, J.M.: GAODE and HAODE: Two Proposals based on AODE to deal with Continuous Variables. In: ICML. ACM Int. Conf. Proc. Series, vol. 382, p. 40 (2009)
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29, 131–163 (1997)
García, S., Herrera, F.: An Extension on Statistical Comparisons of Classifiers over Multiple Data Sets for all Pairwise Comparisons. J. Mach. Learn. Res. 9, 2677–2694 (2009)
Webb, G.I.: Multiboosting: A Technique for Combining Boosting and Wagging. Mach. Learn. 40(2), 159–196 (2000)
Webb, G.I., Boughton, J.R., Wang, Z.: Not So Naive Bayes: Aggregating One-Dependence Estimators. Mach. Learn. 58(1), 5–24 (2005)
Webb, G.I., Conilione, P.: Estimating bias and variance from data (2002)
Collection of Datasets avalaibles from the Weka Official Homepage (2008), http://www.cs.waikato.ac.nz/ml/weka/
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Yang, Y., Webb, G.I.: Discretization for Naive-Bayes Learning: Managing Discretization Bias and Variance. Mach. Learn. 74(1), 39–74 (2009)
Zheng, F., Webb, G.I.: A Comparative Study of Semi-naive Bayes Methods in Classification Learning. In: Proc. of the 4th Australasian Data Mining Conf. (AusDM05), Sydney, pp. 141–156. University of Technology (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Flores, M.J., Gámez, J.A., Martínez, A.M., Puerta, J.M. (2010). Analyzing the Impact of the Discretization Method When Comparing Bayesian Classifiers. In: García-Pedrajas, N., Herrera, F., Fyfe, C., Benítez, J.M., Ali, M. (eds) Trends in Applied Intelligent Systems. IEA/AIE 2010. Lecture Notes in Computer Science(), vol 6096. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13022-9_57
Download citation
DOI: https://doi.org/10.1007/978-3-642-13022-9_57
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13021-2
Online ISBN: 978-3-642-13022-9
eBook Packages: Computer ScienceComputer Science (R0)