Analyzing the Impact of the Discretization Method When Comparing Bayesian Classifiers

M. Julia Flores²⁴,
José A. Gámez²⁴,
Ana M. Martínez²⁴ &
…
José M. Puerta²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6096))

Included in the following conference series:

International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

2205 Accesses

Abstract

Most of the methods designed within the framework of Bayesian networks (BNs) assume that the involved variables are of discrete nature, but this assumption rarely holds in real problems. The Bayesian classifier AODE (Aggregating One-Dependence Estimators) e.g. can only work directly with discrete variables. The HAODE (from Hybrid AODE) classifier is proposed as an appealing alternative to AODE which is less affected by the discretization process. In this paper, we study if this behavior holds when applying different discretization methods. More importantly, we include other Bayesian classifiers in the comparison to find out to what extent the type of discretization affects their results in terms of accuracy and bias-variance discretization. If the type of discretization applied is not decisive, then future experiments can be k times faster, k being the number of discretization methods considered.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Bayesian network classifiers using ensembles and smoothing

Article 30 March 2020

Bayes Classification Using an Approximation to the Joint Probability Distribution of the Attributes

Bayesian additive regression trees with model trees

Article 03 March 2021

References

Asuncion, A., Newman, D.J.: UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
Demšar, J.: Statistical Comparisons of Classifiers over Multiple Data Sets. J. Mach. Learn. Res. 7, 1–30 (2006)
MathSciNet Google Scholar
Dougherty, J., Kohavi, R., Sahami, M.: Supervised and Unsupervised Discretization of Continuous Features. In: Proc. of the 12th Int. Conf. on Mach. Learn., pp. 194–202 (1995)
Google Scholar
Fayyad, U.M., Irani, K.B.: Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning. In: Proc. of the 13th Int. Joint Conf. on AI, pp. 1022–1027 (1993)
Google Scholar
Flores, M.J., Gámez, J.A., Martínez, A.M., Puerta, J.M.: GAODE and HAODE: Two Proposals based on AODE to deal with Continuous Variables. In: ICML. ACM Int. Conf. Proc. Series, vol. 382, p. 40 (2009)
Google Scholar
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29, 131–163 (1997)
Article MATH Google Scholar
García, S., Herrera, F.: An Extension on Statistical Comparisons of Classifiers over Multiple Data Sets for all Pairwise Comparisons. J. Mach. Learn. Res. 9, 2677–2694 (2009)
Google Scholar
Webb, G.I.: Multiboosting: A Technique for Combining Boosting and Wagging. Mach. Learn. 40(2), 159–196 (2000)
Article Google Scholar
Webb, G.I., Boughton, J.R., Wang, Z.: Not So Naive Bayes: Aggregating One-Dependence Estimators. Mach. Learn. 58(1), 5–24 (2005)
Article MATH Google Scholar
Webb, G.I., Conilione, P.: Estimating bias and variance from data (2002)
Google Scholar
Collection of Datasets avalaibles from the Weka Official Homepage (2008), http://www.cs.waikato.ac.nz/ml/weka/
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar
Yang, Y., Webb, G.I.: Discretization for Naive-Bayes Learning: Managing Discretization Bias and Variance. Mach. Learn. 74(1), 39–74 (2009)
Article Google Scholar
Zheng, F., Webb, G.I.: A Comparative Study of Semi-naive Bayes Methods in Classification Learning. In: Proc. of the 4th Australasian Data Mining Conf. (AusDM05), Sydney, pp. 141–156. University of Technology (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Systems Department, Intelligent Systems & Data Mining - SIMD, I3A. University of Castilla-La Mancha, Albacete, Spain
M. Julia Flores, José A. Gámez, Ana M. Martínez & José M. Puerta

Authors

M. Julia Flores
View author publications
You can also search for this author in PubMed Google Scholar
José A. Gámez
View author publications
You can also search for this author in PubMed Google Scholar
Ana M. Martínez
View author publications
You can also search for this author in PubMed Google Scholar
José M. Puerta
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computing and Numerical Analysis, University of Cordoba, Campus Universitario de Rabanales, Einstein Building, 3rd floor, 14071, Cordoba, Spain
Nicolás García-Pedrajas
Dept. of Computer Science and Artificial Intelligence, ETS de Ingenierias Informática y de Telecomunicación, University of Granada, 18071, Granada, Spain
Francisco Herrera
School of Computing, University of the West of Scotland, PA1 2BE, Paisley, UK
Colin Fyfe
Dept. Computer Science and Artificial Intelligence, ETS de Ingenierias Informática y de Telecomunicación, University of Granada, 18071, Granada, Spain
José Manuel Benítez
Department of Computer Science, Texas State University-San Marcos, 601 University Drive, TX 78666-4616, San Marcos, USA
Moonis Ali

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Flores, M.J., Gámez, J.A., Martínez, A.M., Puerta, J.M. (2010). Analyzing the Impact of the Discretization Method When Comparing Bayesian Classifiers. In: García-Pedrajas, N., Herrera, F., Fyfe, C., Benítez, J.M., Ali, M. (eds) Trends in Applied Intelligent Systems. IEA/AIE 2010. Lecture Notes in Computer Science(), vol 6096. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13022-9_57

Download citation

DOI: https://doi.org/10.1007/978-3-642-13022-9_57
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13021-2
Online ISBN: 978-3-642-13022-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics