Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/978-3-540-92137-0_39guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

A Combined Classification Algorithm Based on C4.5 and NB

Published: 19 December 2008 Publication History

Abstract

When our learning task is to build a model with accurate classification, C4.5 and NB are two very important algorithms for achieving this task because of their simplicity and high performance. In this paper, we present a combined classification algorithm based on C4.5 and NB, simply C4.5-NB. In C4.5-NB, the class probability estimates of C4.5 and NB are weighted according to their classification accuracy on the training data. We experimentally tested C4.5-NB in Weka system using the whole 36 UCI data sets selected by Weka, and compared it with C4.5 and NB. The experimental results show that C4.5-NB significantly outperforms C4.5 and NB in terms of classification accuracy. Besides, we also observe the ranking performance of C4.5-NB in terms of AUC (the area under the Receiver Operating Characteristics curve). Fortunately, C4.5-NB also significantly outperforms C4.5 and NB.

References

[1]
Mitchell, T.M.: Decision tree Learning. Chapter 3 in Machine Learning. McGraw-Hill, New York (1997).
[2]
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993).
[3]
Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1, 81-106 (1986).
[4]
Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, San Francisco (1988).
[5]
Langley, P., Iba, W., Thomas, K.: An analysis of Bayesian classifiers. In: Proceedings of the Tenth National Conference of Artificial Intelligence, pp. 223-228. AAAI Press, Menlo Park (1992).
[6]
Friedman, G., Goldszmidt: Bayesian Network Classifiers. Machine Learning 29, 131-163 (1997).
[7]
Merz, C., Murphy, P., Aha, D.: UCI repository of machine learning databases. In: Dept of ICS, University of California, Irvine (1997), http://www.ics.uci.edu/mlearn/MLRepository.html
[8]
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005), http://prdownloads.sourceforge.net/weka/datasets-UCI.jar
[9]
Zhang, H., Jiang, L., Su, J.: Hidden Naive Bayes. In: Proceedings of the 20th National Conference on Artificial Intelligence, AAAI 2005, pp. 919-924. AAAI Press, Menlo Park (2005).
[10]
Liang, H., Zhang, H., Guo, Y.: Decision Trees for Probability Estimation: An Empirical Study. In: Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2006, pp. 756-764. IEEE Computer Society Press, Los Alamitos (2006).
[11]
Nadeau, C., Bengio, Y.: Inference for the generalization error. Advances in Neural Information Processing Systems 12, 307-313 (1999).
[12]
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30, 1145-1159 (1997).
[13]
Hand, D.J., Till, R.J.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning 45, 171-186 (2001).
[14]
Jiang, L., Zhang, H., Cai, Z., Su, J.: Learning tree augmented naive bayes for ranking. In: Zhou, L.-z., Ooi, B.-C., Meng, X. (eds.) DASFAA 2005. LNCS, vol. 3453, pp. 688-698. Springer, Heidelberg (2005).
[15]
Jiang, L., Zhang, H., Cai, Z.: Discriminatively Improving Naive Bayes by Evolutionary Feature Selection. Romanian Journal of Information Science and Technology 9(3), 163-174 (2006).

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ISICA '08: Proceedings of the 3rd International Symposium on Advances in Computation and Intelligence
December 2008
857 pages
ISBN:9783540921363
  • Editors:
  • Lishan Kang,
  • Zhihua Cai,
  • Xuesong Yan,
  • Yong Liu

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 19 December 2008

Author Tags

  1. classification
  2. combined algorithms
  3. data mining
  4. decision trees
  5. naive Bayes
  6. ranking
  7. weights

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media