research-article

MEPSI: an MDL-based ensemble pruning approach with structural information

AUTHORs:

Shao-Qun Zhang,

Yuan JiangAuthors Info & Claims

AAAI'24/IAAI'24/EAAI'24: Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence

Article No.: 1236, Pages 11078 - 11086

https://doi.org/10.1609/aaai.v38i10.28984

Published: 20 February 2024 Publication History

Abstract

Ensemble pruning that combines a subset of individual learners generated in parallel to make predictions is an important topic in ensemble learning. Past decades have developed a lot of pruning algorithms that focus on the external behavior of learners on samples, which may lead to over-fitting. In this paper, we conjecture that the generalization performance of an ensemble is not only related to its external behavior on samples but also dependent on the internal structure of individual learners. We propose the general MEPSI approach based on Kolmogorov complexity and the Minimum Description Length (MDL) principle, which formulates the ensemble pruning task as the two-objective optimization problem that comprises the empirical error and structural information among individual learners. We also provide a concrete implementation of MEPSI on decision trees. The theoretical results provide generalization bounds for both the general MEPSI approach and tree-based implementation. The comparative experiments conducted on multiple real-world data sets demonstrate the effectiveness of our proposed method.

References

[1]

Banfield, R. E.; Hall, L. O.; Bowyer, K. W.; and Kegelmeyer, W. P. 2005. Ensemble diversity measures and their application to thinning. Information Fusion, 6(1): 49-62.

[2]

Breiman, L. 1996. Stacked regressions. Machine Learning, 24(1): 49-64.

[3]

Breiman, L. 2001. Random forests. Machine Learning, 45(1): 5-32.

Digital Library

[4]

Breiman, L. 2017. Classification and Regression Trees. Routledge.

[5]

Brooks Jr, F. P. 2003. Three great challenges for half-century-old computer science. Journal of the ACM, 50(1): 25-26.

Digital Library

[6]

Chen, T.; and Guestrin, C. 2016. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785-794.

Digital Library

[7]

Cunningham, P.; and Carney, J. 2000. Diversity versus quality in classification ensembles based on feature selection. In Proceedings of the 11th European Conference on Machine Learning, 109-116.

[8]

Dietterich, T. G. 2000. Ensemble methods in machine learning. In Proceedings of the 1st International Workshop on Multiple Classifier Systems, 1-15.

[9]

Dorogush, A. V.; Ershov, V.; and Gulin, A. 2018. CatBoost: Gradient boosting with categorical features support. arXiv preprint arXiv:1810.11363.

[10]

Friedman, J. H. 2001. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 1189-1232.

[11]

Geurts, P.; Ernst, D.; and Wehenkel, L. 2006. Extremely randomized trees. Machine Learning, 63(1): 3-42.

Digital Library

[12]

Giacinto, G.; Roli, F.; and Fumera, G. 2000. Design of effective multiple classifier systems by clustering of classifiers. In Proceedings of the 15th International Conference on Pattern Recognition, 160-163.

[13]

Grünwald, P. D.; and Vitányi, P. M. 2008. Algorithmic information theory. Handbook of the Philosophy of Information, 281-320.

[14]

Hull, J. J. 1994. A database for handwritten text recognition research. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(5): 550-554.

Digital Library

[15]

Kelly, M.; Longjohn, R.; and Nottingham, K. 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml. Accessed: 2023-09-25.

[16]

Kohavi, R.; and Wolpert, D. H. 1996. Bias plus variance decomposition for zero-one loss functions. In Proceedings of the 13th International Conference on Machine Learning, 275-83.

[17]

Kolmogorov, A. N. 1965. Three approaches to the quantitative definition of information. Problems of Information Transmission, 1(1): 1-7.

[18]

Kuncheva, L. I.; and Whitaker, C. J. 2003. Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning, 51(2): 181-207.

Digital Library

[19]

Lazarevic, A.; and Obradovic, Z. 2001. Effective pruning of neural network classifier ensembles. In Proceedings of the 2001 International Joint Conference on Neural Networks, 796-801.

[20]

Li, M.; and Vitányi, P. M. B. 2019. An Introduction to Kolmogorov Complexity and its Applications, 4th Edition. Springer.

[21]

Li, N.; and Zhou, Z.-H. 2009. Selective ensemble under regularization framework. In Proceedings of the 8th International Workshop on Multiple Classifier Systems, 293-303.

[22]

Margineantu, D. D.; and Dietterich, T. G. 1997. Pruning adaptive boosting. In Proceedings of the 14th International Conference on Machine Learning, 211-218.

[23]

Martinez-Munoz, G.; Hernández-Lobato, D.; and Suárez, A. 2008. An analysis of ensemble pruning techniques based on ordered aggregation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2): 245-259.

Digital Library

[24]

Martínez-Muñoz, G.; and Suárez, A. 2006. Pruning in ordered bagging ensembles. In Proceedings of the 23rd International Conference on Machine learning, 609-616.

[25]

Martinez-Munoz, G.; and Suárez, A. 2007. Using boosting to prune bagging ensembles. Pattern Recognition Letters, 28(1): 156-165.

Digital Library

[26]

Pawlik, M.; and Augsten, N. 2011. RTED: A robust algorithm for the tree edit distance. arXiv preprint arXiv:1201.0230.

[27]

Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; and Dubourg, V. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12: 2825-2830.

Digital Library

[28]

Rissanen, J. 1978. Modeling by shortest data description. Automatica, 14(5): 465-471.

Digital Library

[29]

Rodriguez, J. J.; Kuncheva, L. I.; and Alonso, C. J. 2006. Rotation forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10): 1619-1630.

Digital Library

[30]

Shannon, C. E. 1948. A mathematical theory of communication. The Bell System Technical Journal, 27(3): 379-423.

[31]

Skalak, D. B. 1996. The sources of increased accuracy for two proposed boosting algorithms. In Working Notes of the AAAI'96 Workshop on Integrating Multiple Learned Models, 1133.

[32]

Solomonoff, R. J. 1964. A formal theory of inductive inference. Part I and Part II. Information and Control, 7: 1-22,224-254.

[33]

Sun, T.; and Zhou, Z.-H. 2018. Structural diversity for decision tree ensemble learning. Frontiers of Computer Science, 12(3): 560-570.

Digital Library

[34]

Vereshchagin, N. K.; and Vitányi, P. M. 2004. Kolmogorov's structure functions and model selection. IEEE Transactions on Information Theory, 50(12): 3265-3290.

Digital Library

[35]

Wolpert, D. H. 1992. Stacked generalization. Neural Networks, 5(2): 241-259.

Digital Library

[36]

Wu, Y.-C.; He, Y.-X.; Qian, C.; and Zhou, Z.-H. 2022. Multi-objective evolutionary ensemble pruning guided by margin distribution. In Proceedings of the 17th International Conference on Parallel Problem Solving from Nature, 427-441.

[37]

Yule, G. U. 1900. On the association of attributes in statistics. Philosophical Transactions of the Royal Society of London, 194(252-261): 257-319.

[38]

Zhang, K.; and Shasha, D. 1989. Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal on Computing, 18(6): 1245-1262.

Digital Library

[39]

Zhang, S.-Q.; Wu, J.-H.; Zhang, G.; Xiong, H.; Gu, B.; and Zhou, Z.-H. 2023. On the Generalization of Spiking Neural Networks via Minimum Description Length and Structural Stability. arXiv preprint arXiv:2207.04876.

[40]

Zhang, Y.; Burer, S.; Nick Street, W.; Bennett, K. P.; and Parrado-Hernández, E. 2006. Ensemble Pruning via Semi-definite Programming. Journal of Machine Learning Research, 7(48): 1315-1338.

Digital Library

[41]

Zhou, Z.-H. 2012. Ensemble Methods: Foundations and Algorithms. CRC press.

[42]

Zhou, Z.-H.; and Feng, J. 2019. Deep forest. National Science Review, 6(1): 74-86.

[43]

Zhou, Z.-H.; and Tang, W. 2003. Selective ensemble of decision trees. In Proceedings of the 9th International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing, 476-483.

[44]

Zhou, Z.-H.; Wu, J.; and Tang, W. 2002. Ensembling neural networks: Many could be better than all. Artificial Intelligence, 137(1-2): 239-263.

Digital Library

Index Terms

MEPSI: an MDL-based ensemble pruning approach with structural information

Index terms have been assigned to the content through auto-classification.

Recommendations

Chinese text classification by the Naïve Bayes Classifier and the associative classifier with multiple confidence threshold values

Each type of classifier has its own advantages as well as certain shortcomings. In this paper, we take the advantages of the associative classifier and the Naive Bayes Classifier to make up the shortcomings of each other, thus improving the accuracy of ...
Letter: Boosting Naïve Bayes text classification using uncertainty-based selective sampling

This paper presents adaptive boosting with uncertainty-based selective sampling (AdaBUS), a variant of the AdaBoost algorithm for boosting the Nai@?ve Bayes (NB) text classification. Although the boosting technique has been shown to effectively improve ...
AdaBoost classifiers for pecan defect classification

Highlights The performance of AdaBoost algorithms were compared with support vector machine and Bayesian classifiers for pecan defect classification. AdaBoost classifiers took least time and gave best classification accuracy. AdaBoost classifiers ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

AAAI'24/IAAI'24/EAAI'24: Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence

February 2024

23861 pages

ISBN:978-1-57735-887-9

Copyright © 2024 Association for the Advancement of Artificial Intelligence.

Sponsors

Association for the Advancement of Artificial Intelligence

Publisher

AAAI Press

Publication History

Published: 20 February 2024

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Table of Conten