Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3524938.3525841guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article
Free access

Born-again tree ensembles

Published: 13 July 2020 Publication History

Abstract

The use of machine learning algorithms in finance, medicine, and criminal justice can deeply impact human lives. As a consequence, research into interpretable machine learning has rapidly grown in an attempt to better control and fix possible sources of mistakes and biases. Tree ensembles offer a good prediction quality in various domains, but the concurrent use of multiple trees reduces the interpretability of the ensemble. Against this background, we study born-again tree ensembles, i.e., the process of constructing a single decision tree of minimum size that reproduces the exact same behavior as a given tree ensemble in its entire feature space. To find such a tree, we develop a dynamic-programming based algorithm that exploits sophisticated pruning and bounding rules to reduce the number of recursive calls. This algorithm generates optimal born-again trees for many datasets of practical interest, leading to classifiers which are typically simpler and more interpretable without any other form of compromise.

Supplementary Material

Additional material (3524938.3525841_supp.pdf)
Supplemental material.

References

[1]
Bai, J., Li, Y., Li, J., Jiang, Y., and Xia, S. Rectified decision trees: Towards interpretability, compression and empirical soundness. arXiv preprint arXiv:1903.05965, 2019.
[2]
Banfield, R. E., Hall, L. O., Bowyer, K.W., and Kegelmeyer, W. P. Ensemble diversity measures and their application to thinning. Information Fusion, 6(1):49-62, 2005.
[3]
Bastani, O., Kim, C., and Bastani, H. Interpretability via model extraction. arXiv preprint arXiv:1706.09773, 2017a.
[4]
Bastani, O., Kim, C., and Bastani, H. Interpreting blackbox models via model extraction. arXiv preprint arXiv:1705.08504, 2017b.
[5]
Bennett, K. Decision tree construction via linear programming. In Proceedings of the 4th Midwest Artificial Intelligence and Cognitive Science Society Conference, Utica, Illinois, 1992.
[6]
Bennett, K. and Blue, J. Optimal decision trees. Technical report, Rensselaer Polytechnique Institute, 1996.
[7]
Bertsimas, D. and Dunn, J. Optimal classification trees. Machine Learning, 106(7):1039-1082, 2017.
[8]
Breiman, L. Random forests. Machine Learning, 45(1): 5-32, 2001.
[9]
Breiman, L. and Shang, N. Born again trees. Technical report, University of California Berkeley, 1996.
[10]
Buciluǎ, C., Caruana, R., and Niculescu-Mizil, A. Model compression. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006.
[11]
Caruana, R., Niculescu-Mizil, A., Crew, G., and Ksikes, A. Ensemble selection from libraries of models. In Proceedings of the twenty-first International Conference on Machine Learning, pp. 18, 2004.
[12]
Clark, K., Luong, M.-T., Khandelwal, U., Manning, C. D., and Le, Q. V. Bam! born-again multi-task networks for natural language understanding. arXiv preprint arXiv:1907.04829, 2019.
[13]
Frankle, J. and Carbin, M. The lottery ticket hypothesis: Finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635, 2018.
[14]
Friedman, J. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5):1189-1232, 2001.
[15]
Friedman, J. H. and Popescu, B. E. Predictive learning via rule ensembles. The Annals of Applied Statistics, 2(3): 916-954, 2008.
[16]
Frosst, N. and Hinton, G. Distilling a neural network into a soft decision tree. arXiv preprint arXiv:1711.09784, 2017.
[17]
Furlanello, T., Lipton, Z. C., Tschannen, M., Itti, L., and Anandkumar, A. Born again neural networks. arXiv preprint arXiv:1805.04770, 2018.
[18]
Guidotti, R., Monreale, A., Ruggieri, S., Turini, F., Giannotti, F., and Pedreschi, D. A survey of methods for explaining black box models. ACM Computing Surveys (CSUR), 51(5):1-42, 2018.
[19]
Günlük, O., Kalagnanam, J., Menickelly, M., and Scheinberg, K. Optimal decision trees for categorical data via integer programming. arXiv preprint arXiv:1612.03225, 2018.
[20]
Hara, S. and Hayashi, K. Making tree ensembles interpretable: A bayesian model selection approach. arXiv preprint arXiv:1606.09066, 2016.
[21]
Hernández-Lobato, D., Martinez-Muoz, G., and Suárez, A. Statistical instance-based pruning in ensembles of independent classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2):364-369, 2009.
[22]
Hinton, G., Vinyals, O., and Dean, J. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
[23]
Hu, Q., Yu, D., Xie, Z., and Li, X. Eros: Ensemble rough subspaces. Pattern recognition, 40(12):3728-3739, 2007.
[24]
Hu, X., Rudin, C., and Seltzer, M. Optimal sparse decision trees. In Advances in Neural Information Processing Systems, 2019.
[25]
Kisamori, K. and Yamazaki, K. Model bridging: To interpretable simulation model from neural network. arXiv preprint arXiv:1906.09391, 2019.
[26]
Margineantu, D. and Dietterich, T. Pruning adaptive boosting. In Proceedings of the Fourteenth International Conference Machine Learning, 1997.
[27]
Martínez-Muñoz, G., Hernández-Lobato, D., and Suárez, A. An analysis of ensemble pruning techniques based on ordered aggregation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2):245-259, 2008.
[28]
Meinshausen, N. Node harvest. The Annals of Applied Statistics, pp. 2049-2072, 2010.
[29]
Melis, D. A. and Jaakkola, T. Towards robust interpretability with self-explaining neural networks. In Advances in Neural Information Processing Systems, 2018.
[30]
Mollas, I., Tsoumakas, G., and Bassiliades, N. Lionforests: Local interpretation of random forests through path selection. arXiv preprint arXiv:1911.08780, 2019.
[31]
Murdoch, W., Singh, C., Kumbier, K., Abassi-Asl, R., and Yu, B. Interpretable machine learning: definitions, methods, and applications. arXiv preprint arXiv:1901.04592v1, 2019.
[32]
Nijssen, S. and Fromont, E. Mining optimal decision trees from itemset lattices. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2007.
[33]
Park, S. and Furnkranz, J. Efficient prediction algorithms for binary decomposition techniques. Data Mining and Knowledge Discovery, 24(1):40-77, 2012.
[34]
Partalas, I., Tsoumakas, G., and Vlahavas, I. An ensemble uncertainty aware measure for directed hill climbing ensemble pruning. Machine Learning, 81(3):257-282, 2010.
[35]
Prodromidis, A. L. and Stolfo, S. J. Cost complexity-based pruning of ensemble classifiers. Knowledge and Information Systems, 3(4):449-469, 2001.
[36]
Prodromidis, A. L., Stolfo, S. J., and Chan, P. K. Effective and efficient pruning of metaclassifiers in a distributed data mining system. Knowledge Discovery and Data Mining Journal, 32, 1999.
[37]
Rokach, L. Collective-agreement-based pruning of ensembles. Computational Statistics & Data Analysis, 53(4): 1015-1026, 2009.
[38]
Rokach, L. Decision forest: Twenty years of research. Information Fusion, 27:111-125, 2016.
[39]
Rokach, L., Maimon, O., and Arbel, R. Selective voting-- getting more for less in sensor fusion. International Journal of Pattern Recognition and Artificial Intelligence, 20(03):329-350, 2006.
[40]
Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5):206- 215, 2019.
[41]
Sirikulviriya, N. and Sinthupinyo, S. Integration of rules from a random forest. In International Conference on Information and Electronics Engineering, volume 6, 2011.
[42]
Smith, J., Everhart, J., Dickson, W., Knowler, W., and Johannes, R. Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proceedings of the Annual Symposium on Computer Applications in Medical Care, pp. 261-265. IEEE Computer Society Press, 1988.
[43]
Tamon, C. and Xiang, J. On the boosting pruning problem. In Proceedings of the 11th European Conference on Machine Learning, 2000.
[44]
Tan, H. F., Hooker, G., and Wells, M. T. Tree space prototypes: Another look at making tree ensembles interpretable. arXiv preprint arXiv:1611.07115, 2016.
[45]
Vandewiele, G., Lannoye, K., Janssens, O., Ongenae, F., De Turck, F., and Van Hoecke, S. A genetic algorithm for interpretable model extraction from decision tree ensembles. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 2017.
[46]
Verwer, S. and Zhang, Y. Learning optimal classification trees using a binary linear program formulation. In Proceedings of the AAAI Conference on Artificial Intelligence, 2019.
[47]
Windeatt, T. and Ardeshir, G. An empirical comparison of pruning methods for ensemble classifiers. In International Symposium on Intelligent Data Analysis, 2001.
[48]
Zhang, H. and Wang, M. Search for the smallest random forest. Statistics and its Interface, 2(3):381, 2009.
[49]
Zhang, Q., Nian Wu, Y., and Zhu, S.-C. Interpretable convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
[50]
Zhang, Y., Burer, S., and Street, W. N. Ensemble pruning via semi-definite programming. Journal of Machine Learning Research, 7(Jul):1315-1338, 2006.
[51]
Zhou, Y. and Hooker, G. Interpreting models via single tree approximation. arXiv preprint arXiv:1610.09036, 2016.
[52]
Zhou, Z., Wu, J., and Tang, W. Ensembling neural networks: many could be better than all. Artificial Intelligence, 137: 239-263, 2002.
[53]
Zhou, Z.-H. and Tang, W. Selective ensemble of decision trees. In International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing, 2003.

Cited By

View all
  • (2023)On computing optimal tree ensemblesProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619123(17364-17374)Online publication date: 23-Jul-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICML'20: Proceedings of the 37th International Conference on Machine Learning
July 2020
11702 pages

Publisher

JMLR.org

Publication History

Published: 13 July 2020

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)15
  • Downloads (Last 6 weeks)4
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)On computing optimal tree ensemblesProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619123(17364-17374)Online publication date: 23-Jul-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media