Faster Decision Tree Induction with Impurity-Based Heuristic Schema

Junlong Liu²¹,
Yunfeng Liu²¹,
Jinhong Zhong²¹ &
…
Wangyang Shen²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9937))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

1858 Accesses

Abstract

Decision tree is one of the most commonly-used tools in data mining. Most popular induction algorithms construct decision trees in top-down manner. These algorithms generally select splitting feature only with regard to current nodes’ data, while ignoring history information. This kind of approaches need to search whole feature space during splitting each node and will be quite time-consuming in high-dimensional cases. To tackle this problem, we propose an impurity-based heuristic schema (IBH) to utilize history information to accelerate existing top-down induction algorithms. In details, when child node’s impurity is smaller than parent node’s, IBH takes feature performance in parent node as the pseudo upper bound of that in child node, to cut down unpromising computation. The feature selection of IBH biases toward the ones that perform better in parent nodes. Both mathematical analysis and experimental results demonstrate the coherence between IBH and original induction algorithms. Experiments show that IBH can significantly reduce induction time without accuracy degradation in both decision tree and related ensemble methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Information classification algorithm based on decision tree optimization

Article 26 February 2018

Addressing Local Class Imbalance in Balanced Datasets with Dynamic Impurity Decision Trees

Regularized impurity reduction: accurate decision trees with complexity guarantees

Article Open access 28 November 2022

References

Baker, E., Jain, A.: On feature ordering in practice, some finite sample effects. In: Proceedings of the Third International Joint Conference on Pattern Recognition, pp. 45–49 (1976)
Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
MathSciNet MATH Google Scholar
Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. CRC Press, Boca Raton (1984)
MATH Google Scholar
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)
Article MathSciNet MATH Google Scholar
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat., 1189–1232 (2001)
Google Scholar
Gehrke, J., Ramakrishnan, R., Ganti, V.: Rainforest-a framework for fast decision tree construction of large datasets. In: VLDB, vol. 98, pp. 416–427 (1998)
Google Scholar
Hyafil, L., Rivest, R.L.: Constructing optimal binary decision trees is np-complete. Inf. Process. Lett. 5(1), 15–17 (1976)
Article MathSciNet MATH Google Scholar
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Google Scholar

Download references

Author information

Authors and Affiliations

USTC-Birmingham Joint Research Institute in Intelligent Computation and Its Applications, University of Science and Technology of China, Hefei, China
Junlong Liu, Yunfeng Liu, Jinhong Zhong & Wangyang Shen

Authors

Junlong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yunfeng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jinhong Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Wangyang Shen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Junlong Liu .

Editor information

Editors and Affiliations

University of Manchester, Manchester, United Kingdom
Hujun Yin
Nanjing University, Nanjing, China
Yang Gao
Yangzhou University, Yangzhou, Jiangsu, China
Bin Li
Aeronautics and Astronautics, Nanjing University Aeronautics and Astronautics, Nanjing, China
Daoqiang Zhang
Nanjing Normal University, Nanjing, China
Ming Yang
Yangzhou University, Yangzhou, Jiangsu, China
Yun Li
Ostfalia University of Applied Sciences, Wolfenbüttel, Germany
Frank Klawonn
University of Seville, Seville, Spain
Antonio J. Tallón-Ballesteros

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, J., Liu, Y., Zhong, J., Shen, W. (2016). Faster Decision Tree Induction with Impurity-Based Heuristic Schema. In: Yin, H., et al. Intelligent Data Engineering and Automated Learning – IDEAL 2016. IDEAL 2016. Lecture Notes in Computer Science(), vol 9937. Springer, Cham. https://doi.org/10.1007/978-3-319-46257-8_55

Download citation

DOI: https://doi.org/10.1007/978-3-319-46257-8_55
Published: 13 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46256-1
Online ISBN: 978-3-319-46257-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Faster Decision Tree Induction with Impurity-Based Heuristic Schema

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Information classification algorithm based on decision tree optimization

Addressing Local Class Imbalance in Balanced Datasets with Dynamic Impurity Decision Trees

Regularized impurity reduction: accurate decision trees with complexity guarantees

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Faster Decision Tree Induction with Impurity-Based Heuristic Schema

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Information classification algorithm based on decision tree optimization

Addressing Local Class Imbalance in Balanced Datasets with Dynamic Impurity Decision Trees

Regularized impurity reduction: accurate decision trees with complexity guarantees

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation