Efficient Construction of Regression Trees with Range and Region Splitting

Yasuhiko Morimoto¹,
Hiromu Ishii² &
Shinichi Morishita³

495 Accesses
Explore all metrics

Abstract

We propose a method for constructing regression trees with range and region splitting. We present an efficient algorithm for computing the optimal two-dimensional region that minimizes the mean squared error of an objective numeric attribute in a given database. As two-dimensional regions, we consider a class R of grid-regions, such as “x-monotone,” “rectilinear-convex,” and “rectangular,” in the plane associated with two numeric attributes. We compute the optimal region R. We propose to use a test that splits data into those that lie inside the region R and those that lie outside the region in the construction of regression trees. Experiments confirm that the use of region splitting gives compact and accurate regression trees in many domains.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Asano, T., Chen, D., Katoh, N., & Tokuyama, T. (1996). Polynomial-time solutions to image segmentations. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (pp. 104–113).
Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and Regression Trees. Wadsworth.
Brodley, C. E. & Utgoff, P. E. (1995). Multivariate decision trees. Machine Learning, 19, 45–77.
Google Scholar
Fukuda, T., Morimoto, Y., Morishita, S., & Tokuyama, T. (1996). Data mining using two-dimensional optimized association rules: Scheme, algorithms, and visualization. In Proceedings of the ACM SIGMOD Conference on Management of Data (pp. 13–23).
Garey, M. R. & Johnson, D. S. (1979). Computer and Intractability. A Guide to NP-Completeness. San Francisco, CA: W. H. Freeman.
Google Scholar
Gehrke, J. E., Ramakrishnan, R., & Ganti, V. (1998). RainForest—A framework for fast decision tree construction of large datasets. In Proceedings of the International Conference on Very Large Data Bases (pp. 416–427).
Hyafil, L. & Rivest, R. L. (1976). Constructing optimal binary decision trees is np-complete. Information Processing Letters, 5, 15–17.
Google Scholar
Ittner, A. & Schlosser, M. (1996). Non-linear decision trees—NDT. In Proceedings of the International Conference on Machine Learning (pp. 252–258).
Lubinsky, D. (1996). Tree structured interpretable regression. Learning from Data: AI and Statistics V (pp. 387–398).
Mehta, M., Agrawal, R., & Rissanen, J. (1996). SLIQ: A fast scalable classifier for data mining. In Proceedings of the International Conference on Extending Database Technology (Vol. 1057, pp. 18–32). Berlin: Springer.
Google Scholar
Mingers, J. (1989). An empirical comparison of pruning methods for decision tree induction. Machine Learning, 4, 227–243.
Google Scholar
Morimoto, Y., Fukuda, T., Morishita, S., & Tokuyama, T. (1997). Implementation and evaluation of decision trees with range and region splitting. Constraints, an International Journal, 2(3/4), 401–427.
Google Scholar
Murthy, S. K., Kasif, S., & Salzberg, S. (1994). A system for induction of oblique decision trees. Journal of Artificial Intelligence Research, 2, 1–32.
Google Scholar
Piatetsky-Shapiro, G. & Frawley, W. J. (Eds.) (1991). Knowledge Discovery in Databases. Menlo Park, CA: AAAI Press.
Google Scholar
Quinlan, J. R. (1993). C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann.
Google Scholar
Shafer, J. C., Agrawal, R., & Mehta, M. (1996). SPRINT: A scalable parallel classifier for data mining. In Proceedings of the International Conference on Very Large Data Bases (pp. 544–555).
Yoda, K., Fukuda, T., Morimoto, Y., Morishita, S., & Tokuyama, T. (1997). Computing optimized rectilinear regions for association rules. In Proceedings of the International Conference on Knowledge Discovery and Data Mining (pp. 96–103).

Download references

Author information

Authors and Affiliations

IBM Tokyo Research Laboratory, 1623-14 Shimotsuruma, Yamato, Kanagawa, 242-8502, Japan
Yasuhiko Morimoto
Kyoto University, Yoshida-Honmachi, Sakyo Ward, Kyoto, 606-8501, Japan
Hiromu Ishii
University of Tokyo, 7-3-1 Hongo, Bunkyo Ward, Tokyo, 113-0033, Japan
Shinichi Morishita

Authors

Yasuhiko Morimoto
View author publications
You can also search for this author in PubMed Google Scholar
Hiromu Ishii
View author publications
You can also search for this author in PubMed Google Scholar
Shinichi Morishita
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Morimoto, Y., Ishii, H. & Morishita, S. Efficient Construction of Regression Trees with Range and Region Splitting. Machine Learning 45, 235–259 (2001). https://doi.org/10.1023/A:1017980905332

Download citation

Issue Date: December 2001
DOI: https://doi.org/10.1023/A:1017980905332

Efficient Construction of Regression Trees with Range and Region Splitting

Abstract

Article PDF

Similar content being viewed by others

Multidimensional Range Selection

Fast linear model trees by PILOT

Decision Rules with Collinearity Models

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Efficient Construction of Regression Trees with Range and Region Splitting

Abstract

Article PDF

Similar content being viewed by others

Multidimensional Range Selection

Fast linear model trees by PILOT

Decision Rules with Collinearity Models

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation