Speeding Up Logistic Model Tree Induction

Marc Sumner^23,24,
Eibe Frank²⁴ &
Mark Hall²⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3721))

Included in the following conference series:

European Conference on Principles of Data Mining and Knowledge Discovery

3706 Accesses
3 Altmetric

Abstract

Logistic Model Trees have been shown to be very accurate and compact classifiers [8]. Their greatest disadvantage is the computational complexity of inducing the logistic regression models in the tree. We address this issue by using the AIC criterion [1] instead of cross-validation to prevent overfitting these models. In addition, a weight trimming heuristic is used which produces a significant speedup. We compare the training time and accuracy of the new induction process with the original one on various datasets and show that the training time often decreases while the classification accuracy diminishes only slightly.

Download to read the full chapter text

Chapter PDF

\(\text {ALR}^n\): accelerated higher-order logistic regression

Article 22 July 2016

Overview on Decision Tree Induction

Decision Tree Induction Methods and Their Application to Big Data

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Second Int. Symposium on Information Theory, pp. 267–281 (1973)
Google Scholar
Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Breiman, L., Friedman, H., Olshen, J.A., Stone, C.J.: Classification and Regression Trees. Wadsworth (1984)
Google Scholar
Bühlmann, P., Yu, B.: Boosting, model selection, lasso and nonnegative garrote. Technical Report 2005-127, Seminar for Statistics, ETH Zürich (2005)
Google Scholar
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Proc. In. Conf. on Machine Learning, pp. 148–156. Morgan Kaufmann, San Francisco (1996)
Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. The Annals of Statistics 38(2), 337–374 (2000)
Article MathSciNet Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, Heidelberg (2001)
MATH Google Scholar
Landwehr, N., Hall, M., Frank, E.: Logistic model trees. Machine Learning 59(1/2), 161–205 (2005)
Article MATH Google Scholar
Nadeau, C., Bengio, Y.: Inference for the generalization error. In: Advances in Neural Information Processing Systems, vol. 12, pp. 307–313. MIT Press, Cambridge (1999)
Google Scholar
Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implemenations. Morgan Kaufmann, San Francisco (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Computer Science, University of Freiburg, Freiburg, Germany
Marc Sumner
Department of Computer Science, University of Waikato, Hamilton, New Zealand
Marc Sumner, Eibe Frank & Mark Hall

Authors

Marc Sumner
View author publications
You can also search for this author in PubMed Google Scholar
Eibe Frank
View author publications
You can also search for this author in PubMed Google Scholar
Mark Hall
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

LIACC/FEP, Universidade do Porto, Portugal
Alípio Mário Jorge
LIAAD-INESC Porto LA / FEP, University of Porto, R. de Ceuta, 118, 6, 4050-190, Porto, Portugal
Luís Torgo
LIAAD-INESC Porto L.A./Faculty of Economics, University of Porto, Rua de Ceuta, 118-6, 4050-190, Porto, Portugal
Pavel Brazdil
Faculdade de Engenharia & LIAAD, Universidade do Porto, Portugal
Rui Camacho
Faculty of Economics of the University of Porto, Portugal
João Gama

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sumner, M., Frank, E., Hall, M. (2005). Speeding Up Logistic Model Tree Induction. In: Jorge, A.M., Torgo, L., Brazdil, P., Camacho, R., Gama, J. (eds) Knowledge Discovery in Databases: PKDD 2005. PKDD 2005. Lecture Notes in Computer Science(), vol 3721. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564126_72

Download citation

DOI: https://doi.org/10.1007/11564126_72
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29244-9
Online ISBN: 978-3-540-31665-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Speeding Up Logistic Model Tree Induction

Abstract

Chapter PDF

Similar content being viewed by others

\(\text {ALR}^n\): accelerated higher-order logistic regression

Overview on Decision Tree Induction

Decision Tree Induction Methods and Their Application to Big Data

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Speeding Up Logistic Model Tree Induction

Abstract

Chapter PDF

Similar content being viewed by others

\(\text {ALR}^n\): accelerated higher-order logistic regression

Overview on Decision Tree Induction

Decision Tree Induction Methods and Their Application to Big Data

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation