Statistics > Machine Learning

arXiv:2302.03931 (stat)

[Submitted on 8 Feb 2023]

Title:Fast Linear Model Trees by PILOT

Authors:Jakob Raymaekers, Peter J. Rousseeuw, Tim Verdonck, Ruicong Yao

View PDF

Abstract:Linear model trees are regression trees that incorporate linear models in the leaf nodes. This preserves the intuitive interpretation of decision trees and at the same time enables them to better capture linear relationships, which is hard for standard decision trees. But most existing methods for fitting linear model trees are time consuming and therefore not scalable to large data sets. In addition, they are more prone to overfitting and extrapolation issues than standard regression trees. In this paper we introduce PILOT, a new algorithm for linear model trees that is fast, regularized, stable and interpretable. PILOT trains in a greedy fashion like classic regression trees, but incorporates an $L^2$ boosting approach and a model selection rule for fitting linear models in the nodes. The abbreviation PILOT stands for $PI$ecewise $L$inear $O$rganic $T$ree, where `organic' refers to the fact that no pruning is carried out. PILOT has the same low time and space complexity as CART without its pruning. An empirical study indicates that PILOT tends to outperform standard decision trees and other linear model trees on a variety of data sets. Moreover, we prove its consistency in an additive model setting under weak assumptions. When the data is generated by a linear model, the convergence rate is polynomial.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Methodology (stat.ME)
Cite as:	arXiv:2302.03931 [stat.ML]
	(or arXiv:2302.03931v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2302.03931
Journal reference:	Machine Learning, 2024
Related DOI:	https://doi.org/10.1007/s10994-024-06590-3

Submission history

From: Peter Rousseeuw [view email]
[v1] Wed, 8 Feb 2023 08:11:10 UTC (70 KB)

Statistics > Machine Learning

Title:Fast Linear Model Trees by PILOT

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Fast Linear Model Trees by PILOT

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators