Using Iterated Bagging to Debias Regressions

Leo Breiman¹

3749 Accesses
210 Citations
Explore all metrics

Abstract

Breiman (Machine Learning, 26(2), 123–140) showed that bagging could effectively reduce the variance of regression predictors, while leaving the bias relatively unchanged. A new form of bagging we call iterated bagging is effective in reducing both bias and variance. The procedure works in stages—the first stage is bagging. Based on the outcomes of the first stage, the output values are altered; and a second stage of bagging is carried out using the altered output values. This is repeated until a simple rule stops the process. The method is tested using both trees and nearest neighbor regression methods. Accuracy on the Boston Housing data benchmark is comparable to the best of the results gotten using highly tuned and compute- intensive Support Vector Regression Machines. Some heuristic theory is given to clarify what is going on. Application to two-class classification data gives interesting results.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Breiman, L. (1993). Hinging hyperplanes for regression, classification and noiseless function approximation. IEEE Transactions on Information Theory, 39, 999–1013.
Google Scholar
Breiman, L. (1996). Bagging predictors. Machine Learning, 26(2), 123–140.
Google Scholar
Breiman, L. (1997a). Arcing the edge. Technical Report, Statistics Department, University of California.
Breiman, L. (1997b). Out-of-bag estimation. Technical Report, Statistics Department, University of California.
Breiman, L. (1998). Arcing classifiers, discussion paper. Annals of Statistics, 26, 801–824.
Google Scholar
Breiman, L. (1998a). Half and half bagging and hard boundary points. Technical Report, Statistics Department, University of California.
Breiman, L., Friedman, J., Olshen R., & Stone, C. (1984). Classification and Regression Trees. Wadsworth.
Drucker, H. (1997). Improving regressors using boosting techniques. In Proceedings of the International Conference on Machine Learning (pp. 107–115).
Drucker, H. (1999). Combining Artificial Neural Nets (pp. 51–77). Berlin: Springer.
Google Scholar
Drucker, H., Burges, C., Kaufman, K., Smola, A., & Vapnik, V. (1997). Support vector regression machines. Advances in Neural Information Processing Systems, 9, 155–161.
Google Scholar
Freund, Y., & Schapire, R. (1996). Experiments with a new boosting algorithm. Machine Learning: Proceedings of the Thirteenth International Conference. July, 1996.
Friedman, J. (1991). Multivariate adaptive regression splines. Annals of Statistics, 1991.
Friedman, J. (1997). On bias, variance, 0/1 loss, and the curse of dimensionality. J. of Data Mining and Knowlege Discovery, 1, 55.
Google Scholar
Friedman, J. (1999a). Greedy Function Approximation: A Gradient Boosting Method. Available at http://www-stat.stanford.edu/?jhf/.
Friedman, J. (1999b). Stochastic Gradient Boosting. Available at http://www-stat.stanford.edu/?jhf/.
Friedman, J., Hastie, T., & Tibshirani, R. (1998). Additive logistic regression: A statistical view of boosting. Technical Report, Statistics Department, Stanford University.
Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4, 1–58.
Google Scholar
Scholkopf, B., Bartlett, P., Smola, A., & Williamson, R. (1999). Shrinking the tube:A new support vector regression algorithm. Advances in Neural Information Processing Systems, 11, 330–336.
Google Scholar
Stitson, M., Gammerman, A., Vapnik, V., Vovk, V., Watkins, C., & Weston, J. (1999). Support vector regression with ANOVA decomposition kernels. Advances in Kernel Methods—Support Vector Learning (pp. 285–291). Cambridge, MA: MIT Press.
Google Scholar
Tibshirani, R. (1996). Bias, variance, and prediction error for classification rules. Technical Report, Statistics Department, University of Toronto.
Vapnik, V. (1998). Statistical Learning Theory. New York: Wiley.
Google Scholar
Wolpert, D. H. & Macready, W. G. (1999). An efficient method to estimate bagging's generalization error. Machine Learning, 35(1), 41–55.
Google Scholar

Download references

Author information

Authors and Affiliations

Statistics Department, University of California at Berkeley, Berkeley, CA, 94720, USA
Leo Breiman

Authors

Leo Breiman
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Breiman, L. Using Iterated Bagging to Debias Regressions. Machine Learning 45, 261–277 (2001). https://doi.org/10.1023/A:1017934522171

Download citation

Issue Date: December 2001
DOI: https://doi.org/10.1023/A:1017934522171

Using Iterated Bagging to Debias Regressions

Abstract

Article PDF

Similar content being viewed by others

Loan Status Prediction System with Ensembled Machine Learning Models: Elevating Information Reliability and Accuracy

Supervised Learning

Generalized bagging

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Using Iterated Bagging to Debias Regressions

Abstract

Article PDF

Similar content being viewed by others

Loan Status Prediction System with Ensembled Machine Learning Models: Elevating Information Reliability and Accuracy

Supervised Learning

Generalized bagging

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation