Abstract
Breiman (Machine Learning, 26(2), 123–140) showed that bagging could effectively reduce the variance of regression predictors, while leaving the bias relatively unchanged. A new form of bagging we call iterated bagging is effective in reducing both bias and variance. The procedure works in stages—the first stage is bagging. Based on the outcomes of the first stage, the output values are altered; and a second stage of bagging is carried out using the altered output values. This is repeated until a simple rule stops the process. The method is tested using both trees and nearest neighbor regression methods. Accuracy on the Boston Housing data benchmark is comparable to the best of the results gotten using highly tuned and compute- intensive Support Vector Regression Machines. Some heuristic theory is given to clarify what is going on. Application to two-class classification data gives interesting results.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Breiman, L. (1993). Hinging hyperplanes for regression, classification and noiseless function approximation. IEEE Transactions on Information Theory, 39, 999–1013.
Breiman, L. (1996). Bagging predictors. Machine Learning, 26(2), 123–140.
Breiman, L. (1997a). Arcing the edge. Technical Report, Statistics Department, University of California.
Breiman, L. (1997b). Out-of-bag estimation. Technical Report, Statistics Department, University of California.
Breiman, L. (1998). Arcing classifiers, discussion paper. Annals of Statistics, 26, 801–824.
Breiman, L. (1998a). Half and half bagging and hard boundary points. Technical Report, Statistics Department, University of California.
Breiman, L., Friedman, J., Olshen R., & Stone, C. (1984). Classification and Regression Trees. Wadsworth.
Drucker, H. (1997). Improving regressors using boosting techniques. In Proceedings of the International Conference on Machine Learning (pp. 107–115).
Drucker, H. (1999). Combining Artificial Neural Nets (pp. 51–77). Berlin: Springer.
Drucker, H., Burges, C., Kaufman, K., Smola, A., & Vapnik, V. (1997). Support vector regression machines. Advances in Neural Information Processing Systems, 9, 155–161.
Freund, Y., & Schapire, R. (1996). Experiments with a new boosting algorithm. Machine Learning: Proceedings of the Thirteenth International Conference. July, 1996.
Friedman, J. (1991). Multivariate adaptive regression splines. Annals of Statistics, 1991.
Friedman, J. (1997). On bias, variance, 0/1 loss, and the curse of dimensionality. J. of Data Mining and Knowlege Discovery, 1, 55.
Friedman, J. (1999a). Greedy Function Approximation: A Gradient Boosting Method. Available at http://www-stat.stanford.edu/?jhf/.
Friedman, J. (1999b). Stochastic Gradient Boosting. Available at http://www-stat.stanford.edu/?jhf/.
Friedman, J., Hastie, T., & Tibshirani, R. (1998). Additive logistic regression: A statistical view of boosting. Technical Report, Statistics Department, Stanford University.
Geman, S., Bienenstock, E., & Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4, 1–58.
Scholkopf, B., Bartlett, P., Smola, A., & Williamson, R. (1999). Shrinking the tube:A new support vector regression algorithm. Advances in Neural Information Processing Systems, 11, 330–336.
Stitson, M., Gammerman, A., Vapnik, V., Vovk, V., Watkins, C., & Weston, J. (1999). Support vector regression with ANOVA decomposition kernels. Advances in Kernel Methods—Support Vector Learning (pp. 285–291). Cambridge, MA: MIT Press.
Tibshirani, R. (1996). Bias, variance, and prediction error for classification rules. Technical Report, Statistics Department, University of Toronto.
Vapnik, V. (1998). Statistical Learning Theory. New York: Wiley.
Wolpert, D. H. & Macready, W. G. (1999). An efficient method to estimate bagging's generalization error. Machine Learning, 35(1), 41–55.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Breiman, L. Using Iterated Bagging to Debias Regressions. Machine Learning 45, 261–277 (2001). https://doi.org/10.1023/A:1017934522171
Issue Date:
DOI: https://doi.org/10.1023/A:1017934522171