Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1102351.1102381acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
Article

Closed-form dual perturb and combine for tree-based models

Published: 07 August 2005 Publication History

Abstract

This paper studies the aggregation of predictions made by tree-based models for several perturbed versions of the attribute vector of a test case. A closed-form approximation of this scheme combined with cross-validation to tune the level of perturbation is proposed. This yields soft-tree models in a parameter free way. and preserves their interpretability. Empirical evaluations, on classification and regression problems, show that accuracy and bias/variance tradeoff are improved significantly at the price of an acceptable computational overhead. The method is further compared and combined with tree bagging.

References

[1]
Blake, C., & Merz, C. (1998). UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/.]]
[2]
Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123--140.]]
[3]
Breiman, L. (1998). Arcing classifiers. Annals of statistics, 26, 801--849.]]
[4]
Breiman, L. (2000). Randomizing outputs to increase prediction accuracy. Machine Learning, 40, 229--242.]]
[5]
Breiman, L. (2001). Random forests. Machine learning, 45, 5--32.]]
[6]
Breiman, L., Friedman, J., Olsen. R., & Stone, C. (1984). Classification and regression trees. Wadsworth International (California).]]
[7]
Carter, C., & Catlett, J. (1987). Assessing credit card applications using machine learning. IEEE Expert, Fall, 71--79.]]
[8]
Dahmen, J., Keysers, D., & Ney, H. (2001). Combined classification of handwritten digits using the "virtual test sample method". Proc. of the Second International Workshop on Multiple Classifier Systems, Cambrige, UK (pp. 109--118).]]
[9]
Dietterich, T. G. (2000). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40, 139--157.]]
[10]
Friedman, J. (1991). Multivariate adaptive regression splines. Annals of Statistics, 19.]]
[11]
Friedman, J. (1997). On bias, variance, 0/1-loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery, 1, 55--77.]]
[12]
Friedman, J. H. (1996). Local learning based on recursive covering (Technical Report). Department of Statistics, Stanford University.]]
[13]
Geurts, P. (2001). Dual perturb and combine algorithm. Proc. of the Eighth International Workshop on Artificial Intelligence and Statistics (pp. 196--201). Key-West, Florida.]]
[14]
Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 832--844.]]
[15]
Jordan, M. I. (1994). A statistical approach to decision tree modeling. Proc. of the Seventh Annual ACM Conference on Computational Learning Theory. New York. ACM Press.]]
[16]
Ling, C., & Yan, R. (2003). Decision trees with better ranking. Proceedings of the 20th International Conference on Machine Learning (ICML-2003) (pp. 480--487). Washington DC.]]
[17]
Nadeau, C., & Bengio, Y. (2003). Inference for the generalization error. Machine Learning, 52, 239--281.]]
[18]
Olaru, C., & Wehenkel, L. (2003). A complete fuzzy decision tree technique. Fuzzy Sets and Systems, 138, 221--254.]]
[19]
Quinlan, J. (1986). C4.5: Programs for machine learning. Morgan Kaufmann (San Mateo).]]
[20]
Torgo, L. (1999). Inductive learning of tree-based regression models. Doctoral dissertation, University of Porto.]]
[21]
Wehenkel, L. (1998). Automatic learning techniques in power systems. Boston: Kluwer Academic.]]

Cited By

View all
  • (2022)Suspended sediment load modeling using advanced hybrid rotation forest based elastic network approachJournal of Hydrology10.1016/j.jhydrol.2022.127963610(127963)Online publication date: Jul-2022
  • (2022)Multi-level Machine Learning-Driven Tunnel Squeezing Prediction: Review and New InsightsArchives of Computational Methods in Engineering10.1007/s11831-022-09774-z29:7(5493-5509)Online publication date: 10-Jun-2022
  • (2020)Regularisation of neural networks by enforcing Lipschitz continuityMachine Learning10.1007/s10994-020-05929-wOnline publication date: 6-Dec-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICML '05: Proceedings of the 22nd international conference on Machine learning
August 2005
1113 pages
ISBN:1595931805
DOI:10.1145/1102351
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 August 2005

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Suspended sediment load modeling using advanced hybrid rotation forest based elastic network approachJournal of Hydrology10.1016/j.jhydrol.2022.127963610(127963)Online publication date: Jul-2022
  • (2022)Multi-level Machine Learning-Driven Tunnel Squeezing Prediction: Review and New InsightsArchives of Computational Methods in Engineering10.1007/s11831-022-09774-z29:7(5493-5509)Online publication date: 10-Jun-2022
  • (2020)Regularisation of neural networks by enforcing Lipschitz continuityMachine Learning10.1007/s10994-020-05929-wOnline publication date: 6-Dec-2020
  • (2018)Neural Random ForestsSankhya A10.1007/s13171-018-0133-yOnline publication date: 21-Jun-2018
  • (2016)Comments on: A random forest guided tourTEST10.1007/s11749-016-0487-125:2(247-253)Online publication date: 19-Apr-2016
  • (2015)Prediction intervals for electric load forecast: Evaluation for different profiles2015 18th International Conference on Intelligent System Application to Power Systems (ISAP)10.1109/ISAP.2015.7325539(1-6)Online publication date: Sep-2015

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media