Improving Random Forests

Marko Robnik-Šikonja²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3201))

Included in the following conference series:

European Conference on Machine Learning

6920 Accesses

Abstract

Random forests are one of the most successful ensemble methods which exhibits performance on the level of boosting and support vector machines. The method is fast, robust to noise, does not overfit and offers possibilities for explanation and visualization of its output. We investigate some possibilities to increase strength or decrease correlation of individual trees in the forest. Using several attribute evaluation measures instead of just one gives promising results. On the other hand replacement of ordinary voting with voting weighted with margin achieved on most similar instances gives improvements which are statistically highly significant over several data sets.

Download to read the full chapter text

Chapter PDF

Double random forest

Article 02 July 2020

Algebraic aggregation of random forests: towards explainability and rapid evaluation

Article Open access 29 September 2021

A comparison of random forest based algorithms: random credal random forest versus oblique random forest

Article 17 November 2018

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Breiman, L.: Bagging predictors. Machine Learning Journal 26(2), 123–140 (1996)
Google Scholar
Breiman, L.: Random forests. Machine Learning Journal 45, 5–32 (2001)
Article MATH Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and regression trees. Wadsworth Inc., Belmont (1984)
MATH Google Scholar
Demšar, J.: Statistically correct comparison of classifiers over multiple datasets (2004) (submitted)
Google Scholar
Dietterich, T.G., Kerns, M., Mansour, Y.: Applying the weak learning framework to understand and improve C4.5. In: Saitta, L. (ed.) Machine Learning: Proceedings of the Thirteenth International Conference (ICML 1996), pp. 96–103. Morgan Kaufmann, San Francisco (1996)
Google Scholar
Freund, Y., Shapire, R.E.: Experiments with a new boosting algorithm. In: Saitta, L. (ed.) Machine Learning: Proceedings of the Thirteenth International Conference (ICML 1996), Morgan Kaufmann, San Francisco (1996)
Google Scholar
Hand, D.J., Till, R.J.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning Journal 45, 171–186 (2001)
Article MATH Google Scholar
Kononenko, I.: Estimating attributes: analysis and extensions of Relief. In: De Raedt, L., Bergadano, F. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994)
Google Scholar
Kononenko, I.: On biases in estimating multi-valued attributes. In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI 1995), pp. 1034–1040. Morgan Kaufmann, San Francisco (1995)
Google Scholar
Meyer, D., Leisch, F., Hornik, K.: The support vector machine under test. Neurocomputing 55, 169–186 (2003)
Article Google Scholar
Murphy, P.M., Aha, D.W.: UCI repository of machine learning databases (1995), http://www.ics.uci.edu/mlearn/MLRepository.html
Ross Quinlan, J.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Robnik-Šikonja, M., Kononenko, I.: Theoretical and empirical analysis of ReliefF and RReliefF. Machine Learning Journal 53, 23–69 (2003)
Article MATH Google Scholar
Schapire, R.E., Freund, Y., Bartlett, P., Lee, W.S.: Boosting the margin: a new explanation for the effectiveness of voting methods. In: Fisher, D.H. (ed.) Machine Learning: Proceedings of the Fourteenth International Conference (ICML 1997), pp. 322–330. Morgan Kaufmann, San Francisco (1997)
Google Scholar
Zar, J.H.: Biostatistical Analysis, 4th edn. Prentice Hall, Englewood Clifs (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Computer and Information Science, University of Ljubljana, Tržaška 25, 1001, Ljubljana, Slovenia
Marko Robnik-Šikonja

Authors

Marko Robnik-Šikonja
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

INSA-Lyon, LIRIS CNRS UMR5205, F-69621, Villeurbanne, France
Jean-François Boulicaut
Dipartimento di Informatica, Università degli Studi di Bari,
Floriana Esposito
Pisa KDD Laboratory, ISTI - CNR, Area della Ricerca di Pisa, Via Giuseppe Moruzzi 1, Pisa, Italy
Fosca Giannotti
Dipartimento di Informatica, Via F. Buonarroti 2, 56127, Pisa, Italy
Dino Pedreschi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Robnik-Šikonja, M. (2004). Improving Random Forests. In: Boulicaut, JF., Esposito, F., Giannotti, F., Pedreschi, D. (eds) Machine Learning: ECML 2004. ECML 2004. Lecture Notes in Computer Science(), vol 3201. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30115-8_34

Download citation

DOI: https://doi.org/10.1007/978-3-540-30115-8_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23105-9
Online ISBN: 978-3-540-30115-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics