Abstract
Given an ensemble of randomized regression trees, it is possible to restructure them as a collection of multilayered neural networks with particular connection weights. Following this principle, we reformulate the random forest method of Breiman (2001) into a neural network setting, and in turn propose two new hybrid procedures that we call neural random forests. Both predictors exploit prior knowledge of regression trees for their architecture, have less parameters to tune than standard networks, and less restrictions on the geometry of the decision boundaries than trees. Consistency results are proved, and substantial numerical evidence is provided on both synthetic and real data sets to assess the excellent performance of our methods in a large variety of prediction problems.
Similar content being viewed by others
References
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., et al. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. http://tensorflow.org/.
Bengio, Y. (2009). Learning deep architectures for AI. Foundations and Trends in Machine Learning2, 1–127.
Biau, G. and Scornet, E. (2016). A random forest guided tour (with comments and a rejoinder by the authors). TEST25, 197–227.
Boulesteix, A.-L., Janitza, S., Kruppa, J. and König, I.R. (2012). Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery2, 493–507.
Breiman, L. (2001). Random forests. Machine Learning45, 5–32.
Breiman, L., Friedman, J.H., Olshen, R.A. and Stone, C.J. (1984). Classification and regression trees. Chapman & Hall/CRC, Boca Raton.
Brent, R.P. (1991). Fast training algorithms for multi-layer neural nets. IEEE Transactions on Neural Networks2, 346–354.
Chipman, H.A., George, E.I. and McCulloch, R.E. (2010). BART: Bayesian additive regression trees. The Annals of Applied Statistics4, 266–298.
Cortez, P. and Morais, A. (2007). A data mining approach to predict forest fires using meteorological data. In: J. Neves, M.F. Santos, and J. Machado, editors, New Trends in Artificial Intelligence, Proceedings of the 13th Portugese Conference on Artificial Intelligence, pp. 512–523.
Devroye, L., Györfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Springer, New York.
Fernández-Delgado, M., Cernadas, E., Barro, S. and Amorim, D. (2014). Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research15, 3133–3181.
Geurts, P. and Wehenkel, L. (2005). Closed-form dual perturb and combine for tree-based models. ACM, New York, pp. 233–240.
Györfi, L., Kohler, M., KrzyŻak, A. and Walk, H. (2002). A Distribution-Free Theory of Nonparametric Regression. Springer, New York.
Hastie, T., Tibshirani, R. and Friedman, J. (2009). The Elements of Statistical Learning, 2nd edition. Springer, New York.
Ioannou, Y., Robertson, D., Zikic, D., Kontschieder, P., Shotton, J., Brown, M. and Criminisi, A. (2016). Decision forests, convolutional networks and the models in-between. arXiv:1603.01250.
Ishwaran, H., Kogalur, U.B., Chen, X. and Minn, A.J. (2011). Random survival forests for high-dimensional data. Statistical Analysis and Data Mining4, 115–132.
Jordan, M.I. and Jacobs, R.A. (1994). Hierarchical mixtures of experts and the EM algorithm. Neural Computation6, 181–214.
Kingma, D.P. and Ba, J. (2015). Adam: A method for stochastic optimization. In: International conference on learning representations.
Kontschieder, P., Fiterau, M., Criminisi, A. and Rota Bulò, S. (2015). Deep neural decision forests. In: International conference on computer vision.
Lichman, M. (2013). UCI machine learning repository. http://archive.ics.uci.edu/ml.
Lugosi, G. and Zeger, K. (1995). Nonparametric estimation via empirical risk minimization. IEEE Transactions on Information Theory41, 677–687.
Meinshausen, N. (2006). Quantile regression forests. Journal of Machine Learning Research7, 983–999.
Menze, B.H., Kelm, B.M., Splitthoff, D.N., Koethe, U. and Hamprecht, F.A. (2011). On oblique random forests. Machine Learning and Knowledge Discovery in Databases, Springer, Berlin, D. Gunopulos, T. Hofmann, D. Malerba, and M. Vazirgiannis, editors, pages 453–469.
Olaru, C. and Wehenkel, L. (2003). A complete fuzzy decision tree technique. Fuzzy Sets and Systems138, 221–254.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., et al. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research12, 2825– 2830.
Pollard, D. (1984). Convergence of Stochastic Processes. Springer, New York.
Redmond, M. and Baveja, A. (2002). A data-driven software tool for enabling cooperative information sharing among police departments. European Journal of Operational Research141, 660–678.
Richmond, D.L., Kainmueller, D., Yang, M.Y., Myers, E.W. and Rother, C. (2015). Relating cascaded random forests to deep convolutional neural networks for semantic segmentation. arXiv:1507:07583.
Rokach, L. and Maimon, O. (2008). Data Mining with Decision Trees: Theory and Applications. World Scientific, Singapore.
Scornet, E., Biau, G. and Vert, J.-P. (2015). Consistency of random forests. The Annals of Statistics43, 1716–1741.
Sethi, I.K. (1990). Entropy nets: From decision trees to neural networks. Proceedings of the IEEE78, 1605–1613.
Sethi, I.K. (1991). Decision tree performance enhancement using an artificial neural network interpretation. Elsevier, Amsterdam, Sethi, I.K. and Jain, A.K. (eds.), 6912, pp. 71–88.
Welbl, J. (2014). Casting random forests as artificial neural networks (and profiting from it). Pattern Recognition, Springer, Berlin Jiang, X., Hornegger, J., Koch, R. (eds.), pp. 765–771.
Yeh, I.-C. (1998). Modeling of strength of high-performance concrete using artificial neural networks. Cement and Concrete Research28, 1797–1808.
Yildiz, O.T. and Alpaydin, E. (2013). Regularizing soft decision trees. Information Sciences and Systems 2013, Springer, Cham,. pp. 15–21.
Acknowledgments
We greatly thank the Associate Editor and the Referee for valuable comments and insightful suggestions, which lead to a substantial improvement of the paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Biau, G., Scornet, E. & Welbl, J. Neural Random Forests. Sankhya A 81, 347–386 (2019). https://doi.org/10.1007/s13171-018-0133-y
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13171-018-0133-y