Nothing Special   »   [go: up one dir, main page]

The R Journal: article published in 2015, volume 7:2

VSURF: An R Package for Variable Selection Using Random Forests PDF download
Robin Genuer, Jean-Michel Poggi and Christine Tuleau-Malot , The R Journal (2015) 7:2, pages 19-33.

Abstract This paper describes the R package VSURF. Based on random forests, and for both regression and classification problems, it returns two subsets of variables. The first is a subset of important variables including some redundancy which can be relevant for interpretation, and the second one is a smaller subset corresponding to a model trying to avoid redundancy focusing more closely on the prediction objective. The two-stage strategy is based on a preliminary ranking of the explanatory variables using the random forests permutation-based score of importance and proceeds using a stepwise forward strategy for variable introduction. The two proposals can be obtained automatically using data-driven default values, good enough to provide interesting results, but strategy can also be tuned by the user. The algorithm is illustrated on a simulated example and its applications to real datasets are presented.

Received: 2014-07-28; online 2015-11-08
CRAN packages: VSURF, rpart, randomForest, party, ipred, Boruta, varSelRF, spikeSlabGAM, BioMark, mlbench, mixOmics
CRAN Task Views implied by cited CRAN packages: MachineLearning, Environmetrics, Survival, ChemPhys, Multivariate, Bayesian, HighPerformanceComputing


CC BY 4.0
This article is licensed under a Creative Commons Attribution 3.0 Unported license .

@article{RJ-2015-018,
  author = {Robin Genuer and Jean-Michel Poggi and Christine Tuleau-
          Malot},
  title = {{VSURF: An R Package for Variable Selection Using Random
          Forests}},
  year = {2015},
  journal = {{The R Journal}},
  doi = {10.32614/RJ-2015-018},
  url = {https://doi.org/10.32614/RJ-2015-018},
  pages = {19--33},
  volume = {7},
  number = {2}
}