Computer Science > Machine Learning

arXiv:1306.0237 (cs)

[Submitted on 2 Jun 2013 (v1), last revised 18 Nov 2013 (this version, v3)]

Title:Guided Random Forest in the RRF Package

View PDF

Abstract:Random Forest (RF) is a powerful supervised learner and has been popularly used in many applications such as bioinformatics.
In this work we propose the guided random forest (GRF) for feature selection. Similar to a feature selection method called guided regularized random forest (GRRF), GRF is built using the importance scores from an ordinary RF. However, the trees in GRRF are built sequentially, are highly correlated and do not allow for parallel computing, while the trees in GRF are built independently and can be implemented in parallel. Experiments on 10 high-dimensional gene data sets show that, with a fixed parameter value (without tuning the parameter), RF applied to features selected by GRF outperforms RF applied to all features on 9 data sets and 7 of them have significant differences at the 0.05 level. Therefore, both accuracy and interpretability are significantly improved. GRF selects more features than GRRF, however, leads to better classification accuracy. Note in this work the guided random forest is guided by the importance scores from an ordinary random forest, however, it can also be guided by other methods such as human insights (by specifying $\lambda_i$). GRF can be used in "RRF" v1.4 (and later versions), a package that also includes the regularized random forest methods.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:1306.0237 [cs.LG]
	(or arXiv:1306.0237v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1306.0237

Submission history

From: Houtao Deng [view email]
[v1] Sun, 2 Jun 2013 18:30:45 UTC (14 KB)
[v2] Sat, 12 Oct 2013 03:56:07 UTC (14 KB)
[v3] Mon, 18 Nov 2013 08:52:49 UTC (13 KB)

Computer Science > Machine Learning

Title:Guided Random Forest in the RRF Package

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Guided Random Forest in the RRF Package

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators