Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Rotation Forest: A New Classifier Ensemble Method

Published: 01 October 2006 Publication History

Abstract

We propose a method for generating classifier ensembles based on feature extraction. To create the training data for a base classifier, the feature set is randomly split into K subsets (K is a parameter of the algorithm) and Principal Component Analysis (PCA) is applied to each subset. All principal components are retained in order to preserve the variability information in the data. Thus, K axis rotations take place to form the new features for a base classifier. The idea of the rotation approach is to encourage simultaneously individual accuracy and diversity within the ensemble. Diversity is promoted through the feature extraction for each base classifier. Decision trees were chosen here because they are sensitive to rotation of the feature axes, hence the name "forest.” Accuracy is sought by keeping all principal components and also using the whole data set to train each base classifier. Using WEKA, we examined the Rotation Forest ensemble on a random selection of 33 benchmark data sets from the UCI repository and compared it with Bagging, AdaBoost, and Random Forest. The results were favorable to Rotation Forest and prompted an investigation into diversity-accuracy landscape of the ensemble models. Diversity-error diagrams revealed that Rotation Forest ensembles construct individual classifiers which are more accurate than these in AdaBoost and Random Forest, and more diverse than these in Bagging, sometimes more accurate as well.

References

[1]
E.L. Allwein, R.E. Schapire, and Y. Singer, “Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers,” J. Machine Learning Research, vol. 1, pp. 113-141, 2000.
[2]
R.E. Banfield, L.O. Hall, K.W. Bowyer, D. Bhadoria, W.P. Kegelmeyer, and S. Eschrich, “A Comparison of Ensemble Creation Techniques,” Proc Fifth Int'l Workshop Multiple Classifier Systems (MCS '04), 2004.
[3]
E. Bauer and R. Kohavi, “An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants,” Machine Learning, vol. 36, nos. 1-2, pp. 105-139, 1999.
[4]
C.L. Blake and C.J. Merz, “UCI Repository of Machine Learning Databases,” 1998,
[5]
L. Breiman, “Bagging Predictors,” Machine Learning, vol. 24, no. 2, pp. 123-140, 1996.
[6]
L. Breiman, “Arcing Classifiers,” Annals of Statistics, vol. 26, no. 3, pp. 801-849, 1998.
[7]
L. Breiman, “Random Forests,” Machine Learning, vol. 45, no. 1, pp. 5-32, 2001.
[8]
T.G. Dietterich, “Ensemble Methods in Machine Learning,” Proc. Conf. Multiple Classifier Systems, pp. 1-15, 2000.
[9]
X.Z. Fern and C.E. Brodley, “Random Projection for High Dimensional Data Clustering: A Cluster Ensemble Approach,” Proc. 20th Int'l Conf. Machine Learning (ICML), pp. 186-193, 2003.
[10]
J.L. Fleiss, Statistical Methods for Rates and Proportions. John Wiley and Sons, 1981.
[11]
F.H. Foley and J.W. Sammon, “An Optimal Set of Discriminant Vectors,” IEEE Trans. Computers, vol. 24, no. 3, pp. 281-289, Mar. 1975.
[12]
Y. Freund and R.E. Schapire, “A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting,” J. Computer and System Sciences, vol. 55, no. 1, pp. 119-139, 1997.
[13]
J. Friedman, T. Hastie, and R. Tibshirani, “Additive Logistic Regression: A Statistical View of Boosting,” Annals of Statistics, vol. 28, no. 2, pp. 337-374, 2000.
[14]
K. Fukunaga and W.L.G. Koontz, “Application of the Karhunen-Loeve Expansion to Feature Selection and Ordering,” IEEE Trans. Computers, vol. 19, no. 4, pp. 311-318, Apr. 1970.
[15]
J. Han and M. Kamber, Data Mining: Concepts and Techniques. Morgan Kaufmann, 2001.
[16]
L.K. Hansen and P. Salamon, “Neural Network Ensembles,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 12, no. 10, pp. 993-1001, Oct. 1990.
[17]
T.K. Ho, “The Random Subspace Method for Constructing Decision Forests,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 8, pp. 832-844, Aug. 1998.
[18]
T.K. Ho, “A Data Complexity Analysis of Comparative Advantages of Decision Forest Constructors,” Pattern Analysis and Applications, vol. 5, pp. 102-112, 2002.
[19]
J. Kittler and P.C. Young, “A New Approach to Feature Selection Based on the Karhunen-Loeve Expansion,” Pattern Recognition, vol. 5, no. 4, pp. 335-352, Dec. 1973.
[20]
S. Kolenikov and G. Angeles, “The Use of Discrete Data in PCA: Theory, Simulations, and Applications to Socioeconomic Indices,” Proc. 2004 Joint Statistical Meeting, 2004.
[21]
L.I. Kuncheva, Combining Pattern Classifiers. Methods and Algorithms. John Wiley and Sons, 2004.
[22]
L.I. Kuncheva and C.J. Whitaker, “Measures of Diversity in Classifier Ensembles,” Machine Learning, vol. 51, pp. 181-207, 2003.
[23]
L.I. Kuncheva, “Diversity in Multiple Classifier Systems (editorial),” Information Fusion, vol. 6, no. 1, pp. 3-4, 2004.
[24]
L.I. Kuncheva, C.J. Whitaker, C.A. Shipp, and R.P.W. Duin, “Is Independence Good for Combining Classifiers?” Proc. 15th Int'l Conf. Pattern Recognition, vol. 2, pp. 169-171, 2000.
[25]
P.M. Long and V.B. Vega, “Boosting and Microarray Data,” Machine Learning, vol. 52, pp. 31-44, 2003.
[26]
D.D. Margineantu and T.G. Dietterich, “Pruning Adaptive Boosting,” Proc. 14th Int'l Conf. Machine Learning, pp. 211-218, 1997.
[27]
L. Mason, P.L. Bartlet, and J. Baxter, “Improved Generalization through Explicit Optimization of Margins,” Machine Learning, vol. 38, no. 3, pp. 243-255, 2000.
[28]
P. Melville, N. Shah, L. Mihalkova, and R.J. Mooney, “Experiments with Ensembles with Missing and Noisy Data,” Proc Fifth Int'l Workshop Multiple Classifier Systems, pp. 293-302, 2004.
[29]
C. Nadeau and Y. Bengio, “Inference for the Generalization Error,” Machine Learning, vol. 62, pp. 239-281, 2003.
[30]
N.C. Oza, “Boosting with Averaged Weight Vectors,” Proc Fourth Int'l Workshop Multiple Classifier Systems (MCS 2003), 2003.
[31]
Multiple Classifier Systems, Proc. Sixth Int'l Workshop, MCS 2005, N.C. Oza et al., eds., 2005.
[32]
J.R. Quinlan, C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
[33]
Proc. First Int'l Workshop Multiple Classifier Systems (MCS 2000), F. Roli and J. Kittler, eds., 2001.
[34]
Proc. Second Int'l Workshop Multiple Classifier Systems (MCS 2001), F. Roli and J. Kittler, eds., 2001.
[35]
Proc. Third Int'l Workshop Multiple Classifier Systems (MCS 2002), F. Roli and J. Kittler, eds., 2002.
[36]
Proc. Fifth Int'l Workshop Multiple Classifier Systems (MCS 2004), F. Roli et al., eds., 2004.
[37]
R.E. Schapire, “Theoretical Views of Boosting,” Proc. Fourth European Conf. Computational Learning Theory, pp. 1-10, 1999.
[38]
R.E. Schapire, “The Boosting Approach to Machine Learning: An Overview,” Proc. MSRI Workshop Nonlinear Estimation and Classification, 2002.
[39]
R.E. Schapire, Y. Freund, P. Bartlett, and W.S. Lee, “Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods,” Annals of Statistics, vol. 26, no. 5, pp. 1651-1686, 1998.
[40]
R.E. Schapire and Y. Singer, “Improved Boosting Algorithms Using Confidence-Rated Predictions,” Machine Learning, vol. 37, no. 3, pp. 397-336, 1999.
[41]
M. Skurichina and R.P.W. Duin, “Combining Feature Subsets in Feature Selection,” Proc. Sixth Int'l Workshop Multiple Classifier Systems, (MCS '05), pp. 165-175, 2005.
[42]
K. Tumer and N.C. Oza, “Input Decimated Ensembles,” Pattern Analysis Applications, vol. 6, pp. 65-77, 2003.
[43]
F. van der Heijden, R.P.W. Duin, D. de Ridder, and D.M.J. Tax, Classification, Parameter Estimation and State Estimation. Wiley, 2004.
[44]
A. Webb, Statistical Pattern Recognition. London: Ar nold, 1999.
[45]
G.I. Webb, “MultiBoosting: A Technique for Combining Boosting and Wagging,” Machine Learning, vol. 40, no. 2, pp. 159-196, 2000.
[46]
Proc. Fourth Int'l Workshop Multiple Classifier Systems (MCS 2003), T. Windeatt and F. Roli, eds., 2003.
[47]
I.H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, second ed. Morgan Kaufmann, 2005.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE Transactions on Pattern Analysis and Machine Intelligence  Volume 28, Issue 10
October 2006
176 pages

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 October 2006

Author Tags

  1. AdaBoost
  2. Classifier ensembles
  3. PCA
  4. bagging
  5. feature extraction
  6. kappa-error diagrams.
  7. random forest

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2025)Time-series classification in smart manufacturing systemsRobotics and Computer-Integrated Manufacturing10.1016/j.rcim.2024.10283991:COnline publication date: 1-Feb-2025
  • (2025)DES-ASPattern Recognition10.1016/j.patcog.2024.110899157:COnline publication date: 1-Jan-2025
  • (2024)MisDetect: Iterative Mislabel Detection using Early LossProceedings of the VLDB Endowment10.14778/3648160.364816117:6(1159-1172)Online publication date: 1-Feb-2024
  • (2024)Improving Algorithm-Selectors and Performance-Predictors via Learning Discriminating Training SamplesProceedings of the Genetic and Evolutionary Computation Conference10.1145/3638529.3654025(1026-1035)Online publication date: 14-Jul-2024
  • (2024)Improved Contraction-Expansion Subspace Ensemble for High-Dimensional Imbalanced Data ClassificationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.338427436:10(5194-5205)Online publication date: 1-Oct-2024
  • (2024)Estimating the structural diversity introduced by decision forest algorithms Knowledge-Based Systems10.1016/j.knosys.2024.111435286:COnline publication date: 17-Apr-2024
  • (2024)End-to-end approach of multi-grained embedding of categorical features in tabular dataInformation Processing and Management: an International Journal10.1016/j.ipm.2024.10364561:3Online publication date: 2-Jul-2024
  • (2024)An efficient ensemble learning method based on multi-objective feature selectionInformation Sciences: an International Journal10.1016/j.ins.2024.121084679:COnline publication date: 1-Sep-2024
  • (2024)RFEMComputers in Biology and Medicine10.1016/j.compbiomed.2024.108177171:COnline publication date: 9-Jul-2024
  • (2024)Identifying potential ligand–receptor interactions based on gradient boosted neural network and interpretable boosting machine for intercellular communication analysisComputers in Biology and Medicine10.1016/j.compbiomed.2024.108110171:COnline publication date: 9-Jul-2024
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media