Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

An efficient ensemble learning method based on multi-objective feature selection

Published: 19 September 2024 Publication History

Abstract

Ensemble learning (EL) boosts model prediction performance across various domains through two main steps: generating individual classifiers (ICs) and combining them. Creating accurate and diverse ICs is crucial for a strong ensemble, while selecting the best ICs, known as ensemble selection (ES), is critical yet challenging due to the accuracy-diversity trade-off and the lack of agreed-upon diversity metrics. This paper introduces an EL strategy that uses multi-objective feature selection (MOFS) and a feature relevance-guided selection to tackle these challenges. Our approach uses a hybrid MOFS algorithm to produce accurate and diverse ICs, and then it employs a novel knowledge-based feature-relevance-guided metric for precise diversity assessment during ES. The ES issue is cast as an optimization problem, aiming to maximize both diversity and accuracy, and an efficient ES algorithm is developed to select optimal ICs. Extensive tests on public datasets and a real-world prediction task demonstrate the effectiveness of our method, especially in achieving high accuracy.

References

[1]
H. Yu, Q. Dai, AE-DIL: a double incremental learning algorithm for non-stationary time series prediction via adaptive ensemble, Inf. Sci. 636 (2023).
[2]
W. Dai, J. Liu, L. Wang, Cloud ensemble learning for fault diagnosis of rolling bearings with stochastic configuration networks, Inf. Sci. 658 (2024).
[3]
O. Sagi, L. Rokach, Ensemble learning: a survey, Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 8 (2018).
[4]
X. Zhou, J. He, C. Yang, An ensemble learning method based on deep neural network and group decision making, Knowl.-Based Syst. 239 (2022).
[5]
Y. Xu, Z. Yu, C.L.P. Chen, W. Cao, A novel classifier ensemble method based on subspace enhancement for high-dimensional data classification, IEEE Trans. Knowl. Data Eng. 35 (2023) 16–30.
[6]
M.H.L. Louk, B.A. Tama, Dual-IDS: a bagging-based gradient boosting decision tree model for network anomaly intrusion detection system, Expert Syst. Appl. 213 (2023).
[7]
Y. Xu, P. Huang, Semi-supervised text classification based on ensemble learning through optimized sampling, J. Chin. Inf. Process. 31 (2017) 180–189.
[8]
K. Khosravi, A. Golkarian, A.M. Melesse, R.C. Deo, Suspended sediment load modeling using advanced hybrid rotation forest based elastic network approach, J. Hydrol. 610 (2022).
[9]
S. Martiello Mastelini, F.K. Nakano, C. Vens, A.C.P. de Leon Ferreira de Carvalho, Online extra trees regressor, IEEE Trans. Neural Netw. Learn. Syst. 34 (2023) 6755–6767.
[10]
G. Stiglic, J.J. Rodriguez, P. Kokol, Rotation of random forests for genomic and proteomic classification problems, Softw. Tools Algorithms Biol. Syst. (2011) 211–221.
[11]
M.F. Amasyali, O.K. Ersoy, Classifier ensembles with the extended space forest, IEEE Trans. Knowl. Data Eng. 26 (2013) 549–562.
[12]
C. Zhang, J. Zhang, A novel method for constructing ensemble classifiers, Stat. Comput. 19 (2009) 317–327.
[13]
E.J. Ferreira, A.C. Delbem, R.A.F. Romero, O.N. Oliveira, Random subspaces of the instance and principal component spaces for ensembles, in: Proc. Int. Joint Conf. Neural Netw., IEEE, 2009, pp. 816–819.
[14]
Y. Xu, Z. Yu, W. Cao, C.P. Chen, J. You, Adaptive classifier ensemble method based on spatial perception for high-dimensional data classification, IEEE Trans. Knowl. Data Eng. 33 (2019) 2847–2862.
[15]
J. Xia, P. Ghamisi, N. Yokoya, A. Iwasaki, Random forest ensembles and extended multiextinction profiles for hyperspectral image classification, IEEE Trans. Geosci. Remote Sens. 56 (2017) 202–216.
[16]
Z. Yu, D. Wang, Z. Zhao, C.P. Chen, J. You, H. Wong, J. Zhang, Hybrid incremental ensemble learning for noisy real-world data classification, IEEE Trans. Cybern. 49 (2017) 403–416.
[17]
Y. Wei, X. Wang, W. Guan, L. Nie, Z. Lin, B. Chen, Neural multimodal cooperative learning toward micro-video understanding, IEEE Trans. Image Process. 29 (2019) 1–14.
[18]
F. Han, W. Chen, Q. Ling, H. Han, Multi-objective particle swarm optimization with adaptive strategies for feature selection, Swarm Evol. Comput. 62 (2021).
[19]
Y. Bian, Y. Wang, Y. Yao, H. Chen, Ensemble pruning based on objection maximization with a general distributed framework, IEEE Trans. Neural Netw. Learn. Syst. 31 (2020) 3766–3774.
[20]
Y. Sun, X. Zhou, C. Yang, T. Huang, A visual analytics approach for multi-attribute decision making based on intuitionistic fuzzy AHP and UMAP, Inf. Fusion 96 (2023) 269–280.
[21]
X. Zhou, Q. Wang, R. Zhang, C. Yang, A hybrid feature selection method for production condition recognition in froth flotation with noisy labels, Miner. Eng. 153 (2020).
[22]
P. Wang, B. Xue, J. Liang, M. Zhang, Feature clustering-assisted feature selection with differential evolution, Pattern Recognit. 140 (2023).
[23]
Z. Huang, C. Yang, X. Zhou, T. Huang, A hybrid feature selection method based on binary state transition algorithm and ReliefF, IEEE J. Biomed. Health Inform. 23 (2019) 1888–1898.
[24]
X. Song, Y. Zhang, Y. Guo, X. Sun, Y. Wang, Variable-size cooperative coevolutionary particle swarm optimization for feature selection on high-dimensional data, IEEE Trans. Evol. Comput. 24 (2020) 882–895.
[25]
X. Song, Y. Zhang, D. Gong, X. Gao, A fast hybrid feature selection based on correlation-guided clustering and particle swarm optimization for high-dimensional data, IEEE Trans. Cybern. 52 (2022) 9573–9586.
[26]
S. Zadeh, M. Ghadiri, V. Mirrokni, M. Zadimoghaddam, Scalable feature selection via distributed diversity maximization, in: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31, 2017, p. 1.
[27]
X. Zhang, Y. Tian, R. Cheng, Y. Jin, An efficient approach to nondominated sorting for evolutionary multiobjective optimization, IEEE Trans. Evol. Comput. 19 (2015) 201–213.
[28]
J.T. Pintas, L.A.F. Fernandes, A.C.B. Garcia, Feature selection methods for text classification: a systematic literature review, Artif. Intell. Rev. 54 (2021) 6149–6200.
[29]
Y. Li, W. Huang, R. Wu, K. Guo, An improved artificial bee colony algorithm for solving multi-objective low-carbon flexible job shop scheduling problem, Appl. Soft Comput. 95 (2020).
[30]
Y. Zhang, D. Gong, X. Gao, T. Tian, X. Sun, Binary differential evolution with self-learning for multi-objective feature selection, Inf. Sci. 507 (2020) 67–85.
[31]
E. Hancer, B. Xue, M. Zhang, D. Karaboga, B. Akay, Pareto front feature selection based on artificial bee colony optimization, Inf. Sci. 422 (2018) 462–479.
[32]
Y. Freund, R.E. Schapire, A decision-theoretic generalization of on-line learning and an application to boosting, Artif. Intell. Med. 55 (1997) 119–139.
[33]
J.H. Friedman, Greedy function approximation: a gradient boosting machine, Ann. Stat. 29 (2001) 1189–1232.
[34]
T. Chen, C. Guestrin, XGBoost: a scalable tree boosting system, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '16, Association for Computing Machinery, New York, NY, USA, 2016, pp. 785–794.
[35]
G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, T. Liu, LightGBM: a highly efficient gradient boosting decision tree, in: Advances in Neural Information Processing Systems, vol. 30, NIPS'17, Curran Associates, Inc., Red, Hook, NY, USA, 2017, pp. 3149–3157.
[36]
G. Ngo, R. Beard, R. Chandra, Evolutionary bagging for ensemble learning, Neurocomputing 510 (2022) 1–14.
[37]
J. Demšar, Statistical comparisons of classifiers over multiple data sets, J. Mach. Learn. Res. 7 (2006) 1–30.
[38]
L. Breiman, Random forests, Mach. Learn. 45 (2001) 5–32.
[39]
P. Geurts, D. Ernst, L. Wehenkel, Extremely randomized trees, Mach. Learn. 63 (2006) 3–42.
[40]
J.J. Rodriguez, L.I. Kuncheva, C.J. Alonso, Rotation forest: a new classifier ensemble method, IEEE Trans. Pattern Anal. Mach. Intell. 28 (2006) 1619–1630.
[41]
Z. Sun, G. Wang, P. Li, H. Wang, M. Zhang, X. Liang, An improved random forest based on the classification accuracy and correlation measurement of decision trees, Expert Syst. Appl. 237 (2024).
[42]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay, Scikit-learn: machine learning in python, J. Mach. Learn. Res. 12 (2011) 2825–2830.
[43]
H. Chen, G. Zhang, X. Pan, R. Jia, Using dual evolutionary search to construct decision tree based ensemble classifier, Complex Intell. Syst. 9 (2023) 1327–1345.
[44]
C. Qian, Y. Yu, Z. Zhou, Pareto ensemble pruning, in: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 29, 2015, p. 1.
[45]
Y. Guo, J. Feng, B. Jiao, N. Cui, S. Yang, Z. Yu, A dual evolutionary bagging for class imbalance learning, Expert Syst. Appl. 206 (2022).
[46]
J. Teng, H. Zhang, W. Liu, X. Shu, F. Ye, A dynamic Bayesian model for breast cancer survival prediction, IEEE J. Biomed. Health Inform. 26 (2022) 5716–5727.
[47]
X. Zhang, X.P. Dong, Y.Z. Guan, M. Ren, D. Guo, Y. He, Research progress on epidemiological trend and risk factors of female breast cancer, Cancer Res. Prev. Treat. 48 (2021) 87–92.
[48]
J. Teng, A. Abdygametova, J. Du, B. Ma, R. Zhou, Y. Shyr, F. Ye, Bayesian inference of lymph node ratio estimation and survival prognosis for breast cancer patients, IEEE J. Biomed. Health Inform. 24 (2020) 354–364.
[49]
L. Shi, F. Yan, H. Liu, Screening model of candidate drugs for breast cancer based on ensemble learning algorithm and molecular descriptor, Expert Syst. Appl. 213 (2023).

Index Terms

  1. An efficient ensemble learning method based on multi-objective feature selection
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image Information Sciences: an International Journal
          Information Sciences: an International Journal  Volume 679, Issue C
          Sep 2024
          1581 pages

          Publisher

          Elsevier Science Inc.

          United States

          Publication History

          Published: 19 September 2024

          Author Tags

          1. Ensemble learning
          2. Multi-objective feature selection
          3. Ensemble selection
          4. Knowledge-based
          5. Binary state transition algorithm

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • 0
            Total Citations
          • 0
            Total Downloads
          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 19 Nov 2024

          Other Metrics

          Citations

          View Options

          View options

          Login options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media