Abstract
High dimensionality is a data quality problem that negatively influences the predictive capabilities of prediction models in software defect prediction (SDP). As a viable solution, feature selection (FS) has been used to address the high dimensionality problem in SDP. From existing studies, Filter-based feature selection (FFS) and Wrapper Feature Selection (WFS) are the two basic types of FS methods. WFS methods have been regarded to have superior performance between the two. However, WFS methods have been known to have high computational cost as the number of executions required for feature subset search, evaluation and selection is not known prior. This often leads to overfitting of prediction models due to easy trapping in local maxima. Applying appropriate search method in WFS subset evaluator phase can resolve its trapping in local maxima. Best First Search (BFS) and Greedy Step-wise Search (GSS) methods have been extensively and conventionally used as viable search methods in WFS with positive impacts. However, metaheuristic search methods can also be as effective as BFS and GSS. Consequently, this study conducts an empirical comparative analysis of 13 search methods (11 state-of-the-art metaheuristic search and 2 conventional search methods) in WFS methods for SDP. The experimental results showed that metaheuristic (AS, BS, BAT, CS, ES, FS, FLS, GS, NSGA-II, PSOS, RS) as search methods in WFS proved to be better than conventional search methods (BFS and GSS). Although the average computational time of metaheuristic-based WFS methods is relatively high. We recommend that metaheuristic search can be used as alternate search methods for WFS methods in SDP.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Mojeed, H.A., Bajeh, A.O., Balogun, A.O., Adeleke, H.O.: Memetic approach for multi-objective overtime planning in software engineering projects. J. Eng. Sci. Technol. 14, 3213–3233 (2019)
Balogun, A.O., Basri, S., Abdulkadir, S.J., Adeyemo, V.E., Imam, A.A., Bajeh, A.O.: Software defect prediction: analysis of class imbalance and performance stability. J. Eng. Sci. Technol. 14, 3294–3308 (2019)
Iqbal, A., Aftab, S., Matloob, F.: Performance analysis of resampling techniques on class imbalance issue in software defect prediction. Int. J. Inf. Technol. Comput. Sci. 11, 44–54 (2019)
Matloob, F., Aftab, S., Iqbal, A.: A framework for software defect prediction using feature selection and ensemble learning techniques. Int. J. Mod. Educ. Comput. Sci. 11(12), 14–20 (2019)
Basri, S., Almomani, M.A., Imam, A.A., Thangiah, M., Gilal, A.R., Balogun, A.O.: The organisational factors of software process improvement in small software industry: comparative study. In: International Conference of Reliable Information and Communication Technology, pp. 1132–1143. Springer, Johor (2019)
Mabayoje, M.A., Balogun, A.O., Jibril, H.A., Atoyebi, J.O., Mojeed, H.A., Adeyemo, V.E.: Parameter tuning in KNN for software defect prediction: an empirical analysis. Jurnal Teknologi dan Sistem Komputer 7 (2019)
Li, L., Lessmann, S., Baesens, B.: Evaluating software defect prediction performance: an updated benchmarking study. arXiv preprint arXiv:1901.01726 (2019)
Tantithamthavorn, C., McIntosh, S., Hassan, A.E., Matsumoto, K.: comments on “researcher bias: the use of machine learning in software defect prediction”. IEEE Trans. Softw. Eng. 42, 1092–1094 (2016)
Nam, J., Fu, W., Kim, S., Menzies, T., Tan, L.: Heterogeneous defect prediction. IEEE Trans. Softw. Eng. 44, 874–896 (2017)
Akintola, A.G., Balogun, A.O., Lafenwa-Balogun, F., Mojeed, H.A.: Comparative analysis of selected heterogeneous classifiers for software defects prediction using filter-based feature selection methods. FUOYE J. Eng. Technol. 3, 134–137 (2018)
Bowes, D., Hall, T., Petrić, J.: Software defect prediction: do different classifiers find the same defects? Softw. Qual. J. 26(2), 525–552 (2017). https://doi.org/10.1007/s11219-016-9353-3
Mabayoje, M.A., Balogun, A.O., Bello, S.M., Atoyebi, J.O., Mojeed, H.A., Ekundayo, A.H.: Wrapper feature selection based heterogeneous classifiers for software defect prediction. Adeleke Univ. J. Eng. Technol. 2, 1–11 (2019)
Gao, K., Khoshgoftaar, T.M., Wang, H., Seliya, N.: Choosing software metrics for defect prediction: an investigation on feature selection techniques. Softw.: Pract. Exp. 41, 579–606 (2011)
Balogun, A.O., Bajeh, A.O., Orie, V.A., Yusuf-Asaju, W.A.: Software defect prediction using ensemble learning: an ANP based evaluation method. FUOYE J. Eng. Technol. 3, 50–55 (2018)
Jimoh, R., Balogun, A., Bajeh, A., Ajayi, S.: A PROMETHEE based evaluation of software defect predictors. J. Comput. Sci. Appl. 25, 106–119 (2018)
Xu, Z., Liu, J., Yang, Z., An, G., Jia, X.: The impact of feature selection on defect prediction performance: An empirical comparison. In: 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE), pp. 309–320. IEEE (2016)
Ameen, A.O., Balogun, A.O., Usman, G., Fashoto, G.S.: Heterogeneous ensemble methods based on filter feature selection. Comput. Inf. Syst. Dev. Inform. J. 7, 63–78 (2016)
Ghotra, B., McIntosh, S., Hassan, A.E.: A large-scale study of the impact of feature selection techniques on defect classification models. In: 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), pp. 146–157. IEEE (2017)
Wahono, R.S., Suryana, N., Ahmad, S.: Metaheuristic optimization based feature selection for software defect prediction. J. Softw. 9, 1324–1333 (2014)
Song, Q., Jia, Z., Shepperd, M., Ying, S., Liu, J.: A general software defect-proneness prediction framework. IEEE Trans. Softw. Eng. 37, 356–370 (2010)
Muthukumaran, K., Rallapalli, A., Murthy, N.B.: Impact of feature selection techniques on bug prediction models. In: Proceedings of the 8th India Software Engineering Conference, pp. 120–129 (2015)
Rodríguez, D., Ruiz, R., Cuadrado-Gallego, J., Aguilar-Ruiz, J.: Detecting fault modules applying feature selection to classifiers. In: 2007 IEEE International Conference on Information Reuse and Integration, pp. 667–672. IEEE (2007)
Al-Tashi, Q., Abdulkadir, S.J., Rais, H.M., Mirjalili, S., Alhussian, H.: Binary optimization using hybrid grey wolf optimization for feature selection. IEEE Access 7, 39496–39508 (2019)
Yu, Q., Jiang, S., Zhang, Y.: The performance stability of defect prediction models with class imbalance: an empirical study. IEICE Trans. Inf. Syst. 100, 265–272 (2017)
Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33, 2–13 (2007)
Lessmann, S., Baesens, B., Mues, C., Pietsch, S.: Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans. Softw. Eng. 34, 485–496 (2008)
Shepperd, M., Song, Q., Sun, Z., Mair, C.: Data quality: Some comments on the NASA software defect datasets. IEEE Trans. Softw. Eng. 39, 1208–1215 (2013)
Kondo, M., Bezemer, C.-P., Kamei, Y., Hassan, A.E., Mizuno, O.: The impact of feature reduction techniques on defect prediction models. Empirical Softw. Eng. 24(4), 1925–1963 (2019). https://doi.org/10.1007/s10664-018-9679-5
Balogun, A.O., Basri, S., Abdulkadir, S.J., Hashim, A.S.: Performance analysis of feature selection methods in software defect prediction: a search method approach. Appl. Sci. 9, 2764 (2019)
Rathore, S.S., Gupta, A.: A comparative study of feature-ranking and feature-subset selection techniques for improved fault prediction. In: Proceedings of the 7th India Software Engineering Conference, p. 7. ACM (2014)
James, G., Witten, D., Hastie, T., Tibshirani, R.: An Introduction to Statistical Learning. Springer, Heidelberg (2013)
Kuhn, M., Johnson, K.: Applied Predictive Modeling. Springer, Heidelberg (2013)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM Sig. Exp. 11, 10–18 (2009)
Tantithamthavorn, C., McIntosh, S., Hassan, A.E., Matsumoto, K.: The impact of automated parameter optimization on defect prediction models. IEEE Trans. Softw. Eng. 45(7), 1–32 (2018)
Chen, X., Shen, Y., Cui, Z., Ju, X.: Applying feature selection to software defect prediction using multi-objective optimization. In: 2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC), vol. 2, pp. 54–59. IEEE (2017)
Acknowledgement
This research/paper was fully supported by Universiti Teknologi PETRONAS, under the Yayasan Universiti Teknologi PETRONAS (YUTP) Research Grant Scheme (YUTP-FRG/015LC0240).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Balogun, A.O. et al. (2020). Search-Based Wrapper Feature Selection Methods in Software Defect Prediction: An Empirical Analysis. In: Silhavy, R. (eds) Intelligent Algorithms in Software Engineering. CSOC 2020. Advances in Intelligent Systems and Computing, vol 1224. Springer, Cham. https://doi.org/10.1007/978-3-030-51965-0_43
Download citation
DOI: https://doi.org/10.1007/978-3-030-51965-0_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-51964-3
Online ISBN: 978-3-030-51965-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)