Abstract
The lack of training dataset availability is the most popular issue in the software defect prediction, especially when dealing with new project development. Adopting training dataset from other software projects probably will not be the best solution because of the software metrics heterogeneity issues across projects. Unsupervised approaches have been proposed to address this issue, where the software prediction model is built without training dataset. Spectral classifier is one of these unsupervised approaches that has been applied successfully to address the lack of training dataset. However, this method leaves an issue when the dataset does not meet the requirement of nonnegative Laplacian assumption. This case would be occurred if there were nonnegative values of the adjacency matrix. It is well known that spectral classifier works with the Laplacian matrix, where the Laplacian matrix is constructed by adjacency matrix. In this paper, the signed Laplacian-based spectral classifier is proposed to solve the negative values problem in the adjacency matrix by converting the negative values into absolute values. The experimental results show that the proposed method could improve the performance of unsupervised classifiers compared to the unsigned Laplacian-based spectral classifier method. Hence, the proposed method is strongly suggested as unsupervised software defects prediction for the software projects that have no historical software dataset.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abaei G, Rezaei Z, Selamat A (2013) Fault prediction by utilizing self-organizing map and threshold. In: Proceedings of the 2013 IEEE international conference on control system, computing and engineering (ICCSCE), pp 465–470
Aggarwal CK, Reddy C (2014) Data clustering: algorithms and applications. CRC Press, Boca Raton, pp 177–194
Arar ÖF, Ayan K (2015) Software defect prediction using cost-sensitive neural network. Appl Soft Comput 33:263–277
Bishnu PS, Bhattacherjee V (2012) Software fault prediction using quad tree-based K-means clustering algorithm. IEEE Trans Knowl Data Eng 24(6):1146–1150
Catal C, Sevim U, Diri B (2009) Software fault prediction of unlabeled program modules. In: Proceedings of the world congress on engineering, pp 1–6
Gallier J (2016) Spectral theory of unsigned and signed graphs. applications to graph clustering: a survey, pp 1–122. arXiv:1601.04692
Hall T, Beecham S, Bowes D, Gray D, Counsell S (2012) A systematic literature review on fault prediction performance in software engineering. IEEE Trans Softw Eng 38(6):1276–1304
He Z, Shu F, Yang Y, Li M, Wang Q (2012) An investigation on the feasibility of cross-project defect prediction. Autom Softw Eng 19(2):167–199
Knyazev AV (2017) Signed Laplacian for spectral clustering revisited, pp 1–24. arXiv:1701.01394v1
Kunegis J, Schmidt S, Lommatzsch A, Lerner J, De Luca EW, Albayrak S (2010) Spectral analysis of signed graphs for clustering, prediction and visualization. In: Proceedings of the SIAM international conference on data mining, pp 559–570
Lee T, Nam J, Han D, Kim S, In H (2016) Developer micro interaction metrics for software defect prediction. IEEE Trans Softw Eng 42(11):1015–1035
Menzies T, Milton Z, Turhan B, Cukic B, Jiang Y, Bener A (2010) Defect prediction from static code features: current results, limitations, new approaches. Autom Softw Eng 17(4):375–407
Menzies T, Krishna R, Pryor D (2016) The promise repository of empirical software engineering data. North Carolina State University, Department of Computer Science, Raleigh
Nam J, Kim S (2015) CLAMI: defect prediction on unlabeled datasets. In: Proceedings of the 30th IEEE/ACM international conference on automated software engineering (ASE), pp 452–463
Nam J, Pan SJ, Kim S (2013) Transfer defect learning. In: Proceedings of the 35th international conference on software engineering (ICSE), vol 34(2), pp 382–391
Nam J, Fu W, Kim S, Menzies T, Tan L (2017) Heterogeneous defect prediction. IEEE Trans Softw Eng 99:1–23
Ni C, Liu WS, Chen X (2017) A cluster based feature selection method for cross-project software defect prediction. J Comput Sci Technol 32(6):1090–1107
Osborne JW, Carolina N (2010) Improving your data transformations: applying the Box-Cox transformation. Pract Assess Res Eval 15(12):1–9
Petersen K (2011) Measuring and predicting software productivity: a systematic map and review. Inf Softw Technol 53(4):317–343
Punitha K, Chitra S (2013) Software defect prediction using software metrics: a survey. In: Proceedings of the the 2013 international conference on information communication and embedded systems (ICICES), pp 555–558
Ryu D, Jang JI, Baik J (2015) A hybrid instance selection using nearest-neighbor for cross-project defect prediction. J Comput Sci Technol 30(5):969–980
Tomar D, Agarwal S (2016) Prediction of defective software modules using class imbalance learning. Appl Comput Intell Soft Comput 2016:1–12
Wahono RS (2015) A systematic literature review of software defect prediction: research trends, datasets, methods and frameworks. J Softw Eng 1(1):1–16
Wahono RS, Suryana N, Ahmad S (2014) Metaheuristic optimization based feature selection for software defect prediction. J Softw 9(5):1324–1333
Zaki MJ, Wagner MJ (2014) Data mining and analysis. Cambridge Univerity Press, Cambridge, pp 472–514
Zhang H, Zhang X (2007) Comments on ‘data mining static code attributes to learn defect predictors’. IEEE Trans Softw Eng 33(9):635–636
Zhang F, Mockus A, Keivanloo I, Zou Y (2014) Towards building a universal defect prediction model. In: Proceedings of the 11th working conference on mining software repositories (MSR), pp 182–191
Zhang F, Zheng Q, Zou Y, Hassan AE (2016) Cross-project defect prediction using a connectivity based unsupervised classifier. In Proceedings of the 38th international conference on software engineering (ICSE), pp 309–320
Zhang F, Keivanloo I, Zou Y (2017) Data transformation in cross-project defect prediction. Empir Softw Eng 22:3186–3218
Zhong S, Khoshgoftaar TM, Seliya N (2004) Unsupervised learning for expert-based software quality estimation. In: Proceedings of the eighth IEEE international conference on high assurance systems engineering, pp 149–155
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Human and animal participants
This article does not contain any studies with human participants or animals performed by any of the authors.
Informed consent
Informed consent was obtained from all individual participants included in the study.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Marjuni, A., Adji, T.B. & Ferdiana, R. Unsupervised software defect prediction using signed Laplacian-based spectral classifier. Soft Comput 23, 13679–13690 (2019). https://doi.org/10.1007/s00500-019-03907-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-019-03907-6