Deep learning approaches for bad smell detection: a systematic literature review

1117 Accesses
Explore all metrics

Abstract

Context

Bad smells negatively impact software quality metrics such as understandability, reusability, and maintainability. Reduced costs and enhanced software quality can be achieved through accurate bad smell detection.

Objective

This review aims to summarize and synthesize the studies that used deep learning (DL) techniques for bad smell detection. Given the rapid growth of DL techniques, we believe that reviewing and analyzing the current body of knowledge would facilitate the development of new techniques and help researchers identify research gaps in this area.

Method

We followed a systematic approach to identify 67 studies on DL-based bad smell detection published until October 2021. We collected and analyzed quantitative and qualitative data to obtain our results.

Results

Code Clone was the most recurring smell. Supervised learning is the most adopted learning approach for DL-based bad smell detection. Convolutional neural network (CNN), Artificial neural network (ANN), Deep neural network (DNN), Long short-term memory (LSTM), Attention model, and recursive autoencoder (RAE) are the most popularly used DL models. DL models that efficiently detect bad smells, such as Tree-based CNN (TBCNN) and the Abstract syntax tree-based LSTM (AST-LSTM), tend to be specifically designed to encode features for bad smell detection.

Conclusion

Many factors can affect the detection performance of DL models. Although studies exist on DL-based bad smell detection, more works that use other DL models than those already studied are needed. In this SLR, we provide a summary of existing research and recommendations for further research directions on DL-based bad smell detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Examining deep learning’s capability to spot code smells: a systematic literature review

Article 03 October 2023

Application of Deep Learning for Code Smell Detection: Challenges and Opportunities

Article 03 June 2024

Improving accuracy of code smells detection using machine learning with data balancing techniques

Article Open access 05 June 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability

The datasets generated during and/or analyzed during the current study are available in the GitHub repository, https://github.com/amalazba/Deep-Learning-Approaches-for-Bad-Smell-Detection-SLR.

Notes

References

AbuHassan A, Alshayeb M, Ghouti L (2021) Software smell detection techniques: A systematic literature review. J Softw Evol Process 33(3):e2320. https://doi.org/10.1002/smr.2320
Article Google Scholar
Alkharabsheh K, Crespo Y, Manso E, Taboada JA (2019) Software Design Smell Detection: A systematic mapping study. Software Qual J 27(3):1069–1148. https://doi.org/10.1007/s11219-018-9424-8
Article Google Scholar
Al-Shaaby A, Aljamaan H, Alshayeb M (2020) Bad Smell Detection Using Machine Learning Techniques: A Systematic Literature Review. Arab J Sci Eng. https://doi.org/10.1007/s13369-019-04311-w
Article Google Scholar
Anne-Wil Harzing (2006) Publish or perish. Harzing.Com. Retrieved January 23, 2022, from https://harzing.com/resources/publish-or-perish
Ardimento P, Aversano L, Bernardi ML, Cimitile M, Iammarino M (2021) Temporal convolutional networks for just-in-time design smells prediction using fine-grained software metrics. Neurocomputing 463:454–471. https://doi.org/10.1016/j.neucom.2021.08.010
Article Google Scholar
Azeem MI, Palomba F, Shi L, Wang Q (2019) Machine learning techniques for code smell detection: A systematic literature review and meta-analysis. Inf Softw Technol 108:115–138. https://doi.org/10.1016/j.infsof.2018.12.009
Article Google Scholar
Barbez A, Khomh F, Gueheneuc Y-G (2019) Deep Learning Anti-Patterns from Code Metrics History. IEEE International Conference on Software Maintenance and Evolution (ICSME) 2019:114–124. https://doi.org/10.1109/ICSME.2019.00021
Article Google Scholar
Bengio Y, Courville AC, Vincent P (2013) Representation Learning: A Review and New Perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828. https://doi.org/10.1109/TPAMI.2013.50
Article Google Scholar
Brier G (1950). VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY. https://doi.org/10.1175/1520-0493(1950)078%3c0001:VOFEIT%3e2.0.CO;2
Article Google Scholar
Brown WH, Malveau RC, McCormick HWS, Mowbray TJ (1998) AntiPatterns: refactoring software, architectures, and projects in crisis (1st edn). John Wiley & Sons, Inc.
Buch, L, Andrzejak, A (2019) Learning-Based Recursive Aggregation of Abstract Syntax Trees for Code Clone Detection. 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER), 95–104. https://doi.org/10.1109/SANER.2019.8668039
Bui, NDQ, Yu, Y, Jiang, L (2021) InferCode: Self-Supervised Learning of Code Representations by Predicting Subtrees. 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), 1186–1197. https://doi.org/10.1109/ICSE43902.2021.00109
Caram FL, Rodrigues BRDO, Campanelli AS, Parreiras FS (2019) Machine Learning Techniques for Code Smells Detection: A Systematic Mapping Study. Int J Software Eng Knowl Eng 29(02):285–316. https://doi.org/10.1142/S021819401950013X
Article Google Scholar
Chen, L, Ye, W, Zhang, S (2019) Capturing source code semantics via tree-based convolution over API-enhanced AST. Proceedings of the 16th ACM International Conference on Computing Frontiers, 174–182. https://doi.org/10.1145/3310273.3321560
Chicco D, Jurman G (2020) The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 21:6. https://doi.org/10.1186/s12864-019-6413-7
Article Google Scholar
Cruzes DS, Dybå T (2011) Research synthesis in software engineering. Inf Softw Technol 53(5):440–455. https://doi.org/10.1016/j.infsof.2011.01.004
Article Google Scholar
Das, AK, Yadav, S, Dhal, S (2019) Detecting Code Smells using Deep Learning. TENCON 2019 - 2019 IEEE Region 10 Conference (TENCON), 2081–2086. https://doi.org/10.1109/TENCON.2019.8929628
Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. http://arxiv.org/abs/1810.04805. Accessed 07-03-2022
Dìşlì H, Tosun A (2020) Code Clone Detection with Convolutional Neural Networks. Bilişim Teknolojileri Dergisi 13(1):1–12. https://doi.org/10.17671/gazibtd.541476
Article Google Scholar
Dong W, Feng Z, Wei H, Luo H (2020) A Novel Code Stylometry-based Code Clone Detection Strategy. International Wireless Communications and Mobile Computing (IWCMC) 2020:1516–1521. https://doi.org/10.1109/IWCMC48107.2020.9148302
Article Google Scholar
Fakhoury, S, Arnaoudova, V, Noiseux, C, Khomh, F, Antoniol, G (2018) Keep it simple: Is deep learning good for linguistic smell detection? 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), 602–611. https://doi.org/10.1109/SANER.2018.8330265
Fang, C, Liu, Z, Shi, Y, Huang, J, Shi, Q (2020) Functional code clone detection with syntax and semantics fusion learning. Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, 516–527. https://doi.org/10.1145/3395363.3397362
Feng, C, Wang, T, Yu, Y, Zhang, Y, Zhang, Y, Wang, H (2020) Sia-RAE: A Siamese Network based on Recursive AutoEncoder for Effective Clone Detection. 2020 27th Asia-Pacific Software Engineering Conference (APSEC), 238–246. https://doi.org/10.1109/APSEC51365.2020.00032
Fontana FA, Mäntylä MV, Zanoni M, Marino A et al (2016) Comparing and experimenting machine learning techniques for code smell detection. Empir Softw Eng 21:1143–1191. https://doi.org/10.1007/s10664-015-9378-4
Fowler M, Beck K, Brant J, Opdyke W, Roberts D, Gamma E (1999) Refactoring: improving the design of existing code (1 edn). Addison-Wesley Professional
Gao Y, Wang Z, Liu S, Yang L, Sang W, Cai Y (2019) TECCD: A Tree Embedding Approach for Code Clone Detection. IEEE International Conference on Software Maintenance and Evolution (ICSME) 2019:145–156. https://doi.org/10.1109/ICSME.2019.00025
Article Google Scholar
Gentleman, R, Carey, VJ (2008) Unsupervised Machine Learning. In F. Hahne, W. Huber, R. Gentleman, & S. Falcon (Eds.), Bioconductor Case Studies (pp. 137–157). Springer. https://doi.org/10.1007/978-0-387-77240-0_10
Goodfellow I, Bengio Y, Courville A (2016) Deep Learning. MIT Press
MATH Google Scholar
Guggulothu T, Moiz SA (2020) Code smell detection using multi-label classification approach. Software Qual J 28(3):1063–1086. https://doi.org/10.1007/s11219-020-09498-y
Article Google Scholar
Guo, X, Shi, C, Jiang, H (2019) Deep semantic-Based Feature Envy Identification. Proceedings of the 11th Asia-Pacific Symposium on Internetware, 1–6. https://doi.org/10.1145/3361242.3361257
Guo C, Yang H, Huang D, Zhang J, Dong N, Xu J, Zhu J (2020) Review Sharing via Deep Semi-Supervised Code Clone Detection. IEEE Access 8:24948–24965. https://doi.org/10.1109/ACCESS.2020.2966532
Article Google Scholar
Hadj-Kacem, M, Bouassida, N (2018) A Hybrid Approach To Detect Code Smells using Deep Learning. Proceedings of the 13th International Conference on Evaluation of Novel Approaches to Software Engineering, 137–146. https://doi.org/10.5220/0006709801370146
Hadj-Kacem, M, Bouassida, N (2019a) Improving the Identification of Code Smells by Combining Structural and Semantic Information. In T. Gedeon, K. W. Wong, & M. Lee (Eds.), Neural Information Processing (pp. 296–304). Springer International Publishing. https://doi.org/10.1007/978-3-030-36808-1_32
Hadj-Kacem M, Bouassida N (2019b) Deep Representation Learning for Code Smells Detection using Variational Auto-Encoder. International Joint Conference on Neural Networks (IJCNN) 2019:1–8. https://doi.org/10.1109/IJCNN.2019.8851854
Article Google Scholar
Hall T, Beecham S, Bowes D, Gray D, Counsell S (2012) A Systematic Literature Review on Fault Prediction Performance in Software Engineering. IEEE Trans Software Eng 38(6):1276–1304. https://doi.org/10.1109/TSE.2011.103
Article Google Scholar
Hamdy A, Tazy M (2020) Deep Hybrid Features for Code Smells Detection. J Theor Appl Inf Technol 98:2684–2696
Google Scholar
He H, Garcia EA (2009) Learning from Imbalanced Data. IEEE Trans Knowl Data Eng 21(9):1263–1284. https://doi.org/10.1109/TKDE.2008.239
Article Google Scholar
Hosseini S, Turhan B, Gunarathna D (2019) A Systematic Literature Review and Meta-Analysis on Cross Project Defect Prediction. IEEE Trans Software Eng 45(2):111–147. https://doi.org/10.1109/TSE.2017.2770124
Article Google Scholar
Hua W, Sui Y, Wan Y, Liu G, Xu G (2021) FCCA: Hybrid Code Representation for Functional Clone Detection Using Attention Networks. IEEE Trans Reliab 70(1):304–318. https://doi.org/10.1109/TR.2020.3001918
Article Google Scholar
Jaiswal A, Babu AR, Zadeh MZ, Banerjee D, Makedon F (2021) A Survey on Contrastive Self-Supervised Learning. Technologies 9(1):2. https://doi.org/10.3390/technologies9010002
Article Google Scholar
Ji X, Liu L, Zhu J (2021) Code Clone Detection with Hierarchical Attentive Graph Embedding. Int J Software Eng Knowl Eng 31(06):837–861. https://doi.org/10.1142/S021819402150025X
Article Google Scholar
Jiang, L, Misherghi, G, Su, Z, Glondu, S (2007) DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones. 29th International Conference on Software Engineering (ICSE’07), 96–105. https://doi.org/10.1109/ICSE.2007.30
Jo Y-B, Lee J, Yoo C-J (2021) Two-Pass Technique for Clone Detection and Type Classification Using Tree-Based Convolution Neural Network. Appl Sci 11(14):6613. https://doi.org/10.3390/app11146613
Article Google Scholar
Karabulut EM, Özel SA, İbrikçi T (2012) A comparative study on the effect of feature selection on classification accuracy. Procedia Technol 1:323–327. https://doi.org/10.1016/j.protcy.2012.02.068
Article Google Scholar
Kaur, A, Jain, S, Goel, S, Dhiman, G (2020) A Review on Machine-learning Based Code Smell Detection Techniques in Object-oriented Software System(s). https://doi.org/10.2174/2352096513999200922125839
Kaur A, Saini M (2021) Enhancing the Software Clone Detection in BigCloneBench: A Neural Network Approach. International Journal of Open Source Software and Processes (IJOSSP) 12(3):17–31. https://doi.org/10.4018/IJOSSP.2021070102
Article Google Scholar
Khan MA, Le H, Do K, Tran T, Ghose A, Dam K, Sindhgatta R (2018) Memory-augmented neural networks for predictive process analytics. arXiv preprint. https://arxiv.org/abs/1802.00938. Accessed 07-01-2022
Kim DK (2019) Enhancing code clone detection using control flow graphs. Int J Electric Comput Eng (IJECE) 9(5):3804. https://doi.org/10.11591/ijece.v9i5.pp3804-3812
Article Google Scholar
Kim DK (2020) A Deep Neural Network-Based Approach to Finding Similar Code Segments. IEICE Trans Inf Syst E103D(4):874–878. https://doi.org/10.1587/transinf.2019EDL8195
Kitchenham B (2004) Procedures for performing systematic reviews. Keele, UK, Keele University, 33(2004), 1–26.
Kitchenham B, Charters S (2007) Guidelines for performing systematic literature reviews in software engineering. Technical Report EBSE 2007-001, Keele University and Durham University Joint Report.
Kitchenham B, Pearl Brereton O, Budgen D, Turner M, Bailey J, Linkman S (2009) Systematic literature reviews in software engineering – A systematic literature review. Inf Softw Technol 51(1):7–15. https://doi.org/10.1016/j.infsof.2008.09.009
Article Google Scholar
Kotsiantis SB (2007) Supervised machine learning: a review of classification techniques. Informatica 31:249–268
Lacerda G, Petrillo F, Pimenta M, Guéhéneuc YG (2020) Code smells and refactoring: A tertiary systematic review of challenges and observations. J Syst Softw 167:110610. https://doi.org/10.1016/j.jss.2020.110610
Article Google Scholar
Le QV, Ngiam J, Coates A, Lahiri A, Prochnow B, Ng AY (2011) On optimization methods for deep learning. In Proceedings of the 28th International Conference on International Conference on Machine Learning (ICML'11). Omnipress, Madison, WI, USA, pp 265–272
Lei, M, Li, H, Li, J, Aundhkar, N, Kim, D-K (2022) Deep learning application on code clone detection: A review of current knowledge. J Syst Softw, 184(C). https://doi.org/10.1016/j.jss.2021.111141
Lewowski, T, Madeyski, L (2022) Code Smells Detection Using Artificial Intelligence Techniques: A Business-Driven Systematic Review. In N. Kryvinska & A. Poniszewska-Marańda (Eds.), Developments in Information & Knowledge Management for Business Applications: Volume 3 (pp. 285–319). Springer International Publishing. https://doi.org/10.1007/978-3-030-77916-0_12
Li L, Feng H, Zhuang W, Meng N, Ryder B (2017a) CCLearner: A Deep Learning-Based Clone Detection Approach. IEEE International Conference on Software Maintenance and Evolution (ICSME) 2017:249–260. https://doi.org/10.1109/ICSME.2017.46
Article Google Scholar
Li, Y, Tarlow, D, Brockschmidt, M, Zemel, R (2017b) Gated Graph Sequence Neural Networks (arXiv:1511.05493). arXiv. https://doi.org/10.48550/arXiv.1511.05493
Li, B, Ye, C, Guan, S, Zhou, H (2020a) Semantic Code Clone Detection Via Event Embedding Tree and GAT Network. 2020a IEEE 20th International Conference on Software Quality, Reliability and Security (QRS), 382–393. https://doi.org/10.1109/QRS51102.2020.00057
Li G, Tang Y, Zhang X, Yi B (2020b) A Deep Learning Based Approach to Detect Code Clones. International Conference on Intelligent Computing and Human-Computer Interaction (ICHCI) 2020:337–340. https://doi.org/10.1109/ICHCI51889.2020.00078
Article Google Scholar
Liang H, Ai L (2021) AST-path Based Compare-Aggregate Network for Code Clone Detection. International Joint Conference on Neural Networks (IJCNN) 2021:1–8. https://doi.org/10.1109/IJCNN52387.2021.9534099
Article Google Scholar
Lim T-S, Loh W-Y, Shih Y-S (2000) A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms. Mach Learn 40(3):203–228. https://doi.org/10.1023/A:1007608224229
Article MATH Google Scholar
Liu, H, Jin, J, Xu, Z, Bu, Y, Zou, Y, Zhang, L (2019) Deep Learning Based Code Smell Detection. IEEE Trans Soft Eng, 1–1. https://doi.org/10.1109/TSE.2019.2936376
Liu, X, Zhang, F, Hou, Z, Wang, Z, Mian, L, Zhang, J, Tang, J (2021) Self-supervised Learning: Generative or Contrastive. ArXiv:2006.08218 [Cs, Stat]. http://arxiv.org/abs/2006.08218
Ma Y, He H (eds) (2013) Imbalanced learning: foundations, algorithms, and applications (1st edn). Wiley-IEEE Press.
Marinescu C, Marinescu R, Mihancea PF, Ratiu D, Wettel R (2005) Iplasma: an integrated platform for quality assessment of object-oriented design. ICSM, pp 77–80
Mayvan BB, Rasoolzadegan A, Jafari AJ (2020) Bad smell detection using quality metrics and refactoring opportunities. J Softw Evol Process 32(8):e2255. https://doi.org/10.1002/smr.2255
Article Google Scholar
Mehrotra, N, Agarwal, N, Gupta, P, Anand, S, Lo, D, Purandare, R (2021) Modeling Functional Similarity in Source Code with Graph-Based Siamese Networks. IEEE Trans Softw Eng, 1–1. https://doi.org/10.1109/TSE.2021.3105556
Meng Y, Liu L (2020) A Deep Learning Approach for a Source Code Detection Model Using Self-Attention. Complexity 2020:1–15. https://doi.org/10.1155/2020/5027198
Article Google Scholar
Menshawy, RS, Yousef, AH, Salem, A (2021) Code Smells and Detection Techniques: A Survey. 2021 International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC), 78–83. https://doi.org/10.1109/MIUCC52538.2021.9447669
Moha N, Gueheneuc Y-G, Duchien L, Le Meur A-F (2010) DECOR: A Method for the Specification and Detection of Code and Design Smells. IEEE Trans Softw Eng 36(1):20–36. https://doi.org/10.1109/TSE.2009.50
Article MATH Google Scholar
Mostaeen G, Roy B, Roy CK, Schneider K, Svajlenko J (2020) A machine learning based framework for code clone validation. J Syst Softw 169:110686. https://doi.org/10.1016/j.jss.2020.110686
Article Google Scholar
Mumtaz H, Alshayeb M, Mahmood S, Niazi M (2019) A survey on UML model smells detection techniques for software refactoring. J Softw Evol Process 31(3):e2154. https://doi.org/10.1002/smr.2154
Article Google Scholar
Nafi KW, Roy B, Roy CK, Schneider KA (2020) A universal cross language software similarity detector for open source software categorization. J Syst Softw 162:110491. https://doi.org/10.1016/j.jss.2019.110491
Article Google Scholar
Nafi, KW, Kar, TS, Roy, B, Roy, CK, Schneider, KA (2019) CLCDSA: Cross Language Code Clone Detection using Syntactical Features and API Documentation. 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), 1026–1037. https://doi.org/10.1109/ASE.2019.00099
Nair, A, Roy, A, Meinke, K (2020) funcGNN: A Graph Neural Network Approach to Program Similarity. Proceedings of the 14th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), 1–11. https://doi.org/10.1145/3382494.3410675
Ohri K, Kumar M (2021) Review on self-supervised image recognition using deep neural networks. Knowl Based Syst 224:107090. https://doi.org/10.1016/j.knosys.2021.107090
Article Google Scholar
Olbrich SM, Cruzes DS, Sjøberg DIK (2010) Are all code smells harmful? A study of God Classes and Brain Classes in the evolution of three open source systems. IEEE International Conference on Software Maintenance 2010:1–10. https://doi.org/10.1109/ICSM.2010.5609564
Article Google Scholar
Palomba, F, Di Nucci, D, Tufano, M, Bavota, G, Oliveto, R, Poshyvanyk, D, De Lucia, A (2015) Landfill: An Open Dataset of Code Smells with Public Evaluation. 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, 482–485. https://doi.org/10.1109/MSR.2015.69
Patnaik A, Padhy N (2021) A Hybrid Approach to Identify Code Smell Using Machine Learning Algorithms. International Journal of Open Source Software and Processes 12(2):21–35. https://doi.org/10.4018/IJOSSP.2021040102
Article Google Scholar
Pecorelli F, Nucci DD, Roover CD, Lucia AD (2020) A large empirical assessment of the role of data balancing in machine-learning-based code smell detection. J Syst Softw. https://doi.org/10.1016/j.jss.2020.110693
Article Google Scholar
Perez, D, Chiba, S (2019) Cross-Language Clone Detection by Learning Over Abstract Syntax Trees. 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), 518–528. https://doi.org/10.1109/MSR.2019.00078
Pérez, J (2013) Refactoring Planning for Design Smell Correction: Summary, Opportunities and Lessons Learned. Proceedings of the 2013 IEEE International Conference on Software Maintenance, 572–577. https://doi.org/10.1109/ICSM.2013.98
Rasmussen CE, Ghahramani Z (2001) Occam’s Razor. In Advances in Neural Information Processing Systems 13:294–300
Google Scholar
Ren, S, Shi, C, Zhao, S (2021) Exploiting Multi-aspect Interactions for God Class Detection with Dataset Fine-tuning. 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC), 864–873. https://doi.org/10.1109/COMPSAC51774.2021.00119
Sabir F, Palma F, Rasool G, Guéhéneuc Y-G, Moha N (2019) A systematic literature review on the detection of smells and their evolution in object-oriented and service-oriented systems. Softw Practice Experience 49(1):3–39. https://doi.org/10.1002/spe.2639
Article Google Scholar
Saini, V, Farmahinifarahani, F, Lu, Y, Baldi, P, Lopes, CV (2018) Oreo: Detection of clones in the twilight zone. Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 354–365. https://doi.org/10.1145/3236024.3236026
Saini, V, Farmahinifarahani, F, Lu, Y, Yang, D, Martins, P, Sajnani, H, Baldi, P, Lopes, CV (2019) Towards Automating Precision Studies of Clone Detectors. 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), 49–59. https://doi.org/10.1109/ICSE.2019.00023
Sajnani, H, Saini, V, Svajlenko, J, Roy, C K, Lopes, CV (2016) SourcererCC: Scaling Code Clone Detection to Big-Code. 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE), 1157–1168. https://doi.org/10.1145/2884781.2884877
Sammut C, Webb GI (2011) Encyclopedia of machine learning. Springer Sci Bus Med. https://doi.org/10.1007/978-0-387-30164-8
Sarker IH (2021) Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions. SN Computer Science 2(6):420. https://doi.org/10.1007/s42979-021-00815-1
Article Google Scholar
Sharma T, Efstathiou V, Louridas P, Spinellis D (2021) Code smell detection by deep direct-learning and transfer-learning. J Syst Softw 176:110936. https://doi.org/10.1016/j.jss.2021.110936
Article Google Scholar
Sheneamer A, Roy S, Kalita J (2021) An Effective Semantic Code Clone Detection Framework Using Pairwise Feature Fusion. IEEE Access 9:84828–84844. https://doi.org/10.1109/ACCESS.2021.3079156
Article Google Scholar
Sheneamer, A, Hazazi, H, Roy, S, Kalita, J (2017) Schemes for Labeling Semantic Code Clones using Machine Learning. 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), 981–985. https://doi.org/10.1109/ICMLA.2017.00-25
Sheneamer, A (2018) CCDLC Detection Framework-Combining Clustering with Deep Learning Classification for Semantic Clones. 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), 701–706. https://doi.org/10.1109/ICMLA.2018.00111
Sidhu, BK, Singh, K, Sharma, N (2020) A machine learning approach to software model refactoring. Int J Comput Appl, 1–12. https://doi.org/10.1080/1206212X.2020.1711616
Storey, M-A, Zagalsky, A (2016) Disrupting developer productivity one bot at a time. Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 928–931. https://doi.org/10.1145/2950290.2983989
Suryanarayana, G, Samarthyam, G, Sharma, T (2015) Refactoring for Software Design Smells: Managing Technical Debt, Chapter 2—Design Smells. In G. Suryanarayana, G. Samarthyam, & T. Sharma (Eds.), Refactoring for Software Design Smells (pp. 9–19). Morgan Kaufmann. https://doi.org/10.1016/B978-0-12-801397-7.00002-3
Sutskever I, Martens J, Dahl G, Hinton G (2013) On the importance of initialization and momentum in deep learning. Proceedings of the 30th International Conference on Machine Learning, 1139–1147. https://proceedings.mlr.press/v28/sutskever13.html. Accessed 01 Jan 2022
Svajlenko J, Islam JF, Keivanloo I, Roy CK, Mia MM (2014) Towards a Big Data Curated Benchmark of Inter-project Code Clones. IEEE International Conference on Software Maintenance and Evolution 2014:476–480. https://doi.org/10.1109/ICSME.2014.77
Article Google Scholar
Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2019) The Impact of Automated Parameter Optimization on Defect Prediction Models. IEEE Trans Software Eng 45(7):683–711. https://doi.org/10.1109/TSE.2018.2794977
Article Google Scholar
Tsantalis, N, Chaikalis, T, Chatzigeorgiou, A (2008) JDeodorant: Identification and Removal of Type-Checking Bad Smells. 2008 12th European Conference on Software Maintenance and Reengineering, 329–331. https://doi.org/10.1109/CSMR.2008.4493342
Tufano, M, Watson, C, Bavota, G, Di Penta, M, White, M, Poshyvanyk, D (2018) Deep learning similarities from different representations of source code. Proceedings of the 15th International Conference on Mining Software Repositories, 542–553. https://doi.org/10.1145/3196398.3196431
Ullah F, Naeem MR, Mostarda L, Shah SA (2021) Clone detection in 5G-enabled social IoT system using graph semantics and deep learning model. Int J Mach Learn Cybern 12(11):3115–3127. https://doi.org/10.1007/s13042-020-01246-9
Article Google Scholar
Wang W, Li G, Shen S, Xia X, Jin Z (2020c) Modular Tree Network for Source Code Representation Learning. ACM Transactions on Software Engineering and Methodology 29(4):1–23. https://doi.org/10.1145/3409331
Article Google Scholar
Wang, C, Gao, J, Jiang, Y, Xing, Z, Zhang, H, Yin, W, Gu, M, Sun, J (2019) Go-clone: Graph-embedding based clone detector for Golang. Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, 374–377. https://doi.org/10.1145/3293882.3338996
Wang, H, Liu, J, Kang, J, Yin, W, Sun, H, Wang, H (2020a). Feature Envy Detection based on Bi-LSTM with Self-Attention Mechanism. 2020a IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), 448–457. https://doi.org/10.1109/ISPA-BDCloud-SocialCom-SustainCom51426.2020.00082
Wang, W, Li, G, Ma, B, Xia, X, Jin, Z (2020b) Detecting Code Clones with Graph Neural Network and Flow-Augmented Abstract Syntax Tree. 2020b IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER), 261–271. https://doi.org/10.1109/SANER48275.2020.9054857
Wei, H, Li, M (2017) Supervised Deep Features for Software Functional Clone Detection by Exploiting Lexical and Syntactical Information in Source Code. Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 3034–3040. https://doi.org/10.24963/ijcai.2017/423
Wei, H-H, Li, M (2018) Positive and Unlabeled Learning for Detecting Software Functional Clones with Adversarial Training. Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2840–2846. https://doi.org/10.24963/ijcai.2018/394
Wen J, Li S, Lin Z, Hu Y, Huang C (2012) Systematic literature review of machine learning based software development effort estimation models. Inf Softw Technol 54(1):41–59. https://doi.org/10.1016/j.infsof.2011.09.002
Article Google Scholar
White M, Tufano M, Vendome C, Poshyvanyk D (2016) Deep learning code fragments for code clone detection. 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE), 87–98. https://doi.org/10.1145/2970276.2970326
Wu, Y, Zou, D, Dou, S, Yang, S, Yang, W, Cheng, F, Liang, H, Jin, H (2020) SCDetector: Software functional clone detection based on semantic tokens analysis. Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 821–833. https://doi.org/10.1145/3324884.3416562
Wu Y, Wang W (2021) Code Similarity Detection Based on Siamese Network. IEEE International Conference on Information Communication and Software Engineering (ICICSE) 2021:47–51. https://doi.org/10.1109/ICICSE52190.2021.9404110
Article Google Scholar
Xie C, Wang X, Qian C, Wang M (2020) A Source Code Similarity Based on Siamese Neural Network. Appl Sci 10(21):7519. https://doi.org/10.3390/app10217519
Article Google Scholar
Xu, W (2021) Multi-Granularity Code Smell Detection using Deep Learning Method based on Abstract Syntax Tree. 503–509. https://doi.org/10.18293/SEKE2021-014
Xue, H, Venkataramani, G, Lan, T (2018) Clone-Slicer: Detecting Domain Specific Binary Code Clones through Program Slicing. Proceedings of the 2018 Workshop on Forming an Ecosystem Around Software Transformation - FEAST ’18, 27–33. https://doi.org/10.1145/3273045.3273047
Yamashita A, Counsell S (2013) Code smells as system-level indicators of maintainability: An empirical study. J Syst Softw 10(86):2639–2653. https://doi.org/10.1016/j.jss.2013.05.007
Article Google Scholar
Yin, X, Shi, C, Zhao, S (2021) Local and Global Feature Based Explainable Feature Envy Detection. 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC), 942–951. https://doi.org/10.1109/COMPSAC51774.2021.00127
Yu, H, Lam, W, Chen, L, Li, G, Xie, T, Wang, Q (2019) Neural Detection of Semantic Code Clones Via Tree-Based Convolution. 2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC), 70–80. https://doi.org/10.1109/ICPC.2019.00021
Yuan, Y, Kong, W, Hou, G, Hu, Y, Watanabe, M, Fukuda, A (2020) From Local to Global Semantic Clone Detection. 2019 6th International Conference on Dependable Systems and Their Applications (DSA), 13–24. https://doi.org/10.1109/DSA.2019.00012
Zeng J, Ben K, Li X, Zhang X (2019) Fast Code Clone Detection Based on Weighted Recursive Autoencoders. IEEE Access 7:125062–125078. https://doi.org/10.1109/ACCESS.2019.2938825
Article Google Scholar
Zhang Y, Wang T (2021) CCEyes: An Effective Tool for Code Clone Detection on Large-Scale Open Source Repositories. IEEE International Conference on Information Communication and Software Engineering (ICICSE) 2021:61–70. https://doi.org/10.1109/ICICSE52190.2021.9404141
Article Google Scholar
Zhang M, Hall T, Baddoo N (2011) Code Bad Smells: A review of current knowledge. J Softw Maint Evol Res Pract 23(3):179–202. https://doi.org/10.1002/smr.521
Article Google Scholar
Zhang L, Feng Z, Ren W, Luo H (2020) Siamese-Based BiLSTM Network for Scratch Source Code Similarity Measuring. International Wireless Communications and Mobile Computing (IWCMC) 2020:1800–1805. https://doi.org/10.1109/IWCMC48107.2020.9148382
Article Google Scholar
Zhang, J, Wang, X, Zhang, H, Sun, H, Wang, K, Liu, X (2019) A Novel Neural Source Code Representation Based on Abstract Syntax Tree. 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), 783–794. https://doi.org/10.1109/ICSE.2019.00086
Zhang, J, Hong, H, Zhang, Y, Wan, Y, Liu, Y, Sui, Y (2021) Disentangled Code Representation Learning for Multiple Programming Languages. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 4454–4466. https://doi.org/10.18653/v1/2021.findings-acl.391
Zhao, G, Huang, J (2018) DeepSim: Deep learning code functional similarity. Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 141–151. https://doi.org/10.1145/3236024.3236068
Zhou, X, Jin, Y, Zhang, H, Li, S, Huang, X (2016) A Map of Threats to Validity of Systematic Literature Reviews in Software Engineering. 2016 23rd Asia-Pacific Software Engineering Conference (APSEC), 153–160. https://doi.org/10.1109/APSEC.2016.031

Download references

Acknowledgements

The authors acknowledge the support of King Fahd University of Petroleum and Minerals in the development of this work.

Author information

Authors and Affiliations

Information and Computer Science Department, King Fahd University of Petroleum and Minerals, Dhahran, 31261, Saudi Arabia
Amal Alazba, Hamoud Aljamaan & Mohammad Alshayeb
Department of Information Systems, King Saud University, Riyadh, 11362, Saudi Arabia
Amal Alazba
Interdisciplinary Research Center for Intelligent Secure Systems, Dhahran, 31261, Saudi Arabia
Mohammad Alshayeb

Authors

Amal Alazba
View author publications
You can also search for this author in PubMed Google Scholar
Hamoud Aljamaan
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Alshayeb
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Amal Alazba.

Ethics declarations

Conflicts of Interests/Competing Interests

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Communicated by: Denys Poshyvanyk

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Tables 23, 24, 25 and 26

Table 23 The selected primary studies

Full size table

Table 24 The number of primary studies published in each venue and the publication type

Full size table

Table 25 Types of bad smells detected using DL

Full size table

Table 26 Publicly available datasets for each bad smell

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Alazba, A., Aljamaan, H. & Alshayeb, M. Deep learning approaches for bad smell detection: a systematic literature review. Empir Software Eng 28, 77 (2023). https://doi.org/10.1007/s10664-023-10312-z

Download citation

Accepted: 01 March 2023
Published: 11 May 2023
DOI: https://doi.org/10.1007/s10664-023-10312-z

Deep learning approaches for bad smell detection: a systematic literature review

Abstract

Context

Objective

Method

Results

Conclusion

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Examining deep learning’s capability to spot code smells: a systematic literature review

Application of Deep Learning for Code Smell Detection: Challenges and Opportunities

Improving accuracy of code smells detection using machine learning with data balancing techniques

Data Availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of Interests/Competing Interests

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Deep learning approaches for bad smell detection: a systematic literature review

Abstract

Context

Objective

Method

Results

Conclusion

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Examining deep learning’s capability to spot code smells: a systematic literature review

Application of Deep Learning for Code Smell Detection: Challenges and Opportunities

Improving accuracy of code smells detection using machine learning with data balancing techniques

Explore related subjects

Data Availability

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of Interests/Competing Interests

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation