Abstract
Context
Bad smells negatively impact software quality metrics such as understandability, reusability, and maintainability. Reduced costs and enhanced software quality can be achieved through accurate bad smell detection.
Objective
This review aims to summarize and synthesize the studies that used deep learning (DL) techniques for bad smell detection. Given the rapid growth of DL techniques, we believe that reviewing and analyzing the current body of knowledge would facilitate the development of new techniques and help researchers identify research gaps in this area.
Method
We followed a systematic approach to identify 67 studies on DL-based bad smell detection published until October 2021. We collected and analyzed quantitative and qualitative data to obtain our results.
Results
Code Clone was the most recurring smell. Supervised learning is the most adopted learning approach for DL-based bad smell detection. Convolutional neural network (CNN), Artificial neural network (ANN), Deep neural network (DNN), Long short-term memory (LSTM), Attention model, and recursive autoencoder (RAE) are the most popularly used DL models. DL models that efficiently detect bad smells, such as Tree-based CNN (TBCNN) and the Abstract syntax tree-based LSTM (AST-LSTM), tend to be specifically designed to encode features for bad smell detection.
Conclusion
Many factors can affect the detection performance of DL models. Although studies exist on DL-based bad smell detection, more works that use other DL models than those already studied are needed. In this SLR, we provide a summary of existing research and recommendations for further research directions on DL-based bad smell detection.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
The datasets generated during and/or analyzed during the current study are available in the GitHub repository, https://github.com/amalazba/Deep-Learning-Approaches-for-Bad-Smell-Detection-SLR.
References
AbuHassan A, Alshayeb M, Ghouti L (2021) Software smell detection techniques: A systematic literature review. J Softw Evol Process 33(3):e2320. https://doi.org/10.1002/smr.2320
Alkharabsheh K, Crespo Y, Manso E, Taboada JA (2019) Software Design Smell Detection: A systematic mapping study. Software Qual J 27(3):1069–1148. https://doi.org/10.1007/s11219-018-9424-8
Al-Shaaby A, Aljamaan H, Alshayeb M (2020) Bad Smell Detection Using Machine Learning Techniques: A Systematic Literature Review. Arab J Sci Eng. https://doi.org/10.1007/s13369-019-04311-w
Anne-Wil Harzing (2006) Publish or perish. Harzing.Com. Retrieved January 23, 2022, from https://harzing.com/resources/publish-or-perish
Ardimento P, Aversano L, Bernardi ML, Cimitile M, Iammarino M (2021) Temporal convolutional networks for just-in-time design smells prediction using fine-grained software metrics. Neurocomputing 463:454–471. https://doi.org/10.1016/j.neucom.2021.08.010
Azeem MI, Palomba F, Shi L, Wang Q (2019) Machine learning techniques for code smell detection: A systematic literature review and meta-analysis. Inf Softw Technol 108:115–138. https://doi.org/10.1016/j.infsof.2018.12.009
Barbez A, Khomh F, Gueheneuc Y-G (2019) Deep Learning Anti-Patterns from Code Metrics History. IEEE International Conference on Software Maintenance and Evolution (ICSME) 2019:114–124. https://doi.org/10.1109/ICSME.2019.00021
Bengio Y, Courville AC, Vincent P (2013) Representation Learning: A Review and New Perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828. https://doi.org/10.1109/TPAMI.2013.50
Brier G (1950). VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY. https://doi.org/10.1175/1520-0493(1950)078%3c0001:VOFEIT%3e2.0.CO;2
Brown WH, Malveau RC, McCormick HWS, Mowbray TJ (1998) AntiPatterns: refactoring software, architectures, and projects in crisis (1st edn). John Wiley & Sons, Inc.
Buch, L, Andrzejak, A (2019) Learning-Based Recursive Aggregation of Abstract Syntax Trees for Code Clone Detection. 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER), 95–104. https://doi.org/10.1109/SANER.2019.8668039
Bui, NDQ, Yu, Y, Jiang, L (2021) InferCode: Self-Supervised Learning of Code Representations by Predicting Subtrees. 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), 1186–1197. https://doi.org/10.1109/ICSE43902.2021.00109
Caram FL, Rodrigues BRDO, Campanelli AS, Parreiras FS (2019) Machine Learning Techniques for Code Smells Detection: A Systematic Mapping Study. Int J Software Eng Knowl Eng 29(02):285–316. https://doi.org/10.1142/S021819401950013X
Chen, L, Ye, W, Zhang, S (2019) Capturing source code semantics via tree-based convolution over API-enhanced AST. Proceedings of the 16th ACM International Conference on Computing Frontiers, 174–182. https://doi.org/10.1145/3310273.3321560
Chicco D, Jurman G (2020) The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 21:6. https://doi.org/10.1186/s12864-019-6413-7
Cruzes DS, Dybå T (2011) Research synthesis in software engineering. Inf Softw Technol 53(5):440–455. https://doi.org/10.1016/j.infsof.2011.01.004
Das, AK, Yadav, S, Dhal, S (2019) Detecting Code Smells using Deep Learning. TENCON 2019 - 2019 IEEE Region 10 Conference (TENCON), 2081–2086. https://doi.org/10.1109/TENCON.2019.8929628
Devlin J, Chang M-W, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. http://arxiv.org/abs/1810.04805. Accessed 07-03-2022
Dìşlì H, Tosun A (2020) Code Clone Detection with Convolutional Neural Networks. Bilişim Teknolojileri Dergisi 13(1):1–12. https://doi.org/10.17671/gazibtd.541476
Dong W, Feng Z, Wei H, Luo H (2020) A Novel Code Stylometry-based Code Clone Detection Strategy. International Wireless Communications and Mobile Computing (IWCMC) 2020:1516–1521. https://doi.org/10.1109/IWCMC48107.2020.9148302
Fakhoury, S, Arnaoudova, V, Noiseux, C, Khomh, F, Antoniol, G (2018) Keep it simple: Is deep learning good for linguistic smell detection? 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), 602–611. https://doi.org/10.1109/SANER.2018.8330265
Fang, C, Liu, Z, Shi, Y, Huang, J, Shi, Q (2020) Functional code clone detection with syntax and semantics fusion learning. Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, 516–527. https://doi.org/10.1145/3395363.3397362
Feng, C, Wang, T, Yu, Y, Zhang, Y, Zhang, Y, Wang, H (2020) Sia-RAE: A Siamese Network based on Recursive AutoEncoder for Effective Clone Detection. 2020 27th Asia-Pacific Software Engineering Conference (APSEC), 238–246. https://doi.org/10.1109/APSEC51365.2020.00032
Fontana FA, Mäntylä MV, Zanoni M, Marino A et al (2016) Comparing and experimenting machine learning techniques for code smell detection. Empir Softw Eng 21:1143–1191. https://doi.org/10.1007/s10664-015-9378-4
Fowler M, Beck K, Brant J, Opdyke W, Roberts D, Gamma E (1999) Refactoring: improving the design of existing code (1 edn). Addison-Wesley Professional
Gao Y, Wang Z, Liu S, Yang L, Sang W, Cai Y (2019) TECCD: A Tree Embedding Approach for Code Clone Detection. IEEE International Conference on Software Maintenance and Evolution (ICSME) 2019:145–156. https://doi.org/10.1109/ICSME.2019.00025
Gentleman, R, Carey, VJ (2008) Unsupervised Machine Learning. In F. Hahne, W. Huber, R. Gentleman, & S. Falcon (Eds.), Bioconductor Case Studies (pp. 137–157). Springer. https://doi.org/10.1007/978-0-387-77240-0_10
Goodfellow I, Bengio Y, Courville A (2016) Deep Learning. MIT Press
Guggulothu T, Moiz SA (2020) Code smell detection using multi-label classification approach. Software Qual J 28(3):1063–1086. https://doi.org/10.1007/s11219-020-09498-y
Guo, X, Shi, C, Jiang, H (2019) Deep semantic-Based Feature Envy Identification. Proceedings of the 11th Asia-Pacific Symposium on Internetware, 1–6. https://doi.org/10.1145/3361242.3361257
Guo C, Yang H, Huang D, Zhang J, Dong N, Xu J, Zhu J (2020) Review Sharing via Deep Semi-Supervised Code Clone Detection. IEEE Access 8:24948–24965. https://doi.org/10.1109/ACCESS.2020.2966532
Hadj-Kacem, M, Bouassida, N (2018) A Hybrid Approach To Detect Code Smells using Deep Learning. Proceedings of the 13th International Conference on Evaluation of Novel Approaches to Software Engineering, 137–146. https://doi.org/10.5220/0006709801370146
Hadj-Kacem, M, Bouassida, N (2019a) Improving the Identification of Code Smells by Combining Structural and Semantic Information. In T. Gedeon, K. W. Wong, & M. Lee (Eds.), Neural Information Processing (pp. 296–304). Springer International Publishing. https://doi.org/10.1007/978-3-030-36808-1_32
Hadj-Kacem M, Bouassida N (2019b) Deep Representation Learning for Code Smells Detection using Variational Auto-Encoder. International Joint Conference on Neural Networks (IJCNN) 2019:1–8. https://doi.org/10.1109/IJCNN.2019.8851854
Hall T, Beecham S, Bowes D, Gray D, Counsell S (2012) A Systematic Literature Review on Fault Prediction Performance in Software Engineering. IEEE Trans Software Eng 38(6):1276–1304. https://doi.org/10.1109/TSE.2011.103
Hamdy A, Tazy M (2020) Deep Hybrid Features for Code Smells Detection. J Theor Appl Inf Technol 98:2684–2696
He H, Garcia EA (2009) Learning from Imbalanced Data. IEEE Trans Knowl Data Eng 21(9):1263–1284. https://doi.org/10.1109/TKDE.2008.239
Hosseini S, Turhan B, Gunarathna D (2019) A Systematic Literature Review and Meta-Analysis on Cross Project Defect Prediction. IEEE Trans Software Eng 45(2):111–147. https://doi.org/10.1109/TSE.2017.2770124
Hua W, Sui Y, Wan Y, Liu G, Xu G (2021) FCCA: Hybrid Code Representation for Functional Clone Detection Using Attention Networks. IEEE Trans Reliab 70(1):304–318. https://doi.org/10.1109/TR.2020.3001918
Jaiswal A, Babu AR, Zadeh MZ, Banerjee D, Makedon F (2021) A Survey on Contrastive Self-Supervised Learning. Technologies 9(1):2. https://doi.org/10.3390/technologies9010002
Ji X, Liu L, Zhu J (2021) Code Clone Detection with Hierarchical Attentive Graph Embedding. Int J Software Eng Knowl Eng 31(06):837–861. https://doi.org/10.1142/S021819402150025X
Jiang, L, Misherghi, G, Su, Z, Glondu, S (2007) DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones. 29th International Conference on Software Engineering (ICSE’07), 96–105. https://doi.org/10.1109/ICSE.2007.30
Jo Y-B, Lee J, Yoo C-J (2021) Two-Pass Technique for Clone Detection and Type Classification Using Tree-Based Convolution Neural Network. Appl Sci 11(14):6613. https://doi.org/10.3390/app11146613
Karabulut EM, Özel SA, İbrikçi T (2012) A comparative study on the effect of feature selection on classification accuracy. Procedia Technol 1:323–327. https://doi.org/10.1016/j.protcy.2012.02.068
Kaur, A, Jain, S, Goel, S, Dhiman, G (2020) A Review on Machine-learning Based Code Smell Detection Techniques in Object-oriented Software System(s). https://doi.org/10.2174/2352096513999200922125839
Kaur A, Saini M (2021) Enhancing the Software Clone Detection in BigCloneBench: A Neural Network Approach. International Journal of Open Source Software and Processes (IJOSSP) 12(3):17–31. https://doi.org/10.4018/IJOSSP.2021070102
Khan MA, Le H, Do K, Tran T, Ghose A, Dam K, Sindhgatta R (2018) Memory-augmented neural networks for predictive process analytics. arXiv preprint. https://arxiv.org/abs/1802.00938. Accessed 07-01-2022
Kim DK (2019) Enhancing code clone detection using control flow graphs. Int J Electric Comput Eng (IJECE) 9(5):3804. https://doi.org/10.11591/ijece.v9i5.pp3804-3812
Kim DK (2020) A Deep Neural Network-Based Approach to Finding Similar Code Segments. IEICE Trans Inf Syst E103D(4):874–878. https://doi.org/10.1587/transinf.2019EDL8195
Kitchenham B (2004) Procedures for performing systematic reviews. Keele, UK, Keele University, 33(2004), 1–26.
Kitchenham B, Charters S (2007) Guidelines for performing systematic literature reviews in software engineering. Technical Report EBSE 2007-001, Keele University and Durham University Joint Report.
Kitchenham B, Pearl Brereton O, Budgen D, Turner M, Bailey J, Linkman S (2009) Systematic literature reviews in software engineering – A systematic literature review. Inf Softw Technol 51(1):7–15. https://doi.org/10.1016/j.infsof.2008.09.009
Kotsiantis SB (2007) Supervised machine learning: a review of classification techniques. Informatica 31:249–268
Lacerda G, Petrillo F, Pimenta M, Guéhéneuc YG (2020) Code smells and refactoring: A tertiary systematic review of challenges and observations. J Syst Softw 167:110610. https://doi.org/10.1016/j.jss.2020.110610
Le QV, Ngiam J, Coates A, Lahiri A, Prochnow B, Ng AY (2011) On optimization methods for deep learning. In Proceedings of the 28th International Conference on International Conference on Machine Learning (ICML'11). Omnipress, Madison, WI, USA, pp 265–272
Lei, M, Li, H, Li, J, Aundhkar, N, Kim, D-K (2022) Deep learning application on code clone detection: A review of current knowledge. J Syst Softw, 184(C). https://doi.org/10.1016/j.jss.2021.111141
Lewowski, T, Madeyski, L (2022) Code Smells Detection Using Artificial Intelligence Techniques: A Business-Driven Systematic Review. In N. Kryvinska & A. Poniszewska-Marańda (Eds.), Developments in Information & Knowledge Management for Business Applications: Volume 3 (pp. 285–319). Springer International Publishing. https://doi.org/10.1007/978-3-030-77916-0_12
Li L, Feng H, Zhuang W, Meng N, Ryder B (2017a) CCLearner: A Deep Learning-Based Clone Detection Approach. IEEE International Conference on Software Maintenance and Evolution (ICSME) 2017:249–260. https://doi.org/10.1109/ICSME.2017.46
Li, Y, Tarlow, D, Brockschmidt, M, Zemel, R (2017b) Gated Graph Sequence Neural Networks (arXiv:1511.05493). arXiv. https://doi.org/10.48550/arXiv.1511.05493
Li, B, Ye, C, Guan, S, Zhou, H (2020a) Semantic Code Clone Detection Via Event Embedding Tree and GAT Network. 2020a IEEE 20th International Conference on Software Quality, Reliability and Security (QRS), 382–393. https://doi.org/10.1109/QRS51102.2020.00057
Li G, Tang Y, Zhang X, Yi B (2020b) A Deep Learning Based Approach to Detect Code Clones. International Conference on Intelligent Computing and Human-Computer Interaction (ICHCI) 2020:337–340. https://doi.org/10.1109/ICHCI51889.2020.00078
Liang H, Ai L (2021) AST-path Based Compare-Aggregate Network for Code Clone Detection. International Joint Conference on Neural Networks (IJCNN) 2021:1–8. https://doi.org/10.1109/IJCNN52387.2021.9534099
Lim T-S, Loh W-Y, Shih Y-S (2000) A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms. Mach Learn 40(3):203–228. https://doi.org/10.1023/A:1007608224229
Liu, H, Jin, J, Xu, Z, Bu, Y, Zou, Y, Zhang, L (2019) Deep Learning Based Code Smell Detection. IEEE Trans Soft Eng, 1–1. https://doi.org/10.1109/TSE.2019.2936376
Liu, X, Zhang, F, Hou, Z, Wang, Z, Mian, L, Zhang, J, Tang, J (2021) Self-supervised Learning: Generative or Contrastive. ArXiv:2006.08218 [Cs, Stat]. http://arxiv.org/abs/2006.08218
Ma Y, He H (eds) (2013) Imbalanced learning: foundations, algorithms, and applications (1st edn). Wiley-IEEE Press.
Marinescu C, Marinescu R, Mihancea PF, Ratiu D, Wettel R (2005) Iplasma: an integrated platform for quality assessment of object-oriented design. ICSM, pp 77–80
Mayvan BB, Rasoolzadegan A, Jafari AJ (2020) Bad smell detection using quality metrics and refactoring opportunities. J Softw Evol Process 32(8):e2255. https://doi.org/10.1002/smr.2255
Mehrotra, N, Agarwal, N, Gupta, P, Anand, S, Lo, D, Purandare, R (2021) Modeling Functional Similarity in Source Code with Graph-Based Siamese Networks. IEEE Trans Softw Eng, 1–1. https://doi.org/10.1109/TSE.2021.3105556
Meng Y, Liu L (2020) A Deep Learning Approach for a Source Code Detection Model Using Self-Attention. Complexity 2020:1–15. https://doi.org/10.1155/2020/5027198
Menshawy, RS, Yousef, AH, Salem, A (2021) Code Smells and Detection Techniques: A Survey. 2021 International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC), 78–83. https://doi.org/10.1109/MIUCC52538.2021.9447669
Moha N, Gueheneuc Y-G, Duchien L, Le Meur A-F (2010) DECOR: A Method for the Specification and Detection of Code and Design Smells. IEEE Trans Softw Eng 36(1):20–36. https://doi.org/10.1109/TSE.2009.50
Mostaeen G, Roy B, Roy CK, Schneider K, Svajlenko J (2020) A machine learning based framework for code clone validation. J Syst Softw 169:110686. https://doi.org/10.1016/j.jss.2020.110686
Mumtaz H, Alshayeb M, Mahmood S, Niazi M (2019) A survey on UML model smells detection techniques for software refactoring. J Softw Evol Process 31(3):e2154. https://doi.org/10.1002/smr.2154
Nafi KW, Roy B, Roy CK, Schneider KA (2020) A universal cross language software similarity detector for open source software categorization. J Syst Softw 162:110491. https://doi.org/10.1016/j.jss.2019.110491
Nafi, KW, Kar, TS, Roy, B, Roy, CK, Schneider, KA (2019) CLCDSA: Cross Language Code Clone Detection using Syntactical Features and API Documentation. 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), 1026–1037. https://doi.org/10.1109/ASE.2019.00099
Nair, A, Roy, A, Meinke, K (2020) funcGNN: A Graph Neural Network Approach to Program Similarity. Proceedings of the 14th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), 1–11. https://doi.org/10.1145/3382494.3410675
Ohri K, Kumar M (2021) Review on self-supervised image recognition using deep neural networks. Knowl Based Syst 224:107090. https://doi.org/10.1016/j.knosys.2021.107090
Olbrich SM, Cruzes DS, Sjøberg DIK (2010) Are all code smells harmful? A study of God Classes and Brain Classes in the evolution of three open source systems. IEEE International Conference on Software Maintenance 2010:1–10. https://doi.org/10.1109/ICSM.2010.5609564
Palomba, F, Di Nucci, D, Tufano, M, Bavota, G, Oliveto, R, Poshyvanyk, D, De Lucia, A (2015) Landfill: An Open Dataset of Code Smells with Public Evaluation. 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, 482–485. https://doi.org/10.1109/MSR.2015.69
Patnaik A, Padhy N (2021) A Hybrid Approach to Identify Code Smell Using Machine Learning Algorithms. International Journal of Open Source Software and Processes 12(2):21–35. https://doi.org/10.4018/IJOSSP.2021040102
Pecorelli F, Nucci DD, Roover CD, Lucia AD (2020) A large empirical assessment of the role of data balancing in machine-learning-based code smell detection. J Syst Softw. https://doi.org/10.1016/j.jss.2020.110693
Perez, D, Chiba, S (2019) Cross-Language Clone Detection by Learning Over Abstract Syntax Trees. 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), 518–528. https://doi.org/10.1109/MSR.2019.00078
Pérez, J (2013) Refactoring Planning for Design Smell Correction: Summary, Opportunities and Lessons Learned. Proceedings of the 2013 IEEE International Conference on Software Maintenance, 572–577. https://doi.org/10.1109/ICSM.2013.98
Rasmussen CE, Ghahramani Z (2001) Occam’s Razor. In Advances in Neural Information Processing Systems 13:294–300
Ren, S, Shi, C, Zhao, S (2021) Exploiting Multi-aspect Interactions for God Class Detection with Dataset Fine-tuning. 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC), 864–873. https://doi.org/10.1109/COMPSAC51774.2021.00119
Sabir F, Palma F, Rasool G, Guéhéneuc Y-G, Moha N (2019) A systematic literature review on the detection of smells and their evolution in object-oriented and service-oriented systems. Softw Practice Experience 49(1):3–39. https://doi.org/10.1002/spe.2639
Saini, V, Farmahinifarahani, F, Lu, Y, Baldi, P, Lopes, CV (2018) Oreo: Detection of clones in the twilight zone. Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 354–365. https://doi.org/10.1145/3236024.3236026
Saini, V, Farmahinifarahani, F, Lu, Y, Yang, D, Martins, P, Sajnani, H, Baldi, P, Lopes, CV (2019) Towards Automating Precision Studies of Clone Detectors. 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), 49–59. https://doi.org/10.1109/ICSE.2019.00023
Sajnani, H, Saini, V, Svajlenko, J, Roy, C K, Lopes, CV (2016) SourcererCC: Scaling Code Clone Detection to Big-Code. 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE), 1157–1168. https://doi.org/10.1145/2884781.2884877
Sammut C, Webb GI (2011) Encyclopedia of machine learning. Springer Sci Bus Med. https://doi.org/10.1007/978-0-387-30164-8
Sarker IH (2021) Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions. SN Computer Science 2(6):420. https://doi.org/10.1007/s42979-021-00815-1
Sharma T, Efstathiou V, Louridas P, Spinellis D (2021) Code smell detection by deep direct-learning and transfer-learning. J Syst Softw 176:110936. https://doi.org/10.1016/j.jss.2021.110936
Sheneamer A, Roy S, Kalita J (2021) An Effective Semantic Code Clone Detection Framework Using Pairwise Feature Fusion. IEEE Access 9:84828–84844. https://doi.org/10.1109/ACCESS.2021.3079156
Sheneamer, A, Hazazi, H, Roy, S, Kalita, J (2017) Schemes for Labeling Semantic Code Clones using Machine Learning. 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), 981–985. https://doi.org/10.1109/ICMLA.2017.00-25
Sheneamer, A (2018) CCDLC Detection Framework-Combining Clustering with Deep Learning Classification for Semantic Clones. 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), 701–706. https://doi.org/10.1109/ICMLA.2018.00111
Sidhu, BK, Singh, K, Sharma, N (2020) A machine learning approach to software model refactoring. Int J Comput Appl, 1–12. https://doi.org/10.1080/1206212X.2020.1711616
Storey, M-A, Zagalsky, A (2016) Disrupting developer productivity one bot at a time. Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 928–931. https://doi.org/10.1145/2950290.2983989
Suryanarayana, G, Samarthyam, G, Sharma, T (2015) Refactoring for Software Design Smells: Managing Technical Debt, Chapter 2—Design Smells. In G. Suryanarayana, G. Samarthyam, & T. Sharma (Eds.), Refactoring for Software Design Smells (pp. 9–19). Morgan Kaufmann. https://doi.org/10.1016/B978-0-12-801397-7.00002-3
Sutskever I, Martens J, Dahl G, Hinton G (2013) On the importance of initialization and momentum in deep learning. Proceedings of the 30th International Conference on Machine Learning, 1139–1147. https://proceedings.mlr.press/v28/sutskever13.html. Accessed 01 Jan 2022
Svajlenko J, Islam JF, Keivanloo I, Roy CK, Mia MM (2014) Towards a Big Data Curated Benchmark of Inter-project Code Clones. IEEE International Conference on Software Maintenance and Evolution 2014:476–480. https://doi.org/10.1109/ICSME.2014.77
Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2019) The Impact of Automated Parameter Optimization on Defect Prediction Models. IEEE Trans Software Eng 45(7):683–711. https://doi.org/10.1109/TSE.2018.2794977
Tsantalis, N, Chaikalis, T, Chatzigeorgiou, A (2008) JDeodorant: Identification and Removal of Type-Checking Bad Smells. 2008 12th European Conference on Software Maintenance and Reengineering, 329–331. https://doi.org/10.1109/CSMR.2008.4493342
Tufano, M, Watson, C, Bavota, G, Di Penta, M, White, M, Poshyvanyk, D (2018) Deep learning similarities from different representations of source code. Proceedings of the 15th International Conference on Mining Software Repositories, 542–553. https://doi.org/10.1145/3196398.3196431
Ullah F, Naeem MR, Mostarda L, Shah SA (2021) Clone detection in 5G-enabled social IoT system using graph semantics and deep learning model. Int J Mach Learn Cybern 12(11):3115–3127. https://doi.org/10.1007/s13042-020-01246-9
Wang W, Li G, Shen S, Xia X, Jin Z (2020c) Modular Tree Network for Source Code Representation Learning. ACM Transactions on Software Engineering and Methodology 29(4):1–23. https://doi.org/10.1145/3409331
Wang, C, Gao, J, Jiang, Y, Xing, Z, Zhang, H, Yin, W, Gu, M, Sun, J (2019) Go-clone: Graph-embedding based clone detector for Golang. Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, 374–377. https://doi.org/10.1145/3293882.3338996
Wang, H, Liu, J, Kang, J, Yin, W, Sun, H, Wang, H (2020a). Feature Envy Detection based on Bi-LSTM with Self-Attention Mechanism. 2020a IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), 448–457. https://doi.org/10.1109/ISPA-BDCloud-SocialCom-SustainCom51426.2020.00082
Wang, W, Li, G, Ma, B, Xia, X, Jin, Z (2020b) Detecting Code Clones with Graph Neural Network and Flow-Augmented Abstract Syntax Tree. 2020b IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER), 261–271. https://doi.org/10.1109/SANER48275.2020.9054857
Wei, H, Li, M (2017) Supervised Deep Features for Software Functional Clone Detection by Exploiting Lexical and Syntactical Information in Source Code. Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 3034–3040. https://doi.org/10.24963/ijcai.2017/423
Wei, H-H, Li, M (2018) Positive and Unlabeled Learning for Detecting Software Functional Clones with Adversarial Training. Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2840–2846. https://doi.org/10.24963/ijcai.2018/394
Wen J, Li S, Lin Z, Hu Y, Huang C (2012) Systematic literature review of machine learning based software development effort estimation models. Inf Softw Technol 54(1):41–59. https://doi.org/10.1016/j.infsof.2011.09.002
White M, Tufano M, Vendome C, Poshyvanyk D (2016) Deep learning code fragments for code clone detection. 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE), 87–98. https://doi.org/10.1145/2970276.2970326
Wu, Y, Zou, D, Dou, S, Yang, S, Yang, W, Cheng, F, Liang, H, Jin, H (2020) SCDetector: Software functional clone detection based on semantic tokens analysis. Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 821–833. https://doi.org/10.1145/3324884.3416562
Wu Y, Wang W (2021) Code Similarity Detection Based on Siamese Network. IEEE International Conference on Information Communication and Software Engineering (ICICSE) 2021:47–51. https://doi.org/10.1109/ICICSE52190.2021.9404110
Xie C, Wang X, Qian C, Wang M (2020) A Source Code Similarity Based on Siamese Neural Network. Appl Sci 10(21):7519. https://doi.org/10.3390/app10217519
Xu, W (2021) Multi-Granularity Code Smell Detection using Deep Learning Method based on Abstract Syntax Tree. 503–509. https://doi.org/10.18293/SEKE2021-014
Xue, H, Venkataramani, G, Lan, T (2018) Clone-Slicer: Detecting Domain Specific Binary Code Clones through Program Slicing. Proceedings of the 2018 Workshop on Forming an Ecosystem Around Software Transformation - FEAST ’18, 27–33. https://doi.org/10.1145/3273045.3273047
Yamashita A, Counsell S (2013) Code smells as system-level indicators of maintainability: An empirical study. J Syst Softw 10(86):2639–2653. https://doi.org/10.1016/j.jss.2013.05.007
Yin, X, Shi, C, Zhao, S (2021) Local and Global Feature Based Explainable Feature Envy Detection. 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC), 942–951. https://doi.org/10.1109/COMPSAC51774.2021.00127
Yu, H, Lam, W, Chen, L, Li, G, Xie, T, Wang, Q (2019) Neural Detection of Semantic Code Clones Via Tree-Based Convolution. 2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC), 70–80. https://doi.org/10.1109/ICPC.2019.00021
Yuan, Y, Kong, W, Hou, G, Hu, Y, Watanabe, M, Fukuda, A (2020) From Local to Global Semantic Clone Detection. 2019 6th International Conference on Dependable Systems and Their Applications (DSA), 13–24. https://doi.org/10.1109/DSA.2019.00012
Zeng J, Ben K, Li X, Zhang X (2019) Fast Code Clone Detection Based on Weighted Recursive Autoencoders. IEEE Access 7:125062–125078. https://doi.org/10.1109/ACCESS.2019.2938825
Zhang Y, Wang T (2021) CCEyes: An Effective Tool for Code Clone Detection on Large-Scale Open Source Repositories. IEEE International Conference on Information Communication and Software Engineering (ICICSE) 2021:61–70. https://doi.org/10.1109/ICICSE52190.2021.9404141
Zhang M, Hall T, Baddoo N (2011) Code Bad Smells: A review of current knowledge. J Softw Maint Evol Res Pract 23(3):179–202. https://doi.org/10.1002/smr.521
Zhang L, Feng Z, Ren W, Luo H (2020) Siamese-Based BiLSTM Network for Scratch Source Code Similarity Measuring. International Wireless Communications and Mobile Computing (IWCMC) 2020:1800–1805. https://doi.org/10.1109/IWCMC48107.2020.9148382
Zhang, J, Wang, X, Zhang, H, Sun, H, Wang, K, Liu, X (2019) A Novel Neural Source Code Representation Based on Abstract Syntax Tree. 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), 783–794. https://doi.org/10.1109/ICSE.2019.00086
Zhang, J, Hong, H, Zhang, Y, Wan, Y, Liu, Y, Sui, Y (2021) Disentangled Code Representation Learning for Multiple Programming Languages. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 4454–4466. https://doi.org/10.18653/v1/2021.findings-acl.391
Zhao, G, Huang, J (2018) DeepSim: Deep learning code functional similarity. Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 141–151. https://doi.org/10.1145/3236024.3236068
Zhou, X, Jin, Y, Zhang, H, Li, S, Huang, X (2016) A Map of Threats to Validity of Systematic Literature Reviews in Software Engineering. 2016 23rd Asia-Pacific Software Engineering Conference (APSEC), 153–160. https://doi.org/10.1109/APSEC.2016.031
Acknowledgements
The authors acknowledge the support of King Fahd University of Petroleum and Minerals in the development of this work.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of Interests/Competing Interests
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Additional information
Communicated by: Denys Poshyvanyk
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Alazba, A., Aljamaan, H. & Alshayeb, M. Deep learning approaches for bad smell detection: a systematic literature review. Empir Software Eng 28, 77 (2023). https://doi.org/10.1007/s10664-023-10312-z
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-023-10312-z