Abstract
Today’s software development is typically driven by incremental changes made to software to implement a new functionality, fix a bug, or improve its performance and security. Each change request is often described as an issue. Recent studies suggest that a set of components (e.g., software modules) relevant to the resolution of an issue is one of the most important information provided with the issue that software engineers often rely on. However, assigning an issue to the correct component(s) is challenging, especially for large-scale projects which have up to hundreds of components. In this paper, we propose a predictive model which learns from historical issue reports and recommends the most relevant components for new issues. Our model uses Long Short-Term Memory, a deep learning technique, to automatically learn semantic features representing an issue report, and combines them with the traditional textual similarity features. An extensive evaluation on 142,025 issues from 11 large projects shows that our approach outperforms one common baseline, two state-of-the-art techniques, and six alternative techniques with an improvement of 16.70%–66.31% on average across all projects in predictive performance.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
We use “assigned to” to denote the identification of the relation between an issue and the set of components relevant to the resolution of that issue.
The number of top k components is specified by the user.
The model was implemented in Python using Theano (Team 2016).
We used an implementation of Doc2Vec in Gensim https://radimrehurek.com/gensim/models/doc2vec.html
References
Al-Kofahi JM, Tamrawi A, Nguyen TN (2010) Fuzzy set approach for automatic tagging in evolving software. In: Proceeding of the international conference on software maintenance (ICSM). https://doi.org/10.1109/ICSM.2010.5609751, pp 1–10
Alencar D, Abebe SL, Mcintosh S, Alencar da Costa D, Abebe SL, Mcintosh S, Kulesza U, Hassan AE (2014) An empirical study of delays in the integration of addressed issues. In: Proceedings of the international conference on software maintenance and evolution (ICSME), IEEE, pp 281–290
Antoniol G, Ayari K, Di Penta M, Khomh F, Guéhéneuc YG (2008) Is it a bug or an enhancement?: a text-based approach to classify change requests. In: Proceedings of the conference of the center for advanced studies on collaborative research: meeting of minds, ACM. https://doi.org/10.1145/1463788.1463819, pp 304–318
Anvik J, Murphy GC (2011) Reducing the effort of bug report triage. ACM Trans Softw Eng Methodol 20(3):1–35
Anvik J, Hiew L, Murphy GC (2006) Who should fix this bug?. In: Proceedings of the 28th international conference on software engineering (ICSE), ACM Press, New York, USA, pp 361–370
Arcuri A, Briand L (2011) A practical guide for using statistical tests to assess randomized algorithms in software engineering. In: Proceedings of the 33rd international conference on software engineering (ICSE). https://doi.org/10.1145/1985793.1985795, pp 1–10
Atzmueller M, Chin A, Scholz C, Trattner C (2015) Mining, modeling, and recommending ‘things’ in social media. Lect Notes Comput Sci 8940:55–74. https://doi.org/10.1007/978-3-319-14723-9
Baroni M, Dinu G, Kruszewski G (2014) Don’t count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors. In: ACL (1), pp 238–247
Bettenburg N, Just S, Schröter A, Weiss C, Premraj R, Zimmermann T (2008a) What makes a good bug report?. In: Proceedings of the 16th ACM SIGSOFT international symposium on foundations of software engineering, ACM Press, New York, USA, pp 308–318
Bettenburg N, Premraj R, Zimmermann T (2008b) Duplicate bug reports considered harmful {…} really?. In: Proceedings of the international conference on software maintenance (ICSM), pp 337–345
Blei DM, Ng AY, Jordan MI (2012) Latent dirichlet allocation. J Mach Learn Res 3(4-5):993–1022
Cherman EA, Monard MC, Metz J (2011) Multi-label problem transformation methods : a case study. CLEI Electron J 14(1):1–10
Choetkiertikul M, Dam KH, Tran T, Pham TTM, Ghose A (2018) Poster: predicting components for issue reports using deep. In: Proceedings of the 40th international conference on software engineering (ICSE) poster track, pp 244–245
Cottrell R, Walker RJ, Denzinger J (2008) Semi-automating small-scale source code reuse via structural correspondence. Science 214–225. https://doi.org/10.1145/1453101.1453130
Cubranic D, Murphy G (2004) Automatic bug triage using text categorization. In: Proceedings of the 16th international conference on software engineering & knowledge engineering (SEKE), pp 92–97
Dam H, Tran T, Pham T (2016) A deep language model for software code. arXiv:1608.02715 (August):1–4
Denninger O (2012) Recommending relevant code artifacts for change requests using multiple predictors. In: Proceeding of the 3rd International Workshop on Recommendation Systems for Software Engineering (RSSE). https://doi.org/10.1109/RSSE.2012.6233416, pp 78–79
Elisseeff A, Weston J (2001) A kernel method for multi-labelled classification. Advances in Neural Information Processing Systems 14:681–687
Fu W, Menzies T (2017) Easy over hard: a case study on deep learning. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, Association for Computing Machinery, New York, NY, USA, ESEC/FSE 2017. https://doi.org/10.1145/3106237.3106256, pp 49–60
Gasparic M, Janes A (2016) What recommendation systems for software engineering recommend: a systematic literature review. J Syst Softw 113:101–113. https://doi.org/10.1016/j.jss.2015.11.036
Gers FA, Schmidhuber J, Cummins F (2000) Learning to forget: continual prediction with lstm. Neural Comput 12(10):2451–2471
Glasmachers T (2017) Limits of end-to-end learning. In: Proceeding of the 9th asian conference on machine learning, pp 17–32
Graves A, Mohamed AR, Hinton G (2013) Speech recognition with deep recurrent neural networks. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), 2013, IEEE, pp 6645–6649
Gu X, Zhang H, Zhang D, Kim S (2016) Deep API learning. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering, ACM, FSE 2016, pp 631–642
Gutmann MU, Hyvärinen A (2012) Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. J Mach Learn Res 13:307–361
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Hochreiter S, Bengio Y, Frasconi P, Schmidhuber J (2001) Gradient flow in recurrent nets: the difficulty of learning long-term dependencies
Hu H, Zhang H, Xuan J, Sun W (2014) Effective bug triage based on historical bug-fix information. In: Proceedings of the international Symposium on Software Reliability Engineering (ISSRE). https://doi.org/10.1109/ISSRE.2014.17, pp 122–132
Iqbal A (2014) Understanding contributor to developer turnover patterns in oss projects: a case study of apache projects. ISRN Softw Eng 2014:1–10. https://doi.org/10.1155/2014/535724
Jalbert N, Weimer W (2008) Automated duplicate detection for bug tracking systems. In: Proceedings of the international conference on dependable systems and networks with FTCS and DCC (DSN), IEEE, pp 52–61
James ER (2002) Some implications of remedial and preventive legislation in the United States. Am J Sociol 18(6):769–783. https://doi.org/10.1086/212157,1603.06111
Jindal R, Malhotra R, Jain A (2017) Prediction of defect severity by mining software project reports. International Journal of System Assurance Engineering and Management 8(2):334–351. https://doi.org/10.1007/s13198-016-0438-y
Johnson R, Zhang T (2015) Effective use of word order for text categorization with convolutional neural networks. In: NAACL HLT 2015 - 2015 conference of the North American chapter of the association for computational linguistics: human language technologies, proceedings of the conference, 2011, pp 103–112
Jones C (2004) Software project management practices : failure versus success. CrossTalk: The Journal of Defense Software Engineering 17(10):5–9
Kakarontzas G, Stamelos I, Skalistis S, Naskos A (2012) Extracting components from open source: the component adaptation environment (COPE) approach. In: Proceedings of the 38th EUROMICRO conference on software engineering and advanced applications (SEAA). https://doi.org/10.1109/SEAA.2012.39, pp 192–199
Kerzner H, Kerzner HR (2017) Project management: a systems approach to planning, scheduling, and controlling. Wiley
Kochhar PS, Thung F, Lo D (2014) Automatic fine-grained issue report reclassification. In: Proceedings of the IEEE international conference on engineering of complex computer systems (ICECCS), pp 126–135
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86
Kumari M, Singh VB (2020) An improved classifier based on entropy and deep learning for bug priority prediction. In: Intelligent systems design and applications, Springer International Publishing, pp 571–580
Lam AN, Nguyen AT, Nguyen HA, Nguyen TN (2016) Combining deep learning with information retrieval to localize buggy files for bug reports. In: Proceedings of the 30th IEEE/ACM international conference on automated software engineering (ASE). https://doi.org/10.1109/ASE.2015.73, pp 476–481
Lam AN, Nguyen AT, Nguyen HA, Nguyen TN (2017) Bug localization with combination of deep learning and information retrieval. In: Proceedings of the 25th IEEE/ACM international conference on program comprehension (ICPC). https://doi.org/10.1109/ICPC.2017.24, pp 218–229
Lamkanfi A, Demeyer S (2013) Predicting reassignments of bug reports - an exploratory investigation. In: Proceedings of the European conference on software maintenance and reengineering, CSMR. https://doi.org/10.1109/CSMR.2013.42, pp 327–330
Lamkanfi A, Demeyer S, Giger E, Goethals B (2010) Predicting the severity of a reported bug. In: Proceedings of the 7th IEEE working conference on mining software repositories (MSR), IEEE, pp 1–10
Lamkanfi A, Demeyer S, Soetens QD, Verdonckz T (2011) Comparing mining algorithms for predicting the severity of a reported bug. In: Proceedings of the European Conference on Software Maintenance and Reengineering (CSMR). https://doi.org/10.1109/CSMR.2011.31, pp 249–258
Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of the 31st international conference on machine learning (ICML). https://doi.org/10.1145/2740908.2742760, vol 32, pp 1188–1196
Lederer AL, Prasad J (1992) Nine management guidelines for better cost estimating. Commun ACM 35(2):51–59
Lee SR, Heo MJ, Lee CG, Kim M, Jeong G (2017) Applying deep learning based automatic bug triager to industrial projects. In: Proceedings of the 2017 11th joint meeting on foundations of software engineering - ESEC/FSE 2017, pp 926–931
Li L, Feng H, Zhuang W, Meng N, Ryder B (2017) CCLearner: a deep learning-based clone detection approach. In: IEEE international conference on software maintenance and evolution (ICSME ’17), pp 249–260
Linares-Vásquez M, McMillan C, Poshyvanyk D, Grechanik M (2014) On using machine learning to automatically classify software applications into domain categories. Empir Softw Eng 19:582–618. https://doi.org/10.1007/s10664-012-9230-z
Mani S, Sankaran A, Aralikatte R (2019) Deeptriage: exploring the effectiveness of deep learning for bug triaging. In: Proceedings of the ACM India joint international conference on data science and management of data - CoDS-COMAD ’19, pp 171–179
McCallum A, Nigam K (1998) A comparison of event models for naïve Bayes text classification. In: Proceedings of the AAAI-98 workshop on learning for text categorization, AAAI Press, pp 41–48
Menzies T, Marcus A (2008) Automated severity assessment of software defect reports. In: Proceedings of the international conference on software maintenance (ICSM), IEEE, pp 346–355
Mohammad F (2018) Is preprocessing of text really worth your time for toxic comment classification?. In: Proceedings of the International Conference on Artificial Intelligence (ICAI) 1(1):447–453. arXiv:1806.02908
Muller K (1989) Statistical power analysis for the behavioral sciences. Technometrics 31(4):499–500
Nam J, Kim J, Menci̇a EL, Gurevych I, Fu̇rnkranz J (2013) Large-scale multi-label text classification - revisiting neural networks. In: Machine learning and knowledge discovery in databases. ECML PKDD 2014. Lecture notes in computer science. arXiv:1312.5419, pp 437–452
Navarro-Almanza R, Juurez-Ramirez R, Licea G (2018) Towards supporting software engineering using deep learning: a case of software requirements classification. In: Proceedings - 2017 5th international conference in software engineering research and innovation, CONISOFT 2017 2018-Janua:116–120. https://doi.org/10.1109/CONISOFT.2017.00021
Nguyen AT, Nguyen TT, Al-Kofahi J, Nguyen HV, Nguyen TN (2011) A topic-based approach for narrowing the search space of buggy files from a bug report. In: Proceedings of the 26th IEEE/ACM International Conference on Automated Software Engineering (ASE). https://doi.org/10.1109/ASE.2011.6100062, pp 263–272
Otoom AF, Al-shdaifat D, Hammad M, Abdallah EE (2016) Severity prediction of software bugs. In: Proceedings of the 7th international conference on information and communication systems (ICICS), pp 92–95
Pandey N, Sanyal DK, Hudait A, Sen A (2017) Automated classification of software issue reports using machine learning techniques: an empirical study. Innov Syst Softw Eng 13(4):279–297. https://doi.org/10.1007/s11334-017-0294-1
Park YJ, Tuzhilin A (2008) The long tail of recommender systems and how to leverage it. In: Proceedings of the 2008 ACM conference on Recommender systems - RecSys ’08, p 11
Project Management Institute Inc (2000) A guide to the project management body of knowledge (PMBOK guide). Project Management Institute https://doi.org/10.5860/CHOICE.34-1636,978-1-933890-51-7
Rahman MM, Ruhe G, Zimmermann T (2009) Optimized assignment of developers for fixing bugs an initial evaluation for eclipse projects. In: Proceedings of the 3rd international symposium on empirical software engineering and measurement, IEEE, pp 439–442
Robillard MP, Walker RJ, Zimmermann T (2010) Recommendation systems for software engineering. IEEE Softw 27(4):80–86. https://doi.org/10.1109/MS.2009.161
Runeson P, Alexandersson M, Nyholm O (2007) Detection of duplicate defect reports using natural language processing. In: Proceedings of the 29th international conference on software engineering (ICSE), IEEE, pp 499–510
Saha RK, Saha AK, Perry DE (2013) Toward understanding the causes of unanswered questions in software information sites: a case study of stack overflow. In: Proceedings of the 2013 9th joint meeting on foundations of software engineering. https://doi.org/10.1145/2491411.2494585, pp 663–666
Saini V, Farmahinifarahani F, Lu Y, Baldi P, Lopes CV (2018) Oreo: detection of clones in the twilight zone. In: Proceedings of the 2018 26th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering (ESEC/FSE ’18), ACM Press, pp 354–365
Sarro F, Petrozziello A, Harman M (2016) Multi-objective software effort estimation. In: Proceedings of the 38th international conference on software engineering (ICSE), pp 619–630
Schmidhuber J (2015) Deep Learning in neural networks: an overview. Neural Netw 61:85–117. https://doi.org/10.1016/j.neunet.2014.09.003,1404.7828
Somasundaram K, Murphy GC (2012) Automatic categorization of bug reports using latent Dirichlet allocation. In: Proceedings of the 5th India software engineering conference (ISEC). https://doi.org/10.1145/2134254.2134276, pp 125–130
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958
Steck H (2010) Training and testing of recommender systems on data missing not at random. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining. https://doi.org/10.1145/1835804.1835895, pp 713–722
Sun C, Lo D, Khoo SC, Jiang J (2011) Towards more accurate retrieval of duplicate bug reports. In: Proceedings of the 26th IEEE/ACM international conference on automated software engineering (ASE), IEEE, pp 253–262
Sundermeyer M, Schlüter R, Ney H (2012) LSTM neural networks for language modeling. In: INTERSPEECH, pp 194–197
Sureka A (2012) Learning to classify bug reports into components. In: Proceedings of the 50th international conference on objects, models, components, patterns, Springer. https://doi.org/10.1007/978-3-642-30561-0_20, pp 288–303
Team TD (2016) Theano: a python framework for fast computation of mathematical expressions. arXiv:http:arxiv.org/abs/1605.0http://deeplearning.net/software/theano
Thung F, Lo D, Jiang L (2012) Automatic defect categorization. In: Proceedings of the working conference on reverse engineering (WCRE), pp 205–214
Tian Y, Lo D, Xia X, Sun C (2015) Automated prediction of bug report priority using multi-factor analysis. Empir Softw Eng 20(5):1354–1383
Vargas-Baldrich S, Linares-Vásquez M, Poshyvanyk D (2016) Automated tagging of software projects using bytecode and dependencies. In: Proceedings of the 30th IEEE/ACM international conference on automated software engineering (ASE). https://doi.org/10.1109/ASE.2015.38, pp 289–294
Vargha A, Delaney HD (2000) A critique and improvement of the CL common language effect size statistics of McGraw and Wong. J Educ Behav Stat 25(2):101–132. https://doi.org/10.3102/10769986025002101
Wang S, Lo D, Lawall J (2014a) Compositional vector space models for improved bug localization. In: Proceedings of the 30th international conference on software maintenance and evolution (ICSME). https://doi.org/10.1109/ICSME.2014.39, pp 171–180
Wang S, Lo D, Vasilescu B, Serebrenik A (2014b) Entagrec: an enhanced tag recommendation system for software information sites. In: International conference on software maintenance and evolution (ICSME ’14), pp 291–300
Wang S, Liu T, Tan L (2016) Automatically learning semantic features for defect prediction. In: Proceedings of the international conference on software engineering (ICSE). https://doi.org/10.1145/2884781.2884804, vol 14–22, pp 297–308
Wang T, Wang H, Yin G, Ling CX, Li X, Zou P (2014c) Tag recommendation for open source software. Front Comput Sci 8(1):69–82
Wang X, Zhang L, Xie T, Anvik J, Sun J (2008) An approach to detecting duplicate bug reports using natural language and execution information. In: Proceedings of the 30th international conference on software engineering (ICSE), pp 461–470
White M, Vendome C, Linares-v M, Poshyvanyk D (2015) Toward deep learning software repositories. In: Proceedings of the 12th working conference on mining software repositories (MSR), pp 334–345
White M, Tufano M, Vendome C, Poshyvanyk D (2016) Deep learning code fragments for code clone detection. In: IEEE/ACM international conference on automated software engineering. https://doi.org/10.1145/2970276.2970326, pp 87–98
Xi S, Yao Y, Xiao X, Xu F, Lu J (2018) An effective approach for routing the bug reports to the right fixers. In: Proceedings of the tenth Asia-Pacific symposium on internetware - internetware ’18, pp 1–10
Xi SQ, Yao Y, Xiao XS, Xu F, Lv J (2019) Bug triaging based on tossing sequence modeling. J Comput Sci Technol 34(5):942–956. https://doi.org/10.1007/s11390-019-1953-5
Xia X, Lo D, Wang X, Zhou B (2013) Tag recommendation in software information sites. In: Proceedings of the 10th working conference on mining software repositories (MSR), Ieee. https://doi.org/10.1109/MSR.2013.6624040, pp 287–296
Xia X, Lo D, Wen M, Shihab E, Zhou B (2014) An empirical study of bug report field reassignment. In: Proceedings of the conference on software maintenance, reengineering, and reverse engineering, pp 174–183
Xia X, Lo D, Ding Y, Al-Kofahi JM, Nguyen TN, Wang X (2016) Improving automated bug triaging with specialized topic model. IEEE Trans Softw Eng 43(3):272–297. https://doi.org/10.1109/TSE.2016.2576454
Yan M, Zhang X, Yang D, Xu L, Kymer JD (2016) A component recommender for bug reports using discriminative probability latent semantic analysis. Inf Softw Technol 73:37–51
Yang X, Lo D, Xia X, Zhang Y, Sun J (2015) Deep learning for just-in-time defect prediction. In: Proceedings of the IEEE international conference on software quality, reliability and security (QRS), 1. https://doi.org/10.1109/QRS.2015.14, pp 17–26
Yin H, Cui B, Li J, Yao J, Chen C (2012) Challenging the long tail recommendation. Proceedings of the VLDB Endowment 5(9):896–907. http://dl.acm.org/citation.cfm?doid=2311906.2311916
Yin W, Kann K, Yu M, Schütze H (2017) Comparative study of CNN and RNN for natural language processing arXiv:http://arxiv.org/1702.01923
Yue-Hei Ng J, Hausknecht M, Vijayanarasimhan S, Vinyals O, Monga R, Toderici G (2015) Beyond short snippets: deep networks for video classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4694–4702
Zanoni M, Perin F, Fontana FA, Viscusi G (2014) Dual analysis for recommending developers to resolve bugs. Journal of Software: Evolution and Process 26(12):1172–1192
Zhang M, Zhou Z, Member S (2006) Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans Knowl Data Eng 18(10):1338–1351
Zhou J, Zhang H, Lo D (2012) Where should the bugs be fixed?. In: Proceedings of the 34th international conference on software engineering (ICSE). https://doi.org/10.1109/ICSE.2012.6227210, pp 14–24
Zhou P, Liu J, Yang Z, Zhou G (2017) Scalable tag recommendation for software information sites. In: SANER 2017 - 24th IEEE international conference on software analysis, evolution, and reengineering, IEEE, 1, pp 272–282
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Bram Adams
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Choetkiertikul, M., Dam, H.K., Tran, T. et al. Automatically recommending components for issue reports using deep learning. Empir Software Eng 26, 14 (2021). https://doi.org/10.1007/s10664-020-09898-5
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-020-09898-5