Abstract
Software developers insert logging statements in their source code to record important runtime information; such logged information is valuable for understanding system usage in production and debugging system failures. However, providing proper logging statements remains a manual and challenging task. Missing an important logging statement may increase the difficulty of debugging a system failure, while too much logging can increase system overhead and mask the truly important information. Intuitively, the actual functionality of a software component is one of the major drivers behind logging decisions. For instance, a method maintaining network communications is more likely to be logged than getters and setters. In this paper, we used automatically-computed topics of a code snippet to approximate the functionality of a code snippet. We studied the relationship between the topics of a code snippet and the likelihood of a code snippet being logged (i.e., to contain a logging statement). Our driving intuition is that certain topics in the source code are more likely to be logged than others. To validate our intuition, we conducted a case study on six open source systems, and we found that i) there exists a small number of “log-intensive” topics that are more likely to be logged than other topics; ii) each pair of the studied systems share 12% to 62% common topics, and the likelihood of logging such common topics has a statistically significant correlation of 0.35 to 0.62 among all the studied systems; and iii) our topic-based metrics help explain the likelihood of a code snippet being logged, providing an improvement of 3% to 13% on AUC and 6% to 16% on balanced accuracy over a set of baseline metrics that capture the structural information of a code snippet. Our findings highlight that topics contain valuable information that can help guide and drive developers’ logging decisions.
Similar content being viewed by others
Notes
Qpid-Java git commit: d606368b92f3952f57dbabd8553b3b6f426305e1
We share our replication package online: http://sailhome.cs.queensu.ca/replication/LoggingTopicModel
References
Apache-Commons (2016) Best practices—logging exceptions. https://commons.apache.org/logging/guide.html
Asuncion H U, Asuncion A U, Taylor R N (2010) Software traceability with topic modeling. In: Proceedings of the 32nd international conference on software engineering. ICSE ’10, pp 95–104
Baldi PF, Lopes CV, Linstead EJ, Bajracharya SK (2008a) A theory of aspects as latent topics. In: Proceedings of the 23rd ACM SIGPLAN conference on object-oriented programming systems languages and applications. OOPSLA ’08, pp 543–562
Baldi P F, Lopes C V, Linstead E J, Bajracharya S K (2008b) A theory of aspects as latent topics. In: ACM Sigplan notices, vol 43. ACM, pp 543–562
Bavota G, Oliveto R, Gethers M, Poshyvanyk D, Lucia A D (2014) Methodbook: recommending move method refactorings via relational topic models. IEEE Trans Softw Eng 40(7):671–694
Binkley D, Heinz D, Lawrie D, Overfelt J (2014) Understanding LDA in source code analysis. In: Proceedings of the 22nd international conference on program comprehension, pp 26–36
Blei D M, Ng A Y, Jordan M I (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
Bring J (1994) How to standardize regression coefficients. Am Stat 48(3):209–213
Brown P F, deSouza P V, Mercer R L, Pietra V J D, Lai J C (1992) Class-based n-gram models of natural language. Comput Linguist 18:467–479
Chang J, Gerrish S, Wang C, Boyd-graber JL, Blei D M (2009) Reading tea leaves: how humans interpret topic models. Adv Neural Inf Process Syst 22:288–296
Chen B, Jiang Z M (2017) Characterizing and detecting anti-patterns in the logging code. In: Proceedings of the 39th international conference on software engineering. ICSE ’17, pp 71–81
Chen T-H, Thomas S W, Nagappan M, Hassan A (2012) Explaining software defects using topic models. In: Proceedings of the 9th working conference on mining software repositories. MSR ’12, pp 189– 198
Chen T-H, Shang W, Hassan A E, Nasser M, Flora P (2016a) Cacheoptimizer: helping developers configure caching frameworks for hibernate-based database-centric web applications. In: Proceedings of the 24th ACM SIGSOFT international symposium on foundations of software engineering. FSE ’16, pp 666– 677
Chen T-H, Thomas S W, Hassan A E (2016b) A survey on the use of topic models when mining software repositories. Empir Softw Eng 21(5):1843–1919
Chen T-H, Syer M D, Shang W, Jiang Z M, Hassan A E, Nasser M, Flora P (2017a) Analytics-driven load testing: an industrial experience report on load testing of large-scale systems. In: Proceedings of the 39th international conference on software engineering: software engineering in practice track. ICSE-SEIP ’17, pp 243–252
Chen T-H, Shang W, Nagappan M, Hassan A E, Thomas S W (2017b) Topic-based software defect explanation. J Syst Softw 129:79–106
Cleary B, Exton C, Buckley J, English M (2008) An empirical analysis of information retrieval based concept location techniques in software comprehension. Empir Softw Eng 14(1):93–130
Cohen I, Goldszmidt M, Kelly T, Symons J, Chase J S (2004) Correlating instrumentation data to system states: a building block for automated diagnosis and control. In: Proceedings of the 6th conference on symposium on opearting systems design & implementation, pp 16–16
De Lucia A, Di Penta M, Oliveto R, Panichella A, Panichella S (2012) Using IR methods for labeling source code artifacts: is it worthwhile? In: Proceedings of the 20th international conference on program comprehension. ICPC ’12, pp 193–202
De Lucia A, Di Penta M, Oliveto R, Panichella A, Panichella S (2014) Labeling source code with information retrieval methods: an empirical study. Empir Softw Eng 19(5):1383–1420
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22
Fu Q, Zhu J, Hu W, Lou J-G, Ding R, Lin Q, Zhang D, Xie T (2014) Where do developers log? An empirical study on logging practices in industry. In: Companion proceedings of the 36th international conference on software engineering. ICSE Companion ’14, pp 24–33
Goshtasby A A (2012) Similarity and dissimilarity measures. In: Image registration: principles, tools and methods. Springer London, London, pp 7–66
Groeneveld R A, Meeden G (1984) Measuring Skewness and Kurtosis. J R Stat Soc D (Stat) 33(4):391–399
Hall D, Jurafsky D, Manning C D (2008) Studying the history of ideas using topic models. In: Proceedings of the 2008 conference on empirical methods in natural language processing. EMNLP ’08, pp 363–371. Association for Computational Linguistics
Hindle A, Bird C, Zimmermann T, Nagappan N (2014) Do topics make sense to managers and developers? Empir Softw Eng
Hu J, Sun X, Lo D, Li B (2015) Modeling the evolution of development topics using dynamic topic models. In: Proceedings of the 22nd IEEE international conference on software analysis, evolution, and reengineering. SANER’15, pp 3–12
Kabacoff R (2011) R in action. Manning Publications Co., Greenwich
Kabinna S, Bezemer C-P, Hassan A E, Shang W (2016) Examining the stability of logging statements. In: Proceedings of the 23rd IEEE international conference on software analysis, evolution, and reengineering. SANER ’16
Kuhn M, Johnson K (2013) Applied predictive modeling. Springer, Berlin
Kuhn A, Ducasse S, Gírba T (2007) Semantic clustering: identifying topics in source code. Inf Softw Technol 49:230–243
Lal S, Sureka A (2016) Logopt: static feature extraction from source code for automated catch block logging prediction. In: Proceedings of the 9th India software engineering conference. ISEC ’16, pp 151– 155
Li H, Shang W, Zou Y, Hassan AE (2017a) Towards just-in-time suggestions for log changes. Empir Softw Eng 22(4):1831–1865
Li H, Shang W, Hassan AE (2017b) Which log level should developers choose for a new logging statement? Empir Softw Eng 22(4):1684–1716
Linstead E, Lopes C, Baldi P (2008) An application of latent Dirichlet allocation to analyzing software evolution. In: Proceedings of seventh international conference on machine learning and applications. ICMLA ’12, pp 813–818
Liu Y, Poshyvanyk D, Ferenc R, Gyimothy T, Chrisochoides N (2009a) Modeling class cohesion as mixtures of latent topics. In: Proceedings of the 25th international conference on software maintenance. ICSE ’09, pp 233–242
Liu Y, Poshyvanyk D, Ferenc R, Gyimothy T, Chrisochoides N (2009b) Modeling class cohesion as mixtures of latent topics. In: Proceedings of the 25th IEEE international conference on software maintenance. ICSM ’09, pp 233–242
Macbeth G, Razumiejczyk E, Ledesma R D (2011) Cliff’s delta calculator: a non-parametric effect size program for two groups of observations. Univ Psychol 10 (2):545–555
Mariani L, Pastore F (2008) Automated identification of failure causes in system logs. In: Proceedings of the 2008 19th international symposium on software reliability engineering, pp 117–126
Martin T M, Harten P, Young D M, Muratov E N, Golbraikh A, Zhu H, Tropsha A (2012) Does rational selection of training and test sets improve the outcome of qsar modeling? J Chem Inf Model 52(10):2570–2578
Maskeri G, Sarkar S, Heafield K (2008) Mining business topics in source code using latent Dirichlet allocation. In: Proceedings of the 1st India software engineering conference, pp 113–120
McCabe TJ (1976) A complexity measure. IEEE Trans Softw Eng SE-2(4):308–320
McCallum AK (2002) Mallet: a machine learning for language toolkit
Microsoft-MSDN (2016) Logging an exception. https://msdn.microsoft.com/en-us/library/ff664711(v=pandp.50).aspx
Misra H, Cappé O, Yvon F (2008) Using lda to detect semantically incoherent documents. In: Proceedings of the 12th conference on computational natural language learning. CoNLL ’08. Association for Computational Linguistics, pp 41–48
Nguyen T T, Nguyen T N, Phuong T M (2011) Topic-based defect prediction. In: Proceedings of the 33rd international conference on software engineering. ICSE ’11, pp 932–935
Oliner A, Ganapathi A, Xu W (2012) Advances and challenges in log analysis. Commun ACM 55(2):55–61
Panichella A, Dit B, Oliveto R, Di Penta M, Poshyvanyk D, De Lucia A (2013) How to effectively use topic models for software engineering tasks? An approach based on genetic algorithms. In: Proceedings of the 2013 international conference on software engineering. ICSE ’13, pp 522–531
Panichella A, Dit B, Oliveto R, Di Penta M, Poshyvanyk D, De Lucia A (2016) Parameterizing and assembling ir-based solutions for se tasks using genetic algorithms. In: Proceedings of the 23rd IEEE international conference on software analysis, evolution, and reengineering. SANER ’16
Pecchia A, Cinque M, Carrozza G, Cotroneo D (2015) Industry practices and event logging: assessment of a critical software development process. In: Proceedings of the 37th international conference on software engineering. ICSE ’15, pp 169–178
Poshyvanyk D, Gueheneuc Y, Marcus A, Antoniol G, Rajlich V (2007) Feature location using probabilistic ranking of methods based on execution scenarios and information retrieval. IEEE Trans Softw Eng 33(6):420–432
Rao S, Kak A (2011) Retrieval from software libraries for bug localization: a comparative study of generic and composite text models. In: Proceeding of the 8th working conference on mining software repositories. MSR ’11, pp 43–52
Romano J, Kromrey J D, Coraggio J, Skowronek J (2006) Appropriate statistics for ordinal level data: should we really be using t-test and cohen’sd for evaluating group differences on the nsse and other surveys. In: Annual meeting of the Florida association of institutional research, pp 1–33
Shang W, Jiang Z M, Adams B, Hassan A E, Godfrey M W, Nasser M, Flora P (2014) An exploratory study of the evolution of communicated information about the execution of large software systems. J Softw: Evol Process 26(1):3–26
Shang W, Nagappan M, Hassan AE (2015) Studying the relationship between logging characteristics and the code quality of platform software. Empir Softw Eng 20 (1):1–27
Simon N, Friedman J, Hastie T, Tibshirani R (2011) Regularization paths for cox’s proportional hazards model via coordinate descent. J Stat Softw 39(5):1–13
Steyvers M, Griffiths T (2007) Probabilistic topic models. In: Handbook of latent semantic analysis, vol 427(7), pp 424–440
Sun X, Li B, Leung H, Li B, Li Y (2015a) Msr4sm: using topic models to effectively mining software repositories for software maintenance tasks. Inf Softw Technol 66:1–12
Sun X, Li B, Li Y, Chen Y (2015b) What information in software historical repositories do we need to support software maintenance tasks? An approach based on topic model. In: Computer and information science. Springer International Publishing, Cham, pp 27–37
Sun X, Liu X, Li B, Duan Y, Yang H, Hu J (2016) Exploring topic models in software engineering data analysis: a survey. In: Proceedings of the 17th IEEE/ACIS international conference on software engineering, artificial intelligence, networking and parallel/distributed computing. SNPD’, vol. 16, pp 357–362
Swinscow TDV, Campbell MJ et al (2002) Statistics at Square One. BMJ, London
Syer MD, Jiang Z M, Nagappan M, Hassan A E, Nasser M, Flora P (2013) Leveraging performance counters and execution logs to diagnose memory-related performance issues. In: Proceedings of the 29th IEEE international conference on software maintenance. ICSM 13’, pp 110–119
Thomas SW (2012) A lightweight source code preprocesser. https://github.com/doofuslarge/lscp
Thomas S, Adams B, Hassan A E, Blostein D (2010) Validating the use of topic models for software evolution. In: Proceedings of the 10th international working conference on source code analysis and manipulation. SCAM ’10, pp 55–64
Thomas S W, Adams B, Hassan A E, Blostein D (2011) Modeling the evolution of topics in source code histories. In: Proceedings of the 8th working conference on mining software repositories, pp 173–182
Thomas S W, Adams B, Hassan A E, Blostein D (2014) Studying software evolution using topic models. Sci Comput Program 80:457–479
Tian K, Revelle M, Poshyvanyk D (2009) Using latent Dirichlet allocation for automatic categorization of software. In: Proceedings of the 6th international working conference on mining software repositories. MSR ’09, pp 163–166
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Series B (Methodological) 58(1):267–288
Wallach H M, Mimno D M, McCallum A (2009) Rethinking lda: why priors matter. In: Advances in neural information processing systems. NIPS ’09, pp 1973–1981
Witten I H, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, San Mateo
Xu W, Huang L, Fox A, Patterson D, Jordan M I (2009) Detecting large-scale system problems by mining console logs. In: Proceedings of the ACM SIGOPS 22nd symposium on operating systems principles. SOSP ’09, pp 117–132
Yuan D, Mai H, Xiong W, Tan L, Zhou Y, Pasupathy S (2010) Sherlog: error diagnosis by connecting clues from run-time logs. SIGARCH Comput Architect News 38(1):143–154
Yuan D, Zheng J, Park S, Zhou Y, Savage S (2011) Improving software diagnosability via log enhancement. In: Proceedings of the sixteenth international conference on architectural support for programming languages and operating systems. ASPLOS ’11, pp 3–14
Yuan D, Park S, Huang P, Liu Y, Lee M M, Tang X, Zhou Y, Savage S (2012a) Be conservative: enhancing failure diagnosis with proactive logging. In: Proceedings of the 10th USENIX conference on operating systems design and implementation. OSDI’12, pp 293–306
Yuan D, Park S, Zhou Y (2012b) Characterizing logging practices in open-source software. In: Proceedings of the 34th international conference on software engineering. ICSE ’12, pp 102–112
Yuan D, Luo Y, Zhuang X, Rodrigues G R, Zhao X, Zhang Y, Jain P U, Stumm M (2014) Simple testing can prevent most critical failures: an analysis of production failures in distributed data-intensive systems. In: Proceedings of the 11th USENIX conference on operating systems design and implementation. OSDI’14, pp 249–265
Zeng L, Xiao Y, Chen H (2015) Linux auditing: overhead and adaptation. In: Proceedings of 2015 IEEE international conference on communications. ICC ’15, pp 7168–7173
Zhang S, Cohen I, Symons J, Fox A (2005) Ensembles of models for automated diagnosis of system performance problems. In: Proceedings of the 2005 international conference on dependable systems and networks. DSN ’05, pp 644–653
Zhu J, He P, Fu Q, Zhang H, Lyu M R, Zhang D (2015) Learning to log: helping developers make informed logging decisions. In: Proceedings of the 37th international conference on software engineering, vol 1. ICSE ’15, pp 415–425
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Miryung Kim
Rights and permissions
About this article
Cite this article
Li, H., Chen, TH.(., Shang, W. et al. Studying software logging using topic models. Empir Software Eng 23, 2655–2694 (2018). https://doi.org/10.1007/s10664-018-9595-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-018-9595-8