Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

LoGenText-Plus: Improving Neural Machine Translation Based Logging Texts Generation with Syntactic Templates

Published: 22 December 2023 Publication History

Abstract

Developers insert logging statements in the source code to collect important runtime information about software systems. The textual descriptions in logging statements (i.e., logging texts) are printed during system executions and exposed to multiple stakeholders including developers, operators, users, and regulatory authorities. Writing proper logging texts is an important but often challenging task for developers. Prior studies find that developers spend significant efforts modifying their logging texts. However, despite extensive research on automated logging suggestions, research on suggesting logging texts rarely exists. To fill this knowledge gap, we first propose LoGenText (initially reported in our conference paper), an automated approach that uses neural machine translation (NMT) models to generate logging texts by translating the related source code into short textual descriptions. LoGenText takes the preceding source code of a logging text as the input and considers other context information, such as the location of the logging statement, to automatically generate the logging text. LoGenText’s evaluation on 10 open source projects indicates that the approach is promising for automatic logging text generation and significantly outperforms the state-of-the-art approach. Furthermore, we extend LoGenText to LoGenText-Plus by incorporating the syntactic templates of the logging texts. Different from LoGenText, LoGenText-Plus decomposes the logging text generation process into two stages. LoGenText-Plus first adopts an NMT model to generate the syntactic template of the target logging text. Then LoGenText-Plus feeds the source code and the generated template as the input to another NMT model for logging text generation. We also evaluate LoGenText-Plus on the same 10 projects and observe that it outperforms LoGenText on 9 of them. According to a human evaluation from developers’ perspectives, the logging texts generated by LoGenText-Plus have a higher quality than those generated by LoGenText and the prior baseline approach. By manually examining the generated logging texts, we then identify five aspects that can serve as guidance for writing or generating good logging texts. Our work is an important step toward the automated generation of logging statements, which can potentially save developers’ efforts and improve the quality of software logging. Our findings shed light on research opportunities that leverage advances in NMT techniques for automated generation and suggestion of logging statements.

References

[1]
Ruchit Agrawal, Marco Turchi, and Matteo Negri. 2018. Contextual handling in neural machine translation: Look behind, ahead and on both sides. In Proceedings of the 21st Annual Conference of the European Association for Machine Translation (EAMT’18). http://rua.ua.es/dspace/handle/10045/76016
[2]
Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, and Kai-Wei Chang. 2020. A Transformer-based approach for source code summarization. arXiv preprint arXiv:2005.00653 (2020).
[3]
Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2019. code2vec: Learning distributed representations of code. In Proceedings of the ACM on Programming Languages (POPL’19) 3 (2019), Article 40, 29 pages.
[4]
T. Barik, R. DeLine, S. Drucker, and D. Fisher. 2016. The bones of the system: A case study of logging and telemetry at Microsoft. In Proceedings of the 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE Companion’16). 92–101.
[5]
Rachel Bawden, Rico Sennrich, Alexandra Birch, and Barry Haddow. 2018. Evaluating discourse phenomena in neural machine translation. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’18), Volume 1 (Long Papers). 1304–1313.
[6]
Nicolas Boulanger-Lewandowski, Yoshua Bengio, and Pascal Vincent. 2013. Audio chord recognition with recurrent neural networks. In Proceedings of the 14th International Society for Music Information Retrieval Conference (ISMIR’13). 335–340. http://www.ppgia.pucpr.br/ismir2013/wp-content/uploads/2013/09/243_Paper.pdf
[7]
Lutz Büch and Artur Andrzejak. 2019. Learning-based recursive aggregation of abstract syntax trees for code clone detection. In Proceedings of the 26th IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER’19). IEEE, Los Alamitos, CA, 95–104.
[8]
Boyuan Chen and Zhen Ming (Jack) Jiang. 2017. Characterizing and detecting anti-patterns in the logging code. In Proceedings of the 39th International Conference on Software Engineering (ICSE’17). IEEE, Los Alamitos, CA, 71–81.
[9]
Boyuan Chen and Zhen Ming (Jack) Jiang. 2017. Characterizing logging practices in Java-based open source software projects: A replication study in Apache Software Foundation. Empirical Software Engineering 22, 1 (2017), 330–374.
[10]
Boyuan Chen and Zhen Ming (Jack) Jiang. 2020. Studying the use of Java logging utilities in the wild. In Proceedings of the 42th International Conference on Software Engineering (ICSE’20).
[11]
S. Chowdhury, S. D. Nardo, A. Hindle, and Z. M. Jiang. 2017. An exploratory study on assessing the energy impact of logging on Android applications. Empirical Software Engineering 23 (2017), 1422–1456.
[12]
Michael L. Collard, Michael John Decker, and Jonathan I. Maletic. 2011. Lightweight transformation and fact extraction with the srcML toolkit. In Proceedings of the 11th IEEE Working Conference on Source Code Analysis and Manipulation (SCAM’11). IEEE, Los Alamitos, CA, 173–184.
[13]
Hetong Dai, Heng Li, Che-Shao Chen, Weiyi Shang, and Tse-Hsun Chen. 2022. Logram: Efficient log parsing using n-gram dictionaries. IEEE Transactions on Software Engineering 48, 3 (2022), 879–892.
[14]
Zishuo Ding, Heng Li, and Weiyi Shang. 2022. LoGenText: Automatically generating logging texts using neural machine translation. In Proceedings of the IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER’22). IEEE, Los Alamitos, CA, 349–360.
[15]
Zishuo Ding, Heng Li, Weiyi Shang, and Tse-Hsun Peter Chen. 2022. Can pre-trained code embeddings improve model performance? Revisiting the use of code embeddings in software engineering tasks. Empirical Software Engineering 27, 3 (2022), 63.
[16]
Zishuo Ding, Heng Li, Weiyi Shang, and Tse-Hsun (Peter) Chen. 2023. Towards learning generalizable code embeddings using task-agnostic graph convolutional networks. ACM Transactions on Software Engineering and Methodology 32, 2 (2023), Article 48, 43 pages.
[17]
Zishuo Ding, Yiming Tang, Yang Li, Heng Li, and Weiyi Shang. 2023. On the temporal relations between logging and code. In Proceedings of the 45rd International Conference on Software Engineering (ICSE’23). IEEE, Los Alamitos, CA.
[18]
Li Dong and Mirella Lapata. 2018. Coarse-to-fine decoding for neural semantic parsing. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL’18), Volume 1 (Long Papers).731–742.
[19]
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A pre-trained model for programming and natural languages. In Findings of the Association for Computational Linguistics: EMNLP 2020, Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, 1536–1547.
[20]
Qiang Fu, Jian-Guang Lou, Qingwei Lin, Rui Ding, Dongmei Zhang, and Tao Xie. 2013. Contextual analysis of program logs for understanding system behaviors. In Proceedings of the 10th Working Conference on Mining Software Repositories (MSR’13). 397–400.
[21]
Qiang Fu, Jian-Guang Lou, Yi Wang, and Jiang Li. 2009. Execution anomaly detection in distributed systems through unstructured log analysis. In Proceedings of the 9th IEEE International Conference on Data Mining (ICDM’09). 149–158.
[22]
Qiang Fu, Jieming Zhu, Wenlu Hu, Jian-Guang Lou, Rui Ding, Qingwei Lin, Dongmei Zhang, and Tao Xie. 2014. Where do developers log? An empirical study on logging practices in industry. In Companion Proceedings of the 36th International Conference on Software Engineering. ACM, New York, NY, 24–33.
[23]
Philip Gage. 1994. A new algorithm for data compression. C Users Journal 12, 2 (1994), 23–38.
[24]
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press, Cambridge, MA.
[25]
Daya Guo, Shuai Lu, Nan Duan, Yanlin Wang, Ming Zhou, and Jian Yin. 2022. UniXcoder: Unified cross-modal pre-training for code representation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL’22), Volume 1 (Long Papers). 7212–7225.
[26]
Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie Liu, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, Michele Tufano, Shao Kun Deng, Colin B. Clement, Dawn Drain, Neel Sundaresan, Jian Yin, Daxin Jiang, and Ming Zhou. 2021. GraphCodeBERT: Pre-training code representations with data flow. In Proceedings of the 9th International Conference on Learning Representations (ICLR’21). https://openreview.net/forum?id=jLoC4ez43PZ
[27]
Kelvin Guu, Tatsunori B. Hashimoto, Yonatan Oren, and Percy Liang. 2018. Generating sentences by editing prototypes. Transactions of the Association for Computational Linguistics 6 (2018), 437–450.
[28]
Mehran Hassani, Weiyi Shang, Emad Shihab, and Nikolaos Tsantalis. 2018. Studying and detecting log-related issues. Empirical Software Engineering 23, 6 (2018), 3248–3280.
[29]
Pinjia He, Zhuangbin Chen, Shilin He, and Michael R. Lyu. 2018. Characterizing the natural language descriptions in software logging statements. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE’18). ACM, New York, NY, 178–189.
[30]
Vincent J. Hellendoorn and Premkumar T. Devanbu. 2017. Are deep neural networks the best choice for modeling source code? In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE’17). ACM, New York, NY, 763–773.
[31]
Abram Hindle, Earl T. Barr, Zhendong Su, Mark Gabel, and Premkumar T. Devanbu. 2012. On the naturalness of software. In Proceedings of the 34th International Conference on Software Engineering (ICSE’12). IEEE, Los Alamitos, CA, 837–847.
[32]
Xing Hu, Ge Li, Xin Xia, David Lo, and Zhi Jin. 2018. Deep code comment generation. In Proceedings of the 26th Conference on Program Comprehension (ICPC’18). ACM, New York, NY, 200–210.
[33]
Zhen Ming Jiang, Ahmed E. Hassan, Gilbert Hamann, and Parminder Flora. 2008. Automatic identification of load testing problems. In Proceedings of the 2008 IEEE International Conference on Software Maintenance (ICSM’08). 307–316.
[34]
Suhas Kabinna, Cor-Paul Bezemer, Weiyi Shang, and Ahmed E. Hassan. 2016. Logging library migrations: A case study for the Apache Software Foundation projects. In Proceedings of the 13th International Conference on Mining Software Repositories (MSR’16). ACM, New York, NY, 154–164.
[35]
Suhas Kabinna, Cor-Paul Bezemer, Weiyi Shang, Mark D. Syer, and Ahmed E. Hassan. 2018. Examining the stability of logging statements. Empirical Software Engineering 23, 1 (Feb. 2018), 290–333.
[36]
Yunsu Kim, Duc Thanh Tran, and Hermann Ney. 2019. When and why is document-level context useful in neural machine translation? In Proceedings of the 4th Workshop on Discourse in Machine Translation (DiscoMT@EMNLP’19). 24–34.
[37]
Vladimir I. Levenshtein. 1966. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10 (1966), 707–710.
[38]
Bei Li, Hui Liu, Ziyang Wang, Yufan Jiang, Tong Xiao, Jingbo Zhu, Tongran Liu, and Changliang Li. 2020. Does multi-encoder help? A case study on context-aware neural machine translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL’20). 3512–3518. https://www.aclweb.org/anthology/2020.acl-main.322/
[39]
Heng Li, Tse-Hsun (Peter) Chen, Weiyi Shang, and Ahmed E. Hassan. 2018. Studying software logging using topic models. Empirical Software Engineering 23, 5 (Oct. 2018), 2655–2694.
[40]
H. Li, W. Shang, B. Adams, M. Sayagh, and A. E. Hassan. 2021. A qualitative study of the benefits and costs of logging from developers’ perspectives. IEEE Transactions on Software Engineering 47, 12 (2021), 2858–2873.
[41]
Heng Li, Weiyi Shang, and Ahmed E. Hassan. 2017. Which log level should developers choose for a new logging statement? Empirical Software Engineering 22, 4 (2017), 1684–1716.
[42]
Heng Li, Weiyi Shang, Ying Zou, and Ahmed E. Hassan. 2017. Towards just-in-time suggestions for log changes. Empirical Software Engineering 22, 4 (2017), 1831–1865.
[43]
Yangguang Li, Zhen Ming (Jack) Jiang, Heng Li, Ahmed E. Hassan, Cheng He, Ruirui Huang, Zhengda Zeng, Mian Wang, and Pinan Chen. 2020. Predicting node failures in an ultra-large-scale cloud computing platform: An AIOps solution. ACM Transactions on Software Engineering and Methodology 29, 2 (2020), Article 13, 24 pages.
[44]
Zhenhao Li, Tse-Hsun Chen, and Weiyi Shang. 2020. Where shall we log? Studying and suggesting logging locations in code blocks. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering (ASE’20). 361–372.
[45]
Zhenhao Li, Tse-Hsun (Peter) Chen, Jinqiu Yang, and Weiyi Shang. 2019. DLFinder: Characterizing and detecting duplicate logging code smells. In Proceedings of the 41st International Conference on Software Engineering (ICSE’19). IEEE, Los Alamitos, CA, 152–163.
[46]
Zhenhao Li, Tse-Hsun (Peter) Chen, Jinqiu Yang, and Weiyi Shang. 2022. Studying duplicate logging statements and their relationships with code clones. IEEE Transactions on Software Engineering 48, 7 (2022), 2476–2494.
[47]
Zhenhao Li, Heng Li, Tse-Hsun Chen, and Weiyi Shang. 2021. DeepLV: Suggesting log levels using ordinal based neural networks. In Proceedings of the 43rd International Conference on Software Engineering (ICSE21). 1–12.
[48]
Zhenhao Li, Chuan Luo, Tse-Hsun Chen, Weiyi Shang, Shilin He, Qingwei Lin, and Dongmei Zhang. 2023. Did we miss something important? Studying and exploring variable-aware log abstraction. In Proceedings of the 45th IEEE/ACM International Conference on Software Engineering (ICSE’23). IEEE, Los Alamitos, CA, 830–842.
[49]
Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text Summarization Branches Out. Association for Computational Linguistics, 74–81.
[50]
Fang Liu, Ge Li, Bolin Wei, Xin Xia, Ming Li, Zhiyi Fu, and Zhi Jin. 2020. Characterizing logging practices in open-source software. In Proceedings of the 28th International Conference on Program Comprehension (ICPC’20).
[51]
Z. Liu, X. Xia, D. Lo, Z. Xing, A. E. Hassan, and S. Li. 2021. Which variables should I log? IEEE Transactions on Software Engineering 47, 9 (2021), 2012–2031.
[52]
Jie Lu, Feng Li, Lian Li, and Xiaobing Feng. 2018. CloudRaid: Hunting concurrency bugs in the cloud via log-mining. In Proceedings of the 2018 ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/SIGSOFT FSE’18). ACM, New York, NY, 3–14.
[53]
Jie Lu, Chen Liu, Lian Li, Xiaobing Feng, Feng Tan, Jun Yang, and Liang You. 2019. CrashTuner: Detecting crash-recovery bugs in cloud systems via meta-info analysis. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (SOSP’19). ACM, New York, NY, 114–130.
[54]
Thang Luong, Ilya Sutskever, Quoc Le, Oriol Vinyals, and Wojciech Zaremba. 2015. Addressing the rare word problem in neural machine translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Volume 1 (Long Papers). 11–19.
[55]
L. Mariani and F. Pastore. 2008. Automated identification of failure causes in system logs. In Proceedings of the 19th International Symposium on Software Reliability Engineering (ISSRE’08). 117–126.
[56]
Sameen Maruf and Gholamreza Haffari. 2018. Document context neural machine translation with memory networks. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL’18), Volume 1 (Long Papers). 1275–1284.
[57]
Sameen Maruf, André F. T. Martins, and Gholamreza Haffari. 2019. Selective attention for context-aware neural machine translation. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’19), Volume 1 (Long and Short Papers). 3092–3102.
[58]
Antonio Mastropaolo, Nathan Cooper, David Nader-Palacio, Simone Scalabrino, Denys Poshyvanyk, Rocco Oliveto, and Gabriele Bavota. 2022. Using transfer learning for code-related tasks. CoRR abs/2206.08574 (2022).
[59]
Antonio Mastropaolo, Luca Pascarella, and Gabriele Bavota. 2022. Using deep learning to generate complete log statements. In Proceedings of the 44th IEEE/ACM 44th International Conference on Software Engineering (ICSE’22). ACM, New York, NY, 2279–2290.
[60]
Antonio Mastropaolo, Simone Scalabrino, Nathan Cooper, David Nader-Palacio, Denys Poshyvanyk, Rocco Oliveto, and Gabriele Bavota. 2021. Studying the usage of text-to-text transfer transformer to support code-related tasks. In Proceedings of the 43rd IEEE/ACM International Conference on Software Engineering (ICSE’21). IEEE, Los Alamitos, CA, 336–347.
[61]
Karthik Nagaraj, Charles Killian, and Jennifer Neville. 2012. Structured comparative analysis of systems logs to diagnose performance problems. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation (NSDI’12). 26–26.
[62]
Masato Neishi, Jin Sakuma, Satoshi Tohda, Shonosuke Ishiwatari, Naoki Yoshinaga, and Masashi Toyoda. 2017. A bag of useful tricks for practical neural machine translation: Embedding layer initialization and large batch size. In Proceedings of the 4th Workshop on Asian Translation (WAT@IJCNLP’17). 99–109. https://www.aclweb.org/anthology/W17-5708/
[63]
Emilio Soria Olivas, Jose David Martin Guerrero, Marcelino Martinez Sober, Jose Rafael Magdalena Benedito, and Antonio Jose Serrano Lopez. 2009. Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods and Techniques (Two Volumes). Information Science Reference, IGI Publishing, Hershey, PA.
[64]
Myle Ott, Sergey Edunov, Alexei Baevski, Angela Fan, Sam Gross, Nathan Ng, David Grangier, and Michael Auli. 2019. fairseq: A fast, extensible toolkit for sequence modeling. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’19), Demonstrations. 48–53.
[65]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 311–318.
[66]
Matt Post. 2018. A call for clarity in reporting BLEU scores. In Proceedings of the 3rd Conference on Machine Translation: Research Papers (WMT’18). 186–191.
[67]
Peng Qi, Yuhao Zhang, Yuhui Zhang, Jason Bolton, and Christopher D. Manning. 2020. Stanza: A Python natural language processing toolkit for many human languages. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations (ACL’20). 101–108.
[68]
Aurko Roy, Rohan Anil, Guangda Lai, Benjamin Lee, Jeffrey Zhao, Shuyuan Zhang, Shibo Wang, Ye Zhang, Shen Wu, Rigel Swavely, Tao Yu, Phuong Dao, Christopher Fifty, Zhifeng Chen, and Yonghui Wu. 2022. N-Grammer: Augmenting transformers with latent n-grams. CoRR abs/2207.06366 (2022).
[69]
Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL’16), Volume 1 (Long Papers).
[70]
Weiyi Shang, Zhen Ming Jiang, Bram Adams, Ahmed E. Hassan, Michael W. Godfrey, Mohamed Nasser, and Parminder Flora. 2014. An exploratory study of the evolution of communicated information about the execution of large software systems. Journal of Software: Evolution and Process 26, 1 (2014), 3–26.
[71]
Weiyi Shang, Zhen Ming Jiang, Hadi Hemmati, Bram Adams, Ahmed E. Hassan, and Patrick Martin. 2013. Assisting developers of big data analytics applications when deploying on Hadoop clouds. In Proceedings of the 2013 International Conference on Software Engineering (ICSE’13). 402–411.
[72]
Weiyi Shang, Meiyappan Nagappan, and Ahmed E. Hassan. 2015. Studying the relationship between logging characteristics and the code quality of platform software. Empirical Software Engineering 20, 1 (Feb. 2015), 1–27.
[73]
Mark D. Syer, Zhen Ming Jiang, Meiyappan Nagappan, Ahmed E. Hassan, Mohamed Nasser, and Parminder Flora. 2013. Leveraging performance counters and execution logs to diagnose memory-related performance issues. In Proceedings of the 29th IEEE International Conference on Software Maintenance (ICSM’13). 110–119.
[74]
Yiming Tang, Allan Spektor, Raffi Khatchadourian, and Mehdi Bagherzadeh. 2022. Automated evolution of feature logging statement levels using Git histories and degree of interest. Science of Computer Programming 214 (2022), 102724.
[75]
Jörg Tiedemann and Yves Scherrer. 2017. Neural machine translation with extended context. In Proceedings of the 3rd Workshop on Discourse in Machine Translation (DiscoMT@EMNLP’17). 82–92.
[76]
Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. 2018. Deep learning similarities from different representations of source code. In Proceedings of the 15th International Conference on Mining Software Repositories (MSR’18). ACM, New York, NY, 542–553.
[77]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017. 5998–6008. http://papers.nips.cc/paper/7181-attention-is-all-you-need
[78]
Elena Voita, Rico Sennrich, and Ivan Titov. 2019. When a good translation is wrong in context: Context-aware machine translation improves on deixis, ellipsis, and lexical cohesion. In Proceedings of the 57th Conference of the Association for Computational Linguistics (ACL’19), Volume 1 (Long Papers). 1198–1212.
[79]
Elena Voita, Pavel Serdyukov, Rico Sennrich, and Ivan Titov. 2018. Context-aware neural machine translation learns anaphora resolution. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL’18), Volume 1 (Long Papers). 1264–1274.
[80]
Haoye Wang, Xin Xia, David Lo, Qiang He, Xinyu Wang, and John Grundy. 2021. Context-aware retrieval-based deep commit message generation. ACM Transactions on Software Engineering and Methodology 30, 4 (2021), Article 56, 30 pages.
[81]
Kai Wang, Xiaojun Quan, and Rui Wang. 2019. BiSET: Bi-directional selective encoding with template for abstractive summarization. In Proceedings of the 57th Conference of the Association for Computational Linguistics (ACL’19), Volume 1 (Long Papers). 2153–2162.
[82]
Yue Wang, Weishi Wang, Shafiq R. Joty, and Steven C. H. Hoi. 2021. CodeT5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP’21). 8696–8708.
[83]
Lesly Miculicich Werlen, Dhananjay Ram, Nikolaos Pappas, and James Henderson. 2018. Document-level neural machine translation with hierarchical attention networks. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2947–2954.
[84]
Frank Wilcoxon. 1992. Individual Comparisons by Ranking Methods. Springer, New York, NY, 196–202.
[85]
Sam Wiseman, Stuart M. Shieber, and Alexander M. Rush. 2018. Learning neural templates for text generation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 3174–3187.
[86]
Shuangzhi Wu, Dongdong Zhang, Nan Yang, Mu Li, and Ming Zhou. 2017. Sequence-to-dependency neural machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL’17), Volume 1 (Long Papers). 698–707.
[87]
Bowen Xu, Zhenchang Xing, Xin Xia, and David Lo. 2017. AnswerBot: Automated generation of answer summary to developers’ undefined technical questions. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE’17). IEEE, Los Alamitos, CA, 706–716.
[88]
Wei Xu, Ling Huang, Armando Fox, David Patterson, and Michael Jordan. 2009. Online system problem detection by mining patterns of console logs. In Proceedings of the 2009 9th IEEE International Conference on Data Mining (ICDM’09). 588–597.
[89]
Wei Xu, Ling Huang, Armando Fox, David Patterson, and Michael I. Jordan. 2009. Detecting large-scale system problems by mining console logs. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (SOSP’09). ACM, New York, NY, 117–132.
[90]
Jian Yang, Shuming Ma, Dongdong Zhang, Zhoujun Li, and Ming Zhou. 2020. Improving neural machine translation with soft template prediction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL’20). 5979–5989.
[91]
Kundi Yao, Guilherme B. de Pádua, Weiyi Shang, Catalin Sporea, Andrei Toma, and Sarah Sajedi. 2020. Log4Perf: Suggesting and updating logging locations for web-based systems’ performance monitoring. Empirical Software Engineering 25, 1 (2020), 488–531.
[92]
Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks? In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014. 3320–3328. http://papers.nips.cc/paper/5347-how-transferable-are-features-in-deep-neural-networks
[93]
Ding Yuan, Yu Luo, Xin Zhuang, Guilherme Renna Rodrigues, Xu Zhao, Yongle Zhang, Pranay Jain, and Michael Stumm. 2014. Simple testing can prevent most critical failures: An analysis of production failures in distributed data-intensive systems. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). 249–265. https://www.usenix.org/conference/osdi14/technical-sessions/presentation/yuan
[94]
Ding Yuan, Haohui Mai, Weiwei Xiong, Lin Tan, Yuanyuan Zhou, and Shankar Pasupathy. 2010. SherLog: Error diagnosis by connecting clues from run-time logs. In Proceedings of the 15th Edition of ASPLOS on Architectural Support for Programming Languages and Operating Systems (ASPLOS’10). 143–154.
[95]
Ding Yuan, Soyeon Park, Peng Huang, Yang Liu, Michael Mihn-Jong Lee, Xiaoming Tang, Yuanyuan Zhou, and Stefan Savage. 2012. Be conservative: Enhancing failure diagnosis with proactive logging. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI’12). 293–306.
[96]
Ding Yuan, Soyeon Park, and Yuanyuan Zhou. 2012. Characterizing logging practices in open-source software. In Proceedings of the 34th International Conference on Software Engineering (ICSE’12). IEEE, Los Alamitos, CA, 102–112.
[97]
Yi Zeng, Jinfu Chen, Weiyi Shang, and Tse-Hsun (Peter) Chen. 2019. Studying the characteristics of logging practices in mobile apps: A case study on F-Droid. Empirical Software Engineering 24, 6 (2019), 3394–3434.
[98]
Jiacheng Zhang, Huanbo Luan, Maosong Sun, Feifei Zhai, Jingfang Xu, Min Zhang, and Yang Liu. 2018. Improving the Transformer translation model with document-level context. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 533–542.
[99]
Jian Zhang, Xu Wang, Hongyu Zhang, Hailong Sun, Kaixuan Wang, and Xudong Liu. 2019. A novel neural source code representation based on abstract syntax tree. In Proceedings of the 41st International Conference on Software Engineering (ICSE’19). IEEE, Los Alamitos, CA, 783–794.
[100]
Xu Zhao, Kirk Rodrigues, Yu Luo, Michael Stumm, Ding Yuan, and Yuanyuan Zhou. 2017. Log20: Fully automated optimal placement of log printing statements under specified overhead threshold. In Proceedings of the 26th Symposium on Operating Systems Principles. ACM, New York, NY, 565–581.
[101]
Xu Zhao, Yongle Zhang, David Lion, Muhammad Faizan Ullah, Yu Luo, Ding Yuan, and Michael Stumm. 2014. lprof: A non-intrusive request flow profiler for distributed systems. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). 629–644. https://www.usenix.org/conference/osdi14/technical-sessions/presentation/zhao
[102]
Zaixiang Zheng, Xiang Yue, Shujian Huang, Jiajun Chen, and Alexandra Birch. 2020. Towards making the most of context in neural machine translation. In Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI’20). 3983–3989.
[103]
C. Zhi, Jianwei Yin, S. Deng, Maoxin Ye, Min Fu, and Tao Xie. 2019. An exploratory study of logging configuration practice in Java. In Proceedings of the 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME’19).459–469.
[104]
Jieming Zhu, Pinjia He, Qiang Fu, Hongyu Zhang, Michael R. Lyu, and Dongmei Zhang. 2015. Learning to log: Helping developers make informed logging decisions. In Proceedings of the 37th International Conference on Software Engineering, Volume 1. IEEE, Los Alamitos, CA, 415–425.

Cited By

View all
  • (2024)Go Static: Contextualized Logging Statement GenerationProceedings of the ACM on Software Engineering10.1145/36437541:FSE(609-630)Online publication date: 12-Jul-2024

Index Terms

  1. LoGenText-Plus: Improving Neural Machine Translation Based Logging Texts Generation with Syntactic Templates

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Software Engineering and Methodology
    ACM Transactions on Software Engineering and Methodology  Volume 33, Issue 2
    February 2024
    947 pages
    EISSN:1557-7392
    DOI:10.1145/3618077
    • Editor:
    • Mauro Pezzè
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 December 2023
    Online AM: 18 September 2023
    Accepted: 16 August 2023
    Revised: 11 July 2023
    Received: 30 October 2022
    Published in TOSEM Volume 33, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Software logging
    2. logging text
    3. neural machine translation

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)305
    • Downloads (Last 6 weeks)26
    Reflects downloads up to 13 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Go Static: Contextualized Logging Statement GenerationProceedings of the ACM on Software Engineering10.1145/36437541:FSE(609-630)Online publication date: 12-Jul-2024

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media