research-article

Open access

LILAC: Log Parsing using LLMs with Adaptive Parsing Cache

Authors:

Zhuangbin Chen,

Michael R. LyuAuthors Info & Claims

Proceedings of the ACM on Software Engineering, Volume 1, Issue FSE

Article No.: 7, Pages 137 - 160

https://doi.org/10.1145/3643733

Published: 12 July 2024 Publication History

Abstract

Log parsing transforms log messages into structured formats, serving as the prerequisite step for various log analysis tasks. Although a variety of log parsing approaches have been proposed, their performance on complicated log data remains compromised due to the use of human-crafted rules or learning-based models with limited training data. The recent emergence of powerful large language models (LLMs) demonstrates their vast pre-trained knowledge related to code and logging, making it promising to apply LLMs for log parsing. However, their lack of specialized log parsing capabilities currently hinders their parsing accuracy. Moreover, the inherent inconsistent answers, as well as the substantial overhead, prevent the practical adoption of LLM-based log parsing. To address these challenges, we propose LILAC, the first practical Log parsIng framework using LLMs with Adaptive parsing Cache. To facilitate accurate and robust log parsing, LILAC leverages the in-context learning (ICL) capability of the LLM by performing a hierarchical candidate sampling algorithm and selecting high-quality demonstrations. Furthermore, LILAC incorporates a novel component, an adaptive parsing cache, to store and refine the templates generated by the LLM. It helps mitigate LLM's inefficiency issue by enabling rapid retrieval of previously processed log templates. In this process, LILAC adaptively updates the templates within the parsing cache to ensure the consistency of parsed results. The extensive evaluation on public large-scale datasets shows that LILAC outperforms state-of-the-art methods by 69.5% in terms of the average F1 score of template accuracy. In addition, LILAC reduces the query times to LLMs by several orders of magnitude, achieving a comparable efficiency to the fastest baseline.

References

[1]

2023. Jaccard index - Wikipedia. https://en.wikipedia.org/wiki/Jaccard_index [Online; accessed 1 Aug 2023]

[2]

2023. OpenAI API. https://openai.com/blog/openai-api [Online; accessed 1 Aug 2023]

[3]

2023. The repository of LILAC. https://github.com/logpai/LILAC [Online; accessed 29 Jan 2024]

[4]

2023. Scipy. https://scipy.org/ [Online; accessed 1 Aug 2023]

[5]

Shan Ali, Chaima Boufaied, Domenico Bianculli, Paula Branco, Lionel Briand, and Nathan Aschbacher. 2023. An Empirical Study on Log-based Anomaly Detection Using Machine Learning. arXiv preprint arXiv:2307.16714, https://doi.org/10.48550/arXiv.2307.16714

[6]

Anunay Amar and Peter C Rigby. 2019. Mining historical test logs to predict bugs and localize faults in the test logs. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 140–151. https://doi.org/10.1109/ICSE.2019.00031

Digital Library

[7]

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html

[8]

An Ran Chen, Tse-Hsun Chen, and Shaowei Wang. 2021. Pathidea: Improving information retrieval-based bug localization by re-constructing execution paths using logs. IEEE Transactions on Software Engineering (TSE), 48, 8 (2021), 2905–2919. https://doi.org/10.1109/TSE.2021.3071473

Digital Library

[9]

Hetong Dai, Heng Li, Che-Shao Chen, Weiyi Shang, and Tse-Hsun Chen. 2020. Logram: Efficient Log Parsing Using n n-Gram Dictionaries. IEEE Transactions on Software Engineering (TSE), 48, 3 (2020), 879–892. https://doi.org/10.1109/TSE.2020.3007554

[10]

Tim Dettmers, Mike Lewis, Younes Belkada, and Luke Zettlemoyer. 2022. Llm. int8 (): 8-bit matrix multiplication for transformers at scale. arXiv preprint arXiv:2208.07339, https://doi.org/10.48550/arXiv.2208.07339

[11]

Qingxiu Dong, Lei Li, Damai Dai, Ce Zheng, Zhiyong Wu, Baobao Chang, Xu Sun, Jingjing Xu, and Zhifang Sui. 2022. A survey for in-context learning. arXiv preprint arXiv:2301.00234, https://doi.org/10.48550/arXiv.2301.00234

[12]

Min Du and Feifei Li. 2016. Spell: Streaming parsing of system event logs. In 2016 IEEE 16th International Conference on Data Mining (ICDM). 859–864. https://doi.org/10.1109/ICDM.2016.0103

[13]

Yilun Du, Shuang Li, Antonio Torralba, Joshua B Tenenbaum, and Igor Mordatch. 2023. Improving Factuality and Reasoning in Language Models through Multiagent Debate. arXiv preprint arXiv:2305.14325, https://doi.org/10.48550/arXiv.2305.14325

[14]

Shuzheng Gao, Xin-Cheng Wen, Cuiyun Gao, Wenxuan Wang, Hongyu Zhang, and Michael R Lyu. 2023. What makes good in-context demonstrations for code intelligence tasks with llms? In 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE). 761–773. https://doi.org/10.1109/ASE56229.2023.00109

[15]

Hossein Hamooni, Biplob Debnath, Jianwu Xu, Hui Zhang, Guofei Jiang, and Abdullah Mueen. 2016. Logmine: Fast pattern recognition for log analytics. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (CIKM). 1573–1582. https://doi.org/10.1145/2983323.2983358

Digital Library

[16]

Hangfeng He, Hongming Zhang, and Dan Roth. 2022. Rethinking with retrieval: Faithful large language model inference. arXiv preprint arXiv:2301.00303, https://doi.org/10.48550/arXiv.2301.00303

[17]

Pinjia He, Jieming Zhu, Shilin He, Jian Li, and Michael R Lyu. 2016. An evaluation study on log parsing and its use in log mining. In 2016 46th annual IEEE/IFIP international conference on dependable systems and networks (DSN). 654–661. https://doi.org/10.1109/DSN.2016.66

[18]

Pinjia He, Jieming Zhu, Zibin Zheng, and Michael R Lyu. 2017. Drain: An online log parsing approach with fixed depth tree. In 2017 IEEE international conference on web services (ICWS). 33–40. https://doi.org/10.1109/ICWS.2017.13

[19]

Yintong Huo, Yuxin Su, Cheryl Lee, and Michael R Lyu. 2023. Semparser: A semantic parser for log analytics. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). 881–893. https://doi.org/10.1109/ICSE48619.2023.00082

Digital Library

[20]

Zhihan Jiang, Jinyang Liu, Junjie Huang, Yichen Li, Yintong Huo, Jiazhen Gu, Zhuangbin Chen, Jieming Zhu, and Michael R Lyu. 2023. A Large-scale Benchmark for Log Parsing. arXiv preprint arXiv:2308.10828, https://doi.org/10.48550/arXiv.2308.10828

[21]

Zhen Ming Jiang, Ahmed E Hassan, Parminder Flora, and Gilbert Hamann. 2008. Abstracting execution logs to execution events for enterprise applications (short paper). In 2008 The Eighth International Conference on Quality Software. 181–186. https://doi.org/10.1109/QSIC.2008.50

Digital Library

[22]

Pinjia He Jinyang Liu Jieming Zhu, Shilin He and Michael R. Lyu. 2023. Loghub: A Large Collection of System Log Datasets for AI-driven Log Analytics. In 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE). https://doi.org/10.1109/ISSRE59848.2023.00071

[23]

Zanis Ali Khan, Donghwan Shin, Domenico Bianculli, and Lionel Briand. 2022. Guidelines for assessing the accuracy of log message template identification techniques. In Proceedings of the 44th International Conference on Software Engineering (ICSE). 1095–1106. https://doi.org/10.1145/3510003.3510101

Digital Library

[24]

Van-Hoang Le and Hongyu Zhang. 2022. Log-based anomaly detection with deep learning: How far are we? In Proceedings of the 44th international conference on software engineering (ICSE). 1356–1367. https://doi.org/10.1145/3510003.3510155

Digital Library

[25]

Van-Hoang Le and Hongyu Zhang. 2023. Log Parsing: How Far Can ChatGPT Go? In 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE). 1699–1704. https://doi.org/10.1109/ASE56229.2023.00206

[26]

Van-Hoang Le and Hongyu Zhang. 2023. Log parsing with prompt-based few-shot learning. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). 2438–2449. https://doi.org/10.1109/ICSE48619.2023.00204

Digital Library

[27]

Xiaoyun Li, Hongyu Zhang, Van-Hoang Le, and Pengfei Chen. 2024. Logshrink: Effective log compression by leveraging commonality and variability of log data. In Proceedings of the 46th IEEE/ACM International Conference on Software Engineering. 1–12. https://doi.org/10.1145/3597503.3608129

Digital Library

[28]

Yichen Li, Yintong Huo, Zhihan Jiang, Renyi Zhong, Pinjia He, Yuxin Su, and Michael R Lyu. 2023. Exploring the Effectiveness of LLMs in Automated Logging Generation: An Empirical Study. arXiv preprint arXiv:2307.05950, https://doi.org/10.48550/arXiv.2307.05950

[29]

Zhenhao Li, Chuan Luo, Tse-Hsun Chen, Weiyi Shang, Shilin He, Qingwei Lin, and Dongmei Zhang. 2023. Did we miss something important? studying and exploring variable-aware log abstraction. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). 830–842. https://doi.org/10.1109/ICSE48619.2023.00078

Digital Library

[30]

Jinyang Liu, Junjie Huang, Yintong Huo, Zhihan Jiang, Jiazhen Gu, Zhuangbin Chen, Cong Feng, Minzhi Yan, and Michael R Lyu. 2023. Scalable and Adaptive Log-based Anomaly Detection with Expert in the Loop. arXiv preprint arXiv:2306.05032, https://doi.org/10.48550/arXiv.2306.05032

[31]

Jinyang Liu, Jieming Zhu, Shilin He, Pinjia He, Zibin Zheng, and Michael R Lyu. 2019. Logzip: Extracting hidden structures via iterative clustering for log compression. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). 863–873. https://doi.org/10.1109/ASE.2019.00085

Digital Library

[32]

Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2023. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. Comput. Surveys, 55, 9 (2023), 1–35. https://doi.org/10.1145/3560815

Digital Library

[33]

Yilun Liu, Shimin Tao, Weibin Meng, Feiyu Yao, Xiaofeng Zhao, and Hao Yang. 2024. Logprompt: Prompt engineering towards zero-shot and interpretable log analysis. In Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings. 364–365. https://doi.org/10.1145/3639478.3643108

Digital Library

[34]

Yudong Liu, Xu Zhang, Shilin He, Hongyu Zhang, Liqun Li, Yu Kang, Yong Xu, Minghua Ma, Qingwei Lin, and Yingnong Dang. 2022. Uniparser: A unified log parser for heterogeneous log data. In Proceedings of the ACM Web Conference 2022 (WWW). 1893–1901. https://doi.org/10.1145/3485447.3511993

Digital Library

[35]

James MacGlashan, Mark K. Ho, Robert Tyler Loftin, Bei Peng, Guan Wang, David L. Roberts, Matthew E. Taylor, and Michael L. Littman. 2017. Interactive Learning from Policy-Dependent Human Feedback. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017, Doina Precup and Yee Whye Teh (Eds.) (Proceedings of Machine Learning Research, Vol. 70). PMLR, 2285–2294. http://proceedings.mlr.press/v70/macglashan17a.html

[36]

Adetokunbo AO Makanju, A Nur Zincir-Heywood, and Evangelos E Milios. 2009. Clustering event logs using iterative partitioning. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD). 1255–1264. https://doi.org/10.1145/1557019.1557154

Digital Library

[37]

Antonio Mastropaolo, Luca Pascarella, and Gabriele Bavota. 2022. Using deep learning to generate complete log statements. In Proceedings of the 44th International Conference on Software Engineering (ICSE). 2279–2290. https://doi.org/10.1145/3510003.3511561

Digital Library

[38]

Salma Messaoudi, Annibale Panichella, Domenico Bianculli, Lionel Briand, and Raimondas Sasnauskas. 2018. A search-based approach for accurate identification of log message formats. In Proceedings of the 26th Conference on Program Comprehension. 167–177. https://doi.org/10.1145/3196321.3196340

Digital Library

[39]

Masayoshi Mizutani. 2013. Incremental mining of system log format. In 2013 IEEE International Conference on Services Computing. 595–602. https://doi.org/10.1109/SCC.2013.73

Digital Library

[40]

Priyanka Mudgal and Rita Wouhaybi. 2023. An Assessment of ChatGPT on Log Data. arXiv preprint arXiv:2309.07938, https://doi.org/10.48550/arXiv.2309.07938

[41]

N Mündler, J He, S Jenko, and M Vechev. 2023. Self-contradictory hallucinations of large language models: Evaluation, detection and mitigation. https://doi.org/10.48550/arXiv.2305.15852

[42]

Meiyappan Nagappan and Mladen A Vouk. 2010. Abstracting log lines to log event types for mining software system logs. In 2010 7th IEEE Working Conference on Mining Software Repositories (MSR). 114–117. https://doi.org/10.1109/MSR.2010.5463281

[43]

Paolo Notaro, Soroush Haeri, Jorge Cardoso, and Michael Gerndt. 2023. LogRule: Efficient Structured Log Mining for Root Cause Analysis. IEEE Transactions on Network and Service Management, https://doi.org/10.1109/TNSM.2023.3282270

Digital Library

[44]

Antonio Pecchia, Marcello Cinque, Gabriella Carrozza, and Domenico Cotroneo. 2015. Industry practices and event logging: Assessment of a critical software development process. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering (ICSE). 2, 169–178. https://doi.org/10.1109/ICSE.2015.145

[45]

Baolin Peng, Michel Galley, Pengcheng He, Hao Cheng, Yujia Xie, Yu Hu, Qiuyuan Huang, Lars Liden, Zhou Yu, and Weizhu Chen. 2023. Check your facts and try again: Improving large language models with external knowledge and automated feedback. arXiv preprint arXiv:2302.12813, https://doi.org/10.48550/arXiv.2302.12813

[46]

Yun Peng, Chaozheng Wang, Wenxuan Wang, Cuiyun Gao, and Michael R Lyu. 2023. Generative Type Inference for Python. In 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE). 988–999. https://doi.org/10.1109/ASE56229.2023.00031

[47]

Stefan Petrescu, Floris Den Hengst, Alexandru Uta, and Jan S Rellermeyer. 2023. Log parsing evaluation in the era of modern software systems. In 2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE). 379–390. https://doi.org/10.1109/ISSRE59848.2023.00019

[48]

Kirk Rodrigues, Yu Luo, and Ding Yuan. 2021. $CLP$: Efficient and Scalable Search on Compressed Text Logs. In 15th $USENIX$ Symposium on Operating Systems Design and Implementation ($OSDI$ 21). 183–198. https://www.usenix.org/conference/osdi21/presentation/rodrigues

[49]

Daan Schipper, Maurício Aniche, and Arie van Deursen. 2019. Tracing back log data to its log statement: from research to practice. In 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). 545–549. https://doi.org/10.1109/MSR.2019.00081

Digital Library

[50]

Keiichi Shima. 2016. Length matters: Clustering system log messages using length of words. arXiv preprint arXiv:1611.03213, https://doi.org/10.48550/arXiv.1611.03213

[51]

Liang Tang, Tao Li, and Chang-Shing Perng. 2011. LogSig: Generating system events from raw textual logs. In Proceedings of the 20th ACM international conference on Information and knowledge management (CIKM). 785–794. https://doi.org/10.1145/2063576.2063690

Digital Library

[52]

Risto Vaarandi. 2003. A data clustering algorithm for mining patterns from event logs. In Proceedings of the 3rd IEEE Workshop on IP Operations & Management (IPOM)(IEEE Cat. No. 03EX764). 119–126. https://doi.org/10.1109/IPOM.2003.1251233

[53]

Risto Vaarandi and Mauno Pihelgas. 2015. Logcluster-a data clustering and pattern mining algorithm for event logs. In 2015 11th International conference on network and service management (CNSM). 1–7. https://doi.org/10.1109/CNSM.2015.7367331

Digital Library

[54]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5998–6008. https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html

Digital Library

[55]

Lingzhi Wang, Nengwen Zhao, Junjie Chen, Pinnong Li, Wenchi Zhang, and Kaixin Sui. 2020. Root-cause metric location for microservice systems via log anomaly detection. In 2020 IEEE international conference on web services (ICWS). 142–150. https://doi.org/10.1109/ICWS49710.2020.00026

[56]

Xindi Wang, Yufei Wang, Can Xu, Xiubo Geng, Bowen Zhang, Chongyang Tao, Frank Rudzicz, Robert E. Mercer, and Daxin Jiang. 2023. Investigating the Learning Behaviour of In-Context Learning: A Comparison with Supervised Learning. In ECAI 2023 - 26th European Conference on Artificial Intelligence, September 30 - October 4, 2023, Kraków, Poland - Including 12th Conference on Prestigious Applications of Intelligent Systems (PAIS 2023), Kobi Gal, Ann Nowé, Grzegorz J. Nalepa, Roy Fairstein, and Roxana Radulescu (Eds.) (Frontiers in Artificial Intelligence and Applications, Vol. 372). IOS Press, 2543–2551. https://doi.org/10.3233/FAIA230559

[57]

Xuheng Wang, Xu Zhang, Liqun Li, Shilin He, Hongyu Zhang, Yudong Liu, Lingling Zheng, Yu Kang, Qingwei Lin, and Yingnong Dang. 2022. SPINE: a scalable log parser with feedback guidance. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (FSE). 1198–1208. https://doi.org/10.1145/3641399.3644114

Digital Library

[58]

Yiding Wang, Kai Chen, Haisheng Tan, and Kun Guo. 2023. Tabi: An Efficient Multi-Level Inference System for Large Language Models. In Proceedings of the Eighteenth European Conference on Computer Systems. 233–248. https://doi.org/10.1145/3552326.3587438

Digital Library

[59]

Jason Wei, Maarten Bosma, Vincent Y Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M Dai, and Quoc V Le. 2021. Finetuned language models are zero-shot learners. arXiv preprint arXiv:2109.01652, https://doi.org/10.48550/arXiv.2109.01652

[60]

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed H. Chi, Quoc V. Le, and Denny Zhou. 2022. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, and A. Oh (Eds.). http://papers.nips.cc/paper_files/paper/2022/hash/9d5609613524ecf4f15af0f7b31abca4-Abstract-Conference.html

[61]

Junjielong Xu, Qiuai Fu, Zhouruixing Zhu, Yutong Cheng, Zhijing Li, Yuchi Ma, and Pinjia He. 2023. Hue: A user-adaptive parser for hybrid logs. In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 413–424. https://doi.org/10.1145/3611643.3616260

Digital Library

[62]

Junjielong Xu, Ruichun Yang, Yintong Huo, Chengyu Zhang, and Pinjia He. 2024. DivLog: Log Parsing with Prompt Enhanced In-Context Learning. In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering. 1–12. https://doi.org/10.1145/3597503.3639155

Digital Library

[63]

Wei Xu, Ling Huang, Armando Fox, David Patterson, and Michael Jordan. 2009. Detecting large-scale system problems by mining console logs. Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles (SOSP), https://doi.org/10.1145/1629575.1629587

Digital Library

[64]

Zhou Yang, Zhipeng Zhao, Chenyu Wang, Jieke Shi, Dongsun Kim, Donggyun Han, and David Lo. 2024. Unveiling memorization in code models. In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering. 1–13. https://doi.org/10.1145/3597503.3639074

Digital Library

[65]

Kundi Yao, Mohammed Sayagh, Weiyi Shang, and Ahmed E Hassan. 2021. Improving state-of-the-art compression techniques for log management tools. IEEE Transactions on Software Engineering (TSE), 48, 8 (2021), 2748–2760. https://doi.org/10.1109/TSE.2021.3069958

Digital Library

[66]

Siyu Yu, Ningjiang Chen, Yifan Wu, and Wensheng Dou. 2023. Self-supervised log parsing using semantic contribution difference. Journal of Systems and Software, 200 (2023), 111646. https://doi.org/10.1016/j.jss.2023.111646

Digital Library

[67]

Siyu Yu, Pinjia He, Ningjiang Chen, and Yifan Wu. 2023. Brain: Log Parsing with Bidirectional Parallel Tree. IEEE Transactions on Services Computing (TSC), https://doi.org/10.1109/TSC.2023.3270566

[68]

Chenxi Zhang, Xin Peng, Chaofeng Sha, Ke Zhang, Zhenqing Fu, Xiya Wu, Qingwei Lin, and Dongmei Zhang. 2022. DeepTraLog: Trace-log combined microservice anomaly detection through graph-based deep learning. In Proceedings of the 44th International Conference on Software Engineering (ICSE). 623–634. https://doi.org/10.1145/3510003.3510180

Digital Library

[69]

Xu Zhang, Yong Xu, Qingwei Lin, Bo Qiao, Hongyu Zhang, Yingnong Dang, Chunyu Xie, Xinsheng Yang, Qian Cheng, and Ze Li. 2019. Robust log-based anomaly detection on unstable log data. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (FSE). 807–817. https://doi.org/10.1145/3338906.3338931

Digital Library

[70]

Nengwen Zhao, Honglin Wang, Zeyan Li, Xiao Peng, Gang Wang, Zhu Pan, Yong Wu, Zhen Feng, Xidao Wen, and Wenchi Zhang. 2021. An empirical investigation of practical log anomaly detection for online service systems. In Proceedings of the 29th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering (FSE). 1404–1415. https://doi.org/10.1145/3468264.3473933

Digital Library

[71]

Zihao Zhao, Eric Wallace, Shi Feng, Dan Klein, and Sameer Singh. 2021. Calibrate Before Use: Improving Few-shot Performance of Language Models. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event, Marina Meila and Tong Zhang (Eds.) (Proceedings of Machine Learning Research, Vol. 139). PMLR, 12697–12706. http://proceedings.mlr.press/v139/zhao21c.html

[72]

Shen Zheng, Jie Huang, and Kevin Chen-Chuan Chang. 2023. Why does chatgpt fall short in providing truthful answers. ArXiv preprint, abs/2304.10513, https://doi.org/10.48550/arXiv.2304.10513

[73]

Jieming Zhu, Shilin He, Jinyang Liu, Pinjia He, Qi Xie, Zibin Zheng, and Michael R Lyu. 2019. Tools and benchmarks for automated log parsing. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). 121–130. https://doi.org/10.1109/ICSE-SEIP.2019.00021

Digital Library

Index Terms

LILAC: Log Parsing using LLMs with Adaptive Parsing Cache
1. Software and its engineering
  1. Software creation and management

Recommendations

LLMParser: An Exploratory Study on Using Large Language Models for Log Parsing
ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering

Logs are important in modern software development with runtime information. Log parsing is the first step in many log-based analyses, that involve extracting structured information from unstructured log data. Traditional log parsers face challenges in ...
DivLog: Log Parsing with Prompt Enhanced In-Context Learning
ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering

Log parsing, which involves log template extraction from semi-structured logs to produce structured logs, is the first and the most critical step in automated log analysis. However, current log parsers suffer from limited effectiveness for two reasons. ...
A Comparative Study on Large Language Models for Log Parsing
ESEM '24: Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement

Background: Log messages provide valuable information about the status of software systems. This information is provided in an unstructured fashion and automated approaches are applied to extract relevant parameters. To ease this process, log parsing can ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Software Engineering

Proceedings of the ACM on Software Engineering Volume 1, Issue FSE

July 2024

2770 pages

EISSN:2994-970X

DOI:10.1145/3554322

Editor:
Luciano Baresi
Politecnico di Milano, Italy

Issue’s Table of Contents

Copyright © 2024 Copyright held by the owner/author(s).

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 July 2024

Published in PACMSE Volume 1, Issue FSE

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
589
Total Downloads

Downloads (Last 12 months)589
Downloads (Last 6 weeks)276

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents