Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3639476.3639778acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Toward Adaptive Tracing: Efficient System Behavior Analysis using Language Models

Published: 24 May 2024 Publication History

Abstract

Tracing, a technique essential for unraveling the complexities of computer systems' behavior, involves the organized collection of low-level events, enabling anomaly identification, performance debugging, and root cause analysis. However, the significant overhead it imposes on large-scale systems, particularly in terms of performance and storage, has made it a less favorable tool for system maintenance. Previous efforts to mitigate tracing's burden have mostly centered around automating trace analysis but have primarily neglected the duration of events, a significant aspect of the information provided by tracers. To address these challenges, we propose an Adaptive Tracing method that leverages Language Models and kernel trace for precise system modeling. This novel approach minimizes overhead by recording detailed traces only during significant behavioral shifts and focusing on subsystems related to the root cause. Using a multi-task model, incorporating system call sequences and durations, we propose a root cause analysis method, enhancing model transparency and enabling targeted system tracing. Evaluation using a dataset of normal and noisy traces from an Apache server reveals that our Adaptive Tracer captures events related to abrupt changes with only 5.8% loss, reducing the collected trace by 77.1%, and accurately determining the respective noise set with 91.3% accuracy, outperforming previous state-of-the-art trace models by 20.9%.

References

[1]
Emre Ates, Lily Sturmann, Mert Toslali, Orran Krieger, Richard Megginson, Ayse K Coskun, and Raja R Sambasivan. 2019. An automated, cross-layer instrumentation framework for diagnosing performance problems in distributed applications. In Proceedings of the ACM Symposium on Cloud Computing. 165--170.
[2]
Bryan Cantrill, Michael W Shapiro, Adam H Leventhal, et al. 2004. Dynamic Instrumentation of Production Systems. In USENIX Annual Technical Conference, General Track. 15--28.
[3]
Mathieu Desnoyers and Michel Dagenais. 2006. Low disturbance embedded system tracing with linux trace toolkit next generation. In ELC (Embedded Linux Conference), Vol. 2006.
[4]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[5]
Michael Dymshits, Benjamin Myara, and David Tolpin. 2017. Process monitoring on sequences of system call count vectors. In 2017 International Carnahan Conference on Security Technology (ICCST). IEEE, 1--5.
[6]
Okwudili M Ezeme, Qusay H Mahmoud, and Akramul Azim. 2020. A framework for anomaly detection in time-driven and event-driven processes using kernel traces. IEEE Transactions on Knowledge and Data Engineering 34, 1 (2020), 1--14.
[7]
Quentin Fournier, Daniel Aloise, Seyed Vahid Azhari, and François Tetreault. 2021. On improving deep learning trace analysis with system call arguments. In 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR). IEEE, 120--130.
[8]
Quentin Fournier, Daniel Aloise, and Leandro R. Costa. 2023. Language Models for Novelty Detection in System Call Traces. arXiv:2309.02206 [cs.LG]
[9]
Mohamad Gebai and Michel R Dagenais. 2018. Survey and analysis of kernel and userspace tracers on linux: Design, implementation, and overhead. ACM Computing Surveys (CSUR) 51, 2 (2018), 1--33.
[10]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.
[11]
Gyuwan Kim, Hayoon Yi, Jangho Lee, Yunheung Paek, and Sungroh Yoon. 2016. LSTM-based system-call language modeling and robust ensemble method for designing host-based intrusion detection systems. arXiv preprint arXiv:1611.01726 (2016).
[12]
Iman Kohyarnejadfard, Mahsa Shakeri, and Daniel Aloise. 2019. System performance anomaly detection using tracing data analysis. In Proceedings of the 2019 5th International Conference on Computer and Technology Applications. 169--173.
[13]
Sasho Nedelkoski, Jorge Cardoso, and Odej Kao. 2019. Anomaly detection from system tracing data using multimodal deep learning. In 2019 IEEE 12th International Conference on Cloud Computing (CLOUD). IEEE, 179--186.
[14]
Austin Parker, Daniel Spoonhower, Jonathan Mace, Ben Sigelman, and Rebecca Isaacs. 2020. Distributed tracing in practice: Instrumenting, analyzing, and debugging microservices. O'Reilly Media.
[15]
Fei Song, Yanlei Diao, Jesse Read, Arnaud Stiegler, and Albert Bifet. 2018. EXAD: A system for explainable anomaly detection on big data traces. In 2018 IEEE International Conference on Data Mining Workshops (ICDMW). IEEE, 1435--1440.
[16]
The LTTng Project. 2023. LTTng Documentation. https://lttng.org/docs/v2.13/#doc-what-is-tracing
[17]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).
[18]
Shudong Zhang, Dongxue Liu, Lijuan Zhou, Zhongshan Ren, and Zipeng Wang. 2020. Diagnostic Framework for Distributed Application Performance Anomaly Based on Adaptive Instrumentation. In 2020 2nd International Conference on Computer Communication and the Internet (ICCCI). IEEE, 164--169.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE-NIER'24: Proceedings of the 2024 ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results
April 2024
127 pages
ISBN:9798400705007
DOI:10.1145/3639476
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

  • Faculty of Engineering of University of Porto

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 May 2024

Check for updates

Author Tags

  1. adaptive tracing
  2. language model
  3. root cause analysis
  4. change detection
  5. trace duration modeling
  6. sequence modeling

Qualifiers

  • Research-article

Conference

ICSE-NIER'24
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 69
    Total Downloads
  • Downloads (Last 12 months)69
  • Downloads (Last 6 weeks)8
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media