Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3132747.3132778acmconferencesArticle/Chapter ViewAbstractPublication PagessospConference Proceedingsconference-collections
research-article
Open access

Log20: Fully Automated Optimal Placement of Log Printing Statements under Specified Overhead Threshold

Published: 14 October 2017 Publication History

Abstract

When systems fail in production environments, log data is often the only information available to programmers for postmortem debugging. Consequently, programmers' decision on where to place a log printing statement is of crucial importance, as it directly affects how effective and efficient postmortem debugging can be. This paper presents Log20, a tool that determines a near optimal placement of log printing statements under the constraint of adding less than a specified amount of performance overhead. Log20 does this in an automated way without any human involvement. Guided by information theory, the core of our algorithm measures how effective each log printing statement is in disambiguating code paths. To do so, it uses the frequencies of different execution paths that are collected from a production environment by a low-overhead tracing library. We evaluated Log20 on HDFS, HBase, Cassandra, and ZooKeeper, and observed that Log20 is substantially more efficient in code path disambiguation compared to the developers' manually placed log printing statements. Log20 can also output a curve showing the trade-off between the informativeness of the logs and the performance slowdown, so that a developer can choose the right balance.

Supplementary Material

MP4 File (log20.mp4)

References

[1]
G. Altekar and I. Stoica. ODR: Output-deterministic Replay for Multi-core Debugging. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, SOSP '09, pages 193--206. ACM, 2009.
[2]
T. Ball and J. R. Larus. Efficient Path Profiling. In Proceedings of the 29th Annual ACM/IEEE International Symposium on Microarchitecture, MICRO '96, pages 46--57. IEEE Computer Society, 1996.
[3]
P. Barham, A. Donnelly, R. Isaacs, and R. Mortier. Using Magpie for Request Extraction and Workload Modelling. In Proceedings of the 6th Conference on Symposium on Operating Systems Design & Implementation, OSDI '04, pages 259--272. USENIX Association, 2004.
[4]
B. M. Cantrill, M. W. Shapiro, and A. H. Leventhal. Dynamic Instrumentation of Production Systems. In Proceedings of the 10th USENIX Annual Technical Conference, USENIX ATC '04, pages 15--28. USENIX Association, 2004.
[5]
H. Cui, J. Simsa, Y.-H. Lin, H. Li, B. Blum, X. Xu, J. Yang, G. A. Gibson, and R. E. Bryant. Parrot: A Practical Runtime for Deterministic, Stable, and Reliable Threads. In Proceedings of the 24th ACM Symposium on Operating Systems Principles, SOSP '13, pages 388--405. ACM, 2013.
[6]
D. Devecsery, M. Chow, X. Dou, J. Flinn, and P. M. Chen. Eidetic Systems. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation, OSDI '14, pages 525--540. USENIX Association, 2014.
[7]
G. W. Dunlap, S. T. King, S. Cinar, M. A. Basrai, and P. M. Chen. ReVirt: Enabling Intrusion Analysis Through Virtual-machine Logging and Replay. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation, OSDI '02, pages 211--224. ACM, 2002.
[8]
R. Fonseca, G. Porter, R. H. Katz, S. Shenker, and I. Stoica. X-trace: A Pervasive Network Tracing Framework. In Proceedings of the 4th USENIX Conference on Networked Systems Design & Implementation, NSDI '07, pages 271--284. USENIX Association, 2007.
[9]
Q. Fu, J. Zhu, W. Hu, J.-G. Lou, R. Ding, Q. Lin, D. Zhang, and T. Xie. Where Do Developers Log? An Empirical Study on Logging Practices in Industry. In Companion Proceedings of the 36th International Conference on Software Engineering, ICSE Companion '14, pages 24--33. ACM, 2014.
[10]
M. Hauswirth and T. M. Chilimbi. Low-overhead Memory Leak Detection Using Adaptive Statistical Profiling. In Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '04, pages 156--164. ACM, 2004.
[11]
HDFS-12332: Logging Improvement for SampleStat Function Min-Max.add. https://issues.apache.org/jira/browse/HDFS-12332.
[12]
J. Huang, P. Liu, and C. Zhang. LEAP: Lightweight Deterministic Multi-processor Replay of Concurrent Java Programs. In Proceedings of the 18th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE '10, pages 207--216. ACM, 2010.
[13]
S. Huang, J. Huang, J. Dai, T. Xie, and B. Huang. The HiBench Benchmark Suite: Characterization of the MapReduce-based Data Analysis. In 26th International Conference on Data Engineering Workshops, ICDEW '10, pages 41--51. IEEE Computer Society, 2010.
[14]
P. Hunt, M. Konar, F. P. Junqueira, and B. Reed. ZooKeeper: Wait-free Coordination for Internet-scale Systems. In Proceedings of the 16th USENIX Annual Technical Conference, USENIX ATC '10, pages 145--158. USENIX Association, 2010.
[15]
JavaParser: Process Java Code Programmatically. http://javaparser.org/.
[16]
J. R. Larus. Whole Program Paths. In Proceedings of the ACM SIGPLAN 1999 Conference on Programming Language Design and Implementation, PLDI '99, pages 259--269. ACM, 1999.
[17]
V. I. Levenshtein. Binary Codes Capable of Correcting Deletions, Insertions, and Reversals. In Soviet Physics Doklady, 10(8), pages 707-- 710, 1966.
[18]
H. Li, W. Shang, Y. Zou, and A. E. Hassan. Towards Just-in-time Suggestions for Log Changes. Empirical Software Engineering, pages 1--35, 2016.
[19]
D. Lion, A. Chiu, H. Sun, X. Zhuang, N. Grcevski, and D. Yuan. Don't Get Caught in the Cold, Warm-up Your JVM: Understand and Eliminate JVM Warm-up Overhead in Data-Parallel Systems. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI '16, pages 383--400. USENIX Association, 2016.
[20]
T. Liu, C. Curtsinger, and E. D. Berger. Dthreads: Efficient Deterministic Multithreading. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles, SOSP '11, pages 327--336. ACM, 2011.
[21]
Log4j - Log4j 2 Guide - Apache Log4j 2. http://logging.apache.org/log4j/2.x/.
[22]
J. Mace, R. Roelke, and R. Fonseca. Pivot Tracing: Dynamic Causal Monitoring for Distributed Systems. In Proceedings of the 25th Symposium on Operating Systems Principles, SOSP '15, pages 378--393. ACM, 2015.
[23]
S. Narayanasamy, G. Pokam, and B. Calder. BugNet: Continuously Recording Program Execution for Deterministic Replay Debugging. In Proceedings of the 32nd Annual International Symposium on Computer Architecture, ISCA '05, pages 284--295. IEEE Computer Society, 2005.
[24]
P. Ohmann, D. B. Brown, N. Neelakandan, J. Linderoth, and B. Liblit. Optimizing Customized Program Coverage. In Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, ASE 2016, pages 27--38. ACM, 2016.
[25]
S. Park, Y. Zhou, W. Xiong, Z. Yin, R. Kaushik, K. H.Lee, and S. Lu. PRES: Probabilistic Replay with Execution Sketching on Multiprocessors. In Proceedings of the 22nd Symposium on Operating Systems Principles, SOSP '09, pages 177--192. ACM, 2009.
[26]
Performance of Log4j 2. https://logging.apache.org/log4j/log4j-2.2/performance.html.
[27]
C. E. Shannon. A Mathematical Theory of Communication. The Bell System Technical Journal, 27(4):623--656, 1948.
[28]
Spring Loaded. https://github.com/spring-projects/spring-loaded.
[29]
D. Subhraveti and J. Nieh. Record and Transplay: Partial Checkpointing for Replay Debugging Across Heterogeneous Systems. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, SIGMETRICS '11, pages 109--120. ACM, 2011.
[30]
SystemTap. https://sourceware.org/systemtap/.
[31]
H. Thane and H. Hansson. Using Deterministic Replay for Debugging of Distributed Real-time Systems. In Proceedings of the 12th Euromicro Conference on Real-Time Systems, ECRTS 2000, pages 265--272, 2000.
[32]
R. Vallée-Rai, P. Co, E. Gagnon, L. Hendren, P. Lam, and V. Sundaresan. Soot: A Java Bytecode Optimization Framework. In CASCON First Decade High Impact Papers, CASCON '10, pages 214--224. IBM Corp., 2010.
[33]
K. Veeraraghavan, D. Lee, B. Wester, J. Ouyang, P. M. Chen, J. Flinn, and S. Narayanasamy. DoublePlay: Parallelizing Sequential Logging and Replay. In Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '11, pages 15--26. ACM, 2011.
[34]
D. Yuan, Y. Luo, X. Zhuang, G. Rodrigues, X. Zhao, Y. Zhang, P. U. Jain, and M. Stumm. Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-intensive Systems. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation, OSDI '14, pages 249--265. USENIX Association, 2014.
[35]
D. Yuan, S. Park, P. Huang, Y. Liu, M. Lee, Y. Zhou, and S. Savage. Be Conservative: Enhancing Failure Diagnosis with Proactive Logging. In Proceedings of the 10th USENIX Symposium on Operating System Design and Implementation, OSDI '12, pages 293--306. USENIX Association, 2012.
[36]
D. Yuan, S. Park, and Y. Zhou. Characterising Logging Practices in Open-Source Software. In Proceedings of the 34th International Conference on Software Engineering, ICSE '12, pages 102--112. IEEE Press, 2012.
[37]
D. Yuan, J. Zheng, S. Park, Y. Zhou, and S. Savage. Improving Software Diagnosability via Log Enhancement. In Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '11, pages 3--14. ACM, 2011.
[38]
X. Zhao, K. Rodrigues, Y. Luo, M. Stumm, D. Yuan, and Y. Zhou. The Game of Twenty Questions: Do You Know Where to Log? In Proceedings of the 16th Workshop on Hot Topics in Operating Systems, HotOS '17, pages 125--131. ACM, 2017.

Cited By

View all
  • (2024)Demystifying the Fight Against Complexity: A Comprehensive Study of Live Debugging Activities in Production Cloud SystemsProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698568(341-360)Online publication date: 20-Nov-2024
  • (2024)Reducing Events to Augment Log-based Anomaly Detection Models: An Empirical StudyProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3674805.3695403(538-548)Online publication date: 24-Oct-2024
  • (2024)Eliminating eBPF Tracing Overhead on Untraced ProcessesProceedings of the ACM SIGCOMM 2024 Workshop on eBPF and Kernel Extensions10.1145/3672197.3673431(16-22)Online publication date: 4-Aug-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SOSP '17: Proceedings of the 26th Symposium on Operating Systems Principles
October 2017
677 pages
ISBN:9781450350853
DOI:10.1145/3132747
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 October 2017

Check for updates

Author Tags

  1. Log placement
  2. distributed systems
  3. information theory

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

SOSP '17
Sponsor:

Acceptance Rates

Overall Acceptance Rate 174 of 961 submissions, 18%

Upcoming Conference

SOSP '25
ACM SIGOPS 31st Symposium on Operating Systems Principles
October 13 - 16, 2025
Seoul , Republic of Korea

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)288
  • Downloads (Last 6 weeks)76
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Demystifying the Fight Against Complexity: A Comprehensive Study of Live Debugging Activities in Production Cloud SystemsProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698568(341-360)Online publication date: 20-Nov-2024
  • (2024)Reducing Events to Augment Log-based Anomaly Detection Models: An Empirical StudyProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3674805.3695403(538-548)Online publication date: 24-Oct-2024
  • (2024)Eliminating eBPF Tracing Overhead on Untraced ProcessesProceedings of the ACM SIGCOMM 2024 Workshop on eBPF and Kernel Extensions10.1145/3672197.3673431(16-22)Online publication date: 4-Aug-2024
  • (2024)Go Static: Contextualized Logging Statement GenerationProceedings of the ACM on Software Engineering10.1145/36437541:FSE(609-630)Online publication date: 12-Jul-2024
  • (2024)An Adaptive Logging System (ALS): Enhancing Software Logging with Reinforcement Learning TechniquesProceedings of the 15th ACM/SPEC International Conference on Performance Engineering10.1145/3629526.3645033(37-47)Online publication date: 7-May-2024
  • (2024)Deep Learning or Classical Machine Learning? An Empirical Study on Log-Based Anomaly DetectionProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623308(1-13)Online publication date: 20-May-2024
  • (2024)Automatic Configurator to Prevent Attacks for Azure Cloud System2024 IEEE/ACIS 22nd International Conference on Software Engineering Research, Management and Applications (SERA)10.1109/SERA61261.2024.10685612(174-181)Online publication date: 30-May-2024
  • (2024)A literature review and existing challenges on software logging practicesEmpirical Software Engineering10.1007/s10664-024-10452-w29:4Online publication date: 18-Jun-2024
  • (2023)MultiviewProceedings of the 32nd USENIX Conference on Security Symposium10.5555/3620237.3620657(7499-7516)Online publication date: 9-Aug-2023
  • (2023) LoGenText-Plus: Improving Neural Machine Translation Based Logging Texts Generation with Syntactic TemplatesACM Transactions on Software Engineering and Methodology10.1145/362474033:2(1-45)Online publication date: 22-Dec-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media