Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Statistical debugging for real-world performance problems

Published: 15 October 2014 Publication History

Abstract

Design and implementation defects that lead to inefficient computation widely exist in software. These defects are difficult to avoid and discover. They lead to severe performance degradation and energy waste during production runs, and are becoming increasingly critical with the meager increase of single-core hardware performance and the increasing concerns about energy constraints. Effective tools that diagnose performance problems and point out the inefficiency root cause are sorely needed.
The state of the art of performance diagnosis is preliminary. Profiling can identify the functions that consume the most computation resources, but can neither identify the ones that waste the most resources nor explain why. Performance-bug detectors can identify specific type of inefficient computation, but are not suited for diagnosing general performance problems. Effective failure diagnosis techniques, such as statistical debugging, have been proposed for functional bugs. However, whether they work for performance problems is still an open question.
In this paper, we first conduct an empirical study to understand how performance problems are observed and reported by real-world users. Our study shows that statistical debugging is a natural fit for diagnosing performance problems, which are often observed through comparison-based approaches and reported together with both good and bad inputs. We then thoroughly investigate different design points in statistical debugging, including three different predicates and two different types of statistical models, to understand which design point works the best for performance diagnosis. Finally, we study how some unique nature of performance bugs allows sampling techniques to lower the overhead of run-time performance diagnosis without extending the diagnosis latency.

References

[1]
http://sourceware.org/binutils/docs/gprof/.
[2]
M. K. Aguilera, J. C. Mogul, J. L. Wiener, P. Reynolds, and A. Muthitacharoen. Performance debugging for distributed systems of black boxes. In SOSP, 2003.
[3]
E. Altman, M. Arnold, S. Fink, and N. Mitchell. Performance analysis of idle programs. In Proceedings of the ACM international conference on Object oriented programming systems languages and applications, OOPSLA '10, 2010.
[4]
E. Alves, M. Gligoric, V. Jagannath, and M. d'Amorim. Fault-localization using dynamic slicing and change impact analysis. In ASE, 2011.
[5]
D. Andrzejewski, A. Mulhern, B. Liblit, and X. Zhu. Statistical debugging using latent topic models. In Proceedings of the 18th European conference on Machine Learning, 2007.
[6]
J. Arulraj, P.-C. Chang, G. Jin, and S. Lu. Production-run software failure diagnosis via hardware performance counters. In ASPLOS, 2013.
[7]
M. Attariyan, M. Chow, and J. Flinn. X-ray: automating root-cause diagnosis of performance anomalies in production software. In OSDI, 2012.
[8]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. J. Mach. Learn. Res., 3:993--1022, Mar. 2003. ISSN 1532-4435.
[9]
T.-H. Chen, W. Shang, Z. M. Jiang, A. E. Hassan, M. Nasser, and P. Flora. Detecting performance anti-patterns for applications developed using object-relational mapping. In ICSE, 2014.
[10]
V. Chipounov, V. Kuznetsov, and G. Candea. S2E: a platform for in-vivo multi-path analysis of software systems. In ASPLOS, 2011.
[11]
A. Diwan, M. Hauswirth, T. Mytkowicz, and P. F. Sweeney. Traceanalyzer: a system for processing performance traces. Softw., Pract. Exper., 41(3):267--282, 2011.
[12]
B. Dufour, B. G. Ryder, and G. Sevitsky. A scalable technique for characterizing the usage of temporaries in framework-intensive java applications. In FSE, 2008.
[13]
D. Engler, D. Y. Chen, S. Hallem, A. Chou, and B. Chelf. Bugs as deviant behavior: A general approach to inferring errors in systems code. In SOSP, pages 57--72, 2001.
[14]
R. Fonseca, M. J. Freedman, and G. Porter. Experiences with tracing causality in networked services. In Internet network management conference on Research on enterprise networking, 2010.
[15]
K. Glerum, K. Kinshumann, S. Greenberg, G. Aul, V. Orgovan, G. Nichols, D. Grant, G. Loihle, and G. C. Hunt. Debugging in the (very) large: ten years of implementation and experience. In SOSP, 2009.
[16]
N. Gupta, H. He, X. Zhang, and R. Gupta. Locating faulty code using failure-inducing chops. In ASE, 2005.
[17]
S. Han, Y. Dang, S. Ge, D. Zhang, and T. Xie. Performance debugging in the large via mining millions of stack traces. In ICSE, 2012.
[18]
S. Hangal and M. S. Lam. Tracking down software bugs using automatic anomaly detection. In ICSE, 2002.
[19]
S. Horwitz, T. Reps, and D. Binkley. Interprocedural slicing using dependence graphs. ACM Trans. Program. Lang. Syst., 12(1):26--60, Jan. 1990.
[20]
G. Jin, A. Thakur, B. Liblit, and S. Lu. Instrumentation and sampling strategies for cooperative concurrency bug isolation. In OOPSLA, 2010.
[21]
G. Jin, L. Song, X. Shi, J. Scherpelz, and S. Lu. Understanding and detecting real-world performance bugs. In PLDI, 2012.
[22]
J. A. Jones, M. J. Harrold, and J. Stasko. Visualization of test information to assist fault localization. In ICSE, 2002.
[23]
M. P. Kasick, J. Tan, R. Gandhi, and P. Narasimhan. Blackbox problem diagnosis in parallel file systems. In FAST, 2010.
[24]
C. Killian, K. Nagaraj, S. Pervez, R. Braud, J. W. Anderson, and R. Jhala. Finding latent performance bugs in systems implementations. In FSE, 2010.
[25]
C. H. Kim, J. Rhee, H. Zhang, N. Arora, G. Jiang, X. Zhang, and D. Xu. Introperf: Transparent context-sensitive multilayer performance inference using system stack traces. SIGMETRICS, 2014.
[26]
T. Kremenek, P. Twohey, G. Back, A. Ng, and D. Engler. From uncertainty to belief: Inferring the specification within. In OSDI, Nov 2006.
[27]
Z. Li, S. Lu, S. Myagmar, and Y. Zhou. CP-Miner: A Tool for Finding Copy-paste and Related Bugs in Operating System Code. In OSDI, 2004.
[28]
Z. Li, L. Tan, X. Wang, S. Lu, Y. Zhou, and C. Zhai. Have things changed now?: an empirical study of bug characteristics in modern open source software. In ASID, 2006.
[29]
B. Liblit, A. Aiken, A. X. Zheng, and M. I. Jordan. Bug isolation via remote program sampling. In PLDI, 2003.
[30]
B. Liblit, M. Naik, A. X. Zheng, A. Aiken, and M. I. Jordan. Scalable statistical bug isolation. In PLDI, 2005.
[31]
T. Liu and E. D. Berger. Sheriff: precise detection and automatic mitigation of false sharing. In OOPSLA, 2011.
[32]
Y. Liu, C. Xu, and S.-C. Cheung. Characterizing and detecting performance bugs for smartphone applications. In ICSE, 2014.
[33]
S. Lu, S. Park, E. Seo, and Y. Zhou. Learning from mistakes - a comprehensive study of real world concurrency bug characteristics. In ASPLOS, 2008.
[34]
C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In PLDI, 2005.
[35]
I. Molyneaux. The Art of Application Performance Testing: Help for Programmers and Quality Assurance. O'Reilly Media, 2009.
[36]
A. Nistor, T. Jiang, and L. Tan. Discovering, reporting, and fixing performance bugs. In The 10th Working Conference on Mining Software Repositories, 2013.
[37]
A. Nistor, L. Song, D. Marinov, and S. Lu. Toddler: Detecting performance problems via similar memory-access patterns. In ICSE, 2013.
[38]
OProfile. OProfile - A System Profiler for Linux. http://oprofile.sourceforge.net.
[39]
R. R. Sambasivan, A. X. Zheng, M. De Rosa, E. Krevat, S. Whitman, M. Stroucken, W. Wang, L. Xu, and G. R. Ganger. Diagnosing performance changes by comparing request flows. In NSDI, 2011.
[40]
R. Santelices, J. A. Jones, Y. Yu, and M. J. Harrold. Lightweight fault-localization using multiple coverage types. In ICSE, 2009.
[41]
C. U. Smith and L. G. Williams. Software performance antipatterns. In Proceedings of the 2nd international workshop on Software and performance, 2000.
[42]
Wikipedia. Z-test. http://en.wikipedia.org/wiki/Z-test.
[43]
X. Xiao, S. Han, T. Xie, and D. Zhang. Context-sensitive delta inference for identifying workload-dependent performance bottlenecks. In ISSTA, 2013.
[44]
G. Xu and A. Rountev. Detecting inefficiently-used containers to avoid bloat. In PLDI, 2010.
[45]
G. Xu, M. Arnold, N. Mitchell, A. Rountev, and G. Sevitsky. Go with the flow: profiling copies to find runtime bloat. In PLDI, 2009.
[46]
G. Xu, N. Mitchell, M. Arnold, A. Rountev, E. Schonberg, and G. Sevitsky. Finding low-utility data structures. In PLDI, 2010.
[47]
W. Xu, L. Huang, A. Fox, D. Patterson, and M. I. Jordan. Detecting large-scale system problems by mining console logs. In SOSP, 2009.
[48]
X. Yu, S. Han, D. Zhang, and T. Xie. Comprehending performance from real-world execution traces: A device-driver case. In ASPLOS, 2014.
[49]
S. Zaman, B. Adams, and A. E. Hassan. A qualitative study on performance bugs. In The 9th Working Conference on Mining Software Repositories, 2012.
[50]
A. Zeller. Isolating cause-effect chains from computer programs. In FSE, 2002.

Cited By

View all
  • (2024)A Platform-Agnostic Framework for Automatically Identifying Performance Issue Reports with Heuristic Linguistic PatternsIEEE Transactions on Software Engineering10.1109/TSE.2024.3390623(1-22)Online publication date: 2024
  • (2023)Performance Bug Analysis and Detection for Distributed Storage and Computing SystemsACM Transactions on Storage10.1145/358028119:3(1-33)Online publication date: 19-Jun-2023
  • (2023)Toward More Efficient Statistical Debugging with Abstraction RefinementACM Transactions on Software Engineering and Methodology10.1145/354479032:2(1-38)Online publication date: 30-Mar-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 49, Issue 10
OOPSLA '14
October 2014
907 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2714064
  • Editor:
  • Andy Gill
Issue’s Table of Contents
  • cover image ACM Conferences
    OOPSLA '14: Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications
    October 2014
    946 pages
    ISBN:9781450325851
    DOI:10.1145/2660193
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2014
Published in SIGPLAN Volume 49, Issue 10

Check for updates

Author Tags

  1. empirical study
  2. performance bugs
  3. performance diagnosis
  4. statistical debugging

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)64
  • Downloads (Last 6 weeks)8
Reflects downloads up to 26 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Platform-Agnostic Framework for Automatically Identifying Performance Issue Reports with Heuristic Linguistic PatternsIEEE Transactions on Software Engineering10.1109/TSE.2024.3390623(1-22)Online publication date: 2024
  • (2023)Performance Bug Analysis and Detection for Distributed Storage and Computing SystemsACM Transactions on Storage10.1145/358028119:3(1-33)Online publication date: 19-Jun-2023
  • (2023)Toward More Efficient Statistical Debugging with Abstraction RefinementACM Transactions on Software Engineering and Methodology10.1145/354479032:2(1-38)Online publication date: 30-Mar-2023
  • (2023)Automated Detection of Software Performance Antipatterns in Java-Based ApplicationsIEEE Transactions on Software Engineering10.1109/TSE.2023.323432149:4(2873-2891)Online publication date: 1-Apr-2023
  • (2023)PASD: A Performance Analysis Approach Through the Statistical Debugging of Kernel Events2023 IEEE 23rd International Working Conference on Source Code Analysis and Manipulation (SCAM)10.1109/SCAM59687.2023.00025(151-161)Online publication date: 2-Oct-2023
  • (2023)A systematic mapping study of software performance researchSoftware: Practice and Experience10.1002/spe.318553:5(1249-1270)Online publication date: 2-Jan-2023
  • (2022)UnicornProceedings of the Seventeenth European Conference on Computer Systems10.1145/3492321.3519575(199-217)Online publication date: 28-Mar-2022
  • (2021)Dynaplex: analyzing program complexity using dynamically inferred recurrence relationsProceedings of the ACM on Programming Languages10.1145/34855155:OOPSLA(1-23)Online publication date: 15-Oct-2021
  • (2020)Detecting and understanding real-world differential performance bugs in machine learning librariesProceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3395363.3404540(189-199)Online publication date: 13-Jul-2020
  • (2020)Automated Performance Modeling of HPC Applications Using Machine LearningIEEE Transactions on Computers10.1109/TC.2020.296476769:5(749-763)Online publication date: 1-May-2020
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media