Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2635868.2635885acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

AI: a lightweight system for tolerating concurrency bugs

Published: 11 November 2014 Publication History

Abstract

Concurrency bugs are notoriously difficult to eradicate during software testing because of their non-deterministic nature. Moreover, fixing concurrency bugs is time-consuming and error-prone. Thus, tolerating concurrency bugs during production runs is an attractive complementary approach to bug detection and testing. Unfortunately, existing bug-tolerating tools are usually either 1) constrained in types of bugs they can handle or 2) requiring roll-back mechanism, which can hitherto not be fully achieved efficiently without hardware supports. This paper presents a novel program invariant, called Anticipating Invariant (AI), which can help anticipate bugs before any irreversible changes are made. Benefiting from this ability of anticipating bugs beforehand, our software-only system is able to forestall the failures with a simple thread stalling technique, which does not rely on execution roll-back and hence has good performance Experiments with 35 real-world concurrency bugs demonstrate that AI is capable of detecting and tolerating most types of concurrency bugs, including both atomicity and order violations. Two new bugs have been detected and confirmed by the corresponding developers. Performance evaluation with 6 representative parallel programs shows that AI incurs negligible overhead (<1%) for many nontrivial desktop and server applications.

References

[1]
MySQL. Bug report time to close stats. http://bugs.mysql.com/bugstats.php.
[2]
Mysql bugs: Statistics. http://bugs.mysql.com/bugstats.php.
[3]
S. Burckhardt, P. Kothari, M. Musuvathi, and S. Nagarakatte. A randomized scheduler with probabilistic guarantees of finding bugs. ASPLOS ’10.
[4]
M. D. Ernst, J. H. Perkins, P. J. Guo, S. McCamant, C. Pacheco, M. S. Tschantz, and C. Xiao. The daikon system for dynamic detection of likely invariants. Sci. Comput. Program., 69(1-3), Dec. 2007.
[5]
Z. Gu, E. T. Barr, D. J. Hamilton, and Z. Su. Has the bug really been fixed? ICSE ’10.
[6]
G. Jin, A. Thakur, B. Liblit, and S. Lu. Instrumentation and sampling strategies for cooperative concurrency bug isolation. OOPSLA ’10.
[7]
S. Kundu, M. K. Ganai, and C. Wang. Contessa: Concurrency testing augmented with symbolic analysis. CAV’10.
[8]
R. H. A.-D. S. L. Lanyue Lu, Andrea C. Arpaci-Dusseau. A study of linux file system evolution. FAST ’04.
[9]
C. Lattner and V. Adve. LLVM: A compilation framework for lifelong program analysis & transformation. CGO ’’04.
[10]
C. E. Leiserson, R. L. Rivest, C. Stein, and T. H. Cormen. Introduction to algorithms. The MIT press, 2001.
[11]
S. Lu, S. Park, E. Seo, and Y. Zhou. Learning from mistakes: a comprehensive study on real world concurrency bug characteristics. ASPLOS ’08.
[12]
S. Lu, J. Tucek, F. Qin, and Y. Zhou. AVIO: detecting atomicity violations via access interleaving invariants. ASPLOS ’06, 2006.
[13]
B. Lucia and L. Ceze. Cooperative empirical failure avoidance for multithreaded programs. ASPLOS ’13.
[14]
B. Lucia and L. Ceze. Finding concurrency bugs with context-aware communication graphs. MICRO 42, 2009.
[15]
B. Lucia, L. Ceze, and K. Strauss. ColorSafe: architectural support for debugging and dynamically avoiding multi-variable atomicity violations. ISCA ’10, 2010.
[16]
B. Lucia, J. Devietti, K. Strauss, and L. Ceze. Atom-Aid: Detecting and surviving atomicity violations. ISCA ’08, 2008.
[17]
D. Marino, M. Musuvathi, and S. Narayanasamy. LiteRace: effective sampling for lightweight data-race detection. PLDI ’09.
[18]
M. Musuvathi and S. Qadeer. Iterative context bounding for systematic testing of multithreaded programs. PLDI ’07, 2007.
[19]
M. Musuvathi, S. Qadeer, T. Ball, G. Basler, P. Nainar, and I. Neamtiu. Finding and reproducing heisenbugs in concurrent programs. OSDI ’08.
[20]
A. Muzahid, N. Otsuki, and J. Torrellas. AtomTracker: A comprehensive approach to atomic region inference and violation detection. MICRO 43, 2010.
[21]
S. Nagarakatte, S. Burckhardt, M. M. Martin, and M. Musuvathi. Multicore acceleration of priority-based schedulers for concurrency bug detection. PLDI ’12, 2012.
[22]
C.-S. Park and K. Sen. Randomized active atomicity violation detection in concurrent programs. SIGSOFT ’08/FSE-16, 2008.
[23]
S. Park, S. Lu, and Y. Zhou. CTrigger: exposing atomicity violation bugs from their hiding places. ASPLOS ’09, 2009.
[24]
S. Park, R. W. Vuduc, and M. J. Harrold. Falcon: fault localization in concurrent programs. ICSE ’10.
[25]
K. Sen. Race directed random testing of concurrent programs. PLDI ’08.
[26]
K. Serebryany, A. Potapenko, T. Iskhodzhanov, and D. Vyukov. Dynamic race detection with llvm compiler. In Runtime Verification, pages 110–114. Springer, 2012.
[27]
Y. Shi, S. Park, Z. Yin, S. Lu, Y. Zhou, W. Chen, and W. Zheng. Do I use the wrong definition?: Defuse: definition-use invariants for detecting concurrency and sequential bugs. OOPSLA ’10, 2010.
[28]
Y. Smaragdakis, J. Evans, C. Sadowski, J. Yi, and C. Flanagan. Sound predictive race detection in polynomial time. POPL ’12.
[29]
K. Veeraraghavan, P. M. Chen, J. Flinn, and S. Narayanasamy. Detecting and surviving data races using complementary schedules. SOSP ’11.
[30]
H. Volos, A. Tack, M. Swift, and S. Lu. Applying transactional memory to concurrency bugs. ASPLOS ’12.
[31]
C. Wang, M. Said, and A. Gupta. Coverage guided systematic concurrency testing. ICSE ’11.
[32]
S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. The SPLASH-2 programs: characterization and methodological considerations. ISCA ’95, 1995.
[33]
J. Wu, H. Cui, and J. Yang. Bypassing races in live applications with execution filters. OSDI ’10, pages 1–13, 2010.
[34]
J. Yu and S. Narayanasamy. A case for an interleaving constrained shared-memory multi-processor. ISCA ’09, 2009.
[35]
J. Yu and S. Narayanasamy. Tolerating concurrency bugs using transactions as lifeguards. MICRO 43, 2010.
[36]
W. Zhang, M. de Kruijf, A. Li, S. Lu, and K. Sankaralingam. ConAir: Featherweight concurrency bug recovery via single-threaded idempotent execution. ASPLOS ’13.
[37]
W. Zhang, C. Sun, and S. Lu. ConMem: detecting severe concurrency bugs through an effect-oriented approach. ASPLOS ’10, 2010. Introduction Motivation Our New Approach Contributions Anticipating Invariant Definition Case Studies Atomicity Violation Order Violation Rationales Implementation Overview Training & Tolerating Custom Instrumentation Strategy Experimental Evaluation Test Platform, Applications, and Bugs Detecting and Anticipating Capability AVIO DUI CCI PSet Discussion New Bugs Performance Sufficient Training Related Work Conclusion Acknowledgments References

Cited By

View all
  • (2022)Efficiently detecting concurrency bugs in persistent memory programsProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507755(873-887)Online publication date: 28-Feb-2022
  • (2019)Applying Transactional Memory for Concurrency-Bug Failure Recovery in Production RunsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2018.287765630:5(990-1006)Online publication date: 1-May-2019
  • (2019)Convoider: A Concurrency Bug Avoider Based on Transparent Software Transactional MemoryInternational Journal of Parallel Programming10.1007/s10766-019-00642-1Online publication date: 12-Sep-2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
FSE 2014: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering
November 2014
856 pages
ISBN:9781450330565
DOI:10.1145/2635868
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 November 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Bug Tolerating
  2. Concurrency Bugs
  3. Software Reliability

Qualifiers

  • Research-article

Conference

SIGSOFT/FSE'14
Sponsor:

Acceptance Rates

Overall Acceptance Rate 17 of 128 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)2
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Efficiently detecting concurrency bugs in persistent memory programsProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507755(873-887)Online publication date: 28-Feb-2022
  • (2019)Applying Transactional Memory for Concurrency-Bug Failure Recovery in Production RunsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2018.287765630:5(990-1006)Online publication date: 1-May-2019
  • (2019)Convoider: A Concurrency Bug Avoider Based on Transparent Software Transactional MemoryInternational Journal of Parallel Programming10.1007/s10766-019-00642-1Online publication date: 12-Sep-2019
  • (2018)Testing multithreaded programs via thread speed controlProceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3236024.3236077(15-25)Online publication date: 26-Oct-2018
  • (2017)A comprehensive study on real world concurrency bugs in Node.jsProceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering10.5555/3155562.3155628(520-531)Online publication date: 30-Oct-2017
  • (2017)Supervised testing of concurrent software in embedded systems2017 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS)10.1109/SAMOS.2017.8344633(233-238)Online publication date: Jul-2017
  • (2017)Repairing event race errors by controlling nondeterminismProceedings of the 39th International Conference on Software Engineering10.1109/ICSE.2017.34(289-299)Online publication date: 20-May-2017
  • (2017)A comprehensive study on real world concurrency bugs in Node.js2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)10.1109/ASE.2017.8115663(520-531)Online publication date: Oct-2017
  • (2016)A Lightweight System for Detecting and Tolerating Concurrency BugsIEEE Transactions on Software Engineering10.1109/TSE.2016.253166642:10(899-917)Online publication date: 1-Oct-2016
  • (2016)EventHealerJournal of Systems and Software10.1016/j.jss.2016.02.051118:C(208-220)Online publication date: 1-Aug-2016
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media