Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1736020.1736039acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

Analyzing multicore dumps to facilitate concurrency bug reproduction

Published: 13 March 2010 Publication History

Abstract

Debugging concurrent programs is difficult. This is primarily because the inherent non-determinism that arises because of scheduler interleavings makes it hard to easily reproduce bugs that may manifest only under certain interleavings. The problem is exacerbated in multi-core environments where there are multiple schedulers, one for each core. In this paper, we propose a reproduction technique for concurrent programs that execute on multi-core platforms. Our technique performs a lightweight analysis of a failing execution that occurs in a multi-core environment, and uses the result of the analysis to enable reproduction of the bug in a single-core system, under the control of a deterministic scheduler.
More specifically, our approach automatically identifies the execution point in the re-execution that corresponds to the failure point. It does so by analyzing the failure core dump and leveraging a technique called execution indexing that identifies a related point in the re-execution. By generating a core dump at this point, and comparing the differences betwen the two dumps, we are able to guide a search algorithm to efficiently generate a failure inducing schedule. Our experiments show that our technique is highly effective and has reasonable overhead.

References

[1]
A. R. Alameldeen and D. A. Wood. Addressing Workload Variability in Architectural Simulations. In IEEE Micro, 23(6):94--98, 2003.
[2]
G. Altekar and I. Stoica. ODR: Output-Deterministic Replay for Multicore Debugging. In SOSP, pages 193--206, 2009.
[3]
A. Ayers, R. Schooler, C. Metcalf, A. Agarwal, J. Rhee, and E. Witchel. Traceback: First Fault Diagnosis by Reconstruction of Distributed Control Flow. In PLDI, pages 201--212, 2005.
[4]
S. Bhansali, W.-K. Chen, S. de Jong, A. Edwards, R. Murray, M. Drinic, D. Mihocka, and J. Chau. Framework for Instruction-Level Tracing and Analysis of Program Executions. In VEE, pages 154--163, 2006.
[5]
H. J. Boehm and M. Weiser. Garbage Collection in an Uncooperative Environment. In Software Practice and Experience, 18(9):807--820, 1988.
[6]
M. D. Bond and K. S. McKinley. Probabilistic Calling Context. In OOPSLA, pages 97--112, 2007.
[7]
J.-D. Choi and H. Srinivasan. Deterministic Replay of Java Multi-threaded Applications. In SIGMETRICS, pages 48--59, 1998.
[8]
G. W. Dunlap, D. G. Lucchetti, M. A. Fetterman, and P. M. Chen. Execution Replay of Multiprocessor Virtual Machines. In VEE, pages 121--130, 2008.
[9]
J. Ferrante, K. J. Ottenstein, and J. D. Warren. The Program Dependence Graph and its Use in Optimization. ACM Transactions on Programming Languages and Systems, 9(3):319--349, 1987.
[10]
B. C. M. Fung, K. Wang, R. Chen, and P. S. Yu. Privacy-preserving Data Publishing: A Survey on Recent Developments. In ACM Computing Surveys, 2009.
[11]
Z. Guo, X. Wang, J. Tang, X. Liu, Z. Xu, M. Wu, M. F. Kaashoek, and Z. Zhang. R2: An Application-Level Kernel for Record and Replay. In OSDI, pages 193--208, 2008.
[12]
D. R. Hower and M. D. Hill. Rerun: Exploiting Episodes for Lightweight Memory Race Recording. In ISCA, pages 265--276, 2008.
[13]
P. Joshi, C. S. Park, K. Sen, and M. Naik. A Randomized Dynamic Program Analysis Technique for Detecting Real Deadlocks. In PLDI, pages 110--120, 2009.
[14]
S. T. King, G. W. Dunlap, and P. M. Chen. Debugging Operating Systems with Time-Traveling Virtual Machines. In USENIX, pages 1--15, 2005.
[15]
B. Korel and J. Laski. Dynamic Program Slicing. In Information Processing Letters, 29(3):155--163, 1988.
[16]
P. Montesinos, M. Hicks, S. T. King, and J. Torrellas. Capo: A Software-Hardware Interface for Practical Deterministic Multiprocessor Replay. In ASPLOS, pages 73--84, 2009.
[17]
M. Musuvathi and S. Qadeer. Iterative Context Bounding for Systematic Testing of Multithreaded Programs. In PLDI, pages 446--455, 2007.
[18]
S. Narayanasamy, C. Pereira, and B. Calder. Recording Shared Memory Dependencies Using Strata. In ASPLOS, pages 229--240, 2006.
[19]
N. Nethercote and J. Seward. Valgrind: A Framework for Heavy-weight Dynamic Binary Instrumentation. In PLDI, pages 89--100, 2007.
[20]
R. H. B. Netzer and M. H. Weaver. Optimal Tracing and Incremental Reexecution for Debugging Long-Running Programs. In PLDI, pages 313--325, 1994.
[21]
D. Z. Pan and M. A. Linton. Supporting Reverse Execution for Parallel Programs. In SIGPLAN and SIGOPS Workshop on Parallel and Distributed Debugging, pages 124--129, 1988.
[22]
S. Park, S. Lu, and Y. Zhou. Ctrigger: Exposing Atomicity Violation Bugs from Their Hiding Places. In ASPLOS, pages 25--36, 2009.
[23]
S. Park, W. Xiong, Z. Yin, R. Kaushik, K. Lee, S. Lu, and Y. Zhou. Do You Have to Reproduce the Bug at the First Replay Attempt? -- pres: Probabilistic Replay with Execution Sketching on Multiprocessors. In SOSP, pages 177--192, 2009.
[24]
M. Ronsse, K. D. Bosschere, M. Christiaens, J. C. d. Kergommeaux, and D. Kranzlmüller. Record/Replay for Nondeterministic Program Executions. In Communcation of the ACM, 46(9):62--67, 2003.
[25]
Y. Saito. Jockey: A User-Space Library for Record-Replay Debugging. In Automated Analysis--Driven Debugging, pages 69--76, 2005.
[26]
, S. Sarkar, P. Sewell, F.Z. Nardelli, S. Owens, T. Ridge, T. Braibant, M. Myreen, and J. Aglave The Semantics of x86-CC Multiprocessor Machine Code In POPL, pages 379--391, 2009.
[27]
K. Sen. Race Directed Random Testing of Concurrent Programs. In PLDI, pages 11--21, 2008.
[28]
S. M. Srinivasan, S. Kandula, C. R. Andrews, and Y. Zhou. Flashback: A Lightweight Extension For Rollback and Deterministic Replay for Software Debugging. In USENIX, pages 29--44, 2004.
[29]
B. Xin, N. Sumner, and X. Zhang. Efficient Program Execution Indexing. In PLDI, pages 238--249, 2008.
[30]
X. Zhang, R. Gupta, and Y. Zhang. Cost and Precision Tradeoffs of Dynamic Data Slicing Algorithms. ACM Transactions on Programming Languages and Systems, 27(4):631--661, 2005.

Cited By

View all
  • (2024)How Well Industry-Level Cause Bisection Works in Real-World: A Study on Linux KernelCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663828(62-73)Online publication date: 10-Jul-2024
  • (2024)MissConf: LLM-Enhanced Reproduction of Configuration-Triggered BugsProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings10.1145/3639478.3647635(484-495)Online publication date: 14-Apr-2024
  • (2022)Enriching Compiler Testing with Real Program from Bug ReportProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3556894(1-12)Online publication date: 10-Oct-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS XV: Proceedings of the fifteenth International Conference on Architectural support for programming languages and operating systems
March 2010
422 pages
ISBN:9781605588391
DOI:10.1145/1736020
  • General Chair:
  • James C. Hoe,
  • Program Chair:
  • Vikram S. Adve
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 45, Issue 3
    ASPLOS '10
    March 2010
    399 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/1735971
    Issue’s Table of Contents
  • cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 38, Issue 1
    ASPLOS '10
    March 2010
    399 pages
    ISSN:0163-5964
    DOI:10.1145/1735970
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 March 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. concurrency bugs
  2. execution indexing
  3. multi-core
  4. reproduction

Qualifiers

  • Research-article

Conference

ASPLOS '10

Acceptance Rates

ASPLOS XV Paper Acceptance Rate 32 of 181 submissions, 18%;
Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)3
Reflects downloads up to 23 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)How Well Industry-Level Cause Bisection Works in Real-World: A Study on Linux KernelCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering10.1145/3663529.3663828(62-73)Online publication date: 10-Jul-2024
  • (2024)MissConf: LLM-Enhanced Reproduction of Configuration-Triggered BugsProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings10.1145/3639478.3647635(484-495)Online publication date: 14-Apr-2024
  • (2022)Enriching Compiler Testing with Real Program from Bug ReportProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3556894(1-12)Online publication date: 10-Oct-2022
  • (2020)Assessing and restoring reproducibility of Jupyter notebooksProceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering10.1145/3324884.3416585(138-149)Online publication date: 21-Dec-2020
  • (2020)RSX: Reproduction Scenario Extraction Technique for Business Application Workloads in DBMS2020 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)10.1109/ISSREW51248.2020.00043(91-96)Online publication date: Oct-2020
  • (2019)ReCDroidProceedings of the 41st International Conference on Software Engineering10.1109/ICSE.2019.00030(128-139)Online publication date: 25-May-2019
  • (2019)RENNProceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering10.1109/ASE.2019.00090(924-935)Online publication date: 10-Nov-2019
  • (2019)A benchmark-based evaluation of search-based crash reproductionEmpirical Software Engineering10.1007/s10664-019-09762-1Online publication date: 29-Aug-2019
  • (2017)Hybridizing and Relaxing Dependence Tracking for Efficient Parallel Runtime SupportACM Transactions on Parallel Computing10.1145/31081384:2(1-42)Online publication date: 30-Aug-2017
  • (2017)Reproducing concurrency failures from crash stacksProceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering10.1145/3106237.3106292(705-716)Online publication date: 21-Aug-2017
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media