research-article

Open access

Accuracy Bugs: A New Class of Concurrency Bugs to Exploit Algorithmic Noise Tolerance

Authors:

Mohammad Majharul Islam,

Abdullah Muzahid,

Ulya R. KarpuzcuAuthors Info & Claims

ACM Transactions on Architecture and Code Optimization (TACO), Volume 13, Issue 4

Article No.: 48, Pages 1 - 24

https://doi.org/10.1145/3017991

Published: 12 December 2016 Publication History

Abstract

Parallel programming introduces notoriously difficult bugs, usually referred to as concurrency bugs. This article investigates the potential for deviating from the conventional wisdom of writing concurrency bug--free, parallel programs. It explores the benefit of accepting buggy but approximately correct parallel programs by leveraging the inherent tolerance of emerging parallel applications to inaccuracy in computations. Under algorithmic noise tolerance, a new class of concurrency bugs, accuracy bugs, degrade the accuracy of computation (often at acceptable levels) rather than causing catastrophic termination. This study demonstrates how embracing accuracy bugs affects the application output quality and performance and analyzes the impact on execution semantics.

References

[1]

Christian Bienia. 2011. Benchmarking Modern Multiprocessors. Ph.D. Dissertation. Princeton University, Princeton, NJ.

Digital Library

[2]

Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. 2008. The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (PACT’08).

Digital Library

[3]

Robert L. Bocchino, Jr., Vikram S. Adve, Danny Dig, Sarita V. Adve, Stephen Heumann, Rakesh Komuravelli, Jeffrey Overbey, Patrick Simmons, Hyojin Sung, and Mohsen Vakilian. 2009. A type and effect system for deterministic parallel Java. In Proceedings of the 24th ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA’09).

Digital Library

[4]

Tom Britton, Lisa Jeng, Graham Carver, Paul Cheak, and Tomer Katzenellenbogen. 2013. Reversible Debugging Software. Available at https://www.jbs.cam.ac.uk/media/2013/.

[5]

Michael Carbin, Deokhwan Kim, Sasa Misailovic, and Martin C. Rinard. 2012. Proving acceptability properties of relaxed nondeterministic approximate programs. In Proceedings of the 33rd SIGPLAN Conference on Programming Language Design and Implementation (PLDI’12).

Digital Library

[6]

Trevor E. Carlson, Wim Heirman, and Lieven Eeckhout. 2011. Sniper: Exploring the level of abstraction for scalable and accurate parallel multi-core simulations. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’11).

Digital Library

[7]

Yen-Kuang Chen, Jatin Chhugani, Pradeep Dubey, Christopher J. Hughes, Daehyun Kim, Sanjeev Kumar, Victor W. Lee, Anthony D. Nguyen, and Mikhail Smelyanskiy. 2008. Convergence of recognition, mining, and synthesis workloads and its implications. Proceedings of the IEEE 96, 5, 790--807.

[8]

Hyungmin Cho, Larkhoon Leem, and Subhasish Mitra. 2012. ERSA: Error resilient system architecture for probabilistic applications. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 31, 4, 546--558.

Digital Library

[9]

Marc de Kruijf, Shuou Nomura, and Karthikeyan Sankaralingam. 2010. Relax: An architectural framework for software recovery of hardware faults. In Proceedings of the 37th International Symposium on Computer Architecture (ISCA’10).

Digital Library

[10]

Jeffrey Dean, Greg S. Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Quoc V. Le, Mark Z. Mao, Marc’Aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, and Andrew Y. Ng. 2012. Large scale distributed deep networks. In Proceedings of the 26th Annual Conference on Neural Information Processing Systems (NIPS’12).

Digital Library

[11]

Joseph Devietti, Brandon Lucia, Luis Ceze, and Mark Oskin. 2009. DMP: Deterministic shared memory multiprocessing. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XIV).

Digital Library

[12]

Hadi Esmaeilzadeh, Adrian Sampson, Luis Ceze, and Doug Burger. 2012. Architecture support for disciplined approximate programming. In Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XVII).

Digital Library

[13]

Rajamohana Hegde and Naresh R. Shanbhag. 1999. Energy-efficient signal processing via algorithmic noise-tolerance. In Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED’99).

Digital Library

[14]

Alberto Leon-Garcia. 1994. Probability and Random Processes for Electrical Engineering (2nd ed.). Addison Wesley.

[15]

Xuanhua Li and Donald Yeung. 2007. Application-level correctness and its impact on fault tolerance. In Proceedings of the 13th International Symposium on High Performance Computer Architecture (HPCA’07).

Digital Library

[16]

Kwei-Jaiy Lin, Swaminathan Natarajan, and Jane W. S. Liu. 1987. Imprecise results: Utilizing partial computations in real-time systems. In Proceedings of the IEEE 8th Real-Time Systems Symposium (RTSS’87).

[17]

Peng Liu, Omer Tripp, and Charles Zhang. 2014. Grail: Context-aware fixing of concurrency bugs. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE’14).

Digital Library

[18]

Shan Lu, Soyeon Park, Chongfeng Hu, Xiao Ma, Weihang Jiang, Zhenmin Li, Raluca A. Popa, and Yuanyuan Zhou. 2007. MUVI: Automatically inferring multi-variable access correlations and detecting related semantic and concurrency bugs. In Proceedings of the 21st ACM SIGOPS Symposium on Operating Systems Principles (SOSP’07).

Digital Library

[19]

Shan Lu, Soyeon Park, Eunsoo Seo, and Yuanyuan Zhou. 2008. Learning from mistakes: A comprehensive study on real world concurrency bug characteristics. In Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XIII).

Digital Library

[20]

Shan Lu, Joseph Tucek, Feng Qin, and Yuanyuan Zhou. 2006. AVIO: Detecting atomicity violations via access interleaving invariants. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XII).

Digital Library

[21]

Brandon Lucia and Luis Ceze. 2009. Finding concurrency bugs with context-aware communication graphs. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’09).

Digital Library

[22]

Madanlal Musuvathi, Shaz Qadeer, Thomas Ball, Gerard Basler, Piramanayagam Arumuga Nainar, and Iulian Neamtiu. 2008. Finding and reproducing Heisenbugs in concurrent programs. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI’08).

Digital Library

[23]

Satish Narayanasamy, Zhenghao Wang, Jordan Tigani, Andrew Edwards, and Brad Calder. 2007. Automatically classifying benign and harmful data races using replay analysis. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’07).

Digital Library

[24]

Nicholas Nethercote. 2004. Dynamic Binary Analysis and Instrumentation. Ph.D. Dissertation. Computer Laboratory, University of Cambridge, UK.

[25]

Soyeon Park, Shan Lu, and Yuanyuan Zhou. 2009. CTrigger: Exposing atomicity violation bugs from their hiding places. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XIV).

Digital Library

[26]

Paul Petersen. 2011. Intel parallel inspector. In Encyclopedia of Parallel Computing, D. Padua (Ed.). Springer.

[27]

Paruj Ratanaworabhan, Martin Burtscher, Darko Kirovski, Benjamin Zorn, Rahul Nagpal, and Karthik Pattabiraman. 2012. Efficient runtime detection and toleration of asymmetric races. IEEE Transactions on Computers 61, 4, 548--562.

Digital Library

[28]

Benjamin Recht, Christopher Re, Stephen Wright, and Feng Niu. 2011. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In Advances in Neural Information Processing Systems 24. Curran Associates.

Digital Library

[29]

Lakshminarayanan Renganarayana, Vijayalakshmi Srinivasan, Ravi Nair, and Daniel Prener. 2012. Programming with relaxed synchronization. In Proceedings of the ACM Workshop on Relaxing Synchronization for Multicore and Manycore Scalability (RACES’12).

Digital Library

[30]

Martin Rinard. 2012. Unsynchronized techniques for approximate parallel computing. In Proceedings of the ACM Workshop on Relaxing Synchronization for Multicore and Manycore Scalability (RACES’12).

[31]

Martin Rinard. 2013. Parallel synchronization-free approximate data structure construction. In Proceedings of the 5th USENIX Workshop on Hot Topics in Parallelism.

Digital Library

[32]

Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, and Thomas Anderson. 1997. Eraser: A dynamic data race detector for multithreaded programs. ACM Transactions on Computer Systems 15, 4, 391--411.

Digital Library

[33]

Koushik Sen. 2008. Race directed random testing of concurrent programs. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’08).

Digital Library

[34]

Konstantin Serebryany and Timur Iskhodzhanov. 2009. ThreadSanitizer: Data race detection in practice. In Proceedings of the Workshop on Binary Instrumentation and Applications (WBIA’09).

Digital Library

[35]

Naresh Shanbhag. 2002. Reliable and energy-efficient digital signal processing. In Proceedings of the 39th Design Automation Conference (DAC’02).

Digital Library

[36]

Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4, 600--612.

Digital Library

[37]

Mark Weiser. 1981. Program slicing. In Proceedings of the 5th International Conference on Software Engineering (ICSE’81).

Digital Library

[38]

Vicky Wong and Mark Horowitz. 2006. Soft error resilience of probabilistic inference applications. In Proceedings of the Workshop on Silicon Errors in Logic-System Effects (SELSE’06).

[39]

Steven Cameron Woo, Moriyoshi Ohara, Evan Torrie, Jaswinder Pal Singh, and Anoop Gupta. 1995. The SPLASH-2 programs: Characterization and methodological considerations. In Proceedings of the 22nd International Symposium on Computer Architecture (ISCA’05).

Digital Library

[40]

Jie Yu and Satish Narayanasamy. 2009. A case for an interleaving constrained shared-memory multi-processor. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA’09).

Digital Library

Cited By

Bahar RKarpuzcu UMisailovic S(2019)Special Session: Does Approximation Make Testing Harder (or Easier)?2019 IEEE 37th VLSI Test Symposium (VTS)10.1109/VTS.2019.8758649(1-9)Online publication date: Apr-2019
https://doi.org/10.1109/VTS.2019.8758649

Index Terms

Accuracy Bugs: A New Class of Concurrency Bugs to Exploit Algorithmic Noise Tolerance

Recommendations

Finding complex concurrency bugs in large multi-threaded applications
EuroSys '11: Proceedings of the sixth conference on Computer systems

Parallel software is increasingly necessary to take advantage of multi-core architectures, but it is also prone to concurrency bugs which are particularly hard to avoid, find, and fix, since their occurrence depends on specific thread interleavings. In ...
AI: a lightweight system for tolerating concurrency bugs
FSE 2014: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering

Concurrency bugs are notoriously difficult to eradicate during software testing because of their non-deterministic nature. Moreover, fixing concurrency bugs is time-consuming and error-prone. Thus, tolerating concurrency bugs during production runs is ...
Detecting Concurrency Bugs from the Perspectives of Synchronization Intentions

Concurrency bugs are among the most difficult to detect and diagnose of all software bugs. This paper combats concurrency bugs from the perspective of programmers' synchronization intentions. We first study the root causes of 74 real-world concurrency ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Architecture and Code Optimization

ACM Transactions on Architecture and Code Optimization Volume 13, Issue 4

December 2016

648 pages

ISSN:1544-3566

EISSN:1544-3973

DOI:10.1145/3012405

Editor:
Koen De Bosschere
Ghent University

Issue’s Table of Contents

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 December 2016

Accepted: 01 November 2016

Revised: 01 October 2016

Received: 01 April 2016

Published in TACO Volume 13, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

NSF

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
354
Total Downloads

Downloads (Last 12 months)37
Downloads (Last 6 weeks)3

Reflects downloads up to 14 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Bahar RKarpuzcu UMisailovic S(2019)Special Session: Does Approximation Make Testing Harder (or Easier)?2019 IEEE 37th VLSI Test Symposium (VTS)10.1109/VTS.2019.8758649(1-9)Online publication date: Apr-2019
https://doi.org/10.1109/VTS.2019.8758649

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents