Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3461648.3463856acmconferencesArticle/Chapter ViewAbstractPublication PagescpsweekConference Proceedingsconference-collections
research-article
Open access

Cache abstraction for data race detection in heterogeneous systems with non-coherent accelerators

Published: 22 June 2021 Publication History

Abstract

Embedded systems are becoming increasingly complex and heterogeneous, featuring multiple processor cores (which might themselves be heterogeneous) as well as specialized hardware accelerators, all accessing shared memory. Many accelerators are non-coherent (i.e., do not support hardware cache coherence) because it reduces hardware complexity, cost, and power consumption, while potentially offering superior performance. However, the disadvantage of non-coherence is that the software must explicitly synchronize between accelerators and processors, and this synchronization is notoriously error-prone.
We propose an analysis technique to find data races in software for heterogeneous systems that include non-coherent accelerators. Our approach builds on classical results for data race detection, but the challenge turns out to be analyzing cache behavior rather than the behavior of the non-coherent accelerators. Accordingly, our central contribution is a novel, sound (data-race-preserving) abstraction of cache behavior. We prove our abstraction sound, and then to demonstrate the precision of our abstraction, we implement it in a simple dynamic race detector for a system with a processor and a massively parallel accelerator provided by a commercial FPGA-based accelerator vendor. On eleven software examples provided by the vendor, the tool had zero false positives and was able to detect previously unknown data races in 2 of the 11 examples.

References

[1]
Martin Abadi, Cormac Flanagan, and Stephen N Freund. 2006. Types for safe locking: Static race detection for Java. ACM Transactions on Programming Languages and Systems (TOPLAS), 28, 2 (2006), 207–255. https://doi.org/10.1145/1119479.1119480
[2]
Sarita V. Adve and Kourosh Gharachorloo. 1996. Shared Memory Consistency Models: A Tutorial. Computer, 29, 12 (1996), 66–76. https://doi.org/10.1109/2.546611
[3]
Sarita V. Adve, Mark D. Hill, Barton P. Miller, and Robert H. B. Netzer. 1991. Detecting Data Races in Weak Memory Systems. In 18th ACM/IEEE International Symposium on Computer Architecture (ISCA). 234–243. https://doi.org/10.1145/115953.115976
[4]
Jade Alglave, Anthony C. J. Fox, Samin Ishtiaq, Magnus O. Myreen, Susmit Sarkar, Peter Sewell, and Francesco Zappa Nardelli. 2009. The Semantics of Power and ARM Multiprocessor Machine Code. In POPL 2009 Workshop on Declarative Aspects of Multicore Programming. 13–24. https://doi.org/10.1145/1481839.1481842
[5]
Laura Effinger-Dean, Brandon Lucia, Luis Ceze, Dan Grossman, and Hans-J. Boehm. 2012. IFRit: Interference-Free Regions for Dynamic Data-Race Detection. In ACM International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA’12). 467–484. https://doi.org/10.1145/2384616.2384650
[6]
Dawson Engler and Ken Ashcraft. 2003. RacerX: effective, static detection of race conditions and deadlocks. ACM SIGOPS Operating Systems Review, 37, 5 (2003), 237–252. https://doi.org/10.1145/1165389.945468
[7]
Cormac Flanagan and Stephen N Freund. 2000. Type-based race detection for Java. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’00). ACM SIGPLAN Notices, 35, 5, 219–232. https://doi.org/10.1145/349299.349328
[8]
Cormac Flanagan and Stephen N Freund. 2009. FastTrack: efficient and precise dynamic race detection. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’09). ACM SIGPLAN Notices, 44, 6, 121–133. https://doi.org/10.1145/1543135.1542490
[9]
Davide Giri, Paolo Mantovani, and Luca P Carloni. 2018. Accelerators and Coherence: An SoC Perspective. IEEE Micro, 38, 6 (2018), 36–45. https://doi.org/10.1109/MM.2018.2877288
[10]
John L. Hennessy and David A. Patterson. 2019. A New Golden Age for Computer Architecture. Commun. ACM, 62, 2 (2019), 48–60. https://doi.org/10.1145/3282307
[11]
Leslie Lamport. 1978. Time, clocks, and the ordering of events in a distributed system. Commun. ACM, 21, 7 (1978), 558–565. https://doi.org/10.1145/359545.359563
[12]
Guangpu Li, Shan Lu, Madanlal Musuvathi, Suman Nath, and Rohan Padhye. 2019. Efficient Scalable Thread-Safety-Violation Detection: Finding Thousands of Concurrency Bugs During Testing. In 27th ACM Symposium on Operating System Principles (SOSP’19). 162–180. https://doi.org/10.1145/3341301.3359638
[13]
Yatin A. Manerkar, Daniel Lustig, and Margaret Martonosi. 2020. RealityCheck: Bringing Modularity, Hierarchy, and Abstraction to Automated Microarchitectural Memory Consistency Verification. arxiv:2003.04892.
[14]
Daniel Marino, Madanlal Musuvathi, and Satish Narayanasamy. 2009. LiteRace: Effective Sampling for Lightweigth Data-Race Detection. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’09). 134–143. https://doi.org/10.1145/1542476.1542491
[15]
Daniel Marino, Abhayendra Singh, Todd Millstein, Madanlal Musuvathi, and Satish Narayanasamy. 2011. A Case for an SC-Preserving Compiler. In ACM SIGPLAN Conference on Programming Language Design and Implementation(PLDI’11). 199–210. https://doi.org/10.1145/1993498.1993522
[16]
Robert O’Callahan and Jong-Deok Choi. 2003. Hybrid Dynamic Data Race Detection. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’03). 167–178. https://doi.org/10.1145/781498.781528
[17]
Polyvios Pratikakis, Jeffrey S Foster, and Michael Hicks. 2011. LOCKSMITH: Practical static race detection for C. ACM Transactions on Programming Languages and Systems (TOPLAS), 33, 1 (2011), 3. https://doi.org/10.1145/1889997.1890000
[18]
Raghavan Raman, Jisheng Zhao, Vivek Sarkar, Martin Vechev, and Eran Yahav. 2012. Scalable and Precise Dynamic Datarace Detection for Structured Parallelism. In 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’12). 531–542. https://doi.org/10.1145/2345156.2254127
[19]
Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, and Thomas Anderson. 1997. Eraser: A dynamic data race detector for multithreaded programs. ACM Transactions on Computer Systems (TOCS), 15, 4 (1997), 391–411. https://doi.org/10.1145/265924.265927
[20]
Konstantin Serebryany and Timur Iskhodzhanov. 2009. ThreadSanitizer: data race detection in practice. In Workshop on Binary Instrumentation and Applications (WBIA’09). 62–71. https://doi.org/10.1145/1791194.1791203
[21]
Aaron Severance and Guy G. F. Lemieux. 2013. Embedded supercomputing in FPGAs with the VectorBlox MXP matrix processor. In International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’13). 1–10. https://doi.org/10.1109/CODES-ISSS.2013.6658993
[22]
Inderpreet Singh, Arrvindh Shriraman, Wilson W. L. Fung, Mike O’Connor, and Tor M. Aamodt. 2013. Cache Coherence for GPU Architectures. In IEEE International Symposium on High Performance Computer Architecture (HPCA). 578–590. https://doi.org/10.1109/HPCA.2013.6522351
[23]
Nicholas Sterling. 1993. WARLOCK–A Static Data Race Analysis Tool. In USENIX Winter Technical Conference. USENIX Association, 97–106.
[24]
Caroline Trippel, Yatin A. Manerkar, Daniel Lustig, Michael Pellauer, and Margaret Martonosi. 2017. TriCheck: Memory Model Verification at the Trisection of Software, Hardware, and ISA. In 22nd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’17). 119–133. https://doi.org/10.1145/3037697.3037719
[25]
Ana Lucia Varbanescu and Jie Shen. 2016. Heterogeneous computing with accelerators: an overview with examples. In 2016 Forum on Specification and Design Languages (FDL). 1–8. https://doi.org/10.1109/FDL.2016.7880387
[26]
Jan Wen Voung, Ranjit Jhala, and Sorin Lerner. 2007. RELAY: static race detection on millions of lines of code. In 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC-FSE’07). 205–214. https://doi.org/10.1145/1287624.1287654
[27]
Mai Zheng, Vignesh T. Ravi, Feng Qin, and Gagan Agrawal. 2011. GRace: A Low-Overhead Mechanism for Detecting Data Races in GPU Programs. In ACM Symposium on Principles and Practice of Parallel Programming (PPoPP’11). 135–146. https://doi.org/10.1145/2038037.1941574

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
LCTES 2021: Proceedings of the 22nd ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems
June 2021
162 pages
ISBN:9781450384728
DOI:10.1145/3461648
  • General Chair:
  • Jörg Henkel,
  • Program Chair:
  • Xu Liu
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 June 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Caching
  2. Data Race
  3. Hardware Accelerator
  4. Memory Coherence

Qualifiers

  • Research-article

Funding Sources

  • NSERC

Conference

LCTES '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 116 of 438 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 323
    Total Downloads
  • Downloads (Last 12 months)41
  • Downloads (Last 6 weeks)8
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media