ConMem: detecting severe concurrency bugs through an effect-oriented approach

W Zhang, C Sun, S Lu - ACM Sigplan Notices, 2010 - dl.acm.org
W Zhang, C Sun, S Lu
ACM Sigplan Notices, 2010dl.acm.org
Multicore technology is making concurrent programs increasingly pervasive. Unfortunately, it
is difficult to deliver reliable concurrent programs, because of the huge and non-
deterministic interleaving space. In reality, without the resources to thoroughly check the
interleaving space, critical concurrency bugs can slip into production runs and cause failures
in the field. Approaches to making the best use of the limited resources and exposing severe
concurrency bugs before software release would be desirable. Unlike previous work that …
Multicore technology is making concurrent programs increasingly pervasive. Unfortunately, it is difficult to deliver reliable concurrent programs, because of the huge and non-deterministic interleaving space. In reality, without the resources to thoroughly check the interleaving space, critical concurrency bugs can slip into production runs and cause failures in the field. Approaches to making the best use of the limited resources and exposing severe concurrency bugs before software release would be desirable.
Unlike previous work that focuses on bugs caused by specific interleavings (e.g., races and atomicity-violations), this paper targets concurrency bugs that result in one type of severe effects: program crashes. Our study of the error-propagation process of realworld concurrency bugs reveals a common pattern (50% in our non-deadlock concurrency bug set) that is highly correlated with program crashes. We call this pattern concurrency-memory bugs: buggy interleavings directly cause memory bugs (NULL-pointer-dereference, dangling-pointer, buffer-overflow, uninitialized-read) on shared memory objects.
Guided by this study, we built ConMem to monitor program execution, analyze memory accesses and synchronizations, and predicatively detect these common and severe concurrency-memory bugs. We also built a validator ConMem-v to automatically prune false positives by enforcing potential bug-triggering interleavings.
We evaluated ConMem using 7 open-source programs with 9 real-world severe concurrency bugs. ConMem detects more tested bugs (8 out of 9 bugs) than a lock-set-based race detector and an unserializable-interleaving detector that detect 4 and 5 bugs respectively, with a false positive rate about one tenth of the compared tools. ConMem-v further prunes out all the false positives. ConMem has reasonable overhead suitable for development usage.
ACM Digital Library