Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3236024.3275438acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
abstract

Dara: hybrid model checking of distributed systems

Published: 26 October 2018 Publication History

Abstract

Building correct implementations of distributed systems continues to elude us. Solutions consist of abstract modeling languages such as TLA+, PLusCal, which specify models of systems and tools like Coq, and SPIN which verify correctness of models but require considerable amount of effort, or transparent model checkers like MODIST, CMC and CHESS which suffer from state space explosion, rendering them impractical to use as they are too slow.
We propose Dara, a novel hybrid technique that combines the speed of abstract model checkers with the correctness and ease-of-use of transparent model checkers. Dara utilizes tests as well as a transparent model checker to generate logs from real executions of the system. The generated logs are analyzed to infer a model of the system which is model-checked by SPIN to verify user-provided invariants. Invariant violations are reported as likely bug traces. These traces are then passed to a replay engine which tries to replay the traces as real executions of the system to remove false positives. We are currently evaluating Dara's efficiency and usability.

References

[1]
Cyrille Artho, Quentin Gros, Guillaume Rousset, Kazuaki Banzai, Lei Ma, Takashi Kitamura, Masami Hagiya, Yoshinori Tanabe, and Mitsuharu Yamamoto. 2017.
[2]
Model-based API testing of Apache ZooKeeper. In Software Testing, Verification and Validation (ICST), 2017 IEEE International Conference on. IEEE, 288–298.
[3]
Fletcher Babb. 2015. AmazonâĂŹs AWS DynamoDB Experiences Outage, Affecting Netflix, Reddit, Medium, and More. en-US. https://venturebeat.com/2015/09/20/amazons-aws-outage-takes-down-netflixreddit-medium-and-more/.
[4]
Ivan Beschastnikh, Yuriy Brun, Michael D Ernst, and Arvind Krishnamurthy. 2014. Inferring models of concurrent systems from logs of their behavior with CSight. In Proceedings of the 36th International Conference on Software Engineering. ACM, 468–479.
[5]
btcsuite. 2013. an alternative full node bitcoin implementation. https://github.com/btcsuite/btcd.
[6]
Coreos. 2013.
[7]
Distributed reliable key-value store. https://github.com/coreos/etcd.
[8]
Pantazis Deligiannis, Matt McCutchen, Paul Thomson, Shuo Chen, Alastair F Donaldson, John Erickson, Cheng Huang, Akash Lal, Rashmi Mudduluru, Shaz Qadeer, et al. 2016. Uncovering Bugs in Distributed Storage Systems during Testing (Not in Production!). In FAST. 249–262.
[9]
Michael D. Ernst, Jeff H. Perkins, Philip J. Guo, Stephen McCamant, Carlos Pacheco, Matthew S. Tschantz, and Chen Xiao. 2007. The Daikon system for dynamic detection of likely invariants. Sci. Comput. Program. (2007).
[10]
Cormac Flanagan and Patrice Godefroid. 2005. Dynamic partial-order reduction for model checking software. In ACM Sigplan Notices, Vol. 40. ACM, 110–121.
[11]
Stewart Grant, Hendrik Cech, and Ivan Beschastnikh. 2018. Inferring and Asserting Distributed System Invariants. In Proceedings of the 40th International Conference on Software Engineering (ICSE ’18). ACM, New York, NY, USA, 1149– 1159.
[12]
Rachid Guerraoui and Maysam Yabandeh. 2011. Model Checking a Networked System Without the Network. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation (NSDI’11). USENIX Association, Berkeley, CA, USA, 225–238. http://dl.acm.org/citation.cfm?id=1972457.1972481
[13]
Huayang Guo, Ming Wu, Lidong Zhou, Gang Hu, Junfeng Yang, and Lintao Zhang. 2011. Practical Software Model Checking via Dynamic Interface Reduction. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (SOSP ’11). ACM, New York, NY, USA, 265–278.
[14]
[15]
Chris Hawblitzel, Jon Howell, Manos Kapritsos, Jacob R. Lorch, Bryan Parno, Michael L. Roberts, Srinath Setty, and Brian Zill. {n. d.}. IronFleet: Proving Practical Distributed Systems Correct. In SOSP 2015. 17. 2815400.2815428
[16]
Gerard J. Holzmann. 1997. The Model Checker SPIN. IEEE Trans. Softw. Eng. 23, 5 (May 1997), 279–295.
[17]
Charles Killian, James W. Anderson, Ranjit Jhala, and Amin Vahdat. 2007. Life, death, and the critical transition: finding liveness bugs in systems code. In Networked Systems Design and Implementation (NSDI). Cambridge, MA, USA.
[18]
Sandeep Kumar, Siau-Cheng Khoo, Abhik Roychoudhury, and David Lo. 2012. Inferring class level specifications for distributed systems. In Proceedings of the 34th International Conference on Software Engineering. IEEE Press, 914–924.
[19]
François Laroussinie and Kim G Larsen. 1998. CMC: A tool for compositional model-checking of real-time systems. In Formal Description Techniques and Protocol Specification, Testing and Verification. Springer, 439–456.
[20]
Tanakorn Leesatapornwongsa, Mingzhe Hao, Pallavi Joshi, Jeffrey F Lukman, and Haryadi S Gunawi. 2014. SAMC: Semantic-Aware Model Checking for Fast Discovery of Deep Bugs in Cloud Systems. In OSDI. 399–414.
[21]
Friedemann Mattern. 1989. Virtual Time and Global States of Distributed Systems. In Parallel and Distributed Algorithms. 215–226.
[22]
Madanlal Musuvathi, Shaz Qadeer, Thomas Ball, Gerard Basler, Piramanayagam Arumuga Nainar, and Iulian Neamtiu. 2008. Finding and Reproducing Heisenbugs in Concurrent Programs. In OSDI, Vol. 8. 267–280.
[23]
Chris Newcombe, Tim Rath, Fan Zhang, Bogdan Munteanu, Marc Brooker, and Michael Deardeuff. 2015. How Amazon Web Services Uses Formal Methods. Commun. ACM 58, 4 (March 2015), 66–73.
[24]
Diego Ongaro and John K Ousterhout. 2014. In search of an understandable consensus algorithm. In USENIX Annual Technical Conference. 305–319.
[25]
Shannon Vavra. 2017. Amazon outage cost S&P 500 companies $150M. https://www.axios.com/amazon-outage-cost-sp-500-companies-150m- 1513300728-aaff3a9e-d5de-4700-aded-55614cf7852c.html.
[26]
James R. Wilcox, Doug Woos, Pavel Panchekha, Zachary Tatlock, Xi Wang, Michael D. Ernst, and Thomas Anderson. {n. d.}. Verdi: A Framework for Implementing and Formally Verifying Distributed Systems. In PLDI 2015.
[27]
Maysam Yabandeh, Nikola Knezevic, Dejan Kostic, and Viktor Kuncak. 2009. CrystalBall: Predicting and preventing inconsistencies in deployed distributed systems. In The 6th USENIX Symposium on Networked Systems Design and Implementation (NSDIâĂŹ09).
[28]
Junfeng Yang, Tisheng Chen, Ming Wu, Zhilei Xu, Xuezheng Liu, Haoxiang Lin, Mao Yang, Fan Long, Lintao Zhang, and Lidong Zhou. {n. d.}. MODIST: Transparent Model Checking of Unmodified Distributed Systems. In NSDI 2009.
[29]
USENIX Association, 16. http://dl.acm.org/citation.cfm?id=1558977.1558992 Abstract 1 BACKGROUND & MOTIVATION 2 APPROACH 3 EVALUATION & FUTURE WORK 4 CONCLUSION REFERENCES

Cited By

View all
  • (2022)Common Data Guided Crash Injection for Cloud Systems2022 IEEE/ACM 44th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion)10.1109/ICSE-Companion55297.2022.9793803(36-40)Online publication date: May-2022

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEC/FSE 2018: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
October 2018
987 pages
ISBN:9781450355735
DOI:10.1145/3236024
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 October 2018

Check for updates

Author Tags

  1. Distributed Systems
  2. Model Checking

Qualifiers

  • Abstract

Conference

ESEC/FSE '18
Sponsor:

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)26
  • Downloads (Last 6 weeks)4
Reflects downloads up to 27 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Common Data Guided Crash Injection for Cloud Systems2022 IEEE/ACM 44th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion)10.1109/ICSE-Companion55297.2022.9793803(36-40)Online publication date: May-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media