Nothing Special   »   [go: up one dir, main page]

Skip to main content

Relational Equivalence Proofs Between Imperative and MapReduce Algorithms

  • Conference paper
  • First Online:
Verified Software. Theories, Tools, and Experiments (VSTTE 2018)

Abstract

Distributed programming frameworks like MapReduce, Spark and Thrill, are widely used for the implementation of algorithms operating on large datasets. However, implementing in these frameworks is more demanding than coming up with sequential implementations. One way to achieve correctness of an optimized implementation is by deriving it from an existing imperative sequential algorithm description through a sequence of behavior-preserving transformations.

We present a novel approach for proving equivalence between imperative and deterministic MapReduce algorithms based on partitioning the equivalence proof into a sequence of equivalence proofs between intermediate programs with smaller differences. Our approach is based on the insight that proofs are best conducted using a combination of two kinds of steps: (1) uniform context-independent rewriting transformations; and (2) context-dependent flexible transformations that can be proved using relational reasoning with coupling invariants.

We demonstrate the feasibility of our approach by evaluating it on two prototypical algorithms commonly used as examples in MapReduce frameworks: k-means and PageRank. To carry out the proofs, we use a higher-order theorem prover with partial proof automation. The results show that our approach and its prototypical implementation enable equivalence proofs of non-trivial algorithms and could be automated to a large degree.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Barthe, G., Crespo, J.M., Kunz, C.: Relational verification using product programs. In: Butler, M., Schulte, W. (eds.) FM 2011. LNCS, vol. 6664, pp. 200–214. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21437-0_17

    Chapter  Google Scholar 

  2. Beckert, B., Bingmann, T., Kiefer, M., Sanders, P., Ulbrich, M., Weigl, A.: Relational Equivalence Proofs Between Imperative and MapReduce Algorithms. ArXiv e-prints, January 2018. arXiv:1801.08766

  3. Beckert, B., Bingmann, T., Kiefer, M., Sanders, P., Ulbrich, M., Weigl, A.: Proving equivalence between imperative and mapreduce implementations using program transformations. In: Third Workshop Models for Formal Analysis of Real Systems and Sixth International Workshop on Verification and Program Transformation. Electronic Proceedings in Theoretical Computer Science, vol. 268, pp. 185–199. Open Publishing Association (2018)

    Google Scholar 

  4. Bingmann, T., et al.: Thrill: high-performance algorithmic distributed batch data processing with C++. In: IEEE International Conference on Big Data, pp. 172–183. IEEE, December 2016. preprint arXiv:1608.05634

  5. Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst. 30(1–7), 107–117 (1998). https://doi.org/10.1016/S0169-7552(98)00110-X

    Article  Google Scholar 

  6. Chambers, C., et al.: FlumeJava: easy, efficient data-parallel pipelines. ACM SIGPLAN Notices 45(6), 363–375 (2010)

    Article  Google Scholar 

  7. Chen, Y.F., Hong, C.D., Lengál, O., Mu, S.C., Sinha, N., Wang, B.Y.: An Executable Sequential Specification for Spark Aggregation (2017). arXiv:1702.02439

  8. Chen, Y.-F., Hong, C.-D., Sinha, N., Wang, B.-Y.: Commutativity of reducers. In: Baier, C., Tinelli, C. (eds.) TACAS 2015. LNCS, vol. 9035, pp. 131–146. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46681-0_9

    Chapter  Google Scholar 

  9. Chen, Y., Song, L., Wu, Z.: The Commutativity Problem of the MapReduce Framework: A Transducer-based Approach. CoRR abs/1605.01497 (2016). arXiv:1605.01497

  10. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  11. Elenbogen, D., Katz, S., Strichman, O.: Proving mutual termination. Form. Methods Syst. Des. 47(2), 204–229 (2015). https://doi.org/10.1007/s10703-015-0234-3

    Article  MATH  Google Scholar 

  12. Felsing, D., Grebing, S., Klebanov, V., Rümmer, P., Ulbrich, M.: Automating regression verification. In: Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering, pp. 349–360. ASE 2014. ACM, New York, NY, USA (2014)

    Google Scholar 

  13. Godlin, B., Strichman, O.: Regression verification. In: Proceedings of the 46th Annual Design Automation Conference, pp. 466–471. DAC 2009. ACM, New York, NY, USA (2009)

    Google Scholar 

  14. Grossman, S., Cohen, S., Itzhaky, S., Rinetzky, N., Sagiv, M.: Verifying equivalence of spark programs. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10427, pp. 282–300. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63390-9_15

    Chapter  Google Scholar 

  15. Hawblitzel, C., Kawaguchi, M., Lahiri, S., Rebêlo, H.: Mutual summaries: unifying program comparison techniques. In: Informal proceedings of BOOGIE 2011 workshop (2011). https://www.microsoft.com/en-us/research/publication/mutual-summaries-unifying-program-comparison-techniques/

  16. Kahn, G.: Natural semantics. STACS 87, 22–39 (1987)

    MathSciNet  MATH  Google Scholar 

  17. Kiefer, M., Klebanov, V., Ulbrich, M.: Relational program reasoning using compiler IR - combining static verification and dynamic analysis. J. Autom. Reason. (2017)

    Google Scholar 

  18. Klebanov, V., Rümmer, P., Ulbrich, M.: Automating regression verification of pointer programs by predicate abstraction. J. Formal Methods Syst. Des. 52, 229–259 (2017)

    Article  Google Scholar 

  19. Lahiri, S.K., Hawblitzel, C., Kawaguchi, M., Rebêlo, H.: SYMDIFF: a language-agnostic semantic diff tool for imperative programs. In: Madhusudan, P., Seshia, S.A. (eds.) CAV 2012. LNCS, vol. 7358, pp. 712–717. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31424-7_54

    Chapter  Google Scholar 

  20. Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theor. 28(2), 129–137 (1982). https://doi.org/10.1109/TIT.1982.1056489

    Article  MathSciNet  MATH  Google Scholar 

  21. Radoi, C., Fink, S.J., Rabbah, R., Sridharan, M.: Translating Imperative Code to MapReduce. SIGPLAN Not. 49(10), 909–927 (2014)

    Article  Google Scholar 

  22. The Coq development team: The Coq proof assistant reference manual. LogiCal Project, version 8.6 (2004). http://coq.inria.fr

  23. Verdoolaege, S., Janssens, G., Bruynooghe, M.: Equivalence checking of static affine programs using widening to handle recurrences. ACM Trans. Program. Lang. Syst. 34(3), 11:1–11:35 (2012). https://doi.org/10.1145/2362389.2362390

    Article  MATH  Google Scholar 

  24. White, T.: Hadoop: The Definitive Guide. O’Reilly Media Inc., Sebastopol (2012)

    Google Scholar 

  25. Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, pp. 10–10. HotCloud 2010, USENIX Association, Berkeley, CA, USA (2010). http://dl.acm.org/citation.cfm?id=1863103.1863113

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander Weigl .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Beckert, B., Bingmann, T., Kiefer, M., Sanders, P., Ulbrich, M., Weigl, A. (2018). Relational Equivalence Proofs Between Imperative and MapReduce Algorithms. In: Piskac, R., Rümmer, P. (eds) Verified Software. Theories, Tools, and Experiments. VSTTE 2018. Lecture Notes in Computer Science(), vol 11294. Springer, Cham. https://doi.org/10.1007/978-3-030-03592-1_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-03592-1_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-03591-4

  • Online ISBN: 978-3-030-03592-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics