research-article

Public Access

Parallelizing Sequential Graph Computations

Authors:

Chao TianAuthors Info & Claims

SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of Data

Pages 495 - 510

https://doi.org/10.1145/3035918.3035942

Published: 09 May 2017 Publication History

Abstract

This paper presents GRAPE, a parallel system for graph computations. GRAPE differs from prior systems in its ability to parallelize existing sequential graph algorithms as a whole. Underlying GRAPE are a simple programming model and a principled approach, based on partial evaluation and incremental computation. We show that sequential graph algorithms can be "plugged into" GRAPE with minor changes, and get parallelized. As long as the sequential algorithms are correct, their GRAPE parallelization guarantees to terminate with correct answers under a monotonic condition. Moreover, we show that algorithms in MapReduce, BSP and PRAM can be optimally simulated on GRAPE. In addition to the ease of programming, we experimentally verify that GRAPE achieves comparable performance to the state-of-the-art graph systems, using real-life and synthetic graphs.

References

[1]

Aliyun.sl https://intl.aliyun.com.

[2]

DBpedia.sl http://wiki.dbpedia.org/Datasets.

[3]

Giraph.sl http://giraph.apache.org/.

[4]

GRAPE.sl http://grapedb.io/.

[5]

Movielens.sl http://grouplens.org/datasets/movielens/.

[6]

MPICH.sl https://www.mpich.org/.

[7]

Snap.sl http://snap.stanford.edu/data/index.html.

[8]

Traffic.sl http://www.dis.uniroma1.it/challenge9/download.shtml.

[9]

U. A. Acar. Self-Adjusting Computation. PhD thesis, Carnegie Mellon University, 2005.

Digital Library

[10]

J. Bang-Jensen and G. Z. Gutin. Digraphs: Theory, Algorithms and Applications. Springer, 2008.

Digital Library

[11]

P. A. Bernstein and N. Goodman. Concurrency control in distributed database systems. ACM Comput. Surv., 13(2):185--221.

Digital Library

[12]

E. G. Boman, K. D. Devine, and S. Rajamanickam. Scalable matrix computations on large scale-free graphs using 2D graph partitioning. In SC, 2013.

Digital Library

[13]

F. Bourse, M. Lelarge, and M. Vojnovic. Balanced graph edge partition. In SIGKDD, pages 1456--1465, 2014.

Digital Library

[14]

P. Buneman, G. Cong, W. Fan, and A. Kementsietsidis. Using partial evaluation in distributed query evaluation. In VLDB, 2006.

Digital Library

[15]

E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. SICOMP, 32(5):1338--1355, 2003.

Digital Library

[16]

L. P. Cordella, P. Foggia, C. Sansone, and M. Vento. A (sub) graph isomorphism algorithm for matching large graphs. TPAMI, 26(10):1367--1372, 2004.

Digital Library

[17]

J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. Commun. ACM, 51(1), 2008.

Digital Library

[18]

W. Fan, C. Hu, and C. Tian. Incremental graph computations: Doable and undoable. In SIGMOD, 2017.

Digital Library

[19]

W. Fan, J. Li, S. Ma, N. Tang, Y. Wu, and Y. Wu. Graph pattern matching: From intractability to polynomial time. In PVLDB, 2010.

Digital Library

[20]

W. Fan, J. Li, X. Wang, and Y. Wu. Query preserving graph compression. In SIGMOD, 2012.

Digital Library

[21]

W. Fan, X. Wang, and Y. Wu. Incremental graph pattern matching. TODS, 38(3), 2013.

Digital Library

[22]

W. Fan, X. Wang, and Y. Wu. Distributed graph simulation: Impossibility and possibility. PVLDB, 7(12), 2014.

Digital Library

[23]

M. L. Fredman and R. E. Tarjan. Fibonacci heaps and their uses in improved network optimization algorithms. JACM, 34(3):596--615, 1987.

Digital Library

[24]

J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In USENIX, 2012.

Digital Library

[25]

J. E. Gonzalez, R. S. Xin, A. Dave, D. Crankshaw, M. J. Franklin, and I. Stoica. GraphX: Graph processing in a distributed dataflow framework. In OSDI, 2014.

Digital Library

[26]

T. J. Harris. A survey of PRAM simulation techniques. ACM Comput. Surv., 26(2):187--206, 1994.

Digital Library

[27]

M. R. Henzinger, T. Henzinger, and P. Kopke. Computing simulations on finite and infinite graphs. In FOCS, 1995.

Digital Library

[28]

N. D. Jones. An introduction to partial evaluation. ACM Computing Surveys, 28(3), 1996.

Digital Library

[29]

H. J. Karloff, S. Suri, and S. Vassilvitskii. A model of computation for MapReduce. In SODA, 2010.

Digital Library

[30]

G. Karypis and V. Kumar. METIS--unstructured graph partitioning and sparse matrix ordering system, version 2.0. Technical report, 1995.

[31]

A. Khan, Y. Wu, C. C. Aggarwal, and X. Yan. Nema: Fast graph search with label similarity. PVLDB, 6(3), 2013.

Digital Library

[32]

M. Kim and K. S. Candan. SBV-Cut: Vertex-cut based graph partitioning using structural balance vertices. Data & Knowledge Engineering, 72:285--303, 2012.

Digital Library

[33]

Y. Koren, R. Bell, C. Volinsky, et al. Matrix factorization techniques for recommender systems. Computer, 42(8):30--37, 2009.

Digital Library

[34]

Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein. Distributed GraphLab: A framework for machine learning in the cloud. PVLDB, 5(8), 2012.

Digital Library

[35]

G. Malewicz, M. H. Austern, A. J. C. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In SIGMOD, 2010.

Digital Library

[36]

T. Mytkowicz, M. Musuvathi, and W. Schulte. Data-parallel finite-state machines. In ASPLOS, 2014.

Digital Library

[37]

K. Pingali, D. Nguyen, M. Kulkarni, M. Burtscher, M. A. Hassaan, R. Kaleem, T.-H. Lee, A. Lenharth, R. Manevich, M. Méndez-Lojo, et al. The tao of parallelism in algorithms. In ACM Sigplan Notices, volume 46, pages 12--25, 2011.

Digital Library

[38]

C. Radoi, S. J. Fink, R. M. Rabbah, and M. Sridharan. Translating imperative code to MapReduce. In OOPSLA, 2014.

Digital Library

[39]

G. Ramalingam and T. Reps. An incremental algorithm for a generalization of the shortest-path problem. J. Algorithms, 21(2):267--305, 1996.

Digital Library

[40]

G. Ramalingam and T. Reps. On the computational complexity of dynamic graph problems. TCS, 158(1--2), 1996.

Digital Library

[41]

V. Raychev, M. Musuvathi, and T. Mytkowicz. Parallelizing user-defined aggregations using symbolic execution. In SOSP, 2015.

Digital Library

[42]

S. Salihoglu and J. Widom. GPS: a graph processing system. In SSDBM, 2013.

Digital Library

[43]

I. Stanton and G. Kliot. Streaming graph partitioning for large distributed graphs. In KDD, pages 1222--1230, 2012.

Digital Library

[44]

Y. Tian, A. Balmin, S. A. Corsten, and J. M. Shirish Tatikonda. From "think like a vertex" to "think like a graph". PVLDB, 7(7):193--204, 2013.

Digital Library

[45]

P. Trinder. A Functional Database. PhD thesis, University of Oxford, 1989.

Digital Library

[46]

L. G. Valiant. A bridging model for parallel computation. Commun. ACM, 33(8):103--111, 1990.

Digital Library

[47]

L. G. Valiant. General purpose parallel architectures. In Handbook of Theoretical Computer Science, Vol A. 1990.

Digital Library

[48]

J. Vinagre, A. M. Jorge, and J. Gama. Fast incremental matrix factorization for recommendation with positive-only feedback. In International Conference on User Modeling, Adaptation, and Personalization, 2014.

[49]

G. Wang, W. Xie, A. J. Demers, and J. Gehrke. Asynchronous large-scale graph processing made easy. In CIDR, 2013.

[50]

D. Yan, J. Cheng, Y. Lu, and W. Ng. Blogel: A block-centric framework for distributed computation on real-world graphs. PVLDB, 7(14):1981--1992, 2014.

Digital Library

[51]

D. Yan, J. Cheng, K. Xing, Y. Lu, W. Ng, and Y. Bu. Pregel algorithms for graph connectivity problems with performance guarantees. PVLDB, 7(14):1821--1832, 2014.

Digital Library

[52]

Y. Zhou, L. Liu, K. Lee, C. Pu, and Q. Zhang. Fast iterative graph computation with resource aware graph parallel abstractions. In HPDC, 2015.

Digital Library

Cited By

Liu XDong BFu WWu NWang XWang W(2024)Extending Graph Rules with OraclesProceedings of the VLDB Endowment10.14778/3654621.365464117:7(1775-1787)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.14778/3654621.3654641
Yao FTao QLin SZhang YYu WGong SWang QYu GZhou J(2024)Towards Efficient Graph Processing in Geo-Distributed Data CentersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.345387235:11(2147-2160)Online publication date: Nov-2024
https://doi.org/10.1109/TPDS.2024.3453872
Jiang JChoi BHuang XXu JBhowmick S(2024)DKWS: A Distributed System for Keyword Search on Massive GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.3313726(1-16)Online publication date: 2024
https://doi.org/10.1109/TKDE.2023.3313726
Show More Cited By

Index Terms

Parallelizing Sequential Graph Computations
1. Information systems
  1. Data management systems
    1. Database management system engines
      1. Parallel and distributed DBMSs

Recommendations

Parallelizing Sequential Graph Computations
Best of SIGMOD 2017 Papers

This article presents GRAPE, a parallel <underline>GRAP</underline>h <underline>E</underline>ngine for graph computations. GRAPE differs from prior systems in its ability to parallelize existing sequential graph algorithms as a whole, without the need ...
Parallelizing Subroutines in Sequential Programs

An algorithm for making sequential programs parallel is described, which first identifies all subroutines, then determines the appropriate execution mode and restructures the code. It works recursively to parallelize the entire program. We use Fortran ...
A communication-reduced and computation-balanced framework for fast graph computation

The bulk synchronous parallel (BSP) model is very user friendly for coding and debugging parallel graph algorithms. However, existing BSP-based distributed graph-processing frameworks, such as Pregel, GPS and Giraph, routinely suffer from high ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of Data

May 2017

1810 pages

ISBN:9781450341974

DOI:10.1145/3035918

General Chairs:
Rada Chirkova
North Carolina State University, USA
,
Jun Yang
Duke University, USA
,
Program Chair:
Dan Suciu
University of Washington, USA

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMOD: ACM Special Interest Group on Management of Data

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 May 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Best Paper

Author Tags

Qualifiers

Research-article

Funding Sources

973 Program
the Foundation for Innovative Research Groups of NSFC
Shenzhen Peacock Program
Guangdong Innovative Research Team Program
European Research Council
National Science Foundation
National Natural Science Foundation of China
Engineering and Physical Sciences Research Council
Beijing Advanced Innovation Center for Big Data and Brain Computing
google

Conference

SIGMOD/PODS'17

Sponsor:

SIGMOD

SIGMOD/PODS'17: International Conference on Management of Data

May 14 - 19, 2017

Illinois, Chicago, USA

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

54
Total Citations
View Citations
4,097
Total Downloads

Downloads (Last 12 months)339
Downloads (Last 6 weeks)56

Reflects downloads up to 14 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Liu XDong BFu WWu NWang XWang W(2024)Extending Graph Rules with OraclesProceedings of the VLDB Endowment10.14778/3654621.365464117:7(1775-1787)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.14778/3654621.3654641
Yao FTao QLin SZhang YYu WGong SWang QYu GZhou J(2024)Towards Efficient Graph Processing in Geo-Distributed Data CentersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.345387235:11(2147-2160)Online publication date: Nov-2024
https://doi.org/10.1109/TPDS.2024.3453872
Jiang JChoi BHuang XXu JBhowmick S(2024)DKWS: A Distributed System for Keyword Search on Massive GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.3313726(1-16)Online publication date: 2024
https://doi.org/10.1109/TKDE.2023.3313726
Xu RWang YXiao X(2024)Graph Computation with Adaptive Granularity2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00169(2123-2136)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00169
Xia YZhang FXu QZhang MYao ZLu LDu XDeng DHe BMa S(2024)GPU-based butterfly countingThe VLDB Journal10.1007/s00778-024-00861-033:5(1543-1567)Online publication date: 27-Jun-2024
https://doi.org/10.1007/s00778-024-00861-0
Gong STian CYin QWang ZYu SZhang YYu WGeng LFu CYu GZhou J(2024)Ingress: an automated incremental graph processing systemThe VLDB Journal10.1007/s00778-024-00838-z33:3(781-806)Online publication date: 20-Feb-2024
https://doi.org/10.1007/s00778-024-00838-z
Zhang SJiang ZHou XLi MYuan MYou H(2023)DRONE: An Efficient Distributed Subgraph-Centric Framework for Processing Large-Scale Power-law GraphsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.322306834:2(463-474)Online publication date: 1-Feb-2023
https://doi.org/10.1109/TPDS.2022.3223068
Yu SGong SZhang YYu WYin QTian CTao QYan YYu GZhou J(2023)Layph: Making Change Propagation Constraint in Incremental Graph Processing by Layering Graph2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00212(2766-2779)Online publication date: Apr-2023
https://doi.org/10.1109/ICDE55515.2023.00212
Ahmad AYuan LYan DGuo GChen JZhang C(2023)Accelerating k-Core Decomposition by a GPU2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00142(1818-1831)Online publication date: Apr-2023
https://doi.org/10.1109/ICDE55515.2023.00142
Peng JChen ZShao YShen YChen LCao J(2022)SancusProceedings of the VLDB Endowment10.14778/3538598.353861415:9(1937-1950)Online publication date: 27-Jul-2022
https://dl.acm.org/doi/10.14778/3538598.3538614
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents