Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3035918.3035942acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Public Access

Parallelizing Sequential Graph Computations

Published: 09 May 2017 Publication History

Abstract

This paper presents GRAPE, a parallel system for graph computations. GRAPE differs from prior systems in its ability to parallelize existing sequential graph algorithms as a whole. Underlying GRAPE are a simple programming model and a principled approach, based on partial evaluation and incremental computation. We show that sequential graph algorithms can be "plugged into" GRAPE with minor changes, and get parallelized. As long as the sequential algorithms are correct, their GRAPE parallelization guarantees to terminate with correct answers under a monotonic condition. Moreover, we show that algorithms in MapReduce, BSP and PRAM can be optimally simulated on GRAPE. In addition to the ease of programming, we experimentally verify that GRAPE achieves comparable performance to the state-of-the-art graph systems, using real-life and synthetic graphs.

References

[1]
Aliyun.sl https://intl.aliyun.com.
[2]
DBpedia.sl http://wiki.dbpedia.org/Datasets.
[3]
Giraph.sl http://giraph.apache.org/.
[4]
GRAPE.sl http://grapedb.io/.
[5]
Movielens.sl http://grouplens.org/datasets/movielens/.
[6]
MPICH.sl https://www.mpich.org/.
[7]
Snap.sl http://snap.stanford.edu/data/index.html.
[8]
Traffic.sl http://www.dis.uniroma1.it/challenge9/download.shtml.
[9]
U. A. Acar. Self-Adjusting Computation. PhD thesis, Carnegie Mellon University, 2005.
[10]
J. Bang-Jensen and G. Z. Gutin. Digraphs: Theory, Algorithms and Applications. Springer, 2008.
[11]
P. A. Bernstein and N. Goodman. Concurrency control in distributed database systems. ACM Comput. Surv., 13(2):185--221.
[12]
E. G. Boman, K. D. Devine, and S. Rajamanickam. Scalable matrix computations on large scale-free graphs using 2D graph partitioning. In SC, 2013.
[13]
F. Bourse, M. Lelarge, and M. Vojnovic. Balanced graph edge partition. In SIGKDD, pages 1456--1465, 2014.
[14]
P. Buneman, G. Cong, W. Fan, and A. Kementsietsidis. Using partial evaluation in distributed query evaluation. In VLDB, 2006.
[15]
E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and distance queries via 2-hop labels. SICOMP, 32(5):1338--1355, 2003.
[16]
L. P. Cordella, P. Foggia, C. Sansone, and M. Vento. A (sub) graph isomorphism algorithm for matching large graphs. TPAMI, 26(10):1367--1372, 2004.
[17]
J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. Commun. ACM, 51(1), 2008.
[18]
W. Fan, C. Hu, and C. Tian. Incremental graph computations: Doable and undoable. In SIGMOD, 2017.
[19]
W. Fan, J. Li, S. Ma, N. Tang, Y. Wu, and Y. Wu. Graph pattern matching: From intractability to polynomial time. In PVLDB, 2010.
[20]
W. Fan, J. Li, X. Wang, and Y. Wu. Query preserving graph compression. In SIGMOD, 2012.
[21]
W. Fan, X. Wang, and Y. Wu. Incremental graph pattern matching. TODS, 38(3), 2013.
[22]
W. Fan, X. Wang, and Y. Wu. Distributed graph simulation: Impossibility and possibility. PVLDB, 7(12), 2014.
[23]
M. L. Fredman and R. E. Tarjan. Fibonacci heaps and their uses in improved network optimization algorithms. JACM, 34(3):596--615, 1987.
[24]
J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In USENIX, 2012.
[25]
J. E. Gonzalez, R. S. Xin, A. Dave, D. Crankshaw, M. J. Franklin, and I. Stoica. GraphX: Graph processing in a distributed dataflow framework. In OSDI, 2014.
[26]
T. J. Harris. A survey of PRAM simulation techniques. ACM Comput. Surv., 26(2):187--206, 1994.
[27]
M. R. Henzinger, T. Henzinger, and P. Kopke. Computing simulations on finite and infinite graphs. In FOCS, 1995.
[28]
N. D. Jones. An introduction to partial evaluation. ACM Computing Surveys, 28(3), 1996.
[29]
H. J. Karloff, S. Suri, and S. Vassilvitskii. A model of computation for MapReduce. In SODA, 2010.
[30]
G. Karypis and V. Kumar. METIS--unstructured graph partitioning and sparse matrix ordering system, version 2.0. Technical report, 1995.
[31]
A. Khan, Y. Wu, C. C. Aggarwal, and X. Yan. Nema: Fast graph search with label similarity. PVLDB, 6(3), 2013.
[32]
M. Kim and K. S. Candan. SBV-Cut: Vertex-cut based graph partitioning using structural balance vertices. Data & Knowledge Engineering, 72:285--303, 2012.
[33]
Y. Koren, R. Bell, C. Volinsky, et al. Matrix factorization techniques for recommender systems. Computer, 42(8):30--37, 2009.
[34]
Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein. Distributed GraphLab: A framework for machine learning in the cloud. PVLDB, 5(8), 2012.
[35]
G. Malewicz, M. H. Austern, A. J. C. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In SIGMOD, 2010.
[36]
T. Mytkowicz, M. Musuvathi, and W. Schulte. Data-parallel finite-state machines. In ASPLOS, 2014.
[37]
K. Pingali, D. Nguyen, M. Kulkarni, M. Burtscher, M. A. Hassaan, R. Kaleem, T.-H. Lee, A. Lenharth, R. Manevich, M. Méndez-Lojo, et al. The tao of parallelism in algorithms. In ACM Sigplan Notices, volume 46, pages 12--25, 2011.
[38]
C. Radoi, S. J. Fink, R. M. Rabbah, and M. Sridharan. Translating imperative code to MapReduce. In OOPSLA, 2014.
[39]
G. Ramalingam and T. Reps. An incremental algorithm for a generalization of the shortest-path problem. J. Algorithms, 21(2):267--305, 1996.
[40]
G. Ramalingam and T. Reps. On the computational complexity of dynamic graph problems. TCS, 158(1--2), 1996.
[41]
V. Raychev, M. Musuvathi, and T. Mytkowicz. Parallelizing user-defined aggregations using symbolic execution. In SOSP, 2015.
[42]
S. Salihoglu and J. Widom. GPS: a graph processing system. In SSDBM, 2013.
[43]
I. Stanton and G. Kliot. Streaming graph partitioning for large distributed graphs. In KDD, pages 1222--1230, 2012.
[44]
Y. Tian, A. Balmin, S. A. Corsten, and J. M. Shirish Tatikonda. From "think like a vertex" to "think like a graph". PVLDB, 7(7):193--204, 2013.
[45]
P. Trinder. A Functional Database. PhD thesis, University of Oxford, 1989.
[46]
L. G. Valiant. A bridging model for parallel computation. Commun. ACM, 33(8):103--111, 1990.
[47]
L. G. Valiant. General purpose parallel architectures. In Handbook of Theoretical Computer Science, Vol A. 1990.
[48]
J. Vinagre, A. M. Jorge, and J. Gama. Fast incremental matrix factorization for recommendation with positive-only feedback. In International Conference on User Modeling, Adaptation, and Personalization, 2014.
[49]
G. Wang, W. Xie, A. J. Demers, and J. Gehrke. Asynchronous large-scale graph processing made easy. In CIDR, 2013.
[50]
D. Yan, J. Cheng, Y. Lu, and W. Ng. Blogel: A block-centric framework for distributed computation on real-world graphs. PVLDB, 7(14):1981--1992, 2014.
[51]
D. Yan, J. Cheng, K. Xing, Y. Lu, W. Ng, and Y. Bu. Pregel algorithms for graph connectivity problems with performance guarantees. PVLDB, 7(14):1821--1832, 2014.
[52]
Y. Zhou, L. Liu, K. Lee, C. Pu, and Q. Zhang. Fast iterative graph computation with resource aware graph parallel abstractions. In HPDC, 2015.

Cited By

View all
  • (2024)Extending Graph Rules with OraclesProceedings of the VLDB Endowment10.14778/3654621.365464117:7(1775-1787)Online publication date: 1-Mar-2024
  • (2024)Towards Efficient Graph Processing in Geo-Distributed Data CentersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.345387235:11(2147-2160)Online publication date: Nov-2024
  • (2024)DKWS: A Distributed System for Keyword Search on Massive GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.3313726(1-16)Online publication date: 2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of Data
May 2017
1810 pages
ISBN:9781450341974
DOI:10.1145/3035918
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 May 2017

Permissions

Request permissions for this article.

Check for updates

Badges

  • Best Paper

Author Tags

  1. graph computation
  2. in-cremental evaluation
  3. parallel model
  4. partial evaluation
  5. scalability

Qualifiers

  • Research-article

Funding Sources

Conference

SIGMOD/PODS'17
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)339
  • Downloads (Last 6 weeks)56
Reflects downloads up to 14 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Extending Graph Rules with OraclesProceedings of the VLDB Endowment10.14778/3654621.365464117:7(1775-1787)Online publication date: 1-Mar-2024
  • (2024)Towards Efficient Graph Processing in Geo-Distributed Data CentersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.345387235:11(2147-2160)Online publication date: Nov-2024
  • (2024)DKWS: A Distributed System for Keyword Search on Massive GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.3313726(1-16)Online publication date: 2024
  • (2024)Graph Computation with Adaptive Granularity2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00169(2123-2136)Online publication date: 13-May-2024
  • (2024)GPU-based butterfly countingThe VLDB Journal10.1007/s00778-024-00861-033:5(1543-1567)Online publication date: 27-Jun-2024
  • (2024)Ingress: an automated incremental graph processing systemThe VLDB Journal10.1007/s00778-024-00838-z33:3(781-806)Online publication date: 20-Feb-2024
  • (2023)DRONE: An Efficient Distributed Subgraph-Centric Framework for Processing Large-Scale Power-law GraphsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.322306834:2(463-474)Online publication date: 1-Feb-2023
  • (2023)Layph: Making Change Propagation Constraint in Incremental Graph Processing by Layering Graph2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00212(2766-2779)Online publication date: Apr-2023
  • (2023)Accelerating k-Core Decomposition by a GPU2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00142(1818-1831)Online publication date: Apr-2023
  • (2022)SancusProceedings of the VLDB Endowment10.14778/3538598.353861415:9(1937-1950)Online publication date: 27-Jul-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media