Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2465351.2465369acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

Mizan: a system for dynamic load balancing in large-scale graph processing

Published: 15 April 2013 Publication History

Abstract

Pregel [23] was recently introduced as a scalable graph mining system that can provide significant performance improvements over traditional MapReduce implementations. Existing implementations focus primarily on graph partitioning as a preprocessing step to balance computation across compute nodes. In this paper, we examine the runtime characteristics of a Pregel system. We show that graph partitioning alone is insufficient for minimizing end-to-end computation. Especially where data is very large or the runtime behavior of the algorithm is unknown, an adaptive approach is needed. To this end, we introduce Mizan, a Pregel system that achieves efficient load balancing to better adapt to changes in computing needs. Unlike known implementations of Pregel, Mizan does not assume any a priori knowledge of the structure of the graph or behavior of the algorithm. Instead, it monitors the runtime characteristics of the system. Mizan then performs efficient fine-grained vertex migration to balance computation and communication. We have fully implemented Mizan; using extensive evaluation we show that---especially for highly-dynamic workloads---Mizan provides up to 84% improvement over techniques leveraging static graph pre-partitioning.

References

[1]
H. Balakrishnan, M. F. Kaashoek, D. Karger, R. Morris, and I. Stoica. Looking Up Data in P2P Systems. Communications of the ACM, 46(2):43--48, 2003.
[2]
P. Boldi, B. Codenotti, M. Santini, and S. Vigna. UbiCrawler: A Scalable Fully Distributed Web Crawler. Software: Practice & Experience, 34(8):711--726, 2004.
[3]
D. Buntinas. Designing a Common Communication Subsystem. In Proceedings of the 12th European Parallel Virtual Machine and Message Passing Interface Conference (Euro PVM MPI), pages 156--166, Sorrento, Italy, 2005.
[4]
R. Chen, M. Yang, X. Weng, B. Choi, B. He, and X. Li. Improving Large Graph Processing on Partitioned Graphs in the Cloud. In Proceedings of the 3rd ACM Symposium on Cloud Computing (SOCC), pages 3:1--3:13, San Jose, California, USA, 2012.
[5]
R. Cheng, J. Hong, A. Kyrola, Y. Miao, X. Weng, M. Wu, F. Yang, L. Zhou, F. Zhao, and E. Chen. Kineograph: Taking the Pulse of a Fast-Changing and Connected World. In Proceedings of the 7th ACM european conference on Computer Systems (EuroSys), pages 85--98, Bern, Switzerland, 2012.
[6]
J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In Proceedings of the 6th USENIX Symposium on Opearting Systems Design and Implementation (OSDI), pages 137--150, San Francisco, California, USA, 2004.
[7]
R. G. Gallager, P. A. Humblet, and P. M. Spira. A Distributed Algorithm for Minimum-Weight Spanning Trees. ACM Transactions on Programming Languages and Systems (TOPLAS), 5(1):66--77, 1983.
[8]
Giraph. Apache Incubator Giraph. http://incubator.apache.org/giraph, 2012.
[9]
GoldenOrb. A Cloud-based Open Source Project for Massive-Scale Graph Analysis. http://goldenorbos.org/, 2012.
[10]
J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI), pages 17--30, Hollywood, California, USA, 2012.
[11]
B. Hendrickson and K. Devine. Dynamic Load Balancing in Computational Mechanics. Computer Methods in Applied Mechanics and Engineering, 184(2--4):485--500, 2000.
[12]
L.-Y. Ho, J.-J. Wu, and P. Liu. Distributed Graph Database for Large-Scale Social Computing. In Proceedings of the IEEE 5th International Conference in Cloud Computing (CLOUD), pages 455--462, Honolulu, Hawaii, USA, 2012.
[13]
U. Kang, C. E. Tsourakakis, and C. Faloutsos. PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations. In Proceedings of the IEEE 9th International Conference on Data Mining (ICDM), pages 229--238, Miami, Florida, USA, 2009.
[14]
U. Kang, C. Tsourakakis, A. P. Appel, C. Faloutsos, and J. Leskovec. HADI: Fast Diameter Estimation and Mining in Massive Graphs with Hadoop. ACM Trasactions on Knowledge Discovery from Data (TKDD), 5(2):8:1--8:24, 2011.
[15]
G. Karypis and V. Kumar. Parallel Multilevel K-Way Partitioning Scheme for Irregular Graphs. In Proceedings of the ACM/IEEE Conference on Supercomputing (CDROM), pages 278--300, Pittsburgh, Pennsylvania, USA.
[16]
G. Karypis and V. Kumar. Multilevel k-way Partitioning Scheme for Irregular Graphs. Journal of Parallel and Distributed Computing, 48(1):96--129, 1998.
[17]
Z. Khayyat, K. Awara, H. Jamjoom, and P. Kalnis. Mizan: Optimizing Graph Mining in Large Parallel Systems. Technical report, King Abdullah University of Science and Technology, 2012.
[18]
E. Krepska, T. Kielmann, W. Fokkink, and H. Bal. HipG: Parallel Processing of Large-Scale Graphs. ACM SIGOPS Operating Systems Review, 45(2):3--13, 2011.
[19]
A. Kyrola, G. Blelloch, and C. Guestrin. GraphChi: Large-Scale Graph computation on Just a PC. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI), pages 31--46, Hollywood, California, USA, 2012.
[20]
LAW. The Laboratory for Web Algorithmics. http://law.di.unimi.it/index.php, 2012.
[21]
J. Leskovec, D. Chakrabarti, J. Kleinberg, C. Faloutsos, and Z. Ghahramani. Kronecker Graphs: An Approach to Modeling Networks. Journal of Machine Learning Research, 11: 985--1042, 2010.
[22]
Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. Hellerstein. Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud. Proceedings of the VLDB Endowment (PVLDB), 5(8):716--727, 2012.
[23]
G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A System for Large-Scale Graph Processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pages 135--146, Indianapolis, Indiana, USA.
[24]
L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank Citation Ranking: Bringing Order to the Web. http://ilpubs.stanford.edu:8090/422/, 2001.
[25]
J. M. Pujol, V. Erramilli, G. Siganos, X. Yang, N. Laoutaris, P. Chhabra, and P. Rodriguez. The Little Engine(s) That Could: Scaling Online Social Networks. IEEE/ACM Transactions on Networking, 20(4):1162--1175, 2012.
[26]
D. Rees. Foundations of Statistics. Chapman and Hall, London New York, 1987. ISBN 0412285606.
[27]
S. Salihoglu and J. Widom. GPS: A Graph Processing System. Technical report, Stanford University, 2012.
[28]
S. Seo, E. Yoon, J. Kim, S. Jin, J.-S. Kim, and S. Maeng. HAMA: An Efficient Matrix Computation with the MapReduce Framework. In Proceedings of the 2010 IEEE Second International Conference on Cloud Computing Technology and Science (CloudCom), pages 721--726, Athens, Greece.
[29]
B. Shao, H. Wang, and Y. Li. Trinity: A Distributed Graph Engine on a Memory Cloud. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, New York, New York, USA.
[30]
L. G. Valiant. A Bridging Model for Parallel Computation. Communications of the ACM, 33:103--111, 1990.
[31]
W. Xue, J. Shi, and B. Yang. X-RIME: Cloud-Based Large Scale Social Network Analysis. In Proceedings of the 2010 IEEE International Conference on Services Computing (SCC), pages 506--513, Shanghai, China.

Cited By

View all
  • (2024)FSM: A Fine-Grained Splitting and Merging Framework for Dual-Balanced Graph PartitionProceedings of the VLDB Endowment10.14778/3665844.366586417:9(2378-2391)Online publication date: 1-May-2024
  • (2024)MST: Topology-Aware Message Aggregation for Exascale Graph Processing of Traversal-Centric AlgorithmsACM Transactions on Architecture and Code Optimization10.1145/367684621:4(1-22)Online publication date: 3-Jul-2024
  • (2024)Grafu: Unleashing the Full Potential of Future Value Computation for Out-of-core Synchronous Graph ProcessingProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640409(467-481)Online publication date: 27-Apr-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
EuroSys '13: Proceedings of the 8th ACM European Conference on Computer Systems
April 2013
401 pages
ISBN:9781450319942
DOI:10.1145/2465351
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 April 2013

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

EuroSys '13
Sponsor:
EuroSys '13: Eighth Eurosys Conference 2013
April 15 - 17, 2013
Prague, Czech Republic

Acceptance Rates

EuroSys '13 Paper Acceptance Rate 28 of 143 submissions, 20%;
Overall Acceptance Rate 241 of 1,308 submissions, 18%

Upcoming Conference

EuroSys '25
Twentieth European Conference on Computer Systems
March 30 - April 3, 2025
Rotterdam , Netherlands

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)82
  • Downloads (Last 6 weeks)8
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)FSM: A Fine-Grained Splitting and Merging Framework for Dual-Balanced Graph PartitionProceedings of the VLDB Endowment10.14778/3665844.366586417:9(2378-2391)Online publication date: 1-May-2024
  • (2024)MST: Topology-Aware Message Aggregation for Exascale Graph Processing of Traversal-Centric AlgorithmsACM Transactions on Architecture and Code Optimization10.1145/367684621:4(1-22)Online publication date: 3-Jul-2024
  • (2024)Grafu: Unleashing the Full Potential of Future Value Computation for Out-of-core Synchronous Graph ProcessingProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640409(467-481)Online publication date: 27-Apr-2024
  • (2024)Locality-Preserving Graph Traversal With Split Live MigrationIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.343682835:10(1810-1825)Online publication date: Oct-2024
  • (2024)DeltaGNN: Accelerating Graph Neural Networks on Dynamic Graphs With Delta UpdatingIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.333515343:4(1163-1176)Online publication date: Apr-2024
  • (2024)TorchGT: A Holistic System for Large-Scale Graph Transformer TrainingProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00083(1-17)Online publication date: 17-Nov-2024
  • (2024)How to Fit the SCC Algorithm Efficiently into Distributed Graph Iterative Computation2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC61105.2024.00043(254-263)Online publication date: 2-Jul-2024
  • (2024)Optimising Queries for Pattern Detection Over Large Scale Temporally Evolving GraphsIEEE Access10.1109/ACCESS.2024.341735212(86790-86808)Online publication date: 2024
  • (2024)Performance optimization of heterogeneous computing for large-scale dynamic graph dataThe Journal of Supercomputing10.1007/s11227-024-06562-381:1Online publication date: 21-Oct-2024
  • (2023)SageProceedings of the VLDB Endowment10.14778/3565838.356584415:13(3897-3910)Online publication date: 20-Jan-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media