Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3234664.3234682acmotherconferencesArticle/Chapter ViewAbstractPublication PageshpcctConference Proceedingsconference-collections
research-article

The Construction of Large Graph Data Structures in a Scalable Distributed Message System

Published: 22 June 2018 Publication History

Abstract

A large-scale distributed graph data substrate is the foundation of graph databases and knowledge base systems. Existing systems either scale up the capacity of a single compute node or build on distributed relational models (table based systems). The former limits the scalability of the graph system, and the latter can constrain the performance. We have designed a new graph data system based on a scalable message system. The new data model enables most of existing graph algorithms designed for a monolithic computer to run smoothly on a plurality of distributed nodes, and achieves high performance on complex graph operations. The implementation supports 100 million vertices and one billion arcs on a commodity computer cluster, and provides scalability to potentially operate on tens of billions of vertices.

References

[1]
J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. In Proc. of the 6th Symp. on Operating Systems Design & Implementation (OSDI'04), Berkeley, CA, USA, 2004.
[2]
M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proc. of NSDI'12, 2012.
[3]
S. Ghemawat, H. Gobio, and S.-T. Leung. The Google file system. SIGOPS Oper. Syst. Rev., 37(5):29--43, 2003.
[4]
J. E. Gonzalez, R. S. Xin, A. Dave, D. Crankshaw, M. J. Franklin, and I. Stoica. GraphX: Graph processing in a distributed dataflow framework. In In USENIX Operating System Design and Implementation 2014 (USENIX OSDI'14), 2014.
[5]
G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In Proceedings of the 2010 international conference on Management of data, SIGMOD 10, pages 135--146, 2010.
[6]
Z. Ma and L. Gu. The limitation of mapreduce: A probing case and a lightweight solution. In CLOUD COMPUTING 2010: Proc. of the 1st Intl. Conf. on Cloud Computing, GRIDs, and Virtualization, pages 68--73, 2010.
[7]
D. Ongaro, S. M. Rumble, R. Stutsman, J. Ousterhout, and M. Rosenblum. Fast crash recovery in ramcloud. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, SOSP 11, pages 29--41, 2011.
[8]
A. Dubey, G. D. Hill, R. Escriva, and E. G. Sirer. Weaver: A high-performance, transactional graph database based on refinable timestamps. Proc. VLDB Endow., 9(11):852--863, July 2016.
[9]
Sparql 1.1 query language. https://www.w3.org/TR/sparql11-query/.
[10]
https://neo4j.com/.
[11]
Gremlin. https://github.com/tinkerpop/gremlin/wiki.
[12]
Apache tinkertop. http://tinkerpop.apache.org/.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
HPCCT '18: Proceedings of the 2018 2nd High Performance Computing and Cluster Technologies Conference
June 2018
126 pages
ISBN:9781450364850
DOI:10.1145/3234664
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • Shanghai Jiao Tong University: Shanghai Jiao Tong University
  • Xi'an Jiaotong-Liverpool University: Xi'an Jiaotong-Liverpool University
  • Chinese Academy of Sciences

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 June 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Distributed System
  2. Graph Database
  3. Message Space

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

HPCCT 2018

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 61
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media