Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Weaver: a high-performance, transactional graph database based on refinable timestamps

Published: 01 July 2016 Publication History

Abstract

Graph databases have become a common infrastructure component. Yet existing systems either operate on offline snapshots, provide weak consistency guarantees, or use expensive concurrency control techniques that limit performance.
In this paper, we introduce a new distributed graph database, called Weaver, which enables efficient, transactional graph analyses as well as strictly serializable ACID transactions on dynamic graphs. The key insight that allows Weaver to combine strict serializability with horizontal scalability and high performance is a novel request ordering mechanism called refinable timestamps. This technique couples coarse-grained vector timestamps with a fine-grained timeline oracle to pay the overhead of strong consistency only when needed. Experiments show that Weaver enables a Bitcoin blockchain explorer that is 8x faster than Blockchain.info, and achieves 10.9x higher throughput than the Titan graph database on social network workloads and 4x lower latency than GraphLab on offline graph traversal workloads.

References

[1]
Blockchain. https://blockchain.info.
[2]
The Plan to Build a Massive Online Brain for All the World's Robots. http://www.wired.com/2014/08/robobrain.
[3]
Apache Giraph. http://giraph.apache.org/.
[4]
Apache Hama. http://hama.apache.org/.
[5]
Facebook Graph Search. https://www.facebook.com/about/graphsearch.
[6]
Neo4j. http://neo4j.org.
[7]
TPC-C. http://www.tpc.org/tpcc.
[8]
Three and a half degrees of separation. https://research.facebook.com/blog/three-and-a-half-degrees-of-separation/.
[9]
Titan. https://github.com/thinkaurelius/titan/wiki.
[10]
L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan. Group Formation in Large Social Networks: Membership, Growth, and Evolution. KDD, pages 44--54, Aug. 2006.
[11]
N. Bronson, Z. Amsden, G. Cabrera, P. Chakka, P. Dimov, H. Ding, J. Ferris, A. Giardullo, S. Kulkarni, H. Li, M. Marchukov, D. Petrov, L. Puzar, Y. J. Song, and V. Venkataramani. TAO: Facebook's Distributed Data Store for the Social Graph. USENIX ATC, pages 49--60, June 2013.
[12]
Y. Bu, V. Borkar, J. Jia, M. J. Carey, and T. Condie. Pregelix: Big(ger) Graph Analytics on A Dataflow Engine. PVLDB, 8(2), 2014.
[13]
R. Chen, M. Yang, X. Weng, B. Choi, B. He, and X. Li. Improving Large Graph Processing on Partitioned Graphs in the Cloud. SoCC, pages 25--36, Oct. 2012.
[14]
R. Chen, J. Shi, Y. Chen, and H. Chen. PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs. Eurosys, Apr. 2015.
[15]
R. Cheng, J. Hong, A. Kyrola, Y. Miao, X. Weng, M. Wu, F. Yang, L. Zhou, F. Zhao, and E. Chen. Kineograph: Taking the Pulse of a Fast-Changing and Connected World. Eurosys, pages 85--98, Apr. 2012.
[16]
A. Ching, S. Edunov, M. Kabiljo, D. Logothetis, and S. Muthukrishnan. One Trillion Edges: Graph Processing at Facebook-Scale. PVLDB, 8(12):1804--1815, 2015.
[17]
J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. J. Furman, S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh, S. Kanthak, E. Kogan, H. Li, A. Lloyd, S. Melnik, D. Mwaura, D. Nagle, S. Quinlan, R. Rao, L. Rolig, Y. Saito, M. Szymaniak, C. Taylor, R. Wang, and D. Woodford. Spanner: Google's Globally-Distributed Database. OSDI, pages 251--264, Oct. 2012.
[18]
B. Ding, L. Kot, A. J. Demers, and J. Gehrke. Centiman: Elastic, High Performance Optimistic Concurrency Control by Watermarking. SoCC, pages 262--275, Nov. 2015.
[19]
A. Dragojevic, D. Narayanan, E. B. Nightingale, M. Renzelmann, A. Shamis, A. Badam, and M. Castro. No Compromises: Distributed Transactions with Consistency, Availability, and Performance. SOSP, pages 54--70, Oct. 2015.
[20]
R. Escriva, A. Dubey, B. Wong, and E. G. Sirer. Kronos: The Design and Implementation of an Event Ordering Service. Eurosys, Apr. 2014.
[21]
R. Escriva, B. Wong, and E. G. Sirer. Warp: Lightweight Multi-Key Transactions for Key-Value Stores. CoRR, abs/1509.07815, 2015.
[22]
J. M. Faleiro and D. J. Abadi. Rethinking serializable multiversion concurrency control. PVLDB, 8(11):1190--1201, 2015.
[23]
C. J. Fidge. Timestamps in Message-Passing Systems That Preserve the Partial Ordering. Australian Computer Science Communications, 10(1):56--66, 1988.
[24]
H. Garcia-Molina, J. D. Ullman, and J. Widom. Database Systems. Pearson Prentice Hall, 2009.
[25]
B. Gedik and R. Bordawekar. Disk-Based Management of Interaction Graphs. IEEE Transactions on Knowledge and Data Engineering, 26(11), 2014.
[26]
J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs. OSDI, pages 17--30, Oct. 2012.
[27]
J. E. Gonzalez, R. S. Xin, A. Dave, D. Crankshaw, M. J. Franklin, and I. Stoica. GraphX: Graph Processing in a Distributed Dataflow Framework. OSDI, Oct. 2014.
[28]
J. N. Gray. Notes on Data Base Operating Systems. Springer, 1978.
[29]
M. Han and K. Daudjee. Giraph Unchained: Barrierless Asynchronous Parallel Execution in Pregel-like Graph Processing Systems. PVLDB, 8(9):950--961, 2015.
[30]
W. Han, Y. Miao, K. Li, M. Wu, F. Yang, L. Zhou, V. Prabhakaran, W. Chen, and E. Chen. Chronos: A Graph Engine for Temporal Graph Analysis. Eurosys, Apr. 2014.
[31]
B. Iordanov. HyperGraphDB: A Generalized Graph Database. WAIM, pages 25--36, July 2010.
[32]
Z. Khayyat, K. Awara, A. Alonazi, H. Jamjoom, D. Williams, and P. Kalnis. Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing. Eurosys, pages 169--182, Apr. 2013.
[33]
H. T. Kung and J. T. Robinson. On Optimistic Methods for Concurrency Control. ACM ToDS, 6(2):213--226, 1981.
[34]
H. Kwak, C. Lee, H. Park, and S. Moon. What is Twitter, a Social Network or a News Media? WWW, pages 591--600, Apr. 2010.
[35]
A. Kyrola, G. E. Blelloch, and C. Guestrin. GraphChi: Large-Scale Graph Computation on Just a PC. OSDI, pages 31--46, Oct. 2012.
[36]
A. Kyrola and C. Guestrin. GraphChi-DB: Simple Design for a Scalable Graph Database System - on Just a PC. CoRR, abs/1403.0701, 2014.
[37]
L. Lamport. The Part-Time Parliament. ACM ToCS, 16(2):133--169, 1998.
[38]
J. Levandoski, D. Lomet, S. Sengupta, R. Stutsman, and R. Wang. High Performance Transactions in Deuteronomy. CIDR, Jan. 2015.
[39]
Z. Li, L. Tan, X. Wang, S. Lu, Y. Zhou, and C. Zhai. Have Things Changed Now?: An Empirical Study of Bug Characteristics in Modern Open Source Software. ACM Workshop on Architectural and System Support for Improving Software Dependability, pages 25--33, Oct. 2006.
[40]
H. Lu, K. Veeraraghavan, P. Ajoux, J. Hunt, Y. J. Song, W. Tobagus, S. Kumar, and W. Lloyd. Existential Consistency: Measuring and Understanding Consistency at Facebook. SOSP, Oct. 2015.
[41]
G. Malewicz, M. H. Austern, A. J. C. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A System for Large-Scale Graph Processing. SIGMOD, pages 135--146, June 2010.
[42]
J. J. McAuley and J. Leskovec. Learning to Discover Social Circles in Ego Networks. NIPS, pages 548--556, Dec. 2012.
[43]
F. McSherry, M. Isard, and D. G. Murray. Scalability! But at what COST? HotOS Workshop, May 2015.
[44]
D. G. Murray, F. McSherry, R. Isaacs, M. Isard, P. Barham, and M. Abadi. Naiad: A Timely Dataflow System. SOSP, pages 439--455, Nov. 2013.
[45]
M. Najork. The Scalable Hyperlink Store. Hypertext, pages 89--98, June 2009.
[46]
S. Nakamoto. Bitcoin: A Peer-to-Peer Electronic Cash System. 2008.
[47]
T. Neumann, T. Mühlbauer, and A. Kemper. Fast Serializable Multi-Version Concurrency Control for Main-Memory Database Systems. SIGMOD, pages 677--689, 2015.
[48]
J. Nishimura and J. Ugander. Restreaming Graph Partitioning: Simple Versatile Algorithms For Advanced Balancing. KDD, pages 1106--1114, Aug. 2013.
[49]
V. Prabhakaran, M. Wu, X. Weng, F. McSherry, L. Zhou, and M. Haridasan. Managing Large Graphs on Multi-Cores With Graph Awareness. USENIX ATC, pages 41--52, June 2012.
[50]
D. P. Reed. Naming And Synchronization in a Decentralized Computer System. Massachusetts Institute of Technology, Technical Report, 1978.
[51]
M. A. Rodriguez and M. Broecheler. http://www.slideshare.net/slidarko/titan-the-rise-of-big-graph-data.
[52]
A. Roy, I. Mihailovic, and W. Zwaenepoel. X-Stream: Edge-centric Graph Processing Using Streaming Partitions. SOSP, pages 472--488, Nov. 2013.
[53]
S. Salihoglu and J. Widom. GPS: A Graph Processing System. SSDBM, July 2013.
[54]
A. Saxena, A. Jain, O. Sener, A. Jami, D. K. Misra, and H. S. Koppula. RoboBrain: Large-Scale Knowledge Engine for Robots. Cornell University and Stanford University, Technical Report, 2015.
[55]
F. B. Schneider. Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial. ACM Computing Surveys, 22(4), 1990.
[56]
B. Shao, H. Wang, and Y. Li. Trinity: A Distributed Graph Engine on a Memory Cloud. SIGMOD, pages 505--516, June 2013.
[57]
P. Smith. Personal communication, 2015.
[58]
I. Stanton and G. Kliot. Streaming Graph Partitioning for Large Distributed Graphs. KDD, pages 1222--1230, Aug. 2012.
[59]
W. Sun, A. Fokoue, K. Srinivas, A. Kementsietsidis, G. Hu, and G. T. Xie. SQLGraph: An Efficient Relational-Based Property Graph Store. SIGMOD, pages 1887--1901, 2015.
[60]
Y. Tian, A. Balmin, S. A. Corsten, S. Tatikonda, and J. McPherson. From "Think Like a Vertex" to "Think Like a Graph". PVLDB, 7(3):193--204, 2013.
[61]
J. Ugander, B. Karrer, L. Backstrom, and C. Marlow. The Anatomy of the Facebook Social Graph. CoRR, abs/1111.4503, 2011.
[62]
R. van Renesse and F. B. Schneider. Chain Replication for Supporting High Throughput and Availability. OSDI, pages 91--104, Dec. 2004.
[63]
K. Wang, G. Xu, Z. Su, and Y. D. Liu. GraphQ: Graph Query Processing with Abstraction Refinement--Scalable and Programmable Analytics over Very Large Graphs on a Single PC. USENIX ATC, pages 387--401, July 2015.
[64]
W. Xie, G. Wang, D. Bindel, A. Demers, and J. Gehrke. Fast Iterative Graph Computation with Block Updates. PVLDB, 6(14):2014--2025, 2013.
[65]
D. Yan, J. Cheng, Y. Lu, and W. Ng. Blogel: A Block-Centric Framework for Distributed Computation on Real-World Graphs. PVLDB, 7(14):1981--1992, 2014.
[66]
D. Yan, J. Cheng, Y. Lu, and W. Ng. Effective Techniques for Message Reduction and Load Balancing in Distributed Graph Computation. WWW, pages 1307--1317, 2015.
[67]
C. Zhou, J. Gao, B. Sun, and J. X. Yu. MOCgraph: Scalable Distributed Graph Processing Using Message Online Computing. PVLDB, 8(4):377--388, 2014.
[68]
X. Zhu, W. Han, and W. Chen. GridGraph: Large-Scale Graph Processing on a Single Machine Using 2-Level Hierarchical Partitioning. USENIX ATC, pages 375--386, July 2015.

Cited By

View all
  • (2024)AeonG: An Efficient Built-in Temporal Support in Graph DatabasesProceedings of the VLDB Endowment10.14778/3648160.364818717:6(1515-1527)Online publication date: 1-Feb-2024
  • (2024)LSGraph: A Locality-centric High-performance Streaming Graph EngineProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650076(33-49)Online publication date: 22-Apr-2024
  • (2023)Demystifying Graph Databases: Analysis and Taxonomy of Data Organization, System Designs, and Graph QueriesACM Computing Surveys10.1145/360493256:2(1-40)Online publication date: 15-Sep-2023
  • Show More Cited By
  1. Weaver: a high-performance, transactional graph database based on refinable timestamps

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 9, Issue 11
    July 2016
    60 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    Published: 01 July 2016
    Published in PVLDB Volume 9, Issue 11

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)14
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 10 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)AeonG: An Efficient Built-in Temporal Support in Graph DatabasesProceedings of the VLDB Endowment10.14778/3648160.364818717:6(1515-1527)Online publication date: 1-Feb-2024
    • (2024)LSGraph: A Locality-centric High-performance Streaming Graph EngineProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650076(33-49)Online publication date: 22-Apr-2024
    • (2023)Demystifying Graph Databases: Analysis and Taxonomy of Data Organization, System Designs, and Graph QueriesACM Computing Surveys10.1145/360493256:2(1-40)Online publication date: 15-Sep-2023
    • (2023)The Graph Database Interface: Scaling Online Transactional and Analytical Graph Workloads to Hundreds of Thousands of CoresProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607068(1-18)Online publication date: 12-Nov-2023
    • (2022)ByteGraphProceedings of the VLDB Endowment10.14778/3554821.355482415:12(3306-3318)Online publication date: 1-Aug-2022
    • (2022)G-tranProceedings of the VLDB Endowment10.14778/3551793.355181315:11(2545-2558)Online publication date: 29-Sep-2022
    • (2021)Terrace: A Hierarchical Graph Container for Skewed Dynamic GraphsProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457313(1372-1385)Online publication date: 9-Jun-2021
    • (2021)RisGraph: A Real-Time Streaming System for Evolving Graphs to Support Sub-millisecond Per-update Analysis at Millions Ops/sProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457263(513-527)Online publication date: 9-Jun-2021
    • (2021)Vertex-Centric Visual Programming for Graph Neural NetworksProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3452770(2803-2807)Online publication date: 9-Jun-2021
    • (2020)LiveGraphProceedings of the VLDB Endowment10.14778/3384345.338435113:7(1020-1034)Online publication date: 1-Mar-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media