Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1247480.1247573acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Fast and practical indexing and querying of very large graphs

Published: 11 June 2007 Publication History

Abstract

Many applications work with graph-structured data. As graphs grow in size, indexing becomes essential to ensure sufficient query performance. We present the GRIPP index structure (GRaph Indexing based on Pre- and Postorder numbering) for answering reachability queries in graphs.
GRIPP requires only linear time and space. Using GRIPP, we can answer reachability queries on graphs with 5 million nodes on average in less than 5 milliseconds, which is unrivaled by previous methods. We evaluate the performance and scalability of our approach on real and synthetic random and scale-free graphs and compare our approach to existing indexing schemes. GRIPP is implemented as stored procedure inside a relational database management system and can therefore very easily be integrated into existing graph-oriented applications.

References

[1]
R. Agrawal, A. Borgida, and H. V. Jagadish. Efficient Management of Transitive Relationships in Large Data and Knowledge Bases. In Proceedings of the ACM SIGMOD International Conference on Management of Data, pages 253--262, 1989. ACM Press.
[2]
R. Agrawal and H. V. Jagadish. Direct algorithms for computing the transitive closure of database relations. In Proceedings of the 13th International Conference on Very Large Data Bases (VLDB), pages 255--266, 1987. Morgan Kaufmann.
[3]
A. L. Barabàsi and R. Albert. Emergence of Scaling in Random Networks. Science, 286(5439):509--512, Oct 1999.
[4]
A. L. Barabàsi and Z. N. Oltvai. Network biology: understanding the cell's functional organization. Nature Reviews Genetics, 5(2):101--113, Feb 2004.
[5]
I. Borodina and J. Nielsen. From genomes to in silico cells via metabolic networks. Current Opinion in Biotechnology, 16(3):350--355, Jun 2005.
[6]
L. Chen, A. Gupta, and M. E. Kurul. Stack-based Algorithms for Pattern Matching on DAGs. In Proceedings of the 31st International Conference on Very Large Data Bases (VLDB), pages 493--504, 2005. ACM Press.
[7]
J. Cheng, J. X. Yu, X. Lin, H. Wang, and P. S. Yu. Fast Computation of Reachability Labeling for Large Graphs. In Proceedings of the 10th International Conference on Extending Database Technology (EDBT), volume 3896 of Lecture Notes in Computer Science, pages 961--979, 2006. Springer.
[8]
E. Cohen, E. Halperin, H. Kaplan, and U. Zwick. Reachability and Distance Queries via 2-Hop Labels. SIAM J. Comput., 32(5):1338--1355, 2003.
[9]
T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. MIT Press, 2001.
[10]
P. Dietz and D. Sleator. Two algorithms for maintaining order in a list. In Proceedings of the 19th annual ACM Symposium on Theory of computing (STOC), pages 365--372, 1987. ACM Press.
[11]
P. Erdös and A. Rényi. On the Evolution of Random Graphs. Publ. Math. Inst. Hungar. Acad. Sci., 5:17--61, 1960.
[12]
M. F. Fernandez, D. Florescu, A. Y. Levy, and D. Suciu. A query language for a web-site management system. SIGMOD Record, 26(3):4--11, 1997.
[13]
T. Grust, M. van Keulen, and J. Teubner. Accelerating XPath evaluation in any RDBMS. ACM Trans. Database Syst., 29:91--131, 2004.
[14]
R. H. Güting. GraphDB: Modeling and Querying Graphs in Databases. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB), pages 297--308, 1994. Morgan Kaufmann.
[15]
H. He, H. Wang, J. Yang, and P. S. Yu. Compact Reachability Labeling for Graph-Structured Data. In Proceedings of the 2005 ACM International Conference on Information and Knowledge Management (CIKM), pages 594--601, 2005. ACM Press.
[16]
J. van Helden, A. Naim, R. Mancuso, M. Eldridge, et al. Representing and analysing molecular and cellular function using the computer. Journal of Biological Chemistry, 381(9-10):921--935, Sep-Oct 2000.
[17]
G. Joshi-Tope, M. Gillespie, I. Vastrik, P. D'Eustachio, et al. Reactome: a knowledgebase of biological pathways. Nucleic Acids Research, 33:D428--D432, Jan 2005.
[18]
M. Kanehisa, S. Goto, S. Kavashima, Y. Okuno, and M. Hattori. The KEGG resource for deciphering the genome. Nucleic Acids Research, 32:D277--D280, Jan 2004.
[19]
G. Karvounarakis, S. Alexaki, V. Christophides, D. Plexousakis, and M. Scholl. RQL: A declarative query language for RDF, 2002. In Proceedings of the 11th Intl. World Wide Web Conference (WWW), 2002.
[20]
C. Lemer, E. Antezana, F. Couche, F. Fays, et al. The aMAZE LightBench: a web interface to a relational database of cellular processes. Nucleic Acids Research, 32:D443--D448, Jan 2004.
[21]
H. Lu. New Strategies for Computing the Transitive Closure of a Database Relation. In Proceedings of the 13th International Conference on Very Large Data Bases (VLDB), pages 267--274, 1987. Morgan Kaufmann.
[22]
R. Schenkel, A. Theobald, and G. Weikum. Efficient Creation and Incremental Maintenance of the HOPI Index for Complex XML Document Collections. In Proceedings of the 21st International Conference on Data Engineering (ICDE), pages 360--371, 2005. IEEE Computer Society.
[23]
U. Stelzl, U. Worm, M. Lalowski, C. Haenig, et al. A human protein-protein interaction network: a resource for annotating the proteome. Cell, 122(6):957--968, Sep 2005.
[24]
S. Trißl and U. Leser. Querying Ontologies in Relational Database Systems. In Proceedings of the Second International Workshop on Data Integration in the Life Sciences (DILS), volume 3615 of Lecture Notes in Computer Science, pages 63--79, 2005. Springer.
[25]
S. Trißl and U. Leser. GRIPP-Indexing and Querying Graphs based on Pre- and Postorder Numbering. Technical Report No. 207, Humboldt-Universität zu Berlin, Institut für Informatik, 2006.
[26]
H. Wang, H. He, J. Yang, P. S. Yu, and J. X. Yu. Dual Labeling: Answering Graph Reachability Queries in Constant Time. In Proceedings of the 22nd International Conference on Data Engineering (ICDE), page 75, 2006. IEEE Computer Society.

Cited By

View all
  • (2025)Answering Min-Max Resource-Constrained Shortest Path Queries Over Large GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348809537:1(60-74)Online publication date: Jan-2025
  • (2024)ActiveReach: an active learning framework for approximate reachability query answering in large-scale graphsFrontiers in Big Data10.3389/fdata.2024.14271047Online publication date: 19-Nov-2024
  • (2024)Enabling Privacy-Preserving K-Hop Reachability Query Over Encrypted GraphsIEEE Transactions on Services Computing10.1109/TSC.2024.338295417:3(893-904)Online publication date: May-2024
  • Show More Cited By

Index Terms

  1. Fast and practical indexing and querying of very large graphs

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data
    June 2007
    1210 pages
    ISBN:9781595936868
    DOI:10.1145/1247480
    • General Chairs:
    • Lizhu Zhou,
    • Tok Wang Ling,
    • Program Chair:
    • Beng Chin Ooi
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 June 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. databases
    2. graph indexing
    3. reachability queries

    Qualifiers

    • Article

    Conference

    SIGMOD/PODS07
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)38
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 01 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Answering Min-Max Resource-Constrained Shortest Path Queries Over Large GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348809537:1(60-74)Online publication date: Jan-2025
    • (2024)ActiveReach: an active learning framework for approximate reachability query answering in large-scale graphsFrontiers in Big Data10.3389/fdata.2024.14271047Online publication date: 19-Nov-2024
    • (2024)Enabling Privacy-Preserving K-Hop Reachability Query Over Encrypted GraphsIEEE Transactions on Services Computing10.1109/TSC.2024.338295417:3(893-904)Online publication date: May-2024
    • (2024)Verifiable Strong Privacy-Preserving Any-Hop Reachability Query on Blockchain-Assisted CloudIEEE Internet of Things Journal10.1109/JIOT.2024.344543111:24(39637-39650)Online publication date: 15-Dec-2024
    • (2024)Efficient algorithms for reachability and path queries on temporal bipartite graphsThe VLDB Journal10.1007/s00778-024-00854-z33:5(1399-1426)Online publication date: 23-May-2024
    • (2023)The case for learned provenance graph storage systemsProceedings of the 32nd USENIX Conference on Security Symposium10.5555/3620237.3620421(3277-3294)Online publication date: 9-Aug-2023
    • (2023)HR-Index: An Effective Index Method for Historical Reachability Queries over Evolving GraphsProceedings of the ACM on Management of Data10.1145/35892721:2(1-25)Online publication date: 20-Jun-2023
    • (2023)Reachability Queries on Dynamic Temporal Bipartite GraphsProceedings of the 31st ACM International Conference on Advances in Geographic Information Systems10.1145/3589132.3625647(1-11)Online publication date: 13-Nov-2023
    • (2023)An Overview of Reachability Indexes on GraphsCompanion of the 2023 International Conference on Management of Data10.1145/3555041.3589408(61-68)Online publication date: 4-Jun-2023
    • (2023)IFCA: Index-Free Community-Aware Reachability Processing Over Large Dynamic Graphs2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00172(2220-2234)Online publication date: Apr-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media