Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

SAPPER: subgraph indexing and approximate matching in large graphs

Published: 01 September 2010 Publication History

Abstract

With the emergence of new applications, e.g., computational biology, new software engineering techniques, social networks, etc., more data is in the form of graphs. Locating occurrences of a query graph in a large database graph is an important research topic. Due to the existence of noise (e.g., missing edges) in the large database graph, we investigate the problem of approximate subgraph indexing, i.e., finding the occurrences of a query graph in a large database graph with (possible) missing edges. The SAPPER method is proposed to solve this problem. Utilizing the hybrid neighborhood unit structures in the index, SAPPER takes advantage of pre-generated random spanning trees and a carefully designed graph enumeration order. Real and synthetic data sets are employed to demonstrate the efficiency and scalability of our approximate subgraph indexing method.

References

[1]
D. J. Aldous, The random walk construction of uniform spanning trees and uniform labelled trees, SIAM J. Discrete Math, 1990.
[2]
R. Agrawal and R. Srikant, Fast algorithms for mining association rules, Prof. of VLDB, 1994.
[3]
B. H. Bloom, Space/time trade-offs in hash coding with allowable errors, Communications of the ACM 13 (7), 1970.
[4]
J. Cheng, Y. Ke, W. Ng and A. Lu, FG-Index: towards verification-free query processing on graph databases. Proc. of SIGMOD, 2007.
[5]
B. Chazelle, J. Kilian, R. Rubinfeld and A. Tal, The bloomier filter: an efficient data structure for static support lookup tables, Proc. of 5th Annual ACM-SIAM Symposium on Discrete Algorithms, 2004.
[6]
L. Cordella, P. Foggia, C. Sansone and M. Vento, A (sub)graph isomorphism algorithm for matching large graphs. PAMI, 2004.
[7]
B. Dost, T. Shlomi, N. Gupta, E. Ruppin, V. Bafna and R. Sharan, QNet: a tool for querying protein interaction networks, Proc. of RECOMB, 2007.
[8]
T. Nguyen, H. Nguyen, N. Pham, J. AI-Kofahi and T. Nguyen, Graph-based mining of multiple object usage patterns, Proc. of the Joint Meeting of ESEC and ACM SIGSOFT, 2009.
[9]
R. Giugno and D. Shasha, GraphGrep: A fast and universal method for querying graphs. Proc. of ICPR, 2002.
[10]
J. Han, J. Pei and Y. Yin, Mining frequent patterns without candidate generation, Proc. of SIGMOD, 2000.
[11]
H. He and A. K. Singh, Closure-Tree: an index structure for graph queries. Proc. of ICDE, 2006.
[12]
H. Jiang, H. Wang, P. Yu and S. Zhou, GString: A novel approach for efficient search in graph databases. Proc. of ICDE, 2007.
[13]
M. Kanehisa and S. Goto, KEGG: Kyoto encyclopedia of genes and genomes, Nuc. Ac. Res, 2000, 28:27--30
[14]
M. Koyuturk, A. Grama and W. Szpankowski, Pairwise local alignment of protein interaction networks guided by models of evolution. Proc. of RECOMB, 2005.
[15]
F. Mandreoli, R. Martoglia, G. Villani and W. Penzo, Flexible query answering on graph-modeled data. Proc. of EDBT, 2009.
[16]
M. Mongiovi, R. Natale, R. Giugno, A, Pulvirenti, and A. Ferro. A set-cover-based approach for inexact graph matching. Proc. of CSB, 2009.
[17]
R. Pinter, O. Rokhlenko, E. Yeger-Lotem and M. Ziv-Ukelson, Alignment of metabolic pathways, Bioinformatics, 2005.
[18]
H. Shang, Y. Zhang, X. Lin, and J. Yu, Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. PVLDB, 2008.
[19]
Y. Tian and J. Patel, TALE: a tool for approximate large graph matching, Proc. of ICDE, 2008.
[20]
J. Ullmann, An algorithm for subgraph isomorphism. J. ACM, 1976.
[21]
X. Wang, A. Smalter, J. Huan, and G. Lushington, G-Hash: towards fast kernel-based similarity search in large graph databases, Proc. of EDBT, 2009.
[22]
X. Yan, P. Yu and J. Han, Graph indexing, a frequent structure-based approach. Proc. of SIGMOD, 2004.
[23]
S. Zhang, M. Hu, and J. Yang, Treepi: a novel graph indexing method. Proc. of ICDE, 2007.
[24]
S. Zhang, S. Li, and J. Yang, Gaddi: distance index based subgraph matching in biological networks. Proc. of EDBT, 2009.
[25]
Gene Ontology. http://www.geneontology.org/.

Cited By

View all
  • (2024)Graph-based Spatial Pattern Matching: A Theoretical ComparisonProceedings of the 32nd ACM International Conference on Advances in Geographic Information Systems10.1145/3678717.3691227(505-508)Online publication date: 29-Oct-2024
  • (2024)Grand: A Fast and Accurate Graph Retrieval Framework via Knowledge DistillationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657773(1639-1648)Online publication date: 10-Jul-2024
  • (2024)Path-based approximate matching of fuzzy spatiotemporal RDF dataWorld Wide Web10.1007/s11280-024-01247-627:2Online publication date: 3-Feb-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 3, Issue 1-2
September 2010
1658 pages

Publisher

VLDB Endowment

Publication History

Published: 01 September 2010
Published in PVLDB Volume 3, Issue 1-2

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)29
  • Downloads (Last 6 weeks)1
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Graph-based Spatial Pattern Matching: A Theoretical ComparisonProceedings of the 32nd ACM International Conference on Advances in Geographic Information Systems10.1145/3678717.3691227(505-508)Online publication date: 29-Oct-2024
  • (2024)Grand: A Fast and Accurate Graph Retrieval Framework via Knowledge DistillationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657773(1639-1648)Online publication date: 10-Jul-2024
  • (2024)Path-based approximate matching of fuzzy spatiotemporal RDF dataWorld Wide Web10.1007/s11280-024-01247-627:2Online publication date: 3-Feb-2024
  • (2023)Partial Recovery in the Graph Alignment ProblemOperations Research10.1287/opre.2022.235571:1(259-272)Online publication date: 1-Jan-2023
  • (2023)Efficient m-closest entity matching over heterogeneous information networksKnowledge-Based Systems10.1016/j.knosys.2023.110299263:COnline publication date: 5-Mar-2023
  • (2023)A Scalable Query Pricing Framework for Incomplete Graph DataDatabase Systems for Advanced Applications10.1007/978-3-031-30637-2_7(97-113)Online publication date: 17-Apr-2023
  • (2022)Evaluating pattern matching queries for spatial databasesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-019-00550-328:5(649-673)Online publication date: 11-Mar-2022
  • (2021)VerSaChIProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482217(2812-2816)Online publication date: 26-Oct-2021
  • (2020)ChiSeLProceedings of the VLDB Endowment10.14778/3401960.340196413:10(1654-1668)Online publication date: 1-Jun-2020
  • (2020)Simulation-based Approximate Graph Pattern MatchingProceedings of the 2020 ACM SIGMOD International Conference on Management of Data10.1145/3318464.3384401(2825-2827)Online publication date: 11-Jun-2020
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media