Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2588555.2610500acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Fast and unified local search for random walk based k-nearest-neighbor query in large graphs

Published: 18 June 2014 Publication History

Abstract

Given a large graph and a query node, finding its k-nearest-neighbor (kNN) is a fundamental problem. Various random walk based measures have been developed to measure the proximity (similarity) between nodes. Existing algorithms for the random walk based top-k proximity search can be categorized as global and local methods based on their search strategies. Global methods usually require an expensive precomputing step. By only searching the nodes near the query node, local methods have the potential to support more efficient query. However, most existing local search methods cannot guarantee the exactness of the solution. Moreover, they are usually designed for specific proximity measures. Can we devise an efficient local search method that applies to different measures and also guarantees result exactness? In this paper, we present FLoS (Fast Local Search), a unified local search method for efficient and exact top-k proximity query in large graphs. FLoS is based on the no local optimum property of proximity measures. We show that many measures have no local optimum. Utilizing this property, we introduce several simple operations on transition probabilities, which allow developing lower and upper bounds on the proximity. The bounds monotonically converge to the exact proximity when more nodes are visited. We further show that FLoS can also be applied to measures having local optimum by utilizing relationship among different measures. We perform comprehensive experiments to evaluate the efficiency and applicability of the proposed method.

References

[1]
D. Aldous and J. Fill. Reversible markov chains and random walks on graphs, 2002.
[2]
P. Berkhin. Bookmark-coloring algorithm for personalized PageRank computing. Internet Mathematics, 3(1):41--62, 2006.
[3]
P. Bogdanov and A. Singh. Accurate and scalable nearest neighbors in large networks based on effective importance. In CIKM, pages 523--528, 2013.
[4]
D. Chakrabarti, Y. Zhan, and C. Faloutsos. R-MAT: A recursive model for graph mining. In SDM, pages 442--446, 2004.
[5]
S. Chakrabarti, A. Pathak, and M. Gupta. Index design and query processing for graph conductance search. VLDB, 20(3):445--470, 2011.
[6]
S. Cohen, B. Kimelfeld, and G. Koutrika. A survey on proximity measures for social networks. In Search Computing, pages 191--206, 2012.
[7]
P. Erdos and A. Rényi. On the evolution of random graphs. Magyar Tud. Akad. Mat. Kutató Int. Kozl, 5:17--61, 1960.
[8]
Y. Fujiwara, M. Nakatsuji, M. Onizuka, and M. Kitsuregawa. Fast and exact top-k search for random walk with restart. VLDB, 5(5):442--453, 2012.
[9]
Y. Fujiwara, M. Nakatsuji, H. Shiokawa, T. Mishima, and M. Onizuka. Efficient ad-hoc search for personalized PageRank. In SIGMOD, pages 445--456, 2013.
[10]
Y. Fujiwara, M. Nakatsuji, T. Yamamuro, H. Shiokawa, and M. Onizuka. Efficient personalized PageRank with accuracy assurance. In KDD, pages 15--23, 2012.
[11]
Z. Guan, J. Wu, Q. Zhang, A. Singh, and X. Yan. Assessing and ranking structural correlations in graphs. In SIGMOD, pages 937--948, 2011.
[12]
E. A. Guillemin. Introductory circuit theory. John Wiley & Sons, 1953.
[13]
P. Lee, L. V. Lakshmanan, and J. X. Yu. On top-k structural similarity search. In ICDE, pages 774--785, 2012.
[14]
Q. Mei, D. Zhou, and K. Church. Query suggestion using hitting time. In CIKM, pages 469--478, 2008.
[15]
C. Meyer. Matrix analysis and applied linear algebra. SIAM, 2000.
[16]
Y. Saad. Iterative methods for sparse linear systems. SIAM, 2003.
[17]
P. Sarkar and A. W. Moore. A tractable approach to finding closet truncated communicate time nieghbors in large graphs. In UAI, pages 335--343, 2007.
[18]
P. Sarkar and A. W. Moore. Fast nearest-neighbor search in disk-resident graphs. In KDD, pages 513--522, 2010.
[19]
P. Sarkar, A. W. Moore, and A. Prakash. Fast incremental proximity search in large graphs. In ICML, pages 896--903, 2008.
[20]
H. Tong, C. Faloutsos, and J.-Y. Pan. Fast random walk with restart and its applications. In ICDM, pages 613--622, 2006.
[21]
C. Zhang, L. Shou, K. Chen, G. Chen, and Y. Bei. Evaluating geo-social influence in location-based social networks. In CIKM, pages 1442--1451, 2012.
[22]
X. Zhao, A. Chang, A. D. Sarma, H. Zheng, and B. Y. Zhao. On the embeddability of random walk distances. VLDB, 6(14), 2013.

Cited By

View all
  • (2025)AquaPipe: A Quality-Aware Pipeline for Knowledge Retrieval and Large Language ModelsProceedings of the ACM on Management of Data10.1145/37096613:1(1-26)Online publication date: 11-Feb-2025
  • (2024)Enhanced Data Mining and Visualization of Sensory-Graph-Modeled Datasets through SummarizationSensors10.3390/s2414455424:14(4554)Online publication date: 14-Jul-2024
  • (2024)GUITAR: Gradient Pruning toward Fast Neural RankingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657728(163-173)Online publication date: 10-Jul-2024
  • Show More Cited By

Index Terms

  1. Fast and unified local search for random walk based k-nearest-neighbor query in large graphs

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data
      June 2014
      1645 pages
      ISBN:9781450323765
      DOI:10.1145/2588555
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 18 June 2014

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. local search
      2. nearest neighbors
      3. proximity search
      4. random walk
      5. top-k search

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      SIGMOD/PODS'14
      Sponsor:

      Acceptance Rates

      SIGMOD '14 Paper Acceptance Rate 107 of 421 submissions, 25%;
      Overall Acceptance Rate 785 of 4,003 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)21
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 17 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)AquaPipe: A Quality-Aware Pipeline for Knowledge Retrieval and Large Language ModelsProceedings of the ACM on Management of Data10.1145/37096613:1(1-26)Online publication date: 11-Feb-2025
      • (2024)Enhanced Data Mining and Visualization of Sensory-Graph-Modeled Datasets through SummarizationSensors10.3390/s2414455424:14(4554)Online publication date: 14-Jul-2024
      • (2024)GUITAR: Gradient Pruning toward Fast Neural RankingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657728(163-173)Online publication date: 10-Jul-2024
      • (2024)Towards Efficient Graph Processing in Geo-Distributed Data CentersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.345387235:11(2147-2160)Online publication date: Nov-2024
      • (2024)Efficient Algorithms for Personalized PageRank Computation: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.337600036:9(4582-4602)Online publication date: Sep-2024
      • (2024)Attributed Network Embedding in Streaming Style2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00243(3138-3150)Online publication date: 13-May-2024
      • (2024)Fast Iterative Graph Computing with Updated Neighbor States2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00193(2449-2462)Online publication date: 13-May-2024
      • (2024)Personalized PageRanks over Dynamic Graphs - The Case for Optimizing Quality of Service2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00038(409-422)Online publication date: 13-May-2024
      • (2023)iQANProceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3572848.3577527(313-328)Online publication date: 25-Feb-2023
      • (2022)Fast neural ranking on bipartite graph indicesProceedings of the VLDB Endowment10.14778/3503585.350358915:4(794-803)Online publication date: 14-Apr-2022
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media