Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Parallel personalized pagerank on dynamic graphs

Published: 01 September 2017 Publication History

Abstract

Personalized PageRank (PPR) is a well-known proximity measure in graphs. To meet the need for dynamic PPR maintenance, recent works have proposed a local update scheme to support incremental computation. Nevertheless, sequential execution of the scheme is still too slow for highspeed stream processing. Therefore, we are motivated to design a parallel approach for dynamic PPR computation. First, as updates always come in batches, we devise a batch processing method to reduce synchronization cost among every single update and enable more parallelism for iterative parallel execution. Our theoretical analysis shows that the parallel approach has the same asymptotic complexity as the sequential approach. Second, we devise novel optimization techniques to effectively reduce runtime overheads for parallel processes. Experimental evaluation shows that our parallel algorithm can achieve orders of magnitude speedups on GPUs and multi-core CPUs compared with the state-of-the-art sequential algorithm.

References

[1]
Extended version. http://www.comp.nus.edu.sg/~wentian/paper/tr2017--ppr.pdf.
[2]
PAPI. http://icl.utk.edu/papi/.
[3]
Twitter usage statistics. http://www.internetlivestats.com/twitter-statistics/. accessed 18-April-2017.
[4]
V. Agarwal, F. Petrini, D. Pasetto, and D. A. Bader. Scalable graph exploration on multicore processors. In Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1--11. IEEE Computer Society, 2010.
[5]
R. Andersen, C. Borgs, J. Chayes, J. Hopcraft, V. S. Mirrokni, and S.-H. Teng. Local computation of pagerank contributions. In International Workshop on Algorithms and Models for the Web-Graph, pages 150--165. Springer, 2007.
[6]
R. Andersen, F. Chung, and K. Lang. Local graph partitioning using pagerank vectors. In 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06), pages 475--486. IEEE, 2006.
[7]
K. Avrachenkov, N. Litvak, D. Nemirovsky, and N. Osipova. Monte carlo methods in pagerank computation: When one iteration is sufficient. SIAM Journal on Numerical Analysis, 45(2):890--904, 2007.
[8]
L. Backstrom and J. Leskovec. Supervised random walks: predicting and recommending links in social networks. In Proceedings of the fourth ACM international conference on Web search and data mining, pages 635--644. ACM, 2011.
[9]
B. Bahmani, K. Chakrabarti, and D. Xin. Fast personalized pagerank on mapreduce. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pages 973--984. ACM, 2011.
[10]
B. Bahmani, A. Chowdhury, and A. Goel. Fast incremental and personalized pagerank. PVLDB, 4(3):173--184, 2010.
[11]
P. Berkhin. Bookmark-coloring algorithm for personalized pagerank computing. Internet Mathematics, 3(1):41--62, 2006.
[12]
T. David, R. Guerraoui, and V. Trigonakis. Everything you always wanted to know about synchronization but were afraid to ask. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pages 33--48. ACM, 2013.
[13]
A. Davidson, S. Baxter, M. Garland, and J. D. Owens. Work-efficient parallel gpu methods for single-source shortest paths. In Parallel and Distributed Processing Symposium, 2014 IEEE 28th International, pages 349--359. IEEE, 2014.
[14]
D. Ediger, R. McColl, J. Riedy, and D. A. Bader. Stinger: High performance data structure for streaming graphs. In High Performance Extreme Computing (HPEC), 2012 IEEE Conference on, pages 1--5. IEEE, 2012.
[15]
S. L. Feld. Why your friends have more friends than you do. American Journal of Sociology, 96(6):1464--1477, 1991.
[16]
J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In OSDI, volume 12, page 2, 2012.
[17]
S. Guha and A. McGregor. Graph synopses, sketches, and streams: A survey. PVLDB, 5(12):2030--2031, 2012.
[18]
T. Guo, X. Cao, G. Cong, J. Lu, and X. Lin. Distributed algorithms on exact personalized pagerank. In Proceedings of the 2017 ACM International Conference on Management of Data, pages 479--494. ACM, 2017.
[19]
P. Gupta, A. Goel, J. Lin, A. Sharma, D. Wang, and R. Zadeh. Wtf: The who to follow service at twitter. In Proceedings of the 22nd international conference on World Wide Web, pages 505--514. ACM, 2013.
[20]
M. Herlihy. Wait-free synchronization. ACM Transactions on Programming Languages and Systems (TOPLAS), 13(1):124--149, 1991.
[21]
S. Hong, T. Oguntebi, and K. Olukotun. Efficient parallel graph exploration on multi-core cpu and gpu. In Parallel Architectures and Compilation Techniques (PACT), 2011 International Conference on, pages 78--88. IEEE, 2011.
[22]
G. Jeh and J. Widom. Scaling personalized web search. In Proceedings of the 12th international conference on World Wide Web, pages 271--279. ACM, 2003.
[23]
F. Khorasani, K. Vora, R. Gupta, and L. N. Bhuyan. Cusha: vertex-centric graph processing on gpus. In Proceedings of the 23rd international symposium on High-performance parallel and distributed computing, pages 239--252. ACM, 2014.
[24]
C. E. Leiserson. The cilk++ concurrency platform. In Design Automation Conference, 2009. DAC'09. 46th ACM/IEEE, pages 522--527. IEEE, 2009.
[25]
H. Liu and H. H. Huang. Enterprise: Breadth-first graph traversal on gpus. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, page 68. ACM, 2015.
[26]
H. Liu, H. H. Huang, and Y. Hu. ibfs: Concurrent breadth-first search on gpus. In Proceedings of the 2016 International Conference on Management of Data, pages 403--416. ACM, 2016.
[27]
Q. Liu, Z. Li, J. Lui, and J. Cheng. Powerwalk: Scalable personalized pagerank via random walks with vertex-centric decomposition. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pages 195--204. ACM, 2016.
[28]
P. Lofgren. Efficient algorithms for personalized pagerank. arXiv preprint arXiv:1512.04633, 2015.
[29]
P. Lofgren, S. Banerjee, and A. Goel. Personalized pagerank estimation and search: A bidirectional approach. In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, pages 163--172. ACM, 2016.
[30]
P. Lofgren and A. Goel. Personalized pagerank to a target node. arXiv preprint arXiv:1304.4658, 2013.
[31]
P. A. Lofgren, S. Banerjee, A. Goel, and C. Seshadhri. Fast-ppr: scaling personalized pagerank estimation for large graphs. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1436--1445. ACM, 2014.
[32]
T. Maehara, T. Akiba, Y. Iwata, and K.-i. Kawarabayashi. Computing personalized pagerank quickly by exploiting graph structures. PVLDB, 7(12):1023--1034, 2014.
[33]
A. McLaughlin and D. A. Bader. Scalable and high performance betweenness centrality on the gpu. In Proceedings of the International Conference for High performance computing, networking, storage and analysis, pages 572--583. IEEE Press, 2014.
[34]
D. Merrill, M. Garland, and A. Grimshaw. High-performance and scalable gpu graph traversal. ACM Transactions on Parallel Computing, 1(2):14, 2015.
[35]
R. Nasre, M. Burtscher, and K. Pingali. Data-driven versus topology-driven irregular computations on gpus. In Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on, pages 463--474. IEEE, 2013.
[36]
D. Nguyen, A. Lenharth, and K. Pingali. A lightweight infrastructure for graph analytics. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pages 456--471. ACM, 2013.
[37]
J. Nickolls, I. Buck, M. Garland, and K. Skadron. Scalable parallel programming with cuda. Queue, 6(2):40--53, 2008.
[38]
N. Ohsaka, T. Maehara, and K.-i. Kawarabayashi. Efficient pagerank tracking in evolving networks. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 875--884. ACM, 2015.
[39]
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: bringing order to the web. 1999.
[40]
B. Perozzi, C. McCubbin, and J. T. Halbert. Scalable graph clustering with parallel approximate pagerank. Social Network Analysis and Mining, 4(1):1--11, 2014.
[41]
M. Sha, Y. Li, B. He, and K.-L. Tan. Accelerating dynamic graph analytics on gpus. PVLDB, 11(1), 2017.
[42]
J. Shun and G. E. Blelloch. Ligra: a lightweight graph processing framework for shared memory. In ACM Sigplan Notices, volume 48, pages 135--146. ACM, 2013.
[43]
J. Shun, L. Dhulipala, and G. Blelloch. A simple and practical linear-work parallel algorithm for connectivity. In Proceedings of the 26th ACM symposium on Parallelism in algorithms and architectures, pages 143--153. ACM, 2014.
[44]
J. Shun, F. Roosta-Khorasani, K. Fountoulakis, and M. W. Mahoney. Parallel local graph clustering. PVLDB, 9(12):1041--1052, 2016.
[45]
M. Then, M. Kaufmann, F. Chirigati, T.-A. Hoang-Vu, K. Pham, A. Kemper, T. Neumann, and H. T. Vo. The more the merrier: Efficient multi-source graph traversal. PVLDB, 8(4):449--460, 2014.
[46]
S. Wang, Y. Tang, X. Xiao, Y. Yang, and Z. Li. Hubppr: effective indexing for approximate personalized pagerank. PVLDB, 10(3):205--216, 2016.
[47]
Y. Wang, A. Davidson, Y. Pan, Y. Wu, A. Riffel, and J. D. Owens. Gunrock: A high-performance graph processing library on the gpu. In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, page 11. ACM, 2016.
[48]
J. Yang and J. Leskovec. Defining and evaluating network communities based on ground-truth. Knowledge and Information Systems, 42(1):181--213, 2015.
[49]
H. Zhang, P. Lofgren, and A. Goel. Approximate personalized pagerank on dynamic graphs. arXiv preprint arXiv:1603.07796, 2016.

Cited By

View all
  • (2024)Revisiting Local Computation of PageRank: Simple and OptimalProceedings of the 56th Annual ACM Symposium on Theory of Computing10.1145/3618260.3649661(911-922)Online publication date: 10-Jun-2024
  • (2024)A survey on dynamic graph processing on GPUs: concepts, terminologies and systemsFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-023-2656-118:4Online publication date: 1-Aug-2024
  • (2023)Efficient Personalized PageRank Computation: The Power of Variance-Reduced Monte Carlo ApproachesProceedings of the ACM on Management of Data10.1145/35893051:2(1-26)Online publication date: 20-Jun-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 11, Issue 1
Proceedings of the 44th International Conference on Very Large Data Bases, Rio de Janeiro, Brazil
September 2017
120 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 September 2017
Published in PVLDB Volume 11, Issue 1

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)43
  • Downloads (Last 6 weeks)3
Reflects downloads up to 16 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Revisiting Local Computation of PageRank: Simple and OptimalProceedings of the 56th Annual ACM Symposium on Theory of Computing10.1145/3618260.3649661(911-922)Online publication date: 10-Jun-2024
  • (2024)A survey on dynamic graph processing on GPUs: concepts, terminologies and systemsFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-023-2656-118:4Online publication date: 1-Aug-2024
  • (2023)Efficient Personalized PageRank Computation: The Power of Variance-Reduced Monte Carlo ApproachesProceedings of the ACM on Management of Data10.1145/35893051:2(1-26)Online publication date: 20-Jun-2023
  • (2023)Personalized PageRank on Evolving Graphs with an Incremental Index-Update SchemeProceedings of the ACM on Management of Data10.1145/35887051:1(1-26)Online publication date: 30-May-2023
  • (2023)Real-Time PageRank on Dynamic GraphsProceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3588195.3593004(239-251)Online publication date: 7-Aug-2023
  • (2022)Subset Node Anomaly Tracking over Large Dynamic GraphsProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539389(475-485)Online publication date: 14-Aug-2022
  • (2022)An Approach to Detect Fake Profiles in Social Networks Using Cellular Automata-Based PageRank Validation Model Involving Energy TransferSN Computer Science10.1007/s42979-022-01315-63:6Online publication date: 6-Oct-2022
  • (2021)ThunderRWProceedings of the VLDB Endowment10.14778/3476249.347625714:11(1992-2005)Online publication date: 27-Oct-2021
  • (2021)Accelerating exact constrained shortest paths on GPUsProceedings of the VLDB Endowment10.14778/3436905.343691414:4(547-559)Online publication date: 22-Feb-2021
  • (2021)AgendaProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482317(1315-1324)Online publication date: 26-Oct-2021
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media