Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3605573.3605579acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

Connectivity-Aware Link Analysis for Skewed Graphs

Published: 13 September 2023 Publication History

Abstract

Link analysis is a fundamental task for graph analytics, as it enables the identification of important nodes and patterns in the graph. Link analysis algorithms typically require traversing the graph and accessing the links of each node. However, for graphs with a skewed degree distribution, the computing efficiency of link analysis is severely constrained due to irregular connectivity, which results in randomized memory accesses and high cache miss ratio.
In this paper, we conduct a thorough investigation into skewed graphs’ structural characteristics, focusing on their interaction with the computational patterns observed in link analysis algorithms and the underlying hardware architecture. Generalizing the understanding, we develop a novel framework named Mixen. Mixen applies a lightweight filtering procedure to enhance graph locality and reschedule computations. Different graph components are selectively processed under distinctive paradigms. Thereby, Mixen allows for efficient graph traversal and performance optimization.
We evaluate Mixen on modern multicore systems, comparing its performance to state-of-the-art frameworks across various graphs and algorithms. The results indicate that Mixen significantly outperforms its counterparts. Over the best-performing alternative, Mixen achieves a 3.42 × speedup in execution time. Furthermore, we explore the design space of Mixen, examining its trade-offs and the correlation with cache-memory dynamics.

References

[1]
Hartwig Anzt, Terry Cojean, Chen Yen-Chen, Jack Dongarra, Goran Flegar, Pratik Nayak, Stanimire Tomov, Yuhsiang M Tsai, and Weichung Wang. 2020. Load-balancing sparse matrix vector product kernels on GPUs. ACM Transactions on Parallel Computing (TOPC) 7, 1 (2020), 1–26.
[2]
Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives. 2007. Dbpedia: A nucleus for a web of open data. In The semantic web. Springer, 722–735.
[3]
Scott Beamer, Krste Asanović, and David Patterson. 2015. The GAP benchmark suite. arXiv preprint arXiv:1508.03619 (2015).
[4]
Scott Beamer, Krste Asanović, and David Patterson. 2017. Reducing pagerank communication via propagation blocking. In 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 820–831.
[5]
Ronald F Boisvert, Ronald F Boisvert, and Karin A Remington. 1996. The matrix market exchange formats: Initial design. Vol. 5935. US Department of Commerce, National Institute of Standards and Technology.
[6]
Allan Borodin, Gareth O Roberts, Jeffrey S Rosenthal, and Panayiotis Tsaparas. 2005. Link analysis ranking: algorithms, theory, and experiments. ACM Transactions on Internet Technology (TOIT) 5, 1 (2005), 231–297.
[7]
Daniele Buono, Fabrizio Petrini, Fabio Checconi, Xing Liu, Xinyu Que, Chris Long, and Tai-Ching Tuan. 2016. Optimizing sparse matrix-vector multiplication for large-scale data analytics. In Proceedings of the 2016 International Conference on Supercomputing. 1–12.
[8]
Peter J Carrington, John Scott, and Stanley Wasserman. 2005. Models and methods in social network analysis. Vol. 28. Cambridge university press.
[9]
Deepayan Chakrabarti, Yiping Zhan, and Christos Faloutsos. 2004. R-MAT: A recursive model for graph mining. In Proceedings of the 2004 SIAM International Conference on Data Mining. SIAM, 442–446.
[10]
YuAng Chen and Yeh-ching Chung. 2021. HiPa: Hierarchical Partitioning for Fast PageRank on NUMA Multicore Systems. In 50th International Conference on Parallel Processing. 1–10.
[11]
YuAng Chen and Yeh-Ching Chung. 2021. Workload Balancing via Graph Reordering on Multicore Systems. IEEE Transactions on Parallel and Distributed Systems 33, 5 (2021), 1231–1245.
[12]
YuAng Chen and Yeh-Ching Chung. 2023. An Unequal Caching Strategy for Shared-Memory Graph Analytics. IEEE Transactions on Parallel and Distributed Systems (2023).
[13]
Leonardo Dagum and Ramesh Menon. 1998. OpenMP: an industry standard API for shared-memory programming. IEEE computational science and engineering 5, 1 (1998), 46–55.
[14]
Michalis Faloutsos, Petros Faloutsos, and Christos Faloutsos. 1999. On power-law relationships of the internet topology. ACM SIGCOMM computer communication review 29, 4 (1999), 251–262.
[15]
Goran Flegar and Hartwig Anzt. 2017. Overcoming load imbalance for irregular sparse matrices. In Proceedings of the Seventh Workshop on Irregular Applications: Architectures and Algorithms. 1–8.
[16]
Joseph L Greathouse and Mayank Daga. 2014. Efficient sparse matrix-vector multiplication on GPUs using the CSR storage format. In SC’14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 769–780.
[17]
Roger G Grimes, David R Kincaid, and David M Young. 1979. ITPACK 2.0 user’s guide. Center for Numerical Analysis, Univ.
[18]
Jeremy Kepner and John Gilbert. 2011. Graph algorithms in the language of linear algebra. SIAM.
[19]
Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).
[20]
Jon M Kleinberg. 1999. Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM) 46, 5 (1999), 604–632.
[21]
Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8 (2009), 30–37.
[22]
Jérôme Kunegis. 2013. Konect: the koblenz network collection. In Proceedings of the 22nd International Conference on World Wide Web. 1343–1350.
[23]
Aapo Kyrola, Guy Blelloch, and Carlos Guestrin. 2012. Graphchi: Large-scale graph computation on just a { PC}. In 10th { USENIX} Symposium on Operating Systems Design and Implementation ({ OSDI} 12). 31–46.
[24]
Kartik Lakhotia, Rajgopal Kannan, Sourav Pati, and Viktor Prasanna. 2020. GPOP: A Scalable Cache-and Memory-efficient Framework for Graph Processing over Parts. ACM Transactions on Parallel Computing (TOPC) 7, 1 (2020), 1–24.
[25]
Kartik Lakhotia, Rajgopal Kannan, and Viktor Prasanna. 2018. Accelerating pagerank using partition-centric processing. In 2018 { USENIX} Annual Technical Conference ({ USENIX}{ ATC} 18). 427–440.
[26]
Oliver Lehmberg, Robert Meusel, and Christian Bizer. 2014. Graph structure in the web: aggregated by pay-level domain. In Proceedings of the 2014 ACM conference on Web science. 119–128.
[27]
Ronny Lempel and Shlomo Moran. 2001. SALSA: the stochastic approach for link-structure analysis. ACM Transactions on Information Systems (TOIS) 19, 2 (2001), 131–160.
[28]
Min Li, Yulong Ao, and Chao Yang. 2020. Adaptive SpMV/SpMSpV on GPUs for input vectors of varied sparsity. IEEE Transactions on Parallel and Distributed Systems 32, 7 (2020), 1842–1853.
[29]
Steffen Maass, Changwoo Min, Sanidhya Kashyap, Woonhak Kang, Mohan Kumar, and Taesoo Kim. 2017. Mosaic: Processing a trillion-edge graph on a single machine. In Proceedings of the Twelfth European Conference on Computer Systems. 527–543.
[30]
Frank McSherry, Michael Isard, and Derek G Murray. 2015. Scalability! But at what { COST} ?. In 15th Workshop on Hot Topics in Operating Systems (HotOS { XV}).
[31]
Robert Meusel, Sebastiano Vigna, Oliver Lehmberg, and Christian Bizer. 2015. The graph structure in the web–analyzed on different aggregation levels. The Journal of Web Science 1 (2015).
[32]
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank citation ranking: Bringing order to the web.Technical Report. Stanford InfoLab.
[33]
Ryan A Rossi and Nesreen K Ahmed. 2016. An interactive data repository with visual analytics. ACM SIGKDD Explorations Newsletter 17, 2 (2016), 37–41.
[34]
Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. 2013. X-stream: Edge-centric graph processing using streaming partitions. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. 472–488.
[35]
Yousef Saad. 2003. Iterative methods for sparse linear systems. SIAM.
[36]
Sebastian Schelter and Jérôme Kunegis. 2016. Tracking the trackers: A large-scale analysis of embedded web trackers. In Tenth International AAAI Conference on Web and Social Media.
[37]
Julian Shun and Guy E Blelloch. 2013. Ligra: a lightweight graph processing framework for shared memory. In Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming. 135–146.
[38]
Narayanan Sundaram, Nadathur Rajagopalan Satish, Md Mostofa Ali Patwary, Subramanya R Dulloor, Satya Gautam Vadlamudi, Dipankar Das, and Pradeep Dubey. 2015. Graphmat: High performance graph analytics made productive. arXiv preprint arXiv:1503.07241 (2015).
[39]
Jan Treibig, Georg Hager, and Gerhard Wellein. 2010. Likwid: A lightweight performance-oriented tool suite for x86 multicore environments. In 2010 39th International Conference on Parallel Processing Workshops. IEEE, 207–216.
[40]
Endong Wang, Qing Zhang, Bo Shen, Guangyong Zhang, Xiaowei Lu, Qing Wu, and Yajuan Wang. 2014. Intel math kernel library. In High-Performance Computing on the Intel® Xeon Phi™. Springer, 167–188.
[41]
Vincent M Weaver. 2013. Linux perf_event features and overhead. In The 2nd International Workshop on Performance Analysis of Workload Optimized Systems, FastPath, Vol. 13. 5.
[42]
Biwei Xie, Jianfeng Zhan, Xu Liu, Wanling Gao, Zhen Jia, Xiwen He, and Lixin Zhang. 2018. Cvr: Efficient vectorization of spmv on x86 processors. In Proceedings of the 2018 International Symposium on Code Generation and Optimization. 149–162.
[43]
Kaiyuan Zhang, Rong Chen, and Haibo Chen. 2015. NUMA-aware graph-structured analytics. In Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 183–193.
[44]
Yunming Zhang, Vladimir Kiriansky, Charith Mendis, Saman Amarasinghe, and Matei Zaharia. 2017. Making caches work for graph analytics. In 2017 IEEE International Conference on Big Data (Big Data). IEEE, 293–302.
[45]
Yunming Zhang, Mengjiao Yang, Riyadh Baghdadi, Shoaib Kamil, Julian Shun, and Saman Amarasinghe. 2018. Graphit: A high-performance graph dsl. Proceedings of the ACM on Programming Languages 2, OOPSLA (2018), 1–30.
[46]
Xiaowei Zhu, Wenguang Chen, Weimin Zheng, and Xiaosong Ma. 2016. Gemini: A computation-centric distributed graph processing system. In 12th { USENIX} symposium on operating systems design and implementation ({ OSDI} 16). 301–316.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICPP '23: Proceedings of the 52nd International Conference on Parallel Processing
August 2023
858 pages
ISBN:9798400708435
DOI:10.1145/3605573
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 September 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Link Analysis Algorithms
  2. Multicore Systems
  3. Skewed Graphs

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICPP 2023
ICPP 2023: 52nd International Conference on Parallel Processing
August 7 - 10, 2023
UT, Salt Lake City, USA

Acceptance Rates

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 40
    Total Downloads
  • Downloads (Last 12 months)33
  • Downloads (Last 6 weeks)5
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media