Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3524059.3532379acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article
Open access

Bring orders into uncertainty: enabling efficient uncertain graph processing via novel path sampling on multi-accelerator systems

Published: 28 June 2022 Publication History

Abstract

Uncertain or probabilistic graphs have been ubiquitously used to represent noisy, incomplete, and inaccurate linked data in many emerging big-data mining and analytics applications. It is impractical to solve uncertain graph problems exactly as it requires to evaluate an exponential number of certain instances (or "possible worlds") generated from an uncertain graph. Previously, several CPU-based techniques were proposed to use sampling for uncertain graph processing. However, we observe that (1) they suffer from low computation efficiency and large memory overhead due to unnecessary edge sampling at runtime; (2) they cannot leverage the massive parallelism provided by modern general-purpose accelerators; and (3) there lacks a general programming framework for high-performance uncertain graph processing. To tackle these challenges, we propose a novel runtime path sampling method, which is able to identify and eliminate unnecessary edge sampling via incremental path identification and filtering, resulting in significant reduction in computation and data movement. Centered around this idea, we introduce a general uncertain graph processing framework for multi-GPU systems, named BPGraph1. BPGraph provides general support for users to design and optimize a wide-range of uncertain graph algorithms and applications without concerning about the underlying complexity. Extensive evaluation on a variety of real-world uncertain graph applications demonstrates an average speedup of 26X (up to 43X) and better scalability from BPGraph over the state-of-the-art frameworks.

References

[1]
http://law.di.unimi.it/webdata/. LAW web dataset. (http://law.di.unimi.it/webdata/).
[2]
Eytan Adar and Christopher Re. 2007. Managing uncertainty in social networks. IEEE Data Eng. Bull. 30, 2 (2007), 15--22.
[3]
Michael O Ball. 1986. Computational complexity of network reliability analysis: An overview. IEEE Transactions on Reliability 35, 3 (1986), 230--239.
[4]
Tal Ben-Nun, Michael Sutton, Sreepathi Pai, and Keshav Pingali. 2017. Groute: An asynchronous multi-GPU programming model for irregular computations. In Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, 235--248.
[5]
Paolo Boldi, Francesco Bonchi, Aris Gionis, and Tamir Tassa. 2012. Injecting uncertainty in graphs for identity obfuscation. arXiv preprint arXiv:1208.4145 (2012).
[6]
Federico Busato and Nicola Bombieri. 2015. BFS-4K: an efficient implementation of BFS for kepler GPU architectures. IEEE Transactions on Parallel and Distributed Systems 26, 7 (2015), 1826--1838.
[7]
Yurong Cheng, Ye Yuan, Lei Chen, and Guoren Wang. 2015. The reachability query over distributed uncertain graphs. In 2015 IEEE 35th International Conference on Distributed Computing Systems. IEEE, 786--787.
[8]
Yurong Cheng, Ye Yuan, Lei Chen, Guoren Wang, Christophe Giraud-Carrier, and Yongjiao Sun. 2016. Distr: A distributed method for the reachability query over large uncertain graphs. IEEE Transactions on Parallel and Distributed Systems 27, 11 (2016), 3172--3185.
[9]
Hristo Djidjev, Sunil Thulasidasan, Guillaume Chapuis, Rumen Andonov, and Dominique Lavenier. 2014. Efficient multi-GPU computation of all-pairs shortest paths. In Parallel and Distributed Processing Symposium, 2014 IEEE 28th International. IEEE, 360--369.
[10]
Talya Eden, Shweta Jain, Ali Pinar, Dana Ron, and C. Seshadhri. 2018. Provable and Practical Approximations for the Degree Distribution Using Sublinear Graph Samples. In Proceedings of the 2018 World Wide Web Conference (Lyon, France) (WWW'18). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 449--458.
[11]
Talya Eden, Amit Levi, Dana Ron, and C. Seshadhri. 2015. Approximately Counting Triangles in Sublinear Time. In 2015 IEEE 56th Annual Symposium on Foundations of Computer Science. 614--633.
[12]
Talya Eden, Dana Ron, and C. Seshadhri. 2020. Faster Sublinear Approximation of the Number of k-Cliques in Low-Arboricity Graphs. In Proceedings of the Thirty-First Annual ACM-SIAM Symposium on Discrete Algorithms (Salt Lake City, Utah) (SODA'20). Society for Industrial and Applied Mathematics, USA, 1467--1478.
[13]
George S Fishman. 1986. A comparison of four Monte Carlo methods for estimating the probability of st connectedness. IEEE Transactions on reliability 35, 2 (1986), 145--155.
[14]
Zhisong Fu, Michael Personick, and Bryan Thompson. 2014. Mapgraph: A high level API for fast development of high performance graph analytics on GPUs. In Proceedings of Workshop on GRAph Data management Experiences and Systems. ACM, 1--6.
[15]
Anil Gaihre, Zhenlin Wu, Fan Yao, and Hang Liu. 2019. XBFS: eXploring runtime optimizations for breadth-first search on GPUs. In Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing. 121--131.
[16]
Anil Gaihre, Da Zheng, Scott Weitze, Lingda Li, Shuaiwen Leon Song, Caiwen Ding, Xiaoye S Li, and Hang Liu. 2021. Dr. Top-k: delegate-centric Top-k on GPUs. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1--14.
[17]
Xiulian Gao and Yuan Gao. 2013. Connectedness index of uncertain graph. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 21, 01 (2013), 127--137.
[18]
Tong Geng, Tianqi Wang, Chunshu Wu, Chen Yang, Shuaiwen Leon Song, Ang Li, and Martin Herbordt. 2019. LP-BNN: Ultra-low-latency BNN inference with layer parallelism. In 2019 IEEE 30th International Conference on Application-specific Systems, Architectures and Processors (ASAP), Vol. 2160. IEEE, 9--16.
[19]
Joseph E Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. Powergraph: Distributed graph-parallel computation on natural graphs. In Presented as part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12). 17--30.
[20]
Pawan Harish and PJ Narayanan. 2007. Accelerating large graph algorithms on the GPU using CUDA. In International Conference on High-Performance Computing. Springer, 197--208.
[21]
Mark Harris and Kyrylo Perelygin. 2017. Cooperative groups: Flexible CUDA thread programming. NVIDIA Developer Blog (2017).
[22]
Sungpack Hong, Sang Kyun Kim, Tayo Oguntebi, and Kunle Olukotun. 2011. Accelerating CUDA graph algorithms at maximum warp. In ACM SIGPLAN Notices, Vol. 46. ACM, 267--276.
[23]
Sungpack Hong, Tayo Oguntebi, and Kunle Olukotun. 2011. Efficient parallel graph exploration on multi-core CPU and GPU. In Parallel Architectures and Compilation Techniques (PACT), 2011 International Conference on. IEEE, 78--88.
[24]
Ming Hua and Jian Pei. 2010. Probabilistic path queries in road networks: traffic uncertainty aware path selection. In Proceedings of the 13th International Conference on Extending Database Technology. 347--358.
[25]
Ruoming Jin, Lin Liu, Bolin Ding, and Haixun Wang. 2011. Distance-constraint reachability computation in uncertain graphs. Proceedings of the VLDB Endowment 4, 9 (2011), 551--562.
[26]
Arijit Khan and Lei Chen. 2015. On uncertain graphs modeling and queries. Proceedings of the VLDB Endowment 8, 12 (2015), 2042--2043.
[27]
Farzad Khorasani, Rajiv Gupta, and Laxmi N Bhuyan. 2015. Scalable simd-efficient graph processing on gpus. In 2015 International Conference on Parallel Architecture and Compilation (PACT). IEEE, 39--50.
[28]
Farzad Khorasani, Keval Vora, Rajiv Gupta, and Laxmi N Bhuyan. 2014. CuSha: vertex-centric graph processing on GPUs. In Proceedings of the 23rd international symposium on High-performance parallel and distributed computing. ACM, 239--252.
[29]
Hyeongsik Kim, Abhisha Bhattacharyya, and Kemafor Anyanwu. 2019. Semantic Query Transformations for Increased Parallelization in Distributed Knowledge Graph Query Processing. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (Denver, Colorado) (SC '19). Association for Computing Machinery, New York, NY, USA, Article 4, 14 pages.
[30]
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media?. In WWW '10: Proceedings of the 19th international conference on World wide web (Raleigh, North Carolina, USA). ACM, New York, NY, USA, 591--600.
[31]
Dominique LaSalle and George Karypis. 2013. Multi-threaded graph partitioning. In 2013 IEEE 27th International Symposium on Parallel and Distributed Processing. IEEE, 225--236.
[32]
Ang Li, Weifeng Liu, Linnan Wang, Kevin Barker, and Shuaiwen Leon Song. 2018. Warp-consolidation: A novel execution model for gpus. In Proceedings of the 2018 International Conference on Supercomputing. 53--64.
[33]
Ang Li, Shuaiwen Leon Song, Eric Brugel, Akash Kumar, Daniel Chavarria-Miranda, and Henk Corporaal. 2016. X: A comprehensive analytic model for parallel machines. In 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 242--252.
[34]
Ang Li, Shuaiwen Leon Song, Akash Kumar, Eddy Z Zhang, Daniel Chavarría-Miranda, and Henk Corporaal. 2016. Critical points based register-concurrency autotuning for GPUs. In 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1273--1278.
[35]
Rong-Hua Li, Jeffrey Xu Yu, Rui Mao, and Tan Jin. 2015. Recursive stratified sampling: A new framework for query evaluation on uncertain graphs. IEEE Transactions on Knowledge and Data Engineering 28, 2 (2015), 468--482.
[36]
Fragkiskos D Malliaros, Christos Giatsidis, Apostolos N Papadopoulos, and Michalis Vazirgiannis. 2020. The core decomposition of networks: Theory, algorithms and applications. The VLDB Journal 29, 1 (2020), 61--92.
[37]
Silviu Maniu, Reynold Cheng, and Pierre Senellart. 2017. An Indexing Framework for Queries on Probabilistic Graphs. ACM Transactions on Database Systems 42, 2 (2017).
[38]
Maxime Martinasso, Grzegorz Kwasniewski, Sadaf R Alam, Thomas C Schulthess, and Torsten Hoefler. 2016. A PCIe congestion-aware performance model for densely populated accelerator servers. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE Press, 63.
[39]
Robert Ryan McCune, Tim Weninger, and Greg Madey. 2015. Thinking like a vertex: a survey of vertex-centric frameworks for large-scale distributed graph processing. ACM Computing Surveys (CSUR) 48, 2 (2015), 25.
[40]
Duane Merrill, Michael Garland, and Andrew Grimshaw. 2012. Scalable GPU graph traversal. In ACM SIGPLAN Notices, Vol. 47. ACM, 117--128.
[41]
Duane Merrill, Michael Garland, and Andrew Grimshaw. 2015. High-performance and scalable GPU graph traversal. ACM Transactions on Parallel Computing 1, 2 (2015), 14.
[42]
Marco Minutoli, Prathyush Sambaturu, Mahantesh Halappanavar, Antonino Tumeo, Ananth Kalyanaraman, and Anil Vullikanti. 2020. Preempt: Scalable Epidemic Interventions Using Submodular Optimization on Multi-GPU Systems. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (Atlanta, Georgia) (SC '20). IEEE Press, Article 55, 15 pages.
[43]
Mark EJ Newman. 2003. The structure and function of complex networks. SIAM review 45, 2 (2003), 167--256.
[44]
Evdokia Nikolova, Matthew Brand, and David R Karger. 2006. Optimal Route Planning under Uncertainty. In Icaps, Vol. 6. 131--141.
[45]
Yuechao Pan, Yangzihao Wang, Yuduo Wu, Carl Yang, and John D Owens. 2015. Multi-GPU graph analytics. arXiv preprint arXiv:1504.04804 (2015).
[46]
Santosh Pandey, Lingda Li, Adolfy Hoisie, Xiaoye S Li, and Hang Liu. 2020. C-SAW: A framework for graph sampling and random walk on GPUs. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 1--15.
[47]
Panos Parchas, Francesco Gullo, Dimitris Papadias, and Franceseco Bonchi. 2014. The pursuit of a good possible world: extracting representative instances of uncertain graphs. In Proceedings of the 2014 ACM SIGMOD international conference on management of data. 967--978.
[48]
Panos Parchas, Francesco Gullo, Dimitris Papadias, and Francesco Bonchi. 2015. Uncertain graph processing through representative instances. ACM Transactions on Database Systems (TODS) 40, 3 (2015), 1--39.
[49]
Michalis Potamias, Francesco Bonchi, Aristides Gionis, and George Kollios. 2010. K-nearest neighbors in uncertain graphs. Proceedings of the VLDB Endowment 3, 1--2 (2010), 997--1008.
[50]
Ryan A. Rossi and Nesreen K. Ahmed. 2015. The Network Data Repository with Interactive Graph Analytics and Visualization. In AAAI. http://networkrepository.com
[51]
Arkaprava Saha, Ruben Brokkelkamp, Yllka Velaj, Arijit Khan, and Francesco Bonchi. 2021. Shortest paths and centrality in uncertain networks. Proceedings of the VLDB Endowment 14, 7 (2021), 1188--1201.
[52]
Stephen B Seidman. 1983. Network structure and minimum degree. Social networks 5, 3 (1983), 269--287.
[53]
Dipanjan Sengupta and Shuaiwen Leon Song. 2017. EvoGraph: On-the-Fly Efficient Mining of Evolving Graphs on GPU. In High Performance Computing. Springer International Publishing, Cham, 97--119.
[54]
Dipanjan Sengupta, Shuaiwen Leon Song, Kapil Agarwal, and Karsten Schwan. 2015. GraphReduce: processing large-scale graphs on accelerator-based systems. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 28.
[55]
Philip Taffet and John Mellor-Crummey. 2019. Understanding Congestion in High Performance Interconnection Networks Using Sampling. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (Denver, Colorado) (SC '19). Association for Computing Machinery, New York, NY, USA, Article 43, 24 pages.
[56]
Jingweijia Tan, Shuaiwen Leon Song, Kaige Yan, Xin Fu, Andres Marquez, and Darren Kerbyson. 2016. Combating the reliability challenge of GPU register file at low supply voltage. In 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT). IEEE, 3--15.
[57]
Yuanyuan Tian, Andrey Balmin, Severin Andreas Corsten, Shirish Tatikonda, and John McPherson. 2013. From think like a vertex to think like a graph. Proceedings of the VLDB Endowment 7, 3 (2013), 193--204.
[58]
Ha-Nguyen Tran, Jung-jae Kim, and Bingsheng He. 2015. Fast subgraph matching on large graphs using graphics processors. In International Conference on Database Systems for Advanced Applications. Springer, 299--315.
[59]
Yangzihao Wang, Andrew Davidson, Yuechao Pan, Yuduo Wu, Andy Riffel, and John D Owens. 2016. Gunrock: A high-performance graph processing library on the GPU. In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, 11.
[60]
Yuduo Wu, Yangzihao Wang, Yuechao Pan, Carl Yang, and John D Owens. 2015. Performance characterization of high-level programming models for GPU graph analytics. In Workload Characterization (IISWC), 2015 IEEE International Symposium on. IEEE, 66--75.
[61]
Reynold S Xin, Joseph E Gonzalez, Michael J Franklin, and Ion Stoica. 2013. Graphx: A resilient distributed graph system on spark. In First international workshop on graph data management experiences and systems. 1--6.
[62]
Bohua Yang, Dong Wen, Lu Qin, Ying Zhang, Lijun Chang, and Rong-Hua Li. 2019. Index-Based Optimal Algorithm for Computing K-Cores in Large Uncertain Graphs. In 2019 IEEE 35th International Conference on Data Engineering (ICDE). IEEE, 64--75.
[63]
Ye Yuan, Lei Chen, and Guoren Wang. 2010. Efficiently answering probability threshold-based shortest path queries over uncertain graphs. In International Conference on Database Systems for Advanced Applications. Springer, 155--170.
[64]
Yu Zhang, Xiaofei Liao, Hai Jin, Bingsheng He, Haikun Liu, and Lin Gu. 2019. DiGraph: An Efficient Path-Based Iterative Directed Graph Processing System on Multiple GPUs. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (Providence, RI, USA) (ASPLOS '19). Association for Computing Machinery, New York, NY, USA, 601--614.
[65]
Jianlong Zhong and Bingsheng He. 2014. Medusa: Simplified graph processing on GPUs. IEEE Transactions on Parallel and Distributed Systems 25, 6 (2014), 1543--1552.
[66]
Jian Zhou, Fan Yang, and Ke Wang. 2014. An inverse shortest path problem on an uncertain graph. Journal of Networks 9, 9 (2014), 2353.
[67]
Rong Zhu, Zhaonian Zou, and Jianzhong Li. 2017. Towards efficient top-k reliability search on uncertain graphs. Knowledge and Information Systems 50, 3 (2017), 723--750.
[68]
Xiaowei Zhu, Wentao Han, and Wenguang Chen. 2015. GridGraph: Large-Scale Graph Processing on a Single Machine Using 2-Level Hierarchical Partitioning. In USENIX Annual Technical Conference. 375--386.
[69]
Zhaonian Zou, Faming Li, Jianzhong Li, and Yingshu Li. 2017. Scalable Processing of Massive Uncertain Graph Data: A Simultaneous Processing Approach. In IEEE International Conference on Data Engineering.
[70]
Zhaonian Zou, Jianzhong Li, Hong Gao, and Shuo Zhang. 2010. Finding top-k maximal cliques in an uncertain graph. In 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010). IEEE, 649--652.

Cited By

View all
  • (2024)uBlade: Efficient Batch Processing for Uncertainty Graph QueriesProceedings of the ACM on Management of Data10.1145/36549822:3(1-24)Online publication date: 30-May-2024

Index Terms

  1. Bring orders into uncertainty: enabling efficient uncertain graph processing via novel path sampling on multi-accelerator systems

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICS '22: Proceedings of the 36th ACM International Conference on Supercomputing
    June 2022
    514 pages
    ISBN:9781450392815
    DOI:10.1145/3524059
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 June 2022

    Check for updates

    Author Tags

    1. GPU
    2. performance
    3. sampling
    4. uncertain graph

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ICS '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 629 of 2,180 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)173
    • Downloads (Last 6 weeks)18
    Reflects downloads up to 24 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)uBlade: Efficient Batch Processing for Uncertainty Graph QueriesProceedings of the ACM on Management of Data10.1145/36549822:3(1-24)Online publication date: 30-May-2024

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media