research-article

Public Access

Distributed Graph Processing System and Processing-in-memory Architecture with Precise Loop-carried Dependency Guarantee

Authors:

Xuehai QianAuthors Info & Claims

ACM Transactions on Computer Systems (TOCS), Volume 37, Issue 1-4

Article No.: 5, Pages 1 - 37

https://doi.org/10.1145/3453681

Published: 01 July 2021 Publication History

All formats PDF

Abstract

To hide the complexity of the underlying system, graph processing frameworks ask programmers to specify graph computations in user-defined functions (UDFs) of graph-oriented programming model. Due to the nature of distributed execution, current frameworks cannot precisely enforce the semantics of UDFs, leading to unnecessary computation and communication. It exemplifies a gap between programming model and runtime execution. This article proposes novel graph processing frameworks for distributed system and Processing-in-memory (PIM) architecture that precisely enforces loop-carried dependency; i.e., when a condition is satisfied by a neighbor, all following neighbors can be skipped. Our approach instruments the UDFs to express the loop-carried dependency, then the distributed execution framework enforces the precise semantics by performing dependency propagation dynamically. Enforcing loop-carried dependency requires the sequential processing of the neighbors of each vertex distributed in different nodes. We propose to circulant scheduling in the framework to allow different nodes to process disjoint sets of edges/vertices in parallel while satisfying the sequential requirement. The technique achieves an excellent trade-off between precise semantics and parallelism—the benefits of eliminating unnecessary computation and communication offset the reduced parallelism. We implement a new distributed graph processing framework SympleGraph, and two variants of runtime systems—GraphS and GraphSR—for PIM-based graph processing architecture, which significantly outperform the state-of-the-art.

References

[1]

Junwhan Ahn, Sungpack Hong, Sungjoo Yoo, Onur Mutlu, and Kiyoung Choi. 2015. A scalable processing-in-memory accelerator for parallel graph processing. In Proceedings of the ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA’15). IEEE, 105–117.

Digital Library

[2]

Tero Aittokallio and Benno Schwikowski. 2006. Graph-based methods for analysing networks in cell biology. Brief. Bioinform. 7, 3 (2006), 243–255.

[3]

Andrei Alexandrescu and Katrin Kirchhoff. 2007. Data-driven graph construction for semi-supervised graph-based learning in NLP. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL’07). 204–211.

[4]

ARM. 2009. ARM Cortex-A5 Processor. Retrieved from http://www.arm.com/products/processors/cortex-a/cortex-a5.php.

[5]

Abanti Basak, Shuangchen Li, Xing Hu, Sang Min Oh, Xinfeng Xie, Li Zhao, Xiaowei Jiang, and Yuan Xie. 2019. Analysis and optimization of the memory hierarchy for graph processing workloads. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA’19). IEEE, 373–386.

[6]

Peter W Battaglia, Jessica B Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner et al. 2018. Relational inductive biases, deep learning, and graph networks. Retrieved from https://arXiv:1806.01261.

[7]

Scott Beamer, Krste Asanović, and David Patterson. 2012. Direction-optimizing breadth-first search. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC’12). IEEE Computer Society Press, Los Alamitos, CA, Article 12, 10 pages. Retrieved from http://dl.acm.org/citation.cfm?id=2388996.2389013.

Digital Library

[8]

Scott Beamer, Krste Asanović, and David Patterson. 2015. The GAP Benchmark Suite. Retrieved from https://arXiv:cs.DC/1508.03619.

[9]

Scott Beamer, Aydin Buluc, Krste Asanovic, and David Patterson. 2013. Distributed memory breadth-first search revisited: Enabling bottom-up search. In Proceeding sof the IEEE International Symposium on Parallel & Distributed Processing, Workshops and PhD Forum. IEEE, 1618–1627.

[10]

Paolo Boldi, Marco Rosa, Massimo Santini, and Sebastiano Vigna. 2011. Layered label propagation: A multiresolution coordinate-free ordering for compressing social networks. In Proceedings of the 20th International Conference on World Wide Web. ACM, 587–596.

Digital Library

[11]

Paolo Boldi and Sebastiano Vigna. 2004. The webgraph framework I: Compression techniques. In Proceedings of the 13th International Conference on World Wide Web. ACM, 595–602.

Digital Library

[12]

Aydin Buluc, Scott Beamer, Kamesh Madduri, Krste Asanovic, and David Patterson. 2017. Distributed-memory breadth-first search on massive graphs. Retrieved from https://arXiv:1705.04590.

[13]

Deepayan Chakrabarti, Yiping Zhan, and Christos Faloutsos. 2004. R-MAT: A recursive model for graph mining. In Proceedings of the SIAM International Conference on Data Mining. SIAM, 442–446.

[14]

Rong Chen, Jiaxin Shi, Yanzhe Chen, and Haibo Chen. 2015. PowerLyra: Differentiated graph computation and partitioning on skewed graphs. In Proceedings of the 10th European Conference on Computer Systems (EuroSys’15). ACM, New York, NY, Article 1, 15 pages.

Digital Library

[15]

Thayne Coffman, Seth Greenblatt, and Sherry Marcus. 2004. Graph-based technologies for intelligence analysis. Commun. ACM 47, 3 (Mar. 2004), 45–47.

Digital Library

[16]

Hybrid Memory Cube Consortium. 2015. Hybrid Memory Cube Specification Version 2.1. Technical Report.

[17]

Guohao Dai, Tianhao Huang, Yuze Chi, Jishen Zhao, Guangyu Sun, Yongpan Liu, Yu Wang, Yuan Xie, and Huazhong Yang. 2018. Graphh: A processing-in-memory architecture for large-scale graph processing. IEEE Trans. Comput.-Aided Design Integr. Circ. Syst. 34, 4 (2018), 640–653.

[18]

Roshan Dathathri, Gurbinder Gill, Loc Hoang, Hoang-Vu Dang, Alex Brooks, Nikoli Dryden, Marc Snir, and Keshav Pingali. 2018. Gluon: A communication-optimizing substrate for distributed heterogeneous graph analytics. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’18). ACM, New York, NY, 752–768.

Digital Library

[19]

Anton J. Enright and Christos A. Ouzounis. 2001. BioLayout—An automatic graph layout algorithm for similarity visualization. Bioinformatics 17, 9 (2001), 853–854.

[20]

Wenfei Fan, Jingbo Xu, Yinghui Wu, Wenyuan Yu, Jiaxin Jiang, Zeyu Zheng, Bohan Zhang, Yang Cao, and Chao Tian. 2017. Parallelizing sequential graph computations. In Proceedings of the ACM International Conference on Management of Data (SIGMOD’17). ACM, New York, NY, 495–510.

Digital Library

[21]

Francois Fouss, Alain Pirotte, Jean-Michel Renders, and Marco Saerens. 2007. Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Trans. Knowl. Data Eng. 19, 3 (2007), 355–369.

Digital Library

[22]

Mingyu Gao, Grant Ayers, and Christos Kozyrakis. 2015. Practical near-data processing for in-memory analytics frameworks. In Proceedings of the International Conference on Parallel Architecture and Compilation (PACT’15). IEEE, 113–124.

Digital Library

[23]

Gurbinder Gill, Roshan Dathathri, Loc Hoang, Andrew Lenharth, and Keshav Pingali. 2018. Abelian: A compiler for graph analytics on distributed, heterogeneous platforms. In Proceedings of the European Conference on Parallel Processing. Springer, 249–264.

[24]

Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. PowerGraph: Distributed graph-parallel computation on natural graphs. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI’12). USENIX Association, Berkeley, CA, 17–30. Retrieved from http://dl.acm.org/citation.cfm?id=2387880.2387883.

Digital Library

[25]

Joseph E. Gonzalez, Reynold S. Xin, Ankur Dave, Daniel Crankshaw, Michael J. Franklin, and Ion Stoica. 2014. GraphX: Graph processing in a distributed dataflow framework. In Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI’14). USENIX Association, Berkeley, CA, 599–613. Retrieved from http://dl.acm.org/citation.cfm?id=2685048.2685096.

Digital Library

[26]

Amit Goyal, Hal Daumé III, and Raul Guerra. 2012. Fast large-scale approximate graph construction for nlp. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, 1069–1080.

[27]

Graph500. 2010. Graph 500 Benchmarks. Retrieved from http://www.graph500.org.

[28]

Aditya Grover and Jure Leskovec. 2016. Node2Vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’16). ACM, New York, NY, 855–864.

Digital Library

[29]

Ziyu Guan, Jiajun Bu, Qiaozhu Mei, Chun Chen, and Can Wang. 2009. Personalized tag recommendation using graph-based ranking on multi-type interrelated objects. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 540–547.

Digital Library

[30]

Tae Jun Ham, Lisa Wu, Narayanan Sundaram, Nadathur Satish, and Margaret Martonosi. 2016. Graphicionado: A high-performance and energy-efficient accelerator for graph analytics. In Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’16). IEEE, 1–13.

[31]

Sungpack Hong, Hassan Chafi, Edic Sedlar, and Kunle Olukotun. 2012. Green-Marl: A DSL for easy and efficient graph analysis. SIGPLAN Not. 47, 4 (Mar. 2012), 349–362.

Digital Library

[32]

Sungpack Hong, Siegfried Depner, Thomas Manhardt, Jan Van Der Lugt, Merijn Verstraaten, and Hassan Chafi. 2015. PGX.D: A fast distributed graph processing engine. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC’15). ACM, New York, NY, Article 58, 12 pages.

Digital Library

[33]

Sungpack Hong, Nicole C. Rodia, and Kunle Olukotun. 2013. On fast parallel detection of strongly connected components (SCC) in small-world graphs. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC’13). ACM, New York, NY, Article 92, 11 pages.

Digital Library

[34]

Imranul Hoque and Indranil Gupta. 2013. LFGraph: Simple and fast distributed graph analytics. In Proceedings of the First ACM SIGOPS Conference on Timely Results in Operating Systems (TRIOS’13). ACM, New York, NY, Article 9, 17 pages.

Digital Library

[35]

M. C. Jeffrey, S. Subramanian, C. Yan, J. Emer, and D. Sanchez. 2016. Unlocking ordered parallelism with the swarm architecture. IEEE Micro 36, 3 (2016), 105–117.

[36]

Andrew B. Kahng, Bin Li, Li-Shiuan Peh, and Kambiz Samadi. 2012. ORION 2.0: A power-area simulator for interconnection networks. IEEE Trans. Very Large Scale Integr. Syst. 20, 1 (Jan. 2012), 191–196.

Digital Library

[37]

Gwangsun Kim, John Kim, Jung Ho Ahn, and Jaeha Kim. 2013. Memory-centric system interconnect design with hybrid memory cubes. In Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques. IEEE Press, 145–156.

Digital Library

[38]

Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web (WWW’10). ACM, New York, NY, 591–600.

Digital Library

[39]

Nicolas Le Novere, Michael Hucka, Huaiyu Mi, Stuart Moodie, Falk Schreiber, Anatoly Sorokin, Emek Demir, Katja Wegner, Mirit I. Aladjem, Sarala M. Wimalaratne, et al. 2009. The systems biology graphical notation. Nature Biotechnology 27, 8 (2009), 735–741.

[40]

Dong Uk Lee, Kyung Whan Kim, Kwan Weon Kim, Hongjung Kim, Ju Young Kim, Young Jun Park, Jae Hwan Kim, Dae Suk Kim, Heat Bit Park, Jin Wook Shin, et al. 2014. 25.2 A 1.2 V 8Gb 8-channel 128GB/s high-bandwidth memory (HBM) stacked DRAM with effective microbump I/O test methods using 29nm process and TSV. In Proceedings of the IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC’14). IEEE, 432–433.

[41]

Jure Leskovec and Andrej Krevl. 2014. friendster. Retrieved from https://snap.stanford.edu/data/com-Friendster.html.

[42]

Jure Leskovec, Kevin J. Lang, Anirban Dasgupta, and Michael W. Mahoney. 2009. Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Math. 6, 1 (2009), 29–123.

[43]

Sheng Li, Jung Ho Ahn, Richard D. Strong, Jay B. Brockman, Dean M. Tullsen, and Norman P. Jouppi. 2009. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’09). 469–480.

[44]

Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, and Joseph M. Hellerstein. 2012. Distributed GraphLab: A framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5, 8 (Apr. 2012), 716–727.

Digital Library

[45]

Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: A system for large-scale graph processing. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’10). ACM, New York, NY, 135–146.

Digital Library

[46]

Mugilan Mariappan and Keval Vora. 2019. GraphBolt: Dependency-driven synchronous processing of streaming graphs. In Proceedings of the 14th EuroSys Conference 2019 (EuroSys’19). ACM, New York, NY, Article 25, 16 pages.

Digital Library

[47]

David W. Matula and Leland L. Beck. 1983. Smallest-last ordering and clustering and graph coloring algorithms. J. ACM 30, 3 (1983), 417–427.

Digital Library

[48]

Julian McAuley and Jure Leskovec. 2012. Learning to discover social circles in ego networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS’12). Curran Associates, 539–547. Retrieved from http://dl.acm.org/citation.cfm?id=2999134.2999195.

[49]

Frank McSherry. 2017. COST in the land of databases. Retrieved from https://github.com/frankmcsherry/blog/blob/master/posts/2017-09-23.md.

[50]

Frank McSherry, Michael Isard, and Derek G Murray. 2015. Scalability! But at what {COST}? In Proceedings of the 15th Workshop on Hot Topics in Operating Systems (HotOS’15).

[51]

Batul J. Mirza, Benjamin J. Keller, and Naren Ramakrishnan. 2003. Studying recommendation algorithms by graph analysis. J. Intell. Info. Syst. 20, 2 (2003), 131–160.

Digital Library

[52]

Anurag Mukkara, Nathan Beckmann, Maleen Abeydeera, Xiaosong Ma, and Daniel Sanchez. 2018. Exploiting locality in graph analytics through hardware-accelerated traversal scheduling. In Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’18). IEEE, 1–14.

Digital Library

[53]

Lifeng Nai, Ramyad Hadidi, Jaewoong Sim, Hyojong Kim, Pranith Kumar, and Hyesoon Kim. 2017. GraphPIM: Enabling instruction-level PIM offloading in graph computing frameworks. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA’17). IEEE, 457–468.

[54]

Donald Nguyen, Andrew Lenharth, and Keshav Pingali. 2013. A lightweight infrastructure for graph analytics. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP’13). ACM, New York, NY, 456–471.

Digital Library

[55]

The University of Texas at Austin. 2019. Texas Advanced Computing Center (TACC). Retrieved from https://www.tacc.utexas.edu/.

[56]

Muhammet Mustafa Ozdal, Serif Yesil, Taemin Kim, Andrey Ayupov, John Greth, Steven Burns, and Ozcan Ozturk. 2016. Energy efficient architecture for graph analytics accelerators. In Proceedings of the ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA’16). IEEE, 166–177.

Digital Library

[57]

Sreepathi Pai and Keshav Pingali. 2016. A compiler for throughput optimization of graph algorithms on GPUs. In Proceedings of the ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’16). ACM, New York, NY, 1–19.

Digital Library

[58]

Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’14). ACM, New York, NY, 701–710.

Digital Library

[59]

The Lemur Project. 2013. The ClueWeb12 Dataset. Retrieved from http://lemurproject.org/clueweb12/.

[60]

Meikang Qiu, Lei Zhang, Zhong Ming, Zhi Chen, Xiao Qin, and Laurence T. Yang. 2013. Security-aware optimization for ubiquitous computing systems with SEAT graph approach. J. Comput. Syst. Sci. 79, 5 (2013), 518–529.

Digital Library

[61]

Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. 2013. X-stream: Edge-centric graph processing using streaming partitions. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP’13). Association for Computing Machinery, New York, NY, 472–488.

Digital Library

[62]

Semih Salihoglu and Jennifer Widom. 2013. GPS: A graph processing system. In Proceedings of the 25th International Conference on Scientific and Statistical Database Management (SSDBM’13). ACM, New York, NY, Article 22, 12 pages.

Digital Library

[63]

Daniel Sanchez and Christos Kozyrakis. 2013. ZSim: Fast and accurate microarchitectural simulation of thousand-core systems. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA’13). ACM, New York, NY, 475–486.

Digital Library

[64]

Satu Elisa Schaeffer. 2007. Graph clustering. Comput. Sci. Rev. 1, 1 (2007), 27–64.

Digital Library

[65]

Jiwon Seo, Jongsoo Park, Jaeho Shin, and Monica S. Lam. 2013. Distributed socialite: A datalog-based language for large-scale graph analysis. Proc. VLDB Endow. 6, 14 (Sep. 2013), 1906–1917.

Digital Library

[66]

Manjunath Shevgoor, Jung-Sik Kim, Niladrish Chatterjee, Rajeev Balasubramonian, Al Davis, and Aniruddha N. Udipi. 2013. Quantifying the relationship between the power delivery network and architectural policies in a 3D-stacked memory device. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture. ACM, 198–209.

[67]

Julian Shun. 2019. K-Core. Retrieved from http://jshun.github.io/ligra/docs/tutorial_kcore.html.

[68]

Julian Shun and Guy E. Blelloch. 2013. Ligra: A lightweight graph processing framework for shared memory. In Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’13). ACM, New York, NY, 135–146.

[69]

Julian Shun, Farbod Roosta-Khorasani, Kimon Fountoulakis, and Michael W. Mahoney. 2016. Parallel local graph clustering. Proc. VLDB Endow. 9, 12 (Aug. 2016), 1041–1052.

Digital Library

[70]

AM Stankovic and MS Calovic. 1989. Graph oriented algorithm for the steady-state security enhancement in distribution networks. IEEE Trans. Power Delivery 4, 1 (1989), 539–544.

[71]

Lei Tang and Huan Liu. 2010. Graph mining applications to social network analysis. In Managing and Mining Graph Data. Springer, 487–513.

[72]

Po-An Tsai, Nathan Beckmann, and Daniel Sanchez. 2017. Jenga: Sotware-defined cache hierarchies. In Proceedings of the 44th Annual International Symposium on Computer Architecture. ACM, 652–665.

Digital Library

[73]

Keval Vora. 2019. LUMOS: Dependency-driven disk-based graph processing. In Proceedings of the USENIX Conference on Usenix Annual Technical Conference (USENIX ATC’19). USENIX Association, USA, 429–442.

[74]

Keval Vora, Rajiv Gupta, and Guoqing Xu. 2017. KickStarter: Fast and accurate computations on streaming graphs via trimmed approximations. In Proceedings of the 22nd International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’17). Association for Computing Machinery, New York, NY, 237–251.

Digital Library

[75]

Keval Vora, Sai Charan Koduru, and Rajiv Gupta. 2014. ASPIRE: Exploiting asynchronous parallelism in iterative algorithms using a relaxed consistency based DSM. In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages & Applications (OOPSLA’14). ACM, New York, NY, 861–878.

Digital Library

[76]

Tianyi Wang, Yang Chen, Zengbin Zhang, Tianyin Xu, Long Jin, Pan Hui, Beixing Deng, and Xing Li. 2011. Understanding graph sampling algorithms for social network analysis. In Proceedings of the 31st International Conference on Distributed Computing Systems Workshops. IEEE, 123–128.

Digital Library

[77]

English Wikipedia. 2013. enwiki-2013. Retrieved from http://law.di.unimi.it/webdata/enwiki-2013/.

[78]

Ming Wu, Fan Yang, Jilong Xue, Wencong Xiao, Youshan Miao, Lan Wei, Haoxiang Lin, Yafei Dai, and Lidong Zhou. 2015. GraM: Scaling graph computation to the trillions. In Proceedings of the 6th ACM Symposium on Cloud Computing (SoCC’15). ACM, New York, NY, 408–421.

Digital Library

[79]

Wencong Xiao, Jilong Xue, Youshan Miao, Zhen Li, Cheng Chen, Ming Wu, Wei Li, and Lidong Zhou. 2017. Tux2: Distributed graph computation for machine learning. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI’17). USENIX Association, Berkeley, CA, 669–682.

[80]

Yuan Yu, Pradeep Kumar Gunda, and Michael Isard. 2009. Distributed aggregation for data-parallel computing: Interfaces and implementations. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (SOSP’09). Association for Computing Machinery, New York, NY, 247–260.

Digital Library

[81]

Torsten Zesch and Iryna Gurevych. 2007. Analysis of the Wikipedia category graph for NLP applications. In Proceedings of the TextGraphs-2 Workshop (NAACL-HLT’07). 1–8.

[82]

Mingxing Zhang, Yongwei Wu, Kang Chen, Xuehai Qian, Xue Li, and Weimin Zheng. 2016. Exploring the hidden dimension in graph processing. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’16). USENIX Association, Berkeley, CA, 285–300. Retrieved from http://dl.acm.org/citation.cfm?id=3026877.3026900.

Digital Library

[83]

Mingxing Zhang, Youwei Zhuo, Chao Wang, Mingyu Gao, Yongwei Wu, Kang Chen, Christos Kozyrakis, and Xuehai Qian. 2018. GraphP: Reducing communication for PIM-based graph processing with efficient data partition. In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA’18). IEEE, 544–557.

[84]

Yunming Zhang, Mengjiao Yang, Riyadh Baghdadi, Shoaib Kamil, Julian Shun, and Saman Amarasinghe. 2018. GraphIt: A high-performance graph DSL. Proc. ACM Program. Lang. 2, OOPSLA, Article 121 (Oct. 2018), 30 pages.

Digital Library

[85]

Xiaowei Zhu, Wenguang Chen, Weimin Zheng, and Xiaosong Ma. 2016. Gemini: A computation-centric distributed graph processing system. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’16). USENIX Association, Berkeley, CA, 301–316. http://dl.acm.org/citation.cfm?id=3026877.3026901

Digital Library

[86]

Youwei Zhuo, Chao Wang, Mingxing Zhang, Rui Wang, Dimin Niu, Yanzhi Wang, and Xuehai Qian. 2019. GraphQ: Scalable PIM-based graph processing. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’52). ACM, New York, NY, 712–725.

Digital Library

Cited By

Theodorakopoulos LKarras ATheodoropoulou AKampiotis G(2024)Benchmarking Big Data Systems: Performance and Decision-Making Implications in Emerging TechnologiesTechnologies10.3390/technologies1211021712:11(217)Online publication date: 3-Nov-2024
https://doi.org/10.3390/technologies12110217
Sun XWang WHuang T(2024)How to Fit the SCC Algorithm Efficiently into Distributed Graph Iterative Computation2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC61105.2024.00043(254-263)Online publication date: 2-Jul-2024
https://doi.org/10.1109/COMPSAC61105.2024.00043
Fei XHan JHuang JZheng WZhang Y(2022)Accelerating Neural Network Training with Processing-in-Memory GPU2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid54584.2022.00051(414-421)Online publication date: May-2022
https://doi.org/10.1109/CCGrid54584.2022.00051

Index Terms

Distributed Graph Processing System and Processing-in-memory Architecture with Precise Loop-carried Dependency Guarantee
1. Computing methodologies
  1. Distributed computing methodologies
    1. Distributed programming languages

Recommendations

SympleGraph: distributed graph processing with precise loop-carried dependency guarantee
PLDI 2020: Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation

Graph analytics is an important way to understand relationships in real-world applications. At the age of big data, graphs have grown to billions of edges. This motivates distributed graph processing. Graph processing frameworks ask programmers to ...
L(2,1)-labeling of dually chordal graphs and strongly orderable graphs

An L(2,1)-labeling of a graph G=(V,E) is a function f:V(G)->{0,1,2,...} such that |f(u)-f(v)|>=2 whenever uv@__ __E(G) and |f(u)-f(v)|>=1 whenever u and v are at distance two apart. The span of an L(2,1)-labeling f of G, denoted as SP"2(f,G), is the ...
Finding a chain graph in a bipartite permutation graph

We present a polynomial-time algorithm for solving Subgraph Isomorphism where the base graphs are bipartite permutation graphs and the pattern graphs are chain graphs. Subgraph Isomorphism is studied on graph classes.A polynomial-time algorithm is given ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Computer Systems

ACM Transactions on Computer Systems Volume 37, Issue 1-4

November 2019

177 pages

ISSN:0734-2071

EISSN:1557-7333

DOI:10.1145/3446674

Editor:
Michael Swift
University of Wisconsin, USA

Issue’s Table of Contents

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 2021

Accepted: 01 March 2021

Revised: 01 December 2020

Received: 01 July 2020

Published in TOCS Volume 37, Issue 1-4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

National Science Foundation

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
971
Total Downloads

Downloads (Last 12 months)418
Downloads (Last 6 weeks)40

Reflects downloads up to 20 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Theodorakopoulos LKarras ATheodoropoulou AKampiotis G(2024)Benchmarking Big Data Systems: Performance and Decision-Making Implications in Emerging TechnologiesTechnologies10.3390/technologies1211021712:11(217)Online publication date: 3-Nov-2024
https://doi.org/10.3390/technologies12110217
Sun XWang WHuang T(2024)How to Fit the SCC Algorithm Efficiently into Distributed Graph Iterative Computation2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC61105.2024.00043(254-263)Online publication date: 2-Jul-2024
https://doi.org/10.1109/COMPSAC61105.2024.00043
Fei XHan JHuang JZheng WZhang Y(2022)Accelerating Neural Network Training with Processing-in-Memory GPU2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid54584.2022.00051(414-421)Online publication date: May-2022
https://doi.org/10.1109/CCGrid54584.2022.00051

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents