research-article

GraphABCD: scaling out graph analytics with asynchronous block coordinate descent

Authors:

Leibo LiuAuthors Info & Claims

ISCA '20: Proceedings of the ACM/IEEE 47th Annual International Symposium on Computer Architecture

Pages 419 - 432

https://doi.org/10.1109/ISCA45697.2020.00043

Published: 23 September 2020 Publication History

Abstract

It is of vital importance to efficiently process large graphs for many data-intensive applications. As a result, a large collection of graph analytic frameworks has been proposed to improve the per-iteration performance on a single kind of computation resource. However, heavy coordination and synchronization overhead make it hard to scale out graph analytic frameworks from single platform to heterogeneous platforms. Furthermore, increasing the convergence rate, i.e. reducing the number of iterations, which is equally vital for improving the overall performance of iterative graph algorithms, receives much less attention.

In this paper, we introduce the Block Coordinate Descent (BCD) view of graph algorithms and propose an asynchronous heterogeneous graph analytic framework, GraphABCD, using the BCD view. The BCD view offers key insights and trade-offs on achieving high convergence rate of iterative graph algorithms. GraphABCD features fast convergence under the algorithm design options suggested by BCD. GraphABCD offers algorithm and architectural supports for asynchronous execution, without undermining its fast convergence properties. With minimum synchronization overhead, GraphABCD is able to scale out to heterogeneous and distributed accelerators efficiently. To demonstrate GraphABCD, we prototype its whole system on Intel HARPv2 CPU-FPGA heterogeneous platform. Evaluations on HARPv2 show that GraphABCD achieves geo-mean speedups of 4.8x and 2.0x over GraphMat, a state-of-the-art framework in terms of convergence rate and execution time, respectively.

References

[1]

J. Ahn, S. Hong, S. Yoo, O. Mutlu, and K. Choi, "A scalable processing-in-memory accelerator for parallel graph processing," in Proceedings of the 42nd Annual International Symposium on Computer Architecture, Portland, OR, USA, June 13-17, 2015, 2015, pp. 105--117.

[2]

Z. Ai, M. Zhang, Y. Wu, X. Qian, K. Chen, and W. Zheng, "Clip: A disk I/O focused parallel out-of-core graph processing system," IEEE Trans. Parallel Distrib. Syst., vol. 30, no. 1, pp. 45--62, 2019.

Digital Library

[3]

M. S. B. Altaf and D. A. Wood, "Logca: A high-level performance model for hardware accelerators," in Proceedings of the 44th Annual International Symposium on Computer Architecture, ISCA 2017, Toronto, ON, Canada, June 24-28, 2017, 2017, pp. 375--388.

[4]

L. Backstrom, D. P. Huttenlocher, J. M. Kleinberg, and X. Lan, "Group formation in large social networks: membership, growth, and evolution," in Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA, August 20-23, 2006, 2006, pp. 44--54.

[5]

J. Bennett, S. Lanning et al., "The netflix prize," in Proceedings of KDD cup and workshop, vol. 2007. New York, NY, USA., 2007, p. 35.

[6]

A. Ching, H. Choi, J. Homan, C. Kunz, O. O'Malley, J. Mannix, D. Ryaboy, C. Martella, R. Shaposhnik, S. Schelter, E. Koontz, A. Presta, E. Reisman, M. Kabiljo, N. Joffe, S. Edunov, P. Kumar, I. Kabiljo, and H. Eslami, "Apache giraph," 2013. [Online]. Available: http://giraph.apache.org/

[7]

G. Dai, T. Huang, Y. Chi, N. Xu, Y. Wang, and H. Yang, "Foregraph: Exploring large-scale graph processing on multi-fpga architecture," in Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA 2017, Monterey, CA, USA, February 22-24, 2017, 2017, pp. 217--226.

[8]

R. Dathathri, G. Gill, L. Hoang, H. Dang, A. Brooks, N. Dryden, M. Snir, and K. Pingali, "Gluon: a communication-optimizing substrate for distributed heterogeneous graph analytics," in Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2018, Philadelphia, PA, USA, June 18-22, 2018, 2018, pp. 752--768.

Digital Library

[9]

N. Engelhardt and H. K. So, "Gravf: A vertex-centric distributed graph processing framework on fpgas," in 26th International Conference on Field Programmable Logic and Applications, FPL 2016, Lausanne, Switzerland, August 29 - September 2, 2016, 2016, pp. 1--4.

[10]

J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin, "Powergraph: Distributed graph-parallel computation on natural graphs," in 10th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2012, Hollywood, CA, USA, October 8-10, 2012, 2012, pp. 17--30.

[11]

S. Grossman, H. Litz, and C. Kozyrakis, "Making pull-based graph processing performant," in Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2018, Vienna, Austria, February 24-28, 2018, A. Krall and T. R. Gross, Eds. ACM, 2018, pp. 246--260.

[12]

T. J. Ham, L. Wu, N. Sundaram, N. Satish, and M. Martonosi, "Graphicionado: A high-performance and energy-efficient accelerator for graph analytics," in 49th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2016, Taipei, Taiwan, October 15-19, 2016, 2016, pp. 56:1--56:13.

[13]

F. M. Harper and J. A. Konstan, "The movielens datasets: History and context," TiiS, vol. 5, no. 4, pp. 19:1--19:19, 2016.

Digital Library

[14]

S. Hong, S. K. Kim, T. Oguntebi, and K. Olukotun, "Accelerating CUDA graph algorithms at maximum warp," in Proceedings of the 16th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2011, San Antonio, TX, USA, February 12-16, 2011, 2011, pp. 267--276.

[15]

F. Khorasani, K. Vora, R. Gupta, and L. N. Bhuyan, "Cusha: vertex-centric graph processing on gpus," in The 23rd International Symposium on High-Performance Parallel and Distributed Computing, HPDC'14, Vancouver, BC, Canada - June 23 - 27, 2014, 2014, pp. 239--252.

Digital Library

[16]

D. Kotkov, J. A. Konstan, Q. Zhao, and J. Veijalainen, "Investigating serendipity in recommender systems based on real user feedback," in Proceedings of the 33rd Annual ACM Symposium on Applied Computing, SAC 2018, Pau, France, April 09-13, 2018, 2018, pp. 1341--1350.

Digital Library

[17]

H. Kwak, C. Lee, H. Park, and S. B. Moon, "What is twitter, a social network or a news media?" in Proceedings of the 19th International Conference on World Wide Web, WWW 2010, Raleigh, North Carolina, USA, April 26-30, 2010, M. Rappa, P. Jones, J. Freire, and S. Chakrabarti, Eds. ACM, 2010, pp. 591--600.

[18]

A. Kyrola, G. E. Blelloch, and C. Guestrin, "Graphchi: Large-scale graph computation on just a PC," in 10th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2012, Hollywood, CA, USA, October 8-10, 2012, 2012, pp. 31--46.

[19]

K. Lakhotia, R. Kannan, S. Pati, and V. K. Prasanna, "GPOP: a cache and memory-efficient framework for graph processing over partitions," in Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2019, Washington, DC, USA, February 16-20, 2019, 2019, pp. 393--394.

[20]

A. Lenharth, D. Nguyen, and K. Pingali, "Parallel graph analytics," Commun. ACM, vol. 59, no. 5, pp. 78--87, Apr. 2016.

Digital Library

[21]

J. Leskovec, D. P. Huttenlocher, and J. M. Kleinberg, "Predicting positive and negative links in online social networks," in Proceedings of the 19th International Conference on World Wide Web, WWW 2010, Raleigh, North Carolina, USA, April 26-30, 2010, 2010, pp. 641--650.

[22]

J. Leskovec, D. P. Huttenlocher, and J. M. Kleinberg, "Signed networks in social media," in Proceedings of the 28th International Conference on Human Factors in Computing Systems, CHI 2010, Atlanta, Georgia, USA, April 10-15, 2010. ACM, 2010, pp. 1361--1370.

[23]

J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney, "Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters," arXiv e-prints, p. arXiv:0810.1355, Oct. 2008.

[24]

Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein, "Distributed graphlab: A framework for machine learning in the cloud," PVLDB, vol. 5, no. 8, pp. 716--727, 2012.

Digital Library

[25]

L. Luo and Y. Liu, "Improving parallel efficiency for asynchronous graph analytics using gauss-seidel-based matrix computation," Concurrency and Computation: Practice and Experience, vol. 31, no. 17, 2019.

[26]

G. Malewicz, M. H. Austern, A. J. C. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski, "Pregel: a system for large-scale graph processing," in Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, Indianapolis, Indiana, USA, June 6-10, 2010, 2010, pp. 135--146.

[27]

D. Nguyen, A. Lenharth, and K. Pingali, "A lightweight infrastructure for graph analytics," in Proceedings of ACM Symposium on Operating Systems Principles, ser. SOSP '13, 2013, pp. 456--471.

Digital Library

[28]

E. Nurvitadhi, G. Weisz, Y. Wang, S. Hurkat, M. Nguyen, J. C. Hoe, J. F. Martínez, and C. Guestrin, "Graphgen: An FPGA framework for vertex-centric graph computation," in 22nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, FCCM 2014, Boston, MA, USA, May 11-13, 2014, 2014, pp. 25--28.

[29]

J. Nutini, I. Laradji, and M. Schmidt, "Let's Make Block Coordinate Descent Go Fast: Faster Greedy Rules, Message-Passing, Active-Set Complexity, and Superlinear Convergence," ArXiv e-prints, Dec. 2017.

[30]

J. Nutini, M. W. Schmidt, I. H. Laradji, M. P. Friedlander, and H. A. Koepke, "Coordinate descent converges faster with the gauss-southwell rule than random selection," in Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015, 2015, pp. 1632--1641.

[31]

T. Oguntebi and K. Olukotun, "Graphops: A dataflow library for graph analytics acceleration," in Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey, CA, USA, February 21--23, 2016, 2016, pp. 111--117.

[32]

N. Oliver, R. R. Sharma, S. Chang, B. Chitlur, E. Garcia, J. Grecco, A. Grier, N. Ijih, Y. Liu, P. Marolia, H. Mitchel, S. Subhaschandra, A. Sheiman, T. Whisonant, and P. Gupta, "A reconfigurable computing system based on a cache-coherent fabric," in 2011 International Conference on Reconfigurable Computing and FPGAs, ReConFig 2011, Cancun, Mexico, November 30 - December 2, 2011, 2011, pp. 80--85.

[33]

M. M. Ozdal, S. Yesil, T. Kim, A. Ayupov, J. Greth, S. M. Burns, and Ö. Özturk, "Energy efficient architecture for graph analytics accelerators," in 43rd ACM/IEEE Annual International Symposium on Computer Architecture, ISCA 2016, Seoul, South Korea, June 18-22, 2016, 2016, pp. 166--177.

[34]

B. Recht, C. Ré, S. J. Wright, and F. Niu, "Hogwild: A lock-free approach to parallelizing stochastic gradient descent," in Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, Granada, Spain, J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. C. N. Pereira, and K. Q. Weinberger, Eds., 2011, pp. 693--701.

[35]

A. H. N. Sabet, J. Qiu, and Z. Zhao, "Tigr: Transforming irregular graphs for gpu-friendly graph processing," in Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2018, Williamsburg, VA, USA, March 24-28, 2018, 2018, pp. 622--636.

[36]

J. Shun and G. E. Blelloch, "Ligra: a lightweight graph processing framework for shared memory," in ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '13, Shenzhen, China, February 23-27, 2013, 2013, pp. 135--146.

Digital Library

[37]

J. Shun, L. Dhulipala, and G. E. Blelloch, "Smaller and faster: Parallel processing of compressed graphs with ligra+," in 2015 Data Compression Conference, DCC 2015, Snowbird, UT, USA, April 7-9, 2015, A. Bilgin, M. W. Marcellin, J. Serra-Sagristà, and J. A. Storer, Eds. IEEE, 2015, pp. 403--412.

[38]

P. Stutz, A. Bernstein, and W. W. Cohen, "Signal/collect: Graph algorithms for the (semantic) web," in The Semantic Web - ISWC 2010 - 9th International Semantic Web Conference, ISWC 2010, Shanghai, China, November 7-11, 2010, Revised Selected Papers, Part I, 2010, pp. 764--780.

[39]

T. Sun, R. Hannah, and W. Yin, "Asynchronous Coordinate Descent Under More Realistic Assumption," in Proceedings of the 31st International Conference on Neural Information Processing Systems, ser. NIPS'17. USA: Curran Associates Inc., 2017, pp. 6183--6191, event-place: Long Beach, California, USA.

[40]

N. Sundaram, N. Satish, M. M. A. Patwary, S. Dulloor, M. J. Anderson, S. G. Vadlamudi, D. Das, and P. Dubey, "Graphmat: High performance graph analytics made productive," PVLDB, vol. 8, no. 11, pp. 1214--1225, 2015.

Digital Library

[41]

L. Takac and M. Zabovsky, "Data analysis in public social networks," International Scientific Conference and International Workshop Present Day Trends of Innovations, pp. 1--6, 01 2012.

[42]

S. Tasci and M. Demirbas, "Giraphx: Parallel yet serializable large-scale graph processing," in Euro-Par 2013 Parallel Processing - 19th International Conference, Aachen, Germany, August 26-30, 2013. Proceedings, 2013, pp. 458--469.

[43]

P. Tseng, "Convergence of a block coordinate descent method for nondifferentiable minimization," Journal of optimization theory and applications, vol. 109, no. 3, pp. 475--494, 2001.

Digital Library

[44]

D. Voitsechov and Y. Etsion, "Single-graph multiple flows: Energy efficient design alternative for gpgpus," in ACM/IEEE 41st International Symposium on Computer Architecture, ISCA 2014, Minneapolis, MN, USA, June 14-18, 2014, 2014, pp. 205--216.

[45]

Y. Wang, S. Baxter, and J. D. Owens, "Mini-gunrock: A lightweight graph analytics framework on the GPU," in 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPS Workshops 2017, Orlando / Buena Vista, FL, USA, May 29 - June 2, 2017, 2017, pp. 616--626.

[46]

J. J. Whang, A. Lenharth, I. S. Dhillon, and K. Pingali, "Scalable Data-Driven PageRank: Algorithms, System Issues, and Lessons Learned," in Euro-Par 2015: Parallel Processing, J. L. Träff, S. Hunold, and F. Versaci, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2015, vol. 9233, pp. 438--450.

[47]

D. Yan, J. Cheng, K. Xing, Y. Lu, W. Ng, and Y. Bu, "Pregel algorithms for graph connectivity problems with performance guarantees," PVLDB, vol. 7, no. 14, pp. 1821--1832, 2014.

Digital Library

[48]

M. Yan, L. Deng, X. Hu, L. Liang, Y. Feng, X. Ye, Z. Zhang, D. Fan, and Y. Xie, "Hygcn: A GCN accelerator with hybrid architecture," CoRR, vol. abs/2001.02514, 2020.

[49]

M. Yan, X. Hu, S. Li, A. Basak, H. Li, X. Ma, I. Akgun, Y. Feng, P. Gu, L. Deng, X. Ye, Z. Zhang, D. Fan, and Y. Xie, "Alleviating irregularity in graph analytics acceleration: a hardware/software co-design approach," in Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2019, Columbus, OH, USA, October 12-16, 2019, 2019, pp. 615--628.

[50]

E. Yoneki and A. Roy, "Scale-up graph processing: a storage-centric view," in First International Workshop on Graph Data Management Experiences and Systems, GRADES 2013, co-loated with SIGMOD/PODS 2013, New York, NY, USA, June 24, 2013, 2013, p. 8.

[51]

M. Zhang, Y. Wu, K. Chen, X. Qian, X. Li, and W. Zheng, "Exploring the hidden dimension in graph processing," in 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2016, Savannah, GA, USA, November 2-4, 2016., 2016, pp. 285--300.

[52]

M. Zhang, Y. Wu, Y. Zhuo, X. Qian, C. Huan, and K. Chen, "Wonderland: A novel abstraction-based out-of-core graph processing system," in Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2018, Williamsburg, VA, USA, March 24-28, 2018, 2018, pp. 608--621.

[53]

M. Zhang, Y. Zhuo, C. Wang, M. Gao, Y. Wu, K. Chen, C. Kozyrakis, and X. Qian, "Graphp: Reducing communication for pim-based graph processing with efficient data partition," in IEEE International Symposium on High Performance Computer Architecture, HPCA 2018, Vienna, Austria, February 24-28, 2018, 2018, pp. 544--557.

[54]

Y. Zhang, M. Yang, R. Baghdadi, S. Kamil, J. Shun, and S. P. Amarasinghe, "Graphit: a high-performance graph DSL," PACMPL, vol. 2, no. OOPSLA, pp. 121:1--121:30, 2018.

[55]

S. Zhou, C. Chelmis, and V. K. Prasanna, "High-throughput and energy-efficient graph processing on FPGA," in 24th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, FCCM 2016, Washington, DC, USA, May 1-3, 2016, 2016, pp. 103--110.

[56]

S. Zhou, R. Kannan, Y. Min, and V. K. Prasanna, "FASTCF: fpga-based accelerator for stochastic-gradient-descent-based collaborative filtering," in Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA 2018, Monterey, CA, USA, February 25-27, 2018, 2018, pp. 259--268.

[57]

S. Zhou, R. Kannan, H. Zeng, and V. K. Prasanna, "An FPGA framework for edge-centric graph processing," in Proceedings of the 15th ACM International Conference on Computing Frontiers, CF 2018, Ischia, Italy, May 08-10, 2018, 2018, pp. 69--77.

[58]

S. Zhou and V. K. Prasanna, "Accelerating graph analytics on CPU-FPGA heterogeneous platform," in 29th International Symposium on Computer Architecture and High Performance Computing, SBAC-PAD 2017, Campinas, Brazil, October 17-20, 2017, 2017, pp. 137--144.

[59]

X. Zhu, W. Chen, W. Zheng, and X. Ma, "Gemini: A computation-centric distributed graph processing system," in 12th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2016, Savannah, GA, USA, November 2-4, 2016., 2016, pp. 301--316.

[60]

Y. Zhuo, C. Wang, M. Zhang, R. Wang, D. Niu, Y. Wang, and X. Qian, "Graphq: Scalable pim-based graph processing," in Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2019, Columbus, OH, USA, October 12-16, 2019, 2019, pp. 712--725.

Cited By

Zhang BDu HChen SKang YBartolini ARietveld KSchuman CMoreira J(2023)GGPAProceedings of the 20th ACM International Conference on Computing Frontiers10.1145/3587135.3592198(33-41)Online publication date: 9-May-2023
https://dl.acm.org/doi/10.1145/3587135.3592198
Tan LYan MYe XFan DSavidis ISasan AThapliyal HDeMara R(2022)HetGraph: A High Performance CPU-CGRA Architecture for Matrix-based Graph AnalyticsProceedings of the Great Lakes Symposium on VLSI 202210.1145/3526241.3530382(387-391)Online publication date: 6-Jun-2022
https://dl.acm.org/doi/10.1145/3526241.3530382
Li SNiu DWang YHan WZhang ZGuan TGuan YLiu HHuang LDu ZXue FFang YZheng HXie YSalapura VZahran MChong FTang L(2022)Hyperscale FPGA-as-a-service architecture for large-scale distributed graph neural networkProceedings of the 49th Annual International Symposium on Computer Architecture10.1145/3470496.3527439(946-961)Online publication date: 18-Jun-2022
https://dl.acm.org/doi/10.1145/3470496.3527439
Show More Cited By

Recommendations

Lightweight asynchronous scheduling in heterogeneous reconfigurable systems
Abstract
The trend for heterogeneous embedded systems is the integration of accelerators and general-purpose CPU cores on the same die. In these integrated architectures, like the Zynq UltraScale+ board (CPU+FPGA) that we target in this work, hardware ...
ACTS: A Near-Memory FPGA Graph Processing Framework
FPGA '23: Proceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate Arrays

Despite the high off-chip bandwidth and on-chip parallelism offered by today's near-memory accelerators, software-based (CPU and GPU) graph processing frameworks still suffer performance degradation from under-utilization of available memory bandwidth ...
A configurable mapreduce accelerator for multi-core FPGAs (abstract only)
FPGA '14: Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays

MapReduce is a widely used programming framework for the implementation of cloud computing application in data centers. This work presents a novel configurable hardware accelerator that is used to speed up the processing of multi-core and cloud ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ISCA '20: Proceedings of the ACM/IEEE 47th Annual International Symposium on Computer Architecture

May 2020

1152 pages

ISBN:9781728146614

General Chairs:
José Martínez
Cornell University
,
José Duato
Universitat Politècnica de València
,
Program Chair:
Lieven Eeckhout
Ghent University

Sponsors

SIGARCH: ACM Special Interest Group on Computer Architecture

In-Cooperation

IEEE

Publisher

IEEE Press

Publication History

Published: 23 September 2020

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ISCA '20

Sponsor:

SIGARCH

ISCA '20: The 47th Annual International Symposium on Computer Architecture

May 30 - June 3, 2020

Virtual Event

Acceptance Rates

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Sponsor:
sigarch

The 52nd Annual International Symposium on Computer Architecture

June 21 - 25, 2025

Tokyo , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
110
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhang BDu HChen SKang YBartolini ARietveld KSchuman CMoreira J(2023)GGPAProceedings of the 20th ACM International Conference on Computing Frontiers10.1145/3587135.3592198(33-41)Online publication date: 9-May-2023
https://dl.acm.org/doi/10.1145/3587135.3592198
Tan LYan MYe XFan DSavidis ISasan AThapliyal HDeMara R(2022)HetGraph: A High Performance CPU-CGRA Architecture for Matrix-based Graph AnalyticsProceedings of the Great Lakes Symposium on VLSI 202210.1145/3526241.3530382(387-391)Online publication date: 6-Jun-2022
https://dl.acm.org/doi/10.1145/3526241.3530382
Li SNiu DWang YHan WZhang ZGuan TGuan YLiu HHuang LDu ZXue FFang YZheng HXie YSalapura VZahran MChong FTang L(2022)Hyperscale FPGA-as-a-service architecture for large-scale distributed graph neural networkProceedings of the 49th Annual International Symposium on Computer Architecture10.1145/3470496.3527439(946-961)Online publication date: 18-Jun-2022
https://dl.acm.org/doi/10.1145/3470496.3527439
Zhao JYang YZhang YLiao XGu LHe LHe BJin HLiu HJiang XYu HSalapura VZahran MChong FTang L(2022)TDGraphProceedings of the 49th Annual International Symposium on Computer Architecture10.1145/3470496.3527409(116-129)Online publication date: 18-Jun-2022
https://dl.acm.org/doi/10.1145/3470496.3527409
Zhao JZhang YLiao XHe LHe BJin HLiu Hde Supinski BHall MGamblin T(2021)LCCGProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3458817.3480854(1-14)Online publication date: 14-Nov-2021
https://dl.acm.org/doi/10.1145/3458817.3480854
Yang YEmer JSanchez DMartínez JDuato JJohn L(2021)SpZipProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00087(1069-1082)Online publication date: 14-Jun-2021
https://dl.acm.org/doi/10.1109/ISCA52012.2021.00087
Dadu VLiu SNowatzki TMartínez JDuato JJohn L(2021)PolyGraphProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00053(595-608)Online publication date: 14-Jun-2021
https://dl.acm.org/doi/10.1109/ISCA52012.2021.00053
Chen XHuang TXu SBourgeat TChung CArvind Martínez JDuato JJohn L(2021)FlexMinerProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00052(581-594)Online publication date: 14-Jun-2021
https://dl.acm.org/doi/10.1109/ISCA52012.2021.00052

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents