research-article

Effective exploitation of SIMD resources in cross-ISA virtualization

Authors:

Decheng ZuoAuthors Info & Claims

VEE 2021: Proceedings of the 17th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments

Pages 84 - 97

https://doi.org/10.1145/3453933.3454016

Published: 07 April 2021 Publication History

Abstract

System virtualization is a fundamental technology that enables many important applications. However, existing virtualization techniques suffer from a critical limitation due to their limited exploitation of host SIMD hardware resources, especially when a guest application does not have inherently fine-grained data-level parallelism. To bridge this utilization gap and unleash the full potential of host SIMD resources, this paper proposes an effective and unconventional SIMD exploitation technique. The proposed exploitation takes advantage of ample host SIMD registers and powerful host SIMD instructions to generate more efficient host binary code for guest applications even without any fine-grained data-level parallelism. It also mitigates the shortage of general-purpose registers on the host platform, as well as improves the efficiency of accessing guest registers. We have implemented the exploitation in an extensively-used virtualization platform, QEMU. Experimental results on a comprehensive list of benchmarks from PARSEC, SPEC-CPU2017, and Google Octane JavaScript benchmark suite show that an average of 2.2X performance speedup can be achieved for AArch64 binaries on an x86-64 host machine. We believe the proposed technique will provide a new perspective for our community to rethink the exploitation of SIMD hardware resources.

References

[1]

Berkin Akin, Zeshan A. Chishti, and Alaa R. Alameldeen. 2019. ZCOMP: Reducing DNN Cross-Layer Memory Footprint Using Vector Extensions. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture (Columbus, OH, USA) ( MICRO '52). Association for Computing Machinery, New York, NY, USA, 126-138. https://doi.org/10.1145/3352460.3358305

Digital Library

[2]

Android. 2020. Run apps on the Android Emulator. https://developer. android.com/studio/run/emulator.

[3]

Sara S. Baghsorkhi, Nalini Vasudevan, and Youfeng Wu. 2016. FlexVec: Auto-Vectorization for Irregular Loops. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (Santa Barbara, CA, USA) ( PLDI '16). Association for Computing Machinery, New York, NY, USA, 697-710. https://doi.org/10.1145/2908080.2908111

Digital Library

[4]

Fabrice Bellard. 2005. QEMU, a Fast and Portable Dynamic Translator. In Proceedings of the Annual Conference on USENIX Annual Technical Conference (Anaheim, CA) (USENIX ATC '05). USENIX Association, USA, 41.

Digital Library

[5]

Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. 2008. The PARSEC Benchmark Suite: Characterization and Architectural Implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (Toronto, Ontario, Canada) ( PACT '08). Association for Computing Machinery, New York, NY, USA, 72-81. https://doi.org/10.1145/1454115.1454128

Digital Library

[6]

Derek L. Bruening and Saman Amarasinghe. 2004. Eficient, Transparent, and Comprehensive Runtime Code Manipulation. Ph.D. Dissertation. Massachusetts Institute of Technology, USA.

[7]

James C. Dehnert, Brian K. Grant, John P. Banning, Richard Johnson, Thomas Kistler, Alexander Klaiber, and Jim Mattson. 2003. The Transmeta Code MorphingTM Software: Using Speculation, Recovery, and Adaptive Retranslation to Address Real-Life Challenges. In Proceedings of the International Symposium on Code Generation and Optimization: Feedback-Directed and Runtime Optimization (San Francisco, California, USA) ( CGO '03). IEEE Computer Society, USA, 15-24.

[8]

Matthew DeVuyst, Ashish Venkat, and Dean M. Tullsen. 2012. Execution Migration in a Heterogeneous-ISA Chip Multiprocessor. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems (London, England, UK) (ASPLOS XVII). Association for Computing Machinery, New York, NY, USA, 261-272. https://doi.org/10.1145/2150976.2151004

Digital Library

[9]

Dolphin Emulator Project. 2020. A GameCube and Wii emulator. https://dolphin-emu.org.

[10]

Amanieu D'Antras, Cosmin Gorgovan, Jim Garside, and Mikel Luján. 2017. Low Overhead Dynamic Binary Translation on ARM. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (Barcelona, Spain) ( PLDI 2017 ). Association for Computing Machinery, New York, NY, USA, 333-346. https://doi.org/10.1145/3062341.3062371

Digital Library

[11]

Carol Eidt and Tanner Gooding. 2020. SIMD Support in.NET: Abstract and Concrete Vector Types and Operations. In Proceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization (San Diego, CA, USA) ( CGO 2020 ). Association for Computing Machinery, New York, NY, USA, 229-241. https://doi.org/10.1145/3368826.3377926

Digital Library

[12]

Sheng-Yu Fu, Ding-Yong Hong, Yu-Ping Liu, Jan-Jan Wu, and WeiChung Hsu. 2017. Dynamic Translation of Structured Loads/Stores and Register Mapping for Architectures with SIMD Extensions. In Proceedings of the 18th ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (Barcelona, Spain) (LCTES 2017 ). Association for Computing Machinery, New York, NY, USA, 31-40. https://doi.org/10.1145/3078633.3081029

Digital Library

[13]

Evangelos Georganas, Sasikanth Avancha, Kunal Banerjee, Dhiraj Kalamkar, Greg Henry, Hans Pabst, and Alexander Heinecke. 2018. Anatomy of High-Performance Deep Learning Convolutions on SIMD Architectures. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (Dallas, Texas) ( SC '18). IEEE Press, Article 66, 12 pages.

[14]

Google. 2020. The JavaScript Benchmark Suite for the modern web. https://developers.google.com/octane.

[15]

Google. 2020. V8 JavaScript engine. https://v8.dev.

[16]

Shuo Han, Lei Zou, and Jefrey Xu Yu. 2018. Speeding Up Set Intersections in Graph Algorithms Using SIMD Instructions. In Proceedings of the 2018 International Conference on Management of Data (Houston, TX, USA) ( SIGMOD '18). Association for Computing Machinery, New York, NY, USA, 1587-1602. https://doi.org/10.1145/3183713.3196924

Digital Library

[17]

Kaixi Hou, Hao Wang, and Wu-chun Feng. 2015. ASPaS: A Framework for Automatic SIMDization of Parallel Sorting on X86-Based Many-Core Processors. In Proceedings of the 29th ACM on International Conference on Supercomputing (Newport Beach, California, USA) ( ICS '15). Association for Computing Machinery, New York, NY, USA, 383-392. https://doi.org/10.1145/2751205.2751247

Digital Library

[18]

Joonmoo Huh and James Tuck. 2017. Improving the Efectiveness of Searching for Isomorphic Chains in Superword Level Parallelism. In Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture (Cambridge, Massachusetts) (MICRO-50 ' 17 ). Association for Computing Machinery, New York, NY, USA, 718-729. https://doi.org/10.1145/3123939.3124554

Digital Library

[19]

Jinhu Jiang, Rongchao Dong, Zhongjun Zhou, Changheng Song, Wenwen Wang, Pen-Chung Yew, and Weihua Zhang. 2020. More with Less-Deriving More Translation Rules with Less Training Data for DBTs Using Parameterization. In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 415-426. https://doi.org/10.1109/MICRO50266. 2020.00043

[20]

Timo Kersten, Viktor Leis, Alfons Kemper, Thomas Neumann, Andrew Pavlo, and Peter Boncz. 2018. Everything You Always Wanted to Know about Compiled and Vectorized Queries but Were Afraid to Ask. Proc. VLDB Endow. 11, 13 (Sept. 2018 ), 2209-2222. https://doi.org/10.14778/ 3275366.3284966

Digital Library

[21]

Martin Kong, Richard Veras, Kevin Stock, Franz Franchetti, LouisNoël Pouchet, and P. Sadayappan. 2013. When Polyhedral Transformations Meet SIMD Code Generation. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (Seattle, Washington, USA) ( PLDI '13). Association for Computing Machinery, New York, NY, USA, 127-138. https://doi.org/10.1145/2491956.2462187

Digital Library

[22]

Jianhui Li, Qi Zhang, Shu Xu, and Bo Huang. 2006. Optimizing Dynamic Binary Translation for SIMD Instructions. In Proceedings of the International Symposium on Code Generation and Optimization (CGO '06). IEEE Computer Society, USA, 269-280. https://doi.org/10.1109/ CGO. 2006.27

Digital Library

[23]

Y. Liu, D. Hong, J. Wu, S. Fu, and W. Hsu. 2017. Exploiting Asymmetric SIMD Register Configurations in ARM-to-x86 Dynamic Binary Translation. In 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT). 343-355.

[24]

Charith Mendis, Ajay Jain, Paras Jain, and Saman Amarasinghe. 2019. Revec: Program Rejuvenation through Revectorization. In Proceedings of the 28th International Conference on Compiler Construction (Washington, DC, USA) ( CC 2019). Association for Computing Machinery, New York, NY, USA, 29-41. https://doi.org/10.1145/3302516.3307357

Digital Library

[25]

Microsoft. 2018. How x86 emulation works on ARM. https://docs.microsoft.com/en-us/windows/uwp/porting/appson-arm-x86-emulation.

[26]

Barton P. Miller, Mark D. Callaghan, Jonathan M. Cargille, Jefrey K. Hollingsworth, R. Bruce Irvin, Karen L. Karavanic, Krishna Kunchithapadam, and Tia Newhall. 1995. The Paradyn Parallel Performance Measurement Tool. Computer 28, 11 (Nov. 1995 ), 37-46. https://doi.org/10.1109/2.471178

Digital Library

[27]

Nicholas Nethercote and Julian Seward. 2007. Valgrind: A Framework for Heavyweight Dynamic Binary Instrumentation. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation (San Diego, California, USA) ( PLDI '07). Association for Computing Machinery, New York, NY, USA, 89-100. https://doi.org/10.1145/1250734.1250746

Digital Library

[28]

Dorit Nuzman, Sergei Dyshel, Erven Rohou, Ira Rosen, Kevin Williams, David Yuste, Albert Cohen, and Ayal Zaks. 2011. Vapor SIMD: AutoVectorize Once, Run Everywhere. In Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO '11). IEEE Computer Society, USA, 151-160.

[29]

Yihan Pang, Robert Lyerly, and Binoy Ravindran. 2019. Cross-ISA Execution of SIMD Regions for Improved Performance. In Proceedings of the 12th ACM International Conference on Systems and Storage (Haifa, Israel) (SYSTOR '19). Association for Computing Machinery, New York, NY, USA, 55-67. https://doi.org/10.1145/3319647.3325832

Digital Library

[30]

Orestis Polychroniou, Arun Raghavan, and Kenneth A. Ross. 2015. Rethinking SIMD Vectorization for In-Memory Databases. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (Melbourne, Victoria, Australia) ( SIGMOD '15). Association for Computing Machinery, New York, NY, USA, 1493-1508. https://doi.org/10.1145/2723372.2747645

Digital Library

[31]

V. Porpodas. 2017. SuperGraph-SLP Auto-Vectorization. In 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT). 330-342.

[32]

Vasileios Porpodas, Rodrigo C. O. Rocha, Evgueni Brevnov, Luís F. W. Góes, and Timothy Mattson. 2019. Super-Node SLP : Optimized Vectorization for Code Sequences Containing Operators and Their Inverse Elements. In Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization (Washington, DC, USA) ( CGO 2019). IEEE Press, 206-216.

[33]

Vijay Janapa Reddi, Dan Connors, Robert Cohn, and Michael D. Smith. 2007. Persistent Code Caching: Exploiting Code Reuse Across Executions and Applications. In Proceedings of the International Symposium on Code Generation and Optimization (CGO '07). IEEE Computer Society, USA, 74-88. https://doi.org/10.1109/CGO. 2007.29

Digital Library

[34]

Changheng Song, Wenwen Wang, Pen-Chung Yew, Antonia Zhai, and Weihua Zhang. 2019. Unleashing the Power of Learning: An Enhanced Learning-Based Approach for Dynamic Binary Translation. In Proceedings of the 2019 USENIX Conference on Usenix Annual Technical Conference (Renton, WA, USA) ( USENIX ATC '19). USENIX Association, USA, 77-89.

[35]

Tom Spink, Harry Wagstaf, and Björn Franke. 2019. A Retargetable System-Level DBT Hypervisor. In Proceedings of the 2019 USENIX Conference on Usenix Annual Technical Conference (Renton, WA, USA) ( USENIX ATC '19). USENIX Association, USA, 505-520.

[36]

Standard Performance Evaluation Corporation. 2020. SPEC CPU 2017. https://www.spec.org/cpu2017.

[37]

Alen Stojanov, Ivaylo Toskov, Tiark Rompf, and Markus Püschel. 2018. SIMD Intrinsics on Managed Language Runtimes. In Proceedings of the 2018 International Symposium on Code Generation and Optimization (Vienna, Austria) ( CGO 2018 ). Association for Computing Machinery, New York, NY, USA, 2-15. https://doi.org/10.1145/3168810

Digital Library

[38]

Wenwen Wang. 2021. Helper Function Inlining in Dynamic Binary Translation. In Proceedings of the 30th ACM SIGPLAN International Conference on Compiler Construction (Virtual, Republic of Korea) (CC 2021 ). Association for Computing Machinery, New York, NY, USA, 107-118. https://doi.org/10.1145/3446804.3446851

Digital Library

[39]

Wenwen Wang, Stephen McCamant, Antonia Zhai, and Pen-Chung Yew. 2018. Enhancing Cross-ISA DBT Through Automatically Learned Translation Rules. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems (Williamsburg, VA, USA) ( ASPLOS '18). Association for Computing Machinery, New York, NY, USA, 84-97. https://doi. org/10.1145/3173162.3177160

Digital Library

[40]

Wenwen Wang, Chenggang Wu, Tongxin Bai, Zhenjiang Wang, Xiang Yuan, and Huimin Cui. 2014. A Pattern Translation Method for Flags in Binary Translation. Journal of Computer Research and Development 51, 10 ( 2014 ), 2336-2347. http://crad.ict.ac.cn/EN/10.7544/issn1000-1239. 2014.20130018

[41]

Wenwen Wang, Jiacheng Wu, Xiaoli Gong, Tao Li, and Pen-Chung Yew. 2018. Improving Dynamically-Generated Code Performance on Dynamic Binary Translators. In Proceedings of the 14th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (Williamsburg, VA, USA) ( VEE '18). Association for Computing Machinery, New York, NY, USA, 17-30. https://doi.org/10.1145/ 3186411.3186413

Digital Library

[42]

Wenwen Wang, Pen-Chung Yew, Antonia Zhai, and Stephen McCamant. 2016. A General Persistent Code Caching Framework for Dynamic Binary Translation (DBT). In Proceedings of the 2016 USENIX Conference on Usenix Annual Technical Conference (Denver, CO, USA) ( USENIX ATC '16). USENIX Association, USA, 591-603.

[43]

Wenwen Wang, Pen-Chung Yew, Antonia Zhai, and Stephen McCamant. 2020. Eficient and Scalable Cross-ISA Virtualization of Hardware Transactional Memory. In Proceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization (San Diego, CA, USA) ( CGO 2020 ). Association for Computing Machinery, New York, NY, USA, 107-120. https://doi.org/10.1145/3368826.3377919

Digital Library

[44]

Wenwen Wang, Pen-Chung Yew, Antonia Zhai, Stephen McCamant, Youfeng Wu, and Jayaram Bobba. 2017. Enabling Cross-ISA Ofloading for COTS Binaries. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services (Niagara Falls, New York, USA) ( MobiSys '17). Association for Computing Machinery, New York, NY, USA, 319-331. https://doi.org/10.1145/3081333.3081337

Digital Library

[45]

Jin Wu, Jian Dong, Ruili Fang, Wenwen Wang, and Decheng Zuo. 2020. PerfDBT: Eficient Performance Regression Testing of Dynamic Binary Translation. In 2020 IEEE 38th International Conference on Computer Design (ICCD). 389-392. https://doi.org/10.1109/ICCD50377. 2020. 00071

[46]

Qifan Yang, Zhenhua Li, Yunhao Liu, Hai Long, Yuanchao Huang, Jiaming He, Tianyin Xu, and Ennan Zhai. 2019. Mobile Gaming on Personal Computers with Direct Android Emulation. In The 25th Annual International Conference on Mobile Computing and Networking (Los Cabos, Mexico) ( MobiCom '19). Association for Computing Machinery, New York, NY, USA, Article 19, 15 pages. https://doi.org/10.1145/3300061.3300122

Digital Library

[47]

Ziyi Zhao, Zhang Jiang, Ying Chen, Xiaoli Gong, Wenwen Wang, and Pen-Chung Yew. 2021. Enhancing Atomic Instruction Emulation for Cross-ISA Dynamic Binary Translation. In 19th IEEE/ACM International Symposium on Code Generation and Optimization (CGO 2021 ). Association for Computing Machinery, New York, NY, USA.

[48]

Ziyi Zhao, Zhang Jiang, Ximing Liu, Xiaoli Gong, Wenwen Wang, and Pen-Chung Yew. 2020. DQEMU: A Scalable Emulator with Retargetable DBT on Distributed Platforms. In 49th International Conference on Parallel Processing-ICPP (Edmonton, AB, Canada) ( ICPP '20). Association for Computing Machinery, New York, NY, USA, Article 7, 11 pages. https://doi.org/10.1145/3404397.3404403

Digital Library

[49]

Jingren Zhou and Kenneth A. Ross. 2002. Implementing Database Operations Using SIMD Instructions. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data (Madison, Wisconsin) (SIGMOD '02). Association for Computing Machinery, New York, NY, USA, 145-156. https://doi.org/10.1145/564691.564709

Digital Library

Cited By

Xie WLuo QTian XHuang JQi F(2024)Performance Improvements via Peephole Optimization in Dynamic Binary TranslationElectronics10.3390/electronics1309160813:9(1608)Online publication date: 23-Apr-2024
https://doi.org/10.3390/electronics13091608
Xie WTang DQi FChai ZLuo QLin Y(2023)Towards Efficient Dynamic Binary Translation Optimizations Based on RISC Architectural FeaturesJournal of Circuits, Systems and Computers10.1142/S021812662450104433:06Online publication date: 26-Oct-2023
https://doi.org/10.1142/S0218126624501044
Zeng HXie MDong YWu ZLin H(2023)Efficient condition code emulation for dynamic binary translation systemsThird International Symposium on Computer Engineering and Intelligent Communications (ISCEIC 2022)10.1117/12.2660798(25)Online publication date: 2-Feb-2023
https://doi.org/10.1117/12.2660798
Show More Cited By

Recommendations

Efficient memory virtualization for Cross-ISA system mode emulation
VEE '14: Proceedings of the 10th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments

Cross-ISA system-mode emulation has many important applications. For example, Cross-ISA system-mode emulation helps computer architects and OS developers trace and debug kernel execution-flow efficiently by emulating a slower platform (such as ARM) on a ...
Optimizing data permutations in structured loads/stores translation and SIMD register mapping for a cross-ISA dynamic binary translator
Abstract
More and more modern processors have been supporting non-contiguous SIMD data accesses. However, translating such instructions has been overlooked in the Dynamic Binary Translation (DBT) area. For example, in the popular QEMU dynamic ...
Efficient memory virtualization for Cross-ISA system mode emulation
VEE '14

Cross-ISA system-mode emulation has many important applications. For example, Cross-ISA system-mode emulation helps computer architects and OS developers trace and debug kernel execution-flow efficiently by emulating a slower platform (such as ARM) on a ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

VEE 2021: Proceedings of the 17th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments

April 2021

200 pages

ISBN:9781450383943

DOI:10.1145/3453933

General Chair:
Ben L. Titzer
Google, Germany
,
Program Chairs:
Harry Xu
University of California at Los Angeles, USA
,
Irene Zhang
Microsoft Research, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 April 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

VEE '21

Sponsor:

VEE '21: 17th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments

April 16, 2021

Virtual, USA

Acceptance Rates

Overall Acceptance Rate 80 of 235 submissions, 34%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
271
Total Downloads

Downloads (Last 12 months)48
Downloads (Last 6 weeks)1

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Xie WLuo QTian XHuang JQi F(2024)Performance Improvements via Peephole Optimization in Dynamic Binary TranslationElectronics10.3390/electronics1309160813:9(1608)Online publication date: 23-Apr-2024
https://doi.org/10.3390/electronics13091608
Xie WTang DQi FChai ZLuo QLin Y(2023)Towards Efficient Dynamic Binary Translation Optimizations Based on RISC Architectural FeaturesJournal of Circuits, Systems and Computers10.1142/S021812662450104433:06Online publication date: 26-Oct-2023
https://doi.org/10.1142/S0218126624501044
Zeng HXie MDong YWu ZLin H(2023)Efficient condition code emulation for dynamic binary translation systemsThird International Symposium on Computer Engineering and Intelligent Communications (ISCEIC 2022)10.1117/12.2660798(25)Online publication date: 2-Feb-2023
https://doi.org/10.1117/12.2660798
Wu JDong JFang RZhang WWang WZuo DDwyer MDamian DZeller A(2022)FADATestProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510169(896-908)Online publication date: 21-May-2022
https://dl.acm.org/doi/10.1145/3510003.3510169
Wu JDong JFang RZhang WWang WZuo D(2022)WDBTJournal of Systems and Software10.1016/j.jss.2022.111247187:COnline publication date: 1-May-2022
https://dl.acm.org/doi/10.1016/j.jss.2022.111247
Wu JDong JFang RZhang WWang WZuo D(2021)WDBT: Wear Characterization, Reduction, and Leveling of DBT Systems for Non-Volatile MemoryProceedings of the International Symposium on Memory Systems10.1145/3488423.3519337(1-13)Online publication date: 27-Sep-2021
https://dl.acm.org/doi/10.1145/3488423.3519337
Huang JWang HFei XWang XChen W(2021)TCStream: Large-Scale Graph Triangle-Counting on a single Machine using GPUsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2021.3135329(1-1)Online publication date: 2021
https://doi.org/10.1109/TPDS.2021.3135329

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten