Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3611643.3616318acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article
Open access

Leveraging Hardware Probes and Optimizations for Accelerating Fuzz Testing of Heterogeneous Applications

Published: 30 November 2023 Publication History

Abstract

There is a growing interest in the computer architecture community to incorporate heterogeneity and specialization to improve performance. Developers can create heterogeneous applications that consist of both host code and kernel code, where compute-intensive kernels can be offloaded from CPU to hardware accelerators. Testing such applications on real heterogeneous architectures is extremely challenging as kernels are black boxes, providing no information about the kernels’ internal execution to diagnose issues such as silent hangs or unexpected results. Additionally, inputs for heterogeneous applications are often large matrices, leading to a vast search space for identifying bug-revealing inputs.
We propose a novel fuzz testing technique, HFuzz, to enable efficient testing on real heterogeneous architectures. HFuzz aims to increase both the observability of hardware kernels and testing efficiency through a three-pronged approach. First, HFuzz automatically generates test guidance by inserting device-side in-kernel hardware probes in addition to host-side software monitors. Second, it performs rapid input space exploration by offloading compute-intensive input mutations to hardware kernels. Third, HFuzz parallelizes fuzzing and enables fast on-chip memory access, by utilizing four FPGA-level optimizations including loop unrolling, shannonization, data preloading, and dynamic kernel sharing.
We evaluate HFuzz on seven open-source OneAPI subjects from Intel. HFuzz speeds up fuzz testing by 4.7x with HW-accelerated input space exploration. By incorporating HW probes in tandem with SW monitors, HFuzz finds 33 defects within 4 hours and reveals 25 unique, unexpected behavior symptoms that could not be found by SW-based monitoring alone. HFuzz is the first to design hardware optimizations to accelerate fuzz testing.

Supplementary Material

Video (fse23main-p677-p-video.mp4)
"There is a growing interest in the computer architecture community to incorporate heterogeneity and specialization to improve performance. Developers can create heterogeneous applications that consist of both host code and kernel code, where compute-intensive kernels can be offloaded from CPU to hardware accelerators. Testing such applications on real heterogeneous architectures is extremely challenging as kernels are black boxes, providing no information about the kernels’ internal execution to diagnose issues such as silent hangs or unexpected results. Additionally, inputs for heterogeneous applications are often large matrices, leading to a vast search space for identifying bug-revealing inputs. We propose a novel fuzz testing technique, HFuzz, to enable efficient testing on real heterogeneous architectures. HFuzz aims to increase both the observability of hardware kernels and testing efficiency through a three-pronged approach. First, HFuzz automatically generates test guidance by inserting device-side in-kernel hardware probes in addition to host-side software monitors. Second, it performs rapid input space exploration by offloading compute-intensive input mutations to hardware kernels. Third, HFuzz parallelizes fuzzing and enables fast on-chip memory access, by utilizing four FPGA-level optimizations including loop unrolling, shannonization, data preloading, and dynamic kernel sharing. We evaluate HFuzz on seven open-source OneAPI subjects from Intel. HFuzz speeds up fuzz testing by 4.7× with HW-accelerated input space exploration. By incorporating HW probes in tandem with SW monitors, HFuzz finds 33 defects within 4 hours and reveals 25 unique, unexpected behavior symptoms that could not be found by SW-based monitoring alone. HFuzz is the first to design hardware optimizations to accelerate fuzz testing."

References

[1]
Paul Alcorn. 2022. AMD to Fuse FPGA AI Engines Onto EPYC Processors, Arrives in 2023. https://www.tomshardware.com/news/amd-to-fuse-fpga-ai-engines-onto-epyc-processors-arrives-in-2023
[2]
Amazon.com. 2021. Amazon EC2 F1 Instances: Run Custom FPGAs in the AWS Cloud. https://aws.amazon.com/ec2/instance-types/f1
[3]
David F. Bacon, Rodric Rabbah, and Sunil Shukla. 2013. FPGA Programming for the Masses. Commun. ACM, 56, 4 (2013), apr, 56–63. issn:0001-0782 https://doi.org/10.1145/2436256.2436271
[4]
E Bendersky. 2012. PyCParser C Parser and AST Generator Written in Python.
[5]
Marcel Böhme, Van-Thuan Pham, Manh-Dung Nguyen, and Abhik Roychoudhury. 2017. Directed Greybox Fuzzing. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, David Evans, Tal Maklin, and Dongyan Xu (Eds.). Association for Computing Machinery (ACM), United States of America. 2329–2344. https://doi.org/10.1145/3133956.3134020 ACM Conference on Computer and Communications Security 2017<br/>, CCS 2017 ; Conference date: 30-10-2017 Through 03-11-2017
[6]
Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2016. Coverage-based greybox fuzzing as markov chain. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. 1032–1043.
[7]
Andre R Brodtkorb, Christopher Dyken, Trond R Hagen, Jon M Hjelmervik, and Olaf O Storaasli. 2010. State-of-the-art in heterogeneous computing. Scientific Programming, 18, 1 (2010), 1–33.
[8]
Nazanin Calagar, Stephen D. Brown, and Jason H. Anderson. 2014. Source-level debugging for FPGA high-level synthesis. In 2014 24th International Conference on Field Programmable Logic and Applications (FPL). 1–8. https://doi.org/10.1109/FPL.2014.6927496
[9]
Jared Casper and Kunle Olukotun. 2014. Hardware Acceleration of Database Operations. In Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA ’14). Association for Computing Machinery, New York, NY, USA. 151–160. isbn:9781450326711 https://doi.org/10.1145/2554688.2554787
[10]
Adrian M. Caulfield, Eric S. Chung, Andrew Putnam, Hari Angepat, Jeremy Fowers, Michael Haselman, Stephen Heil, Matt Humphrey, Puneet Kaur, Joo-Young Kim, Daniel Lo, Todd Massengill, Kalin Ovtcharov, Michael Papamichael, Lisa Woods, Sitaram Lanka, Derek Chiou, and Doug Burger. 2016. A cloud-scale acceleration architecture. In 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 1–13. https://doi.org/10.1109/MICRO.2016.7783710
[11]
Andrew A Chien, Allan Snavely, and Mark Gahagan. 2011. 10x10: A general-purpose architectural approach to heterogeneity and energy efficiency. Procedia Computer Science, 4 (2011), 1987–1996.
[12]
Young-Kyu Choi and Jason Cong. 2017. HLScope: High-Level Performance Debugging for FPGA Designs. In 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 125–128. https://doi.org/10.1109/FCCM.2017.44
[13]
Jason Cong, Mohammad Ali Ghodrat, Michael Gill, Beayna Grigorian, Karthik Gururaj, and Glenn Reinman. 2014. Accelerator-rich architectures: Opportunities and progresses. In 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC). 1–6. https://doi.org/10.1145/2593069.2596667
[14]
Jason Cong, Licheng Guo, Po-Tsang Huang, Peng Wei, and Tianhe Yu. 2018. SMEM++: A Pipelined and Time-Multiplexed SMEM Seeding Accelerator for DNA Sequencing. In 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 206–206. https://doi.org/10.1109/FCCM.2018.00040
[15]
Jason Cong, Bin Liu, Stephen Neuendorffer, Juanjo Noguera, Kees Vissers, and Zhiru Zhang. 2011. High-Level Synthesis for FPGAs: From Prototyping to Deployment. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 30, 4 (2011), 473–491. https://doi.org/10.1109/TCAD.2011.2110592
[16]
Jason Cong, Vivek Sarkar, Glenn Reinman, and Alex Bui. 2011. Customizable Domain-Specific Computing. IEEE Design Test of Computers, 28, 2 (2011), 6–15. https://doi.org/10.1109/MDT.2010.141
[17]
John Curreri, Greg Stitt, and Alan D. George. 2010. High-level synthesis techniques for in-circuit assertion-based verification. In 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW). 1–8. https://doi.org/10.1109/IPDPSW.2010.5470747
[18]
Ian Cutress. 2018. Intel Shows Xeon Scalable Gold 6138P with Integrated FPGA, Shipping to Vendors. https://www.anandtech.com/show/12773/intel-shows-xeon-scalable-gold-6138p-with-integrated-fpga-shipping-to-vendors
[19]
Ren Ding, Yonghae Kim, Fan Sang, Wen Xu, Gururaj Saileshwar, and Taesoo Kim. 2021. Hardware Support to Improve Fuzzing Performance and Precision. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 2214–2228.
[20]
Andrea Fioraldi, Dominik Maier, Heiko Eiß feldt, and Marc Heuse. 2020. AFL++: Combining Incremental Steps of Fuzzing Research. USENIX Association, USA.
[21]
Daniel D Gajski, Nikil D Dutt, Allen CH Wu, and Steve YL Lin. 2012. High—Level Synthesis: Introduction to Chip and System Design. Springer Science & Business Media.
[22]
Philippe Garrault and Brian Philofsky. 2006. HDL coding practices to accelerate design performance. Xilinx White Paper, 231 (2006), 1–22.
[23]
Licheng Guo, Jason Lau, Zhenyuan Ruan, Peng Wei, and Jason Cong. 2019. Hardware Acceleration of Long Read Pairwise Overlapping in Genome Sequencing: A Race Between FPGA and GPU. In 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 127–135. https://doi.org/10.1109/FCCM.2019.00027
[24]
Intel. 2021. Dense Linear Algebra. https://github.com/oneapi-src/oneAPI-samples/tree/6901f7203b549a651911fec694ffefad82ed0b35/DirectProgramming/C%2B%2BSYCL/DenseLinearAlgebra
[25]
Intel. 2021. DPC++ Reference. https://oneapi-src.github.io/DPCPP_Reference/
[26]
Intel. 2022. Devcloud. https://www.intel.com/content/www/us/en/developer/tools/devcloud/overview.html
[27]
Intel. 2022. FPGA Optimization Guide for Intel® oneAPI Toolkits - Shannonization to Improve FMAX/II. https://www.intel.com/content/www/us/en/develop/documentation/oneapi-fpga-optimization-guide/top/optimize-your-design/throughput-1/single-work-item-kernels/loops/shannonization-to-improve-fmax-ii.html
[28]
Intel. 2022. FPGA Optimization Guide for Intel® oneAPI Toolkits - Transfer Loop-Carried Dependency to Local Memory. https://www.intel.com/content/www/us/en/develop/documentation/oneapi-fpga-optimization-guide/top/optimize-your-design/throughput-1/single-work-item-kernels/loops/transfer-loop-carried-dependency-to-local-memory.html
[29]
Intel. 2022. FPGA Optimization Guide for Intel® oneAPI Toolkits - Unroll Loops. https://www.intel.com/content/www/us/en/develop/documentation/oneapi-fpga-optimization-guide/top/optimize-your-design/throughput-1/single-work-item-kernels/loops/unroll-loops.html
[30]
Intel. 2022. Intel® Arria® 10 GX FPGA Overview. https://www.intel.com/content/www/us/en/products/details/fpga/arria/10/gx/products.html
[31]
Intel. 2022. Intel® Stratix® 10 GX FPGA Overview. https://www.intel.com/content/www/us/en/products/details/fpga/stratix/10.html
[32]
George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018. Evaluating Fuzz Testing. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS ’18). Association for Computing Machinery, New York, NY, USA. 2123–2138. isbn:9781450356930 https://doi.org/10.1145/3243734.3243804
[33]
Alexandra Kourfali and Dirk Stroobandt. 2020. In-Circuit Debugging with Dynamic Reconfiguration of FPGA Interconnects. ACM Trans. Reconfigurable Technol. Syst., 13, 1 (2020), Article 5, jan, 29 pages. issn:1936-7406 https://doi.org/10.1145/3375459
[34]
Kevin Laeufer, Jack Koenig, Donggyu Kim, Jonathan Bachrach, and Koushik Sen. 2018. RFUZZ: Coverage-Directed Fuzz Testing of RTL on FPGAs. In Proceedings of the International Conference on Computer-Aided Design (ICCAD ’18). Association for Computing Machinery, New York, NY, USA. Article 28, 8 pages. isbn:9781450359504 https://doi.org/10.1145/3240765.3240842
[35]
Yi-Hsiang Lai, Ecenur Ustun, Shaojie Xiang, Zhenman Fang, Hongbo Rong, and Zhiru Zhang. 2021. Programming and Synthesis for Software-Defined FPGA Acceleration: Status and Future Prospects. ACM Trans. Reconfigurable Technol. Syst., 14, 4 (2021), Article 17, sep, 39 pages. issn:1936-7406 https://doi.org/10.1145/3469660
[36]
Caroline Lemieux, Rohan Padhye, Koushik Sen, and Dawn Song. 2018. PerfFuzz: Automatically Generating Pathological Inputs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2018). Association for Computing Machinery, New York, NY, USA. 254–265. isbn:9781450356992 https://doi.org/10.1145/3213846.3213874
[37]
Huimin Li, Xitian Fan, Li Jiao, Wei Cao, Xuegong Zhou, and Lingli Wang. 2016. A high performance FPGA-based accelerator for large-scale convolutional neural networks. In 2016 26th International Conference on Field Programmable Logic and Applications (FPL). 1–9.
[38]
Jie Liang, Yu Jiang, Yuanliang Chen, Mingzhe Wang, Chijin Zhou, and Jiaguang Sun. 2018. PAFL: Extend Fuzzing Optimizations of Single Mode to Industrial Parallel Mode. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2018). Association for Computing Machinery, New York, NY, USA. 809–814. isbn:9781450355735 https://doi.org/10.1145/3236024.3275525
[39]
Yufei Ma, Yu Cao, Sarma Vrudhula, and Jae-sun Seo. 2017. Optimizing loop operation and dataflow in FPGA acceleration of deep convolutional neural networks. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 45–54.
[40]
Valentin Manes, HyungSeok Han, Choongwoo Han, sang cha, Manuel Egele, Edward Schwartz, and Maverick Woo. 2019. The Art, Science, and Engineering of Fuzzing: A Survey. IEEE Transactions on Software Engineering, PP (2019), 10, 1–1. https://doi.org/10.1109/TSE.2019.2946563
[41]
Joshua S. Monson and Brad Hutchings. 2015. Using source-to-source compilation to instrument circuits for debug with High Level Synthesis. In 2015 International Conference on Field Programmable Technology (FPT). 48–55. https://doi.org/10.1109/FPT.2015.7393129
[42]
Andrew Putnam, Adrian M. Caulfield, Eric S. Chung, Derek Chiou, Kypros Constantinides, John Demme, Hadi Esmaeilzadeh, Jeremy Fowers, Gopi Prashanth Gopal, Jan Gray, Michael Haselman, Scott Hauck, Stephen Heil, Amir Hormati, Joo-Young Kim, Sitaram Lanka, James Larus, Eric Peterson, Simon Pope, Aaron Smith, Jason Thong, Phillip Yi Xiao, and Doug Burger. 2016. A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services. Commun. ACM, 59, 11 (2016), Oct., 114–122. issn:0001-0782 https://doi.org/10.1145/2996868
[43]
Xiaoke Qin and Prabhat Mishra. 2014. Scalable Test Generation by Interleaving Concrete and Symbolic Execution. In Proceedings of the 2014 27th International Conference on VLSI Design and 2014 13th International Conference on Embedded Systems (VLSID ’14). IEEE Computer Society, USA. 104–109. isbn:9781479925131 https://doi.org/10.1109/VLSID.2014.25
[44]
James Reinders, Ben Ashbaugh, James Brodman, Michael Kinsner, John Pennycook, and Xinmin Tian. 2021. Data parallel C++: mastering DPC++ for programming of heterogeneous systems using C++ and SYCL. Springer Nature.
[45]
Ruyman Reyes and Victor Lomüller. 2016. SYCL: Single-source C++ accelerator programming. In Parallel Computing: On the Road to Exascale. IOS Press, 673–682.
[46]
Hongbo Rong. 2017. Programmatic Control of a Compiler for Generating High-performance Spatial Hardware. CoRR, abs/1711.07606 (2017), arxiv:1711.07606. arxiv:1711.07606
[47]
Kyle Rupnow, Yun Liang, Yinan Li, and Deming Chen. 2011. A study of high-level synthesis: Promises and challenges. In 2011 9th IEEE International Conference on ASIC. 1102–1105. https://doi.org/10.1109/ASICON.2011.6157401
[48]
Kostya Serebryany, Maxim Lifantsev, Konstantin Shtoyk, Doug Kwan, and Peter Hochschild. 2021. Silifuzz: Fuzzing cpus by proxy. arXiv preprint arXiv:2110.11519.
[49]
Congxi Song, Xu Zhou, Qidi Yin, Xinglu He, Hangwei Zhang, and Kai Lu. 2019. P-Fuzz: A Parallel Grey-Box Fuzzing Framework. Applied Sciences, 9, 23 (2019), issn:2076-3417 https://doi.org/10.3390/app9235100
[50]
Haijun Wang, Xiaofei Xie, Yi Li, Cheng Wen, Yuekang Li, Yang Liu, Shengchao Qin, Hongxu Chen, and Yulei Sui. 2020. Typestate-Guided Fuzzer for Discovering Use-after-Free Vulnerabilities. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (ICSE ’20). Association for Computing Machinery, New York, NY, USA. 999–1010. isbn:9781450371216 https://doi.org/10.1145/3377811.3380386
[51]
Cheng Wen, Haijun Wang, Yuekang Li, Shengchao Qin, Yang Liu, Zhiwu Xu, Hongxu Chen, Xiaofei Xie, Geguang Pu, and Ting Liu. 2020. MEMLOCK: Memory Usage Guided Fuzzing. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). 765–777. https://doi.org/10.1145/3377811.3380396
[52]
Xilinx. 2021. UltraScale Architecture and Product Data Sheet: Overview. https://www.xilinx.com/support/documentation/data_sheets/ds890-ultrascale-overview.pdf
[53]
Tai Yue, Pengfei Wang, Yong Tang, Enze Wang, Bo Yu, Kai Lu, and Xu Zhou. 2020. Ecofuzz: Adaptive energy-saving greybox fuzzing as a variant of the adversarial multi-armed bandit. In Proceedings of the 29th USENIX Conference on Security Symposium. 2307–2324.
[54]
Mohamed Zahran. 2017. Heterogeneous computing: Here to stay. Commun. ACM, 60, 3 (2017), 42–45.
[55]
Michał Zalewski. 2021. American Fuzz Loop. http://lcamtuf.coredump.cx/afl/
[56]
Chen Zhang, Peng Li, Guangyu Sun, Yijin Guan, Bingjun Xiao, and Jason Cong. 2015. Optimizing FPGA-based accelerator design for deep convolutional neural networks. In Proceedings of the 2015 ACM/SIGDA international symposium on field-programmable gate arrays. 161–170.
[57]
Qian Zhang, Jiyuan Wang, Muhammad Ali Gulzar, Rohan Padhye, and Miryung Kim. 2020. BigFuzz: Efficient Fuzz Testing for Data Analytics using Framework Abstraction. In The 35th IEEE/ACM International Conference on Automated Software Engineering. https://doi.org/10.1145/3324884.3416641
[58]
Qian Zhang, Jiyuan Wang, and Miryung Kim. 2021. Heterofuzz: Fuzz testing to detect platform dependent divergence for heterogeneous applications. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 242–254.
[59]
Yuan Zhou, Udit Gupta, Steve Dai, Ritchie Zhao, Nitish Srivastava, Hanchen Jin, Joseph Featherston, Yi-Hsiang Lai, Gai Liu, Gustavo Angarita Velasquez, Wenping Wang, and Zhiru Zhang. 2018. Rosetta: A Realistic High-Level Synthesis Benchmark Suite for Software Programmable FPGAs. 10 pages. isbn:9781450356145 https://doi.org/10.1145/3174243.3174255

Cited By

View all
  • (2024)Stalling in Queuing Systems with Heterogeneous ChannelsApplied Sciences10.3390/app1402077314:2(773)Online publication date: 16-Jan-2024
  • (2023)Software Engineering for Data Intensive Scalable Computing and Heterogeneous Computing2023 IEEE/ACM International Conference on Software Engineering: Future of Software Engineering (ICSE-FoSE)10.1109/ICSE-FoSE59343.2023.00006(54-68)Online publication date: 14-May-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
November 2023
2215 pages
ISBN:9798400703270
DOI:10.1145/3611643
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 November 2023

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. Fuzz Testing
  2. Heterogeneous Applications

Qualifiers

  • Research-article

Funding Sources

Conference

ESEC/FSE '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)209
  • Downloads (Last 6 weeks)32
Reflects downloads up to 24 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Stalling in Queuing Systems with Heterogeneous ChannelsApplied Sciences10.3390/app1402077314:2(773)Online publication date: 16-Jan-2024
  • (2023)Software Engineering for Data Intensive Scalable Computing and Heterogeneous Computing2023 IEEE/ACM International Conference on Software Engineering: Future of Software Engineering (ICSE-FoSE)10.1109/ICSE-FoSE59343.2023.00006(54-68)Online publication date: 14-May-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media