research-article

Open access

Towards Efficient Control Flow Handling in Spatial Architecture via Architecting the Control Flow Plane

Authors:

Shouyi YinAuthors Info & Claims

MICRO '23: Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture

Pages 1395 - 1408

https://doi.org/10.1145/3613424.3614246

Published: 08 December 2023 Publication History

All formats PDF

Abstract

Spatial architecture is a high-performance architecture that uses control flow graphs and data flow graphs as the computational model and producer/consumer models as the execution models. However, existing spatial architectures suffer from control flow handling challenges. Upon categorizing their PE execution models, we find that they lack autonomous, peer-to-peer, and temporally loosely-coupled control flow handling capability. This leads to limited performance in intensive control programs.

A spatial architecture, Marionette, is proposed, with an explicit-designed control flow plane. The Control Flow Plane enables autonomous, peer-to-peer and temporally loosely-coupled control flow handling. The Proactive PE Configuration ensures computation-overlapped and timely configuration to improve handling Branch Divergence. The Agile PE Assignment enhance the pipeline performance of Imperfect Loops. We develop full stack of Marionette (ISA, compiler, simulator, RTL) and demonstrate that in a variety of challenging intensive control programs, compared to state-of-the-art spatial architectures, Marionette outperforms Softbrain, TIA, REVEL, and RipTide by geomean 2.88×, 3.38×, 1.55×, and 2.66×.

References

[1]

Miguel Á. Abella-González, Pedro Carollo-Fernández, Louis-Noël Pouchet, Fabrice Rastello, and Gabriel Rodríguez. 2021. PolyBench/Python: Benchmarking Python Environments with Polyhedral Optimizations. In Proceedings of the 30th ACM SIGPLAN International Conference on Compiler Construction (Virtual, Republic of Korea) (CC 2021). Association for Computing Machinery, New York, NY, USA, 59–70. https://doi.org/10.1145/3446804.3446842

Digital Library

[2]

Omid Akbari, Mehdi Kamal, Ali Afzali-Kusha, Massoud Pedram, and Muhammad Shafique. 2018. PX-CGRA: Polymorphic approximate coarse-grained reconfigurable architecture. In 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 413–418.

[3]

Frances E Allen. 1970. Control flow analysis. ACM Sigplan Notices 5, 7 (1970), 1–19.

Digital Library

[4]

Erdal Arikan. 2009. Channel polarization: A method for constructing capacity-achieving codes for symmetric binary-input memoryless channels. IEEE Transactions on information Theory 55, 7 (2009), 3051–3073.

Digital Library

[5]

Inpyo Bae, Barend Harris, Hyemi Min, and Bernhard Egger. 2018. Auto-tuning CNNs for coarse-grained reconfigurable array-based accelerators. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 37, 11 (2018), 2301–2310.

[6]

Rick Bahr, Clark Barrett, Nikhil Bhagdikar, Alex Carsello, Ross Daly, Caleb Donovick, David Durst, Kayvon Fatahalian, Kathleen Feng, Pat Hanrahan, Teguh Hofstee, Mark Horowitz, Dillon Huff, Fredrik Kjolstad, Taeyoung Kong, Qiaoyi Liu, Makai Mann, Jackson Melchert, Ankita Nayak, Aina Niemetz, Gedeon Nyengele, Priyanka Raina, Stephen Richardson, Raj Setaluri, Jeff Setter, Kavya Sreedhar, Maxwell Strange, James Thomas, Christopher Torng, Leonard Truong, Nestan Tsiskaridze, and Keyi Zhang. 2020. Creating an Agile Hardware Design Flow. In 2020 57th ACM/IEEE Design Automation Conference (DAC). 1–6. https://doi.org/10.1109/DAC18072.2020.9218553

[7]

Mahesh Balasubramanian. 2021. Compiler Design for Accelerating Applications on Coarse-Grained Reconfigurable Architectures. Ph. D. Dissertation. Arizona State University.

[8]

Thilini Kaushalya Bandara, Dhananjaya Wijerathne, Tulika Mitra, and Li-Shiuan Peh. 2022. REVAMP: a systematic framework for heterogeneous CGRA realization. In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems. 918–932.

Digital Library

[9]

Volker Baumgarte, Gerd Ehlers, Frank May, Armin Nückel, Martin Vorbach, and Markus Weinhardt. 2003. PACT XPP—A self-reconfigurable data processing architecture. the Journal of Supercomputing 26, 2 (2003), 167–184.

[10]

Najmeh Nazari Bavarsad, Hosein Mohammadi Makrani, Hossein Sayadi, Lawrence Landis, Setareh Rafatirad, and Houman Homayoun. 2021. HosNa: A DPC++ Benchmark Suite for Heterogeneous Architectures. In 2021 IEEE 39th International Conference on Computer Design (ICCD). IEEE, 509–516.

[11]

Václav E Beneš. 1962. On rearrangeable three-stage connecting networks. The Bell System Technical Journal 41, 5 (1962), 1481–1492.

[12]

Mihai Budiu, Girish Venkataramani, Tiberiu Chelcea, and Seth Copen Goldstein. 2004. Spatial computation. In Proceedings of the 11th international conference on Architectural support for programming languages and operating systems. 14–26.

Digital Library

[13]

Shuai Che, Michael Boyer, Jiayuan Meng, David Tarjan, Jeremy W Sheaffer, Sang-Ha Lee, and Kevin Skadron. 2009. Rodinia: A benchmark suite for heterogeneous computing. In 2009 IEEE international symposium on workload characterization (IISWC). Ieee, 44–54.

Digital Library

[14]

Dev C Chen and Jan M Rabaey. 1992. A reconfigurable multiprocessor IC for rapid prototyping of algorithmic-specific high-speed DSP data paths. IEEE Journal of Solid-State Circuits 27, 12 (1992), 1895–1904.

[15]

Yu-Hsin Chen, Joel Emer, and Vivienne Sze. 2016. Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks. ACM SIGARCH computer architecture news 44, 3 (2016), 367–379.

Digital Library

[16]

Jason Cong, Hui Huang, Chiyuan Ma, Bingjun Xiao, and Peipei Zhou. 2014. A fully pipelined and dynamically composable architecture of CGRA. In 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines. IEEE, 9–16.

Digital Library

[17]

David E Culler. 1986. Dataflow architectures. Annual review of computer science 1, 1 (1986), 225–253.

[18]

Vidushi Dadu, Jian Weng, Sihao Liu, and Tony Nowatzki. 2019. Towards general purpose acceleration by exploiting common data-dependence forms. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. 924–939.

Digital Library

[19]

Jinyi Deng, Linyun Zhang, Lei Wang, Jiawei Liu, Kexiang Deng, Shibin Tang, Jiangyuan Gu, Boxiao Han, Fei Xu, Leibo Liu, Shaojun Wei, and Shouyi Yin. 2022. Mixed-granularity Parallel Coarse-grained Reconfigurable Architecture. In 2022 59th ACM/IEEE Design Automation Conference (DAC). 1–6. https://doi.org/10.1145/3489517.3530454

Digital Library

[20]

Jack B Dennis, John B Fosseen, and John P Linderman. 1974. Data flow schemas. In International Symposium on Theoretical Programming. Springer, 187–216.

[21]

Loris Duch, Soumya Basu, Miguel Peón-Quirós, Giovanni Ansaloni, Laura Pozzi, and David Atienza. 2018. i-DPs CGRA: an interleaved-datapaths reconfigurable accelerator for embedded bio-signal processing. IEEE Embedded Systems Letters 11, 2 (2018), 50–53.

[22]

Hritam Dutta, Dmitrij Kissler, Frank Hannig, Alexey Kupriyanov, Jürgen Teich, and Bernard Pottier. 2009. A holistic approach for tightly coupled reconfigurable parallel processors. Microprocessors and Microsystems 33, 1 (2009), 53–62.

Digital Library

[23]

Xitian Fan, Di Wu, Wei Cao, Wayne Luk, and Lingli Wang. 2018. Stream processing dual-track CGRA for object inference. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 26, 6 (2018), 1098–1111.

[24]

Amin Farmahini-Farahani, Jung Ho Ahn, Katherine Morrow, and Nam Sung Kim. 2015. NDA: Near-DRAM acceleration architecture leveraging commodity DRAM devices and standard memory modules. In 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA). IEEE, 283–295.

[25]

Mingyu Gao and Christos Kozyrakis. 2016. HRL: Efficient and flexible reconfigurable logic for near-data processing. In 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA). Ieee, 126–137.

[26]

Mingyu Gao and Christos Kozyrakis. 2016. HRL: Efficient and flexible reconfigurable logic for near-data processing. In 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA). Ieee, 126–137.

[27]

Graham Gobieski, Ahmet Oguz Atli, Kenneth Mai, Brandon Lucia, and Nathan Beckmann. 2021. Snafu: an ultra-low-power, energy-minimal CGRA-generation framework and architecture. In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA). IEEE, 1027–1040.

Digital Library

[28]

Graham Gobieski, Souradip Ghosh, Marijn Heule, Todd Mowry, Tony Nowatzki, Nathan Beckmann, and Brandon Lucia. 2022. A programmable, energy-minimal dataflow compiler and architecture. In 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 546–564.

Digital Library

[29]

Seth Copen Goldstein, Herman Schmit, Mihai Budiu, Srihari Cadambi, Matthew Moe, and R Reed Taylor. 2000. PipeRench: A reconfigurable architecture and compiler. Computer 33, 4 (2000), 70–77.

Digital Library

[30]

Venkatraman Govindaraju, Chen-Han Ho, Tony Nowatzki, Jatin Chhugani, Nadathur Satish, Karthikeyan Sankaralingam, and Changkyu Kim. 2012. Dyser: Unifying functionality and parallelism specialization for energy-efficient computing. IEEE Micro 32, 5 (2012), 38–51.

Digital Library

[31]

Venkatraman Govindaraju, Chen-Han Ho, and Karthikeyan Sankaralingam. 2011. Dynamically specialized datapaths for energy efficient computing. In 2011 IEEE 17th International Symposium on High Performance Computer Architecture. IEEE, 503–514.

[32]

Matthew R Guthaus, Jeffrey S Ringenberg, Dan Ernst, Todd M Austin, Trevor Mudge, and Richard B Brown. 2001. MiBench: A free, commercially representative embedded benchmark suite. In Proceedings of the fourth annual IEEE international workshop on workload characterization. WWC-4 (Cat. No. 01EX538). IEEE, 3–14.

[33]

Reiner W Hartenstein, Alexander G Hirschbiel, M Riedmuller, Karin Schmidt, and Michael Weber. 1991. A novel ASIC design approach based on a new machine paradigm. IEEE Journal of Solid-State Circuits 26, 7 (1991), 975–989.

[34]

Manupa Karunaratne, Aditi Kulkarni Mohite, Tulika Mitra, and Li-Shiuan Peh. 2017. Hycube: A cgra with reconfigurable single-cycle multi-hop interconnect. In Proceedings of the 54th Annual Design Automation Conference 2017. 1–6.

Digital Library

[35]

Manupa Karunaratne, Dhananjaya Wijerathne, Tulika Mitra, and Li-Shiuan Peh. 2019. 4D-CGRA: Introducing branch dimension to spatio-temporal application mapping on CGRAs. In 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE, 1–8.

[36]

Sami Khawam, Ioannis Nousias, Mark Milward, Ying Yi, Mark Muir, and Tughrul Arslan. 2007. The reconfigurable instruction cell array. IEEE Transactions on very large scale integration (VLSI) systems 16, 1 (2007), 75–85.

Digital Library

[37]

Changkyu Kim, Simha Sethumadhavan, Madhu S Govindan, Nitya Ranganathan, Divya Gulati, Doug Burger, and Stephen W Keckler. 2007. Composable lightweight processors. In 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007). IEEE, 381–394.

Digital Library

[38]

C-T Lea. 1988. A new broadcast switching network. IEEE transactions on communications 36, 10 (1988), 1128–1137.

[39]

Feng Liu, Heejin Ahn, Stephen R Beard, Taewook Oh, and David I August. 2015. DynaSpAM: Dynamic spatial architecture mapping using out of order instruction schedules. In 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA). IEEE, 541–553.

Digital Library

[40]

Leibo Liu, Chenchen Deng, Dong Wang, Min Zhu, Shouyi Yin, Peng Cao, and Shaojun Wei. 2013. An energy-efficient coarse-grained dynamically reconfigurable fabric for multiple-standard video decoding applications. In Proceedings of the IEEE 2013 Custom Integrated Circuits Conference. IEEE, 1–4.

[41]

Leibo Liu, Zhaoshi Li, Chen Yang, Chenchen Deng, Shouyi Yin, and Shaojun Wei. 2017. HReA: An energy-efficient embedded dynamically reconfigurable fabric for 13-dwarfs processing. IEEE Transactions on Circuits and Systems II: Express Briefs 65, 3 (2017), 381–385.

[42]

Mahim Mishra, Timothy J Callahan, Tiberiu Chelcea, Girish Venkataramani, Seth C Goldstein, and Mihai Budiu. 2006. Tartan: evaluating spatial computation for whole program execution. ACM SIGARCH Computer Architecture News 34, 5 (2006), 163–174.

Digital Library

[43]

Quan M. Nguyen and Daniel Sanchez. 2020. Pipette: Improving Core Utilization on Irregular Applications through Intra-Core Pipeline Parallelism. In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 596–608. https://doi.org/10.1109/MICRO50266.2020.00056

[44]

Quan M Nguyen and Daniel Sanchez. 2021. Fifer: Practical Acceleration of Irregular Applications on Reconfigurable Architectures. In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture. 1064–1077.

[45]

Chris Nicol. 2017. A coarse grain reconfigurable array (CGRA) for statically scheduled data flow computing. Wave computing white paper (2017), 1–9.

[46]

Chris Nicol. 2017. A coarse grain reconfigurable array (CGRA) for statically scheduled data flow computing. Wave computing white paper (2017), 1–9.

[47]

Xiaonan Nie, Xupeng Miao, Zilong Wang, Zichao Yang, Jilong Xue, Lingxiao Ma, Gang Cao, and Bin Cui. 2023. FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement. arXiv preprint arXiv:2304.03946 (2023).

[48]

Tony Nowatzki, Vinay Gangadhar, Newsha Ardalani, and Karthikeyan Sankaralingam. 2017. Stream-dataflow acceleration. In 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA). IEEE, 416–429.

Digital Library

[49]

Angshuman Parashar, Michael Pellauer, Michael Adler, Bushra Ahsan, Neal Crago, Daniel Lustig, Vladimir Pavlov, Antonia Zhai, Mohit Gambhir, Aamer Jaleel, Randy Allmon, Rachid Rayess, Stephen Maresh, and Joel Emer. 2013. Triggered Instructions: A Control Paradigm for Spatially-Programmed Architectures. In Proceedings of the 40th Annual International Symposium on Computer Architecture (Tel-Aviv, Israel) (ISCA ’13). Association for Computing Machinery, New York, NY, USA, 142–153. https://doi.org/10.1145/2485922.2485935

Digital Library

[50]

Hyunchul Park, Yongjun Park, and Scott Mahlke. 2009. Polymorphic pipeline array: a flexible multicore accelerator with virtualized execution for mobile multimedia applications. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture. 370–380.

Digital Library

[51]

Michael Pellauer, Yakun Sophia Shao, Jason Clemons, Neal Crago, Kartik Hegde, Rangharajan Venkatesan, Stephen W Keckler, Christopher W Fletcher, and Joel Emer. 2019. Buffets: An efficient and composable storage idiom for explicit decoupled data orchestration. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems. 137–151.

Digital Library

[52]

Raghu Prabhakar, Yaqi Zhang, David Koeplinger, Matt Feldman, Tian Zhao, Stefan Hadjis, Ardavan Pedram, Christos Kozyrakis, and Kunle Olukotun. 2017. Plasticine: A reconfigurable architecture for parallel patterns. In 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA). IEEE, 389–402.

Digital Library

[53]

Brandon Reagen, Robert Adolf, Yakun Sophia Shao, Gu-Yeon Wei, and David Brooks. 2014. Machsuite: Benchmarks for accelerator design and customized architectures. In 2014 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 110–119.

[54]

Tom Richardson and Ruediger Urbanke. 2008. Modern coding theory. Cambridge university press.

Digital Library

[55]

Behnam Robatmili, Dong Li, Hadi Esmaeilzadeh, Sibi Govindan, Aaron Smith, Andrew Putnam, Doug Burger, and Stephen W Keckler. 2013. How to implement effective prediction and forwarding for fusable dynamic multicore architectures. In 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA). IEEE, 460–471.

Digital Library

[56]

Karthikeyan Sankaralingam, Ramadass Nagarajan, Haiming Liu, Changkyu Kim, Jaehyuk Huh, Nitya Ranganathan, Doug Burger, Stephen W Keckler, Robert G McDonald, and Charles R Moore. 2004. Trips: A polymorphous architecture for exploiting ilp, tlp, and dlp. ACM Transactions on Architecture and Code Optimization (TACO) 1, 1 (2004), 62–93.

Digital Library

[57]

Hartej Singh, Ming-Hau Lee, Guangming Lu, Fadi J Kurdahi, Nader Bagherzadeh, and Eliseu M Chaves Filho. 2000. MorphoSys: an integrated reconfigurable system for data-parallel and computation-intensive applications. IEEE transactions on computers 49, 5 (2000), 465–481.

Digital Library

[58]

Jeckson Dellagostin Souza, Luigi Carro, Mateus Beck Rutzig, and Antonio Carlos Schneider Beck. 2016. A reconfigurable heterogeneous multicore with a homogeneous ISA. In 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1598–1603.

[59]

M. Suzuki, Y. Hasegawa, Y. Yamada, N. Kaneko, K. Deguchi, H. Amano, K. Anjo, M. Motomura, K. Wakabayashi, T. Toi, and T. Awashima. 2004. Stream applications on the dynamically reconfigurable processor. In Proceedings. 2004 IEEE International Conference on Field- Programmable Technology (IEEE Cat. No.04EX921). 137–144. https://doi.org/10.1109/FPT.2004.1393261

[60]

Steven Swanson, Ken Michelson, Andrew Schwerin, and Mark Oskin. 2003. WaveScalar. In Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36. IEEE, 291–302.

[61]

Steven Swanson, Andrew Schwerin, Martha Mercaldi, Andrew Petersen, Andrew Putnam, Ken Michelson, Mark Oskin, and Susan J Eggers. 2007. The wavescalar architecture. ACM Transactions on Computer Systems (TOCS) 25, 2 (2007), 1–54.

Digital Library

[62]

Cheng Tan, Nicolas Bohm Agostini, Tong Geng, Chenhao Xie, Jiajia Li, Ang Li, Kevin J Barker, and Antonino Tumeo. 2022. DRIPS: Dynamic Rebalancing of Pipelined Streaming Applications on CGRAs. In 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 304–316.

[63]

tenstorrent. [n. d.]. Software and Silicon in Serbia w/ Ljubisa Bajic and Jim Keller. https://tenstorrent.com/research/software-and-silicon-in-serbia-w-ljubisa-bajic-and-jim-keller/ (2022, Mar 17).

[64]

Christopher Torng, Peitian Pan, Yanghui Ou, Cheng Tan, and Christopher Batten. 2021. Ultra-elastic cgras for irregular loop specialization. In 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 412–425.

[65]

Artem Vasilyev, Nikhil Bhagdikar, Ardavan Pedram, Stephen Richardson, Shahar Kvatinsky, and Mark Horowitz. 2016. Evaluating programmable architectures for imaging and vision applications. In 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 1–13.

[66]

Dani Voitsechov and Yoav Etsion. 2014. Single-graph multiple flows: Energy efficient design alternative for GPGPUs. ACM SIGARCH computer architecture news 42, 3 (2014), 205–216.

Digital Library

[67]

Dani Voitsechov, Oron Port, and Yoav Etsion. 2018. Inter-thread communication in multithreaded, reconfigurable coarse-grain arrays. In 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 42–54.

Digital Library

[68]

John Von Neumann. 1993. First Draft of a Report on the EDVAC. IEEE Annals of the History of Computing 15, 4 (1993), 27–75.

Digital Library

[69]

Matthew A Watkins, Tony Nowatzki, and Anthony Carno. 2016. Software transparent dynamic binary translation for coarse-grain reconfigurable architectures. In 2016 IEEE International symposium on high performance computer architecture (HPCA). IEEE, 138–150.

[70]

Jian Weng, Sihao Liu, Zhengrong Wang, Vidushi Dadu, and Tony Nowatzki. 2020. A hybrid systolic-dataflow architecture for inductive matrix algorithms. In 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 703–716.

[71]

Alfred KW Yeung and Jan M Rabaey. 1993. A reconfigurable data-driven multiprocessor architecture for rapid prototyping of high throughput DSP algorithms. In [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences, Vol. 1. IEEE, 169–178.

Cited By

Deng JTang XYue ZLu GYang QZhang JLi JLi CWei SHu YYin S(2024)Efficient Orchestrated AI Workflows Execution on Scale-Out Spatial ArchitectureIEEE Transactions on Circuits and Systems for Artificial Intelligence10.1109/TCASAI.2024.34762371:2(229-243)Online publication date: Dec-2024
https://doi.org/10.1109/TCASAI.2024.3476237

Index Terms

Towards Efficient Control Flow Handling in Spatial Architecture via Architecting the Control Flow Plane
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Reconfigurable computing
2. Theory of computation
  1. Models of computation

Recommendations

Control Versus Data Flow in Parallel Database Machines

The execution of a query in a parallel database machine can be controlled in either acontrol flow way, or in a data flow way. In the former case a single system node controlsthe entire query execution. In the latter case the processes that execute the ...
Efficient and practical control flow monitoring for program security
ASIAN'06: Proceedings of the 11th Asian computing science conference on Advances in computer science: secure software and related issues

Control-hijacking attacks are known as critical threats to software security. Control flow monitoring is a kind of important method to mitigate this problem. In this paper, we present a new method for program control flow monitoring. Based on the static ...
Lightweight Dispatcher Constructions for Control Flow Flattening
SSPREW-7: Proceedings of the 7th Software Security, Protection, and Reverse Engineering / Software Security and Protection Workshop

The objective of control flow obfuscation is to protect the program control flow from analysis. A technique called control flow flattening addresses static analysis by hiding edges between basic blocks in a program and introduces a dispatcher block that ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MICRO '23: Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture

October 2023

1528 pages

ISBN:9798400703294

DOI:10.1145/3613424

Copyright © 2023 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 December 2023

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

the Science and Technology Innovation 2030 - New Generation of AI Project
Beijing Municipal Science and Technology Project
NSFC
the National Key Research and Development Program
Beijing National Research Center For Information Science And Technology
the Beijing Advanced Innovation Center for Integrated Circuits
Tsinghua University-China Mobile Communications Group Co.,Ltd. Joint Institute

Conference

MICRO '23

Sponsor:

SIGMICRO

MICRO '23: 56th Annual IEEE/ACM International Symposium on Microarchitecture

October 28 - November 1, 2023

ON, Toronto, Canada

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
735
Total Downloads

Downloads (Last 12 months)735
Downloads (Last 6 weeks)72

Reflects downloads up to 26 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Deng JTang XYue ZLu GYang QZhang JLi JLi CWei SHu YYin S(2024)Efficient Orchestrated AI Workflows Execution on Scale-Out Spatial ArchitectureIEEE Transactions on Circuits and Systems for Artificial Intelligence10.1109/TCASAI.2024.34762371:2(229-243)Online publication date: Dec-2024
https://doi.org/10.1109/TCASAI.2024.3476237

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents