Keyword: modulo scheduling : Search

research-article

Open Access

SAT-Based Exact Modulo Scheduling Mapping for Resource-Constrained CGRAs

ACM Journal on Emerging Technologies in Computing Systems (JETC), Volume 20, Issue 3Article No.: 8, Pages 1–26https://doi.org/10.1145/3663675

Coarse-Grain Reconfigurable Arrays (CGRAs) represent emerging low-power architectures designed to accelerate Compute-Intensive Loops (CILs). The effectiveness of CGRAs in providing acceleration relies on the quality of mapping: how efficiently the CIL is ...

research-article

Towards High-Quality CGRA Mapping with Graph Neural Networks and Reinforcement Learning

ICCAD '22: Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided DesignArticle No.: 61, Pages 1–9https://doi.org/10.1145/3508352.3549458

Coarse-Grained Reconfigurable Architectures (CGRA) is a promising solution to accelerate domain applications due to its good combination of energy-efficiency and flexibility. Loops, as computation-intensive parts of applications, are often mapped onto ...

research-article

RF-CGRA: a routing-friendly CGRA with hierarchical register chains

DATE '22: Proceedings of the 2022 Conference & Exhibition on Design, Automation & Test in EuropePages 262–267

CGRAs are promising architectures to accelerate domain-specific applications as they combine high energy-efficiency and flexibility. With either isolated register files (RFs) or link-consuming distributed registers in each processing element (PE), ...

research-article

Folded Integer Multiplication for FPGAs

FPGA '21: The 2021 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysPages 160–170https://doi.org/10.1145/3431920.3439299

Encryption - especially the key exchange algorithms such as RSA - is an increasing use-model for FPGAs, driven by the adoption of the FPGA as a SmartNIC in the datacenter. While bulk encryption such as AES maps well to generic FPGA features, the very ...

research-article

Free

A slack-based approach to efficiently deploy radix 8 booth multipliers

DATE '17: Proceedings of the Conference on Design, Automation & Test in EuropePages 1153–1158

¹In 1951 A. Booth published his algorithm to efficiently multiply signed numbers. Since the appearance of such algorithm, it has been widely accepted that radix 4-based Booth multipliers are the most efficient. They allow the height of the multiplier to ...

poster

Joint Modulo Scheduling and Memory Partitioning with Multi-Bank Memory for High-Level Synthesis (Abstract Only)

FPGA '17: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysPage 290https://doi.org/10.1145/3020078.3021778

High-Level Synthesis (HLS) has been widely recognized and accepted as an efficient compilation process targeting FPGAs for algorithm evaluation and product prototyping. However, the massively parallel memory access demands and the extremely expensive ...

research-article

Integrated modulo scheduling and cluster assignment for TI TMS320C64x+ architecture

ODES '14: Proceedings of the 11th Workshop on Optimizations for DSP and Embedded SystemsPages 25–32https://doi.org/10.1145/2568326.2568327

For the exploitation of the available parallelism clustered Very Long Instruction Word (VLIW) processors rely on highly optimizing compilers. Aiming this parallelism, many advanced compiler concepts have been developed and proposed in the past. Many of ...

research-article

Throughput-memory footprint trade-off in synthesis of streaming software on embedded multiprocessors

ACM Transactions on Embedded Computing Systems (TECS), Volume 13, Issue 3Article No.: 46, Pages 1–26https://doi.org/10.1145/2539036.2539042

We study the trade-off between throughput and memory footprint of embedded software that is synthesized from acyclic static dataflow (task graph) specifications targeting distributed memory multiprocessors. We identify iteration overlapping as a knob in ...

research-article

Open Access

Fast modulo scheduler utilizing patternized routes for coarse-grained reconfigurable architectures

ACM Transactions on Architecture and Code Optimization (TACO), Volume 10, Issue 4Article No.: 58, Pages 1–24https://doi.org/10.1145/2541228.2555314

Coarse-Grained Reconfigurable Architectures (CGRAs) present a potential of high compute throughput with energy efficiency. A CGRA consists of an array of Functional Units (FUs), which communicate with each other through an interconnect network ...

research-article

EPIMap: using epimorphism to map applications on CGRAs

DAC '12: Proceedings of the 49th Annual Design Automation ConferencePages 1284–1291https://doi.org/10.1145/2228360.2228600

Coarse-Grained Reconfigurable Architectures (CGRAs) are an attractive platform that promise simultaneous high-performance and high power-efficiency. One of the primary challenges in using CGRAs is to develop efficient compilers that can automatically ...

research-article

Integrated Code Generation for Loops

ACM Transactions on Embedded Computing Systems (TECS), Volume 11S, Issue 1Article No.: 19, Pages 1–24https://doi.org/10.1145/2180887.2180896

Code generation in a compiler is commonly divided into several phases: instruction selection, scheduling, register allocation, spill code generation, and, in the case of clustered architectures, cluster assignment. These phases are interdependent; for ...

research-article

Resource recycling: putting idle resources to work on a composable accelerator

CASES '10: Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systemsPages 21–30https://doi.org/10.1145/1878921.1878925

Mobile computing platforms in the form of smart phones, netbooks, and personal digital assistants have become an integral part of our everyday lives. Moving ahead to the future, mobile multimedia support will become a key differentiating factor for ...

research-article

CGRA express: accelerating execution using dynamic operation fusion

CASES '09: Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systemsPages 271–280https://doi.org/10.1145/1629395.1629433

Coarse-grained reconfigurable architectures (CGRAs) present an appealing hardware platform by providing programmability with the potential for high computation throughput, scalability, low cost, and energy efficiency. CGRAs have been effectively used ...

research-article

Modulo scheduling without overlapped lifetimes

LCTES '09: Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systemsPages 1–10https://doi.org/10.1145/1542452.1542454

This paper describes complementary software- and hardware-based approaches for handling overlapping register lifetimes that occur in modulo scheduled loops. Modulo scheduling takes the N-instructions in a loop body and constructs an M-stage software ...

Also Published in:

ACM SIGPLAN Notices: Volume 44 Issue 7

research-article

AGAMOS: A Graph-Based Approach to Modulo Scheduling for Clustered Microarchitectures

IEEE Transactions on Computers (ITCO), Volume 58, Issue 6Pages 770–783https://doi.org/10.1109/TC.2009.32

This paper presents AGAMOS, a technique to modulo schedule loops on clustered microarchitectures. The proposed scheme uses a multilevel graph partitioning strategy to distribute the workload among clusters and reduces the number of intercluster ...

Article

Reconstructing Control Flow in Modulo Scheduled Loops

ICIS '08: Proceedings of the Seventh IEEE/ACIS International Conference on Computer and Information Science (icis 2008)Pages 539–544https://doi.org/10.1109/ICIS.2008.16

Software pipelining is a loop optimization technique used to exploit instruction level parallelism in the loop. EPICarchitectures, such as Intel IA-64 (Itanium) provide extensive hardware support for software pipelining to generate compact and highly ...

research-article

Modulo scheduling for highly customized datapaths to increase hardware reusability

CGO '08: Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimizationPages 124–133https://doi.org/10.1145/1356058.1356075

In the embedded domain, custom hardware in the form of ASICs is often used to implement critical parts of applications when performance and energy efficiency goals cannot be met with software implementations on a general purpose processor or DSP. The ...

research-article

Latency-tolerant software pipelining in a production compiler

CGO '08: Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimizationPages 104–113https://doi.org/10.1145/1356058.1356073

In this paper we investigate the benefit of scheduling non-critical loads for a higher latency during software pipelining. "Non-critical" denotes those loads that have sufficient slack in the cyclic data dependence graph so that increasing the ...

Article

Hierarchical coarse-grained stream compilation for software defined radio

CASES '07: Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systemsPages 115–124https://doi.org/10.1145/1289881.1289903

Software Defined Radio (SDR) is an emerging embedded domain where the physical layer of wireless protocols is implemented in software rather than the traditional application specific hardware. The operation throughput requirements of current third-...

Article

Compiler assisted architectural exploration for coarse grained reconfigurable arrays

GLSVLSI '07: Proceedings of the 17th ACM Great Lakes symposium on VLSIPages 164–167https://doi.org/10.1145/1228784.1228827

A large number of factors influence the hardware cost and the mapping efficiency of applications on coarse grain reconfigurable architectures. This paper investigates for the first time in a unified way the four factors that are directly related with ...

Applied Filters

People

Names

Institutions

Authors

Reviewers

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Save to Binder

Upcoming Conferences

Also Published in: