Welcome to the 2014 Conference on Compilers, Architectures and Synthesis for Embedded Systems -- CASES'14! We are very pleased to continue the CASES tradition of bringing a unique focus on the intersection of architecture and compilers to Embedded Systems Week (ESWEEK). CASES has served this purpose since 2006 when it first formed part of the three core conferences that comprise ESWEEK.
Proceeding Downloads
The improbable but highly appropriate marriage of 3D stacking and neuromorphic accelerators
3D stacking is a promising technology (low latency/power/area, high bandwidth); its main shortcoming is increased power density. Simultaneously, motivated by energy constraints, architectures are evolving towards greater customization, with tasks ...
Heuristics for greedy transport triggered architecture interconnect exploration
Most power dissipation in Very Large Instruction Word (VLIW) processors occurs in their large, multi-port register files. Transport Triggered Architecture (TTA) is a VLIW variant whose exposed datapath reduces the need for RF accesses and ports. However,...
Energy-efficient VFI-partitioned multicore design using wireless NoC architectures
In recent years, multiple Voltage Frequency Island (VFI)-based designs have increasingly made their way into both commercial and research multicore platforms. On the other hand, the wireless Network-on-Chip (WiNoC) architecture has emerged as an energy-...
Retargetable automatic generation of compound instructions for CGRA based reconfigurable processor applications
Reconfigurable processors such as SRP (Samsung Reconfigurable Processors) have become increasingly important, which enables just enough flexibility of accepting software solutions and providing application specific hardware configurability for faster ...
COREFAB: concurrent reconfigurable fabric utilization in heterogeneous multi-core systems
Application-specific accelerators may provide considerable speedup in single-core systems with a runtime-reconfigurable fabric (for simplicity called "fabric" in the following). A reconfigurable core, i.e. processor core pipeline coupled to a fabric, ...
Automatic custom instruction identification in memory streaming algorithms
Application-specific instruction set processors (ASIPs) extend the instruction set of a general purpose processor by dedicated custom instructions (CIs). In the last decade, reconfigurable processors advanced this concept towards run-time ...
Auto-parallelization of data structure operations for GPUs
We present an auto-parallelization technique for generating GPU implementation of data-structure operations from a sequential specification. The technique partitions the data-structure operations into barrier-separated phases such that each phase ...
A compilation flow for parametric dataflow: programming model, scheduling, and application to heterogeneous MPSoC
Efficient programming of signal processing applications on embedded systems is a complex problem. High level models such as Synchronous dataflow (SDF) have been privileged candidates for dealing with this complexity. These models permit to express ...
A compiler framework for automatically mapping data parallel programs to heterogeneous MPSoCs
Many of today's embedded devices are based on MultiProcessor System-on-Chips(MPSoCs). Such devices are usually heterogeneous, containing DSPs and specialized accelerators as well as one or more CPUs. This heterogeneity allows efficient implementations ...
Team up: cooperative memory management in embedded systems
- Isabella Stilkerich,
- Philip Taffner,
- Christoph Erhardt,
- Christian Dietrich,
- Christian Wawersich,
- Michael Stilkerich
The use of a managed, type-safe languages such as Java in real-time and embedded systems can offer productivity and, in particular, safety and dependability benefits over the dominating unsafe languages at reasonable costs. A JVM that has dynamic memory-...
A low-cost memory interface for high-throughput accelerators
Heterogeneous multi-cores, a mix of cores and accelerators, are becoming prevalent. These accelerators are designed for both speed and energy improvements, and thus, they increasingly come with a large number of load/store ports for achieving a high ...
EnVM: virtual memory design for new memory architectures
Virtual memory is optimized for SRAM-based memory devices in which memory accesses are symmetric, i.e., the latency of read and write accesses are similar. Unfortunately, with the emergence of newer non-volatile memory (NVM) technologies that are denser ...
A system-level simulation framework for evaluating task migration in MPSoCs
Task migration is the transfer of the execution of a process (task) from one processing element to another. It originates from the massive deployment of distributed systems in the parallel computing field to enable dynamic load distribution, fault ...
Context-sensitive timing simulation of binary embedded software
We present an approach to accurately simulate the temporal behavior of binary embedded software based on timing data generated using static analysis. As the timing of an instruction sequence is significantly influenced by the microarchitecture state ...
Automated ISA branch coverage analysis and test case generation for retargetable instruction set simulators
Processor design tools integrate in their workflows generators for instruction set simulators (Iss) from architecture descriptions. However, it is difficult to validate the correctness of these simulators. Isa coverage analysis is insufficient to ...
Control-layer optimization for flow-based mVLSI microfluidic biochips
Recent advantages in flow-based microfluidic biochips have enabled the emergence of lab-on-a-chip devices for bimolecular recognition and point-of-care disease diagnostics. However, the adoption of flow-based biochips is hampered today by the lack of ...
Splitting functions into single-entry regions
As the performance requirements of today's real-time systems are on the rise, system engineers are increasingly forced to optimize and tune the execution time of real-time software. Apart from usual optimizations targeting the average-case performance ...
Construction of GCCFG for inter-procedural optimizations in software managed manycore (SMM) architectures
Software Managed Manycore (SMM) architectures -- in which each core has only a scratch pad memory (instead of caches), -- are a promising solution for scaling memory hierarchy to hundreds of cores. However, in these architectures, the code and data of ...
CAPED: context-aware personalized display brightness for mobile devices
The display remains the primary user interface on many computing devices, ranging from traditional devices such as desktops and laptops, to the more pervasive devices such as smartphones and smartwatches. Thus, the overall user experience with these ...
A high-level model of embedded flash energy consumption
The alignment of code in the flash memory of deeply embedded SoCs can have a large impact on the total energy consumption of a computation. We investigate the effect of code alignment in six SoCs and find that a large proportion of this energy (up to 15%...
Reducing cache leakage energy for hybrid SPM-cache architectures
In this paper, we study how to reduce the cache leakage energy efficiently in a hybrid SPM (Scratch-Pad Memory) and cache architecture. Since SPM can reduce the access frequency to the cache, we find it is possible to place the cache lines of the hybrid ...
AdaPNet: adapting process networks in response to resource variations
A widely considered strategy to prevent interference issues on multi-processor systems is to isolate the execution of the individual applications by running each of them on a dedicated virtual guest machine. The amount of computing power available to a ...
SDCTune: a model for predicting the SDC proneness of an application for configurable protection
Silent Data Corruption (SDC) is a serious reliability issue in many domains, including embedded systems. However, current protection techniques are brittle, and do not allow programmers to trade off performance for SDC coverage. Further, many of them ...
Fault resilient physical neural networks on a single chip
Device scaling engineering is facing major challenges in producing reliable transistors for future electronic technologies. With shrinking device sizes, the total circuit sensitivity to both permanent and transient faults has increased significantly. ...
Index Terms
- Proceedings of the 2014 International Conference on Compilers, Architecture and Synthesis for Embedded Systems