No abstract available.
Proceeding Downloads
Vectorization-aware loop unrolling with seed forwarding
- Rodrigo C. O. Rocha,
- Vasileios Porpodas,
- Pavlos Petoumenos,
- Luís F. W. Góes,
- Zheng Wang,
- Murray Cole,
- Hugh Leather
Loop unrolling is a widely adopted loop transformation, commonly used for enabling subsequent optimizations. Straight-line-code vectorization (SLP) is an optimization that benefits from unrolling. SLP converts isomorphic instruction sequences into ...
Secure delivery of program properties through optimizing compilation
Annotations and assertions capturing static program properties are ubiquitous, from robust software engineering to safety-critical or secure code. These may be functional or non-functional properties of control and data flow, memory usage, I/O and real ...
Mix your contexts well: opportunities unleashed by recent advances in scaling context-sensitivity
Existing precise context-sensitive heap analyses do not scale well for large OO programs. Further, identifying the right context abstraction becomes quite intriguing as two of the most popular categories of context abstractions (call-site- and object-...
Scalable pointer analysis of data structures using semantic models
Pointer analysis is widely used as a base for different kinds of static analyses and compiler optimizations. Designing a scalable pointer analysis with acceptable precision for use in production compilers is still an open question. Modern object ...
A study of event frequency profiling with differential privacy
Program profiling is widely used to measure run-time execution properties---for example, the frequency of method and statement execution. Such profiling could be applied to deployed software to gain performance insights about the behavior of many ...
Improving database query performance with automatic fusion
Array-based programming languages have shown significant promise for improving performance of column-based in-memory database systems, allowing elegant representation of query execution plans that are also amenable to standard compiler optimization ...
Robust quantization of deep neural networks
We studied robust quantization of deep neural networks (DNNs) for embedded devices. Existing compression techniques often generate DNNs that are sensitive to external errors. Because embedded devices may be affected by external lights and outside ...
Generating fast sparse matrix vector multiplication from a high level generic functional IR
Usage of high-level intermediate representations promises the generation of fast code from a high-level description, improving the productivity of developers while achieving the performance traditionally only reached with low-level programming ...
Runtime multi-versioning and specialization inside a memoized speculative loop optimizer
In this paper, we propose a runtime framework that implements code multi-versioning and specialization to optimize and parallelize loop kernels that are invoked many times with varying parameters. These parameters may influence the code structure, the ...
Dynamic property caches: a step towards faster JavaScript proxy objects
Inline caches and hidden classes are two essential components for closing the performance gap between static languages such as Java, Scheme, or ML and dynamic languages such as JavaScript or Python. They rely on the observation that for a particular ...
Mixed-data-model heterogeneous compilation and OpenMP offloading
- Andreas Kurth,
- Koen Wolters,
- Björn Forsberg,
- Alessandro Capotondi,
- Andrea Marongiu,
- Tobias Grosser,
- Luca Benini
Heterogeneous computers combine a general-purpose host processor with domain-specific programmable many-core accelerators, uniting high versatility with high performance and energy efficiency. While the host manages ever-more application memory, ...
Balancing performance and productivity for the development of dynamic binary instrumentation tools: a case study on Arm systems
Dynamic Binary Instrumentation (DBI) is a well-established approach for analysing the execution of applications at the level of machine code. DBI frameworks implement a runtime system capable of modifying running applications without access to their ...
Compiling first-order functions to session-typed parallel code
Building correct and efficient message-passing parallel programs still poses many challenges. The incorrect use of message-passing constructs can introduce deadlocks, and a bad task decomposition will not achieve good speedups. Current approaches focus ...
Is stateful packrat parsing really linear in practice? a counter-example, an improved grammar, and its parsing algorithms
Stateful packrat parsing is an algorithm for parsing syntaxes that have context-sensitive features. It is a well-known knowledge among researchers that the running time of stateful packrat parsing is linear for real-world grammars, as demonstrated in ...
Bitwidth customization in image processing pipelines using interval analysis and SMT solvers
Unlike CPUs and GPUs, it is possible to use custom fixed-point data types, specified as a tuple (α, β), on FPGAs. The parameters α and β denote the number of integral and fractional bitwidths respectively. The power and area savings while performing ...
Automatically harnessing sparse acceleration
Sparse linear algebra is central to many scientific programs, yet compilers fail to optimize it well. High-performance libraries are available, but adoption costs are significant. Moreover, libraries tie programs into vendor-specific software and ...
Postcondition-preserving fusion of postorder tree transformations
Tree transformations are common in applications such as program rewriting in compilers. Using a series of simple transformations to build a more complex system can make the resulting software easier to understand, maintain, and reason about. Fusion ...
Compiler-based graph representations for deep learning models of code
In natural language processing, novel methods in deep learning, like recurrent neural networks (RNNs) on sequences of words, have been very successful. In contrast to natural languages, programming languages usually have a well-defined structure. With ...
Relaxing the one definition rule in interpreted C++
- Javier López-Gómez,
- Javier Fernández,
- David del Rio Astorga,
- Vassil Vassilev,
- Axel Naumann,
- J. Daniel García
Most implementations of the C++ programming language generate binary executable code. However, interpreted execution of C++ sources has its own use cases as the Cling interpreter from CERN's ROOT project has shown. Some limitations are derived from the ...
Index Terms
- Proceedings of the 29th International Conference on Compiler Construction