Cited By
View all- Kelefouras VKeramidas G(2023)Design and Implementation of Deep Learning 2D Convolutions on Modern CPUsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.332203734:12(3104-3116)Online publication date: 1-Dec-2023
We present the design and implementation of an automatic polyhedral source-to-source transformation framework that can optimize regular programs (sequences of possibly imperfectly nested loops) for parallelism and locality simultaneously. Through this ...
Global locality analysis is a technique for improving the cache performance of a sequence of loop nests through a combination of loop and data layout optimizations. Pure loop transformations are restricted by data dependences and may not be very ...
Tiling is a well-known loop transformation to improve temporal locality of nested loops. Current compiler algorithms for tiling are limited to loops which are perfectly nested or can be transformed, in trivial ways, into a perfect nest. This paper ...
Association for Computing Machinery
New York, NY, United States
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in