Keyword: cache-oblivious : Search

research-article

Processor-Aware Cache-Oblivious Algorithms✱

ICPP '21: Proceedings of the 50th International Conference on Parallel ProcessingArticle No.: 55, Pages 1–10https://doi.org/10.1145/3472456.3472506

Frigo et al. proposed an ideal cache model and a recursive technique to design sequential cache-efficient algorithms in a cache-oblivious fashion. Ballard et al. pointed out that it is a fundamental open problem to extend the technique to an arbitrary ...

extended-abstract

Balanced Partitioning of Several Cache-Oblivious Algorithms

Yuan Tang

SPAA '20: Proceedings of the 32nd ACM Symposium on Parallelism in Algorithms and ArchitecturesPages 575–577https://doi.org/10.1145/3350755.3400214

Frigo et al. proposed an ideal cache model and a recursive cache-oblivious technique to design sequential cache-efficient algorithms in an oblivious fashion. Ballard et al. pointed out that it is a fundamental open problem to extend the technique to an ...

research-article

Cache-oblivious High-performance Similarity Join

SIGMOD '19: Proceedings of the 2019 International Conference on Management of DataPages 87–104https://doi.org/10.1145/3299869.3319859

A similarity join combines vectors based on a distance condition. Typically, such algorithms apply a filter step (by indexing or sorting) and then refine pairs of candidate vectors. In this paper, we propose to refine the pairs in an order defined by a ...

research-article

Public Access

Cache-Oblivious Buffer Heap and Cache-Efficient Computation of Shortest Paths in Graphs

ACM Transactions on Algorithms (TALG), Volume 14, Issue 1Article No.: 1, Pages 1–33https://doi.org/10.1145/3147172

We present the buffer heap, a cache-oblivious priority queue that supports Delete-Min, Delete, and a hybrid Insert/Decrease-Key operation in O(1/B log₂ N/M) amortized block transfers from main memory, where M and B are the (unknown) cache size and block ...

research-article

Public Access

Autogen: Automatic Discovery of Efficient Recursive Divide-8-Conquer Algorithms for Solving Dynamic Programming Problems

ACM Transactions on Parallel Computing (TOPC), Volume 4, Issue 1Article No.: 4, Pages 1–30https://doi.org/10.1145/3125632

We present Autogen—an algorithm that for a wide class of dynamic programming (DP) problems automatically discovers highly efficient cache-oblivious parallel recursive divide-and-conquer algorithms from inefficient iterative descriptions of DP ...

research-article

Public Access

Provably Efficient Scheduling of Cache-oblivious Wavefront Algorithms

SPAA '17: Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and ArchitecturesPages 339–350https://doi.org/10.1145/3087556.3087586

Iterative wavefront algorithms for evaluating dynamic programming recurrences exploit optimal parallelism but show poor cache performance. Tiled-iterative wavefront algorithms achieve optimal cache complexity and high parallelism but are cache-aware and ...

poster

Public Access

POSTER: Provably Efficient Scheduling of Cache-Oblivious Wavefront Algorithms

PPoPP '17: Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingPages 435–436https://doi.org/10.1145/3018743.3019031

Standard cache-oblivious recursive divide-and-conquer algorithms for evaluating dynamic programming recurrences have optimal serial cache complexity but often have lower parallelism compared with iterative wavefront algorithms due to artificial ...

Also Published in:

ACM SIGPLAN Notices: Volume 52 Issue 8

research-article

Public Access

Anti-Persistence on Persistent Storage: History-Independent Sparse Tables and Dictionaries

PODS '16: Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database SystemsPages 289–302https://doi.org/10.1145/2902251.2902276

We present history-independent alternatives to a B-tree, the primary indexing data structure used in databases. A data structure is history independent (HI) if it is impossible to deduce any information by examining the bit representation of the data ...

research-article

Public Access

AUTOGEN: automatic discovery of cache-oblivious parallel recursive algorithms for solving dynamic programs

PPoPP '16: Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingArticle No.: 10, Pages 1–12https://doi.org/10.1145/2851141.2851167

We present AUTOGEN---an algorithm that for a wide class of dynamic programming (DP) problems automatically discovers highly efficient cache-oblivious parallel recursive divide-and-conquer algorithms from inefficient iterative descriptions of DP ...

Also Published in:

ACM SIGPLAN Notices: Volume 51 Issue 8

Article

High-Performance Energy-Efficient Recursive Dynamic Programming with Matrix-Multiplication-Like Flexible Kernels

IPDPS '15: Proceedings of the 2015 IEEE International Parallel and Distributed Processing SymposiumPages 303–312https://doi.org/10.1109/IPDPS.2015.107

Dynamic Programming (DP) problems arise in wide range of application areas spanning from logistics to computational biology. In this paper, we show how to obtain high-performing parallel implementations for a class of Problems by reducing them to highly ...

research-article

The input/output complexity of triangle enumeration

PODS '14: Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsPages 224–233https://doi.org/10.1145/2594538.2594552

We consider the well-known problem of enumerating all triangles of an undirected graph. Our focus is on determining the input/output (I/O) complexity of this problem. Let E be the number of edges, M<E the size of internal memory, and B the block size. ...

article

High-Level Strategies for Parallel Shared-Memory Sparse Matrix-Vector Multiplication

IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 25, Issue 1Pages 116–125https://doi.org/10.1109/TPDS.2013.31

The sparse matrix-vector multiplication is an important computational kernel, but is hard to efficiently execute even in the sequential case. The problems--namely low arithmetic intensity, inefficient cache use, and limited memory bandwidth--are ...

Article

NUMA Aware Iterative Stencil Computations on Many-Core Systems

IPDPS '12: Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing SymposiumPages 461–473https://doi.org/10.1109/IPDPS.2012.50

Temporal blocking in iterative stencil computations allows to surpass the performance of peak system bandwidth that holds for a single stencil computation. However, the effectiveness of temporal blocking depends strongly on the tiling scheme, which must ...

research-article

Cache-Oblivious Algorithms

ACM Transactions on Algorithms (TALG), Volume 8, Issue 1Article No.: 4, Pages 1–22https://doi.org/10.1145/2071379.2071383

This article presents asymptotically optimal algorithms for rectangular matrix transpose, fast Fourier transform (FFT), and sorting on computers with multiple levels of caching. Unlike previous optimal algorithms, these algorithms are cache oblivious: ...

research-article

Data management for SSDs for large-scale interactive graphics applications

I3D '11: Symposium on Interactive 3D Graphics and GamesPages 175–182https://doi.org/10.1145/1944745.1944775

Solid state drives (SSDs) are emerging as an alternative storage medium to HDDs. SSDs have performance characteristics (e.g., fast random reads) that are very different from those of HDDs. Because of the high performance of SSDs, there are increasingly ...

research-article

Binary Mesh Partitioning for Cache-Efficient Visualization

IEEE Transactions on Visualization and Computer Graphics (ITVC), Volume 16, Issue 5Pages 815–828https://doi.org/10.1109/TVCG.2010.19

One important bottleneck when visualizing large data sets is the data transfer between processor and memory. Cache-aware (CA) and cache-oblivious (CO) algorithms take into consideration the memory hierarchy to design cache efficient algorithms. CO ...

poster

Porting existing cache-oblivious linear algebra HPC modules to larrabee architecture

CF '10: Proceedings of the 7th ACM international conference on Computing frontiersPages 91–92https://doi.org/10.1145/1787275.1787298

Cache-obliviousness represents an important but relatively new concept for cache optimization. As cache-oblivious algorithms perform well on architectures with arbitrary cache configurations, the programming effort required for porting and optimizing ...

article

Cache-Oblivious Sparse Matrix-Vector Multiplication by Using Sparse Matrix Partitioning Methods

SIAM Journal on Scientific Computing (SISC), Volume 31, Issue 4Pages 3128–3154https://doi.org/10.1137/080733243

In this article, we introduce a cache-oblivious method for sparse matrix-vector multiplication. Our method attempts to permute the rows and columns of the input matrix using a recursive hypergraph-based sparse matrix partitioning scheme so that the ...

Article

EaseDB: a cache-oblivious in-memory query processor

SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of dataPages 1064–1066https://doi.org/10.1145/1247480.1247607

We propose to demonstrate EaseDB, the first cache-oblivious queryprocessor for in-memory relational query processing. The cache-oblivious notion from the theory community refers to the property that no parameters in an algorithm or a data structure need ...

Article

Locality and parallelism optimization for dynamic programming algorithm in bioinformatics

SC '06: Proceedings of the 2006 ACM/IEEE conference on SupercomputingPages 78–eshttps://doi.org/10.1145/1188455.1188538

Dynamic programming has been one of the most efficient approaches to sequence analysis and structure prediction in biology. However, their performance is limited due to the drastic increase in both the number of biological data and variety of the ...

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Reproducibility Badges

Publication Date

Save to Binder

Upcoming Conferences

Also Published in:

Also Published in: