Nothing Special   »   [go: up one dir, main page]

skip to main content
article

The Cache Complexity of Multithreaded Cache Oblivious Algorithms

Published: 10 June 2009 Publication History

Abstract

We present a technique for analyzing the number of cache misses incurred by multithreaded cache oblivious algorithms on an idealized parallel machine in which each processor has a private cache. We specialize this technique to computations executed by the Cilk work-stealing scheduler on a machine with dag-consistent shared memory. We show that a multithreaded cache oblivious matrix multiplication incurs Image (http://www.springerlink.com/content/l8946u2p54m72263/224_2007_9098_Article_IEq1.gif) is missing or otherwise invalid. cache misses when executed by the Cilk scheduler on a machine with  P processors, each with a cache of size  Z , with high probability. This bound is tighter than previously published bounds. We also present a new multithreaded cache oblivious algorithm for 1D stencil computations incurring Image (http://www.springerlink.com/content/l8946u2p54m72263/224_2007_9098_Article_IEq2.gif) is missing or otherwise invalid. cache misses with high probability, one for Gaussian elimination and back substitution, and one for the length computation part of the longest common subsequence problem incurring Image (http://www.springerlink.com/content/l8946u2p54m72263/224_2007_9098_Article_IEq3.gif) is missing or otherwise invalid. cache misses with high probability.

Cited By

View all
  • (2024)Fast American Option Pricing using Nonlinear StencilsProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638506(316-332)Online publication date: 2-Mar-2024
  • (2023)A Fast Algorithm for Aperiodic Linear Stencil Computation using Fast Fourier TransformsACM Transactions on Parallel Computing10.1145/360633810:4(1-34)Online publication date: 24-Jul-2023
  • (2023)DHTS: A Dynamic Hybrid Tiling Strategy for Optimizing Stencil Computation on GPUsIEEE Transactions on Computers10.1109/TC.2023.327106072:10(2795-2807)Online publication date: 27-Apr-2023
  • Show More Cited By

Index Terms

  1. The Cache Complexity of Multithreaded Cache Oblivious Algorithms
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Theory of Computing Systems
      Theory of Computing Systems  Volume 45, Issue 2
      Special Issue: Symposium on Parallelism in Algorithms and Architectures 2006; Guest Editors: Robert Kleinberg and Christian Scheideler
      June 2009
      239 pages

      Publisher

      Springer-Verlag

      Berlin, Heidelberg

      Publication History

      Published: 10 June 2009

      Author Tags

      1. Cache oblivious algorithms
      2. Multithreading
      3. Stencil computations

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 22 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Fast American Option Pricing using Nonlinear StencilsProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638506(316-332)Online publication date: 2-Mar-2024
      • (2023)A Fast Algorithm for Aperiodic Linear Stencil Computation using Fast Fourier TransformsACM Transactions on Parallel Computing10.1145/360633810:4(1-34)Online publication date: 24-Jul-2023
      • (2023)DHTS: A Dynamic Hybrid Tiling Strategy for Optimizing Stencil Computation on GPUsIEEE Transactions on Computers10.1109/TC.2023.327106072:10(2795-2807)Online publication date: 27-Apr-2023
      • (2021)Processor-Aware Cache-Oblivious Algorithms✱Proceedings of the 50th International Conference on Parallel Processing10.1145/3472456.3472506(1-10)Online publication date: 9-Aug-2021
      • (2021)Tile size selection of affine programs for GPGPUs using polyhedral cross-compilationProceedings of the 35th ACM International Conference on Supercomputing10.1145/3447818.3460369(13-26)Online publication date: 3-Jun-2021
      • (2021)Fast Stencil Computations using Fast Fourier TransformsProceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3409964.3461803(8-21)Online publication date: 6-Jul-2021
      • (2021)Data Oblivious Algorithms for MulticoresProceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3409964.3461783(373-384)Online publication date: 6-Jul-2021
      • (2020)PencilProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.5555/3433701.3433814(1-16)Online publication date: 9-Nov-2020
      • (2020)Improving the Space-Time Efficiency of Matrix Multiplication AlgorithmsWorkshop Proceedings of the 49th International Conference on Parallel Processing10.1145/3409390.3409404(1-10)Online publication date: 17-Aug-2020
      • (2020)Balanced Partitioning of Several Cache-Oblivious AlgorithmsProceedings of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3350755.3400214(575-577)Online publication date: 6-Jul-2020
      • Show More Cited By

      View Options

      View options

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media