Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3350755.3400227acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
research-article
Open access

Optimal Parallel Algorithms in the Binary-Forking Model

Published: 09 July 2020 Publication History

Abstract

In this paper we develop optimal algorithms in the binary-forking model for a variety of fundamental problems, including sorting, semisorting, list ranking, tree contraction, range minima, and ordered set union, intersection and difference. In the binary-forking model, tasks can only fork into two child tasks, but can do so recursively and asynchronously. The tasks share memory, supporting reads, writes and test-and-sets. Costs are measured in terms of work (total number of instructions), and span (longest dependence chain).
The binary-forking model is meant to capture both algorithm performance and algorithm-design considerations on many existing multithreaded languages, which are also asynchronous and rely on binary forks either explicitly or under the covers. In contrast to the widely studied PRAM model, it does not assume arbitrary-way forks nor synchronous operations, both of which are hard to implement in modern hardware. While optimal PRAM algorithms are known for the problems studied herein, it turns out that arbitrary-way forking and strict synchronization are powerful, if unrealistic, capabilities. Natural simulations of these PRAM algorithms in the binary-forking model (i.e., implementations in existing parallel languages) incur an Ω(log n) overhead in span. This paper explores techniques for designing optimal algorithms when limited to binary forking and assuming asynchrony. All algorithms described in this paper are the first algorithms with optimal work and span in the binary-forking model. Most of the algorithms are simple. Many are randomized.

References

[1]
U. A. Acar, G. E. Blelloch, and R. D. Blumofe. 2002. The Data Locality of Work Stealing. Theoretical Computer Science (TCS), Vol. 35, 3 (2002).
[2]
Umut A. Acar, Arthur Charguéraud, Adrien Guatto, Mike Rainey, and Filip Sieczkowski. 2018. Heartbeat Scheduling: Provable Efficiency for Nested Parallelism. In ACM Conference on Programming Language Design and Implementation (PLDI). 769--782.
[3]
Kunal Agrawal, Jeremy T. Fineman, Kefu Lu, Brendan Sheridan, Jim Sukha, and Robert Utterback. 2014. Provably Good Scheduling for Parallel Programs That Use Data Structures Through Implicit Batching. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
[4]
Kunal Agrawal, Seth Gilbert, and Wei Quan Lim. 2018. Parallel Working-Set Search Structures. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
[5]
Yaroslav Akhremtsev and Peter Sanders. 2016. Fast Parallel Operations on Search Trees. In IEEE International Conference on High Performance Computing (HiPC).
[6]
Laurent Alonso and Ren Schott. 1996. A parallel algorithm for the generation of a permutation and applications. Theoretical Computer Science (TCS), Vol. 159, 1 (1996).
[7]
Stephen Alstrup, Cyril Gavoille, Haim Kaplan, and Theis Rauhe. 2004. Nearest common ancestors: A survey and a new algorithm for a distributed environment. Theory of Computing Systems (TOCS), Vol. 37, 3 (2004), 441--456.
[8]
R. Anderson. 1990. Parallel Algorithms for Generating Random Permutations on a Shared Memory Machine. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
[9]
Richard J Anderson and Gary L Miller. 1990. A simple randomized parallel algorithm for list-ranking. Inform. Process. Lett., Vol. 33, 5 (1990), 269--273.
[10]
Lars Arge, Johannes Fischer, Peter Sanders, and Nodari Sitchinava. 2013. On (dynamic) range minimum queries in external memory. In Workshop on Algorithms and Data Structures. Springer, 37--48.
[11]
N. S. Arora, R. D. Blumofe, and C. G. Plaxton. 2001. Thread Scheduling for Multiprogrammed Multiprocessors. Theory of Computing Systems (TOCS), Vol. 34, 2 (01 Apr 2001).
[12]
Sara Baase. 1993. Introduction to Parallel Connectivity, List Ranking, and Euler Tour Techniques. In Synthesis of Parallel Algorithms, John Reif (Ed.). Morgan Kaufmann, 61--114.
[13]
Naama Ben-David, Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, Yan Gu, Charles McGuffey, and Julian Shun. 2016. Parallel Algorithms for Asymmetric Read-Write Costs. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
[14]
Naama Ben-David, Guy E Blelloch, Jeremy T Fineman, Phillip B Gibbons, Yan Gu, Charles McGuffey, and Julian Shun. 2018. Implicit Decomposition for Write-Efficient Connectivity Algorithms. In IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[15]
Michael A Bender, Alex Conway, Mart'in Farach-Colton, William Jannen, Yizheng Jiao, Rob Johnson, Eric Knorr, Sara McAllister, Nirjhar Mukherjee, Prashant Pandey, et al. 2019. Small refinements to the dam can have big consequences for data-structure design. In The 31st ACM Symposium on Parallelism in Algorithms and Architectures. 265--274.
[16]
Michael A Bender and Martin Farach-Colton. 2000. The LCA problem revisited. In Latin American Symposium on Theoretical Informatics (LATIN). Springer, 88--94.
[17]
Omer Berkman and Uzi Vishkin. 1993. Recursive star-tree parallel data structure. SIAM J. Scientific Computing, Vol. 22, 2 (1993), 221--242.
[18]
Guy E. Blelloch. 1993. Prefix Sums and Their Applications. In Synthesis of Parallel Algorithms, John Reif (Ed.). Morgan Kaufmann.
[19]
Guy E. Blelloch. 1996. Programming Parallel Algorithms. Commun. ACM, Vol. 39, 3 (March 1996).
[20]
Guy E. Blelloch, Rezaul Alam Chowdhury, Phillip B. Gibbons, Vijaya Ramachandran, Shimin Chen, and Michael Kozuch. 2008. Provably good multicore cache performance for divide-and-conquer algorithms. In ACM-SIAM Symposium on Discrete Algorithms (SODA).
[21]
Guy E Blelloch, Daniel Ferizovic, and Yihan Sun. 2016a. Just Join for Parallel Ordered Sets. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
[22]
Guy E. Blelloch, Daniel Ferizovic, and Yihan Sun. 2016b. Just Join for Parallel Ordered Sets. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
[23]
Guy E Blelloch, Jeremy T Fineman, Phillip B Gibbons, and Julian Shun. 2012a. Internally deterministic parallel algorithms can be fast. In ACM Symposium on Principles and Practice of Parallel Programming (PPOPP).
[24]
Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, and Harsha Vardhan Simhadri. 2011. Scheduling Irregular Parallel Computations on Hierarchical Caches. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
[25]
Guy E Blelloch, Jeremy T Fineman, Yan Gu, and Yihan Sun. 2020 a. Optimal parallel algorithms in the binary-forking model. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
[26]
Guy E. Blelloch and Phillip B. Gibbons. 2004. Effectively sharing a cache among threads. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
[27]
Guy E. Blelloch, Phillip B. Gibbons, and Yossi Matias. 1999. Provably Efficient Scheduling for Languages with Fine-grained Parallelism. J. ACM, Vol. 46, 2 (March 1999).
[28]
Guy E Blelloch, Phillip B Gibbons, and Harsha Vardhan Simhadri. 2010. Low depth cache-oblivious algorithms. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
[29]
Guy E Blelloch, Yan Gu, Julian Shun, and Yihan Sun. 2016c. Parallelism in Randomized Incremental Algorithms. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
[30]
Guy E Blelloch, Yan Gu, Julian Shun, and Yihan Sun. 2018. Parallel Write-Efficient Algorithms and Data Structures for Computational Geometry. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
[31]
Guy E Blelloch, Yan Gu, Julian Shun, and Yihan Sun. 2020 b. Randomized Incremental Convex Hull is Highly Parallel. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
[32]
Guy E. Blelloch and Margaret Reid-Miller. 1998. Fast Set Operations Using Treaps. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
[33]
Guy E Blelloch and Margaret Reid-Miller. 1999. Pipelining with futures. Theory of Computing Systems (TOCS), Vol. 32, 3 (1999).
[34]
Guy E. Blelloch, Harsha Vardhan Simhadri, and Kanat Tangwongsan. 2012b. Parallel and I/O efficient set covering algorithms. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
[35]
Robert D. Blumofe and Charles E. Leiserson. 1998. Space-Efficient Scheduling of Multithreaded Computations. SIAM J. Scientific Computing, Vol. 27, 1 (1998).
[36]
Robert D Blumofe and Charles E Leiserson. 1999. Scheduling multithreaded computations by work stealing. J. ACM, Vol. 46, 5 (1999), 720--748.
[37]
Allan Borodin. 1977. On Relating Time and Space to Size and Depth. SIAM J. Scientific Computing, Vol. 6, 4 (1977), 733--744.
[38]
Anastasia Braginsky and Erez Petrank. 2012. A lock-free B
[39]
tree. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 58--67.
[40]
Mark R. Brown and Robert Endre Tarjan. 1980. Design and Analysis of a Data Structure for Representing Sorted Lists. SIAM J. Scientific Computing, Vol. 9, 3 (1980), 594--614.
[41]
Zoran Budimlić, Vincent Cavé, Raghavan Raman, Jun Shirako, Saug nak Tacs irlar, Jisheng Zhao, and Vivek Sarkar. 2011. The design and implementation of the habanero-java parallel programming language. In Proceedings of the ACM international Conference Companion on Object Oriented Programming Systems Languages and Applications (OOPSLA). 185--186.
[42]
Philippe Charles, Christian Grothoff, Vijay Saraswat, Christopher Donawa, Allan Kielstra, Kemal Ebcioglu, Christoph Von Praun, and Vivek Sarkar. 2005. X10: an object-oriented approach to non-uniform cluster computing. In ACM SIGPLAN Conference on Object-oriented Programming, Systems, Languages, and Applications (OOPSLA). 519--538.
[43]
Rezaul Chowdhury, Pramod Ganapathi, Yuan Tang, and Jesmin Jahan Tithi. 2017. Provably efficient scheduling of cache-oblivious wavefront algorithms. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). ACM, 339--350.
[44]
Rezaul Alam Chowdhury, Vijaya Ramachandran, Francesco Silvestri, and Brandon Blakeley. 2013. Oblivious algorithms for multicores and networks of processors. J. Parallel Distrib. Comput., Vol. 73, 7 (2013).
[45]
Richard Cole. 1988. Parallel Merge Sort. SIAM J. Scientific Computing, Vol. 17, 4 (1988).
[46]
Richard Cole and Vijaya Ramachandran. 2017a. Bounding Cache Miss Costs of Multithreaded Computations Under General Schedulers: Extended Abstract. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
[47]
Richard Cole and Vijaya Ramachandran. 2017b. Resource Oblivious Sorting on Multicores. ACM Transactions on Parallel Computing (TOPC), Vol. 3, 4 (2017).
[48]
Richard Cole and Uzi Vishkin. 1986. Deterministic Coin Tossing and Accelerating Cascades: Micro and Macro Techniques for Designing Parallel Algorithms. In ACM Symposium on Theory of Computing (STOC). 206--219.
[49]
Richard Cole and Uzi Vishkin. 1988. Approximate Parallel Scheduling. Part I: The Basic Technique with Applications to Optimal Parallel List Ranking in Logarithmic Time. SIAM J. Scientific Computing, Vol. 17 (1988), 128--142.
[50]
Richard Cole and Uzi Vishkin. 1989. Faster optimal parallel prefix sums and list ranking. Information and computation, Vol. 81, 3 (1989), 334--352.
[51]
Guojing Cong and David A. Bader. 2005. An Empirical Analysis of Parallel Random Permutation Algorithms ON SMPs. In Parallel and Distributed Computing and Systems (PDCS).
[52]
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2009. Introduction to Algorithms (3rd edition) .MIT Press.
[53]
Artur Czumaj, Przemyslawa Kanarek, Miroslaw Kutylowski, and Krzysztof Lorys. 1998. Fast generation of random permutations via networks simulation. Algorithmica (1998).
[54]
Laxman Dhulipala, Guy E. Blelloch, and Julian Shun. 2018. Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
[55]
Laxman Dhulipala, Charlie McGuffey, Hongbo Kang, Yan Gu, Guy E Blelloch, Phillip B Gibbons, and Julian Shun. 2020. Semi-Asymmetric Parallel Graph Algorithms for NVRAMs. Proceedings of the VLDB Endowment (PVLDB), Vol. 13, 9 (2020).
[56]
David Dinh, Harsha Vardhan Simhadri, and Yuan Tang. 2016. Extending the nested parallel model to the nested dataflow model with provably efficient schedulers. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
[57]
James R Driscoll, Neil Sarnak, Daniel D Sleator, and Robert E Tarjan. 1989. Making data structures persistent. J. Computer and System Sciences, Vol. 38, 1 (1989), 86--124.
[58]
Richard Durstenfeld. 1964. Algorithm 235: Random Permutation. Commun. ACM, Vol. 7, 7 (1964), 420.
[59]
Panagiota Fatourou, Elias Papavasileiou, and Eric Ruppert. 2019. Persistent non-blocking binary search trees supporting wait-free range queries. In The 31st ACM Symposium on Parallelism in Algorithms and Architectures. 275--286.
[60]
Johannes Fischer and Volker Heun. 2006. Theoretical and practical improvements on the RMQ-problem, with applications to LCA and LCE. In Symposium on Combinatorial Pattern Matching (CPM). Springer, 36--48.
[61]
W. D. Frazer and A. C. McKellar. 1970. Samplesort: A Sampling Approach to Minimal Storage Tree Sorting. J. ACM, Vol. 17, 3 (July 1970).
[62]
Matteo Frigo, Charles E Leiserson, and Keith H Randall. 1998. The implementation of the Cilk-5 multithreaded language. ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), Vol. 33, 5 (1998), 212--223.
[63]
Phillip B. Gibbons, Yossi Matias, and Vijaya Ramachandran. 1996. Efficient Low-Contention Parallel Algorithms. J. Computer and System Sciences, Vol. 53, 3 (1996).
[64]
Joseph Gil. 1991. Fast load balancing on a PRAM. In IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[65]
J. Gil, Y. Matias, and U. Vishkin. 1991. Towards a theory of nearly constant time parallel algorithms. In IEEE Symposium on Foundations of Computer Science (FOCS).
[66]
Seth Gilbert and Wei Quan Lim. 2019. Parallel Finger Search Structures. arXiv preprint arXiv:1908.02741 (2019).
[67]
Michael T. Goodrich, Yossi Matias, and Uzi Vishkin. 1994. Optimal Parallel Approximation for Prefix Sums and Integer Sorting. In ACM-SIAM Symposium on Discrete Algorithms (SODA).
[68]
Yan Gu, Julian Shun, Yihan Sun, and Guy E Blelloch. 2015. A top-down parallel semisort. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
[69]
Jens Gustedt. 2003. Randomized permutations in a coarse grained parallel environment. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA).
[70]
Jens Gustedt. 2008. Engineering Parallel In-place Random Generation of Integer Permutations. In Workshop on Experimental Algorithmics.
[71]
Torben Hagerup. 1991. Fast parallel generation of random permutations. In Intl. Colloq. on Automata, Languages and Programming (ICALP). Springer.
[72]
Maurice Herlihy. 1991. Wait-Free Synchronization. ACM Transactions on Programming Languages and Systems (TOPLAS), Vol. 13, 1 (1991), 124--149.
[73]
Frank K. Hwang and Shen Lin. 1972. A simple algorithm for merging two disjoint linearly ordered sets. SIAM J. Scientific Computing, Vol. 1, 1 (1972), 31--39.
[74]
Riko Jacob, Tobias Lieber, and Nodari Sitchinava. 2014. On the complexity of list ranking in the parallel external memory model. In International Symposium on Mathematical Foundations of Computer Science. Springer, 384--395.
[75]
J. JaJa. 1992. Introduction to Parallel Algorithms .Addison-Wesley Professional.
[76]
Richard M. Karp and Vijaya Ramachandran. 1990. Parallel Algorithms for Shared-Memory Machines. In Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity (A). MIT Press.
[77]
Donald E. Knuth. 1969. The Art of Computer Programming, Volume II: Seminumerical Algorithms .Addison-Wesley.
[78]
G.L. Miller and J.H. Reif. 1985. Parallel tree contraction and its application. In IEEE Symposium on Foundations of Computer Science (FOCS). 478--489.
[79]
Gal Milman, Alex Kogan, Yossi Lev, Victor Luchangco, and Erez Petrank. 2018. Bq: A lock-free queue with batching. In ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 99--109.
[80]
Jürg Nievergelt and Edward M Reingold. 1973. Binary search trees of bounded balance. SIAM J. Scientific Computing, Vol. 2, 1 (1973).
[81]
Heejin Park and Kunsoo Park. 2001. Parallel Algorithms for Red-Black Trees. Theoretical Computer Science (TCS), Vol. 262, 1--2 (2001), 415--435.
[82]
Wolfgang J. Paul, Uzi Vishkin, and Hubert Wagener. 1983. Parallel Dictionaries in 2--3 Trees. In Intl. Colloq. on Automata, Languages and Programming (ICALP). 597--609.
[83]
S. Rajasekaran and J. H. Reif. 1989. Optimal and sublogarithmic time randomized parallel sorting algorithms. SIAM J. Scientific Computing, Vol. 18, 3 (1989).
[84]
A. Ranade. 1998. A simple optimal list ranking algorithm. In IEEE International Conference on High Performance Computing (HiPC).
[85]
Margaret Reid-Miller, Gary L. Miller, and Francesmary Modugno. 1993. List Ranking and Parallel Tree Contraction. In Synthesis of Parallel Algorithms, John Reif (Ed.). Morgan Kaufmann, 115--194.
[86]
John H. Reif. 1993. Synthesis of Paralell Algorithms .Morgan Kaufmann.
[87]
Yossi Shiloach and Uzi Vishkin. 1981. Finding the Maximum, Merging, and Sorting in a Parallel Computation Model. J. Algorithms, Vol. 2, 1 (1981).
[88]
Julian Shun, Yan Gu, Guy E Blelloch, Jeremy T Fineman, and Phillip B Gibbons. 2015. Sequential random permutation, list contraction and tree contraction are highly parallel. In ACM-SIAM Symposium on Discrete Algorithms (SODA). 431--448.
[89]
Yihan Sun, Daniel Ferizovic, and Guy E Blelloch. 2018. PAM: Parallel Augmented Maps. In ACM Symposium on Principles and Practice of Parallel Programming (PPOPP).
[90]
Yuan Tang, Ronghui You, Haibin Kan, Jesmin Jahan Tithi, Pramod Ganapathi, and Rezaul A Chowdhury. 2015. Cache-oblivious wavefront: improving parallelism of recursive dynamic programming algorithms without losing cache-efficiency. In ACM Symposium on Principles and Practice of Parallel Programming (PPOPP).
[91]
Robert Endre Tarjan. 1983. Data Structures and Network Algorithms .Society for Industrial and Applied Mathematics, Philadelphia, PA, USA.
[92]
L. G. Valiant. 1990. General Purpose Parallel Architectures. In Handbook of Theoretical Computer Science (Vol. A), Jan van Leeuwen (Ed.). MIT Press, 943--973.
[93]
Uzi Vishkin. 1984. Randomized Speed-Ups in Parallel Computation. In ACM Symposium on Theory of Computing (STOC). 230--239.
[94]
Uzi Vishkin. 1993. Advanced Parallel Prefix-sums, List Ranking and Connectivity. In Synthesis of Parallel Algorithms, John Reif (Ed.). Morgan Kaufmann, 215--257.
[95]
James C. Wyllie. 1979. The Complexity of Parallel Computations. Technical Report TR-79--387. Department of Computer Science, Cornell University, Ithaca, NY.

Cited By

View all
  • (2024)Parallel Integer Sort: Theory and PracticeProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638483(301-315)Online publication date: 2-Mar-2024
  • (2024)ParlayANN: Scalable and Deterministic Parallel Graph-Based Approximate Nearest Neighbor Search AlgorithmsProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638475(270-285)Online publication date: 2-Mar-2024
  • (2024)Deterministic and Low-Span Work-Efficient Parallel Batch-Dynamic TreesProceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3626183.3659976(247-258)Online publication date: 17-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SPAA '20: Proceedings of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures
July 2020
601 pages
ISBN:9781450369350
DOI:10.1145/3350755
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 July 2020

Check for updates

Badges

  • Honorable Mention

Author Tags

  1. RMQ
  2. binary forking
  3. difference
  4. fork-join
  5. intersection
  6. list contraction
  7. list ranking
  8. parallel computational model
  9. range minimum query
  10. semisort
  11. set-set operation
  12. sorting
  13. tree contraction
  14. tree merging
  15. union

Qualifiers

  • Research-article

Funding Sources

Conference

SPAA '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 447 of 1,461 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)236
  • Downloads (Last 6 weeks)35
Reflects downloads up to 14 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Parallel Integer Sort: Theory and PracticeProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638483(301-315)Online publication date: 2-Mar-2024
  • (2024)ParlayANN: Scalable and Deterministic Parallel Graph-Based Approximate Nearest Neighbor Search AlgorithmsProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638475(270-285)Online publication date: 2-Mar-2024
  • (2024)Deterministic and Low-Span Work-Efficient Parallel Batch-Dynamic TreesProceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3626183.3659976(247-258)Online publication date: 17-Jun-2024
  • (2024)Parallel and (Nearly) Work-Efficient Dynamic ProgrammingProceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3626183.3659958(219-232)Online publication date: 17-Jun-2024
  • (2024)Teaching Parallel Algorithms Using the Binary-Forking Model2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00080(346-351)Online publication date: 27-May-2024
  • (2023)Fast and Space-Efficient Parallel Algorithms for Influence MaximizationProceedings of the VLDB Endowment10.14778/3632093.363210417:3(400-413)Online publication date: 1-Nov-2023
  • (2023)A Fast Algorithm for Aperiodic Linear Stencil Computation using Fast Fourier TransformsACM Transactions on Parallel Computing10.1145/3606338Online publication date: 24-Jul-2023
  • (2023)Parallel Strong Connectivity Based on Faster ReachabilityProceedings of the ACM on Management of Data10.1145/35892591:2(1-29)Online publication date: 20-Jun-2023
  • (2023)Provably Fast and Space-Efficient Parallel BiconnectivityProceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3572848.3577483(52-65)Online publication date: 25-Feb-2023
  • (2023)Optimal Parallel Sorting with Comparison ErrorsProceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3558481.3591093(355-365)Online publication date: 17-Jun-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media