Nothing Special   »   [go: up one dir, main page]

skip to main content
article
Public Access

Responsive parallel computation: bridging competitive and cooperative threading

Published: 14 June 2017 Publication History

Abstract

Competitive and cooperative threading are widely used abstractions in computing. In competitive threading, threads are scheduled preemptively with the goal of minimizing response time, usually of interactive applications. In cooperative threading, threads are scheduled non-preemptively with the goal of maximizing throughput or minimizing the completion time, usually in compute-intensive applications, e.g. scientific computing, machine learning and AI.
Although both of these forms of threading rely on the same abstraction of a thread, they have, to date, remained largely separate forms of computing. Motivated by the recent increase in the mainstream use of multicore computers, we propose a threading model that aims to unify competitive and cooperative threading. To this end, we extend the classic graph-based cost model for cooperative threading to allow for competitive threading, and describe how such a cost model may be used in a programming language by presenting a language and a corresponding cost semantics. Finally, we show that the cost model and the semantics are realizable by presenting an operational semantics for the language that specifies the behavior of an implementation, as well as an implementation and a small empirical evaluation.

References

[1]
U. A. Acar, G. E. Blelloch, and R. D. Blumofe. The data locality of work stealing. Theory of Computing Systems (TOCS), 35(3):321–347, 2002.
[2]
U. A. Acar, A. Charguéraud, and M. Rainey. Scheduling parallel programs by work stealing with private deques. In PPoPP ’13, 2013.
[3]
U. A. Acar, A. Charguéraud, and M. Rainey. Oracle-guided scheduling for controlling granularity in implicitly parallel languages. Journal of Functional Programming (JFP), 26:e23, 2016.
[4]
N. S. Arora, R. D. Blumofe, and C. G. Plaxton. Thread scheduling for multiprogrammed multiprocessors. Theory of Computing Systems, 34(2):115–144, 2001.
[5]
Arvind and K. P. Gostelow. The Id report: An asychronous language and computing machine. Technical Report TR-114, Department of Information and Computer Science, University of California, Irvine, Sept. 1978.
[6]
A. Baumann, P. Barham, P.-E. Dagand, T. Harris, R. Isaacs, S. Peter, T. Roscoe, A. Schüpbach, and A. Singhania. The multikernel: A new OS architecture for scalable multicore systems. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, SOSP ’09, pages 29–44, 2009.
[7]
G. Blake, R. G. Dreslinski, T. Mudge, and K. Flautner. Evolution of thread-level parallelism in desktop applications. In Proceedings of the 37th Annual International Symposium on Computer Architecture, ISCA ’10, pages 302–313, 2010.
[8]
G. Blelloch and J. Greiner. Parallelism in sequential functional languages. In Proceedings of the 7th International Conference on Functional Programming Languages and Computer Architecture, FPCA ’95, pages 226–237. ACM, 1995.
[9]
G. E. Blelloch and J. Greiner. A provable time and space e fficient implementation of NESL. In Proceedings of the 1st ACM SIGPLAN International Conference on Functional Programming, pages 213–225. ACM, 1996.
[10]
G. E. Blelloch, J. C. Hardwick, J. Sipelstein, M. Zagha, and S. Chatterjee. Implementation of a portable nested data-parallel language. J. Parallel Distrib. Comput., 21(1):4–14, 1994.
[11]
G. E. Blelloch, P. B. Gibbons, and Y. Matias. Provably e fficient scheduling for languages with fine-grained parallelism. J. ACM, 46:281–321, Mar. 1999.
[12]
R. D. Blumofe and C. E. Leiserson. Space-e fficient scheduling of multithreaded computations. SIAM Journal on Computing, 27(1):202–229, 1998.
[13]
R. D. Blumofe and C. E. Leiserson. Scheduling multithreaded computations by work stealing. J. ACM, 46:720–748, Sept. 1999.
[14]
S. Boyd-Wickizer, H. Chen, R. Chen, Y. Mao, F. Kaashoek, R. Morris, A. Pesterev, L. Stein, M. Wu, Y. Dai, Y. Zhang, and Z. Zhang. Corey: An operating system for many cores. In Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, pages 43–57, 2008.
[15]
R. P. Brent. The parallel evaluation of general arithmetic expressions. J. ACM, 21(2):201–206, 1974.
[16]
F. W. Burton and M. R. Sleep. Executing functional programs on a virtual tree of processors. In Functional Programming Languages and Computer Architecture (FPCA ’81), pages 187– 194. ACM Press, Oct. 1981.
[17]
M. M. T. Chakravarty, R. Leshchinskiy, S. L. Peyton Jones, G. Keller, and S. Marlow. Data parallel Haskell: a status report. In Proceedings of the POPL 2007 Workshop on Declarative Aspects of Multicore Programming, DAMP 2007, Nice, France, January 16, 2007, pages 10–18, 2007.
[18]
P. Charles, C. Grotho ff, V. Saraswat, C. Donawa, A. Kielstra, K. Ebcioglu, C. von Praun, and V. Sarkar. X10: an objectoriented approach to non-uniform cluster computing. In Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications, OOPSLA ’05, pages 519–538. ACM, 2005.
[19]
R. Davies. A temporal-logic approach to binding-time analysis. In LICS, pages 184–195, 1996.
[20]
D. L. Eager, J. Zahorjan, and E. D. Lazowska. Speedup versus e fficiency in parallel systems. IEEE Transactions on Computing, 38(3):408–423, 1989.
[21]
Y. Endo, Z. Wang, J. B. Chen, and M. Seltzer. Using latency to evaluate interactive system performance. In Proceedings of the Second USENIX Symposium on Operating Systems Design and Implementation, OSDI ’96, pages 185–199, New York, NY, USA, 1996. ACM.
[22]
D. G. Feitelson, L. Rudolph, and U. Schwiegelshohn. Parallel job scheduling - A status report. In Job Scheduling Strategies for Parallel Processing (JSSPP), 10th International Workshop, pages 1–16, 2004.
[23]
N. Feltman, C. Angiuli, U. A. Acar, and K. Fatahalian. Automatically splitting a two-stage lambda calculus. In Proceedings of the 25 European Symposium on Programming, ESOP, pages 255–281, 2016.
[24]
K. Flautner, R. Uhlig, S. Reinhardt, and T. Mudge. Threadlevel parallelism and interactive performance of desktop applications. In Proceedings of the Ninth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS IX, pages 129–138, 2000.
[25]
M. Fluet, M. Rainey, J. Reppy, and A. Shaw. Implicitly threaded parallelism in Manticore. Journal of Functional Programming, 20(5-6):1–40, 2011.
[26]
M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the Cilk-5 multithreaded language. In PLDI, pages 212–223, 1998.
[27]
C. Gao, A. Gutierrez, R. G. Dreslinski, T. Mudge, K. Flautner, and G. Blake. A study of thread level parallelism on mobile devices. In Performance Analysis of Systems and Software (ISPASS), 2014 IEEE International Symposium on, pages 126– 127, March 2014.
[28]
R. L. Graham. Bounds on multiprocessing timing anomalies. SIAM Journal on Applied Mathematics, 17(2):416–429, 1969.
[29]
J. Greiner and G. E. Blelloch. A provably time-e fficient parallel implementation of full speculation. ACM Transactions on Programming Languages and Systems, 21(2):240–285, Mar. 1999.
[30]
R. H. Halstead. Multilisp: a language for concurrent symbolic computation. ACM Transactions on Programming Languages and Systems, 7:501–538, 1985.
[31]
M. Harchol-Balter. Performance Modeling and Design of Computer Systems: Queueing Theory in Action. Cambridge University Press, 2013.
[32]
R. Harper. Practical Foundations for Programming Languages. Cambridge University Press, New York, NY, USA, 2012.
[33]
C. Hauser, C. Jacobi, M. Theimer, B. Welch, and M. Weiser. Using threads in interactive systems: A case study. In Proceedings of the Fourteenth ACM Symposium on Operating Systems Principles, SOSP ’93, pages 94–105, New York, NY, USA, 1993. ACM.
[34]
S. Imam and V. Sarkar. Load balancing prioritized tasks via work-stealing. In Euro-Par 2015: Parallel Processing - 21st International Conference on Parallel and Distributed Computing, pages 222–234, 2015.
[35]
S. M. Imam and V. Sarkar. Habanero-Java library: a Java 8 framework for multicore programming. In 2014 International Conference on Principles and Practices of Programming on the Java Platform Virtual Machines, Languages and Tools, PPPJ ’14, pages 75–86, 2014.
[36]
Intel. Intel threading building blocks, 2011.
[37]
https://www.threadingbuildingblocks.org/.
[38]
S. Jagannathan, A. Navabi, K. Sivaramakrishnan, and L. Ziarek. The design rationale for Multi-MLton. In ML ’10: Proceedings of the ACM SIGPLAN Workshop on ML. ACM, 2010.
[39]
J. Jaja. An introduction to parallel algorithms. Addison Wesley Longman Publishing Company, 1992.
[40]
G. Keller, M. M. Chakravarty, R. Leshchinskiy, S. Peyton Jones, and B. Lippmeier. Regular, shape-polymorphic, parallel arrays in Haskell. In Proceedings of the 15th ACM SIGPLAN international conference on Functional programming, ICFP ’10, pages 261–272, 2010.
[41]
T. B. Knoblock and E. Ruf. Data specialization. In Proceedings of the ACM SIGPLAN 1996 Conference on Programming Language Design and Implementation, PLDI ’96, pages 215– 225, 1996.
[42]
D. Lea. A Java fork /join framework. In Proceedings of the ACM 2000 conference on Java Grande, JAVA ’00, pages 36–43, 2000.
[43]
D. Leijen, W. Schulte, and S. Burckhardt. The design of a task parallel library. In Proceedings of the 24th ACM SIGPLAN conference on Object Oriented Programming Systems Languages and Applications, OOPSLA ’09, pages 227–242, 2009.
[44]
R. Ley-Wild, U. A. Acar, and M. Fluet. A cost semantics for self-adjusting computation. In Proceedings of the 26th Annual ACM Symposium on Principles of Programming Languages, POPL ’09, 2009.
[45]
MLton. MLton web site. http://www.mlton.org.
[46]
S. K. Muller and U. A. Acar. Latency-hiding work stealing: Scheduling interacting parallel computations with work stealing. In Proceedings of the 28th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA ’16, pages 71–82, 2016.
[47]
S. K. Muller, U. A. Acar, and R. Harper. Responsive parallel computation: Bridging competitive and cooperative threading. Technical Report TBD, Carnegie Mellon University School of Computer Science, Apr. 2017.
[48]
T. Murphy, VII, K. Crary, R. Harper, and F. Pfenning. A symmetric modal lambda calculus for distributed computing. In Proceedings of the 19th IEEE Symposium on Logic in Computer Science (LICS), pages 286–295. IEEE Press, 2004.
[49]
A. Nanevski and F. Pfenning. Staged computation with names and necessity. J. Funct. Program., 15(5):893–939, 2005.
[50]
G. J. Narlikar and G. E. Blelloch. Space-e fficient scheduling of nested parallelism. ACM Transactions on Programming Languages and Systems, 21, 1999.
[51]
F. Pfenning and R. Davies. A judgmental reconstruction of modal logic. Mathematical Structures in Computer Science, 11:511–540, 2001.
[52]
R. Raghunathan, S. K. Muller, U. A. Acar, and G. Blelloch. Hierarchical memory management for parallel programs. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, ICFP 2016, pages 392–406, New York, NY, USA, 2016. ACM.
[53]
M. Rosendahl. Automatic complexity analysis. In FPCA ’89: Functional Programming Languages and Computer Architecture, pages 144–156. ACM, 1989.
[54]
A. Saifullah, D. Ferry, J. Li, K. Agrawal, C. Lu, and C. D. Gill. Parallel real-time scheduling of DAGs. IEEE Trans. Parallel Distrib. Syst., 25(12):3242–3252, 2014.
[55]
D. Sands. Complexity analysis for a lazy higher-order language. In ESOP ’90: Proceedings of the 3rd European Symposium on Programming, pages 361–376, London, UK, 1990. Springer-Verlag.
[56]
P. M. Sansom and S. L. Peyton Jones. Time and space profiling for non-strict, higher-order functional languages. In Principles of Programming Languages, pages 355–366, 1995.
[57]
A. Silberschatz, P. B. Galvin, and G. Gagne. Operating system concepts (7. ed.). Wiley, 2005.
[58]
D. C. Smith, C. Irby, R. Kimball, B. Verplank, and E. Harslem. Designing the star user interface. BYTE Magazine, 7(4):242– 282, 1982.
[59]
D. Spoonhower. Scheduling Deterministic Parallel Programs. PhD thesis, Carnegie Mellon University, Pittsburgh, PA, USA, 2009.
[60]
D. Spoonhower, G. E. Blelloch, R. Harper, and P. B. Gibbons. Space profiling for parallel functional programs. In International Conference on Functional Programming, 2008.
[61]
D. C. Swinehart, P. T. Zellweger, R. J. Beach, and R. B. Hagmann. A structural view of the cedar programming environment. ACM Trans. Program. Lang. Syst., 8(4):419– 490, Aug. 1986.
[62]
W. Taha and T. Sheard. MetaML and multi-stage programming with explicit annotations. Theoretical Computer Science, 248 (1):211 – 242, 2000.
[63]
J. Ullman. NP-complete scheduling problems. Journal of Computer and System Sciences, 10(3):384 – 393, 1975.
[64]
M. Wimmer, D. Cederman, J. L. Trä ff, and P. Tsigas. Workstealing with configurable scheduling strategies. In Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’13, pages 315–316, 2013.
[65]
M. Wimmer, F. Versaci, J. L. Trä ff, D. Cederman, and P. Tsigas. Data structures for task-based priority scheduling. In Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’14, pages 379–380, 2014.

Cited By

View all
  • (2021)Provably space-efficient parallel functional programmingProceedings of the ACM on Programming Languages10.1145/34342995:POPL(1-33)Online publication date: 4-Jan-2021
  • (2024)Disentanglement with Futures, State, and InteractionProceedings of the ACM on Programming Languages10.1145/36328958:POPL(1569-1599)Online publication date: 5-Jan-2024
  • (2024)Automatic Parallelism ManagementProceedings of the ACM on Programming Languages10.1145/36328808:POPL(1118-1149)Online publication date: 5-Jan-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 52, Issue 6
PLDI '17
June 2017
708 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/3140587
Issue’s Table of Contents
  • cover image ACM Conferences
    PLDI 2017: Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation
    June 2017
    708 pages
    ISBN:9781450349888
    DOI:10.1145/3062341
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2017
Published in SIGPLAN Volume 52, Issue 6

Check for updates

Author Tags

  1. Cost Models
  2. Cost Semantics
  3. Operational Semantics
  4. Parallelism
  5. Scheduling

Qualifiers

  • Article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)122
  • Downloads (Last 6 weeks)29
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Provably space-efficient parallel functional programmingProceedings of the ACM on Programming Languages10.1145/34342995:POPL(1-33)Online publication date: 4-Jan-2021
  • (2024)Disentanglement with Futures, State, and InteractionProceedings of the ACM on Programming Languages10.1145/36328958:POPL(1569-1599)Online publication date: 5-Jan-2024
  • (2024)Automatic Parallelism ManagementProceedings of the ACM on Programming Languages10.1145/36328808:POPL(1118-1149)Online publication date: 5-Jan-2024
  • (2023)Efficient Parallel Functional Programming with EffectsProceedings of the ACM on Programming Languages10.1145/35912847:PLDI(1558-1583)Online publication date: 6-Jun-2023
  • (2023)Responsive Parallelism with SynchronizationProceedings of the ACM on Programming Languages10.1145/35912497:PLDI(712-735)Online publication date: 6-Jun-2023
  • (2023)An Efficient Scheduler for Task-Parallel Interactive ApplicationsProceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3558481.3591092(27-38)Online publication date: 17-Jun-2023
  • (2022)Entanglement detection with near-zero costProceedings of the ACM on Programming Languages10.1145/35476466:ICFP(679-710)Online publication date: 31-Aug-2022
  • (2021)Provably space-efficient parallel functional programmingProceedings of the ACM on Programming Languages10.1145/34342995:POPL(1-33)Online publication date: 4-Jan-2021
  • (2020)Priority Scheduling for Interactive ApplicationsProceedings of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3350755.3400236(465-477)Online publication date: 6-Jul-2020
  • (2019)Disentanglement in nested-parallel programsProceedings of the ACM on Programming Languages10.1145/33711154:POPL(1-32)Online publication date: 20-Dec-2019
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media