article

Fast synchronization for chip multiprocessors

Authors:

Jack Sampson,

Rubén González,

Jean-Francois Collard,

Norman P. Jouppi,

Mike SchlanskerAuthors Info & Claims

ACM SIGARCH Computer Architecture News, Volume 33, Issue 4

Pages 64 - 69

https://doi.org/10.1145/1105734.1105743

Published: 01 November 2005 Publication History

Get Access

Abstract

This paper presents a novel mechanism for barrier synchronization on chip multi-processors (CMPs). By forcing the invalidation of selected I-cache lines, this mechanism starves threads and thus forces their execution to stop. Threads are let free when all have entered the barrier.We evaluated this mechanism using SMTSim and report much better (and most importantly, more flat) performance than lock-based barriers supported by existing microprocessors.

References

[1]

G. Almasi et al. Design and implementation of message-passing services for the Blue Gene/L supercomputer. IBM Journal of Research and Development, 49(2/3):393--406, Mar. 2005.

Digital Library

Google Scholar

[2]

S. Amarasinghe. Multicores from the compiler's perspective: A blessing or a curse? Keynote at CGO'05, San Jose, CA. March 05.

Digital Library

Google Scholar

[3]

C. J. Beckman and C. D. Polychronopoulos. Fast barrier synchronization hardware. In Proc. Conf. on Supercomputing, pages 180--189, 1990.

Digital Library

Google Scholar

[4]

S. Chaudhry, P. Caprioli, S. Yip, and M. Tremblay. High-performance throughput computing. IEEE Micro, 25(3):32--45, May 2005.

Digital Library

Google Scholar

[5]

P. Coteus et al. Packaging the Blue Gene/L supercomputer. IBM Journal of Research and Development, 49(2/3):213--248, Mar. 2005.

Digital Library

Google Scholar

[6]

D. E. Culler, J. P. Singh, and A. Gupta. Parallel Computer Architecture. Morgan Kaufmann.

Digital Library

Google Scholar

[7]

R. Kalla, B. Sinharoy, and J. M. Tendler. IBM Power5 chip: a dual-core multithreaded processor. IEEE Micro, pages 40--47, March-April 2004.

Digital Library

Google Scholar

[8]

P. Kongetira, K. Aingaran, and K. Olukotun. Niagara: A 32-way multithreaded sparc processor. IEEE Micro, 25(2):21--29, Mar. 2005.

Digital Library

Google Scholar

[9]

C. E. Leiserson et al. The network architecture of the Connection Machine CM-5. In Proc. of SPAA, pages 272--285, June 1992.

Digital Library

Google Scholar

[10]

J. M. Mellor-Crummey and M. L. Scott. Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Trans. on Comp. Sys., 9(1):21--65, Feb. 1991.

Digital Library

Google Scholar

[11]

B. E. Saglam and V. J. Mooney. System-on-a-chip processor synchronization support in hardware. In Proc. of Conf. on Design, automation and test in Europe, pages 633--641, Munich, Germany, 2001.

Digital Library

Google Scholar

[12]

D. M. Tullsen, J. L. Lo, S. J. Eggers, and H. M. Levy. Supporting fine-grained synchronization on a simulataneous multithreading processor. In Proc. Int'l Symp on High-Performance Architecture (HPCA), Jan. 1999.

Digital Library

Google Scholar

Cited By

View all

Maroun EHansen HKristensen ASchoeberl M(2019)Time-predictable synchronization support with a shared scratchpad memoryMicroprocessors and Microsystems10.1016/j.micpro.2018.09.01464(34-42)Online publication date: Feb-2019
https://doi.org/10.1016/j.micpro.2018.09.014
Wei ZLiu PZeng ZXu JYing R(2014)Instruction-based high-efficient synchronization in a many-core Network-on-Chip processor2014 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS.2014.6865604(2193-2196)Online publication date: Jun-2014
https://doi.org/10.1109/ISCAS.2014.6865604
Savadi ADeldari H(2014)Measurement of the latency parameters of the Multi-BSP modelThe Journal of Supercomputing10.1007/s11227-013-1018-467:2(565-584)Online publication date: 1-Feb-2014
https://dl.acm.org/doi/10.1007/s11227-013-1018-4
Show More Cited By

Index Terms

Fast synchronization for chip multiprocessors

Recommendations

Efficient synchronization for embedded on-chip multiprocessors

This paper investigates optimized synchronization techniques for shared memory on-chip multiprocessors (CMPs) based on network-on-chip (NoC) and targeted at future mobile systems. The proposed solution is based on the idea of locally performing ...
Exploring hybrid photonic networks-on-chip foremerging chip multiprocessors
CODES+ISSS '09: Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis

Increasing application complexity and improvements in process technology have today enabled chip multiprocessors (CMPs) with tens to hundreds of cores on a chip. Networks on Chip (NoCs) have emerged as scalable communication fabrics that can support ...
Photonic Networks-on-Chip for Future Generations of Chip Multiprocessors

The design and performance of next-generation chip multiprocessors (CMPs) will be bound by the limited amount of power that can be dissipated on a single die. We present photonic networks-on-chip (NoC) as a solution to reduce the impact of intra-chip ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News

ACM SIGARCH Computer Architecture News Volume 33, Issue 4

Special issue: dasCMP'05

November 2005

130 pages

ISSN:0163-5964

DOI:10.1145/1105734

Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 2005

Published in SIGARCH Volume 33, Issue 4

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
446
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)1

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Maroun EHansen HKristensen ASchoeberl M(2019)Time-predictable synchronization support with a shared scratchpad memoryMicroprocessors and Microsystems10.1016/j.micpro.2018.09.01464(34-42)Online publication date: Feb-2019
https://doi.org/10.1016/j.micpro.2018.09.014
Wei ZLiu PZeng ZXu JYing R(2014)Instruction-based high-efficient synchronization in a many-core Network-on-Chip processor2014 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS.2014.6865604(2193-2196)Online publication date: Jun-2014
https://doi.org/10.1109/ISCAS.2014.6865604
Savadi ADeldari H(2014)Measurement of the latency parameters of the Multi-BSP modelThe Journal of Supercomputing10.1007/s11227-013-1018-467:2(565-584)Online publication date: 1-Feb-2014
https://dl.acm.org/doi/10.1007/s11227-013-1018-4
Chen XChen S(2011)DSBSProceedings of the 2011IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications10.1109/TrustCom.2011.141(1030-1037)Online publication date: 16-Nov-2011
https://dl.acm.org/doi/10.1109/TrustCom.2011.141
Stoif CSchoeberl MLiccardi BHaase J(2011)Hardware synchronization for embedded multi-core processors2011 IEEE International Symposium of Circuits and Systems (ISCAS)10.1109/ISCAS.2011.5938126(2557-2560)Online publication date: May-2011
https://doi.org/10.1109/ISCAS.2011.5938126
Valiant L(2011)A bridging model for multi-core computingJournal of Computer and System Sciences10.1016/j.jcss.2010.06.01277:1(154-166)Online publication date: 1-Jan-2011
https://dl.acm.org/doi/10.1016/j.jcss.2010.06.012
Chen XLu ZJantsch AChen S(2010)Handling shared variable synchronization in multi-core Network-on-Chips with distributed memory23rd IEEE International SOC Conference10.1109/SOCC.2010.5784680(467-472)Online publication date: Sep-2010
https://doi.org/10.1109/SOCC.2010.5784680
Chen XLu ZJantsch AChen SLu JWu H(2010)Supporting Efficient Synchronization in Multi-core NoCs Using Dynamic Buffer Allocation TechniqueProceedings of the 2010 IEEE Annual Symposium on VLSI10.1109/ISVLSI.2010.16(462-463)Online publication date: 5-Jul-2010
https://dl.acm.org/doi/10.1109/ISVLSI.2010.16
Marongiu ABenini LKandemir MKim TSainrat PLumetta SNavarro N(2007)Lightweight barrier-based parallelization support for non-cache-coherent MPSoC platformsProceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems10.1145/1289881.1289908(145-149)Online publication date: 30-Sep-2007
https://dl.acm.org/doi/10.1145/1289881.1289908

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Efficient synchronization for embedded on-chip multiprocessors

Exploring hybrid photonic networks-on-chip foremerging chip multiprocessors

Photonic Networks-on-Chip for Future Generations of Chip Multiprocessors

Comments

Information

Published In

Publisher

Publication History

Check for updates

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations