Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/800046.801638acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
Article
Free access

Synchronizing large VLSI processor arrays

Published: 13 June 1983 Publication History

Abstract

Highly parallel VLSI computing structures consist of many processing elements operating simultaneously. In order for such processing elements to communicate among themselves, some provision must be made for synchronization of data transfer. The simplest means of synchronization is the use of a global clock. Unfortunately, large clocked systems can be difficult to implement because of the inevitable problem of clock skews and delays, which can be especially acute in VLSI systems as feature sizes shrink. For the near term, good engineering and technology improvements can be expected to maintain the feasibility of clocking in such systems; however, clock distribution problems crop up in any technology as systems grow. An alternative means of enforcing necessary synchronization is the use of self-timed, asynchronous schemes, at the cost of increased design complexity and hardware cost. Realizing that different circumstances call for different synchronization methods, this paper provides a spectrum of synchronization models; based on the assumptions made for each model, theoretical lower bounds on clock skew are derived, and appropriate or best-possible synchronization schemes for large processor arrays are proposed.
One set of models is based on assumptions that allow the use of a pipelined clocking scheme, where more than one clock event is propagated at a time. In this case, it is shown that even assuming that physical variations along clock lines can produce skews between wires of the same length, any one-dimensional processor array can be correctly synchronized by a global pipelined clock while enjoying desirable properties such as modularity, expandability and robustness. This result cannot be extended to two-dimensional arrays, however—the paper shows that under this assumption, it is impossible to run a clock such that the maximum clock skew between two communicating cells will be bounded by a constant as systems grow. For such cases or where pipelined clocking is unworkable, a synchronization scheme incorporating both clocked and “asynchronous” elements is proposed.

References

[1]
Aleliunas, R.and Rosenberg, A.L., "On Embedding Rectangular Grids in Square Grids," IEEE Trans. Computers, Vol. C-31, No. 9, September 1982, pp. 907-913.
[2]
Bentley, J.L. and Kung, H.T., "A Tree Machine for Searching Problems," Proceedings of 1979 International Conference on Parallel Processing, IEEE, August 1979, pp. 257-266.
[3]
Franklin, M.A. and Wann. D.F., "Asynchronous and Clocked Control Structures for VLSI Interconnection Networks," Proceedings of the 9th International Symposium on Computer Architecture, April 1982, pp. 50-59.
[4]
Kung, H.T., "Why Systolic Architectures?," Computer Magazine, Vol. 15, No. 1, January 1982, pp. 37-46.
[5]
Kung, H.T. and Leiserson, C.E., "Systolic Arrays (for VLSI)," Sparse Matrix Proceedings 1978. Duff, I.S. and Stewart, G.W., eds., Society for Industrial and Applied Mathematics, 1979, pp. 256-282, A slightly different version appears in Introduction to VLSI Systems by C.A. Mead and L.A. Conway, Addison-Wesley, 1980, Section 8.3.
[6]
Kung, S.Y. and Gal-Ezer, R.J., "Synchronous vs. Asynchronous Computation in VLSI Array Processors," Proceedings of the SPIE, vol. 341, May 1982, pp. 53-65.
[7]
Lipton, R.J., Eisenstat, S.C. and DeMillo, R.A., "Space and Time Hierarchies for Classes of Control Structures and Data Structures," Journal of the ACM, Vol. 23, No. 4, October 1976, pp. 720-732.
[8]
Mead, C.A. and Rem, M., "Cost and Performance of VLSI Computing Structures," IEEE Journal of Solid State Circuits, Vol. SC-14, No. 2, April 1979, pp. 455-462.
[9]
Paterson, M.S., Ruzzo, W.L. and Snyder. L., "Bounds on Minimax Edge Length for Complete Binary Trees," Proceedings of the Thirteenth Annual ACM Symposium on Theory of Computing, ACM SIGACT, May 1981, pp. 293-299.
[10]
Seitz, C.L., "Self-Timed VLSI Systems," Proceedings of Conference on Very Large Scale Integration: Architecture, Design, Fabrication, California Institute of Technology, January 1979, pp. 345-355.
[11]
Thompson, C.D.,A Complexity Theory for VLSI, PhD dissertation, Carnegie-Mellon University, Computer Science Department, 1980.

Cited By

View all
  • (2005)Scheduling task-tree with additive scales on parallel/distributed machinesComputing and Combinatorics10.1007/BFb0030883(607-616)Online publication date: 20-Jun-2005
  • (2005)Designing modular linear systolic arrays using dependence graph regular partitionsParallel Processing: CONPAR 92—VAPP V10.1007/3-540-55895-0_421(265-270)Online publication date: 29-May-2005
  • (2005)A fully-pipelined solutions constructor for dynamic programming problemsAdvances in Computing and Information — ICCI '9110.1007/3-540-54029-6_191(421-430)Online publication date: 1-Jun-2005
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ISCA '83: Proceedings of the 10th annual international symposium on Computer architecture
June 1983
424 pages
  • cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 11, Issue 3
    June 1983
    413 pages
    ISSN:0163-5964
    DOI:10.1145/1067651
    Issue’s Table of Contents

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 1983

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)41
  • Downloads (Last 6 weeks)13
Reflects downloads up to 28 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2005)Scheduling task-tree with additive scales on parallel/distributed machinesComputing and Combinatorics10.1007/BFb0030883(607-616)Online publication date: 20-Jun-2005
  • (2005)Designing modular linear systolic arrays using dependence graph regular partitionsParallel Processing: CONPAR 92—VAPP V10.1007/3-540-55895-0_421(265-270)Online publication date: 29-May-2005
  • (2005)A fully-pipelined solutions constructor for dynamic programming problemsAdvances in Computing and Information — ICCI '9110.1007/3-540-54029-6_191(421-430)Online publication date: 1-Jun-2005
  • (2001)Distributed Memory Parallel Architecture Based on Modular Linear Arrays for 2-D Separable Transforms ComputationJournal of VLSI Signal Processing Systems10.5555/509243.281323528:3(187-203)Online publication date: 1-Jul-2001
  • (1998)Design of a parallel neural processorProceedings of the 1998 Second IEEE International Caracas Conference on Devices, Circuits and Systems. ICCDCS 98. On the 70th Anniversary of the MOSFET and 50th of the BJT. (Cat. No.98TH8350)10.1109/ICCDCS.1998.705817(109-112)Online publication date: 1998
  • (1996)A Modular Systolic Linearization of the Warshall-Floyd AlgorithmIEEE Transactions on Parallel and Distributed Systems10.1109/71.5037697:5(449-455)Online publication date: 1-May-1996
  • (1994)The red-blue algorithm for dynamic programming on linear arraysProceedings of the 1994 6th IEEE Symposium on Parallel and Distributed Processing10.1109/SPDP.1994.346142(484-488)Online publication date: 26-Oct-1994
  • (1993)An asynchronous communication protocol for internode connections in a scalable processor arrayProceedings of IEEE Workshop on VLSI Signal Processing10.1109/VLSISP.1993.404456(489-497)Online publication date: 1993
  • (1993)Mapping dynamic programming onto modular linear systolic arraysDistributed Computing10.1007/BF022427056:3(165-179)Online publication date: 1-Apr-1993
  • (1992)Optimal dynamic scheduling of task tree on constant-dimensional architecturesProceedings of the fourth annual ACM symposium on Parallel algorithms and architectures10.1145/140901.140916(138-146)Online publication date: 1-Jun-1992
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media