Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/223982.224435acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
Article
Free access

Vector multiprocessors with arbitrated memory access

Published: 01 May 1995 Publication History

Abstract

The high latency of memory accesses is one of the factors that most contribute to reduce the performance of current vector supercomputers. The conflicts that can occur in the memory modules plus the collisions in the interconnection network in the case of multiprocessors make that the execution time of applications increases significantly. In this work we propose a memory access method that for both cases of vector uniprocessors and multiprocessors allows to perform stream accesses with the smallest possible latency in the majority of the cases. The basic idea is to arbitrate the memory access by defining the order in which the memory modules are visited. The stream elements are requested out of order. In addition, the access method also reduces the cost of the interconnection network.

References

[1]
R M. Kogge, "The Architecture of Pipelined Computers", McGraw-Hill, New York, 1981.
[2]
E Budnik and D. J. Kuck, "The Organization and Use of Parallel Memories", IEEE Trans. on Computers, vol. 20, no. 12, pp. 1566-1569, 1971.
[3]
J. Frailong, W. Jalby and J. Lenfant, "XOR-schemes: A Flexible Data Organization in Parallel Memories", Int. Conf. on Parallel Processing, pp. 276-283, 1985.
[4]
D.T. Harper III, "Address Transformations to Increase Memory Performance", Int. Conf. on Parallel Processing, pp. 237-241, 1989.
[5]
D.T. Harper III and J.R. Jump, "Performance Evaluation of Vector Accesses in Parallel Memories Using a Skewed Storage Scheme", Int. Symp. on Computer Architecture, pp. 324-328, 1986.
[6]
M. Valero, T. Lang, J.M. Llaberia, M. Peiron, E. Ayguade and J.J. Navarro, "Increasing the Number of Strides for Conflict-Free Vector Access", Int. Syrup. on Computer Architecture, pp. 372-381, 1992.
[7]
A. Seznec and J. Lenfant, "Interleaved Parallel Schemes: Improving Memory Throughput on Supercomputers", Int. Symp. on Computer Architecture, pp. 246-255, 1992.
[8]
M.Peiron, M. Valero and E. Ayguad6, "Synchronized Access to Streams in SIMD Vector Multiprocessors", Int. Conf. on Supercomputing, pp. 23-32, 1994.
[9]
D.H. Bailey, "Vector Computer Memory Bank Contention", IEEE Trans. on Computers, vol. 36, no. 3, pp. 293-298, 1987.
[10]
Y. Bucher and D.A. Calahan, "Access Conflicts in Multiprocessor Memories: Queuing Models and Simulation Studies", Int. Conf. on Supercomputing, pp. 428-438, 1990.
[11]
D.A. Calahan and D.H. Bailey, "Measurement and Analysis of Memory Conflicts on Vector Multiprocessors", Performance Evaluation of Supercomputers, Elsevier Science Publishers, pp. 83- 106, 1988.
[12]
W. Oed and O. Lange, "On the Effective Bandwidth of Interleaved Memories in Vector Processing Systems, IEEE Trans. on Computers, vol. 34, no. 10, pp. 949-957, 1985.
[13]
J.E. Smith and W.R. Taylor, "Characterizing Memory Performance in Vector Multiprocessors", Int. Conf. on Supercomputing, pp. 35-44, 1992.
[14]
T. Cheung and J.E. Smith, "A Simulation Study of the Cray X-MP Memory System", IEEE Trans. on Computers, vol. 35, no. 7, pp. 613-622, 1986.
[15]
W. Oed and O. Lange, "Modelling, Measurement and Simulation of Memory Interference in the Cray X-MP", Parallel Computing, vol. 3, no. 4, pp. 343-358, 1986.
[16]
A. Calahan, "Some Results in Memory Conflict Analysis", Proc. of Supercomputing'89, pp. 775-778, 1989.
[17]
D.A. Calahan, "Characterization of Memory Conflict Loading on the Cray-2", Int. Conf. on Parallel Processing, pp. 299-302, 1988.
[18]
Y. Bucher and M.L. Simmons, "Measurement of Memory Access Contentions in Multiple Vector Processor Systems", Supercomputing 91, pp. 806-817, 1991.
[19]
J.E. Smith and W.R. Taylor, "Accurate Modelling of Interconnection Networks in Vector Supercomputers", Int. Conf. on Supercomputing, pp. 264-273, 1991.
[20]
M. Valero, M. Peiron and E. Ayguad6, "Memory Access Synchronization in Vector Multiprocessors", CONPAR 94, pp. 414-425, 1994.
[21]
H. Shing and L.M. Ni, "A Conflict-free Memory Design for Multiprocessors", Supercomputing91, pp. 46-55, 1991.

Cited By

View all
  • (2021)PrioRAT: Criticality-Driven Prioritization Inside the On-Chip Memory HierarchyEuro-Par 2021: Parallel Processing10.1007/978-3-030-85665-6_37(599-615)Online publication date: 25-Aug-2021
  • (2012)Isomorphic Recursive SplittingProceedings of the 2012 41st International Conference on Parallel Processing Workshops10.1109/ICPPW.2012.78(574-580)Online publication date: 10-Sep-2012
  • (2010)Many-Thread Aware Prefetching Mechanisms for GPGPU ApplicationsProceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2010.44(213-224)Online publication date: 4-Dec-2010
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ISCA '95: Proceedings of the 22nd annual international symposium on Computer architecture
July 1995
426 pages
ISBN:0897916980
DOI:10.1145/223982
  • cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 23, Issue 2
    Special Issue: Proceedings of the 22nd annual international symposium on Computer architecture (ISCA '95)
    May 1995
    412 pages
    ISSN:0163-5964
    DOI:10.1145/225830
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 1995

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

ISCA95
Sponsor:
ISCA95: International Conference on Computer Architecture
June 22 - 24, 1995
S. Margherita Ligure, Italy

Acceptance Rates

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)60
  • Downloads (Last 6 weeks)8
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)PrioRAT: Criticality-Driven Prioritization Inside the On-Chip Memory HierarchyEuro-Par 2021: Parallel Processing10.1007/978-3-030-85665-6_37(599-615)Online publication date: 25-Aug-2021
  • (2012)Isomorphic Recursive SplittingProceedings of the 2012 41st International Conference on Parallel Processing Workshops10.1109/ICPPW.2012.78(574-580)Online publication date: 10-Sep-2012
  • (2010)Many-Thread Aware Prefetching Mechanisms for GPGPU ApplicationsProceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture10.1109/MICRO.2010.44(213-224)Online publication date: 4-Dec-2010
  • (2010)Interleaving granularity on high bandwidth memory architecture for CMPs2010 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation10.1109/ICSAMOS.2010.5642060(250-257)Online publication date: Jul-2010
  • (2005)Conflict-Free Accesses to Strided Vectors on a Banked CacheIEEE Transactions on Computers10.1109/TC.2005.11054:7(913-916)Online publication date: 1-Jul-2005
  • (2000)Algorithmic foundations for a parallel vector access memory systemProceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures10.1145/341800.341819(156-165)Online publication date: 9-Jul-2000
  • (1998)A case for merging the ILP and DLP paradigmsProceedings of the Sixth Euromicro Workshop on Parallel and Distributed Processing - PDP '98 -10.1109/EMPDP.1998.647201(217-224)Online publication date: 1998
  • (1996)Increasing the effective bandwidth of complex memory systems in multivector processorsProceedings of the 1996 ACM/IEEE conference on Supercomputing10.1145/369028.369084(26-es)Online publication date: 17-Nov-1996
  • (1995)Increasing the effective memory bandwidth in multivector processorsProceedings of EUROMICRO 96. 22nd Euromicro Conference. Beyond 2000: Hardware and Software Design Strategies10.1109/EURMIC.1996.546363(38-45)Online publication date: 1995
  • (2018)Return data interleaving for multi-channel embedded CMPs systemsIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2011.215736820:7(1351-1354)Online publication date: 29-Dec-2018

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media