Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/165939.165967acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
Article
Free access

EMC-Y: parallel processing element optimizing communication and computation

Published: 01 August 1993 Publication History

Abstract

EMC-Y is a new processing element for highly parallel computers designed to achieve high performance parallel computation by fusing a dataflow mechanism and a von Neumann execution pipeline. We have already developed EMC-R, which is the processing element used in the EM-4 prototype. EMC-Y improves on EMC-R's packet communication performance, allowing it to tolerate a more network traffic. This paper presents the architecture of EMC-Y, concentrating on the principles of packet communication. EMC-Y uses an output packet buffer and optimal packet routing to improve the performance of packet sending and transferring. EMC-Y changes the memory access priority for input packet buffer operation to improve the performance of receiving packets. Since the EMC-Y processor not only improves the performance of packet input and output but also balances them, it can tolerate a large amount of traffic and can improve the execution performance. We evaluate the improvements of EMC-Y architecture using a clock level simulator. The results show that EMC-Y improves performance by 50% to 70% in several programs over EMC-R at the same clock speed.

References

[1]
Arvind and R.Nickhil. Executing a Program on the MIT Tagged-Token Dataflow Arch~e~ure. IEEE trans, on Computers, 39(3), (1990), pp.300-318.
[2]
K.Hiraki, T.Shimada and K.Nishida. A hardware design of the SIGMA-1 - a dataflow computer for scientific computations. Proc. of ICPP 84, (1984), pp.524-531.
[3]
J.L.Hennessy and D.A.Patterson. Computer Arch~ tecture A Quantitative Approach. Morgan Kau~ mann Pub. Inc., (1990).
[4]
Y.Yamaguchi, S.Sakai, K.Hiraki, Y.Kodama, and T.Yuba. An ArchitecturM Design of a Highly Parallel Dataflow Machine, Proc. of IFIP 89, (1989), pp.1155-1160.
[5]
S.Sakai, Y.Yamaguchi, K.Hiraki, Y.Kodama, and T.Yuba. An Architecture of a Dataflow Single Chip Processor, Proc. of ISCA 89, (1989), pp.46-53.
[6]
Y.Kodama, S.Sakai, and Y.yamaguchi. A Prototype of a Highly Paral~l Dataflow Machine EM-4 and its Preliminary Evaluation, Proc. of infoJapan 90, (1990), pp.291-298.
[7]
S.Sakai, Y.Kodama and Y.Yamaguchi, Design .and Imp~mentation of a Circular Omega Network in the EM-4, Parallel Computing, Vol.19, No.2, (1993), pp.125-142.
[8]
Y.Kodama, S.Sakai, and Y.Yamaguchi. Load Balancing by Function DHtribution on the EM-4 Prototype, Proc. of Supercomputing 91, (1991), pp.522- 531.
[9]
D.E.Culler, A.Sah, K.E.Schauser, T.von Eicken and J.Wawrzynek. Fine-grain Paral~fism with Minimal Hardware Support: A Comp~e~Control~d Threaded Abstract Machine, Proc. of ASPLOS I, (1991), pp.164-175.
[10]
A.Agarwal, B.H.Lim, D.kranz and J.Kubiatowicz. APRIL: A Processor Architecture for MuRiproees~ ing, Proc. of ISCA 91, (1991), pp.104-114.
[11]
W.J.Dally, A.Chien, S.F~ke, W.Horwat, J.Keen, M.Larivee, R.Lethin, P.Nuth and S.Wil~. The J- machine: A Fine-grain Concurrent Computer, Information Proces~ng 89, Proe. of IFIP 89, (1989), pp.1147-1153.
[12]
D.Lenoski, J.Laudon, T.Joe, D.Nakahira, L.Stevens, A.Gupta and J.Hennessy. The DASH Prototype: Imp~mentation and Performance, Proc. of ISCA 92, (1992), pp.92-103.

Cited By

View all
  • (2005)Programming with distributed data structure for EM-X multiprocessorTheory and Practice of Parallel Programming10.1007/BFb0026585(472-483)Online publication date: 15-Jun-2005
  • (2001)Tolerating communication latency through dynamic thread invocation in a multithreaded architectureCompiler optimizations for scalable parallel systems10.5555/380466.380481(525-549)Online publication date: 1-Jun-2001
  • (2001)Tolerating Communication Latency through Dynamic Thread Invocation in a Multithreaded ArchitectureCompiler Optimizations for Scalable Parallel Systems10.1007/3-540-45403-9_15(525-549)Online publication date: 18-May-2001
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICS '93: Proceedings of the 7th international conference on Supercomputing
August 1993
425 pages
ISBN:089791600X
DOI:10.1145/165939
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 August 1993

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

ICS93
Sponsor:

Acceptance Rates

Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)80
  • Downloads (Last 6 weeks)31
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2005)Programming with distributed data structure for EM-X multiprocessorTheory and Practice of Parallel Programming10.1007/BFb0026585(472-483)Online publication date: 15-Jun-2005
  • (2001)Tolerating communication latency through dynamic thread invocation in a multithreaded architectureCompiler optimizations for scalable parallel systems10.5555/380466.380481(525-549)Online publication date: 1-Jun-2001
  • (2001)Tolerating Communication Latency through Dynamic Thread Invocation in a Multithreaded ArchitectureCompiler Optimizations for Scalable Parallel Systems10.1007/3-540-45403-9_15(525-549)Online publication date: 18-May-2001
  • (1998)Highly efficient implementation of MPI point-to-point communication using remote memory operationsProceedings of the 12th international conference on Supercomputing10.1145/277830.277890(267-273)Online publication date: 13-Jul-1998
  • (1997)Fine-grain multithreading with the EM-X multiprocessorProceedings of the ninth annual ACM symposium on Parallel algorithms and architectures10.1145/258492.258511(189-198)Online publication date: 1-Jun-1997
  • (1995)Multithreading with the EM-4 distributed-memory multiprocessorProceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques10.5555/224659.224676(27-36)Online publication date: 27-Jun-1995
  • (1995)The EM-X parallel computerACM SIGARCH Computer Architecture News10.1145/225830.22398723:2(14-23)Online publication date: 1-May-1995
  • (1995)The EM-X parallel computerProceedings of the 22nd annual international symposium on Computer architecture10.1145/223982.223987(14-23)Online publication date: 1-Jul-1995
  • (1995)OSCAR Fortran Multigrain CompilerParallel Language and Compiler Research in Japan10.1007/978-1-4615-2269-0_11(271-301)Online publication date: 1995
  • (1994)Nonnumeric search results on the EM-4 distributed-memory multiprocessorProceedings of the 1994 ACM/IEEE conference on Supercomputing10.5555/602770.602828(301-310)Online publication date: 14-Nov-1994
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media