Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1366230.1366261acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
research-article

A modular 3d processor for flexible product design and technology migration

Published: 05 May 2008 Publication History

Abstract

The current methodology used in mass-market processor design is to create a single base microarchitecture (e.g., Intel's ``Core'' or AMD's ``K8'') that is used throughout all of the PC market segments from laptops to servers. To differentiate the products, manufacturers rely on speed binning, different cache sizes, and varying the number of cores. In this paper, we propose using 3D integration to provide a new, but complementary, approach to providing product differentiation. Past research on using 3D to improve performance has focused on the construction of "fully 3D" circuits where functional blocks are partitioned across two or more layers. This approach forces one of two undesirable situations: (1) all products must be implemented in, and therefore pay the cost of, 3D or (2) a 3D-implemented processor is designed for the high-end/high-performance markets and a separate 2D microarchitecture must be designed for the lower-cost markets thereby incurring significant additional design effort and engineering cost. We present a modular processor architecture where 3D can be used to enhance performance within a single unified design and also provides for a more gradual migration path toward fully 3D-integrated designs. To make this work, we describe a generic technique of using "phantom" components where the baseline processor may believe that 3D-stacked resources exist, but are currently unavailable. Simply using 3D to stack more L2 cache provides a 15.1% average performance benefit, but our proposal increases performance by 25.4%.

References

[1]
Todd Austin, Eric Larson, and Dan Ernst. SimpleScalar: An Infrastructure for Computer System Modeling. IEEE Micro Magazine, pages 59--67, February 2002.
[2]
Todd M. Austin. DIVA: A Reliable Substrate for Deep Submicron Microarchitecture Design. In Proceedings of the 32nd International Symposium on Microarchitecture, pages 196--207, Haifa, Israel, November 1999.
[3]
Lee Baugh and Craig Zilles. Decomposing the Load-Store Queue by Function for Power Reduction and Scalability. IBM Journal of Research and Development, pages 287--297, March-May 2006.
[4]
Bryan Black, Murali M. Annavaram, Edward Brekelbaum, John DeVale, Lei Jiang, Gabriel H. Loh, Don McCauley, Pat Morrow, Donald W. Nelson, Daniel Pantuso, Paul Reed, Jeff Rupley, Sadas Shankar, John Paul Shen, and Clair Webb. Die-Stacking (3D) Microarchitecture. In Proceedings of the 39th International Symposium on Microarchitecture, Orlando, FL, December 2006.
[5]
Bryan Black, Don Nelson, Clair Webb, and Nick Samra. 3D Processing Technology and its Impact on IA32 Microprocessors. In Proceedings of the 22nd International Conference on Computer Design, pages 316--318, San Jose, CA, USA, October 2004.
[6]
Harold W. Cain and Mikko H. Lipasti. Memory Ordering: A Value-Based Approach. In Proceedings of the 31st International Symposium on Computer Architecture, pages 90--101, München, Germany, June 2004.
[7]
Jason Cong, Ashok Jagannathan, Yuchun Ma, Glenn Reinman, Jie Wei, and Yan Zhang. An Automated Design Flow for 3D Microarchitecture Evaluation. In Proceedings of the 11th Asia South Pacific Design Automation Conference, pages 384--389, Yokohama, Japan, January 2006.
[8]
Shamik Das, Andy Fan, Kuan-Neng Chen, and C. S. Tan. Technology, Performance, and Computer-Aided Design of Three-Dimensional Integrated Circuits. In Proceedings of the International Symposium on Physical Design, pages 108--115, Phoenix, AZ, USA, April 2004.
[9]
Jack Doweck. Inside Intel Core Microarchitecture and Smart Memory Access. White paper, Intel Corporation, 2006. http://download.intel.com/technology/architecture/sma.pdf.
[10]
Amit Gandhi, Haitham Akkary, Ravi Rajwar, Srikanth T. Srinivasan, and Konrad Lai. Scalable Load and Store Processing in Latency Tolerant Processors. In Proceedings of the 32nd International Symposium on Computer Architecture, pages 446--457, Madison, WI, USA, June 2005.
[11]
Andy Glew. MLP Yes! ILP No! Memory Level Parallelism, or, Why I No Longer Worry About IPC. In Proceedings of the ASPLOS Wild and Crazy Ideas Session, San Jose, CA, USA, October 1997.
[12]
Simcha Gochman, Ronny Ronen, Ittai Anati, Ariel Berkovitz, Tsvika Kurts, Alon Naveh, Ali Saeed, Zeev Sperber, and Robert C. Valentine. The Intel Pentium M Processor: Microarchitecture and Performance. Intel Technology Journal, 7(2), May 2003.
[13]
Darryl Gove. CPU2006 Working Set Size. Computer Architecture News, 35(1):90--96, March 2007.
[14]
Ed Grochowski, David Eyers, and Vivek Tiwari. Microarchitectural Simulation and Control of di/dt-induced Power Supply Voltage Variation. In Proceedings of the 8th International Symposium on High Performance Computer Architecture, pages 7--16, 2002.
[15]
K. W. Guarini, A. W. Topol, M. Ieong, R. Yu, L. Shi, M. R. Newport, D. J. Frank, D. V. Singh, G. M. Cohen, S. V. Nitta, D. C. Boyd, P. A. O-Neil, S. L. Tempest, H. B. Pogge, S. Purushothaman, and W. E. Haensch. Electrical Integrity of State-of-the-Art 0.13?m SOI CMOS Devices and Circuits Transferred for Three-Dimensional (3D) Integrated Circuit (IC) Fabrication. In Proceedings of the International Electron Devices Meeting, pages 943--945, December 2002.
[16]
Greg Hamerly, Erez Perelman, Jeremy Lau, and Brad Calder. SimPoint 3.0: Faster and More Flexible Program Analysis. In Proceedings of the Workshop on Modeling, Benchmarking and Simulation, Madison, WI, USA, June 2005.
[17]
Michael Healy, Mario Vittes, Mongkol Ekpanyapong, Chinnakrishnan Ballapuram, Sung Kyu Lim, Hsien-Hsin S. Lee, and Gabriel H. Loh. Multi-Objective Microarchitectural Floorplanning for 2D and 3D ICs. To appear in the IEEE Transactions on Computer Aided Design, 2007.
[18]
H. Peter Hofstee. Power Efficient Processor Architecture and the Cell Processor. In Proceedings of the 11th International Symposium on High Performance Computer Architecture, pages 258--262, San Francisco, CA, USA, February 2005.
[19]
Tae Ho Kgil, Shaun D-Souza, Ali Ghassan Saidi, Nathan Binkert, Ronald Dreslinski, Steven Reinhardt, Kristian Flautner, and Trevor Mudge. PicoServer: Using 3D Stacking Technology to Enable a Compact Energy Efficient Chip Multiprocessor. In Proceedings of the 12th Symposium on Architectural Support for Programming Languages and Operating Systems, San Jose, CA, USA, October 2006.
[20]
Jongman Kim, Chrysostomos Nicopoulos, Dongkook Park, Reetuparna Das, Yuan Xie, N. Vijaykrishnan, Mazin S. Yousif, and Chita R. Das. A Novel Dimensionally-Decomposed Router for On-Chip Communication in 3D Architectures. In Proceedings of the 34th International Symposium on Computer Architecture, San Diego, CA, USA, June 2007.
[21]
Gurhan Kucuk, Kanad Ghose, Dmitry V. Ponomarev, and Peter M. Kogge. Energy-Efficient Instruction Dispatch Buffer Design for Superscalar Processors. In Proceedings of the International Symposium on Low Power Electronics and Design, Huntington Beach, CA, USA, August 2001.
[22]
Feihui Li, Chrysostomos Nicopoulos, Thomas Richardson, Yuan Xie, Vijaykrishnan Narayanan, and Mahmut Kandemir. Design and Management of 3D Chip Multiprocessors Using Network-in-Memory. In Proceedings of the 33rd International Symposium on Computer Architecture, pages 130--141, Boston, MA, USA, June 2006.
[23]
Gian Luca Loi, Banit Agarwal, Navin Srivastava, Sheng-Chih Lin, and Timothy Sherwood. A Thermally-Aware Performance Analysis of Vertically Integrated (3-D) Processor-Memory Hierarchy. In Proceedings of the 43rd Design Automation Conference, San Francisco, CA, USA, July 2006.
[24]
Niti Madan and Rajeev Balasubramonian. Leveraging 3D Technology for Improved Reliability. In Proceedings of the 40th International Symposium on Microarchitecture, Chicago, IL, December 2007.
[25]
John Mayega, Okan Erdogan, Paul M. Belemjian, Kuan Zhou, John F. McDonald, and Russel P. Kraft. 3D Direct Vertical Interconnect Microprocessors Test Vehicle. In Proceedings of the ACM Great Lakes Symposium on VLSI, pages 141--146, Washington, DC, USA, April 2003.
[26]
Scott McFarling. Combining Branch Predictors. TN 36, Compaq Computer Corporation Western Research Laboratory, June 1993.
[27]
Fayez Mohamood, Michael Healy, Sung Kyu Lim, and sien-Hsin S. Lee. A Floorplan-Aware Dynamic Inductive Noise Controller for Reliable Processor Design. In Proceedings of the 39th International Symposium on Microarchitecture, pages 3--14, Orlando, FL, December 2006.
[28]
Shashidar Mysore, Banit Agarwal, Sheng-Chih Lin, Navin Srivastava, Kaustav Banerjee, and Timothy Sherwood. Introspective 3D Chips. In Proceedings of the 12th Symposium on Architectural Support for Programming Languages and Operating Systems, San Jose, CA, USA, October 2006.
[29]
Shashidhar Mysore, Banit Agrawal, Navin Srivastava, Sheng-Chih Lin, Kaustav Banerjee, and Timothy Sherwood. 3D-Integration for Introspection. IEEE Micro Magazine, 27(1):77--83, January?February 2007.
[30]
Don Nelson, Clair Webb, Don McCauley, Kartik Raol, Jeff Rupley II, John DeVale, and Bryan Black. A 3D Interconnect Methodology Applied to iA32-class Architectures for Performance Improvements through RC Mitigation. In Proceedings of the 21st International VLSI Multilevel Interconnection Conference, Waikoloa Beach, HI, USA, September 2004.
[31]
Subbarao Palacharla. Complexity-Effective Superscalar Processors. PhD thesis, University of Wisconsin, 1998.
[32]
Kiran Puttaswamy and Gabriel H. Loh. Implementing Caches in a 3D Technology for High Performance Processors. In Proceedings of the International Conference on Computer Design, San Jose, CA, USA, October 2005.
[33]
Kiran Puttaswamy and Gabriel H. Loh. Dynamic Instruction Schedulers in a 3-Dimensional Integration Technology. In Proceedings of the ACM Great Lakes Symposium on VLSI, pages 153--158, Philadelphia, PA, USA, May 2006.
[34]
Kiran Puttaswamy and Gabriel H. Loh. Implementing Register Files for High-Performance Microprocessors in a Die-Stacked (3D) Technology. In Proceedings of the International Symposium on VLSI, pages 384--389, Karlsrühe, Germany, March 2006.
[35]
Kiran Puttaswamy and Gabriel H. Loh. Thermal Analysis of a 3D Die-Stacked High-Performance Microprocessor. In Proceedings of the ACM Great Lakes Symposium on VLSI, pages 19--24, Philadelphia, PA, USA, May 2006.
[36]
Kiran Puttaswamy and Gabriel H. Loh. Scalability of 3D-Integrated Arithmetic Units in High-Performance Microprocessors. In Proceedings of the 44tth Design Automation Conference, pages 622--625, 2007.
[37]
Kiran Puttaswamy and Gabriel H. Loh. Thermal Herding: Microarchitecture Techniques for Controlling HotSpots in High-Performance 3D-Integrated Processors. In Proceedings of the 13th International Symposium on High Performance Computer Architecture, pages 193--204, Phoenix, AZ, USA, February 2007.
[38]
Paul Reed, Gus Yeung, and Bryan Black. Design Aspects of a Microprocessor Data Cache using 3D Die Interconnect Technology. In Proceedings of the International Conference on Integrated Circuit Design and Technology, pages 15--18, Austin, TX, USA, May 2005.
[39]
Amir Roth. Store Vulnerability Window (SVW): A Filter and Potential Replacement for Load Re-Execution. Journal of Instruction Level Parallelism, 8, 2006.
[40]
Stefan Rusu, Jason Stinson, Simon Tam, Justin Leung, Harry Muljono, and Brian Cherkauer. A 1.5-Ghz 130-nm Itanium 2 Processor with 6-MB On-Die L3 Cache. IEEE Journal of Solid-State Circuits, 38(11):1887--1895, November 2003.
[41]
Peter G. Sassone, Jeff Rupley, Edward Brekelbaum, Gabriel H. Loh, and Bryan Black. Matrix Scheduler Reloaded. In Proceedings of the 34th International Symposium on Computer Architecture, pages 335--346, San Diego, CA, USA, June 2007.
[42]
Gerard Schrom, Peter Hazucha, Jae-Hong Hahn, Volkan Kursun, Donald Gardner, Siva Narendra, Tanay Karnik, and Vivek De. Feasibility of Monolithic and 3D-Stacked DC-DC Converters for Microprocessors in 90nm Technology Generation. In Proceedings of the International Symposium on Low Power Electronics and Design, pages 263--268, Newport Beach, CA, USA, August 2004.
[43]
Simha Sethumadhavan, Rajagopalan Desikan, Doug Burger, Charles R. Moore, and Stephen W. Keckler. Scalable Hardware Memory Disambiguation for High ILP Processors. In Proceedings of the 36th International Symposium on Microarchitecture, pages 118--127, San Diego, CA, USA, May 2003.
[44]
Andé Seznec and Pierre Michaud. A Case for (Partially) TAgges GEometric History Length Branch Prediction. Journal of Instruction Level Parallelism, 8:1--23, 2006.
[45]
Tingting Sha, Milo M. K. Martin, and Amir Roth. Scalable Store-Load Forwarding via Store Queue Index Prediction. In Proceedings of the 38th International Symposium on Microarchitecture, pages 159--170, Barcelona, Spain, November 2005.
[46]
Tingting Sha, Milo M. K. Martin, and Amir Roth. NoSQ: Store-Load Forwarding without a Store Queue. In Proceedings of the 39th International Symposium on Microarchitecture, pages 285--296, Orlando, FL, December 2006.
[47]
Kevin Skadron, Mircea R. Stan, Wei Huang, Sivakumar Velusamy, Karthik Sankaranarayanan, and David Tarjan. Temperature-Aware Microarchitecture. In Proceedings of the 30th International Symposium on Computer Architecture, pages 2-?13, San Diego, CA, USA, May 2003.
[48]
T. J. Slegel, R. M. Averill III, M. A. Check, B. C. Giamei, B. W. Krumm, C. A. Krygowski, W. H. Li, J. S. Liptay, J. D. MacDougall, T. J. McPherson, J. A. Navarro, E. M. Schwarz, K. Shum, and C. F. Webb. IBM?s S/390 G5 Microprocessor Design. IEEE Micro Magazine, 19(2):12--23, Mar/Apr 1999.
[49]
Jim E. Smith. A Study of Branch Prediction Strategies. In Proceedings of the 8th International Symposium on Computer Architecture, pages 135--148, Minneapolis, MN, USA, May 1981.
[50]
Sam S. Stone, Kevin M. Woley, and Matthew I. Frank. Address-Indexed Memory Disambiguation and Store-to-Load Forwarding. In Proceedings of the 37th International Symposium on Microarchitecture, pages 171--182, Barcelona, Spain, November 2005.
[51]
Samantika Subramaniam and Gabriel H. Loh. Fire-and-Forget: Load/Store Scheduling with No Store Queue At All. In Proceedings of the 39th International Symposium on Microarchitecture, pages 273--284, Orlando, FL, December 2006.
[52]
David Tarjan, Shyamkumar Thoziyoor, and Norman P. Jouppi. CACTI 4.0. Technical Report HPL-2006-86, HP Laboratories Palo Alto, June 2006.
[53]
Yuh-Fang Tsai, Yuan Xie, Narayanan Vijaykrishnan, and Mary Jane Irwin. Three-Dimensional Cache Design Using 3DCacti. In Proceedings of the International Conference on Computer Design, San Jose, CA, USA, October 2005.
[54]
Balaji Vaidyanathan, Wei-Lun Hung, Feng Wang, Yuan Xie, Vijaykrishnan Narayanan, and Mary Jane Irwin. Architecting Microprocessor Components in 3D Design Space. In Proceedings of the IEEE Symposium on VLSI Design, Bangalore, India, January 2007.
[55]
Yuan Xie, Gabriel H. Loh, Bryan Black, and Kerry Bernstein. Design Space Exploration for 3D Architecture. ACM Journal of Emerging Technologies in Computer Systems, 2(2):65--103, April 2006.

Cited By

View all
  • (2013)3D-MMCProceedings of the Conference on Design, Automation and Test in Europe10.5555/2485288.2485586(1241-1246)Online publication date: 18-Mar-2013
  • (2013)Exploiting Application/System-Dependent Ambient Temperature for Accurate Microarchitectural SimulationIEEE Transactions on Computers10.1109/TC.2012.2462:4(705-715)Online publication date: 1-Apr-2013
  • (2012)Design and Testing Strategies for Modular 3-D-Multiprocessor Systems Using Die-Level Through Silicon Via TechnologyIEEE Journal on Emerging and Selected Topics in Circuits and Systems10.1109/JETCAS.2012.21938372:2(295-306)Online publication date: Jun-2012
  • Show More Cited By

Index Terms

  1. A modular 3d processor for flexible product design and technology migration

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CF '08: Proceedings of the 5th conference on Computing frontiers
    May 2008
    334 pages
    ISBN:9781605580777
    DOI:10.1145/1366230
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 May 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. 3d-integration
    2. modular
    3. superscalar

    Qualifiers

    • Research-article

    Conference

    CF '08
    Sponsor:
    CF '08: Computing Frontiers Conference
    May 5 - 7, 2008
    Ischia, Italy

    Acceptance Rates

    Overall Acceptance Rate 273 of 785 submissions, 35%

    Upcoming Conference

    CF '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 14 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2013)3D-MMCProceedings of the Conference on Design, Automation and Test in Europe10.5555/2485288.2485586(1241-1246)Online publication date: 18-Mar-2013
    • (2013)Exploiting Application/System-Dependent Ambient Temperature for Accurate Microarchitectural SimulationIEEE Transactions on Computers10.1109/TC.2012.2462:4(705-715)Online publication date: 1-Apr-2013
    • (2012)Design and Testing Strategies for Modular 3-D-Multiprocessor Systems Using Die-Level Through Silicon Via TechnologyIEEE Journal on Emerging and Selected Topics in Circuits and Systems10.1109/JETCAS.2012.21938372:2(295-306)Online publication date: Jun-2012
    • (2010)Predictive Temperature-Aware DVFSIEEE Transactions on Computers10.1109/TC.2009.13659:1(127-133)Online publication date: 1-Jan-2010
    • (2010)Exploiting application-dependent ambient temperature for accurate architectural simulation2010 IEEE International Conference on Computer Design10.1109/ICCD.2010.5647639(502-508)Online publication date: Oct-2010
    • (2009)The impact of liquid cooling on 3D multi-core processors2009 IEEE International Conference on Computer Design10.1109/ICCD.2009.5413115(472-478)Online publication date: Oct-2009
    • (2008)Investigating the effects of fine-grain three-dimensional integration on microarchitecture designACM Journal on Emerging Technologies in Computing Systems10.1145/1412587.14125904:4(1-30)Online publication date: 7-Nov-2008

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media