Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

Utilizing RF-I and intelligent scheduling for better throughput/watt in a mobile GPU memory system

Published: 26 January 2012 Publication History

Abstract

Smartphones and tablets are becoming more and more powerful, replacing desktops and laptops as the users' main computing system. As these systems support higher and higher resolutions with more complex 3D graphics, a high-throughput and low-power memory system is essential for the mobile GPU. In this article, we propose to improve throughput/watt in a mobile GPU memory system by using intelligent scheduling to reduce power and multi-band radio frequency interconnect (MRF-I) to offset any throughput degradation caused by our intelligent scheduling. Overall, we are able to improve throughput 17% up to 66% while increasing throughput per watt by an average of 18% up to 26%.

References

[1]
Attila, 2011. ATTILA traces. http://attila.ac.upc.edu/traceList/.
[2]
Byun, G., Kim, Y., Kim, J., Tam, S., Hsieh, H., Wu, P., Jou, C., Cong, J., Reinman, G., and Chang, M. F. 2011. An 8.4Gb/s 2.5pJ/b mobile memory I/O interface using bidirectional and simultaneous dual (baseband and RF-band) signaling. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) (Digest of Technical Papers). 488--490.
[3]
Cadence, 2011. Cadence Virtuoso Spectre circuit simulator. http://www.cadence.com/products/rf/spectre_circuit/pages/default.aspx
[4]
Chang, M. F., Verbauwhede, I., Chien, C., Xu, Z., Kim, J., Ko, J., Gu, Q., and Lai, B. 2005. Advanced RF/baseband interconnect schemes for inter- and intra-ULSI communications. In IEEE Trans. Electron Devices 52, 7, 1271--1285.
[5]
del Barrio, V., Gonzalez, C., Roca, J., and Fernandez, A. 2006. ATTILA: A cycle-level execution-driven simulator for modern GPU architectures. In Proceedings of the International Symposium on Performance Analysis of Systems and Software. 231--241.
[6]
Diniz, B., Guedes, D., Meira, Jr., W., and Bianchini, R. 2007. Limiting the power consumption of main memory. In Proceedings of the 34th International Symposium on Computer Architecture. 290--301.
[7]
Eckert, R. E. 2008. Page streams sorter for DRAM systems. Assignee NVIDIA Corporation, United States Patent 7, 376, 803.
[8]
Fan, X., Ellis, C., and Lebeck, A. 2001. Memory controller policies for DRAM power management. In Proceedings of the International Symposium on Low Power Electronics and Design. 129--134.
[9]
Huang, H., Shin, K. G., Lefurgy, C., and Keller, T. 2005. Improving energy efficiency by making DRAM less randomly accessed. In Proceedings of the International Symposium on Low Power Electronics and Design. 393--398.
[10]
Ko, J., Kim, J., Xu, Z., Gu, Q., Chien, C., and Chang, M. F. 2005. An RF/baseband FDMA-interconnect transceiver for reconfigurable multiple access chip-to-chip communication. In Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC) (Digest of Technical Papers). 338--602.
[11]
Kundert, K. 1999. Introduction to RF simulation and its application. IEEE J. Solid-State Circuits 34, 9, 1298--1319.
[12]
Micron, 2009. Micron. 1Gb: x16,x32 mobile LPDDR SDRAM features. http://www.micron.com/products/dram/mobile_lpdram.html
[13]
Mutlu, O. and Moscibroda, T. 2008. Parallelism-aware batch scheduling: Enhancing both performance and fairness of shared DRAM Systems. In Proceedings of the 36th International Symposium on Computer Architecture.
[14]
Moya, V., Gonzalez, C., Roca, J., Fernandez, A., and Espasa, R. 2005. Shader performance analysis on a modern GPU architecture. In Proceedings of the 38th Annual ACM/IEEE International Symposium on Microarchitecture (MICRO '05). 355--364.
[15]
Rixner, S., Dally, W. J., Kapasi, U. J., Mattson, P., and Owens, J. D. 2000. Memory access scheduling. In Proceedings of the 27th International Symposium on Computer Architecture.
[16]
Shao, J. and Davis, B. T. 2007. A burst scheduling access reordering mechanism," In Proceedings of the IEEE 13th International Symposium on High Performance Computer Architecture. 285--294.
[17]
Wang, D., Ganesh, B., Tuaycharoen, N., Baynes, K., Jaleel, A., and Jacob, B. 2005. Dramsim: a memory-system simulator. ACM SIGARCH Comput. Architec. News 33, 4, 100--107.
[18]
Zheng, H., Lin, J., Zhang, Z., Gorbatov, E., David, H., and Zhu, Z. 2008. Mini-rank: Adaptive DRAM architecture for improving memory power efficiency. In Proceedings of the 41st Annual IEEE/ACM International Symposium on Microarchitecture. 210--221.

Cited By

View all

Index Terms

  1. Utilizing RF-I and intelligent scheduling for better throughput/watt in a mobile GPU memory system

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Transactions on Architecture and Code Optimization
        ACM Transactions on Architecture and Code Optimization  Volume 8, Issue 4
        Special Issue on High-Performance Embedded Architectures and Compilers
        January 2012
        765 pages
        ISSN:1544-3566
        EISSN:1544-3973
        DOI:10.1145/2086696
        Issue’s Table of Contents
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 26 January 2012
        Accepted: 01 November 2011
        Revised: 01 October 2011
        Received: 01 June 2011
        Published in TACO Volume 8, Issue 4

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. DRAM
        2. GPU
        3. Mobile
        4. RF-I
        5. memory
        6. power
        7. scheduling

        Qualifiers

        • Research-article
        • Research
        • Refereed

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)31
        • Downloads (Last 6 weeks)2
        Reflects downloads up to 14 Dec 2024

        Other Metrics

        Citations

        Cited By

        View all

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Login options

        Full Access

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media