Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/2648668.2648750acmconferencesArticle/Chapter ViewAbstractPublication PagesislpedConference Proceedingsconference-collections
research-article

PreTrans: reducing TLB CAM-search via page number prediction and speculative pre-translation

Published: 04 September 2013 Publication History

Abstract

The need for fast address translation within tight time constraints (before L1 tag check but after effective address computation) imposes many design constraints. The freedom from such constraints can potentially lead to lower TLB energy costs. In this paper, we observe that (1) data accesses commonly use base-displacement addressing modes in which the effective address is computed as the sum of a base and a displacement, and (2) the effective page numbers are predictable once the base address is known. Further, it is easy to cache address translations alongside the predicted page numbers thus enabling speculative address translation that can filter accesses to the TLB. The two observations enable our PreTrans design in which (a) a speculative translation is available based solely on the base address, and (b) the translation is available simultaneously with the effective (virtual) address. PreTrans replaces most of the energy-expensive CAM-lookups for TLB access with RAM lookups, which translates to significant power improvements in the TLB.

References

[1]
Cortex-a15 mpcore processors. http://infocenter.arm.com/help/index. jsp?topic=/com.arm.doc.ddi0438g/index.html.
[2]
Cortex-a9 series processors. http://infocenter.arm.com/help/index.jsp? topic=/com.arm.doc.ddi0388e/Chddijbd.html.
[3]
Intel 64 and ia-32 architectures software developer's manual. 3A:399.
[4]
T. Austin, D. Pnevmatikatos, and G. Sohi. Streamlining data cache access with fast address calculation. In Computer Architecture, 1995. Proceedings., 22nd Annual International Symposium on, pages 369--380, 1995.
[5]
T. Austin and G. Sohi. Zero-cycle loads: microarchitecture support for reducing load latency. In Microarchitecture, 1995., Proceedings of the 28th Annual International Symposium on, pages 82--92, 1995.
[6]
T. W. Barr, A. L. Cox, and S. Rixner. Spectlb: a mechanism for speculative address translation. In Proceedings of the 38th annual international symposium on Computer architecture, ISCA '11, pages 307--318, 2011.
[7]
A. Basu, M. D. Hill, and M. M. Swift. Reducing memory reference energy with opportunistic virtual caching. In Proceedings of the 39th International Symposium on Computer Architecture, ISCA '12, pages 297--308, 2012.
[8]
C. Bienia, S. Kumar, J. P. Singh, and K. Li. The parsec benchmark suite: Characterization and architectural implications. Technical Report TR-811-08, Princeton University, January 2008.
[9]
N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. R. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. D. Hill, and D. A. Wood. The gem5 simulator. SIGARCH Comput. Archit. News, 39(2): 1--7, Aug. 2011.
[10]
D. Callahan, K. Kennedy, and A. Porterfield. Software prefetching. In Proceedings of the fourth international conference on Architectural support for programming languages and operating systems, ASPLOS IV, pages 40--52, 1991.
[11]
M. Cekleov and M. Dubois. Virtual-address caches. part 1: problems and solutions in uniprocessors. Micro, IEEE, 17(5): 64--71, sep/oct 1997.
[12]
T.-F. Chen and J.-L. Baer. A performance study of software and hardware data prefetching schemes. In Computer Architecture, 1994., Proceedings the 21st Annual International Symposium on, pages 223--232, 1994.
[13]
W. Y. Chen, S. A. Mahlke, P. P. Chang, and W.-m. W. Hwu. Data access microarchitectures for superscalar processors with compiler-assisted data prefetching. In Proceedings of the 24th annual international symposium on Microarchitecture, MICRO 24, pages 69--73, 1991.
[14]
M. Ferdman, A. Adileh, O. Kocberber, S. Volos, M. Alisafaee, D. Jevdjic, C. Kaynak, A. D. Popescu, A. Ailamaki, and B. Falsafi. Clearing the clouds: a study of emerging scale-out workloads on modern hardware. In Proceedings of the seventeenth international conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVII, pages 37--48, New York, NY, USA, 2012. ACM.
[15]
F. Gabbay and A. Mendelson. Using value prediction to increase the power of speculative execution hardware. ACM TRANSACTIONS ON COMPUTER SYSTEMS, 16: 234--270, 1998.
[16]
J. R. Goodman. Coherency for multiprocessor virtual address caches. SIGOPS Oper. Syst. Rev., 21(4): 72--81, Oct. 1987.
[17]
J. L. Henning. Spec cpu2006 benchmark descriptions. SIGARCH Comput. Archit. News, 34(4): 1--17, Sept. 2006.
[18]
T. Juan, T. Lang, and J. J. Navarro. Reducing tlb power requirements. In Proceedings of the 1997 international symposium on Low power electronics and design, ISLPED '97, pages 196--201, 1997.
[19]
I. Kadayif, A. Sivasubramaniam, M. Kandemir, G. Kandiraju, and G. Chen. Generating physical addresses directly for saving instruction tlb energy. In Microarchitecture, 2002. (MICRO-35). Proceedings. 35th Annual IEEE/ACM International Symposium on, pages 185--196, 2002.
[20]
S. Li, J. H. Ahn, R. D. Strong, J. B. Brockman, D. M. Tullsen, and N. P. Jouppi. Mcpat: an integrated power, area, and timing modeling framework for multicore and manycore architectures. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 42, pages 469--480, 2009.
[21]
M. H. Lipasti and J. P. Shen. Exceeding the dataflow limit via value prediction. In Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture, MICRO 29, pages 226--237, Washington, DC, USA, 1996. IEEE Computer Society.
[22]
W. H. Wang, J.-L. Baer, and H. M. Levy. Organization and performance of a two-level virtual-real cache hierarchy. In Proceedings of the 16th annual international symposium on Computer architecture, ISCA '89, pages 140--148, 1989.

Cited By

View all
  • (2021)Fat Loads: Exploiting Locality Amongst Contemporaneous Load Operations to Optimize Cache AccessesMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480104(366-379)Online publication date: 18-Oct-2021

Index Terms

  1. PreTrans: reducing TLB CAM-search via page number prediction and speculative pre-translation

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image ACM Conferences
          ISLPED '13: Proceedings of the 2013 International Symposium on Low Power Electronics and Design
          September 2013
          440 pages
          ISBN:9781479912353

          Sponsors

          Publisher

          IEEE Press

          Publication History

          Published: 04 September 2013

          Check for updates

          Author Tags

          1. TLB
          2. power
          3. prediction
          4. speculation

          Qualifiers

          • Research-article

          Conference

          ISLPED'13
          Sponsor:

          Acceptance Rates

          Overall Acceptance Rate 398 of 1,159 submissions, 34%

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)2
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 18 Nov 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2021)Fat Loads: Exploiting Locality Amongst Contemporaneous Load Operations to Optimize Cache AccessesMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480104(366-379)Online publication date: 18-Oct-2021

          View Options

          Login options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media