Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

WDBT: : Non-volatile memory wear characterization and mitigation for DBT systems

Published: 01 May 2022 Publication History

Abstract

Emerging high-capacity and byte-addressable non-volatile memory (NVM) is promising for the next-generation memory system. However, NVM suffers from limited write endurance, as an NVM cell will wear out very soon after a certain number of writes, making NVM undependable. To address this issue, many wear reduction and leveling mechanisms have been proposed. Nevertheless, most of these mechanisms are developed without the knowledge of application semantics and behaviors. In this paper, we advocate application-level wear management, which allows us to create effective and flexible wear reduction and leveling techniques for specific application domains. Particularly, we find that applications running with dynamic binary translation (DBT) exhibit significantly more writes. This is because DBT systems need to handle architectural differences when translating instructions across different architectures. In this paper, we present WDBT, which focuses on wear reduction and leveling for DBT systems on NVM. WDBT is designed based on common practices of DBT systems to reduce the majority of writes introduced by DBT. We also implement a prototype of WDBT using a real-world DBT system, QEMU, for multiple popular instruction sets. Experimental results on SPEC CPU 2017 show that WDBT can effectively reduce writes by 52.09% and 34.48% for x86-64 and RISC-V, respectively. Moreover, the performance overhead of WDBT is negligible.

Highlights

NVM wear leveling and reduction for DBT systems.
Characterization of memory writes for DBT systems.
Exploiting host hardware resources for optimizing DBT.
Dynamic CPU state emulation for leveling uneven NVM wear.

References

[1]
Akram S., Sartor J.B., McKinley K.S., Eeckhout L., Write-rationing garbage collection for hybrid memories, in: Proceedings Of The 39th ACM SIGPLAN Conference On Programming Language Design And Implementation, in: PLDI 2018, Association for Computing Machinery, New York, NY, USA, 2018, pp. 62–77,.
[3]
Barbalace A., Lyerly R., Jelesnianski C., Carno A., Chuang H.-R., Legout V., Ravindran B., Breaking the boundaries in heterogeneous-ISA datacenters, in: Proceedings Of The Twenty-Second International Conference On Architectural Support For Programming Languages And Operating Systems, in: ASPLOS ’17, Association for Computing Machinery, New York, NY, USA, 2017, pp. 645–659,.
[4]
Bellard F., QEMU, A fast and portable dynamic translator, in: Proceedings Of The Annual Conference On USENIX Annual Technical Conference, in: USENIX ATC ’05, USENIX Association, USA, 2005, p. 41.
[5]
Bruening D.L., Amarasinghe S., Efficient, Transparent, and Comprehensive Runtime Code Manipulation, (Ph.D. thesis) Massachusetts Institute of Technology, USA, 2004, AAI0807735.
[6]
Bruening D., Garnett T., Amarasinghe S., An infrastructure for adaptive dynamic optimization, in: Proceedings Of The International Symposium On Code Generation And Optimization: Feedback-Directed And Runtime Optimization, in: CGO ’03, IEEE Computer Society, USA, 2003, pp. 265–275.
[7]
Chen C.-H., Hsiu P.-C., Kuo T.-W., Yang C.-L., Wang C.-Y.M., Age-based PCM wear leveling with nearly zero search cost, in: Proceedings Of The 49th Annual Design Automation Conference, in: DAC ’12, Association for Computing Machinery, New York, NY, USA, 2012, pp. 453–458,.
[8]
Chen X., Qingfeng Z., Sun Q., Sha E.H.-M., Gu S., Yang C., Xue C.J., A wear-leveling-aware fine-grained allocator for non-volatile memory, in: Proceedings Of The 56th Annual Design Automation Conference 2019, in: DAC ’19, Association for Computing Machinery, New York, NY, USA, 2019,.
[9]
Dehnert J.C., Grant B.K., Banning J.P., Johnson R., Kistler T., Klaiber A., Mattson J., The transmeta code morphing™ software: Using speculation, recovery, and adaptive retranslation to address real-life challenges, in: Proceedings Of The International Symposium On Code Generation And Optimization: Feedback-Directed And Runtime Optimization, in: CGO ’03, IEEE Computer Society, USA, 2003, pp. 15–24.
[10]
DeVuyst M., Venkat A., Tullsen D.M., Execution migration in a heterogeneous-ISA chip multiprocessor, in: Proceedings Of The Seventeenth International Conference On Architectural Support For Programming Languages And Operating Systems, in: ASPLOS XVII, Association for Computing Machinery, New York, NY, USA, 2012, pp. 261–272,.
[11]
Dhiman G., Ayoub R., Rosing T., PDRAM: A hybrid PRAM and DRAM main memory system, in: 2009 46th ACM/IEEE Design Automation Conference, IEEE, 2009, pp. 664–669.
[12]
Feiner P., Brown A.D., Goel A., Comprehensive kernel instrumentation via dynamic binary translation, in: Proceedings Of The Seventeenth International Conference On Architectural Support For Programming Languages And Operating Systems, in: ASPLOS XVII, Association for Computing Machinery, New York, NY, USA, 2012, pp. 135–146,.
[13]
Ferreira A.P., Zhou M., Bock S., Childers B., Melhem R., Mossé D., Increasing PCM main memory lifetime, in: 2010 Design, Automation & Test In Europe Conference & Exhibition (DATE 2010), IEEE, 2010, pp. 914–919.
[14]
Friedman M., Ben-David N., Wei Y., Blelloch G.E., Petrank E., NVTraverse: In NVRAM data structures, the destination is more important than the journey, in: Proceedings Of The 41st ACM SIGPLAN Conference On Programming Language Design And Implementation, in: PLDI 2020, Association for Computing Machinery, New York, NY, USA, 2020, pp. 377–392,.
[15]
Gao T., Strauss K., Blackburn S.M., McKinley K.S., Burger D., Larus J., Using managed runtime systems to tolerate holes in wearable memories, in: Proceedings Of The 34th ACM SIGPLAN Conference On Programming Language Design And Implementation, in: PLDI ’13, Association for Computing Machinery, New York, NY, USA, 2013, pp. 297–308,.
[16]
Gogte V., Wang W., Diestelhorst S., Kolli A., Chen P.M., Narayanasamy S., Wenisch T.F., Software wear management for persistent memories, in: Proceedings Of The 17th USENIX Conference On File And Storage Technologies, in: FAST’19, USENIX Association, USA, 2019, pp. 45–63.
[17]
Hu D., Chen Z., Wu J., Sun J., Chen H., Persistent memory hash indexes: An experimental evaluation, Proc. VLDB Endow. 14 (2021) 785–798,.
[18]
Hu S., Smith J.E., Using dynamic binary translation to fuse dependent instructions, in: Proceedings Of The International Symposium On Code Generation And Optimization: Feedback-Directed And Runtime Optimization, in: CGO ’04, IEEE Computer Society, USA, 2004, p. 213.
[19]
Jiang, J., Dong, R., Zhou, Z., Song, C., Wang, W., Yew, P.-C., Zhang, W., 2020. More with less – deriving more translation rules with less training data for DBTs using parameterization. In: 2020 53rd Annual IEEE/ACM International Symposium On Microarchitecture (MICRO), pp. 415–426, https://doi.org/10.1109/MICRO50266.2020.00043.
[20]
Li D., Reidys B., Sun J., Shull T., Torrellas J., Huang J., UniHeap: MAnaging persistent objects across managed runtimes for non-volatile memory, in: Proceedings Of The 14th ACM International Conference On Systems And Storage, in: SYSTOR ’21, Association for Computing Machinery, New York, NY, USA, 2021,.
[21]
Luk C.-K., Cohn R., Muth R., Patil H., Klauser A., Lowney G., Wallace S., Reddi V.J., Hazelwood K., Pin: Building customized program analysis tools with dynamic instrumentation, in: Proceedings Of The 2005 ACM SIGPLAN Conference On Programming Language Design And Implementation, in: PLDI ’05, Association for Computing Machinery, New York, NY, USA, 2005, pp. 190–200,.
[23]
Ou J., Shu J., Lu Y., A high performance file system for non-volatile main memory, in: Proceedings Of The Eleventh European Conference On Computer Systems, in: EuroSys ’16, Association for Computing Machinery, New York, NY, USA, 2016,.
[24]
Qureshi M.K., Karidis J., Franceschini M., Srinivasan V., Lastras L., Abali B., Enhancing lifetime and security of PCM-based main memory with start-gap wear leveling, in: Proceedings Of The 42nd Annual IEEE/ACM International Symposium On Microarchitecture, in: MICRO 42, Association for Computing Machinery, New York, NY, USA, 2009, pp. 14–23,.
[25]
Qureshi M.K., Srinivasan V., Rivers J.A., Scalable high performance main memory system using phase-change memory technology, in: Proceedings Of The 36th Annual International Symposium On Computer Architecture, in: ISCA ’09, Association for Computing Machinery, New York, NY, USA, 2009, pp. 24–33,.
[26]
Ramos L.E., Gorbatov E., Bianchini R., Page placement in hybrid memory systems, in: Proceedings Of The International Conference On Supercomputing, in: ICS ’11, Association for Computing Machinery, New York, NY, USA, 2011, pp. 85–95,.
[27]
Seong N.H., Woo D.H., Lee H.-H.S., Security refresh: Prevent malicious wear-out and increase durability for phase-change memory with dynamically randomized address mapping, in: Proceedings Of The 37th Annual International Symposium On Computer Architecture, in: ISCA ’10, Association for Computing Machinery, New York, NY, USA, 2010, pp. 383–394,.
[28]
Song C., Wang W., Yew P.-C., Zhai A., Zhang W., Unleashing the power of learning: An enhanced learning-based approach for dynamic binary translation, in: Proceedings Of The 2019 USENIX Conference On Usenix Annual Technical Conference, in: USENIX ATC ’19, USENIX Association, USA, 2019, pp. 77–89.
[29]
Wang W., Helper function inlining in dynamic binary translation, in: Proceedings Of The 30th ACM SIGPLAN International Conference On Compiler Construction, in: CC 2021, Association for Computing Machinery, New York, NY, USA, 2021, pp. 107–118,.
[30]
Wang C., Cao T., Zigman J., Lv F., Zhang Y., Feng X., Efficient management for hybrid memory in managed language runtime, in: Gao G.R., Qian D., Gao X., Chapman B., Chen W. (Eds.), Network And Parallel Computing, Springer International Publishing, Cham, 2016, pp. 29–42.
[31]
Wang R., Chen G., Liang N., Huang Z., Preventive maintenance optimization regarding large-scale systems based on the life-cycle cost, Int. J. Performabil. Eng. 17 (2021) 766,. URL http://www.ijpe-online.com/EN/abstract/article_4619.shtml.
[32]
Wang T., Johnson R., Scalable logging through emerging non-volatile memory, Proc. VLDB Endow. 7 (2014) 865–876,.
[33]
Wang W., McCamant S., Zhai A., Yew P.-C., Enhancing cross-ISA dbt through automatically learned translation rules, in: Proceedings Of The Twenty-Third International Conference On Architectural Support For Programming Languages And Operating Systems, in: ASPLOS ’18, Association for Computing Machinery, New York, NY, USA, 2018, pp. 84–97,.
[34]
Wang W., Wu C., Bai T., Wang Z., Yuan X., Cui H., A pattern translation method for flags in binary translation, J. Comput. Res. Dev. 51 (2014) 2336–2347. URL http://crad.ict.ac.cn/EN/10.7544/issn1000-1239.2014.20130018.
[35]
Wang W., Wu J., Gong X., Li T., Yew P.-C., Improving dynamically-generated code performance on dynamic binary translators, in: Proceedings Of The 14th ACM SIGPLAN/SIGOPS International Conference On Virtual Execution Environments, in: VEE ’18, Association for Computing Machinery, New York, NY, USA, 2018, pp. 17–30,.
[36]
Wang W., Yew P.-C., Zhai A., McCamant S., A general persistent code caching framework for dynamic binary translation (DBT), in: Proceedings Of The 2016 USENIX Conference On Usenix Annual Technical Conference, in: USENIX ATC ’16, USENIX Association, USA, 2016, pp. 591–603.
[37]
Wang W., Yew P.-C., Zhai A., McCamant S., Efficient and scalable cross-ISA virtualization of hardware transactional memory, in: Proceedings Of The 18th ACM/IEEE International Symposium On Code Generation And Optimization, in: CGO 2020, Association for Computing Machinery, New York, NY, USA, 2020, pp. 107–120,.
[38]
Wang W., Yew P.-C., Zhai A., McCamant S., Wu Y., Bobba J., Enabling cross-ISA offloading for COTS binaries, in: Proceedings Of The 15th Annual International Conference On Mobile Systems, Applications, And Services, in: MobiSys ’17, Association for Computing Machinery, New York, NY, USA, 2017, pp. 319–331,.
[39]
Wen W., Zhang Y., Yang J., Wear leveling for crossbar resistive memory, in: Proceedings Of The 55th Annual Design Automation Conference, in: DAC ’18, Association for Computing Machinery, New York, NY, USA, 2018,.
[40]
Wu, J., Dong, J., Fang, R., Wang, W., Zuo, D., 2020. PerfDBT: Efficient performance regression testing of dynamic binary translation. In: 2020 IEEE 38th International Conference On Computer Design (ICCD), pp. 389–392, https://doi.org/10.1109/ICCD50377.2020.00071.
[41]
Wu J., Dong J., Fang R., Zhao Z., Gong X., Wang W., Zuo D., Effective exploitation of SIMD resources in cross-ISA virtualization, in: Proceedings Of The 17th ACM SIGPLAN/SIGOPS International Conference On Virtual Execution Environments, in: VEE 2021, Association for Computing Machinery, New York, NY, USA, 2021, pp. 84–97,.
[42]
Xu, C., Niu, D., Muralimanohar, N., Balasubramonian, R., Zhang, T., Yu, S., Xie, Y., 2015. Overcoming the challenges of crossbar resistive memory architectures. In: 2015 IEEE 21st International Symposium On High Performance Computer Architecture (HPCA), pp. 476–488, https://doi.org/10.1109/HPCA.2015.7056056.
[43]
Xu J., Swanson S., NOVA: A Log-structured file system for hybrid volatile/non-volatile main memories, in: Proceedings Of The 14th Usenix Conference On File And Storage Technologies, in: FAST’16, USENIX Association, USA, 2016, pp. 323–338.
[44]
Yang Q., Li Z., Liu Y., Long H., Huang Y., He J., Xu T., Zhai E., Mobile gaming on personal computers with direct android emulation, in: The 25th Annual International Conference On Mobile Computing And Networking, in: MobiCom ’19, Association for Computing Machinery, New York, NY, USA, 2019,.
[45]
Yang C., Liu D., Zhang R., Chen X., Nie S., Wang F., Zhuge Q., Sha E.H.-M., Efficient multi-grained wear leveling for inodes of persistent memory file systems, in: Proceedings Of The 57th ACM/EDAC/IEEE Design Automation Conference, in: DAC ’20, IEEE Press, 2020.
[46]
Yu, H.-C., Lin, K.-C., Lin, K.-F., Huang, C.-Y., Chih, Y.-D., Ong, T.-C., Chang, J., Natarajan, S., Tran, L.C., 2013. Cycling endurance optimization scheme for 1Mb STT-MRAM in 40nm technology. In: 2013 IEEE International Solid-State Circuits Conference Digest Of Technical Papers, pp. 224–225, https://doi.org/10.1109/ISSCC.2013.6487710.
[47]
Zhang W., Wang X., Cabrera D., Bai Y., Product quality reliability analysis based on rough Bayesian network, Int. J. Performabil. Eng. 16 (2020) 37,. URL http://www.ijpe-online.com/EN/abstract/article_4340.shtml.
[48]
Zhao, Z., Jiang, Z., Chen, Y., Gong, X., Wang, W., Yew, P.-C., 2021. Enhancing atomic instruction emulation for cross-ISA dynamic binary translation. In: 2021 IEEE/ACM International Symposium On Code Generation And Optimization (CGO), pp. 351–362, https://doi.org/10.1109/CGO51591.2021.9370312.
[49]
Zhao Z., Jiang Z., Liu X., Gong X., Wang W., Yew P.-C., DQEMU: A scalable emulator with retargetable DBT on distributed platforms, in: 49th International Conference On Parallel Processing - ICPP, in: ICPP ’20, Association for Computing Machinery, New York, NY, USA, 2020,.

Cited By

View all
  • (2022)Challenges and future directions for energy, latency, and lifetime improvements in NVMsDistributed and Parallel Databases10.1007/s10619-022-07421-x41:3(163-189)Online publication date: 21-Sep-2022

Index Terms

  1. WDBT: Non-volatile memory wear characterization and mitigation for DBT systems
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Journal of Systems and Software
    Journal of Systems and Software  Volume 187, Issue C
    May 2022
    270 pages

    Publisher

    Elsevier Science Inc.

    United States

    Publication History

    Published: 01 May 2022

    Author Tags

    1. Non-volatile memory
    2. NVM Wear leveling
    3. NVM Wear reduction
    4. Cross-ISA virtualization
    5. Dynamic binary translation
    6. QEMU

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 24 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Challenges and future directions for energy, latency, and lifetime improvements in NVMsDistributed and Parallel Databases10.1007/s10619-022-07421-x41:3(163-189)Online publication date: 21-Sep-2022

    View Options

    View options

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media