Abstract
High-density NAND flash memory has been recommended as a storage medium in edge computing and intelligent storage systems. However, recent studies show that the read latency of this kind of NAND flash is increasing. The reason comes from at least two aspects: First, high-density flash memory generally adopts multiple bits per cell technique, where the access latency of the most significant bit is largely increased. Second, due to the reliability variation among these bits, the access latency of the most significant bit is further increased, which will seriously affect the read performance and even cause the tail latency. This paper proposes a read latency variation aware performance optimization scheme, RLV, to accelerate both data and metadata access to maximize the read performance and reduce the tail latency. RLV includes three parts: First, a read latency variation aware data placement scheme is proposed by accelerating the hot data accesses, including a data identification method and a fine-grained data migration method. Second, a new caching method is proposed to cache data from slow pages and minimize the migration cost, which includes an assisted caching method and a migration tagged caching method. Third, a life-stage aware metadata placement scheme is further proposed to speed up metadata access. Experimental results show that the proposed method can improve the read performance by 45.7% on average compared with state-of-the-art works and significantly reduce the tail latency at 95–99.99th percentiles.
Similar content being viewed by others
References
Chang, D.-W., Lin, W.-C., Chen, H.-H.: Fastread: improving read performance for multilevel-cell flash memory. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. (TVLSI) 24(9), 2998–3002 (2016)
Choi, W., Jung, M., Kandemir, M.: Invalid data-aware coding to enhance the read performance of high-density flash memories. In: 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 482–493 (2018)
Component-Level Characterization of 3D TLC, QLC, and Low-Latency NAND. https://www.flashmemorysummit.com/Proceedings2019/08-07-Wednesday/20190807_FTEC-202-1_Breen.pdf
Cui, J., Wu, W., Nie, S., et al.: VIOS: a variation-aware I/O scheduler for flash-based storage systems. In: IFIP International Conference on Network and Parallel Computing, pp. 3–16 (2016)
Cui, J., Wu, W., Zhang, X., et al.: Exploiting latency variation for access conflict reduction of NAND flash memory. In: Proceedings of the 32nd Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–7 (2016)
Cui, J., Zhang, Y., Wu, W., et al.: DLV: exploiting device level latency variations for performance improvement on flash memory storage systems. IEEE Trans. Comput. Aided Design Integr. Circuits Syst. (TCAD) 37(8), 1546–1559 (2017)
Di, Y., Shi, L., Gao, C., et al.: Minimizing retention induced refresh through exploiting process variation of flash memory. IEEE Trans. Comput. (TC) 68(1), 83–98 (2019)
Du, Y., Zou, D., Li, Q., et al.: LALDPC: latency-aware LDPC for read performance improvement of solid state drives. In: IEEE 33rd Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–11 (2017)
Grupp, L.M., Davis, J.D., Swanson, S.: The Harey tortoise: managing heterogeneous write performance in SSDs. In: Proceedings of USENIX Annual Technical Conference (USENIX ATC), pp. 79–90 (2013)
Gupta, A., Kim, Y., Urgaonkar, B.: DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings. In: Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 229–240 (2009)
Higuchi, T., Kodama, T., et al.: A 1Tb 3b/cell 3D-flash memory in a 170+ word-line-layer technology. In: IEEE International Solid- State Circuits Conference-(ISSCC), vol. 64, pp. 428–430 (2021)
Hsieh, J.-W., Kuo, T.-W., Chang, L.-P.: Efficient identification of hot data for flash memory storage systems. ACM Trans. Storage (TOS) 2(1), 22–40 (2006)
Hu, Y., Jiang, H., Feng, D., et al.: Performance impact and interplay of SSD parallelism through advanced commands, allocation strategy and data granularity. In: Proceedings of the International Conference on Supercomputing (ICS), pp. 96–107 (2011)
Hu, Y., Jiang, H., et al.: Exploring and exploiting the multilevel parallelism inside SSDs for improved performance and endurance. IEEE Trans. Comput. (TC) 62(6), 1141–1155 (2013)
Jae-Woo, P., Kim, D., et al.: A 176-stacked 512GB 3b/cell 3D-NAND flash with 10.8 GB/mm 2 density with a peripheral circuit under cell array architecture. In: IEEE International Solid-State Circuits Conference (ISSCC), vol. 64, pp. 422–423 (2021)
Kang, D., Kim, M.-S., Jeon, S., et al.: 13.4 a 512GB 3-bit/cell 3D 6th-generation V-NAND flash memory with 82MB/s write throughput and 1.2GB/s interface. In: IEEE International Solid-State Circuits Conference (ISSCC), pp. 216–218 (2019)
Khakifirooz, A., Balasubrahmanyam, S., et al.: A 1Tb 4b/cell 144-tier floating-gate 3D-NAND flash memory with 40MB/s program throughput and 13.8GB/$\text{mm}^2$ bit density. In: IEEE International Solid-State Circuits Conference—(ISSCC), vol. 64, pp. 424–426 (2021)
Lee, S., Kim, C., Kim, M., et al.: A 1Tb 4b/cell 64-stacked-WL 3D NAND flash memory with 12MB/s program throughput. In: IEEE International Solid-State Circuits Conference (ISSCC), pp. 340–342 (2018)
Li, Q., Shi, L., Xue, C.J., et al.: Improving LDPC performance via asymmetric sensing level placement on flash memory. In: Proceedings of the 22nd Asia and South Pacific Design Automation Conference (ASP-DAC), pp. 560–565 (2017)
Li, Q., Shi, L., Di, Y., et al.: Process variation aware read performance improvement for LDPC-based NAND flash memory. IEEE Trans. Reliab. 69(1), 310–321 (2020)
Luo, Y., Cai, Y., Ghose, S., et al.: Warm: improving NAND flash memory lifetime with write-hotness aware retention management. In: IEEE 31st Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–14 (2015)
Maejima, H., Kanda, K., Fujimura, S., et al.: A 512GB 3b/cell 3D flash memory on a 96-word-line-layer technology. In: IEEE International Solid-State Circuits Conference (ISSCC), pp. 336–338 (2018)
Meeting Application Needs with a New Generation of Advanced Storage Technology. https://flashmemorysummit.com/English/Conference/ Keynotes_2019.html
Narayanan, D., Thereska, E., et al.: Migrating server storage to SSDs: analysis of tradeoffs. In: Proceedings of ACM European Conference on Computer Systems, pp. 145–158 (2009)
Park, D., Du, D.H.: Hot data identification for flash-based storage systems using multiple bloom filters. In: IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–11 (2011)
Shi, L., Qiu, K., et al.: Error model guided joint performance and endurance optimization for flash memory. IEEE Trans. Comput. Aided Design Integr. Circuits Syst. (TCAD) 33(3), 343–355 (2014)
Shibata, N., Kanda, K., Shimizu, T., et al.: 13.1 a 1.33Tb 4-bit/cell 3D-flash memory on a 96-word-line-layer technology. In: IEEE International Solid-State Circuits Conference (ISSCC), pp. 210–212 (2019)
Siau, C., Kim, K.-H., Lee, S., et al.: 13.5 a 512GB 3-bit/cell 3D flash memory on 128-wordline-layer with 132MB/s write performance featuring circuit-under-array technology. In: IEEE International Solid-State Circuits Conference (ISSCC), pp. 218–220 (2019)
Sun, H., Zhao, W., Lv, M., et al.: Exploiting intracell bit-error characteristics to improve min-sum LDPC decoding for MLC NAND flash-based storage in mobile device. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 24(8), 2654–2664 (2016)
The Top 10 Best Workloads for QLC. https://www.snia.org/sites/default/files/SDCEMEA/2019/
Tokutomi, T., Doi, M., Hachiya, S., et al.: 7.7 enterprise-grade 6x fast read and 5x highly reliable SSD with TLC NAND-flash memory for big-data storage. In: IEEE International Solid-State Circuits Conference (ISSCC), pp. 1–3 (2015)
Wu, F., Lu, Z., Zhou, Y., et al.: Ospada: one-shot programming aware data allocation policy to improve 3D NAND flash read performance. In: IEEE 36th International Conference on Computer Design (ICCD), pp. 51–58 (2018)
Wu, S., Zhang, W., Mao, B., Jiang, H.: HOTR: alleviating read/write interference with hot read data replication for flash storage. In: Proceedings of Design, Automation and Test in Europe Conference and Exhibition (DATE), pp. 1367–1372 (2019)
Zhang, M., Wu, F., He, X., et al.: Real: a retention error aware LDPC decoding scheme to improve NAND flash read performance. In: 32nd Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–13 (2016)
Zhao, K., Zhao, W., Sun, H., et al.: LDPC-in-SSD: making advanced error correction codes work effectively in solid state drives. In: Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST), pp. 243–256 (2013)
Zhao, K., , Venkataraman, K.S., et al.: Over-clocked SSD: safely running beyond flash memory chip I/O clock specs. In: Proceedings of High Performance Computer Architecture (HPCA), pp. 536–545 (2014)
Zuolo, L., Zambelli, C., Marelli, A., et al.: LDPC soft decoding with improved performance in 1x–2x MLC and TLC NAND flash-based solid state drives. IEEE Trans. Emerg. Top. Comput. 7(3), 507–515 (2019)
Acknowledgements
The authors would like to thank anonymous reviewers for their valuable comments. This work is supported by the Excellent PhD Student Promotion Project of East China Normal University (YBNLTS2021-036), NSFC 62072177 and 61972154, Shanghai Science and Technology Project (20ZR1417200) and the Open Project Program of Wuhan National Laboratory for Optoelectronics NO. 2019WNLOKF008.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shi, L., Lv, Y., Luo, L. et al. Read latency variation aware performance optimization on high-density NAND flash based storage systems. CCF Trans. HPC 4, 265–280 (2022). https://doi.org/10.1007/s42514-022-00102-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42514-022-00102-2