Abstract
Cloud computing should inherently support various types of data-intensive workloads with different storage access patterns. This makes a high-performance storage system in the Cloud an important component. Emerging flash device technologies such as solid state drives (SSDs) are a viable choice for building high performance computing (HPC) cloud storage systems to address more fine-grained data access patterns. However, the bit-per-dollar SSD price is still higher than the prices of HDDs. This study proposes an optimized progressive file layout (PFL) method to leverage the advantages of SSDs in a parallel file system such as Lustre so that small file I/O performance can be significantly improved. A PFL can dynamically adjust chunk sizes and stripe patterns according to various I/O traffics. Extensive experimental results show that this approach (i.e. building a hybrid storage system based on a combination of SSDs and HDDs) can actually achieve balanced throughput over mixed I/O workloads consisting of large and small file access patterns.
Similar content being viewed by others
References
High Performance Computing in the AWS Cloud. https://aws.amazon.com/hpc/
Sun Oracle. Lustre Software Release 2.x Operation Manual. http://lustre.org/documentation. Accessed January 2011
Paciucci, G., Paper, S., Meyers, I., Ballantyne, D.: Developing High-Performance, Scalable, cost effective storage solutions with Intel Cloud Edition Lustre and Amazon Web Services. Reference Architecture: Developing Storage Solutions with Intel Cloud Edition for Lustre and Amazon Web Services (2015)
Raicu, I., Foster, I.T., Zhao, Y.: Many-task computing for grids and supercomputers 2008. In: Workshop on Many-Task Computing on Grids and Supercomputers (2008)
The Apache Hadoop project: open-source software for reliable, scalable, distributed computing. http://hadoop.apache.org/
Hammond, J.L.: Intel high performance data division. Progressive file layouts prototype. LUG (2015)
Mohr, R., Brim, M., Oral, S., Dilger, A.: Evaluating progressive file layouts for Lustre. LUG (2016)
Koo, D., Kim, J.-S., Hwang, S., Eom, H., Lee, J.: Utilizing Progressive File Layout Leveraging SSDs in HPC Cloud Environments. In: Proceedings of the IEEE International Workshops on Foundations and Applications of Self* Systems, September 2016
Benchmarking Working Group, OpenSFS. I/O Characterization of Large-Scale HPC Centers. Reference Architecture: the Supercomputing Conference (2012)
Lee, J., Koo, D., Park, K., Kim, J., Hwang, S.: Performance analysis of Lustre file system using high performance storage devices. KIISE Trans. Comput. Pract. 22(4), 163–169 (2016)
Intel SSD-Based Lustre Cluster File System Evaluation. http://www.intel.com/content/www/us/en/software/lustre-cluster-file-system-performance-evaluation.html
Prabhakar, R., Vazhkudai, S.S., Kim, Y., Butt, A.R., Li, M., Kandemir, M.: Provisioning a multi-tiered data staging area for extreme-scale machines. In: Proceedings of the 2011 31st International Conference on Distributed Computing Systems, ICDCS’11, pp. 1–12, IEEE Computer Society, Washington, DC, USA (2011)
Layout Enhancement High Level Design. http://wiki.lustre.org/Layout_Enhancement_High_Level_Design
Liu, N., Cope, J., Carns, P., Carothers, C., Ross, R., Grider, G., Crume, A., Maltzahn, C.: On the role of burst buffers in leadership-class storage systems. In: IEEE 28th Symposium on MSST/SNAPI. IEEE (2012)
Acknowledgements
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (No. NRF-2015R1C1A1A02036524), and by Institute for Information & Communications Technology Promotion (IITP) Grant Funded by the Korean government (MSIP) (No. R0190-16-2012, High Performance Big Data Analytics Platform Performance Acceleration Technologies Development).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Koo, D., Kim, JS., Hwang, S. et al. Adaptive hybrid storage systems leveraging SSDs and HDDs in HPC cloud environments. Cluster Comput 20, 2119–2131 (2017). https://doi.org/10.1007/s10586-017-1002-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-017-1002-5