Abstract
Performance-aware based correlated datasets replication strategy (PCDRS) has been put forward to solve the issue of how to place datasets at a wide area distributed environment to support researchers on interdisciplinary scientific research. The issue is addressed from three parts. First of all, we gave out the replica number based on the performance requirement of the datasets. Secondly, according to the performance of the datanode, we determined the location for placing the datasets replica. Thirdly, we distinguished data from hot and cold to control the number of replicas elastically. The strategy has been put into tests on a HDFS cluster. The result shows that our strategy is performance effective in maintaining the reliability of datasets and promoting the access performance of the cluster.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cidon, A., et al.: Copysets: reducing the frequency of data loss in cloud storage. In: USENIX Annual Technical Conference, Citeseer (2013)
Lazowska, E.D., Zahorjan, J., Graham, G.S., Sevcik, K.C.: Quantitative System Performance: Computer System Analysis Using Queueing Network Models. Prentice-Hall Inc., Upper Saddle River (1984)
Qingsong, W., et al.: CDRM: a cost-effective dynamic replication management scheme for cloud storage cluster. In: 2010 IEEE International Conference on Cluster Computing (CLUSTER) (2010)
Bhagwan, R., Savage, S., Voelker, G.M.: Understanding availability. In: Kaashoek, M., Stoica, I. (eds.) IPTPS 2003. LNCS, vol. 2735, pp. 256–267. Springer, Heidelberg (2003)
Rodrigues, R., Zhou, T.H.: High availability in DHTs: erasure coding vs. replication. In: van Renesse, R. (ed.) IPTPS 2005. LNCS, vol. 3640, pp. 226–239. Springer, Heidelberg (2005)
Zhendong, C., et al.: ERMS: an elastic replication management system for HDFS. In: 2012 IEEE International Conference on Cluster Computing Workshops (CLUSTER WORKSHOPS) (2012)
Leung, Y.-W., Hou, R.-T.: Assignment of movies to heterogeneous video servers. IEEE Trans. Syst. Man Cybern. A Syst. Humans 35(5), 665–681 (2005)
Sammer, E.: Hadoop Operations. O’Reilly Media Inc., Sebastopol (2013)
HDFS Users Guide. hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html
Acknowledgment
This work is supported by the National Key Technology R&D Program (Grant NO. 2012BAH17FOl) and NSFC-NSF International Cooperation Project (Grant NO. 61361126011).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ye, L., Luan, Z., Yang, H. (2015). Performance-Aware Based Correlated Datasets Replication Strategy. In: Yueming, L., Xu, W., Xi, Z. (eds) Trustworthy Computing and Services. ISCTCS 2014. Communications in Computer and Information Science, vol 520. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-47401-3_42
Download citation
DOI: https://doi.org/10.1007/978-3-662-47401-3_42
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-47400-6
Online ISBN: 978-3-662-47401-3
eBook Packages: Computer ScienceComputer Science (R0)