Abstract
The size of data centers is becoming larger to deal with the exponential data growth, and the energy consumption challenges the services providers and the environment. Various data placement strategies were developed to reduce the energy consumption of processing big data on the level of storage system, but they were typically developed for specific applications and storage medium. This paper proposes an energy-aware algorithm EABD of processing big data in homogeneous cluster with general data storage. We show that a variation of this optimization can be reduced to set cover problem, and a heuristic algorithm is proposed to reduce the energy consumption by selecting proper nodes and assigning balanced workload to each selected node. This algorithm will not be influenced by the data placement strategies and storage medium. Simulation results show that our algorithm significantly reduces energy consumption in different situations.
Similar content being viewed by others
References
Delforge, P.: Americas Data Centers Consuming and Wasting Growing Amounts of Energy. February 06, (2015). Available: https://www.nrdc.org/resources/americas-data-centers-consuming-and-wasting-growing-amounts-energy
Sehgal, P., Tarasov, V., Zadok, E.: Optimizing energy and performance for server-class file system workloads. Trans. Storage 6(3), 10 (2010)
Dayarathna, M., Wen, Y., Fan, R.: Data center energy consumption modeling: a survey. IEEE Commun. Surv. Tutor. 18(1), 732–794 (2016)
Ho, C.C., Chen, H.W., Chang, Y.H., Chang, Y.M.: Energy-aware data placement strategy for SSD-assisted streaming video servers. In: Proceedings of Non-Volatile Memory Systems and Applications Symposium (NVMSA). August 20–21 (2014), pp. 1–6
Zhang, L., Deng, Y., Zhu, W., Zhou, J., Wang, F.: Skewly replicating hot data to construct a power-efficient storage cluster. J. Netw. Comput. Appl. 50, 168–179 (2015)
Chou, J., Kim, J., Rotem, D.: Energy-aware scheduling in disk storage systems. In: Proceedings of the 31st International Conference on Distributed Computing Systems. June 20–24 (2011), pp. 423–433
Meisner, D., Gold, B.T., Wenisch, T. F.: PowerNap: eliminating server idle power. In: Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems. March 7–11 (2009), pp. 205–216
Lang, W., Patel, J.M., Naughton, J.F.: On energy management, load balancing and replication. ACM SIGMOD Rec. 38(4), 35–42 (2009)
Amur, H., Cipar, J., Gupta, V., Ganger, G.R., Kozuch, M.A., Schwan, K.: Robust and flexible power-proportional storage. In: Proceedings of the 1st ACM Symposium on Cloud Computing. June 10–11, pp. 217–228 (2010)
Kim, J., Rotem, D.: Energy proportionality for disk storage using replication. In: Proceedings of the 14th International Conference on Extending Database Technology. March 21–24 (2011), pp. 81–92
Chvatal, V.: A greedy heuristic for the set-covering problem. Math. Oper. Res. 4(3), 233–235 (1979)
Agrawal, V., Kepler, N., Kidd, D.: Low power ARM Cortex-M0 CPU and SRAM using Deeply Depleted Channel (DDC) transistors with Vdd scaling and body bias. In: Proceedings of the IEEE Custom Integrated Circuits Conference. September 22–25 (2013), pp. 1–4
Tanakamaru, S., Hung, C., Takeuchi, K.: Highly reliable and low power SSD using asymmetric coding and stripe bitline-pattern elimination programming. IEEE J. Solid-State Circuits 47(1), 85–96 (2012)
Liao, X., Jin, H., Liu, H.: Towards a green cluster through dynamic remapping of virtual machines. Future Gener. Comput. Syst. 28(2), 469C477 (2012)
Ding, Y., Qin, X., Liu, L., Wang, T.: Energy efficient scheduling of virtual machines in cloud with deadline constraint. Future Gener. Comput. Syst. 50(1), 62–74 (2015)
Kaushik, R.T., Bhandarkar, M.A., Nahrstedt, K.: Evaluation and analysis of greenHDFS: a self-adaptive, energy-conserving variant of the hadoop distributed file system. In: Proceedings of IEEE Second International Conference on Cloud Computing Technology and Science. November 30, December 3 (2010), pp. 274–287
Leverich, J., Kozyrakis, C.: On the energy (in)efficiency of hadoop clusters. SIGOPS Oper. Syst. Rev. 44(1), 61–65 (2010)
Cardosa, M., Singh, A., Pucha, H., Chandra, A.: Exploiting spatio-temporal tradeoffs for energy-aware MapReduce in the cloud. IEEE Trans. Comput. 61(12), 1737–1751 (2012)
Hartog, J., Fadika, Z., Dede, E., Govindaraju, M.: Configuring a mapreduce framework for dynamic and efficient energy adaptation. In: Proceedings of IEEE 5th International Conference on Cloud Computing. June 24–29 (2012), pp. 914–921
Maheshwari, N., Nanduri, R., Varma, V.: Dynamic energy efficient data placement and cluster reconfiguration algorithm for mapreduce framework. Future Gener. Comput. Syst. 28(1), 119–127 (2012)
Mochocki, B., Hu, X., Quan, G.: Transition-overhead-aware voltage scheduling for fixed-priority real-time systems. ACM Trans. Des. Autom. Electron. Syst. 12(2), 11 (2007)
Burd, T.D., Brodersen, R.W.: Design issues for dynamic voltage scaling. In: Proceedings of International Symposium on Low Power Electronics and Design, July 25–27, (2000), pp. 9–14
Acknowledgments
The research was supported by The National Natural Science Foundation of China (grant nos. 41301047, 61373015, 61300052), Research Fund for the Doctoral Program of High Education of China (grant no. 20103218110017), project funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions (grant no. PAPD), and Fundamental Research Funds for the Central Universities, NUAA (grant nos. NP2013307, NZ2013306).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ding, Y., Qin, X., Zhou, Q. et al. Energy-aware processing of big data in homogeneous cluster. SIViP 11, 371–379 (2017). https://doi.org/10.1007/s11760-016-0964-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-016-0964-8