Abstract
As advancements in technology domain have promoted with times, so have the amount of data generated. The amount of this data is so volumetric, that mainstream techniques fail to process and analyse the data in an efficient way and hence the requirement of dedicated processing techniques. This gigantic amount and processing of this sort of data often leads to over exploitation of computing resources. This results in a lot of power consumption. Many attempts have been made in this domain at hardware as well as software level and promising results have been achieved, but one issue that has been over looked is the impact of Big Data’s Variety on processing. To address this problem it contributes towards a novel technique which reduces energy consumption on processing of big-data with variety meeting the deadline as Quality of Service. The technique works by reading the whole dataset in chunks and then removing stop words with appropriate algorithm to save processing time. In the evaluation part, a set of datasets of approximately 100 GB aggregated from different sources and evaluated them using benchmark applications. The final outcomes with efficient approach by improving energy consumption and meeting deadline constraints.
Similar content being viewed by others
Data Availability
Amazon Datasets—https://nijianmo.github.io/amazon/index.html Guttenberg Data—https://web.eecs.umich.edu/~lahiri/gutenberg_dataset.html Quotes Dataset – https://www.kaggle.com/akmittal/quotes-dataset TPC dataset—https://relational.fit.cvut.cz/dataset/TPCH IMDB dataset—https://datasets.imdbws.com/
References
Ahmadvand, H., Foroutan, F., & Fathy, M. (2021). DV-DVFS: Merging data variety and DVFS technique to manage the energy consumption of big data processing. Journal of Big Data, 8, 45. https://doi.org/10.1186/s40537-021-00437-7
Nejat, M., Manivannam, M., & Perices, M. (2020). Perstenstrom, “Coordinated management of DVFS and cache partitioning under QoS contraints to save energy in multi-core systems.” Journal of Parallel Computing, 144, 246–259.
Hassan, H. A., Salem, S. A., & Saad, E. M. (2020). ”A smart energy and reliability aware scheduling algorithm for workflow execution in DVFS-enabled cloud environment. Future Generation Computer Systems, 112, 431–448.
Ahmadvand, H., Goudarzi, M., & Foroutan, F. (2019). Gapprox: Using Gallup approach for approximation in big data processing. J Big Data, 6, 20. https://doi.org/10.1186/s40537-019-0185-4
Stavarindes, G. L., & Karatza, H. D. (2019). An energy-efficient, Qos aware and cost-effective scheduling approach for real-time workflow applications in cloud computing systems utilizing DVFS and approximate computations. Future Generation Computer Systems, 96, 216–226.
Zhu, Z., & Tang, X. (2019). Deadline constrained workflow scheduling in IaaS cloud with multi-resource packing. Future Generation Computer Systems, 101, 880–893.
Guerreuro, J., Ilic, A., Roma, N., & Tomas, P. (2018). DVFS-aware application classification to improve GPGPUs energy efficiency. Parallel Computing, 000, 1–25.
Safari, M., & Khorsand, R. (2018). Energy aware scheduling algorithm for time constrained workflow tasks in DVFS-enabled cloud environment. Simulation Modelling Practice and Theory, 87, 311–326.
Rauber, T., & Rünger, G. (2019). A scheduling selection process for energy-efficient task execution on DVFS processors. Concurrency Computat Pract Exper., 31, e5043. https://doi.org/10.1002/cpe.5043
Shuting, Xu., Wu, C. Q., Hou, A., Wang, Y., & Wang, M. (2017). “Energy efficient dynamic consolidation of virtual machines in big data centres”, GPC 2017. LNCS, 10232, 191–206.
Teng, L., Pande, P. P., & Shirazi, B. (2016). A dynamic, compiler guided DVFS mechanism to achieve energy-efficiency in multi-core processors. Sustainable Computing: Informatics and Systems, 12, 1–9.
Arroba, P., Moya, J. M., Ayala, J. L., & Buyya, R. (2016). Dynamic Voltage and frequency scaling-awre dynamic consolidation of virtual machines for energy efficient cloud data centres. Concurrency Computation Practice Experience, 29(10), e4067.
Zheng, W., & Huang, S. (2015). An adaptive deadline constrained energy-efficient scheduling heuristic for workflows in clouds. Concurrency Computat.: Pract Exper, 27, 5590–5605. https://doi.org/10.1002/cpe.3592
Guérout, T., Monteil, T., Da Costa, G., Calheiros, R. N., Buyya, R., & Alexandru, M. (2013). Energy-aware simulation with DVFS. Simulation Modelling Practice and Theory, 39, 76–79.
Rizvandi, N. B., Taheri, J., & Zomaya, A. (2011). Some observations on optimal frequency selection in DVFS–based energy consumption minimization. Journal of Parallel and Distributed Computing, 71(8), 1154–1164.
Li, B., Yang, X., Zhou, R., Wang, B., Liu, C., & Zhang, Y. (2018). An efficient method for high quality and cohesive topical phrase mining. IEEE Transactions on Knowledge and Data Engineering., 31, 1–1. https://doi.org/10.1109/TKDE.2018.2823758
Dash, S., Shakyawar, S. K., Sharma, M., et al. (2019). Big data in healthcare: Management, analysis and future prospects. J Big Data, 6, 54. https://doi.org/10.1186/s40537-019-0217-0
Singh, J., Chen, J., Singh, S. P., Singh, M. P., Hassan, M. M., Hassan, M. M., & Awal, H. (2023). Load-balancing strategy: employing a capsule algorithm for cutting down energy consumption in cloud data centers for next generation wireless systems. Computational Intelligence and Neuroscience, 2023, 6090282. https://doi.org/10.1155/2023/6090282
Ardagna, D., Cappiello, C., Samá, W., & Vitali, M. (2018). Context-aware data quality assessment for big data. Future Generation Computer Systems, 89, 548–562. https://doi.org/10.1016/j.future.2018.07.014
Tang, Z., Qi, L., Cheng, Z., Li, K., Khan, S. U., & Li, K. (2016). An energy-efficient task scheduling algorithm in DVFS-enabled cloud environment. Journal of Grid Computing, 14(1), 55–74. https://doi.org/10.1007/s10723-015-9334-y
Ibrahim, S., Phan, T. D., Carpen-Amarie, A., Chihoub, H. E., Moise, D., & Antoniu, G. (2016). Governing energy consumption in Hadoop through CPU frequency scaling: An analysis. Future Generation Computer Systems, 54, 219–232. https://doi.org/10.1016/j.future.2015.01.005
Hosseini Shirvani, M., Rahmani, A. M., & Sahafi, A. (2020). A survey study on virtual machine migration and server consolidation techniques in DVFS-enabled cloud datacenter: Taxonomy and challenges. In Journal of King Saud University - Computer and Information Sciences (Vol. 32, Issue 3, pp. 267–286). King Saud bin Abdulaziz University. https://doi.org/10.1016/j.jksuci.2018.07.001
He, H., Zhao, Y., & Pang, S. (2020). Stochastic modeling and performance analysis of energy-aware cloud data center based on dynamic scalable stochastic petri net. Computing and Informatics, 39, 28–50. https://doi.org/10.31577/cai
Liu, B., Bohnenstiehl, B., & Baas, B. M. (n.d.). Scalable Hardware-Based Power Management for Many-Core Systems.
Khriji, S., Chéour, R., & Kanoun, O. (2022). Dynamic voltage and frequency scaling and duty-cycling for ultra low-power wireless sensor nodes. Electronics (Switzerland). https://doi.org/10.3390/electronics11244071
Junaid, M., Ali, S., Siddiqui, I. F., et al. (2022). Performance evaluation of data-driven intelligent algorithms for big data ecosystem. Wireless Personal Communications, 126, 2403–2423. https://doi.org/10.1007/s11277-021-09362-7
Siddiqui, I. F., Qureshi, N. M., Chowdhry, B. S., & Uqaili, M. A. (2019). Edge-node-aware adaptive data processing framework for smart grid. Wireless Personal Communications, 106, 179–189.
Lee, I., & Mangalaraj, G. (2022). Big data analytics in supply chain management: a systematic literature review and research directions. Big Data Cogn. Comput., 6, 17. https://doi.org/10.3390/bdcc6010017
Pop, F., Iacono, M., Gribaudo, M., & Kołodziej, J. (2016). Advances in modelling and simulation for big-data applications (AMSBA). Concurrency Computat.: Pract Exper., 28, 291–293. https://doi.org/10.1002/cpe.3750
Qureshi, N. M. F., Siddiqui, I. F., Abbas, A., Bashir, A. K., Nam, C. S., Chowdhry, B. S., & Uqaili, M. A. (2021). Stream-based authentication strategy using iot sensor data in multi-homing sub-aqueous big data network. Wireless Personal Communications, 116(2), 1217–1229. https://doi.org/10.1007/s11277-020-07215-3
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interest regarding the publication of this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kumar, R. Simulative Analysis and Performance Evaluation for Data Variety Aware Power Optimization Technique Using Big Data. Wireless Pers Commun 133, 1987–2002 (2023). https://doi.org/10.1007/s11277-023-10841-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-023-10841-2