Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Energy consumption estimation and profiling for queries in distributed database systems based on a bottom-up comprehensive energy model

Published: 08 August 2024 Publication History

Abstract

Quantifying energy consumption of database operations is the foundation of building energy-efficient database systems. Existing approaches only focused on stand-alone database servers, while we are interested in modeling energy consumption of database operations in the distributed environments. In this study, we aim at providing an accurate energy consumption model for queries executed in distributed database systems, profiling energy consumption characteristics of both the individual queries and the system as a whole, to guide the design of green database systems and to reveal opportunities for energy-efficient computing. As the execution of a distributed query is a combination of a set of subqueries that decomposed from it, we start from building energy models for individual subqueries by extracting basic operations that can effectively reflect their energy consumption. Then we use a bottom-up measuring and modeling method as the basis to provide a comprehensive energy estimation model for the entire distributed query. To validate the accuracy of the model, we use queries with a variety complexity that generated from three standard benchmarks (TPC-H, SSB, and Sysbench) on real distributed databases. Extensive experimental results show that our solution can achieve a high average accuracy of 94.63% for distributed queries. More importantly, based on these results we further explored energy consumption patterns for distributed queries and presented several important implications. And finally, a significant role of distributed joins has been discovered in saving both idle and dynamic energy cost of the system. We hope that taking advantage of our observations can help readers who wish to substantially improve energy efficiency for distributed database systems.

Highlights

Measuring real and accurate energy cost of distributed database systems for running queries.
Bottom-up energy estimation model of 94.63% average accuracy for distributed queries.
Extensive results obtained from queries of various complexity on physical testbed.
Discussion on energy cost patterns of distributed queries and the system for energy-saving opportunity.

References

[1]
Brown R., Masanet E., Nordman B., Tschudi B., Shehabi A., Stanley J., Koomey J., Sartor D., Chan P., et al., Report To Congress on Server and Data Center Energy Efficiency: Public Law 109-431, Lawrence Berkeley National Lab.(LBNL), Berkeley, CA (United States), 2007,.
[2]
Poess M., Nambiar R.O., cost Energy., The key challenge of today’s data centers: A power consumption analysis of tpc-c results, Proc. VLDB Endow. 1 (2) (2008) 1229–1240,.
[3]
Graefe G., Database servers tailored to improve energy efficiency, in: Proceedings of the 2008 EDBT Workshop on Software Engineering for Tailor-Made Data Management, ACM, 2008, pp. 24–28,.
[4]
Harizopoulos S., Shah M., Meza J., Ranganathan P., Energy efficiency: The new holy grail of data management systems research, 2009, arXiv preprint, http://arxiv.org/abs/0909.1784.
[5]
Wang J., Feng L., Xue W., Song Z., A survey on energy-efficient data management, ACM SIGMOD Rec. 40 (2) (2011) 17–23,.
[6]
Dayarathna M., Wen Y., Fan R., Data center energy consumption modeling: A survey, IEEE Commun. Surv. Tutor. 18 (1) (2015) 732–794,.
[7]
Lin W., Shi F., Wu W., Li K., Wu G., Mohammed A.-A., A taxonomy and survey of power models and power modeling for cloud servers, ACM Comput. Surv. 53 (5) (2020) 1–41,.
[8]
You X., Lv X., Zhao Z., Han J., Ren X., A survey and taxonomy on energy-aware data management strategies in cloud environment, IEEE Access 8 (2020) 94279–94293,.
[9]
Cong P., Zhou J., Li L., Cao K., Wei T., Li K., A survey of hierarchical energy optimization for mobile edge computing: A perspective from end devices to the cloud, ACM Comput. Surv. 53 (2) (2020) 1–44,.
[10]
Sardianos C., Varlamis I., Chronis C., Dimitrakopoulos G., Alsalemi A., Himeur Y., Bensaali F., Amira A., The emergence of explainability of intelligent systems: Delivering explainable and personalized recommendations for energy efficiency, Int. J. Intell. Syst. 36 (2) (2020) 656–680,.
[11]
Liu J., Wang K., Chen F., Understanding energy efficiency of databases on single board computers for edge computing, in: 2021 29th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS, IEEE, 2021, pp. 1–8,.
[12]
Mansouri Y., Prokhorenko V., Ullah F., Babar M.A., Evaluation of distributed databases in hybrid clouds and edge computing: Energy, bandwidth, and storage consumption, 2021,. arXiv preprint.
[13]
Ullah F., Dhingra S., Xia X., Babar M.A., Evaluation of distributed data processing frameworks in hybrid clouds, 2022,. arXiv preprint.
[14]
Guo B., Yu J., Yang D., Leng H., Liao B., Energy-efficient database systems: A systematic survey, ACM Comput. Surv. 55 (6) (2023) 1–53,.
[15]
Xu Z., Tu Y.-C., Wang X., Online energy estimation of relational operations in database systems, IEEE Trans. Comput. 64 (11) (2015) 3223–3236,.
[16]
Haas S., Arnold O., Nöthen B., Scholze S., Ellguth G., Dixius A., Höppner S., Schiefer S., Hartmann S., Henker S., et al., An mpsoc for energy-efficient database query processing, in: Proceedings of the 53rd Annual Design Automation Conference, ACM, 2016, p. 112,.
[17]
Zhou Y., Taneja S., Qin X., Ku W.-S., Zhang J., Edom: Improving energy efficiency of database operations on multicore servers, Future Gener. Comput. Syst. 105 (2020) 1002–1015,.
[18]
Roukh A., Bellatreche L., Bouarar S., Boukorca A., Eco-physic: Eco-physical design initiative for very large databases, Inf. Syst. 68 (2017) 44–63,.
[19]
Guo B., Yu J., Liao B., Yang D., Lu L., A green framework for dbms based on energy-aware query optimization and energy-efficient query processing, J. Netw. Comput. Appl. 84 (2017) 118–130,.
[20]
Korkmaz M., Karsten M., Salem K., Salihoglu S., Workload-aware cpu performance scaling for transactional database systems, in: Proceedings of the 2018 International Conference on Management of Data, ACM, 2018, pp. 291–306,.
[21]
Poess M., Ren D.Q., Rabl T., Jacobsen H.-A., Methods for quantifying energy consumption in tpc-h, in: Proceedings of the 2018 ACM/SPEC International Conference on Performance Engineering, Association for Computing Machinery, 2018, pp. 293–304,.
[22]
Kissinger T., Habich D., Lehner W., Adaptive energy-control for in-memory database systems, in: Proceedings of the 2018 International Conference on Management of Data, ACM, 2018, pp. 351–364,.
[23]
Karyakin A., Salem K., Dimmstore: Memory power optimization for database systems, Proc. VLDB Endow. 12 (11) (2019) 1499–1512,.
[24]
Mahajan D., Blakeney C., Zong Z., Improving the energy efficiency of relational and nosql databases via query optimizations, Sustain. Comput.: Inf Syst. 22 (2019) 120–133,.
[25]
Yun J.-T., Yoon S.-K., Kim J.-G., Kim S.-D., Effective data prediction method for in-memory database applications, J. Supercomput. 76 (1) (2020) 580–601,.
[26]
Dembele S.P., Bellatreche L., Ordonez C., Towards green query processing-auditing power before deploying, in: 2020 IEEE International Conference on Big Data, Big Data, IEEE, 2020, pp. 2492–2501,.
[27]
Dembele S.P., Bellatreche L., Ordonez C., Roukh A., Think big, start small: A good initiative to design green query optimizers, Cluster Comput. 23 (3) (2020) 2323–2345,.
[28]
Xu Z., Bai G., Cui A., Wang S., Power-aware throughput control for containerized relational operation, CCF Trans. High Perf. Comput. 3 (1) (2021) 70–84,.
[29]
Moghaddamfar M., Färber C., Lehner W., May N., Kumar A., Resource-efficient database query processing on fpgas, in: Proceedings of the 17th International Workshop on Data Management on New Hardware, DaMoN 2021, 2021, pp. 1–8,.
[30]
Lutz C., Breß S., Zeuch S., Rabl T., Markl V., Triton join: Efficiently scaling to a large join state on gpus with fast interconnects, in: Proceedings of the 2022 International Conference on Management of Data, 2022, pp. 1017–1032,.
[31]
Lang W., Patel J., Towards eco-friendly database management systems, 2009,. arXiv preprint.
[32]
Lang W., Kandhan R., Patel J.M., Rethinking query processing for energy efficiency: Slowing down to win the race, IEEE Data Eng. Bull. 34 (1) (2011) 12–23,.
[33]
Xu Z., Tu Y.-C., Wang X., Exploring power-performance tradeoffs in database systems, in: 2010 IEEE 26th International Conference on Data Engineering, ICDE 2010, IEEE, 2010, pp. 485–496,.
[34]
Selinger P.G., Astrahan M.M., Chamberlin D.D., Lorie R.A., Price T.G., Access path selection in a relational database management system, in: Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data, ACM, 1979, pp. 23–34,.
[35]
Yu X., Chai C., Li G., Liu J., Cost-based or learning-based? A hybrid query optimizer for query plan selection, Proc. VLDB Endow. 15 (13) (2022) 3924–3936,.
[36]
Bhattacharya T., Peng X., Takreeti T., Mao J., Cao T., Qin X., Rahgouy M., Accelerating the energy efficient design of traditional data centers through modeling, in: 2022 IEEE International Conference on Networking, Architecture and Storage, NAS, IEEE, 2022, pp. 1–8,.
[37]
Tsirogiannis D., Harizopoulos S., Shah M.A., Analyzing the energy efficiency of a database server, in: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, ACM, 2010, pp. 231–242,.
[38]
Bouhatous A., Bellatreche L., Ordonez C., et al., The impact of multicore cpus on eco-friendly query processors in big data warehouses, in: 2022 IEEE International Conference on Big Data, Big Data, IEEE, 2022, pp. 4463–4472,.
[39]
Charef N., Mnaouer A.B., Aloqaily M., Bouachir O., Guizani M., Artificial intelligence implication on energy sustainability in internet of things: A survey, Inf. Process. Manage. 60 (2) (2023),.
[40]
Meza J., Shah M.A., Ranganathan P., Fitzner M., Veazey J., Tracking the power in an enterprise decision support system, in: Proceedings of the 2009 ACM/IEEE International Symposium on Low Power Electronics and Design, ACM, 2009, pp. 261–266,.
[41]
Xu Z., Building a power-aware database management system, in: Proceedings of the Fourth SIGMOD PhD Workshop on Innovative Database Research, ACM, 2010, pp. 1–6,.
[42]
Kunjir M., Birwa P.K., Haritsa J.R., Peak power plays in database engines, in: Proceedings of the 15th International Conference on Extending Database Technology, ACM, 2012, pp. 444–455,.
[43]
Xu Z., Tu Y.-C., Wang X., Dynamic energy estimation of query plans in database systems, in: 2013 IEEE 33rd International Conference on Distributed Computing Systems, IEEE, 2013, pp. 83–92,.
[44]
Roukh A., Estimating power consumption of batch query workloads, in: Model and Data Engineering: 5th International Conference, MEDI 2015, Rhodes, Greece, September 26-28, 2015, Proceedings, Springer, 2015, pp. 198–212,.
[45]
Luo B., Hayamizu Y., Goda K., Kitsuregawa M., Modeling query energy costs in analytical database systems with processor speed scaling, in: 29th International Conference on Database and Expert Systems Applications, Springer, 2018, pp. 310–317,.
[46]
Agrawal R., Ailamaki A., Bernstein P.A., Brewer E.A., Carey M.J., Chaudhuri S., Doan A., Florescu D., Franklin M.J., Garcia-Molina H., et al., The claremont report on database research, ACM Sigmod Rec. 37 (3) (2008) 9–19,.
[47]
Michalke A., Grulich P.M., Lutz C., Zeuch S., Markl V., An energy-efficient stream join for the internet of things, in: Proceedings of the 17th International Workshop on Data Management on New Hardware, DaMoN 2021, 2021, pp. 1–6,.
[48]
Arora S., Bala A., A survey: Ict enabled energy efficiency techniques for big data applications, Cluster Comput. 23 (2) (2020) 775–796,.
[49]
Fang J., Mulder Y.T., Hidders J., Lee J., Hofstee H.P., In-memory database acceleration on fpgas: A survey, VLDB J. 29 (1) (2020) 33–59,.
[50]
Sanka A.I., Chowdhury M.H., Cheung R.C., Efficient high-performance fpga-redis hybrid nosql caching system for blockchain scalability, Comput. Commun. 169 (2021) 81–91,.
[51]
Darmont J., Novikov B., Wrembel R., Bellatreche L., Advances on data management and information systems, Inf. Syst. Front. 24 (1) (2022) 1–10,.
[52]
Rivoire S., Shah M.A., Ranganathan P., Kozyrakis C., Joulesort: A balanced energy-efficiency benchmark, in: Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data, ACM, 2007, pp. 365–376,.
[53]
Roukh A., Bellatreche L., Ordonez C., Enerquery: Energy-aware query processing, in: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, ACM, 2016, pp. 2465–2468,.
[54]
Roukh A., Bellatreche L., Tziritas N., Ordonez C., Energy-aware query processing on a parallel database cluster node, in: International Conference on Algorithms and Architectures for Parallel Processing, Springer, 2016, pp. 260–269,.
[55]
Tu Y.-C., Wang X., Zeng B., Xu Z., A system for energy-efficient data management, ACM SIGMOD Rec. 43 (1) (2014) 21–26,.
[56]
Korkmaz M., Karyakin A., Karsten M., Salem K., Towards dynamic green-sizing for database servers, in: ADMS@ VLDB, 2015, pp. 25–36. https://www.adms-conf.org/2015/adms15_korkmaz.pdf.
[57]
Kissinger T., Hähnel M., Smejkal T., Habich D., Härtig H., Lehner W., Energy-utility function-based resource control for in-memory database systems live, in: Proceedings of the 2018 International Conference on Management of Data, ACM, 2018, pp. 1717–1720,.
[58]
Guo C., Pierson J.-M., Liu H., Song J., Frequency selection approach for energy aware cloud database, IEEE Access 7 (2018) 1927–1942,.
[59]
Guo C., Pierson J.-M., Song J., Herzog C., Hot-n-cold model for energy aware cloud databases, J. Parallel Distrib. Comput. 123 (2019) 130–144,.
[60]
Zhou Y., Taneja S., Zhang C., Qin X., Greendb: Energy-efficient prefetching and caching in database clusters, IEEE Trans. Parallel Distrib. Syst. 30 (5) (2018) 1091–1104,.
[61]
Appuswamy R., Olma M., Ailamaki A., Scaling the memory power wall with dram-aware data management, in: Proceedings of the 11th International Workshop on Data Management on New Hardware, ACM, 2015, p. 3,.
[62]
Mackert L.F., Lohman G.M., R* optimizer validation and performance evaluation for distributed queries, in: Proceedings of the 12th International Conference on Very Large Data Bases, Morgan Kaufmann Publishers Inc., 1986, pp. 149–159. https://dl.acm.org/doi/10.5555/645913.671480.
[63]
Chaudhuri S., An overview of query optimization in relational systems, in: Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, ACM, 1998, pp. 34–43,.
[64]
Ullah F., Mohammed I., Babar M.A., A framework for energy-aware evaluation of distributed data processing platforms in edge-cloud environment, 2022,. arXiv preprint.
[65]
Mansouri Y., Prokhorenko V., Ullah F., Babar M.A., Resource utilization of distributed databases in edge–cloud environment, IEEE Internet Things J. 10 (11) (2023) 9423–9437,.
[66]
Özsu M.T., Valduriez P., Principles of Distributed Database Systems, Springer, 1999,.
[67]
Leverich J., Kozyrakis C., On the energy (in) efficiency of hadoop clusters, Oper. Syst. Rev. 44 (1) (2010) 61–65,.
[68]
Orgerie A.-C., d. Assuncao M.D., Lefevre L., A survey on techniques for improving the energy efficiency of large-scale distributed systems, ACM Comput. Surv. 46 (4) (2014) 47,.
[69]
Lefurgy C., Rajamani K., Rawson F., Felter W., Kistler M., Keller T.W., Energy management for commercial servers, Computer 36 (12) (2003) 39–48,.
[70]
Barroso L.A., Hölzle U., The datacenter as a computer: An introduction to the design of warehouse-scale machines, Synth. Lect. Comput. Archit. 4 (1) (2009) 1–108,.
[71]
Karyakin A., Salem K., An analysis of memory power consumption in database systems, in: Proceedings of the 13th International Workshop on Data Management on New Hardware, ACM, 2017, p. 2,.
[72]
Mackert L.F., Lohman G.M., R* optimizer validation and performance evaluation for local queries, ACM Sigmod Rec. 15 (2) (1986) 84–95,.
[73]
Wang H.-S., Peh L.-S., Malik S., A power model for routers: Modeling alpha 21364 and infiniband routers, IEEE Micro 23 (1) (2003) 26–35,.
[74]
Vishwanath A., Hinton K., Ayre R.W., Tucker R.S., Modeling energy consumption in high-capacity routers and switches, IEEE J. Sel. Areas Commun. 32 (8) (2014) 1524–1532,.
[75]
Ahn J., Park H.-S., Measurement and modeling the power consumption of router interface, in: 16th International Conference on Advanced Communication Technology, IEEE, 2014, pp. 860–863,.
[76]
TPC, The TPC Benchmark™H (TPC-H) version 3.0.1, 2023, https://www.tpc.org/tpc_documents_current_versions/current_specifications5.asp, (Accessed 9 August 2017).
[77]
O’Neil P.E., O’Neil E.J., Chen X., Revilak S., The star schema benchmark and augmented fact table indexing, in: Technology Conference on Performance Evaluation and Benchmarking, Springer, 2009, pp. 237–252,.
[78]
Kopytov A., Sysbench Version 1.0, 2023, http://sourceforge.net/, (Accessed 8 August 2017).
[79]
Bellatreche L., Woameno K.Y., Dimension table driven approach to referential partition relational data warehouses, in: Proceedings of the ACM Twelfth International Workshop on Data Warehousing and OLAP, DOLAP ’09, ACM, 2009, pp. 9–16,.
[80]
Bellatreche L., Boukhalfa K., Richard P., Woameno K.Y., Referential horizontal partitioning selection problem in data warehouses, Int. J. Data Warehous. Min. 5 (4) (2009) 1–23,.
[81]
Höpfner H., Bunse C., Towards an energy aware dbms-energy consumptions of sorting and join algorithms, in: Proceedings of the 21. GI-Workshop on Foundations of Databases (Grundlagen Von Datenbanken), 2009, pp. 69–73,.
[82]
Song S.L., Barker K., Kerbyson D., Unified performance and power modeling of scientific workloads, in: Proceedings of the 1st International Workshop on Energy Efficient Supercomputing, 2013, pp. 1–8,.
[83]
Alan I., Arslan E., Kosar T., Energy-aware data transfer tuning, in: 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, IEEE, 2014, pp. 626–634,.
[84]
Bash C., Forman G., Cool job allocation: Measuring the power savings of placing jobs at cooling-efficient locations in the data center, in: 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference, USENIX Association, 2007, pp. 1–6. https://dl.acm.org/doi/10.5555/1364385.1364414.
[85]
Tang Q., Gupta S.K., Varsamopoulos G., Thermal-aware task scheduling for data centers through minimizing heat recirculation, in: 2007 IEEE International Conference on Cluster Computing, IEEE, 2007, pp. 129–138,.
[86]
Fan X., Weber W.D., Barroso L.A., Power provisioning for a warehouse-sized computer, Proceedings of the 34th Annual International Symposium on Computer Architecture, vol. 35, ACM, 2007, pp. 13–23,.
[87]
Maheshwari N., Nanduri R., Varma V., Dynamic energy efficient data placement and cluster reconfiguration algorithm for mapreduce framework, Future Gener. Comput. Syst. 28 (1) (2012) 119–127,.
[88]
Lang W., Patel J.M., Energy management for mapreduce clusters, Proc. VLDB Endow. 3 (1) (2010) 129–139,.
[89]
Meisner D., Gold B.T., Wenisch T.F., Powernap: Eliminating server idle power, in: International Conference on Architectural Support for Programming Languages and Operating Systems, 2009, pp. 205–216,.
[90]
Meisner D., Sadler C.M., Barroso L.A., Weber W.D., Wenisch T.F., Power management of online data-intensive services, in: International Symposium on Computer Architecture, 2011, pp. 319–330,.
[91]
Barroso L.A., Holzle U., The case for energy-proportional computing, IEEE Comput. 40 (12) (2007) 33–37,.
[92]
Hamilton J., Cooperative expendable micro-slice servers (cems): Low cost, low power servers for internet-scale services, in: CIDR 2009-4th Biennal Conference on Innovative Data Systems Research, 2009, pp. 1–8. https://www.engineeringvillage.com/app/doc/?docid=cpx_6e3d6013678ef5294M6a922061377553.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Future Generation Computer Systems
Future Generation Computer Systems  Volume 159, Issue C
Oct 2024
580 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 08 August 2024

Author Tags

  1. Energy estimation
  2. Energy-efficient computing
  3. Distributed database systems
  4. Query optimization

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media