Responsible Knowledge Management in Energy Data Ecosystems
<p>The electricity value chain.</p> "> Figure 2
<p>The energy big data integration platform as a knowledge-driven data ecosystem.</p> "> Figure 3
<p>Software architecture for one node (Institute Mihajlo Pupin, 2021).</p> "> Figure 4
<p>Applying the CIM standard.</p> "> Figure 5
<p>Example SPARQL query.</p> "> Figure 6
<p>Unified knowledge graph creation process.</p> "> Figure 7
<p>Data harmonization pipeline.</p> "> Figure 8
<p>Annotation of data sources. Visualization powered by Cystoscape.</p> "> Figure 9
<p>Energy analytics dashboard.</p> ">
Abstract
:1. Introduction
1.1. Data Ecosystems
1.2. The EU Energy Data Ecosystem
1.3. Overview of Main Contributions
- A new approach can combine data and knowledge management and enhance the analytics services portfolio of various energy stakeholders. Thus, energy expert users can develop analytical methods for two-way electricity flows and information and optimize electricity generation, distribution, and consumption on top of heterogeneous data sources.
- An abstract architecture for semantic data integration and business analytics. On top of this architecture, various knowledge-driven services for processing data contextually and intelligently are devised.
- A unified knowledge graph that converges data and knowledge collected from the data ecosystem. The knowledge graph is connected to existing encyclopedic knowledge graphs (e.g., DBpedia [14] and Wikidata [15]). Additionally, a federated query engine allows for query processing on top of the connected knowledge graphs in a unified way. This engine provides the basis for the development of interactive and explainable AI-based services on top of the knowledge-driven data ecosystem.
- An analytical layer composed of advanced analytical services (statistical and ML models that work on edge and on top of integrated data). Depending on the stakeholders’ needs and the available data, the services offered are related to renewable energy source (RES) production forecast, RES effects calculation, buildings operation optimization, and asset predictive maintenance.
2. The Electricity Value Chain: Overview of Challenges
2.1. Example Case Study
2.2. EU Energy Data Spaces and New Business Models
- Data governance according to regulations imposed by data providers;
- Ensuring a trusted and secure data exchange;
- Semantically representing main data concepts and relationships;
- Exchanging formats and protocols;
- Providing software design principles for guiding the implementation of the reference architecture components.
2.3. Example Scenario: The RES Forecasting
- RQ2 Which ontologies cover the needs for modeling the energy value chain and ensure uniform access [21] to data collected with the proprietary SCADA system?
- RQ3 How to build a knowledge graph that will be ready for integration with services in future energy marketplaces?
- RQ4 From a business perspective, what are the benefits of advanced analytics for different kinds of energy actors?
Example Data Sources
- Production dataset contains historical wind power production measurements from the wind power plant.
- Predictive maintenance dataset contains high-resolution measurements collected by the phasor measurement unit (PMU) installed on the renewable power plant.
- Meteorological dataset contains both historical and forecasted meteorological data, which are crucial for providing precise RES production forecast.
- RES effects contains estimations regarding the effects of the renewable energy source on the power system based on the PMU measurement (predictive maintenance dataset).
3. Developing a Multi-Layer Software Architecture
3.1. Energy Big Data Integration Platform
3.2. Instantiating a DE
- Producer site (e.g., at a wind power plant, a unified knowledge graph shall be integrated with the production forecast and the predictive maintenance services);
- Supplier site, an organization that integrates data from many producers and sells electricity to TSO (e.g., the power industry of Serbia might be interested to integrate the data sources from power plants it owns and manages);
- Transmission system operator site, an organization that operates and balances the grid (e.g., the joint stock company EMS might be interested in improving the data integration and the transparency of data exchanged with other actors).
3.2.1. Description of the DE Architecture on the Node Level
- The first layer (denoted with number 1) ensures syntactic interoperability and communication with physical architecture, for example, phasor measurement units for collecting high-resolution data about the generating units (inverters of PV production plant or turbines of wind plant); a building or a complex of buildings; or single devices, such as energy meters on the consumption side.
- The second layer (denoted with number 2) ensures syntactic interoperability and communication between the control SCADA system and the intelligent layer, analytical services that work on top of one kind of data, for instance, one MySQL base is used for retrieving the data.
- The third layer (denoted with number 3) ensures semantic interoperability for advanced business services where integration of different big data sources are needed because of different interoperability issues.
- Representation of attributes’ values: Timestamps standardization, measurement unit generalization, and measurement scale.
- Granularity: different aggregations (daily vs. hourly, weather at wind farm vs. at the city level); different measurement for same time intervals (example temperature from wind farm sensor and temp in WeatherBit of the city).
- Structuredness: SCADA—structured (MySQL); Weatherbit—semi-structured (JSON), ENTSO-E—semi-structured data (XSML).
- Schematic interoperability: various representations of attributes and concepts are used for modeling the same semantic concept (outtemperature in Wind RES database vs. temp at WeatherBit; obtime at WeatherBit vs. timestamp in Wind RES database).
3.2.2. Instantiating a Node at the Producers’ Site
3.2.3. Instantiating an IDS Data Connector
4. Data Standardization and Harmonization
4.1. Developing a Global Schema for the Energy Domain
- IEC Common Information Model standards (CIM) (https://www.dmtf.org/standards/cim/cim_schema_v2530), (accessed on 2 May 2022)), see CIM V2.53.0 Schema (MOF, PDF and UML);
- Smart Appliances REFerence ontology (SAREF), and the extension of SAREF to fully support demand/response use cases in the energy domain (SAREF4EE);
- The International Data Space (IDS) (https://w3id.org/seas/, (accessed on 2 May 2022)) Information Model;
- SEAS—Smart Energy Aware Systems (https://ci.mines-stetienne.fr/seas/index.html, (accessed on 2 May 2022)).
4.2. Unified Knowledge Graph Creation Process
4.3. Data Harmonization
5. Knowledge Exploitation
5.1. Traversing the Knowledge Graph
5.2. Federated Query Processing
6. Integration of Advanced Analytics
6.1. Res Forecasting
6.2. Data Analytics on the Edge
7. Discussion
- The use of new approaches capable of data managing and processing for extending the analytics services portfolio of various energy stakeholders. Examples include ESCOs, DSOs, and utilities to achieve two-way flows of electricity and information for optimized generation, distribution, and electricity consumption.
- Distributed/edge processing and data analytics technologies to optimize the operation of the real-time energy system management and automate the “monitor–forecast–optimize–control” loop.
- Effective integration of relevant digital technologies. It will transform energy systems from the top down and move from centralized production and rigid distribution framework into a collaborative ecosystem of self-managed prosumers able to act independently on the liberalized energy markets.
- Secure data exchange: For instance, using the industrial data space concept that features various levels of protection, data are exchanged securely across the entire data supply chain (and not just in bilateral data exchange).
- Data governance and sovereignty: In a network of energy DEs, data owners determine the terms and conditions of use of the data provided, while data sovereignty always remains with the respective data provider. A provider makes data available to be requested by certain contractors in a data space by its own rules. Additionally, the provider can also offer data services (e.g., via an »AppStore«) to be found by all DE participants.
- Innovative scalable and replicable energy management services: a network of energy DEs opens opportunities for new data-driven and model-driven services that will complement and enhance the existing, e.g., balancing services, energy generation and consumption intelligent forecasts services, and energy performance assessment services.
8. Related Work
9. Concluding Remarks
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- El-hawary, M.E. The smart grid—State-of-the-art and future trends. Electr. Power Compon. Syst. 2014, 42, 239–250. [Google Scholar] [CrossRef]
- Liggesmeyer, P.; Rombach, D.; Bomarius, F. Smart Energy. In Digital Transformation; Springer: Berlin/Heidelberg, Germany, 2019; pp. 335–351. [Google Scholar] [CrossRef]
- Noy, N.; Gao, Y.; Jain, A.; Narayanan, A.; Patterson, A.; Taylor, J. Industry-Scale Knowledge Graphs: Lessons and Challenges. Commun. ACM 2019, 62, 36–43. [Google Scholar] [CrossRef] [Green Version]
- Jain, S. Exploiting Knowledge Graphs for Facilitating Product/Service Discovery. arXiv 2020, arXiv:2010.05213. [Google Scholar]
- Roman, D.; Alexiev, V.; Paniagua, J.; Elvesæter, B.; Marius von Zernichow, B.; Soylu, A.; Simeonov, B.; Taggart, C. The euBusinessGraph ontology: A lightweight ontology for harmonizing basic company information. Semant. Web J. 2022, 13, 41–68. [Google Scholar] [CrossRef]
- Lackshen, G.; Janev, V.; Vraneš, S. Arabic Linked Drug Dataset Consolidating and Publishing. Comput. Sci. Inf. Syst. 2021, 18, 729–748. [Google Scholar] [CrossRef]
- Mijović, V.; Tomašević, N.; Janev, V.; Stanojević, M.; Vraneš, S. Ontology Enabled Decision Support System for Emergency Management at Airports. In Proceedings of the I-SEMANTICS 2011, International Conference on Semantic Systems, Graz, Austria, 7–9 September 2011; ACM: New York, NY, USA, 2011; pp. 163–166. [Google Scholar] [CrossRef]
- Janev, V.; Vidal, M.E.; Endris, K.; Pujić, D. Managing Knowledge in Energy Data Spaces; Association for Computing Machinery: New York, NY, USA, 2021; pp. 7–15. [Google Scholar]
- Capiello, C.; Gal, A.; Jarke, M.; Rehof, J. Data Ecosystems: Sovereign Data Exchange among Organizations (Dagstuhl Seminar 19391). In Dagstuhl Reports; Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik: Dagstuhl, Germany, 2020; Volume 9. [Google Scholar]
- Oliveira, M.I.S.; Lóscio, B.F. What is a Data Ecosystem? In Proceedings of the 19th Annual International Conference on Digital Government Research: Governance in the Data Age, Delft, The Netherlands, 30 May–June 1 2018; pp. 1–9. [Google Scholar]
- European Commission. A European Strategy for Data (19 February 2020, COM(2020) 66 Final). 2020. Available online: https://ec.europa.eu/info/strategy/priorities-2019-2024/europe-fit-digital-age/european-data-strategy_en (accessed on 23 May 2022).
- Geisler, S.; Vidal, M.; Cappiello, C.; Lóscio, B.F.; Gal, A.; Jarke, M.; Lenzerini, M.; Missier, P.; Otto, B.; Paja, E.; et al. Knowledge-Driven Data Ecosystems Toward Data Transparency. ACM J. Data Inf. Qual. 2022, 14, 3:1–3:12. [Google Scholar] [CrossRef]
- European Commission. The European Green Deal (11 December 2019, COM(2019) 640 Final). 2019. Available online: https://ec.europa.eu/info/publications/communication-european-green-deal_en (accessed on 23 May 2022).
- Auer, S.; Bizer, C.; Kobilarov, G.; Lehmann, J.; Cyganiak, R.; Ives, Z. DBpedia: A Nucleus for a Web of Open Data. In The Semantic Web; Springer: Berlin/Heidelberg, Germany, 2007; pp. 722–735. [Google Scholar]
- Vrandecic, D.; Krötzsch, M. Wikidata: A free collaborative knowledgebase. Commun. ACM 2014, 57, 78–85. [Google Scholar] [CrossRef]
- Otto, B.; Haas, C.; Pettenpohl, H.; Lohmann, S.; Huber, M.; Pullmann, J.; Auer, S. Reference Architecture Model for the Industrial Data Space. 2017. Available online: https://www.fit.fraunhofer.de/content/dam/fit/en/documents/Industrial-Data-Space_Reference-Architecture-Model-2017.pdf (accessed on 23 May 2022).
- Janev, V.; Jakupović, G. Electricity Balancing: Challenges and Perspectives. In Proceedings of the 2020 28th Telecommunications Forum (TELFOR), Belgrade, Serbia, 24–25 November 2020; pp. 1–4. [Google Scholar] [CrossRef]
- Blackmon, D. How Europes Energy Crisis Could Force The EU To Adopt More Sensible Policies. 2022. Available online: https://www.forbes.com/sites/davidblackmon/2022/01/03/how-europes-energy-crisis-could-force-the-eu-to-adopt-more-sensible-policies/?sh=4e5e9a6e3ed3 (accessed on 23 May 2022).
- Hooshyar, H.; Vanfretti, L. A SGAM-based architecture for synchrophasor applications facilitating TSO/DSO interactions. In Proceedings of the 2017 IEEE Power Energy Society Innovative Smart Grid Technologies Conference (ISGT), Washington, DC, USA, 23–26 April 2017; pp. 1–5. [Google Scholar] [CrossRef] [Green Version]
- Commission, E. BRIDGE European Energy Data Exchange Reference Architecture. 2021. Available online: https://ec.europa.eu/energy/sites/default/files/documents/bridge_wg_data_management_eu_reference_architcture_report_2020-2021.pdf (accessed on 23 May 2022).
- Mami, M.; Graux, D.; Scerri, S.; Jabeen, H.; Auer, S.; Lehmann, S. Uniform Access to Multiform Data Lakes using SemanticTechnologies. In Proceedings of the 21st International Conference on Information Integration and Web-based Applications and Services, Munich, Germany, 2–4 December 2019; pp. 313–322. [Google Scholar] [CrossRef]
- Dimou, A.; Vander Sande, M.; Colpaert, P.; Verborgh, R.; Mannens, E.; Van de Walle, R. RML: A generic language for integrated RDF mappings of heterogeneous data. In Proceedings of the 7th Workshop on Linked Data on the Web, Seoul, Korea, 8 April 2014. [Google Scholar]
- Iglesias, E.; Jozashoori, S.; Chaves-Fraga, D.; Collarana, D.; Vidal, M.E. SDM-RDFizer: An RML interpreter for the efficient creation of RDF knowledge graphs. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, Virtual Event, 19–23 October 2020; pp. 3039–3046. [Google Scholar]
- Endris, K.M.; Rohde, P.D.; Vidal, M.; Auer, S. Ontario: Federated Query Processing Against a Semantic Data Lake. In Lecture Notes in Computer Science, Proceedings of the Database and Expert Systems Applications—30th International Conference, DEXA 2019, Linz, Austria, 26–29 August 2019; Hartmann, S., Küng, J., Chakravarthy, S., Anderst-Kotsis, G., Tjoa, A.M., Khalil, I., Eds.; Springer: Berlin/Heidelberg, Germany, 2019; Volume 11706, pp. 379–395. [Google Scholar] [CrossRef]
- Sakor, A.; Singh, K.; Patel, A.; Vidal, M. Falcon 2.0: An Entity and Relation Linking Tool over Wikidata. In Proceedings of the CIKM’20: The 29th ACM International Conference on Information and Knowledge Management, Virtual Event, Ireland, 19–23 October 2020; d’Aquin, M., Dietze, S., Hauff, C., Curry, E., Cudré-Mauroux, P., Eds.; ACM: New York, NY, USA, 2020; pp. 3141–3148. [CrossRef]
- Figuera, M.; Rohde, P.D.; Vidal, M. Trav-SHACL: Efficiently Validating Networks of SHACL Constraints. In Proceedings of the The Web Conference WWW, Ljubljana, Slovenia, 19–23 April 2021. [Google Scholar]
- Lehmann, J.; Sejdiu, G.; Bühmann, L.; Westphal, P.; Stadler, C.; Ermilov, I.; Bin, S.; Chakraborty, N.; Saleem, M.; Ngomo, A.N.; et al. Distributed Semantic Analytics Using the SANSA Stack. In Lecture Notes in Computer Science, Proceedings of the Semantic Web—ISWC 2017—16th International Semantic Web Conference, Vienna, Austria, 21–25 October 2017; d’Amato, C., Fernández, M., Tamma, V.A.M., Lécué, F., Cudré-Mauroux, P., Sequeda, J.F., Lange, C., Heflin, J., Eds.; Springer: Berlin/Heidelberg, Germany, 2017; Volume 10588, pp. 147–155. [Google Scholar] [CrossRef]
- Maggio, M.; Savarino, V.; Rashid, T.T.; Harish, T.M.; Idoia Murua, U.I. Deliverable D3.4 Open Source Data Connector. 2021. Available online: https://cordis.europa.eu/project/id/872592/results (accessed on 23 May 2022).
- Janev, V.; Popadić, D.; Pujić, D.; Vidal, M.E.; Endris, K. Reuse of Semantic Models for Emerging Smart Grids Applications. arXiv 2021, arXiv:2107.06999. [Google Scholar]
- Endris, K.M.; Vidal, M.E.; Graux, D. Chapter 5 Federated Query Processing. In Knowledge Graphs and Big Data Processing; Janev, V., Graux, D., Jabeen, H., Sallinger, E., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 73–86. [Google Scholar] [CrossRef]
- Fathabad, A.; Cheng, J.; Pan, K.; Qiu, F. Data-Driven Planning for Renewable Distributed Generation Integration. IEEE Trans. Power Syst. 2020, 35, 4357–4368. [Google Scholar] [CrossRef]
- Gelhaar, J.; Otto, B. Challenges in the Emergence of Data Ecosystems. In Proceedings of the 24th Pacific Asia Conference on Information Systems, PACIS, Dubai, United Arab Emirates, 22–24 June 2020; Vogel, D., Shen, K.N., Ling, P.S., Hsu, C., Thong, J.Y.L., Marco, M.D., Limayem, M., Xu, S.X., Eds.; 2020; p. 175. [Google Scholar]
- Stoyanovich, J.; Howe, B.; Jagadish, H.V. Responsible Data Management. Proc. VLDB Endow. 2020, 13, 3474–3488. [Google Scholar] [CrossRef]
- Quix, C.; Hai, R.; Vatov, I. GEMMS: A Generic and Extensible Metadata Management System for Data Lakes. In Proceedings of the 28th International Conference on Advanced Information Systems Engineering (CAiSE 2016), CEUR-WS, Ljubljana, Slovenia, 13–17 June 2016; pp. 129–136. [Google Scholar]
- Khan, Y.; Zimmermann, A.; Jha, A.; Gadepally, V.; D’Aquin, M.; Sahay, R. One Size Does Not Fit All: Querying Web Polystores. IEEE Access 2019, 7, 9598–9617. [Google Scholar] [CrossRef]
- Duggan, J.; Elmore, A.J.; Stonebraker, M.; Balazinska, M.; Howe, B.; Kepner, J.; Madden, S.; Maier, D.; Mattson, T.; Zdonik, S. The BigDAWG Polystore System. SIGMOD Rec. 2015, 44, 11–16. [Google Scholar] [CrossRef]
- Hai, R.; Geisler, S.; Quix, C. Constance: An Intelligent Data Lake System. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD, San Francisco, CA, USA, 26 June–1 July 2016; ACM: New York, NY, USA, 2016; pp. 2097–2100. [Google Scholar] [CrossRef]
Dataset | Number of Annotations |
---|---|
PUPIN-RES-PROD | 34 |
PUPIN-RES-PV | 26 |
PUPIN-ENTSO-E | 22 |
PUPIN-RES-Effects | 4 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Janev, V.; Vidal, M.-E.; Pujić, D.; Popadić, D.; Iglesias, E.; Sakor, A.; Čampa, A. Responsible Knowledge Management in Energy Data Ecosystems. Energies 2022, 15, 3973. https://doi.org/10.3390/en15113973
Janev V, Vidal M-E, Pujić D, Popadić D, Iglesias E, Sakor A, Čampa A. Responsible Knowledge Management in Energy Data Ecosystems. Energies. 2022; 15(11):3973. https://doi.org/10.3390/en15113973
Chicago/Turabian StyleJanev, Valentina, Maria-Esther Vidal, Dea Pujić, Dušan Popadić, Enrique Iglesias, Ahmad Sakor, and Andrej Čampa. 2022. "Responsible Knowledge Management in Energy Data Ecosystems" Energies 15, no. 11: 3973. https://doi.org/10.3390/en15113973
APA StyleJanev, V., Vidal, M. -E., Pujić, D., Popadić, D., Iglesias, E., Sakor, A., & Čampa, A. (2022). Responsible Knowledge Management in Energy Data Ecosystems. Energies, 15(11), 3973. https://doi.org/10.3390/en15113973