Nothing Special   »   [go: up one dir, main page]

Skip to main content

Explaining Commonalities of Clusters of RDF Resources in Natural Language

  • Conference paper
  • First Online:
Foundations of Intelligent Systems (ISMIS 2024)

Abstract

We introduce a system that provides explanations in Natural Language for individual clusters of RDF resources, where clusters are obtained using an external clustering tool. Our system is based on the theory of (Least) Common Subsumers (CS) in RDF. We propose an optimized algorithm for computing a CS, which allows us to compute the CS for up to 80 RDF resources (each with its own RDF-graph of linked data). We then generate a Natural Language sentence to describe each cluster. A unique aspect of our explanations is the use of relative sentences, including nested ones, to represent blank nodes in an RDF-path. We demonstrate the usefulness of our tool by describing the resulting clusters of a real, publicly available, dataset on Public Procurements.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Data Availability Statement

Data publicly available at https://tbfy.github.io/data/.

Notes

  1. 1.

    The same conclusion was recently reached [1] using cycles instead of trees.

  2. 2.

    A lean graph G [17] is an RDF-graph which is \(\subseteq \)-minimal with respect to all other RDF-graphs logically equivalent to G.

  3. 3.

    Average execution time for running Algorithm 1 on 80 randomly selected resources—machine equipped with an Intel i7 processor at 3.60 GHz and 32 GB RAM.

  4. 4.

    https://aclweb.org/aclwiki/Downloadable_NLG_systems.

  5. 5.

    https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html.

  6. 6.

    The full Knowledge Graph is downloadable at https://tbfy.github.io/data/.

References

  1. Amendola, G., Manna, M., Ricioppo, A.: Characterizing nexus of similarity within knowledge bases: a logic-based framework and its computational complexity aspects. https://arxiv.org/pdf/2303.10714.pdf

  2. Baader, F., Küsters, R., Molitor, R.: Computing least common subsumers in description logics with existential restrictions. In: IJCAI, vol. 99, pp. 96–101 (1999)

    Google Scholar 

  3. Bae, J., Helldin, T., Riveiro, M., Nowaczyk, S., Bouguelia, M.R., Falkman, G.: Interactive clustering: a comprehensive review. ACM Comput. Surv. 53(1) (2020)

    Google Scholar 

  4. Bandyapadhyay, S., Fomin, F.V., Golovach, P.A., Lochet, W., Purohit, N., Simonov, K.: How to find a good explanation for clustering? Artif. Intell. 322 (2023)

    Google Scholar 

  5. Bouayad-Agha, N., Casamayor, G., Wanner, L.: Natural language generation in the context of the semantic web. Semant. Web 5(6), 493–513 (2014)

    Article  Google Scholar 

  6. Colucci, S., Donini, F., Giannini, S., Di Sciascio, E.: Defining and computing least common subsumers in RDF. Web Semant. Sci. Serv. Agents World Wide Web 39, 62–80 (2016)

    Article  Google Scholar 

  7. Colucci, S., Donini, F.M., Di Sciascio, E.: Common subsumbers in RDF. In: Baldoni, M., Baroglio, C., Boella, G., Micalizio, R. (eds.) AI*IA 2013. LNCS (LNAI), vol. 8249, pp. 348–359. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-03524-6_30

    Chapter  Google Scholar 

  8. Colucci, S., Donini, F.M., Di Sciascio, E.: On the relevance of explanation for RDF resources similarity. In: Babkin, E., Barjis, J., Malyzhenkov, P., Merunka, V., Molhanec, M. (eds.) MOBA 2023. LNBIP, vol. 488, pp. 96–107. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-45010-5_8

    Chapter  Google Scholar 

  9. Colucci, S., Giannini, S., Donini, F.M., Di Sciascio, E.: A deductive approach to the identification and description of clusters in linked open data. In: Proceedings of the 21st European Conference on Artificial Intelligence (ECAI 2014). IOS Press (2014)

    Google Scholar 

  10. Gatt, A., Krahmer, E.: Survey of the state of the art in natural language generation: core tasks, applications and evaluation. J. Artif. Int. Res. 61(1), 65–170 (2018)

    MathSciNet  Google Scholar 

  11. Hitzler, P., Krötzsch, M., Rudolph, S.: Foundations of Semantic Web Technologies. Chapman & Hall/CRC (2009)

    Google Scholar 

  12. Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall Inc., Upper Saddle River (1988)

    Google Scholar 

  13. Li, J., et al.: Neural entity summarization with joint encoding and weak supervision. In: Proceedings of IJCAI-2020, pp. 1644–1650. ijcai.org (2020)

    Google Scholar 

  14. Michalski, R.S.: Knowledge acquisition through conceptual clustering: a theoretical framework and an algorithm for partitioning data into conjunctive concepts. Int. J. Policy Anal. Inf. Syst. 4, 219–244 (1980)

    MathSciNet  Google Scholar 

  15. Miller, T.: Explanation in artificial intelligence: insights from the social sciences. Artif. Intell. 267, 1–38 (2019)

    Article  MathSciNet  Google Scholar 

  16. Moshkovitz, M., Dasgupta, S., Rashtchian, C., Frost, N.: Explainable k-means and k-medians clustering. In: Proceedings of the 37th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 119, pp. 7055–7065. PMLR (2020)

    Google Scholar 

  17. Patel-Schneider, P., Arndt, D., Haudebourg, T.: RDF 1.2 semantics, W3C recommendation (2023). https://www.w3.org/TR/rdf12-semantics/

  18. Pérez-Suárez, A., Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A.: A review of conceptual clustering algorithms. Art. Intell. Rev. 52(2), 1267–1296 (2019)

    Article  Google Scholar 

  19. Pichler, R., Polleres, A., Skritek, S., Woltran, S.: Complexity of redundancy detection on RDF graphs in the presence of rules, constraints, and queries. Semant. Web 4(4), 351–393 (2013)

    Article  Google Scholar 

  20. Ruta, M., Colucci, S., Scioscia, F., Di Sciascio, E., Donini, F.M.: Finding commonalities in RFID semantic streams. Procedia Comput. Sci. 5, 857–864 (2011)

    Article  Google Scholar 

  21. Shadbolt, N., Hall, W., Berners-Lee, T.: The semantic web revisited. IEEE Intell. Syst. 21(3), 96–101 (2006)

    Article  Google Scholar 

  22. Soylu, A., et al.: TheyBuyForYou platform and knowledge graph: expanding horizons in public procurement with open linked data. Semant. Web 13(2) (2022)

    Google Scholar 

  23. Soylu, A., et al.: Towards an ontology for public procurement based on the open contracting data standard. In: Pappas, I.O., Mikalef, P., Dwivedi, Y.K., Jaccheri, L., Krogstie, J., Mäntymäki, M. (eds.) I3E 2019. LNCS, vol. 11701, pp. 230–237. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29374-1_19

    Chapter  Google Scholar 

  24. Vougiouklis, P., et al.: Neural Wikipedian: generating textual summaries from knowledge base triples. J. Web Semant. 52–53, 1–15 (2018)

    Article  Google Scholar 

Download references

Acknowledgements

We acknowledge support by project “LIFE: the itaLian system wIde Frailty nEtwork” founded by Ministry of Health (CUP D93C22000640001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simona Colucci .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Colucci, S., Donini, F.M., Di Sciascio, E. (2024). Explaining Commonalities of Clusters of RDF Resources in Natural Language. In: Appice, A., Azzag, H., Hacid, MS., Hadjali, A., Ras, Z. (eds) Foundations of Intelligent Systems. ISMIS 2024. Lecture Notes in Computer Science(), vol 14670. Springer, Cham. https://doi.org/10.1007/978-3-031-62700-2_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-62700-2_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-62699-9

  • Online ISBN: 978-3-031-62700-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics