Nothing Special   »   [go: up one dir, main page]

Skip to main content

Advertisement

Log in

Improving RDF Data Through Association Rule Mining

  • Schwerpunktbeitrag
  • Published:
Datenbank-Spektrum Aims and scope Submit manuscript

Abstract

Linked Open Data comprises very many and often large public data sets, which are mostly presented in the Rdf triple structure of subject, predicate, and object. However, the heterogeneity of available open data requires significant integration steps before it can be used in applications. A promising and novel technique to explore such data is the use of association rule mining. We introduce “mining configurations”, which allow us to mine Rdf data sets in various ways. Different configurations enable us to identify schema and value dependencies that in combination result in interesting use cases. We present rule-based approaches for predicate suggestion, data enrichment, ontology improvement, and query relaxation. On the one hand we prevent inconsistencies in the data through predicate suggestion, enrichment with missing facts, and alignment of the corresponding ontology. On the other hand we support users to handle inconsistencies during query formulation through predicate expansion techniques. Based on these approaches, we show that association rule mining benefits the integration and usability of Rdf data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. http://data.semanticweb.org/usewod/2012/.

  2. http://xmlns.com/foaf/spec/.

  3. http://dbtune.org/magnatune/.

References

  1. Abedjan Z, Lorey J, Naumann F (2012) Reconciling ontologies and the web of data. In: Proceedings of the international conference on information and knowledge management (CIKM), New York, NY, USA, pp 1532–1536

    Google Scholar 

  2. Abedjan Z, Naumann F (2011) Context and target configurations for mining RDF data (2 pp.). In: Proceedings of the international workshop on search and mining entity-relationship data (SMER), Glasgow

    Google Scholar 

  3. Abedjan Z, Naumann F (2013) Synonym analysis for predicate expansion. In: Proceedings of the extended semantic web conference (ESWC), Montpellier, France

    Google Scholar 

  4. Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the ACM international conference on management of data (SIGMOD), Washington, DC, USA, pp 207–216

    Google Scholar 

  5. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the international conference on very large databases (VLDB), Santiago de Chile, Chile, pp 487–499

    Google Scholar 

  6. Baeza-Yates RA, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley/Longman, Boston

    Google Scholar 

  7. Bizer C, Lehmann J, Kobilarov G, Auer S, Becker C, Cyganiak R, Hellmann S (2009) DBpedia—a crystallization point for the web of data. J Web Semant 7:154–165

    Article  Google Scholar 

  8. Böhm C, Freitag M, Heise A, Lehmann C, Mascher A, Naumann F, Ercegovac V, Hernandez M, Haase P, Schmidt M (2012) GovWILD: integrating open government data for transparency. In: Proceedings of the international world wide web conference (WWW). Demo

    Google Scholar 

  9. Buitelaar P, Cimiano P (eds) (2008) Ontology learning and population: bridging the gap between text and knowledge. Frontiers in artificial intelligence and applications, vol 167. IOS Press, Amsterdam

    Google Scholar 

  10. Cafarella MJ, Halevy A, Wang DZ, Wu E, Zhang Y (2008) WebTables: exploring the power of tables on the web. In: Proceedings of the VLDB endowment, vol 1, pp 538–549

    Google Scholar 

  11. Elbassuoni S, Ramanath M, Weikum G (2012) RDF Xpress: a flexible expressive RDF search engine. In: Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval. ACM, New York, p 1013

    Google Scholar 

  12. Fleischhacker D, Völker J, Stuckenschmidt H (2012) Mining RDF data for property axioms. In: Meersman R, Panetto H, Dillon T, Rinderle-Ma S, Dadam P, Zhou X, Pearson S, Ferscha A, Bergamaschi S, Cruz I (eds) On the move to meaningful internet systems: OTM 2012. Lecture notes in computer science, vol 7566. Springer, Berlin, pp 718–735

    Chapter  Google Scholar 

  13. Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of the ACM international conference on management of data (SIGMOD), pp 1–12

    Google Scholar 

  14. Heath T, Bizer C (2011) Linked data: evolving the web into a global data space, 1st edn, Morgan & Claypool

  15. Józefowska J, Lawrynowicz A, Lukaszewski T (2010) The role of semantics in mining frequent patterns from knowledge bases in description logics with rules. Theory Pract Log Program 10:251–289

    Article  MathSciNet  MATH  Google Scholar 

  16. Kuramochi M, Karypis G (2001) Frequent subgraph discovery. In: Proceedings of the IEEE international conference on data mining (ICDM), Washington, DC, pp 313–320

    Google Scholar 

  17. Lange D, Böhm C, Naumann F (2010) Extracting structured information from Wikipedia articles to populate infoboxes. In: Proceedings of the international conference on information and knowledge management (CIKM). ACM, New York, pp 1661–1664

    Google Scholar 

  18. Maedche A, Staab S (2001) Ontology learning for the semantic web. IEEE Intell Syst 16:72–79

    Article  Google Scholar 

  19. Nebot V, Berlanga R (2010) Mining association rules from semantic web data. In: Proceedings of the international conference on industrial engineering and other applications of applied intelligent systems (IEA/AIE), Cordoba, Spain, vol 2, pp 504–513

    Google Scholar 

  20. Völker J, Niepert M (2011) Statistical schema induction. In: Proceedings of the extended semantic web conference (ESWC), Heraklion, Greece, pp 124–138

    Google Scholar 

  21. Wu F, Weld DS (2007) Autonomously semantifying Wikipedia. In: Proceedings of the international conference on information and knowledge management (CIKM). ACM, New York, pp 41–50

    Google Scholar 

  22. Wu F, Weld DS (2008) Automatically refining the Wikipedia infobox ontology. In: Proceedings of the international world wide web conference (WWW), Beijing, China, pp 635–644

    Google Scholar 

  23. Zaki MJ (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12:372–390

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ziawasch Abedjan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abedjan, Z., Naumann, F. Improving RDF Data Through Association Rule Mining. Datenbank Spektrum 13, 111–120 (2013). https://doi.org/10.1007/s13222-013-0126-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13222-013-0126-x

Keywords

Navigation