Improving RDF Data Through Association Rule Mining

Ziawasch Abedjan¹ &
Felix Naumann¹

902 Accesses
Explore all metrics

Abstract

Linked Open Data comprises very many and often large public data sets, which are mostly presented in the Rdf triple structure of subject, predicate, and object. However, the heterogeneity of available open data requires significant integration steps before it can be used in applications. A promising and novel technique to explore such data is the use of association rule mining. We introduce “mining configurations”, which allow us to mine Rdf data sets in various ways. Different configurations enable us to identify schema and value dependencies that in combination result in interesting use cases. We present rule-based approaches for predicate suggestion, data enrichment, ontology improvement, and query relaxation. On the one hand we prevent inconsistencies in the data through predicate suggestion, enrichment with missing facts, and alignment of the corresponding ontology. On the other hand we support users to handle inconsistencies during query formulation through predicate expansion techniques. Based on these approaches, we show that association rule mining benefits the integration and usability of Rdf data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

References

Abedjan Z, Lorey J, Naumann F (2012) Reconciling ontologies and the web of data. In: Proceedings of the international conference on information and knowledge management (CIKM), New York, NY, USA, pp 1532–1536
Google Scholar
Abedjan Z, Naumann F (2011) Context and target configurations for mining RDF data (2 pp.). In: Proceedings of the international workshop on search and mining entity-relationship data (SMER), Glasgow
Google Scholar
Abedjan Z, Naumann F (2013) Synonym analysis for predicate expansion. In: Proceedings of the extended semantic web conference (ESWC), Montpellier, France
Google Scholar
Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the ACM international conference on management of data (SIGMOD), Washington, DC, USA, pp 207–216
Google Scholar
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the international conference on very large databases (VLDB), Santiago de Chile, Chile, pp 487–499
Google Scholar
Baeza-Yates RA, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley/Longman, Boston
Google Scholar
Bizer C, Lehmann J, Kobilarov G, Auer S, Becker C, Cyganiak R, Hellmann S (2009) DBpedia—a crystallization point for the web of data. J Web Semant 7:154–165
Article Google Scholar
Böhm C, Freitag M, Heise A, Lehmann C, Mascher A, Naumann F, Ercegovac V, Hernandez M, Haase P, Schmidt M (2012) GovWILD: integrating open government data for transparency. In: Proceedings of the international world wide web conference (WWW). Demo
Google Scholar
Buitelaar P, Cimiano P (eds) (2008) Ontology learning and population: bridging the gap between text and knowledge. Frontiers in artificial intelligence and applications, vol 167. IOS Press, Amsterdam
Google Scholar
Cafarella MJ, Halevy A, Wang DZ, Wu E, Zhang Y (2008) WebTables: exploring the power of tables on the web. In: Proceedings of the VLDB endowment, vol 1, pp 538–549
Google Scholar
Elbassuoni S, Ramanath M, Weikum G (2012) RDF Xpress: a flexible expressive RDF search engine. In: Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval. ACM, New York, p 1013
Google Scholar
Fleischhacker D, Völker J, Stuckenschmidt H (2012) Mining RDF data for property axioms. In: Meersman R, Panetto H, Dillon T, Rinderle-Ma S, Dadam P, Zhou X, Pearson S, Ferscha A, Bergamaschi S, Cruz I (eds) On the move to meaningful internet systems: OTM 2012. Lecture notes in computer science, vol 7566. Springer, Berlin, pp 718–735
Chapter Google Scholar
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of the ACM international conference on management of data (SIGMOD), pp 1–12
Google Scholar
Heath T, Bizer C (2011) Linked data: evolving the web into a global data space, 1st edn, Morgan & Claypool
Józefowska J, Lawrynowicz A, Lukaszewski T (2010) The role of semantics in mining frequent patterns from knowledge bases in description logics with rules. Theory Pract Log Program 10:251–289
Article MathSciNet MATH Google Scholar
Kuramochi M, Karypis G (2001) Frequent subgraph discovery. In: Proceedings of the IEEE international conference on data mining (ICDM), Washington, DC, pp 313–320
Google Scholar
Lange D, Böhm C, Naumann F (2010) Extracting structured information from Wikipedia articles to populate infoboxes. In: Proceedings of the international conference on information and knowledge management (CIKM). ACM, New York, pp 1661–1664
Google Scholar
Maedche A, Staab S (2001) Ontology learning for the semantic web. IEEE Intell Syst 16:72–79
Article Google Scholar
Nebot V, Berlanga R (2010) Mining association rules from semantic web data. In: Proceedings of the international conference on industrial engineering and other applications of applied intelligent systems (IEA/AIE), Cordoba, Spain, vol 2, pp 504–513
Google Scholar
Völker J, Niepert M (2011) Statistical schema induction. In: Proceedings of the extended semantic web conference (ESWC), Heraklion, Greece, pp 124–138
Google Scholar
Wu F, Weld DS (2007) Autonomously semantifying Wikipedia. In: Proceedings of the international conference on information and knowledge management (CIKM). ACM, New York, pp 41–50
Google Scholar
Wu F, Weld DS (2008) Automatically refining the Wikipedia infobox ontology. In: Proceedings of the international world wide web conference (WWW), Beijing, China, pp 635–644
Google Scholar
Zaki MJ (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12:372–390
Article Google Scholar

Download references

Author information

Authors and Affiliations

Hasso Plattner Institute, Potsdam, Germany
Ziawasch Abedjan & Felix Naumann

Authors

Ziawasch Abedjan
View author publications
You can also search for this author in PubMed Google Scholar
Felix Naumann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ziawasch Abedjan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abedjan, Z., Naumann, F. Improving RDF Data Through Association Rule Mining. Datenbank Spektrum 13, 111–120 (2013). https://doi.org/10.1007/s13222-013-0126-x

Download citation

Received: 23 January 2013
Accepted: 25 April 2013
Published: 21 May 2013
Issue Date: July 2013
DOI: https://doi.org/10.1007/s13222-013-0126-x

Improving RDF Data Through Association Rule Mining

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On Learnability of Constraints from RDF Data

Datalog Revisited for Reasoning in Linked Data

Towards Linked Open Data Enabled Data Mining

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Improving RDF Data Through Association Rule Mining

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On Learnability of Constraints from RDF Data

Datalog Revisited for Reasoning in Linked Data

Towards Linked Open Data Enabled Data Mining

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now