Abstract
We present Dedalo, a framework which is able to exploit Linked Data to generate explanations for clusters. In general, any result of a Knowledge Discovery process, including clusters, is interpreted by human experts who use their background knowledge to explain them. However, for someone without such expert knowledge, those results may be difficult to understand. Obtaining a complete and satisfactory explanation becomes a laborious and time-consuming process, involving expertise in possibly different domains. Having said so, not only does the Web of Data contain vast amounts of such background knowledge, but it also natively connects those domains. While the efforts put in the interpretation process can be reduced with the support of Linked Data, how to automatically access the right piece of knowledge in such a big space remains an issue. Dedalo is a framework that dynamically traverses Linked Data to find commonalities that form explanations for items of a cluster. We have developed different strategies (or heuristics) to guide this traversal, reducing the time to get the best explanation. In our experiments, we compare those strategies and demonstrate that Dedalo finds relevant and sophisticated Linked Data explanations from different areas.
Chapter PDF
Similar content being viewed by others
References
Brisson, L., Collard, M., Pasquier, N.: Improving the knowledge discovery process using ontologies. In: Proceedings of the IEEE MCD International Workshop on Mining Complex Data, pp. 25–32 (November 2005)
Brisson, L., Collard, M.: How to Semantically Enhance a Data Mining Process? In: Filipe, J., Cordeiro, J. (eds.) Enterprise Information Systems. LNBIP, vol. 19, pp. 103–116. Springer, Heidelberg (2009)
d’Aquin, M., Jay, N.: Interpreting Data Mining Results with Linked Data for Learning Analytics: Motivation, Case Study and Direction. In: LAK 2013 (2013)
Dehmer, M., Mowshowitz, A.: Generalized graph entropies. Complexity 17(2), 45–50 (2011)
Geng, L., Hamilton, H.J.: Interestingness measures for data mining: A survey. ACM Computing Surveys (CSUR) 38(3), 9 (2006)
Grosof, B.N., Horrocks, I., Volz, R., Decker, S.: Description logic programs: Combining logic programs with description logic. In: Proceedings of the 12th International Conference on World Wide Web, pp. 48–57. ACM (May 2003)
King, R.D., Rowland, J., Oliver, S.G., Young, M., Aubrey, W., Byrne, E., Clare, A.: The automation of science. Science 324(5923), 85–89 (2009)
Lavrač, N., Vavpetič, A., Soldatova, L., Trajkovski, I., Novak, P.K.: Using ontologies in semantic data mining with SEGS and g-SEGS. In: Elomaa, T., Hollmén, J., Mannila, H. (eds.) DS 2011. LNCS, vol. 6926, pp. 165–178. Springer, Heidelberg (2011)
Lisi, F.A.: Inductive Logic Programming in Databases: From Datalog to DL+log. Theory and Practice of Logic Programming 10(3), 331–359 (2010)
Lisi, F.A., Esposito, F.: On ontologies as prior conceptual knowledge in inductive logic programming. In: Berendt, B., Mladenič, D., de Gemmis, M., Semeraro, G., Spiliopoulou, M., Stumme, G., Svátek, V., Železný, F. (eds.) Knowledge Discovery Enhanced with Semantic and Social Information. SCI, vol. 220, pp. 3–17. Springer, Heidelberg (2009)
Marinica, C., Guillet, F.: Knowledge-based interactive postmining of association rules using ontologies. IEEE Transactions on Knowledge and Data Engineering 22(6), 784–797 (2010)
Moss, L., Sleeman, D., Sim, M., Booth, M., Daniel, M., Donaldson, L., Kinsella, J.: Ontology-driven hypothesis generation to explain anomalous patient responses to treatment. Knowledge-Based Systems 23(4), 309–315 (2010)
Motik, B., Rosati, R.: Closing semantic web ontologies. Technical report, University of Manchester, UK (2006)
Mowshowitz, A., Dehmer, M.: Entropy and the complexity of graphs revisited. Entropy 14(3), 559–570 (2012)
Muggleton, S., De Raedt, L.: Inductive logic programming: Theory and methods. The Journal of Logic Programming 19, 629–679 (1994)
Mulwad, V., Finin, T., Syed, Z., Joshi, A.: Using Linked Data to Interpret Tables. In: COLD 2010 (2010)
Paulheim, H.: Generating Possible Interpretations for Statistics from Linked Open Data. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 560–574. Springer, Heidelberg (2012)
Paulheim, H.: Exploiting Linked Open Data as Background Knowledge in Data Mining. In: CEUR Workshop Proceedings DMoLD 2013 Collocated with ECMLPKDD 2013, pp. 1–10. RWTH, Aachen (2013)
Racunas, S.A., Shah, N.H., Albert, I., Fedoroff, N.V.: HyBrow: a prototype system for computer-aided hypothesis evaluation. Bioinformatics 20(suppl. 1), i257–i264 (2004)
Roos, M., Marshall, M.S., Gibson, A., Schuemie, M., Meij, E., Katrenko, S., Adriaans, P.: Structuring and extracting knowledge for the support of hypothesis generation in molecular biology. BMC Bioinformatics 10(suppl. 10), S9 (2009)
Shannon, C.: A Mathematical Theory of Communication. Bell System Technical Journal 27(3), 379–423 (1948)
Tiddi, I., d’Aquin, M., Motta, E.: Explaining Clusters with Inductive Logic Programming and Linked Data. In: 12th International Semantic Web Conference (2013)
Tiddi, I.: Explaining data patterns using background knowledge from Linked Data. In: ISWC 2013 Doctoral Consortium, Sydney, Australia (2013)
Zapilko, B., Harth, A., Mathiak, B.: Enriching and analysing statistics with Linked Open Data. In: Eurostat (ed.) NTTS - Conference on New Techniques and Technologies for Statistics. S8 Paper 1, Brüssel (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Tiddi, I., d’Aquin, M., Motta, E. (2014). Dedalo: Looking for Clusters Explanations in a Labyrinth of Linked Data. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds) The Semantic Web: Trends and Challenges. ESWC 2014. Lecture Notes in Computer Science, vol 8465. Springer, Cham. https://doi.org/10.1007/978-3-319-07443-6_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-07443-6_23
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07442-9
Online ISBN: 978-3-319-07443-6
eBook Packages: Computer ScienceComputer Science (R0)