Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1988688.1988693acmotherconferencesArticle/Chapter ViewAbstractPublication PageswimsConference Proceedingsconference-collections
research-article

Creating knowledge out of interlinked data: making the web a data washing machine

Published: 25 May 2011 Publication History

Abstract

Over the past 4 years, the semantic web activity has gained momentum with the widespread publishing of structured data as RDF. The Linked Data paradigm has therefore evolved from a practical research idea into a very promising candidate for addressing one of the biggest challenges in the area of the Semantic Web vision: the exploitation of the Web as a platform for data and information integration. To translate this initial success into a world-scale reality, a number of research challenges need to be addressed: the performance gap between relational and RDF data management has to be closed, coherence and quality of data published on the Web have to be improved, provenance and trust on the Linked Data Web must be established and generally the entrance barrier for data publishers and users has to be lowered. We discuss approaches for tackling these challenges and their integration into a mutual refinement cycle - the linked data 'washing machine'.

References

[1]
A survey of current approaches for mapping of relational databases to rdf. WWW document, 2008.
[2]
The Semantic Web: Research and Applications, 7th Extended Semantic Web Conference, ESWC 2010, Heraklion, Greece, May 30 - June 3, Proceedings, Part II, volume 6089 of LNCS. Springer, 2010.
[3]
H. Alani, D. Dupplaw, J. Sheridan, K. O'Hara, J. Darlington, N. Shadbolt, and C. Tullo. Unlocking the potential of public sector information with semantic web technology. In Proc. of ISWC/ASWC2007, Busan, South Korea, volume 4825 of LNCS, pages 701--714. Springer, 2007.
[4]
S. Auer, S. Dietzold, J. Lehmann, S. Hellmann, and D. Aumueller. Triplify: Light-weight linked data publication from relational databases. In J. Quemada, G. León, Y. S. Maarek, and W. Nejdl, editors, Proceedings of the 18th International Conference on World Wide Web, WWW 2009, Madrid, Spain, April 20--24, 2009, pages 621--630. ACM, 2009.
[5]
S. Auer, S. Dietzold, and T. Riechert. OntoWiki - A Tool for Social, Semantic Collaboration. In I. F. Cruz, S. Decker, D. Allemang, C. Preist, D. Schwabe, P. Mika, M. Uschold, and L. Aroyo, editors, The Semantic Web - ISWC 2006, 5th International Semantic Web Conference, ISWC 2006, Athens, GA, USA, November 5--9, 2006, Proceedings, volume 4273 of Lecture Notes in Computer Science, pages 736--749, Berlin/Heidelberg, 2006. Springer.
[6]
S. Auer, R. Doehring, and S. Dietzold. Less - template-based syndication and presentation of linked data. In ESWC (2) {2}, pages 211--224.
[7]
S. Auer, J. Lehmann, and S. Hellmann. LinkedGeoData - adding a spatial dimension to the web of data. In Proc. of 8th International Semantic Web Conference (ISWC), 2009.
[8]
D. Aumüller. Semantic Authoring and Retrieval within a Wiki (WikSAR). In Demo Session at the Second European Semantic Web Conference (ESWC2005), May 2005. Available at http://wiksar.sf.net, 2005.
[9]
F. Baader, B. Ganter, U. Sattler, and B. Sertkaya. Completing description logic knowledge bases using formal concept analysis. In IJCAI 2007. AAAI Press, 2007.
[10]
L. Badea and S.-H. Nienhuys-Cheng. A refinement operator for description logics. In ILP 2000, volume 1866 of LNAI, pages 40--59. Springer, 2000.
[11]
T. Berners-Lee. Linked data - design issues. web page, 2006.
[12]
C. Bizer and R. Cyganiak. D2r server - publishing relational databases on the semantic web. Poster at the 5th International Semantic Web Conference (ISWC2006), 2006.
[13]
C. Bizer and A. Schultz. The berlin sparql benchmark. Int. J. Semantic Web Inf. Syst., 5(2):1--24, 2009.
[14]
J. Bleiholder and F. Naumann. Data fusion. ACM Comput. Surv., 41(1):1--41, 2008.
[15]
R. Castillo, C. Rothe, and U. Leser. Rdfmatview: Indexing rdf data for sparql queries. Technical report, Department for Computer Science, Humboldt-Universität zu Berlin, 2010.
[16]
A. Chatterjee and A. Segev. Data manipulation in heterogeneous databases. SIGMOD Record, 20(4):64--68, 1991.
[17]
N. Choi, I.-Y. Song, and H. Han. A survey on ontology mapping. SIGMOD Record, 35(3):34--41, 2006.
[18]
A. K. Elmagarmid, P. G. Ipeirotis, and V. S. Verykios. Duplicate record detection: A survey. IEEE Trans. Knowl. Data Eng., 19(1):1--16, 2007.
[19]
A. K. Elmagarmid, P. G. Ipeirotis, and V. S. Verykios. Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering, 19:1--16, 2007.
[20]
F. Esposito, N. Fanizzi, L. Iannone, I. Palmisano, and G. Semeraro. Knowledge-intensive induction of terminologies from metadata. In ISWC 2004, pages 441--455. Springer, 2004.
[21]
J. Euzenat and P. Shvaiko. Ontology matching. Springer-Verlag, Heidelberg (DE), 2007.
[22]
N. Fanizzi, C. d'Amato, and F. Esposito. DL-FOIL concept learning in description logics. In ILP 2008, volume 5194 of LNCS, pages 107--121. Springer, 2008.
[23]
G. Flouris, D. Manakanatas, H. Kondylakis, D. Plexousakis, and G. Antoniou. Ontology change: classification and survey. Knowledge Eng. Review, 23(2):117--152, 2008.
[24]
M. Fowler. Refactoring: Improving the Design of Existing Code. Addison-Wesley, 1999.
[25]
D. Geer. Reducing the storage burden via data deduplication. IEEE Computer, 41(12):15--17, 2008.
[26]
O. Hartig, C. Bizer, and J.-C. Freytag. Executing SPARQL queries over the web of linked data. In 8th International Semantic Web Conference, volume 5823 of LNCS, pages 293--309, Chantilly, Virginia, Oct. 2009. Springer-Verlag.
[27]
S. Hellmann, J. Lehmann, and S. Auer. Learning of owl class descriptions on very large knowledge bases. Int. J. Semantic Web Inf. Syst., 5(2):25--48, 2009.
[28]
L. Iannone, I. Palmisano, and N. Fanizzi. An algorithm based on counterfactuals for concept learning in the semantic web. Applied Intelligence, 26(2):139--159, 2007.
[29]
A. Jaffri. Linked data for the enterprise - an easy route to the semantic web. web page, March 2010.
[30]
P. Jain, P. Hitzler, P. Z. Yeh, K. Verma, and A. P. Sheth. Linked data is merely more data. Technical Report SS-10-07, Menlo Park, California, 2010.
[31]
S. N. Kim, O. Medelyan, M.-Y. Kan, and T. Baldwin. Semeval-2010 task 5: Automatic keyphrase extraction from scientific articles. In Proceedings of the 5th International Workshop on Semantic Evaluation, SemEval '10, pages 21--26, Stroudsburg, PA, USA, 2010. Association for Computational Linguistics.
[32]
M. Krötzsch, D. Vrandecic, M. Völkel, H. Haller, and R. Studer. Semantic wikipedia. Journal of Web Semantics, 5:251--261, Sept. 2007.
[33]
J. Lehmann. Hybrid learning of ontology classes. In Machine Learning and Data Mining in Pattern Recognition, volume 4571 of LNCS, pages 883--898. Springer, 2007.
[34]
J. Lehmann, S. Auer, L. Bühmann, and S. Tramp. Class expression learning for ontology engineering. Journal of Web Semantics, 9:71--81, 2011.
[35]
J. Lehmann, C. Bizer, G. Kobilarov, S. Auer, C. Becker, R. Cyganiak, and S. Hellmann. DBpedia - a crystallization point for the web of data. Journal of Web Semantics, 7(3):154--165, 2009.
[36]
J. Lehmann and P. Hitzler. Concept learning in description logics using refinement operators. Machine Learning journal, 78(1--2):203--250, 2010.
[37]
E.-P. Lim, J. Srivastava, S. Prabhakar, and J. Richardson. Entity identification in database integration. In ICDE, pages 294--301, 1993.
[38]
F. A. Lisi. Building rules on top of ontologies for the semantic web with inductive logic programming. Theory and Practice of Logic Programming, 8(3):271--300, 2008.
[39]
F. A. Lisi and F. Esposito. Learning SHIQ+log rules for ontology evolution. In SWAP 2008, volume 426 of CEUR Workshop Proceedings. CEUR-WS.org, 2008.
[40]
M. Martin, J. Unbehauen, and S. Auer. Improving the performance of semantic web applications with sparql query caching. In ESWC (2), volume 6089 of LNCS, pages 304--318. Springer, 2010.
[41]
M. Martin, J. Unbehauen, and S. Auer. Improving the Performance of Semantic Web Applications with SPARQL Query Caching. In Proceedings of 7th Extended Semantic Web Conference (ESWC 2010), 30 May -- 3 June 2010, Heraklion, Greece, 2010.
[42]
D. Nadeau. Semi-Supervised Named Entity Recognition: Learning to Recognize 100 Entity Types with Little Supervision. PhD thesis, University of Ottawa, 2007.
[43]
E. Oren. SemperWiki: A Semantic Personal Wiki. In S. Decker, J. Park, D. Quan, and L. Sauermann, editors, Proc. of Semantic Desktop Workshop at the ISWC, Galway, Ireland, November 6, volume 175, November 2005.
[44]
B. Quilitz and U. Leser. Querying distributed rdf data sources with sparql. In Proceedings of the 5th European Semantic Web Conference, LNCS, Berlin, Heidelberg, June 2008. Springer.
[45]
Y. Raimond, C. Sutton, and M. Sandler. Automatic interlinking of music datasets on the semantic web. In 1st Workshop about Linked Data on the Web, 2008.
[46]
C. Rieß, N. Heino, S. Tramp, and S. Auer. EvoPat -- Pattern-Based Evolution and Refactoring of RDF Knowledge Bases. In Proceedings of the 9th International Semantic Web Conference (ISWC2010), Lecture Notes in Computer Science, Berlin/Heidelberg, 2010. Springer.
[47]
S. Sarawagi. Letter from the special issue editor. IEEE Data Eng. Bull., 23(4):2, 2000.
[48]
S. Schaffert. Ikewiki: A semantic wiki for collaborative knowledge management. In Proceedings of the 1st International Workshop on Semantic Technologies in Collaborative Applications (STICA), 2006.
[49]
F. Scharffe, Y. Liu, and C. Zhou. Rdf-ai: an architecture for rdf datasets matching, fusion and interlink. In Proc. IJCAI 2009 IR-KR Workshop.
[50]
S. Schenk, C. Saathoff, S. Staab, and A. Scherp. Semaplorer - interactive semantic exploration of data and media based on a federated cloud infrastructure. Journal of Web Semantics, 7(4):298--304, 2009.
[51]
B. Sertkaya. OntocomP system description. In B. C. Grau, I. Horrocks, B. Motik, and U. Sattler, editors, Description Logics, volume 477 of CEUR Workshop Proceedings. CEUR-WS.org, 2009.
[52]
J. Sheridan and J. Tennison. Linking uk government data. In WWW2010 Workshop on Linked Data on the Web (LDOW 2010), 2010.
[53]
P. Shvaiko and J. Euzenat. A survey of schema-based matching approaches. J. Data Semantics IV, 3730:146--171, 2005.
[54]
P. Shvaiko and J. Euzenat. Ten challenges for ontology matching. Technical report, Aug. 01 2008.
[55]
A. Souzis. Building a Semantic Wiki. IEEE Intelligent Systems, 20(5):87--91, 2005.
[56]
A. Thor. Automatische Mapping---Verarbeitung von Web-Daten. Dissertation, Institut für Informatik, Universität Leipzig, 2007.
[57]
S. Tramp, P. Frischmuth, T. Ermilov, and S. Auer. Weaving a social data web with semantic pingback. In Proceedings of the EKAW 2010 - Knowledge Engineering and Knowledge Management by the Masses; 11th October-15th October 2010 - Lisbon, Portugal, 2010.
[58]
S. Tramp, N. Heino, S. Auer, and P. Frischmuth. Making the semantic data web easily writeable with rdfauthor. In ESWC (2) {2}, pages 436--440.
[59]
J. Völker and S. Rudolph. Fostering web intelligence by semi-automatic OWL ontology refinement. In Web Intelligence, pages 454--460. IEEE, 2008.
[60]
J. Völker, D. Vrandecic, Y. Sure, and A. Hotho. Learning disjointness. In ESWC 2007, volume 4519 of LNCS, pages 175--189. Springer, 2007.
[61]
J. Volz, C. Bizer, M. Gaedke, and G. Kobilarov. Discovering and maintaining links on the web of data. In ISWC 2009, pages 650--665. Springer, 2009.
[62]
J. Widom. Research problems in data warehousing. In CIKM, pages 25--30. ACM, 1995.
[63]
W. Winkler. Overview of record linkage and current research directions. Technical report, Bureau of the Census - Research Report Series, 2006.
[64]
H. Wu, M. Zubair, and K. Maly. Harvesting social knowledge from folksonomies. In Proceedings of the seventeenth conference on Hypertext and hypermedia, HYPERTEXT '06, pages 111--114, New York, NY, USA, 2006. ACM.

Cited By

View all
  • (2019)The Linked Data Wiki: Leveraging Organizational Knowledge Bases with Linked Open DataPrimate Life Histories, Sex Roles, and Adaptability10.1007/978-3-030-15640-4_15(294-319)Online publication date: 15-Mar-2019
  • (2018)A systematic review on the use of best practices for publishing linked dataOnline Information Review10.1108/OIR-11-2016-032242:1(107-123)Online publication date: 12-Feb-2018
  • (2018)Open Data InteroperabilityThe World of Open Data10.1007/978-3-319-90850-2_5(75-93)Online publication date: 22-Sep-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
WIMS '11: Proceedings of the International Conference on Web Intelligence, Mining and Semantics
May 2011
563 pages
ISBN:9781450301480
DOI:10.1145/1988688
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 May 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data web
  2. linked data
  3. semantic web

Qualifiers

  • Research-article

Funding Sources

Conference

WIMS '11

Acceptance Rates

Overall Acceptance Rate 140 of 278 submissions, 50%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)1
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2019)The Linked Data Wiki: Leveraging Organizational Knowledge Bases with Linked Open DataPrimate Life Histories, Sex Roles, and Adaptability10.1007/978-3-030-15640-4_15(294-319)Online publication date: 15-Mar-2019
  • (2018)A systematic review on the use of best practices for publishing linked dataOnline Information Review10.1108/OIR-11-2016-032242:1(107-123)Online publication date: 12-Feb-2018
  • (2018)Open Data InteroperabilityThe World of Open Data10.1007/978-3-319-90850-2_5(75-93)Online publication date: 22-Sep-2018
  • (2016)Semantic metadata in the publishing industry – technological achievements and economic implicationsElectronic Markets10.1007/s12525-016-0238-x27:1(9-20)Online publication date: 15-Nov-2016
  • (2015)Parallel mining of OWL 2 EL ontology from large linked datasetsKnowledge-Based Systems10.1016/j.knosys.2015.03.02384:C(10-17)Online publication date: 1-Aug-2015
  • (2015)Ontology usage analysis in the ontology lifecycleKnowledge-Based Systems10.1016/j.knosys.2015.02.02680:C(34-47)Online publication date: 1-May-2015
  • (2014)Introduction to Linked Data and Its Lifecycle on the WebReasoning Web. Reasoning on the Web in the Big Data Era10.1007/978-3-319-10587-1_1(1-99)Online publication date: 2014
  • (2013)Introduction to linked data and its lifecycle on the webProceedings of the 9th international conference on Reasoning Web: semantic technologies for intelligent data access10.1007/978-3-642-39784-4_1(1-90)Online publication date: 30-Jul-2013
  • (2013)Leveraging the Crowdsourcing of Lexical Resources for Bootstrapping a Linguistic Data CloudSemantic Technology10.1007/978-3-642-37996-3_13(191-206)Online publication date: 2013
  • (2012)Semantic metadata in the news production processProceeding of the 16th International Academic MindTrek Conference10.1145/2393132.2393158(125-133)Online publication date: 3-Oct-2012
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media