Reusing Linguistic Resources: Tasks and Goals for a Linked Data Approach

Marieke van Erp⁴

1460 Accesses

Abstract

There is a need to share linguistic resources, but reuse is impaired by a number of constraints including lack of common formats, differences in conceptual notions, and unsystematic metadata. In this contribution, the five most important constraints and the tasks necessary to overcome these issues are detailed. These constraints lie in the design of linguistic resources, the way they are marked up and their metadata. These issues have also come up in a domain other than linguistics, namely in the semantic web, where the Linked Data approach proved useful. Experiences and lessons learnt from that domain are discussed in the light of standardisation and reconciliation of concepts and representations of linguistic annotations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Multilingual Knowledge Systems as Linguistic Linked Open Data

Language Resources and Linked Data: A Practical Perspective

Designing Annotation Schemes: From Model to Representation

References

Bizer C, Heath T, Berners-Lee T (2009a) Linked data - the story so far. International Journal on Semantic Web and Information Systems (IJSWIS) 5(3):1–22
Article Google Scholar
Bizer C, Lehman J, Kobilarov G, Auer S, Becker C, Cyganiak R, Hellmann S (2009b) DBpedia - A crystallization point for the web of data. Web Semantics: Science, Services and Agents on the World Wide Web 7(3):154–165
Article Google Scholar
Chiarcos C (this vol.) Interoperability of corpora and annotations. pp 161–179
Google Scholar
Chiarcos C, Hellmann S, Nordhoff S (this vol.) The Open Linguistics Working Group of the Open Knowledge Foundation. pp 153–160
Google Scholar
Consortium LD (2005) ACE (Automatic Content Extraction) English Annotation Guidelines for Entities version 5.6.1
Google Scholar
Eckart K, Riester A, Schweitzer K (this vol.) A discourse information radio news database for linguistic analysis. pp 65–75
Google Scholar
Euzenat J, Meilicke C, Stuckenschmidt H, Shvaiko P, Trojahn C (2011) Ontology alignment evaluation initiative: Six years of experience. Journal on Data Semantics 15:158–192
Article Google Scholar
Fellbaum C (ed) (1998) WordNet: An Electronic Lexical Database. The MIT Press
MATH Google Scholar
Halpin H, Hayes PJ, McCusker JP, McGuinness DL, Thompson HS (2010) When owl:sameAs isn’t the same: An analysis of identity in linked data. In: The 9th International Semantic Web Conference (ISWC 2010), Shanghai, China, pp 305–320
Google Scholar
Hellmann S, Stadler C, Lehmann J (this vol.) The German DBpedia: A sense repository for linking entities. pp 181–189
Google Scholar
Mika P, Ciaramita M, Zaragoza H, Atserias J (2008) Learning to tag and tagging to learn: A case study on Wikipedia. IEEE Intelligent Systems 23(5):26–33
Article Google Scholar
Petrov S, Das D, McDonald R (2011) A universal part-of-speech tagset. arXiv:1104.2086v1
Pustejovsky J, Lee K, Bunt H, Romary L (2010) ISO-TimeML: An international standard for semantic annotation. In: Proceedings of LREC 2010, pp 394–397
Google Scholar
Recasens M, Hovy E, Martí MA (2011) Identify, non-identity, and near-identity: Addressing the complexity of coreference. Lingua pp 1138–1152
Google Scholar
Teufel S (1997) A support tool for tagset mapping. arXiv:cmp-lg/9506005v2
Tjong Kim Sang EF, Meulder FD (2003) Introduction to the conll-2003 shared task: Language-independent named entity recognition. In: Proceedings of CoNLL-2003, Edmonton, Canada, pp 142–147
Google Scholar
Windhouwer M, Wright SE (this vol.) Linking to linguistic data categories in ISOcat. pp 99–107
Google Scholar

Download references

Author information

Authors and Affiliations

Web and Media Group, Computer Sciences Department, VU University, De Boelelaan 1081a, 1081 HV, Amsterdam, The Netherlands
Marieke van Erp

Authors

Marieke van Erp
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marieke van Erp .

Editor information

Editors and Affiliations

, Information Science Institute, University of Southern California, Admiralty Way 4676, Marina del Rey, 90292, California, USA
Christian Chiarcos
Department of Linguistics, Evolutionary Anthropology Leipzig, Max-Planck Instutite for, Deutscher Platz 6, Leipzig, 04103, Germany
Sebastian Nordhoff
, Business Information Systems, University of Leipzig, Johannisgasse 26, Leipzig, 04103, Germany
Sebastian Hellmann

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

van Erp, M. (2012). Reusing Linguistic Resources: Tasks and Goals for a Linked Data Approach. In: Chiarcos, C., Nordhoff, S., Hellmann, S. (eds) Linked Data in Linguistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28249-2_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-28249-2_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28248-5
Online ISBN: 978-3-642-28249-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics