Abstract
The paper discusses the process of developing Semantic Annotations, a form of metadata for assigning conceptual entities to textual instances, in this case archaeological grey literature. The use of Information Extraction (IE), a Natural Language Processing (NLP) technique is central to the annotation process. The paper explores the use of Ontology Oriented Information Extraction (OOIE) methods for the definition of rich semantic-aware indices of archaeology documents. The annotation process follows a rule-based information extraction approach using GATE. In particular the report discusses a prototype development that adopts the core ontology, CIDOC CRM, together with an English Heritage archaeological extension, to inform and direct the information extraction effort. The prototype evaluation, supports the assumptions made, about the capability of the method to construct rich indices of grey literature documents empowered by Semantic Annotations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cowie, J., Lehnert, W.: Information extraction. Communications ACM 39(1), 80–91 (1996)
Lewis, D., Jones, K.: Natural language processing for information retrieval. Commun. ACM 39(1), 92–101 (1996)
Moens, M.F.: Information Extraction Algorithms and Prospects in a Retrieval Context. Springer, New York (2006)
Gaizauskas, R., Wilks, Y.: Information extraction: beyond document retrieval. Journal of Documentation 54(1), 70–105 (1998)
CIDOC-CRM, http://www.cidoc-crm.org/
CRM-EH, http://hypermedia.research.glam.ac.uk/resources/crm/
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. In: Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics ACL 2002 (2002)
Vlachidis, A., Binding, C., May, K., Tudhope, D.: Excavating Grey Literature: a case study on the rich indexing of archaeological documents via Natural Language Processing techniques and Knowledge Based resources. ASLIB Proceedings Journal 62(4&5), 466–475 (2010)
Tudhope, D., Binding, C., May, K.: Semantic interoperability issues from a case study in archaeology. In: Kollias, S., Cousins, J. (eds.) Semantic Interoperability in the European Digital Library, Proceedings of the First International Workshop SIEDL 2008, Associated with 5th European Semantic Web Conference, Tenerife, pp. 88–99 (2008)
General Architecture for Text Engineering GATE, http://gate.ac.uk/
Cunningham, H., Maynard, D., Tablan, V.: JAPE a Java Annotation Patterns Engine, 2nd edn. Technical report CS–00–10, University of Sheffield, Department of Computer Science (2000)
Bontcheva, K., Cunningham, H., Kiryakov, A., Tablan, V.: Semantic Annotation and Human Language Technology. In: Semantic Web Technology: Trends and Research in Ontology Based Systems, John Wiley and Sons, Sussex (2006)
Uren, V., Cimiano, P., Iria, J., Handschuh, S., Vargas-Vera, M., Motta, E., Ciravegna, F.: Semantic annotation for knowledge management: Requirements and a survey of the state of the art. Web Semantics: Science, Services and Agents on the World Wide Web 4(1), 14–28 (2006)
Guarino, N.: Formal Ontology and Information Systems. In: Guarino, N. (ed.) Formal Ontology in Information Systems, pp. 3–15. IOS Press, Amsterdam (1998)
Wilks, Y.: The Semantic Web as the apotheosis of annotation, but what are its semantics? Intelligent Systems 23(3), 41–49 (2008)
Kiryakov, A., Popov, B., Terziev, I., Manov, D., Ognyanoff, D.: Semantic annotation, indexing, and retrieval. Web Semantics: Science, Services and Agents on the World Wide Web 2(1), 49–79 (2004)
Bontcheva, K., Duke, T., Glover, N., Kings, I.: Semantic Information Access. In: Semantic Web Semantic Web Technology: Trends and Research in Ontology Based Systems. John Wiley and Sons, Sussex (2006)
Crofts, N., Doerr, M., Gill, T., Stead, S., Stiff, M.: Definition of the CIDOC Conceptual Reference Model, http://cidoc.ics.forth.gr/docs/cidoc_crm_version_5.0.1_Mar09.pdf
Cripps, P., Greenhalgh, A., Fellows, D., May, K., Robinson, D.E.: Ontological Modelling of the work of the Centre for Archaeology. CRM – EH model diagram (2004), http://cidoc.ics.forth.gr/docs/AppendixA_DiagramV9.pdf
STAR project, http://hypermedia.research.glam.ac.uk/kos/star/
Debachere, M.C.: Problems in Obtaining Grey Literature. IFLA Journal 21(2), 94 (1995)
Online AccesS to the Index of archaeological investigations OASIS, http://oasis.ac.uk/
Archaeology Data Service ADS, http://archaeologydataservice.ac.uk
Binding, C., Tudhope, D., May, K.: Semantic Interoperability in Archaeological Datasets: Data Mapping and Extraction via the CIDOC CRM. In: Christensen-Dalsgaard, B., Castelli, D., Ammitzbøll Jurik, B., Lippincott, J. (eds.) ECDL 2008. LNCS, vol. 5173, pp. 280–290. Springer, Heidelberg (2008)
EH National Monuments Records Thesauri, http://thesaurus.english-heritage.org.uk/
Grishman, R., Sundheim, B.: Message Understanding Conference-6; a brief history, pp. 466–471. Association for Computational Linguistics, New Jersey (1996)
Maynard, D., Peters, W., Li, Y.: Metrics for Evaluation of Ontology-based Information Extraction. In: Procceding of WWW 2006 Workshop on Evaluation of Ontologies for the Web (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vlachidis, A., Tudhope, D. (2011). Semantic Annotation for Indexing Archaeological Context: A Prototype Development and Evaluation. In: García-Barriocanal, E., Cebeci, Z., Okur, M.C., Öztürk, A. (eds) Metadata and Semantic Research. MTSR 2011. Communications in Computer and Information Science, vol 240. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24731-6_37
Download citation
DOI: https://doi.org/10.1007/978-3-642-24731-6_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24730-9
Online ISBN: 978-3-642-24731-6
eBook Packages: Computer ScienceComputer Science (R0)