Abstract
A relation extraction system recognises pre-defined relation types between two identified entities from natural language documents. It is important for a task of automatically locating missing instances in knowledge base where the instance is represented as a triple (‘entity – relation – entity’). A relation entry specifies a set of rules associated with the syntactic and semantic conditions under which appropriate relations would be extracted. Manually creating such rules requires knowledge from information experts and moreover, it is a time-consuming and error-prone task when the input sentences have little consistency in terms of structures and vocabularies. In this paper, we present an approach for applying a symbolic learning algorithm to sentences in order to automatically induce the extraction rules which then successfully classify a new sentence. The proposed approach takes into account semantic attributes (e.g., semantically close words and named-entities) in generalising common patterns among the sentences which enable the system to cope better with syntactically different but semantically similar sentences. Not only does this increase the number of relations extracted, but it also improves the accuracy in extracting relations by adding features which might not be discovered only with syntactic analysis. Experimental results show that this approach is effective on the sentences of the Web documents obtaining 17% higher precision and 34% higher recall values.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aitken, J.S.: Learning information extraction rules: An inductive logic programming approach. In: Proc. of European Conf. on Artificial Intelligence, ECAI, France, pp. 335–359 (2002)
Aone, C., Halverson, L., Hampton, T., Ramos-Santacruz, M.: SRA: Description of the IE system used for MUC-7, MUC-7 (1998)
Aone, C., Ramos-Santacruz, M.: REES: A Large-Scale Relation and Event Extraction System. In: Proc. of the 6th Applied Natural Language Processing Conference, U.S.A, pp. 76–83 (2000)
Ciravegna, F.: Adaptive Information Extraction from Text by Rule Induction and Generalisation. In: Proc. 17th Int. Joint Conf. on Artificial Intelligence, Seattle (2001)
Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K., Slattery, S.: Learning to Extract Symbolic Knowledge from the World Wide Web. Technical report, Carnegie Mellon University, U.S.A, CMU-CS-98-122 (1998)
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: a framework and graphical development environment for robust NLP tools and applications. In: Proc. of the 40th Anniversary Meeting of the Association for Computational Linguistics, Philadelphia, USA, pp. 168–175 (2002)
Freitag, D.: Information Extraction from HTML: Application of a General Machine Learning Approach. In: Proc. AAAI 1998, pp. 517–523 (1998)
Kim, S., Alani, H., Hall, W., Lewis, P.H., Millard, D.E., Shadbolt, N.R., Weal, M.W.: Artequakt: Generating Tailored Biographies with Automatically Annotated Fragments from the Web. In: Proc. of the Workshop on the Semantic Authoring, Annotation & Knowledge Markup in the 15th European Con. on Artificial Intelligence, France, pp. 1–6 (2002)
Kim, S., Hall, W., Keane, A.: Natural Language Processing for Expertise Modelling in Email Communication. In: Proc. of the 3rd Int. Con. on Intelligent Data Engineering and Automated Reasoning, England, pp. 161–166 (2002)
Marsh, E., Perzanowski, D.: MUC-7 Evaluation of IE Technology: Overview of Results (1998), available at: http://www.itl.nist.gov/iaui/894.02/related_projects/muc/index.html
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: Introduction to wordnet: An on-line lexical database. Technical report, University of Princeton, U.S.A. (1993)
Muggleton, S.: Inverse entailment and Progol. New Generation Computing 13, 245–286 (1995)
Parson, R., Muggleton, S.: An experiment with browsers that learn. In: Furukawa, K., Michie, D., Muggleton, S. (eds.) Machine Intelligence, vol. 15. Oxford University Press, Oxford (1998)
Resnik, P.: Using Information Content to Evaluate Semantic Similarity in Taxonomy. In: Proc. of the 14th Int. Joint Con. on Artificial Intelligence, pp. 448–453 (1995)
Roth, D., Yih, W.T.: Probabilistic reasoning for entity & relation recognition. In: COLING 2002 (2002)
Sekine, S., Grishman, R.: A corpus-based probabilistic grammar with only two nonterminals. In: Proc. of the 1st International Workshop on Multimedia annotation, Japan (2001)
Staab, S., Maedche, A., Handschuh, S.: An annotation framework for the semantic web. In: Proc. of the 1st International Workshop on MultiMedia Annotation, Japan (2001)
Vargas-Vera, M., Motta, E., Domingue, J.: Knowledge extraction by using an ontologybased annotation tool. In: Proc. of the Workshop on Knowledge Markup and Semantic Annotation, KCAP 2001, Canada (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, S., Lewis, P., Martinez, K. (2004). The Impact of Enriched Linguistic Annotation on the Performance of Extracting Relation Triples. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2004. Lecture Notes in Computer Science, vol 2945. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24630-5_68
Download citation
DOI: https://doi.org/10.1007/978-3-540-24630-5_68
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21006-1
Online ISBN: 978-3-540-24630-5
eBook Packages: Springer Book Archive