Nothing Special   »   [go: up one dir, main page]

Skip to main content

The Impact of Enriched Linguistic Annotation on the Performance of Extracting Relation Triples

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2945))

Abstract

A relation extraction system recognises pre-defined relation types between two identified entities from natural language documents. It is important for a task of automatically locating missing instances in knowledge base where the instance is represented as a triple (‘entity – relation – entity’). A relation entry specifies a set of rules associated with the syntactic and semantic conditions under which appropriate relations would be extracted. Manually creating such rules requires knowledge from information experts and moreover, it is a time-consuming and error-prone task when the input sentences have little consistency in terms of structures and vocabularies. In this paper, we present an approach for applying a symbolic learning algorithm to sentences in order to automatically induce the extraction rules which then successfully classify a new sentence. The proposed approach takes into account semantic attributes (e.g., semantically close words and named-entities) in generalising common patterns among the sentences which enable the system to cope better with syntactically different but semantically similar sentences. Not only does this increase the number of relations extracted, but it also improves the accuracy in extracting relations by adding features which might not be discovered only with syntactic analysis. Experimental results show that this approach is effective on the sentences of the Web documents obtaining 17% higher precision and 34% higher recall values.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Aitken, J.S.: Learning information extraction rules: An inductive logic programming approach. In: Proc. of European Conf. on Artificial Intelligence, ECAI, France, pp. 335–359 (2002)

    Google Scholar 

  2. Aone, C., Halverson, L., Hampton, T., Ramos-Santacruz, M.: SRA: Description of the IE system used for MUC-7, MUC-7 (1998)

    Google Scholar 

  3. Aone, C., Ramos-Santacruz, M.: REES: A Large-Scale Relation and Event Extraction System. In: Proc. of the 6th Applied Natural Language Processing Conference, U.S.A, pp. 76–83 (2000)

    Google Scholar 

  4. Ciravegna, F.: Adaptive Information Extraction from Text by Rule Induction and Generalisation. In: Proc. 17th Int. Joint Conf. on Artificial Intelligence, Seattle (2001)

    Google Scholar 

  5. Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K., Slattery, S.: Learning to Extract Symbolic Knowledge from the World Wide Web. Technical report, Carnegie Mellon University, U.S.A, CMU-CS-98-122 (1998)

    Google Scholar 

  6. Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: a framework and graphical development environment for robust NLP tools and applications. In: Proc. of the 40th Anniversary Meeting of the Association for Computational Linguistics, Philadelphia, USA, pp. 168–175 (2002)

    Google Scholar 

  7. Freitag, D.: Information Extraction from HTML: Application of a General Machine Learning Approach. In: Proc. AAAI 1998, pp. 517–523 (1998)

    Google Scholar 

  8. Kim, S., Alani, H., Hall, W., Lewis, P.H., Millard, D.E., Shadbolt, N.R., Weal, M.W.: Artequakt: Generating Tailored Biographies with Automatically Annotated Fragments from the Web. In: Proc. of the Workshop on the Semantic Authoring, Annotation & Knowledge Markup in the 15th European Con. on Artificial Intelligence, France, pp. 1–6 (2002)

    Google Scholar 

  9. Kim, S., Hall, W., Keane, A.: Natural Language Processing for Expertise Modelling in Email Communication. In: Proc. of the 3rd Int. Con. on Intelligent Data Engineering and Automated Reasoning, England, pp. 161–166 (2002)

    Google Scholar 

  10. Marsh, E., Perzanowski, D.: MUC-7 Evaluation of IE Technology: Overview of Results (1998), available at: http://www.itl.nist.gov/iaui/894.02/related_projects/muc/index.html

  11. Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: Introduction to wordnet: An on-line lexical database. Technical report, University of Princeton, U.S.A. (1993)

    Google Scholar 

  12. Muggleton, S.: Inverse entailment and Progol. New Generation Computing 13, 245–286 (1995)

    Article  Google Scholar 

  13. Parson, R., Muggleton, S.: An experiment with browsers that learn. In: Furukawa, K., Michie, D., Muggleton, S. (eds.) Machine Intelligence, vol. 15. Oxford University Press, Oxford (1998)

    Google Scholar 

  14. Resnik, P.: Using Information Content to Evaluate Semantic Similarity in Taxonomy. In: Proc. of the 14th Int. Joint Con. on Artificial Intelligence, pp. 448–453 (1995)

    Google Scholar 

  15. Roth, D., Yih, W.T.: Probabilistic reasoning for entity & relation recognition. In: COLING 2002 (2002)

    Google Scholar 

  16. Sekine, S., Grishman, R.: A corpus-based probabilistic grammar with only two nonterminals. In: Proc. of the 1st International Workshop on Multimedia annotation, Japan (2001)

    Google Scholar 

  17. Staab, S., Maedche, A., Handschuh, S.: An annotation framework for the semantic web. In: Proc. of the 1st International Workshop on MultiMedia Annotation, Japan (2001)

    Google Scholar 

  18. Vargas-Vera, M., Motta, E., Domingue, J.: Knowledge extraction by using an ontologybased annotation tool. In: Proc. of the Workshop on Knowledge Markup and Semantic Annotation, KCAP 2001, Canada (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kim, S., Lewis, P., Martinez, K. (2004). The Impact of Enriched Linguistic Annotation on the Performance of Extracting Relation Triples. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2004. Lecture Notes in Computer Science, vol 2945. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24630-5_68

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24630-5_68

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21006-1

  • Online ISBN: 978-3-540-24630-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics