Nothing Special   »   [go: up one dir, main page]

Skip to main content

KYOTO: A Knowledge-Rich Approach to the Interoperable Mining of Events from Text

  • Chapter
  • First Online:
New Trends of Research in Ontologies and Lexical Resources

Abstract

To automatically understand text, a crucial step is to extract events and their participants. The same event can be packaged in many different ways in a language. Capturing all these ways with sufficient precision is a major challenge. This becomes even more complex, when we consider texts in different languages on the same topic. We describe a knowledge-rich event-mining system developed for the Asian-European project KYOTO that can extract events in a uniform and interoperable way, regardless of the way they are expressed and in which language. To achieve this, we developed an open text representation format, semantic processing modules and a central ontology that is shared across seven languages. We implemented a semantic tagging approach that performs off-line reasoning and a module for detecting semantic and linguistic patterns in the tagged data to extract events from a large variety of expressions. The system can efficiently handle large volumes of documents and is not restricted to a specific domain. We applied the system to an English text on estuaries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

eBook
USD 15.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    www.kyoto-project.eu

  2. 2.

    http://www.sp2000.org/

  3. 3.

    DOLCE-Lite-Plus version 3.9.7

  4. 4.

    http://adimen.si.ehu.es/web/BLC

  5. 5.

    This knowledge model is freely available through the KYOTO website as open-source data.

  6. 6.

    The mapping relations from wordnet to the ontology, need to satisfy the constraints of the ontology, i.e. only roles can be expressed that are compatible with the role-schema of the process in which they participate.

  7. 7.

    www.acb-online.org/pubs/BayBarometer2008Web.pdf

  8. 8.

    For cross-lingual retrieval, the lemmas have been translated to all the other languages in KYOTO, using the equivalences in the wordnets. The databases can be searched in any of the languages and the results are rendered in the query languages, regardless of the source language of the information.

  9. 9.

    http://simile-widgets.org/wiki/Exhibit

  10. 10.

    Follow the next URL to search in the Estuary database. Login with any name and any password: http://kyoto.irion.nl/kyoto/web/init.do?project=estuary_en&database=2&queryLg=en&query=

  11. 11.

    http://www.let.rug.nl/vannoord/alp/Alpino/

References

  1. Agirre, E., Soroa, A.: Personalizing PageRank for word sense disambiguation. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL ’09), pp. 33–41. Association for Computational Linguistics, Stroudsburg (2009)

    Google Scholar 

  2. Baker, C.F., Fillmore, C.J., Lowe, J.B.: The berkeley framenet project. In: Proceedings of the 17th International Conference on Computational Linguistics, COLING ’98, vol. 1, pp. 86–90. Association for Computational Linguistics, Stroudsburg (1998)

    Google Scholar 

  3. Bontcheva, K., Wilks, Y.: Automatic report generation from ontologies: the MIAKT approach. In: Nineth International Conference on Applications of Natural Language to Information Systems (NLDB’2004). Manchester (2004)

    Google Scholar 

  4. Bosma, W.E., Vossen, P., Soroa, A., Rigau, G., Tesconi, M., Marchetti, A., Monachini, M., Aliprandi, C.: Kaf: a generic semantic annotation format. In: Proceedings of the GL2009 Workshop on Semantic Annotation, Pisa (2009)

    Google Scholar 

  5. Brooke, J.: SUS: a quick and dirty usability scale. In: Jordan, P.W., Thomas, B., Weerdmeester, B.A., McClelland, A.L. (eds.) Usability Evaluation in Industry. CRC (1996)

    Google Scholar 

  6. Fellbaum, C.: WordNet: an Electronical Lexical Database. The MIT Press, Cambridge (1998)

    Google Scholar 

  7. Gangemi, A., Guarino, N., Masolo, C., Oltramari, A.: Interfacing WordNet with DOLCE: towards OntoWordNet. In: Ontology and the Lexicon, pp. 36–52. Cambridge University Press, Cambridge/New York (2010)

    Google Scholar 

  8. Guarino, N., Welty, C.: Evaluating ontological decisions with ontoclean. Commun. ACM 45(2), 61–65 (2002)

    Article  Google Scholar 

  9. Hicks, A., Herold, A.: Evaluating ontologies with rudify. In: Dietz J.L.G. (ed.) Proceedings of the 2nd International Conference on Knowledge Engineering and Ontology Development (KEOD’09), pp. 5–12. INSTICC Press (2009)

    Google Scholar 

  10. Huyhn, D., Karger, D., Miller, R.: Exhibit: lightweight structured data publishing. ACM 978-1-59593-654-7/07/0005. MIT Computer Science and Artificial Intelligence Laboratory (2007)

    Google Scholar 

  11. Ide, N., Romary, L.: Outline of the international standard linguistic annotation framework. In: Proceedings of the ACL 2003 Workshop on Linguistic Annotation: Getting the Model Right (LingAnnot 03), vol. 19, pp. 1–5. Association for Computational Linguistics, Stroudsburg (2003)

    Google Scholar 

  12. Izquierdo, R., Suarez, A., Rigau, G.: Exploring the automatic selection of basic level concepts. In: Angelova, G., Bontcheva, K., Mitkov, R., Nicolov, N., Nikolov, N. (eds.) International Conference Recent Advances in Natural Language Processing, Borovets, pp. 298–302 (2007)

    Google Scholar 

  13. Kaiser, K., Miksch, S.: Information extraction a survey. Tech. rep., Vienna University of Technology. Institute of Software Technology and Interactive Systems (2005)

    Google Scholar 

  14. Majid, A., Boster, J.S., Bowerman, M.: The cross-linguistic categorization of everyday events: a study of cutting and breaking. Cognition 109, 235–250 (2008)

    Article  Google Scholar 

  15. Masolo, C., Borgo, S., Gangemi, A., Guarino, N., Oltramari, A.: Wonderweb deliverable d18: Ontology library. Tech. rep., ITSC-CNR, Trento (2003)

    Google Scholar 

  16. Mizoguchi, R., Sunagawa, E., Kozaki, K., Kitamura, Y.: The model of roles within an ontology development tool: hozo. Appl. Ontol. 2, 159–179 (2007)

    Google Scholar 

  17. Niles, I., Pease, A.: Linking lexicons and ontologies: mapping wordnet to the suggested upper merged ontology. In: Proceedings of the 2003 International Conference on Information and Knowledge Engineering (IKE 03), Las Vegas, pp. 23–26. CSREA Press, Las Vegas (2003)

    Google Scholar 

  18. Peshkin, L., Pfeffer, A.: Bayesian information extraction network. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI’03), pp. 421–426. Morgan Kaufmann, San Francisco (2003)

    Google Scholar 

  19. Pustejovsky, J., Lee, K., Bunt, H., Romary, L.: Iso-timeml: an international standard for semantic annotation. In: Chair, N.C.C., Choukri, K., Maegaard, B., Mariani, J., Odijk, J. Piperidis, S., Rosner, M., Tapias, D. (eds.) Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10). European Language Resources Association (ELRA), Valletta (2010)

    Google Scholar 

  20. Putnam, H.: The meaning of ‘meaning’. Minn. Stud. Philos. Sci. 7, 131–193 (1975)

    Google Scholar 

  21. Vossen, P., Rigau, G.: Division of semantic labor in the global wordnet grid. In: Proceedings of Global WordNet Conference (GWC’2010), Mumbay (2010)

    Google Scholar 

  22. Vossen, P., Agirre, E., Calzolari, N., Fellbaum, C., Shu-Kai Hsieh, Chu-Ren Huang, Isahara, H., Kanzaki, K., Marchetti, A., Monachini, M., Neri, F., Raffaelli, R., Rigau, G., Tesconi, M., VanGent, J.: KYOTO: a system for mining, structuring, and distributing knowledge across languages and cultures. In: Proceedings of the 4th Global WordNet Conference (GWC’08), University of Szeged. Szeged, Hungary (2008)

    Google Scholar 

Download references

Acknowledgements

The KYOTO project is co-funded by EU –FP7 ICT Work Programme 2007 under Challenge 4 –Digital libraries and Content, Objective ICT-2007.4.2 (ICT-2007.4.4): Intelligent Content and Semantics (challenge 4.2). The Asian partners from Tapei and Kyoto are funded from national funds. This work has been also supported by Spanish project KNOW-2 (TIN2009-14715-C04-01).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Piek Vossen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Vossen, P., Agirre, E., Rigau, G., Soroa, A. (2013). KYOTO: A Knowledge-Rich Approach to the Interoperable Mining of Events from Text. In: Oltramari, A., Vossen, P., Qin, L., Hovy, E. (eds) New Trends of Research in Ontologies and Lexical Resources. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31782-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31782-8_5

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31781-1

  • Online ISBN: 978-3-642-31782-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics