Abstract
To automatically understand text, a crucial step is to extract events and their participants. The same event can be packaged in many different ways in a language. Capturing all these ways with sufficient precision is a major challenge. This becomes even more complex, when we consider texts in different languages on the same topic. We describe a knowledge-rich event-mining system developed for the Asian-European project KYOTO that can extract events in a uniform and interoperable way, regardless of the way they are expressed and in which language. To achieve this, we developed an open text representation format, semantic processing modules and a central ontology that is shared across seven languages. We implemented a semantic tagging approach that performs off-line reasoning and a module for detecting semantic and linguistic patterns in the tagged data to extract events from a large variety of expressions. The system can efficiently handle large volumes of documents and is not restricted to a specific domain. We applied the system to an English text on estuaries.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
DOLCE-Lite-Plus version 3.9.7
- 4.
- 5.
This knowledge model is freely available through the KYOTO website as open-source data.
- 6.
The mapping relations from wordnet to the ontology, need to satisfy the constraints of the ontology, i.e. only roles can be expressed that are compatible with the role-schema of the process in which they participate.
- 7.
- 8.
For cross-lingual retrieval, the lemmas have been translated to all the other languages in KYOTO, using the equivalences in the wordnets. The databases can be searched in any of the languages and the results are rendered in the query languages, regardless of the source language of the information.
- 9.
- 10.
Follow the next URL to search in the Estuary database. Login with any name and any password: http://kyoto.irion.nl/kyoto/web/init.do?project=estuary_en&database=2&queryLg=en&query=
- 11.
References
Agirre, E., Soroa, A.: Personalizing PageRank for word sense disambiguation. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL ’09), pp. 33–41. Association for Computational Linguistics, Stroudsburg (2009)
Baker, C.F., Fillmore, C.J., Lowe, J.B.: The berkeley framenet project. In: Proceedings of the 17th International Conference on Computational Linguistics, COLING ’98, vol. 1, pp. 86–90. Association for Computational Linguistics, Stroudsburg (1998)
Bontcheva, K., Wilks, Y.: Automatic report generation from ontologies: the MIAKT approach. In: Nineth International Conference on Applications of Natural Language to Information Systems (NLDB’2004). Manchester (2004)
Bosma, W.E., Vossen, P., Soroa, A., Rigau, G., Tesconi, M., Marchetti, A., Monachini, M., Aliprandi, C.: Kaf: a generic semantic annotation format. In: Proceedings of the GL2009 Workshop on Semantic Annotation, Pisa (2009)
Brooke, J.: SUS: a quick and dirty usability scale. In: Jordan, P.W., Thomas, B., Weerdmeester, B.A., McClelland, A.L. (eds.) Usability Evaluation in Industry. CRC (1996)
Fellbaum, C.: WordNet: an Electronical Lexical Database. The MIT Press, Cambridge (1998)
Gangemi, A., Guarino, N., Masolo, C., Oltramari, A.: Interfacing WordNet with DOLCE: towards OntoWordNet. In: Ontology and the Lexicon, pp. 36–52. Cambridge University Press, Cambridge/New York (2010)
Guarino, N., Welty, C.: Evaluating ontological decisions with ontoclean. Commun. ACM 45(2), 61–65 (2002)
Hicks, A., Herold, A.: Evaluating ontologies with rudify. In: Dietz J.L.G. (ed.) Proceedings of the 2nd International Conference on Knowledge Engineering and Ontology Development (KEOD’09), pp. 5–12. INSTICC Press (2009)
Huyhn, D., Karger, D., Miller, R.: Exhibit: lightweight structured data publishing. ACM 978-1-59593-654-7/07/0005. MIT Computer Science and Artificial Intelligence Laboratory (2007)
Ide, N., Romary, L.: Outline of the international standard linguistic annotation framework. In: Proceedings of the ACL 2003 Workshop on Linguistic Annotation: Getting the Model Right (LingAnnot 03), vol. 19, pp. 1–5. Association for Computational Linguistics, Stroudsburg (2003)
Izquierdo, R., Suarez, A., Rigau, G.: Exploring the automatic selection of basic level concepts. In: Angelova, G., Bontcheva, K., Mitkov, R., Nicolov, N., Nikolov, N. (eds.) International Conference Recent Advances in Natural Language Processing, Borovets, pp. 298–302 (2007)
Kaiser, K., Miksch, S.: Information extraction a survey. Tech. rep., Vienna University of Technology. Institute of Software Technology and Interactive Systems (2005)
Majid, A., Boster, J.S., Bowerman, M.: The cross-linguistic categorization of everyday events: a study of cutting and breaking. Cognition 109, 235–250 (2008)
Masolo, C., Borgo, S., Gangemi, A., Guarino, N., Oltramari, A.: Wonderweb deliverable d18: Ontology library. Tech. rep., ITSC-CNR, Trento (2003)
Mizoguchi, R., Sunagawa, E., Kozaki, K., Kitamura, Y.: The model of roles within an ontology development tool: hozo. Appl. Ontol. 2, 159–179 (2007)
Niles, I., Pease, A.: Linking lexicons and ontologies: mapping wordnet to the suggested upper merged ontology. In: Proceedings of the 2003 International Conference on Information and Knowledge Engineering (IKE 03), Las Vegas, pp. 23–26. CSREA Press, Las Vegas (2003)
Peshkin, L., Pfeffer, A.: Bayesian information extraction network. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI’03), pp. 421–426. Morgan Kaufmann, San Francisco (2003)
Pustejovsky, J., Lee, K., Bunt, H., Romary, L.: Iso-timeml: an international standard for semantic annotation. In: Chair, N.C.C., Choukri, K., Maegaard, B., Mariani, J., Odijk, J. Piperidis, S., Rosner, M., Tapias, D. (eds.) Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10). European Language Resources Association (ELRA), Valletta (2010)
Putnam, H.: The meaning of ‘meaning’. Minn. Stud. Philos. Sci. 7, 131–193 (1975)
Vossen, P., Rigau, G.: Division of semantic labor in the global wordnet grid. In: Proceedings of Global WordNet Conference (GWC’2010), Mumbay (2010)
Vossen, P., Agirre, E., Calzolari, N., Fellbaum, C., Shu-Kai Hsieh, Chu-Ren Huang, Isahara, H., Kanzaki, K., Marchetti, A., Monachini, M., Neri, F., Raffaelli, R., Rigau, G., Tesconi, M., VanGent, J.: KYOTO: a system for mining, structuring, and distributing knowledge across languages and cultures. In: Proceedings of the 4th Global WordNet Conference (GWC’08), University of Szeged. Szeged, Hungary (2008)
Acknowledgements
The KYOTO project is co-funded by EU –FP7 ICT Work Programme 2007 under Challenge 4 –Digital libraries and Content, Objective ICT-2007.4.2 (ICT-2007.4.4): Intelligent Content and Semantics (challenge 4.2). The Asian partners from Tapei and Kyoto are funded from national funds. This work has been also supported by Spanish project KNOW-2 (TIN2009-14715-C04-01).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Vossen, P., Agirre, E., Rigau, G., Soroa, A. (2013). KYOTO: A Knowledge-Rich Approach to the Interoperable Mining of Events from Text. In: Oltramari, A., Vossen, P., Qin, L., Hovy, E. (eds) New Trends of Research in Ontologies and Lexical Resources. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31782-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-31782-8_5
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31781-1
Online ISBN: 978-3-642-31782-8
eBook Packages: Computer ScienceComputer Science (R0)