Abstract
Our purpose is to extract RDF-style triples from text corpora in an unsupervised way and use them as preprocessed material for the construction of ontologies from scratch. We have worked on a corpus taken from Internet websites and describing the megalithic ruin of Stonehenge. Using a shallow parser, we select functional relations, such as the syntactic structure subject-verb-object. The selection is done using prepositional structures and frequency measures in order to select the most relevant triples. Therefore, the paper stresses the choice of patterns and the filtering carried out in order to discard automatically all irrelevant structures. At the same occasion, we are experimenting with a method to objectively evaluate the material generated automatically.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Berland, M., Charniak, E.: Finding parts in very large corpora. In: Proceedings ACL 1999 (1999)
Buchholz, S.: Memory-based grammatical relation finding. In: Proceedings of the Joint SIGDAT Conference EMNLP/VLC (2002)
Caraballo, S.A.: Automatic construction of a hypernym-labeled noun hierarchy from text. In: Proceedings ACL 1999 (1999)
Cimiano, P., Staab, S., Tane, J.: Automatic acquisition of taxonomies from text: FCA meets NLP. In: Nicolov, N., Mitkov, R., Angelova, G., Boncheva, K. (eds.) Proceedings of the ECML/PKDD Workshop on Adaptive Text Extraction and Mining ATEM 2003, pp. 10–17 (2003)
Daelemans, W., Buchholz, S., Veenstra, J.: Memory-based shallow parsing. In: Proceedings of CoNLL 1999 (1999)
Daelemans, W., Van den Bosch, A.: Memory-Based Language Processing. Cambridge University Press, Cambridge (2005)
Gamallo, P., Agustini, A., Lopes, G.P.: Using co-composition for acquiring syntactic and semantic subcategorisation. In: Proceedings of the Workshop SIGLEX 2002 (ACL 2002) (2002)
De Kock, J.: Elementos para una estilística computacional - tomo. Editorial Coloquio, Madrid (1984)
Kuebler, S.: Parsing Without Grammar – Using Complete Trees Instead. In: Nicolov, N., Mitkov, R., Angelova, G., Boncheva, K. (eds.) Recent Advances in Natural Language Processing III: Selected Papers from RANLP 2003. John Benjamins, Amsterdam (2003)
Lin, D.: Automatic retrieval and clustering of similar words. In: Proceedings of COLING-ACL- 1998 (1998)
Luhn, H.P.: The automatic creation of literature abstracts. IBM Journal of Research and Development 2(2), 159–195 (1958)
Pantel, P., Lin, D.: Discovering word senses from text. In: Proceedings of ACM SIGKDD 2002 (2002)
Pustejovsky, J.: The Generative Lexicon. MIT Press, Cambridge (1995)
Reinberger, M.-L., Spyns, P., Daelemans, W., Meersman, R.: Mining for lexons: Applying unsupervised learning methods to create ontology bases. In: Meersman, R., Tari, Z., Schmidt, D.C. (eds.) CoopIS 2003, DOA 2003, and ODBASE 2003. LNCS, vol. 2888, pp. 803–819. Springer, Heidelberg (2003)
Reinberger, M.-L., Spyns, P., Pretorius, A.J., Daelemans, W.: Automatic initiation of an ontology. In: Meersman, R., Tari, Z. (eds.) OTM 2004. LNCS, vol. 3290, pp. 600–617. Springer, Heidelberg (2004)
Reinberger, M.-L., Spyns, P.: Ontology Learning from Text: Methods, Applications and Evaluation. In: Unsupervised Text Mining for the Learning of DOGMA-inspired Ontologies. IOS Press, Amsterdam (2005)
Reinberger, M.-L.: Automatic extraction of spatial relations. In: Proceedings of the TEMA workshop, EPIA 2005, Portugal (2005)
Spyns, P., Reinberger, M.-L.: Lexically evaluating ontology triples automatically generated from text. In: Gómez-Pérez, A., Euzenat, J. (eds.) ESWC 2005. LNCS, vol. 3532, pp. 563–577. Springer, Heidelberg (2005)
Spyns, P., De Bo, J.: Ontologies: a revamped cross-disciplinary buzzword or a truly promising interdisciplinary research topic? Linguistica Antverpiensia, new series 3 (2004)
Sure, Y., Gomez-Perez, G.-P., Daelemans, W., Reinberger, M.-L., Guarino, N., Noy, N.F.: Why Evaluate Ontology Technologies? Because It Works! IEEE Intelligent Systems 19(4), 74–81 (2004)
van Rijsbergen, C.: Information Retrieval. Butterworths, London (1979)
Zipf, G.K.: Human Behaviour and the Principle of Least-Effort. Addison-Wesley, Cambridge (1949)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Reinberger, ML., Spyns, P. (2005). Generating and Evaluating Triples for Modelling a Virtual Environment. In: Meersman, R., Tari, Z., Herrero, P. (eds) On the Move to Meaningful Internet Systems 2005: OTM 2005 Workshops. OTM 2005. Lecture Notes in Computer Science, vol 3762. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11575863_144
Download citation
DOI: https://doi.org/10.1007/11575863_144
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29739-0
Online ISBN: 978-3-540-32132-3
eBook Packages: Computer ScienceComputer Science (R0)