Abstract
In this paper, we present an approach providing generalized relations for automatic ontology building based on frequent word n-grams. Using publicly available Google n-grams as our data source we can extract relations in form of triples and compute generalized and more abstract models. We propose an algorithm for building abstractions of the extracted triples using WordNet as background knowledge. We also present a novel approach to triple extraction using heuristics, which achieves notably better results than deep parsing applied on n-grams. This allows us to represent information gathered from the web as a set of triples modeling the common and frequent relations expressed in natural language. Our results have potential for usage in different settings including providing for a knowledge base for reasoning or simply as statistical data useful in improving understanding of natural languages.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Clark, P., Harrison, P.: Large-Scale Extraction and Use of Knowledge from Text. In: Proc. Fifth Int. Conf. on Knowledge Capture, KCap 2009 (2009)
Rusu, D., Dali, L., Fortuna, B., Grobelnik, M., Mladenić, D.: Triplet Extraction from Sentences. In: Proceedings of the 10th International Multiconference Information Society - IS 2007, pp. 218–222 (2007)
Specia, L., Baldassarre, C., Motta, E.: Relation Extraction for Semantic Intranet Annotations. Knowledge Media Institute (2006)
Sahay, S., Li, B., Garcia, E.V., Agichtein, E., Ram, A.: Domain Ontology Construction from Biomedical Text, pp. 28–34. CSREA Press (2007)
Fundel, K., Küffner, R., Zimmer, R.: RelEx - Relation extraction using dependency parse trees. Bioinformatics 23, 365–371 (2007)
Etzioni, M., Cafarella, D., Downey, S., Kok, A.-M., Popescu, T., Shaked, S., Soderland, D.S.: Web-scale information extraction in knowitall (preliminary results), pp. 100–110. ACM, New York (2004)
Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M., Etzioni, O.: Open Information Extraction from the Web, pp. 2670–2676 (2007)
Zelenko, D., Aone, C., Richardella, A.: Kernel Methods for Relation Extraction. Journal of Machine Learning Research 3, 1083–1106 (2003)
Kavalec, M., Svatek, V., Buitelaar, P., Cimmiano, P., Magnini, B. (eds.): A Study on Automated Relation Labelling in Ontology Learning. IOS Press, Amsterdam (2005)
Schutz, Buitelaar, P.: RelExt: A Tool for Relation Extraction from Text in Ontology Extension, pp. 593–606 (2005)
Soderland, S., Mandhani, B.: Moving from Textual Relations to Ontologized Relations. In: Proceedings of the 2007 AAAI Spring Symposium on Machine Reading (2007)
Trampuš, M., Mladenić, D.: Constructing Event Templates from Textual News. In: Workshop on: Intelligent Analysis and Processing of Web News Content (2009)
Leskovec, J., Grobelnik, M., Milic-Frayling, N.: Learning Sub-structures of Document Semantic Graphs for Document Summarization. In: Workshop on Link Analysis and Group Detection (LinkKDD), KDD 2004, Seattle, USA, August 22-24 (2004)
Rusu, D., Fortuna, B., Mladenić, D., Grobelnik, M., Sipoš, R.: Document Visualization Based on Semantic Graphs. In: IV 2009 (2009)
Bies, A., Ferguson, M., Katz, K., Mac-Intyre, R.: Bracketing guidelines for Treebank II style Penn Treebank project. Technical report, University of Pennsylvania (1995)
Grobelnik, M., Mladenić, D.: Text Mining Recipes. Springer, Heidelberg (2009), http://www.textmining.net
Ciaramita, M., Gangemi, A., Ratsch, E., Saric, J., Rojas, I.: Unsupervised Learning of Semantic Relations between Concepts of a Molecular Biology Ontology. In: IJCAI 2005, pp. 659–664 (2005)
Pennacchiotti, M., Pantel, P.: Ontologizing Semantic Relations. In: ACL 2006 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sipoš, R., Mladenić, D., Grobelnik, M., Brank, J. (2009). Modeling Common Real-Word Relations Using Triples Extracted from n-Grams. In: Gómez-Pérez, A., Yu, Y., Ding, Y. (eds) The Semantic Web. ASWC 2009. Lecture Notes in Computer Science, vol 5926. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10871-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-10871-6_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10870-9
Online ISBN: 978-3-642-10871-6
eBook Packages: Computer ScienceComputer Science (R0)