Modeling Common Real-Word Relations Using Triples Extracted from n-Grams

Ruben Sipoš¹⁹,
Dunja Mladenić¹⁹,
Marko Grobelnik¹⁹ &
…
Janez Brank¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5926))

Included in the following conference series:

Asian Semantic Web Conference

890 Accesses
1 Citations

Abstract

In this paper, we present an approach providing generalized relations for automatic ontology building based on frequent word n-grams. Using publicly available Google n-grams as our data source we can extract relations in form of triples and compute generalized and more abstract models. We propose an algorithm for building abstractions of the extracted triples using WordNet as background knowledge. We also present a novel approach to triple extraction using heuristics, which achieves notably better results than deep parsing applied on n-grams. This allows us to represent information gathered from the web as a set of triples modeling the common and frequent relations expressed in natural language. Our results have potential for usage in different settings including providing for a knowledge base for reasoning or simply as statistical data useful in improving understanding of natural languages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Graph-Based Approach for Inferring Semantic Descriptions of Wikipedia Tables

Large Scale Semantic Relation Discovery: Toward Establishing the Missing Link Between Wikipedia and Semantic Network

Intelligent Approaches for the Automated Domain Ontology Extraction

References

Clark, P., Harrison, P.: Large-Scale Extraction and Use of Knowledge from Text. In: Proc. Fifth Int. Conf. on Knowledge Capture, KCap 2009 (2009)
Google Scholar
Rusu, D., Dali, L., Fortuna, B., Grobelnik, M., Mladenić, D.: Triplet Extraction from Sentences. In: Proceedings of the 10th International Multiconference Information Society - IS 2007, pp. 218–222 (2007)
Google Scholar
Specia, L., Baldassarre, C., Motta, E.: Relation Extraction for Semantic Intranet Annotations. Knowledge Media Institute (2006)
Google Scholar
Sahay, S., Li, B., Garcia, E.V., Agichtein, E., Ram, A.: Domain Ontology Construction from Biomedical Text, pp. 28–34. CSREA Press (2007)
Google Scholar
Fundel, K., Küffner, R., Zimmer, R.: RelEx - Relation extraction using dependency parse trees. Bioinformatics 23, 365–371 (2007)
Article Google Scholar
Etzioni, M., Cafarella, D., Downey, S., Kok, A.-M., Popescu, T., Shaked, S., Soderland, D.S.: Web-scale information extraction in knowitall (preliminary results), pp. 100–110. ACM, New York (2004)
Google Scholar
Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M., Etzioni, O.: Open Information Extraction from the Web, pp. 2670–2676 (2007)
Google Scholar
Zelenko, D., Aone, C., Richardella, A.: Kernel Methods for Relation Extraction. Journal of Machine Learning Research 3, 1083–1106 (2003)
Article MATH MathSciNet Google Scholar
Kavalec, M., Svatek, V., Buitelaar, P., Cimmiano, P., Magnini, B. (eds.): A Study on Automated Relation Labelling in Ontology Learning. IOS Press, Amsterdam (2005)
Google Scholar
Schutz, Buitelaar, P.: RelExt: A Tool for Relation Extraction from Text in Ontology Extension, pp. 593–606 (2005)
Google Scholar
Soderland, S., Mandhani, B.: Moving from Textual Relations to Ontologized Relations. In: Proceedings of the 2007 AAAI Spring Symposium on Machine Reading (2007)
Google Scholar
Trampuš, M., Mladenić, D.: Constructing Event Templates from Textual News. In: Workshop on: Intelligent Analysis and Processing of Web News Content (2009)
Google Scholar
Leskovec, J., Grobelnik, M., Milic-Frayling, N.: Learning Sub-structures of Document Semantic Graphs for Document Summarization. In: Workshop on Link Analysis and Group Detection (LinkKDD), KDD 2004, Seattle, USA, August 22-24 (2004)
Google Scholar
Rusu, D., Fortuna, B., Mladenić, D., Grobelnik, M., Sipoš, R.: Document Visualization Based on Semantic Graphs. In: IV 2009 (2009)
Google Scholar
Bies, A., Ferguson, M., Katz, K., Mac-Intyre, R.: Bracketing guidelines for Treebank II style Penn Treebank project. Technical report, University of Pennsylvania (1995)
Google Scholar
Grobelnik, M., Mladenić, D.: Text Mining Recipes. Springer, Heidelberg (2009), http://www.textmining.net
Google Scholar
Ciaramita, M., Gangemi, A., Ratsch, E., Saric, J., Rojas, I.: Unsupervised Learning of Semantic Relations between Concepts of a Molecular Biology Ontology. In: IJCAI 2005, pp. 659–664 (2005)
Google Scholar
Pennacchiotti, M., Pantel, P.: Ontologizing Semantic Relations. In: ACL 2006 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Jozef Stefan Institute, Jamova 39, 1000, Ljubljana, Slovenia
Ruben Sipoš, Dunja Mladenić, Marko Grobelnik & Janez Brank

Authors

Ruben Sipoš
View author publications
You can also search for this author in PubMed Google Scholar
Dunja Mladenić
View author publications
You can also search for this author in PubMed Google Scholar
Marko Grobelnik
View author publications
You can also search for this author in PubMed Google Scholar
Janez Brank
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Facultad de Informática, Dpto. de Inteligencia Artificial, Ontology Engineering Group, Universidad Politécnica de Madrid, Campus de Montegancedo s/n, 28660, Boadilla del Monte, Madrid
Asunción Gómez-Pérez
Shanghai Jiao Tong University, 200030, Shanghai, China
Yong Yu
Indiana University, USA
Ying Ding

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sipoš, R., Mladenić, D., Grobelnik, M., Brank, J. (2009). Modeling Common Real-Word Relations Using Triples Extracted from n-Grams. In: Gómez-Pérez, A., Yu, Y., Ding, Y. (eds) The Semantic Web. ASWC 2009. Lecture Notes in Computer Science, vol 5926. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10871-6_2

Download citation

DOI: https://doi.org/10.1007/978-3-642-10871-6_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10870-9
Online ISBN: 978-3-642-10871-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics