Nothing Special   »   [go: up one dir, main page]

skip to main content

Content-based text querying with ontological descriptors

Published: 01 February 2004 Publication History


This paper describes a method and a system for content-based querying of texts based on the availability of an ontology for the concepts in the text domain. A key principle in the system is the extraction of conceptual content of noun phrases into descriptors forming an integral part of the ontology.The retrieval of text passages rests on matching descriptors from the text against descriptors from the noun phrases in the query. The match does not need to be exact but is mediated by the ontology invoking in particular taxonomic reasoning with sub- and super-concepts. The paper also reports on a prototype implementation of the system.


{1} S. Abney, Partial parsing via finite-state cascades, in: Proceedings of the ESSLLI'96 Robust Parsing Workshop, 1996.
{2} T. Andreasen, J. Fischer Nilsson, Grammatical Specification of Domain Ontologies, companion paper to the present paper.
{3} T. Andreasen, J. Fischer Nilsson, H. Erdman Thomsen, Ontology-based querying, in: H.L. Larsen et al. (Eds.), Flexible Query Answering Systems, Recent Advances, Physica-Verlag, Springer, 2000, pp. 15-26.
{4} T. Andreasen, Query evaluation based on domain-specific ontologies, in: NAFIPS'2001, 20th IFSA/NAFIPS International Conference Fuzziness and Soft Computing, Vancouver, Canada, 2001, pp. 1844-1849.
{5} T. Andreasen, On knowledge-guided fuzzy aggregation, in: IPMU'2002, 9th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, Annecy, France, 2002, pp. 1-5.
{6} T. Andreasen, A. Motro, H. Christiansen, H.L. Larsen, in: FQAS2002 Fifth International Conference on Flexible Query Answering Systems, October 27-29, 2002, Copenhagen, Denmark, Lecture Notes in Artificial Intelligence, vol. 2522, Springer Verlag, Berlin, 2002.
{7} T. Andreasen, P.A. Jensen, J. Fischer Nilsson, P. Paggio, B.S. Pedersen, H. Erdman Thomsen, ONTOQUERY: ontology-based querying of texts, AAAI 2002 Spring Symposium, Stanford, California, 2002.
{8} E. Brill, Transformation-based error-driven learning amd natural language processing: a case study in part-of-speech tagging, Computational Linguistics 21 (4) (1995) 543-565.
{9} C. Brink, K. Britz, R.A. Schmidt, Peirce algebras, Formal Aspects of Computing 6 (1994) 339-358.
{10} B. Carpenter, The Logic of Typed Feature Structures, Cambridge UP, 1992.
{11} A. Copestake, Implementing Typed Feature Structure Grammars, CSLI Publications, 2002.
{12} C. Fillmore, The case for case, in: E. Bach, R. Harms (Eds.), Universals in Linguistic Theory, Holt, Rinehart and Winston, New York, 1968.
{13} J. Gonzales, F. Verdejo, C. Peters, N. Calzolari, Applying EuroWordNet to cross-lingual text retrieval, in: Computers and the Humanities, vol. 32, Kluwer Academic Publishers, The Netherlands, 1998, pp. 185-207.
{14} N. Guarino, C. Masolo, G. Vetere, OntoSeek: content-based access to the web, IEEE Intelligent Systems 14 (3) (1999) 70-80.
{15} C. Jacquemin, E. Tzoukermann, NLP for term variant extraction: a synergy of morphology, lexicon and syntax, in: T. Strzalkowski (Ed.), Natural Language Information Retrieval, Kluwer, Boston, MA, 1999, pp. 25-74.
{16} C. Jacquemin, D. Bourigault, Term extraction and automatic indexing, in: R. Mitkov (Ed.), Handbook of Computational Linguistics, Oxford University Press, Oxford, 2001.
{17} P. Anker Jensen, P. Skadhauge (Eds.), Proceedings of the First International OntoQuery Workshop, Ontology-based interpretation of NP's, Department of Business Communication and Information Science, University of Southern Denmark, Kolding, 2001.
{18} P. Anker Jensen, J. Fischer Nilsson, C. Vikner, Towards an ontology-based interpretation of noun phrases, in: P.A. Jensen, P.R. Skadhauge (Eds.), Ontology-Based Interpretation of Noun Phrases, in: {17}.
{19} A. Lenci, N. Bel, F. Busa, N. Calzolari, E. Gola, M. Monachini, A. Ogonowski, I. Peters, W. Peters, N. Ruimy, A. Zampolli, SIMPLE--linguistic specifications, Technical Report, University of Pisa and Institute of Linguistics of CNR, Pisa, 2000.
{20} A. Lenci, N. Bel, F. Busa, N. Calzolari, E. Gola, M. Monachini, A. Ogonowski, I. Peters, W. Peters, N. Ruimy, M. Villegas, A. Zampolli, SIMPLE--a general framework for the development of multilingual lexicons, in: T. Fontenelle (Ed.), International Journal of Lexicography 13 (2000) 249-263.
{21} B. Nistrup Madsen, B. Standford Pedersen, H. Erdman Thomsen, Semantic relations in content-based querying systems: a research presentation from the OntoQuery project, in: K. Simov, A. Kiryakov (Eds.), Ontologies and Lexical Knowledge Bases, Proceedings of the 1st International Workshop, OntoLex 2000, OntoText Lab., Sofia 2002, pp. 72-82.
{22} B. Nistrup Madsen, H. Erdman Thomsen, C. Vikner, Data modelling and conceptual modelling in the domain of termninology, in: A. Melby (Ed.), Proceedings of TKE'02--Terminology and Knowledge Engineering, INRIA, France, 2002.
{23} J. Fischer Nilsson, A Logico-algebraic Framework for Ontologies ONTOLOG, in: {17}.
{24} J. Fischer Nilsson, Concept descriptions for text search, in: Information Modelling and Knowledge Bases XIII, IOS Press, 2002, pp. 296-300.
{25} J. Fischer Nilsson, Are there conceptual grammars?, in: Information Modelling and Knowledge Base XIII, IOS Press, 2002, pp. 412-418.
{26} J. Fischer Nilsson, Generative Ontologies, Ontological Types and Conceptual Grammars, in: {38}.
{27} A. Nuopponen, Concept systems for terminological analysis, Acta Wasaensia, No. 38. Universitas Wasaensis, Wasa, 1994.
{28} ONTOQUERY project net site: 〈〉.
{29} P. Paggio, B.S. Pedersen, D. Haltrup, Applying language technology to content-based querying--the OntoQuery project, in: Proceedings from Workshop on Artificial Intelligence for Cultural Heritage and Digital Libraries, Università di Bari, Italy, 2001, pp. 75-79.
{30} B.S. Pedersen, B. Keson, SIMPLE semantic information for multifunctional plurilingual lexicons: some examples of Danish concrete nouns, in: SIGLEX 99: Standardisaing Lexical Resources, ACL Workshop, University of Maryland, USA, 1999, pp. 46-51.
{31} B.S. Pedersen, S. Nimb, Semantic encoding of Danish verbs in SIMPLE adapting a verb-framed model to a satellite-framed language, in: Proceedings from 2nd Internal Conference on Language Resources and Evaluation, Athens, Greece, 2000, pp. 1405-1412.
{32} B. Pedersen, P. Paggio, Semantic lexical resources applied to content-based querying the OntoQuery project, in: Third International Conference on Language Resource and Evaluation Las Palmas, Gran Canaria, 2002, pp. 1753-1759.
{33} B.S. Pedersen, P. Paggio, A. Danish semantic lexicon and its application in content-based querying, in: CST Working Papers no. 4, Center for Sprogteknologi, Copenhagen, 2002.
{34} J. Pustejovsky, The Generative Lexicon, MIT press, 1995.
{35} F. Rinaldi, J. Dowdall, M. Hess, K. Karljurand, M. Koit, K. Vider, N. Kahusk, Terminology as knowledge in answer extraction, in: A. Melby (Ed.), Proceedings of TKE'02--Terminology and Knowledge Engineering, INRIA, France, 2002.
{36} A. Smeaton, A. Quigley, Experiments on using semantic distances between words in image caption retrieval, in: Proceedings of the 19th International Conference on Research Development in: IR, 1996.
{37} H. Bulskov Styltsvig, R. Knappe, T. Andreasen, On Measuring Similarity for Conceptual Querying, in: {6}.
{38} H. Erdman Thomsen (Ed.), Ontologies and Search--2nd OntoQuery Workshop January 2000, LAMBDA, No. 28, HHK (Copenhagen Business School), Frederiksberg, 2001.
{39} E.M. Voorhees, Using WordNet to disambiguate word senses for text retrieval, in: R. Korfhage, E. Rasmussen, P. Willett (Eds.), Proceedings of the 16th Annual ACM SIGIR Conference on Research and Development in: Information Retrieval, Pittsburgh, 1993, pp. 171-180.
{40} E.M. Voorhees, Query expansion using lexical-semantic relations, in: W. Bruce Croft, C.J. van Rijsbergen (Eds.), Proceedings of the 17th Annual ACM SIGIR Conference on Research and Development in: Information Retrieval, 1994, pp. 61-69.

Cited By

View all
  • (2012)Collapse and reorganization patterns of social knowledge representation in evolving semantic networksInformation Sciences: an International Journal10.1016/j.ins.2012.02.053200(1-21)Online publication date: 1-Oct-2012
  • (2010)A Web Knowledge Discovery Engine Based on Concept AlgebraInternational Journal of Cognitive Informatics and Natural Intelligence10.4018/jcini.20100101054:1(80-97)Online publication date: 1-Jan-2010
  • (2009)Enhancing search results of concept annotated documentsProceedings of the 10th IEEE international conference on Information Reuse & Integration10.5555/1689250.1689310(330-335)Online publication date: 10-Aug-2009
  • Show More Cited By



Jaroslav Pokorny

An approach to querying text sources based on extracting and evaluating semantic content, given a formal ontology for the text domain, is described in this paper. Queries take the form of natural language expressions, and the system is primarily intended to retrieve text segments whose semantic content matches the content of noun phrases in the query phrase. The approach is based on a fully automatic generation of descriptors of natural language text. Querying uses the generalization/specialization relations for ranking matches, rather than reformulating queries. After a short section comparing their approach with other systems, the authors discuss their choice of a feasible ontological representation formalism, in sections 3 and 4. In particular, the descriptors, and their embedding in the ontology by means of a notion of ontological grammar, are introduced. Section 4 addresses the disambiguation of noun phrases, whereas section 5 introduces the notion of semantic roles represented in terms of binary semantic relations. As an application domain for the prototype, the authors use the field of nutrition. Section 6 considers examples of noun phrases from nutrition texts, and their mappings onto descriptors. Finally, sections 7 and 8 describe the query-matching process, and the prototype implementation, respectively. Section 9 concludes the paper, and offers some challenges for the future. The method was developed in the OntoQuery project, and is described in several other papers by the same authors, profusely cited in the text. This detracts from the readability of the paper. While the method is interesting, the authors provide no well-specified experiments justifying their approach. For example, a comparison with other methods of information retrieval would be beneficial. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.


Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors


Published In

cover image Data & Knowledge Engineering
Data & Knowledge Engineering  Volume 48, Issue 2
February 2004
111 pages


Elsevier Science Publishers B. V.


Publication History

Published: 01 February 2004

Author Tags

  1. conceptual distance
  2. content-based querying
  3. information retrieval
  4. noun phrase semantics
  5. ontologies
  6. taxonomic reasoning


  • Article


Other Metrics

Bibliometrics & Citations


Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Feb 2025

Other Metrics


Cited By

View all
  • (2012)Collapse and reorganization patterns of social knowledge representation in evolving semantic networksInformation Sciences: an International Journal10.1016/j.ins.2012.02.053200(1-21)Online publication date: 1-Oct-2012
  • (2010)A Web Knowledge Discovery Engine Based on Concept AlgebraInternational Journal of Cognitive Informatics and Natural Intelligence10.4018/jcini.20100101054:1(80-97)Online publication date: 1-Jan-2010
  • (2009)Enhancing search results of concept annotated documentsProceedings of the 10th IEEE international conference on Information Reuse & Integration10.5555/1689250.1689310(330-335)Online publication date: 10-Aug-2009
  • (2009)Concept SearchProceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications10.1007/978-3-642-02121-3_33(429-444)Online publication date: 31-May-2009
  • (2008)Ontological summaries through hierarchical clusteringProceedings of the 17th international conference on Foundations of intelligent systems10.5555/1786474.1786539(497-507)Online publication date: 20-May-2008
  • (2008)A knowledge retrieval model using ontology mining and user profilingIntegrated Computer-Aided Engineering10.5555/1402687.140269115:4(313-329)Online publication date: 1-Dec-2008
  • (2007)Using shallow linguistic analysis to improve search on Danish compoundsNatural Language Engineering10.1017/S135132490600425613:1(75-90)Online publication date: 1-Mar-2007
  • (2007)Automated ontology construction for unstructured text documentsData & Knowledge Engineering10.1016/j.datak.2006.04.00160:3(547-566)Online publication date: 1-Mar-2007
  • (2007)On Browsing Domain Ontologies for Information Base ContentProceedings of the 12th international Fuzzy Systems Association world congress on Foundations of Fuzzy Logic and Soft Computing10.1007/978-3-540-72950-1_14(135-144)Online publication date: 18-Jun-2007
  • (2006)Information extraction and imprecise query answering from web documentsWeb Intelligence and Agent Systems10.5555/2636352.26363564:4(407-429)Online publication date: 1-Oct-2006
  • Show More Cited By

View Options

View options






Share this Publication link

Share on social media