Abstract
Users of information retrieval (IR) systems require an interface that is powerful and easy-to-use in order to fulfill their information requirement. In XML-IR systems this is a non-trivial task since users expect these systems to fulfill both their structural and content requirements. Most existing XML-IR systems accept queries formatted in formal query languages, however, these languages are difficult to use. This paper presents NLPX – an XML-IR system with a natural language interface that is user friendly enough so it can be used intuitively, but sophisticated enough to be able to handle complex structured queries. NLPX accepts English queries that contain both users’ content and structural requirements. It uses a set of grammar templates to derive the structural and content requirements and translates them into a formal language (NEXI). The formal language queries can then be processed by many existing XML-IR systems. The system was developed for participation in the NLP Track of the INEX 2004 Workshop, and results indicated that natural language interfaces are able to capture users’ structural and content requirements, but not as accurately as some formal language interfaces.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Brill, E.: A Simple Rule-Based Part of Speech Tagger. In: Proceedings of the Third Conference on Applied Computational Linguistics (ACL), Trento, Italy (1992)
Clark, J., DeRose, S.: XML Path Language XPath version 1.0., Technical report, W3C (1999), W3C Recommendation, available at http://www.w3.org/TR/xpath
Fuhr, N., Malik, S.: Overview of the Initiative for the Evaluation of XML Retrieval (INEX) 2003. In: INEX 2003 Workshop Proceedings, Dagstuhl, Germany, December 15-17, 2003, pp. 1–11 (2004)
Geva, S., Spork, M.: XPath Inverted File for Information Retrieval. In: INEX 2003 Workshop Proceedings, Dagstuhl, Germany, December 15-17, 2003, pp. 110–117 (2004)
Geva, S.: GPX at INEX 2004. In: INEX 2004 Workshop Proceedings, Dagstuhl, Germany, December 6-8, 2004 (2005)
O’Keefe, R., Trotman, A.: The Simplest Query Language That Could Possibly Work. In: INEX 2003 Workshop Proceedings, Dagstuhl, Germany, December 15-17, 2003, pp. 167–174 (2004)
Manning, C.D., Schutze, D.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
Marcus, M., Santorini, N., Marcinkiewicz, M.: Building a large annotated corpus of English: The Penn Treebank. In: Computational Linguistics (1993)
Sigurbjornsson, B., Kamps, J., de Rijke, M.: An Element-based Approach to XML Retrieval. In: INEX 2003 Workshop Proceedings, Schloss Dagstuhl, Germany, December 15-17, 2003, pp. 19–26 (2004)
Special Interest Group on Information Retrieval (SIGIR) Homepage, http://www.acm.org/sigir/
Text REtreival Conference (TREC) Homepage, http://trec.nist.gov/
Trotman, A., Sigurbjörnsson, B.: Narrowed Extended XPath I (NEXI). In: INEX 2004 Workshop Proceedings, Dagstuhl, Germany, Decemeber 8 -10, 2004 (2005)
Van Rijsbergen, R.J.: Information Retrieval, 2nd edn. Butterworths (1979)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Woodley, A., Geva, S. (2005). NLPX at INEX 2004. In: Fuhr, N., Lalmas, M., Malik, S., Szlávik, Z. (eds) Advances in XML Information Retrieval. INEX 2004. Lecture Notes in Computer Science, vol 3493. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11424550_31
Download citation
DOI: https://doi.org/10.1007/11424550_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26166-7
Online ISBN: 978-3-540-32053-1
eBook Packages: Computer ScienceComputer Science (R0)