Nothing Special   »   [go: up one dir, main page]

Skip to main content

Effective Use of Semantic Structure in XML Retrieval

  • Conference paper
Advances in Information Retrieval (ECIR 2007)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4425))

Included in the following conference series:

  • 2126 Accesses


The objective of XML retrieval is to return relevant XML document fragments that answer a given user information need, by exploiting the document structure. The focus in this article is on automatically deriving and using semantic XML structure to enhance the retrieval performance of XML retrieval systems. Based on a naive approach for named entity detection, we discuss how the structure of an XML document can be enriched using the Reuters 21587 news collection.

Based on a retrieval performance experiment, we study the effect of the additional semantic structure on the retrieval performance of our XSee search engine for XML documents. The experiment provides some initial evidence that an XML retrieval system significantly benefits from having meaningful XML structure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others


  1. Alias-i. Lingpipe (2006),

  2. Buckley, C., Voorhees, E.M.: Retrieval evaluation with incomplete information. In: SIGIR’04, Sheffield, United Kingdom, p. 25. ACM Press, New York (2004)

    Google Scholar 

  3. Ciaramita, M., Altun, Y.: Named-Entity Recognition in Novel Domains with External Lexical Knowledge. In: Workshop on Advances in Structured Learning for Text and Speech Processing (NIPS 2005) (2005)

    Google Scholar 

  4. Fuhr, N., et al. (eds.): INEX 2005. LNCS, vol. 3977. Springer, Heidelberg (2006)

    Google Scholar 

  5. Geva, S.: GPX: Gardens point XML information retrieval at INEX 2005. In: Fuhr, N., et al. (eds.) INEX 2005. LNCS, vol. 3977, Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. Ramirez, G., Westerveld, T., de Vries, A.P.: Using small XML elements to support relevance. In: SIGIR’06, Seattle, Washington, USA, pp. 693–694. ACM Press, New York (2006)

    Google Scholar 

  7. Trotman, A., Lalmas, M.: Why Structural Hints in Queries do not Help XML-Retrieval. In: SIGIR’06, Seatle, Washington, USA, Aug. 2006, ACM Press, New York (2006)

    Google Scholar 

  8. Trotman, A., Sigurbjörnsson, B.: Narrowed Extended XPath I (NEXI). In: Fuhr, N., et al. (eds.) INEX 2004. LNCS, vol. 3493, pp. 16–40. Springer, Heidelberg (2005)

    Google Scholar 

  9. van Loosbroek, T.M.: An ad hoc approach for creating a semantic enhanced document collection. Master’s thesis, Department of Computer Sciences, Utrecht University (April 2006)

    Google Scholar 

  10. van Oostendorp, H., van Zwol, R.: Google’s ”I’m feeling lucky”, truly a gamble? In: Zhou, X., et al. (eds.) WISE 2004. LNCS, vol. 3306, pp. 378–390. Springer, Heidelberg (2004)

    Google Scholar 

  11. van Zwol, R.: B3-SDR and effective use of structural hints. In: Fuhr, N., et al. (eds.) INEX 2005. LNCS, vol. 3977, Springer, Heidelberg (2006)

    Google Scholar 

  12. van Zwol, R.: XSee: Structure Xposed. In: Fuhr, N., Lalmas, M., Trotman, A. (eds.) INEX 2006. LNCS, vol. 4518, Springer, Heidelberg (2007)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations


Editor information

Giambattista Amati Claudio Carpineto Giovanni Romano

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

van Zwol, R., van Loosbroek, T. (2007). Effective Use of Semantic Structure in XML Retrieval. In: Amati, G., Carpineto, C., Romano, G. (eds) Advances in Information Retrieval. ECIR 2007. Lecture Notes in Computer Science, vol 4425. Springer, Berlin, Heidelberg.

Download citation

  • DOI:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71494-1

  • Online ISBN: 978-3-540-71496-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics