Abstract
The objective of XML retrieval is to return relevant XML document fragments that answer a given user information need, by exploiting the document structure. The focus in this article is on automatically deriving and using semantic XML structure to enhance the retrieval performance of XML retrieval systems. Based on a naive approach for named entity detection, we discuss how the structure of an XML document can be enriched using the Reuters 21587 news collection.
Based on a retrieval performance experiment, we study the effect of the additional semantic structure on the retrieval performance of our XSee search engine for XML documents. The experiment provides some initial evidence that an XML retrieval system significantly benefits from having meaningful XML structure.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Alias-i. Lingpipe (2006), http://www.alias-i.com/lingpipe/
Buckley, C., Voorhees, E.M.: Retrieval evaluation with incomplete information. In: SIGIR’04, Sheffield, United Kingdom, p. 25. ACM Press, New York (2004)
Ciaramita, M., Altun, Y.: Named-Entity Recognition in Novel Domains with External Lexical Knowledge. In: Workshop on Advances in Structured Learning for Text and Speech Processing (NIPS 2005) (2005)
Fuhr, N., et al. (eds.): INEX 2005. LNCS, vol. 3977. Springer, Heidelberg (2006)
Geva, S.: GPX: Gardens point XML information retrieval at INEX 2005. In: Fuhr, N., et al. (eds.) INEX 2005. LNCS, vol. 3977, Springer, Heidelberg (2006)
Ramirez, G., Westerveld, T., de Vries, A.P.: Using small XML elements to support relevance. In: SIGIR’06, Seattle, Washington, USA, pp. 693–694. ACM Press, New York (2006)
Trotman, A., Lalmas, M.: Why Structural Hints in Queries do not Help XML-Retrieval. In: SIGIR’06, Seatle, Washington, USA, Aug. 2006, ACM Press, New York (2006)
Trotman, A., Sigurbjörnsson, B.: Narrowed Extended XPath I (NEXI). In: Fuhr, N., et al. (eds.) INEX 2004. LNCS, vol. 3493, pp. 16–40. Springer, Heidelberg (2005)
van Loosbroek, T.M.: An ad hoc approach for creating a semantic enhanced document collection. Master’s thesis, Department of Computer Sciences, Utrecht University (April 2006)
van Oostendorp, H., van Zwol, R.: Google’s ”I’m feeling lucky”, truly a gamble? In: Zhou, X., et al. (eds.) WISE 2004. LNCS, vol. 3306, pp. 378–390. Springer, Heidelberg (2004)
van Zwol, R.: B3-SDR and effective use of structural hints. In: Fuhr, N., et al. (eds.) INEX 2005. LNCS, vol. 3977, Springer, Heidelberg (2006)
van Zwol, R.: XSee: Structure Xposed. In: Fuhr, N., Lalmas, M., Trotman, A. (eds.) INEX 2006. LNCS, vol. 4518, Springer, Heidelberg (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
van Zwol, R., van Loosbroek, T. (2007). Effective Use of Semantic Structure in XML Retrieval. In: Amati, G., Carpineto, C., Romano, G. (eds) Advances in Information Retrieval. ECIR 2007. Lecture Notes in Computer Science, vol 4425. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71496-5_59
Download citation
DOI: https://doi.org/10.1007/978-3-540-71496-5_59
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71494-1
Online ISBN: 978-3-540-71496-5
eBook Packages: Computer ScienceComputer Science (R0)