Abstract
This paper summarizes the work done at the State University of New York at Buffalo (UB) in the GeoCLEF 2006 track. The approach presented uses pure IR techniques (indexing of single word terms as well as word bigrams, and automatic retrieval feedback) to try to improve retrieval performance of queries with geographical references. The main purpose of this work is to identify the strengths and shortcomings of this approach so that it serves as a basis for future development of a geographical reference extraction system. We submitted four runs to the monolingual English task, two automatic runs and two manual runs, using the title and description fields of the topics. Our official results are above the median system (auto=0.2344 MAP, manual=0.2445 MAP). We also present an unofficial run that uses title description and narrative which shows a 10% improvement in results with respect to our baseline runs. Our manual runs were prepared by creating a Boolean query based on the topic description and manually adding terms from geographical resources available on the web. Although the average performance of the manual run is comparable to the automatic runs, a query by query analysis shows significant differences among individual queries. In general, we got significant improvements (more that 10% average precision) in 8 of the 25 queries. However, we also noticed that 5 queries in the manual runs perform significantly below the automatic runs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Gey, F., Larson, R., Sanderson, M., Bischoff, K., Mandl, T., Womser-Hacker, C., Santos, D., Rocha, P., Di Nunzio, G., Ferro, N.: Geoclef 2006: the clef 2006 cross-language geographic information retrieval track overview. In: Working Notes for the CLEF 2006 Workshop, Alicante, Spain (September 2006)
Gey, F., Larson, R., Sanderson, M., Joho, H., Clough, P.: Geoclef: The clef 2005 cross-language geographic information retrieval track overview. In: Peters, C., Gey, F.C., Gonzalo, J., Müller, H., Jones, G.J.F., Kluck, M., Magnini, B., de Rijke, M., Giampiccolo, D. (eds.) CLEF 2005. LNCS, vol. 4022, pp. 963–976. Springer, Heidelberg (2006)
Porter, M.F.: An algorithm for suffix stripping. Program 14, 130–137 (1980)
Rocchio, J.J.: Relevance feedback in information retrieval. In: Salton, G. (ed.) The SMART Retrieval System: Experiments in Automatic Document Processing, Englewood Cliff, NJ (1971)
Salton, G.: The SMART Retrieval System: Experiments in Automatic Document Processing. Prentice-Hall, Englewood Cliffs (1983)
Singhal, A., Buckley, C., Mitra, M.: Pivoted document length normalization. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich, Switzerland, August 1996, pp. 21–29. ACM Press, New York (1996)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ruiz, M.E., Abbas, J., Mark, D., Shapiro, S., Southwick, S.B. (2007). UB at GeoCLEF 2006. In: Peters, C., et al. Evaluation of Multilingual and Multi-modal Information Retrieval. CLEF 2006. Lecture Notes in Computer Science, vol 4730. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74999-8_126
Download citation
DOI: https://doi.org/10.1007/978-3-540-74999-8_126
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74998-1
Online ISBN: 978-3-540-74999-8
eBook Packages: Computer ScienceComputer Science (R0)