Abstract
For the evaluation of diversified search results, a number of different methods have been proposed in the literature. Prior to making use of such evaluation methods, it is important to have a good understanding of how diversity and relevance contribute to the performance metric of each method. In this paper, we use the statistical technique ANOVA to analyse and compare three representative evaluation methods for diversified search, namely \(\alpha \)-nDCG, MAP-IA, and ERR-IA, on the TREC-2009 Web track dataset. It is shown that the performance scores provided by those evaluation methods can indeed reflect two crucial aspects of diversity — richness and evenness — as well as relevance, though to different degrees.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The parameter \(\alpha \) for \(\alpha \)-nDCG was set to 0.5, the default value used in the TREC-2009 Web track diversity task.
References
Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S.: Diversifying search results. In: Proceedings of the 2nd International Conference on Web Search and Web Data Mining (WSDM), Barcelona, Spain, pp. 5–14 (2009)
Begon, M., Harper, J.L., Townsend, C.R.: Ecology: Individuals, Populations, and Communities, 3rd edn. John Wiley & Sons, Hoboken (1996)
Carbonell, J.G., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Melbourne, Australia, pp. 335–336 (1998)
Chandar, P., Carterette, B.: Analysis of various evaluation measures for diversity. In: Proceedings of the DDR Workshop, Dublin, Ireland, pp. 21–28 (2011)
Chapelle, O., Ji, S., Liao, C., Velipasaoglu, E., Lai, L., Wu, S.L.: Intent-based diversification of web search results: metrics and algorithms. Inf. Retr. 14(6), 572–592 (2011)
Chapelle, O., Metlzer, D., Zhang, Y., Grinspan, P.: Expected reciprocal rank for graded relevance. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM), Hong Kong, China, pp. 621–630 (2009)
Clarke, C.L.A., Craswell, N., Soboroff, I.: Overview of the TREC 2009 web track. In: Proceedings of The 18th Text REtrieval Conference (TREC), Gaithersburg, MD, USA (2009)
Clarke, C.L.A., Craswell, N., Soboroff, I., Ashkan, A.: A comparative analysis of cascade measures for novelty and diversity. In: Proceedings of the 4th International Conference on Web Search and Web Data Mining (WSDM), Hong Kong, China, pp. 75–84 (2011)
Clarke, C.L.A., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A., Buttcher, S., MacKinnon, I.: Novelty and diversity in information retrieval evaluation. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Singapore, pp. 659–666 (2008)
Gamst, G., Meyers, L.S., Guarino, A.: Analysis of Variance Designs: A Conceptual and Computational Approach with SPSS and SAS. Cambridge University Press, New York (2008)
Hill, M.O.: Diversity and evenness: a unifying notation and its consequences. Ecology 54(2), 427–432 (1973)
Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. (TOIS) 20(4), 422–446 (2002)
Kingrani, S.K., Levene, M., Zhang, D.: Diversity analysis of web search results. In: Proceedings of the ACM Web Science Conference (WebSci), Oxford, UK, pp. 43:1–43:2 (2015)
Magurran, A.E.: Ecological Diversity and Its Measurement. Princeton University Press, Princeton (1988)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Pielou, E.C.: An Introduction to Mathematical Ecology. Wiley-Interscience, New York (1969)
Santos, R.L., Macdonald, C., Ounis, I.: Search result diversification. Found. Trends Inf. Retr. 9(1), 1–90 (2015)
Zhai, C., Cohen, W.W., Lafferty, J.D.: Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), Toronto, Canada, pp. 10–17 (2003)
Zuccon, G., Azzopardi, L., Zhang, D., Wang, J.: Top-k retrieval using facility location analysis. In: Baeza-Yates, R., de Vries, A.P., Zaragoza, H., Cambazoglu, B.B., Murdock, V., Lempel, R., Silvestri, F. (eds.) ECIR 2012. LNCS, vol. 7224, pp. 305–316. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28997-2_26
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Kingrani, S.K., Levene, M., Zhang, D. (2018). A Meta-Evaluation of Evaluation Methods for Diversified Search. In: Pasi, G., Piwowarski, B., Azzopardi, L., Hanbury, A. (eds) Advances in Information Retrieval. ECIR 2018. Lecture Notes in Computer Science(), vol 10772. Springer, Cham. https://doi.org/10.1007/978-3-319-76941-7_43
Download citation
DOI: https://doi.org/10.1007/978-3-319-76941-7_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-76940-0
Online ISBN: 978-3-319-76941-7
eBook Packages: Computer ScienceComputer Science (R0)