research-article

On Evaluation Metrics for Diversity-enhanced Recommendations

Authors:

Kenli LiAuthors Info & Claims

CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management

Pages 1286 - 1295

https://doi.org/10.1145/3627673.3679629

Published: 21 October 2024 Publication History

Abstract

Diversity is increasingly recognized as a crucial factor in recommendation systems for enhancing user satisfaction. However, existing studies on diversity-enhanced recommendation systems primarily focus on designing recommendation strategies, often overlooking the development of evaluation metrics. Widely used diversity metrics such as CC, ILAD, and ILMD are typically assessed independently of accuracy. This separation leads to a critical limitation: existing diversity measures are unable to distinguish between diversity improvements from effective recommendations and those from in effective recommendations. Our evaluations reveal that the diversity improvements are primarily contributed by ineffective recommendations, which often do not positively contribute to user satisfaction. Furthermore, existing diversity metrics disregard the feature distribution of ground-truth items, potentially skewing the assessment of diversity performance. To address these limitations, we design three new accuracy-aware metrics: DCC, FDCC, and DILAD, and conduct a re-evaluation using these metrics. Surprisingly, our results illustrate that the diversity improvements of existing diversity-enhanced approaches are limited and even negative compared to those of accurate recommendations. This finding underscores the need to explore more sophisticated diversity-enhanced techniques for improving the diversity within effective recommendations.

References

[1]

Ashton Anderson, Lucas Maystre, Ian Anderson, Rishabh Mehrotra, and Mounia Lalmas. 2020. Algorithmic effects on the diversity of consumption on spotify. In WWW. 2155--2165.

[2]

Giacomo Balloccu, Ludovico Boratto, Gianni Fenu, and Mirko Marras. 2022. Post Processing Recommender Systems with Knowledge Graphs for Recency, Popularity, and Diversity of Explanations. In SIGIR. 646--656.

[3]

Thierry Bertin-Mahieux, Daniel P. W. Ellis, Brian Whitman, and Paul Lamere. 2011. The million song dataset. In ISMIR. 591--596.

[4]

Chris Buckley and Ellen M. Voorhees. 2004. Retrieval evaluation with incomplete information. In SIGIR. 25--32.

[5]

Jaime Carbonell and Jade Goldstein. 1998. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In SIGIR. 335--336.

[6]

Laming Chen, Guoxin Zhang, and Eric Zhou. 2018. Fast greedy map inference for determinantal point process to improve recommendation diversity. In NeurIPS. 5627--5638.

[7]

Wanyu Chen, Pengjie Ren, Fei Cai, Fei Sun, and Maarten de Rijke. 2020. Improving End-to-End Sequential Recommendations with Intent-aware Diversification. In CIKM. 175--184.

[8]

Charles L. A. Clarke, Maheedhar Kolla, Gordon V. Cormack, Olga Vechtomova, Azin Ashkan, Stefan Büttcher, and Ian MacKinnon. 2008. Novelty and diversity in information retrieval evaluation. In SIGIR. 659--666.

[9]

Daniel M. Fleder and Kartik Hosanagar. 2007. Recommender systems and their impact on sales diversity. In EC. 192--199.

[10]

Mouzhi Ge, Carla Delgado-Battenfeld, and Dietmar Jannach. 2010. Beyond accuracy: evaluating recommender systems by coverage and serendipity. In RecSys. 257--260.

[11]

Corrado Gini. 1936. On the measure of concentration with special reference to income and statistics, colorado college publication. General series, Vol. 208, 1 (1936).

[12]

Xiangnan He, Kuan Deng, Xiang Wang, Yan Li, Yong-Dong Zhang, and Meng Wang. 2020. Lightgcn: simplifying and powering graph convolution network for recommendation. In SIGIR. 639--648.

[13]

Jonathan L. Herlocker, Joseph A. Konstan, Loren G. Terveen, and John T. Riedl. 2004. Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst., Vol. 22, 1 (2004), 5--53.

Digital Library

[14]

John Paul Kelly and Derek G. Bridge. 2006. Enhancing the diversity of conversational collaborative recommendations: a comparison. Artif. Intell. Rev., Vol. 25, 1--2 (2006), 79--95.

[15]

Zihan Lin, Hui Wang, Jingshu Mao, Wayne Xin Zhao, Cheng Wang, Peng Jiang, and Ji-Rong Wen. 2022. Feature-aware diversified re-ranking with disentangled representations for relevant recommendation. In KDD. 3327--3335.

[16]

Weiwen Liu, Yunjia Xi, Jiarui Qin, Xinyi Dai, Ruiming Tang, Shuai Li, Weinan Zhang, and Rui Zhang. 2023. Personalized diversification for neural re-ranking in recommendation. In ICDE. 802--815.

[17]

Xiaolong Liu, Liangwei Yang, Zhiwei Liu, Mingdai Yang, Chen Wang, Hao Peng, and Philip S. Yu. 2024. Knowledge graph context-enhanced diversified recommendation. In WSDM. 462--471.

[18]

Jianmo Ni, Jiacheng Li, and Julian J. McAuley. 2019. Justifying recommendations using distantly-labeled reviews and fine-grained aspects. In EMNLP. 188--197.

[19]

Javier Parapar and Filip Radlinski. 2021. Towards Unified Metrics for Accuracy and Diversity for Recommender Systems. In RecSys. 75--84.

[20]

Kenny Peng, Manish Raghavan, Emma Pierson, Jon Kleinberg, and Nikhil Garg. 2024. Reconciling the Accuracy-Diversity Trade-off in Recommendations. In WWW. 1318--1329.

[21]

Shaina Raza, Syed Raza Bashir, and Usman Naseem. 2022. Accuracy meets diversity in a news recommender system. In COLING. 3778--3787.

[22]

Yuanyi Ren, Hang Ni, Yingxue Zhang, Xi Wang, Guojie Song, Dong Li, and Jianye Hao. 2023. Dual-process graph neural network for diversified recommendation. In CIKM. 2126--2135.

[23]

Marco Túlio Ribeiro, Anísio Lacerda, Adriano Veloso, and Nivio Ziviani. 2012. Pareto-efficient hybridization for multi-objective recommender systems. In RecSys. 19--26.

[24]

Rodrygo L. T. Santos, Craig MacDonald, and Iadh Ounis. 2015. Search Result Diversification. Found. Trends Inf. Retr., Vol. 9, 1 (2015), 1--90.

Digital Library

[25]

Guy Shani and Asela Gunawardana. 2011. Evaluating recommendation systems. In Recommender Systems Handbook. 257--297.

[26]

Claude E. Shannon. 1948. A mathematical theory of communication. Bell Syst. Tech. J., Vol. 27, 3 (1948), 379--423.

[27]

Yue Shi, Xiaoxue Zhao, Jun Wang, Martha A. Larson, and Alan Hanjalic. 2012. Adaptive diversification of recommendation results via latent factor portfolio. In SIGIR. 175--184.

[28]

João Sá, Vanessa Queiroz Marinho, Ana Rita Magalhães, Tiago Lacerda, and Diogo Gonçalves. 2022. Diversity vs relevance: a practical multi-objective study in luxury fashion recommendations. In SIGIR. 2405--2409.

[29]

Saul Vargas and Pablo Castells. 2011. Rank and relevance in novelty and diversity metrics for recommender systems. In RecSys. 109--116.

[30]

Zihong Wang, Yingxia Shao, Jiyuan He, Jinbao Liu, Shitao Xiao, Tao Feng, and Ming Liu. 2023. Diversity-aware deep ranking network for recommendation. In CIKM. 2564--2573.

[31]

Mark Wilhelm, Ajith Ramanathan, Alexander Bonomo, Sagar Jain, Ed H. Chi, and Jennifer Gillenwater. 2018. Practical diversified recommendations on youtube with determinantal point processes. In CIKM. 2165--2173.

[32]

Haolun Wu, Yansen Zhang, Chen Ma, Fuyuan Lyu, Bowei He, Bhaskar Mitra, and Xue Liu. 2024. Result Diversification in Search and Recommendation: A Survey. IEEE Transactions on Knowledge and Data Engineering (2024), 1--20.

Digital Library

[33]

Yue Xu, Hao Chen, Zefan Wang, Jianwen Yin, Qijie Shen, Dimin Wang, Feiran Huang, Lixiang Lai, Tao Zhuang, Junfeng Ge, and Xia Hu. 2023. Multi-factor sequential re-ranking with perception-aware diversification. In KDD. 5327--5337.

[34]

Liangwei Yang, Shengjie Wang, Yunzhe Tao, Jiankai Sun, Xiaolong Liu, Philip S. Yu, and Taiqing Wang. 2023. Dgrec: graph neural network for recommendation with diversified embedding generation. In WSDM. 661--669.

[35]

Shuai Yang, Lixin Zhang, Feng Xia, and Leyu Lin. 2023. Graph exploration matters: improving both individual-level and system-level diversity in wechat feed recommendation. In CIKM. 4901--4908.

[36]

Wenzhuo Yang, Jia Li, Chenxi Li, Latrice Barnett, Markus Anderle, Simo Arajärvi, Harshavardhan Utharavalli, Caiming Xiong, and Steven C. H. Hoi. 2021. On the Diversity and Explainability of Recommender Systems: A Practical Framework for Enterprise App Recommendation. In CIKM. 4302--4311.

[37]

Cong Yu, Laks V. S. Lakshmanan, and Sihem Amer-Yahia. 2009. It takes variety to make a world: diversification in recommender systems. In EDBT. 368--378.

[38]

Mi Zhang and Neil Hurley. 2008. Avoiding monotony: improving the diversity of recommendation lists. In RecSys. 123--130.

[39]

Xiaoying Zhang, Hongning Wang, and Hang Li. 2023. Disentangled representation for diversified recommendations. In WSDM. 490--498.

[40]

Yuying Zhao, Yu Wang, Yunchao Liu, Xueqi Cheng, Charu Aggarwal, and Tyler Derr. 2023. Fairness and diversity in recommender systems: a survey.

[41]

Yu Zheng, Chen Gao, Liang Chen, Depeng Jin, and Yong Li. 2021. Dgcn: diversified recommendation with graph convolutional networks. In WWW. 401--412.

[42]

Tao Zhou, Zoltán Kuscsik, Jian-Guo Liu, Matú Medo, Joseph Rushton Wakeling, and Yi-Cheng Zhang. 2010. Solving the apparent diversity-accuracy dilemma of recommender systems. Proceedings of the National Academy of Sciences, Vol. 107, 10 (2010), 4511--4515.

[43]

Han Zhu, Daqing Chang, Ziru Xu, Pengye Zhang, Xiang Li, Jie He, Han Li, Jian Xu, and Kun Gai. 2019. Joint optimization of tree-based index and deep model for recommender systems. In NeurIPS. 3973--3982.

[44]

Han Zhu, Xiang Li, Pengye Zhang, Guozheng Li, Jie He, Han Li, and Kun Gai. 2018. Learning tree-based deep model for recommender systems. In KDD. 1079--1088.

[45]

Cai-Nicolas Ziegler, Sean M. McNee, Joseph A. Konstan, and Georg Lausen. 2005. Improving recommendation lists through topic diversification. In WWW. 22--32.

Index Terms

On Evaluation Metrics for Diversity-enhanced Recommendations
1. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results
    2. Retrieval tasks and goals
      1. Recommender systems

Recommendations

Diversity, Serendipity, Novelty, and Coverage: A Survey and Empirical Analysis of Beyond-Accuracy Objectives in Recommender Systems

What makes a good recommendation or good list of recommendations?

Research into recommender systems has traditionally focused on accuracy, in particular how closely the recommender’s predicted ratings are to the users’ true ratings. However, it has been ...
Enhancing the diversity of conversational collaborative recommendations: a comparison

In conversational collaborative recommender systems, user feedback influences the recommendations. We report mechanisms for enhancing the diversity of the recommendations made by collaborative recommenders. We focus on techniques for increasing ...
User Similarity Adjustment for Improved Recommendations
MIKE 2015: Proceedings of the Third International Conference on Mining Intelligence and Knowledge Exploration - Volume 9468

Recommender systems are becoming more and more attractive in both research and commercial communities due to Information overload problem and the popularity of the Internet applications. Collaborative Filtering, a popular branch of recommendation ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management

October 2024

5705 pages

ISBN:9798400704369

DOI:10.1145/3627673

General Chairs:
Edoardo Serra
Boise State University, USA
,
Francesca Spezzano
Boise State University, USA

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Programs of Hunan Province
NSFC

Conference

CIKM '24

Sponsor:

SIGIR

CIKM '24: The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

ID, Boise, USA

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
979
Total Downloads

Downloads (Last 12 months)979
Downloads (Last 6 weeks)979

Reflects downloads up to 18 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents