Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3627673.3679629acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

On Evaluation Metrics for Diversity-enhanced Recommendations

Published: 21 October 2024 Publication History

Abstract

Diversity is increasingly recognized as a crucial factor in recommendation systems for enhancing user satisfaction. However, existing studies on diversity-enhanced recommendation systems primarily focus on designing recommendation strategies, often overlooking the development of evaluation metrics. Widely used diversity metrics such as CC, ILAD, and ILMD are typically assessed independently of accuracy. This separation leads to a critical limitation: existing diversity measures are unable to distinguish between diversity improvements from effective recommendations and those from in effective recommendations. Our evaluations reveal that the diversity improvements are primarily contributed by ineffective recommendations, which often do not positively contribute to user satisfaction. Furthermore, existing diversity metrics disregard the feature distribution of ground-truth items, potentially skewing the assessment of diversity performance. To address these limitations, we design three new accuracy-aware metrics: DCC, FDCC, and DILAD, and conduct a re-evaluation using these metrics. Surprisingly, our results illustrate that the diversity improvements of existing diversity-enhanced approaches are limited and even negative compared to those of accurate recommendations. This finding underscores the need to explore more sophisticated diversity-enhanced techniques for improving the diversity within effective recommendations.

References

[1]
Ashton Anderson, Lucas Maystre, Ian Anderson, Rishabh Mehrotra, and Mounia Lalmas. 2020. Algorithmic effects on the diversity of consumption on spotify. In WWW. 2155--2165.
[2]
Giacomo Balloccu, Ludovico Boratto, Gianni Fenu, and Mirko Marras. 2022. Post Processing Recommender Systems with Knowledge Graphs for Recency, Popularity, and Diversity of Explanations. In SIGIR. 646--656.
[3]
Thierry Bertin-Mahieux, Daniel P. W. Ellis, Brian Whitman, and Paul Lamere. 2011. The million song dataset. In ISMIR. 591--596.
[4]
Chris Buckley and Ellen M. Voorhees. 2004. Retrieval evaluation with incomplete information. In SIGIR. 25--32.
[5]
Jaime Carbonell and Jade Goldstein. 1998. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In SIGIR. 335--336.
[6]
Laming Chen, Guoxin Zhang, and Eric Zhou. 2018. Fast greedy map inference for determinantal point process to improve recommendation diversity. In NeurIPS. 5627--5638.
[7]
Wanyu Chen, Pengjie Ren, Fei Cai, Fei Sun, and Maarten de Rijke. 2020. Improving End-to-End Sequential Recommendations with Intent-aware Diversification. In CIKM. 175--184.
[8]
Charles L. A. Clarke, Maheedhar Kolla, Gordon V. Cormack, Olga Vechtomova, Azin Ashkan, Stefan Büttcher, and Ian MacKinnon. 2008. Novelty and diversity in information retrieval evaluation. In SIGIR. 659--666.
[9]
Daniel M. Fleder and Kartik Hosanagar. 2007. Recommender systems and their impact on sales diversity. In EC. 192--199.
[10]
Mouzhi Ge, Carla Delgado-Battenfeld, and Dietmar Jannach. 2010. Beyond accuracy: evaluating recommender systems by coverage and serendipity. In RecSys. 257--260.
[11]
Corrado Gini. 1936. On the measure of concentration with special reference to income and statistics, colorado college publication. General series, Vol. 208, 1 (1936).
[12]
Xiangnan He, Kuan Deng, Xiang Wang, Yan Li, Yong-Dong Zhang, and Meng Wang. 2020. Lightgcn: simplifying and powering graph convolution network for recommendation. In SIGIR. 639--648.
[13]
Jonathan L. Herlocker, Joseph A. Konstan, Loren G. Terveen, and John T. Riedl. 2004. Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst., Vol. 22, 1 (2004), 5--53.
[14]
John Paul Kelly and Derek G. Bridge. 2006. Enhancing the diversity of conversational collaborative recommendations: a comparison. Artif. Intell. Rev., Vol. 25, 1--2 (2006), 79--95.
[15]
Zihan Lin, Hui Wang, Jingshu Mao, Wayne Xin Zhao, Cheng Wang, Peng Jiang, and Ji-Rong Wen. 2022. Feature-aware diversified re-ranking with disentangled representations for relevant recommendation. In KDD. 3327--3335.
[16]
Weiwen Liu, Yunjia Xi, Jiarui Qin, Xinyi Dai, Ruiming Tang, Shuai Li, Weinan Zhang, and Rui Zhang. 2023. Personalized diversification for neural re-ranking in recommendation. In ICDE. 802--815.
[17]
Xiaolong Liu, Liangwei Yang, Zhiwei Liu, Mingdai Yang, Chen Wang, Hao Peng, and Philip S. Yu. 2024. Knowledge graph context-enhanced diversified recommendation. In WSDM. 462--471.
[18]
Jianmo Ni, Jiacheng Li, and Julian J. McAuley. 2019. Justifying recommendations using distantly-labeled reviews and fine-grained aspects. In EMNLP. 188--197.
[19]
Javier Parapar and Filip Radlinski. 2021. Towards Unified Metrics for Accuracy and Diversity for Recommender Systems. In RecSys. 75--84.
[20]
Kenny Peng, Manish Raghavan, Emma Pierson, Jon Kleinberg, and Nikhil Garg. 2024. Reconciling the Accuracy-Diversity Trade-off in Recommendations. In WWW. 1318--1329.
[21]
Shaina Raza, Syed Raza Bashir, and Usman Naseem. 2022. Accuracy meets diversity in a news recommender system. In COLING. 3778--3787.
[22]
Yuanyi Ren, Hang Ni, Yingxue Zhang, Xi Wang, Guojie Song, Dong Li, and Jianye Hao. 2023. Dual-process graph neural network for diversified recommendation. In CIKM. 2126--2135.
[23]
Marco Túlio Ribeiro, Anísio Lacerda, Adriano Veloso, and Nivio Ziviani. 2012. Pareto-efficient hybridization for multi-objective recommender systems. In RecSys. 19--26.
[24]
Rodrygo L. T. Santos, Craig MacDonald, and Iadh Ounis. 2015. Search Result Diversification. Found. Trends Inf. Retr., Vol. 9, 1 (2015), 1--90.
[25]
Guy Shani and Asela Gunawardana. 2011. Evaluating recommendation systems. In Recommender Systems Handbook. 257--297.
[26]
Claude E. Shannon. 1948. A mathematical theory of communication. Bell Syst. Tech. J., Vol. 27, 3 (1948), 379--423.
[27]
Yue Shi, Xiaoxue Zhao, Jun Wang, Martha A. Larson, and Alan Hanjalic. 2012. Adaptive diversification of recommendation results via latent factor portfolio. In SIGIR. 175--184.
[28]
João Sá, Vanessa Queiroz Marinho, Ana Rita Magalhães, Tiago Lacerda, and Diogo Gonçalves. 2022. Diversity vs relevance: a practical multi-objective study in luxury fashion recommendations. In SIGIR. 2405--2409.
[29]
Saul Vargas and Pablo Castells. 2011. Rank and relevance in novelty and diversity metrics for recommender systems. In RecSys. 109--116.
[30]
Zihong Wang, Yingxia Shao, Jiyuan He, Jinbao Liu, Shitao Xiao, Tao Feng, and Ming Liu. 2023. Diversity-aware deep ranking network for recommendation. In CIKM. 2564--2573.
[31]
Mark Wilhelm, Ajith Ramanathan, Alexander Bonomo, Sagar Jain, Ed H. Chi, and Jennifer Gillenwater. 2018. Practical diversified recommendations on youtube with determinantal point processes. In CIKM. 2165--2173.
[32]
Haolun Wu, Yansen Zhang, Chen Ma, Fuyuan Lyu, Bowei He, Bhaskar Mitra, and Xue Liu. 2024. Result Diversification in Search and Recommendation: A Survey. IEEE Transactions on Knowledge and Data Engineering (2024), 1--20.
[33]
Yue Xu, Hao Chen, Zefan Wang, Jianwen Yin, Qijie Shen, Dimin Wang, Feiran Huang, Lixiang Lai, Tao Zhuang, Junfeng Ge, and Xia Hu. 2023. Multi-factor sequential re-ranking with perception-aware diversification. In KDD. 5327--5337.
[34]
Liangwei Yang, Shengjie Wang, Yunzhe Tao, Jiankai Sun, Xiaolong Liu, Philip S. Yu, and Taiqing Wang. 2023. Dgrec: graph neural network for recommendation with diversified embedding generation. In WSDM. 661--669.
[35]
Shuai Yang, Lixin Zhang, Feng Xia, and Leyu Lin. 2023. Graph exploration matters: improving both individual-level and system-level diversity in wechat feed recommendation. In CIKM. 4901--4908.
[36]
Wenzhuo Yang, Jia Li, Chenxi Li, Latrice Barnett, Markus Anderle, Simo Arajärvi, Harshavardhan Utharavalli, Caiming Xiong, and Steven C. H. Hoi. 2021. On the Diversity and Explainability of Recommender Systems: A Practical Framework for Enterprise App Recommendation. In CIKM. 4302--4311.
[37]
Cong Yu, Laks V. S. Lakshmanan, and Sihem Amer-Yahia. 2009. It takes variety to make a world: diversification in recommender systems. In EDBT. 368--378.
[38]
Mi Zhang and Neil Hurley. 2008. Avoiding monotony: improving the diversity of recommendation lists. In RecSys. 123--130.
[39]
Xiaoying Zhang, Hongning Wang, and Hang Li. 2023. Disentangled representation for diversified recommendations. In WSDM. 490--498.
[40]
Yuying Zhao, Yu Wang, Yunchao Liu, Xueqi Cheng, Charu Aggarwal, and Tyler Derr. 2023. Fairness and diversity in recommender systems: a survey.
[41]
Yu Zheng, Chen Gao, Liang Chen, Depeng Jin, and Yong Li. 2021. Dgcn: diversified recommendation with graph convolutional networks. In WWW. 401--412.
[42]
Tao Zhou, Zoltán Kuscsik, Jian-Guo Liu, Matú Medo, Joseph Rushton Wakeling, and Yi-Cheng Zhang. 2010. Solving the apparent diversity-accuracy dilemma of recommender systems. Proceedings of the National Academy of Sciences, Vol. 107, 10 (2010), 4511--4515.
[43]
Han Zhu, Daqing Chang, Ziru Xu, Pengye Zhang, Xiang Li, Jie He, Han Li, Jian Xu, and Kun Gai. 2019. Joint optimization of tree-based index and deep model for recommender systems. In NeurIPS. 3973--3982.
[44]
Han Zhu, Xiang Li, Pengye Zhang, Guozheng Li, Jie He, Han Li, and Kun Gai. 2018. Learning tree-based deep model for recommender systems. In KDD. 1079--1088.
[45]
Cai-Nicolas Ziegler, Sean M. McNee, Joseph A. Konstan, and Georg Lausen. 2005. Improving recommendation lists through topic diversification. In WWW. 22--32.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management
October 2024
5705 pages
ISBN:9798400704369
DOI:10.1145/3627673
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. diversity
  2. evaluation metrics
  3. recommendation system

Qualifiers

  • Research-article

Funding Sources

  • Programs of Hunan Province
  • NSFC

Conference

CIKM '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 979
    Total Downloads
  • Downloads (Last 12 months)979
  • Downloads (Last 6 weeks)979
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media