Nothing Special   »   [go: up one dir, main page]

skip to main content
review-article

Similarity measures for Collaborative Filtering-based Recommender Systems: : Review and experimental comparison

Published: 01 October 2022 Publication History

Abstract

Collaborative Filtering (CF) filters the flow of data that can be recommended, by a Recommender System (RS), to a target user according to his taste and his preferences. The target user’s profile is built based on his similarity with other users. For this reason, CF technique is very sensitive to the similarity measure used to quantify the dependency strength between two users (or two items). In this paper we provide an in-depth review on similarity measures used for CF-based RS. For each measure, we outline its fundamental background and we test its performance through an experimental study. Experiments are carried out on three standard datasets (MovieLens100k, MovieLens1M and Jester) and reveal many important conclusions. In fact, results show that ITR and IPWR are the most suitable similarity measures for a user-based RS while AMI is the best choice for an item-based RS. Evaluation metrics show that under the user-based approach, ITR obtains an MAE equal to 0.786 and 0.731 on MovieLens100k and MovieLens1M, respectively. Whereas, IPWR reach an MAE equal to 3.256 on Jester. Also, AMI gets under the item-based approach an MAE equal to 0.745, 0.724 and 3.281 on MovieLens100k, MovieLens1M and Jester, respectively.

References

[1]
J. Abello, P.M. Pardalos, M.G.C. Resende (Eds.), Handbook of Massive Data Sets, Kluwer Academic Publishers, USA, 2002.
[2]
W. Abramowicz, Knowledge-Based Information Retrieval and Filtering from the Web, 1st Edition, Springer, Boston, MA, 2003, doi:10.1007/978-1-4757-3739-4.
[3]
G. Adomavicius, B. Mobasher, F. Ricci, A. Tuzhilin, Context-aware recommender systems, AI Magazine 32 (3) (2011) 67–80,.
[4]
C.C. Aggarwal, Recommender Systems: The Textbook, 1st Edition, Springer Publishing Company Incorporated, 2016.
[5]
H.J. Ahn, A new similarity measure for collaborative filtering to alleviate the new user cold-starting problem, Information Sciences 178 (1) (2008) 37–51,.
[6]
Ayub, R., Ghazanfar, M. a., Mehmood, Z., Saba, T., Alharbey, R., Munshi, A., Alrige, M., 2019. Modeling user rating preference behavior to improve the performance of the collaborative filtering based recommender systems. PLoS ONE 14.
[7]
M. Ayub, M.A. Ghazanfar, Z. Mehmood, K.H. Alyoubi, A.S. Alfakeeh, Unifying user similarity and social trust to generate powerful recommendations for smart cities using collaborating filtering-based recommender systems, Soft Computing 24 (2020) 11071–11094.
[8]
J. Bobadilla, F. Ortega, A. Hernando, A. Gutiérrez, Recommender systems survey, Knowledge-Based Systems 46 (2013) 109–132,.
[9]
J.S. Breese, D. Heckerman, C. Kadie, Empirical analysis of predictive algorithms for collaborative filtering, in: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, UAI’98, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1998, pp. 43–52.
[10]
A. Brun, S. Castagnos, A. Boyer, A positively directed mutual information measure for collaborative filtering, in: M. Ghenima, S. Sidhom, A. Ouksel (Eds.), 2nd International Conference on Information Systems and Economic Intelligence - SIIE 2009, Malek Ghenima (ESCE Université la Manouba - Tunisie) and Sahbi Sidhom (Nancy Université - France), IHE éditions Tunis - Tunisie, Hammamet, Tunisia, 2009, pp. 943–958.
[11]
Cacheda, F., Carneiro, V., Fernández, D., Formoso, V. Comparison of collaborative filtering algorithms: Limitations of current techniques and proposals for scalable, high-performance recommender systems, ACM Trans. Web 5 (1).
[12]
R. Chen, Q. Hua, Y. Chang, B. Wang, L. Zhang, X. Kong, A survey of collaborative filtering-based recommender systems: From traditional methods to hybrid methods based on social networks, IEEE Access 6 (2018) 64301–64320,.
[13]
A. Colin Cameron, F.A. Windmeijer, An r-squared measure of goodness of fit for some common nonlinear regression models, Journal of Econometrics 77 (2) (1997) 329–342,.
[14]
W. Conover, Practical Nonparametric Statistics, Wiley, 1971.
[15]
M. de Gemmis, P. Lops, C. Musto, F. Narducci, G. Semeraro, Semantics-Aware Content-Based Recommender Systems, Springer US, Boston, MA, 2015, pp. 119–159,.
[16]
Ekstrand, M.D., Riedl, J.T., Konstan, J.A., 2011. Collaborative filtering recommender systems. Found. Trends Hum.-Comput. Interact. 4 (2), 81–173.
[17]
F. Fkih, M.N. Omri, Information retrieval from unstructured web text document based on automatic learning of the threshold, Int. J. Inf. Retr. Res. 2 (4) (2012) 12–30,.
[18]
F. Fkih, M.N. Omri, Estimation of a priori decision threshold for collocations extraction: An empirical study, Int. J. Inf. Technol. Web Eng. 8 (3) (2013) 34–49,.
[19]
Fkih, F., Omri, M.N., 2013. A statistical classifier based Markov chain for complex terms filtration, in: Proceedings of the International Conference on Web Informations and Technologies, ICWIT 2013, Hammamet, Tunisia, pp. 175–184.
[20]
Fkih, F., Omri, M.N., 2016. Hybridization of an index based on concept lattice with a terminology extraction model for semantic information retrieval guided by wordnet. In: Abraham, A., Haqiq, A., Alimi, A.M., Mezzour, G., Rokbani, N., Muda, A.K. (Eds.), Proceedings of the 16th International Conference on Hybrid Intelligent Systems (HIS 2016), Marrakech, Morocco, November 21–23, 2016, Vol. 552 of Advances in Intelligent Systems and Computing, Springer, pp. 144–152.
[21]
Fkih, F., Omri, M.N., 2018. Fca_retrieval: A multi-operator algorithm for information retrieval from binary concept lattice. In: S. Politzer-Ahles, Y. Hsu, C. Huang, Y. Yao (Eds.), Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation, PACLIC 2018, Hong Kong, December 1–3, 2018, Association for Computational Linguistics.
[22]
F. Fkih, M.N. Omri, Hidden data states-based complex terminology extraction from textual web data model, Appl. Intell. 50 (6) (2020) 1813–1831,.
[23]
K. Goldberg, T. Roeder, D. Gupta, C. Perkins, Eigentaste: A constant time collaborative filtering algorithm, Inf. Retr. 4 (2) (2001) 133–151,.
[24]
G. Guo, Resolving data sparsity and cold start in recommender systems, in: J. Masthoff, B. Mobasher, M.C. Desmarais, R. Nkambou (Eds.), User Modeling, Adaptation, and Personalization, Springer Berlin Heidelberg, Berlin, Heidelberg, 2012, pp. 361–364.
[25]
Harper, F.M., Konstan, J.A. The movielens datasets: History and context, ACM Trans. Interact. Intell. Syst. 5 (4).
[26]
J.L. Herlocker, J.A. Konstan, L.G. Terveen, J.T. Riedl, Evaluating collaborative filtering recommender systems, ACM Trans. Inf. Syst. 22 (1) (2004) 5–53,.
[27]
A. Iftikhar, M.A. Ghazanfar, M. Ayub, Z. Mehmood, M. Maqsood, An improved product recommendation method for collaborative filtering, IEEE Access 8 (2020) 123841–123857,.
[28]
A. Iovine, F. Narducci, G. Semeraro, Conversational recommender systems and natural language: A study through the converse framework, Decision Support Systems 131 (2020),.
[29]
F. Isinkaye, Y. Folajimi, B. Ojokoh, Recommendation systems: Principles, methods and evaluation, Egyptian Informatics Journal 16 (3) (2015) 261–273,.
[30]
P. Jaccard, The distribution of the flora in the alpine zone, New Phytologist 11 (2) (1912) 37–50.
[31]
D.K. Jain, A. Kumar, V. Sharma, Tweet recommender model using adaptive neuro-fuzzy inference system, Future Generation Computer Systems 112 (2020) 996–1009,.
[32]
M. Jalili, S. Ahmadian, M. Izadi, P. Moradi, M. Salehi, Evaluating collaborative filtering recommender algorithms: A survey, IEEE Access 6 (2018) 74003–74024,.
[33]
M.G. Kendall, A new measure of rank correlation, Biometrika 30 (1–2) (1938) 81–93,.
[34]
Kendall, M., Gibbons, J.D., 1990. Rank Correlation Methods, 5th Edition, A Charles Griffin Title.
[35]
Koh, E.T., Owen, W.L., 2000. Nonparametric Statistics, Springer US, Boston, MA, pp. 155–168.
[36]
McCarey, F., Cinneide, M.O., Kushmerick, N., 2006. A recommender agent for software libraries: An evaluation of memory-based and model-based collaborative filtering. In: Proceedings of the IEEE/WIC/ACM International Conference on Intelligent Agent Technology, IAT ’06, IEEE Computer Society, USA, pp. 154–162.
[37]
M.K. Najafabadi, M.N. Mahrin, S. Chuprat, H.M. Sarkan, Improving the accuracy of collaborative filtering recommendations using clustering and association rules mining on implicit data, Computers in Human Behavior 67 (2017) 113–128,.
[38]
S. Natarajan, S. Vairavasundaram, S. Natarajan, A.H. Gandomi, Resolving data sparsity and cold start problem in collaborative filtering recommender system using linked open data, Expert Systems with Applications 149 (2020),.
[39]
R.E. Neapolitan, X. Jiang, Chapter 11 - collaborative filtering, in: R.E. Neapolitan, X. Jiang (Eds.), Probabilistic Methods for Financial and Marketing Informatics, Morgan Kaufmann, Burlington, 2007, pp. 373–385,.
[40]
X. Ning, C. Desrosiers, G. Karypis, A Comprehensive Survey of Neighborhood-Based Recommendation Methods, Springer US, Boston, MA, 2015, pp. 37–76,.
[41]
O’Neill, B., 2006. Chapter 2 - frame fields, in: B. O’Neill (Ed.), Elementary Differential Geometry (Second Edition), second edition Edition, Academic Press, Boston, pp. 43–99.
[42]
Ouni, S., Fkih, F., Omri, M.N., 2021. Toward a new approach to author profiling based on the extraction of statistical features. Soc. Netw. Anal. Min. 59 (11).
[43]
K. Pearson, Note on Regression and Inheritance in the Case of Two Parents, Proceedings of the Royal Society of London Series I (58) (1895) 240–242.
[44]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay, Scikit-learn: Machine learning in Python, Journal of Machine Learning Research 12 (2011) 2825–2830.
[45]
N. Polatidis, C.K. Georgiadis, A dynamic multi-level collaborative filtering method for improved recommendations, Computer Standards & Interfaces 51 (2017) 14–21,.
[46]
L. Quijano-Sánchez, I. Cantador, M.E. Cortés-Cediel, O. Gil, Recommender systems for smart cities, Information Systems 92 (2020),.
[47]
W. Rand, Objective criteria for the evaluation of clustering methods, Journal of the American Statistical Association 66 (336) (1971) 846–850.
[48]
Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., Riedl, J., 1994. Grouplens: An open architecture for collaborative filtering of netnews. In: Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work, CSCW ’94, Association for Computing Machinery, New York, NY, USA, pp. 175–186.
[49]
F. Ricci, L. Rokach, B. Shapira, P.B. Kantor, Recommender Systems Handbook, 1st Edition, Springer-Verlag, Berlin, Heidelberg, 2010.
[50]
A. Ritter, R. Muñoz-Carpena, Performance evaluation of hydrological models: Statistical significance for reducing subjectivity in goodness-of-fit assessments, Journal of Hydrology 480 (2013) 33–45,.
[51]
Sarwar, B., Karypis, G., Konstan, J., Riedl, J., 2001. Item-based collaborative filtering recommendation algorithms. In: Proceedings of the 10th International Conference on World Wide Web, WWW ’01, Association for Computing Machinery, New York, NY, USA, pp. 285–295.
[52]
C.E. Shannon, A mathematical theory of communication, SIGMOBILE Mob. Comput. Commun. Rev. 5 (1) (2001) 3–55,.
[53]
Shardanand, U., Maes, P., 1995. Social information filtering: Algorithms for automating “word of mouth, in: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’95, ACM Press/Addison-Wesley Publishing Co., USA, pp. 210–217.
[54]
T. Silveira, M. Zhang, X. Lin, Y. Liu, S. Ma, How good your recommender system is? a survey on evaluations in recommendation, International Journal of Machine Learning and Cybernetics 10 (2019) 813–831.
[55]
Sinnott, R., Duan, H., Sun, Y., 2016. Chapter 15 - a case study in big data analytics: Exploring twitter sentiment analysis and the weather. In: R. Buyya, R.N. Calheiros, A.V. Dastjerdi (Eds.), Big Data, Morgan Kaufmann, pp. 357–388.
[56]
C. Spearman, The proof and measurement of association between two things, International Journal of Epidemiology 39 (5) (2010) 1137–1150,.
[57]
Sun, S.-B., Zhang, Z.-H., Dong, X.-L., Zhang, H.-R., Li, T.-J., Zhang, L., Min, F. Integrating triangle and jaccard similarities for recommendation. PLoS ONE 12 (8).
[58]
Szabo, F.E., 2015. M, in: F.E. Szabo (Ed.), The Linear Algebra Survival Guide, Academic Press, Boston, pp. 219–233.
[59]
Szczepanska, A., 2011. Research design and statistical analysis, third edition by jerome l. myers, arnold d. well, robert f. lorch, jr. International Statistical Review 79 (3), 491–492.
[60]
J. Turner, R. Baker, F. Kellner, Theoretical literature review: Tracing the life cycle of a theory and its verified and falsified statements, Human Resource Development Review 17 (2018) 34–61.
[61]
D. Valcarce, A. Landin, J. Parapar, Álvaro Barreiro, Collaborative filtering embeddings for memory-based recommender systems, Engineering Applications of Artificial Intelligence 85 (2019) 347–356,.
[62]
N.X. Vinh, J. Epps, J. Bailey, Information theoretic measures for clusterings comparison: Is a correction for chance necessary?, in: Proceedings of the 26th Annual International Conference on Machine Learning, Association for Computing Machinery, New York, NY, USA, 2009, pp. 1073–1080,.
[63]
Wilson, J., Chaudhury, S., Lall, B., 2014. Improving collaborative filtering based recommenders using topic modelling. In: Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 01, WI-IAT ’14, IEEE Computer Society, USA, pp. 340–346.

Cited By

View all

Index Terms

  1. Similarity measures for Collaborative Filtering-based Recommender Systems: Review and experimental comparison
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image Journal of King Saud University - Computer and Information Sciences
        Journal of King Saud University - Computer and Information Sciences  Volume 34, Issue 9
        Oct 2022
        1335 pages

        Publisher

        Elsevier Science Inc.

        United States

        Publication History

        Published: 01 October 2022

        Author Tags

        1. Recommender System
        2. Collaborative Filtering
        3. Similarity measure
        4. User-based CF
        5. Item-based CF

        Qualifiers

        • Review-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 26 Sep 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Artwork Recommendations based on User Preferences: Integrating Clustering Analysis with Visual FeaturesJournal on Computing and Cultural Heritage 10.1145/364990117:3(1-10)Online publication date: 15-May-2024
        • (2024)SPINEXApplied Soft Computing10.1016/j.asoc.2024.111518157:COnline publication date: 1-May-2024
        • (2024)Ontology-based recommender system: a deep learning approachThe Journal of Supercomputing10.1007/s11227-023-05874-080:9(12102-12122)Online publication date: 1-Jun-2024
        • (2023)A Normative Approach to Privacy-Preserving Recommender SystemsInternational Journal of Intelligent Systems10.1155/2023/29595032023Online publication date: 1-Jan-2023
        • (2023)Secure and Enhanced Online Recommendations: A Federated Intelligence ApproachIEEE Transactions on Consumer Electronics10.1109/TCE.2023.333515670:1(2500-2507)Online publication date: 28-Nov-2023

        View Options

        View options

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media