Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Evaluating collaborative filtering recommender systems

Published: 01 January 2004 Publication History

Abstract

Recommender systems have been evaluated in many, often incomparable, ways. In this article, we review the key decisions in evaluating collaborative filtering recommender systems: the user tasks being evaluated, the types of analysis and datasets being used, the ways in which prediction quality is measured, the evaluation of prediction attributes other than quality, and the user-based evaluation of the system as a whole. In addition to reviewing the evaluation strategies used by prior researchers, we present empirical results from the analysis of various accuracy metrics on one content domain where all the tested metrics collapsed roughly into three equivalence classes. Metrics within each equivalency class were strongly correlated, while metrics from different equivalency classes were uncorrelated.

References

[1]
Aggarwal, C. C., Wolf, J. L., Wu, K.-L., and Yu, P. S. 1999. Horting hatches an egg: A new graph-theoretic approach to collaborative filtering. In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, New York.]]
[2]
Amento, B., Terveen, L., Hill, W., Hix, D., and Schulman, R. 2003. Experiments in social data mining: The TopicShop System. ACM Trans. Computer-Human Interact. 10, 1 (Mar.), 54--85.]]
[3]
Amento, B., Terveen, L., Hix, D., and Ju, P. 1999. An empirical evaluation of user interfaces for topic management of web sites. In Proceedings of the 1999 Conference on Human Factors in Computing Systems (CHI '99). ACM, New York, 552--559.]]
[4]
Baeza-Yates, R. and Ribiero-Neto, B. 1999. Modern Information Retrieval. Addison-Wesley Longman, Boston, Mass.]]
[5]
Bailey, B. P., Gurak, L. J., and Konstan, J. A. 2001. An examination of trust production in computer-mediated exchange. In Proceedings of the 7th Conference on Human Factors and the Web (July).]]
[6]
BalabanovíC, M. and Shoham, Y. 1997. Fab: Content-based, collaborative recommendation. Commun. ACM 40, 66--72.]]
[7]
Basu, C., Hirsh, H., Cohen, W. W. 1998. Recommendation as classification: using social and content-based information in recommendation. In Proceedings of the 15th National Conference on Artificial Intelligence (AAAI-98). C. Rich, and J. Mostow, Eds. AAAI Press, Menlo Park, Calif., 714--720.]]
[8]
Billsus, D. and Pazzani, M. J. 1998. Learning collaborative information filters. In Proceedings of the 15th National Conference on Artificial Intelligence (AAAI-98). C. Rich, and J. Mostow, Eds. AAAI Press, Menlo Park, Calif., 46--53.]]
[9]
Breese, J. S., Heckerman, D., and Kadie, C. 1998. Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence (UAI-98). G. F. Cooper, and S. Moral, Eds. Morgan-Kaufmann, San Francisco, Calif., 43--52.]]
[10]
Canny, J. 2002. Collaborative filtering with privacy via factor analysis. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information retrieval. ACM, New York, 238--245.]]
[11]
Claypool, M., Brown, D., Le, P., and Waseda, M. 2001. Inferring user Interest. IEEE Internet Comput. 5, 32--39.]]
[12]
Cleverdon, C. and Kean, M. 1968. Factors Determining the Performance of Indexing Systems. Aslib Cranfield Research Project, Cranfield, England.]]
[13]
Cosley, D., Lam, S. K., Albert, I., Konstan, J. A., and Riedl, J. 2003. Is seeing believing? How recommender interfaces affect users' opinions. CHI Lett. 5.]]
[14]
Dahlen, B. J., Konstan, J. A., Herlocker, J. L., Good, N., Borchers, A., and Riedl, J. 1998. Jump-starting movielens: User benefits of starting a collaborative filtering system with "dead data". TR 98-017. University of Minnesota.]]
[15]
Domingos, P. and Richardson, M. 2003. Mining the network value of customers. In Proceedings of the 7th International Conference on Knowledge Discovery and Data Mining. ACM, New York, 57--66.]]
[16]
Goldberg, D., Nichols, D., Oki, B. M., and Terry, D. 1992. Using collaborative filtering to weave an information tapestry. Commun. ACM 35, 61--70.]]
[17]
Goldberg, K., Roeder, T., Guptra, D., and Perkins, C. 2001. Eigentaste: A constant-time collaborative filtering algorithm. Inf. Retr. 4, 133--151.]]
[18]
Good, N., Schafer, J. B., Konstan, J. A., Borchers, A., Sarwar, B. M., Herlocker, J. L., and Riedl, J. 1999. Combining collaborative filtering with personal agents for better recommendations. In Proceedings of the 16th National Conference on Artificial Intelligence (AAAI-99), J. Hendler, and D. Subramanian, Eds. AAAI Press, Menlo Park, Calif., 439--446.]]
[19]
Hanley, J. A. and Mcneil, B. J. 1982. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 29--36.]]
[20]
Harman, D. 1995. The TREC conferences. Hypertext---Information Retrieval---Multimedia: Synergieeffekte Elektronisher Informationssysteme. In Proceedings of HIM '95.]]
[21]
Harter, S. P. 1996. Variations in relevance assessments and the measurement of retrieval effectiveness. J. ASIS 47, 37--49.]]
[22]
Heckerman, D., Chickering, D. M., Meek, C., Rounthwaite, R., and Kadie, C. 2000. Dependency networks for inference, collaborative filtering, and data visualization. J. Mach. Learn. Res. 1, 49--75.]]
[23]
Helander, M. 1988. Handbook of Human-Computer Interaction. North Holland, Amsterdam.]]
[24]
Herlocker, J. L., Konstan, J. A., Borchers, A., and Riedl, J. 1999. An algorithmic framework for performing collaborative filtering. In Proceedings of the 22nd International Conference on Research and Development in Information Retrieval (SIGIR '99) (Aug). M. A. Hearst, F. F. Gey, and R. Tong, Eds. ACM, New York. 230--237.]]
[25]
Herlocker, J. L., Konstan, J. A., and Riedl, J. 2000. Explaining collaborative filtering recommendations. In Proceedings of the 2000 Conference on Computer Supported Cooperative Work, 241--250.]]
[26]
Herlocker, J. L., Konstan, J. A., and Riedl, J. 2002. An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms. Inf. Retr. 5, 287--310.]]
[27]
Hill, W., Stead, L., Rosenstein, M., and Furnas, G. W. 1995. Recommending and evaluating choices in a virtual community of use. In Proceedings of ACM CHI'95 Conference on Human Factors in Computing Systems. ACM, New York, 194--201.]]
[28]
Konstan, J. A., Miller, B. N., Maltz, D., Herlocker, J. L., Gordon, L. R., and Riedl, J. 1997. GroupLens: Applying collaborative filtering to usenet news. Commun. ACM 40, 77--87.]]
[29]
Le, C. T., Lindren, B. R. 1995. Construction and comparison of two receiver operating characteristics curves derived from the same samples. Biom. J. 37, 869--877.]]
[30]
Linton, F., Charron, A., and Joy, D. 1998. OWL: A recommender system for organziation-wide learning. In Proceedings of the 1998 Workshop on Recommender Systems 65--69.]]
[31]
McDonald, D. W. 2001. Evaluating Expertise Recommendations. In Proceedings of the ACM 2001 International Conference on Supporting Group Work (GROUP'01). ACM, New York.]]
[32]
Mcnee, S., Albert, I., Cosley, D., Gopalkrishnan, P., Rashid, A. M., Konstan, J. A., and Riedl, J. 2002. On the recommending of citations for research papers. In Proceedings of ACM CSCW 2002. ACM, New York.]]
[33]
Miller, B. N., Albert, I., Lam, S. K., Konstan, J. A., and Riedl, J. 2003. MovieLens unplugged: Experiences with a recommender systems on four mobile devices. In Proceedings of the 2003 Conference on Intelligent User Interfaces.]]
[34]
Miller, B. N., Riedl, J., and Konstan, J. A. 1997. Experiences with GroupLens: Making Usenet useful again. In Proceedings of the 1997 Usenix Technical Conference.]]
[35]
Mobasher, B., Dai, H., Luo, T., and Nakagawa, M. 2001. Effective personalization based on association rule discovery from web usage data. In Proceedings of the 3rd ACM Workshop on Web Information and Data Management (WIDM01), held in conjunction with the International Conference on Information and Knowledge Management (CIKM 2001). ACM, New York.]]
[36]
Morita, M. and Shinoda, Y. 1994. Information filtering based on user behavior analysis and best match text retrieval. In Proceedings of SIGIR '94, ACM, New York. 272--281.]]
[37]
Mui, L., Ang, C., and Mohtashemi, M. 2001. A Probabilistic Model for Collaborative Sanctioning. Technical Memorandum 617. MIT LCS.]]
[38]
Newman, W. 1997. Better or just different? On the benefits of designing interactive systems in terms of critical parameters. In Proceedings of the Designing Interactive Systems (DIS97). ACM, New York, 239--246.]]
[39]
Nielsen, J. 1994. Usability Engineering. Academic Press, San Diego, Calif.]]
[40]
Pennock, D. M., Horvitz, E., Lawrence, S., and Giles, C. L. 2000. Collaborative filtering by personality diagnosis: A hybrid memory- and model-based approach. In Proceedings of the 16th Annual Conference on Uncertainty in Artificial Intelligence (UAI-2000). Morgan Kaufmann, San Francisco, Calif., 473--480.]]
[41]
Rashid, A. M., Albert, I., Cosley, D., Lam, S. K., Mcnee, S., Konstan, J. A., and Riedl, J. 2002. Getting to know you: Learning new user preferences in recommender systems. In Proceedings of the 2002 Conference on Intelligent User Interfaces (IUI 2002). 127--134.]]
[42]
Reddy, P. K., Kitsuregawa, P., Sreekanth, P., and Rao, S. S. 2002. A graph based approach to extract a neighborhood customer community for collaborative filtering. In Databases in Networked Information Systems, Second International Workshop. Lecture Notes in Computer Science Springer-Verlag, New York, 188--200.]]
[43]
Resnick, P., Iacovou, N., Suchak, M., Bergstrom, P., and Riedl, J. 1994. GroupLens: An open architecture for collaborative filtering of netnews. In Proceedings of the 1994 Conference on Computer Supported Collaborative Work. R. Furuta and C. Neuwirth, Eds. ACM, New York. 175--186.]]
[44]
Resnick, P. and Varian, H. R. 1997. Recommender systems. Commun. ACM 40, 56--58.]]
[45]
Rogers, S. C. 2001. Marketing Strategies, Tactics, and Techniques : A handbook for practitioners. Quorum Books, Westport, Conn.]]
[46]
Sarwar, B. M., Karypis, G., Konstan, J. A., and Riedl, J. 2000a. Analysis of recommendation algorithms for E-commerce. In Proceedings of the 2nd ACM Conference on Electronic Commerce (EC'00). ACM, New York. 285--295.]]
[47]
Sarwar, B. M., Karypis, G., Konstan, J. A., and Riedl, J. 2000b. Application of dimensionality reduction in recommender system--A case study. In Proceedings of the ACM WebKDD 2000 Web Mining for E-Commerce Workshop.]]
[48]
Sarwar, B. M., Karypis, G., Konstan, J. A., and Riedl, J. 2001. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th International World Wide Web Conference (WWW10).]]
[49]
Sarwar, B. M., Konstan, J. A., Borchers, A., Herlocker, J. L., Miller, B. N., and Riedl, J. 1998. Using filtering agents to improve prediction quality in the grouplens research collaborative filtering system. In Proceedings of the ACM 1998 Conference on Computer Supported Cooperative Work (CSCW '98), ACM, New York.]]
[50]
Schafer, J. B., Konstan, J. A., and Riedl, J. 2002. Meta-recommendation systems:User-controlled integration of diverse recommendations. In Proceedings of the 11th International Conference on Information and Knowledge Management, Nov. 2002, 43--51.]]
[51]
Schein, A. I., Popescul, A., Ungar, L. H., and Pennock, D. M. 2001. Generate models for cold-start recommendations. Proceedings of the 2001 ACM SIGIR Workshop on Recommender Systems. ACM, New York.]]
[52]
Schein, A. I., Popescul, A., Ungar, L. H., and Pennock, D. M. 2002. Methods and metrics for cold-start collaborative filtering. In Proceedings of the 25th Annual international ACM SIGIR Conference on Research and Development in Information Retrieval (Aug.). ACM, New York.]]
[53]
Shardanand, U. and Maes, P. 1995. Social information filtering: Algorithms for automating "word of mouth". In Proceedings of ACM CHI'95 Conference on Human Factors in Computing Systems. ACM, New York. 210--217.]]
[54]
Sinha, R. and Swearingen, K. 2002. The role of transparency in recommender systems. In CHI 2002 Conference Companion.]]
[55]
Swearingen, K. and Sinha, R. 2001. Beyond algorithms: An HCI perspective on recommender systems. In Proceedings of the SIGIR 2001 Workshop on Recommender Systems.]]
[56]
Swets, J. A. 1963. Information retrieval systems. Science 141, 245--250.]]
[57]
Swets, J. A. 1969. Effectiveness of information retrieval methods. Amer. Doc. 20, 72--89.]]
[58]
Turpin, A. and Hersh, W. 2001. Why batch and user evaluations do not give the same results. In Proceedings of the 24th Annual ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, 17--24.]]
[59]
Voorhees, E. M. and Harman, D. K. 1999. Overview of the seventh Text REtrieval Conference (TREC-7). In NIST Special Publication 500-242 (July), E. M. Voorhees, and D. K. Harman, Eds. NIST, 1--24.]]
[60]
Wexelblat, A. and Maes, P. 1999. Footprints: History-rich tools for information foraging. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. M. G. Williams, and M. W. Altom, Eds. ACM, New York, 270--277.]]
[61]
Whittaker, S., Terveen, L. G., and Nardi, B. 2000. Let's stop pushing the envelope and start addressing it: A reference task agenda for HCI. Human-Computer Interact. 15, 2-3 (Sept.), 75--106.]]
[62]
Yao, Y. Y. 1995. Measuring retrieval effectiveness based on user preference of documents. J. ASIS. 46, 133--145.]]

Cited By

View all
  • (2025)When latent features meet side information: A preference relation based graph neural network for collaborative filteringExpert Systems with Applications10.1016/j.eswa.2024.125423260(125423)Online publication date: Jan-2025
  • (2024)Software Engineering Strategies for Real-Time Personalization in E-Commerce RecommendationsAdvancing Software Engineering Through AI, Federated Learning, and Large Language Models10.4018/979-8-3693-3502-4.ch003(40-53)Online publication date: 21-Jun-2024
  • (2024)Diverse but Relevant Recommendations with Continuous Ant Colony OptimizationMathematics10.3390/math1216249712:16(2497)Online publication date: 13-Aug-2024
  • Show More Cited By

Index Terms

  1. Evaluating collaborative filtering recommender systems

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Information Systems
    ACM Transactions on Information Systems  Volume 22, Issue 1
    January 2004
    177 pages
    ISSN:1046-8188
    EISSN:1558-2868
    DOI:10.1145/963770
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 January 2004
    Published in TOIS Volume 22, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Collaborative filtering
    2. evaluation
    3. metrics
    4. recommender systems

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)877
    • Downloads (Last 6 weeks)65
    Reflects downloads up to 24 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)When latent features meet side information: A preference relation based graph neural network for collaborative filteringExpert Systems with Applications10.1016/j.eswa.2024.125423260(125423)Online publication date: Jan-2025
    • (2024)Software Engineering Strategies for Real-Time Personalization in E-Commerce RecommendationsAdvancing Software Engineering Through AI, Federated Learning, and Large Language Models10.4018/979-8-3693-3502-4.ch003(40-53)Online publication date: 21-Jun-2024
    • (2024)Diverse but Relevant Recommendations with Continuous Ant Colony OptimizationMathematics10.3390/math1216249712:16(2497)Online publication date: 13-Aug-2024
    • (2024)From Traditional Recommender Systems to GPT-Based Chatbots: A Survey of Recent Developments and Future DirectionsBig Data and Cognitive Computing10.3390/bdcc80400368:4(36)Online publication date: 27-Mar-2024
    • (2024)A Hybrid Group-Based Food Recommender Framework for Handling Overlapping MembershipsApplied Sciences10.3390/app1413584314:13(5843)Online publication date: 4-Jul-2024
    • (2024)Classifications, evaluation metrics, datasets, and domains in recommendation servicesInternational Journal of Hybrid Intelligent Systems10.3233/HIS-24000320:2(85-100)Online publication date: 1-Jan-2024
    • (2024)Implementation of the Kenya Innovation Bridge using Recommender Systems2024 IST-Africa Conference (IST-Africa)10.23919/IST-Africa63983.2024.10569413(1-6)Online publication date: 20-May-2024
    • (2024)The Implementation of Recommender Systems for Mental Health Recovery Narratives: Evaluation of Use and PerformanceJMIR Mental Health10.2196/4575411(e45754)Online publication date: 29-Mar-2024
    • (2024)Innovative Approaches to Book Recommendation: A Systematic ReviewSSRN Electronic Journal10.2139/ssrn.4487073Online publication date: 2024
    • (2024)Generation of Rating Matrix Based on Rational Behaviors of UsersJournal of Advanced Computational Intelligence and Intelligent Informatics10.20965/jaciii.2024.p012928:1(129-140)Online publication date: 20-Jan-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media