Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2578726.2578747acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
tutorial

Improving Neighborhood-Based Collaborative Filtering by Reducing Hubness

Published: 01 April 2014 Publication History

Abstract

For recommending multimedia items, collaborative filtering (CF) denotes the technique of automatically predicting a user's rating or preference for an item by exploiting item preferences of a (large) group of other users. In traditional memory-based (or neighborhood-based) recommenders, this is accomplished by, first, selecting a number of similar users (or items) and, second, combining their ratings into a single user's predicted rating for an item. Strategies for both defining similarity (i.e., to identify nearest neighbors) and for combining ratings (i.e., to weight their impact) have been extensively studied and even resulted in inconsistent findings.
In this paper, we investigate the effects of the high dimensionality of userxitem matrices on the quality of memory-based movie rating prediction. By examining several publicly available real-world CF data sets, we show that the step of nearest neighbor selection is affected by the phenomena of similarity concentration and hub occurrence due to high-dimensional data spaces and the class of similarity measures used. To mitigate this, we adapt a normalization technique called mutual proximity that has been shown to reduce these effects in classification tasks. Finally, we show that removing hubs and incorporating normalized similarity values into the neighbor weighting step leads to increased rating prediction accuracy, observable on all examined data sets in terms of lowered error measure (RMSE).

References

[1]
X. Amatriain, N. Lathia, J. M. Pujol, H. Kwak, and N. Oliver. The wisdom of the few -- a collaborative filtering approach based on expert opinions from the web. In Proc 32nd SIGIR, pp. 532--539, 2009.
[2]
L. Baltrunas and F. Ricci. Dynamic item weighting and selection for collaborative filtering. Web mining, 2:135--146, 2007.
[3]
J. Bennett and S. Lanning. The Netflix prize. In Proc KDD Cup and Workshop, p. 35, 2007.
[4]
J. S. Breese, D. Heckerman, and C. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. In Proc 14th conference on Uncertainty in artificial intelligence (UAI'98), pp. 43--52, 1998.
[5]
C. Desrosiers and G. Karypis. A comprehensive survey of neighborhood-based recommendation methods. In F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor, editors, Recommender Systems Handbook, pp. 107--144. Springer, 2011.
[6]
A. Flexer, D. Schnitzer, and J. Schlüter. A mirex meta-analysis of hubness in audio music similarity. In Proc 13th int'l society for Music information retrieval conference, pp. 175--180, 2012.
[7]
Z. Gantner, S. Rendle, C. Freudenthaler, and L. Schmidt-Thieme. MyMediaLite: A free recommender system library. In Proc 5th ACM conference on Recommender systems, 2011.
[8]
D. Goldberg, D. Nichols, B. M. Oki, and D. Terry. Using collaborative filtering to weave an information tapestry. Comm ACM, 35(12):61--70, 1992.
[9]
J. L. Herlocker, J. A. Konstan, A. Borchers, and J. Riedl. An algorithmic framework for performing collaborative filtering. In Proc 22nd SIGIR, pp. 230--237, 1999.
[10]
T. Hofmann. Latent semantic models for collaborative filtering. ACM Trans. Inf. Syst., 22(1):89--115, 2004.
[11]
R. Jin, J. Y. Chai, and L. Si. An automatic weighting scheme for collaborative filtering. In Proc 27th SIGIR, pp. 337--344, 2004.
[12]
J. A. Konstan, B. N. Miller, D. Maltz, J. L. Herlocker, L. R. Gordon, and J. Riedl. Grouplens: applying collaborative filtering to usenet news. Comm ACM, 40(3):77--87, Mar. 1997.
[13]
Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In Proc 14th ACM SIGKDD, pp. 426--434, 2008.
[14]
Y. Koren, R. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8):30--37, 2009.
[15]
G. Linden, B. Smith, and J. York. Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Computing, 7(1):76--80, 2003.
[16]
B. Mobasher, R. Burke, R. Bhaumik, and C. Williams. Toward trustworthy recommender systems: An analysis of attack models and algorithm robustness. ACM Trans. Internet Technol., 7(4):23, 2007.
[17]
A. Nanopoulos, M. Radovanović, and M. Ivanović. How does high dimensionality affect collaborative filtering? In Proc 3rd ACM conference on Recommender systems, pp. 293--296, 2009.
[18]
M. Radovanović, A. Nanopoulos, and M. Ivanović. Hubs in space: Popular nearest neighbors in high-dimensional data. Journal of Machine Learning Research, 11:2487--2531, 2010.
[19]
B. Sarwar, G. Karypis, J. Konstan, and J. Riedl. Item-based collaborative filtering recommendation algorithms. In Proc 10th int'l conference on World Wide Web (WWW), pp. 285--295, 2001.
[20]
D. Schnitzer, A. Flexer, M. Schedl, and G. Widmer. Local and global scaling reduce hubs in space. Journal of Machine Learning Research, 13:2871--2902, 2012.
[21]
K. Seyerlehner, A. Flexer, and G. Widmer. On the limitations of browsing top-n recommender systems. In Proc 3rd ACM conference on Recommender systems, pp. 321--324, 2009.
[22]
G. Shani and A. Gunawardana. Evaluating recommendation systems. In F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor, editors, Recommender Systems Handbook, pp. 257--297. Springer, 2011.
[23]
I. Suzuki, K. Hara, M. Shimbo, Y. Matsumoto, and M. Saerens. Investigating the effectiveness of laplacian-based kernels in hub reduction. In Proc 26th conference on Artificial Intelligence (AAAI), pp. 1112--1118, 2012.
[24]
G. Takács, I. Pilászy, B. Németh, and D. Tikk. Major components of the gravity recommendation system. ACM SIGKDD Explor. Newsl., 9(2):80--83, 2007.
[25]
N. Tomašev, M. Radovanović, D. Mladenić, and M. Ivanović. The role of hubness in clustering high-dimensional data. In Advances in Knowledge Discovery and Data Mining, pp. 183--195, 2011.
[26]
B. Xie, P. Han, F. Yang, R.-M. Shen, H.-J. Zeng, and Z. Chen. Dcfla: A distributed collaborative-filtering neighbor-locating algorithm. Information Sciences, 177(6):1349--1363, 2007.

Cited By

View all
  • (2024)Improving Serendipity for Collaborative Metric Learning Based on Mutual ProximityBig Data Analytics and Knowledge Discovery10.1007/978-3-031-68323-7_14(177-191)Online publication date: 26-Aug-2024
  • (2022)On the feasibility of crawling-based attacks against recommender systemsJournal of Computer Security10.3233/JCS-21004130:4(599-621)Online publication date: 1-Jan-2022
  • (2022)Rethinking Correlation-based Item-Item Similarities for Recommender SystemsProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3532055(2287-2291)Online publication date: 6-Jul-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICMR '14: Proceedings of International Conference on Multimedia Retrieval
April 2014
564 pages
ISBN:9781450327824
DOI:10.1145/2578726
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. collaborative filtering
  2. hubs
  3. memory-based
  4. mutual proximity
  5. rating prediction

Qualifiers

  • Tutorial
  • Research
  • Refereed limited

Conference

ICMR '14
ICMR '14: International Conference on Multimedia Retrieval
April 1 - 4, 2014
Glasgow, United Kingdom

Acceptance Rates

ICMR '14 Paper Acceptance Rate 21 of 111 submissions, 19%;
Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)3
Reflects downloads up to 30 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Improving Serendipity for Collaborative Metric Learning Based on Mutual ProximityBig Data Analytics and Knowledge Discovery10.1007/978-3-031-68323-7_14(177-191)Online publication date: 26-Aug-2024
  • (2022)On the feasibility of crawling-based attacks against recommender systemsJournal of Computer Security10.3233/JCS-21004130:4(599-621)Online publication date: 1-Jan-2022
  • (2022)Rethinking Correlation-based Item-Item Similarities for Recommender SystemsProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3532055(2287-2291)Online publication date: 6-Jul-2022
  • (2021)Survey of similarity functions on neighborhood-based collaborative filteringExpert Systems with Applications: An International Journal10.1016/j.eswa.2021.115482185:COnline publication date: 15-Dec-2021
  • (2020)Improving neighbor-based collaborative filtering by using a hybrid similarity measurementExpert Systems with Applications10.1016/j.eswa.2020.113651160(113651)Online publication date: Dec-2020
  • (2020)Big Enough to Care Not Enough to Scare! Crawling to Attack Recommender SystemsComputer Security – ESORICS 202010.1007/978-3-030-59013-0_9(165-184)Online publication date: 14-Sep-2020
  • (2019)A Hybrid Similarity Measure Based on Binary and Decimal Data for Data MiningProceedings of the 2019 5th International Conference on Computing and Artificial Intelligence10.1145/3330482.3330520(72-77)Online publication date: 19-Apr-2019
  • (2019)A comprehensive empirical comparison of hubness reduction in high-dimensional spacesKnowledge and Information Systems10.1007/s10115-018-1205-y59:1(137-166)Online publication date: 1-Apr-2019
  • (2018)Hubs in Nearest-Neighbor GraphsProceedings of the 8th International Conference on Web Intelligence, Mining and Semantics10.1145/3227609.3227691(1-4)Online publication date: 25-Jun-2018
  • (2016)Cross-system RecommendationProceedings of the 27th ACM Conference on Hypertext and Social Media10.1145/2914586.2914640(183-188)Online publication date: 10-Jul-2016
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media