research-article

On the feasibility of crawling-based attacks against recommender systems

Authors: Fabio Aiolli, Mauro Conti, Stjepan Picek, Mirko PolatoAuthors Info & Claims

Journal of Computer Security, Volume 30, Issue 4

Pages 599 - 621

https://doi.org/10.3233/JCS-210041

Published: 01 January 2022 Publication History

Abstract

Nowadays, online services, like e-commerce or streaming services, provide a personalized user experience through recommender systems. Recommender systems are built upon a vast amount of data about users/items acquired by the services. Such knowledge represents an invaluable resource. However, commonly, part of this knowledge is public and can be easily accessed via the Internet. Unfortunately, that same knowledge can be leveraged by competitors or malicious users. The literature offers a large number of works concerning attacks on recommender systems, but most of them assume that the attacker can easily access the full rating matrix. In practice, this is never the case. The only way to access the rating matrix is by gathering the ratings (e.g., reviews) by crawling the service’s website. Crawling a website has a cost in terms of time and resources. What is more, the targeted website can employ defensive measures to detect automatic scraping.

In this paper, we assess the impact of a series of attacks on recommender systems. Our analysis aims to set up the most realistic scenarios considering both the possibilities and the potential attacker’s limitations. In particular, we assess the impact of different crawling approaches when attacking a recommendation service. From the collected information, we mount various profile injection attacks. We measure the value of the collected knowledge through the identification of the most similar user/item. Our empirical results show that while crawling can indeed bring knowledge to the attacker (up to 65% of neighborhood reconstruction on a mid-size dataset and up to 90% on a small-size dataset), this will not be enough to mount a successful shilling attack in practice.

References

[1]

F. Aiolli, M. Conti, S. Picek and M. Polato, Big enough to care not enough to scare! Crawling to attack recommender systems, in: Computer Security – ESORICS 2020, L. Chen, N. Li, K. Liang and S. Schneider, eds, Springer International Publishing, Cham, 2020, pp. 165–184. ISBN 978-3-030-59013-0.

Digital Library

[2]

R. Baeza-Yates, C. Castillo, M. Marin and A. Rodriguez, Crawling a country: Better strategies than breadth-first for web page ordering, in: Special Interest Tracks and Posters of the 14th International Conference on World Wide Web, WWW’05, Association for Computing Machinery, New York, NY, USA, 2005, pp. 864–872. ISBN 1595930515.

Digital Library

[3]

L. Barbosa and J. Freire, An adaptive crawler for locating hidden-web entry points, in: Proceedings of the 16th International Conference on World Wide Web, WWW’07, Association for Computing Machinery, New York, NY, USA, 2007, pp. 441–450. ISBN 9781595936547.

Digital Library

[4]

L. Barbosa and J. Freire, Combining classifiers to identify online databases, in: Proceedings of the 16th International Conference on World Wide Web, WWW’07, Association for Computing Machinery, New York, NY, USA, 2007, pp. 431–440. ISBN 9781595936547.

Digital Library

[5]

W. Bhebe and O.P. Kogeda, Shilling attack detection in collaborative recommender systems using a meta learning strategy, in: 2015 International Conference on Emerging Trends in Networks and Computer Communications, 2015, pp. 56–61.

[6]

S. Brin and L. Page, The anatomy of a large-scale hypertextual web search engine, in: Proceedings of the Seventh International Conference on World Wide Web 7, WWW7, Elsevier, NLD, 1998, pp. 107–117.

[7]

R. Burke, B. Mobasher and R. Bhaumik, Limited knowledge shilling attacks in collaborative filtering systems, in: Proceedings of the 3rd IJCAI Workshop in Intelligent Techniques for Personalization, 2005.

[8]

R. Burke, B. Mobasher, C. Williams and R. Bhaumik, Classification features for attack detection in collaborative recommender systems, in: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’06, Association for Computing Machinery, New York, NY, USA, 2006, pp. 542–547. ISBN 1595933395.

Digital Library

[9]

R. Cai, J.-M. Yang, W. Lai, Y. Wang and L. Zhang, IRobot: An intelligent crawler for web forums, in: Proceedings of the 17th International Conference on World Wide Web, WWW’08, Association for Computing Machinery, New York, NY, USA, 2008, pp. 447–456. ISBN 9781605580852.

Digital Library

[10]

S. Chakrabarti, Focused web crawling, in: Encyclopedia of Database Systems, Springer US, Boston, MA, 2009, pp. 1147–1155. ISBN 978-0-387-39940-9.

[11]

S. Chakrabarti, B. Dom, P. Raghavan, S. Rajagopalan, D. Gibson and J. Kleinberg, Automatic resource compilation by analyzing hyperlink structure and associated text, in: Proceedings of the Seventh International Conference on World Wide Web 7, WWW7, Elsevier, NLD, 1998, pp. 65–74.

[12]

P.-A. Chirita, W. Nejdl and C. Zamfir, Preventing shilling attacks in online recommender systems, in: WIDM’05, Association for Computing Machinery, New York, NY, USA, 2005, pp. 67–74. ISBN 1595931945.

Digital Library

[13]

J. Cho, H. Garcia-Molina and L. Page, Efficient crawling through URL ordering, Computer Networks and ISDN Systems 30(1) (1998), 161–172, http://www.sciencedirect.com/science/article/pii/S0169755298001081.

Digital Library

[14]

K. Christakopoulou and A. Banerjee, Adversarial attacks on an oblivious recommender, in: Proceedings of the 13th ACM Conference on Recommender Systems, RecSys’19, ACM, 2019, pp. 322–330. ISBN 978-1-4503-6243-6.

Digital Library

[15]

Y. Deldjoo, T. Di Noia and F.A. Merra, Assessing the impact of a user-item collaborative attack on class of users, in: Proceedings of the 13th ACM RecSys Workshop on Impact of Recommender Systems, (ImpactRS@RecSys’19), 2019, http://sisinflab.poliba.it/publications/2019/DDM19.

[16]

W. Deng, Y. Shi, Z. Chen, W. Kwak and H. Tang, Recommender system for marketing optimization, World Wide Web 23(3) (2020), 1497–1517.

Digital Library

[17]

C. Eksombatchai, P. Jindal, J.Z. Liu, Y. Liu, R. Sharma, C. Sugnet, M. Ulrich and J. Leskovec, Pixie: A system for recommending 3+ billion items to 200+ million users in real-time, in: Proceedings of the 2018 World Wide Web Conference, WWW’18, WWW Conferences Steering Committee, Republic and Canton of Geneva, CHE, 2018, pp. 1775–1784. ISBN 9781450356398.

Digital Library

[18]

M. Ester, H.-P. Kriegel and M. Schubert, Accurate and efficient crawling for relevant websites, in: Proceedings of the Thirtieth International Conference on Very Large Data Bases – Volume 30, VLDB’04, VLDB Endowment, 2004, pp. 396–407. ISBN 0120884690.

[19]

M. Fang, G. Yang, N.Z. Gong and J. Liu, Poisoning attacks to graph-based recommender systems, in: Proceedings of the 34th Annual Computer Security Applications Conference, ACSAC’18, Association for Computing Machinery, New York, NY, USA, 2018, pp. 381–392. ISBN 9781450365697.

Digital Library

[20]

S.I. Gass and M.C. Fu (eds), Prim’s algorithm, in: Encyclopedia of Operations Research and Management Science, Springer US, Boston, MA, 2013, pp. 1160–1160. ISBN 978-1-4419-1153-7.

[21]

C. Gomez-Uribe and N. Hunt, The Netflix recommender system: Algorithms, business value, and innovation, ACM Trans. Manage. Inf. Syst. 6(4) (2016).

Digital Library

[22]

I. Gunes, A. Bilge and H. Polat, Shilling attacks against memory-based privacy-preserving recommendation algorithms, TIIS 7 (2013), 1272–1290.

[23]

I. Gunes, C. Kaleli, A. Bilge and H. Polat, Shilling attacks against recommender systems: A comprehensive survey, Artificial Intelligence Review (2014), 767–799.

Digital Library

[24]

G. Guo, J. Zhang and N. Yorke-Smith, A novel Bayesian similarity measure for recommender systems, in: Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI), 2013, pp. 2619–2625.

[25]

K. Hara, I. Suzuki, K. Kobayashi and K. Fukumizu, Reducing hubness: A cause of vulnerability in recommender systems, in: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR’15, Association for Computing Machinery, New York, NY, USA, 2015, pp. 815–818. ISBN 9781450336215.

Digital Library

[26]

H. Holzmann, A. Anand and M. Khosla, Delusive PageRank in incomplete graphs, in: Complex Networks and Their Applications VII, L.M. Aiello, C. Cherifi, H. Cherifi, R. Lambiotte, P. Lió and L.M. Rocha, eds, Springer International Publishing, Cham, 2019, pp. 104–117.

[27]

H. Holzmann, A. Anand and M. Khosla, Estimating PageRank deviations in crawled graphs, Applied Network Science 4 (2019), 86–107.

[28]

N.J. Hurley, M.P. O’Mahony and G.C.M. Silvestre, Attacking recommender systems: A cost-benefit analysis, IEEE Intelligent Systems 22(3) (2007), 64–68.

Digital Library

[29]

P. Kaur and S. Goel, Shilling attack models in recommender system, in: 2016 International Conference on Inventive Computation Technologies (ICICT), Vol. 2, 2016, pp. 1–5.

[30]

P. Knees, D. Schnitzer and A. Flexer, Improving neighborhood-based collaborative filtering by reducing hubness, in: Proceedings of International Conference on Multimedia Retrieval, ICMR’14, Association for Computing Machinery, New York, NY, USA, 2014, pp. 161–168. ISBN 9781450327824.

Digital Library

[31]

Y. Koren and R. Bell, Advances in collaborative filtering, in: Recommender Systems Handbook, Springer, Boston, MA, 2011, pp. 145–186. ISBN 978-0-387-85820-3.

[32]

M. Koster, Robots in the web: Threat or treat?, ConneXions 9(4) (1995).

[33]

A. Lawankar and N. Mangrulkar, A review on techniques for optimizing web crawler results, in: 2016 World Conference on Futuristic Trends in Research and Innovation for Social Welfare (Startup Conclave), 2016, pp. 1–4.

[34]

B. Li, Y. Wang, A. Singh and Y. Vorobeychik, Data poisoning attacks on factorization-based collaborative filtering, in: Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, 2016, pp. 1893–1901, http://dl.acm.org/citation.cfm?id=3157096.3157308. ISBN 978-1-5108-3881-9.

[35]

G. Linden, B. Smith and J. York, Amazon.com recommendations: Item-to-item collaborative filtering, IEEE Internet Computing 7(1) (2003), 76–80.

Digital Library

[36]

L. Muñoz-González, B. Pfitzner, M. Russo, J. Carnerero-Cano and E.C. Lupu, Poisoning attacks with generative adversarial nets, 2019.

[37]

L. Page, S. Brin, R. Motwani and T. Winograd, The PageRank citation ranking: Bringing order to the web, in: WWW 1999, 1999.

[38]

K. Patel, A. Thakkar, C. Shah and K. Makvana, A state of art survey on shilling attack in collaborative filtering based recommendation system, in: Proceedings of First International Conference on Information and Communication Technology for Intelligent Systems, Vol. 1, S.C. Satapathy and S. Das, eds, Springer, Cham, 2016, pp. 377–385. ISBN 978-3-319-30933-0.

[39]

M. Polato and F. Aiolli, Boolean kernels for collaborative filtering in top-N item recommendation, Neurocomputing 286 (2018), 214–225.

Digital Library

[40]

S. Rendle, C. Freudenthaler, Z. Gantner and L. Schmidt-Thieme, BPR: Bayesian personalized ranking from implicit feedback, in: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, UAI’09, AUAI Press, Arlington, Virginia, USA, 2009, pp. 452–461. ISBN 9780974903958.

[41]

F. Ricci, L. Rokach and B. Shapira, Recommender Systems Handbook, 2nd edn, Springer Publishing Company, Incorporated, 2015. ISBN 1489976361.

[42]

M. Si and Q. Li, Shilling attacks against collaborative recommender systems: A review, Artificial Intelligence Review 53 (2020), 291–319.

Digital Library

[43]

X. Su and T.M. Khoshgoftaar, A survey of collaborative filtering techniques, Adv. in Artif. Intell. 2009 (2009).

Digital Library

[44]

A.P. Sundar, F. Li, X. Zou, T. Gao and E.D. Russomanno, Understanding shilling attacks and their detection traits: A comprehensive survey, IEEE Access 8 (2020), 171703–171715.

[45]

K. Turk, S. Pastrana and B. Collier, A tight scrape: Methodological approaches to cybercrime research data collection in adversarial environments, in: 2020 IEEE European Symposium on Security and Privacy Workshops (EuroS PW), 2020, pp. 428–437.

[46]

M. Wan, R. Misra, N. Nakashole and J.J. McAuley, Fine-grained spoiler detection from large-scale review corpora, in: ACL, 2019, pp. 2605–2610.

[47]

Y. Zhang, H. Gao, G. Pei, S. Luo, G. Chang and N. Cheng, A survey of research on CAPTCHA designing and breaking techniques, in: 2019 18th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/13th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE), 2019, pp. 75–84.

[48]

W. Zhou, J. Wen, Y.S. Koh, Q. Xiong, M. Gao, G. Dobbie and S. Alam, Shilling attacks detection in recommender systems based on target item analysis, PLOS ONE 10(7) (2015), 1–26.

[49]

C.-N. Ziegler, S.M. McNee, J.A. Konstan and G. Lausen, Improving recommendation lists through topic diversification, in: Proceedings of the 14th International Conference on World Wide Web, WWW’05, Association for Computing Machinery, New York, NY, USA, 2005, pp. 22–32. ISBN 1595930469.

Digital Library

Index Terms

On the feasibility of crawling-based attacks against recommender systems¹
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Document filtering
      2. Information extraction
  2. Information systems applications
    1. Data mining
2. Security and privacy

Index terms have been assigned to the content through auto-classification.

Recommendations

Big Enough to Care Not Enough to Scare! Crawling to Attack Recommender Systems
Computer Security – ESORICS 2020
Abstract
Online recommendation services, such as e-commerce sites, rely on a vast amount of knowledge about users/items that represent an invaluable resource. Part of this acquired knowledge is public and can be accessed by anyone through the Internet. ...
Shilling attacks against collaborative recommender systems: a review
Abstract
Collaborative filtering recommender systems (CFRSs) have already been proved effective to cope with the information overload problem since they merged in the past two decades. However, CFRSs are highly vulnerable to shilling or profile injection ...
A Scalable, Accurate Hybrid Recommender System
WKDD '10: Proceedings of the 2010 Third International Conference on Knowledge Discovery and Data Mining

Recommender systems apply machine learning techniques for filtering unseen information and can predict whether a user would like a given resource. There are three main types of recommender systems: collaborative filtering, content-based filtering, and ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Journal of Computer Security

Journal of Computer Security Volume 30, Issue 4

2022

134 pages

ISSN:0926-227X

Issue’s Table of Contents

© 2022 – IOS Press. All rights reserved.

Publisher

IOS Press

Netherlands

Publication History

Published: 01 January 2022

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 30 Sep 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents