Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

On the feasibility of crawling-based attacks against recommender systems

Published: 01 January 2022 Publication History

Abstract

Nowadays, online services, like e-commerce or streaming services, provide a personalized user experience through recommender systems. Recommender systems are built upon a vast amount of data about users/items acquired by the services. Such knowledge represents an invaluable resource. However, commonly, part of this knowledge is public and can be easily accessed via the Internet. Unfortunately, that same knowledge can be leveraged by competitors or malicious users. The literature offers a large number of works concerning attacks on recommender systems, but most of them assume that the attacker can easily access the full rating matrix. In practice, this is never the case. The only way to access the rating matrix is by gathering the ratings (e.g., reviews) by crawling the service’s website. Crawling a website has a cost in terms of time and resources. What is more, the targeted website can employ defensive measures to detect automatic scraping.
In this paper, we assess the impact of a series of attacks on recommender systems. Our analysis aims to set up the most realistic scenarios considering both the possibilities and the potential attacker’s limitations. In particular, we assess the impact of different crawling approaches when attacking a recommendation service. From the collected information, we mount various profile injection attacks. We measure the value of the collected knowledge through the identification of the most similar user/item. Our empirical results show that while crawling can indeed bring knowledge to the attacker (up to 65% of neighborhood reconstruction on a mid-size dataset and up to 90% on a small-size dataset), this will not be enough to mount a successful shilling attack in practice.

References

[1]
F. Aiolli, M. Conti, S. Picek and M. Polato, Big enough to care not enough to scare! Crawling to attack recommender systems, in: Computer Security – ESORICS 2020, L. Chen, N. Li, K. Liang and S. Schneider, eds, Springer International Publishing, Cham, 2020, pp. 165–184. ISBN 978-3-030-59013-0.
[2]
R. Baeza-Yates, C. Castillo, M. Marin and A. Rodriguez, Crawling a country: Better strategies than breadth-first for web page ordering, in: Special Interest Tracks and Posters of the 14th International Conference on World Wide Web, WWW’05, Association for Computing Machinery, New York, NY, USA, 2005, pp. 864–872. ISBN 1595930515.
[3]
L. Barbosa and J. Freire, An adaptive crawler for locating hidden-web entry points, in: Proceedings of the 16th International Conference on World Wide Web, WWW’07, Association for Computing Machinery, New York, NY, USA, 2007, pp. 441–450. ISBN 9781595936547.
[4]
L. Barbosa and J. Freire, Combining classifiers to identify online databases, in: Proceedings of the 16th International Conference on World Wide Web, WWW’07, Association for Computing Machinery, New York, NY, USA, 2007, pp. 431–440. ISBN 9781595936547.
[5]
W. Bhebe and O.P. Kogeda, Shilling attack detection in collaborative recommender systems using a meta learning strategy, in: 2015 International Conference on Emerging Trends in Networks and Computer Communications, 2015, pp. 56–61.
[6]
S. Brin and L. Page, The anatomy of a large-scale hypertextual web search engine, in: Proceedings of the Seventh International Conference on World Wide Web 7, WWW7, Elsevier, NLD, 1998, pp. 107–117.
[7]
R. Burke, B. Mobasher and R. Bhaumik, Limited knowledge shilling attacks in collaborative filtering systems, in: Proceedings of the 3rd IJCAI Workshop in Intelligent Techniques for Personalization, 2005.
[8]
R. Burke, B. Mobasher, C. Williams and R. Bhaumik, Classification features for attack detection in collaborative recommender systems, in: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’06, Association for Computing Machinery, New York, NY, USA, 2006, pp. 542–547. ISBN 1595933395.
[9]
R. Cai, J.-M. Yang, W. Lai, Y. Wang and L. Zhang, IRobot: An intelligent crawler for web forums, in: Proceedings of the 17th International Conference on World Wide Web, WWW’08, Association for Computing Machinery, New York, NY, USA, 2008, pp. 447–456. ISBN 9781605580852.
[10]
S. Chakrabarti, Focused web crawling, in: Encyclopedia of Database Systems, Springer US, Boston, MA, 2009, pp. 1147–1155. ISBN 978-0-387-39940-9.
[11]
S. Chakrabarti, B. Dom, P. Raghavan, S. Rajagopalan, D. Gibson and J. Kleinberg, Automatic resource compilation by analyzing hyperlink structure and associated text, in: Proceedings of the Seventh International Conference on World Wide Web 7, WWW7, Elsevier, NLD, 1998, pp. 65–74.
[12]
P.-A. Chirita, W. Nejdl and C. Zamfir, Preventing shilling attacks in online recommender systems, in: WIDM’05, Association for Computing Machinery, New York, NY, USA, 2005, pp. 67–74. ISBN 1595931945.
[13]
J. Cho, H. Garcia-Molina and L. Page, Efficient crawling through URL ordering, Computer Networks and ISDN Systems 30(1) (1998), 161–172, http://www.sciencedirect.com/science/article/pii/S0169755298001081.
[14]
K. Christakopoulou and A. Banerjee, Adversarial attacks on an oblivious recommender, in: Proceedings of the 13th ACM Conference on Recommender Systems, RecSys’19, ACM, 2019, pp. 322–330. ISBN 978-1-4503-6243-6.
[15]
Y. Deldjoo, T. Di Noia and F.A. Merra, Assessing the impact of a user-item collaborative attack on class of users, in: Proceedings of the 13th ACM RecSys Workshop on Impact of Recommender Systems, (ImpactRS@RecSys’19), 2019, http://sisinflab.poliba.it/publications/2019/DDM19.
[16]
W. Deng, Y. Shi, Z. Chen, W. Kwak and H. Tang, Recommender system for marketing optimization, World Wide Web 23(3) (2020), 1497–1517.
[17]
C. Eksombatchai, P. Jindal, J.Z. Liu, Y. Liu, R. Sharma, C. Sugnet, M. Ulrich and J. Leskovec, Pixie: A system for recommending 3+ billion items to 200+ million users in real-time, in: Proceedings of the 2018 World Wide Web Conference, WWW’18, WWW Conferences Steering Committee, Republic and Canton of Geneva, CHE, 2018, pp. 1775–1784. ISBN 9781450356398.
[18]
M. Ester, H.-P. Kriegel and M. Schubert, Accurate and efficient crawling for relevant websites, in: Proceedings of the Thirtieth International Conference on Very Large Data Bases – Volume 30, VLDB’04, VLDB Endowment, 2004, pp. 396–407. ISBN 0120884690.
[19]
M. Fang, G. Yang, N.Z. Gong and J. Liu, Poisoning attacks to graph-based recommender systems, in: Proceedings of the 34th Annual Computer Security Applications Conference, ACSAC’18, Association for Computing Machinery, New York, NY, USA, 2018, pp. 381–392. ISBN 9781450365697.
[20]
S.I. Gass and M.C. Fu (eds), Prim’s algorithm, in: Encyclopedia of Operations Research and Management Science, Springer US, Boston, MA, 2013, pp. 1160–1160. ISBN 978-1-4419-1153-7.
[21]
C. Gomez-Uribe and N. Hunt, The Netflix recommender system: Algorithms, business value, and innovation, ACM Trans. Manage. Inf. Syst. 6(4) (2016).
[22]
I. Gunes, A. Bilge and H. Polat, Shilling attacks against memory-based privacy-preserving recommendation algorithms, TIIS 7 (2013), 1272–1290.
[23]
I. Gunes, C. Kaleli, A. Bilge and H. Polat, Shilling attacks against recommender systems: A comprehensive survey, Artificial Intelligence Review (2014), 767–799.
[24]
G. Guo, J. Zhang and N. Yorke-Smith, A novel Bayesian similarity measure for recommender systems, in: Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI), 2013, pp. 2619–2625.
[25]
K. Hara, I. Suzuki, K. Kobayashi and K. Fukumizu, Reducing hubness: A cause of vulnerability in recommender systems, in: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR’15, Association for Computing Machinery, New York, NY, USA, 2015, pp. 815–818. ISBN 9781450336215.
[26]
H. Holzmann, A. Anand and M. Khosla, Delusive PageRank in incomplete graphs, in: Complex Networks and Their Applications VII, L.M. Aiello, C. Cherifi, H. Cherifi, R. Lambiotte, P. Lió and L.M. Rocha, eds, Springer International Publishing, Cham, 2019, pp. 104–117.
[27]
H. Holzmann, A. Anand and M. Khosla, Estimating PageRank deviations in crawled graphs, Applied Network Science 4 (2019), 86–107.
[28]
N.J. Hurley, M.P. O’Mahony and G.C.M. Silvestre, Attacking recommender systems: A cost-benefit analysis, IEEE Intelligent Systems 22(3) (2007), 64–68.
[29]
P. Kaur and S. Goel, Shilling attack models in recommender system, in: 2016 International Conference on Inventive Computation Technologies (ICICT), Vol. 2, 2016, pp. 1–5.
[30]
P. Knees, D. Schnitzer and A. Flexer, Improving neighborhood-based collaborative filtering by reducing hubness, in: Proceedings of International Conference on Multimedia Retrieval, ICMR’14, Association for Computing Machinery, New York, NY, USA, 2014, pp. 161–168. ISBN 9781450327824.
[31]
Y. Koren and R. Bell, Advances in collaborative filtering, in: Recommender Systems Handbook, Springer, Boston, MA, 2011, pp. 145–186. ISBN 978-0-387-85820-3.
[32]
M. Koster, Robots in the web: Threat or treat?, ConneXions 9(4) (1995).
[33]
A. Lawankar and N. Mangrulkar, A review on techniques for optimizing web crawler results, in: 2016 World Conference on Futuristic Trends in Research and Innovation for Social Welfare (Startup Conclave), 2016, pp. 1–4.
[34]
B. Li, Y. Wang, A. Singh and Y. Vorobeychik, Data poisoning attacks on factorization-based collaborative filtering, in: Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, 2016, pp. 1893–1901, http://dl.acm.org/citation.cfm?id=3157096.3157308. ISBN 978-1-5108-3881-9.
[35]
G. Linden, B. Smith and J. York, Amazon.com recommendations: Item-to-item collaborative filtering, IEEE Internet Computing 7(1) (2003), 76–80.
[36]
L. Muñoz-González, B. Pfitzner, M. Russo, J. Carnerero-Cano and E.C. Lupu, Poisoning attacks with generative adversarial nets, 2019.
[37]
L. Page, S. Brin, R. Motwani and T. Winograd, The PageRank citation ranking: Bringing order to the web, in: WWW 1999, 1999.
[38]
K. Patel, A. Thakkar, C. Shah and K. Makvana, A state of art survey on shilling attack in collaborative filtering based recommendation system, in: Proceedings of First International Conference on Information and Communication Technology for Intelligent Systems, Vol. 1, S.C. Satapathy and S. Das, eds, Springer, Cham, 2016, pp. 377–385. ISBN 978-3-319-30933-0.
[39]
M. Polato and F. Aiolli, Boolean kernels for collaborative filtering in top-N item recommendation, Neurocomputing 286 (2018), 214–225.
[40]
S. Rendle, C. Freudenthaler, Z. Gantner and L. Schmidt-Thieme, BPR: Bayesian personalized ranking from implicit feedback, in: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, UAI’09, AUAI Press, Arlington, Virginia, USA, 2009, pp. 452–461. ISBN 9780974903958.
[41]
F. Ricci, L. Rokach and B. Shapira, Recommender Systems Handbook, 2nd edn, Springer Publishing Company, Incorporated, 2015. ISBN 1489976361.
[42]
M. Si and Q. Li, Shilling attacks against collaborative recommender systems: A review, Artificial Intelligence Review 53 (2020), 291–319.
[43]
X. Su and T.M. Khoshgoftaar, A survey of collaborative filtering techniques, Adv. in Artif. Intell. 2009 (2009).
[44]
A.P. Sundar, F. Li, X. Zou, T. Gao and E.D. Russomanno, Understanding shilling attacks and their detection traits: A comprehensive survey, IEEE Access 8 (2020), 171703–171715.
[45]
K. Turk, S. Pastrana and B. Collier, A tight scrape: Methodological approaches to cybercrime research data collection in adversarial environments, in: 2020 IEEE European Symposium on Security and Privacy Workshops (EuroS PW), 2020, pp. 428–437.
[46]
M. Wan, R. Misra, N. Nakashole and J.J. McAuley, Fine-grained spoiler detection from large-scale review corpora, in: ACL, 2019, pp. 2605–2610.
[47]
Y. Zhang, H. Gao, G. Pei, S. Luo, G. Chang and N. Cheng, A survey of research on CAPTCHA designing and breaking techniques, in: 2019 18th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/13th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE), 2019, pp. 75–84.
[48]
W. Zhou, J. Wen, Y.S. Koh, Q. Xiong, M. Gao, G. Dobbie and S. Alam, Shilling attacks detection in recommender systems based on target item analysis, PLOS ONE 10(7) (2015), 1–26.
[49]
C.-N. Ziegler, S.M. McNee, J.A. Konstan and G. Lausen, Improving recommendation lists through topic diversification, in: Proceedings of the 14th International Conference on World Wide Web, WWW’05, Association for Computing Machinery, New York, NY, USA, 2005, pp. 22–32. ISBN 1595930469.

Index Terms

  1. On the feasibility of crawling-based attacks against recommender systems1
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image Journal of Computer Security
          Journal of Computer Security  Volume 30, Issue 4
          2022
          134 pages

          Publisher

          IOS Press

          Netherlands

          Publication History

          Published: 01 January 2022

          Author Tags

          1. Recommender systems
          2. security
          3. crawling
          4. shilling attack
          5. collaborative filtering

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • 0
            Total Citations
          • 0
            Total Downloads
          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 30 Sep 2024

          Other Metrics

          Citations

          View Options

          View options

          Get Access

          Login options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media