Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2046707.2046762acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

SURF: detecting and measuring search poisoning

Published: 17 October 2011 Publication History

Abstract

Search engine optimization (SEO) techniques are often abused to promote websites among search results. This is a practice known as blackhat SEO. In this paper we tackle a newly emerging and especially aggressive class of blackhat SEO, namely search poisoning. Unlike other blackhat SEO techniques, which typically attempt to promote a website's ranking only under a limited set of search keywords relevant to the website's content, search poisoning techniques disregard any term relevance constraint and are employed to poison popular search keywords with the sole purpose of diverting large numbers of users to short-lived traffic-hungry websites for malicious purposes. To accurately detect search poisoning cases, we designed a novel detection system called SURF. SURF runs as a browser component to extract a number of robust (i.e., difficult to evade) detection features from search-then-visit browsing sessions, and is able to accurately classify malicious search user redirections resulted from user clicking on poisoned search results. Our evaluation on real-world search poisoning instances shows that SURF can achieve a detection rate of 99.1% at a false positive rate of 0.9%. Furthermore, we applied SURF to analyze a large dataset of search-related browsing sessions collected over a period of seven months starting in September 2010. Through this long-term measurement study we were able to reveal new trends and interesting patterns related to a great variety of poisoning cases, thus contributing to a better understanding of the prevalence and gravity of the search poisoning problem.

References

[1]
Google search engine optimization. http://www.google.com/webmasters/.
[2]
Google trends. http://www.google.com/trends.
[3]
URLVoid: Scan a website with multiple scanning engines. http://www.urlvoid.com/.
[4]
WOT: Web of trust. http://www.mywot.com/wiki/API.
[5]
C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, 1993.
[6]
Google drives 70 percent of traffic to most web sites. http://searchengineoptimism.com/Google_refers_70_percent.html, July 2006.
[7]
Malware poisoning results for innocent searches. http://www.eweek.com/c/a/Security/Malware-Poisoning-Results-for-Innocent-Searches, November 2007.
[8]
Barracuda labs 2010 mid-year security report. Technical report, Barracuda Networks Inc., 2010.
[9]
Search engine optimization 'poisoning' way up this year. http://www.networkworld.com/news/2010/110910-seo-poisoning-increases.html, November 2010.
[10]
The dirty little secrets of search. http://www.nytimes.com/2011/02/13/business/13search.html, February 2011.
[11]
Google: Search engine spam on the rise. http://www.pcworld.com/article/217370/google_search, January 2011.
[12]
Z. Gyöngyi and H. Garcia-Molina. Web spam taxonomy. Technical report, Stanford University, 2005.
[13]
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The weka data mining software: an update. SIGKDD Explor. Newsl., 11(1):10--18, 2009.
[14]
F. Howard and O. Komili. Poisoned search results: How hackers have automated search engine poisoning attacks to distribute malware. Technical report, SophosLab, 2010.
[15]
J. John, F. Yu, Y. Xie, M. Abadi, and A. Krishnamurthy. deSEO: Combating search-result poisoning. In Proceedings of the 20th USENIX Security, 2011.
[16]
L. Lu, V. Yegneswaran, P. Porras, and W. Lee. Blade: an attack-agnostic approach for preventing drive-by malware infections. In Proceedings of the 17th ACM CCS, 2010.
[17]
E. Moshchuk, T. Bragin, S. D. Gribble, and H. M. Levy. A crawler-based study of spyware on the web. In Proceedings of the NDSS, 2006.
[18]
A. Ntoulas and M. Manasse. Detecting spam web pages through content analysis. In In Proceedings of the 15th WWW, 2006.
[19]
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical Report 1999--66, Stanford InfoLab, 1999.
[20]
M. A. Rajab, L. Ballard, P. Mavrommatis, N. Provos, and X. Zhao. The nocebo effect on the web: an analysis of fake anti-virus distribution. In Proceedings of the 3rd USENIX LEET, 2010.
[21]
K. Thomas, C. Grier, J. Ma, V. Paxson, and D. Song. Design and evaluation of a real-time url spam filtering service. In In Proceedings of the IEEE S&P, 2011.
[22]
T. Urvoy, E. Chauveau, P. Filoche, and T. Lavergne. Tracking web spam with html style similarities. ACM Trans. Web, 2:3:1--3:28, March 2008.
[23]
Y.-M. Wang, D. Beck, X. Jiang, and R. Roussev. Automated web patrol with strider honeymonkeys: Finding web sites that exploit browser vulnerabilities. In Proceedings of the NDSS, 2006.
[24]
B. Wu and B. D. Davison. Identifying link farm spam pages. In Proceedings of the 14th WWW, 2005.
[25]
B. Wu and B. D. Davison. Detecting semantic cloaking on the web. In Proceedings of the 15th WWW, 2006.

Cited By

View all
  • (2024)Overview of Social Engineering Protection and Prevention MethodsComputer Security. ESORICS 2023 International Workshops10.1007/978-3-031-54204-6_4(64-83)Online publication date: 1-Mar-2024
  • (2023)Scripted Henchmen: Leveraging XS-Leaks for Cross-Site Vulnerability Detection2023 IEEE Security and Privacy Workshops (SPW)10.1109/SPW59333.2023.00038(371-383)Online publication date: May-2023
  • (2023)Green Computing and Security Practices for Optimizing Crawler Efficiency2023 International Telecommunications Conference (ITC-Egypt)10.1109/ITC-Egypt58155.2023.10206217(151-156)Online publication date: 18-Jul-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CCS '11: Proceedings of the 18th ACM conference on Computer and communications security
October 2011
742 pages
ISBN:9781450309486
DOI:10.1145/2046707
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. detection
  2. malicious search engine redirection
  3. measurement
  4. search engine poisoning

Qualifiers

  • Research-article

Conference

CCS'11
Sponsor:

Acceptance Rates

CCS '11 Paper Acceptance Rate 60 of 429 submissions, 14%;
Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

Upcoming Conference

CCS '24
ACM SIGSAC Conference on Computer and Communications Security
October 14 - 18, 2024
Salt Lake City , UT , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)37
  • Downloads (Last 6 weeks)2
Reflects downloads up to 01 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Overview of Social Engineering Protection and Prevention MethodsComputer Security. ESORICS 2023 International Workshops10.1007/978-3-031-54204-6_4(64-83)Online publication date: 1-Mar-2024
  • (2023)Scripted Henchmen: Leveraging XS-Leaks for Cross-Site Vulnerability Detection2023 IEEE Security and Privacy Workshops (SPW)10.1109/SPW59333.2023.00038(371-383)Online publication date: May-2023
  • (2023)Green Computing and Security Practices for Optimizing Crawler Efficiency2023 International Telecommunications Conference (ITC-Egypt)10.1109/ITC-Egypt58155.2023.10206217(151-156)Online publication date: 18-Jul-2023
  • (2023)Understanding, Measuring, and Detecting Modern Technical Support Scams2023 IEEE 8th European Symposium on Security and Privacy (EuroS&P)10.1109/EuroSP57164.2023.00011(18-38)Online publication date: Jul-2023
  • (2022)Prevalence of Poisoned Google Search Results of Erectile Dysfunction Medications Redirecting to Illegal Internet Pharmacies: Data Analysis StudyJournal of Medical Internet Research10.2196/3895724:11(e38957)Online publication date: 8-Nov-2022
  • (2022)“I Don’t Just Take Whatever They Hand to Me”: How Women Recently Released from Incarceration Access Internet Health InformationWomen & Criminal Justice10.1080/08974454.2022.204069234:5(306-322)Online publication date: 26-Feb-2022
  • (2021)Effect of Infodemic Regarding the Illegal Sale of Medications on the Internet: Evaluation of Demand and Online Availability of Ivermectin during the COVID-19 PandemicInternational Journal of Environmental Research and Public Health10.3390/ijerph1814747518:14(7475)Online publication date: 13-Jul-2021
  • (2021)Understanding the Fake Removal Information Advertisement SitesJournal of Information Processing10.2197/ipsjjip.29.39229(392-405)Online publication date: 2021
  • (2021)To Get Lost is to Learn the Way: An Analysis of Multi-Step Social Engineering Attacks on the WebIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences10.1587/transfun.2020CIP0005E104.A:1(162-181)Online publication date: 1-Jan-2021
  • (2021)How Do Home Computer Users Browse the Web?ACM Transactions on the Web10.1145/347334316:1(1-27)Online publication date: 28-Sep-2021
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media