Google Scholar

Beyond blacklists: learning to detect malicious web sites from suspicious URLs

J Ma, LK Saul, S Savage, GM Voelker - Proceedings of the 15th ACM …, 2009 - dl.acm.org

Proceedings of the 15th ACM SIGKDD international conference on Knowledge …, 2009•dl.acm.org

Malicious Web sites are a cornerstone of Internet criminal activities. As a result, there has
been broad interest in developing systems to prevent the end user from visiting such sites. In
this paper, we describe an approach to this problem based on automated URL classification,
using statistical methods to discover the tell-tale lexical and host-based properties of
malicious Web site URLs. These methods are able to learn highly predictive models by
extracting and automatically analyzing tens of thousands of features potentially indicative of …

Malicious Web sites are a cornerstone of Internet criminal activities. As a result, there has been broad interest in developing systems to prevent the end user from visiting such sites. In this paper, we describe an approach to this problem based on automated URL classification, using statistical methods to discover the tell-tale lexical and host-based properties of malicious Web site URLs. These methods are able to learn highly predictive models by extracting and automatically analyzing tens of thousands of features potentially indicative of suspicious URLs. The resulting classifiers obtain 95-99% accuracy, detecting large numbers of malicious Web sites from their URLs, with only modest false positives.

ACM Digital Library

Show moreShow less

Save Cite Cited by 1173 Related articles All 10 versions

Cite

Advanced search

Saved to My library

Beyond blacklists: learning to detect malicious web sites from suspicious URLs