Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2590296.2590347acmconferencesArticle/Chapter ViewAbstractPublication Pagesasia-ccsConference Proceedingsconference-collections
research-article

On the effectiveness of risk prediction based on users browsing behavior

Published: 04 June 2014 Publication History

Abstract

Users are typically the final target of web attacks: criminals are interested in stealing their money, their personal information, or in infecting their machines with malicious code. However, while many aspects of web attacks have been carefully studied by researchers and security companies, the reasons that make certain users more "at risk" than others are still unknown. Why do certain users never encounter malicious pages while others seem to end up on them on a daily basis?
To answer this question, in this paper we present a comprehensive study on the effectiveness of risk prediction based only on the web browsing behavior of users. Our analysis is based on a telemetry dataset collected by a major AntiVirus vendor, comprising millions of URLs visited by more than 100,000 users during a period of three months. For each user, we extract detailed usage statistics, and distill this information in 74 unique features that model different aspects of the user's behavior.
After the features are extracted, we perform a correlation analysis to see if any of them is correlated with the probability of visiting malicious web pages. Afterwards, we leverage machine learning techniques to provide a prediction model that can be used to estimate the risk class of a given user. The results of our experiments show that it is possible to predict with a reasonable accuracy (up to 87%) the users that are more likely to be the victims of web attacks, only by analyzing their browsing history.

References

[1]
Alexa. Alexa Browse by Category. http://www.alexa.com/topsites/category/Top, 2013.
[2]
Alexa. Alexa Top Websites. http://www.alexa.com/topsites, 2013.
[3]
amada.abuse.ch. Malware Database (AMaDa) :: AMaDa Blocklist. http://amada.abuse.ch/blocklist.php?download=domainblocklist, 2013.
[4]
A. Barth, C. Jackson, and J. C. Mitchell. Robust defenses for cross-site request forgery. In 15th ACM Conference on Computer and Communications Security (CCS 2008), 2008.
[5]
C. M. Bishop. nformation Science and Statistics. In Pattern Recognition and Machine Learning. Springer, 2006.
[6]
R. Bohme and G. Schwartz. Modeling cyber-insurance: Towards a unifying framework. In Ninth Workshop on the Economics of Information Security (WEIS), 2010.
[7]
N. Cristianini and J. Shawe-Taylor. An introduction to support vector machines and other kernel-based learning methods. In Cambridge University Press, 2000.
[8]
J. Delgado and R. Davidson. Knowledge bases and user profiling in travel and hospitality recommender systems. In Proceedings of the ENTER 2002 Conference, pages 1--16. Citeseer, 2002.
[9]
M. Egele, P. Wurzinger, C. Kruegel, and E. Kirda. Defending browsers against drive-by downloads: Mitigating heap-spraying code injection attacks. In Detection of Intrusions and Malware, and Vulnerability Assessment, pages 88--106. Springer, 2009.
[10]
Y. G.U. and K. M.G. An Introduction to the Theory of Statistics (14th ed.). Charles Griffin & Co., 1968.
[11]
Kaspersky. Kaspersky Security Bulletin 2012. http://www.securelist.com/en/analysis/204792255/Kaspersky_Security_Bulletin_2012_The_overall_statistics_for_2012, 2012.
[12]
C. Ke, J. Oliver, and Y. Xiang. Analysis of the Australian Web Threat Landscape. http://www.trendmicro.com.au/cloud-content/au/pdfs/security-intelligence/white-papers/australian_web_threat_landscape_-v7.pdf, May 2013.
[13]
F. L. Lévesque, J. Nsiempba, J. M. Fernandez, S. Chiasson, and A. Somayaji. A Clinical Study of Risk Factors Related to Malware Infections. In Proceedings of the ACM Conference on Computer and Communications Security (CCS), Nov. 2013.
[14]
A. Liaw and M. Wiener. Classification and regression by randomforest. In R News, volume 2/3, page 18, 2002.
[15]
M. D. List. Malware Domains List. http://www.malwaredomainlist.com/, 2013.
[16]
URL Shortening Services - A List of URL Shorteners. http://longurl.org/services, 2013.
[17]
G. Maier, A. Feldmann, V. Paxson, R. Sommer, and M. Vallentin. An assessment of overt malicious activity manifest in residential networks. In Detection of Intrusions and Malware, and Vulnerability Assessment, pages 144--163. Springer, 2011.
[18]
Malcode. Malcode. http://malc0de.com/bl/BOOT, 2013.
[19]
D. W. McDonald and M. S. Ackerman. Expertise recommender: a exible recommendation system and architecture. In Proceedings of the 2000 ACM conference on Computer supported cooperative work, pages 231--240. ACM, 2000.
[20]
S. E. Middleton, N. R. Shadbolt, and D. C. De Roure. Ontological user profiling in recommender systems. ACM Transactions on Information Systems (TOIS), 22(1):54--88, 2004.
[21]
L. Olejnik, C. Castelluccia, and A. Janc. Why Johnny Can't Browse in Peace: On the Uniqueness of Web Browsing History Patterns. In 5th Workshop on Hot Topics in Privacy Enhancing Technologies (HotPETs 2012), Vigo, Espagne, July 2012.
[22]
K. Onarlioglu, U. O. Yilmaz, D. Balzarotti, and E. Kirda. Insights into user behavior in dealing with internet attacks. In 19th Annual Network and Distributed System Security Symposium (NDSS), NDSS 12, January 2012.
[23]
Y. Peng, G. Wang, G. Kou, and Y. Shi. An empirical study of classification algorithm evaluation for financial risk prediction. Applied Soft Computing, 11(2):2906 -- 2915, 2011.
[24]
O. D. Project. DMOZ Open Directory Project. http://www.dmoz.org/, 2013.
[25]
J. Quinlan. C4.5: Programs for machine learning. In Morgan Kaufmann Publishers, 1993.
[26]
P. Ratanaworabhan, B. Livshits, B., and Zorn. Nozzle: a defense against heap-spraying code injection attacks. In Proceedings of the USENIX Security Symposium, 2009.
[27]
Google Safe Browsing API. http://code.google.com/apis/safebrowsing/, 2008.
[28]
S. Stigler. Fisher and the 5 CHANCE, 21(4):12--12, 2008.
[29]
Symantec. 2013 Internet Security Threat Report. http://www.symantec.com/security_response/publications/threatreport.jsp, 2013.
[30]
Symantec. Norton Safe Web. https://safeweb.norton.com/, 2013.
[31]
TBLOP - The Big List of Porn. http://www.tblop.com/, 2013.
[32]
Torrent Sites. http://www.torrentresource.com/, 2013.
[33]
S. J. Vaughan-Nichols. How the Syrian Electronic Army took out the New York Times and Twitter sites. http://www.zdnet.com/how-the-syrian-electronic-army-took-out-the-new-york-times-and-twitter-sites-7000019989/, August 2013.
[34]
Websense. Websense 2013 Threat Report. http://www.websense.com/content/websense-2013-threat-report.aspx?cmpid=prnr2.13.13, 2013.
[35]
G. Wondracek, T. Holz, C. Platzer, E. Kirda, and C. Kruegel. Is the internet for porn? an insight into the online adult industry. In Ninth Workshop on the Economics of Information Security (WEIS), 2010.
[36]
List of File Hosting and Sharing Websites. http://xboxpirate.eu/forums/topic/280-list-of-file-hosting-and-sharing-websites-137-entries/, 2013.
[37]
H. Zhang. The Optimality of Naive Bayes. In FLAIRS 2004 conference, 2004.

Cited By

View all
  • (2024)A Case-Control Study to Measure Behavioral Risks of Malware Encounters in OrganizationsIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.345696019(9419-9432)Online publication date: 2024
  • (2024)Unveiling the Connection Between Malware and Pirated Software in Southeast Asian Countries: A Case StudyIEEE Open Journal of the Computer Society10.1109/OJCS.2024.33645765(62-72)Online publication date: 2024
  • (2024)Modeling and management of cyber risk: a cross-disciplinary reviewAnnals of Actuarial Science10.1017/S1748499523000258(1-40)Online publication date: 4-Jan-2024
  • Show More Cited By

Index Terms

  1. On the effectiveness of risk prediction based on users browsing behavior

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASIA CCS '14: Proceedings of the 9th ACM symposium on Information, computer and communications security
    June 2014
    556 pages
    ISBN:9781450328005
    DOI:10.1145/2590296
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 June 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. profiling
    2. risk prediction
    3. web browsing

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ASIA CCS '14
    Sponsor:

    Acceptance Rates

    ASIA CCS '14 Paper Acceptance Rate 50 of 255 submissions, 20%;
    Overall Acceptance Rate 418 of 2,322 submissions, 18%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)26
    • Downloads (Last 6 weeks)8
    Reflects downloads up to 25 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Case-Control Study to Measure Behavioral Risks of Malware Encounters in OrganizationsIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.345696019(9419-9432)Online publication date: 2024
    • (2024)Unveiling the Connection Between Malware and Pirated Software in Southeast Asian Countries: A Case StudyIEEE Open Journal of the Computer Society10.1109/OJCS.2024.33645765(62-72)Online publication date: 2024
    • (2024)Modeling and management of cyber risk: a cross-disciplinary reviewAnnals of Actuarial Science10.1017/S1748499523000258(1-40)Online publication date: 4-Jan-2024
    • (2023)One size does not fit allProceedings of the 32nd USENIX Conference on Security Symposium10.5555/3620237.3620555(5683-5700)Online publication date: 9-Aug-2023
    • (2023)A Comparison of Systemic and Systematic Risks of Malware Encounters in Consumer and Enterprise EnvironmentsACM Transactions on Privacy and Security10.1145/356536226:2(1-30)Online publication date: 12-Apr-2023
    • (2023)Infection Risk Prediction and ManagementEncyclopedia of Cryptography, Security and Privacy10.1007/978-3-642-27739-9_1634-1(1-5)Online publication date: 9-Mar-2023
    • (2022)On recruiting and retaining users for security-sensitive longitudinal measurement panelsProceedings of the Eighteenth USENIX Conference on Usable Privacy and Security10.5555/3563609.3563628(347-366)Online publication date: 8-Aug-2022
    • (2022)Protection of Critical Infrastructure Using an Integrated Cybersecurity Risk Management (i-CSRM) Framework5G Internet of Things and Changing Standards for Computing and Electronic Systems10.4018/978-1-6684-3855-8.ch004(94-133)Online publication date: 3-Jun-2022
    • (2022)Measuring security practicesCommunications of the ACM10.1145/354713365:9(93-102)Online publication date: 19-Aug-2022
    • (2022)The Filters for Websites by Organizing Distance Learning Processes2022 2nd International Conference on Technology Enhanced Learning in Higher Education (TELE)10.1109/TELE55498.2022.9801033(340-342)Online publication date: 26-May-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media