research-article

On the effectiveness of risk prediction based on users browsing behavior

Authors:

Davide BalzarottiAuthors Info & Claims

ASIA CCS '14: Proceedings of the 9th ACM symposium on Information, computer and communications security

Pages 171 - 182

https://doi.org/10.1145/2590296.2590347

Published: 04 June 2014 Publication History

Abstract

Users are typically the final target of web attacks: criminals are interested in stealing their money, their personal information, or in infecting their machines with malicious code. However, while many aspects of web attacks have been carefully studied by researchers and security companies, the reasons that make certain users more "at risk" than others are still unknown. Why do certain users never encounter malicious pages while others seem to end up on them on a daily basis?

To answer this question, in this paper we present a comprehensive study on the effectiveness of risk prediction based only on the web browsing behavior of users. Our analysis is based on a telemetry dataset collected by a major AntiVirus vendor, comprising millions of URLs visited by more than 100,000 users during a period of three months. For each user, we extract detailed usage statistics, and distill this information in 74 unique features that model different aspects of the user's behavior.

After the features are extracted, we perform a correlation analysis to see if any of them is correlated with the probability of visiting malicious web pages. Afterwards, we leverage machine learning techniques to provide a prediction model that can be used to estimate the risk class of a given user. The results of our experiments show that it is possible to predict with a reasonable accuracy (up to 87%) the users that are more likely to be the victims of web attacks, only by analyzing their browsing history.

References

[1]

Alexa. Alexa Browse by Category. http://www.alexa.com/topsites/category/Top, 2013.

[2]

Alexa. Alexa Top Websites. http://www.alexa.com/topsites, 2013.

[3]

amada.abuse.ch. Malware Database (AMaDa) :: AMaDa Blocklist. http://amada.abuse.ch/blocklist.php?download=domainblocklist, 2013.

[4]

A. Barth, C. Jackson, and J. C. Mitchell. Robust defenses for cross-site request forgery. In 15th ACM Conference on Computer and Communications Security (CCS 2008), 2008.

Digital Library

[5]

C. M. Bishop. nformation Science and Statistics. In Pattern Recognition and Machine Learning. Springer, 2006.

Digital Library

[6]

R. Bohme and G. Schwartz. Modeling cyber-insurance: Towards a unifying framework. In Ninth Workshop on the Economics of Information Security (WEIS), 2010.

[7]

N. Cristianini and J. Shawe-Taylor. An introduction to support vector machines and other kernel-based learning methods. In Cambridge University Press, 2000.

Digital Library

[8]

J. Delgado and R. Davidson. Knowledge bases and user profiling in travel and hospitality recommender systems. In Proceedings of the ENTER 2002 Conference, pages 1--16. Citeseer, 2002.

[9]

M. Egele, P. Wurzinger, C. Kruegel, and E. Kirda. Defending browsers against drive-by downloads: Mitigating heap-spraying code injection attacks. In Detection of Intrusions and Malware, and Vulnerability Assessment, pages 88--106. Springer, 2009.

Digital Library

[10]

Y. G.U. and K. M.G. An Introduction to the Theory of Statistics (14th ed.). Charles Griffin & Co., 1968.

[11]

Kaspersky. Kaspersky Security Bulletin 2012. http://www.securelist.com/en/analysis/204792255/Kaspersky_Security_Bulletin_2012_The_overall_statistics_for_2012, 2012.

[12]

C. Ke, J. Oliver, and Y. Xiang. Analysis of the Australian Web Threat Landscape. http://www.trendmicro.com.au/cloud-content/au/pdfs/security-intelligence/white-papers/australian_web_threat_landscape_-v7.pdf, May 2013.

[13]

F. L. Lévesque, J. Nsiempba, J. M. Fernandez, S. Chiasson, and A. Somayaji. A Clinical Study of Risk Factors Related to Malware Infections. In Proceedings of the ACM Conference on Computer and Communications Security (CCS), Nov. 2013.

Digital Library

[14]

A. Liaw and M. Wiener. Classification and regression by randomforest. In R News, volume 2/3, page 18, 2002.

[15]

M. D. List. Malware Domains List. http://www.malwaredomainlist.com/, 2013.

[16]

URL Shortening Services - A List of URL Shorteners. http://longurl.org/services, 2013.

[17]

G. Maier, A. Feldmann, V. Paxson, R. Sommer, and M. Vallentin. An assessment of overt malicious activity manifest in residential networks. In Detection of Intrusions and Malware, and Vulnerability Assessment, pages 144--163. Springer, 2011.

Digital Library

[18]

Malcode. Malcode. http://malc0de.com/bl/BOOT, 2013.

[19]

D. W. McDonald and M. S. Ackerman. Expertise recommender: a exible recommendation system and architecture. In Proceedings of the 2000 ACM conference on Computer supported cooperative work, pages 231--240. ACM, 2000.

Digital Library

[20]

S. E. Middleton, N. R. Shadbolt, and D. C. De Roure. Ontological user profiling in recommender systems. ACM Transactions on Information Systems (TOIS), 22(1):54--88, 2004.

Digital Library

[21]

L. Olejnik, C. Castelluccia, and A. Janc. Why Johnny Can't Browse in Peace: On the Uniqueness of Web Browsing History Patterns. In 5th Workshop on Hot Topics in Privacy Enhancing Technologies (HotPETs 2012), Vigo, Espagne, July 2012.

[22]

K. Onarlioglu, U. O. Yilmaz, D. Balzarotti, and E. Kirda. Insights into user behavior in dealing with internet attacks. In 19th Annual Network and Distributed System Security Symposium (NDSS), NDSS 12, January 2012.

[23]

Y. Peng, G. Wang, G. Kou, and Y. Shi. An empirical study of classification algorithm evaluation for financial risk prediction. Applied Soft Computing, 11(2):2906 -- 2915, 2011.

Digital Library

[24]

O. D. Project. DMOZ Open Directory Project. http://www.dmoz.org/, 2013.

[25]

J. Quinlan. C4.5: Programs for machine learning. In Morgan Kaufmann Publishers, 1993.

Digital Library

[26]

P. Ratanaworabhan, B. Livshits, B., and Zorn. Nozzle: a defense against heap-spraying code injection attacks. In Proceedings of the USENIX Security Symposium, 2009.

Digital Library

[27]

Google Safe Browsing API. http://code.google.com/apis/safebrowsing/, 2008.

[28]

S. Stigler. Fisher and the 5 CHANCE, 21(4):12--12, 2008.

[29]

Symantec. 2013 Internet Security Threat Report. http://www.symantec.com/security_response/publications/threatreport.jsp, 2013.

[30]

Symantec. Norton Safe Web. https://safeweb.norton.com/, 2013.

[31]

TBLOP - The Big List of Porn. http://www.tblop.com/, 2013.

[32]

Torrent Sites. http://www.torrentresource.com/, 2013.

[33]

S. J. Vaughan-Nichols. How the Syrian Electronic Army took out the New York Times and Twitter sites. http://www.zdnet.com/how-the-syrian-electronic-army-took-out-the-new-york-times-and-twitter-sites-7000019989/, August 2013.

[34]

Websense. Websense 2013 Threat Report. http://www.websense.com/content/websense-2013-threat-report.aspx?cmpid=prnr2.13.13, 2013.

[35]

G. Wondracek, T. Holz, C. Platzer, E. Kirda, and C. Kruegel. Is the internet for porn? an insight into the online adult industry. In Ninth Workshop on the Economics of Information Security (WEIS), 2010.

[36]

List of File Hosting and Sharing Websites. http://xboxpirate.eu/forums/topic/280-list-of-file-hosting-and-sharing-websites-137-entries/, 2013.

[37]

H. Zhang. The Optimality of Naive Bayes. In FLAIRS 2004 conference, 2004.

Cited By

Meschini MTizio GBalduzzi MMassacci F(2024)A Case-Control Study to Measure Behavioral Risks of Malware Encounters in OrganizationsIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.345696019(9419-9432)Online publication date: 2024
https://doi.org/10.1109/TIFS.2024.3456960
Iqbal AAman MRejendran RSikdar B(2024)Unveiling the Connection Between Malware and Pirated Software in Southeast Asian Countries: A Case StudyIEEE Open Journal of the Computer Society10.1109/OJCS.2024.33645765(62-72)Online publication date: 2024
https://doi.org/10.1109/OJCS.2024.3364576
He RJin ZLi J(2024)Modeling and management of cyber risk: a cross-disciplinary reviewAnnals of Actuarial Science10.1017/S1748499523000258(1-40)Online publication date: 4-Jan-2024
https://doi.org/10.1017/S1748499523000258
Show More Cited By

Index Terms

On the effectiveness of risk prediction based on users browsing behavior
1. Social and professional topics
  1. Professional topics
    1. Management of computing and information systems

Recommendations

A study of tabbed browsing among mozilla firefox users
CHI '10: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

We present a study which investigated how and why users of Mozilla Firefox use multiple tabs and windows during web browsing. The detailed web browsing usage of 21 participants was logged over a period of 13 to 21 days each, and was supplemented by ...
Mobile web browsing: usability study
Mobility '07: Proceedings of the 4th international conference on mobile technology, applications, and systems and the 1st international symposium on Computer human interaction in mobile technology

The mobile phones are increasingly used to access different kind of information other than just to make voice calls. However, browsing large web pages which is not adapted for small-screen viewing is still very inconvenient. Web browsers are emerging ...
Extracting user preference from Web browsing behaviour for spam filtering

We focus on user behaviour that most e-mail users browse the web. In this paper, we attempt to exploit user preference extracted from the behaviour in a spam filtering method. The method reduces troublesome maintenance of the filter, since it keeps ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ASIA CCS '14: Proceedings of the 9th ACM symposium on Information, computer and communications security

June 2014

556 pages

ISBN:9781450328005

DOI:10.1145/2590296

General Chair:
Shiho Moriai
NICT, Japan
,
Program Chairs:
Trent Jaeger
Penn State University, USA
,
Kouichi Sakurai
Kyushu University, Japan

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 June 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Seventh Framework Programme

Conference

ASIA CCS '14

Sponsor:

SIGSAC

ASIA CCS '14: 9th ACM Symposium on Information, Computer and Communications Security

June 4 - 6, 2014

Kyoto, Japan

Acceptance Rates

ASIA CCS '14 Paper Acceptance Rate 50 of 255 submissions, 20%;

Overall Acceptance Rate 418 of 2,322 submissions, 18%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

45
Total Citations
View Citations
476
Total Downloads

Downloads (Last 12 months)26
Downloads (Last 6 weeks)8

Reflects downloads up to 25 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Meschini MTizio GBalduzzi MMassacci F(2024)A Case-Control Study to Measure Behavioral Risks of Malware Encounters in OrganizationsIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.345696019(9419-9432)Online publication date: 2024
https://doi.org/10.1109/TIFS.2024.3456960
Iqbal AAman MRejendran RSikdar B(2024)Unveiling the Connection Between Malware and Pirated Software in Southeast Asian Countries: A Case StudyIEEE Open Journal of the Computer Society10.1109/OJCS.2024.33645765(62-72)Online publication date: 2024
https://doi.org/10.1109/OJCS.2024.3364576
He RJin ZLi J(2024)Modeling and management of cyber risk: a cross-disciplinary reviewAnnals of Actuarial Science10.1017/S1748499523000258(1-40)Online publication date: 4-Jan-2024
https://doi.org/10.1017/S1748499523000258
Dambra SBilge LKotzias PShen YCaballero JCalandrino JTroncoso C(2023)One size does not fit allProceedings of the 32nd USENIX Conference on Security Symposium10.5555/3620237.3620555(5683-5700)Online publication date: 9-Aug-2023
https://dl.acm.org/doi/10.5555/3620237.3620555
Dambra SBilge LBalzarotti D(2023)A Comparison of Systemic and Systematic Risks of Malware Encounters in Consumer and Enterprise EnvironmentsACM Transactions on Privacy and Security10.1145/356536226:2(1-30)Online publication date: 12-Apr-2023
https://dl.acm.org/doi/10.1145/3565362
Bilge LDambra S(2023)Infection Risk Prediction and ManagementEncyclopedia of Cryptography, Security and Privacy10.1007/978-3-642-27739-9_1634-1(1-5)Online publication date: 9-Mar-2023
https://doi.org/10.1007/978-3-642-27739-9_1634-1
Yamada ACrichton KSawaya YDong JPearman SKubota AChristin NChiasson SKapadia A(2022)On recruiting and retaining users for security-sensitive longitudinal measurement panelsProceedings of the Eighteenth USENIX Conference on Usable Privacy and Security10.5555/3563609.3563628(347-366)Online publication date: 8-Aug-2022
https://dl.acm.org/doi/10.5555/3563609.3563628
Kure HNwajana A(2022)Protection of Critical Infrastructure Using an Integrated Cybersecurity Risk Management (i-CSRM) Framework5G Internet of Things and Changing Standards for Computing and Electronic Systems10.4018/978-1-6684-3855-8.ch004(94-133)Online publication date: 3-Jun-2022
https://doi.org/10.4018/978-1-6684-3855-8.ch004
DeKoven LRandall AMirian AAkiwate GBlume ASaul LSchulman AVoelker GSavage S(2022)Measuring security practicesCommunications of the ACM10.1145/354713365:9(93-102)Online publication date: 19-Aug-2022
https://dl.acm.org/doi/10.1145/3547133
Bechelova ATkhabisimova MKardanova M(2022)The Filters for Websites by Organizing Distance Learning Processes2022 2nd International Conference on Technology Enhanced Learning in Higher Education (TELE)10.1109/TELE55498.2022.9801033(340-342)Online publication date: 26-May-2022
https://doi.org/10.1109/TELE55498.2022.9801033
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents