Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2517312.2517321acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Is data clustering in adversarial settings secure?

Published: 04 November 2013 Publication History

Abstract

Clustering algorithms have been increasingly adopted in security applications to spot dangerous or illicit activities. However, they have not been originally devised to deal with deliberate attack attempts that may aim to subvert the clustering process itself. Whether clustering can be safely adopted in such settings remains thus questionable. In this work we propose a general framework that allows one to identify potential attacks against clustering algorithms, and to evaluate their impact, by making specific assumptions on the adversary's goal, knowledge of the attacked system, and capabilities of manipulating the input data. We show that an attacker may significantly poison the whole clustering process by adding a relatively small percentage of attack samples to the input data, and that some attack samples may be obfuscated to be hidden within some existing clusters. We present a case study on single-linkage hierarchical clustering, and report experiments on clustering of malware samples and handwritten digits.

References

[1]
Collaborative Malware Collection and Sensing. https://alliance.mwcollect.org.
[2]
Project Malfease. http://malfease.oarci.net.
[3]
M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D. Tygar. Can machine learning be secure? In ASIACCS '06: Proc. 2006 ACM Symposium on Information, Computer and Communications Security, pages 16--25, NY, USA, 2006. ACM.
[4]
U. Bayer, P. M. Comparetti, C. Hlauschek, C. Krügel, and E. Kirda. Scalable, behavior-based malware clustering. In NDSS. The Internet Society, 2009.
[5]
B. Biggio, G. Fumera, and F. Roli. Design of robust classifiers for adversarial environments. In IEEE Int'l Conf. on Systems, Man, and Cybernetics (SMC), pages 977--982, 2011.
[6]
B. Biggio, G. Fumera, and F. Roli. Security evaluation of pattern classifiers under attack. IEEE Trans. on Knowledge and Data Eng., 99(PrePrints):1, 2013.
[7]
B. Biggio, B. Nelson, and P. Laskov. Poisoning attacks against support vector machines. In J. Langford and J. Pineau, editors, 29th Int'l Conf. on Machine Learning. Omnipress, 2012.
[8]
M. Brückner, C. Kanzow, and T. Scheffer. Static prediction games for adversarial learning problems. J. Mach. Learn. Res., 13:2617--2654, 2012.
[9]
I. Burguera, U. Zurutuza, and S. Nadjm-Tehrani. Crowdroid: behavior-based malware detection system for android. In Proc. 1st ACM workshop on Security and Privacy in Smartphones and Mobile devices, SPSM '11, pages 15--26, NY, USA, 2011. ACM.
[10]
C. Castillo and B. D. Davison. Adversarial web search. Foundations and Trends in Information Retrieval}, 4(5):377--486, May 2011.
[11]
J. G. Dutrisac and D. Skillicorn. Hiding clusters in adversarial settings. In IEEE Int'l Conf. on Intelligence and Security Informatics (ISI 2008), pages 185--187, 2008.
[12]
D. A. Forsyth and J. Ponce. Computer Vision: A Modern Approach. Prentice Hall, 2011.
[13]
M. Grosshans, C. Sawade, M. Brückner, and T. Scheffer. Bayesian games for adversarial regression problems. In J. Mach. Learn. Res. - Proc. 30th Int'l Conf. on Machine Learning (ICML), volume 28, 2013.
[14]
P. Haider, L. Chiarandini, and U. Brefeld. Discriminative clustering for market segmentation. In Proc. 18th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, KDD '12, pages 417--425, NY, USA, 2012. ACM.
[15]
M. Halkidi, Y. Batistakis, and M. Vazirgiannis. On clustering validation techniques. Journal of Intelligent Information Systems, 17(2--3):107--145, Dec. 2001.
[16]
S. Hanna, L. Huang, E. Wu, S. Li, C. Chen, and D. Song. Juxtapp: a scalable system for detecting code reuse among android applications. In Proc. 9th Int'l Conf. on Detection of Intrusions and Malware, and Vulnerability Assessment, DIMVA'12, pages 62--81, Berlin, Heidelberg, 2013. Springer-Verlag.
[17]
L. Huang, A. D. Joseph, B. Nelson, B. Rubinstein, and J. D. Tygar. Adversarial machine learning. In 4th ACM Workshop on Artificial Intelligence and Security (AISec 2011), pages 43--57, Chicago, IL, USA, 2011.
[18]
A. K. Jain and R. C. Dubes. Algorithms for clustering data. Prentice-Hall, Inc., NJ, USA, 1988.
[19]
M. Kloft and P. Laskov. Online anomaly detection under adversarial impact. In Proc. 13th Int'l Conf. on Artificial Intell. and Statistics, pages 405--412, 2010.
[20]
A. Kolcz and C. H. Teo. Feature weighting for improved classifier robustness. In Sixth Conf. on Email and Anti-Spam (CEAS), CA, USA, 2009.
[21]
Y. LeCun, L. Jackel, L. Bottou, A. Brunot, C. Cortes, J. Denker, H. Drucker, I. Guyon, U. Müller, E. Säckinger, P. Simard, and V. Vapnik. Comparison of learning algorithms for handwritten digit recognition. In Int'l Conf. on Artificial Neural Networks, pages 53--60, 1995.
[22]
M. Pavan and M. Pelillo. Dominant sets and pairwise clustering. IEEE Trans. on Pattern Analysis and Machine Intelligence, 29(1):167--172, 2007.
[23]
R. Perdisci, D. Ariu, and G. Giacinto. Scalable fine-grained behavioral clustering of http-based malware. Computer Networks, 57(2):487 -- 500, 2013.
[24]
R. Perdisci, I. Corona, and G. Giacinto. Early detection of malicious flux networks via large-scale passive DNS traffic analysis. IEEE Trans. on Dependable and Secure Comp., 9(5):714--726, 2012.
[25]
F. Pouget, M. Dacier, J. Zimmerman, A. Clark, and G. Mohay. Internet attack knowledge discovery via clusters and cliques of attack traces. J. Information Assurance and Security, Vol. 1, Issue 1, March 2006.
[26]
G. Punj and D. W. Stewart. Cluster analysis in marketing research: Review and suggestions for application. J. Marketing Res., 20(2):134, May 1983.
[27]
D. B. Skillicorn. Adversarial knowledge discovery. IEEE Intelligent Systems, 24:54--61, 2009.
[28]
L. Spitzner. Honeypots: Tracking Hackers. Addison-Wesley Professional, 2002.
[29]
U. von Luxburg. Clustering stability: An overview. Foundations and Trends in Machine Learning, 2(3):235--274, 2010.
[30]
T. Zhang, R. Ramakrishnan, and M. Livny. Birch: an efficient data clustering method for very large databases. In Proc. 1996 ACM SIGMOD Int'l Conf. on Management of data, SIGMOD '96, pages 103--114, NY, USA, 1996. ACM.

Cited By

View all
  • (2024)Robustness of Updatable Learning-based Index Advisors against Poisoning AttackProceedings of the ACM on Management of Data10.1145/36392652:1(1-26)Online publication date: 26-Mar-2024
  • (2024)Metricizing the Euclidean Space Toward Desired Distance Relations in Point CloudsIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.342024619(7304-7319)Online publication date: 2024
  • (2024)DARD: Deceptive Approaches for Robust Defense Against IP TheftIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.340243319(5591-5606)Online publication date: 2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
AISec '13: Proceedings of the 2013 ACM workshop on Artificial intelligence and security
November 2013
116 pages
ISBN:9781450324885
DOI:10.1145/2517312
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 November 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. adversarial learning
  2. clustering
  3. computer security
  4. malware detection
  5. security evaluation
  6. unsupervised learning

Qualifiers

  • Research-article

Conference

CCS'13
Sponsor:

Acceptance Rates

AISec '13 Paper Acceptance Rate 10 of 17 submissions, 59%;
Overall Acceptance Rate 94 of 231 submissions, 41%

Upcoming Conference

CCS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)41
  • Downloads (Last 6 weeks)9
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Robustness of Updatable Learning-based Index Advisors against Poisoning AttackProceedings of the ACM on Management of Data10.1145/36392652:1(1-26)Online publication date: 26-Mar-2024
  • (2024)Metricizing the Euclidean Space Toward Desired Distance Relations in Point CloudsIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.342024619(7304-7319)Online publication date: 2024
  • (2024)DARD: Deceptive Approaches for Robust Defense Against IP TheftIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.340243319(5591-5606)Online publication date: 2024
  • (2024) Adversarial Robustness on Image Classification With k -Means IEEE Access10.1109/ACCESS.2024.336551712(28853-28859)Online publication date: 2024
  • (2023)Algorithmic Complexity Attacks on Dynamic Learned IndexesProceedings of the VLDB Endowment10.14778/3636218.363623217:4(780-793)Online publication date: 1-Dec-2023
  • (2023)A2SC: Adversarial Attacks on Subspace ClusteringACM Transactions on Multimedia Computing, Communications, and Applications10.1145/358709719:6(1-23)Online publication date: 12-Jul-2023
  • (2023)Clean-label Poisoning Attack against Fake News Detection Models2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386777(3614-3623)Online publication date: 15-Dec-2023
  • (2023)Adversarial Machine Learning: Bayesian PerspectivesJournal of the American Statistical Association10.1080/01621459.2023.2183129118:543(2195-2206)Online publication date: 31-Mar-2023
  • (2023)Adversarial Machine LearningMachine Learning for Data Science Handbook10.1007/978-3-031-24628-9_25(559-585)Online publication date: 26-Feb-2023
  • (2022)On the robustness of deep clustering modelsProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601765(20566-20579)Online publication date: 28-Nov-2022
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media