research-article

The GANfather: Controllable generation of malicious activity to improve defence systems

Authors:

Ricardo Ribeiro Pereira,

João Tiago Ascensão,

David Aparício,

Pedro BizarroAuthors Info & Claims

ICAIF '23: Proceedings of the Fourth ACM International Conference on AI in Finance

Pages 133 - 140

https://doi.org/10.1145/3604237.3626882

Published: 25 November 2023 Publication History

Abstract

Machine learning methods to aid defence systems in detecting malicious activity typically rely on labelled data. In some domains, such labelled data is unavailable or incomplete. In practice this can lead to low detection rates and high false positive rates, which characterise for example anti-money laundering systems. In fact, it is estimated that 1.7–4 trillion euros are laundered annually and go undetected. We propose The GANfather, a method to generate samples with properties of malicious activity, without label requirements. We propose to reward the generation of malicious samples by introducing an extra objective to the typical Generative Adversarial Networks (GANs) loss. Ultimately, our goal is to enhance the detection of illicit activity using the discriminator network as a novel and robust defence system. Optionally, we may encourage the generator to bypass pre-existing detection systems. This setup then reveals defensive weaknesses for the discriminator to correct. We evaluate our method in two real-world use cases, money laundering and recommendation systems. In the former, our method moves cumulative amounts close to 350 thousand dollars through a network of accounts without being detected by an existing system. In the latter, we recommend the target item to a broad user base with as few as 30 synthetic attackers. In both cases, we train a new defence system to capture the synthetic attacks.

References

[1]

Giovanni Apruzzese, Mauro Andreolini, Mirco Marchetti, Andrea Venturi, and Michele Colajanni. 2020. Deep Reinforcement Adversarial Learning Against Botnet Evasion Attacks. IEEE Transactions on Network and Service Management 17, 4 (2020), 1975–1987. https://doi.org/10.1109/TNSM.2020.3031843

Digital Library

[2]

Martin Arjovsky, Soumith Chintala, and Léon Bottou. 2017. Wasserstein generative adversarial networks. In International conference on machine learning. PMLR, 214–223.

[3]

Robin Burke, Bamshad Mobasher, and Runa Bhaumik. 2005. Limited knowledge shilling attacks in collaborative filtering systems. In Proceedings of 3rd international workshop on intelligent techniques for web personalization (ITWP 2005), 19th international joint conference on artificial intelligence (IJCAI 2005). 17–24.

[4]

Robin Burke, Bamshad Mobasher, Runa Bhaumik, and Chad Williams. 2005. Segment-based injection attacks against collaborative filtering recommender systems. In Fifth IEEE International Conference on Data Mining (ICDM’05). IEEE, 4–pp.

Digital Library

[5]

Ramiro Daniel Camino, Radu State, Leandro Montero, and Petko Valtchev. 2017. Finding suspicious activities in financial transactions and distributed ledgers. In 2017 IEEE International Conference on Data Mining Workshops (ICDMW). IEEE, 787–796.

[6]

Charitos Charitou, Simo Dragicevic, and Artur d’Avila Garcez. 2021. Synthetic Data Generation for Fraud Detection using GANs. arXiv preprint arXiv:2109.12546 (2021).

[7]

Zhiyuan Chen, Ee Na Teoh, Amril Nazir, Ettikan Kandasamy Karuppiah, Kim Sim Lam, 2018. Machine learning techniques for anti-money laundering (AML) solutions in suspicious transaction detection: a review. Knowledge and Information Systems 57, 2 (2018), 245–285.

Digital Library

[8]

Nicola De Cao and Thomas Kipf. 2018. MolGAN: An implicit generative model for small molecular graphs. arXiv preprint arXiv:1805.11973 (2018).

[9]

Xinwei Deng, V Roshan Joseph, Agus Sudjianto, and CF Jeff Wu. 2009. Active learning through sequential design, with applications to detection of money laundering. J. Amer. Statist. Assoc. 104, 487 (2009), 969–981.

[10]

Zengan Gao. 2009. Application of cluster-based local outlier factor algorithm in anti-money laundering. In 2009 International Conference on Management and Service Science. IEEE, 1–4.

[11]

Siddhant Garg and Goutham Ramakrishnan. 2020. BAE: BERT-based Adversarial Examples for Text Classification. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Online, 6174–6181. https://doi.org/10.18653/v1/2020.emnlp-main.498

[12]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Advances in neural information processing systems 27 (2014).

[13]

F Maxwell Harper and Joseph A Konstan. 2015. The movielens datasets: History and context. Acm transactions on interactive intelligent systems (tiis) 5, 4 (2015), 1–19.

[14]

Martin Jullum, Anders Løland, Ragnar Bang Huseby, Geir Ånonsen, and Johannes Lorentzen. 2020. Detecting money laundering transactions with machine learning. Journal of Money Laundering Control (2020).

[15]

Shyong K Lam and John Riedl. 2004. Shilling recommender systems for fun and profit. In Proceedings of the 13th international conference on World Wide Web. 393–402.

Digital Library

[16]

Karel Lannoo and Richard Parlour. 2021. Anti-Money Laundering in the EU: Time to get serious. CEPS Task Force Report 28 Jan 2021.http://aei.pitt.edu/103318/

[17]

Xiangfeng Li, Shenghua Liu, Zifeng Li, Xiaotian Han, Chuan Shi, Bryan Hooi, He Huang, and Xueqi Cheng. 2020. Flowscope: Spotting money laundering based on graphs. In Proceedings of the AAAI Conference on Artificial Intelligence.

[18]

Chen Lin, Si Chen, Meifang Zeng, Sheng Zhang, Min Gao, and Hui Li. 2022. Shilling Black-Box Recommender Systems by Learning to Generate Fake User Profiles. IEEE Transactions on Neural Networks and Learning Systems (2022).

[19]

Joana Lorenz, Maria Inês Silva, David Aparício, João Tiago Ascensão, and Pedro Bizarro. 2020. Machine learning methods to detect money laundering in the Bitcoin blockchain in the presence of label scarcity. arXiv preprint arXiv:2005.14635 (2020).

[20]

Michael Luca. 2016. Reviews, Reputation, and Revenue: The Case of Yelp.com. American Economic Journal - Applied Economics (2016).

[21]

Lin-Tao Lv, Na Ji, and Jiu-Long Zhang. 2008. A RBF neural network model for anti-money laundering. In 2008 International Conference on Wavelet Analysis and Pattern Recognition, Vol. 1. IEEE, 209–215.

[22]

Catarina Oliveira, João Torres, Maria Inês Silva, David Aparício, João Tiago Ascensão, and Pedro Bizarro. 2021. GuiltyWalker: Distance to illicit nodes in the Bitcoin network. arXiv preprint arXiv:2102.05373 (2021).

[23]

Arin Ray. 2021. IT and operational spending in AML-KYC: 2021 edition. https://www.celent.com/insights/428901357

[24]

Saleha Raza and Sajjad Haider. 2011. Suspicious activity reporting using dynamic bayesian networks. Procedia Computer Science 3 (2011), 987–991.

[25]

David Savage, Qingmai Wang, Pauline Chou, Xiuzhen Zhang, and Xinghuo Yu. 2016. Detection of money laundering groups using supervised learning in networks. arXiv preprint arXiv:1608.00708 (2016).

[26]

Reza Soltani, Uyen Trang Nguyen, Yang Yang, Mohammad Faghani, Alaa Yagoub, and Aijun An. 2016. A new algorithm for money laundering detection based on structural similarity. In 2016 IEEE 7th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON). IEEE, 1–7.

[27]

Xiaobing Sun, Jiabao Zhang, Qiming Zhao, Shenghua Liu, Jinglei Chen, Ruoyu Zhuang, Huawei Shen, and Xueqi Cheng. 2021. CubeFlow: Money Laundering Detection with Coupled Tensors. In PAKDD (1). Springer, 78–90.

[28]

Jiaxi Tang, Hongyi Wen, and Ke Wang. 2020. Revisiting adversarially learned injection attacks against recommender systems. In Fourteenth ACM conference on recommender systems. 318–327.

Digital Library

[29]

Jun Tang and Jian Yin. 2005. Developing an intelligent data discriminating system of anti-money laundering based on SVM. In 2005 International conference on machine learning and cybernetics, Vol. 6. IEEE, 3453–3457.

[30]

Muhammad Usama, Muhammad Asim, Siddique Latif, Junaid Qadir, and Ala-Al-Fuqaha. 2019. Generative Adversarial Networks For Launching and Thwarting Adversarial Attacks on Network Intrusion Detection Systems. In 2019 15th International Wireless Communications and Mobile Computing Conference (IWCMC). 78–83. https://doi.org/10.1109/IWCMC.2019.8766353

[31]

Shiyu Wang, Yuanqi Du, Xiaojie Guo, Bo Pan, and Liang Zhao. 2022. Controllable Data Generation by Deep Learning: A Review. arXiv preprint arXiv:2207.09542 (2022).

[32]

Xingqi Wang and Guang Dong. 2009. Research on money laundering detection based on improved minimum spanning tree clustering and its application. In 2009 Second international symposium on knowledge acquisition and modeling, Vol. 2. IEEE, 62–64.

Digital Library

[33]

R Cory Watkins, K Michael Reynolds, Ron Demara, Michael Georgiopoulos, Avelino Gonzalez, and Ron Eaglin. 2003. Tracking dirty proceeds: exploring data mining technologies as tools to investigate money laundering. Police Practice and Research 4, 2 (2003), 163–178.

[34]

Mark Weber, Jie Chen, Toyotaro Suzumura, Aldo Pareja, Tengfei Ma, Hiroki Kanezashi, Tim Kaler, Charles E Leiserson, and Tao B Schardl. 2018. Scalable graph learning for anti-money laundering: A first look. arXiv preprint arXiv:1812.00076 (2018).

[35]

Fan Wu, Min Gao, Junliang Yu, Zongwei Wang, Kecheng Liu, and Xu Wang. 2021. Ready for emerging threats to recommender systems? A graph convolution-based generative shilling attack. Information Sciences 578 (2021), 683–701.

Digital Library

[36]

Han Xu, Yao Ma, Hao-Chen Liu, Debayan Deb, Hui Liu, Ji-Liang Tang, and Anil K. Jain. 2020. Adversarial Attacks and Defenses in Images, Graphs and Text: A Review. International Journal of Automation and Computing (2020), 151–178.

[37]

Xuxin Zhang, Jian Chen, Rui Zhang, Chen Wang, and Ling Liu. 2021. Attacking recommender systems with plausible profile. IEEE Transactions on Information Forensics and Security 16 (2021), 4788–4800.

Digital Library

Cited By

Ayemowa MIbrahim RKhan M(2024)Analysis of Recommender System Using Generative Artificial Intelligence: A Systematic Literature ReviewIEEE Access10.1109/ACCESS.2024.341696212(87742-87766)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3416962

Index Terms

The GANfather: Controllable generation of malicious activity to improve defence systems
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning approaches
2. Security and privacy

Index terms have been assigned to the content through auto-classification.

Recommendations

Searching Structural Neighborhood of Malicious URLs to Improve Blacklisting
SAINT '11: Proceedings of the 2011 IEEE/IPSJ International Symposium on Applications and the Internet

Filtering based on blacklists is a major countermeasure against malicious websites. However, blacklists must be updated because malicious URLs tend to be short-lived and their sub strings may be partially mutated to avoid blacklisting. Due to these ...
Malicious Bots Threaten Network Security

Viruses, worms, Trojan horses, and network intrusions are among the threats that security administrators worry about on a regular basis. However, there is a less familiar threat that many experts say could be just as dangerous: malicious bot software. A ...
A framework for malicious workload generation
IMC '04: Proceedings of the 4th ACM SIGCOMM conference on Internet measurement

Malicious traffic from self-propagating worms and denial-of-service attacks constantly threatens the everyday operation of Internet systems. Defending networks from these threats demands appropriate tools to conduct comprehensive vulnerability ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICAIF '23: Proceedings of the Fourth ACM International Conference on AI in Finance

November 2023

697 pages

ISBN:9798400702402

DOI:10.1145/3604237

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 November 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

ICAIF '23

ICAIF '23: 4th ACM International Conference on AI in Finance

November 27 - 29, 2023

NY, Brooklyn, USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
81
Total Downloads

Downloads (Last 12 months)31
Downloads (Last 6 weeks)4

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ayemowa MIbrahim RKhan M(2024)Analysis of Recommender System Using Generative Artificial Intelligence: A Systematic Literature ReviewIEEE Access10.1109/ACCESS.2024.341696212(87742-87766)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3416962

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten