Abstract
Fraudulent transaction detection in a bank is a highly imbalanced classification problem. In addition, one single bank will not have a wide variety of different fraudulent transactions in its database to learn from. Collaboration between banks is needed to achieve an effective model, but banks do not share their data with each other due to competition and regulatory restrictions. Federated learning can be leveraged here to solve this problem. Here, the data held by different banks will be different in terms of distribution and hence follows a non-IID scenario across the participants’ datasets. Moreover, we assume that a minority of the banks could be malicious and will try to disrupt the federated learning process. Hence, the problem is to perform federated learning in a non-IID cross-silo setting with active adversaries involved. For this, we propose a novel algorithm—Epsilon Cluster Selection, a filter-based aggregation technique to recognize and prevent malicious nodes from contributing to the global model being trained. We apply this algorithm to this setting with malicious banks and compare the results. Furthermore, we consider additional attack scenarios featuring more stealthier attacks like only using a part of the data for attacks as well as keeping malicious banks benign for a part of training time to assess the resilience of our algorithm by varying the levels of maliciousness within a bank.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability statement
We are not sharing any code/data at present.
Notes
This paper is a continuation of our work from Robust Collaborative Fraudulent Transaction Detection using Federated Learning [13] presented at ICMLA 2021.
References
Japkowicz N, Stephen S. The class imbalance problem: a systematic study. Intell Data Anal. 2002;6(5):429–49.
Lopez-Rojas EA, Elmir A, Axelsson S. “PaySim: A financial mobile money simulator for fraud detection”. In: The 28th European Modeling and Simulation Symposium-EMSS, Larnaca, Cyprus. 2016
McMahan HB, Moore E, Ramage D, Hampson S, y Arcas BA. Communication-efficient learning of deep networks from decentralized data. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS) 2017. JMLR: W &CP.2017;54
Kairouz P, McMahan HB. Advances and open problems in federated learning. Found Trends® Mach Learn. 2021. https://doi.org/10.1561/2200000083.
Zhao Y, Li M, Lai L, Suda N, Civin D, Chandra V. Federated learning with non-iid data. arXiv preprint arXiv:1806.00582. 2018.
Blanchard P, El Mhamdi EM, Guerraoui R, Stainer J. Machine learning with adversaries: Byzantine tolerant gradient descent. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R. (eds.) Advances in Neural Information Processing Systems, Curran Associates, Inc., ??? 2017;30.
Li T, Sahu AK, Zaheer M, Sanjabi M, Talwalkar A, Smith V. Federated optimization in heterogeneous networks. Proceedings of Machine learning and systems. 2020;2:429–50.
Wang J, Liu Q, Liang H, Joshi G, Poor HV. Tackling the objective inconsistency problem in heterogeneous federated optimization. Adv Neural Inf Process Syst. 2020;33:7611–23.
Chen X, Liu C, Li B, Lu K, Song D. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526. 2017 Dec 15.
Li S, Cheng Y, Liu Y, Wang W, Chen T. Abnormal client behavior detection in federated learning. arXiv preprint arXiv:1910.09933. 2019 Oct 22.
Tolpegin V, Truex S, Gursoy ME, Liu L. Data poisoning attacks against federated learning systems. InComputer Security–ESORICS 2020: 25th European Symposium on Research in Computer Security, ESORICS 2020, Guildford, UK, September 14–18, 2020, Proceedings, Part I 25 2020 (pp. 480–501). Springer International Publishing.
Yu L, Wu L. Towards byzantine-resilient federated learning via group-wise robust aggregation. 2020:81–92. https://doi.org/10.1007/978-3-030-63076-8_6
Myalil D, Rajan MA, Apte M, Lodha S. Robust collaborative fraudulent transaction detection using federated learning. In: 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA). 2021: 373–8. https://doi.org/10.1109/ICMLA52953.2021.00064
Weber M, Chen J, Suzumura T, Pareja A, Ma T, Kanezashi H, Kaler T, Leiserson CE, Schardl TB. Scalable graph learning for anti-money laundering: A first look. arXiv preprint arXiv:1812.00076. 2018 Nov 30.
Mubalaike AM, Adali E. Deep learning approach for intelligent financial fraud detection system. In: 2018 3rd International Conference on Computer Science and Engineering (UBMK). 2018:598–603. https://doi.org/10.1109/UBMK.2018.8566574
Roy A, Sun J, Mahoney R, Alonzi L, Adams S, Beling P. Deep learning detecting fraud in credit card transactions. In: 2018 Systems and Information Engineering Design Symposium (SIEDS).2018:129–34. https://doi.org/10.1109/SIEDS.2018.8374722
Lebichot B, Le Borgne Y-A, He-Guelton L, Oblé F, Bontempi G. Deep-learning domain adaptation techniques for credit cards fraud detection. In: Oneto L, Navarin N, Sperduti A, Anguita D, editors. Recent advances in big data and deep learning. Cham: Springer; 2020. p. 78–88.
Kazemi Z, Zarrabi H. Using deep networks for fraud detection in the credit card transactions. In: 2017 IEEE 4th International Conference on Knowledge-Based Engineering and Innovation (KBEI). 2017:0630–3. https://doi.org/10.1109/KBEI.2017.8324876
Yang W, Zhang Y, Ye K, Li L, Xu CZ. Ffd: A federated learning based method for credit card fraud detection. InBig Data–BigData 2019: 8th International Congress, Held as Part of the Services Conference Federation, SCF 2019, San Diego, CA, USA, June 25–30, 2019, Proceedings 8 2019 (pp. 18–32). Springer International Publishing.
Huang Y, Chu L, Zhou Z, Wang L, Liu J, Pei J, Zhang Y. Personalized Cross-Silo federated learning on non-IID data. In: press: 2021.
Li Q, Diao Y, Chen Q, He B. Federated learning on non-IID data silos: an experimental study. https://doi.org/10.48550/ARXIV.2102.02079. https://arxiv.org/abs/2102.02079
Li Q, He B, Song D. Practical one-shot federated learning for cross-silo setting:2021. arXiv: Learning
Zhang C, Li S, Xia J, Wang W, Yan F, Liu Y. Batchcrypt: efficient homomorphic encryption for cross-silo federated learning. In: 2020 USENIX Annual Technical Conference (USENIX ATC 20), USENIX Association, ???. 2020:493–506.
Wang H, Yurochkin M, Sun Y, Papailiopoulos D, Khazaeni Y. Federated learning with matched averaging:2020. arXiv preprint arXiv:2002.06440
Wang H, Kaplan Z, Niu D, Li B. Optimizing federated learning on non-iid data with reinforcement learning. In: IEEE INFOCOM 2020 - IEEE Conference on Computer Communications.2020:1698–707.
Sattler F, Wiedemann S, Müller K-R, Samek W. Robust communication-efficient federated learning from non-i.i.d. data. IEEE Trans Neural Netw Learn Syst. 2020;31(9):3400–13. https://doi.org/10.1109/TNNLS.2019.2944481.
Yue K, Jin R, Pilgrim R, Wong C-W, Baron D, Dai H. Neural tangent kernel empowered federated learning:2021 arXiv. https://doi.org/10.48550/ARXIV.2110.03681arxiv:2110.03681
Fung C, Yoon CJ, Beschastnikh I. Mitigating sybils in federated learning poisoning. arXiv preprint arXiv:1808.04866. 2018 Aug 14.
Chen Y, Yang X, Qin X, Yu H, Chan P, Shen Z. Dealing with label quality disparity in federated learning. Fed Learn Priv Incent. 2020:108–21.
Goldreich O. Foundations of cryptography: volume 2, basic applications. Cambridge university press; 2009 Sep 17.
Rousseeuw P, Rousseeuw PJ. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math. 1987;20:53–65. https://doi.org/10.1016/0377-0427(87)90125-7.
MacQueen J. Classification and analysis of multivariate observations. In5th Berkeley Symp. Math. Statist. Probability 1967 Jun 21 (pp. 281–297). Los Angeles LA USA: University of California.
Davies DL, Bouldin DW. A cluster separation measure. IEEE Trans Pattern Anal Mach Intell PAMI-1. 1979. https://doi.org/10.1109/TPAMI.1979.4766909.
Ryffel T, Trask A, Dahl M, Wagner B, Mancuso J, Rueckert D, Passerat-Palmbach J. A generic framework for privacy preserving deep learning.(2018). arXiv preprint arXiv:1811.04017. 2018.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition 2016 (pp. 770–778).
Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. In: Teh YW, Titterington M. (eds.) Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research. 2010;9:249–256. PMLR, Chia Laguna Resort, Sardinia, Italy.
Mohassel P, Zhang Y. Secure ml: a system for scalable privacy-preserving machine learning. In: 2017 IEEE Symposium on Security and Privacy (SP).2017:19–38. https://doi.org/10.1109/SP.2017.12
Funding
The authors declare that they have received no funding.
Author information
Authors and Affiliations
Contributions
Research problem identification: MAR, MA; research approach definition: DM, MAR, MA; development and coding: DM; main manuscript text and review: DM, MAR, MA, SL.
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Myalil, D., Rajan, M.A., Apte, M. et al. Robust Cross-Silo Federated Fraudulent Transaction Detection in Banks Using Epsilon Cluster Selection. SN COMPUT. SCI. 4, 422 (2023). https://doi.org/10.1007/s42979-023-01873-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-023-01873-3