research-article

GraphFC: Customs Fraud Detection with Label Scarcity

Authors:

Karandeep Singh,

Shou-De LinAuthors Info & Claims

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

Pages 4829 - 4835

https://doi.org/10.1145/3583780.3614690

Published: 21 October 2023 Publication History

Abstract

Customs officials across the world encounter huge volumes of transactions. Associated with customs transactions is customs fraud-the intentional manipulation of goods declarations to avoid taxes and duties. Due to limited manpower, the customs offices can only manually inspect a small number of declarations, necessitating the automation of customs fraud detection by machine learning techniques. The limited availability of manually inspected ground truth data makes it essential for the ML approach to generalize well on unseen data. However, current customs fraud detection models are not well suited or designed for this setting. In this work, we propose GraphFC (Graph Neural networks for Customs Fraud), a model-agnostic, domain-specific, graph neural network based customs fraud detection model that is designed to work in a real-world setting with limited ground truth data. Extensive experimentation using real customs data from two countries demonstrates that GraphFC generalizes well over unseen data and outperforms various baselines and other models by a large margin.

References

[1]

Aisha Abdallah, Mohd Aizaini Maarof, and Anazida Zainal. 2016. Fraud detection system: A survey. Journal of Network and Computer Applications, Vol. 68 (2016).

Digital Library

[2]

Aderemi O Adewumi and Andronicus A Akinyelu. 2017. A survey of machine-learning and nature-inspired based credit card fraud detection techniques. International Journal of System Assurance Engineering and Management, Vol. 8, 2 (2017).

[3]

Mohiuddin Ahmed, Abdun Naser Mahmood, and Md. Rafiqul Islam. 2016. A survey of anomaly detection techniques in financial domain. Future Generation Computer Systems, Vol. 55 (2016), 278--288.

Digital Library

[4]

Sercan O Arik and Tomas Pfister. 2019. Tabnet: Attentive interpretable tabular learning. arXiv preprint arXiv:1908.07442 (2019).

[5]

Jean-Francc ois Arvis, Lauri Ojala, Christina Wiederer, Ben Shepherd, Anasuya Raj, Karlygash Dairabayeva, and Tuomas Kiiski. 2018. Connecting to compete 2018. (2018).

[6]

Aleksandar Bojchevski and Stephan Günnemann. 2018. Deep Gaussian Embedding of Graphs: Unsupervised Inductive Learning via Ranking. In Proceedings of International Conference on Learning Representations (ICLR).

[7]

Canrakerta, Achmad Nizar Hidayanto, and Yova Ruldeviyani. 2020. Application of business intelligence for customs declaration: A case study in Indonesia. Journal of Physics: Conference Series, Vol. 1444 (2020), 012028.

[8]

Andrea Cerioli, Lucio Barabesi, Andrea Cerasa, Mario Menegatti, and Domenico Perrotta. 2019. NewcombtextendashBenford law and the detection of frauds in international trade. Proceedings of the National Academy of Sciences, Vol. 116, 1 (2019), 106--115.

[9]

Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A scalable tree boosting system. In KDD. 785--794.

[10]

Zhe Chen and Aixin Sun. 2020. Anomaly Detection on Dynamic Bipartite Graph with Burstiness. In 2020 IEEE International Conference on Data Mining (ICDM). 966--971.

[11]

Daniel de Roux, Boris Perez, Andrés Moreno, Maria del Pilar Villamil, and César Figueroa. 2018. Tax fraud detection for under-reporting declarations using an unsupervised machine learning approach. In KDD. 215--222.

[12]

Yingtong Dou, Zhiwei Liu, Li Sun, Yutong Deng, Hao Peng, and Philip S. Yu. 2020. Enhancing Graph Neural Network-Based Fraud Detectors against Camouflaged Fraudsters. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 315--324.

[13]

Jorge Jambeiro Filho and Jacques Wainer. 2008. HPB: A model for handling BN nodes with high cardinality parents. JMLR, Vol. 9 (2008), 2141--2170.

[14]

Anne-Marie Geourjon, Bertrand Laporte, Ousmane Coundoul, Massene Gadiaga, T Cantens, R Ireland, and G Raballand. 2013. Inspecting Less to Inspect Better. The Use of Data Mining for Risk Management by Customs Administrations'. Reform by Numbers. Measurement Applied to Customs and Tax Administrations in Developing Countries, Washington DC: World Bank (2013).

[15]

William L Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. arXiv preprint arXiv:1706.02216 (2017).

[16]

Xinran He, Junfeng Pan, Ou Jin, Tianbing Xu, Bo Liu, Tao Xu, Yanxin Shi, Antoine Atallah, Ralf Herbrich, Stuart Bowers, et al. 2014. Practical lessons from predicting clicks on ads at facebook. In ADKDD. 1--9.

[17]

Jyun-Yu Jiang, Cheng-Te Li, Yian Chen, and Wei Wang. 2018. Identifying Users behind Shared Accounts in Online Streaming Services. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR '18). 65--74.

Digital Library

[18]

Michael Keen. 2003. Changing Customs: Challenges and Strategies for the Reform of Customs Administration. International Monetary Fund, USA.

[19]

Sundong Kim, Yu-Che Tsai, Karandeep Singh, Yeonsoo Choi, Etim Ibok, Cheng-Te Li, and Meeyoung Cha. 2020. DATE: Dual Attentive Tree-aware Embedding for Customs Fraud Detection. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.

Digital Library

[20]

Thomas N Kipf and Max Welling. 2016. Variational Graph Auto-Encoders. NIPS Workshop on Bayesian Deep Learning (2016).

[21]

Maria Krivko. 2010. A hybrid model for plastic card fraud detection systems. Expert Systems with Applications, Vol. 37, 8 (2010), 6070--6076.

Digital Library

[22]

Anuj Kumar and Vishnuprasad Nagadevara. 2006. Development of hybrid classification methodology for mining skewed data sets: A case study of Indian customs data. (2006).

[23]

Yi?it Kültür and Mehmet Ufuk Ça?layan. 2017. Hybrid approaches for detecting credit card fraud. Expert Systems, Vol. 34, 2 (2017), e12191.

[24]

Mingxuan Lu, Zhichao Han, Susie Xi Rao, Zitao Zhang, Yang Zhao, Yinan Shan, Ramesh Raghunathan, Ce Zhang, and Jiawei Jiang. 2022. BRIGHT-Graph Neural Networks in Real-time Fraud Detection. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 3342--3351.

Digital Library

[25]

Kunio Mikuriya and Thomas Cantens. 2020. If algorithms dream of Customs, do customs officials dream of algorithms? A manifesto for data mobilization in Customs. World Customs Journal, Vol. 14, 2 (2020), 3--22.

[26]

Andrea Dal Pozzolo, Giacomo Boracchi, Olivier Caelen, Cesare Alippi, and Gianluca Bontempi. 2018. Credit Card Fraud Detection: A Realistic Modeling and a Novel Learning Strategy. IEEE Transactions on Neural Networks and Learning Systems, Vol. 29, 8 (2018), 3784--3797.

[27]

Susie Xi Rao, Shuai Zhang, Zhichao Han, Zitao Zhang, Wei Min, Zhiyao Chen, Yinan Shan, Yang Zhao, and Ce Zhang. 2021a. XFraud: Explainable Fraud Transaction Detection. Proc. VLDB Endow., Vol. 15, 3 (nov 2021), 427--436.

[28]

Susie Xi Rao, Shuai Zhang, Zhichao Han, Zitao Zhang, Wei Min, Mo Cheng, Yinan Shan, Yang Zhao, and Ce Zhang. 2021b. Suspicious Massive Registration Detection via Dynamic Heterogeneous Graph Neural Networks. AAAI Workshop on Deep Learning on Graphs 2021 (2021).

[29]

Jean-Paul Rodrigue, Claude Comtois, and Brian Slack. 2016. The geography of transport systems. Routledge.

[30]

Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, Ivan Titov, and Max Welling. 2018. Modeling relational data with graph convolutional networks. In European semantic web conference. Springer, 593--607.

[31]

Korea Customs Service. April 2023. Korea Customs Week 2023. https://koreacustomsweek.org/

[32]

Hua Shao, Hong Zhao, and Gui-Ran Chang. 2002. Applying data mining to detect fraud behavior in customs declaration. In Proceedings. International Conference on Machine Learning and Cybernetics, Vol. 3. IEEE, 1241--1244.

[33]

Ron Triepels, Hennie Daniels, and Ad Feelders. 2018. Data-driven fraud detection in international shipping. Expert Systems with Applications, Vol. 99 (2018), 193--202.

Digital Library

[34]

Jellis Vanhoeyveld, David Martens, and Bruno Peeters. 2019. Customs fraud detection: Assessing the value of behavioural and high-cardinality data under the imbalanced learning issue. Pattern Analysis and Applications (2019).

[35]

Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In Proc. of the ICLR.

[36]

Petar Velickovic, William Fedus, William L Hamilton, Pietro Liò, Yoshua Bengio, and R Devon Hjelm. 2019. Deep Graph Infomax. In International Conference on Learning Representations (ICLR).

[37]

Daixin Wang, Jianbin Lin, Peng Cui, Quanhui Jia, Zhen Wang, Yanming Fang, Quan Yu, Jun Zhou, Shuang Yang, and Yuan Qi. 2019a. A Semi-Supervised Graph Attentive Network for Financial Fraud Detection. In 2019 IEEE International Conference on Data Mining (ICDM). 598--607.

[38]

Jianyu Wang, Rui Wen, Chunming Wu, Yu Huang, and Jian Xion. 2019b. FdGars: Fraudster Detection via Graph Convolutional Networks in Online App Review System. In Companion Proceedings of The 2019 World Wide Web Conference. 310--316.

Digital Library

[39]

Pei-Chi Wang and Cheng-Te Li. 2019. Spotting Terrorists by Learning Behavior-Aware Heterogeneous Network Embedding. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (CIKM '19). 2097--2100.

Digital Library

[40]

Xiang Wang, Xiangnan He, Fuli Feng, Liqiang Nie, and Tat-Seng Chua. 2018. TEM: Tree-enhanced embedding model for explainable recommendation. In WWW. 1543--1552.

Digital Library

[41]

Jarrod West and Maumita Bhattacharya. 2016. Intelligent financial fraud detection: A comprehensive review. Computers & Security, Vol. 57 (2016), 47--66.

Digital Library

[42]

Less Wright. 2019. Ranger - a synergistic optimizer. https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer.

[43]

Jinsung Yoon, Yao Zhang, James Jordon, and Mihaela van der Schaar. 2020. Vime: Extending the success of self-and semi-supervised learning to tabular domain. Advances in Neural Information Processing Systems, Vol. 33 (2020), 11033--11043.

[44]

Yong-Nan Zhu, Xiaotian Luo, Yu-Feng Li, Bin Bu, Kaibo Zhou, Wenbin Zhang, and Mingfan Lu. 2020. Heterogeneous Mini-Graph Neural Network and Its Application to Fraud Invitation Detection. In 2020 IEEE International Conference on Data Mining (ICDM). 891--899.

Cited By

Lu YTsai YLi CChua TNgo CKumar RLauw HKa-Wei Lee R(2024)Burstiness-aware Bipartite Graph Neural Networks for Fraudulent User Detection on Rating PlatformsCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3651475(834-837)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589335.3651475
Yang SLi CChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Detecting Illicit Food Factories from Chemical Declaration Data via Graph-aware Self-supervised Contrastive Anomaly RankingProceedings of the ACM Web Conference 202410.1145/3589334.3648138(4501-4511)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3648138
Cheng YZhong ZPang JLi C(2024)Hierarchical Bipartite Graph Convolutional Network for RecommendationIEEE Computational Intelligence Magazine10.1109/MCI.2024.336397319:2(49-60)Online publication date: May-2024
https://doi.org/10.1109/MCI.2024.3363973

Index Terms

GraphFC: Customs Fraud Detection with Label Scarcity
1. Applied computing
  1. Computers in other domains
    1. Computing in government
      1. E-government
2. Social and professional topics
  1. Computing / technology policy
    1. Commerce policy
      1. Taxation

Recommendations

Research on Credit Card Fraud Detection Model Based on Distance Sum
JCAI '09: Proceedings of the 2009 International Joint Conference on Artificial Intelligence

Along with increasing credit cards and growing trade volume in China, credit card fraud rises sharply. How to enhance the detection and prevention of credit card fraud becomes the focus of risk control of banks. This paper proposes a credit card fraud ...
Unsupervised Machine Learning for Card Payment Fraud Detection
Risks and Security of Internet and Systems
Abstract
Credit card fraud is one of the most common cybercrimes experienced by consumers today. Machine learning approaches are increasingly used to improve the accuracy of fraud detection systems. However, most of the approaches proposed so far have been ...
SEFraud: Graph-based Self-Explainable Fraud Detection via Interpretative Mask Learning
KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Graph-based fraud detection has widespread application in modern industry scenarios, such as spam review and malicious account detection. While considerable efforts have been devoted to designing adequate fraud detectors, the interpretability of their ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

October 2023

5508 pages

ISBN:9798400701245

DOI:10.1145/3583780

General Chairs:
Ingo Frommholz
University of Wolverhampton, UK
,
Frank Hopfgartner
University of Koblenz, Germany
,
Mark Lee
University of Birmingham, UK
,
Michael Oakes
University of Birmingham, UK
,
Program Chairs:
Mounia Lalmas
Spotify, UK
,
Min Zhang
Tsinghua University, China
,
Rodrygo Santos
Federal University of Minas Gerais, Brazil

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

CIKM '23

Sponsor:

CIKM '23: The 32nd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2023

Birmingham, United Kingdom

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
168
Total Downloads

Downloads (Last 12 months)115
Downloads (Last 6 weeks)10

Reflects downloads up to 15 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Lu YTsai YLi CChua TNgo CKumar RLauw HKa-Wei Lee R(2024)Burstiness-aware Bipartite Graph Neural Networks for Fraudulent User Detection on Rating PlatformsCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3651475(834-837)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589335.3651475
Yang SLi CChua TNgo CKa-Wei Lee RKumar RLauw H(2024)Detecting Illicit Food Factories from Chemical Declaration Data via Graph-aware Self-supervised Contrastive Anomaly RankingProceedings of the ACM Web Conference 202410.1145/3589334.3648138(4501-4511)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3648138
Cheng YZhong ZPang JLi C(2024)Hierarchical Bipartite Graph Convolutional Network for RecommendationIEEE Computational Intelligence Magazine10.1109/MCI.2024.336397319:2(49-60)Online publication date: May-2024
https://doi.org/10.1109/MCI.2024.3363973

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten