Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3583780.3614690acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

GraphFC: Customs Fraud Detection with Label Scarcity

Published: 21 October 2023 Publication History

Abstract

Customs officials across the world encounter huge volumes of transactions. Associated with customs transactions is customs fraud-the intentional manipulation of goods declarations to avoid taxes and duties. Due to limited manpower, the customs offices can only manually inspect a small number of declarations, necessitating the automation of customs fraud detection by machine learning techniques. The limited availability of manually inspected ground truth data makes it essential for the ML approach to generalize well on unseen data. However, current customs fraud detection models are not well suited or designed for this setting. In this work, we propose GraphFC (Graph Neural networks for Customs Fraud), a model-agnostic, domain-specific, graph neural network based customs fraud detection model that is designed to work in a real-world setting with limited ground truth data. Extensive experimentation using real customs data from two countries demonstrates that GraphFC generalizes well over unseen data and outperforms various baselines and other models by a large margin.

References

[1]
Aisha Abdallah, Mohd Aizaini Maarof, and Anazida Zainal. 2016. Fraud detection system: A survey. Journal of Network and Computer Applications, Vol. 68 (2016).
[2]
Aderemi O Adewumi and Andronicus A Akinyelu. 2017. A survey of machine-learning and nature-inspired based credit card fraud detection techniques. International Journal of System Assurance Engineering and Management, Vol. 8, 2 (2017).
[3]
Mohiuddin Ahmed, Abdun Naser Mahmood, and Md. Rafiqul Islam. 2016. A survey of anomaly detection techniques in financial domain. Future Generation Computer Systems, Vol. 55 (2016), 278--288.
[4]
Sercan O Arik and Tomas Pfister. 2019. Tabnet: Attentive interpretable tabular learning. arXiv preprint arXiv:1908.07442 (2019).
[5]
Jean-Francc ois Arvis, Lauri Ojala, Christina Wiederer, Ben Shepherd, Anasuya Raj, Karlygash Dairabayeva, and Tuomas Kiiski. 2018. Connecting to compete 2018. (2018).
[6]
Aleksandar Bojchevski and Stephan Günnemann. 2018. Deep Gaussian Embedding of Graphs: Unsupervised Inductive Learning via Ranking. In Proceedings of International Conference on Learning Representations (ICLR).
[7]
Canrakerta, Achmad Nizar Hidayanto, and Yova Ruldeviyani. 2020. Application of business intelligence for customs declaration: A case study in Indonesia. Journal of Physics: Conference Series, Vol. 1444 (2020), 012028.
[8]
Andrea Cerioli, Lucio Barabesi, Andrea Cerasa, Mario Menegatti, and Domenico Perrotta. 2019. NewcombtextendashBenford law and the detection of frauds in international trade. Proceedings of the National Academy of Sciences, Vol. 116, 1 (2019), 106--115.
[9]
Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A scalable tree boosting system. In KDD. 785--794.
[10]
Zhe Chen and Aixin Sun. 2020. Anomaly Detection on Dynamic Bipartite Graph with Burstiness. In 2020 IEEE International Conference on Data Mining (ICDM). 966--971.
[11]
Daniel de Roux, Boris Perez, Andrés Moreno, Maria del Pilar Villamil, and César Figueroa. 2018. Tax fraud detection for under-reporting declarations using an unsupervised machine learning approach. In KDD. 215--222.
[12]
Yingtong Dou, Zhiwei Liu, Li Sun, Yutong Deng, Hao Peng, and Philip S. Yu. 2020. Enhancing Graph Neural Network-Based Fraud Detectors against Camouflaged Fraudsters. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 315--324.
[13]
Jorge Jambeiro Filho and Jacques Wainer. 2008. HPB: A model for handling BN nodes with high cardinality parents. JMLR, Vol. 9 (2008), 2141--2170.
[14]
Anne-Marie Geourjon, Bertrand Laporte, Ousmane Coundoul, Massene Gadiaga, T Cantens, R Ireland, and G Raballand. 2013. Inspecting Less to Inspect Better. The Use of Data Mining for Risk Management by Customs Administrations'. Reform by Numbers. Measurement Applied to Customs and Tax Administrations in Developing Countries, Washington DC: World Bank (2013).
[15]
William L Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. arXiv preprint arXiv:1706.02216 (2017).
[16]
Xinran He, Junfeng Pan, Ou Jin, Tianbing Xu, Bo Liu, Tao Xu, Yanxin Shi, Antoine Atallah, Ralf Herbrich, Stuart Bowers, et al. 2014. Practical lessons from predicting clicks on ads at facebook. In ADKDD. 1--9.
[17]
Jyun-Yu Jiang, Cheng-Te Li, Yian Chen, and Wei Wang. 2018. Identifying Users behind Shared Accounts in Online Streaming Services. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR '18). 65--74.
[18]
Michael Keen. 2003. Changing Customs: Challenges and Strategies for the Reform of Customs Administration. International Monetary Fund, USA.
[19]
Sundong Kim, Yu-Che Tsai, Karandeep Singh, Yeonsoo Choi, Etim Ibok, Cheng-Te Li, and Meeyoung Cha. 2020. DATE: Dual Attentive Tree-aware Embedding for Customs Fraud Detection. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
[20]
Thomas N Kipf and Max Welling. 2016. Variational Graph Auto-Encoders. NIPS Workshop on Bayesian Deep Learning (2016).
[21]
Maria Krivko. 2010. A hybrid model for plastic card fraud detection systems. Expert Systems with Applications, Vol. 37, 8 (2010), 6070--6076.
[22]
Anuj Kumar and Vishnuprasad Nagadevara. 2006. Development of hybrid classification methodology for mining skewed data sets: A case study of Indian customs data. (2006).
[23]
Yi?it Kültür and Mehmet Ufuk Ça?layan. 2017. Hybrid approaches for detecting credit card fraud. Expert Systems, Vol. 34, 2 (2017), e12191.
[24]
Mingxuan Lu, Zhichao Han, Susie Xi Rao, Zitao Zhang, Yang Zhao, Yinan Shan, Ramesh Raghunathan, Ce Zhang, and Jiawei Jiang. 2022. BRIGHT-Graph Neural Networks in Real-time Fraud Detection. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 3342--3351.
[25]
Kunio Mikuriya and Thomas Cantens. 2020. If algorithms dream of Customs, do customs officials dream of algorithms? A manifesto for data mobilization in Customs. World Customs Journal, Vol. 14, 2 (2020), 3--22.
[26]
Andrea Dal Pozzolo, Giacomo Boracchi, Olivier Caelen, Cesare Alippi, and Gianluca Bontempi. 2018. Credit Card Fraud Detection: A Realistic Modeling and a Novel Learning Strategy. IEEE Transactions on Neural Networks and Learning Systems, Vol. 29, 8 (2018), 3784--3797.
[27]
Susie Xi Rao, Shuai Zhang, Zhichao Han, Zitao Zhang, Wei Min, Zhiyao Chen, Yinan Shan, Yang Zhao, and Ce Zhang. 2021a. XFraud: Explainable Fraud Transaction Detection. Proc. VLDB Endow., Vol. 15, 3 (nov 2021), 427--436.
[28]
Susie Xi Rao, Shuai Zhang, Zhichao Han, Zitao Zhang, Wei Min, Mo Cheng, Yinan Shan, Yang Zhao, and Ce Zhang. 2021b. Suspicious Massive Registration Detection via Dynamic Heterogeneous Graph Neural Networks. AAAI Workshop on Deep Learning on Graphs 2021 (2021).
[29]
Jean-Paul Rodrigue, Claude Comtois, and Brian Slack. 2016. The geography of transport systems. Routledge.
[30]
Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, Ivan Titov, and Max Welling. 2018. Modeling relational data with graph convolutional networks. In European semantic web conference. Springer, 593--607.
[31]
Korea Customs Service. April 2023. Korea Customs Week 2023. https://koreacustomsweek.org/
[32]
Hua Shao, Hong Zhao, and Gui-Ran Chang. 2002. Applying data mining to detect fraud behavior in customs declaration. In Proceedings. International Conference on Machine Learning and Cybernetics, Vol. 3. IEEE, 1241--1244.
[33]
Ron Triepels, Hennie Daniels, and Ad Feelders. 2018. Data-driven fraud detection in international shipping. Expert Systems with Applications, Vol. 99 (2018), 193--202.
[34]
Jellis Vanhoeyveld, David Martens, and Bruno Peeters. 2019. Customs fraud detection: Assessing the value of behavioural and high-cardinality data under the imbalanced learning issue. Pattern Analysis and Applications (2019).
[35]
Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In Proc. of the ICLR.
[36]
Petar Velickovic, William Fedus, William L Hamilton, Pietro Liò, Yoshua Bengio, and R Devon Hjelm. 2019. Deep Graph Infomax. In International Conference on Learning Representations (ICLR).
[37]
Daixin Wang, Jianbin Lin, Peng Cui, Quanhui Jia, Zhen Wang, Yanming Fang, Quan Yu, Jun Zhou, Shuang Yang, and Yuan Qi. 2019a. A Semi-Supervised Graph Attentive Network for Financial Fraud Detection. In 2019 IEEE International Conference on Data Mining (ICDM). 598--607.
[38]
Jianyu Wang, Rui Wen, Chunming Wu, Yu Huang, and Jian Xion. 2019b. FdGars: Fraudster Detection via Graph Convolutional Networks in Online App Review System. In Companion Proceedings of The 2019 World Wide Web Conference. 310--316.
[39]
Pei-Chi Wang and Cheng-Te Li. 2019. Spotting Terrorists by Learning Behavior-Aware Heterogeneous Network Embedding. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (CIKM '19). 2097--2100.
[40]
Xiang Wang, Xiangnan He, Fuli Feng, Liqiang Nie, and Tat-Seng Chua. 2018. TEM: Tree-enhanced embedding model for explainable recommendation. In WWW. 1543--1552.
[41]
Jarrod West and Maumita Bhattacharya. 2016. Intelligent financial fraud detection: A comprehensive review. Computers & Security, Vol. 57 (2016), 47--66.
[42]
Less Wright. 2019. Ranger - a synergistic optimizer. https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer.
[43]
Jinsung Yoon, Yao Zhang, James Jordon, and Mihaela van der Schaar. 2020. Vime: Extending the success of self-and semi-supervised learning to tabular domain. Advances in Neural Information Processing Systems, Vol. 33 (2020), 11033--11043.
[44]
Yong-Nan Zhu, Xiaotian Luo, Yu-Feng Li, Bin Bu, Kaibo Zhou, Wenbin Zhang, and Mingfan Lu. 2020. Heterogeneous Mini-Graph Neural Network and Its Application to Fraud Invitation Detection. In 2020 IEEE International Conference on Data Mining (ICDM). 891--899.

Cited By

View all
  • (2024)Burstiness-aware Bipartite Graph Neural Networks for Fraudulent User Detection on Rating PlatformsCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3651475(834-837)Online publication date: 13-May-2024
  • (2024)Detecting Illicit Food Factories from Chemical Declaration Data via Graph-aware Self-supervised Contrastive Anomaly RankingProceedings of the ACM Web Conference 202410.1145/3589334.3648138(4501-4511)Online publication date: 13-May-2024
  • (2024)Hierarchical Bipartite Graph Convolutional Network for RecommendationIEEE Computational Intelligence Magazine10.1109/MCI.2024.336397319:2(49-60)Online publication date: May-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
October 2023
5508 pages
ISBN:9798400701245
DOI:10.1145/3583780
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. customs fraud detection
  2. fraud detection
  3. graph neural networks
  4. label scarcity
  5. multi-task learning
  6. tabular data

Qualifiers

  • Research-article

Funding Sources

Conference

CIKM '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)115
  • Downloads (Last 6 weeks)10
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Burstiness-aware Bipartite Graph Neural Networks for Fraudulent User Detection on Rating PlatformsCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3651475(834-837)Online publication date: 13-May-2024
  • (2024)Detecting Illicit Food Factories from Chemical Declaration Data via Graph-aware Self-supervised Contrastive Anomaly RankingProceedings of the ACM Web Conference 202410.1145/3589334.3648138(4501-4511)Online publication date: 13-May-2024
  • (2024)Hierarchical Bipartite Graph Convolutional Network for RecommendationIEEE Computational Intelligence Magazine10.1109/MCI.2024.336397319:2(49-60)Online publication date: May-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media