Unsupervised Anomaly Detection for Financial Auditing with Model-Agnostic Explanations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12873))

Included in the following conference series:

German Conference on Artificial Intelligence (Künstliche Intelligenz)

1295 Accesses
2 Citations

Abstract

Explainable Artificial Intelligence (AI) has emerged to be a key component for Black-Box Machine Learning (ML) approaches in domains with a high demand for transparency. Besides medical expert systems, which inherently need to be interpretable, transparent, and comprehensible as they deal with life-changing decision tasks, other application domains like financial auditing require trust in ML as well. The European General Data Protection Regulation (GDPR) also applies to such highly regulated areas where an auditor evaluates financial transactions and statements of a business. In this paper we propose an ML architecture that shall help financial auditors by transparently detecting anomalous datapoints in the absence of ground truth. While most of the time Anomaly Detection (AD) is performed in a supervised manner, where model-agnostic explainers can be easily applied, unsupervised AD is hardly comprehensible especially across different algorithms. In this work we investigate how to dissolve this: We describe an integrated architecture for unsupervised AD that identifies outliers at different levels of granularity using an ensemble of independent algorithms. Furthermore, we show how model-agnostic explanations can be generated for such an ensemble using supervised approximation and Local Interpretable Model-Agnostic Explanations (LIME). Additionally, we propose techniques for explanation-post-processing that allow explanations to be selective, receiver-dependent, and easily understandable. In a nutshell, our architecture paves the way for model-agnostic explainability for the task of unsupervised AD. It can further be transferred smoothly to other unsupervised ML problems like clustering problems.

Supported by organization DATEV eG.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Explainability, Quantified: Benchmarking XAI Techniques

AcME-AD: Accelerated Model Explanations for Anomaly Detection

Differential Privacy for Anomaly Detection: Analyzing the Trade-Off Between Privacy and Explainability

References

Adadi, A., Berrada, M.: Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE Access 6 (2018). https://doi.org/10.1109/ACCESS.2018.2870052
Angiulli, F., Pizzuti, C.: Fast outlier detection in high dimensional spaces. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS, vol. 2431, pp. 15–27. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45681-3_2
Chapter Google Scholar
Antwarg, L., Shapira, B., Rokach, L.: Explaining anomalies detected by autoencoders using shap. arXiv (2019)
Google Scholar
Benford, F.: The law of anomalous numbers. Proc. Am. Philos. Soc. 78, 551–572 (1938)
MATH Google Scholar
Bertsimas, D., Dunn, J.: Optimal classification trees. Mach. Learn. 106(7), 1039–1082 (2017). https://doi.org/10.1007/s10994-017-5633-9
Article MathSciNet MATH Google Scholar
Breuniq, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: Identifying density-based local outliers. SIGMOD Rec. (ACM Special Interest Group on Management of Data) 29 (2000). https://doi.org/10.1145/335191.335388
Bruckert, S., Finzel, B., Schmid, U.: The next generation of medical decision support: a roadmap toward transparent expert companions. Front. Artif. Intell. 3 (2020). https://doi.org/10.3389/frai.2020.507973
Böhmer, K., Rinderle-Ma, S.: Anomaly detection in business process runtime behavior – challenges and limitations. arXiv (2017)
Google Scholar
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41 (2009). https://doi.org/10.1145/1541880.1541882
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16 (2002). https://doi.org/10.1613/jair.953
Fahim, M., Sillitti, A.: Anomaly detection, analysis and prediction techniques in IoT environment: a systematic literature review. IEEE Access 7 (2019). https://doi.org/10.1109/ACCESS.2019.2921912
Goldstein, M., Dengel, A.: Histogram-based outlier score (hbos): a fast unsupervised anomaly detection algorithm. KI-2012: Poster and Demo Track (2012)
Google Scholar
He, Z., Xu, X., Deng, S.: Discovering cluster-based local outliers. Pattern Recogn. Lett. 24 (2003). https://doi.org/10.1016/S0167-8655(03)00003-5
Henselmann, K., Scherr, E., Ditter, D.: Applying Benford’s law to individual financial reports: an empirical investigation on the basis of SEC XBRL filings. Working papers in accounting valuation auditing (2012)
Google Scholar
Holzinger, A., Biemann, C., Pattichis, C.S., Kell, D.B.: What do we need to build explainable AI systems for the medical domain? arXiv (2017)
Google Scholar
Jolliffe, I.T.: Principal component analysis, second edition. Encyclopedia of Statistics in Behavioral Science 30 (2002). https://doi.org/10.2307/1270093
Kauffmann, J., Müller, K.R., Montavon, G.: Towards explaining anomalies: a deep Taylor decomposition of one-class models. Pattern Recogn. 101 (2020). https://doi.org/10.1016/j.patcog.2020.107198
Leys, C., Ley, C., Klein, O., Bernard, P., Licata, L.: Detecting outliers: do not use standard deviation around the mean, use absolute deviation around the median. J. Exp. Soc. Psychol. 49 (2013). https://doi.org/10.1016/j.jesp.2013.03.013
Li, Z., Zhao, Y., Botta, N., Ionescu, C., Hu, X.: COPOD: copula-based outlier detection. In: Proceedings - IEEE International Conference on Data Mining, ICDM (2020). https://doi.org/10.1109/ICDM50108.2020.00135
Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation-based anomaly detection. ACM Trans. Knowl. Discov. Data 6 (2012). https://doi.org/10.1145/2133360.2133363
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems (2017)
Google Scholar
Maaten, L.V.D., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
MATH Google Scholar
Mehrotra, K.G., Mohan, C.K., Huang, H.: Anomaly Detection Principles and Algorithms. Book (2017)
Google Scholar
Molnar, C.: Interpretable Machine Learning. A Guide for Making Black Box Models Explainable. Book (2019)
Google Scholar
Morichetta, A., Casas, P., Mellia, M.: Explain-it: towards explainable AI for unsupervised network traffic analysis. In: Big-DAMA 2019 - Proceedings of the 3rd ACM CoNEXT Workshop on Big Data, Machine Learning and Artificial Intelligence for Data Communication Networks, Part of CoNEXT 2019 (2019). https://doi.org/10.1145/3359992.3366639
Munir, M., Chattha, M.A., Dengel, A., Ahmed, S.: A comparative analysis of traditional and deep learning-based anomaly detection methods for streaming data. In: Proceedings - 18th IEEE International Conference on Machine Learning and Applications, ICMLA 2019 (2019). https://doi.org/10.1109/ICMLA.2019.00105
Pevný, T.: Loda: lightweight on-line detector of anomalies. Mach. Learn. 102(2), 275–304 (2015). https://doi.org/10.1007/s10994-015-5521-0
Article MathSciNet MATH Google Scholar
Polikar, R.: Ensemble based systems in decision making. IEEE Circuits Syst. Mag. 6 (2006). https://doi.org/10.1109/MCAS.2006.1688199
Rabold, J., Schwalbe, G., Schmid, U.: Expressive explanations of DNNs by combining concept analysis with ILP. In: Schmid, U., Klügl, F., Wolter, D. (eds.) KI 2020. LNCS (LNAI), vol. 12325, pp. 148–162. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58285-2_11
Chapter Google Scholar
Radev, D.R., Jing, H., Styś, M., Tam, D.: Centroid-based summarization of multiple documents. Inf. Process. Manag. 40 (2004). https://doi.org/10.1016/j.ipm.2003.10.006
Ribeiro, M.T., Singh, S., Guestrin, C.: “why should i trust you?” explaining the predictions of any classifier. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016). https://doi.org/10.1145/2939672.2939778
Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13 (2001). https://doi.org/10.1162/089976601750264965
Shyu, M.L., Chen, S.C., Sarinnapakorn, K., Chang, L.: A novel anomaly detection scheme based on principal component classifier. In: 3rd IEEE International Conference on Data Mining (2003)
Google Scholar
Thudumu, S., Branch, P., Jin, J., Singh, J.J.: A comprehensive survey of anomaly detection techniques for high dimensional big data. J. Big Data 7(1), 1–30 (2020). https://doi.org/10.1186/s40537-020-00320-x
Article Google Scholar
Xu, X., Liu, H., Yao, M.: Recent progress of anomaly detection. Hindawi Complex. 2019 (2019). https://doi.org/10.1155/2019/2686378
Zhao, Y., Nasrullah, Z., Li, Z.: Pyod: a Python toolbox for scalable outlier detection. J. Mach. Learn. Res. 20, 1–7 (2019)
Google Scholar

Download references

Acknowledgments

We say many thanks to DATEV eG (Markus Decker, Jörg Schaller, Dr. Thilo Edinger, Gregor Fischer) and the University of Bamberg (Prof. Dr. Ute Schmid, head of Cognitive Systems Group) for professional and organizational support.

Author information

Authors and Affiliations

DATEV eG, Paumgartnerstr. 6-14, 90429, Nürnberg, Germany
Sebastian Kiefer & Günter Pesch
Cognitive Systems, University of Bamberg, Kapuzinerstraße 16, 96047, Bamberg, Germany
Sebastian Kiefer

Authors

Sebastian Kiefer
View author publications
You can also search for this author in PubMed Google Scholar
Günter Pesch
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sebastian Kiefer .

Editor information

Editors and Affiliations

Czech Technical University in Prague, Prague, Czech Republic
Stefan Edelkamp
University of Lübeck, Lübeck, Germany
Ralf Möller
University of Leoben, Leoben, Austria
Elmar Rueckert

Appendices

A Anomaly Detection Ensemble

Table 2. Characteristics of the different AD algorithms included in the ensemble

Full size table

B Explanations

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kiefer, S., Pesch, G. (2021). Unsupervised Anomaly Detection for Financial Auditing with Model-Agnostic Explanations. In: Edelkamp, S., Möller, R., Rueckert, E. (eds) KI 2021: Advances in Artificial Intelligence. KI 2021. Lecture Notes in Computer Science(), vol 12873. Springer, Cham. https://doi.org/10.1007/978-3-030-87626-5_22

Download citation

DOI: https://doi.org/10.1007/978-3-030-87626-5_22
Published: 30 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87625-8
Online ISBN: 978-3-030-87626-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics