Nothing Special   »   [go: up one dir, main page]

Skip to main content

Unsupervised Anomaly Detection for Financial Auditing with Model-Agnostic Explanations

  • Conference paper
  • First Online:
KI 2021: Advances in Artificial Intelligence (KI 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12873))

Included in the following conference series:

Abstract

Explainable Artificial Intelligence (AI) has emerged to be a key component for Black-Box Machine Learning (ML) approaches in domains with a high demand for transparency. Besides medical expert systems, which inherently need to be interpretable, transparent, and comprehensible as they deal with life-changing decision tasks, other application domains like financial auditing require trust in ML as well. The European General Data Protection Regulation (GDPR) also applies to such highly regulated areas where an auditor evaluates financial transactions and statements of a business. In this paper we propose an ML architecture that shall help financial auditors by transparently detecting anomalous datapoints in the absence of ground truth. While most of the time Anomaly Detection (AD) is performed in a supervised manner, where model-agnostic explainers can be easily applied, unsupervised AD is hardly comprehensible especially across different algorithms. In this work we investigate how to dissolve this: We describe an integrated architecture for unsupervised AD that identifies outliers at different levels of granularity using an ensemble of independent algorithms. Furthermore, we show how model-agnostic explanations can be generated for such an ensemble using supervised approximation and Local Interpretable Model-Agnostic Explanations (LIME). Additionally, we propose techniques for explanation-post-processing that allow explanations to be selective, receiver-dependent, and easily understandable. In a nutshell, our architecture paves the way for model-agnostic explainability for the task of unsupervised AD. It can further be transferred smoothly to other unsupervised ML problems like clustering problems.

Supported by organization DATEV eG.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Adadi, A., Berrada, M.: Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE Access 6 (2018). https://doi.org/10.1109/ACCESS.2018.2870052

  2. Angiulli, F., Pizzuti, C.: Fast outlier detection in high dimensional spaces. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS, vol. 2431, pp. 15–27. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45681-3_2

    Chapter  Google Scholar 

  3. Antwarg, L., Shapira, B., Rokach, L.: Explaining anomalies detected by autoencoders using shap. arXiv (2019)

    Google Scholar 

  4. Benford, F.: The law of anomalous numbers. Proc. Am. Philos. Soc. 78, 551–572 (1938)

    MATH  Google Scholar 

  5. Bertsimas, D., Dunn, J.: Optimal classification trees. Mach. Learn. 106(7), 1039–1082 (2017). https://doi.org/10.1007/s10994-017-5633-9

    Article  MathSciNet  MATH  Google Scholar 

  6. Breuniq, M.M., Kriegel, H.P., Ng, R.T., Sander, J.: LOF: Identifying density-based local outliers. SIGMOD Rec. (ACM Special Interest Group on Management of Data) 29 (2000). https://doi.org/10.1145/335191.335388

  7. Bruckert, S., Finzel, B., Schmid, U.: The next generation of medical decision support: a roadmap toward transparent expert companions. Front. Artif. Intell. 3 (2020). https://doi.org/10.3389/frai.2020.507973

  8. Böhmer, K., Rinderle-Ma, S.: Anomaly detection in business process runtime behavior – challenges and limitations. arXiv (2017)

    Google Scholar 

  9. Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41 (2009). https://doi.org/10.1145/1541880.1541882

  10. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16 (2002). https://doi.org/10.1613/jair.953

  11. Fahim, M., Sillitti, A.: Anomaly detection, analysis and prediction techniques in IoT environment: a systematic literature review. IEEE Access 7 (2019). https://doi.org/10.1109/ACCESS.2019.2921912

  12. Goldstein, M., Dengel, A.: Histogram-based outlier score (hbos): a fast unsupervised anomaly detection algorithm. KI-2012: Poster and Demo Track (2012)

    Google Scholar 

  13. He, Z., Xu, X., Deng, S.: Discovering cluster-based local outliers. Pattern Recogn. Lett. 24 (2003). https://doi.org/10.1016/S0167-8655(03)00003-5

  14. Henselmann, K., Scherr, E., Ditter, D.: Applying Benford’s law to individual financial reports: an empirical investigation on the basis of SEC XBRL filings. Working papers in accounting valuation auditing (2012)

    Google Scholar 

  15. Holzinger, A., Biemann, C., Pattichis, C.S., Kell, D.B.: What do we need to build explainable AI systems for the medical domain? arXiv (2017)

    Google Scholar 

  16. Jolliffe, I.T.: Principal component analysis, second edition. Encyclopedia of Statistics in Behavioral Science 30 (2002). https://doi.org/10.2307/1270093

  17. Kauffmann, J., Müller, K.R., Montavon, G.: Towards explaining anomalies: a deep Taylor decomposition of one-class models. Pattern Recogn. 101 (2020). https://doi.org/10.1016/j.patcog.2020.107198

  18. Leys, C., Ley, C., Klein, O., Bernard, P., Licata, L.: Detecting outliers: do not use standard deviation around the mean, use absolute deviation around the median. J. Exp. Soc. Psychol. 49 (2013). https://doi.org/10.1016/j.jesp.2013.03.013

  19. Li, Z., Zhao, Y., Botta, N., Ionescu, C., Hu, X.: COPOD: copula-based outlier detection. In: Proceedings - IEEE International Conference on Data Mining, ICDM (2020). https://doi.org/10.1109/ICDM50108.2020.00135

  20. Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation-based anomaly detection. ACM Trans. Knowl. Discov. Data 6 (2012). https://doi.org/10.1145/2133360.2133363

  21. Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems (2017)

    Google Scholar 

  22. Maaten, L.V.D., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    MATH  Google Scholar 

  23. Mehrotra, K.G., Mohan, C.K., Huang, H.: Anomaly Detection Principles and Algorithms. Book (2017)

    Google Scholar 

  24. Molnar, C.: Interpretable Machine Learning. A Guide for Making Black Box Models Explainable. Book (2019)

    Google Scholar 

  25. Morichetta, A., Casas, P., Mellia, M.: Explain-it: towards explainable AI for unsupervised network traffic analysis. In: Big-DAMA 2019 - Proceedings of the 3rd ACM CoNEXT Workshop on Big Data, Machine Learning and Artificial Intelligence for Data Communication Networks, Part of CoNEXT 2019 (2019). https://doi.org/10.1145/3359992.3366639

  26. Munir, M., Chattha, M.A., Dengel, A., Ahmed, S.: A comparative analysis of traditional and deep learning-based anomaly detection methods for streaming data. In: Proceedings - 18th IEEE International Conference on Machine Learning and Applications, ICMLA 2019 (2019). https://doi.org/10.1109/ICMLA.2019.00105

  27. Pevný, T.: Loda: lightweight on-line detector of anomalies. Mach. Learn. 102(2), 275–304 (2015). https://doi.org/10.1007/s10994-015-5521-0

    Article  MathSciNet  MATH  Google Scholar 

  28. Polikar, R.: Ensemble based systems in decision making. IEEE Circuits Syst. Mag. 6 (2006). https://doi.org/10.1109/MCAS.2006.1688199

  29. Rabold, J., Schwalbe, G., Schmid, U.: Expressive explanations of DNNs by combining concept analysis with ILP. In: Schmid, U., Klügl, F., Wolter, D. (eds.) KI 2020. LNCS (LNAI), vol. 12325, pp. 148–162. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58285-2_11

    Chapter  Google Scholar 

  30. Radev, D.R., Jing, H., Styś, M., Tam, D.: Centroid-based summarization of multiple documents. Inf. Process. Manag. 40 (2004). https://doi.org/10.1016/j.ipm.2003.10.006

  31. Ribeiro, M.T., Singh, S., Guestrin, C.: “why should i trust you?” explaining the predictions of any classifier. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016). https://doi.org/10.1145/2939672.2939778

  32. Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13 (2001). https://doi.org/10.1162/089976601750264965

  33. Shyu, M.L., Chen, S.C., Sarinnapakorn, K., Chang, L.: A novel anomaly detection scheme based on principal component classifier. In: 3rd IEEE International Conference on Data Mining (2003)

    Google Scholar 

  34. Thudumu, S., Branch, P., Jin, J., Singh, J.J.: A comprehensive survey of anomaly detection techniques for high dimensional big data. J. Big Data 7(1), 1–30 (2020). https://doi.org/10.1186/s40537-020-00320-x

    Article  Google Scholar 

  35. Xu, X., Liu, H., Yao, M.: Recent progress of anomaly detection. Hindawi Complex. 2019 (2019). https://doi.org/10.1155/2019/2686378

  36. Zhao, Y., Nasrullah, Z., Li, Z.: Pyod: a Python toolbox for scalable outlier detection. J. Mach. Learn. Res. 20, 1–7 (2019)

    Google Scholar 

Download references

Acknowledgments

We say many thanks to DATEV eG (Markus Decker, Jörg Schaller, Dr. Thilo Edinger, Gregor Fischer) and the University of Bamberg (Prof. Dr. Ute Schmid, head of Cognitive Systems Group) for professional and organizational support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastian Kiefer .

Editor information

Editors and Affiliations

Appendices

A Anomaly Detection Ensemble

Table 2. Characteristics of the different AD algorithms included in the ensemble
Fig. 4.
figure 4

Number of detected anomalies depending on threshold min_count for view ACC

Fig. 5.
figure 5

Number of detected anomalies depending on threshold min_count for view TA

Fig. 6.
figure 6

Correlations between different AD methods for view ACC

Fig. 7.
figure 7

Correlations between different AD methods for view TA

B Explanations

Fig. 8.
figure 8

Detailed explanation for view TA

Fig. 9.
figure 9

Human-like explanation for view TA

Fig. 10.
figure 10

Detailed explanation for view ACC

Fig. 11.
figure 11

Human-like explanation for view ACC

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kiefer, S., Pesch, G. (2021). Unsupervised Anomaly Detection for Financial Auditing with Model-Agnostic Explanations. In: Edelkamp, S., Möller, R., Rueckert, E. (eds) KI 2021: Advances in Artificial Intelligence. KI 2021. Lecture Notes in Computer Science(), vol 12873. Springer, Cham. https://doi.org/10.1007/978-3-030-87626-5_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-87626-5_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-87625-8

  • Online ISBN: 978-3-030-87626-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics