Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3533271.3561674acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicaifConference Proceedingsconference-collections
research-article
Open access

Federated and Privacy-Preserving Learning of Accounting Data in Financial Statement Audits

Published: 26 October 2022 Publication History

Abstract

The ongoing ‘digital transformation’ fundamentally changes audit evidence’s nature, recording, and volume. Nowadays, the International Standards on Auditing (ISA) requires auditors to examine vast volumes of a financial statement’s underlying digital accounting records. As a result, audit firms also ‘digitize’ their analytical capabilities and invest in Deep Learning (DL), a successful sub-discipline of Machine Learning. The application of DL offers the ability to learn specialized audit models from data of multiple clients, e.g., organizations operating in the same industry or jurisdiction. In general, regulations require auditors to adhere to strict data confidentiality measures. At the same time, recent intriguing discoveries showed that large-scale DL models are vulnerable to leaking sensitive training data information. Today, it often remains unclear how audit firms can apply DL models while complying with data protection regulations. In this work, we propose a Federated Learning framework to train DL models on auditing relevant accounting data of multiple clients. The framework encompasses Differential Privacy and Split Learning capabilities to mitigate data confidentiality risks at model inference. Our results provide empirical evidence that auditors can benefit from DL models that accumulate knowledge from multiple sources of proprietary client data.

References

[1]
2014. AICPA Code of Professional Conduct. American Institute of Certified Public Accountants (AICPA).
[2]
Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep Learning with Differential Privacy. In ACM SIGSAC Conference on Computer and Communications Security. 308–318.
[3]
Michael G Alles. 2015. Drivers of the Use and Facilitators and Obstacles of the Evolution of Big Data by the Audit Profession. Accounting Horizons 29, 2 (2015), 439–449.
[4]
Deniz Appelbaum. 2016. Securing Big Data Provenance for Auditors: The Big Data Provenance Black Box as Reliable Evidence. Journal of Emerging Technologies in Accounting 13, 1(2016), 17–36.
[5]
Argyris Argyrou. 2012. Auditing Journal Entries Using Self-Organizing Map. In Proceedings of the Eighteenth Americas Conference on Information Systems.
[6]
Stephen Bay, Krishna Kumaraswamy, Markus G Anderle, Rohit Kumar, David M Steier, Almaden Blvd, and San Jose. 2006. Large Scale Detection of Irregularities in Accounting Data. In Sixth International Conference on Data Mining. IEEE, 75–86.
[7]
Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence 35, 8(2013), 1798–1828.
[8]
Daniel J Beutel, Taner Topal, Akhil Mathur, Xinchi Qiu, Titouan Parcollet, Pedro PB de Gusmão, and Nicholas D Lane. 2020. Flower: A friendly federated learning research framework. arXiv preprint arXiv:2007.14390(2020).
[9]
Markus M Breunig, Hans-Peter Kriegel, Raymond T Ng, and Jörg Sander. 2000. LOF: Identifying Density-based Local Outliers. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data. 93–104.
[10]
Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, 2020. Extracting Training Data from Large Language Models. arXiv preprint arXiv:2012.07805(2020).
[11]
D Chan, A Ferguson, D Simunic, and D Stokes. 2004. A spatial analysis and test of oligopolistic competition in the market for audit services. Technical Report. Working paper, University of British Columbia.
[12]
Soohyun Cho, Miklos A Vasarhelyi, Ting Sun, and Chanyuan Zhang. 2020. Learning from Machine Learning in Accounting and Assurance. Journal of Emerging Technologies in Accounting 17, 1(2020), 1–10.
[13]
Fida Kamal Dankar and Khaled El Emam. 2012. The application of differential privacy to health data. In Proceedings of the 2012 Joint EDBT/ICDT Workshops. 158–166.
[14]
Gabe Dickey, S Blanke, and L Seaton. 2019. Machine Learning in Auditing. The CPA Journal (2019), 16–21.
[15]
Cynthia Dwork. 2008. Differential Privacy: A Survey of Results. In International Conference on Theory and Applications of Models of Computation. Springer, 1–19.
[16]
Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. 2006. Our data, ourselves: Privacy via Distributed Noise Generation. In Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer, 486–503.
[17]
Cynthia Dwork, Aaron Roth, 2014. The Algorithmic Foundations of Differential Privacy. Foundations and Trends in Theoretical Computer Science 9, 3-4(2014), 211–407.
[18]
Kazuto Fukuchi, Quang Khai Tran, and Jun Sakuma. 2017. Differentially private empirical risk minimization with input perturbation. In International Conference on Discovery Science. Springer, 82–90.
[19]
Severin V Grabski, Stewart A Leech, and Pamela J Schmidt. 2011. A review of ERP research: A future agenda for accounting information systems. Journal of information systems 25, 1 (2011), 37–78.
[20]
Otkrist Gupta and Ramesh Raskar. 2018. Distributed learning of deep neural network over multiple agents. Journal of Network and Computer Applications 116 (2018), 1–8.
[21]
Dan M Guy, Douglas R Carmichael, and Linda A Lach. 2002. Wiley Practitioner’s Guide to GAAS 2003: Covering all SASs, SSAEs, SSARSs, and Interpretations. Wiley.
[22]
Andrew Hard, Kanishka Rao, Rajiv Mathews, Swaroop Ramaswamy, Françoise Beaufays, Sean Augenstein, Hubert Eichner, Chloé Kiddon, and Daniel Ramage. 2018. Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604(2018).
[23]
Simon Hawkins, Hongxing He, Graham Williams, and Rohan Baxter. 2002. Outlier Detection using Replicator Neural Networks. In International Conference on Data Warehousing and Knowledge Discovery. Springer, 170–180.
[24]
Geoffrey E Hinton and Ruslan R Salakhutdinov. 2006. Reducing the Dimensionality of Data with Neural Networks. Science 313, 5786 (2006), 504–507.
[25]
Chris E Hogan and Debra C Jeter. 1999. Industry specialization by auditors. Auditing: A Journal of Practice & Theory 18, 1 (1999), 1–17.
[26]
Rani Hoitash, Alexander Kogan, and Miklos A Vasarhelyi. 2006. Peer-Based Approach for Analytical Procedures. Auditing: A Journal of Practice & Theory 25, 2 (2006), 53–84.
[27]
IFAC. 2009. International Standards on Auditing (ISA) 240: The Auditor’s Responsibilities Relating to Fraud in an Audit of Financial Statements. International Federation of Accountants (IFAC).
[28]
Mieke Jans, Jan Martijn Van Der Werf, Nadine Lybaert, and Koen Vanhoof. 2011. A Business Process Mining Application for Internal Transaction Fraud Mitigation. Expert Systems with Applications 38, 10 (2011), 13351–13359.
[29]
Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Keith Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, 2019. Advances and Open Problems in Federated Learning. arXiv preprint arXiv:1912.04977(2019).
[30]
Yilin Kang, Yong Liu, Lizhong Ding, Xinwang Liu, Xinyi Tong, and Weiping Wang. 2020. Differentially Private ERM Based on Data Perturbation. arXiv preprint arXiv:2002.08578(2020).
[31]
Deep Kawa, Sunaina Punyani, Priya Nayak, Arpita Karkera, and Varshapriya Jyotinagar. 2019. Credit risk assessment from combined bank records using federated learning. International Research Journal of Engineering and Technology (IRJET) 6, 4(2019), 1355–1358.
[32]
Roheena Khan, Malcolm Corney, Andrew Clark, and George Mohay. 2009. A role mining inspired approach to representing user behaviour in ERP systems. In Asia Pacific Industrial Engineering & Management Systems Conference 2009. APIEMS Society.
[33]
Daniel Kifer, Adam Smith, and Abhradeep Thakurta. 2012. Private convex empirical risk minimization and high-dimensional regression. In Conference on Learning Theory. JMLR Workshop and Conference Proceedings, 25–1.
[34]
Hyesung Kim, Jihong Park, Mehdi Bennis, and Seong-Lyun Kim. 2018. On-device federated learning via blockchain and its latency analysis. arXiv preprint arXiv:1808.03949(2018).
[35]
Alexander Kogan and Cheng Yin. 2021. Privacy-preserving Information Sharing within an Audit Firm. Journal of Information Systems 35, 2 (2021), 243–268.
[36]
Jakub Konečnỳ, H Brendan McMahan, Daniel Ramage, and Peter Richtárik. 2016. Federated optimization: Distributed machine learning for on-device intelligence. arXiv preprint arXiv:1610.02527(2016).
[37]
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015).
[38]
Sherin Mary Mathews and Samuel Assefa. 2022. Federated Learning: Balancing the Thin Line Between Data Intelligence and Privacy. AAAI Workshop on AI in Financial Services: Adaptiveness, Resilience & Governance (2022).
[39]
Mary McGlohon, Stephen Bay, Markus G Mg Anderle, David M Steier, and Christos Faloutsos. 2009. SNARE: A Link Analytic System for Graph Labeling and Risk Detection. In KDD’09: 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.
[40]
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-Efficient Learning of Deep Networks from Decentralized Data. In Artificial Intelligence and Statistics. PMLR, 1273–1282.
[41]
Ivy Munoko, Helen L Brown-Liburd, and Miklos Vasarhelyi. 2020. The Ethical Implications of Using Artificial Intelligence in Auditing. Journal of Business Ethics 167, 2 (2020).
[42]
Jakob Nonnenmacher and Jorge Marx Gómez. 2021. Unsupervised anomaly detection for internal auditing: Literature review and research agenda. The International Journal of Digital Accounting Research 21, 27(2021), 1–22.
[43]
Jakob Nonnenmacher, Felix Kruse, Gerrit Schumann, and Jorge Marx Gómez. 2021. Using autoencoders for data-driven analysis in internal auditing. In Proceedings of the 54th Hawaii International Conference on System Sciences. 5748.
[44]
Nicolas Papernot, Martín Abadi, Ulfar Erlingsson, Ian Goodfellow, and Kunal Talwar. 2016. Semi-supervised knowledge transfer for deep learning from private training data. arXiv preprint arXiv:1610.05755(2016).
[45]
Rajkumar Ramamurthy, Maren Pielka, Robin Stenzel, Christian Bauckhage, Rafet Sifa, Tim Dilmaghani Khameneh, Ulrich Warning, Bernd Kliem, and Rüdiger Loitz. 2021. ALiBERT: improved automated list inspection (ALI) with BERT. In 21st ACM Symposium on Document Engineering.
[46]
Ronald L Rivest, Len Adleman, Michael L Dertouzos, 1978. On Data Banks and Privacy Homomorphisms. Foundations of secure computation 4, 11 (1978), 169–180.
[47]
Marco Schreyer, Marcel Baumgartner, Flemming Ruud, and Damian Borth. 2022. Artificial Intelligence in Internal Audit as a Contribution to Effective Governance. Expert Focus01(2022).
[48]
Marco Schreyer, Timur Sattarov, and Damian Borth. 2021. Multi-view Contrastive Self-Supervised Learning of Accounting Data Representations for Downstream Audit Tasks. In International Conference on Artificial Intelligence.
[49]
Marco Schreyer, Timur Sattarov, Damian Borth, Andreas Dengel, and Bernd Reimer. 2017. Detection of Anomalies in Large Scale Accounting Data using Deep Autoencoder Networks. arXiv preprint arXiv:1709.05254(2017).
[50]
Marco Schreyer, Timur Sattarov, Anita Stefanie Gierbl, Bernd Reimer, and Damian Borth. 2020. Learning Sampling in Financial Statement Audits using Vector Quantised Variational Autoencoder Neural Networks. In International Conference on Artificial Intelligence.
[51]
Marco Schreyer, Timur Sattarov, Bernd Reimer, and Damian Borth. 2019. Adversarial Learning of Deepfakes in Accounting. NeurIPS 2019 Workshop on Robust AI in Financial Services, Vancouver, BC, Canada (2019).
[52]
Marco Schreyer, Timur Sattarov, Christian Schulze, Bernd Reimer, and Damian Borth. 2019. Detection of Accounting Anomalies in the Latent Space using Adversarial Autoencoder Neural Networks. 2nd KDD Workshop on Anomaly Detection in Finance, USA (2019).
[53]
Martin Schultz and Marina Tropmann-Frick. 2020. Autoencoder Neural Networks versus External Auditors: Detecting Unusual Journal Entries in Financial Statement Audits. In Proceedings of the 53rd Hawaii International Conference on System Sciences.
[54]
Rafet Sifa, Anna Ladi, Maren Pielka, Rajkumar Ramamurthy, Lars Hillebrand, Birgit Kirsch, David Biesner, Robin Stenzel, Thiago Bell, Max Lübbering, 2019. Towards automated auditing with machine learning. In ACM Symposium on Document Engineering 2019.
[55]
Virginia Smith, Chao-Kai Chiang, Maziar Sanjabi, and Ameet S Talwalkar. 2017. Federated multi-task learning. Advances in neural information processing systems 30 (2017).
[56]
Ting Sun. 2019. Applying Deep Learning to Audit Procedures: An Illustrative Framework. Accounting Horizons 33, 3 (2019), 89–109.
[57]
Ting Sun and Miklos A Vasarhelyi. 2017. Deep Learning and the Future of Auditing: How an Evolving Technology Could Transform Analysis and Improve Judgment.CPA Journal 87, 6 (2017).
[58]
Sutapat Thiprungsri and Miklos A Vasarhelyi. 2011. Cluster Analysis for Anomaly Detection in Accounting Data: An Audit Approach. International Journal of Digital Accounting Research 11 (2011).
[59]
Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol. 2008. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th international conference on Machine learning. 1096–1103.
[60]
Andrew C Yao. 1982. Protocols for Secure Computations. In 23rd Annual Symposium on Foundations of Computer Science. IEEE, 160–164.
[61]
Hongxu Yin, Arun Mallya, Arash Vahdat, Jose M Alvarez, Jan Kautz, and Pavlo Molchanov. 2021. See through gradients: Image batch recovery via gradinversion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[62]
Kyunghee Yoon, Lucas Hoogduin, and Li Zhang. 2015. Big Data as Complementary Audit Evidence. Accounting Horizons 29, 2 (2015), 431–438.
[63]
Ashkan Yousefpour, Igor Shilov, Alexandre Sablayrolles, Davide Testuggine, Karthik Prasad, Mani Malek, John Nguyen, Sayan Ghosh, Akash Bharadwaj, Jessica Zhao, Graham Cormode, and Ilya Mironov. 2021. Opacus: User-Friendly Differential Privacy Library in PyTorch. arXiv preprint arXiv:2109.12298(2021).
[64]
Manal M Yunis, Raed El-Khalil, and Miray Ghanem. 2021. Towards a Conceptual Framework on the Importance of Privacy and Security Concerns in Audit Data Analytics. (2021).
[65]
Mikhail Yurochkin, Mayank Agarwal, Soumya Ghosh, Kristjan Greenewald, Nghia Hoang, and Yasaman Khazaeni. 2019. Bayesian nonparametric federated learning of neural networks. In International Conference on Machine Learning. PMLR, 7252–7261.
[66]
Jiaqi Zhang, Kai Zheng, Wenlong Mou, and Liwei Wang. 2017. Efficient private ERM for smooth objectives. arXiv preprint arXiv:1703.09947(2017).
[67]
Yuli Zheng, Zhenyu Wu, Ye Yuan, Tianlong Chen, and Zhangyang Wang. 2020. PCAL: A privacy-preserving intelligent credit risk modeling framework based on adversarial learning. arXiv preprint arXiv:2010.02529(2020).
[68]
Mario Zupan, Verica Budimir, and Svjetlana Letinic. 2020. Journal entry anomaly detection model. Intelligent Systems in Accounting, Finance and Management 27, 4(2020), 197–209.

Cited By

View all
  • (2024)Professional Skills of Future Accountants Working in a Digitized Environment Dominated by ERP Systems or Artificial IntelligenceProceedings of the International Conference on Business Excellence10.2478/picbe-2024-010718:1(1290-1305)Online publication date: 3-Jul-2024
  • (2024)A Multi-Head Federated Continual Learning Approach for Improved Flexibility and Robustness in Edge EnvironmentsInternational Journal of Networking and Computing10.15803/ijnc.14.2_12314:2(123-144)Online publication date: 2024
  • (2024)A Tutorial on Federated Learning from Theory to Practice: Foundations, Software Frameworks, Exemplary Use Cases, and Selected TrendsIEEE/CAA Journal of Automatica Sinica10.1109/JAS.2024.12421511:4(824-850)Online publication date: Apr-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICAIF '22: Proceedings of the Third ACM International Conference on AI in Finance
November 2022
527 pages
ISBN:9781450393768
DOI:10.1145/3533271
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 October 2022

Check for updates

Author Tags

  1. accounting information systems
  2. anomaly detection
  3. computer-assisted audit techniques
  4. differential privacy
  5. enterprise resource planning systems
  6. federated learning
  7. financial auditing

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICAIF '22
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)458
  • Downloads (Last 6 weeks)52
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Professional Skills of Future Accountants Working in a Digitized Environment Dominated by ERP Systems or Artificial IntelligenceProceedings of the International Conference on Business Excellence10.2478/picbe-2024-010718:1(1290-1305)Online publication date: 3-Jul-2024
  • (2024)A Multi-Head Federated Continual Learning Approach for Improved Flexibility and Robustness in Edge EnvironmentsInternational Journal of Networking and Computing10.15803/ijnc.14.2_12314:2(123-144)Online publication date: 2024
  • (2024)A Tutorial on Federated Learning from Theory to Practice: Foundations, Software Frameworks, Exemplary Use Cases, and Selected TrendsIEEE/CAA Journal of Automatica Sinica10.1109/JAS.2024.12421511:4(824-850)Online publication date: Apr-2024
  • (2024)Dynamic Resource-Aware Federated Framework for Secure and Sustainable Learning in Data Sensitive Applications2024 IEEE 9th International Conference for Convergence in Technology (I2CT)10.1109/I2CT61223.2024.10543390(1-8)Online publication date: 5-Apr-2024
  • (2024)A comprehensive review on Federated Learning for Data-Sensitive ApplicationEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.108128133:PBOnline publication date: 1-Jul-2024
  • (2023)Flexibility and Privacy: A Multi-Head Federated Continual Learning Framework for Dynamic Edge Environments2023 Eleventh International Symposium on Computing and Networking (CANDAR)10.1109/CANDAR60563.2023.00009(1-10)Online publication date: 28-Nov-2023
  • (2023)Uncovering Inconsistencies and Contradictions in Financial Reports using Large Language Models2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386673(2814-2822)Online publication date: 15-Dec-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media