Nothing Special   »   [go: up one dir, main page]

Skip to main content

Analyzing Textual Information from Financial Statements for Default Prediction

  • Conference paper
  • First Online:
Document Analysis and Recognition - ICDAR 2023 (ICDAR 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14189))

Included in the following conference series:

  • 988 Accesses

Abstract

Financial statements provide a view of company’s financial status at a specific point in time including the quantitative as well as qualitative view. Besides the quantitative information, the paper asserts that the qualitative information present in the form of textual disclosures have high discriminating power to predict the financial default. Towards this, the paper presents a technique to capture comprehensive 360-\(^{\circ }\) features from qualitative textual data at multiple granularities. The paper proposes a new sentence embedding (SE) from large language models specifically built for financial domain to encode the textual data and presents three deep learning models built on SE for financial default prediction. To accommodate unstructured and non-standard financial statements from small and unlisted companies, the paper also presents a document processing pipeline to be inclusive of such companies in the financial text modelling. Finally, the paper presents comprehensive experimental results on two datasets demonstrating the discriminating power of textual features to predict financial defaults.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Zhao, Z., Xu, S., Kang, B.H., et al.: Investigation and improvement of multi-layer perceptron neural networks for credit scoring. Expert Syst. Appl. 42(7), 3508–3516 (2015)

    Article  Google Scholar 

  2. Chong, E., Han, C., Park, F.C.: Deep learning networks for stock market analysis and prediction: methodology, data representations, and case studies. Expert Syst. Appl. 83, 187–205 (2017)

    Article  Google Scholar 

  3. Loughran, T., McDonald, B.: Textual Analysis in Accounting and Finance: A Survey. https://doi.org/10.2139/ssrn.2504147 (2016)

  4. Hosaka, T.: Bankruptcy prediction using imaged financial ratios and convolutional neural networks. Expert Syst. Appl. 117, 287–299 (2019)

    Article  Google Scholar 

  5. Beaver, W.H.: Financial ratios as predictors of failure. J. Accounting Res., 71–111 (1966)

    Google Scholar 

  6. Araci, D.: FinBERT: Financial Sentiment Analysis with Pre-trained Language Models. https://arxiv.org/abs/1908.10063 (2019)

  7. Huang, A.H., Wang, H., Yang, Y.: FinBERT: a large language model for extracting information from financial text. Contemporary Accounting Research (2022)

    Google Scholar 

  8. Loukas, L., et al.: FiNER: financial numeric entity recognition for XBRL tagging. https://arxiv.org/abs/2203.06482 (2022)

  9. Shen, Z., et al.: LayoutParser: a unified toolkit for deep learning based document image analysis. In: 16th International Conference on Document Analysis and Recognition, Lausanne, Switzerland, pp. 131–146. https://doi.org/10.1007/978-3-030-86549-8_9 (2021)

  10. Li, J., Xu, Y., Lv, T., Cui, L., Zhang, C., Wei, F.: DiT: self-supervised pre-training for document image transformer. In: Proceedings of the 30th ACM International Conference on Multimedia (2022)

    Google Scholar 

  11. Huang, Y., Lv, T., Cui, L., Lu, Y., Wei, F.: LayoutLMv3: pre-training for Document AI with Unified Text and Image Masking. arXiv:2204.08387 (2022)

  12. Lombardo, G., Pellegrino, M., Adosoglou, G., Cagnoni, S., Pardalos, P.M., Poggi, A.: Machine learning for bankruptcy prediction in the American stock market: dataset and benchmarks. Future Internet. 14(8), 244. https://doi.org/10.3390/fi14080244(2022)

  13. Lombardo, G., Pellegrino, M., Adosoglou, G., Cagnoni, S., Pardalos, P.M., Poggi, A.: Deep Learning with Multi-Head Recurrent Neural Networks for Bankruptcy Prediction with Time Series Accounting Data. Available at SSRN: https://ssrn.com/abstract=4191839 (2022)

  14. Edward, I.: Altman: financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J. Financ. 23(4), 589–609 (1968)

    Article  Google Scholar 

  15. Shin, K.S., Lee, T.S., Kim, H.: An application of Support Vector Machines in bankruptcy prediction model. Expert Syst. Appl. 28(1), 127–135 (2005)

    Article  Google Scholar 

  16. Nanni, L., Lumini, A.: An experimental comparison of ensemble of classifiers for bankruptcy prediction and credit scoring. Expert Syst. Appl. 36(2), 3028–3033 (2009)

    Article  Google Scholar 

  17. Kim, S.Y., Upneja, A.: Predicting restaurant financial distress using decision tree and AdaBoosted decision tree models. Econ. Model. 36, 354–362 (2014)

    Article  Google Scholar 

  18. Atiya, A.F.: Bankruptcy prediction for credit risk using neural networks: a survey and new results. IEEE Trans. Neural Networks 12(4), 929–935 (2001)

    Article  Google Scholar 

  19. Tsai, C.F., Wu, J.W.: Using neural network ensembles for bankruptcy prediction and credit scoring. Expert Syst. Appl. 34(4), 2639–2649 (2008)

    Article  Google Scholar 

  20. Yoshihara, A., Fujikawa, K., Seki, K., Uehara, K.: Predicting stock market trends by recurrent deep neural networks. In: Pham, D.-N., Park, S.-B. (eds.) PRICAI 2014. LNCS (LNAI), vol. 8862, pp. 759–769. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13560-1_60

    Chapter  Google Scholar 

  21. Lin, X.: Header and footer extraction by page association. In: Proceedings SPIE 5010, Document Recognition and Retrieval X, 13 January 2003. https://doi.org/10.1117/12.472833

  22. Mai, F., Tian, S., Lee, C., et al.: Deep learning models for bankruptcy prediction using textual disclosures. Eur. J. Oper. Res. 274(2), 743–758 (2019)

    Article  Google Scholar 

  23. Ohlson, J.A.: Financial ratios and the probabilistic prediction of bankruptcy. J. Account. Res. 18, 109–131 (1980)

    Article  Google Scholar 

  24. U.S. Securities and Exchange Comission. https://www.sec.gov/edgar/search-and-access. Accessed 15 Jan 2023

  25. Why a global recession is inevitable in 2023?. https://www.economist.com/the-world-ahead/2022/11/18/why-a-global-recession-is-inevitable-in-2023?gclid=CjwKCAiA5Y6eBhAbEiwA_2ZWIT-e4RQK695FLW-F_YuXnMT0Tx4w3Qcx4BdMXPv0P8A_S8guWgh0bRoCKsUQAvD_BwE &gclsrc=aw.ds. Accessed 15 Jan 2023

  26. An Introduction to XBRL. https://www.xbrl.org/guidance/xbrl-glossary/. Accessed 10 Feb 2023

  27. Karl Pearson F.R.S., 1901. LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11), pp. 559–572

    Google Scholar 

  28. Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM, New York (2016). https://doi.org/10.1145/2939672.2939785

  29. Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics (2020). https://www.aclweb.org/anthology/2020.emnlp-demos.6

  30. Loukas, L., Fergadiotis, M., Androutsopoulos, I., Malakasiotis, P.: EDGAR-CORPUS: billions of tokens make the world go round. In: The Proceedings of the Workshop on Economics and Natural Language Processing - co-located with EMNLP (2021)

    Google Scholar 

  31. LoPucki, L.M.: UCLA-LoPucki Bankruptcy Research Database. UCLA School of Law. Print, Los Angeles, California (2005)

    Google Scholar 

  32. Peng, B., Chersoni, E., Hsu, Y.-Y., Huang, C.-R.: Is domain adaptation worth your investment? comparing BERT and FinBERT on financial tasks. In: Proceedings of the Third Workshop on Economics and Natural Language Processing, pp. 37–44, Punta Cana, Dominican Republic. Association for Computational Linguistics (2021)

    Google Scholar 

  33. Decile. In: The Concise Encyclopedia of Statistics. Springer, New York, NY. https://doi.org/10.1007/978-0-387-32833-1_99 (2008)

  34. Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006). ISSN 0167–8655, https://doi.org/10.1016/j.patrec.2005.10.010

  35. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics (2019)

    Google Scholar 

  36. Radford, A.: Improving language understanding with unsupervised learning (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rohit Bhiogade .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Doshi, C., Shrotiya, H., Bhiogade, R., Bhatt, H.S., Jha, A. (2023). Analyzing Textual Information from Financial Statements for Default Prediction. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds) Document Analysis and Recognition - ICDAR 2023. ICDAR 2023. Lecture Notes in Computer Science, vol 14189. Springer, Cham. https://doi.org/10.1007/978-3-031-41682-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-41682-8_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-41681-1

  • Online ISBN: 978-3-031-41682-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics