Abstract
Bankruptcy is an issue of interest in the business world since decades. It is a crucial endeavor for survival to predict this phenomenon in periods of economic turmoil and recession. In fact, bankruptcy modeling is challenging due to the complexity of contributing factors and the highly imbalanced distribution of available data sets. This work aims at improving the prediction power of bankruptcy modeling, by applying cost-sensitive ensemble methods on a real-world Spanish bankruptcy data set to generate prediction models. The performance of the prediction models is highly competitive in comparison with the related research in the field. Cost-sensitive random forests over-performed other approaches in predicting bankruptcy, achieving a geometric mean of 90.7%, 0.094 and 0.088 type I & type II errors, respectively.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Bought from http://infotel.es.
References
Akerlof, G.A., Romer, P.M., Hall, R.E., Mankiw, N.G.: Looting: the economic underworld of bankruptcy for profit. Brook. Pap. Econ. Act. 1993(2), 1–73 (1993)
Alaminos, D., del Castillo, A., Fernández, M.Á.: A global model for bankruptcy prediction. PLoS ONE 11(11), e0166693 (2016)
Alswiti, W., Faris, H., Aljawazneh, H., Safi, S., Castillo, P., Mora, A., Abukhurma, R., Alsawalqah, H.: Empirical evaluation of advanced oversampling methods for improving bankruptcy prediction. In: Proceedings of the International Conference on Time Series and Forecasting (ITISE 2018), pp. 1495–1506 (2018)
Altman, E.I.: Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J. Finance 23(4), 589–609 (1968)
Altman, E.I., Hotchkiss, E.: Corporate financial distress and bankruptcy: predict and avoid bankruptcy, analyze and invest in distressed debt, vol. 289. Wiley, Hoboken (2010)
Baird, D.G., Morrison, E.R.: Bankruptcy decision making. J Law Econ Organ 17(2), 356–372 (2001)
Balakrishnama, S., Ganapathiraju, A.: Linear discriminant analysis-a brief tutorial. In: Institute for Signal and information Processing, p. 18 (1998)
Barboza, F., Kimura, H., Altman, E.: Machine learning models and bankruptcy prediction. Expert Syst. Appl. 83, 405–417 (2017)
Bellovary, J.L., Giacomino, D.E., Akers, M.D.: A review of bankruptcy prediction studies: 1930 to present. J. Financ. Educ. 3, 1–42 (2007)
Blanco-Oliver, A., Irimia-Dieguez, A., Oliver-Alfonso, M., Wilson, N.: Improving bankruptcy prediction in micro-entities by using nonlinear effects and non-financial variables. Finance Uver 65(2), 144 (2015)
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Chawla, N.V.: Data mining for imbalanced datasets: an overview. In: Data Mining and Knowledge Discovery Handbook, pp. 875–886. Springer (2009)
Chen, N., Ribeiro, B., Vieira, A.S., Duarte, J., Neves, J.C.: A genetic algorithm-based approach to cost-sensitive bankruptcy prediction. Expert Syst. Appl. 38(10), 12939–12945 (2011)
Cho, S., Hong, H., Ha, B.C.: A hybrid approach based on the combination of variable selection using decision trees and case-based reasoning using the mahalanobis distance: For bankruptcy prediction. Expert Syst. Appl. 37(4), 3482–3488 (2010)
Collins, R.A., Green, R.D.: Statistical methods for bankruptcy forecasting. J. Econ. Bus. 34(4), 349–354 (1982)
Constand, R.L., Yazdipour, R.: Firm failure prediction models: a critique and a review of recent developments. In: Advances in Entrepreneurial Finance, pp. 185–204. Springer (2011)
Domingos, P.: Metacost: a general method for making classifiers cost-sensitive. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’99, pp. 155–164. ACM, New York, NY, USA (1999). https://doi.org/10.1145/312129.312220
Elkan, C.: The foundations of cost-sensitive learning. In: International Joint Conference on Artificial Intelligence, vol. 17, pp. 973–978. Lawrence Erlbaum Associates Ltd (2001)
Faris, H., Abukhurma, R., Almanaseer, W., Saadeh, M., Mora, A.M., Castillo, P.A., Aljarah, I.: Improving financial bankruptcy prediction in a highly imbalanced class distribution using oversampling and ensemble learning: a case from the spanish market. In: Progress in Artificial Intelligence, pp. 1–23 (2019)
Fejér-Király, G., et al.: Bankruptcy prediction: a survey on evolution, critiques, and solutions. Acta Universitatis Sapientiae, Econ. Bus. 3(1), 93–108 (2015)
Friedman, J.H.: Regularized discriminant analysis. J. Am. Stat. Assoc. 84(405), 165–175 (1989)
García, V., Marqués, A.I., Sánchez, J.S.: Exploring the synergetic effects of sample types on the performance of ensembles for credit risk and corporate bankruptcy prediction. Inform. Fusion 47, 88–101 (2019). https://doi.org/10.1016/j.inffus.2018.07.004
Gerritsen, P.: Accuracy rate of bankruptcy prediction models for the dutch professional football industry. Master’s thesis, University of Twente (2015)
Ghatasheh, N., Faris, H., AlTaharwa, I., Harb, Y., Harb, A.: Business analytics in telemarketing: cost-sensitive analysis of bank campaigns using artificial neural networks. Appl. Sci. 10(7), 2581 (2020). https://doi.org/10.3390/app10072581
Grice, J.S., Dugan, M.T.: The limitations of bankruptcy prediction models: some cautions for the researcher. Rev. Quant. Financ. Acc. 17(2), 151–166 (2001)
Hanley, J.A., McNeil, B.J.: The meaning and use of the area under a receiver operating characteristic (roc) curve. Radiology 143(1), 29–36 (1982)
Kaski, S., Sinkkonen, J., Peltonen, J.: Bankruptcy analysis with self-organizing maps in learning metrics. IEEE Trans. Neural Netw. 12(4), 936–947 (2001)
Khor, K.C., Ng, K.H.: Evaluation of cost sensitive learning for imbalanced bank direct marketing data. Indian J. Sci. Technol. (2016). https://doi.org/10.17485/ijst/2016/v9i42/100812
Kim, M.J., Kang, D.K., Kim, H.B.: Geometric mean based boosting algorithm with over-sampling to resolve data imbalance problem for bankruptcy prediction. Expert Syst. Appl. 42(3), 1074–1082 (2015)
Kiviluoto, K.: Predicting bankruptcies with the self-organizing map. Neurocomputing 21(1), 191–201 (1998)
Kleinert, M.: Comparison of bankruptcy prediction models of Altman (1969), Ohlson (1980) and Zmijewski (1984) on German and Belgian listed companies between 2008–2013. Master’s thesis, University of Twente (2014)
Korol, T., Korodi, A., et al.: An evaluation of effectiveness of fuzzy logic model in predicting the business bankruptcy. Rom. J. Econ. Forecast. 3(1), 92–107 (2011)
Kumar, P.R., Ravi, V.: Bankruptcy prediction in banks and firms via statistical and intelligent techniques-a review. Eur. J. Oper. Res. 180(1), 1–28 (2007)
Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach. Learn. 51(2), 181–207 (2003)
Laitinen, E.K., Laitinen, T.: Bankruptcy prediction: application of the Taylor’s expansion in logistic regression. Int. Rev. Financ. Anal. 9(4), 327–349 (2001)
Le, T., Vo, M.T., Vo, B., Lee, M.Y., Baik, S.W.: A hybrid approach using oversampling technique and cost-sensitive learning for bankruptcy prediction. Complexity 2019, 8460934 (2019). https://doi.org/10.1155/2019/8460934
Lee, H.H., Lin, C.M.: Industry effect, credit contagion and bankruptcy prediction. In: 20th Annual Conference on Pacific Basin Finance, Economics, Accounting, and Management (2012)
Leo, M., Sharma, S., Maddulety, K.: Machine learning in banking risk management: a literature review. Risks 7(1), 29 (2019)
Melville, P., Mooney, R.J.: Constructing diverse classifier ensembles using artificial training examples. IJCAI 3, 505–510 (2003)
Melville, P., Mooney, R.J.: Creating diversity in ensembles using artificial data. Inform. Fusion 6(1), 99–111 (2005)
Min, S.H., Lee, J., Han, I.: Hybrid genetic algorithms and support vector machines for bankruptcy prediction. Expert Syst. Appl. 31(3), 652–660 (2006)
Mossman, C.E., Bell, G.G., Swartz, L.M., Turtle, H.: An empirical comparison of bankruptcy models. Financ. Rev. 33(2), 35–54 (1998)
Nassimbwa, J., Tian, Y.: Bankruptcy effect on business competitors: Empirical study of US companies (2013). http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-76240
Opitz, D., Maclin, R.: Popular ensemble methods: an empirical study. J. Artif. Intell. Res. 11, 169–198 (1999)
Ouenniche, J., Bouslah, K., Cabello, J.M., Ruiz, F.: A new classifier based on the reference point method with application in bankruptcy prediction. J. Oper. Res. Soc. 69(10), 1653–1660 (2018)
O’Brien, R.G., Castelloe, J.: Sample size analysis for traditional hypothesis testing: concepts and issues. In: Pharmaceutical Statistics Using SAS: A Practical Guide, pp. 237–71 (2007)
Pacey, J.W., Pham, T.M.: The predictiveness of bankruptcy models: methodological problems and evidence. Aust. J. Manag. 15(2), 315–337 (1990)
Pervan, I., Kuvek, T.: The relative importance of financial ratios and nonfinancial variables in predicting of insolvency. Croat. Oper. Res. Rev. 4(1), 187–197 (2013)
Rahim, A.H.A., Rashid, N.A., Nayan, A., Ahmad, A.R.: Smote approach to imbalanced dataset in logistic regression analysis. In: Proceedings of the Third International Conference on Computing, Mathematics and Statistics (iCMS2017), pp. 429–433. Springer (2019)
Rey, D., Neuhäuser, M.: Wilcoxon-Signed-Rank Test, pp. 1658–1659. Springer, Berlin (2011). https://doi.org/10.1007/978-3-642-04898-2_616
Rodriguez, J.J., Kuncheva, L.I., Alonso, C.J.: Rotation forest: a new classifier ensemble method. IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1619–1630 (2006)
Schapire, R.E.: Explaining adaboost. In: Empirical Inference, pp. 37–52. Springer (2013)
Shen, F., Zhao, X., Li, Z., Li, K., Meng, Z.: A novel ensemble classification model based on neural networks and a classifier optimisation technique for imbalanced credit risk evaluation. Phys. A Stat. Mech. Appl. 526, 121073 (2019)
Shin, K.S., Lee, T.S., Kim, H.J.: An application of support vector machines in bankruptcy prediction model. Expert Syst. Appl. 28(1), 127–135 (2005)
Shin, K.S., Lee, Y.J.: A genetic algorithm application in bankruptcy prediction modeling. Expert Syst. Appl. 23(3), 321–328 (2002)
Shumway, T.: Forecasting bankruptcy more accurately: a simple hazard model. J. Bus. 74(1), 101–124 (2001)
Sun, Y., Kamel, M.S., Wong, A.K., Wang, Y.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recogn. 40(12), 3358–3378 (2007)
Turney, P.D.: Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm. J. Artif. Intell. Res. 2, 369–409 (1994)
Vu, L.T., Vu, L.T., Nguyen, N.T., Do, P.T.T., Dao, D.P.: Feature selection methods and sampling techniques to financial distress prediction for vietnamese listed companies. Invest. Manag. Financ. Innov. 16(1), 276 (2019)
Wang, H.: Cost-sensitive adaboost selective ensemble for financial distress prediction. Int. J. u e Serv. Sci. Technol. 8(10), 83–94 (2015)
Wang, J.: Data Warehousing and Mining: Concepts, Methodologies, Tools, and Applications, vol. 3. IGI Global, Pennsylvania (2008)
Weiss, G.M., McCarthy, K., Zabar, B.: Cost-sensitive learning vs. sampling: which is best for handling unbalanced classes with unequal error costs? DMIN 7, 35–41 (2007)
Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining, Fourth Edition: Practical Machine Learning Tools and Techniques, 4th edn. Morgan Kaufmann Publishers Inc., San Francisco (2016)
Wu, X., Yang, D., Zhang, W., Zhang, S.: A hybrid ensemble model for corporate bankruptcy prediction based on feature engineering method. Int. J. Inform. Commun. Sci. 4(3), 63 (2019)
Xu, W., Fu, H., Pan, Y.: A novel soft ensemble model for financial distress prediction with different sample sizes. Math. Probl. Eng. 2019, 3085247 (2019). https://doi.org/10.1155/2019/3085247
Yu, Q., Miche, Y., Lendasse, A., Séverin, E.: Bankruptcy prediction with missing data. In: Proceedings of 2011 International Conference on Data Mining, Las Vegas, USA, pp. 279–285 (2011)
Zefrehi, H.G., Altınçay, H.: Imbalance learning using heterogeneous ensembles. Expert Syst. Appl. 142, 113005 (2020)
Zhang, G., Hu, M.Y., Patuwo, B.E., Indro, D.C.: Artificial neural networks in bankruptcy prediction: general framework and cross-validation analysis. Eur. J. Oper. Res. 116(1), 16–32 (1999)
Zhou, Z.H.: Cost-sensitive learning. In: International Conference on Modeling Decisions for Artificial Intelligence, pp. 17–18. Springer (2011)
Acknowledgements
This work has been supported in part by Ministerio español de Economía y Competitividad under Project TIN2017-85727-C4-2-P (UGR-DeepBio), SPIP2017-02116 and TEC2015-68752 (also funded by FEDER), as well as Project B-TIC-402-UGR18 (FEDER and Junta de Andalucíıa) and RTI2018-102002-A-I00 (Ministerio español de Ciencia, Innovación y Universidades).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ghatasheh, N., Faris, H., Abukhurma, R. et al. Cost-sensitive ensemble methods for bankruptcy prediction in a highly imbalanced data distribution: a real case from the Spanish market. Prog Artif Intell 9, 361–375 (2020). https://doi.org/10.1007/s13748-020-00219-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13748-020-00219-x