Abstract
To date, there has been relatively little research in the field of credit risk analysis that compares all of the well known statistical, optimization technique (heuristic methods) and machine learning based approaches in a single article. Review on credit risk assessment using sixteen well-known approaches has been conducted in this work. The accuracy of the machine learning approaches in dealing with financial difficulties is superior to that of traditional statistical methods, especially when dealing with nonlinear patterns, according to the findings. Hybrid or Ensemble algorithms, on the other hand have been found to outperform their traditional counterparts – standalone classifiers in the vast majority of situations. Finally, the paper compares the models with nine machine learning classifiers utilizing two benchmark datasets. In this study, we have encountered with 46 datasets, among them 35 datasets have been utilized for once; whereas among the other 11 datasets, Australian, German and Japanese are the three most frequently utilized datasets by the researchers. The study showed that the performance of ensemble classifiers were very much significant. As per the experimental result, for both datasets ensemble classifiers outperformed other standalone classifiers which validate with the prior research also. Although some of these approaches have a high level of accuracy, additional study is required to discover the right parameters and procedures for better outcomes in a transparent manner. Additionally this study is a valuable reference source for analyzing credit risk for both academic and practical domains, since it contains relevant information on the most major machine learning approaches employed so far.
Similar content being viewed by others
Data availability
All data generated or analysed during this study are included in this article.
Abbreviations
- AdaBoost:
-
Adaptive Boosting
- ANFIS:
-
Adaptive Neuro-Fuzzy Inference System
- ANN:
-
Artificial Neural Network
- AUC:
-
Area Under Curve
- BPNN:
-
Back-Propogation Neural Network
- CART:
-
Classification And Regreesion Tree
- CCR:
-
Candidate Classifier Repository
- CGD:
-
Conjugate Gradient Desecent
- CNN:
-
Convolutional Neural Network
- ConsA:
-
Consensus Approach
- CRJ:
-
Cycle Reservoir with Regular Jump
- CSVM:
-
Clustered Suport Vector Machine
- DA:
-
Discriminant Networks
- DAG:
-
Directed Acylic Graph
- DNN:
-
Deep Neural Network
- DP:
-
Discriminate Power
- DT:
-
Decision Tree
- EAD:
-
Exposure At Default
- EmNN:
-
Emotional Neural Network
- EMPNGA:
-
Enhanced Multi-Population Niche Genetic Algorithm
- FKNN:
-
Fuzzy K-Nearest Neighbour
- FNN:
-
Feedforward Neural Network
- GA:
-
Genetic Algorithm
- GBDT:
-
Gradient Boosting Decision Tree
- GD:
-
Gradient Descent
- GNG:
-
Gabriel Neighbourhood Graph
- GRNN:
-
General Regession Neural Network
- GWO:
-
Grey Wolf Optimization
- HMM:
-
Hidden Markov Model
- IFOA:
-
Improved Fruit Fly Optimization Algorithm
- IMF:
-
International Monetary Fund
- KDD:
-
Knowledge Discovery in Data
- KNN:
-
K- Nearest Neighbour
- LDA:
-
Linear Discriminant Anaysis
- LGD:
-
Loss Given Default
- LM:
-
Levenberg – Marquadt
- LR:
-
Logistic Regression
- MARS:
-
Multivariate Adaptive Regression Splines
- MLP:
-
Multilayer Perception
- MLPNN:
-
Multilayer Perception Neural Network
- MODE-GL:
-
Multi-Objective Evolutionary Algorithm
- MPGA:
-
Multiple Population Genetic Algorithm
- MSE:
-
Mean Squared Error
- NB:
-
Naïve Bayes
- NN:
-
Neural Network
- OS:
-
One-step Secant
- P2P:
-
Peer To Peer
- PD :
-
Probability of Default
- PNN:
-
Probalistic Neural Network
- PSO:
-
Particle Swarm Optimization
- PTVPSO:
-
Parallel TVPSO
- RBF:
-
Radial Basis Function
- RF:
-
Random Forest
- RFoGAPS:
-
Random Forest optimized by genetic algorithm with profit score
- RNN:
-
Recurrent Neural Network
- ROC:
-
Receiver operating Characteristic
- RoS:
-
Random Over Sampling
- SME:
-
Small- and Medium-sized Enterprises
- SMOTE:
-
Synthetic Minority Over-Sampling Technique
- SVM:
-
Support vector Machine
- TLP:
-
Traditional Linear Programming
- TVPSO:
-
Time Variant Particle Swarm Optimization
- UNCTAD:
-
UN Conference on Trade and Development
References
Abdelmoula AK (2015) Bank credit risk analysis with k-nearest-neighbor classifier: Case of Tunisian banks. Account Manag Inf Syst 14(1):79
Altman EI (1968) Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. J Financ 23(4):589–609
Altman EI, Saunders A (1997) Credit risk measurement: developments over the last 20 years. J Bank Financ 21(11–12):1721–1742
Anagnostou I, Kandhai D (2019) Risk factor evolution for counterparty credit risk under a hidden markov model. Risks 7(2):66
Anderson B (2019) Using Bayesian networks to perform reject inference. Expert Syst Appl 137:349–356
Atiya AF (2001) Bankruptcy prediction for credit risk using neural networks: a survey and new results. IEEE Trans Neural Netw 12(4):929–935
Augasta MG, Kathirvalavakumar T (2012) Reverse engineering the neural networks for rule extraction in classification problems. Neural Process Lett 35(2):131–150
Ayodele OE (2021) “Development of credit risk prediction model using support vector machine technique,” PhD Thesis, Federal University of Technology Akure
Back B, Laitinen T, Sere K, van Wezel M (1996) Choosing bankruptcy predictors using discriminant analysis, logit analysis, and genetic algorithms. Turku Centre Comput Sci Tech Rep 40(2):1–18
Bahrammirzaee A (2010) A comparative survey of artificial intelligence applications in finance: artificial neural networks, expert system and hybrid intelligent systems. Neural Comput & Applic 19(8):1165–1195
Balin BJ (2008) “Basel I, Basel II, and emerging markets: a nontechnical analysis,” Available at SSRN 1477712
Baum LE, Eagon JA (1967) An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology. Bull Am Math Soc 73(3):360–363
Baum LE, Petrie T (1966) Statistical inference for probabilistic functions of finite state Markov chains. Ann Math Stat 37(6):1554–1563
Bertsimas D, Tsitsiklis J (1993) Simulated annealing. Stat Sci 8(1):10–15
Bhattacharya A, Ghatak S, Ghosh S, Das R (2014) “Simulated annealing approach onto VLSI circuit partitioning,”
Biswas SK, Chakraborty M, Purkayastha B, Roy P, Thounaojam DM (2017) Rule extraction from training data using neural network. Int J Artif Intell Tools 26(03):1750006
Chakraborty M, Biswas SK, Purkayastha B (2018) Recursive rule extraction from NN using reverse engineering technique. N Gener Comput 36(2):119–142
Chakraborty M, Biswas SK, Purkayastha B (2019) Rule extraction from neural network using input data ranges recursively. N Gener Comput 37(1):67–96
Chang Y-C, Chang K-H, Chu H-H, Tong L-I (2016) Establishing decision tree-based short-term default credit risk assessment models. Commun Stat Theory Methods 45(23):6803–6815
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Chen H-L, Yang B, Wang G, Liu J, Xu X, Wang SJ, Liu DY (2011) A novel bankruptcy prediction model based on an adaptive fuzzy k-nearest neighbor method. Knowl-Based Syst 24(8):1348–1359
Chen N, Ribeiro B, Chen A (2016) Financial credit risk assessment: a recent review. Artif Intell Rev 45(1):1–23
Chi L-C, Tang T-C (2006) Bankruptcy prediction: application of logit analysis in export credit risks. Aust J Manag 31(1):17–27
Chi G, Uddin MS, Abedin MZ, Yuan K (2019) Hybrid model for credit risk prediction: an application of neural network approaches. Int J Artif Intell Tools 28(05):1950017
Chidambaram S, Srinivasagan KG (2019) Performance evaluation of support vector machine classification approaches in data mining. Clust Comput 22(1):189–196
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Dahiya S, Handa SS, Singh NP (2017) A feature selection enabled hybrid-bagging algorithm for credit risk evaluation. Expert Syst 34(6):e12217
Danenas P, Garsva G (2015) Selection of support vector machines based classifiers for credit risk domain. Expert Syst Appl 42(6):3194–3204
Dorigo M, Di Caro G (1999) “Ant colony optimization: a new meta-heuristic,” in Proceedings of the 1999 congress on evolutionary computation-CEC99 (Cat. No. 99TH8406), vol. 2, pp. 1470–1477
Estrella A (2000) “Credit ratings and complementary sources of credit quality information,”
Fatemi A, Fooladi I (2006) “Credit risk management: a survey of practices,” Managerial Finance
From global pandemic to prosperity for all: avoiding another lost decade. (2020)
Gavira-Durón N, Gutierrez-Vargas O, Cruz-Aké S (2021) Markov Chain K-Means Cluster Models and Their Use for Companies’ Credit Quality and Default Probability Estimation. Mathematics 9(8):879
Goldberg DE, Holland JH (1988) “Genetic algorithms and machine learning,”
Gyamfi . K, Abdulai J-D (2018) “Bank fraud detection using support vector machine,” in 2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), pp. 37–41
Harris T (2015) Credit scoring using the clustered support vector machine. Expert Syst Appl 42(2):741–750
He J, Liu X, Shi Y, Xu W, Yan N (2004) Classifications of credit cardholder behavior by using fuzzy linear programming. Int J Inf Technol Decis Mak 3(04):633–650
Henley WE (1997) Construction of a k-nearest-neighbour credit-scoring system. IMA J Manag Math 8(4):305–321
Holland JH (1992) Genetic algorithms. Sci Am 267(1):66–73
Hu J, Cai J (2017) “Internet Credit Risk Scoring Based on Simulated Annealing and Genetic Algorithm,” in 2017 International Conference on Applied Mathematics, Modelling and Statistics Application (AMMSA 2017), pp. 373–377
Huang J-J, Tzeng G-H, Ong C-S (2006) Two-stage genetic programming (2SGP) for the credit scoring model. Appl Math Comput 174(2):1039–1053
Huang C-L, Chen M-C, Wang C-J (2007) Credit scoring with a data mining approach based on support vector machines. Expert Syst Appl 33(4):847–856
Huang X, Liu X, Ren Y (2018) Enterprise credit risk evaluation based on neural network algorithm. Cogn Syst Res 52:317–324
Imandoust SB, Bolandraftar M (2013) Application of k-nearest neighbor (knn) approach for predicting economic events: theoretical background. Int J Eng Res Appl 3(5):605–610
Jiang Y (2009) “Credit scoring model based on the decision tree and the simulated annealing algorithm,” in 2009 WRI world congress on computer science and information engineering, vol. 4, pp. 18–22
Khashman A (2010) Neural networks for credit risk evaluation: investigation of different neural models and learning schemes. Expert Syst Appl 37(9):6233–6239
Khashman A (2011) Credit risk evaluation using neural networks: emotional versus conventional models. Appl Soft Comput 11(8):5477–5484
Khemakhem S, Said FB, Boujelbene Y (2018) “Credit risk assessment for unbalanced datasets based on data mining, artificial neural network and support vector machines,” J Modell Manag
Konglai ZHU, Jingjing LI (2011) Studies of discriminant analysis and logistic regression model application in credit risk for China’s listed companies. Manag Sci Eng 4(4):24–32
Le R, Ku H, Jun D (2021) Sequence-based clustering applied to long-term credit risk assessment. Expert Syst Appl 165:113940
Leo M, Sharma S, Maddulety K (2019) Machine learning in banking risk management: A literature review. Risks 7(1):29
Lileikienė A (2008) “Analysis of chosen strategies of asset and liability management in commercial banks,” Eng Econ, vol. 57, no. 2
Marinakis Y, Marinaki M, Doumpos M, Matsatsinis N, Zopounidis C (2008) Optimization of nearest neighbor classifiers via metaheuristic algorithms for credit risk assessment. J Glob Optim 42(2):279–293
Marinakis Y, Marinaki M, Zopounidis C (2008) Application of ant colony optimization to credit risk assessment. New Math Natural Comput 4(01):107–122
Marinakis Y, Marinaki M, Doumpos M, Zopounidis C (2009) Ant colony and particle swarm optimization for financial classification problems. Expert Syst Appl 36(7):10604–10611
Martens D, Van Gestel T, De Backer M, Haesen R, Vanthienen J, Baesens B (2010) Credit rating prediction using ant colony optimization. J Oper Res Soc 61(4):561–573
Masmoudi K, Abid L, Masmoudi A (2019) Credit risk modeling using Bayesian network with a latent variable. Expert Syst Appl 127:157–166
Metawa N, Hassan MK, Elhoseny M (2017) Genetic algorithm based model for optimizing bank lending decisions. Expert Syst Appl 80:75–82
Miller LH, LaDue EL (1988) “Credit assessment models for farm borrowers: a logit analysis,”
Mohammadi N, Zangeneh M (2016) Customer credit risk assessment using artificial neural networks. IJ Information Technol Comput Sci 8(3):58–66
Moula FE, Guotai C, Abedin MZ (2017) Credit default prediction modeling: an application of support vector machine. Risk Manag 19(2):158–187
Nazari M, Alidadi M (2013) Measuring credit risk of bank customers using artificial neural network. J Manag Res 5(2):17
Oguz HT, Gurgen FS (2008) “Credit risk analysis using hidden markov model,” in 2008 23rd International Symposium on Computer and Information Sciences, pp. 1–5
Oreski S, Oreski G (2014) Genetic algorithm-based heuristic for feature selection in credit risk assessment. Expert Syst Appl 41(4):2052–2064
Pacelli V, Azzollini M (2011) An artificial neural network approach for credit risk management. J Intell Learn Syst Appl 3(02):103
Pavlenko T, Chernyak O (2010) Credit risk modeling using bayesian networks. Int J Intell Syst 25(4):326–344
Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106
Regulation R-BB (2009) “Foundations of banking risk,”
Rodan A, Faris H (2016) “Credit risk evaluation using cycle reservoir neural networks with support vector machines readout,” in Asian Conference on Intelligent Information and Database Systems, pp. 595–604
Roy AG, Urolagin S (2019) “Credit risk assessment using decision tree and support vector machine based data analytics,” in Creative Business and Social Innovations for a Sustainable Future, Springer, pp. 79–84
Satchidananda SS, Simha JB (2006) Comparing decision trees with logistic regression for credit risk analysis. International Institute of Information Technology, Bangalore
Setiono R, Baesens B, Mues C (2008) Recursive neural network rule extraction for data with mixed attributes. IEEE Trans Neural Netw 19(2):299–307
Souza CR (2010) Kernel functions for machine learning applications. Creative Commons Attribution-Noncommercial-Share Alike 3:29
Tian Z, Xiao J, Feng H, Wei Y (2020) Credit risk assessment based on gradient boosting decision tree. Procedia Comput Sci 174:150–160
Triki MW, Boujelbene Y (2017) “Bank credit risk: evidence from Tunisia using Bayesian networks,”
Uddin MS (2021) “Machine learning in credit risk modeling: empirical application of neural network approaches,” The Fourth Industrial Revolution: Implementation of Artificial Intelligence for Growing Business Success, pp. 417–435
Wang Y, Duan D (2021) Research on risk assessment of clients before loan based on decision tree algorithm. J Phys Conf Ser 1774(1):012056
Wang T, Li J (2019) An improved support vector machine and its application in P2P lending personal credit scoring. IOP Conf Series: Mater Sci Eng 490(6):062041
Wang S, Mathew A, Chen Y, Xi L, Ma L, Lee J (2009) Empirical analysis of support vector machine ensemble classifiers. Expert Syst Appl 36(3):6466–6476
Wang D, Zhang Z, Bai R, Mao Y (2018) A hybrid system with filter approach and multiple population genetic algorithm for feature selection in credit scoring. J Comput Appl Math 329:307–321
Ye X, Dong L, Ma D (2018) Loan evaluation in P2P lending based on random forest optimized by genetic algorithm with profit score. Electron Commer Res Appl 32:23–36
Yurynets R, Yurynets Z, Dosyn D, Kis Y (2019) “Risk Assessment Technology of Crediting with the Use of Logistic Regression Model.,” in COLINS, pp. 153–162
Zhang R, Wang W (2011) Facilitating the applications of support vector machine by using a new kernel. Expert Syst Appl 38(11):14225–14230
Zhang W, He H, Zhang S (2019) A novel multi-stage hybrid model with enhanced multi-population niche genetic algorithm: an application in credit scoring. Expert Syst Appl 121:221–232
Acknowledgements
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bhattacharya, A., Biswas, S.K. & Mandal, A. Credit risk evaluation: a comprehensive study. Multimed Tools Appl 82, 18217–18267 (2023). https://doi.org/10.1007/s11042-022-13952-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-13952-3