Nothing Special   »   [go: up one dir, main page]

skip to main content
article

An empirical study of classification algorithm evaluation for financial risk prediction

Published: 01 March 2011 Publication History

Abstract

A wide range of classification methods have been used for the early detection of financial risks in recent years. How to select an adequate classifier (or set of classifiers) for a given dataset is an important task in financial risk prediction. Previous studies indicate that classifiers' performances in financial risk prediction may vary using different performance measures and under different circumstances. The main goal of this paper is to develop a two-step approach to evaluate classification algorithms for financial risk prediction. It constructs a performance score to measure the performance of classification algorithms and introduces three multiple criteria decision making (MCDM) methods (i.e., TOPSIS, PROMETHEE, and VIKOR) to provide a final ranking of classifiers. An empirical study is designed to assess various classification algorithms over seven real-life credit risk and fraud risk datasets from six countries. The results show that linear logistic, Bayesian Network, and ensemble methods are ranked as the top-three classifiers by TOPSIS, PROMETHEE, and VIKOR. In addition, this work discusses the construction of a knowledge-rich financial risk management process to increase the usefulness of classification results in financial risk detection.

References

[1]
Kaplan, S. and Garrick, B.J., On the quantitative definition of risk. Risk Analysis. v1. 11-27.
[2]
Holton, G., Defining risk. Financial Analysts Journal. v60 i6. 19-25.
[3]
The National Health Care Anti-Fraud Association. Available from: http://www.nhcaa.org/eweb/DynamicPage.aspx?webcode=anti_fraud_resource_centr&wpscode=TheProblemOfHCFraud (assessed 27.04.08).
[4]
The U.S. Payment Card Information Network. Available from: http://www.cardweb.com/cardlearn/stat.html (assessed 27.04.08).
[5]
New Generation Research and Inc., New Generation Research, Inc. Available from: http://www.bankruptcydata.com/default.asp (assessed 27.04.08).
[6]
Frydman, H., Altman, E.I. and Kao, D., Introducing recursive partitioning for financial classification: the case of financial distress. The Journal of Finance. v40 iMarch (1). 269-291.
[7]
Baesens, B., Setiono, R., Mues, C. and Vanthienen, J., Using neural network rule extraction and decision tables for credit-risk evaluation. Management Science. v49 iMarch (3). 312-329.
[8]
Rosenberg, E. and Gleit, A., Quantitative methods in credit management: a survey. Operations Research. v42. 589-613.
[9]
Viaene, S., Derrig, R.A., Baesens, B. and Dedene, G., A comparison of state-of-the-art classification techniques for expert automobile insurance claim fraud detection. The Journal of Risk and Insurance. v69 i3. 373-421.
[10]
Altman, E.I., Avery, R.B., Eisenbeis, R.A. and Sinkey Jr., J.F., Application of Classification Techniques in Business, Banking and Finance. 1981. JAI Press, Inc., CT.
[11]
Leonard, K.J., Empirical Bayes analysis of the commercial loan evaluation process. Statistics and Probability Letters. v18. 289-296.
[12]
Chatterjee, S. and Barcun, S., A nonparametric approach to credit screening. Journal of American Statistical Association. v65. 150-154.
[13]
Freed, N. and Glover, F., Simple but powerful goal programming models for discriminant problems. European Journal of Operations Research. v7. 44-60.
[14]
Altman, E.I., Marco, G. and Varetto, F., Corporate distress diagnosis: comparisons using discriminant analysis and neural network. Journal of Banking & Finance. v18 i3. 505-529.
[15]
Atiya, A.F., Bankruptcy prediction for credit risk using neural networks: a survey and new results. IEEE Transactions on Neural Networks. vv12 i4. 929-935.
[16]
Carter, C. and Catlett, J., Assessing credit card applications using machine learning. IEEE Expert, Fall. 71-79.
[17]
Leonard, K.J., Detecting credit card fraud using expert systems. Computers and Industrial Engineering. v25. 103-106.
[18]
Desai, V.S., Convay, D.G., Crook, J.N. and Overstreet, G.A., Credit scoring models in the credit union environment using neural networks and genetic algorithms. IMA Journal of Mathematics Applied in Business and Industry. v8. 323-346.
[19]
Varetto, F., Genetic algorithms applications in the analysis of insolvency risk. Journal of Banking & Finance. v22 iOctober (10-11). 1421-1439.
[20]
Kim, H., Pang, S., Je, H., Kim, D. and Bang, S., Constructing support vector machine ensemble. Pattern Recognition. v36. 2757-2767.
[21]
Zhou, L., Lai, K.K. and Yen, J., Credit scoring models with AUC maximization based on weighted SVM. International Journal of Information Technology and Decision Making. v8 i4. 677-696.
[22]
Yu, L., Wang, S. and Cao, J., A modified least squares support vector machine classifier with application to credit risk analysis. International Journal of Information Technology and Decision Making. v8 i4. 697-710.
[23]
Peng, Y., Kou, G., Shi, Y. and Chen, Z., A multi-criteria convex quadratic programming model for credit data analysis. Decision Support Systems. v44 iMarch (4). 1016-1030.
[24]
Tseng, K.J., Liu, Y.H. and Ho, J., An efficient algorithm for solving a quadratic programming model with application in credit card holders' behavior. International Journal Of Information Technology & Decision Making. v7 iSeptember (3). 421-430.
[25]
Zhang, Y., Chen, L., Zhou, Z. and Shi, Y., A geometrical method on multidimensional dynamic credit evaluation. International Journal Of Information Technology & Decision Making. v7 i1. 103-114.
[26]
V. Srinivasan Y. H. Kim, Credit granting: a comparative analysis of classification procedures, Papers and Proceedings of the Forty-Fifth Annual Meeting of the American Finance Association, New Orleans, Louisiana, December 28-30, 1986, The Journal of Finance 42 (July (3)) (1987), 665-681.
[27]
Hand, D.J. and Henley, W.E., Statistical classification methods in consumer credit scoring: a review. Journal of the Royal Statistical Society. Series A (Statistics in Society). v160 i3. 523-541.
[28]
Thomas, L.C., A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers. International Journal of Forecasting. v16 i2. 149-172.
[29]
Phua, C., Lee, V., Smith, K. and Gayler, R., A comprehensive survey of data mining-based fraud detection research. Artificial Intelligence Review.
[30]
Desai, V.S., Crook, J.N. and Overstreet, G.A., A comparison of neural networks and linear scoring models in the credit union environment. European Journal of Operational Research. v95 i1. 24-37.
[31]
West, D., Neural network credit scoring models. Computers & Operations Research. v27. 1131-1152.
[32]
Yobas, M.B., Crook, J.N. and Ross, P., Credit scoring using neural and evolutionary techniques. IMA Journal of Management Mathematics. v11 i2. 111-125.
[33]
Wolpert, D.H. and Macready, W.G., No Free Lunch Theorems for Search, Technical Report SFI-TR-95-02-010. 1995. Santa Fe Institute.
[34]
Baesens, B., Gestel, T.V., Viaene, S., Stepanova, M., Suyken, J. and Vanthienen, J., Benchmarking state-of-the-art classification algorithms for credit scoring. Journal of the Operational Research Society. v54. 627-635.
[35]
Ali, S. and Smith, K.A., On learning algorithm selection for classification. Applied Soft Computing. v6. 119-138.
[36]
Smith-Miles, K.A., Cross-disciplinary perspectives on meta-learning for algorithm selection. ACM Computing Surveys. v41 iDecember (1). 25
[37]
Rokach, L., Ensemble-based classifiers. Artificial Intelligence Review.
[38]
Domingos, P., Toward knowledge-rich data mining. Data Mining and Knowledge Discovery. v15 i21.
[39]
Ferri, C., Hernandezorallo, J. and Modroiu, R., An experimental comparison of performance measures for classification. Pattern Recognition Letters. iJanuary. 27-38.
[40]
P.B. Brazdil, C. Soares, A comparison of ranking methods for classification algorithm selection, 11th European Conference on Machine Learning, Barcelona, Catalonia, Spain, May 31-June 2, 2000, Lecture Notes in Computer Science 1810 (2000) 63-75.
[41]
Han, J. and Kamber, M., Data Mining: Concepts and Techniques. 2006. 2nd edition. Morgan Kaufmann.
[42]
Hwang, C.L. and Yoon, K., Multiple Attribute Decision Making Methods and Applications. 1981. Springer, Berlin Heidelberg.
[43]
Olson, D.L., Comparison of weights in TOPSIS models. Mathematical and Computer Modelling. v40 i7-8. 721-727.
[44]
Kim, G., Park, C. and Yoon, K.P., Identifying investment opportunities for advanced manufacturing system with comparative-integrated performance measurement. International Journal of Production Economics. v50. 23-33.
[45]
Chu, T.C., Facility location selection using fuzzy TOPSIS under group decisions, International Journal of Uncertainty. Fuzziness & Knowledge-Based Systems. v10 i6. 687-701.
[46]
Abo-Sinna, M.A. and Amer, A.H., Extensions of TOPSIS for multi-objective large-scale nonlinear programming problems. Applied Mathematics and Computation. v162. 243-256.
[47]
Opricovic, S. and Tzeng, G.H., Compromise solution by MCDM methods: a comparative analysis of VIKOR and TOPSIS. European Journal of Operational Research. v156 i2. 445-455.
[48]
Brans, J.P., L'ingénièrie de la décision; Elaboration d'instruments d'aide í la décision. La méthode PROMETHEE. In: Nadeau, R., Landry, M. (Eds.), L'aide í la décision: Nature, Instruments et Perspectives d'Avenir, Presses de l'Université Laval, Québec, Canada. pp. 183-213.
[49]
Brans, J.P. and Mareschal, B., PROMETHEE methods. In: Figueira, J., Mousseau, V., Roy, B. (Eds.), In Multiple Criteria Decision Analysis: State of the Art Surveys, Springer, New York. pp. 163-195.
[50]
J.P. Brans, B. Mareschal, How to decide with PROMETHEE, 1994. Available from: http://www.visualdecision.com/Pdf/How%20to%20use%20PROMETHEE.pdf.
[51]
Chu, M.T., Shyu, J., Tzeng, G.H. and Khosla, R., Comparison among three analytical methods for knowledge communities group decision analysis. Expert Systems with Applications. v33 i4. 1011-1024.
[52]
Opricovic, S., Multicriteria Optimization of Civil Engineering Systems, Faculty of Civil Engineering. 1998. Belgrade.
[53]
Opricovic, S. and Tzeng, G.H., Multicriteria planning of post-earthquake sustainable reconstruction. Computer-Aided Civil and Infrastructure Engineering. v17 i3. 211-220.
[54]
Weiss, S.M. and Kulikowski, C.A., Computer Systems that Learn: Classification and Predication Methods from Statistics, Neural Nets, Machine Learning and Expert Systems. 1991. Morgan Kaufmann.
[55]
Domingos, P. and Pazzani, M., On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning. v29 i203. 103-130.
[56]
Platt, J.C., Fast training of support vector machines using sequential minimal optimization. In: Schotolkopf, B., Burges, C.J.C., Smola, A. (Eds.), Advances in Kernel Methods-Support Vector Learning, MIT press. pp. 185-208.
[57]
le Cessie, S. and Houwelingen, J.C., Ridge estimators in logistic regression. Applied Statistics. v41 i1. 191-201.
[58]
Dasarathy, B.V., Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques. 1991. IEEE Computer Society Press.
[59]
Quinlan, J.R., C4.5: Programs for Machine Learning. 1993. Morgan Kaufmann.
[60]
Cohen, W.W., Fast effective rule induction. In: Proceedings of the Twelfth International Conference on Machine Learning, Morgan Kaufmann. pp. 115-123.
[61]
Bishop, C.M., Neural Networks for Pattern Recognition. 1995. Oxford University Press.
[62]
Witten, I.H. and Frank, E., Data Mining: Practical Machine Learning Tools and Techniques. 2005. 2nd edition. Morgan Kaufmann, San Francisco.
[63]
UCI, UCI Machine Learning Repository, University of California, School of Information and Computer Science, Irvine, CA. Available form: http://www.ics.uci.edu/~mlearn/MLRepository.html.
[64]
Kou, G., Peng, Y., Shi, Y., Wise, M. and Xu, W., Discovering credit cardholders' behavior by multiple criteria linear programming. Annals of Operations Research. v135 iJanuary (1). 261-274.
[65]
Kwak, W., Shi, Y., Eldridge, S. and Kou, G., Bankruptcy prediction for japanese firms: using multiple criteria linear programming data mining approach. International Journal of Business Intelligence and Data Mining. v1 i4. 401-416.
[66]
W. Kwak, Y. Shi, G. Kou. Bankruptcy Prediction for Korean Firms after the 1997 Financial Crisis: Using a Multiple Criteria Linear Programming Data Mining Approach, Review of Quantitative Finance and Accounting (2011) in press.
[67]
Peng, Y., Kou, G., Sabatka, A., Matza, J., Chen, Z., Khazanchi, D. and Shi, Y., Application of classification methods to individual disability income insurance fraud detection. In: Shi, Y. (Ed.), ICCS 2007, Part III, LNCS 4489, Springer-Verlag, Berlin/Heidelberg. pp. 852-858.
[68]
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P. and Witten, I.H., The WEKA data mining software: an update. SIGKDD Explorations. v11 i1. 10-18.
[69]
Egan, J.P., Signal Detection Theory and ROC Analysis. Series in Cognition and Perception. 1975. Academic Press, New York.
[70]
Peng, Y., Kou, G., Shi, Y. and Chen, Z., A descriptive framework for the field of data mining and knowledge discovery. International Journal of Information Technology and Decision Making. v7 i4. 639-682.
[71]
Ohsawa, Y. and Fukuda, H., Chance discovery by stimulated groups of people-application to understanding consumption of rare food. Journal of Contingencies and Crisis Management. v10 iSeptember (3).
[72]
Fayyad, U.M., Piatetsky-Shapiro, G. and Smyth, P., The KDD process for extracting useful knowledge from volumes of data. Communications of the ACM. v39 i11. 27-34.
[73]
P. Chapman, J. Clinton, R. Kerber, T. Khabaza, T. Reinartz, C. Shearer, R. Wirth, CRISP-DM 1.0: step-by-step data mining guide, 2000. Available from: http://www.crisp-dm.org.
[74]
Peng, Y., Kou, G. and Shi, Y., Knowledge-Rich Data Mining in Financial Risk Detection. In: Allen, G., et, al. (Eds.), ICCS 2009, Part II, LNCS 5545, Springer-Verlag, Berlin Heidelberg. pp. 534-542.

Cited By

View all
  • (2022)Classification of Imbalanced Data Set in Financial Field Based on Combined AlgorithmMobile Information Systems10.1155/2022/18392042022Online publication date: 1-Jan-2022
  • (2022)An empirical application of a hybrid ANFIS model to predict household over-indebtednessNeural Computing and Applications10.1007/s00521-022-07389-w34:20(17343-17353)Online publication date: 1-Oct-2022
  • (2021)Deep Learning Based on Hierarchical Self-Attention for Finance Distress Prediction Incorporating TextComputational Intelligence and Neuroscience10.1155/2021/11652962021Online publication date: 1-Jan-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Applied Soft Computing
Applied Soft Computing  Volume 11, Issue 2
March, 2011
1443 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 March 2011

Author Tags

  1. Classification algorithm
  2. Financial risk prediction
  3. Knowledge-rich financial risk analysis
  4. Multiple criteria decision making (MCDM)

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Classification of Imbalanced Data Set in Financial Field Based on Combined AlgorithmMobile Information Systems10.1155/2022/18392042022Online publication date: 1-Jan-2022
  • (2022)An empirical application of a hybrid ANFIS model to predict household over-indebtednessNeural Computing and Applications10.1007/s00521-022-07389-w34:20(17343-17353)Online publication date: 1-Oct-2022
  • (2021)Deep Learning Based on Hierarchical Self-Attention for Finance Distress Prediction Incorporating TextComputational Intelligence and Neuroscience10.1155/2021/11652962021Online publication date: 1-Jan-2021
  • (2021)Biogeography based optimization for mining rules to assess credit riskInternational Journal of Intelligent Systems in Accounting and Finance Management10.1002/isaf.148628:1(35-51)Online publication date: 29-Mar-2021
  • (2020)Predicting Extreme Financial Risks on Imbalanced Dataset: A Combined Kernel FCM and Kernel SMOTE Based SVM ClassifierComputational Economics10.1007/s10614-020-09975-356:1(187-216)Online publication date: 1-Jun-2020
  • (2020)Performance assessment of ensemble learning systems in financial data classificationInternational Journal of Intelligent Systems in Accounting and Finance Management10.1002/isaf.146027:1(3-9)Online publication date: 26-Mar-2020
  • (2019)MCDM method for Financial Fraud DetectionProceedings of the 4th International Conference on Big Data and Internet of Things10.1145/3372938.3372949(1-8)Online publication date: 23-Oct-2019
  • (2019)Explaining Decision-Making Algorithms through UIProceedings of the 2019 CHI Conference on Human Factors in Computing Systems10.1145/3290605.3300789(1-12)Online publication date: 2-May-2019
  • (2019)Free alignment classification of dikarya fungi using some machine learning methodsNeural Computing and Applications10.1007/s00521-018-3539-531:11(6995-7016)Online publication date: 1-Nov-2019
  • (2018)A Pruning Neural Network Model in Credit Classification AnalysisComputational Intelligence and Neuroscience10.1155/2018/93904102018Online publication date: 1-Jan-2018
  • Show More Cited By

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media