research-article

The impact of class imbalance in classification performance metrics based on the binary confusion matrix

Authors:

Alejandro Carrasco,

Alejandro Martín,

Ana de las HerasAuthors Info & Claims

Volume 91, Issue C

Pages 216 - 231

https://doi.org/10.1016/j.patcog.2019.02.023

Published: 01 July 2019 Publication History

Highlights

•

Imbalance coefficient fosters measuring imbalance.

•

Geometric Mean and Bookmaker Informedness constitute the best unbiased metrics.

•

Matthews Correlation Coefficient is the best option for error consideration.

•

The concept of Class Balance Accuracy can be extended to other metrics.

Abstract

A major issue in the classification of class imbalanced datasets involves the determination of the most suitable performance metrics to be used. In previous work using several examples, it has been shown that imbalance can exert a major impact on the value and meaning of accuracy and on certain other well-known performance metrics. In this paper, our approach goes beyond simply studying case studies and develops a systematic analysis of this impact by simulating the results obtained using binary classifiers. A set of functions and numerical indicators are attained which enables the comparison of the behaviour of several performance metrics based on the binary confusion matrix when they are faced with imbalanced datasets. Throughout the paper, a new way to measure the imbalance is defined which surpasses the Imbalance Ratio used in previous studies. From the simulation results, several clusters of performance metrics have been identified that involve the use of Geometric Mean or Bookmaker Informedness as the best null-biased metrics if their focus on classification successes (dismissing the errors) presents no limitation for the specific application where they are used. However, if classification errors must also be considered, then the Matthews Correlation Coefficient arises as the best choice. Finally, a set of null-biased multi-perspective Class Balance Metrics is proposed which extends the concept of Class Balance Accuracy to other performance metrics.

References

[1]

A. Amin, S. Anwar, A. Adnan, M. Nawaz, N. Howard, J. Qadir, …., A. Hussain, Comparing oversampling techniques to handle the class imbalance problem: a customer churn prediction case study, IEEE Access 4 (2016) 7940–7957.

[2]

G.E. Batista, R.C. Prati, M.C. Monard, A study of the behavior of several methods for balancing machine learning training data, ACM SIGKDD Explor. Newslett. 6 (1) (2004) 20–29.

Digital Library

[3]

C. Beyan, R. Fisher, Classifying imbalanced data sets using similarity based hierarchical decomposition, Pattern Recognit. 48 (5) (2015) 1653–1672.

[4]

S. Boughorbel, F. Jarray, M. El-Anbari, Optimal classifier for imbalanced data using Matthews Correlation Coefficient metric, PloS One 12 (6) (2017).

[5]

P. Branco, L. Torgo, R.P. Ribeiro, Relevance-based evaluation metrics for multi-class imbalanced domains, in: Pacific-Asia Conference on Knowledge Discovery and Data Mining, Cham, Springer, 2017, May, pp. 698–710.

[6]

K.H. Brodersen, C.S. Ong, K.E. Stephan, J.M. Buhmann, The balanced accuracy and its posterior distribution, in: Pattern Recognition (ICPR), 2010 20th International Conference on, IEEE, 2010, August, pp. 3121–3124.

[7]

R. Caruana, A. Niculescu-Mizil, Data mining in metric space: an empirical analysis of supervised learning performance criteria, in: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 2004, August, pp. 69–78.

[8]

F. Charte, A.J. Rivera, M.J. del Jesus, F. Herrera, Addressing imbalance in multilabel classification: measures and random resampling algorithms, Neurocomputing 163 (2015) 3–16.

Digital Library

[9]

N.V. Chawla, K.W. Bowyer, L.O. Hall, W.P. Kegelmeyer, SMOTE: synthetic minority over-sampling technique, J. Artif. Intell. Res. 16 (2002) 321–357.

[10]

N.V. Chawla, Data mining for imbalanced datasets: An Overview, Data Mining and Knowledge Discovery Handbook, Springer US, 2005, pp. 853–867.

[11]

D. Chicco, Ten quick tips for machine learning in computational biology, BioData Min. 10 (1) (2017) 35.

[12]

R.A. Dara, M.S. Kamel, N. Wanas, Data dependency in multiple classifier systems, Pattern Recognit. 42 (7) (2009) 1260–1273.

[13]

S. Daskalaki, I. Kopanas, N. Avouris, Evaluation of classifiers for an uneven class distribution problem, Appl. Artif. Intell. 20 (5) (2006) 381–417.

[14]

C. Ferri, J. Hernández-Orallo, R. Modroiu, An experimental comparison of performance measures for classification, Pattern Recognit. Lett. 30 (1) (2009) 27–38.

Digital Library

[15]

A. Fernández, S. del Río, N.V. Chawla, F. Herrera, An insight into imbalanced Big Data classification: outcomes and challenges, Complex Intell. Syst. 3 (2) (2017) 105–120.

[16]

P.A. Flach, The geometry of ROC space: understanding machine learning metrics through ROC isometrics, in: Proceedings of the 20th International Conference on Machine Learning (ICML-03), 2003, pp. 194–201.

[17]

V. Ganganwar, An overview of classification algorithms for imbalanced datasets, Int. J. Emerg. Technol. Adv. Eng. 2 (4) (2012) 42–47.

[18]

V. García, R.A. Mollineda, J.S. Sánchez, Index of balanced accuracy: a performance measure for skewed class distributions, in: Iberian Conference on Pattern Recognition and Image Analysis, Berlin, Heidelberg, Springer, 2009, June, pp. 441–448.

[19]

J. Gorodkin, Comparing two K-category assignments by a K-category correlation coefficient, Comput. Biol. Chem. 28 (5–6) (2004) 367–374.

Digital Library

[20]

Q. Gu, L. Zhu, Z. Cai, Evaluation measures of the classification performance of imbalanced data sets, in: International Symposium on Intelligence Computation and Applications, Berlin, Heidelberg, Springer, 2009, October, pp. 461–471.

[21]

H. He, E.A. Garcia, Learning from imbalanced data, IEEE Trans. Knowl. Data Eng. 21 (9) (2009) 1263–1284.

Digital Library

[22]

M. Hossin, M.N. Sulaiman, A review on evaluation metrics for data classification evaluations, Int. J. Data Min. Knowl. Manage. Process 5 (2) (2015) 1.

[23]

T. Kautz, B.M. Eskofier, C.F. Pasluosta, Generic performance measure for multiclass-classifiers, Pattern Recognit. 68 (2017) 111–125.

[24]

B. Krawczyk, Learning from imbalanced data: open challenges and future directions, Prog. Artif. Intell. 5 (4) (2016) 221–232.

[25]

L.A. Jeni, J.F. Cohn, F. De La Torre, Facing imbalanced data–recommendations for the use of performance metrics, in: Affective Computing and Intelligent Interaction (ACII), 2013 Humaine Association Conference, IEEE, 2013, September, pp. 245–251.

[26]

I. Jolliffe, Principal component analysis, International Encyclopedia of Statistical Science, Springer, Berlin, Heidelberg, 2011, pp. 1094–1096.

[27]

G. Jurman, S. Riccadonna, C. Furlanello, A comparison of MCC and CEN error measures in multi-class prediction, PLoS One 7 (8) (2012) e41882.

[28]

M.S. Kraiem, M.N. Moreno, Effectiveness of basic and advanced sampling strategies on the classification of imbalanced data. a comparative study using classical and novel metrics, in: International Conference on Hybrid Artificial Intelligence Systems, Cham, Springer, 2017, June, pp. 233–245.

[29]

V. López, A. Fernández, S. García, V. Palade, F. Herrera, An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics, Inf. Sci. 250 (2013) 113–141.

[30]

B.W. Matthews, Comparison of the predicted and observed secondary structure of T4 phage lysozyme, Biochimica et Biophysica Acta (BBA) 405 (2) (1975) 442–451.

[31]

N. Mehra, S. Gupta, Survey on multiclass classification methods, Int. J. Comput. Sci. Inf. Technol. 4 (4) (2013) 572–576.

[32]

L. Mosley, A Balanced Approach to the Multi-Class Imbalance Problem, Iowa State University, 2013.

[33]

H. Núñez, L. Gonzalez-Abril, C. Angulo, Improving SVM classification on imbalanced datasets by introducing a new bias, J. Classification 34 (3) (2017) 427–443.

[34]

D.M. Powers, Evaluation: from precision, Recall and F-Measure to ROC, Informedness, Markedness and Correlation, School of Informatics and Engineering, Flinders University, Adelaide, Australia, 2011, Technical Report SIE-07-001.

[35]

B. Raman, T.R. Ioerger, Enhancing Learning Using Feature and Example Selection, Texas A&M University, College Station, TX, USA, 2003.

[36]

M. Sokolova, N. Japkowicz, S. Szpakowicz, Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation, in: Australasian Joint Conference on Artificial Intelligence, Berlin, Heidelberg, Springer, 2006, December, pp. 1015–1021.

[37]

M. Sokolova, G. Lapalme, A systematic analysis of performance measures for classification tasks, Inf. Process. Manage. 45 (4) (2009) 427–437.

Digital Library

[38]

S.V. Stehman, Selecting and interpreting measures of thematic classification accuracy, Remote Sens. Environ. 62 (1) (1997) 77–89.

[39]

Y. Sun, M.S. Kamel, A.K. Wong, Y. Wang, Cost-sensitive boosting for classification of imbalanced data, Pattern Recognit. 40 (12) (2007) 3358–3378.

Digital Library

[40]

X. Yuan, L. Xie, M. Abouelenien, A regularized ensemble framework of deep learning for cancer detection from multi-class, imbalanced training data, Pattern Recognit. 77 (2018) 160–172.

Digital Library

[41]

W. Zong, G.B. Huang, Y. Chen, Weighted extreme learning machine for imbalance learning, Neurocomputing 101 (2013) 229–242.

Digital Library

Cited By

Pandey GBagri RGupta RRajpal AAgarwal MKumar N(2024)Robust weighted general performance score for various classification scenariosIntelligent Decision Technologies10.3233/IDT-24046518:3(2033-2054)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.3233/IDT-240465
Li MLi ZLiu CChen WMa C(2024)Optimisation of Machine Learning Based Data Mining Methods for Network Intrusion DetectionProceedings of the 2024 6th International Conference on Big-data Service and Intelligent Computation10.1145/3686540.3686543(17-25)Online publication date: 29-May-2024
https://dl.acm.org/doi/10.1145/3686540.3686543
Conlon NAhmed NSzafir D(2024)A Survey of Algorithmic Methods for Competency Self-Assessments in Human-Autonomy TeamingACM Computing Surveys10.1145/361601056:7(1-31)Online publication date: 9-Apr-2024
https://dl.acm.org/doi/10.1145/3616010
Show More Cited By

Index Terms

The impact of class imbalance in classification performance metrics based on the binary confusion matrix
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Classification and regression trees
2. Mathematics of computing

Index terms have been assigned to the content through auto-classification.

Recommendations

Estimating harmfulness of class imbalance by scatter matrix based class separability measure

In many real world applications, class imbalance problems occur frequently, causing great underestimation for the classification performance of minority classes. In recent years, much effective solutions have been proposed to address this problem. ...
On Evaluating Multi-class Network Traffic Classifiers Based on AUC

Traffic monitoring and traffic characterization are essential for network planning and operation. Machine learning has been readily applied to high-speed network traffic classification. Evaluating the viability and performance of various classifiers for ...
A dynamic over-sampling procedure based on sensitivity for multi-class problems

Classification with imbalanced datasets supposes a new challenge for researches in the framework of machine learning. This problem appears when the number of patterns that represents one of the classes of the dataset (usually the concept of interest) is ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Pattern Recognition

Pattern Recognition Volume 91, Issue C

Jul 2019

405 pages

ISSN:0031-3203

Issue’s Table of Contents

The Authors.

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 July 2019

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

88
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 26 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Pandey GBagri RGupta RRajpal AAgarwal MKumar N(2024)Robust weighted general performance score for various classification scenariosIntelligent Decision Technologies10.3233/IDT-24046518:3(2033-2054)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.3233/IDT-240465
Li MLi ZLiu CChen WMa C(2024)Optimisation of Machine Learning Based Data Mining Methods for Network Intrusion DetectionProceedings of the 2024 6th International Conference on Big-data Service and Intelligent Computation10.1145/3686540.3686543(17-25)Online publication date: 29-May-2024
https://dl.acm.org/doi/10.1145/3686540.3686543
Conlon NAhmed NSzafir D(2024)A Survey of Algorithmic Methods for Competency Self-Assessments in Human-Autonomy TeamingACM Computing Surveys10.1145/361601056:7(1-31)Online publication date: 9-Apr-2024
https://dl.acm.org/doi/10.1145/3616010
Bhagat Smith JMallampati VBaskaran PGiolando MAdams JGrollman DBroadbent EJu WSoh HWilliams T(2024)Design Principles for Building Robust Human-Robot Interaction Machine Learning ModelsCompanion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction10.1145/3610978.3640598(247-251)Online publication date: 11-Mar-2024
https://dl.acm.org/doi/10.1145/3610978.3640598
Yang YHu XGao ZChen JNi CXia XLo D(2024)Federated Learning for Software Engineering: A Case Study of Code Clone Detection and Defect PredictionIEEE Transactions on Software Engineering10.1109/TSE.2023.334789850:2(296-321)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1109/TSE.2023.3347898
Chang HWang PDiao WXu GSun X(2024)Remote Sensing Change Detection With Bitemporal and Differential Feature Interactive PerceptionIEEE Transactions on Image Processing10.1109/TIP.2024.342433533(4543-4555)Online publication date: 15-Jul-2024
https://dl.acm.org/doi/10.1109/TIP.2024.3424335
HR SB A(2024)Exploratory Analysis of Methods, Techniques, and Metrics to Handle Class Imbalance ProblemProcedia Computer Science10.1016/j.procs.2024.04.082235:C(863-877)Online publication date: 24-Jul-2024
https://dl.acm.org/doi/10.1016/j.procs.2024.04.082
Chen DMiao DZhao X(2024)Multi-granularity detector for enhanced small object detection under sample imbalanceInformation Sciences: an International Journal10.1016/j.ins.2024.121076679:COnline publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1016/j.ins.2024.121076
Shirdel MDi Mauro MLiotta A(2024)Worthiness BenchmarkInformation Sciences: an International Journal10.1016/j.ins.2024.120882678:COnline publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1016/j.ins.2024.120882
Vairetti CAssadi JMaldonado S(2024)Efficient hybrid oversampling and intelligent undersampling for imbalanced big data classificationExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.123149246:COnline publication date: 15-Jul-2024
https://dl.acm.org/doi/10.1016/j.eswa.2024.123149
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents