A Multi-label Imbalanced Data Classification Method Based on Label Partition Integration

Yuxuan Diao^12,13,
Zhongbin Sun^12,13 &
Yong Zhou^12,13

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14094))

Included in the following conference series:

International Conference on Web Information Systems and Applications

862 Accesses
3 Citations

Abstract

The problem of multi-label classification is widespread in real life, and its imbalanced characteristics seriously affect classification performance. Currently, resampling methods can be used to solve the problem of imbalanced classification of multi-label data. However, resampling methods ignore the correlation between labels, which may pull in new imbalance while changing the distribution of the original dataset, resulting in a decrease in classification performance instead of an increase. In addition, the resampling ratio needs to be manually set, resulting in significant fluctuations in classification performance. To address this issue, a multi-label imbalanced data classification method ESP based on label partition integration is proposed. ESP divides the dataset into single label datasets and label pair datasets without changing its original distribution, and then learns each dataset to construct multiple binary classification models. Finally, all binary classification models are integrated into a multi-label classification model. The experimental results show that ESP outperforms the five commonly used resampling methods in two common measures: F-Measure and Accuracy.

Supported by the Fundamental Research Funds for the Central Universities under Grant No.2021QN1075.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A First Approach to Deal with Imbalance in Multi-label Datasets

Extreme Learning Machine for Multi-label Classification

A Survey on Ensemble Multi-label Classifiers

References

Ai, X., Jian, W., Sheng, V.S., Yao, Y., Cui, Z.: Best first over-sampling for multilabel classification. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 1803–1806 (2015)
Google Scholar
Almeida, T.B., Borges, H.B.: An adaptation of the ML-kNN algorithm to predict the number of classes in hierarchical multi-label classification. In: Torra, V., Narukawa, Y., Honda, A., Inoue, S. (eds.) MDAI 2017. LNCS (LNAI), vol. 10571, pp. 77–88. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67422-3_8
Chapter Google Scholar
Bhattacharya, S., Rajan, V., Shrivastava, H.: ICU mortality prediction: a classification algorithm for imbalanced datasets. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, pp. 1288–1294. AAAI Press (2017)
Google Scholar
Boutell, M.R., Luo, J., Shen, X., Brown, C.M.: Learning multi-label scene classification. Pattern Recogn. 37(9), 1757–1771 (2004)
Article Google Scholar
Charte, F., Rivera, A.J., Del Jesus, M.J., Herrera, F.: MLSMOTE: approaching imbalanced multilabel learning through synthetic instance generation. Knowl.-Based Syst. 89, 385–397 (2015)
Article Google Scholar
Charte, F., Rivera, A., del Jesus, M.J., Herrera, F.: A first approach to deal with imbalance in multi-label datasets. In: Pan, J.-S., Polycarpou, M.M., Woźniak, M., de Carvalho, A.C.P.L.F., Quintián, H., Corchado, E. (eds.) HAIS 2013. LNCS (LNAI), vol. 8073, pp. 150–160. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40846-5_16
Chapter Google Scholar
Charte, F., Rivera, A., del Jesus, M.J., Herrera, F.: Resampling multilabel datasets by decoupling highly imbalanced labels. In: Onieva, E., Santos, I., Osaba, E., Quintián, H., Corchado, E. (eds.) HAIS 2015. LNCS (LNAI), vol. 9121, pp. 489–501. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19644-2_41
Chapter Google Scholar
Charte, F., Rivera, A.J., del Jesus, M.J., Herrera, F.: Addressing imbalance in multilabel classification: measures and random resampling algorithms. Neurocomputing 163, 3–16 (2015)
Article Google Scholar
Charte, F., Rivera, A.J., del Jesus, M.J., Herrera, F.: MLeNN: a first approach to heuristic multilabel undersampling. In: Corchado, E., Lozano, J.A., Quintián, H., Yin, H. (eds.) IDEAL 2014. LNCS, vol. 8669, pp. 1–9. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10840-7_1
Chapter Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16(1), 321–357 (2002)
Article MATH Google Scholar
Chen, L., Fu, Y., Chen, N., Ye, J., Liu, G.: Rule reduction for EBRB classification based on clustering. In: Xing, C., Fu, X., Zhang, Y., Zhang, G., Borjigin, C. (eds.) WISA 2021. LNCS, vol. 12999, pp. 442–454. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87571-8_38
Chapter Google Scholar
Chen, P.H., Fan, R.E., Lin, C.J.: A study on SMO-type decomposition methods for support vector machines. IEEE Trans. Neural Netw. 17(4), 893–908 (2006)
Article Google Scholar
Elisseeff, A.E., Weston, J.: A kernel method for multi-labelled classification. In: Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic, pp. 681–687 (2001)
Google Scholar
Haixiang, G., Yijing, L., Shang, J., Mingyun, G., Yuanyue, H., Bing, G.: Learning from class-imbalanced data: review of methods and applications. Expert Syst. Appl. 73, 220–239 (2017)
Article Google Scholar
Liu, B., Tsoumakas, G.: Making classifier chains resilient to class imbalance. In: Proceedings of The 10th Asian Conference on Machine Learning, ACML 2018, Beijing, China, 14–16 November 2018. Proceedings of Machine Learning Research, vol. 95, pp. 280–295. PMLR (2018)
Google Scholar
Nguyen, T.T., Nguyen, T.T.T., Luong, A.V., Nguyen, Q.V.H., Liew, A.W.C., Stantic, B.: Multi-label classification via label correlation and first order feature dependance in a data stream. Pattern Recogn. 90, 35–51 (2019)
Article Google Scholar
Pereira, R.M., Costa, Y.M., Silla, C.N., Jr.: MLTL: a multi-label approach for the Tomek Link undersampling algorithm. Neurocomputing 383, 95–105 (2020)
Article Google Scholar
Tahir, M.A., Kittler, J., Yan, F.: Inverse random under sampling for class imbalance problem and its application to multi-label classification. Pattern Recogn. 45, 3738–3750 (2012)
Article Google Scholar
Tarekegn, A.N., Giacobini, M., Michalak, K.: A review of methods for imbalanced multi-label classification. Pattern Recogn. 118, 107965 (2021)
Article Google Scholar
Tomek, I.: Two modifications of CNN. IEEE Trans. Syst. Man Cybern. SMC-6, 769–772 (1976)
Google Scholar
Tsoumakas, G., Vlahavas, I.: Random k-labelsets: an ensemble method for multilabel classification. In: Kok, J.N., Koronacki, J., Mantaras, R.L., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 406–417. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74958-5_38
Chapter Google Scholar
Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. Syst. Man Cybern. SMC-2 (1972)
Google Scholar
Yu, G., Domeniconi, C., Rangwala, H., Zhang, G., Yu, Z.: Transductive multi-label ensemble classification for protein function prediction. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1077–1085 (2012)
Google Scholar
Zakaryazad, A., Duman, E.: A profit-driven artificial neural network (ANN) with applications to fraud detection and direct marketing. Neurocomputing 175, 121–131 (2016)
Article Google Scholar
Zhang, M., Zhou, Z.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26, 1819–1837 (2014)
Article Google Scholar
Zhang, M.L., Zhou, Z.H.: A k-nearest neighbor based algorithm for multi-label classification. In: 2005 IEEE International Conference on Granular Computing, vol. 2, pp. 718–721 (2005)
Google Scholar
Zhang, W.B., Pincus, Z.: Predicting all-cause mortality from basic physiology in the Framingham heart study. Aging Cell 12, 39–48 (2016)
Article Google Scholar
Zhong, W., Raahemi, B., Liu, J.: Classifying peer-to-peer applications using imbalanced concept-adapting very fast decision tree on IP data stream. Peer-to-Peer Netw. Appl. 6(3), 233–246 (2013)
Article Google Scholar
Zhu, X.: Semi-supervised Learning Literature Survey. University of Wisconsin-Madison (2008)
Google Scholar
Zhu, Y., Kwok, J.T., Zhou, Z.H.: Multi-label learning with global and local label correlation. IEEE Trans. Knowl. Data Eng. 30, 1081–1094 (2017)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Mine Digitization Engineering Research Center of Ministry of Education, Xuzhou, 221116, Jiangsu, China
Yuxuan Diao, Zhongbin Sun & Yong Zhou
School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, Jiangsu, China
Yuxuan Diao, Zhongbin Sun & Yong Zhou

Authors

Yuxuan Diao
View author publications
You can also search for this author in PubMed Google Scholar
Zhongbin Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yong Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhongbin Sun .

Editor information

Editors and Affiliations

Nanjing University of Science and Technology, Nanjing, China
Long Yuan
Guangzhou University, Guangzhou, China
Shiyu Yang
Huazhong University of Science and Technology, Wuhan, China
Ruixuan Li
University of Amsterdam, Amsterdam, The Netherlands
Evangelos Kanoulas
National University of Defense Technology, Changsha, China
Xiang Zhao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Diao, Y., Sun, Z., Zhou, Y. (2023). A Multi-label Imbalanced Data Classification Method Based on Label Partition Integration. In: Yuan, L., Yang, S., Li, R., Kanoulas, E., Zhao, X. (eds) Web Information Systems and Applications. WISA 2023. Lecture Notes in Computer Science, vol 14094. Springer, Singapore. https://doi.org/10.1007/978-981-99-6222-8_2

Download citation

DOI: https://doi.org/10.1007/978-981-99-6222-8_2
Published: 09 September 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-6221-1
Online ISBN: 978-981-99-6222-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)

A Multi-label Imbalanced Data Classification Method Based on Label Partition Integration

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A First Approach to Deal with Imbalance in Multi-label Datasets

Extreme Learning Machine for Multi-label Classification

A Survey on Ensemble Multi-label Classifiers

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

A Multi-label Imbalanced Data Classification Method Based on Label Partition Integration

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A First Approach to Deal with Imbalance in Multi-label Datasets

Extreme Learning Machine for Multi-label Classification

A Survey on Ensemble Multi-label Classifiers

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation