Abstract
Imbalanced data classification is an important issue in machine learning. To solve the data imbalance problem, rebalance algorithms are utilized. However, the rebalance algorithm has a lot of problems. Hence, the specific classification algorithm without a rebalancing algorithm is required. In this paper, we apply the zero-shot classifier to imbalance problem. The zero-shot classifier works despite the number of minority class is zero. Hence, the zero-shot classifier should also work in weaker data imbalance problems. We utilize the cluster-based zero-shot learning algorithm to do imbalanced data classification. The proposed method is evaluated using an imbalanced-learn dataset and compared with Decision Tree, K-Nearest Neighbor, and Over-sampling methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Sun, J., Lang, J., Fujita, H., Li, H.: Imbalanced enterprise credit evaluation with DTE-SBD: decision tree ensemble based on SMOTE and bagging with differentiated sampling rates. Inf. Sci. 425, 76–91 (2018)
Sanz, E.P., Gómez Hidalgo, J.M., Cortizo Pérez, J.C.: Chapter 3 Email spam filtering. In: Advances in Computers, vol. 74, pp. 45–114 (2008)
Li, D.-C., Liu, C.-W., Hu, S.C.: A learning method for the class imbalance problem with medical data sets. Comput. Biol. Med. 40, 509–518 (2010)
Ambai, K., Fujita, H.: MNDO: Multivariate Normal Distribution Based Over-Sampling for Binary Classification. Frontiers in Artificial Intelligence and Applications, Volume 303: New Trends in Intelligent Software Methodologies, Tools and Techniques, pp. 425–438 (2018)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Philip Kegelmeyer, W.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IJCNN 2008), pp. 1322–1328 (2008)
Han, H., Wang, W.-Y., Mao, B.-H.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Huang, D.-S., Zhang, X.-P., Huang, G.-B. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 878–887. Springer, Heidelberg (2005). https://doi.org/10.1007/11538059_91
Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. Syst. Man Commun. 2(3), 408–421 (1972)
Tomek, I.: Two modifications of CNN. IEEE Trans. Syst. Man Cybern. 6, 769–772 (2010)
Smith, M.R., Martinez, T., Giraud-Carrier, C.: An instance level analysis of data complexity. Mach. Learn. 95(2), 225–256 (2013). https://doi.org/10.1007/s10994-013-5422-z
Kubat, M., Matwin, S.: Addressing the curse of imbalanced training sets: one-sided selection. In: ICML, vol. 97, pp. 179–186 (1997)
Hayashi, T., Fujita, H.: Cluster-based zero-shot learning for multivariate data. J. Ambient Intell. Humanized Comput. (2020). https://doi.org/10.1007/s12652-020-02268-5
Lemaître, G., Nogueira, F., Aridas, C.K.: Imbalanced-learn: a Python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18, 1–5 (2017)
Zhang, C., et al.: Multi-imbalance: an open-source software for multi-class imbalance learning. Knowl.-Based Syst. 174, 137–143 (2019)
Bia, J., Zhang, C.: An empirical comparison on state-of-the-art multi-class imbalance learning algorithms and a new diversified ensemble learning scheme. Knowl.-Based Syst. 158, 81–93 (2018)
Wang, W., Zheng, V.W., Yu, H., Miao, C.: A survey of zero-shot learning: settings, methods, and applications. ACM Trans. Intell. Syst. Technol. (TIST) 10(2) (2019). Article 13. https://doi.org/10.1145/3293318
Socher, R., Ganjoo, M., Manning, C.D., Ng, A.Y.: Zero-shot learning through cross-modal transfer. In: NIPS 2013 Proceedings of the 26th International Conference on Neural Information Processing Systems, vol. 1, pp. 935–943 (2013)
Sun, J., Li, H., Fujita, H., Binbin, F., Ai, W.: Class-imbalanced dynamic financial distress prediction based on Adaboost-SVM ensemble combined with SMOTE and time weighting. Inform. Fusion 54, 128–144 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Hayashi, T., Ambai, K., Fujita, H. (2020). Applying Cluster-Based Zero-Shot Classifier to Data Imbalance Problems. In: Fujita, H., Fournier-Viger, P., Ali, M., Sasaki, J. (eds) Trends in Artificial Intelligence Theory and Applications. Artificial Intelligence Practices. IEA/AIE 2020. Lecture Notes in Computer Science(), vol 12144. Springer, Cham. https://doi.org/10.1007/978-3-030-55789-8_65
Download citation
DOI: https://doi.org/10.1007/978-3-030-55789-8_65
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-55788-1
Online ISBN: 978-3-030-55789-8
eBook Packages: Computer ScienceComputer Science (R0)