Nothing Special   »   [go: up one dir, main page]

Skip to main content

Applying Cluster-Based Zero-Shot Classifier to Data Imbalance Problems

  • Conference paper
  • First Online:
Trends in Artificial Intelligence Theory and Applications. Artificial Intelligence Practices (IEA/AIE 2020)

Abstract

Imbalanced data classification is an important issue in machine learning. To solve the data imbalance problem, rebalance algorithms are utilized. However, the rebalance algorithm has a lot of problems. Hence, the specific classification algorithm without a rebalancing algorithm is required. In this paper, we apply the zero-shot classifier to imbalance problem. The zero-shot classifier works despite the number of minority class is zero. Hence, the zero-shot classifier should also work in weaker data imbalance problems. We utilize the cluster-based zero-shot learning algorithm to do imbalanced data classification. The proposed method is evaluated using an imbalanced-learn dataset and compared with Decision Tree, K-Nearest Neighbor, and Over-sampling methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Sun, J., Lang, J., Fujita, H., Li, H.: Imbalanced enterprise credit evaluation with DTE-SBD: decision tree ensemble based on SMOTE and bagging with differentiated sampling rates. Inf. Sci. 425, 76–91 (2018)

    Article  MathSciNet  Google Scholar 

  2. Sanz, E.P., Gómez Hidalgo, J.M., Cortizo Pérez, J.C.: Chapter 3 Email spam filtering. In: Advances in Computers, vol. 74, pp. 45–114 (2008)

    Google Scholar 

  3. Li, D.-C., Liu, C.-W., Hu, S.C.: A learning method for the class imbalance problem with medical data sets. Comput. Biol. Med. 40, 509–518 (2010)

    Article  Google Scholar 

  4. Ambai, K., Fujita, H.: MNDO: Multivariate Normal Distribution Based Over-Sampling for Binary Classification. Frontiers in Artificial Intelligence and Applications, Volume 303: New Trends in Intelligent Software Methodologies, Tools and Techniques, pp. 425–438 (2018)

    Google Scholar 

  5. Chawla, N.V., Bowyer, K.W., Hall, L.O., Philip Kegelmeyer, W.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  Google Scholar 

  6. He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IJCNN 2008), pp. 1322–1328 (2008)

    Google Scholar 

  7. Han, H., Wang, W.-Y., Mao, B.-H.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Huang, D.-S., Zhang, X.-P., Huang, G.-B. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 878–887. Springer, Heidelberg (2005). https://doi.org/10.1007/11538059_91

    Chapter  Google Scholar 

  8. Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. Syst. Man Commun. 2(3), 408–421 (1972)

    Article  MathSciNet  Google Scholar 

  9. Tomek, I.: Two modifications of CNN. IEEE Trans. Syst. Man Cybern. 6, 769–772 (2010)

    MathSciNet  MATH  Google Scholar 

  10. Smith, M.R., Martinez, T., Giraud-Carrier, C.: An instance level analysis of data complexity. Mach. Learn. 95(2), 225–256 (2013). https://doi.org/10.1007/s10994-013-5422-z

    Article  MathSciNet  Google Scholar 

  11. Kubat, M., Matwin, S.: Addressing the curse of imbalanced training sets: one-sided selection. In: ICML, vol. 97, pp. 179–186 (1997)

    Google Scholar 

  12. Hayashi, T., Fujita, H.: Cluster-based zero-shot learning for multivariate data. J. Ambient Intell. Humanized Comput. (2020). https://doi.org/10.1007/s12652-020-02268-5

  13. Lemaître, G., Nogueira, F., Aridas, C.K.: Imbalanced-learn: a Python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18, 1–5 (2017)

    Google Scholar 

  14. Zhang, C., et al.: Multi-imbalance: an open-source software for multi-class imbalance learning. Knowl.-Based Syst. 174, 137–143 (2019)

    Article  Google Scholar 

  15. Bia, J., Zhang, C.: An empirical comparison on state-of-the-art multi-class imbalance learning algorithms and a new diversified ensemble learning scheme. Knowl.-Based Syst. 158, 81–93 (2018)

    Article  Google Scholar 

  16. Wang, W., Zheng, V.W., Yu, H., Miao, C.: A survey of zero-shot learning: settings, methods, and applications. ACM Trans. Intell. Syst. Technol. (TIST) 10(2) (2019). Article 13. https://doi.org/10.1145/3293318

  17. Socher, R., Ganjoo, M., Manning, C.D., Ng, A.Y.: Zero-shot learning through cross-modal transfer. In: NIPS 2013 Proceedings of the 26th International Conference on Neural Information Processing Systems, vol. 1, pp. 935–943 (2013)

    Google Scholar 

  18. Sun, J., Li, H., Fujita, H., Binbin, F., Ai, W.: Class-imbalanced dynamic financial distress prediction based on Adaboost-SVM ensemble combined with SMOTE and time weighting. Inform. Fusion 54, 128–144 (2020)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Toshitaka Hayashi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hayashi, T., Ambai, K., Fujita, H. (2020). Applying Cluster-Based Zero-Shot Classifier to Data Imbalance Problems. In: Fujita, H., Fournier-Viger, P., Ali, M., Sasaki, J. (eds) Trends in Artificial Intelligence Theory and Applications. Artificial Intelligence Practices. IEA/AIE 2020. Lecture Notes in Computer Science(), vol 12144. Springer, Cham. https://doi.org/10.1007/978-3-030-55789-8_65

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-55789-8_65

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-55788-1

  • Online ISBN: 978-3-030-55789-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics