Abstract
Imbalanced learning is a challenging task in predictive modeling and machine learning that has inspired many researchers to attempt to improve the existing algorithms for more accurate predictions of imbalanced data sets. Due to the nature of rare events, developing reliable and efficient classification models for imbalanced data has not been easily accomplished, and over the past two decades, various methods have been proposed. To this end, we propose a Linear Programming Support Vector Machine (LP-SVM) model to address the issue of imbalanced learning in weather applications. To further improve the model’s predictive accuracy, we have implemented a parameter selection method based on the multi-objective parametric simplex approach for parameter tuning of LP-SVM. For numerical tests, we have used a real data set consisting of weather observations made by the Bureau of Meteorology’s (BM) system in Australia. The results obtained from training and testing the model demonstrate the effectiveness of our proposed model on the tested examples.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Change history
28 September 2023
A Correction to this paper has been published: https://doi.org/10.1007/s42979-023-02168-3
References
Krawczyk B. Learning from imbalanced data: open challenges and future directions. Progress Artificial Intell. 2016;5(4):221–32.
Weiss GM, Foundations of imbalanced learning. H. He, & Y. Ma, Imbalanced Learning: Foundations, Algorithms, and Applications, 2013; 13–41.
Veropoulos K, Campbell C, Cristianini N, et al. Controlling the sensitivity of support vector machines. Proc Int Joint Conf AI. 1999;55:60.
Longadge R, Dongre S, “Class imbalance problem in data mining review,” arXiv preprint arXiv:1305.1707 (2013)
Marzban C, Stumpf GJ. A neural network for tornado prediction based on doppler radar-derived attributes. J Appl Meteorol. 1996;35(5):617–26.
Lakshmanan V, Stumpf G, Witt A. A neural network for detecting and diagnosing tornadic circulations using the mesocyclone detection and near storm environment algorithms. In: 1st International Conference on Information Processing Systems, San Diego, CA. Soc: Amer. Meteor; 2005.
Chawla NV, Data mining for imbalanced datasets: An overview. in Data mining and knowledge discovery handbook, pp. 875–886, Springer, 2009
He H, Garcia EA. Learning from imbalanced data. IEEE Trans Knowl Data Eng. 2008;9:1263–84.
Datta S, Nag S, Das S. Boosting with lexicographic programming: addressing class imbalance without cost tuning. IEEE Trans Knowl Data Eng. 2019;32(5):883–97.
Elrahman SMA, Abraham A. A review of class imbalance problem. J Netw Innovative Comput. 2013;1(2013):332–40.
Mullick SS, Datta S, Dhekane SG, Das S. Appropriateness of performance indices for imbalanced data classification: an analysis. Pattern Recognition. 2020;102:107197.
Trafalis TB, Adrianto I, Richman MB, Lakshmivarahan S. Machine-learning classifiers for imbalanced tornado data. Comput Manag Sci. 2014;11(4):403–18.
Datta S, Das S. Near-bayesian support vector machines for imbalanced data classification with equal or unequal misclassification costs. Neural Netw. 2015;70:39–52.
Zou H, Yuan M, The f8-norm support vector machine. Statistica Sinica, pp. 379–398, 2008.
Zhou W, Zhang L, Jiao L. Linear programming support vector machines. Pattern Recognition. 2002;35(12):2927–36.
Ben-Hur A, Ong CS, Sonnenburg S, Schölkopf B, Rätsch G. Support vector machines and kernels for computational biology. PLoS Comput Biol. 2008;4(10):e1000173.
Aşkan A, Sayın S. Svm classification for imbalanced data sets using a multiobjective optimization framework. Ann Oper Res. 2014;216(1):191–203.
Soda P. A multi-objective optimisation approach for class imbalance learning. Pattern Recognition. 2011;44(8):1801–10.
Suttorp T, Igel C, Multi-objective optimization of support vector machines. in Multi-objective machine learning, pp. 199–220, Springer, 2006.
Datta S, Das S. Multiobjective support vector machines: handling class imbalance with pareto optimality. IEEE Trans Neural Netw Learning Syst. 2018;30(5):1602–8.
Ehrgott M. Multicriteria optimization, vol. 491. Berlin: Springer Science & Business Media; 2005.
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. Smote: synthetic minority over-sampling technique. J Artificial Intell Res. 2002;16:321–57.
Van Hulse J, Khoshgoftaar TM, Napolitano A, An empirical comparison of repetitive undersampling techniques. In: 2009 IEEE international conference on information reuse & integration, pp. 29–34, IEEE, 2009.
Liu X-Y, Wu J, Zhou Z-H. Exploratory undersampling for class-imbalance learning. IEEE Trans Syst Man Cybernet Part B (Cybernetics). 2008;39(2):539–50.
Batista GE, Prati RC, Monard MC. A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explorations Newslett. 2004;6(1):20–9.
Kim G, Chae BK, Olson DL. A support vector machine (svm) approach to imbalanced datasets of customer responses: comparison with other customer response models. Service Business. 2013;7(1):167–82.
Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20(3):273–97.
Hsu C-W, Chang C-C, Lin C-J, et al. A practical guide to support vector classification 2003.
Pang H, Zhao T, Vanderbei R, and Liu H, A parametric simplex approach to statistical learning problems. Unpublished manuscript. http://www.princeton.edu/rvdb/tex/PSM/PSM.pdf 2015.
Dan ND, Muu LD. Parametric simplex method for optimizing a linear function over the efficient set of a bicriteria linear problem. Acta Math Vietnamica. 1996;21:59–67.
Rudloff B, Ulus F, Vanderbei R. A parametric simplex algorithm for linear vector optimization problems. Math Program. 2017;163(1–2):213–42.
Cárdenas AA, Baras JS, B-roc curves for the assessment of classifiers over imbalanced data sets. In: Proceedings of the national conference on artificial intelligence, vol. 21, p. 1581, Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999, 2006.
Trafalis TB, Ince H, Richman MB, Tornado detection with support vector machines. In: International conference on computational science, pp. 289–298, Springer, 2003.
Trafalis TB, Adrianto I, Richman MB, Active learning with support vector machines for tornado prediction. in International Conference on Computational Science, pp. 1130–1137, Springer, 2007.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of Interest
On behalf of all authors, the corresponding author states that there is no confict of interest.
Ethical Approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jafarigol, E., Trafalis, T. Imbalanced Learning with Parametric Linear Programming Support Vector Machine for Weather Data Application. SN COMPUT. SCI. 1, 360 (2020). https://doi.org/10.1007/s42979-020-00381-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-020-00381-y