Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Adaptive memetic algorithm enhanced with data geometry analysis to select training data for SVMs

Published: 12 April 2016 Publication History

Abstract

Support vector machines (SVMs) are one of the most popular and powerful machine learning techniques, but suffer from a significant drawback of the high time and memory complexities of their training. This issue needs to be endured especially in the case of large and noisy datasets. In this paper, we propose a new adaptive memetic algorithm (PCA2MA) for selecting valuable SVM training data from the entire set. It helps improve the classifier score, and speeds up the classification process by decreasing the number of support vectors. In PCA2MA, a population of reduced training sets undergoes the evolution, which is complemented by the refinement procedures. We propose to exploit both a priori information about the training set-extracted using the data geometry analysis-and the knowledge attained dynamically during the PCA2MA execution to enhance the refined sets. Also, we introduce a new adaptation scheme to control the pivotal algorithm parameters on the fly, based on the current search state. Extensive experimental study performed on benchmark, real-world, and artificial datasets clearly confirms the efficacy and convergence capabilities of the proposed approach. We demonstrate that PCA2MA is highly competitive compared with other state-of-the-art techniques. HighlightsWe propose a new adaptive memetic algorithm to select SVM training data.We perform the PCA-based preprocessing to determine valuable training samples.We apply a new scheme to adapt the refined training set size and selection scheme.We evaluate the importance of particular components of our algorithm.We demonstrate the effectiveness and efficiency of the proposed memetic algorithm.

References

[1]
S. Abe, T. Inoue, Fast training of support vector machines by extracting boundary data, in: Proceedings of the International Conference on Artificial Neural Networks, Vienna, Austria, Springer, Berlin, Heidelberg, 2001, pp. 308-313.
[2]
W. Ayadi, J.-K. Hao, A memetic algorithm for discovering negative correlation biclusters of DNA microarray data, Neurocomputing, 145 (2014) 14-22.
[3]
J.L. Balcázar, Y. Dai, O. Watanabe, A random sampling technique for training support vector machines, in: Proceedings of the International Conference on Algorithmic Learning Theory, Washington, DC, USA, Springer, Berlin, Heidelberg, 2001, pp. 119-134.
[4]
M. Barros de Almeida, A. De Padua Braga, J. Braga, SVM-KM: speeding SVMs learning with a priori cluster selection and k-means, in: Proceedings of the Sixth Brazilian Symposium on Neural Networks, 2000, pp. 162-167.
[5]
B.E. Boser, I.M. Guyon, V.N. Vapnik, A training algorithm for optimal margin classifiers, in: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, COLT '92, Pittsburgh, PA, USA, ACM, New York, NY, USA, 1992, pp. 144-152.
[6]
C.-C. Chang, H.-K. Pao, Y.-J. Lee, An RSVM based two-teachers-one-student semi-supervised learning algorithm, Neural Netw., 25 (2012) 57-69.
[7]
D. Cheng, J. Wang, X. Wei, Y. Gong, Training mixture of weighted SVM for object detection using EM algorithm, Neurocomputing, 149 (2015) 473-482.
[8]
L.-J. Chien, C.-C. Chang, Y.-J. Lee, Variant methods of reduced set selection for reduced support vector machines, J. Inf. Sci. Eng., 26 (2010) 183-196.
[9]
C. Cortes, V. Vapnik, Support-vector networks, Mach. Learn., 20 (1995) 273-297.
[10]
A. Eiben, R. Hinterding, Z. Michalewicz, Parameter control in evolutionary algorithms, IEEE Trans. Evol. Comput., 3 (1999) 124-141.
[11]
T. Fawcett, An introduction to ROC analysis, Pattern Recognit. Lett., 27 (2006) 861-874.
[12]
E. Ferragut, J. Laska, Randomized sampling for large data applications of SVM, in: Proceedings of IEEE International Conference on Machine Learning and Applications, vol. 1, 2012, pp. 350-355.
[13]
X. Guan, X. Zhang, D. Han, Y. Zhu, J. Lv, J. Su, A strategic flight conflict avoidance approach based on a memetic algorithm, Chin. J. Aeronaut., 27 (2014) 93-101.
[14]
F. Guimares, D. Lowther, J. Ramrez, Analysis of approximation-based memetic algorithms for engineering optimization, in: Computational Intelligence in Optimization Problems. Adaptation Learning and Optimization, vol. 2, Springer, Berlin, Heidelberg, 2010, pp. 163-191.
[15]
G. Guo, J.-S. Zhang, Reducing examples to accelerate support vector regression, Pattern Recognit. Lett., 28 (2007) 2173-2183.
[16]
L. Guo, S. Boukir, Fast data selection for SVM training using ensemble margin, Pattern Recognit. Lett., 51 (2015) 112-119.
[17]
Y. Jin, J.-K. Hao, J.-P. Hamiez, A memetic algorithm for the minimum sum coloring problem, Comput. Oper. Res., 43 (2014) 318-327.
[18]
T. Joachims, Making large-scale SVM learning practical, in: Advances in Kernel Methods, MIT Press, Cambridge, MA, USA, 1999, pp. 169-184.
[19]
V. Kadappa, A. Negi, Computational and space complexity analysis of SubXPCA, Pattern Recognit., 46 (2013) 2169-2174.
[20]
M. Kawulok, J. Nalepa, Support vector machines training data selection using a genetic algorithm, in: Structural, Syntactic, and Statistical Pattern Recognition, Lecture Notes in Computer Science, vol. 7626, Springer, Berlin, Heidelberg, 2012, pp. 557-565.
[21]
M. Kawulok, J. Nalepa, Dynamically adaptive genetic algorithm to select training data for SVMs, in: Advances in Artificial Intelligence-IBERAMIA 2014. Lecture Notes in Computer Science, Santiago de Chile, Chile, Springer International Publishing, Switzerland, 2014, pp. 242-254.
[22]
L. Khedher, J. Ramirez, J. Gorriz, A. Brahim, F. Segovia, Early diagnosis of Alzheimer's disease based on partial least squares, principal component analysis and support vector machine using segmented MRI images, Neurocomputing, 151 (2015) 139-150.
[23]
R. Koggalage, S. Halgamuge, Reducing the number of training samples for fast support vector machine classification, Neural Inf. Process. Lett. Rev., 2 (2004) 57-65.
[24]
Q. Le, T. Sarlos, A. Smola, Fastfood-approximating kernel expansions in loglinear time, in: Proceedings of the International Conference on Machine Learning, 2013, pp. 1-9.
[25]
Y.-J. Lee, S.-Y. Huang, Reduced support vector machines, IEEE Trans. Neural Netw., 18 (2007) 1-13.
[26]
Y. Li, L. Jiao, P. Li, B. Wu, A hybrid memetic algorithm for global optimization, Neurocomputing, 134 (2014) 132-139.
[27]
A. Lopez-Chau, X. Li, W. Yu, Convex-concave hull for classification with SVM, in: Proceedings of International Conference on Data Mining, 2012, pp. 431-438.
[28]
M. Marinaki, Y. Marinakis, An island memetic differential evolution algorithm for the feature selection problem, in: Proceedings of the International Workshop on Nature Inspired Cooperative Strategies for Optimization. Studies in Computational Intelligence, vol. 512, Canterbury, UK, Springer International Publishing, Switzerland, 2014, pp. 29-42.
[29]
J. Mercer, Functions of positive and negative type, and their connection with the theory of integral equations, Philos. Trans. R. Soc. Lond., 209 (1909) 415-446.
[30]
P. Moscato, On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts-Towards Memetic Algorithms, Technical Report 158-79, California Institute of Technology, 1989.
[31]
D.R. Musicant, A. Feinberg, Active set support vector regression, IEEE Trans. Neural Netw., 15 (2004) 268-275.
[32]
J. Nalepa, M. Blocho, Adaptive memetic algorithm for minimizing distance in the vehicle routing problem with time windows, Soft Comput. (2015) 1-19. http://dx.doi.org/10.1007/s00500-015-1642-4.
[33]
J. Nalepa, M. Blocho, Co-operation in the parallel memetic algorithm, Int. J. Parallel Program., 43 (2015) 812-839.
[34]
J. Nalepa, M. Kawulok, Adaptive genetic algorithm to select training data for support vector machines, in: Applications of Evolutionary Computation. Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2014, pp. 514-525.
[35]
J. Nalepa, M. Kawulok, A memetic algorithm to select training data for support vector machines, in: Proceedings of the 2014 Conference on Genetic and Evolutionary Computation, GECCO '14, Vancouver, BC, Canada, ACM, New York, NY, USA, 2014, pp. 573-580.
[36]
Y.-S. Ong, M.-H. Lim, N. Zhu, K.-W. Wong, Classification of adaptive memetic algorithms, IEEE Trans. Syst. Man Cybern. Part, B: Cybern., 36 (2006) 141-152.
[37]
J.C. Platt, Fast training of support vector machines using sequential minimal optimization, in: Advances in Kernel Methods, MIT Press, Cambridge, MA, USA, 1999, pp. 185-208.
[38]
A. Rahimi, B. Recht, Random features for large-scale kernel machines, in: Proceedings of Neural Information Systems Conference, 2008, pp. 1177-1184.
[39]
A. Rahimi, B. Recht, Weighted sums of random kitchen sinks: replacing minimization with randomization in learning, in: Proceedings of Neural Information Processing Systems Conference, 2009, pp. 1313-1320.
[40]
I. Rodriguez-Lujan, C.S. Cruz, R. Huerta, Hierarchical linear support vector machine, Pattern Recognit., 45 (2012) 4414-4427.
[41]
S. Salvador, P. Chan, Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms, in: Proceedings of IEEE International Conference on Tools with Artificial Intelligence, 2004, pp. 576-584.
[42]
G. Schohn, D. Cohn, Less is more: active learning with support vector machines, in: Proceedings of the International Conference on Machine Learning, 2000, pp. 839-846.
[43]
A. Sharma, K.K. Paliwal, Fast principal component analysis using fixed-point algorithm, Pattern Recognit. Lett., 28 (2007) 1151-1155.
[44]
H. Shin, S. Cho, Neighborhood property-based pattern selection for SVMs, Neural Comput., 19 (2007) 816-855.
[45]
K. Siminski, Neuro-fuzzy system based kernel for classification with support vector machines, in: Man-Machine Interactions 3. Advances in Intelligent Systems and Computing, vol. 242, Brenna, Poland, Springer International Publishing, Switzerland, 2014, pp. 415-422.
[46]
S. Tong, D. Koller, Support vector machine active learning with applications to text classification, J. Mach. Learn. Res., 2 (2002) 45-66.
[47]
I.W. Tsang, J.T. Kwok, P.-M. Cheung, Core vector machines, J. Mach. Learn. Res., 6 (2005) 363-392.
[48]
D. Wang, L. Shi, Selecting valuable training samples for SVMs via data structure analysis, Neurocomputing, 71 (2008) 2772-2781.
[49]
J. Wang, P. Neskovic, L.N. Cooper, Training data selection for SVMs, in: Advances in Natural Computing, Changsha, China, Springer, Berlin, Heidelberg, 2005, pp. 554-564.
[50]
W. Wang, Z. Xu, A heuristic training for support vector regression, Neurocomputing, 61 (2004) 259-275.
[51]
S. Wrona, M. Pawelczyk, Controllability-oriented placement of actuators for active noise-vibration control of rectangular plates using a memetic algorithm, Arch. Acoust., 38 (2013) 529-536.
[52]
J. Yan, J. Li, X. Gao, Chinese text location under complex background using Gabor filter and SVM, Neurocomputing, 74 (2011) 2998-3008.
[53]
H. Yu, C. Mu, C. Sun, W. Yang, X. Yang, X. Zuo, Support vector machine-based optimized decision threshold adjustment strategy for classifying imbalanced data, Knowl.-Based Syst., 76 (2015) 67-78.
[54]
Z.-Q. Zeng, H.-R. Xu, Y.-Q. Xie, J. Gao, A geometric approach to train SVM on very large data sets, in: Proceedings of the International Conference on Intelligent Systems and Knowledge Engineering, vol. 1, 2008, pp. 991-996.
[55]
G. Zhai, J. Chen, S. Wang, K. Li, L. Zhang, Material identification of loose particles in sealed electronic devices using PCA and SVM, Neurocomputing, 148 (2015) 222-228.
[56]
B. Zhang, J. Yin, S. Wang, X. Yan, Research on virus detection technique based on ensemble neural network and SVM, Neurocomputing, 137 (2014) 24-33.
[57]
W. Zhang, I. King, Locating support vectors via ß-skeleton technique, in: Proceedings of the International Conference on Neural Information Processing, 2002, pp. 1423-1427.

Cited By

View all
  1. Adaptive memetic algorithm enhanced with data geometry analysis to select training data for SVMs

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Neurocomputing
      Neurocomputing  Volume 185, Issue C
      April 2016
      254 pages

      Publisher

      Elsevier Science Publishers B. V.

      Netherlands

      Publication History

      Published: 12 April 2016

      Author Tags

      1. Adaptation
      2. Memetic algorithm
      3. Parameter control
      4. Principal component analysis
      5. Support vector machine
      6. Training data

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 17 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Fast instance selection method for SVM training based on fuzzy distance metricApplied Intelligence10.1007/s10489-022-04447-753:15(18109-18124)Online publication date: 24-Jan-2023
      • (2022)An escalated convergent firefly algorithmJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2018.10.00734:2(308-315)Online publication date: 1-Feb-2022
      • (2022)Improved bias value and new membership function to enhance the performance of fuzzy support vector MachineExpert Systems with Applications: An International Journal10.1016/j.eswa.2022.118003208:COnline publication date: 1-Dec-2022
      • (2022)Reduction of training data for support vector machine: a surveySoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-022-06787-526:8(3729-3742)Online publication date: 1-Apr-2022
      • (2021)Evolving data-adaptive support vector machines for binary classificationKnowledge-Based Systems10.1016/j.knosys.2021.107221227:COnline publication date: 5-Sep-2021
      • (2020)Fully-automated deep learning-powered system for DCE-MRI analysis of brain tumorsArtificial Intelligence in Medicine10.1016/j.artmed.2019.101769102:COnline publication date: 1-Jan-2020
      • (2019)On Evolutionary Classification Ensembles2019 IEEE Congress on Evolutionary Computation (CEC)10.1109/CEC.2019.8790140(2974-2981)Online publication date: 10-Jun-2019
      • (2019)Segmenting brain tumors from FLAIR MRI using fully convolutional neural networksComputer Methods and Programs in Biomedicine10.1016/j.cmpb.2019.05.006176:C(135-148)Online publication date: 1-Jul-2019
      • (2019)Hybridization of water wave optimization and sequential quadratic programming for cognitive radio systemSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-018-3437-x23:17(7991-8011)Online publication date: 1-Sep-2019
      • (2019)Detection and Segmentation of Brain Tumors from MRI Using U-NetsBrainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries10.1007/978-3-030-46643-5_17(179-190)Online publication date: 17-Oct-2019
      • Show More Cited By

      View Options

      View options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media