Abstract
Feature selection (FS) is a technique which helps to find the most optimal feature subset to develop an efficient pattern recognition model under consideration. The use of genetic algorithm (GA) and particle swarm optimization (PSO) in the field of FS is profound. In this paper, we propose an insightful way to perform FS by amassing information from the candidate solutions produced by GA and PSO. Our aim is to combine the exploitation ability of GA with the exploration capacity of PSO. We name this new model as binary genetic swarm optimization (BGSO). The proposed method initially lets GA and PSO to run independently. To extract sufficient information from the feature subsets obtained by those, BGSO combines their results by an algorithm called average weighted combination method to produce an intermediate solution. Thereafter, a local search called sequential one-point flipping is applied to refine the intermediate solution further in order to generate the final solution. BGSO is applied on 20 popular UCI datasets. The results were obtained by two classifiers, namely, k nearest neighbors (KNN) and multi-layer perceptron (MLP). The overall results and comparisons show that the proposed method outperforms the constituent algorithms in 16 and 14 datasets using KNN and MLP, respectively, whereas among the constituent algorithms, GA is able to achieve the best classification accuracy for 2 and 7 datasets and PSO achieves best accuracy for 2 and 4 datasets, respectively, for the same set of classifiers. This proves the applicability and usefulness of the method in the domain of FS.
1 Introduction
Every object in real life has certain features, the unique entities which define its characteristics. For identifying the patterns distinctively, researchers have been relying on various feature extraction techniques. Many such features are heuristically chosen based on domain understanding and/or inherent properties of the object such as statistical, morphological and so on. However, the extracted features are not always capable of predicting the pattern classes with absolute accuracy. There might be cases where two features being highly correlated or similar to each other, so inclusion of both the features in the learning model may lead to redundancy. There is also the case where features being uncorrelated to the pattern class to be predicted, i.e. the features are not useful enough to represent the pattern classes properly. To combat these issues, researchers since decades have been working on various methods to select an optimal as well as a useful set of features which perform well in any given classification scenario. However, the balance of exploitation and exploration is hard to achieve. As a result, it becomes difficult to find out the optimal feature subset for a problem so as to maximize the objective function (assuming that a higher value of the objective function is desired). Our proposed algorithm tries to achieve a good trade-off in this regard in order to choose an optimal feature subset.
Basically, the key purpose of feature selection (FS) methods is to maximize the classification ability of a learning model by selecting a near-optimal feature subset. Therefore, to choose an effective subset of features, we need a robust algorithm which correctly identifies the subset of features required to classify the patterns under consideration. It is to be noted that the objective of FS is not to find out an individual feature that correlates to the classification problem, but rather it is the combination of different features which when taken together represent the pattern profoundly. FS makes classification problems computationally efficient by reducing the classification cost of patterns with a large feature dimension. This implies that data storage and computation resources required for the training phase can be reduced.
Different searching algorithms are employed to find out the optimal subset of features. Blind search (BS) [11] iterates through each and every combination of subsets to reach the optimal one, but this trivial searching approach has exponential time complexity which is not feasible when the feature dimension is large. As an improvement over BS, researchers have invented heuristic searching algorithms [22], [41] which introduce various directed searches based on domain knowledge. These solutions interact locally with each other to reach a near-optimal solution within a reasonable time. This searching process reduces the time requirement substantially. Problem-specific properties [36] and greedy approaches [29] are often used for subset selection. Meta-heuristic algorithms [13], [16], [44] are used to overcome heuristic algorithms’ inability to circumvent local optima and are applicable to a wide range of problems. They are also problem independent in nature and have the ability to explore the search space more thoroughly which makes these algorithms robust.
FS methods are broadly classified into three categories, namely, filter, wrapper and embedded methods. Filter methods use characteristics of the features to assign a score to each feature. The classification ability of the features is then evaluated based on the score values. In this method, no learning algorithm is involved, which is why the cost of computation is tolerable. Some of the well-known filter methods are the chi-squared test [40], information gain [31], Fisher score [23] and so on. Wrapper methods, on the other hand, consult a learning algorithm to proceed toward an optimal solution. Although computationally expensive, wrapper methods tend to give better results than filter methods more often due to the administration of the learning algorithm. Genetic algorithm (GA) [44], particle swarm optimization (PSO) [16] and gravitation search algorithm [37] are examples of some popular wrapper methods. Researchers have incorporated the advantages of wrapper and filter methods in a single technique which is known as the embedded method [15], [20], [45]. These methods incorporate both filter-based mechanism and supervision of a learning algorithm to test the fitness measure of the solutions. In this paper, we use a new FS method named binary genetic swarm optimization (BGSO), which combines two wrapper-based evolutionary algorithms, namely, GA and PSO.
The remaining paper is organized as follows. Section 2 gives a brief review of the works accomplished with the help of GA and PSO, which are the two ancestors of our proposed FS model. Section 3 provides an introduction of GA and PSO followed by a detailed explanation and analysis of BGSO. Section 4 contains the results obtained by BGSO over some well-known UCI datasets, and its comparison with GA, PSO and histogram-based multi-objective GA (HMOGA) along with the parameter setting we used for our experimentations. Some conclusions drawn from the experimentations and scope of future work regarding BGSO are reported in Section 5.
In this paper our aim is to improve the exploitation capacity of GA and the exploration ability of PSO. A new algorithm is proposed to combine the results (feature subsets in the population) of the two algorithms, namely, GA and PSO, in order to produce a near-optimal feature subset. The algorithm is tested on 20 popular UCI datasets. The results are generated using two classifiers, namely, k nearest neighbors (KNN) and multi-layer perceptron (MLP).
2 Related Work
GA has been used for FS as early as 1997 in [44] where selected feature subsets were evaluated using the neural network as the classifier. GA has been used for FS in a variety of domains like spectral datasets [32], and Colon and Yeast datasets [17], as well as to optimize the kernel parameters of support vector machine (SVM). GA in [17] has utilized the natural phenomenon of death on old age, war and disease to shrink an exploding population. A hybrid model of GA and PSO has been proposed where both GA and PSO run in parallel, and after a specified number of iterations, the subsets in populations of GA and PSO are interchanged. A hybrid of GA and ant colony optimization (ACO) has been used for load prediction in [38] using the MLP classifier. At first, the initial population is enhanced using the genetic operations of GA and the best feature subset is passed for further refinement by ACO. The use of GA to find biomarkers in microarray data [19] is also quite prevalent. A modified version of GA named HMOGA has been proposed in [18]. HMOGA basically divides the entire dataset into a number of smaller datasets. These datasets are then fed to the classifier individually. After this process, the outcomes of the different parts are combined by drawing a histogram and using a cutoff to get to the final solution. In [24], Guha et al. proposed a hybrid GA called Deluge-based GA (DGA) which uses Great Deluge algorithm in place of mutation in order to achieve significant perturbation of the system of solutions. They tested the proposed algorithm over UCI datasets which shows the superiority of DGA over some well-established contemporary metaheuristic algorithms. As an improvement over HMOGA, Guha et al. proposed a memory-oriented HMOGA named M-HMOGA in [25] which uses a memory and stores best population of GA across multiple generations. Abualigah and Hanandeh applied adaptive GA to perform information retrieval using the vector space model in [2].
PSO has been heavily used for the purpose of FS. Binary PSO or discrete PSO was first proposed for FS in [30] where the position values were converted into probability for the inclusion of features using a sigmoid function. In some cases, both continuous and discrete PSO were used together to optimize the parameters of SVM and features, respectively [28]. A distributed system was adopted in which the server performed PSO calculations, and SVM training and testing were done on the client. A different approach could be found in [8] where PSO was modified to form geometric PSO where new agents were generated using crossover operation on the current agents, the local best and global best, and then mutating the agent formed after crossover. The algorithm was used for gene selection. Another minor modification of PSO was the use of a rough set to perform FS [42]. A hybrid algorithm encapsulating PSO with genetic operators was proposed by Abualigah and Khader in [3] to perform text clustering. The proposed hybrid model improved the performance of the k-means clustering algorithm by selecting a better set of informative features. Another approach for text document clustering was proposed in [5] which used adaptive PSO to find a more informative subset of features and also reduced the time requirement to some extent.
ACO was adopted for FS for text classification in [7]. A graph was made with nodes representing features, and instead of assigning pheromones to links, the nodes were assigned pheromone deposits. Each node had a pheromone deposit and a heuristic desirability which determined if the node was selected or not. A hybrid of GA and ACO was proposed in [34] where the two algorithms ran in parallel, and in each iteration, the best result of the two was taken. The hybrid algorithm performed FS for protein function prediction. In [9], a very similar approach was adopted, but here the application domain was text classification. The work reported in [35] compared the usage of ACO, GA and PSO using the SVM classifier on siRNA data. An important observation there was that both GA and PSO had performed better than ACO. In [21], Ghosh et al. proposed an embedded ACO named wrapper filter ACOFS which uses a filter method to evaluate the feature subsets to reduce the time requirement of the overall model. The authors also used a memory to store the best results throughout all the generations. An innovative text clustering method based on Krill herd (KH) was proposed by Abualigah et al. in [6]. The authors introduced a hybrid improved KH algorithm called MMKHA which was applied on eight text datasets. Another approach for text document clustering was proposed in [4] which combined objective functions and hybrid KH algorithm. Abualigah proposed the enhanced KH algorithm for text document clustering in [1].
3 Present Work
The proposed model BGSO is a metaheuristic which considers only the good sides of both GA and PSO in order to overcome the limitations of the individual algorithms. It is to be noted that GA lacks in terms of exploitation ability because the only source of exploitation in GA is mutation which performs very small perturbation of the chromosomes. But, on the other hand, GA can achieve notable exploration of the search space through crossover operation. When PSO is considered, it can be noticed that PSO has good local search capabilities which enhance its exploitation ability but it is unable to achieve suitable exploration. PSO frequently gets stuck in local optima [43], which hampers its exploration abilities. These complementary exploitation-exploration trade-offs of GA and PSO motivate us to combine their results so that an optimal and useful outcome can be achieved. In this section, the constituent algorithms, GA and PSO, as well as their combination BSGO are explained in detail.
3.1 GA
GA is a popular evolutionary algorithm computational method developed by Holland in early 1975 [27] and later enhanced by Goldberg [26]. It is a global search technique that solves a given problem by mimicking the natural process of evolution. Based on Darwin’s theory, GA utilizes the concept of reproduction and survival of the fittest. GA exploits new and better solutions without any presumption such as continuity or unimodality. As a process, GA has large potential, and due to this, over the years GA has been used for designing, optimizing telecommunication, traffic and shipment routing, gaming, market and financial analysis and many more [10], [12], [33]. The increase in its use in different sectors is because of the fact that GA can handle a large number of parameters, and it comes with a solution which is satisfying enough though may not be the best.
GA consists of a set of solutions, chromosomes or individuals which are strings of binary values, “0”s and “1”s. Each value (“0” or “1”) determines the state of attributes in the chromosome. A set of such chromosomes is referred to as a population. Each chromosome is then evaluated using a fitness function. After ranking the chromosomes according to their fitness values, they undergo genetic operations such as crossover and mutation. For this, two chromosomes are selected on the basis of their positions on a roulette wheel (biased according to each chromosome’s fitness). The two chromosomes first go through crossover and then mutation is applied to increase the local coverage of search space by the chromosomes, thereby decreasing the chances of being stuck at a local optimum. If the evolution process generates stronger offspring chromosomes than the previous ones, the algorithm replaces them. The evolution process repeats until it meets the end criteria.
The main steps of GA (as proposed by Holland in 1975) along with the framework are as follows:
Representation of structure data in genetic space, with different combinations to form a candidate solution.
Initialization of randomly generated individuals who constitute the first generation.
Evaluation of each individual to determine the fitness value.
Selection of good chromosomes for breeding purposes.
Crossover and mutation to produce the offspring set.
Evaluation of the new individuals to pass them to the next generation.
Termination if end criteria are met, else back to Step (3).
End criteria for termination can be either of the following:
Highest possible accuracy for the model is reached.
The accuracy results for consecutive generations remaining unchanged.
Prefixed maximum number of generations (or value of time) set is reached.
3.2 PSO
It is a population-based stochastic optimization technique inspired by the social behavior of bird flocking. PSO was proposed by Eberhart and Kennedy in 1995 [16]. It is a metaheuristic as it can explore over a search space making no or few previous assumptions about the given problem and converges to an optimal solution. The candidate solutions, referred to as particles in the technique, fly around in a multi-dimensional search space, to find out an optimal or sub-optimal solution by competition as well as by cooperation among them [39]. Like GA, PSO is also initialized with a group of random particles and then it looks for optima through the movement of candidate solutions in the search space. Each particle is represented by a vector
Here k represents the kth iteration and d represents the dth feature in the vector. w represents the inertia factor which assigns a weight to the impact of previous velocity. c1 and c2 are acceleration constants. r1 and r2 are random numbers in the range [0, 1]. pgd and gid denote the state of dth feature in pbest and gbest.
3.3 BGSO: Combination of GA and PSO
GA and PSO belong to two different categories of FS algorithms, namely, evolutionary algorithm and swarm intelligence, respectively. GA is very useful in passing down useful features from one generation to the next. PSO has the advantage of a thorough search of the search space using particles which relate feature information to one another. These advantages of PSO and GA have been combined to form BGSO which has balanced exploitation and exploration abilities. At the first step of our proposed method, GA and PSO are employed to run separately to produce their final set of population. Then the combination of the population is done by evaluating the importance of all features belonging to any of the two sets of population. This process of combination is done by a method called the average weighted combination method (AWCM). A new feature subset is created based on the mean of the importance of all features (AWCM cutoff). A local search, called sequential one-point flipping (SOPF), is applied thereafter to further enhance the subset’s discriminative abilities. Here lie the main characteristics of the present work.
In AWCM, at first, the sum of accuracy of all the particles (in PSO) or chromosomes (in GA) is calculated. Let us consider a feature that is selected by a chromosome of GA giving an accuracy of 85% and also selected in two particles of PSO having an accuracy of 90% and 80% each. The importance of the feature is calculated as 2.25(0.85 + 0.80 + 0.90 = 2.25). The AWCM cutoff is set as the mean of the importance of all the features selected by either GA or PSO. The features having the importance value higher than the AWCM cutoff are included in the new subset. For easy understanding, the steps of AWCM with a simplified example showing the calculation of importance of a feature are shown in Figure 1. “F” represents the normal feature state (either a 1 or 0) and “WF” represents the weighted feature state (accuracy value of the candidate selecting the feature). The size of the population is taken as n. After using AWCM over 2n feature subsets produced by GA and PSO, we get a single feature subset which includes relatively important features.
A hypothetical example to illustrate the concept of AWCM in order to measure the importance of a feature to be considered in the final and optimal population.
Another significant contribution of the present work is the inclusion of the local search – SOPF in the proposed FS model. This is a superficial non-greedy approach which helps to improve the final result. SOPF sequentially goes through each feature state of a candidate solution. By flipping the state of each feature successively, it checks the effect of considering the neighboring features on the feature under consideration. After flipping a feature, the algorithm accepts the intermediate solution only when it achieves higher accuracy than the current solution. In this way, SOPF confirms the acceptance of similar or better solution but never worse solution than the current one. The algorithm of SOPF is as follows.
Sequential one-point flipping algorithm:
Input : s_old, n Output : s_new s_inter : intermediate solution generated from combination number n : of features in s Start s_inter = s_old for i =1 to n { s_temp = flip value of s_interi in s_inter if(accuracy(s_temp) > accuracy(s_inter) { s_inter = s_temp } } s_new = s_inter End |
To summarize the steps of BGSO, a flowchart is displayed in Figure 2. It is to be noted that the detailing of GA and PSO is not included in the flowchart as it has already been mentioned earlier in this section.
Flowchart of the proposed FS model called BGSO.
4 Results and Analysis
This section focuses on measuring the strength of the proposed FS model by applying it to various datasets. The performance of the proposed model is tabulated against the performance of GA, PSO and HMOGA. The related information of the experimentations is provided in this section.
4.1 Dataset Description
For evaluation of BGSO, we selected 20 well-known datasets from the UCI repository [14]. The datasets vary in terms of dimensions, number of classes and domain. Chosen datasets can be classified into three categories based on their size: small (number of features <10), medium (10 ≤ number of features ≤ 100) and large (number of features >100). To test for all variations, we used 5 small, 11 medium and 4 large datasets. The names of the datasets under these tags (small, medium and large) are shown in Table 1. The details of the said datasets are represented in Table 2. It contains number of features, number of instances and number of classes of all the datasets we have used.
Category-wise Names of the Datasets.
Dataset |
||
---|---|---|
Small | Medium | Large |
BreastCancer | Horse | Arrhythmia |
Monk1 | Ionosphere | Hill-valley |
Monk2 | Sonar | Madelon |
Monk3 | Soybean-small | PenglungEW |
Tic-tac-toe | Wine | |
Zoo | ||
Vowel | ||
Glass | ||
BreastEW | ||
CongressEW | ||
Exactly |
Description of 20 UCI Datasets Used for Evaluation of the Proposed FS Method.
Datasets | No. of features | No. of instances | No. of classes |
---|---|---|---|
Arrhythmia | 279 | 452 | 16 |
BreastCancer | 9 | 699 | 2 |
Glass | 10 | 214 | 7 |
Hill-valley | 101 | 606 | 2 |
Horse | 27 | 368 | 2 |
Ionosphere | 34 | 351 | 2 |
Madelon | 500 | 4400 | 2 |
Monk1 | 6 | 124 | 2 |
Monk2 | 6 | 124 | 2 |
Monk3 | 6 | 124 | 2 |
Sonar | 60 | 208 | 2 |
Soybean-small | 35 | 47 | 4 |
Vowel | 10 | 528 | 11 |
Wine | 13 | 178 | 3 |
Zoo | 16 | 101 | 7 |
BreastEW | 569 | 30 | 2 |
CongressEW | 435 | 16 | 2 |
Exactly | 1000 | 13 | 2 |
Tic-tac-toe | 958 | 9 | 2 |
PenglungEW | 73 | 325 | 7 |
4.2 Parameter Values
The proposed FS model mainly uses two parameters: population size and number of iterations. To find the optimal parameter values, we first evaluated BGSO using different combinations of parameters. Changes of performance by varying the population size for BreastCancer (small), BreastEW (medium) and Hill-valley (large) are shown in Figure 3 where we have varied the size as 5, 10, 15, 20 and 25. Similar experimentations are performed to find out the optimal number of iterations. After performing several experiments, we set the values to the parameters as follows:
Changes in classification accuracy of BGSO for different population sizes over the BreastCancer, BreastEW and Hill-valley datasets.
For the rest of the experimentations, we used this optimal set of parameter values.
4.3 Classifiers Used
As the proposed model is a wrapper approach, it needs to consult with a learning algorithm (classifier) to evaluate the generated candidate solutions. To establish the environment-independent nature of BGSO, we used two classifiers of varying complexity, namely, KNN and MLP. KNN is a simple classifier which uses voting of k number of nearest neighbors to properly classify a point in search space. On the other hand, MLP is a more complex classifier which adjusts network weights using the backpropagation algorithm for training. For a uniform comparison of BGSO, we also evaluated the other methods using both of these classifiers and compared their results with BGSO.
4.4 Analysis of Outcomes
To ensure the effectiveness of the proposed method, individual results of both GA and PSO are recorded separately and are compared with the final results of the proposed model. A recently proposed meta-heuristic approach named HMOGA [18] is also used for the comparison. Tables 3 and 4 display the results obtained using KNN and MLP, respectively. The best results for each dataset are made bold.
Comparison of BGSO with Its Constituent Algorithms (GA and PSO) and HMOGA Using KNN Classifier.
Dataset | GA |
PSO |
HMOGA |
BGSO |
Rank | ||||
---|---|---|---|---|---|---|---|---|---|
No. of features | Accuracy (%) | No. of features | Accuracy (%) | No. of features | Accuracy (%) | No. of features | Accuracy (%) | ||
Arrhythmia | 210 | 57.89 | 180 | 58.4 | 195 | 56.58 | 158 | 61.84 | 1 |
BreastCancer | 6 | 98.79 | 6 | 98.79 | 3 | 96.32 | 7 | 99 | 1 |
Glass | 7 | 85.71 | 6 | 77.14 | 6 | 80.12 | 7 | 88.57 | 1 |
Hill-valley | 73 | 54.76 | 57 | 54.76 | 54 | 51.5 | 59 | 55.68 | 1 |
Horse | 22 | 97.06 | 17 | 97.05 | 14 | 97.05 | 13 | 100 | 1 |
Ionosphere | 25 | 93.37 | 24 | 92.05 | 17 | 93.38 | 12 | 96.03 | 1 |
Madelon | 375 | 57.33 | 277 | 54.17 | 240 | 60.33 | 309 | 59.67 | 2 |
Monk1 | 3 | 88.89 | 3 | 88.89 | 3 | 83.23 | 3 | 88.89 | 1 |
Monk2 | 6 | 74.77 | 6 | 74.77 | 2 | 55.09 | 6 | 74.77 | 1 |
Monk3 | 2 | 97.22 | 3 | 97.22 | 3 | 97.12 | 2 | 97.22 | 1 |
Sonar | 48 | 56.72 | 34 | 58.21 | 35 | 68 | 27 | 79.1 | 1 |
Soybean-small | 24 | 100 | 17 | 100 | 19 | 85.71 | 18 | 100 | 1 |
Vowel | 9 | 89.61 | 9 | 88.74 | 8 | 87.85 | 7 | 88.53 | 3 |
Wine | 9 | 97.87 | 8 | 100 | 5 | 70.21 | 6 | 100 | 1 |
Zoo | 13 | 82.93 | 10 | 85.37 | 11 | 84 | 5 | 82.93 | 3 |
BreastEW | 21 | 74.12 | 15 | 90.59 | 17 | 94.80 | 9 | 95.29 | 1 |
CongressEW | 10 | 92.31 | 8 | 90.00 | 8 | 97.00 | 6 | 97.69 | 1 |
Exactly | 9 | 91.50 | 6 | 69.50 | 7 | 72.00 | 9 | 89.00 | 2 |
Tic-tac-toe | 7 | 82.77 | 5 | 73.63 | 6 | 78.00 | 7 | 82.77 | 1 |
PenglungEW | 271 | 86.21 | 174 | 82.76 | 209 | 86.00 | 208 | 89.66 | 1 |
Highest classification accuracy for each dataset is in bold.
Comparison of the Performance BGSO with Its Constituent Algorithms (GA and PSO) and HMOGA Using MLP Classifier.
Dataset | GA |
PSO |
HMOGA |
BGSO |
Rank | ||||
---|---|---|---|---|---|---|---|---|---|
No. of features | Accuracy (%) | No. of features | Accuracy (%) | No. of features | Accuracy (%) | No. of features | Accuracy (%) | ||
Arrhythmia | 215 | 67.76 | 166 | 65.78 | 202 | 66.95 | 167 | 68.42 | 1 |
BreastCancer | 7 | 98.32 | 6 | 98.32 | 6 | 98.32 | 5 | 98.66 | 1 |
Glass | 8 | 84.28 | 7 | 82.86 | 7 | 81.88 | 5 | 84.28 | 1 |
Hill-valley | 71 | 54.22 | 73 | 55.31 | 75 | 53.22 | 55 | 56.04 | 1 |
Horse | 21 | 100 | 18 | 100 | 19 | 100 | 13 | 100 | 1 |
Ionosphere | 25 | 97.35 | 20 | 96.12 | 26 | 96.56 | 19 | 97.35 | 1 |
Madelon | 400 | 60.17 | 278 | 57.83 | 271 | 59.8 | 251 | 60.5 | 1 |
Monk1 | 4 | 92.59 | 3 | 97.22 | 4 | 94.54 | 3 | 100 | 1 |
Monk2 | 6 | 81.94 | 6 | 74.31 | 5 | 69.21 | 2 | 67.13 | 4 |
Monk3 | 3 | 100 | 3 | 97.22 | 3 | 97 | 3 | 97.22 | 2 |
Sonar | 48 | 76.11 | 34 | 80.59 | 43 | 77.12 | 34 | 79.1 | 2 |
Soybean-small | 25 | 100 | 14 | 100 | 19 | 100 | 18 | 100 | 1 |
Vowel | 9 | 91.77 | 8 | 89.83 | 8 | 87.25 | 7 | 88.74 | 3 |
Wine | 11 | 100 | 8 | 100 | 9 | 99.98 | 9 | 100 | 1 |
Zoo | 12 | 85.37 | 8 | 85.37 | 11 | 81.98 | 6 | 82.93 | 3 |
BreastEW | 17 | 74.71 | 13 | 92.35 | 15 | 93.33 | 13 | 95.29 | 1 |
CongressEW | 10 | 89.23 | 10 | 94.62 | 8 | 96.30 | 7 | 97.69 | 1 |
Exactly | 9 | 91.50 | 10 | 69.25 | 11 | 88.25 | 9 | 90.75 | 2 |
Tic-tac-toe | 7 | 80.68 | 6 | 74.41 | 6 | 75.00 | 6 | 82.25 | 1 |
PenglungEW | 246 | 86.21 | 200 | 82.76 | 183 | 83.25 | 177 | 86.21 | 1 |
Highest classification accuracy for each dataset is in bold.
AWCM combines the outcomes of GA and PSO to produce a vector containing importance of the features. Using AWCM cutoff relatively more important features are selected and the rest of them are eliminated. This allows BGSO to lower the number of features by an impressive margin. Tables 3 and 4 represent a thorough comparison among the results obtained by GA, PSO and HMOGA algorithms, and the proposed model. It can be easily deduced from the results in Table 3 that the proposed model generally decreases the number of features required for classification and increases the accuracy of the classification model. Of 20 datasets, for multiple (3 for KNN and 4 for MLP) datasets, the proposed method provides 100% accuracy, which signifies that the most discriminatory feature subset has been selected by the proposed FS method. However, from the results it is to be observed that the proposed method has the ability to decrease the number of features required for classification in all cases. We get accuracy of more than 80% for more than half of the datasets. Even for the datasets having a large number of attributes like Arrhythmia, Madelon and PenglungEW, the proposed model shows its ability to increase the classification accuracy considerably. Although for some of the datasets the proposed method could not produce an enhanced accuracy, it decreases the number of required features used for classification. Hence, it can be considered as a better FS model than its ancestors. We can see that BGSO outperforms its constituents and HMOGA for 16 of 20 datasets when KNN is used as a learning algorithm. On the other hand, Table 4 shows that BGSO obtained best results for 14 of 20 datasets when MLP was used for classification. The dominance of the proposed model in most of the datasets for both the classifiers concludes that our model is classifier independent in nature.
In order to prove the robustness of the proposed model, we plot the convergence graphs for BGSO, GA and PSO over the iterations in Figure 4. We select one dataset from each category, i.e. small, medium and large, to observe the convergence of the three algorithms. Figure 4A-C represents the convergence graph for the BreastCancer (small), BreastEW (medium) and Hill-valley (large) datasets, respectively. From the graphs, we notice that starting from the same point, BGSO has been able to achieve higher accuracy than its constituents in almost every iteration, which proves the stability of the proposed model over iterations.
Convergence graphs for datasets from different categories (size-wise).
Convergence graphs representing the changes of classification accuracies over iterations for BGSO and its constituent algorithms (GA and PSO) for the BreastCancer (A), BreastEW (B) and Hill-valley (C) datasets. The three datasets used in (A), (B) and (C) belong to three different categories namely small (BreastCancer), medium (BreastEW) and large (Hill-valley) to show the differences in convergence in terms of size of the dataset.
5 Conclusion
FS has a lot of applications in various real-world scenarios. This makes it a very interesting and impactful research domain. The use of GA and PSO in the domain of FS is widespread. Literature reveals that there have been multiple numbers of combinations of these two methods proposed by various researchers. However, most of those proposed models tried to build a hybrid by running GA and PSO in parallel or one after another. As an alternative, in the present work, our proposed model BGSO combines the results of the two algorithms in a simple way. The combination is done by assigning an importance to each feature and taking only the features above the mean of the importance of the features. This is followed by a local search which allows for better exploitation of the search space. The proposed FS model is applied on 20 UCI datasets. The datasets are selected in such a way that they have a varying number of features, classes and samples. Two different classifiers, KNN and MLP, are used as the learning algorithm. For KNN, BGSO performs better in 16 datasets, while for MLP it is 14 of 20 datasets. In future, this concept of combining the results of algorithms can be used for other algorithms. Additionally, this algorithm can be applied to other real-life pattern recognition problems like handwritten word or digit recognition.
Bibliography
[1] L. M. Q. Abualigah, Feature Selection and Enhanced Krill Herd Algorithm for Text Document Clustering, in: Studies in Computational Intelligence, vol. 816, Springer, Cham, 2019.10.1007/978-3-030-10674-4Search in Google Scholar
[2] L. M. Q. Abualigah and E. S. Hanandeh, Applying genetic algorithms to information retrieval using vector space model, Int. J. Comput. Sci. Eng. Appl. 5 (2015), 19.10.5121/ijcsea.2015.5102Search in Google Scholar
[3] L. M. Abualigah and A. T. Khader, Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering, J. Supercomput. 73 (2017), 4773–4795.10.1007/s11227-017-2046-2Search in Google Scholar
[4] L. M. Abualigah, A. T. Khader and E. S. Hanandeh, A combination of objective functions and hybrid Krill herd algorithm for text document clustering analysis, Eng. Appl. Artif. Intell. 73 (2018), 111–125.10.1016/j.engappai.2018.05.003Search in Google Scholar
[5] L. M. Abualigah, A. T. Khader and E. S. Hanandeh, A new feature selection method to improve the document clustering using particle swarm optimization algorithm, J. Comput. Sci. 25 (2018), 456–466.10.1016/j.jocs.2017.07.018Search in Google Scholar
[6] L. M. Abualigah, A. T. Khader and E. S. Hanandeh, Hybrid clustering analysis using improved krill herd algorithm, Appl. Intell. 48 (2018), 4047–4071.10.1007/s10489-018-1190-6Search in Google Scholar
[7] M. H. Aghdam, N. Ghasem-Aghaee and M. E. Basiri, Text feature selection using ant colony optimization, Expert Syst. Appl. 36 (2009), 6843–6853.10.1016/j.eswa.2008.08.022Search in Google Scholar
[8] E. Alba, J. Garcia-Nieto, L. Jourdan and E. Talbi, Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms, in: 2007 IEEE Congress on Evolutionary Computation, Singapore, pp. 284–290, 2007.10.1109/CEC.2007.4424483Search in Google Scholar
[9] M. E. Basiri and S. Nemati, A novel hybrid ACO-GA algorithm for text feature selection, in: 2009 IEEE Congress on Evolutionary Computation, Trondheim, pp. 2561–2568, 2009.10.1109/CEC.2009.4983263Search in Google Scholar
[10] H. Ceylan and M. G. H. Bell, Traffic signal timing optimisation based on genetic algorithm approach, including drivers’ routing,Transport. Res. 38 (2004), 329–342.10.1016/S0191-2615(03)00015-8Search in Google Scholar
[11] J. Culberson, On the futility of blind search, in: Technical Report 96-19, Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada, July 1996.Search in Google Scholar
[12] B. Dengiz, F. Altiparmak and A. E. Smith, Local search genetic algorithm for optimal design of reliable networks, IEEE Trans. Evol. Comput. 1 (1997), 179–188.10.1109/4235.661548Search in Google Scholar
[13] M. Dorigo and M. Birattari, Ant Colony Optimization, in: C. Sammut and G. I. Webb, eds., Encyclopedia of Machine Learning, Springer, Boston, MA, 2011.10.1007/978-0-387-30164-8_22Search in Google Scholar
[14] D. Dua and C. Graff, UCI Machine Learning Repository, University of California, School of Information and Computer Science, Irvine, CA, 2019. http://archive.ics.uci.edu/ml (accessed January 7, 2019).Search in Google Scholar
[15] B. Duval, J.-K. Hao and J. C. Hernandez Hernandez, A memetic algorithm for gene selection and molecular classification of cancer, in: Proc. 11th Annu. Conf. Genet. Evol. Comput. – GECCO ’09,201, 2009.10.1145/1569901.1569930Search in Google Scholar
[16] R. Eberhart and J. Kennedy, A new optimizer using particle swarm theory, in: Micro Mach. Hum. Sci. Proc. Sixth Int. Symp., IEEE, pp. 39–43, 1995.Search in Google Scholar
[17] H. Frohlich, O. Chapelle and B. Scholkopf, Feature selection for support vector machines by means of genetic algorithm, in: Proc 15th IEEE Int. Conf. Tools Artif. Intell., pp. 142–148, 2016.Search in Google Scholar
[18] M. Ghosh, R. Guha, R. Mondal, P. K. Singh and R. Sarkar, Feature Selection Using Histogram-Based Multi-objective GA for Handwritten Devanagari Numeral Recognition, in: Intelligent Engineering Informatics. Advances in Intelligent Systems and Computing, vol. 695, Springer, Singapore, 471–479, 2018.10.1007/978-981-10-7566-7_46Search in Google Scholar
[19] M. Ghosh, S. Adhikary, K. K. Ghosh, A. Sardar, S. Begum and R. Sarkar, Genetic algorithm based cancerous gene identification from microarray data using ensemble of filter methods, Med. Biol. Eng. Comput. 57 (2019), 159–176.10.1007/s11517-018-1874-4Search in Google Scholar PubMed
[20] M. Ghosh, S. Begum, R. Sarkar, D. Chakraborty and U. Maulik, Recursive memetic algorithm for gene selection in microarray data, Expert Syst. Appl. 116 (2019), 172–185.10.1016/j.eswa.2018.06.057Search in Google Scholar
[21] M. Ghosh, R. Guha, R. Sarkar and A. Abraham, A wrapper-filter feature selection technique based on ant colony optimization, Neural Comput. Appl. (2019), 1–19 [Online 11 April 2019].10.1007/s00521-019-04171-3Search in Google Scholar
[22] F. Glover and M. Laguna, Tabu search, in: Handbook of Combinatorial Optimization, Springer, Boston, MA, 1998.10.1007/978-1-4615-6089-0Search in Google Scholar
[23] Q. Gu, Z. Li and J. Han, Generalized Fisher score for feature selection: a brief review of Fisher score, Ratio, p. 19, Citado na, 2010.Search in Google Scholar
[24] R. Guha, M. Ghosh, S. Kapri, S. Shaw, S. Mutsuddi, V. Bhateja and R. Sarkar, Deluge based genetic algorithm for feature selection, Evol. Intell. (2019), 1–11 [Online 7 March 2019].10.1007/s12065-019-00218-5Search in Google Scholar
[25] R. Guha, M. Ghosh, P. K. Singh, R. Sarkar and M. Nasipuri, M-HMOGA: a new multi-objective feature selection algorithm for handwritten numeral classification, J. Intell. Syst. 29 (2020), 1453–1467.10.1515/jisys-2019-0064Search in Google Scholar
[26] G. R. Harik, F. G. Lobo and D. E. Goldberg, IEEE Trans. Evol. Comput. 3 (1999), 287–297.10.1109/4235.797971Search in Google Scholar
[27] J. H. Holland, Genetic algorithms, Sci. Am. 1 (1992), 66–73.10.1038/scientificamerican0792-66Search in Google Scholar
[28] C. Huang and J. Dun, A distributed PSO–SVM hybrid system with feature selection and parameter optimization, Appl. Soft Comput. 8 (2008), 1381–1391.10.1016/j.asoc.2007.10.007Search in Google Scholar
[29] A. L. Kazakovtsev, A. N. Antamoshkin and V. V. Fedosov, Greedy heuristic algorithm for solving series of eee components classification problem, in: IOP Conf. Ser. Mater. Sci. Eng., 2016.10.1088/1757-899X/122/1/012011Search in Google Scholar
[30] J. Kennedy and R. C. Eberhart, A discrete binary version of the particle swarm algorithm, in: 1997 IEEE Int. Conf. Syst. Man, Cybern. Comput. Cybern. Simul., IEEE, pp. 4104–4108, 1997.Search in Google Scholar
[31] J. T. Kent, Information gain and a general measure of correlation, Biometrika. 70 (1983), 163–173.10.1093/biomet/70.1.163Search in Google Scholar
[32] R. Leardi, Application of genetic algorithm – PLS for feature selection in spectral data sets, J. Chemometr. 14 (2000), 643–655.10.1002/1099-128X(200009/12)14:5/6<643::AID-CEM621>3.0.CO;2-ESearch in Google Scholar
[33] C. Miles, S. J. Louis, N. Cole and J. McDonnell, Learning to play like a human: case injected genetic algorithms for strategic computer gaming, in: Proc. 2004 Congr. Evol. Comput. (IEEE Cat. No. 04TH8753), vol. 2, pp. 1441–1448, IEEE, Portland, OR, USA, 2004.10.1109/CEC.2004.1331066Search in Google Scholar
[34] S. Nemati, M. Ehsan, N. Ghasem-aghaee and M. Hosseinzadeh, Expert systems with applications A novel ACO–GA hybrid algorithm for feature selection in protein function prediction, Expert Syst. Appl. 36 (2009), 12086–12094.10.1016/j.eswa.2009.04.023Search in Google Scholar
[35] Y. Prasad, K. K. Biswas and C. K. Jain, SVM classifier based feature selection using GA, ACO and PSO for siRNA design, in: Advances in Swarm Intelligence, pp. 307–314, Springer, Berlin, 2010.10.1007/978-3-642-13498-2_40Search in Google Scholar
[36] Problem-specific knowledge in heuristics. 2016. http://antor.uantwerpen.be/problem-specific-knowledge-in-heuristics/ (accessed January 7, 2019).Search in Google Scholar
[37] E. Rashedi, H. Nezamabadi-Pour and S. Saryazdi, GSA: a gravitational search algorithm, Inf. Sci. (NY). 179 (2009), 2232–2248.10.1016/j.ins.2009.03.004Search in Google Scholar
[38] M. Sheikhan and N. Mohammadi, Neural-based electricity load forecasting using hybrid of GA and ACO for feature selection, Neural. Comput. Appl. 21 (2012), 1961–1970.10.1007/s00521-011-0599-1Search in Google Scholar
[39] J. Sun, B. Feng and W. Xu, Particle swarm optimization with particles having quantum behavior, in: Proc. 2004 Congr. Evol. Comput. (IEEE Cat. No. 04TH8753), pp. 325–331, IEEE, Portland, OR, USA, 2004.10.1109/CEC.2004.1330875Search in Google Scholar
[40] R. J. Tallarida and R. B. Murray, Chi-square test, in: Man. Pharmacol. Calc., pp. 140–142, Springer, New York, NY, 1987.10.1007/978-1-4612-4974-0_43Search in Google Scholar
[41] P. J. Van Laarhoven and E. H. Aarts, Simulated annealing, in: Simulated Annealing: Theory and Applications, 7–15, Springer, Dordrecht, 1987.10.1007/978-94-015-7744-1_2Search in Google Scholar
[42] X. Wang, J. Yang, X. Teng, W. Xia and R. Jensen, Feature selection based on rough sets and particle swarm optimization, Pattern Recognit. Lett. 28 (2007), 459–471.10.1016/j.patrec.2006.09.003Search in Google Scholar
[43] J. Wei, R. Zhang, Z. Yu, R. Hu, J. Tang, C. Gui and Y. Yuan, A BPSO-SVM algorithm based on memory renewal and enhanced mutation mechanisms for feature selection, Appl. Soft Comput. J. 58 (2017), 176–192.10.1016/j.asoc.2017.04.061Search in Google Scholar
[44] J. Yang and V. Honavar, Feature subset selection using a genetic algorithm, IEEE Intell. Syst. 13 (1998), 44–49.10.1007/978-1-4615-5725-8_8Search in Google Scholar
[45] Z. Zhu, Y. S. Ong and M. Dash, Markov blanket-embedded genetic algorithm for gene selection, Pattern Recognit. 40 (2007), 3236–3248.10.1016/j.patcog.2007.02.007Search in Google Scholar
©2020 Walter de Gruyter GmbH, Berlin/Boston
This work is licensed under the Creative Commons Attribution 4.0 Public License.
Articles in the same Issue
- An Optimized K-Harmonic Means Algorithm Combined with Modified Particle Swarm Optimization and Cuckoo Search Algorithm
- Texture Feature Extraction Using Intuitionistic Fuzzy Local Binary Pattern
- Leaf Disease Segmentation From Agricultural Images via Hybridization of Active Contour Model and OFA
- Deadline Constrained Task Scheduling Method Using a Combination of Center-Based Genetic Algorithm and Group Search Optimization
- Efficient Classification of DDoS Attacks Using an Ensemble Feature Selection Algorithm
- Distributed Multi-agent Bidding-Based Approach for the Collaborative Mapping of Unknown Indoor Environments by a Homogeneous Mobile Robot Team
- An Efficient Technique for Three-Dimensional Image Visualization Through Two-Dimensional Images for Medical Data
- Combined Multi-Agent Method to Control Inter-Department Common Events Collision for University Courses Timetabling
- An Improved Particle Swarm Optimization Algorithm for Global Multidimensional Optimization
- A Kernel Probabilistic Model for Semi-supervised Co-clustering Ensemble
- Pythagorean Hesitant Fuzzy Information Aggregation and Their Application to Multi-Attribute Group Decision-Making Problems
- Using an Efficient Optimal Classifier for Soil Classification in Spatial Data Mining Over Big Data
- A Bayesian Multiresolution Approach for Noise Removal in Medical Magnetic Resonance Images
- Gbest-Guided Artificial Bee Colony Optimization Algorithm-Based Optimal Incorporation of Shunt Capacitors in Distribution Networks under Load Growth
- Graded Soft Expert Set as a Generalization of Hesitant Fuzzy Set
- Universal Liver Extraction Algorithm: An Improved Chan–Vese Model
- Software Effort Estimation Using Modified Fuzzy C Means Clustering and Hybrid ABC-MCS Optimization in Neural Network
- Handwritten Indic Script Recognition Based on the Dempster–Shafer Theory of Evidence
- An Integrated Intuitionistic Fuzzy AHP and TOPSIS Approach to Evaluation of Outsource Manufacturers
- Automatically Assess Day Similarity Using Visual Lifelogs
- A Novel Bio-Inspired Algorithm Based on Social Spiders for Improving Performance and Efficiency of Data Clustering
- Discriminative Training Using Noise Robust Integrated Features and Refined HMM Modeling
- Self-Adaptive Mussels Wandering Optimization Algorithm with Application for Artificial Neural Network Training
- A Framework for Image Alignment of TerraSAR-X Images Using Fractional Derivatives and View Synthesis Approach
- Intelligent Systems for Structural Damage Assessment
- Some Interval-Valued Pythagorean Fuzzy Einstein Weighted Averaging Aggregation Operators and Their Application to Group Decision Making
- Fuzzy Adaptive Genetic Algorithm for Improving the Solution of Industrial Optimization Problems
- Approach to Multiple Attribute Group Decision Making Based on Hesitant Fuzzy Linguistic Aggregation Operators
- Cubic Ordered Weighted Distance Operator and Application in Group Decision-Making
- Fault Signal Recognition in Power Distribution System using Deep Belief Network
- Selector: PSO as Model Selector for Dual-Stage Diabetes Network
- Oppositional Gravitational Search Algorithm and Artificial Neural Network-based Classification of Kidney Images
- Improving Image Search through MKFCM Clustering Strategy-Based Re-ranking Measure
- Sparse Decomposition Technique for Segmentation and Compression of Compound Images
- Automatic Genetic Fuzzy c-Means
- Harmony Search Algorithm for Patient Admission Scheduling Problem
- Speech Signal Compression Algorithm Based on the JPEG Technique
- i-Vector-Based Speaker Verification on Limited Data Using Fusion Techniques
- Prediction of User Future Request Utilizing the Combination of Both ANN and FCM in Web Page Recommendation
- Presentation of ACT/R-RBF Hybrid Architecture to Develop Decision Making in Continuous and Non-continuous Data
- An Overview of Segmentation Algorithms for the Analysis of Anomalies on Medical Images
- Blind Restoration Algorithm Using Residual Measures for Motion-Blurred Noisy Images
- Extreme Learning Machine for Credit Risk Analysis
- A Genetic Algorithm Approach for Group Recommender System Based on Partial Rankings
- Improvements in Spoken Query System to Access the Agricultural Commodity Prices and Weather Information in Kannada Language/Dialects
- A One-Pass Approach for Slope and Slant Estimation of Tri-Script Handwritten Words
- Secure Communication through MultiAgent System-Based Diabetes Diagnosing and Classification
- Development of a Two-Stage Segmentation-Based Word Searching Method for Handwritten Document Images
- Pythagorean Fuzzy Einstein Hybrid Averaging Aggregation Operator and its Application to Multiple-Attribute Group Decision Making
- Ensembles of Text and Time-Series Models for Automatic Generation of Financial Trading Signals from Social Media Content
- A Flame Detection Method Based on Novel Gradient Features
- Modeling and Optimization of a Liquid Flow Process using an Artificial Neural Network-Based Flower Pollination Algorithm
- Spectral Graph-based Features for Recognition of Handwritten Characters: A Case Study on Handwritten Devanagari Numerals
- A Grey Wolf Optimizer for Text Document Clustering
- Classification of Masses in Digital Mammograms Using the Genetic Ensemble Method
- A Hybrid Grey Wolf Optimiser Algorithm for Solving Time Series Classification Problems
- Gray Method for Multiple Attribute Decision Making with Incomplete Weight Information under the Pythagorean Fuzzy Setting
- Multi-Agent System Based on the Extreme Learning Machine and Fuzzy Control for Intelligent Energy Management in Microgrid
- Deep CNN Combined With Relevance Feedback for Trademark Image Retrieval
- Cognitively Motivated Query Abstraction Model Based on Associative Root-Pattern Networks
- Improved Adaptive Neuro-Fuzzy Inference System Using Gray Wolf Optimization: A Case Study in Predicting Biochar Yield
- Predict Forex Trend via Convolutional Neural Networks
- Optimizing Integrated Features for Hindi Automatic Speech Recognition System
- A Novel Weakest t-norm based Fuzzy Fault Tree Analysis Through Qualitative Data Processing and Its Application in System Reliability Evaluation
- FCNB: Fuzzy Correlative Naive Bayes Classifier with MapReduce Framework for Big Data Classification
- A Modified Jaya Algorithm for Mixed-Variable Optimization Problems
- An Improved Robust Fuzzy Algorithm for Unsupervised Learning
- Hybridizing the Cuckoo Search Algorithm with Different Mutation Operators for Numerical Optimization Problems
- An Efficient Lossless ROI Image Compression Using Wavelet-Based Modified Region Growing Algorithm
- Predicting Automatic Trigger Speed for Vehicle-Activated Signs
- Group Recommender Systems – An Evolutionary Approach Based on Multi-expert System for Consensus
- Enriching Documents by Linking Salient Entities and Lexical-Semantic Expansion
- A New Feature Selection Method for Sentiment Analysis in Short Text
- Optimizing Software Modularity with Minimum Possible Variations
- Optimizing the Self-Organizing Team Size Using a Genetic Algorithm in Agile Practices
- Aspect-Oriented Sentiment Analysis: A Topic Modeling-Powered Approach
- Feature Pair Index Graph for Clustering
- Tangramob: An Agent-Based Simulation Framework for Validating Urban Smart Mobility Solutions
- A New Algorithm Based on Magic Square and a Novel Chaotic System for Image Encryption
- Video Steganography Using Knight Tour Algorithm and LSB Method for Encrypted Data
- Clay-Based Brick Porosity Estimation Using Image Processing Techniques
- AGCS Technique to Improve the Performance of Neural Networks
- A Color Image Encryption Technique Based on Bit-Level Permutation and Alternate Logistic Maps
- A Hybrid of Deep CNN and Bidirectional LSTM for Automatic Speech Recognition
- Database Creation and Dialect-Wise Comparative Analysis of Prosodic Features for Punjabi Language
- Trapezoidal Linguistic Cubic Fuzzy TOPSIS Method and Application in a Group Decision Making Program
- Histopathological Image Segmentation Using Modified Kernel-Based Fuzzy C-Means and Edge Bridge and Fill Technique
- Proximal Support Vector Machine-Based Hybrid Approach for Edge Detection in Noisy Images
- Early Detection of Parkinson’s Disease by Using SPECT Imaging and Biomarkers
- Image Compression Based on Block SVD Power Method
- Noise Reduction Using Modified Wiener Filter in Digital Hearing Aid for Speech Signal Enhancement
- Secure Fingerprint Authentication Using Deep Learning and Minutiae Verification
- The Use of Natural Language Processing Approach for Converting Pseudo Code to C# Code
- Non-word Attributes’ Efficiency in Text Mining Authorship Prediction
- Design and Evaluation of Outlier Detection Based on Semantic Condensed Nearest Neighbor
- An Efficient Quality Inspection of Food Products Using Neural Network Classification
- Opposition Intensity-Based Cuckoo Search Algorithm for Data Privacy Preservation
- M-HMOGA: A New Multi-Objective Feature Selection Algorithm for Handwritten Numeral Classification
- Analogy-Based Approaches to Improve Software Project Effort Estimation Accuracy
- Linear Regression Supporting Vector Machine and Hybrid LOG Filter-Based Image Restoration
- Fractional Fuzzy Clustering and Particle Whale Optimization-Based MapReduce Framework for Big Data Clustering
- Implementation of Improved Ship-Iceberg Classifier Using Deep Learning
- Hybrid Approach for Face Recognition from a Single Sample per Person by Combining VLC and GOM
- Polarity Analysis of Customer Reviews Based on Part-of-Speech Subcategory
- A 4D Trajectory Prediction Model Based on the BP Neural Network
- A Blind Medical Image Watermarking for Secure E-Healthcare Application Using Crypto-Watermarking System
- Discriminating Healthy Wheat Grains from Grains Infected with Fusarium graminearum Using Texture Characteristics of Image-Processing Technique, Discriminant Analysis, and Support Vector Machine Methods
- License Plate Recognition in Urban Road Based on Vehicle Tracking and Result Integration
- Binary Genetic Swarm Optimization: A Combination of GA and PSO for Feature Selection
- Enhanced Twitter Sentiment Analysis Using Hybrid Approach and by Accounting Local Contextual Semantic
- Cloud Security: LKM and Optimal Fuzzy System for Intrusion Detection in Cloud Environment
- Power Average Operators of Trapezoidal Cubic Fuzzy Numbers and Application to Multi-attribute Group Decision Making
Articles in the same Issue
- An Optimized K-Harmonic Means Algorithm Combined with Modified Particle Swarm Optimization and Cuckoo Search Algorithm
- Texture Feature Extraction Using Intuitionistic Fuzzy Local Binary Pattern
- Leaf Disease Segmentation From Agricultural Images via Hybridization of Active Contour Model and OFA
- Deadline Constrained Task Scheduling Method Using a Combination of Center-Based Genetic Algorithm and Group Search Optimization
- Efficient Classification of DDoS Attacks Using an Ensemble Feature Selection Algorithm
- Distributed Multi-agent Bidding-Based Approach for the Collaborative Mapping of Unknown Indoor Environments by a Homogeneous Mobile Robot Team
- An Efficient Technique for Three-Dimensional Image Visualization Through Two-Dimensional Images for Medical Data
- Combined Multi-Agent Method to Control Inter-Department Common Events Collision for University Courses Timetabling
- An Improved Particle Swarm Optimization Algorithm for Global Multidimensional Optimization
- A Kernel Probabilistic Model for Semi-supervised Co-clustering Ensemble
- Pythagorean Hesitant Fuzzy Information Aggregation and Their Application to Multi-Attribute Group Decision-Making Problems
- Using an Efficient Optimal Classifier for Soil Classification in Spatial Data Mining Over Big Data
- A Bayesian Multiresolution Approach for Noise Removal in Medical Magnetic Resonance Images
- Gbest-Guided Artificial Bee Colony Optimization Algorithm-Based Optimal Incorporation of Shunt Capacitors in Distribution Networks under Load Growth
- Graded Soft Expert Set as a Generalization of Hesitant Fuzzy Set
- Universal Liver Extraction Algorithm: An Improved Chan–Vese Model
- Software Effort Estimation Using Modified Fuzzy C Means Clustering and Hybrid ABC-MCS Optimization in Neural Network
- Handwritten Indic Script Recognition Based on the Dempster–Shafer Theory of Evidence
- An Integrated Intuitionistic Fuzzy AHP and TOPSIS Approach to Evaluation of Outsource Manufacturers
- Automatically Assess Day Similarity Using Visual Lifelogs
- A Novel Bio-Inspired Algorithm Based on Social Spiders for Improving Performance and Efficiency of Data Clustering
- Discriminative Training Using Noise Robust Integrated Features and Refined HMM Modeling
- Self-Adaptive Mussels Wandering Optimization Algorithm with Application for Artificial Neural Network Training
- A Framework for Image Alignment of TerraSAR-X Images Using Fractional Derivatives and View Synthesis Approach
- Intelligent Systems for Structural Damage Assessment
- Some Interval-Valued Pythagorean Fuzzy Einstein Weighted Averaging Aggregation Operators and Their Application to Group Decision Making
- Fuzzy Adaptive Genetic Algorithm for Improving the Solution of Industrial Optimization Problems
- Approach to Multiple Attribute Group Decision Making Based on Hesitant Fuzzy Linguistic Aggregation Operators
- Cubic Ordered Weighted Distance Operator and Application in Group Decision-Making
- Fault Signal Recognition in Power Distribution System using Deep Belief Network
- Selector: PSO as Model Selector for Dual-Stage Diabetes Network
- Oppositional Gravitational Search Algorithm and Artificial Neural Network-based Classification of Kidney Images
- Improving Image Search through MKFCM Clustering Strategy-Based Re-ranking Measure
- Sparse Decomposition Technique for Segmentation and Compression of Compound Images
- Automatic Genetic Fuzzy c-Means
- Harmony Search Algorithm for Patient Admission Scheduling Problem
- Speech Signal Compression Algorithm Based on the JPEG Technique
- i-Vector-Based Speaker Verification on Limited Data Using Fusion Techniques
- Prediction of User Future Request Utilizing the Combination of Both ANN and FCM in Web Page Recommendation
- Presentation of ACT/R-RBF Hybrid Architecture to Develop Decision Making in Continuous and Non-continuous Data
- An Overview of Segmentation Algorithms for the Analysis of Anomalies on Medical Images
- Blind Restoration Algorithm Using Residual Measures for Motion-Blurred Noisy Images
- Extreme Learning Machine for Credit Risk Analysis
- A Genetic Algorithm Approach for Group Recommender System Based on Partial Rankings
- Improvements in Spoken Query System to Access the Agricultural Commodity Prices and Weather Information in Kannada Language/Dialects
- A One-Pass Approach for Slope and Slant Estimation of Tri-Script Handwritten Words
- Secure Communication through MultiAgent System-Based Diabetes Diagnosing and Classification
- Development of a Two-Stage Segmentation-Based Word Searching Method for Handwritten Document Images
- Pythagorean Fuzzy Einstein Hybrid Averaging Aggregation Operator and its Application to Multiple-Attribute Group Decision Making
- Ensembles of Text and Time-Series Models for Automatic Generation of Financial Trading Signals from Social Media Content
- A Flame Detection Method Based on Novel Gradient Features
- Modeling and Optimization of a Liquid Flow Process using an Artificial Neural Network-Based Flower Pollination Algorithm
- Spectral Graph-based Features for Recognition of Handwritten Characters: A Case Study on Handwritten Devanagari Numerals
- A Grey Wolf Optimizer for Text Document Clustering
- Classification of Masses in Digital Mammograms Using the Genetic Ensemble Method
- A Hybrid Grey Wolf Optimiser Algorithm for Solving Time Series Classification Problems
- Gray Method for Multiple Attribute Decision Making with Incomplete Weight Information under the Pythagorean Fuzzy Setting
- Multi-Agent System Based on the Extreme Learning Machine and Fuzzy Control for Intelligent Energy Management in Microgrid
- Deep CNN Combined With Relevance Feedback for Trademark Image Retrieval
- Cognitively Motivated Query Abstraction Model Based on Associative Root-Pattern Networks
- Improved Adaptive Neuro-Fuzzy Inference System Using Gray Wolf Optimization: A Case Study in Predicting Biochar Yield
- Predict Forex Trend via Convolutional Neural Networks
- Optimizing Integrated Features for Hindi Automatic Speech Recognition System
- A Novel Weakest t-norm based Fuzzy Fault Tree Analysis Through Qualitative Data Processing and Its Application in System Reliability Evaluation
- FCNB: Fuzzy Correlative Naive Bayes Classifier with MapReduce Framework for Big Data Classification
- A Modified Jaya Algorithm for Mixed-Variable Optimization Problems
- An Improved Robust Fuzzy Algorithm for Unsupervised Learning
- Hybridizing the Cuckoo Search Algorithm with Different Mutation Operators for Numerical Optimization Problems
- An Efficient Lossless ROI Image Compression Using Wavelet-Based Modified Region Growing Algorithm
- Predicting Automatic Trigger Speed for Vehicle-Activated Signs
- Group Recommender Systems – An Evolutionary Approach Based on Multi-expert System for Consensus
- Enriching Documents by Linking Salient Entities and Lexical-Semantic Expansion
- A New Feature Selection Method for Sentiment Analysis in Short Text
- Optimizing Software Modularity with Minimum Possible Variations
- Optimizing the Self-Organizing Team Size Using a Genetic Algorithm in Agile Practices
- Aspect-Oriented Sentiment Analysis: A Topic Modeling-Powered Approach
- Feature Pair Index Graph for Clustering
- Tangramob: An Agent-Based Simulation Framework for Validating Urban Smart Mobility Solutions
- A New Algorithm Based on Magic Square and a Novel Chaotic System for Image Encryption
- Video Steganography Using Knight Tour Algorithm and LSB Method for Encrypted Data
- Clay-Based Brick Porosity Estimation Using Image Processing Techniques
- AGCS Technique to Improve the Performance of Neural Networks
- A Color Image Encryption Technique Based on Bit-Level Permutation and Alternate Logistic Maps
- A Hybrid of Deep CNN and Bidirectional LSTM for Automatic Speech Recognition
- Database Creation and Dialect-Wise Comparative Analysis of Prosodic Features for Punjabi Language
- Trapezoidal Linguistic Cubic Fuzzy TOPSIS Method and Application in a Group Decision Making Program
- Histopathological Image Segmentation Using Modified Kernel-Based Fuzzy C-Means and Edge Bridge and Fill Technique
- Proximal Support Vector Machine-Based Hybrid Approach for Edge Detection in Noisy Images
- Early Detection of Parkinson’s Disease by Using SPECT Imaging and Biomarkers
- Image Compression Based on Block SVD Power Method
- Noise Reduction Using Modified Wiener Filter in Digital Hearing Aid for Speech Signal Enhancement
- Secure Fingerprint Authentication Using Deep Learning and Minutiae Verification
- The Use of Natural Language Processing Approach for Converting Pseudo Code to C# Code
- Non-word Attributes’ Efficiency in Text Mining Authorship Prediction
- Design and Evaluation of Outlier Detection Based on Semantic Condensed Nearest Neighbor
- An Efficient Quality Inspection of Food Products Using Neural Network Classification
- Opposition Intensity-Based Cuckoo Search Algorithm for Data Privacy Preservation
- M-HMOGA: A New Multi-Objective Feature Selection Algorithm for Handwritten Numeral Classification
- Analogy-Based Approaches to Improve Software Project Effort Estimation Accuracy
- Linear Regression Supporting Vector Machine and Hybrid LOG Filter-Based Image Restoration
- Fractional Fuzzy Clustering and Particle Whale Optimization-Based MapReduce Framework for Big Data Clustering
- Implementation of Improved Ship-Iceberg Classifier Using Deep Learning
- Hybrid Approach for Face Recognition from a Single Sample per Person by Combining VLC and GOM
- Polarity Analysis of Customer Reviews Based on Part-of-Speech Subcategory
- A 4D Trajectory Prediction Model Based on the BP Neural Network
- A Blind Medical Image Watermarking for Secure E-Healthcare Application Using Crypto-Watermarking System
- Discriminating Healthy Wheat Grains from Grains Infected with Fusarium graminearum Using Texture Characteristics of Image-Processing Technique, Discriminant Analysis, and Support Vector Machine Methods
- License Plate Recognition in Urban Road Based on Vehicle Tracking and Result Integration
- Binary Genetic Swarm Optimization: A Combination of GA and PSO for Feature Selection
- Enhanced Twitter Sentiment Analysis Using Hybrid Approach and by Accounting Local Contextual Semantic
- Cloud Security: LKM and Optimal Fuzzy System for Intrusion Detection in Cloud Environment
- Power Average Operators of Trapezoidal Cubic Fuzzy Numbers and Application to Multi-attribute Group Decision Making