Open AccessArticle

Multi-Strategy Improved Binary Secretarial Bird Optimization Algorithm for Feature Selection

Department of Artificial Intelligence, Guangzhou Huashang College, Guangzhou 511300, China

College of Design, Hanyang University, Ansan 15588, Republic of Korea

School of Electrical Engineering, Shandong University, Jinan 250000, China

Author to whom correspondence should be addressed.

Mathematics 2025, 13(4), 668; https://doi.org/10.3390/math13040668

Submission received: 15 January 2025 / Revised: 12 February 2025 / Accepted: 17 February 2025 / Published: 18 February 2025

(This article belongs to the Special Issue Optimization Theory, Algorithms and Applications)

Download

Browse Figures

Versions Notes

Abstract

With the rapid development of large model technology, data storage as well as collection is very important to improve the accuracy of model training, and Feature Selection (FS) methods can greatly eliminate redundant features in the data warehouse and improve the interpretability of the model, which makes it particularly important in the field of large model training. In order to better reduce redundant features in data warehouses, this paper proposes an enhanced Secretarial Bird Optimization Algorithm (SBOA), called BSFSBOA, by combining three learning strategies. First, for the problem of insufficient algorithmic population diversity in SBOA, the best-rand exploration strategy is proposed, which utilizes the randomness and optimality of random individuals as well as optimal individuals to effectively improve the population diversity of the algorithm. Second, to address the imbalance in the exploration/exploitation phase of SBOA, the segmented balance strategy is proposed to improve the balance by segmenting the individuals in the population, targeting individuals of different natures with different degrees of exploration and exploitation performance, and improving the quality of the FS subset when the algorithm is solved. Finally, for the problem of insufficient exploitation performance of SBOA, a four-role exploitation strategy is proposed, which strengthens the effective exploitation ability of the algorithm and enhances the classification accuracy of the FS subset by different degrees of guidance through the four natures of individuals in the population. Subsequently, the proposed BSFSBOA-based FS method is applied to solve 36 FS problems involving low, medium, and high dimensions, and the experimental results show that, compared to SBOA, BSFSBOA improves the performance of classification accuracy by more than 60%, also ranks first in feature subset size, obtains the least runtime, and confirms that the BSFSBOA-based FS method is a robust FS method with efficient solution performance, high stability, and high practicality.

Keywords:

best-rand exploration strategy; segmented balance strategy; four-role exploitation strategy; feature selection; optimization algorithm

MSC:

65K10

1. Introduction

With the rapid development of large modeling technologies, there are more and more domains that require data storage and collection [1] to meet the requirements of improving model training accuracy, such as medical prediction [2], natural language processing [3], and digital media [4]. Unfortunately, however, feature elements in a dataset do not always recognize data relationships and often contain many unrepresentative feature elements in the dataset [5]. In addition, the processing cost of high-dimensional datasets usually grows exponentially [6], resulting in a waste of computational resources. In order to alleviate this problem, a common approach is to cull non-critical feature elements from the original dataset [7], which leads to the emergence of Feature Selection (FS) methods, which aim to minimize the redundant features in the original dataset and retain the subset of features with the highest information content, which reduces the computational cost and improves the interpretability of the dataset at the same time [8]. Therefore, the main objective of this study is to propose a stable and robust FS algorithm with efficient search performance to maximize the interpretability of the data warehouse.

Currently, there are two main types of common FS methods: filters and wrappers [9]. Among them, filter-based FS methods use the correlation metric unit to calculate the correlation between each feature and the target variable and select the top-ranked feature elements to form the optimal feature subset based on the correlation score, which has the advantage of high computational efficiency but is limited in capturing the intrinsic relationship between features, resulting in the selected feature subset not necessarily improving the model performance [10]. Unlike filter-based FS methods, wrapper-based FS methods use specific learning algorithms to evaluate the performance of feature subsets, such as Support Vector Machines (SVMs) [11], K-Nearest Neighbors (KNNs) [12], and Artificial Neural Networks (ANNs) [13], to avoid the black-box defects of filter-based FS methods and find the optimal FS subset based on the performance metrics by searching different feature subset combinations. performance metrics, the optimal FS subset, by selecting the most informative combination of feature subsets, substantially improves the model performance [14]. However, wrapper-based FS methods have challenges such as too large a search space and too many combinations in the process of searching for feature subset combinations, and meta-heuristic algorithms are able to effectively alleviate the above challenges in the search process and reduce the computational cost in the search for optimal subset combinations due to their flexibility and lightweight nature [15].

A metaheuristic algorithm is an optimization method inspired by the objective behavior of nature, which improves the efficiency and quality of problem solving by combining different heuristic rules, strategies, or algorithms [16]. We usually categorize metaheuristic algorithms into four types, which are evolutionary-based, human-based, population-based and physicochemical-based [17]. Among them, typical representatives of evolution-based algorithms include Genetic Algorithm (GA) [18], Evolutionary Strategies (ES) [19], and Biogeography-Based Optimization (BBO) [20]. Typical representatives of human-based algorithms are the Cognitive Behavior Optimization Algorithm (COA) [21], the Teaching-Learning-Based Optimization (TLBO) algorithm [22], and the Poor and Rich Optimization (PRO) [23]. Typical representatives of population-based algorithms are Cuckoo Search (CS) [24], Barnacles Mating Optimizer (BMO) [25], and Slime Mold Algorithm (SMA) [26]. Typical examples of physicochemical-based algorithms are the Big Bang-Big Crunch algorithm (BB-BC) [27], the Multi-Verse Optimizer (MVO) [28], and the Atom Search Optimization (ASO) [29].

In recent years, depending on the simplicity and flexibility of meta-heuristic algorithms, the cost of feature subset combination search can be effectively reduced, so many scholars have proposed FS methods based on meta-heuristic algorithms to reduce the redundant feature elements in the original dataset and to improve the interpretability of the data. For example, Mohammed et al. proposed an enhanced dwarf mongoose optimization algorithm to solve the FS problem, which exploits the non-uniformity of chaos theory to enhance the global search capability of the algorithm, and the proposed method is confirmed to enhance the classification accuracy of the feature subset by testing it on 10 UCL datasets. However, the actual running time was not considered in this experimental design, which may lead to a significant reduction in the practicality of the algorithm, and at the same time, by observing its convergence curve, we found that the algorithm suffers from localized stagnation when solving the FS problem [30]. Heba et al. proposed a binary golden jackal optimization algorithm to solve the FS problem, which reduces the dimensionality of the high-dimensional FS problem by utilizing copula entropy and thus improves the classification accuracy of the feature subset, but we can see its drawbacks in that it is only ranked fourth in terms of the actual runtime compared to the comparison algorithm, which is also ranked fourth. Meanwhile, the algorithm was only tested for performance on 15 datasets, which lacks a certain breadth, and its performance in solving high-dimensional FS problems may still need to be improved [31]. Zhu et al. proposed a hybrid artificial immune optimization to solve the high-dimensional FS problem, combining the fatal mutation mechanism of adaptive tuning factor and the Cauchy mutation operator to improve the search performance of the algorithm, and the results on 22 UCL datasets showed that the algorithm exceeded the classification accuracy of the algorithm by 78.2%, and performed the best in terms of feature subset size is the best performance, but it has some defects, the algorithm excessively reduces the number of feature elements when solving high-dimensional FS problems, which leads to its deletion of a part of feature elements with high information content, resulting in the loss of classification accuracy received [32]. Mostafa et al. proposed an improved gorilla troops optimizer to solve the FS problem, combining elite dyadic learning, Cauchy’s inverse cumulative distribution operator, and tangent flight strategy to improve the population diversity of the algorithm as well as the convergence speed, which led to the improvement of the classification accuracy of the feature subset in solving the dimensionality reduction problem with 16 UCL datasets, but the shortcomings is that the stability of the candidate feature subset combination is poor and the distribution is scattered, which is not conducive to the operation of the algorithm in real scenarios, and the computation time is also limited [33]. Pan et al. proposed an improved gray wolf optimization to solve the FS problem, combining coupla entropy as well as a competitive guidance strategy, which was confirmed to improve classification accuracy by performance evaluation on 10 high-dimensional FS datasets. However, it still has some limitations; on the one hand, the fitness function value is not improved, and on the other hand, the proposed method does not achieve good results in dealing with the FS problem for unbalanced category data [34]. Zhou et al. proposed an improved slime mold algorithm to solve the FS problem by introducing a local dimension mutation strategy and a full-dimensional neighborhood search strategy to reduce the probability of the algorithm falling into the trap of locally suboptimal feature subsets, which was confirmed to have an advantage in reducing the size of the feature subset through performance evaluation on 18 UCL datasets, but the shortcomings lies in the fact that features with high information content are often mistakenly deleted when dealing with high-dimensional FS problems, resulting in a loss of classification accuracy [35].

The above facts prove that the FS method based on the meta-heuristic algorithm has strong FS performance, and the combination of feature subsets localized using this algorithm can improve the classification accuracy to different degrees. However, it is undeniable that there are still some defects in solving the FS problem; for example, it is easy to fall into the trap of locally suboptimal feature subset combinations, which leads to a localized stagnation in the FS problem-solving process, resulting in a loss of classification accuracy and running time. At the same time, as the FS problem gradually becomes more complex, on the one hand, the algorithm will have excessive reduction in the number of feature elements, resulting in the deletion of feature elements with strong information characterization ability, resulting in the loss of classification accuracy. On the other hand, the algorithm excessively pursues the improvement of classification accuracy but does not break the limitation of computation time, resulting in the algorithm’s classification accuracy, feature subset size, and operation time not being well weighed, resulting in the algorithm not being able to effectively capture the intrinsic connection between the feature elements, which reduces the practicality and reliability of the algorithm. The above-mentioned problems motivate the need to explore a novel and suitable robust meta-heuristic-based FS algorithm to minimize the redundant features in the original dataset and use fewer feature elements to improve the classification accuracy in a limited runtime. Fortunately, the Secretary Bird Optimization Algorithm (SBOA) [36] has been proved to be a robust optimization tool with efficient search performance, and it has strong application scalability. To date, no relevant FS method research has involved SBOA, so in order to fill this gap and also improve FS problem classification accuracy when solving FS problems, we apply SBOA to solve challenging FS problems in this paper. Meanwhile, as the FS problem becomes more complex, considering that the SBOA may have the problem of falling into the combination of locally suboptimal feature subsets when solving high-dimensional FS problems, resulting in insufficient search performance, in this paper, we combine three novel and excellent learning strategies to enhance the exploration and exploitation capabilities of the SBOA in different degrees and propose an enhanced SBOA, which is known as BSFSBOA, that improves the search performance, stability, and reliability of the algorithm when solving FS.

Specifically, firstly, to address the lack of algorithmic population diversity of the original SBOA in solving the FS problem, the best-rand exploration strategy is proposed, which effectively improves the algorithm’s population diversity by utilizing the randomness and optimality of the random individuals as well as the optimal individuals. Second, to address the imbalance of the exploration/exploitation phase of the original SBOA in solving the FS problem, the segmented balance strategy is proposed to improve the balance, which makes the algorithm balanced in executing the FS problem by segmenting the individuals in the population and targeting the individuals with different natures for different degrees of exploration and exploitation performance and improving the quality of the FS subset when the algorithm solves. Finally, to address the lack of exploitation performance of SBOA in solving the FS problem, a four-role exploitation strategy is proposed, which strengthens the effective exploitation ability of the algorithm and enhances the classification accuracy of the FS subset by different degrees of guidance through the four natures of the individuals in the population. Subsequently, the proposed BSFSBOA-based FS method is applied to solve 36 FS problems involving low, medium, and high dimensions, which are analyzed by population diversity, exploration/exploitation balance, fitness function value, nonparametric test, convergence property, classification accuracy, feature subset size, and runtime, and it is confirmed that the BSFSBOA-based FS method is a robust FS method with efficient solution performance and stable and robust FS method. The contributions of this paper are as follows:

The best-rand exploration strategy is proposed to effectively improve the population diversity of the algorithm by utilizing the randomness and optimality of random individuals as well as optimal individuals.
The segmented balance strategy is proposed to enhance the quality of the FS subset when the algorithm is solved by segmenting the individuals in the population and targeting the individuals of different natures with different levels of exploration and exploitation performance.
The four-role exploitation strategy is proposed to enhance the effective exploitation of the algorithm and improve the classification accuracy of the FS subset by different degrees of guidance through the four natures of the individuals in the population.
A FS method based on BSFSBOA is proposed by combining the above three learning strategies, and the proposed FS method is utilized to solve 36 FS problems involving low, medium, and high dimensions, which confirms that it is a robust FS tool with efficient search performance.

The subsequent work of this paper is organized as follows: Section 2 introduces the knowledge inspiration and mathematical model of the original SBOA. Section 3 gives a detailed description of the best-rand exploration strategy, segmented balancing strategy, and four-role exploitation strategy and gives the operation logic of BSFSBOA. The proposed FS method based on BSFSBOA is used to solve 23 UCL FS problems involving low, medium, and high dimensions in Section 4, which is analyzed by statistical metrics to show that it is a robust FS tool with efficient search performance. Section 5 focuses on applying the proposed BSFSBOA-based FS method to solve the dimensionality reduction of 13 OpenML datasets. Section 6 mainly gives the conclusions of this paper and the organization of future work.

2. Mathematical Framework of Secretary Bird Optimization Algorithm

In this section, we mainly discuss the mathematical model and operational framework of SBOA, which is a novel population-based meta-heuristic algorithm formed by simulating the survival behaviors of secretary birds in nature. Specifically, SBOA mainly considers the hunting and escape behaviors of the secretary bird in its living environment and subsequently forms the way of updating individuals in SBOA by abstracting the mathematical modeling of the above two behaviors. When solving the FS problem, the algorithm adjusts the feature subset scheme through the above individual updating method until it returns the optimal feature subset scheme after satisfying the condition of the maximum number of iterations, thus reducing the redundancy degree of the original dataset. In general, the first step of the algorithm to execute the FS problem is to generate a set of candidate solutions, i.e., the initialization population phase, using the following formula for the individual schemes.

S_{i} = L b + r ⊙ (U b - L b), i = 1, 2, \dots N

(1)

where

S_{i}

denotes the

i th

individual position, i.e., the

i th

solution.

L b

and

U b

denote the lower and upper boundary constraints of the problem to be solved, respectively.

r

denotes a random vector generated in the interval [0,1].

S_{i}

L b

U b

, and

r

are vectors of size

1 \times d i m

, with

d i m

denoting the number of decision variables of the problem to be solved, also known as the problem dimensions.

N

denotes the number of candidate solutions in the population, also known as the population size. Subsequently, the generated

N

individuals form the initialized population during the iterative process of the algorithm, expressed as the following equation.

{S = [\begin{matrix} \begin{array}{l} S_{1} \\ S_{2} \end{array} \\ ⋮ \\ S_{i} \\ ⋮ \\ S_{N} \end{matrix}] = [\begin{matrix} s_{1, 1} & s_{1, 2} & \dots & s_{1, j} & \dots & s_{1, d i m} \\ s_{2, 1} & s_{2, 2} & \dots & s_{2, j} & \dots & s_{2, d i m} \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋱ & ⋮ \\ s_{i, 1} & s_{i, 2} & \dots & s_{i, j} & \dots & s_{i, d i m} \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋱ & ⋮ \\ s_{N, 1} & s_{N, 2} & \dots & s_{N, j} & \dots & s_{N, d i m} \end{matrix}]}_{N \times d i m}

(2)

where

S

denotes the initialized population consisting of

N

individuals,

S_{i} = (s_{1, 1} s_{1, 2} \dots s_{1, j} \dots s_{1, d i m})

. After generating the initialized population, the SBOA performs feature subset scheme tuning by simulating the hunting and escape behaviors of the secretary bird, which we mathematically model in the subsequent subsections. Notably, we assessed the quality of individuals using the fitness function values, expressed as the following equation.

{F = [\begin{matrix} F_{1} \\ ⋮ \\ F_{i} \\ ⋮ \\ F_{N} \end{matrix}]}_{N \times 1} = {[\begin{matrix} F (S_{1}) \\ ⋮ \\ F (S_{i}) \\ ⋮ \\ F (S_{N}) \end{matrix}]}_{N \times 1}

(3)

where,

F

represents the fitness function value vector, and

F_{i}

represents the fitness function value of the

i th

secret bird individual in the population.

2.1. Mathematical Modelling of Hunting Behavior

In this subsection we will mathematically model the hunting behavior of the secretary bird, a process that corresponds to the exploration phase of the algorithm in SBOA. In this case, the hunting behavior of the secretary bird consists of three main phases, i.e., seeking after one’s prey, depleting prey’s energy and attacking prey, and this process is shown in Figure 1. At the same time, the three phases described above are implemented at the same time intervals. The mathematical model of the three phases that make up the hunting behavior will be given in detail in the following subsections.

2.1.1. Seeking After One’s Prey

When hunting for prey, the secretary bird first needs to seek the location of the prey, and it can quickly find the hidden prey in the grass by its advantage in height as well as vision, and this process is modeled as the following equation.

s_{i, j}^{n e w, P 1} = s_{i, j} + (s_{r a n d o m 1, j} - s_{r a n d o m 2, j}) \cdot R_{1}

(4)

where

s_{i, j}^{n e w, P 1}

denotes the new state of the

j th

dimension of the

i th

secretary bird individual after position update through hunting behavior,

s_{i, j}

denotes the

j th

dimension value of the

i th

individual,

s_{r a n d o m 1, j}

and

s_{r a n d o m 2, j}

denote the

j th

dimension values of two mutually different individuals in the population, respectively, and

R_{1}

denotes a random number generated in the interval [0,1]. Subsequently, use the following equation to preserve the new state of the individual.

S_{i} = {\begin{cases} S_{i}^{n e w, P 1}, & i f F_{i}^{n e w, P 1} < F_{i} \\ S_{i}, & e l s e \end{cases}

(5)

where,

S_{i}

represents the position of the

i th

individual,

S_{i}^{n e w, P 1}

represents the new position of the

i th

individual updated through hunting behavior,

F_{i}

represents the fitness function value of the individual

S_{i}

, and

F_{i}^{n e w, P 1}

represents the fitness function value of individual

S_{i}^{n e w, P 1}

2.1.2. Depleting Prey’s Energy

After searching for prey, due to the prey’s aggressiveness, the secret bird still needs to consume the prey’s energy before attacking. The behavior of depleting prey’s energy is modeled as the following equation.

s_{i, j}^{n e w, P 1} = s_{b e s t, j} + e^{{(\frac{t}{T})}^{4}} \cdot (R_{2} - 0.5) \cdot (s_{b e s t, j} - s_{i, j})

(6)

where,

s_{b e s t, j}

represents the

j th

dimension value of the highest quality individual in the population,

e

represents exponential operation,

t

represents the current iteration number of the algorithm,

T

represents the maximum iteration number of the algorithm, and

R_{2}

represents a random number that follows a standard normal distribution. Subsequently, use Equation (5) to preserve the new state of the individual secret bird.

2.1.3. Attacking Prey

In hunting behavior, when the energy of the prey is depleted and the prey does not have the ability to attack, the secret bird will launch an attack on the prey to achieve the purpose of hunting. This process is modeled as the following equation. Subsequently, use Equation (5) to preserve the new state of the individual secret bird.

s_{i, j}^{n e w, P 1} = s_{b e s t, j} + {(1 - \frac{t}{T})}^{2 \cdot \frac{t}{T}} \cdot s_{i, j} \cdot R L

(7)

where,

R L

represents the flight state of the secret bird when initiating attack behavior, expressed using the following equation.

R L = 0.5 \cdot L e v y (d i m)

(8)

where,

L e v y (d i m)

represents the Levy distribution function and is solved using the following equation.

L e v y (d i m) = s \cdot \frac{μ \cdot σ}{| ν |^{\frac{1}{η}}}

(9)

where

s

and

η

, respectively, represent constants with values of 0.01 and 1.5,

μ

and

ν

represent random numbers generated within the interval [0,1], and

σ

is defined using the following equation.

σ = {(\frac{Γ (1 + η) \times s i n (\frac{π η}{2})}{Γ (\frac{1 + η}{2}) \times η \times 2 (\frac{η - 1}{2})})}^{\frac{1}{η}}

(10)

where,

Γ

represents the gamma function.

2.2. Mathematical Modelling of Escape Behaviour

In this section, the mathematical modeling of the escape behavior of the secret bird is mainly conducted. In nature, the secret bird will be attacked by eagles and foxes, and at this time, the secret bird will exhibit escape behavior. The escape behavior of a secret bird can be mainly divided into two types: one is to use its speed advantage to quickly escape and avoid prey attacks, and the other is to use the similarity between its environment and its own characteristics to disguise itself and avoid prey attacks. This process is shown in Figure 2. In the following sections, we will establish mathematical models for the two types of escape behaviors mentioned above.

2.2.1. Escaping by Run or Fly Away

In this section, we mainly mathematically model the escaping by run or fly-away behavior of the secret bird. When the secret bird is attacked by prey and has a good flying state, and the surrounding environment cannot hide, it will quickly fly to escape the prey’s attack. The behavior of escaping by run or fly away is modeled as the following equation.

s_{i, j}^{n e w, P 2} = s_{i, j} + R_{2} \cdot (s_{r a n d o m, j} - K \cdot s_{i, j})

(11)

where,

s_{i, j}^{n e w, P 2}

represents the new state of the

j th

dimension of the

i th

secret bird individual after position update through escape behavior,

s_{i, j}

represents the

j th

dimension value of the

i th

individual,

s_{r a n d o m, j}

represents the

j th

dimension value of a random individual in the population,

R_{2}

represents a random number following a standard normal distribution,

K

represents a constant randomly selected between the sets {1, 2}, expressed as the following equation.

K = r o u n d (1 + r a n d)

(12)

where,

r o u n d

represents the integer up function, and

r a n d

represents a random number in the interval [0,1]. Subsequently, use the following equation to preserve the new state of the individual secret bird.

S_{i} = {\begin{array}{l} S_{i}^{n e w, P 2}, & i f F_{i}^{n e w, P 2} < F_{i} \\ S_{i}, & e l s e \end{array}

(13)

where,

S_{i}

represents the position of the

i th

individual,

S_{i}^{n e w, P 2}

represents the updated new position of the

i th

individual through escape behavior,

F_{i}

represents the fitness function value of the individual

S_{i}

, and

F_{i}^{n e w, P 2}

represents the fitness function value of individual

S_{i}^{n e w, P 2}

2.2.2. Hiding with the Environment

In this section, we mainly mathematically model the hiding with the environment behavior of the secret bird. When attacked by natural enemies, if the surrounding environment is highly similar to itself, the secret bird will combine with the environment to hide and avoid the attack. This process is represented by the following equation.

s_{i, j}^{n e w, P 2} = s_{b e s t, j} + (2 \cdot R B - 1) \cdot {(1 - \frac{t}{T})}^{2} \cdot s_{i, j}

(14)

where,

R B

represents a random number that follows a standard normal distribution, and then Equation (13) is used to preserve the individual’s new state. The pseudocode of SBOA formed by initializing the population and simulating the hunting behavior and escape behavior of the secret bird is shown in Algorithm 1.

Algorithm 1: The execution of SBOA

Input: Initialize parameters:

d i m

U b

L b

N

T

t = 0

.
Output: the best candidate solution (

S_{b e s t}

).
1: Initialize individuals using Equation (1) and form an initialized population using Equation (2).
2: for

t = 1 : T

3: Update the best candidate solution (

S_{b e s t}

).
4: for

i = 1 : N

5: Hunting behavior (Exploration phase of SBOA):
6: if

t < (1 / 3) \cdot T

7: Update the position of the

i th

secret bird using Equation (4).
8: else if

(1 / 3) \cdot T < t < (2 / 3) \cdot T

9: Update the position of the

i th

secret bird using Equation (6).
10: else
11: Update the position of the

i th

secret bird using Equation (7).
12: end if
13: Use Equation (5) to keep the position of the

i th

secret bird

S_{i}

.
14: Escape Behavior (Exploitation Phase of SBOA):
15: if

r < 0.5

16: Update the position of the

i th

secret bird using Equation (11).
17: else
18: Update the position of the

i th

secret bird using Equation (14).
19: end if
20: Use Equation (13) to keep the position of the

i th

secret bird

S_{i}

.
21: end for
22: Save the best candidate solution (

S_{b e s t}

).
23: end for
24: Return the best candidate solution (

S_{b e s t}

3. Mathematical Modeling of BSFSBOA

The current FS problem is gradually developing towards high dimensionality and high complexity, which makes the feasible solution space of the problem very chaotic, leading to the algorithm needing to reduce the redundant features in the original dataset and needing to search for the optimal feature subset in the combinations of feature subsets of the power exponential nature in order to lock the optimal feature subset and to realize the role of the original dataset for cleaning. However, the original SBOA has the problem of insufficient population diversity in FS problem solving, which results in insufficient exploration performance of the algorithm, leading to its inability to locate the optimal subset region in the chaotic feasible solution space in an effective and rapid manner so that the quality of the feature subset cannot be ensured. At the same time, the exploration and exploitation phases of the original SBOA cannot achieve a good balance, which makes the algorithm always tend to perform either exploration or exploitation operations, making the algorithm’s FS search performance suffer a great loss. In addition, after the SBOA locates the optimal feature subset region, the SBOA, due to the limitation of its exploitation strategy, makes it impossible to effectively exploit the optimal feature subset region that it locates, resulting in an increase in the algorithm’s running time and a loss in classification accuracy. In order to alleviate the above problems of SBOA, an enhanced SBOA, called BSFSBOA, is proposed in this section by combining three learning strategies. Firstly, for the problem of insufficient algorithmic population diversity of the original SBOA in solving the FS problem, a best-rand exploration strategy is proposed in this section, which utilizes the randomness and optimality of random individuals as well as the optimal individuals to effectively enhance the algorithm’s population diversity and, at the same time, makes the algorithm’s exploitation performance guaranteed, and then the algorithm’s effective exploration ability is Enhancement. Secondly, for the imbalance of the exploration/exploitation phase of the original SBOA in solving the FS problem, a segmented balance strategy is proposed in this section to improve the balance, which can improve the balance of the algorithm in executing the FS problem by segmenting the individuals in the population and carrying out different degrees of exploration and exploitation performance for the different nature of the individuals to ensure and improve the balance of the algorithm in solving the FS problem. guaranteed and improved the quality of the FS subset when the algorithm was solved. Finally, for the problem of insufficient exploitation performance of SBOA in solving FS problems, this section proposes a four-role exploitation strategy, which strengthens the effective exploitation capability of the algorithm and enhances the classification accuracy of the FS subset by different degrees of guidance for individuals of four natures in the population. Through the introduction of a best-rand exploration strategy, a segmented balance strategy, and a four-role exploitation strategy, the global exploration ability of the algorithm in solving the FS problem is enhanced, which makes the algorithm able to effectively locate the globally optimal feature subset region and also improves the algorithm’s classification accuracy and reduces the running time. We will introduce the above three strategies in detail in the following subsections.

3.1. Best-Rand Exploration Strategy

As the dimension of the FS problem increases, for example, the clean and semeion datasets contain 167 and 256 feature information, respectively. The algorithm needs to search in 2¹⁶⁷ and 2²⁵⁶ feature subset combinations, respectively, when eliminating redundant features on these two datasets, which requires the algorithm to have a good population diversity during the search process, which can make the algorithm have a stronger This requires the algorithm to have a good population diversity during the search process, which can make the algorithm have a stronger ability to locate the optimal feature subset region. The original SBOA has insufficient population diversity in solving the FS problem, which makes the algorithm lose its global exploration ability and cannot localize the optimal FS subset region. Therefore, our main objective in this section is to propose a learning strategy with efficient global search performance to overcome the FS performance degradation due to the lack of population diversity of the algorithm. In the literature [37], it is pointed out that the population diversity of the algorithm can be improved through the guidance of random individuals. Based on this inspiration, at the same time, in order to avoid the excessive randomness causing the quality of the excellent individuals to be affected by the large, we combined with the optimal individual to carry out a part of the individual guidance and then proposed the best-rand exploration strategy. There are two main parts in this strategy. The first part is the bootstrapping through random individuals while combining adaptivity to enhance the algorithm’s population diversity, which in turn enhances the algorithm’s ability to locate the region of the globally optimal feature subset. The second part is individual bootstrapping through globally optimal individuals to ensure that the quality of excellent individuals is not lost and the quality of the FS subset is guaranteed. The global search ability of the algorithm is enhanced by the introduction of the best-rand exploration strategy, which enhances the ability of the algorithm to locate the globally optimal FS subset, reduces the risk of falling into a locally suboptimal FS subset, and improves the classification accuracy of the FS problem. The best-rand exploration strategy is expressed using the following equation.

s_{i, j}^{n e w, P 1} = {\begin{matrix} r \cdot s_{r 1, j} + z \cdot r \cdot (s_{r 2, j} - s_{r 3, j}) + (1 - z) \cdot r \cdot (s_{i, j} - s_{r 1, j}), & i f r a n d < 0.5 \\ r \cdot (s_{b e s t, j} - s_{i, j}) \cdot e^{b l} \cdot c o s (2 π l) + ({(\frac{t}{T})}^{2} \cdot r o u n d (1 + r)) \cdot s_{b e s t, j} & e l s e \end{matrix}

(15)

where

r

denotes a random number generated in the interval [0,1],

s_{r 1, j}

s_{r 2, j}

and

s_{r 3, j}

denotes the

j th

dimension values of three mutually unidentical random individuals in the population,

z = 1 - {(\frac{t}{T})}^{2}

denotes a nonlinear factor that decreases from 1 to 0 with the increase in the number of iterations,

c o s

denotes the cosine function,

r o u n d

denotes the upward-rounding function, and

b

is the adaptive control parameter, expressed in the following equation.

b = \frac{1}{1 + e^{- {(\frac{t}{T})}^{2}}}

(16)

l

denotes the cosine perturbation parameter, which mainly controls the degree of bootstrapping of the optimal individual, expressed as the following equation.

l = \frac{\sqrt{π}}{2} \cdot e^{- {(\frac{t}{T})}^{2}}

(17)

As can be seen from the expression of the best-rand exploration strategy, in the case of

r a n d < 0.5

, the

i th

, the individual is mainly guided by the random individuals as well as the gaps between the random individuals, and the degree of learning of different gaps is also controlled by a nonlinear decreasing factor, which greatly enhances the population diversity of the algorithm in the execution of the FS problem and enhances the algorithm’s ability to locate the global optimal FS subset region. In addition, in the other case, the

i th

individual is mainly guided by the globally optimal individual, while the combination of two different nonlinear adaptive control factors to control the degree of guidance makes the quality of the individual guaranteed and does not suffer from the loss of quality due to excessive randomization. The above two components together constitute the best-rand exploration strategy proposed in this section, which allows the algorithm’s global exploration performance to be enhanced while guaranteeing a certain degree of exploitation capability, making the algorithm less risky of falling into a subset of locally suboptimal features when executing the FS problem, improving the classification accuracy of the subset of features, and reducing the redundancy of the original dataset.

3.2. Segmented Balance Strategy

With the development of artificial intelligence technology, the complexity of the original dataset gradually increases in the solution of the FS problem. An ideal state of FS problem solving is to carry out global exploration in the exploration phase first to locate the global optimal feature subset region and then further exploit the localized region through the exploitation phase to improve the feature subset classification accuracy. The original SBOA does not achieve a good balance between the exploration phase and the exploitation phase, which leads to the phenomenon of over-exploration or over-exploitation, and then makes the algorithm unable to achieve a good balance between the exploration and the exploitation phases, which makes the algorithm’s ability to locate the globally optimal feature subset region weakened, and at the same time, the feature subset classification accuracy is also degraded. In order to alleviate the above problem, in this section we propose the segmented balance strategy. First, we sort all the individuals in the population according to the fitness function value and perform a segmentation operation on the sorted individuals to divide the individuals in the population into the worse individual segment and the better individual segment, and then we again divide the better individual segment into two segments again by the fitness function value, which are called the first segment and the second segment, and the quality of individuals in the first segment is higher than the quality of individuals in the second segment. At the same time, we divided the worse individual segment into two segments by the fitness function value, which are called the third and fourth segments, and the quality of individuals in the third segment is higher than the quality of individuals in the fourth segment. After the segmentation is completed, we consider the properties of each segment to determine its exploration/exploitation trend, which in turn enhances the exploration/exploitation phase balance of the algorithm. First, for the second and fourth segment individuals, since they both possess worse individual properties in the better and worse individual segments, respectively, we need to enhance the exploration capabilities of the second and fourth segment individuals by updating their positions through the following equation.

s_{i, j}^{n e w, P 2} = {\begin{cases} s_{i, j} + r \cdot (s_{f i r s t, j} - s_{i, j}) + (1 - \frac{t}{T}) \cdot s_{r a n d, j}, i f s_{i, j} \in S e c o n d S e g m e n t \\ s_{i, j} + r \cdot (s_{t h i r d, j} - s_{i, j}) + (1 - \frac{t}{T}) \cdot s_{r a n d, j}, i f s_{i, j} \in F o u r t h S e g m e n t \end{cases}

(18)

where

s_{f i r s t, j}

denotes the

j th

dimension value of the randomly selected individuals in the first segment of individuals,

s_{t h i r d, j}

denotes the

j th

dimension value of the randomly selected individuals in the third segment of individuals, and

s_{r a n d, j}

denotes the

j th

dimension value of the randomly selected individuals in the population. From the equation, it can be seen that a certain degree of randomness is guaranteed while guaranteeing the enhancement of the exploitation of the second segment individuals and the fourth segment individuals, which is mainly reflected in the possession of randomness in guiding the selection of individuals. In addition, for the first segment individuals and the third segment individuals, since they both possess better individual attributes in the better individual segment and the worse individual segment, respectively, in order to exclude the individuals in the segment from being caught in the local suboptimal subset trap, we enhance the exploration ability of the first segment individuals and the third segment individuals by updating the position through the following equation.

s_{i, j}^{n e w, P 1} = {\begin{cases} s_{i, j} + r \cdot (s_{b e t t e r_r a n d, j} - s_{i, j}) + (1 - \frac{t}{T}) \cdot s_{r a n d, j}, i f s_{i, j} \in F i r s t S e g m e n t \\ s_{i, j} + r \cdot (s_{w o r s e_r a n d, j} - s_{i, j}) + (1 - \frac{t}{T}) \cdot s_{r a n d, j}, i f s_{i, j} \in T h i r d S e g m e n t \end{cases}

(19)

where

s_{b e t t e r_r a n d, j}

denotes the

j th

dimension value of the individual randomly selected among the individuals in the better segment and

s_{w o r s e_r a n d, j}

denotes the

j th

dimension value of the individual randomly selected among the individuals in the worse segment. From the equation, it can be seen that the first segment individuals and the third segment individuals are randomly guided by the individuals in the better individual segment and the worse individual segment, respectively, which makes the individuals have strong global search capability and reduces the risk of falling into a subset of locally suboptimal features. In order to visually represent the above segmented balance strategy, we visualize it as Figure 3.

By introducing the segmented balance strategy, we segment the population and provide different levels of exploration and exploitation capabilities for individuals with different properties, making the exploration/exploitation phase of the algorithm more balanced, which not only improves the algorithm’s ability to localize the globally optimal feature subset region but also reduces the risk of an individual falling into a locally suboptimal feature subset and improves the feature classification accuracy of the subset.

3.3. Four-Role Exploitation Strategy

After locating the globally optimal feature subset region, the algorithm needs to have a strong and effective exploitation ability to make the classification accuracy guaranteed, but unfortunately, the original SBOA is not able to exploit the potential optimal feature subset region after locating the region due to the gradual increase in the chaos of the original dataset in the execution of the FS problem, mainly due to the excessive use of globally optimal individual guidance in the exploitation phase of the original SBOA’s exploitation phase is overly guided by the global best individuals, which leads to the problem that it has the tendency to fall into the local suboptimal feature subsets, resulting in the loss of feature subset classification accuracy. In order to alleviate this problem, this section proposes a four-role exploitation strategy, which combines the uniquely efficient exploitative nature of fractional order theory [38] with the consideration of the information of the best, better, worse, and random individuals in the population and defines the degree of learning of each type of individual through the fractional order so that in the exploitation phase, the global best individuals should be considered to the maximum extent, followed by the better, worse, and random individuals. The above method fully utilizes the information of the four types of individuals in the population, which greatly improves the effective exploitation ability, and at the same time, the tail term of the polynomial is controlled with certain randomness to avoid over-exploitation phenomenon, which improves the classification accuracy of the feature subset and reduces the actual running time. A four-role exploitation strategy is defined using the following equation.

\begin{array}{l} s_{i, j}^{n e w, P 2} & = \frac{1}{1!} \cdot Q \cdot (s_{b e s t, j} - s_{i, j}) + \frac{1}{2!} \cdot Q \cdot (1 - Q) \cdot (s_{b e t t e r, j} - s_{i, j}) + \frac{1}{3!} \cdot Q \cdot (1 - Q) \cdot (2 - Q) \cdot (s_{w o r s e, j} - s_{i, j}) \\ + \frac{1}{4!} \cdot Q \cdot (1 - Q) \cdot (2 - Q) \cdot (3 - Q) \cdot (s_{r a n d, j} - s_{i, j}) + K \cdot (s_{r a n d 1, j} - s_{r a n d 2, j}) \end{array}

(20)

where “

!

” denotes a factorial operation,

s_{b e t t e r, j}

denotes the

j th

dimension value of a randomly selected individual in the top 50% of the quality ranked population,

s_{w o r s e, j}

denotes the

j th

dimension value of a randomly selected individual in the bottom 50% of the quality ranked population,

s_{r a n d 1, j}

and

s_{r a n d 2, j}

denote the

j th

dimension values of two mutually unidentical individuals randomly selected in the population.

Q

denotes the fractional order self-adaptation factor using the following equation.

Q = \frac{1}{1 + e^{(\frac{t}{T})}} \cdot c o s (2 \cdot π \cdot (\frac{t}{T}))

(21)

K

denotes the tail term control factor, expressed as the following equation.

K = \frac{(1 - \frac{t}{T}) + (1 - {(\frac{t}{T})}^{2})}{2}

(22)

The proposed four-role exploitation strategy in this section enhances the effective exploitation of the algorithm by combining four types of individuals, which are learned through fractional order theory. At the same time, a certain degree of randomness is controlled by the polynomial tail term, and the value of the tail term control coefficient

K

decreases nonlinearly with iteration, which ensures that there is a certain degree of randomness in the early iteration period, and the exploitation will dominate in the late iteration period. In summary, the four-role exploitation strategy proposed in this section makes the effective exploitation of the algorithm improve, reduces the probability of falling into a locally suboptimal feature subset due to over-exploitation, and improves the classification accuracy of the feature subset.

3.4. Implementation of the BSFSBOA

In response to the insufficient exploration ability, imbalance between exploration and development phases, and inadequate development capability of the original SBOA in solving complex FS problems, this section introduces an improved version of SBOA called BSFSBOA. By incorporating the best-rand exploration strategy, segmented balance strategy, and four-role development strategy, BSFSBOA addresses these issues and enhances the algorithm’s global search capability in solving FS problems. It reduces the likelihood of candidate solutions falling into local suboptimal feature subsets and improves the classification accuracy of the selected feature subset. The specific implementation of BSFSBOA is outlined in Algorithm 2, and for a more visual depiction of its execution logic, Figure 4 illustrates the flowchart of BSFSBOA.

Algorithm 2: The execution of BSFSBOA

Input: Initialize parameters:

d i m

U b

L b

N

T

t = 0

.
Output: the best candidate solution (

S_{b e s t}

).
1: Initialize individuals using Equation (1) and form an initialized population using Equation (2).
2: for

t = 1 : T

3: Update the best candidate solution (

S_{b e s t}

).
4: if

r a n d < 0.5

5: for

i = 1 : N

6: Hunting behavior (Exploration phase of BSFSBOA):
7: if

r a n d < 0.5

8: if

t < (1 / 3) \cdot T

9: Update the position of the

i th

secret bird using Equation (4).
10: else if

(1 / 3) \cdot T < t < (2 / 3) \cdot T

11: Update the position of the

i th

secret bird using Equation (6).
12: else
13: Update the position of the

i th

secret bird using Equation (7).
14: end if
15: else
16: Update the position of the

i th

secret bird using Equation (15).
17: end if
18: Use Equation (5) to keep the position of the

i th

secret bird

S_{i}

.
19: Escape Behavior (Exploitation Phase of BSFSBOA):
20: if

r a n d < 0.5

21: if

r < 0.5

22: Update the position of the

i th

secret bird using Equation (11).
23: else
24: Update the position of the

i th

secret bird using Equation (14).
25: end if
26: else
27: Update the position of the

i th

secret bird using Equation (20).
28: end if
29: Use Equation (13) to keep the position of the

i th

secret bird

S_{i}

.
30: end for
31: else
32: for

i = 1 : N

33: Update the position of the

i th

secret bird using Equations (18) and (19).
34: Use Equations (5) and (13) to keep the position of the

i th

secret bird

S_{i}

.
35: end for
36: end if
37: Save the best candidate solution (

S_{b e s t}

).
38: end for
39: Return the best candidate solution (

S_{b e s t}

4. Discussion of Experimental Results on the UCL Datasets

In this section, we focus on evaluating the performance of the BSFSBOA-based FS method proposed in this paper. Specifically, we use the BSFSBOA-based FS method to solve 23 FS datasets involving low, medium, and high dimensions, with dataset-specific information shown in Table 1. Subsequently, the FS performance of BSFSBOA is evaluated through a detailed discussion of population diversity, exploration/exploitation phase balance, fitness function values, stability, Friedman nonparametric statistics, convergence, and running time during the algorithm operation. Meanwhile, in order to visualize the performance advantages of BSFSBOA, we experimentally compare it with eight efficient and novel algorithms, and the specific parameter setting information of the comparison algorithms is shown in Table 2. Meanwhile, in order to avoid the chance of the experiments, we execute each group of comparison experiments independently and without repetition for 30 times in order to count the experimental results. All the codes in this experiment are written and executed on Matlab2022 software on the Windows 11 operating system.

4.1. The Modeling of the FS Problems

In this section, we model the FS problem. The objective of the FS problem is to reduce the redundant features in the original dataset and to obtain the maximum classification accuracy using less feature information. Therefore, the establishment of the objective function involves two important indexes, the classification error rate and the feature subset size, and we establish the objective function of the FS problem according to the designed important indexes as follows:

m i n f (S_{i}) = α_{1} \cdot e r r o r + α_{2} \cdot \frac{R}{n}

(23)

where

S_{i}

denotes the location information of the

i th

individual, i.e., the candidate solution,

e r r o r

denotes the classification error obtained by using the selected feature subset to classify the original dataset,

R

denotes the number of feature information in the selected feature subset, and

n

denotes the number of feature information in the original dataset,

α_{1}

denotes a constant taking the value 0.9,

α_{2} = 1 - α_{1}

It is worth noting that BSFSBOA performs the FS problem in such a way that each dimension value of an individual is a continuous real value throughout the iterative process, so when solving for the fitness function value, we need to convert each dimension value of an individual to a discrete value in such a way as to select the feature subset and compute the fitness function value. In the following, we give the exact procedure to compute the fitness function values.

Step1: The real-valued individual

S_{i} = (s_{i, 1} s_{i, 2} \dots s_{i, j} \dots s_{i, D i m})

is converted to the binary individual

S_{i}^{c} = (s_{i, 1}^{c} s_{i, 2}^{c} \dots s_{i, j}^{c} \dots s_{i, D i m}^{c})

by the following equation.

s_{i, j}^{c} = {\begin{cases} 0, i f s_{i, j} < 0.5 \\ 1, e l s e \end{cases} i = 1, 2, \dots, N; j = 1, 2, \dots, D i m

(24)

Step2: Feature subset selection is performed in the original dataset using

S_{i}^{c}

, where

s_{i, j}^{c} = 1

means that the

i th

feature in the original dataset is selected and

s_{i, j}^{c} = 0

means that the

i th

feature in the original dataset is not selected.

Step3: The classification accuracy is calculated using the subset of features selected in step 2, where the prediction method uses KNN; in this paper, the value of K is 5.

Step4: Combining the feature subset size obtained in Step 2, and the classification error rate obtained in Step 3, the fitness function value is calculated using Equation (23).

4.2. Sensitivity Analysis of Operating Parameters

In the previous subsections, we have explained the model of the problem and the execution logic of BSFSBOA, but before the FS experiments can be carried out, we need to determine the running parameters of this paper, i.e., the population size and the maximum number of iterations. Usually, a larger population may lead to slower convergence of the algorithm, increasing the complexity of the search, which in turn will reduce the convergence accuracy. In addition, too small a population size may easily lead to premature convergence of the algorithm, i.e., the algorithm may fall into a local optimal solution and cannot be further optimized. Therefore, in this section, it is first necessary to determine the optimal population size parameter suitable for solving the FS problem. At the same time, since a larger number of iterations will result in a waste of computational resources, it is also necessary in this section to determine the appropriate maximum number of iterations that can make the algorithm have a strictly converged state when solving the FS problem for subsequent experiments. In order to determine the population size and maximum number of iterations parameters to be used for FS problems, we analyze the convergence behavior using six FS problems involving low, medium, and high dimensions, and the experimental results are shown in Figure 5, where the Y-axis denotes the value of the fitness function, and the X-axis denotes the number of iterations.

From the figure, we can see that in the process of iteration number growing from 0 to 200, the convergence accuracy and speed of the population size set to 10 and 50 are lower than the performance of the population size set to 30, which is mainly due to the fact that the larger population size increases the complexity of the search, and smaller populations increase the probability of the algorithm falling into a subset of locally suboptimal features, so the convergence accuracy is affected, according to this phenomenon, we reasonably set the population size to 30 to make the algorithm have better performance in solving the FS problem. Meanwhile, we can find that the algorithm reaches a stable convergence state at about 90 iterations by testing on 6 FS datasets, while the algorithm still possesses a stable convergence state as the iterations progress. Based on this phenomenon, we reasonably set the maximum number of iterations to 100 so that the algorithm can achieve a stable state of convergence while saving computational resources when solving the FS problem.

4.3. Population Diversity Analysis

In this section, we focus on analyzing the population diversity of the proposed BSFSBOA-based FS method. The original SBOA suffers from its global search performance due to the lack of population diversity during the solution process, which leads to its inability to effectively locate the globally optimal feature subset region. Meanwhile, when solving the high-dimensional FS problem, the ability of individuals to jump out of the local suboptimal FS subset at the late iteration stage is insufficient, resulting in a loss of classification accuracy. In order to reflect the improvement in population diversity of BSFSBOA, we visualize the real-time population diversity of BSFSBOA as well as SBOA in solving FS problems, as shown in Figure 6, where the X-axis denotes the number of iterations and the Y-axis denotes the population diversity value.

As can be seen from Figure 6, when solving the low-dimensional FS datasets Banana and Bupa, BSFSBOA is ahead of SBOA in terms of population diversity throughout the iteration process, and then as the number of iterations increases, the population diversity gradually decreases, but it still occupies a certain degree of superiority. The above phenomenon reflects the fact that the best-rand exploration strategy and segmented balance strategy proposed in this paper effectively enhance the population diversity of the algorithm in executing the low-dimensional FS problem. Compared to SBOA, BSFSBOA possesses higher global search performance in executing low-dimensional FS problems, resulting in improved classification accuracy for FS problems. Meanwhile, as the FS problem dimension increases, the BSFSBOA maintains high population diversity throughout the iteration process when solving the medium-dimensional FS datasets Vote and Lymphography, which indicates that as the FS problem dimension increases, the best-rand exploration strategy and the segmented balance strategy proposed in this paper can still effectively guarantee the global search performance of the algorithm. A segmented balance strategy can still effectively guarantee the global search performance of the algorithm and enhance the ability of the algorithm to locate the global optimal FS subset region. Finally, as the dimension of the FS problem further increases, the algorithm is often prone to falling into the local optimal FS subset trap. Fortunately, the BSFSBOA proposed in this paper maintains high population diversity throughout the iteration process when solving the high-dimensional FS datasets Clean as well as Isolet, which helps to enhance the algorithm to jump out of the local suboptimal FS subset trap and improve the algorithm’s classification accuracy. In summary, we can conclude that the best-rand exploration strategy and the segmented balance strategy proposed in this paper can improve the population diversity of the algorithm in solving FS problems and enhance the convergence accuracy of the algorithm.

4.4. Exploration/Exploitation Balance Analysis

In this section, we focus on analyzing the exploration/exploitation balance of the BSFSBOA-based FS method proposed in this paper. An excellent swarm intelligence algorithm should achieve a good balance between the exploration phase and the development phase. Specifically, during the execution of the algorithm, the ideal state of the algorithm should be to locate the potential globally optimal feature subset region through the exploration phase first and then further exploit the potentially optimal region localized in the exploration phase through the development phase so as to make the algorithm’s accuracy improve, and through the mutual assistance of the two phases, the FS performance of the algorithm can be greatly improved. Through the two phases, the FS performance of the algorithm can be greatly improved. Figure 7 shows the real-time exploration/exploitation rate of BSFSBOA in solving the FS problem, where the X-axis represents the number of iterations, the red line corresponds to the exploitation rate, and the blue line corresponds to the exploration rate.

From the figure, it can be seen that the BSFSBOA-based FS method for solving the low-dimensional datasets Banana and Bupa possesses a high exploration rate in the early stage of the iteration, which helps to improve the algorithm’s ability to locate the globally optimal FS subset regions, and then, the exploitation rate gradually rises as the iteration proceeds, which enables the algorithm to further exploit the potential optimal regions identified in the early stage to improve the classification accuracy, meanwhile, the algorithm still maintains a high exploration rate in the later stages of the iteration, which ensures the algorithm’s ability to jump out of the trap of locally suboptimal FS subsets. The above phenomena are mainly attributed to the best-rand exploration strategy, the four-role exploitation strategy, and the segmented balance strategy proposed in this paper, which provide a good control of the exploration/exploitation balance of the algorithm. In addition, as the dimension of the FS problem increases, the BSFSBOA-based FS method still maintains a high exploration performance in the early part of the iteration when solving the medium- as well as high-dimensional datasets Vote, Lymphography, Semeion, and Isolet, which ensures that the algorithm is able to explore the entire solution space in a more extensive way. Meanwhile, it maintains a high exploitation rate in the later part of the iteration, which effectively strengthens the classification accuracy when solving the FS problem. The above experimental phenomena confirm that the best-rand exploration strategy, the four-role exploitation strategy, and the segmented balance strategy proposed in this paper are able to balance the exploration/exploitation phases of the algorithm well, which can help to improve the global optimization performance of the algorithm and enhance the FS classification accuracy of the algorithm.

4.5. Fitness Function Value Analysis on the UCL Datasets

In this section, we focus on experimentally evaluating the performance of the BSFSBOA-based FS method proposed in this paper. Specifically, we use the BSFSBOA-based FS method to solve 23 FS datasets involving low, medium, and high dimensions and compare the performance with eight excellent algorithms, and each experiment is executed independently and unrepeatably for 30 times in order to count the results of the experiments, which include the best fitness function value, the mean fitness function value, the worse fitness function value, and the ranking based on the above metrics. The results of the experiments on the low, medium, and high dimensions are shown in Table 3, Table 4 and Table 5, respectively, where “Best”, “Mean”, and “Worse” represent the best fitness function, mean fitness function, and worse fitness function values of the algorithm in solving the corresponding dataset, respectively. values, “Rank” represents the ranking based on the above three metrics, “Mean Rank” represents the average ranking of the algorithm’s performance on all the datasets, and ‘Final Rank’ represents the final ranking of the algorithm based on the ‘Mean Rank’.

From Table 3, it can be seen that the FS method based on BSFSBOA achieves first place with 87.5% probability in the best fitness function value metrics when solving eight low-dimensional FS problems, i.e., it ranks first on seven low-dimensional datasets and achieves a very excellent FS performance, which is mainly attributed to the best-rand exploration strategy and the segmented balance strategy proposed in this paper, enhancing the algorithm’s ability to locate the global optimal feature subset region, which makes the algorithm explore the whole solution space. Meanwhile, due to the introduction of a four-role exploitation strategy, the algorithm is able to locate the potential optimal feature subset region and then further develop it in a timely and efficient manner to improve the classification accuracy of the selected feature subset. Meanwhile, in terms of the mean fitness function value index, the FS method based on BSFSBOA achieves a 100% winning rate compared to the comparison algorithms, which demonstrates a strong solution stability. In order to more intuitively show the solution stability of BSFSBOA, Figure 8 demonstrates the box plots of BSFSBOA in solving the low-dimensional FS problems, from which it can also be seen that in most cases, BSFSBOA possesses higher solution concentration, which is mainly due to the segmented balance strategy balancing the exploration/exploitation phases of the algorithm, which makes the algorithm’s solution accuracy and stability improved. Additionally, in terms of the worst fitness function value index, the FS method based on BSFSBOA achieves a winning rate of 87.5% compared to the comparison algorithm, and the fault tolerance of the solution is lower than that of EEFO only in the dimensionality reduction in the Glass dataset, which also shows that although the BSFSBOA proposed in this paper achieves a good advantage in solving the FS dataset of low dimensions, the fault tolerance of the solution in the specific dataset still needs to be improved. The fault tolerance when solving specific datasets still needs to be improved. However, from a comprehensive point of view, BSFSBOA possesses the highest candidate solution tolerance compared to the comparison algorithms, which is beneficial to be executed in a realistic production environment. The average ranking of BSFSBOA in solving low-dimensional FS problems is shown in Figure 9, from which it can be seen that BSFSBOA possesses better solution performance with lower column heights in all the three metrics Best, Mean, and Worse. In summary, we can conclude that the BSFSBOA-based FS method in solving low-dimensional FS problems, due to the introduction of the best-rand exploration strategy, segmented balance strategy, and four-role exploitation strategy in this paper, the algorithm’s optimization ability, stability, and fault tolerance are well improved, so it can be considered that BSFSBOA is an excellent FS method.

Meanwhile, as can be seen from Table 4, as the FS problem dimensions increase, the BSFSBOA-based FS method ranks first in the optimal fitness function value metrics on the six medium-dimensional datasets when solving the eight medium-dimensional FS problems, with a winning rate of 75%, and achieves a stronger FS performance compared to the original SBOA, which also demonstrates that the proposed best-rand exploration strategy and segmented balance strategy proposed in this paper are still effective in improving the algorithm’s global optimization capability as the FS problem complexity increases. Meanwhile, the introduction of a four-role exploitation strategy can still effectively promote the classification accuracy of the algorithm, and the quality of the selected feature subset is well guaranteed. Meanwhile, in terms of the mean fitness function value index, the FS method based on BSFSBOA achieves a 100% winning rate compared to the comparison algorithms, which mainly reflects that the candidate solutions selected by BSFSBOA have less volatility and show stronger industrial applicability. Figure 10 demonstrates the box plots of BSFSBOA for solving the FS problem of medium dimensions, and it can also be seen that in the majority of cases, the BSFSBOA corresponds to the lowest box height, and also the number of anomalous candidate solutions is less, which is mainly due to the fact that the strategy proposed in this paper is still able to effectively escape from the feature subset traps as the dimension of the FS problem increases. Additionally, in terms of the worse fitness function value metric, the BSFSBOA-based FS method achieves a 100%-win rate compared to the comparison algorithms, showing the highest fault tolerance, which indicates that as the FS dimension increases, the feature subset scheme obtained using BSFSBOA still possesses a stronger fault tolerance capability. The average ranking of BSFSBOA in solving low-dimensional FS problems is shown in Figure 11, from which it can be seen that BSFSBOA possesses better solution performance with medium column heights in all three metrics: Best, Mean, and Worse. In summary, we can conclude that as the FS problem dimensionality increases, the best-rand exploration strategy, segmented balance strategy, and four-role exploitation strategy proposed in this paper are still able to provide the best results for the algorithm globally as the FS problem dimensionality increases, and the BSFSBOA-based FS method is still able to provide the best results for the algorithm globally. An exploitation strategy can still provide a comprehensive enhancement of the algorithm’s global exploration capability, and at the same time, the stability as well as fault tolerance is also well improved.

Meanwhile, as can be seen from Table 5, with the sharp increase in the dimensionality of the FS problems, the BSFSBOA-based FS method obtains the first place in the dimension reduction of six datasets when solving seven high-dimensional FS problems, which achieves a substantial improvement in the performance with respect to the SBOA, which confirms that the three learning strategies proposed in this paper, when faced with the challenge of high-dimensional FS problems, can still enable the algorithms to seek the optimal feature subset of the This also confirms that the three learning strategies proposed in this paper can still improve the algorithm’s ability to seek the optimal feature subset when facing the challenge of high-dimensional FS problems and have strong robustness. Meanwhile, in terms of the mean fitness function value metric, the FS method based on BSFSBOA achieves first place in all seven high-dimensional FS problems, and Figure 12 demonstrates the box plots of BSFSBOA in solving the high-dimensional FS problems, and we can find that BSFSBOA possesses lower box heights as well as fewer anomalies, and the distributions of the candidate solutions are very stable, which is beneficial for the complex environment of the solution decision-making. Meanwhile, BSFSBOA also achieves a 100%-win rate in the worse fitness function value metric, obtaining higher fault tolerance. The average ranking of BSFSBOA in solving high-dimensional FS problems is shown in Figure 13, from which it can be seen that BSFSBOA possesses better performance solutions with lower column heights in all three metrics: Best, Mean, and Worse. In summary, we can conclude that with the improvement of the FS problem dimension, the best-rand exploration strategy, the segmented balance strategy, and the four-role exploitation strategy proposed in this paper, we can still effectively strengthen the algorithm’s global search performance and local optimization ability, which in turn improves the classification accuracy of feature subsets.

Finally, by counting the data on the 23 datasets, it can be found that the average rank of the BSFSBOA-based FS method in solving the 23 FS datasets is 1.174, which is 74.7% higher than that of the original SBOA and 60% higher than that of the second-ranked NOA. above. Figure 14 shows the average ranking of BSFSBOA in solving the 23 FS problems, from which it can be seen that BSFSBOA obtains better solution performance with lower column heights in all three metrics: best, average, and worse. The above analysis of the experimental results illustrates that the best-rand exploration strategy, segmented balance strategy, and four-role exploitation strategy introduced in this paper have greatly improved the algorithm’s global optimization ability and its ability to jump out of the trap of local suboptimal subsets, which makes the BSFSBOA-based algorithm more efficient. greatly improved, which makes the FS method based on BSFSBOA have high FS performance and can be regarded as an FS method with high performance, high stability, and robustness.

4.6. Friedman Nonparametric Test Analysis on the UCL Datasets

In the previous subsection we mainly analyzed the fitness function values of the BSFSBOA-based FS method and confirmed that BSFSBOA is a FS method with efficient solution performance as well as stability. However, since the numerical results may be affected by anomalies, for example, an anomalous candidate solution may make the mean fitness value affected. Therefore, in this section, in order to eliminate the effects of outliers and evaluate the FS performance of BSFSBOA in a comprehensive manner, we evaluate the results of 30 runs using Friedman’s nonparametric test with a significance factor of 0.05, and the experimental results are shown in Table 6, where “Mean Rank” denotes the average Friedman rank on the 23 datasets, and “Final Rank” denotes the final rank of the algorithm based on the metric “Mean Rank”.

From the table, it can be seen that the FS method based on BSFSBOA achieves a 100% winning rate in solving the dimensionality reduction of eight low-dimensional datasets, and achieves the first place on all eight low-dimensional datasets, and its average Friedman rank on the low-dimensional datasets is 1.83, and its lead over the second-ranked HO on the average Friedman rank reaches 60.4%, and it achieves a very good performance on the low-dimensional dataset, which is mainly due to the best-rand exploration strategy, segmented balance strategy and four-role exploitation strategy introduced in this paper to enhance the exploration ability and exploitation capability of the algorithm from different degrees. capability. Meanwhile, the FS method based on BSFSBOA also achieves a 100% win rate when solving the dimensionality reduction of eight medium-dimension datasets, and its average Friedman rank on medium-dimension datasets is 1.23, which is ahead of the second-ranked NOA up to 71.7%, which confirms that, as the dimensionality of the FS problem is increased, the three learning strategies proposed in this paper can still make a significant contribution to the algorithm’s FS performance boost. Finally, as the dimensionality of the FS problem increases dramatically, the BSFSBOA-based FS method achieves a 100% winning rate in solving the dimensionality reduction of seven high-dimensional datasets, and at the same time, the average Friedman ranking is 1.36, which is 66% ahead of the second-ranked EEFO and achieves an excellent performance, which also shows that the proposed BSFSBOA-based FS methods proposed in this paper still have high solution performance when dealing with high-dimensional FS problems due to the efficiency of the strategy. Meanwhile, through the “Mean Rank” metric, we can find that the average ranking of the BSFSBOA-based FS method on the 23 dataset dimensionality reduction problems is 1.48, which is far ahead of the comparison algorithms. In summary, we can conclude that the best-rand exploration strategy, segmented balance strategy, and four-role exploitation strategy proposed in this paper can effectively improve the algorithm’s global search capability, jump out of the local suboptimal feature subset trap, and enhance the algorithm’s ability to search for suboptimal features, and at the same time, its promotion effect is still effective with the increase in FS dimension. The facilitation of the three learning strategies enables BSFSBOA to have efficient FS performance.

4.7. Convergence Analysis on the UCL Datasets

In the above subsections, we analyze the performance of the FS method based on BSFSBOA from the perspectives of population diversity, exploration/exploitation ratio, fitness function value, and Friedman’s nonparametric test, which confirms that it is an efficient and stable FS method with high performance and stability and is able to efficiently clean the redundant features in the original dataset and improve the classification accuracy of the feature subset. However, in addition, the convergence nature of the algorithm is also very important in the actual production environment, and a good FS method should have fast convergence speed in addition to high convergence accuracy. Therefore, in this section, we mainly analyze the convergence curves of the FS method based on BSFSBOA. The convergence curves of BSFSBOA in solving the dimensionality reduction problems of low-dimensional, medium-dimensional, and high-dimensional datasets are shown in Figure 15, Figure 16 and Figure 17, respectively, in which the X-axis denotes the number of iterations, and the Y-axis denotes the value of the fitness function.

From Figure 15, it can be seen that all algorithms are able to effectively reduce the fitness function value when solving the low-dimensional FS problem, but the FS method based on BSFSBOA has higher convergence accuracy and faster convergence speed, and it has established a clear lead after about the 20th iteration, which is mainly due to the four-role This is mainly due to the four-role exploitation strategy introduced in this paper, which greatly improves the development capability of the algorithm and thus accelerates the convergence speed of the algorithm. At the same time, we can see that the FS method based on BSFSBOA continues the convergence behavior throughout the iteration process, which is mainly due to the best-rand exploration strategy in this paper. The segmented balance strategy improves the algorithm’s population diversity so that the algorithm in the late iteration can still explore the more promising regions for obtaining the optimal feature subset. Through the above analysis, it is confirmed that the BSFSBOA introduced in this paper not only has higher convergence accuracy but also has a faster convergence speed than the comparison algorithm, which ensures that the algorithm has high practicality.

Meanwhile, it can be seen from Figure 16 that all algorithms are still able to effectively perform the reduction of the fitness function value and reach the convergence state as the dimension of the FS problem increases. However, it is worth noting that the FS method based on BSFSBOA possesses a faster convergence speed, and it has established a large lead after the 20th iteration. Meanwhile, it is able to continuously optimize the feature subset in the subsequent iterations, which enables the fitness function value to be further reduced. In addition, with the sharp increase in the dimension of the FS problem, when the dimension exceeds 100, it can be seen from Figure 17 that all the algorithms are able to effectively reduce the fitness function value of the high-dimensional FS problem, but it is undeniable that, with the sharp increase in the dimension of the FS, the algorithms suffer a loss in the convergence speed, and the FS method based on the BSFSBOA establishes a certain lead only after the 40th iteration, which is mainly due to the search space becoming very complex, but compared to the comparison algorithms, BSFSBOA is able to continuously optimize the feature subset. This is mainly due to the fact that the search space becomes very complex, but compared with the comparison algorithms, BSFSBOA has a faster convergence speed, and it can also be found that BSFSBOA is able to continuously optimize the subset of features, and continuously clean the original dataset, which is mainly due to the fact that the learning strategy introduced in this paper enhances the algorithm’s global optimization performance, and at the same time, the strategy proposed in this paper has a better adaptability to high-dimensional FS problems. In summary, we can conclude that with the improvement of the FS problem dimension, the three learning strategies proposed in this paper can still contribute greatly to the global search performance of the algorithms, and have better adaptability than the comparison algorithms, although the performance will suffer a certain loss, based on which we can consider that the FS method based on the BSFSBOA not only possesses higher convergence accuracy, but also has better practicability, and is a promising FS method.

4.8. Classification Accuracy and Feature Subset Size Analysis on the UCL Datasets

In the above subsections, we have demonstrated that the BSFSBOA-based FS method possesses efficient solution performance as well as better convergence properties. However, it is indisputable that the main objective of the FS problem is to minimize the redundant features in the original dataset and retain more representative feature information to improve the classification accuracy. Therefore, in this section, we focus on the classification accuracy as well as the feature subset size to further analyze the FS performance of BSFSBOA. Table 7 represents the classification accuracy of the algorithm in solving the FS problem, and Table 8 represents the subset size feature obtained by the algorithm in solving the FS problem.

Combining Table 7 and Table 8, we can see that the FS method based on BSFSBOA achieves a 100% win rate in classification accuracy when solving the dimensionality reduction of eight low-dimensional datasets, which is more than 70% ahead of the performance of the second-ranked EEFO, and at the same time, it is more than 80% ahead of the performance of the original SBOA, and the above advantages are mainly due to the fact that the introduction of the best-rand The above advantages are mainly due to the fact that the best-rand exploration strategy and segmented balance strategy introduced in this paper reduce the probability of the algorithm falling into the trap of locally suboptimal feature subsets, while the introduction of the four-role exploitation strategy effectively improves the algorithm’s convergence accuracy so that the quality of the feature subsets is greatly improved. It is worth noting that, through Table 8, we can find that the FS method based on BSFSBOA is ranked fourth in terms of feature subset size when solving the low-dimensional FS problem, which is weaker than EO, WOA, and SBOA, which is mainly due to the limitations of EO, WOA, and SBOA due to the limitations of the search strategy that make the algorithms fall into local suboptimal FS subset traps in the process, resulting in their excessive elimination of the original data subset. This leads to its excessive elimination of the highly representative feature elements in the original dataset, resulting in a decrease in the classification accuracy.

In addition, the FS method based on BSFSBOA achieves first place in 7 FS problems with a winning rate of 87.5% when solving the dimensionality reduction of 8 medium-dimensional datasets, which demonstrates a very excellent performance compared to the comparison algorithms, which is mainly due to the fact that the strategy proposed in this paper improves the development performance and exploration performance of the algorithms from different perspectives. However, it has to be admitted that BSFSBOA is no more than 1% weaker than DE and EO in terms of accuracy when solving the dimensionality reduction in the Vehicle dataset, which is mainly due to the fact that the BSFSBOA proposed in this paper still suffers from a certain lack of searching ability when solving the dimensionality reduction in some specific datasets. In addition, Table 8 shows that the FS method based on BSFSBOA ranks first in terms of feature subset size when solving the FS problem in medium dimensions, which effectively achieves a higher classification accuracy with less feature information to recognize the whole dataset.

Finally, with the sharp increase in the dimension of the FS problem, the BSFSBOA-based FS method obtains the first place in the classification accuracy among the six-dataset downscaling when solving the seven high-dimensional datasets downscaling, which also shows that compared with the comparison algorithms, the three learning strategies proposed in this paper perform better in adapting to the high-dimensional problem solving, demonstrating the robustness of the strategies. Meanwhile, from Table 8, it can be found that when solving the high-dimensional FS problem, the FS method based on BSFSBOA obtains the first-place average score in the feature subset size metrics, which shows that due to the introduction of the three learning strategies in this paper, the algorithms are able to efficiently eliminate the redundant features in the high-dimensional FS dataset and make an efficient selection of the feature information.

In summary, although the performance of the BSFSBOA-based FS method is weaker than the comparison algorithms on specific datasets. However, from a comprehensive point of view, especially from the “Final Rank” index in the last row of the table, it can be seen that due to the advantages of the three learning strategies proposed in this paper, which realize the maximum reduction in the redundancy of the original dataset features and further enhance the classification accuracy of the algorithm, it can be considered as an effective FS method. Meanwhile, Figure 18 demonstrates the stacked graph of the algorithm in terms of feature subset size and classification accuracy, from which it can be seen that BSFSBOA possesses a lower height, which also indicates that BSFSBOA can well achieve a better balance in terms of feature subset size and classification accuracy and possesses a higher global search capability.

4.9. Runtime Analysis on the UCL Datasets

In the above subsections, it has been confirmed that BSFSBOA has high FS performance from several perspectives, but in addition to this, the actual running time of the algorithm during its actual execution is also important, which is directly related to the practicality of the algorithm. Therefore, in this section, we mainly analyze the running time of the FS method based on BSFSBOA, and the experimental results are shown in Table 9. From the table, we can see that the BSFSBOA-based FS method achieves the smallest runtime on the dimensionality reduction of 15 datasets, with a winning rate of 65.2%, and from the “Mean Rank” metric, we can see that BSFSBOA achieves the smallest average runtime on the solution of 23 FS problems. The above advantages are mainly due to the fact that the strategy proposed in this paper is lightweight, which reduces the computational cost during the execution of the algorithm. Therefore, we can consider the FS method based on BSFSBOA proposed in this paper as a practical FS method.

5. Discussion of Expanded Experiments on the OpenML Datasets

In Section 4, we focus on evaluating the FS performance of BSFSBOA proposed in this paper by testing it on 23 UCL datasets involving low, medium, and high dimensions and analyzing in detail the fitness values, stability, convergence, and runtime statistics in the experiments to confirm that BSFSBOA is a robust method with efficient FS performance. However, in order to make the performance of the BSFSBOA-based FS method widely verified, in this section, the FS performance of BSFSBOA is further evaluated using 13 FS datasets from OpenML, and the performance of BSFSBOA is objectively and comprehensively examined by analyzing the fitness values, classification accuracies, feature subset sizes, and runtimes. In addition, in order to enhance the persuasiveness of the experiments, seven very good and stable algorithms, namely DE, EO, WOA, SBOA, HEOA, LSHADE, and IMODE, are used in this section for comparison, which illustrates more intuitively the FS performance possessed by BSFSBOA. Among them, the specific information of the 13 OpenML datasets used for the experiments is shown in Table 10.

5.1. Fitness Function Value Analysis on the OpenML Datasets

In this section, we focus on evaluating the performance of the BSFSBOA-based FS method proposed in this paper on 13 OpenML datasets. Specifically, we solved 13 FS problems from OpenML using the BSFSBOA-based FS method and compared the performance with seven excellent algorithms. Each experiment was executed independently and non-repeatedly for 30 times, and the results of the experiments were also counted, which included the best fitness function value, the mean fitness function value, the worse fitness function value, and the rankings based on the above-mentioned metrics. The experimental results are shown in Table 11, where “Best”, “Mean”, and “Worse” represent the best fitness function values of the algorithms in solving the corresponding datasets, respectively, and Mean and Worse fitness function values, respectively; “Rank” represents the ranking based on the above three metrics; “Mean Rank” represents the average ranking of the algorithm on all datasets, “Final Rank” represents the final ranking of the algorithm based on ‘Mean Rank’.

As can be seen from Table 11, in terms of the best fitness value metric, BSFSBOA based on BSFSBOA ranks first on the 10 FS datasets, with a winning rate of 76.9%, compared to HEOA, which is in the second place with a leading rate of 63.8%, and it achieves excellent FS performance. This is mainly due to the fact that the best-rand exploration strategy and the segmented balance strategy proposed in this paper enhance the algorithm’s ability to locate the globally optimal feature subset region, enabling the algorithm to explore the entire solution space. Meanwhile, due to the introduction of a four-role exploitation strategy, the algorithm is able to locate potential optimal feature subset regions and further exploit them in a timely and effective manner, thus improving the classification accuracy of the selected feature subset. Second, in terms of the mean fitness metric, the BSFSBOA-based FS method obtains the first place on all 13 FS datasets, which is nearly 72.3% ahead of the second place in terms of performance. This result illustrates that when solving the FS problem, the segmented balance strategy balances the exploration/exploitation phase of the algorithm, which makes the algorithm have a higher solution accuracy and stability, which helps to improve the operational stability of the algorithm in real complex environments. Accuracy and stability are improved, which makes the FS method based on BSFSBOA have higher solution stability, which helps to improve the stability of the algorithm’s operation in real complex environments. Finally, for the worse fitness value metric, the optimal value was achieved on 12 FS datasets, and it was only weaker than HEOA on the Pizzacutter3 dataset, with a winning rate of 92.3%, demonstrating greater fault tolerance, which very positively contributes to the algorithm’s robustness. In addition, Figure 19 visualizes the average ranking of the different algorithms, from which it can also be seen that the BSFSBOA-based FS method achieves the lowest column heights on 13 datasets, characterizing that it obtains a higher performance compared to the other algorithms, which is a confirmation of the fact that due to the three efficient learning strategies introduced in this paper, BSFSBOA can be considered as an efficient and stable FS method.

5.2. Classification Accuracy and Feature Subset Size Analysis on the OpenML Datasets

In the above subsections, we have shown that the BSFSBOA-based FS method has efficient solution performance on 13 FS datasets in OpenML. However, it is worth reminding that the main goal of the FS problem is to reduce the redundant features in the original dataset as much as possible, retain more representative feature information, and then achieve the goal of using a small number of representative feature elements to characterize the whole dataset, improve the classification accuracy, and reduce the computation time. Therefore, in this section, we focus on the discussion of classification accuracy and feature subset size to further analyze the FS performance of BSFSBOA. Table 12 represents the classification accuracy of the algorithm in solving 13 FS problems from OpenML, and Table 13 represents the corresponding achieved feature subset size.

As can be seen from Table 12, the FS method based on BSFSBOA obtains the highest classification accuracy on 11 FS problems when solving 13 FS datasets in OpenML, with a winning rate of 84.6%, ahead of the second-ranked HEOA by nearly 65.9%, and the above advantages are mainly due to the introduction of the best-rand exploration strategy and segmented balance strategy introduced in this paper reduces the probability of the algorithm falling into the trap of locally suboptimal feature subsets, while the introduction of four-role exploitation strategy effectively improves the convergence accuracy of the algorithm, so that the quality of the feature subsets is greatly improved, which in turn improves the classification accuracy. In addition, compared with the original SBOA, the algorithm enhanced by the three learning strategies improves the average ranking by 4.16, which also confirms that the strategy proposed in this paper has a great contribution to the performance of the algorithm. Although BSFSBOA achieves good classification accuracy, it has to be admitted that the classification accuracy is weaker than that of WOA and IMODE in solving the two datasets, PC1 and Piechart3, respectively, which shows that the BSFSBOA proposed in this paper still has some limitations in dimensionality reduction in very few datasets. However, on the whole, BSFSBOA possesses better comprehensive performance. In addition, we can see from Table 13 that all the algorithms are able to reduce the size of feature subsets from different degrees, but the noteworthy phenomenon is that the BSFSBOA proposed in this paper achieves the second best final average ranking, while the SBOA obtains the first ranking, the reason for this phenomenon mainly lies in the fact that in the process of feature dimensionality reduction in the SBOA, due to the limitations of its own strategy that The reason for this phenomenon is that SBOA can easily fall into the trap of locally sub-optimal feature subsets, which leads to its excessive reduction in the number of subsets and the deletion of representative features. This argument can be mainly confirmed from Table 12, for example, in the dimensionality reduction in the PC4, Piechart3, and Pizzacutter3 datasets, although SBOA achieves a smaller subset of features from the aspect of classification accuracy. It is easy to see that the feature subset obtained by SBOA makes the classification accuracy greatly reduced, which is mainly due to the fact that SBOA also deletes the effective features. The BSFSBOA proposed in this paper, due to its powerful learning strategy, can balance the classification accuracy and the feature subset size and show a strong global optimization ability.

The advantages and disadvantages of the BSFSBOA-based FS approach in terms of classification accuracy and feature subset size metrics are illustrated above. BSFSBOA, due to its powerful learning strategy, makes it able to balance the two aspects of classification accuracy and feature subset size. Figure 20 shows the stacking diagram of the algorithm in terms of classification accuracy and feature subset size metrics, from which it can be seen that BSFSBOA corresponds to the smallest stacking length, which indicates that although BSFSBOA has certain defects in dimensionality reduction in specific datasets, it can still be considered as a promising FS method from a comprehensive perspective.

5.3. Runtime Analysis on the OpenML Datasets

In the above subsections, we have confirmed the high performance of BSFSBOA in solving the problem of dimensionality reduction in datasets from OpenML from various perspectives, but in addition, the actual running time of the algorithm during the actual execution is also important, which is directly related to the practicality of the algorithm. Therefore, this section mainly analyzes the actual running time of BSFSBOA, and the experimental results are shown in Table 14. From the table, it can be seen that the FS method based on BSFSBOA achieves the smallest running time on the dimensionality reduction of 6 datasets, accounting for 46.1%, while ranking second on 4 datasets. Less runtime is achieved compared to the comparison algorithms. Meanwhile, from the “Mean Rank” metric, it can be seen that BSFSBOA achieves the smallest average runtime on 13 FS problems with an average rank of 2.08. The above advantages are mainly due to the fact that the strategy proposed in this paper is lightweight, which reduces the computational cost during the execution of the algorithm. Therefore, we can consider the BSFSBOA-based FS method proposed in this paper as a practical FS method.

6. Conclusions and Future Work

In this paper, to address the shortcomings of the original SBOA in FS problem solving, we propose an enhanced SBOA, called BSFSBOA, by combining three efficient and novel learning strategies. First, to address the lack of algorithmic population diversity of the original SBOA in solving the FS problem, the best-rand exploration strategy is proposed in this section, which utilizes the randomness and optimality of random individuals as well as optimal individuals to effectively enhance the population diversity of the algorithm. Second, to address the imbalance of the exploration/exploitation phase of the original SBOA in solving the FS problem, the segmented balance strategy is proposed in this section to improve the balance, which makes the algorithm balanced in executing the FS problem by segmenting the individuals in the population, targeting the individuals with different natures for different degrees of exploration and exploitation performance enhancement that improves the quality of the FS subset when the algorithm solves it. Finally, to address the lack of exploitation performance of SBOA in solving the FS problem, the four-role exploitation strategy is proposed, which strengthens the effective exploitation capability of the algorithm and enhances the classification accuracy of the FS subset by different degrees of guidance through the four natures of individuals in the population. Subsequently, the proposed BSFSBOA-based FS method is applied to solve 36 FS problems involving low, medium, and high dimensions, and it is confirmed that the introduction of the best-rand exploration strategy, the segmented balance strategy, and the four-role exploitation strategy makes the algorithm’s The introduction of the global exploration capability is improved, which enhances the algorithm’s classification accuracy and reduces the running time, demonstrating that the FS method based on BSFSBOA is a robust FS method with efficient solving performance and stable robustness.

However, by analyzing the experimental results, it is undeniable that the BSFSBOA-based FS method proposed in this paper, although it achieves good performance on most FS problems, still has certain performance deficiencies in the dimensionality reduction in some specific FS datasets. Therefore, our future research will be devoted to the following aspects: (1) Propose targeted strategies to enhance the strategy for the specific FS dataset dimensionality reduction problem with respect to the above existing problems. (2) The BSFSBOA proposed in this paper is a combinatorial optimization method, and we should extend it to more combinatorial optimization domains, such as aviation scheduling, robot control, and PV model prediction. (3) The current research on SBOA lacks a multi-objective version, and a multi-objective version of SBOA should be developed in future work to solve more optimization problems.

Author Contributions

Conceptualization, F.C.; methodology, F.C. and S.Y.; software, F.C.; validation, S.Y., J.W. and J.L.; formal analysis, F.C. and J.W.; investigation, F.C.; resources, S.Y. and J.W.; data curation, J.L.; writing—original draft preparation, F.C. and S.Y.; writing—review and editing, F.C., S.Y. and J.W.; visualization, J.L.; supervision, J.L.; project administration, S.Y. and J.W.; funding acquisition, S.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded in part by the Guangzhou Huashang College Daoshi Project, grant number 2024HSDS06; in part by the Key Area Special Project for General Colleges and Universities in Guangdong Province, grant number 2024ZDZX3035.

Data Availability Statement

Data can be obtained by contacting the corresponding author.

Acknowledgments

Many thanks to the reviewers and to the editorial team for their efforts on our article!

Conflicts of Interest

The authors declare no conflict of interest.

References

Nssibi, M.; Manita, G.; Korbaa, O. Advances in nature-inspired metaheuristic optimization for feature selection problem: A comprehensive survey. Comput. Sci. Rev. 2023, 49, 100559. [Google Scholar] [CrossRef]
Albukhanajer, W.A.; Briffa, J.A.; Jin, Y. Evolutionary multiobjective image feature extraction in the presence of noise. IEEE Trans. Cybern. 2014, 45, 1757–1768. [Google Scholar] [CrossRef]
Manbari, Z.; AkhlaghianTab, F.; Salavati, C. Hybrid fast unsupervised feature selection for high-dimensional data. Expert Syst. Appl. 2019, 124, 97–118. [Google Scholar] [CrossRef]
Zawbaa, H.M.; Emary, E.; Grosan, C.; Snasel, V. Large-dimensionality small-instance set feature selection: A hybrid bio-inspired heuristic approach. Swarm Evol. Comput. 2018, 42, 29–42. [Google Scholar] [CrossRef]
Abdulwahab, H.M.; Ajitha, S.; Saif, M.A.N. Feature selection techniques in the context of big data: Taxonomy and analysis. Appl. Intell. 2022, 52, 13568–13613. [Google Scholar] [CrossRef]
Wang, X.; Yang, J.; Teng, X.; Xia, W.; Jensen, R. Feature selection based on rough sets and particle swarm optimization. Pattern Recognit. Lett. 2007, 28, 459–471. [Google Scholar] [CrossRef]
Dash, M.; Liu, H. Feature selection for classification. Intell. Data Anal. 1997, 1, 131–156. [Google Scholar] [CrossRef]
Kabir, M.M.; Shahjahan, M.; Murase, K. A new hybrid ant colony optimization algorithm for feature selection. Expert Syst. Appl. 2012, 39, 3747–3763. [Google Scholar] [CrossRef]
Chen, F.; Ye, S.; Xu, L.; Xie, R. FTDZOA: An Efficient and Robust FS Method with Multi-Strategy Assistance. Biomimetics 2024, 9, 632. [Google Scholar] [CrossRef]
Ghaemi, M.; Feizi-Derakhshi, M.R. Feature selection using forest optimization algorithm. Pattern Recognit. 2016, 60, 121–129. [Google Scholar] [CrossRef]
Jiménez-Cordero, A.; Morales, J.M.; Pineda, S. A novel embedded min-max approach for feature selection in nonlinear support vector machine classification. Eur. J. Oper. Res. 2021, 293, 24–35. [Google Scholar] [CrossRef]
Wang, A.; An, N.; Chen, G.; Li, L.; Alterovitz, G. Accelerating wrapper-based feature selection with K-nearest-neighbor. Knowl. Based Syst. 2015, 83, 81–91. [Google Scholar] [CrossRef]
Nemnes, G.A.; Filipoiu, N.; Sipica, V. Feature selection procedures for combined density functional theory—Artificial neural network schemes. Phys. Scr. 2021, 96, 065807. [Google Scholar] [CrossRef]
Yuan, C.; Zhao, D.; Heidari, A.A.; Liu, L.; Chen, Y.; Chen, H. Polar lights optimizer: Algorithm and applications in image segmentation and feature selection. Neurocomputing 2024, 607, 128427. [Google Scholar] [CrossRef]
Das, H.; Naik, B.; Behera, H.S. A Jaya algorithm based wrapper method for optimal feature selection in supervised classification. J. King Saud Univ. Comput. Inf. Sci. 2022, 34, 3851–3863. [Google Scholar] [CrossRef]
El-kenawy, E.S.M.; Albalawi, F.; Ward, S.A.; Ghoneim, S.S.; Eid, M.M.; Abdelhamid, A.A.; Ibrahim, A. Feature selection and classification of transformer faults based on novel meta-heuristic algorithm. Mathematics 2022, 10, 3144. [Google Scholar] [CrossRef]
Hashim, F.A.; Hussien, A.G. Snake Optimizer: A novel meta-heuristic optimization algorithm. Knowl. Based Syst. 2022, 242, 108320. [Google Scholar] [CrossRef]
Holland, J.H. Genetic algorithms. Sci. Am. 1992, 267, 66–73. [Google Scholar] [CrossRef]
Beyer, H.G.; Schwefel, H.P. Evolution strategies–a comprehensive introduction. Nat. Comput. 2002, 1, 3–52. [Google Scholar] [CrossRef]
Simon, D. Biogeography-based optimization. IEEE Trans. Evol. Comput. 2008, 12, 702–713. [Google Scholar] [CrossRef]
Li, M.; Zhao, H.; Weng, X.; Han, T. Cognitive behavior optimization algorithm for solving optimization problems. Appl. Soft Comput. 2016, 39, 199–222. [Google Scholar] [CrossRef]
Rao, R.V.; Savsani, V.J.; Vakharia, D.P. Teaching–learning-based optimization: A novel method for constrained mechanical design optimization problems. Comput. Aided Des. 2011, 43, 303–315. [Google Scholar] [CrossRef]
Moosavi, S.H.S.; Bardsiri, V.K. Poor and rich optimization algorithm: A new human-based and multi populations algorithm. Eng. Appl. Artif. Intell. 2019, 86, 165–181. [Google Scholar] [CrossRef]
Gandomi, A.H.; Yang, X.S.; Alavi, A.H. Cuckoo search algorithm: A metaheuristic approach to solve structural optimization problems. Eng. Comput. 2013, 29, 17–35. [Google Scholar] [CrossRef]
Sulaiman, M.H.; Mustaffa, Z.; Saari, M.M.; Daniyal, H. Barnacles mating optimizer: A new bio-inspired algorithm for solving engineering optimization problems. Eng. Appl. Artif. Intell. 2020, 87, 103330. [Google Scholar] [CrossRef]
Li, S.; Chen, H.; Wang, M.; Heidari, A.A.; Mirjalili, S. Slime mould algorithm: A new method for stochastic optimization. Future Gener. Comput. Syst. 2020, 111, 300–323. [Google Scholar] [CrossRef]
Erol, O.K.; Eksin, I. A new optimization method: Big bang–big crunch. Adv. Eng. Softw. 2006, 37, 106–111. [Google Scholar] [CrossRef]
Mirjalili, S.; Mirjalili, S.M.; Hatamlou, A. Multi-verse optimizer: A nature-inspired algorithm for global optimization. Neural Comput. Appl. 2016, 27, 495–513. [Google Scholar] [CrossRef]
Zhao, W.; Wang, L.; Zhang, Z. Atom search optimization and its application to solve a hydrogeologic parameter estimation problem. Knowl. Based Syst. 2019, 163, 283–304. [Google Scholar] [CrossRef]
Abdelrazek, M.; Abd Elaziz, M.; El-Baz, A.H. CDMO: Chaotic Dwarf Mongoose optimization algorithm for feature selection. Sci. Rep. 2024, 14, 701. [Google Scholar] [CrossRef]
Askr, H.; Abdel-Salam, M.; Hassanien, A.E. Copula entropy-based golden jackal optimization algorithm for high-dimensional feature selection problems. Expert Syst. Appl. 2024, 238, 121582. [Google Scholar] [CrossRef]
Zhu, Y.; Li, W.; Li, T. A hybrid artificial immune optimization for high-dimensional feature selection. Knowl. Based Syst. 2023, 260, 110111. [Google Scholar] [CrossRef]
Mostafa, R.R.; Gaheen, M.A.; Abd ElAziz, M.; Al-Betar, M.A.; Ewees, A.A. An improved gorilla troops optimizer for global optimization problems and feature selection. Knowl. Based Syst. 2023, 269, 110462. [Google Scholar] [CrossRef]
Pan, H.; Chen, S.; Xiong, H. A high-dimensional feature selection method based on modified Gray Wolf Optimization. Appl. Soft Comput. 2023, 135, 110031. [Google Scholar] [CrossRef]
Zhou, X.; Chen, Y.; Wu, Z.; Heidari, A.A.; Chen, H.; Alabdulkreem, E.; Wang, X. Boosted local dimensional mutation and all-dimensional neighborhood slime mould algorithm for feature selection. Neurocomputing 2023, 551, 126467. [Google Scholar] [CrossRef]
Fu, Y.; Liu, D.; Chen, J.; He, L. Secretary bird optimization algorithm: A new metaheuristic for solving global optimization problems. Artif. Intell. Rev. 2024, 57, 123. [Google Scholar] [CrossRef]
El-Kenawy, E.S.M.; Khodadadi, N.; Mirjalili, S.; Abdelhamid, A.A.; Eid, M.M.; Ibrahim, A. Greylag goose optimization: Nature-inspired optimization algorithm. Expert Syst. Appl. 2024, 238, 122147. [Google Scholar] [CrossRef]
Pahnehkolaei, S.M.A.; Alfi, A.; Machado, J.T. Analytical stability analysis of the fractional-order particle swarm optimization algorithm. Chaos Solitons Fractals 2022, 155, 111658. [Google Scholar] [CrossRef]
Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95-International Conference on Neural Networks, Perth, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar]
Storn, R. On the usage of differential evolution for function optimization. In Proceedings of the North American Fuzzy Information Processing, Berkeley, CA, USA, 19–22 June 1996; pp. 519–523. [Google Scholar]
Faramarzi, A.; Heidarinejad, M.; Stephens, B.; Mirjalili, S. Equilibrium optimizer: A novel optimization algorithm. Knowl. Based Syst. 2020, 191, 105190. [Google Scholar] [CrossRef]
Mirjalili, S.; Lewis, A. The whale optimization algorithm. Adv. Eng. Softw. 2016, 95, 51–67. [Google Scholar] [CrossRef]
Abdel-Basset, M.; Mohamed, R.; Jameel, M.; Abouhawwash, M. Nutcracker optimizer: A novel nature-inspired metaheuristic algorithm for global optimization and engineering design problems. Knowl. Based Syst. 2023, 262, 110248. [Google Scholar] [CrossRef]
Zhao, W.; Wang, L.; Zhang, Z.; Fan, H.; Zhang, J.; Mirjalili, S.; Cao, Q. Electric eel foraging optimization: A new bio-inspired optimizer for engineering applications. Expert Syst. Appl. 2024, 238, 122200. [Google Scholar] [CrossRef]
Amiri, M.H.; Mehrabi Hashjin, N.; Montazeri, M.; Mirjalili, S.; Khodadadi, N. Hippopotamus optimization algorithm: A novel nature-inspired optimization algorithm. Sci. Rep. 2024, 14, 5032. [Google Scholar] [CrossRef] [PubMed]
Lian, J.; Hui, G. Human evolutionary optimization algorithm. Expert Syst. Appl. 2024, 241, 122638. [Google Scholar] [CrossRef]
Tanabe, R.; Fukunaga, A.S. Improving the search performance of SHADE using linear population size reduction. In Proceedings of the 2014 IEEE Congress on Evolutionary Computation (CEC), Beijing, China, 6–11 July 2014; pp. 1658–1665. [Google Scholar]
Sallam, K.M.; Abdel-Basset, M.; El-Abd, M.; Wagdy, A. IMODEII: An Improved IMODE algorithm based on the Reinforcement Learning. In Proceedings of the 2022 IEEE Congress on Evolutionary Computation (CEC), Padua, Italy, 18–23 July 2022; pp. 1–8. [Google Scholar]

Figure 1. Hunting behavior simulation of secretary bird.

Figure 2. Escape behavior simulation of secretary bird.

Figure 3. Simulation diagram of segmented balance strategy.

Figure 4. Flowchart of the execution of BSFSBOA.

Figure 5. Convergence plot of the algorithm for different population sizes.

Figure 6. Population diversity in SBOA and BSFSBOA runs.

Figure 7. Exploration/exploitation ratio for BSFSBOA runs.

Figure 8. Box plots of algorithms for solving low-dimensional UCL FS problems.

Figure 9. Average ranking in solving low-dimensional UCL FS problems.

Figure 10. Box plots of algorithms for solving medium-dimensional UCL FS problems.

Figure 11. Average ranking in solving medium-dimensional UCL FS problems.

Figure 12. Box plots of algorithms for solving high-dimensional UCL FS problems.

Figure 13. Average ranking in solving high-dimensional UCL FS problems.

Figure 14. Average ranking in solving 23 UCL FS problems.

Figure 15. Convergence curve of algorithms for solving low-dimensional UCL FS problems.

Figure 16. Convergence curve of algorithms for solving medium-dimensional UCL FS problems.

Figure 17. Convergence curve of algorithms for solving high-dimensional UCL FS problems.

Figure 18. Stacked plot of algorithms on classification accuracy and FS subset size on UCL FS problems.

Figure 19. Average ranking in solving OpenML FS problems.

Figure 20. Stacked plot of algorithms on classification accuracy and FS subset size on OpenML FS problems.

Table 1. Detailed information on the 23 UCL datasets.

Categories	Datasets	Number of Features	Number of Categories	Dataset Size
Low	Aggregation	2	7	788
	Banana	2	2	5300
	Iris	4	3	150
	Bupa	6	2	345
	Glass	9	7	214
	Breastcancer	9	2	699
	Exactly	13	2	1000
	Wine	13	3	178
Medium	Zoo	16	7	101
	Vote	16	2	435
	Congress	16	2	435
	Lymphography	18	4	148
	Vehicle	18	4	846
	Ionosphere	34	2	351
	Landsat	36	6	2000
	SonarEW	60	2	208
High	Libras	90	15	360
	Hillvalley	100	2	606
	Musk	166	2	476
	Clean	167	2	476
	Semeion	256	10	1593
	Arrhythmia	279	16	452
	Isolet	617	26	1559

Table 2. Specific parameter configuration information for all algorithms.

Algorithms	Year	Parameter Settings
Particle Swarm Optimization (PSO) [39]	1995	$w = 1, w_{p} = 0.99, c_{1} = 1.5, c_{2} = 2.0$
Differential Evolution (DE) [40]	1997	$F = 0.5, C R = 0.9$
Equilibrium Optimizer (EO) [41]	2020	$V = 1, a_{1} = 2, a_{2} = 1, G P = 0.5$
Whale Optimization Algorithm (WOA) [42]	2016	$b = 1, a_{1} = 2 - (2 \cdot F E s / M a x F E s)$
Secretary Bird Optimization Algorithm (SBOA)	2024	$K = r o u n d (1 + r a n d (1, 1))$
Nutcracker Optimization Algorithm (NOA) [43]	2023	$α = {\begin{cases} {(1 - \frac{t}{T_{m a x}})}^{2 \frac{t}{T_{m a x}}} & i f r_{1} > r_{2} \\ {(\frac{t}{T_{m a x}})}^{\frac{2}{t}} & e l s e \end{cases}$
Electric Eel Foraging Optimization (EEFO) [44]	2024	$α_{0} = 2 \cdot (e - e^{\frac{t}{T}}), β_{0} = 2 \cdot (e - e^{\frac{t}{T}})$
Hippopotamus Optimization (HO) [45]	2024	$T = e x p (- \frac{t}{T})$
Human Evolutionary Optimization Algorithm (HEOA) [46]	2024	$ω = 0.2 \cos (\frac{π}{2} (1 - \frac{t}{M a x_{i t e r}}))$
LSHADE [47]	2014	$N_{P_{i n i t}} = 18 \cdot D, N_{P_{m i n}} = 4, H = 6$
IMODE [48]	2022	$N P_{m a x} = 18 \cdot N P, N P_{m i n} = 4 \cdot N P$
BSFSBOA	NA	$K = r o u n d (1 + r a n d (1, 1))$

Table 3. The value of the fitness function of the algorithm in solving the low dimension UCL FS problems.

Datasets	Metrics	PSO	DE	EO	WOA	SBOA	NOA	EEFO	HO	BSFSBOA
Aggregation	Best	0.100	0.106	0.100	0.100	0.100	0.100	0.100	0.111	0.100
	Mean	0.100	0.106	0.100	0.100	0.100	0.100	0.100	0.111	0.100
	Worse	0.100	0.106	0.100	0.100	0.100	0.100	0.100	0.111	0.100
	Rank	1/1/1	8/8/8	1/1/1	1/1/1	1/1/1	1/1/1	1/1/1	9/9/9	1/1/1
Banana	Best	0.201	0.190	0.198	0.198	0.199	0.206	0.196	0.188	0.187
	Mean	0.201	0.190	0.198	0.198	0.199	0.206	0.196	0.188	0.187
	Worse	0.201	0.190	0.198	0.198	0.199	0.206	0.196	0.188	0.187
	Rank	8/8/8	3/3/3	6/6/6	5/5/5	7/7/7	9/9/9	4/4/4	2/2/2	1/1/1
Iris	Best	0.050	0.085	0.085	0.025	0.025	0.025	0.080	0.050	0.025
	Mean	0.051	0.089	0.085	0.029	0.025	0.025	0.080	0.050	0.025
	Worse	0.080	0.130	0.085	0.080	0.025	0.025	0.080	0.050	0.025
	Rank	5/6/5	8/9/9	8/8/8	1/4/5	1/1/1	1/1/1	7/7/5	5/5/4	1/1/1
Bupa	Best	0.314	0.320	0.337	0.259	0.343	0.288	0.341	0.320	0.262
	Mean	0.316	0.326	0.342	0.274	0.346	0.299	0.341	0.323	0.264
	Worse	0.344	0.341	0.399	0.350	0.393	0.337	0.341	0.354	0.272
	Rank	4/4/5	5/6/3	7/8/9	1/2/6	9/9/8	3/3/2	8/7/3	5/5/7	2/1/1
Glass	Best	0.312	0.270	0.279	0.334	0.280	0.301	0.227	0.237	0.216
	Mean	0.323	0.287	0.279	0.357	0.291	0.302	0.229	0.244	0.219
	Worse	0.377	0.333	0.280	0.409	0.398	0.313	0.248	0.259	0.259
	Rank	8/8/7	4/5/6	5/4/4	9/9/9	6/6/8	7/7/5	2/2/1	3/3/2	1/1/2
Breastcancer	Best	0.068	0.048	0.068	0.061	0.059	0.059	0.042	0.053	0.040
	Mean	0.071	0.056	0.069	0.068	0.062	0.060	0.045	0.059	0.043
	Worse	0.079	0.066	0.072	0.080	0.074	0.061	0.048	0.066	0.048
	Rank	8/9/8	3/3/4	8/8/6	7/7/9	5/6/7	5/5/3	2/2/1	4/4/4	1/1/1
Exactly	Best	0.046	0.046	0.060	0.046	0.046	0.046	0.046	0.046	0.046
	Mean	0.172	0.099	0.167	0.253	0.159	0.252	0.151	0.098	0.054
	Worse	0.299	0.302	0.291	0.287	0.287	0.291	0.291	0.287	0.287
	Rank	1/7/8	1/3/9	9/6/5	1/9/1	1/5/1	1/8/5	1/4/5	1/2/4	1/1/1
Wine	Best	0.031	0.041	0.038	0.049	0.049	0.031	0.049	0.023	0.023
	Mean	0.047	0.064	0.045	0.089	0.058	0.032	0.052	0.028	0.024
	Worse	0.134	0.098	0.049	0.203	0.082	0.046	0.075	0.054	0.031
	Rank	3/5/8	6/8/7	5/4/3	7/9/9	7/7/6	3/3/2	7/6/5	1/2/4	1/1/1
Mean Rank	Best	4.750	4.750	6.125	4.000	4.625	3.750	4.000	3.750	1.125
	Mean	6.000	5.625	5.625	5.750	5.250	4.625	4.125	4.000	1.000
	Worse	6.250	6.125	5.250	5.625	4.875	3.500	3.125	4.500	1.125
Final Rank	Best	7	7	9	4	6	2	4	2	1
	Mean	9	6	6	8	5	4	3	2	1
	Worse	9	8	6	7	5	3	2	4	1

Table 4. The value of the fitness function of the algorithm in solving the medium dimension UCL FS problems.

Datasets	Metrics	PSO	DE	EO	WOA	SBOA	NOA	EEFO	HO	BSFSBOA
Zoo	Best	0.050	0.038	0.076	0.031	0.044	0.076	0.031	0.038	0.031
	Mean	0.075	0.062	0.083	0.054	0.115	0.079	0.040	0.079	0.033
	Worse	0.128	0.128	0.115	0.095	0.179	0.089	0.076	0.108	0.038
	Rank	7/5/7	4/4/7	8/8/6	1/3/4	6/9/9	8/7/3	1/2/2	4/6/5	1/1/1
Vote	Best	0.035	0.033	0.039	0.048	0.031	0.035	0.035	0.046	0.023
	Mean	0.044	0.044	0.044	0.049	0.041	0.038	0.037	0.048	0.023
	Worse	0.071	0.054	0.048	0.070	0.052	0.046	0.037	0.054	0.023
	Rank	4/7/9	3/6/7	7/5/4	9/9/8	2/4/5	4/3/3	4/2/2	8/8/6	1/1/1
Congress	Best	0.039	0.050	0.048	0.048	0.027	0.060	0.039	0.035	0.017
	Mean	0.051	0.063	0.052	0.051	0.028	0.060	0.046	0.046	0.017
	Worse	0.073	0.081	0.058	0.073	0.035	0.066	0.048	0.058	0.017
	Rank	4/5/7	8/9/9	7/7/4	6/6/7	2/2/2	9/8/6	4/3/3	3/4/5	1/1/1
Lymphography	Best	0.059	0.070	0.075	0.079	0.084	0.053	0.084	0.081	0.042
	Mean	0.081	0.107	0.090	0.095	0.110	0.076	0.092	0.127	0.046
	Worse	0.138	0.149	0.126	0.132	0.146	0.126	0.132	0.143	0.053
	Rank	3/3/6	4/7/9	5/4/2	6/6/4	8/8/8	2/2/2	8/5/4	7/9/7	1/1/1
Vehicle	Best	0.241	0.242	0.236	0.284	0.263	0.209	0.247	0.241	0.225
	Mean	0.271	0.267	0.252	0.315	0.283	0.250	0.264	0.273	0.233
	Worse	0.311	0.322	0.262	0.364	0.311	0.273	0.289	0.300	0.247
	Rank	4/6/7	6/5/8	3/3/2	9/9/9	8/8/6	1/2/3	7/4/4	4/7/5	2/1/1
Ionosphere	Best	0.062	0.072	0.035	0.015	0.037	0.037	0.028	0.022	0.022
	Mean	0.092	0.121	0.054	0.055	0.055	0.053	0.061	0.051	0.039
	Worse	0.134	0.150	0.076	0.083	0.078	0.069	0.088	0.073	0.063
	Rank	8/8/8	9/9/9	5/4/4	1/5/6	6/6/5	6/3/2	4/7/7	2/2/3	2/1/1
landsat	Best	0.117	0.112	0.096	0.105	0.095	0.099	0.109	0.094	0.087
	Mean	0.126	0.122	0.110	0.121	0.111	0.108	0.117	0.104	0.098
	Worse	0.132	0.137	0.120	0.133	0.120	0.116	0.127	0.119	0.105
	Rank	9/9/7	8/8/9	4/4/4	6/7/8	3/5/4	5/3/2	7/6/6	2/2/3	1/1/1
SonarEW	Best	0.049	0.064	0.040	0.067	0.015	0.012	0.020	0.071	0.008
	Mean	0.084	0.126	0.072	0.129	0.051	0.041	0.059	0.101	0.018
	Worse	0.124	0.160	0.106	0.187	0.083	0.084	0.089	0.131	0.027
	Rank	6/6/6	7/8/8	5/5/5	8/9/9	3/3/2	2/2/3	4/4/4	9/7/7	1/1/1
Mean Rank	Best	5.625	6.125	5.500	5.750	4.750	4.625	4.875	4.875	1.250
	Mean	6.125	7.000	5.000	6.750	5.625	3.750	4.125	5.625	1.000
	Worse	7.125	8.250	3.875	6.875	5.125	3.000	4.000	5.125	1.000
Final Rank	Best	7	9	6	8	3	2	4	4	1
	Mean	7	9	4	8	5	2	3	5	1
	Worse	8	9	3	7	5	2	4	5	1

Table 5. The value of the fitness function of the algorithm in solving the high-dimension UCL FS problems.

Datasets	Metrics	PSO	DE	EO	WOA	SBOA	NOA	EEFO	HO	BSFSBOA
Libras	Best	0.123	0.202	0.166	0.188	0.148	0.140	0.149	0.145	0.088
	Mean	0.143	0.230	0.196	0.243	0.190	0.162	0.179	0.170	0.126
	Worse	0.169	0.258	0.222	0.286	0.235	0.181	0.215	0.205	0.151
	Rank	2/2/2	9/8/8	7/7/6	8/9/9	5/6/7	3/3/3	6/5/5	4/4/4	1/1/1
Hillvalley	Best	0.340	0.352	0.345	0.316	0.308	0.303	0.296	0.268	0.250
	Mean	0.379	0.376	0.370	0.345	0.334	0.327	0.327	0.307	0.280
	Worse	0.409	0.398	0.397	0.381	0.405	0.344	0.360	0.344	0.310
	Rank	7/9/9	9/8/7	8/7/6	6/6/5	5/5/8	4/3/3	3/4/4	2/2/2	1/1/1
Musk	Best	0.071	0.069	0.023	0.066	0.055	0.059	0.064	0.051	0.026
	Mean	0.107	0.084	0.052	0.105	0.076	0.082	0.087	0.090	0.049
	Worse	0.140	0.097	0.085	0.165	0.095	0.104	0.111	0.123	0.081
	Rank	9/9/8	8/5/4	1/2/2	7/8/9	4/3/3	5/4/5	6/6/6	3/7/7	2/1/1
Clean	Best	0.070	0.060	0.036	0.069	0.043	0.069	0.029	0.047	0.024
	Mean	0.096	0.076	0.061	0.105	0.074	0.108	0.048	0.065	0.035
	Worse	0.135	0.094	0.079	0.139	0.092	0.138	0.078	0.084	0.059
	Rank	9/7/7	6/6/6	3/3/3	7/8/9	4/5/5	8/9/8	2/2/2	5/4/4	1/1/1
Semeion	Best	0.094	0.126	0.098	0.111	0.089	0.077	0.085	0.092	0.072
	Mean	0.113	0.139	0.111	0.131	0.098	0.095	0.102	0.109	0.089
	Worse	0.129	0.147	0.120	0.151	0.110	0.107	0.117	0.122	0.104
	Rank	6/7/7	9/9/8	7/6/5	8/8/9	4/3/3	2/2/2	3/4/4	5/5/6	1/1/1
Arrhythmia	Best	0.287	0.329	0.265	0.254	0.268	0.249	0.268	0.274	0.234
	Mean	0.322	0.349	0.291	0.300	0.298	0.291	0.302	0.301	0.269
	Worse	0.350	0.369	0.322	0.336	0.325	0.321	0.324	0.327	0.296
	Rank	8/8/8	9/9/9	4/2/3	3/5/7	5/4/5	2/3/2	6/7/4	7/6/6	1/1/1
Isolet	Best	0.126	0.176	0.133	0.148	0.132	0.115	0.100	0.151	0.066
	Mean	0.154	0.192	0.154	0.179	0.148	0.136	0.124	0.173	0.081
	Worse	0.175	0.204	0.173	0.216	0.172	0.156	0.156	0.191	0.097
	Rank	4/5/6	9/9/8	6/6/5	7/8/9	5/4/4	3/3/2	2/2/3	8/7/7	1/1/1
Mean Rank	Best	6.429	8.429	5.143	6.571	4.571	3.857	4.000	4.857	1.143
	Mean	6.714	7.714	4.714	7.429	4.286	3.857	4.286	5.000	1.000
	Worse	6.714	7.143	4.286	8.143	5.000	3.571	4.000	5.143	1.000
Final Rank	Best	7	9	6	8	4	2	3	5	1
	Mean	7	9	5	8	3	2	3	6	1
	Worse	7	8	4	9	5	2	3	6	1

Table 6. Friedman nonparametric test value of the algorithm in solving the UCL FS problems.

Categories	Datasets	PSO	DE	EO	WOA	SBOA	NOA	EEFO	HO	BSFSBOA
Low	Aggregation	4.30	7.60	4.18	4.07	4.15	3.98	4.12	8.80	3.80
	Banana	8.00	3.00	6.00	5.00	7.00	9.00	4.00	2.00	1.00
	Iris	5.47	8.58	8.42	2.83	2.50	2.45	6.95	5.42	2.38
	Bupa	4.00	5.72	6.88	1.93	8.80	3.52	7.70	5.05	1.40
	Glass	7.90	5.45	4.38	8.85	5.65	6.77	1.98	2.88	1.13
	Breastcancer	8.25	3.87	7.95	7.35	5.25	4.90	1.88	4.33	1.22
	Exactly	5.23	4.02	6.27	6.62	4.63	7.45	4.32	4.22	2.25
	Wine	5.07	7.43	4.87	8.13	6.97	2.95	6.05	2.05	1.48
Medium	Zoo	5.75	4.50	6.87	3.67	8.25	6.23	2.30	6.20	1.23
	Vote	5.87	6.05	5.97	7.58	4.38	3.23	3.60	7.32	1.00
	Congress	5.35	8.42	6.43	5.42	2.00	8.08	4.07	4.23	1.00
	Lymphography	3.65	6.73	4.73	5.30	7.02	3.52	4.67	8.35	1.03
	Vehicle	5.83	5.08	3.13	8.77	7.10	3.42	4.65	5.88	1.13
	Ionosphere	7.95	8.83	4.12	4.38	4.43	4.13	5.30	3.85	2.00
	landsat	8.43	7.25	3.90	7.12	4.35	3.65	6.45	2.60	1.25
	SonarEW	5.77	8.10	5.07	8.12	3.38	2.65	3.90	6.80	1.22
High	Libras	2.10	8.23	6.18	8.53	5.98	3.75	4.93	4.15	1.13
	Hillvalley	8.13	8.03	7.30	5.60	4.35	4.00	3.93	2.52	1.13
	Musk	7.93	5.07	1.97	7.50	4.08	4.87	5.65	6.43	1.50
	Clean	7.13	5.50	3.50	8.00	5.03	8.30	2.17	4.10	1.27
	Semeion	6.07	8.77	5.60	8.03	3.03	2.37	3.83	5.53	1.77
	Arrhythmia	7.10	8.90	3.55	5.10	4.68	3.70	5.13	5.10	1.73
	Isolet	5.03	8.80	5.27	7.60	4.43	3.37	2.40	7.10	1.00
	Mean Rank	6.10	6.69	5.33	6.33	5.11	4.62	4.35	5.00	1.48
	Final Rank	7	9	6	8	5	3	2	4	1

Table 7. Classification accuracy of the algorithm in solving the UCL FS problems.

Categories	Datasets	PSO	DE	EO	WOA	SBOA	NOA	EEFO	HO	BSFSBOA
Low	Aggregation	100.00	99.36	100.00	100.00	100.00	100.00	100.00	98.73	100.00
		1	8	1	1	1	1	1	9	1
	Banana	88.77	90.00	89.06	89.15	88.96	88.21	89.34	90.19	90.38
		8	3	6	5	7	9	4	2	1
	Iris	99.89	93.89	93.33	99.78	100.00	100.00	96.67	100.0	100.00
		5	8	9	6	1	1	7	1	1
	Bupa	72.32	68.74	67.49	74.93	63.86	73.00	69.57	67.97	75.94
		4	6	8	2	9	3	5	7	1
	Glass	67.94	73.33	71.67	65.48	72.30	67.78	80.48	77.46	80.79
		7	4	6	9	5	8	2	3	1
	Breastcancer	95.08	96.74	95.25	95.54	96.52	95.83	97.58	97.15	98.80
		9	4	8	7	5	6	2	3	1
	Exactly	84.78	94.60	85.23	74.90	85.98	74.67	87.23	91.35	95.35
		7	2	6	8	5	9	4	3	1
	Wine	98.29	96.86	98.38	93.33	96.57	97.81	96.95	99.90	100.00
		4	7	3	9	8	5	6	2	1
Medium	Zoo	95.17	98.33	94.67	99.17	91.67	94.50	99.83	96.67	100.00
		6	4	7	3	9	8	2	5	1
	Vote	97.16	98.24	96.55	95.29	97.55	97.05	96.63	95.63	98.85
		4	2	7	9	3	5	6	8	1
	Congress	96.55	95.94	96.48	95.33	97.74	94.75	95.98	96.05	98.85
		3	7	4	8	2	9	6	5	1
	Lymphography	94.94	92.76	94.14	91.49	91.15	94.71	92.99	88.28	96.32
		2	6	4	7	8	3	5	9	1
	Vehicle	74.16	75.96	75.76	68.48	72.43	75.56	74.44	73.71	75.74
		6	1	2	9	8	4	5	7	3
	Ionosphere	93.00	90.90	95.52	95.62	95.00	94.71	95.19	95.57	97.14
		8	9	4	2	6	7	5	3	1
	landsat	89.67	90.83	90.41	89.68	90.39	90.29	90.08	91.27	91.73
		9	3	4	8	5	6	7	2	1
	SonarEW	94.63	91.87	94.23	88.70	96.18	96.99	96.18	91.54	100.00
		5	7	6	9	4	2	3	8	1
High	Libras	88.33	80.69	80.19	74.44	80.93	83.84	83.47	82.92	88.52
		2	7	8	9	6	3	4	5	1
	Hillvalley	62.48	64.27	59.72	62.48	63.28	64.74	66.12	66.89	70.61
		7	5	9	7	6	4	3	2	1
	Musk	92.91	97.54	96.46	90.42	95.37	93.72	93.89	91.93	96.88
		7	1	3	9	4	6	5	8	2
	Clean	94.32	98.21	95.26	91.75	95.26	89.65	98.21	94.91	98.63
		7	2	5	8	4	9	2	6	1
	Semeion	92.66	91.93	92.78	90.42	93.98	93.08	93.95	92.78	94.08
		7	8	5	9	2	4	3	5	1
	Arrhythmia	69.11	68.04	69.04	69.44	69.22	68.37	70.11	67.85	71.30
		5	8	6	3	4	7	2	9	1
	Isolet	88.24	85.74	85.41	83.07	86.62	87.43	90.03	83.28	93.29
		3	6	7	9	5	4	2	8	1
	Mean Rank	5.48	5.13	5.57	6.78	5.09	5.35	3.96	5.22	1.13
	Final Rank	7	4	8	9	3	6	2	5	1

Table 8. Feature subset size of the algorithm in solving the UCL FS problems.

Categories	Datasets	PSO	DE	EO	WOA	SBOA	NOA	EEFO	HO	BSFSBOA
Low	Aggregation	2.00	2.00	2.00	2.00	2.00	2.00	2.00	2.00	2.00
		1	1	1	1	1	1	1	1	1
	Banana	2.00	2.00	2.00	2.00	2.00	2.00	2.00	2.00	2.00
		1	1	1	1	1	1	1	1	1
	Iris	2.00	1.37	1.00	1.07	1.00	1.17	2.00	2.00	1.00
		7	6	1	4	1	5	7	7	1
	Bupa	4.03	2.70	2.97	2.90	1.23	3.33	4.00	2.07	3.90
		9	3	5	4	1	6	8	2	7
	Glass	3.13	4.23	2.20	4.17	3.77	3.33	4.80	3.67	4.17
		2	8	1	6	5	3	9	4	6
	Breastcancer	2.43	2.43	2.40	2.50	2.77	2.57	2.10	3.00	2.87
		3	3	2	5	7	6	1	9	8
	Exactly	4.53	6.57	4.43	3.47	4.30	3.17	4.67	7.27	6.73
		5	7	4	2	3	1	6	9	8
	Wine	4.13	4.60	3.93	3.77	3.57	4.53	3.17	3.53	3.17
		7	9	6	5	4	8	1	3	1
Medium	Zoo	5.10	7.53	5.63	7.37	6.43	7.53	6.13	7.77	5.33
		1	7	3	6	5	7	4	9	2
	Vote	3.03	4.43	2.00	1.13	2.97	3.90	1.10	1.33	2.00
		7	9	4	2	6	8	1	3	4
	Congress	3.20	4.17	3.23	1.47	1.17	3.10	1.53	1.70	1.00
		7	9	8	3	2	6	4	5	1
	Lymphography	6.30	7.60	6.67	3.23	5.53	5.07	5.17	3.90	4.63
		7	9	8	1	6	4	5	2	3
	Vehicle	6.97	9.13	6.03	5.60	6.30	5.40	6.10	6.57	5.77
		8	9	4	2	6	1	5	7	3
	Ionosphere	9.83	13.43	4.60	5.13	3.33	5.00	6.00	3.73	4.43
		8	9	4	6	1	5	7	2	3
	landsat	11.70	14.27	8.60	10.23	8.97	10.17	10.03	9.13	8.57
		8	9	2	7	3	6	5	4	1
	SonarEW	21.43	31.73	12.27	16.57	9.83	16.23	14.60	14.90	10.53
		8	9	3	7	1	6	4	5	2
High	Libras	35.70	50.20	16.13	11.67	16.33	27.73	27.60	15.03	18.90
		8	9	3	1	4	7	6	2	5
	Hillvalley	41.20	54.23	7.37	7.70	3.40	25.40	22.33	8.63	15.40
		8	9	2	3	1	7	6	4	5
	Musk	70.97	103.50	33.70	31.53	57.37	63.17	53.97	29.63	34.53
		8	9	3	2	6	7	5	1	4
	Clean	74.37	100.80	31.23	51.80	52.37	58.07	60.10	32.50	32.30
		8	9	1	4	5	6	7	3	2
	Semeion	121.03	168.90	116.67	114.93	112.77	108.83	121.47	113.63	91.20
		7	9	6	5	3	2	8	4	1
	Arrhythmia	121.90	171.00	34.33	68.60	58.13	64.97	91.20	32.93	29.47
		8	9	3	6	4	5	7	2	1
	Isolet	294.87	392.77	138.10	163.63	172.63	197.47	213.40	138.17	126.93
		8	9	2	4	5	6	7	3	1
	Mean Rank	6.26	7.43	3.35	3.78	3.52	4.96	5.00	4.00	3.09
	Final Rank	8	9	2	4	3	6	7	5	1

Table 9. Running time of the algorithm in solving the UCL FS problems.

Categories	Datasets	PSO	DE	EO	WOA	SBOA	NOA	EEFO	HO	BSFSBOA
Low	Aggregation	7.51	4.94	3.42	4.50	4.05	5.39	3.72	11.22	3.46
		8	6	1	5	4	7	3	9	2
	Banana	5.89	12.38	9.19	11.74	12.2	18.88	10.94	30.71	23.06
		1	6	2	4	5	7	3	9	8
	Iris	8.11	4.24	4.26	4.02	2.98	4.73	3.21	10.23	3.32
		8	5	6	4	1	7	2	9	3
	Bupa	8.23	3.93	4.65	4.46	3.10	5.03	4.38	11.73	3.38
		8	3	6	5	1	7	4	9	2
	Glass	7.99	4.18	4.57	4.03	3.80	4.94	4.16	10.77	3.30
		8	5	6	3	2	7	4	9	1
	Breastcancer	9.16	4.41	4.63	4.33	3.97	5.36	3.85	12.02	3.56
		8	5	6	4	3	7	2	9	1
	Exactly	9.31	4.91	5.91	4.43	3.83	5.90	4.22	13.33	3.88
		8	5	7	4	1	6	3	9	2
	Wine	7.99	4.32	4.78	4.69	3.60	4.99	3.52	11.12	3.36
		8	4	6	5	3	7	2	9	1
Medium	Zoo	7.77	4.12	4.43	4.30	3.76	4.91	3.40	11.17	3.30
		8	4	6	5	3	7	2	9	1
	Vote	7.85	4.55	4.44	4.12	3.55	5.10	3.68	9.90	3.32
		8	6	5	4	2	7	3	9	1
	Congress	8.32	4.26	4.80	4.16	3.23	6.05	3.75	9.93	3.34
		8	5	6	4	1	7	3	9	2
	Lymphography	8.56	4.34	4.39	4.00	4.07	5.94	4.02	9.59	3.36
		8	5	6	2	4	7	3	9	1
	Vehicle	11.28	5.17	6.77	5.41	4.41	5.92	6.19	12.61	3.53
		8	3	7	4	2	5	6	9	1
	Ionosphere	8.41	4.25	4.53	3.89	3.48	5.06	4.72	10.05	3.37
		8	4	5	3	2	7	6	9	1
	landsat	15.02	6.18	7.74	5.36	5.46	7.62	7.05	16.49	4.14
		8	4	7	2	3	6	5	9	1
	SonarEW	8.59	3.71	4.51	3.3	3.56	4.54	4.31	10.24	3.09
		8	4	6	2	3	7	5	9	1
High	Libras	11.98	4.37	4.56	3.36	3.73	4.85	6.47	10.28	3.15
		9	4	5	2	3	6	7	8	1
	Hillvalley	8.97	5.57	5.30	3.61	3.42	5.41	5.37	11.81	3.57
		8	7	4	3	1	6	5	9	2
	Musk	8.49	6.02	5.13	3.65	4.28	6.05	4.70	14.48	3.59
		8	6	5	2	3	7	4	9	1
	Clean	8.51	5.93	5.81	3.88	4.13	5.95	4.74	14.63	3.60
		8	6	5	2	3	7	4	9	1
	Semeion	24.19	16.08	12.98	13.68	13.39	19.85	13.63	38.71	8.44
		8	6	2	5	3	7	4	9	1
	Arrhythmia	8.56	6.78	4.98	4.30	4.97	6.53	6.11	13.50	4.05
		8	7	4	2	3	6	5	9	1
	Isolet	30.65	26.41	15.27	16.97	20.85	28.96	19.29	57.06	16.12
		8	6	1	3	5	7	4	9	2
	Mean Rank	7.74	5.04	4.96	3.43	2.65	6.70	3.87	8.96	1.65
	Final Rank	8	6	5	3	2	7	4	9	1

Table 10. Detailed information on the 13 OpenML datasets.

Datasets	Number of Features	Number of Categories	Dataset Size
Titanic	3	2	2201
Balance-scale	4	3	625
Diabetes	8	2	768
Breast-w	9	2	699
HeartEW	13	2	270
KC1	21	2	2109
PC1	21	2	1109
Parkinsons	22	2	195
BreastEW	30	2	569
PC2	36	2	5589
PC4	37	2	1458
Piechart3	37	2	1077
Pizzacutter3	37	2	1043

Table 11. The value of the fitness function of the algorithm in solving the OpenML FS problems.

Datasets	Metrics	DE	EO	WOA	SBOA	HEOA	LSHADE	IMODE	BSFSBOA
Titanic	Best	0.269	0.256	0.232	0.228	0.246	0.254	0.234	0.224
	Mean	0.294	0.256	0.232	0.228	0.246	0.254	0.234	0.224
	Worse	0.455	0.256	0.232	0.228	0.246	0.254	0.234	0.224
	Rank	8/8/8	7/7/7	3/3/3	2/2/2	5/5/5	6/6/6	4/4/4	1/1/1
Balance-Scale	Best	0.222	0.237	0.237	0.222	0.215	0.215	0.280	0.208
	Mean	0.222	0.237	0.237	0.222	0.215	0.215	0.280	0.208
	Worse	0.222	0.237	0.237	0.222	0.215	0.215	0.280	0.208
	Rank	4/4/4	6/6/6	6/6/6	4/4/4	2/2/2	2/2/2	8/8/8	1/1/1
Diabetes	Best	0.077	0.077	0.077	0.077	0.077	0.077	0.077	0.072
	Mean	0.078	0.077	0.077	0.077	0.077	0.077	0.077	0.074
	Worse	0.090	0.077	0.077	0.077	0.077	0.077	0.077	0.077
	Rank	2/8/8	2/2/1	2/2/1	2/2/1	2/2/1	2/2/1	2/2/1	1/1/1
Breast-w	Best	0.068	0.046	0.042	0.066	0.048	0.066	0.059	0.040
	Mean	0.069	0.054	0.056	0.067	0.048	0.067	0.061	0.045
	Worse	0.079	0.061	0.077	0.072	0.057	0.072	0.066	0.053
	Rank	8/8/8	3/3/3	2/4/7	6/7/5	4/2/2	6/6/5	5/5/4	1/1/1
HeartEW	Best	0.172	0.121	0.129	0.140	0.138	0.147	0.181	0.090
	Mean	0.201	0.129	0.148	0.152	0.148	0.161	0.192	0.109
	Worse	0.263	0.173	0.197	0.190	0.258	0.196	0.223	0.172
	Rank	7/8/8	2/2/2	3/4/5	5/5/3	4/3/7	6/6/4	8/7/6	1/1/1
KC1	Best	0.163	0.155	0.164	0.166	0.136	0.180	0.167	0.135
	Mean	0.177	0.159	0.175	0.171	0.144	0.191	0.177	0.142
	Worse	0.188	0.162	0.192	0.177	0.155	0.208	0.188	0.151
	Rank	4/6/5	3/3/3	5/5/7	6/4/4	2/2/2	8/8/8	7/7/6	1/1/1
PC1	Best	0.058	0.051	0.063	0.063	0.067	0.058	0.046	0.050
	Mean	0.068	0.053	0.073	0.068	0.069	0.060	0.057	0.052
	Worse	0.076	0.054	0.083	0.071	0.070	0.062	0.066	0.054
	Rank	5/5/7	3/2/2	6/8/8	6/6/6	8/7/5	4/4/3	1/3/4	2/1/1
Parkinsons	Best	0.069	0.051	0.078	0.055	0.069	0.046	0.041	0.032
	Mean	0.092	0.057	0.100	0.055	0.115	0.079	0.083	0.034
	Worse	0.106	0.083	0.143	0.055	0.152	0.097	0.101	0.051
	Rank	6/6/6	4/3/3	8/7/7	5/2/2	6/8/8	3/4/4	2/5/5	1/1/1
BreastEW	Best	0.061	0.035	0.029	0.020	0.020	0.039	0.013	0.010
	Mean	0.073	0.044	0.053	0.035	0.029	0.051	0.029	0.021
	Worse	0.084	0.056	0.076	0.052	0.039	0.064	0.043	0.035
	Rank	8/8/8	6/5/5	5/7/7	3/4/4	3/3/2	7/6/6	2/2/3	1/1/1
PC2	Best	0.006	0.006	0.006	0.006	0.006	0.006	0.006	0.006
	Mean	0.019	0.006	0.006	0.006	0.006	0.006	0.006	0.006
	Worse	0.031	0.006	0.006	0.006	0.006	0.006	0.006	0.006
	Rank	1/8/8	1/1/1	1/1/1	1/1/1	1/1/1	1/1/1	1/1/1	1/1/1
PC4	Best	0.098	0.097	0.085	0.082	0.084	0.082	0.073	0.072
	Mean	0.120	0.107	0.108	0.091	0.097	0.094	0.092	0.089
	Worse	0.145	0.113	0.123	0.111	0.111	0.111	0.111	0.108
	Rank	8/8/8	7/6/6	6/7/7	3/2/3	5/5/3	3/4/3	2/3/2	1/1/1
Piechart3	Best	0.122	0.103	0.100	0.095	0.095	0.090	0.092	0.093
	Mean	0.137	0.110	0.114	0.107	0.106	0.105	0.104	0.104
	Worse	0.155	0.114	0.120	0.118	0.114	0.114	0.112	0.110
	Rank	8/8/8	7/6/4	6/7/7	4/5/6	4/4/3	1/3/4	2/2/2	3/1/1
Pizzacutter3	Best	0.108	0.086	0.086	0.095	0.084	0.095	0.092	0.086
	Mean	0.124	0.099	0.103	0.103	0.099	0.103	0.107	0.091
	Worse	0.136	0.108	0.124	0.111	0.107	0.110	0.114	0.108
	Rank	8/8/8	2/2/2	2/6/7	6/4/5	1/3/1	6/5/4	5/7/6	2/1/2
MeanRank	Best	5.92	4.08	4.23	4.08	3.62	4.23	3.77	1.31
	Mean	7.15	3.69	5.15	3.69	3.62	4.38	4.31	1.00
	Worse	7.23	3.46	5.62	3.54	3.23	3.92	4.00	1.08
FinalRank	Best	8	4	6	4	2	6	3	1
	Mean	8	3	7	3	2	6	5	1
	Worse	8	3	7	4	2	5	6	1

Table 12. Classification accuracy of the algorithm in solving the OpenML FS problems.

Datasets	DE	EO	WOA	SBOA	HEOA	LSHADE	IMODE	BSFSBOA
Titanic	75.68	74.74	75.23	77.95	76.36	75.45	77.73	78.86
	5	8	7	2	4	6	3	1
Balance-scale	87.20	86.40	84.80	84.80	87.20	87.20	80.00	88.00
	2	5	6	6	2	2	8	1
Diabetes	92.81	92.81	92.81	92.81	92.81	92.81	92.81	92.81
	1	1	1	1	1	1	1	1
Breast-w	95.11	98.01	98.18	97.27	97.17	96.31	96.71	98.20
	8	3	2	4	5	7	6	1
HeartEW	88.64	83.15	88.77	85.99	88.09	85.74	81.79	90.56
	3	7	2	5	4	6	8	1
KC1	82.69	83.34	84.54	82.48	85.53	81.40	82.47	87.02
	5	4	3	6	2	8	7	1
PC1	93.50	94.10	94.86	92.58	92.94	93.92	94.80	94.80
	6	4	1	8	7	5	2	2
Parkinsons	93.85	92.56	94.53	90.94	88.29	93.42	91.88	97.35
	3	5	2	7	8	4	6	1
BreastEW	98.76	96.55	97.32	97.02	99.00	97.11	98.88	99.76
	4	8	5	7	2	6	3	1
PC2	99.64	99.64	99.64	99.64	99.64	99.64	99.64	99.64
	1	8	1	1	1	1	1	1
PC4	88.00	88.91	90.52	88.49	90.58	90.26	90.80	91.36
	8	6	4	7	3	5	2	1
Piechart3	88.79	88.37	88.79	87.88	89.07	88.96	89.29	89.09
	5	7	5	8	3	4	1	2
Pizzacutter3	89.28	89.71	89.31	89.10	89.73	89.36	88.85	90.64
	6	3	5	7	2	4	8	1
Mean Rank	4.38	5.31	3.38	5.31	3.38	4.54	4.31	1.15
Final Rank	5	7	2	7	2	6	4	1

Table 13. Feature subset size of the algorithm in solving the OpenML FS problems.

Datasets	DE	EO	WOA	SBOA	HEOA	LSHADE	IMODE	BSFSBOA
Titanic	1.00	2.00	1.00	1.00	1.00	1.00	1.00	1.00
	1	8	1	1	1	1	1	1
Balance-scale	4.00	4.00	4.00	4.00	4.00	4.00	4.00	4.00
	1	1	1	1	1	1	1	1
Diabetes	1.00	1.03	1.0	1.00	1.57	1.00	1.00	1.00
	1	7	1	1	8	1	1	1
Breast-w	2.23	2.57	3.37	2.83	2.63	3.00	2.83	2.70
	1	2	8	5	3	7	5	4
HeartEW	4.47	6.37	3.57	2.87	2.80	4.20	3.67	3.07
	7	8	4	2	1	6	5	3
KC1	4.70	5.70	4.20	3.70	2.80	5.00	4.13	3.17
	6	8	5	3	1	7	4	2
PC1	2.93	3.07	1.43	1.20	1.80	1.03	2.13	1.03
	7	8	4	3	5	1	6	1
Parkinsons	4.23	5.60	1.73	4.00	2.17	4.37	2.20	2.00
	6	8	1	5	3	7	4	2
BreastEW	10.70	12.43	5.97	7.90	6.03	7.57	5.63	6.10
	7	8	2	6	3	5	1	4
PC2	3.53	5.63	1.00	1.00	1.00	1.00	1.00	1.00
	7	8	1	1	1	1	1	1
PC4	8.93	2.67	4.00	1.60	3.23	2.43	3.33	13.00
	7	3	6	1	4	2	5	8
Piechart3	7.30	11.83	3.20	1.80	2.17	2.20	2.73	1.97
	7	8	6	1	3	4	5	2
Pizzacutter3	10.07	2.50	2.53	1.97	2.77	2.67	2.53	10.40
	7	2	3	1	6	5	3	8
Mean Rank	5.00	6.08	3.31	2.38	3.08	3.69	3.23	2.92
Final Rank	7	8	5	1	3	6	4	2

Table 14. Running time of the algorithm in solving the OpenML FS problems.

Datasets	DE	EO	WOA	SBOA	HEOA	LSHADE	IMODE	BSFSBOA
Titanic	6.15	5.33	4.92	11.73	11.08	12.21	4.72	4.26
	5	4	3	7	6	8	2	1
Balance-scale	5.09	4.83	5.14	10.48	10.46	10.96	4.59	4.60
	4	3	5	7	6	8	1	2
Diabetes	4.68	4.72	4.51	10.40	9.45	9.92	3.84	4.08
	4	5	3	8	6	7	1	2
Breast-w	4.77	5.02	5.06	10.66	10.24	10.44	4.58	4.38
	3	4	5	8	6	7	2	1
HeartEW	4.43	4.80	4.67	9.74	8.63	10.22	4.49	4.11
	2	5	4	7	6	8	3	1
KC1	4.78	5.33	5.24	12.12	9.25	13.19	5.06	4.81
	1	5	4	7	6	8	3	2
PC1	4.74	4.83	4.82	10.29	9.26	10.16	4.50	4.06
	3	5	4	8	6	7	2	1
Parkinsons	4.57	4.44	4.48	9.51	8.56	9.57	3.92	4.05
	5	3	4	7	6	8	1	2
BreastEW	4.01	4.19	4.29	10.21	8.83	9.65	4.45	3.70
	2	3	4	8	6	7	5	1
PC2	13.60	12.46	11.81	23.46	20.99	33.45	9.16	8.00
	5	4	3	7	6	8	2	1
PC4	4.30	4.68	4.39	10.50	8.66	14.12	4.89	4.82
	1	3	2	7	6	8	5	4
Piechart3	4.51	4.07	4.78	10.44	9.32	13.70	4.58	4.98
	2	1	4	7	6	8	3	5
Pizzacutter3	4.22	4.4	5.17	10.35	9.45	13.50	4.51	4.82
	1	2	5	7	6	8	3	4
Mean Rank	2.92	3.62	3.85	7.31	6.00	7.69	2.54	2.08
Final Rank	3	4	5	7	6	8	2	1

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, F.; Ye, S.; Wang, J.; Luo, J. Multi-Strategy Improved Binary Secretarial Bird Optimization Algorithm for Feature Selection. Mathematics 2025, 13, 668. https://doi.org/10.3390/math13040668

AMA Style

Chen F, Ye S, Wang J, Luo J. Multi-Strategy Improved Binary Secretarial Bird Optimization Algorithm for Feature Selection. Mathematics. 2025; 13(4):668. https://doi.org/10.3390/math13040668

Chicago/Turabian Style

Chen, Fuqiang, Shitong Ye, Jianfeng Wang, and Jia Luo. 2025. "Multi-Strategy Improved Binary Secretarial Bird Optimization Algorithm for Feature Selection" Mathematics 13, no. 4: 668. https://doi.org/10.3390/math13040668

APA Style

Chen, F., Ye, S., Wang, J., & Luo, J. (2025). Multi-Strategy Improved Binary Secretarial Bird Optimization Algorithm for Feature Selection. Mathematics, 13(4), 668. https://doi.org/10.3390/math13040668

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Multi-Strategy Improved Binary Secretarial Bird Optimization Algorithm for Feature Selection

Abstract

1. Introduction

2. Mathematical Framework of Secretary Bird Optimization Algorithm

2.1. Mathematical Modelling of Hunting Behavior

2.1.1. Seeking After One’s Prey

2.1.2. Depleting Prey’s Energy

2.1.3. Attacking Prey

2.2. Mathematical Modelling of Escape Behaviour

2.2.1. Escaping by Run or Fly Away

2.2.2. Hiding with the Environment

3. Mathematical Modeling of BSFSBOA

3.1. Best-Rand Exploration Strategy

3.2. Segmented Balance Strategy

3.3. Four-Role Exploitation Strategy

3.4. Implementation of the BSFSBOA

4. Discussion of Experimental Results on the UCL Datasets

4.1. The Modeling of the FS Problems

4.2. Sensitivity Analysis of Operating Parameters

4.3. Population Diversity Analysis

4.4. Exploration/Exploitation Balance Analysis

4.5. Fitness Function Value Analysis on the UCL Datasets

4.6. Friedman Nonparametric Test Analysis on the UCL Datasets

4.7. Convergence Analysis on the UCL Datasets

4.8. Classification Accuracy and Feature Subset Size Analysis on the UCL Datasets

4.9. Runtime Analysis on the UCL Datasets

5. Discussion of Expanded Experiments on the OpenML Datasets

5.1. Fitness Function Value Analysis on the OpenML Datasets

5.2. Classification Accuracy and Feature Subset Size Analysis on the OpenML Datasets

5.3. Runtime Analysis on the OpenML Datasets

6. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI