Combined Gaussian Mixture Model and Pathfinder Algorithm for Data Clustering
<p>Clustering results on synthetic dataset-Aggregation.</p> "> Figure 2
<p>Clustering results for the eight methods on synthetic dataset-Compound.</p> "> Figure 3
<p>Clustering results for the eight methods on synthetic dataset-Pathbased.</p> "> Figure 4
<p>Clustering results for the eight methods on synthetic dataset-Four Lines.</p> "> Figure 5
<p>Clustering results for the eight methods on synthetic dataset-Jain.</p> "> Figure 6
<p>Clustering results for the eight methods on synthetic dataset-S2.</p> ">
Abstract
:1. Introduction
- To perform clustering using GMMs, one needs to manually configure the cluster numbers. However, this task can be challenging and can significantly impact the outcome of the clustering process;
- The initialization phase of GMMs may encounter difficulties in capturing complex structures within the data and become trapped in local convergence, leading to suboptimal clustering performance.
- To address the issue of Gaussian Mixture Models potentially getting stuck in local convergence during initialization, we introduced the powerful global searching PFA for clustering analysis. PFA identifies the optimal solution during initialization, thereby avoiding the local convergence trap;
- In addressing the issue of manual selection of cluster centers by GMMs, we employed the Davies–Bouldin Index as a fitness function for PFA, allowing for the automatic selection of cluster numbers.
2. Related Work
3. Theoretical Background
3.1. Pathfinder Algorithm
Algorithm 1: Pseudo-code of the Pathfinder algorithm |
1. Initialize the population 2. Calculate the fitness of initial population 3. Find the Pathfinder 4. While K < maximum number of iterations 5. α and β = random number in [1, 2] 6. Update the position of Pathfinder using Equation (1) 7. If new Pathfinder is better than old 8. Update Pathfinder 9. End 10. For i = 2 to maximum number of populations 11. Update positions of members using Equation (2) 12. End 13. Calculate new fitness of members 14. Find the best fitness 15. If best fitness < fitness of Pathfinder 16. Pathfinder = best member 17. Fitness = best fitness 18. End 19. For i = 2 to maximum number of populations 20. If new fitness of member(i) < fitness of member(i) 21. Update members 22. End 23. End 24. Generate new A and ε 25. End |
3.2. Gaussian Mixture Models and Expectation Maximization
4. Pathfinder Algorithm Based on GMMs for Data Clustering
4.1. Internal Validation Criteria
- Compactness: It measures the proximity of intra-cluster data. Variance has been utilized for measuring compactness in some methods, with lower variance representing better compactness. Similarity also has been applied to measure compactness, such as pairwise distance;
- Separation: The inter-cluster data distinction is measured using a similarity metric to determine the distance between sets of clusters. This distance is used to evaluate separation, such as the pairwise distance of intra-cluster data points and the distance between cluster centroids.
4.2. PFA-GMM
Algorithm 2: Pseudo-code of PFA-GMM |
Input: The set of data points, Maximum iterations, Population numbers Output: The optimal clustering result 1. Initialize the population 2. Select data points randomly and determine the number of clusters C 3. Generate the initial population and applying through GMM 4. Calculate the fitness function according to EM 5. Choose population with the best fitness value as Pathfinder 6. While K < maximum number of iterations 7. For i = 1 to maximum number of populations 8. Update positions of members using Equation (2) 9. Calculate fitness value of members through EM 10. End 11. If best fitness < fitness of Pathfinder 12. Pathfinder = best member 13. Fitness = best fitness 14. End 15. α and β = random number in [1, 2] 16. Generate new A and ε 17. Update the position of Pathfinder using Equation (1) 18. If new Pathfinder is better than old 19. Update Pathfinder 20. Calculate fitness value 21. End 22. Assign data points to final cluster centroids |
5. Experimental Results and Analysis
5.1. Clustering Evaluation Criteria
5.2. Experiments on Synthetical Datasets
5.3. Experiments on Real-World Datasets
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Hosseinzadeh, M.; Hemmati, A.; Rahmani, A.M. Clustering for smart cities in the internet of things: A review. Clust. Comput. 2022, 25, 4097–4127. [Google Scholar] [CrossRef]
- Jain, A.K. Data clustering: 50 years beyond k-means. Pattern Recognit. Lett. 2010, 31, 651–666. [Google Scholar] [CrossRef]
- Saraswathi, S.; Sheela, M.I. A comparative study of various clustering algorithms in data mining. Int. J. Comput. Sci. Mob. Comput. 2014, 11, 422–428. [Google Scholar]
- Tao, X.; Guo, W.; Li, X.; He, Q.; Liu, R.; Zou, J. Fitness peak clustering based dynamic multi-swarm particle swarm optimization with enhanced learning strategy. Expert Syst. Appl. 2022, 191, 116301. [Google Scholar] [CrossRef]
- Xie, Y.; Shekhar, S.; Li, Y. Statistically-robust clustering techniques for mapping spatial hotspots: A survey. ACM Comput. Surv. (CSUR) 2022, 55, 1–38. [Google Scholar] [CrossRef]
- Yapici, H.; Cetinkaya, N. A new metaheuristic optimizer: Pathfinder algorithm. Appl. Soft Comput. 2019, 78, 545–568. [Google Scholar] [CrossRef]
- Loperfido, N. Kurtosis-Based Projection Pursuit for Outlier Detection in Financial Time Series. Eur. J. Financ. 2020, 26, 142–164. [Google Scholar] [CrossRef]
- Peña, D.; Prieto, F.J. Cluster identification using projections. J. Am. Statist. Assoc. 2001, 96, 1433–1445. [Google Scholar] [CrossRef] [Green Version]
- Chen, X.; Yang, Y. Cutoff for exact recovery of Gaussian Mixture Models. IEEE Trans. Inf. Theory 2021, 67, 4223–4238. [Google Scholar] [CrossRef]
- Qu, J.; Du, Q.; Li, Y.; Tian, L.; Xia, H. Anomaly detection in hyperspectral imagery based on Gaussian mixture model. IEEE Trans. Geosci. Remote Sens. 2020, 59, 9504–9517. [Google Scholar] [CrossRef]
- Fu, Y.; Liu, X.; Sarkar, S.; Wu, T. Gaussian mixture model with feature selection: An embedded approach. Comput. Ind. Eng. 2021, 152, 107000. [Google Scholar] [CrossRef]
- Patel, E.; Kushwaha, D.S. Clustering cloud workloads: K-means vs gaussian mixture model. Procedia Comput. Sci. 2020, 171, 158–167. [Google Scholar] [CrossRef]
- Hennig, C. Asymmetric Linear Dimension Reduction for Classification. J. Comput. Graph. Stat. 2004, 13, 930–945. [Google Scholar] [CrossRef]
- De Luca, G.; Loperfido, N. A Skew-in-Mean GARCH Model for Financial Returns. In Skew-Elliptical Distributions and Their Applications: A Journey Beyond Normality; CRC/Chapman & Hall: Boca Raton, FL, USA, 2004; pp. 205–222. [Google Scholar]
- Abu Khurma, R.; Aljarah, I.; Sharieh, A.; Abd Elaziz, M.; Damaševičius, R.; Krilavičius, T. A review of the modification strategies of the nature inspired algorithms for feature selection problem. Mathematics 2022, 10, 464. [Google Scholar] [CrossRef]
- Abd Elaziz, M.; Al-Qaness, M.A.A.; Abo Zaid, E.O.; Lu, S.; Ali Ibrahim, R.; Ewees, A.A. Automatic clustering method to segment COVID-19 CT images. PLoS ONE 2021, 16, e0244416. [Google Scholar] [CrossRef]
- Janamala, V.; Reddy, D.S. Coyote optimization algorithm for optimal allocation of interline–Photovoltaic battery storage system in islanded electrical distribution network considering EV load penetration. J. Energy Storage 2021, 41, 102981. [Google Scholar] [CrossRef]
- Oladipo, S.; Sun, Y.; Wang, Z. Application of a new fusion of flower pollinated with Pathfinder algorithm for AGC of multi-source interconnected power system. IEEE Access 2021, 9, 94149–94168. [Google Scholar] [CrossRef]
- Tang, C.; Zhou, Y.; Tang, Z.; Luo, Q. Teaching-learning-based Pathfinder algorithm for function and engineering optimization problems. Appl. Intell. 2021, 51, 5040–5066. [Google Scholar] [CrossRef]
- Lee, P.C.; Moss, C.J. Wild female African elephants (Loxodonta africana) exhibit personality traits of leadership and social integration. J. Comp. Psychol. 2012, 126, 224. [Google Scholar] [CrossRef]
- Peterson, R.O.; Jacobs, A.K.; Drummer, T.D.; Mech, L.D.; Smith, D.W. Leadership behavior in relation to dominance and reproductive status in gray wolves, Canis lupus. Can. J. Zool. 2002, 80, 1405–1412. [Google Scholar] [CrossRef] [Green Version]
- Priyadarshani, S.; Subhashini, K.R.; Satapathy, J.K. Pathfinder algorithm optimized fractional order tilt-integral-derivative (FOTID) controller for automatic generation control of multi-source power system. Microsyst. Technol. 2021, 27, 23–35. [Google Scholar] [CrossRef]
- Bilmes, J.A. A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models. Int. Comput. Sci. Inst. 1998, 4, 126. [Google Scholar]
- Ng, S.K.; McLachlan, G.J. An EM-based semi-parametric mixture model approach to the regression analysis of competing-risks data. Stat. Med. 2003, 22, 1097–1111. [Google Scholar] [CrossRef]
- Davies, D.L.; Bouldin, D.W. A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 1979, PAMI-1, 224–227. [Google Scholar] [CrossRef]
- Maulik, U.; Bandyopadhyay, S. Performance evaluation of some clustering algorithms and validity indices. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 1650–1654. [Google Scholar] [CrossRef] [Green Version]
- Ullmann, T.; Hennig, C.; Boulesteix, A.L. Validation of cluster analysis results on validation data: A systematic framework. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2022, 12, e1444. [Google Scholar] [CrossRef]
- Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise; KDD: Washington, DC, USA, 1996; Volume 96, pp. 226–231. [Google Scholar]
- Bezdek, J.C.; Ehrlich, R.; Full, W. FCM: The fuzzy c-means clustering algorithm. Comput. Geosci. 1984, 10, 191–203. [Google Scholar] [CrossRef]
- Frey, B.J.; Dueck, D. Clustering by passing messages between data points. Science 2007, 315, 972–976. [Google Scholar] [CrossRef] [Green Version]
- Izakian, Z.; Mesgari, M.S.; Abraham, A. Automated clustering of trajectory data using a particle swarm optimization. Comput. Environ. Urban Syst. 2016, 55, 55–65. [Google Scholar] [CrossRef]
- Li, L.; Liu, X.; Xu, M. A novel fuzzy clustering based on particle swarm optimization. In Proceedings of the 2007 First IEEE International Symposium on Information Technologies and Applications in Education, Kunming, China, 23–25 November 2007; pp. 88–90. [Google Scholar]
- Doval, D.; Mancoridis, S.; Mitchell, B.S. Automatic clustering of software systems using a genetic algorithm. In Proceedings of the Ninth International Workshop Software Technology and Engineering Practice (STEP’99), Pittsburgh, PA, USA, 30 August–2 September 1999; IEEE: New York, NY, USA, 1999; pp. 73–81. [Google Scholar]
- Zhang, C.; Ouyang, D.; Ning, J. An artificial bee colony approach for clustering. Expert Syst. Appl. 2010, 37, 4761–4767. [Google Scholar] [CrossRef]
- Storn, R.; Price, K. Differential evolution—A simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 1997, 11, 341–359. [Google Scholar] [CrossRef]
- Fränti, P.; Sieranoja, S. K-means properties on six clustering benchmark datasets. Appl. Intell. 2018, 48, 4743–4759. [Google Scholar] [CrossRef]
- Derrac, J.; García, S.; Molina, D.; Herrera, F. A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol. Comput. 2011, 1, 3–18. [Google Scholar] [CrossRef]
- Carpaneto, G.; Toth, P. Algorithm 548: Solution of the assignment problem [H]. ACM Trans. Math. Softw. (TOMS) 1980, 6, 104–111. [Google Scholar] [CrossRef]
- Strehl, A.; Ghosh, J. Cluster ensembles—A knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 2002, 3, 583–617. [Google Scholar]
- Morey, L.C.; Agresti, A. The measurement of classification agreement: An adjustment to the Rand statistic for chance agreement. Educ. Psychol. Meas. 1984, 44, 33–37. [Google Scholar] [CrossRef]
- Hubert, L.; Arabie, P. Comparing partitions. J. Classif. 1985, 2, 193–218. [Google Scholar] [CrossRef]
Algorithm | Parameter | Value |
---|---|---|
PSO | , Weight factor | 0.5, 0.5, 1.2 |
PSO-FCM | , Weight factor | 2.0, 2.0, 0.4 |
GA | Crossover factor, Mutation factor | 0.8, 0.01 |
ABC | Predetermined cycles | 5 |
DE | Weight factor, Crossover probability | 0.3, 0.8 |
DATASET | Algorithm | ARI | NMI | ACC | DATASET | Algorithm | ARI | NMI | ACC |
---|---|---|---|---|---|---|---|---|---|
Aggregation (788 × 2) | PFA-GMM | 0.97 | 0.93 | 0.94 | S2 (5000 × 2) | PFA-GMM | 0.97 | 0.94 | 0.93 |
PFA-KM | 0.75 | 0.84 | 0.69 | PFA-KM | 0.80 | 0.87 | 0.77 | ||
GMM | 0.87 | 0.93 | 0.91 | GMM | 0.66 | 0.79 | 0.56 | ||
K-MEANS | 0.65 | 0.81 | 0.61 | K-MEANS | 0.53 | 0.79 | 0.59 | ||
FCM | 0.73 | 0.73 | 0.67 | FCM | 0.40 | 0.68 | 0.42 | ||
DBSCAN | 0.35 | 0.00 | 0.00 | DBSCAN | 0.07 | 0.00 | 0.00 | ||
AP | 0.39 | 0.71 | 0.35 | AP | 0.41 | 0.69 | 0.44 | ||
PSO | 0.61 | 0.68 | 0.49 | PSO | 0.69 | 0.79 | 0.63 | ||
PSO-FCM | 0.83 | 0.84 | 0.72 | PSO-FCM | 0.97 | 0.94 | 0.93 | ||
GA | 0.35 | 0.00 | 0.00 | GA | 0.07 | 0.00 | 0.00 | ||
ABC | 0.82 | 0.75 | 0.67 | ABC | 0.76 | 0.82 | 0.67 | ||
DE | 0.42 | 0.16 | 0.05 | DE | 0.14 | 0.28 | 0.06 | ||
Compound (399 × 2) | PFA-GMM | 0.88 | 0.86 | 0.84 | Patbased (300 × 2) | PFA-GMM | 0.47 | 0.31 | 0.14 |
PFA-KM | 0.78 | 0.80 | 0.77 | PFA-KM | 0.75 | 0.55 | 0.47 | ||
GMM | 0.86 | 0.85 | 0.83 | GMM | 0.45 | 0.21 | 0.13 | ||
K-MEANS | 0.55 | 0.68 | 0.46 | K-MEANS | 0.55 | 0.55 | 0.45 | ||
FCM | 0.61 | 0.60 | 0.44 | FCM | 0.70 | 0.48 | 0.42 | ||
DBSCAN | 0.40 | 0.00 | 0.00 | DBSCAN | 0.37 | 0.00 | 0.00 | ||
AP | 0.36 | 0.63 | 0.30 | AP | 0.28 | 0.52 | 0.24 | ||
PSO | 0.50 | 0.49 | 0.31 | PSO | 0.48 | 0.26 | 0.18 | ||
PSO-FCM | 0.66 | 0.70 | 0.54 | PSO-FCM | 0.75 | 0.55 | 0.47 | ||
GA | 0.40 | 0.00 | 0.00 | GA | 0.37 | 0.00 | 0.00 | ||
ABC | 0.65 | 0.48 | 0.42 | ABC | 0.72 | 0.53 | 0.45 | ||
DE | 0.59 | 0.33 | 0.26 | DE | 0.51 | 0.20 | 0.10 | ||
Four Lines (511 × 2) | PFA-GMM | 1.00 | 0.99 | 0.99 | Jain (373 × 2) | PFA-GMM | 0.68 | 0.28 | 0.13 |
PFA-KM | 0.73 | 0.68 | 0.51 | PFA-KM | 0.88 | 0.55 | 0.59 | ||
GMM | 0.57 | 0.42 | 0.33 | GMM | 0.58 | 0.20 | 0.01 | ||
K-MEANS | 0.66 | 0.83 | 0.72 | K-MEANS | 0.28 | 0.40 | 0.17 | ||
FCM | 0.64 | 0.64 | 0.48 | FCM | 0.86 | 0.51 | 0.51 | ||
DBSCAN | 0.29 | 0.00 | 0.00 | DBSCAN | 0.74 | 0.00 | 0.00 | ||
AP | 0.24 | 0.52 | 0.18 | AP | 0.18 | 0.36 | 0.10 | ||
PSO | 0.57 | 0.65 | 0.47 | PSO | 0.72 | 0.29 | 0.19 | ||
PSO-FCM | 0.85 | 0.76 | 0.67 | PSO-FCM | 0.88 | 0.55 | 0.59 | ||
GA | 0.29 | 0.00 | 0.00 | GA | 0.74 | 0.00 | 0.00 | ||
ABC | 0.38 | 0.50 | 0.23 | ABC | 0.74 | 0.01 | 0.01 | ||
DE | 0.47 | 0.48 | 0.26 | DE | 0.42 | 0.38 | 0.26 |
DATASET | Algorithm | DB | RI | DATASET | Algorithm | DB | RI |
---|---|---|---|---|---|---|---|
Aggregation (788 × 2) | PFA-GMM | 0.12 | 0.98 | S2 (5000 × 2) | PFA-GMM | 0.08 | 0.94 |
PFA-KM | 0.12 | 0.92 | PFA-KM | 0.09 | 0.94 | ||
GMM | 0.45 | 0.84 | GMM | 0.09 | 0.94 | ||
K-MEANS | 0.13 | 0.89 | K-MEANS | 0.11 | 0.93 | ||
FCM | 0.19 | 0.89 | FCM | 0.12 | 0.84 | ||
DBSCAN | 0.34 | 0.22 | DBSCAN | 0.47 | 0.07 | ||
AP | 0.17 | 0.84 | AP | 0.41 | 0.69 | ||
PSO | 0.15 | 0.89 | PSO | 0.09 | 0.94 | ||
PSO-FCM | 0.13 | 0.91 | PSO-FCM | 0.04 | 0.93 | ||
GA | 0.23 | 0.77 | GA | 0.36 | 0.07 | ||
ABC | 0.13 | 0.83 | ABC | 0.08 | 0.93 | ||
DE | 0.14 | 0.61 | DE | 0.08 | 0.34 | ||
Compound (399 × 2) | PFA-GMM | 0.14 | 0.92 | Patbased (300 × 2) | PFA-GMM | 0.22 | 0.74 |
PFA-KM | 0.15 | 0.90 | PFA-KM | 0.22 | 0.75 | ||
GMM | 0.77 | 0.87 | GMM | 1.75 | 0.56 | ||
K-MEANS | 0.17 | 0.83 | K-MEANS | 0.23 | 0.74 | ||
FCM | 0.20 | 0.85 | FCM | 0.38 | 0.69 | ||
DBSCAN | 0.42 | 0.25 | DBSCAN | 0.49 | 0.33 | ||
AP | 0.17 | 0.80 | AP | 0.06 | 0.73 | ||
PSO | 0.22 | 0.73 | PSO | 0.22 | 0.72 | ||
PSO-FCM | 0.16 | 0.84 | PSO-FCM | 0.23 | 0.75 | ||
GA | 0.44 | 0.74 | GA | 0.35 | 0.58 | ||
ABC | 0.16 | 0.81 | ABC | 0.23 | 0.73 | ||
DE | 0.22 | 0.75 | DE | 0.36 | 0.59 | ||
Four Lines (511 × 2) | PFA-GMM | 0.07 | 1.0 | Jain (373 × 2) | PFA-GMM | 0.48 | 0.57 |
PFA-KM | 0.19 | 0.83 | PFA-KM | 0.39 | 0.79 | ||
GMM | 0.27 | 1.0 | GMM | 0.53 | 0.51 | ||
K-MEANS | 0.07 | 0.91 | K-MEANS | 0.53 | 0.51 | ||
FCM | 0.30 | 0.75 | FCM | 0.81 | 0.59 | ||
DBSCAN | 0.54 | 0.25 | DBSCAN | 0.59 | 0.61 | ||
AP | 0.32 | 0.83 | AP | 0.65 | 0.46 | ||
PSO | 0.33 | 0.83 | PSO | 0.53 | 0.51 | ||
PSO-FCM | 0.35 | 0.82 | PSO-FCM | 0.39 | 0.78 | ||
GA | 0.45 | 0.59 | GA | 0.59 | 0.61 | ||
ABC | 0.46 | 0.57 | ABC | 0.58 | 0.62 | ||
DE | 0.38 | 0.62 | DE | 0.63 | 0.48 |
Algorithm (Number × Feature) | Spect (267 × 22) | Seeds (210 × 7) | Iris (150 × 4) | Breast (699 × 9) | Glass (214 × 9) | |
---|---|---|---|---|---|---|
PFA-GMM | Best | 0.8058 | 0.3209 | 0.2033 | 0.5876 | 0.0512 |
Worst | 0.8058 | 0.4650 | 0.3456 | 1.0417 | 0.0809 | |
Mean | 0.8058 | 0.3609 | 0.3010 | 0.6707 | 0.0603 | |
Std. | 0.0000 | 0.0397 | 0.0433 | 0.1443 | 0.0087 | |
PFA-KM | Best | 1.1421 | 0.3268 | 0.2303 | 0.8663 | 0.2151 |
Worst | 1.1791 | 0.3297 | 0.3321 | 1.0950 | 0.4132 | |
Mean | 1.1585 | 0.3288 | 0.2723 | 0.9406 | 0.2632 | |
Std. | 0.0127 | 0.0011 | 0.0440 | 0.0998 | 0.0604 | |
PSO | Best | 1.0737 | 0.3286 | 0.2303 | 0.8069 | 0.2882 |
Worst | 2.1367 | 0.7367 | 0.7684 | 1.4967 | 0.4787 | |
Mean | 1.4930 | 0.4889 | 0.4553 | 1.1616 | 0.3921 | |
Std. | 0.2983 | 0.1318 | 0.1503 | 0.2063 | 0.0582 | |
PSO-FCM | Best | 1.1227 | 0.3317 | 0.3242 | 1.1236 | 0.3117 |
Worst | 1.4690 | 0.3363 | 0.3247 | 1.6832 | 0.3432 | |
Mean | 1.2241 | 0.3332 | 0.3244 | 1.3712 | 0.3279 | |
Std. | 0.1166 | 0.0017 | 0.0002 | 0.1896 | 0.0096 | |
GA | Best | 0.8218 | 0.2488 | 0.1902 | 0.5347 | 0.2022 |
Worst | 1.1003 | 0.3350 | 0.2529 | 1.0344 | 0.2815 | |
Mean | 1.0089 | 0.2985 | 0.2136 | 0.7128 | 0.2527 | |
Std. | 0.0957 | 0.0304 | 0.0221 | 0.1294 | 0.0225 | |
ABC | Best | 0.3079 | 0.1756 | 0.1798 | 0.1907 | 0.1531 |
Worst | 0.8413 | 0.3021 | 0.2303 | 0.6749 | 0.2628 | |
Mean | 0.5714 | 0.2315 | 0.1996 | 0.3864 | 0.2095 | |
Std. | 0.1727 | 0.0392 | 0.0176 | 0.1808 | 0.0384 | |
DE | Best | 1.1238 | 0.3412 | 0.2999 | 0.9923 | 0.2284 |
Worst | 1.6833 | 0.7104 | 0.6976 | 1.5695 | 0.5714 | |
Mean | 1.3066 | 0.4789 | 0.4749 | 1.2934 | 0.4302 | |
Std. | 0.1587 | 0.1161 | 0.1260 | 0.1534 | 0.0944 | |
Algorithm (Number × Feature) | Heart (303 × 13) | Liver (345 × 6) | Wine (178 × 13) | Zoo (101 × 16) | Banknote (1372 × 4) | |
PFA-GMM | Best | 0.2673 | 0.1176 | 0.1983 | 0.1230 | 0.3596 |
Worst | 0.4289 | 0.4183 | 0.2975 | 0.1742 | 0.6053 | |
Mean | 0.2834 | 0.1927 | 0.2261 | 0.1571 | 0.5016 | |
Std. | 0.0485 | 0.0794 | 0.0274 | 0.0137 | 0.1089 | |
PFA-KM | Best | 1.0394 | 0.1819 | 0.4875 | 0.1539 | 0.6004 |
Worst | 1.0394 | 0.5950 | 0.4937 | 0.1583 | 0.6004 | |
Mean | 1.0394 | 0.2232 | 0.4913 | 0.1560 | 0.6004 | |
Std. | 0.0000 | 0.1240 | 0.0029 | 0.0015 | 0.0000 | |
PSO | Best | 1.0871 | 0.7896 | 0.5331 | 0.1927 | 0.5152 |
Worst | 2.2302 | 1.7341 | 1.3147 | 0.3888 | 0.8781 | |
Mean | 1.6378 | 1.0881 | 0.8028 | 0.2905 | 0.6749 | |
Std. | 0.3839 | 0.2783 | 0.2102 | 0.0539 | 0.1150 | |
PSO-FCM | Best | 1.0908 | 0.7688 | 0.4701 | 0.1917 | 0.5998 |
Worst | 2.1732 | 0.8308 | 0.4801 | 0.3179 | 0.6064 | |
Mean | 1.3155 | 0.7945 | 0.4752 | 0.2315 | 0.6023 | |
Std. | 0.3145 | 0.0187 | 0.0026 | 0.0313 | 0.0020 | |
GA | Best | 0.1545 | 0.1306 | 0.4699 | 0.1833 | 0.2714 |
Worst | 1.1754 | 0.5463 | 0.4956 | 0.2547 | 0.4616 | |
Mean | 0.9042 | 0.3455 | 0.4787 | 0.2250 | 0.3775 | |
Std. | 0.3762 | 0.1507 | 0.0065 | 0.0212 | 0.0673 | |
ABC | Best | 0.2558 | 0.2558 | 0.2081 | 0.1523 | 0.1845 |
Worst | 0.8741 | 0.6353 | 0.4676 | 0.1983 | 0.6263 | |
Mean | 0.5053 | 0.4117 | 0.3322 | 0.1668 | 0.2944 | |
Std. | 0.1933 | 0.1418 | 0.1014 | 0.0137 | 0.1259 | |
DE | Best | 0.9603 | 1.0209 | 0.4824 | 0.2177 | 0.5461 |
Worst | 2.1271 | 1.8712 | 1.2315 | 0.4304 | 0.8467 | |
Mean | 1.4722 | 1.3898 | 0.7852 | 0.3022 | 0.6600 | |
Std. | 0.3907 | 0.3084 | 0.2398 | 0.0612 | 0.0993 |
Algorithm | PFA-GMM vs. | ||||||
---|---|---|---|---|---|---|---|
PFA-GMM | PFA-KM | PSO | PSO-FCM | GA | ABC | DE | |
Spect | 0.001953 | 0.001953 | 0.001953 | 0.001953 | 0.001953 | 0.009766 | 0.001953 |
Seeds | 0.001953 | 0.003906 | 0.001953 | 0.037109 | 0.003906 | 0.001953 | 0.019531 |
Iris | 0.001953 | 0.232422 | 0.001953 | 0.037109 | 0.009766 | 0.001953 | 0.005859 |
Breast | 0.001953 | 0.001953 | 0.001953 | 0.001953 | 0.232422 | 0.019531 | 0.001953 |
Glass | 0.001953 | 0.001953 | 0.001953 | 0.001953 | 0.001953 | 0.001953 | 0.001953 |
Heart | 0.001953 | 0.001953 | 0.001953 | 0.001953 | 0.009766 | 0.064453 | 0.001953 |
Liver | 0.001953 | 0.461451 | 0.001953 | 0.001953 | 0.037109 | 0.951915 | 0.001953 |
Wine | 0.001953 | 0.001953 | 0.001953 | 0.001953 | 0.001953 | 0.027344 | 0.001953 |
Zoo | 0.001953 | 0.556641 | 0.001953 | 0.001953 | 0.001953 | 0.431641 | 0.001953 |
Banknote | 0.001953 | 0.232422 | 0.001953 | 0.232422 | 0.037109 | 0.037109 | 0.009766 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Huang, H.; Liao, Z.; Wei, X.; Zhou, Y. Combined Gaussian Mixture Model and Pathfinder Algorithm for Data Clustering. Entropy 2023, 25, 946. https://doi.org/10.3390/e25060946
Huang H, Liao Z, Wei X, Zhou Y. Combined Gaussian Mixture Model and Pathfinder Algorithm for Data Clustering. Entropy. 2023; 25(6):946. https://doi.org/10.3390/e25060946
Chicago/Turabian StyleHuang, Huajuan, Zepeng Liao, Xiuxi Wei, and Yongquan Zhou. 2023. "Combined Gaussian Mixture Model and Pathfinder Algorithm for Data Clustering" Entropy 25, no. 6: 946. https://doi.org/10.3390/e25060946
APA StyleHuang, H., Liao, Z., Wei, X., & Zhou, Y. (2023). Combined Gaussian Mixture Model and Pathfinder Algorithm for Data Clustering. Entropy, 25(6), 946. https://doi.org/10.3390/e25060946