A Novel Adaptive Kernel Picture Fuzzy C-Means Clustering Algorithm Based on Grey Wolf Optimizer Algorithm
<p>The procedures of the proposed KPFCM-GWO.</p> "> Figure 2
<p>Initial dataset distribution.</p> "> Figure 3
<p>Final clustering result.</p> "> Figure 4
<p>Comparison of various fuzzy clustering algorithm on Iris dataset. Note: SC—Silhouette Coefficient, CH—Calinski–Harabasz, ARI—Adjusted Rand Index, FCM—Fuzzy C-Means clustering, IFCM—Intuitionistic Fuzzy C-Means clustering, KFCM—Kernel-based Fuzzy C-Means clustering, PFCM—Picture Fuzzy C-Means clustering, PFCM-PSO—Picture Fuzzy C-Means clustering with Particle Swarm Optimization algorithm, KPFCM-GWO—Kernel-based Picture Fuzzy C-Means clustering with Grey Wolf Optimizer.</p> "> Figure 5
<p>Comparison of various fuzzy clustering algorithms on the Glass dataset. Note: SC—Silhouette Coefficient, CH—Calinski–Harabasz, ARI—Adjusted Rand Index, FCM—Fuzzy C-Means clustering, IFCM—Intuitionistic Fuzzy C-Means clustering, KFCM—Kernel-based Fuzzy C-Means clustering, PFCM—Picture Fuzzy C-Means clustering, PFCM-PSO—Picture Fuzzy C-Means clustering with Particle Swarm Optimization algorithm, KPFCM-GWO—Kernel-based Picture Fuzzy C-Means clustering with Grey Wolf Optimizer.</p> "> Figure 6
<p>The accuracy values of KPFCM-GWO by varied number of wolves.</p> "> Figure 7
<p>The SC values of KPFCM-GWO by varied number of wolves.</p> "> Figure 8
<p>RFM model.</p> "> Figure 9
<p>Customer value radar chart.</p> ">
Abstract
:1. Introduction
2. Preliminaries
2.1. Fuzzy C-Means Clustering Algorithm
- indicates the number of data points. Each data point is represented in dimensions;
- indicates the number of clusters of the dataset that need to be set up in advance ();
- is the value of fuzzifier degree. A small results in larger memberships, . The value of m generally depends on human knowledge and experience, is commonly set to 2;
- is the membership degree of the data point belonging to the cluster ;
- is the dataset in the feature space;
- is the set of cluster centers, where each element, , is the center of cluster in dimensions;
- denotes the Euclidean distance measure.
2.2. Picture Fuzzy C-Means Clustering Algorithm
Algorithm 1 PFCM algorithm |
I: Dataset with number of data points () in dimensions; number of clusters (); threshold ; fuzzifier ; exponent , and the maximal number of iterations O: Matrices and the cluster centers, . |
1: , t is the number of current iterations; 2: random; random; random satisfying constraint (7); 3: ; 4: Calculate current clustering center using Equation (10); 5: Update , , and using Equations (7)–(9); 6: Repeat step (4)–step (7); 7: Continue until or . |
3. Proposed PFCM Algorithm Based on Kernel Function and GWO
3.1. Kernel-Based Picture Fuzzy C-Means Clustering (KPFCM)
- Gaussian kernel function:
- Linear kernel function:
- Radial basis kernel function:
- Hyper tangent kernel function:
Algorithm 2 KPFCM algorithm |
I: Dataset with number of data points () in dimensions; number of clusters (); threshold ; fuzzifier ; exponent ; kernel function , and the maximal number of iterations O: Matrices and the cluster centers, . |
1: ; 2: random; random; random satisfying constraint (19); 3: ; 4: Calculate via Equation (22); 5: Calculate , and via Equations (19)–(21); 6: Repeat; 7: Continue until or . |
3.2. Parameter Selection of KPFCM with GWO
- Encircling prey
- Hunting prey
- Attacking prey
- Search for prey
4. Experiments
4.1. Datasets
- Calinski–Harabasz (CH) takes into account the tightness within clusters as well as the separation between clusters. Thereby, a larger CH represents a tighter class itself and a more dispersed class to class, i.e., a better clustering result. The CH index can be formulated as Equation (27).
- Silhouette Coefficient (SC) assesses the effectiveness of clustering by combining the cohesiveness and separation of clusters. The value is between −1 and 1. The closer the samples of the same cluster are to each other, and the more distance between the samples of different clusters, the larger the value of SC will be, indicating a better clustering effect.
- The Adjusted Rand Index (ARI) takes values in the range of −1 to 1. The closer the value of ARI is to 1, the closer the clustering result is to the real situation, and the better the clustering effect is.
- The accuracy index refers to the ratio of the number of correctly predicted samples to the total number of predicted samples, and it does not consider whether the predicted samples are positive or negative cases [20].
4.2. Experiment 1
4.3. Experiment 2
- Segmentation of customers with the help of airline customer data.
- Characterize different customer categories and compare the value of different categories of customers.
- Important retention customers (customer group 1) are high-value customers for airlines, the most desirable type of customer, contributing the most to the airline, yet accounting for a smaller percentage. Airlines should prioritize resources to these customers for differentiation and one-to-one marketing to increase the loyalty and satisfaction of these customers, and to increase the high level of spending of these customers as much as possible.
- Important development customers (customer group 3) are potential value customers for airlines. The airline has to make efforts to encourage such customers to increase their spending on flights with the company and with its partners, to increase their transfer costs to competitors, and to make them gradually become loyal customers of the company.
- Important retention customers (customer group 5) have high uncertainty of changes in customer value. Based on the changes of the recent consumption time and the number of consumptions of these customers, airlines should speculate the variation of customer consumption, and focus on tracking contact with them and adopt certain marketing means to extend the life cycle of customer consumption.
- Average customers and low-value customers (customer group 2 and 4) that they may only fly with the company when airline tickets are on sale at a discount.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Jain, A.K.; Murty, M.N.; Flynn, P.J. Data clustering: A review. ACM Comput. Surv. 1999, 31, 264–323. [Google Scholar] [CrossRef]
- Bezdek, J.C.; Ehrlich, R.; Full, W. FCM: The fuzzy c-means clustering algorithm. Comput. Geosci. 1984, 10, 191–203. [Google Scholar] [CrossRef]
- Shadi, A.; Mahmoud, A.; Yaser, J.; Mohammed, A. Enhanced 3D segmentation techniques for reconstructed 3D medical volumes: Robust and accurate intelligent system. Procedia Comput. Sci. 2017, 113, 531–538. [Google Scholar] [CrossRef]
- Chen, M.M.; Wang, N.; Zhou, H.B.; Chen, Y.Z. FCM technique for efficient intrusion detection system for wireless networks in cloud environment. Comput. Electr. Eng. 2018, 71, 978–987. [Google Scholar] [CrossRef]
- Lee, Z.J.; Lee, C.Y.; Chang, L.Y.; Sano, N. Clustering and classification based on distributed automatic feature engineering for customer segmentation. Symmetry 2021, 13, 1557. [Google Scholar] [CrossRef]
- Hanuman, V.; Deepa, V.; Pawan, K.T. A population based hybrid FCM-PSO algorithm for clustering analysis and segmentation of brain image. Expert Syst. Appl. 2021, 167, 114121. [Google Scholar] [CrossRef]
- Pal, N.R.; Pal, K.; Keller, J.M.; Bezdek, J.C. A possibilistic fuzzy c-means clustering algorithm. IEEE Trans. Fuzzy Syst. 2005, 13, 517–530. [Google Scholar] [CrossRef]
- Krinidis, S.; Chatzis, V. A robust fuzzy local information c-means clustering algorithm. IEEE Trans. Image Process. 2010, 19, 1328–1337. [Google Scholar] [CrossRef]
- Thong, P.H.; Son, L.H. Picture fuzzy clustering: A new computational intelligence method. Soft Comput. 2016, 20, 3549–3562. [Google Scholar] [CrossRef]
- Hwang, C.; Rhee, F.C. Uncertain fuzzy clustering: Interval type-2 fuzzy approach to c-means. IEEE Trans. Fuzzy Syst. 2007, 15, 107–120. [Google Scholar] [CrossRef]
- Xu, Z.; Wu, J. Intuitionistic fuzzy c-means clustering algorithms. J. Syst. Eng. Electron. 2010, 21, 580–590. [Google Scholar] [CrossRef]
- Zeng, W.; Ma, R.; Yin, Q.; Zheng, X.; Xu, Z. Hesitant fuzzy c-means algorithm and its application in image segmentation. J. Intell. Fuzzy Syst. 2020, 39, 3681–3695. [Google Scholar] [CrossRef]
- Hou, W.H.; Wang, Y.T.; Wang, J.Q.; Cheng, P.F.; Li, L. Intuitionistic fuzzy c-means clustering algorithm based on a novel weighted proximity measure and genetic algorithm. Int. J. Mach. Learn. Cyb. 2021, 12, 859–875. [Google Scholar] [CrossRef]
- Cuong, B.C.; Kreinovich, V. Picture fuzzy sets. J. Comput. Sci. Cybern. 2014, 30, 409–420. [Google Scholar]
- Son, L.H. DPFCM. Expert Syst. Appl. 2015, 42, 51–66. [Google Scholar] [CrossRef]
- Thong, P.H.; Son, L.H. A novel automatic picture fuzzy clustering method based on particle swarm optimization and picture composite cardinality. Knowl. Based Syst. 2016, 109, 48–60. [Google Scholar] [CrossRef]
- Thong, P.H.; Son, L.H. Picture fuzzy clustering for complex data. Eng. Appl. Artif. Intell. 2016, 56, 121–130. [Google Scholar] [CrossRef]
- Wu, C.; Chen, Y. Adaptive entropy weighted picture fuzzy clustering algorithm with spatial information for image segmentation. Appl. Soft Comput. 2020, 86, 105888. [Google Scholar] [CrossRef]
- Graves, D.; Pedrycz, W. Kernel-based fuzzy clustering and fuzzy clustering: A comparative experimental study. Fuzzy Sets Syst. 2010, 161, 522–543. [Google Scholar] [CrossRef]
- Zhou, X.; Zhang, R.; Wang, X.; Huang, T.; Yang, C. Kernel intuitionistic fuzzy c-means and state transition algorithm for clustering problem. Soft Comput. 2020, 24, 15507–15518. [Google Scholar] [CrossRef]
- Wu, C.M.; Cao, Z. Noise distance driven fuzzy clustering based on adaptive weighted local information and entropy-like divergence kernel for robust image segmentation. Digit. Signal Process 2021, 111, 102963. [Google Scholar] [CrossRef]
- Lin, K.P. A novel evolutionary kernel intuitionistic fuzzy c-means clustering algorithm. IEEE Trans. Fuzzy Syst. 2014, 22, 1074–1087. [Google Scholar] [CrossRef]
- Chou, C.H.; Hsieh, S.C.; Qiu, C.J. Hybrid genetic algorithm and fuzzy clustering for bankruptcy prediction. Appl. Soft Comput. 2017, 56, 298–316. [Google Scholar] [CrossRef]
- Zhang, J.; Ma, Z.H. Hybrid fuzzy clustering method based on FCM and enhanced logarithmical PSO (ELPSO). Comput. Intel. Neurosc. 2020, 2020, 1386839. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Niknam, T.; Amiri, B. An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis. Appl. Soft Comput. 2010, 10, 183–197. [Google Scholar] [CrossRef]
- Karaboga, D.; Basturk, B. On the performance of artificial bee colony (ABC) algorithm. Appl. Soft Comput. 2008, 8, 687–697. [Google Scholar] [CrossRef]
- Mirjalili, S.; Mirjalili, S.M.; Lewis, A. Grey wolf optimizer. Adv. Eng. Softw. 2014, 69, 46–61. [Google Scholar] [CrossRef] [Green Version]
- Yager, R.R. On the measure of fuzziness and negation. II. Lattices. Inf. Control. 1980, 44, 236–260. [Google Scholar] [CrossRef] [Green Version]
- Keogh, E.; Ratanamahatana, C.A. Exact indexing of dynamic time warping. Knowl. Inf. Syst. 2005, 7, 358–386. [Google Scholar] [CrossRef]
- Qingshan, L.; Rui, H.; Hanqing, L.; Songde, M. Face recognition using kernel-based fisher discriminant Analysis. In Proceedings of the Fifth IEEE International Conference on Automatic Face Gesture Recognition, Washington, DC, USA, 21 May 2002; pp. 197–201. [Google Scholar]
- Camps-Valls, G.; Bruzzone, L. Kernel-based methods for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2005, 43, 1351–1362. [Google Scholar] [CrossRef]
- Che, J.; Wang, J. Short-term load forecasting using a kernel-based support vector regression combination model. Appl. Energy 2014, 132, 602–609. [Google Scholar] [CrossRef]
- Santos, A.; Figueiredo, E.; Silva, M.F.M.; Sales, C.S.; Costa, J.C.W.A. Machine learning algorithms for damage detection: Kernel-based approaches. J. Sound Vib. 2016, 363, 584–599. [Google Scholar] [CrossRef]
- Chaira, T. A novel intuitionistic fuzzy c means clustering algorithm and its application to medical images. Appl. Soft Comput. 2011, 11, 1711–1717. [Google Scholar] [CrossRef]
- Dua, D.a.G. Casey: UCI Machine Learning Repository. 2017. Available online: http://archive.ics.uci.edu/ml (accessed on 12 January 2022).
- Verma, H.; Gupta, A.; Kumar, D. A modified intuitionistic fuzzy c-means algorithm incorporating hesitation degree. Pattern Recognit. Lett. 2019, 122, 45–52. [Google Scholar] [CrossRef]
- Tao, Y. Analysis method for customer value of aviation big data based on LRFMC model. In Data Science, Proceedings of the 6th International Conference of Pioneering Computer Scientists, Engineers and Educators, ICPCSEE 2020, Taiyuan, China, 18–21 September 2020; Zeng, J., Jing, W., Song, X., Lu, Z., Eds.; Springer: Singapore, 2020; pp. 89–100. [Google Scholar]
- Cheadle, C.; Vawter, M.P.; Freed, W.J.; Becker, K.G. Analysis of microarray data using Z score transformation. J. Mol. Diagn. 2003, 5, 73–81. [Google Scholar] [CrossRef] [Green Version]
Datasets | Number of Elements | Number of Features | Number of Classes | |
---|---|---|---|---|
Experiment 1 | Iris | 150 | 4 | 3 |
Wine | 178 | 13 | 3 | |
Glass | 214 | 9 | 6 | |
WDBC | 569 | 30 | 2 | |
Experiment 2 | Airline | 62,988 | 25 | 5 |
Accuracy | SC | CH | ARI | |
---|---|---|---|---|
Iris | 0.93 | 0.579 | 558.85 | 0.745 |
Wine | 0.88 | 0.563 | 557.46 | 0.412 |
Glass | 0.85 | 0.391 | 88.39 | 0.311 |
WDBC | 0.73 | 0.707 | 1317.16 | 0.518 |
Accuracy | SC | CH | ARI | |
---|---|---|---|---|
FCM | ||||
Iris | 0.89 | 0.549 | 558.99 | 0.729 |
Wine | 0.86 | 0.566 | 559.40 | 0.354 |
Glass | 0.71 | 0.258 | 88.68 | 0.227 |
WDBC | 0.63 | 0.697 | 1300.21 | 0.491 |
IFCM | ||||
Iris | 0.95 | 0.535 | 556.34 | 0.731 |
Wine | 0.83 | 0.531 | 317.56 | 0.373 |
Glass | 0.73 | 0.277 | 81.23 | 0.327 |
WDBC | 0.57 | 0.711 | 1267.69 | 0.535 |
KFCM | ||||
Iris | 0.91 | 0.521 | 529.618 | 0.768 |
Wine | 0.69 | 0.559 | 554.568 | 0.366 |
Glass | 0.52 | 0.280 | 61.482 | 0.239 |
WDBC | 0.866 | 0.691 | 1273.503 | 0.529 |
PFCM | ||||
Iris | 0.89 | 0.535 | 537.786 | 0.730 |
Wine | 0.702 | 0.520 | 528.919 | 0.371 |
Glass | 0.542 | 0.261 | 71.367 | 0.267 |
WDBC | 0.849 | 0.704 | 1266.015 | 0.476 |
PFCM-PSO | ||||
Iris | 0.94 | 0.562 | 541.22 | 0.741 |
Wine | 0.85 | 0.519 | 553.87 | 0.452 |
Glass | 0.78 | 0.310 | 40.36 | 0.296 |
WDBC | 0.72 | 0.715 | 1310.26 | 0.505 |
Attributes | |
---|---|
Customers’ basic information | MEMBER_NO |
FFP_DATE | |
FIRST_FLIGHT_DATE | |
GENDER | |
FFP_TIER | |
WORD_CITY | |
WORK_PROVINCE | |
WORK_CONUTRY | |
AGE | |
Flight information | FLIGHT_COUNT |
LOAD_TIME | |
LAST_TO_END | |
AVG_DISCOUNT | |
SUM_YR | |
SEG_KM_SUM | |
LAST_FLIGHT_DATE | |
AVG_INTERVAL | |
MAX_INTERVAL | |
Points information in the airline system | EXCHANGE_COUNT |
EP_SUM | |
PROMOPTIVE_SUM | |
PARTNER_SUM | |
POINTS_SUM | |
POINT_NOTFLIGHT | |
BP_SUM |
Model | LRFMC |
---|---|
Length of membership enrollment in the observed time period | |
Number of months since the customer’s last flight ended in the observed time period | |
The number of times the customer flew with the company during the observed time period counted | |
Customer’s accumulated flight miles within the observation window | |
Average of the discount factors corresponding to the customer’s class of travel within the observation window |
Cluster | Cluster Center | No. Cluster | ||||
---|---|---|---|---|---|---|
ZL | ZR | ZF | ZM | ZC | ||
Customer 1 | 0.0593 | −0.1977 | −0.2353 | 2.1068 | 1.7729 | 15,739 |
Customer 2 | −0.3328 | −0.7509 | 1.3731 | 0.7999 | −0.0403 | 4182 |
Customer 3 | −0.7618 | −0.7053 | 0.4666 | 0.7435 | 0.0125 | 24,661 |
Customer 4 | 0.8262 | −0.6901 | 0.7813 | 0.8079 | −0.1205 | 12,125 |
Customer 5 | 1.2951 | −0.8648 | 2.8262 | 3.4351 | 0.4086 | 5336 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, C.-M.; Liu, Y.; Wang, Y.-T.; Li, Y.-P.; Hou, W.-H.; Duan, S.; Wang, J.-Q. A Novel Adaptive Kernel Picture Fuzzy C-Means Clustering Algorithm Based on Grey Wolf Optimizer Algorithm. Symmetry 2022, 14, 1442. https://doi.org/10.3390/sym14071442
Yang C-M, Liu Y, Wang Y-T, Li Y-P, Hou W-H, Duan S, Wang J-Q. A Novel Adaptive Kernel Picture Fuzzy C-Means Clustering Algorithm Based on Grey Wolf Optimizer Algorithm. Symmetry. 2022; 14(7):1442. https://doi.org/10.3390/sym14071442
Chicago/Turabian StyleYang, Can-Ming, Ye Liu, Yi-Ting Wang, Yan-Ping Li, Wen-Hui Hou, Sheng Duan, and Jian-Qiang Wang. 2022. "A Novel Adaptive Kernel Picture Fuzzy C-Means Clustering Algorithm Based on Grey Wolf Optimizer Algorithm" Symmetry 14, no. 7: 1442. https://doi.org/10.3390/sym14071442