Abstract
This paper proposes a methodology for introducing a neighborhood relation of clusters to the conventional cluster validity measures using external criteria, that is, class information. The extended measure evaluates the cluster validity together with connectivity of class distribution based on a neighborhood relation of clusters. A weighting function is introduced for smoothing the basic statistics to set-based measures and to pairwise-based measures. Our method can extend any cluster validity measure based on a set or pairwise of data points. In the experiment, we examined the neighbor component of the extended measure and revealed an appropriate neighborhood radius and some properties using synthetic and real-world data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
References
Amigó, E., Gonzalo, J., Artiles, J., Verdejo, F.: A comparison of extrinsic clustering evaluation metrics based on formal constraints. Information Retrieval 699(12), 461–486 (2009)
Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Transactions on Pattern Analsis and Machine Intelligence (TPAMI) 1(4), 224–227 (1979)
Deborah, L.J., Baskaran, R., Kannan, A.: A survey on internal validity measure for cluster validation. International Journal of Computer Science & Engineering Survey (IJCSES) 1(2), 85–102 (2010)
Dunn, J.C.: Well separated clusters and optimal fuzzy partitions. Journal of Cybernetics 4, 95–104 (1974)
Halkidi, M., Vazirgiannis, M.: Clustering validity assessment using multi representatives. In: Proc. 2nd Hellenic Conference on Artificial Intelligence, pp. 237–248 (2002)
Kohonen, T.: Self-Organizing Maps. Springer (1995)
Kovács, F., Legány, C., Babos, A.: Cluster validity measurement techniques. Engineering 2006, 388–393 (2006)
Kremer, H., Kranen, P., Jansen, T., Seidl, T., Bifet, A., Holmes, G., Pfahringer, B.: An effective evaluation measure for clustering on evolving data streams. In: Proc. the 17th SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2011), pp. 868–876 (2011)
Liu, Y., Li, Z., Xiong, H., Gao, X., Wu, J.: Understanding of internal clustering validation measures. In: Proc. IEEE International Conference on Data Mining (ICDM 2010), pp. 911–916 (2010)
Rendón, E., Abundez, I., Arizmendi, A., Quiroz, E.M.: Internal versus external cluster validation indexes. International Journal of Computers and Communications 5(1), 27–34 (2011)
Tasdemir, K., Merényi, E.: A new cluster validity index for prototype based clustering algorithms based on inter- and intra-cluster density. In: Proc. International Joint Conference on Neural Networks (IJCNN 2007), pp. 2205–2211 (2007)
Veenhuis, C., Koppen, M.: Data Swarm Clustering, ch. 10, pp. 221–241. Springer (2006)
Weinberger, K.Q., Blitzer, J., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research (JMLR) 10, 207–244 (2009)
Xu, R., Wunsch, D.: Cluster Validity. Computational Intelligence, ch. 10, pp. 263–278. IEEE Press (2008)
Zha, Z.J., Mei, T., Wang, M., Wang, Z., Hua, X.S.: Robust distance metric learning with auxiliary knowledge. In: Proc. International Joint Conference on Artificial Intelligence (IJCAI 2009), pp. 1327–1332 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fukui, Ki., Numao, M. (2012). Neighborhood-Based Smoothing of External Cluster Validity Measures. In: Tan, PN., Chawla, S., Ho, C.K., Bailey, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2012. Lecture Notes in Computer Science(), vol 7301. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30217-6_30
Download citation
DOI: https://doi.org/10.1007/978-3-642-30217-6_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30216-9
Online ISBN: 978-3-642-30217-6
eBook Packages: Computer ScienceComputer Science (R0)