计算机科学 ›› 2015, Vol. 42 ›› Issue (Z11): 55-57.
黄德才,钱潮恺
HUANG De-cai and QIAN Chao-kai
摘要: 针对近邻传播聚类算法不能处理混合属性数据集的问题,提出了一种新的距离度量测度,并将其应用到近邻传播聚类算法中,提出了一种基于维度属性距离的混合属性近邻传播聚类算法。与传统聚类算法不同的是,该算法不需要计算虚拟的中心点,同时考虑了数据集整体分布对聚类结果的影响。将算法在UCI数据库的2个混合属性数据集上进行验证,同时对比了经典的K-Prototypes算法以及K-Modes算法。实验结果表明,改进后的算法具有更好的聚类质量以及执行效率,算法的优越性得到了验证。
[1] Tan P N,Steinbach M,Kumar V.数据挖掘导论[M].范明,范宏建,等译.北京:人民邮电出版社,2011 [2] Kaufan L,Rousseeuw P J.Finding Groups in Data:An Introduction to Cluster Analysis[M].New York:John Wiley&Sons,1990 [3] 黄德才,沈仙桥,陆亿红.混合属性数据流的二重k近邻聚类算法[J].计算机科学,2013,0(10):226-230 [4] Huang Zhe-xue.Clustering Large Data Sets with Mixed Numericand Categorical Values[C]∥Proceedings of PAKDD’97.Singapore,World Scientific,1997:21-35 [5] Chatzis S P.A Fuzzy C-Means-Type Algorithm for Clustering of Data with Mixed Numeric and Categorical Attributes Employing a Probabilistic Dissimilarity Functional[J].Expert Systems with Applications,2011,38(7):8684-8689 [6] 白天,冀进朝,何加亮,等.混合属性数据聚类的新方法[J].吉林大学学报(工学版),2013,43(1):130-134 [7] Frey B J,Dueck D.Clustering by passing messages between data points[J].Science,2007,315(5814):972-976 [8] Qian Y,Yao F,Jia S.Band selection for hyperspectral imagery using affinity propagation[J].IET Computer Vision,2009,3(4):213-222 [9] Li G,Guo L,Liu T.Grouping of brain MR images via affinity propagation[C]∥IEEE International Symposium on Circuits and Systems,2009(ISCAS 2009).IEEE,2009:2425-2428 [10] Dueck D,Frey B J,Jojic N,et al.Constructing treatment portfolios using affinity propagation[M]∥Research in Computational Molecular Biology.Springer Berlin Heidelberg,2008:360-371 [11] Sumedha M L,Weigt M.Unsupervised and semi-supervisedclustering by message passing:soft-constraint affinity propagation[J].The European Physical Journal B-Condensed Matter and Complex Systems,2008,66(1):125-135 [12] 刘晓楠,尹美娟,李明涛,等.面向大规模数据的分层近邻传播聚类算法[J].计算机科学,2014,41(3):185-188 [13] Furtlehner C,Sebag M,Zhang X.Scaling analysis of affinity propagation[J].Physical Review E,2010,81(6):066102 [14] Zhang X,Furtlehner C,Sebag M.Data streaming with affinity propagation[M]∥Machine Learning and Knowledge Discovery in Databases.Springer Berlin Heidelberg,2008:628-643 [15] 张建朋,陈福才,李邵梅,等.基于密度与近邻传播的数据流聚类算法[J].自动化学报,2014,40(2):277-288 [16] 王开军,张军英,李丹,等.自适应仿射传播聚类[J].自动化学报,2007,33(12):1242-1246 |
No related articles found! |
|