Abstract
This paper is devoted to novel stochastic generalized cellular automata (GCA) for self-organizing data clustering in enterprise computing. The GCA transforms the data clustering process into a stochastic process over the configuration space in the GCA array. The proposed approach is characterized by the self-organizing clustering and many advantages in terms of the insensitivity to noise, quality robustness to clustered data, suitability for high-dimensional and massive data sets, the learning ability, and the easier hardware implementation with the VLSI systolic technology. The simulations and comparisons have shown the effectiveness and good performance of the proposed GCA approach to data clustering.
This work was supported by the National Natural Science Foundation of China under Grant No.60473044, No.60575040 and No.60135010.
Chapter PDF
Keywords
- Data Object
- Transitive Probability
- Stationary Probability Distribution
- Cluster Time
- Enterprise Information System
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
T. Thanh, N. Wehrens, R. Buydens, and M. C. Lutgarde, A Clustering Algorithm For Multispectral Images, Analytica Chimica Acta 490, 303–312 (2003).
L. Kaufman, and P. J. Rousseeuw, Finding Groups in Data, Introduction to Cluster Analysis, (Addison Wesley, NEW York, 1990).
P. Dempster, N. M. Laird, and D. B. Rubin, Maximum Likelihood from Incomplete Data Via the EM Algorithm, Journal of the Royal Statistical Society Series B 39(1), 1–38 (1977).
J. Han, and M. Kamber, Data Mining: Concepts and Techniques, (Morgan Kaufinann, 2000).
G. Karypis, E. H. Han, and V. Kumar. CHAMELEON. A Hierarchical Clustering Algorithm Using Dynamic Modelling, Computer 32, 68–75 (1999).
S. Guha, R. Rastogi, and K. Shim, Rock: A Robust Clustering Algorithm for Categorical Attributes, In Proceedings of the 1999 International Conference on Data Engineering, Sydney, Australia, March 1999, 512–521 (1999).
R. Kannan, S. Vempala, and A. Vetta, On Clustering: Good, Bad, and Spectral, In Proceedings of the 40th Annual IEEE Symposium on Foundations of Computer Science, (1999).
D. Cheng, R. Kannan, S. Vempala, and G. Wang, On a Recursive Spectral Algorithm for Clustering from Pairwise Similarities, MIT LCS Technical Report, MIT-LCS-TR-906, 2003.
J. Shi and J. Malik, Normalized Cuts And Image Segmentation, IEEE Transaction on Pattern Analysis and Machine Intelligence 22, 887–905 (2000).
H. Frigui, An Efficient Clustering Approach To Identify Clusters of Arbitrary Shapes in Large Data Sets, In Proceedings of the ACM SIGKDD, Conference Knowledge Discovery and Data Mining, 507–512 (2002).
K. Szczubialka, J. Verdu-Andres, and D.L. Massart, A New Method of Detecting Clustering in the Data, Chemometrics and Intelligent Laboratory Systems 41, 145–160 (1998).
R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan, Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications, In Proceedings of the ACM SIGMOD, (1999).
H. Frigui and M. Rhouma, A Synchronization Based Algorithm for Discovering Ellipsoidal Clusters in Large Data-Sets, In Proceedings of the IEEE Conference on Data Mining, (2001).
L.O. Chua, and L. Yang, Cellular Neural Networks: Theory, IEEE Transaction on Circuits and Systems 35(10), 1257–1272 (1988).
P. S. Shelokar, V. K. Jayaraman, and B.D. Kulkarni, An Ant Colony Approach for Clustering, Analytica Chimica Acta 509(2), 187–196(2004).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 International Federation for Information Processing
About this paper
Cite this paper
Shuai, D., Shuai, Q., Dong, Y., Huang, L. (2006). Data Clustering in Enterprise Computing: A New Generalized Cellular Automata. In: Tjoa, A.M., Xu, L., Chaudhry, S.S. (eds) Research and Practical Issues of Enterprise Information Systems. IFIP International Federation for Information Processing, vol 205. Springer, Boston, MA. https://doi.org/10.1007/0-387-34456-X_4
Download citation
DOI: https://doi.org/10.1007/0-387-34456-X_4
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-34345-7
Online ISBN: 978-0-387-34456-0
eBook Packages: Computer ScienceComputer Science (R0)