Abstract
Very few clustering methods are capable of clustering data without assuming the availability of operations which are defined only in strongly structured spaces, such as vector spaces. We propose an efficient data clustering method based on the shared near neighbours approach, which requires only a distance definition and is capable of discovering clusters of any shape. Using efficient data structures for querying metric data and a scheme for partitioning and sampling the data, the method can cluster effectively and efficiently data sets whose size exceeds the internal memory size.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Jarvis, R., Patrick, E.: Clustering using a similarity measure based on shared near neighbours. IEEE Transactions on Computers 22(11), 1025–1034 (1973)
Jarvis, R., Hofman, I.: Robust and efficient cluster analysis using a shared near neighbours approach. In: ICPR 1998, Proc. of the 14th Int’l Conference on Pattern Recognition, pp. 243–247. IEEE Computer Society Press, Los Alamitos (1998)
Uhlmann, J.K.: Satisfying general proximity/similarity queries with metric trees. Information Processing Letters 40(4), 175–179 (1991)
Bozkaya, T., Özoyoglu, Z.M.: Distance-based indexing for high-dimensional metric spaces. In: Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data, pp. 357–368. ACM Press, New York (1997)
Sheikholeslami, G., Chatterjee, S., Zhang, A.: WaveCluster: A multi-resolution clustering approach for very large spatial databases. In: VLDB 1998, Proceedings of 24th International Conference on Very Large Data Bases, pp. 428–439. Morgan Kaufmann, San Francisco (1998)
Guha, S., Rastogi, R., Shim, K.: CURE: An efficient clustering algorithm for large databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD 1998), pp. 73–84. ACM Press, New York (1998)
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD 1996), p. 226. AAAI Press, Menlo Park (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lodi, S., Reami, L., Sartori, C. (1999). Efficient Shared Near Neighbours Clustering of Large Metric Data Sets. In: Żytkow, J.M., Rauch, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1999. Lecture Notes in Computer Science(), vol 1704. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-48247-5_53
Download citation
DOI: https://doi.org/10.1007/978-3-540-48247-5_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66490-1
Online ISBN: 978-3-540-48247-5
eBook Packages: Springer Book Archive