Efficient Shared Near Neighbours Clustering of Large Metric Data Sets

Stefano Lodi⁸,
Luisella Reami⁸ &
Claudio Sartori⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1704))

Included in the following conference series:

European Conference on Principles of Data Mining and Knowledge Discovery

1867 Accesses

Abstract

Very few clustering methods are capable of clustering data without assuming the availability of operations which are defined only in strongly structured spaces, such as vector spaces. We propose an efficient data clustering method based on the shared near neighbours approach, which requires only a distance definition and is capable of discovering clusters of any shape. Using efficient data structures for querying metric data and a scheme for partitioning and sampling the data, the method can cluster effectively and efficiently data sets whose size exceeds the internal memory size.

Download to read the full chapter text

Chapter PDF

Mutual k-Nearest Neighbor Graph for Data Analysis: Application to Metric Space Clustering

An efficient clustering algorithm based on the k-nearest neighbors with an indexing ratio

Article 18 November 2019

Nearest Neighbor-Based Clustering Algorithm for Large Data Sets

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Jarvis, R., Patrick, E.: Clustering using a similarity measure based on shared near neighbours. IEEE Transactions on Computers 22(11), 1025–1034 (1973)
Article Google Scholar
Jarvis, R., Hofman, I.: Robust and efficient cluster analysis using a shared near neighbours approach. In: ICPR 1998, Proc. of the 14th Int’l Conference on Pattern Recognition, pp. 243–247. IEEE Computer Society Press, Los Alamitos (1998)
Google Scholar
Uhlmann, J.K.: Satisfying general proximity/similarity queries with metric trees. Information Processing Letters 40(4), 175–179 (1991)
Article MATH Google Scholar
Bozkaya, T., Özoyoglu, Z.M.: Distance-based indexing for high-dimensional metric spaces. In: Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data, pp. 357–368. ACM Press, New York (1997)
Chapter Google Scholar
Sheikholeslami, G., Chatterjee, S., Zhang, A.: WaveCluster: A multi-resolution clustering approach for very large spatial databases. In: VLDB 1998, Proceedings of 24th International Conference on Very Large Data Bases, pp. 428–439. Morgan Kaufmann, San Francisco (1998)
Google Scholar
Guha, S., Rastogi, R., Shim, K.: CURE: An efficient clustering algorithm for large databases. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD 1998), pp. 73–84. ACM Press, New York (1998)
Chapter Google Scholar
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD 1996), p. 226. AAAI Press, Menlo Park (1996)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronics, Computer Science and Systems, CSITE-CNR, University of Bologna, viale Risorgimento 2, 40136, Bologna, Italy
Stefano Lodi, Luisella Reami & Claudio Sartori

Authors

Stefano Lodi
View author publications
You can also search for this author in PubMed Google Scholar
Luisella Reami
View author publications
You can also search for this author in PubMed Google Scholar
Claudio Sartori
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, UNC Charlotte, Charlotte, N.C. 28223 and Institute of Computer Science, Polish Academy of Sciences,
Jan M. Żytkow
Faculty of Informatics and Statistics, University of Economics, Prague, nám. W. Churchilla 4, 130 67, Prague, Czech Republic
Jan Rauch

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lodi, S., Reami, L., Sartori, C. (1999). Efficient Shared Near Neighbours Clustering of Large Metric Data Sets. In: Żytkow, J.M., Rauch, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1999. Lecture Notes in Computer Science(), vol 1704. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-48247-5_53

Download citation

DOI: https://doi.org/10.1007/978-3-540-48247-5_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66490-1
Online ISBN: 978-3-540-48247-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Efficient Shared Near Neighbours Clustering of Large Metric Data Sets

Abstract

Chapter PDF

Similar content being viewed by others

Mutual k-Nearest Neighbor Graph for Data Analysis: Application to Metric Space Clustering

An efficient clustering algorithm based on the k-nearest neighbors with an indexing ratio

Nearest Neighbor-Based Clustering Algorithm for Large Data Sets

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Efficient Shared Near Neighbours Clustering of Large Metric Data Sets

Abstract

Chapter PDF

Similar content being viewed by others

Mutual k-Nearest Neighbor Graph for Data Analysis: Application to Metric Space Clustering

An efficient clustering algorithm based on the k-nearest neighbors with an indexing ratio

Nearest Neighbor-Based Clustering Algorithm for Large Data Sets

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation