Abstract
The results of any operation of clustering or classification of objects strongly depend on the proximity measure chosen. The user has to select one measure among many existing ones. Yet, according to the notion of topological equivalence chosen, some measures are more or less equivalent. In this paper, we propose a new approach to compare and classify proximity measures in a topological structure and in a context of discrimination. The concept of topological equivalence uses the basic notion of local neighborhood. We define the topological equivalence between two proximity measures, in the context of discrimination, through the topological structure induced by each measure. We propose a criterion for choosing the “best” measure, adapted to the data considered, among some of the most used proximity measures for quantitative or qualitative data. The principle of the proposed approach is illustrated using two real datasets with conventional proximity measures of literature for quantitative and qualitative variables. Afterward, we conduct experiments to evaluate the performance of this discriminant topological approach and to test if the proximity measure selected as the “best” discriminant changes in terms of the size or the dimensions of the used data. The “best” discriminating proximity measure will be verified a posteriori using a supervised learning method of type Support Vector Machine, discriminant analysis or Logistic regression applied in a topological context.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abdesselam, R. (2014). Proximity measures in topological structure for discrimination. In C. H. Skiadas (Ed.), SMTDA-2014, 3rd Stochastic Modeling Techniques and Data Analysis, International Conference, Lisbon (pp. 599–606). ISAST.
Abdesselam, R. & Zighed, D. (2011). Comparaison topologique de mesures de proximite. In Actes des XVIIIeme Rencontres de la Societe Francophone de Classification (pp. 79–82).
Anderson, E. (1935). The irises of the gaspe peninsula. Bulletin of the American Iris Society, 59, 2–5.
Batagelj, V., & Bren, M. (1992). Comparing resemblance measures. Technical report, Proceedings of International Meeting on Distance Analysis (DISTANCIA’92).
Batagelj, V., & Bren, M. (1995). Comparing resemblance measures. Journal of classification, 12, 73–90.
Demsar, J. (2006). Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research, 7, 1–30.
Fisher, R. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, Part II, 7, 179–188.
Jaromczyk, J.-W., & Toussaint, G.-T. (1992). Relative neighborhood graphs and their relatives. Proceedings of IEEE, 80(9), 1502–1517.
Kim, J., & Lee, S. (2003). Tail bound for the minimal spanning tree of a complete graph. Statistics & Probability Letters, 64(4), 425–430.
Lee, Y., Lin, Y., & Wahba, G. (2004). Multicategory support vector machines, theory and application to the classification of microarray data and satellite radiance data. Journal of the American Statistical Association, 465, 67–81.
Lesot, M.-J., Rifqi, M., & Benhadda, H. (2009). Similarity measures for binary and numerical data: a survey. IJKESDP, 1(1), 63–84.
Park, J., Shin, H., & Choi, B. (2006). Elliptic Gabriel graph for finding neighbors in a point set and its application to normal vector estimation. Computer-Aided Design, 38(6), 619–626.
Richter, M. (1992). Classification and learning of similarity measures. In Proceedings der Jahrestagung der Gesellschaft fur Klassifikation. Studies in classification, data analysis and knowledge organisation. Berlin: Springer
Rifqi, M., Detyniecki, M., & Bouchon-Meunier, B. (2003). 2003. In IFSA: Discrimination power of measures of resemblance.
Schneider, J., & Borlund, P. (2007b). Matrix comparison, part 2: Measuring the resemblance between proximity measures or ordination results by use of the mantel and procrustes statistics. Journal American Society for Information Science and Technology, 58(11), 1596–1609.
Toussaint, G. (1980). The relative neighbourhood graph of a finite planar set. Pattern Recognition, 12(4), 261–268.
UCI. (2013). Machine learning repository. http://archive.ics.uci.edu/ml. Irvine, CA: University of California, School of Information and Computer Science.
Ward, J, Jr. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244.
Zighed, D., Abdesselam, R., & Hadgu, A. (2012). Topological comparisons of proximity measures. In P.-N. Tan et al. (Eds.), The 16th PAKDD 2012 Conference. Part I, LNAI. (Vol. 7301, pp. 379–391). Berlin: Springer.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Abdesselam, R., Aazi, FZ. (2017). Comparison of Proximity Measures for a Topological Discrimination. In: Guillet, F., Pinaud, B., Venturini, G. (eds) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol 665. Springer, Cham. https://doi.org/10.1007/978-3-319-45763-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-45763-5_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45762-8
Online ISBN: 978-3-319-45763-5
eBook Packages: EngineeringEngineering (R0)