Nothing Special   »   [go: up one dir, main page]

Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5633))

Included in the following conference series:

Abstract

The notion of distance is the most important basis for classification. This is especially true for unsupervised learning, i.e. clustering, since there is no validation mechanism by means of objects of known groups. But also for supervised learning standard distances often do not lead to appropriate results. For every individual problem the adequate distance is to be decided upon. This is demonstrated by means of three practical examples from very different application areas, namely social science, music science, and production economics. In social science, clustering is applied to spatial regions with very irregular borders. Then adequate spatial distances may have to be taken into account for clustering. In statistical musicology the main problem is often to find an adequate transformation of the input time series as an adequate basis for distance definition. Also, local modelling is proposed in order to account for different subpopulations, e.g. instruments. In production economics often many quality criteria have to be taken into account with very different scaling. In order to find a compromise optimum classification, this leads to a pre-transformation onto the same scale, called desirability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  • Anderberg, M.R.: Cluster Analysis for Applications. Acadamic Press, New York (1973)

    MATH  Google Scholar 

  • Gnanadesikan, R.: Methods for Statistical Data Analysis of Multivariate Observations. Wiley, New York (1977)

    MATH  Google Scholar 

  • Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning - Data Mining, Inference and Prediction. Springer, New York (2001)

    MATH  Google Scholar 

  • Harrington, J.: The desirability function. Industrial Quality Control 21(10), 494–498 (1965)

    Google Scholar 

  • Neumann, C.: Einsatz von Clusterverfahren zur Produktfamilienbildung. Diploma Thesis, Department of Statistics, TU Dortmund (2007)

    Google Scholar 

  • Perner, P.: Case-based reasoning and the statistical challenges. Journal Quality and Reliability Engineering International 24(6), 705–720 (2008)

    Article  Google Scholar 

  • Perner, P. (ed.): Data Mining on Multimedia Data, vol. 2558. Springer, Heidelberg (2002)

    MATH  Google Scholar 

  • Roever, C., Szepannek, G.: Application of a Genetic Algorithm to Variable Selection in Fuzzy Clustering. In: Weihs, C., Gaul, W. (eds.) Classification - the Ubiquitous Challenge, pp. 674–681. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  • Sturtz, S.: Comparing models for variables given on disparate spatial scales: An epidemiological example. PhD Thesis, Department of Statistics, TU Dortmund, p. 38 (2007)

    Google Scholar 

  • Szepannek, G., Schiffner, J., Wilson, J., Weihs, C.: Local Modelling in Classification. In: Perner, P. (ed.) ICDM 2008. LNCS, vol. 5077, pp. 153–164. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  • Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley, Reading (2005)

    Google Scholar 

  • Ward, J.H.: Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association 58, 236–244 (1963)

    Article  MathSciNet  Google Scholar 

  • Weihs, C., Ligges, U., Mörchen, F., Müllensiefen, D.: Classification in Music Research. Advances in Data Analysis and Classification (ADAC) 1(3), 255–291 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Weihs, C., Szepannek, G., Ligges, U., Lübke, K., Raabe, N.: Local models in register classification by timbre. In: Batagelj, V., Bock, H.-H., Ferligoj, A., Ziberna, A. (eds.) Data Science and Classification, pp. 315–332. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  • Weihs, C., Reuter, C., Ligges, U.: Register Classification by Timbre. In: Weihs, C., Gaul, W. (eds.) Classification - The Ubiquitous Challenge, pp. 624–631. Springer, Berlin (2005)

    Chapter  Google Scholar 

  • Weihs, C., Ligges, U.: Voice Prints as a Tool for Automatic Classification of Vocal Performance. In: Kopiez, R., Lehmann, A.C., Wolther, I., Wolf, C. (eds.) Proceedings of the 5th Triennial ESCOM Conference, Hanover University of Music and Drama, Germany, September 8-13, pp. 332–335 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Weihs, C., Szepannek, G. (2009). Distances in Classification. In: Perner, P. (eds) Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2009. Lecture Notes in Computer Science(), vol 5633. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03067-3_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03067-3_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03066-6

  • Online ISBN: 978-3-642-03067-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics