Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Selecting Among Multi-Mode Partitioning Models of Different Complexities: A Comparison of Four Model Selection Criteria

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

Multi-mode partitioning models for N-way N-mode data reduce each of the N modes in the data to a small number of clusters that are mutually exclusive. Given a specific N-mode data set, one may wonder which multi-mode partitioning model (i.e., with which numbers of clusters for each mode) yields the most useful description of this data set and should therefore be selected. In this paper, we address this issue by investigating four possible model selection heuristics: multi-mode extensions of Calinski and Harabasz’s (1974) and Kaufman and Rousseeuw’s (1990) indices for one-mode k-means clustering and multi-mode partitioning versions of Timmerman and Kiers’s (2000) DIFFIT and Ceulemans and Kiers’s (2006) numerical convex hull based model selection heuristic for three-mode principal component analysis. The performance of these four heuristics is systematically compared in a simulation study, which shows that the DIFFIT and numerical convex hull heuristics perform satisfactory in the two-mode partitioning case and very good in the threemode partitioning case.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • BAIER, D., GAUL, W., and SCHADER, M. (1997), “Two-Mode Overlapping Clustering with Applications to Simultaneous Benefit Segmentation and Market Structuring”, in Classification and Knowledge Organization, Eds. R. Klar and O. Opitz, Berlin: Springer, pp. 557–566.

    Google Scholar 

  • CALINSKI, R.B. and HARABASZ, J. (1974), “A Dendrite Method for Cluster Analysis”, Communications in Statistics, 3, 1–27.

    Article  MathSciNet  Google Scholar 

  • CASTILLO, W. and TREJOS, J. (2002), “Two-mode Partitioning: Review of Methods and Application of Tabu Search”, in Classification, Clustering, and Related Topics. Recent Advances and Applications. Studies in Classification, Data Analysis, and Knowledge Organization, Eds. K. Jajuga, A. Sokolowski, and H.H. Bock, Heidelberg: Springer–Verlag, pp. 43–51.

    Google Scholar 

  • CEULEMANS, E. and KIERS, H.A.L. (2006), “Selecting Among Three-mode Principal Component Models of Different Types and Complexities: A Numerical Convex Hull BasedMethod”, British Journal of Mathematical and Statistical Psychology, 59, 133–150.

    Article  MathSciNet  Google Scholar 

  • CEULEMANS, E. and VAN MECHELEN, I. (2005), “Hierarchical Classes Models for Three-way Three-mode Binary Data: Interrelations and Model Selection”, Psychometrika, 70, 461–480.

    Article  MathSciNet  Google Scholar 

  • EVERITT, B.S. (1993), Cluster Analysis, Arnold: London.

    Google Scholar 

  • GAUL, W. and SCHADER, M. (1996), “A New Algorithm for Two-mode Clustering”, in Classification and Knowledge Organization, Eds. H. Bock and W. Polasek, Berlin: Springer, pp. 15–23.

    Google Scholar 

  • HARTIGAN, J.A. (1975), Clustering Algorithms, New York: Wiley.

    MATH  Google Scholar 

  • KAUFFMAN, L. and ROUSSEEUW, P. (1990), Finding Groups in Data: An Introduction to Cluster Analysis, New York: Wiley.

    Google Scholar 

  • KIERS, H.A.L. (2000), “Towards a Standardized Notation and Terminology in Multiway Analysis”, Journal of Chemometrics, 14, 105–122.

    Article  MathSciNet  Google Scholar 

  • KIERS, H.A.L. (2004), Clustering All Three Modes of Three-mode Data: Computational Possibilities and Problems, Paper presented at COMPSTAT 2004, Prague.

  • KROONENBERG, P.M. and OORT, F.J. (2003), “Three-mode Analysis of Multimode Covariance Matrices”, British Journal of Mathematical and Statistical Psychology, 56, 305-336.

    Article  MathSciNet  Google Scholar 

  • KROONENBERG, P.M. and VAN DER VOORT, T.H.A. (1987), “Multiplicative Decomposition of Interactions for Judgements of Realism of Television Films”, Kwantitatieve Methoden, 8, 117–144.

    Google Scholar 

  • MACQUEEN, J. (1967), “Some Methods for Classification and Analysis of Multivariate Observations, in Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1”, Eds. L.M. LeCam and J. Neyman, Berkeley: University of California Press, pp. 281–297.

    Google Scholar 

  • MILLIGAN, G.W. and COOPER, M.C. (1985), “An Examination of Procedures for Determining the Number of Clusters in a Data Set”, Psychometrika, 50, 159–179.

    Article  Google Scholar 

  • MURAKAMI, T. and KROONENBERG, P.M. (2003), “Three-mode Models and Individual Differences in Semantic Differential Data”, Multivariate Behavioral Research, 38, 247–283.

    Article  Google Scholar 

  • ROCCI, R. and VICHI, M. (2008), “Two-mode Multi-Partitioning”, Computational Statistics and Data Analysis, 52, 1984–2003.

    Article  MATH  MathSciNet  Google Scholar 

  • SCHEPERS, J., VAN MECHELEN, I., and CEULEMANS, E. (2006), “Three-mode Partitioning”, Computational Statistics & Data Analysis, 51, 1623–1642.

    Article  MathSciNet  MATH  Google Scholar 

  • SUGAR, C.A. and JAMES, G.M. (2003), “Identifying Groups in Data: An Information-Theoretic Approach”, Journal of the American Statistical Association, 98, 750–763.

    Article  MATH  MathSciNet  Google Scholar 

  • TIMMERMAN, M.E. and KIERS, H.A.L. (2000), “Three-mode Principal Components Analysis: Choosing the Numbers of Components and Sensitivity to Local Optima”, British Journal of Mathematical and Statistical Psychology, 53, 1–16.

    Article  Google Scholar 

  • TIBSHIRANI, R., WALTHER, G., and HASTIE, T. (2001), “Estimating the Number of Clusters in a Data Set via the Gap Statistic”, Journal of the Royal Statistical Society B,63, 411–423.

    Article  MATH  MathSciNet  Google Scholar 

  • TUCKER, L.R. (1966), “Some Mathematical Notes on Three-mode Factor Analysis, Psychometrika, 31, 279–311.

    Article  MathSciNet  Google Scholar 

  • VAN ROSMALEN, J., GROENEN, P.J.F., TREJOS, J., and CASTILLO,W. (2005), “Global Optimization Strategies for Two-Mode Clustering”, Econometric Institute Report EI 2005-33, Erasmus School of Economics, Erasmus University Rotterdam.

  • VICHI, M. (2002), “Double k-means Clustering for Simultaneous Classification of Objects and Variables”, in Advances in Classification and Data Analysis, Eds. S. Borra, R. Rocci, and M. Schader, Heidelberg: Springer, pp. 43–51.

    Google Scholar 

  • WANSBEEK, T. and VERHEES, J. (1989), “Models for Multidimensional Matrices in Econometrics and Psychometrics”, in Multiway Data Analysis, Eds. R. Coppi and S. Bolasco, Amsterdam: Elsevier, pp. 543–552.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jan Schepers.

Additional information

J. Schepers and I. Van Mechelen were supported by the Fund for Scientific Research-Flanders (Belgium), Project No. G.0146.06 awarded to Iven van Mechelen, and by the Research Council KULeuven (GOA/2005/04). E. Ceulemans is a post-doctoral fellow of the Fund for Scientific Research-Flanders (Belgium).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schepers, J., Ceulemans, E. & Van Mechelen, I. Selecting Among Multi-Mode Partitioning Models of Different Complexities: A Comparison of Four Model Selection Criteria. J Classif 25, 67–85 (2008). https://doi.org/10.1007/s00357-008-9005-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-008-9005-9

Keywords

Navigation