Model-based clustering of multivariate ordinal data relying on a stochastic binary search algorithm

Christophe Biernacki¹ &
Julien Jacques²

943 Accesses
21 Citations
Explore all metrics

Abstract

We design a probability distribution for ordinal data by modeling the process generating data, which is assumed to rely only on order comparisons between categories. Contrariwise, most competitors often either forget the order information or add a non-existent distance information. The data generating process is assumed, from optimality arguments, to be a stochastic binary search algorithm in a sorted table. The resulting distribution is natively governed by two meaningful parameters (position and precision) and has very appealing properties: decrease around the mode, shape tuning from uniformity to a Dirac, identifiability. Moreover, it is easily estimated by an EM algorithm since the path in the stochastic binary search algorithm can be considered as missing values. Using then the classical latent class assumption, the previous univariate ordinal model is straightforwardly extended to model-based clustering for multivariate ordinal data. Parameters of this mixture model are estimated by an AECM algorithm. Both simulated and real data sets illustrate the great potential of this model by its ability to parsimoniously identify particularly relevant clusters which were unsuspected by some traditional competitors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Model-Based Approach to Simultaneous Clustering and Dimensional Reduction of Ordinal Data

Article 06 September 2017

Mixture models for ordinal data: a pairwise likelihood approach

Article 11 December 2014

Clustering Ordinal Data Via Parsimonious Models

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

Ranking data occur when some judges are asking to sort m objects, quoted by $1,\ldots ,m$, according to a preference order. Thus, a ranking data is a permutation of the whole data set $\{1,\ldots ,m\}$. Ordinal data occur when some judges are asking to give a note among the set of ordered notes $1<\cdots <m$. Thus, an ordinal data is a unique element of $\{1,\ldots ,m\}$.
We will do the slight abuse of notation $\hbox {p}(x;\mu ,\pi )=\hbox {p}\Big (\{x\};\mu ,\pi \Big )$.
For avoiding the label cluster dependency, we keep the best distance over all label permutations.
Note that a two component multinomial model is not performed in this univariate situation since it is not identifiable.
http://www.aeres-evaluation.com/
All partitions are better displayed in the first and the third axes than in the first and second axes.
We conjecture that all propositions hold for any m value. In particular, we have proved the simplest of them (Propositions (5.1), (5.2), (5.4)) in this general situation (these general proofs are not given in this paper).

References

Agresti, A.: Analysis of Ordinal Categorical Data. Wiley Series in Probability and Statistics. Wiley-Interscience, New York (2010)
Book MATH Google Scholar
Allman, E.S., Matias, C., Rhodes, J.A.: Identifiability of parameters in latent structure models with many observed variables. Ann. Stat. 37(6A), 3099–3132 (2009)
Article MathSciNet MATH Google Scholar
Bartholomew, D.J., Knott, M., Moustaki, I.: Latent Variable Models and Factor Analysis: A Unified Approach. Wiley, Hoboken (2011)
Book MATH Google Scholar
Biernacki, C., Jacques, J.: A generative model for rank data based on sorting algorithm. Comput. Stat. Data Anal. 58, 162–176 (2013)
Article MathSciNet Google Scholar
Celeux, G., Govaert, G.: Gaussian parsimonious clustering models. J. Pattern Recogn. Soc. 28, 781–793 (1995)
Article Google Scholar
D’Elia, A., Piccolo, D.: A mixture model for preferences data analysis. Comput. Stat. Data Anal. 49(3), 917–934 (2005)
Article MathSciNet MATH Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B 39(1), 1–38 (1977). With discussion
MathSciNet MATH Google Scholar
Fraley, C., Raftery, A.E.: Model-based clustering, discriminant analysis, and density estimation. J. Am. Stat. Soc. 97, 611–612 (2002)
Article MathSciNet MATH Google Scholar
Giordan, M., Diana, G.: A clustering method for categorical ordinal data. Commun. Stat. Theory Methods 40, 1315–1334 (2011)
Article MathSciNet MATH Google Scholar
Goodman, L.A., Kruskal, W.H.: Measures of association for cross classifications. J. Am. Stat. Assoc. 49, 732–764 (1954)
MATH Google Scholar
Goodman, L.A.: Explanatory latent structure models using both identifiable and unidentifiable models. Biometrika 61, 215–231 (1974)
Article MathSciNet MATH Google Scholar
Gouget, C.: Utilisation des modèles de mélange pour la classification automatique de données ordinales. Ph.D. thesis, Université de Technologie de Compiègne (2006)
Hartigan, J.A., Wong, M.A.: Algorithm as 1326: a k-means clustering algorithm. Appl. Stat. 28, 100–108 (1978)
Article MATH Google Scholar
Iannario, M., Piccolo, D.: A program in r for cub models inference. Technical report, Universita di Napoli Federico II, Version 2.0, www.dipstat.unina.it (2009)
Jacques, J., Biernacki, C.: Model-based clustering for multivariate partial ranking data. J. Stat. Plan. Inference 149, 201–217 (2014)
Article MathSciNet MATH Google Scholar
Jacques, J., Grimonprez, Q., Biernacki, C.: Rankcluster: an r package for clustering multivariate partial rankings. R J. (in press) (2014)
Jacques, J., Preda, C.: Functional data clustering: a survey. Adv. Data Anal. Classif. 8(3), 231–255 (2014)
Article MathSciNet Google Scholar
Jollois, F-X., Nadif, M.: Classification de données ordinales : modèles et algorithmes. In: Proceedings of the 41th Conference of the French Statistical Society, Bordeaux, France (2011)
Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, Hoboken (1990)
Book Google Scholar
Kendall, M.G.: The treatment of ties in ranking problems. Biometrika 33, 239–251 (1945)
Article MathSciNet MATH Google Scholar
Knuth, D.E.: Sorting and Searching: The Art of Computer Programming, vol. 3, 2nd edn. Addison-Wesley Professional, Boston (1998)
MATH Google Scholar
Lewis, S.J.G., Foltynie, T., Blackwell, A.D., Robbins, T.W., Owen, A.M., Barker, R.A.: Heterogeneity of parkinson’s disease in the early clinical stages using a data driven approach. J. Neurol. Neurosurg. Psychiatry 76, 343–348 (2003)
Article Google Scholar
Lipsitz, S.R., Kim, K., Zhao, L.: Analysis of repeated categorical data using generalized estimating equations. Stat. Med. 13, 1149–1163 (1994)
Article Google Scholar
Manisera, M., Zuccolotto, P.: Modeling rating data with nonlinear CUB models. Comput. Stat. Data Anal. 78, 100–118 (2014)
Marbac, M., Biernacki, C., Vandewalle, V.: Model-based clustering of Gaussian copulas for mixed data. ArXiv e-prints, May (2014)
Matechou, E., Liu, I., Pledger, S., Arnold, R.: Biclustering models for ordinal data. In: NZSA 2011 Conference, Auckland, New-Zealand (2011)
McCullagh, P.: Regression models for ordinal data. J. R. Stat. Soc. Ser. B 42, 109–142 (1980)
MathSciNet MATH Google Scholar
McLachlan, G., Peel, D.: Applied Probability and Statistics. Finite Mixture Models. Wiley Series in Probability and Statistics. Wiley-Interscience, New York (2000)
Book MATH Google Scholar
McParland, D., Gormley, C.: Algorithms from and for Nature and Life: Studies in Classification, Data Analysis, and Knowledge Organization, chapter Clustering Ordinal Data via Latent Variable Models. Springer, New York (2013)
Google Scholar
Melnykov, V., Maitra, R.: Finite mixture models and model-based clustering. Stat. Surv. 4, 80–116 (2010)
Article MathSciNet MATH Google Scholar
Nelsen, R.B.: An Introduction to Copulas. Springer, New York (1999)
Book MATH Google Scholar
Podani, J.: Braun-blanquet’s legacy and data analysis in vegetation science. J. Veg. Sci. 17, 113–117 (2006)
Article Google Scholar
Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)
Article MathSciNet MATH Google Scholar
Somers, R.H.: A new assymmetric measure of association for ordinal variables. Am. Sociol. Rev. 27, 799–811 (1962)
Article Google Scholar
Stevens, S.S.: On the theory of scales of measurement. Science 103(2684), 677–680 (1946)
Article MATH Google Scholar
Vermunt, J.K.: A general class of nonparametric models for ordinal categorical data. Sociol. Methodol. 29, 187–223 (1999)
Article Google Scholar
Vermunt, J.K., Magidson, J.: Technical Guide for Latent GOLD 4.0: Basic and Advanced. Statistical Innovations Inc., Belmont (2005)
Google Scholar
Wolfe, J.H.: Pattern clustering by multivariate mixture analysis. Multivar. Behav. Res. 5, 329–350 (1970)
Article Google Scholar
Xu, R., Wunsch, D.C.: Clustering. Wiley, Hoboken (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire Painlevé & Inria, Université Lille 1, 59655, Villeneuve d’Ascq, France
Christophe Biernacki
Laboratoire ERIC, Université Lumière Lyon 2, 69676, Bron, France
Julien Jacques

Authors

Christophe Biernacki
View author publications
You can also search for this author in PubMed Google Scholar
Julien Jacques
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Julien Jacques.

Appendices

Appendix 1: The BOS properties: statements and proofs

From Equations (7), (8) and (10) in Sect. 2.2, it is easily seen that the BOS model has a polynomial expression related to $\pi $ of degree less than or equal to $m-1$:

$$\begin{aligned} \hbox {p}(x ; \mu ,\pi ) = \sum _{j=0}^{m-1} a_j(m,\mu ,x) \pi ^{j}. \end{aligned}$$

(18)

From this statement, we have developed a formal Matlab code computing the exact expression of each coefficient $a_j(m,\mu ,x)$ for any x, $\mu $, and m values. For instance, for $m=5$, $\mu =2$, and $x=4$, we obtain the following exact expression:

$$\begin{aligned} \hbox {p}(4 ; 2,\pi )= & {} a_0(5,2,4) + a_1(5,2,4)\pi + a_2(5,2,4)\pi ^{2}\\&+\, a_3(5,2,4)\pi ^{3} + a_4(5,2,4)\pi ^4\\= & {} \frac{1}{5} - \frac{33}{200}\pi - \frac{457}{7200}\pi ^{2} + \frac{2}{75}\pi ^{3} + \frac{13}{7200}\pi ^4. \end{aligned}$$

This formal calculus tool is used in the proofs of all the following properties. Its advantage is to provide very straightforward proofs. Its drawback is to provide proofs for only some selected values of m. However, we have chosen $m\in \{1,\ldots ,8\}$ since it answers to most practical situations^{Footnote 7}. For instance in the reference book Agresti (2010) all ordinal data have less than 8 categories.

Proposition 5.1

($\pi =0$: Uniformity.) If $\pi =0$, then $\forall (\mu ,x) \in \{1,\ldots ,m\}^2$, $\hbox {p}(x;\mu ,\pi )=m^{-1}$.

Proof

The formal calculus leads to $a_0(m,\mu ,x)=m^{-1}$ for all $m\in \{1,\ldots ,8\}$. Conclusion follows by setting $\pi =0$ in (18).

Proposition 5.2

($\pi =1$: Dirac in $\mu $.) If $\pi =1$, then $\forall (x,\mu ) \in \{1,\ldots ,m\}^2$, $\hbox {p}(x;\mu ,\pi )={\mathbb I}(x=\mu )$.

Proof

The formal calculus leads to $\sum _{j=0}^{m-1}a_j(m,\mu ,\mu )=1$ for all $m\in \{1,\ldots ,8\}$. Conclusion follows by setting $\pi =1$ in (18).

Proposition 5.3

($\mu $: mode if $\pi >0$.) If $\pi >0$, then $\forall (\mu ,x) \in \{1,\ldots ,m\}^2$ such that $\mu \ne x$, $\hbox {p}(\mu ;\mu ,\pi ) > \hbox {p}(x;\mu ,\pi )$.

Proof

For $\pi =1$, Proposition 5.2 already established that $\mu $ is the mode. We consider now $0<\pi <1$. From (18), $\hbox {p}(\mu ;\mu ,\pi ) - \hbox {p}(x;\mu ,\pi )$ is a polynomial of degree less than or equal to $m-1$. Moreover, from Proposition 5.1, $\pi =0$ is a root of $\hbox {p}(\mu ;\mu ,\pi ) - \hbox {p}(x;\mu ,\pi )$, thus we can factorize it by $\pi $

$$\begin{aligned} \hbox {p}(\mu ;\mu ,\pi ) - \hbox {p}(x;\mu ,\pi ) = \pi \left( \sum _{j=0}^{m-2} b_j(m,\mu ,x) \pi ^j \right) .\nonumber \\ \end{aligned}$$

(19)

Since $0<\pi < 1$ implies that $0<\pi ^j< 1$ for any $j>1$, the following lower bound is directly obtained

$$\begin{aligned}&\hbox {p}(\mu ;\mu ,\pi ) - \hbox {p}(x;\mu ,\pi ) \nonumber \\&\quad > \pi \left( b_0(m,\mu ,x) + \sum _{j=1}^{m-2} b_j^-(m,\mu ,x) \right) , \end{aligned}$$

(20)

where $b_j^-(m,\mu ,x)=\min (0,b_j(m,\mu ,x))$. All $b_j(m,\mu ,x)$ are obtained from a formal polynomial Euclidian division for all $m\in \{1,\ldots ,8\}$, allowing to show by formal calculus that $ b_0(m,\mu ,x) + \sum _{j=1}^{m-2} b_j^-(m,\mu ,x) > 0$ for all $m\in \{1,\ldots ,8\}$. Conclusion follows.

Proposition 5.4

($\mu $: absolute growing of its probability with $\pi $. ) $\forall \mu \in \{1,\ldots ,m\}$, $\hbox {p}(\mu ;\mu ,\pi )$ is an increasing function of $\pi $.

Proof

The formal calculus leads to $a_j(m,\mu ,\mu )\ge 0$ for all $m\in \{1,\ldots ,8\}$ and for all $j\in \{1,\ldots ,m-1\}$. Conclusion follows directly from (18).

Proposition 5.5

( $\mu $: relative growing of its probability with $\pi $.) $\forall (\mu ,x) \in \{1,\ldots ,m\}^2$ such that $\mu \ne x$, $\hbox {p}(\mu ;\mu ,\pi )- \hbox {p}(x;\mu ,\pi )$ is an increasing function of $\pi $.

Proof

The key is to prove the differential (in $\pi $) expression $\hbox {p}'(\mu ;\mu ,\pi )- \hbox {p}'(x;\mu ,\pi )>0$. From (18), $\hbox {p}'(\mu ;\mu ,\pi ) - \hbox {p}'(x;\mu ,\pi )$ is a polynomial of degree less than or equal to $m-2$

$$\begin{aligned} \hbox {p}'(\mu ;\mu ,\pi ) - \hbox {p}'(x;\mu ,\pi ) = \sum _{j=0}^{m-2} b_j(m,\mu ,x) \pi ^j. \end{aligned}$$

(21)

Since $0<\pi < 1$ implies that $0<\pi ^j< 1$ for any $j>1$, the following lower bound is directly obtained

$$\begin{aligned} \hbox {p}'(\mu ;\mu ,\pi ) - \hbox {p}'(x;\mu ,\pi ) > b_0(m,\mu ,x) + \sum _{j=1}^{m-2} b_j^-(m,\mu ,x),\nonumber \\ \end{aligned}$$

(22)

where $b_j^-(m,\mu ,x)=\min (0,b_j(m,\mu ,x))$. All $b_j(m,\mu ,x)$ are obtained from a formal derivation of a polynomial for all $m\in \{1,\ldots ,8\}$, allowing to show by formal calculus again that $ b_0(m,\mu ,x) + \sum _{j=1}^{m-2} b_j^-(m,\mu ,x) > 0$ for all $m\in \{1,\ldots ,8\}$. Conclusion follows.

Proposition 5.6

(Decreasing around $\mu $ if $0<\pi <1$.) $\forall (x,x') \in \{1,\ldots ,m\}^2$ such that $x'<x<\mu $ or $\mu >x>x'$, $\hbox {p}(x;\mu ,\pi )> \hbox {p}(x';\mu ,\pi )$.

Proof

From (18), $\hbox {p}(x;\mu ,\pi ) - \hbox {p}(x';\mu ,\pi )$ is a polynomial of degree less than or equal to $m-1$. Moreover, from Propositions 5.1 and 5.2 respectively, $\pi =0$ and $\pi =1$ are roots of $\hbox {p}(x;\mu ,\pi ) - \hbox {p}(x';\mu ,\pi )$, thus we can factorize it by $\pi (1-\pi )$

$$\begin{aligned}&\hbox {p}(x;\mu ,\pi ) - \hbox {p}(x';\mu ,\pi ) \nonumber \\&\quad = \pi (1-\pi )\left( \sum _{j=0}^{m-3} b_j(m,\mu ,x,x') \pi ^j \right) . \end{aligned}$$

(23)

Since $0<\pi < 1$ implies that $0<\pi ^j< 1$ for any $j>1$, the following lower bound is directly obtained

$$\begin{aligned}&\hbox {p}(x;\mu ,\pi ) - \hbox {p}(x';\mu ,\pi ) \nonumber \\&\quad > \pi (1-\pi )\left( b_0(m,\mu ,x,x') + \sum _{j=1}^{m-3} b_j^-(m,\mu ,x,x') \right) ,\nonumber \\ \end{aligned}$$

(24)

where $b_j^-(m,\mu ,x,x')=\min (0,b_j(m,\mu ,x,x'))$. All $b_j$ $(m,\mu ,x,x')$ are obtained from a formal polynomial Euclidian division for all $m\in \{1,\ldots ,8\}$, allowing to show by formal calculus again that $ b_0(m,\mu ,x,x') + \sum _{j=1}^{m-2} b_j^-(m,\mu ,x,$ $x') > 0$ for all $m\in \{1,\ldots ,8\}$. Conclusion follows.

Proposition 5.7

(Identifiability) If $0<\pi \le 1$, identifiability holds.

Proof

The identifiability problem could concern $\mu $ and/or $\pi $.

First, there exists no couple $(\mu ,\mu ')\in \{1,\ldots ,m\}^2$ with $\mu \ne \mu '$ such that $\hbox {p}(x;\mu ,\pi )=\hbox {p}(x;\mu ',\pi ')$ for any $x\in \{1,\ldots ,m\}$ and any $(\pi ,\pi ')\in (0,1]^2$. Indeed, otherwise both $\mu $ and $\mu '$ should be modes of both distributions, which is impossible by unicity of the mode (Propositions 5.3 and 5.6).
Second, there exists no couple $(\pi ,\pi ')\in (0,1]^2$ with $\pi \ne \pi '$ such that $\hbox {p}(x;\mu ,\pi )=\hbox {p}(x;\mu ,\pi ')$ for any $x\in \{1,\ldots ,m\}$ and any $\mu \in \{1,\ldots ,m\}$ since $\hbox {p}(x;\mu ,\pi )$ is a polynomial of order at least one. This last statement comes from the fact that $\hbox {p}(x;\mu ,\pi )$ varies from 1 / m to 0 or 1 when $\pi $ varies (respectively when $x\ne \mu $ and $x=\mu $) from Propositions 5.1 and 5.2, thus depends on the value of $\pi $.

Appendix 2: AERES data

See Table 11.

Table 11 Bachelor’s degrees evaluations (March 2011) of 23 French universities on the ordinal scale A$+$, A, B and C. The four evaluation criteria are described in Sect. 4.2

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Biernacki, C., Jacques, J. Model-based clustering of multivariate ordinal data relying on a stochastic binary search algorithm. Stat Comput 26, 929–943 (2016). https://doi.org/10.1007/s11222-015-9585-2

Download citation

Received: 26 July 2014
Accepted: 04 June 2015
Published: 14 June 2015
Issue Date: September 2016
DOI: https://doi.org/10.1007/s11222-015-9585-2

Model-based clustering of multivariate ordinal data relying on a stochastic binary search algorithm

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Model-Based Approach to Simultaneous Clustering and Dimensional Reduction of Ordinal Data

Mixture models for ordinal data: a pairwise likelihood approach

Clustering Ordinal Data Via Parsimonious Models

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: The BOS properties: statements and proofs

Proposition 5.1

Proof

Proposition 5.2

Proof

Proposition 5.3

Proof

Proposition 5.4

Proof

Proposition 5.5

Proof

Proposition 5.6

Proof

Proposition 5.7

Proof

Appendix 2: AERES data

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Model-based clustering of multivariate ordinal data relying on a stochastic binary search algorithm

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Model-Based Approach to Simultaneous Clustering and Dimensional Reduction of Ordinal Data

Mixture models for ordinal data: a pairwise likelihood approach

Clustering Ordinal Data Via Parsimonious Models

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: The BOS properties: statements and proofs

Proposition 5.1

Proof

Proposition 5.2

Proof

Proposition 5.3

Proof

Proposition 5.4

Proof

Proposition 5.5

Proof

Proposition 5.6

Proof

Proposition 5.7

Proof

Appendix 2: AERES data

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation