Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Semiparametric mixtures of regressions with single-index for model based clustering

  • Regular Article
  • Published:
Advances in Data Analysis and Classification Aims and scope Submit manuscript

Abstract

In this article, we propose two classes of semiparametric mixture regression models with single-index for model based clustering. Unlike many semiparametric/nonparametric mixture regression models that can only be applied to low dimensional predictors, the new semiparametric models can easily incorporate high dimensional predictors into the nonparametric components. The proposed models are very general, and many of the recently proposed semiparametric/nonparametric mixture regression models are indeed special cases of the new models. Backfitting estimates and the corresponding modified EM algorithms are proposed to achieve optimal convergence rates for both parametric and nonparametric parts. We establish the identifiability results of the proposed two models and investigate the asymptotic properties of the proposed estimation procedures. Simulation studies are conducted to demonstrate the finite sample performance of the proposed models. Two real data applications using the new models reveal some interesting findings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

Download references

Acknowledgements

The authors are grateful to the editor, the guest editor, and two referees for numerous helpful comments during the preparation of the article. Funding was provided by National Natural Science Foundation of China (Grant No. 11601477), Natural Science Foundation (USA) (Grant No. DMS-1461677), Department of Energy (Grant No. 10006272), the First Class Discipline of Zhejiang - A (Zhejiang University of Finance and Economics-Statistics), China (Grant No. NA) and Natural Science Foundation of Zhejiang Province (Grant No. LY19A010006).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sijia Xiang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

Appendix A

Technical conditions

  1. (C1)

    The sample \(\{({{\varvec{x}}}_i,Y_i),i=1,\ldots ,n\}\) is independent and identically distributed from its population \(({{\varvec{x}}},Y)\). The support for \({{\varvec{x}}}\), denoted by \(\mathscr {X}\), is a compact subset of \(\mathbb {R}^3\).

  2. (C2)

    The marginal density of \({\varvec{\alpha }}^\top {{\varvec{x}}}\), denoted by \(f(\cdot )\), is twice continuously differentiable and positive at the point z.

  3. (C3)

    The kernel function \(K(\cdot )\) has a bounded support, and satisfies that

    $$\begin{aligned}&\int K(t)dt=1,\qquad \int tK(t)dt=0,\qquad \int t^2K(t)dt<\infty ,\nonumber \\&\int K^2(t)dt<\infty ,\qquad \int |K^3(t)|dt<\infty . \end{aligned}$$
  4. (C4)

    \(h\rightarrow 0\), \(nh\rightarrow 0\), and \(nh^5=O(1)\) as \(n\rightarrow \infty \).

  5. (C5)

    The third derivative \(|\partial ^3\ell ({\varvec{\theta }},y)/\partial \theta _i\partial \theta _j\partial \theta _k|\le M(y)\) for all y and all \({\varvec{\theta }}\) in a neighborhood of \({\varvec{\theta }}(z)\), and \(E[M(y)]<\infty \).

  6. (C6)

    The unknown functions \({\varvec{\theta }}(z)\) have continuous second derivative. For \(j=1,\ldots ,k\), \(\sigma _j^2(z)>0\), and \(\pi _j(z)>0\) for all \({{\varvec{x}}}\in \mathscr {X}\).

  7. (C7)

    For all i and j, the following conditions hold:

    $$\begin{aligned} E\left[ \left| \frac{\partial \ell ({\varvec{\theta }}(z),Y)}{\partial \theta _i}\right| ^3\right]<\infty \qquad E\left[ \left( \frac{\partial ^2\ell ({\varvec{\theta }}(z),Y)}{\partial \theta _i\partial \theta _j}\right) ^2\right] <\infty \end{aligned}$$
  8. (C8)

    \({\varvec{\theta }}_0''(\cdot )\) is continuous at the point z.

  9. (C9)

    The third derivative \(|\partial ^3\ell ({\varvec{\pi }},y)/\partial \pi _i\partial \pi _j\partial \pi _k|\le M(y)\) for all y and all \({\varvec{\pi }}\) in a neighborhood of \({\varvec{\pi }}(z)\), and \(E[M(y)]<\infty \).

  10. (C10)

    The unknown functions \({\varvec{\pi }}(z)\) have continuous second derivative. For \(j=1,\ldots ,k\), \(\pi _j(z)>0\) for all \({{\varvec{x}}}\in \mathscr {X}\).

  11. (C11)

    For all i and j, the following conditions hold:

    $$\begin{aligned} E\left[ \left| \frac{\partial \ell ({\varvec{\pi }}(z),Y)}{\partial \pi _i}\right| ^3\right]<\infty \qquad E\left[ \left( \frac{\partial ^2\ell ({\varvec{\pi }}(z),Y)}{\partial \pi _i\partial \pi _j}\right) ^2\right] <\infty \end{aligned}$$
  12. (C11)

    \({\varvec{\pi }}''(\cdot )\) is continuous at the point z.

Proof of Theorem 1

Ichimura (1993) have shown that under conditions (i)–(iv), \({\varvec{\alpha }}\) is identifiable. Further, Huang et al. (2013) showed that with condition (v), the nonparametric functions are identifiable. Thus completes the proof. \(\square \)

Proof of Theorem 2

Let

$$\begin{aligned}&\hat{\pi }_j^*=\sqrt{nh}\{\hat{\pi }_j-\pi _{j}(z)\},\quad j=1,\ldots ,k-1.\\&\hat{m}_j^*=\sqrt{nh}\{\hat{m}_j-m_{j}(z)\},\quad j=1,\ldots ,k,\\&\hat{\sigma }_j^{2*}=\sqrt{nh}\{\hat{\sigma }_j^2-\sigma _{j}^2(z)\},\quad j=1,\ldots ,k. \end{aligned}$$

Define \(\hat{{\varvec{\pi }}}^*=(\hat{\pi }_1^*,\ldots ,\hat{\pi }_{k-1}^*)^\top \), \(\hat{{{\varvec{m}}}}^*=(\hat{m}_1^*,\ldots ,\hat{m}_k^*)^\top \), \(\hat{{{\varvec{\sigma }}}}^*=(\hat{\sigma }_1^*,\ldots ,\hat{\sigma }_k^*)^\top \) and denote \(\hat{{\varvec{\theta }}}^*=(\hat{{\varvec{\pi }}}^{*T},\hat{{{\varvec{m}}}}^{*T},(\hat{{{\varvec{\sigma }}}}^{*2})^\top )^\top \). Let \(a_n=(nh)^{-1/2}\) and

$$\begin{aligned} \ell ({\varvec{\theta }}(z),\tilde{{\varvec{\alpha }}},{{\varvec{x}}}_i,Y_i)=\log \left\{ \sum _{j=1}^k\pi _j(\tilde{{\varvec{\alpha }}}^\top {{\varvec{x}}}_i)\phi (Y_i|m_j(\tilde{{\varvec{\alpha }}}^\top {{\varvec{x}}}_i),\sigma _j^2(\tilde{{\varvec{\alpha }}}^\top {{\varvec{x}}}_i))\right\} K_h(\tilde{{\varvec{\alpha }}}^\top {{\varvec{x}}}_i-z). \end{aligned}$$

If \((\hat{{\varvec{\pi }}},\hat{{{\varvec{m}}}},\hat{{{\varvec{\sigma }}}}^2)^\top \) maximizes (4), then \(\hat{{\varvec{\theta }}}^*\) maximizes

$$\begin{aligned} \ell _n^*({\varvec{\theta }}^*)=h\sum _{i=1}^n[\ell ({\varvec{\theta }}(z)+a_n{\varvec{\theta }}^*,\tilde{{\varvec{\alpha }}},{{\varvec{x}}}_i,Y_i)-\ell ({\varvec{\theta }}(z),\tilde{{\varvec{\alpha }}},{{\varvec{x}}}_i,Y_i)]K_h(\hat{Z}_i-z) \end{aligned}$$
(22)

with respect to \({\varvec{\theta }}^*\). By a Taylor expansion,

$$\begin{aligned} \ell _n^*({\varvec{\theta }}^*)={{\varvec{W}}}_{1n}^\top {\varvec{\theta }}^*+\frac{1}{2}{\varvec{\theta }}^{*T}{{\varvec{A}}}_{1n}{\varvec{\theta }}^*+o_p(1), \end{aligned}$$
(23)

where

$$\begin{aligned} {{\varvec{W}}}_{1n}=\sqrt{\frac{h}{n}}\sum _{i=1}^n\frac{\partial \ell ({\varvec{\theta }}(z),\tilde{{\varvec{\alpha }}},{{\varvec{x}}}_i,Y_i)}{\partial {\varvec{\theta }}}K_h(\hat{Z}_i-z), \end{aligned}$$

and

$$\begin{aligned} {{\varvec{A}}}_{2n}=\frac{1}{n}\sum _{i=1}^n\frac{\partial ^2\ell ({\varvec{\theta }}(z),\tilde{{\varvec{\alpha }}},{{\varvec{x}}}_i,Y_i)}{\partial {\varvec{\theta }}\partial {\varvec{\theta }}^\top }K_h(\hat{Z}_i-z). \end{aligned}$$

By WLLN, it can be shown that \({{\varvec{A}}}_{1n}=-f(z)\mathscr {I}^{(1)}_\theta (z)+o_p(1)\). Therefore,

$$\begin{aligned} \ell _n^*({\varvec{\theta }}^*)={{\varvec{W}}}_{1n}^\top {\varvec{\theta }}^*-\frac{1}{2}f(z){\varvec{\theta }}^{*T}\mathscr {I}^{(1)}_\theta (z){\varvec{\theta }}^*+o_p(1). \end{aligned}$$
(24)

Using the quadratic approximation lemma (see, for example, Fan and Gijbels 1996), we have that

$$\begin{aligned} \hat{{\varvec{\theta }}}^*=f(z)^{-1}\mathscr {I}^{(1)}_\theta (z)^{-1}{{\varvec{W}}}_{1n}+o_p(1). \end{aligned}$$
(25)

Note that

$$\begin{aligned} {{\varvec{W}}}_{1n}&=\sqrt{\frac{h}{n}}\sum _{i=1}^n\frac{\partial \ell ({\varvec{\theta }}(z),{\varvec{\alpha }},{{\varvec{x}}}_i,Y_i)}{\partial {\varvec{\theta }}}K_h(Z_i-z)+D_{1n}+O_p\left( \sqrt{\frac{h}{n}}\Vert \tilde{{\varvec{\alpha }}}-{\varvec{\alpha }}\Vert ^2\right) \end{aligned}$$

where

$$\begin{aligned} D_{1n}&=\sqrt{\frac{h}{n}}\sum _{i=1}^n\left\{ \frac{\partial ^2\ell ({\varvec{\theta }}(z),{\varvec{\alpha }},{{\varvec{x}}}_i,Y_i)}{\partial {\varvec{\theta }}\partial {\varvec{\theta }}^\top }[{{\varvec{x}}}_i{\varvec{\theta }}'(Z_i)]^\top K_h(Z_i-z)\right\} (\tilde{{\varvec{\alpha }}}-{\varvec{\alpha }}). \end{aligned}$$

Since \(\sqrt{n}(\tilde{{\varvec{\alpha }}}-{\varvec{\alpha }})=O_p(1)\), it can be shown that

$$\begin{aligned} D_{1n}=-\sqrt{h}f(z)E\left[ \frac{\partial ^2\ell ({\varvec{\theta }}(z),{\varvec{\alpha }},{{\varvec{x}}},Y)}{\partial {\varvec{\theta }}\partial {\varvec{\theta }}^\top }[{{\varvec{x}}}{\varvec{\theta }}'(Z)]^\top \right] =o_p(1), \end{aligned}$$

and

$$\begin{aligned} O_p\left( \sqrt{\frac{h}{n}}\Vert \tilde{{\varvec{\alpha }}}-{\varvec{\alpha }}\Vert ^2\right) =o_p(1). \end{aligned}$$

Therefore,

$$\begin{aligned} {{\varvec{W}}}_{1n}&=\sqrt{\frac{h}{n}}\sum _{i=1}^n\frac{\partial \ell ({\varvec{\theta }},{\varvec{\alpha }},{{\varvec{x}}}_i,Y_i)}{\partial {\varvec{\theta }}}K_h(Z_i-z)+o_p(1). \end{aligned}$$

To complete the proof, we now calculate the mean and variance of \({{\varvec{W}}}_n\). Note that

$$\begin{aligned} E({{\varvec{W}}}_{1n})&=\sqrt{nh}E\left[ E\left[ \frac{\partial \ell ({\varvec{\theta }},{\varvec{\alpha }},{{\varvec{x}}}_i,Y_i)}{\partial {\varvec{\theta }}}K_h(Z_i-z)|Z=z_0\right] \right] \nonumber \\&=\sqrt{nh}\left[ \frac{1}{2}f(z)\varLambda _1^{''}(z|z)+f'(z)\varLambda _1^{'}(z|z)\right] \kappa _2h^2. \end{aligned}$$
(26)

Similarly, we can show that

$$\begin{aligned} \text {Cov}({{\varvec{W}}}_{1n})=f(z)\mathscr {I}^{(1)}_\theta (z)\nu _0+o_p(1), \end{aligned}$$

where \(\kappa _l=\int t^lK(t)dt\) and \(\nu _l=\int t^lK^2(t)dt\). The rest of the proof follows a standard argument. \(\square \)

Proof of Theorem 3

Denote \(Z={\varvec{\alpha }}^\top {{\varvec{x}}}\) and \(\hat{Z}=\hat{{\varvec{\alpha }}}^\top {{\varvec{x}}}\). Let \(\ell ({\varvec{\theta }}(z),X,Y)=\log \sum _{j=1}^k\pi _j(z)\phi (Y|m_j(z),\sigma _j^2(z))\). If \(\hat{{\varvec{\theta }}}(z_0;\hat{{\varvec{\alpha }}})\) maximizes (4), then it solves

$$\begin{aligned} {{\varvec{0}}}=n^{-1}\sum _{i=1}^n\frac{\partial \ell (\hat{{\varvec{\theta }}}(z_0;\hat{{\varvec{\alpha }}}),X_i,Y_i)}{\partial {\varvec{\theta }}}K_h(\hat{Z}_i-z_0). \end{aligned}$$

Apply a Taylor expansion and use the conditions on h, we obtain

$$\begin{aligned} {{\varvec{0}}}&=n^{-1}\sum _{i=1}^nq_{1i}(Z_i)K_h(Z_i-z_0)\\&\quad + n^{-1}\sum _{i=1}^n\left[ q_{2i}(Z_i)K_h(Z_i-z_0)\right] (\ hat{{\varvec{\theta }}}(z_0;\hat{{\varvec{\alpha }}})-{\varvec{\theta }}(z_0))\\&\quad +n^{-1}\sum _{i=1}^nq_{2i}(Z_i)[{{\varvec{x}}}_i{\varvec{\theta }}'(Z_i)]^\top K_h(Z_i-z_0)(\hat{{\varvec{\alpha }}}-{\varvec{\alpha }})+o_p(n^{-1/2})+O_p(h^2) \end{aligned}$$

By similar argument as in the previous proof,

$$\begin{aligned} \hat{{\varvec{\theta }}}(z_0;\hat{{\varvec{\alpha }}})-{\varvec{\theta }}(z_0)&=n^{-1}f^{-1}(z_0)\mathscr {I}^{(1)-1}_\theta (z_0)\sum _{i=1}^nq_{1i}(Z_i)K_h(Z_i-z_0)\nonumber \\&\quad -\mathscr {I}^{(1)-1}_\theta (z_0)E\{q_2(Z)[{{\varvec{x}}}{\varvec{\theta }}'(Z)]^\top |Z=z_0\}(\hat{{\varvec{\alpha }}}-{\varvec{\alpha }})+o_p(n^{-1/2}). \end{aligned}$$
(27)

Note that

$$\begin{aligned} \hat{{\varvec{\theta }}}&(\hat{{\varvec{\alpha }}}^\top {{\varvec{x}}}_i;\hat{{\varvec{\alpha }}})-{\varvec{\theta }}({\varvec{\alpha }}^\top {{\varvec{x}}}_i)=\hat{{\varvec{\theta }}}(\hat{{\varvec{\alpha }}}^\top {{\varvec{x}}}_i;\hat{{\varvec{\alpha }}}) -\hat{{\varvec{\theta }}}({\varvec{\alpha }}^\top {{\varvec{x}}}_i;\hat{{\varvec{\alpha }}})+\hat{{\varvec{\theta }}}({\varvec{\alpha }}^\top {{\varvec{x}}}_i;\hat{{\varvec{\alpha }}})-{\varvec{\theta }}({\varvec{\alpha }}^\top {{\varvec{x}}}_i)\nonumber \\&=(\hat{{\varvec{\theta }}}'({\varvec{\alpha }}^\top {{\varvec{x}}}_i;\hat{{\varvec{\alpha }}}))^\top (\hat{{\varvec{\alpha }}}^\top -{\varvec{\alpha }}^\top ){{\varvec{x}}}_i+\hat{{\varvec{\theta }}}({\varvec{\alpha }}^\top {{\varvec{x}}}_i;\hat{{\varvec{\alpha }}})-{\varvec{\theta }}({\varvec{\alpha }}_0^\top {{\varvec{x}}}_i)+o_p(n^{-1/2})\nonumber \\&=({\varvec{\theta }}'({\varvec{\alpha }}^\top {{\varvec{x}}}_i))^\top (\hat{{\varvec{\alpha }}}^\top -{\varvec{\alpha }}^\top ){{\varvec{x}}}_i+\hat{{\varvec{\theta }}}({\varvec{\alpha }}^\top {{\varvec{x}}}_i;\hat{{\varvec{\alpha }}})-{\varvec{\theta }}({\varvec{\alpha }}^\top {{\varvec{x}}}_i)+o_p(n^{-1/2}), \end{aligned}$$
(28)

where the second part is handled by (27).

Since \(\hat{{\varvec{\alpha }}}\) maximizes (9), it is the solution to

$$\begin{aligned} {{\varvec{0}}}=\lambda \hat{{\varvec{\alpha }}}+n^{-1/2}\sum _{i=1}^n{{\varvec{x}}}_i\hat{{\varvec{\theta }}}'(\hat{{\varvec{\alpha }}}^\top {{\varvec{x}}}_i;\hat{{\varvec{\alpha }}}) \frac{\partial \ell (\hat{{\varvec{\theta }}}(\hat{{\varvec{\alpha }}}^\top {{\varvec{x}}}_i;\hat{{\varvec{\alpha }}}),X_i,Y_i)}{\partial {\varvec{\theta }}}, \end{aligned}$$

where \(\lambda \) is the Lagrange multiplier. By the Taylor expansion and using (28), we have that

$$\begin{aligned} {{\varvec{0}}}&=\lambda \hat{{\varvec{\alpha }}}+n^{-1/2}\sum _{i=1}^n {{\varvec{x}}}_i{\varvec{\theta }}'(Z_i)q_{1i}(Z_i)\\&\quad +n^{-1/2}\sum _{i=1}^n {{\varvec{x}}}_i{\varvec{\theta }}'(Z_i) q_{2i}(Z_i) [\hat{{\varvec{\theta }}}(\hat{{\varvec{\alpha }}}^\top {{\varvec{x}}}_i)-{\varvec{\theta }}({\varvec{\alpha }}^\top {{\varvec{x}}}_i)]+o_p(1)\\&=\lambda \hat{{\varvec{\alpha }}}+n^{-1/2}\sum _{i=1}^n{{\varvec{x}}}_i{\varvec{\theta }}'(Z_i)q_{1i}(Z_i)\\&\quad +n^{-1/2}\sum _{i=1}^n {{\varvec{x}}}_i{\varvec{\theta }}'(Z_i) q_{2i}(Z_i)({{\varvec{x}}}_i{\varvec{\theta }}'(Z_i))^\top (\hat{{\varvec{\alpha }}}-{\varvec{\alpha }})\\&\quad +n^{-1/2}\sum _{i=1}^n{{\varvec{x}}}_i{\varvec{\theta }}'(Z_i)q_{2i}(Z_i) [\hat{{\varvec{\theta }}}(Z_i)-{\varvec{\theta }}(Z_i)])+o_p(1). \end{aligned}$$

Define

$$\begin{aligned} A_\alpha =E\{[{{\varvec{x}}}{\varvec{\theta }}'(Z)]q_2(Z)[{{\varvec{x}}}{\varvec{\theta }}'(Z)]^\top \}, \end{aligned}$$

and apply (27),

$$\begin{aligned} {{\varvec{0}}}&=\lambda \hat{{\varvec{\alpha }}}+n^{-1/2}\sum _{i=1}^n{{\varvec{x}}}_i{\varvec{\theta }}'(Z_i) q_{1i}(Z_i)+n^{1/2}A_\beta (\hat{{\varvec{\alpha }}}-{\varvec{\alpha }})\nonumber \\&\quad -n^{-1/2}\sum _{i=1}^n{{\varvec{x}}}_i{\varvec{\theta }}'(Z_i)q_{2i}(Z_i)\mathscr {I}^{-1}_\theta (Z_i)E\{q_2(Z)[{{\varvec{x}}}{\varvec{\theta }}'(Z)]^\top |Z=Z_i\}(\hat{{\varvec{\alpha }}}-{\varvec{\alpha }})\nonumber \\&\quad +n^{-1/2}\sum _{i=1}^n{{\varvec{x}}}_i{\varvec{\theta }}'(Z_i)q_{2i}(Z_i) n^{-1}f^{-1}(Z_i)\mathscr {I}_\theta ^{-1}(Z_i)\nonumber \\&\quad \times \sum _{t=1}^nq_{1t}(Z_t)K_h(Z_t-Z_i)+o_p(1)\nonumber \\&=\lambda \hat{{\varvec{\alpha }}}+n^{-1/2}\sum _{i=1}^n{{\varvec{x}}}_i{\varvec{\theta }}'(Z_i)q_{1i}(Z_i) +{{\varvec{Q}}}_1 n^{1/2}(\hat{{\varvec{\alpha }}}-{\varvec{\alpha }})\nonumber \\&\quad +n^{-1/2}\sum _{i=1}^n{{\varvec{x}}}_i{\varvec{\theta }}'(Z_i)q_{2i}(Z_i) n^{-1}f^{-1}(Z_i)\mathscr {I}^{(1)-1}_\theta (Z_i)\nonumber \\&\quad \times \sum _{t=1}^nq_{1t}(Z_t)K_h(Z_t-Z_i)+o_p(1). \end{aligned}$$
(29)

Interchanging the summations in the last term, we get

$$\begin{aligned}&n^{-1/2}\sum _{i=1}^n\left[ n^{-1}\sum _{t=1}^n{{\varvec{x}}}_t{\varvec{\theta }}'(Z_t)q_{2t}(Z_t) K_h(Z_t-Z_i)f^{-1}(Z_t)\mathscr {I}_\theta ^{-1}(Z_t)q_{1i}(Z_i)\right] \nonumber \\&\quad =n^{-1/2}\sum _{i=1}^nE[{{\varvec{x}}}{\varvec{\theta }}'(Z)q_2(Z)|Z_i]\mathscr {I}^{(1)-1}_\theta (Z_i)q_{1i}(Z_i)+o_p(1). \end{aligned}$$
(30)

Let \(\varGamma _\alpha =I-{\varvec{\alpha }}{\varvec{\alpha }}^\top +o_p(1)\). Combining (29) and (30), and multiply by \(\varGamma _\alpha \), we have

$$\begin{aligned}&\varGamma _\alpha {{\varvec{Q}}}_1 n^{1/2}(\hat{{\varvec{\alpha }}}-{\varvec{\alpha }})=n^{-1/2}\sum _{i=1}^n\varGamma _\alpha \{{{\varvec{x}}}_i{\varvec{\theta }}'(Z_i)\nonumber \\&\quad +E[{{\varvec{x}}}{\varvec{\theta }}'(Z)q_2(Z)|Z_i]\mathscr {I}^{(1)-1}_\theta (Z_i)\}q_{1i}(Z_i)+o_p(1). \end{aligned}$$
(31)

It can be shown that the right-hand side of (31) has the covariance matrix \(\varGamma _\alpha {{\varvec{Q}}}_1\varGamma _\alpha \), and therefore, completes the proof. \(\square \)

Proof of Theorem 4

Ichimura (1993) have shown that under conditions (i)–(iv), \({\varvec{\alpha }}\) is identifiable. Furthermore, Huang and Yao (2012) showed that with condition (v), \(({\varvec{\pi }}(\cdot ),{\varvec{\beta }},{{\varvec{\sigma }}}^2)\) are identifiable. Thus completes the proof. \(\square \)

Proof of Theorem 5

This proof is similar to the proof of Theorem 2.

Let \(\hat{\pi }_j^*=\sqrt{nh}\{\hat{\pi }_j-\pi _j(z)\}\), \(j=1,\ldots ,k-1\), and \(\hat{{\varvec{\pi }}}^*=(\hat{\pi }_1^*,\ldots ,\hat{\pi }_{k-1}^*)^\top \). It can be shown that

$$\begin{aligned} \hat{{\varvec{\pi }}}^*=f(z)^{-1}\mathscr {I}_\pi ^{(2)-1}(z){{\varvec{W}}}_{2n}+o_p(1), \end{aligned}$$

where

$$\begin{aligned} {{\varvec{W}}}_{2n}=\sqrt{\frac{h}{n}}\sum _{i=1}^n\frac{\partial \ell ({\varvec{\pi }}(z),\hat{{{\varvec{\lambda }}}},{{\varvec{x}}}_i,Y_i)}{\partial {\varvec{\pi }}}K_h(\hat{Z}_i-z). \end{aligned}$$

To complete the proof, notice that

$$\begin{aligned} E({{\varvec{W}}}_{2n})=\,&\sqrt{nh}E\left\{ E[\frac{\partial \ell ({\varvec{\pi }},{{\varvec{\lambda }}},{{\varvec{x}}}_i,Y_i)}{\partial {\varvec{\pi }}}K_h(Z_i-z)|Z=z_0]\right\} \\ =\,&\sqrt{nh}[\frac{1}{2}f(z)\varLambda _2''(z|z)+f'(z)\varLambda _2'(z|z)]\kappa _2h^2, \end{aligned}$$

and Cov\(({{\varvec{W}}}_{2n})=f(z)\mathscr {I}^{(2)}_\pi (z)\nu _0+o_p(1)\). The rest of the proof follows a standard argument. \(\square \)

Proof of Theorem 6s

The proof is similar to the proof of Theorem 3. It can be shown that

$$\begin{aligned}&\hat{{\varvec{\pi }}}(z_0;\hat{{{\varvec{\lambda }}}})-{\varvec{\pi }}(z_0)=n^{-1}f^{-1}(z_0)\mathscr {I}^{(2)-1}_\pi (z_0)\sum _{i=1}^nq_{\pi i}(Z_i)K_h(Z_i-z_0)\\ -&\mathscr {I}^{(2)-1}_\pi (z_0) E\{q_{\pi \pi }(Z)[{{\varvec{x}}}{\varvec{\pi }}'(Z)]^\top |Z=z_0\}(\hat{{\varvec{\alpha }}}-{\varvec{\alpha }})-\mathscr {I}^{(2)-1}_\pi (z_0) E\{q_{\pi \eta }(Z)|Z\\&=z_0\}(\hat{{{\varvec{\eta }}}}-{{\varvec{\eta }}})+o_p(n^{-1/2}), \end{aligned}$$

and therefore,

$$\begin{aligned}&\hat{{\varvec{\pi }}}(\hat{Z}_i;\hat{{{\varvec{\lambda }}}})-{\varvec{\pi }}(Z_i)=\{{{\varvec{x}}}_i{\varvec{\pi }}'(Z_i)\}^\top (\hat{{\varvec{\alpha }}}-{\varvec{\alpha }})+\hat{{\varvec{\pi }}}(Z_i;\hat{{{\varvec{\lambda }}}})\nonumber \\&\quad -{\varvec{\pi }}(Z_i)+o_p(n^{-\frac{1}{2}}). \end{aligned}$$
(32)

Since \(\hat{{{\varvec{\lambda }}}}\) maximizes (14), it is the solution to

$$\begin{aligned} {{\varvec{0}}}=\gamma \begin{pmatrix}\hat{{\varvec{\alpha }}}\\ {{\varvec{0}}}\end{pmatrix}+n^{-\frac{1}{2}}\sum _{i=1}^n\begin{pmatrix}{{\varvec{x}}}_i\hat{{\varvec{\pi }}}'(\hat{Z}_i;\hat{{{\varvec{\lambda }}}})\\ \mathbf{I} \end{pmatrix}q_\pi (\hat{{\varvec{\pi }}}(\hat{Z}_i;\hat{{{\varvec{\lambda }}}}),\hat{{{\varvec{\lambda }}}}), \end{aligned}$$

where \(\gamma \) is the Lagrange multiplier. By Taylor series and (32)

$$\begin{aligned} {{\varvec{0}}}=\,&\gamma \begin{pmatrix}\hat{{\varvec{\alpha }}}\\ {{\varvec{0}}}\end{pmatrix}+n^{-\frac{1}{2}}\sum _{i=1}^n{{\varvec{\varLambda }}}_{1i}q_{\pi i}(Z_i)+n^{\frac{1}{2}}{{\varvec{Q}}}_2\begin{pmatrix}\hat{{\varvec{\alpha }}}-{\varvec{\alpha }}\\ \hat{{{\varvec{\eta }}}}-{{\varvec{\eta }}}\end{pmatrix}\nonumber \\&+n^{-\frac{1}{2}}\sum _{i=1}^n{{\varvec{\varLambda }}}_{1i}q_{\pi \pi i}(Z_i)n^{-1}f^{-1}(Z_i)\mathscr {I}^{(2)-1}_\pi (Z_i)\nonumber \\&\times \sum _{j=1}^nq_{\pi j}(Z_j)K_h(Z_j-Z_i)+o_p(1)\nonumber \\ =\,&\gamma \begin{pmatrix}\hat{{\varvec{\alpha }}}\\ {{\varvec{0}}}\end{pmatrix}+n^{-\frac{1}{2}}\sum _{i=1}^n{{\varvec{\varLambda }}}_{1i}q_{\pi i}(Z_i)+n^{\frac{1}{2}}{{\varvec{Q}}}_2\begin{pmatrix}\hat{{\varvec{\alpha }}}-{\varvec{\alpha }}\\ \hat{{{\varvec{\eta }}}}-{{\varvec{\eta }}}\end{pmatrix}\nonumber \\&+n^{-\frac{1}{2}}\sum _{i=1}^nE[{{\varvec{\varLambda }}}_{1i}q_{\pi \pi }(Z_i)]\mathscr {I}^{(2)-1}_\pi (Z_i)q_{\pi i}(Z_i)+o_p(1), \end{aligned}$$
(33)

where \({{\varvec{\varLambda }}}_{1i}=\begin{pmatrix}{{\varvec{x}}}_i{\varvec{\pi }}'(Z_i)\\ \mathbf{I} \end{pmatrix}\), and the last equation is the result of interchanging the summations. Let \(\varGamma _\alpha =\begin{pmatrix}{{\varvec{I}}}-{\varvec{\alpha }}{\varvec{\alpha }}^\top &{}\mathbf 0 \\ \mathbf 0 &{}{{\varvec{I}}}\end{pmatrix}+o_p(1)\). By (33), and multiply by \(\varGamma _\alpha \), we have

$$\begin{aligned}&n^{\frac{1}{2}}\varGamma _\alpha {{\varvec{Q}}}_2\begin{pmatrix}\hat{{\varvec{\alpha }}}-{\varvec{\alpha }}\\ \hat{{{\varvec{\eta }}}}-{{\varvec{\eta }}}\end{pmatrix}=n^{-\frac{1}{2}}\sum _{i=1}^n\varGamma _\alpha \left\{ {{\varvec{\varLambda }}}_{1i}-\mathscr {I}_\pi ^{(2)-1}(Z_i)E[{{\varvec{\varLambda }}}_{1i}(Z_i)q_{\pi \pi }(Z_i)|Z_i]\right\} \nonumber \\&\quad \times \, q_{\pi i}(Z_i)+o_p(1). \end{aligned}$$
(34)

It can be shown that the right-hand side of (34) has the covariance matrix \(\varGamma _\alpha {{\varvec{Q}}}_2\varGamma _\alpha \), and thus, completes the proof. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xiang, S., Yao, W. Semiparametric mixtures of regressions with single-index for model based clustering. Adv Data Anal Classif 14, 261–292 (2020). https://doi.org/10.1007/s11634-020-00392-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11634-020-00392-w

Keywords

Mathematics Subject Classification

Navigation