Abstract
By using instrumental variable technology and the partial group smoothly clipped absolute deviation penalty method, we propose a variable selection procedure for a class of partially varying coefficient models with endogenous variables. The proposed variable selection method can eliminate the influence of the endogenous variables. With appropriate selection of the tuning parameters, we establish the oracle property of this variable selection procedure. A simulation study is undertaken to assess the finite sample performance of the proposed variable selection procedure.
Similar content being viewed by others
References
Cai Z, Xiong H (2012) Partially varying coefficient instrumental variables models. Stat Neerl 66:85–110
Card D (2001) Estimating the return to schooling: progress on some persistent econometric problems. Econometrica 69:1127–1160
Fan JQ, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat. Assoc 96:1348–1360
Frank IE, Friedman JH (1993) A statistical view of some chemometrics regression tool (with discussion). Technometrics 35:109–148
Greenland S (2000) An introduction to instrumental variables for epidemiologists. Int J Epidemiol 29:722–729
Hernan MA, Robins JM (2006) Instruments for causal inference—an epidemiologists dream? Epidemiology 17:360–372
Lian H (2012) Variable selection for high-dimensional generalized varying-coefficient models. Stat Sinica 22:1563–1588
Lin ZY, Yuan YZ (2012) Variable selection for generalized varying coefficient partially linear models with diverging number of parameters. Acta Math Appl Sinica Eng Ser 28(2):237–246
Newhouse JP, McClellan M (1998) Econometrics in outcomes research: the use of instrumental variables. Annu Rev Public Health 19:17–24
Schultz TP (1997) Human capital, schooling and health. IUSSP, XXIII, General Population Conference. Yale University
Schumaker LL (1981) Spline functions. Wiley, New York
Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B 58:267–288
Wang L, Li H, Huang JZ (2008) Variable selection in nonparametric varying-coefficient models for analysis of repeated measurements. J Am Stat Assoc 103:1556–1569
Wang LC, Lai P, Lian H (2013) Polynomial spline estimation for generalized varying coefficient partially linear models with a diverging number of components. Metrika 76:1083–1103
Yao F (2012) Efficient semiparametric instrumental variable estimation under conditional heteroskedasticity. J Quant Econ 10:32–55
Zhao PX, Li GR (2013) Modified SEE variable selection for varying coefficient instrumental variable models. Stat Methodol 12:60–70
Zhao PX, Xue LG (2009) Variable selection for semiparametric varying coefficient partially linear models. Stat Probab Lett 79:2148–2157
Zhao PX, Xue LG (2013) Empirical likelihood inferences for semiparametric instrumental variable models. J Appl Math Comput 43:75–90
Acknowledgments
This paper is supported by the National Natural Science Foundation of China (11301569), the Higher-education Reform Project of Guangxi (2014JGA209), and the Project of Outstanding Young Teachers Training in Higher Education Institutions of Guangxi.
Author information
Authors and Affiliations
Corresponding author
Appendix: Proof of theorems
Appendix: Proof of theorems
For convenience and simplicity, let c denote a positive constant which may be different value at each appearance throughout this paper. Before we prove our main theorems, we list some regularity conditions which are used in this paper.
- C1.:
-
\(\theta (u)\) is rth continuously differentiable on (0, 1), where \(r>1/2\).
- C2.:
-
Let \(c_{1},\ldots ,c_{K}\) be the interior knots of [0, 1]. Furthermore, we let \(c_{0}=0, c_{K+1}=1\), \(h_{i}=c_{i}-c_{i-1}\) and \(h=\max \{h_{i}\}\). Then, there exists a constant \(C_{0}\) such that
$$\begin{aligned} \frac{h}{\min \{h_{i}\}}\le C_{0},\quad \max \{|h_{i+1}-h_{i}|\}=o\left( K^{-1}\right) . \end{aligned}$$ - C3.:
-
The density function of U, says f(u), is bounded away from zero and infinity on [0, 1], and is continuously differentiable on (0, 1).
- C4.:
-
Let \(G_{1}(u)=E\{ZZ^{T}|U=u\}, G_{2}(u)=E\{XX^{T}|U=u\}\) and \(\sigma ^{2}(u)=E\{\varepsilon ^{2}|U=u\}\). Then, \(G_{1}(u), G_{2}(u)\) and \(\sigma ^{2}(u)\) are continuous with respect to u. Furthermore, for given u, \(G_{1}(u)\) and \(G_{2}(u)\) are positive definite matrix, and the eigenvalues of \(G_{1}(u)\) and \(G_{2}(u)\) are bounded.
- C5.:
-
The penalty function \(p_{\lambda }(\cdot )\) satisfies that
- (i):
-
\(\displaystyle \lim _{n\rightarrow \infty }\lambda =0\), and \( \displaystyle \lim _{n\rightarrow \infty }\sqrt{n}\lambda =\infty \).
- (ii):
-
For any given non-zero \(w, \displaystyle \lim _{n\rightarrow \infty }\sqrt{n}p'_{\lambda }(|w|)=0\), and \( \displaystyle \lim _{n\rightarrow \infty }p''_{\lambda }(|w|)=0\).
- (iii):
-
\(\displaystyle \lim _{n\rightarrow \infty }\sup _{|w|\le cn^{-1/2}}p''_{\lambda }(|w|)=0\), and \(\displaystyle \lim _{n\rightarrow \infty }\lambda ^{-1}\inf _{|w|\le cn^{-1/2}}p'_{\lambda }(|w|)>0\), where c is a positive constant.
These conditions are commonly adopted in the nonparametric literature and variable selection methodology. Conditions C1 is the continuity condition of nonparametric components which is common in the nonparametric literature. Condition C2 indicates that \(c_{0},\ldots ,c_{K+1}\) is a \(C_{0}\)-quasi-uniform sequence of partitions of [0, 1] (see Schumaker 1981, p. 216), and this assumption is used in Zhao and Li (2013), Zhao and Xue (2009), Wang et al. (2013). Conditions C3 and C4 are some regularity conditions for covariates, which are similar to those used in Zhao and Xue (2013), Cai and Xiong (2012), Wang et al. (2008). Condition C5 contains some assumptions for the penalty function. These conditions on the penalty function are similar to those used in Fan and Li (2001), Wang et al. (2008), Zhao and Xue (2009), and it is easy to show that the SCAD, Lasso penalty functions satisfy these conditions.
Proof of Theorem 1
Let \(\delta =n^{-r/(2r+1)}+a_{n}, \beta =\beta _{0}+\delta M_{1}\), \(\gamma =\gamma _{0}+\delta M_{2}\) and \(M=(M_{1}^{T}, M_{2}^{T})^{T}\). For part (i), we show that, for any given \(\epsilon >0\), there exists a large constant c such that
Let \(R_{k0}(u)=\theta _{k0}(u)-B(u)^{T}\gamma _{k0}\), then note that \(W_{i}=I_{p}\otimes B(U_{i})\cdot X_{i}\), we have
where \(R_{0}(U_{i})=(R_{10}(U_{i}),\ldots ,R_{p0}(U_{i}))^{T}\). Hence, we have
Then, invoking \(\beta =\beta _{0}+\delta M_{1}\), \(\gamma =\gamma _{0}+\delta M_{2}\) and (9), we can obtain that
By (9) and (10), and based on the formula \(a^{2}-b^{2}=(a+b)(a-b)\), we have that
Let \(\Delta (\gamma ,\beta )=K^{-1}\{\hat{Q}(\gamma ,\beta )-\hat{Q}(\gamma _{0},\beta _{0})\}\), then from (11), we have that
From conditions C1, C2 and Corollary 6.21 in Schumaker (1981), we get that \(\Vert R(u)\Vert =O(K^{-r})\). Then, invoking condition C4, a simple calculation yields
Invoking \(E\{\varepsilon _{i}|\xi _{i},X_{i}\}=0\) and \(\hat{Z}_{i}=\varGamma \xi _{i}+O_{p}(n^{-1/2})\), we can prove that
In addition, note that \(Z_{i}-\hat{Z}_{i}=(\varGamma -\hat{\varGamma })\xi _{i}+e_{i}\), then invoking \(\hat{\varGamma }=\varGamma +O_{p}(n^{-1/2})\) and \(E\{e_{i}|\xi _{i},X_{i}\}=0\), we can prove that
Hence, from (12) to (14), it is easy to show that
Similarly, we can prove that
By the condition C5(ii), we have that \(\lim _{n\rightarrow \infty }\sqrt{n}p'_{\lambda }(|w|)=0\), for any given nonzero w. Then invoking the definition of \(a_{n}\), we can obtain \(\sqrt{n}a_{n}\rightarrow 0\) when n is large enough. Hence, we obtain that
Hence we have \(I_{2}/I_{1}=O_{p}(1)\Vert M\Vert \). Then, by choosing a sufficiently large \(c, I_{2}\) can dominate \(I_{1}\) uniformly in \(\Vert M\Vert =c\). Furthermore, invoking \(p_{\lambda }(0)=0\), and by the standard argument of the Taylor expansion, we get that
Note that \(n^{\frac{r}{2r+1}}a_{n}\rightarrow 0\) and \(b_{n}\rightarrow 0\), we obtain that \(I_{3}=o_{p}(1)\Vert M\Vert ^{2}\). Hence, we have that \(I_{3}\) is dominated by \(I_{2}\) uniformly in \(\Vert M\Vert =c\). With the same argument, we can prove that \(I_{4}\) is also dominated by \(I_{2}\) uniformly in \(\Vert M\Vert =c\). In addition, note that \(I_{2}\) is positive, then by choosing a sufficiently large c, (7) holds.
By the continuity of \(\hat{Q}(\cdot ,\cdot )\), the inequality (7) implies that \(\hat{Q}(\cdot ,\cdot )\) should have a local minimum on \(\{\Vert M\Vert \le c\}\) with probability greater than \(1-\epsilon \). Hence, there exists a local minimizer \(\hat{\beta }\) such that \(\Vert \hat{\beta }-\beta _{0}\Vert =O_{p}(\delta )\), which completes the proof of part (i).
Next, we prove part (ii). Note that
With the same arguments as the proof of part (i), we can get that \(\Vert \hat{\gamma }-\gamma \Vert =O_{p}(n^{-r/(2r+1)}+a_{n})\). Then, a simple calculation yields
In addition, it is easy to show that
Invoking (15) and (16), we complete the proof of part (ii). \(\square \)
Proof of Theorem 2
We first prove part (i). Invoking the condition \(\lambda \rightarrow 0\), it is easy to show that \(a_{n}=0\) for large n. Then by Theorem 1, it is sufficient to show that, for any given \(\beta _{l}, l=1,\ldots ,s\), which satisfy \(\Vert \beta _{l}-\beta _{l0}\Vert =O_{p}(n^{-r/(2r+1)})\), and a small \(\epsilon \) which satisfies \(\epsilon =cn^{-r/(2r+1)}\), with probability tending to 1, we have
and
Thus, (17) and (18) imply that the minimizer attains at \(\beta _{l}=0, l=s+1,\ldots ,q\).
By a similar the proof of Theorem 1, we have that
Since \(\lim _{n\rightarrow \infty }\liminf _{\beta _{l}\rightarrow 0}\lambda ^{-1}p' _{\lambda }(|\beta _{l}|)>0\) and \(\lambda n^{\frac{r}{2r+1}}\rightarrow \infty \), the sign of the derivative is completely determined by the sign of \(\beta _{l}\), then (17) and (18) hold. This completes the proof of part (i).
Applying the similar techniques as in the analysis of part (i) in this theorem, we have, with probability tending to 1, that \(\hat{\gamma }_{k}=0, k=d+1,\ldots ,p\). Then, the result of this theorem is immediately achieved form \(\hat{\theta }_{k}(u)=B^{T}(u)\hat{\gamma }_{k}\). \(\square \)
Rights and permissions
About this article
Cite this article
Yuan, J., Zhao, P. & Zhang, W. Semiparametric variable selection for partially varying coefficient models with endogenous variables. Comput Stat 31, 693–707 (2016). https://doi.org/10.1007/s00180-015-0601-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-015-0601-y