Usage of the GO estimator in high dimensional linear models

351 Accesses
6 Citations
Explore all metrics

Abstract

This paper discusses simultaneous parameter estimation and variable selection and presents a new penalized regression method. The method is based on the idea that the coefficient estimates are shrunken towards a predetermined coefficient vector which represents the prior information. This method can result in smaller length estimates of the coefficients depending on the prior information compared to elastic net. In addition to the establishment of the grouping property, we also show that the new method has the grouping effect when the predictors are highly correlated. Simulation studies and real data example show that the prediction performance of the new method is improved over the well-known ridge, lasso and elastic net regression methods yielding a lower mean squared error and competes about the variable selection under sparse and non-sparse situations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Shrinkage and Sparse Estimation for High-Dimensional Linear Models

A new double-regularized regression using Liu and lasso regularization

Article 18 June 2021

Variable selection and estimation using a continuous approximation to the $L_0$ penalty

Article 19 October 2016

Notes

This method is originally named as naive elastic net by Zou and Hastie (2005). The authors use a scaled version of the method and called it as elastic net. But we follow the same line with Friedman et al. (2010) who drop this distinction.

References

Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn 3(1):1–122
Article Google Scholar
Bühlmann P, Kalisch M, Meier L (2014) High-dimensional statistics with a view toward applications in biology. Ann Rev Stat Appl 1(1):255–278
Article Google Scholar
Donoho DL, Johnstone JM (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika 81(3):425–455
Article MathSciNet Google Scholar
Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32(2):407–499
Article MathSciNet Google Scholar
Friedman J, Hastie T, Höfling H, Tibshirani R (2007) Pathwise coordinate optimization. Ann Appl Stat 1(2):302–332
Article MathSciNet Google Scholar
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1
Article Google Scholar
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference and prediction, 2nd edn. Springer, Berlin
Book Google Scholar
Hastie T, Tibshirani R, Wainwright M (2015) Statistical learning with sparsity: the lasso and generalizations. CRC Press, Boca Raton
Book Google Scholar
Hoerl AE, Kennard RW (1970) Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12(1):55–67
Article Google Scholar
Javanmard A, Montanari A (2014) Confidence intervals and hypothesis testing for high-dimensional regression. J Mach Learn Res 15(1):2869–2909
MathSciNet MATH Google Scholar
Özkale MR, Kaçıranlar S (2007) The restricted and unrestricted two-parameter estimators. Commun Stat Theory Methods 36(15):2707–2725
Article MathSciNet Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodological) 58:267–288
MathSciNet MATH Google Scholar
Tibshirani R, Saunders M, Rosset S, Zhu J, Knight K (2005) Sparsity and smoothness via the fused lasso. J R Stat Soc Ser B (Stat Methodol) 67(1):91–108
Article MathSciNet Google Scholar
Wang Y, Jiang Y, Zhang J, Chen Z, Xie B, Zhao C (2019) Robust variable selection based on the random quantile lasso. Commun Stat Simul Comput 2009:1–11
Google Scholar
Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B (Stat Methodol) 68(1):49–67
Article MathSciNet Google Scholar
Zhang C, Wu Y, Zhu M (2019) Pruning variable selection ensembles. Stat Anal Data Min ASA Data Sci J 12(3):168–184
Article MathSciNet Google Scholar
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B (Stat Methodol) 67(2):301–320
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, Faculty of Science and Letters, Çukurova University, Adana, 01330, Turkey
Murat Genç & M. Revan Özkale

Authors

Murat Genç
View author publications
You can also search for this author in PubMed Google Scholar
M. Revan Özkale
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Murat Genç.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

1.1 Proof of Theorem 1

Let

$$\begin{aligned} Q\left( \hat{\varvec{\beta }};{\mathbf {b}},\lambda ,\alpha \right) =\frac{1}{2n}\left\| {\mathbf {y}}-{\mathbf {X}}\varvec{\beta }\right\| _{2}^{2}+\lambda \left( \alpha \left\| \varvec{\beta }\right\| _{1}+\frac{1-\alpha }{2}\left\| \varvec{\beta }-{\mathbf {b}}\right\| _{2}^{2}\right) . \end{aligned}$$

We write the sub-gradients of this function with respect to $\beta _{i}$, $\beta _{j}$ and set them equal to zero:

$$\begin{aligned} \frac{\partial Q}{\partial \beta _{i}}=-\frac{1}{n}{\mathbf {x}}_{i}^{\top }\left( {\mathbf {y}}-{\mathbf {X}}\hat{\varvec{\beta }}\right) +\lambda \alpha {\hat{s}}_{i}+\lambda \left( 1-\alpha \right) \left( {\hat{\beta }}_{i}-b_{i}\right)&=0 \end{aligned}$$

(12)

$$\begin{aligned} \frac{\partial Q}{\partial \beta _{j}}=-\frac{1}{n}{\mathbf {x}}_{j}^{\top }\left( {\mathbf {y}}-{\mathbf {X}}\hat{\varvec{\beta }}\right) +\lambda \alpha {\hat{s}}_{j}+\lambda \left( 1-\alpha \right) \left( {\hat{\beta }}_{j}-b_{j}\right)&=0, \end{aligned}$$

(13)

where ${\hat{s}}_{i}$ and ${\hat{s}}_{j}$ are the sub-gradients of the absolute value function of $\beta _{i}$ and $\beta _{j}$.

Subtracting Eq. (12) from Eq. (13) and applying Cauchy Schwarz inequality, we get

$$\begin{aligned} \left| {\hat{\beta }}_{j}-{\hat{\beta }}_{i}-\left( b_{j}-b_{i}\right) \right| \le \frac{1}{n\lambda \left( 1-\alpha \right) }\sqrt{\left\| {\mathbf {x}}_{i}-{\mathbf {x}}_{j}\right\| _{2}^{2}\left\| \hat{{\mathbf {r}}}\right\| _{2}^{2}}, \end{aligned}$$

(14)

where $\hat{{\mathbf {r}}}={\mathbf {y}}-{\mathbf {X}}\hat{\varvec{\beta }}$. Since $\left\| {\mathbf {x}}_{i}-{\mathbf {x}}_{j}\right\| _{2}^{2}=2\left( 1-\rho \right) $, we obtain

$$\begin{aligned} \left| {\hat{\beta }}_{j}-{\hat{\beta }}_{i}-\left( b_{j}-b_{i}\right) \right| \le \frac{1}{n\lambda \left( 1-\alpha \right) }\sqrt{2\left( 1-\rho \right) \left\| \hat{{\mathbf {r}}}\right\| _{2}^{2}}. \end{aligned}$$

(15)

Furthermore, $Q\left( \hat{\varvec{\beta }};{\mathbf {b}},\lambda ,\alpha \right) \le Q\left( {\mathbf {0}};{\mathbf {b}},\lambda ,\alpha \right) $ holds because $\hat{\varvec{\beta }}$ is the minimizer of Q. Hence, we write

$$\begin{aligned} \frac{1}{2n}\left\| \hat{{\mathbf {r}}}\right\| _{2}^{2}+\lambda \alpha \left\| \hat{\varvec{\beta }}\right\| _{1}+\frac{\lambda \left( 1-\alpha \right) }{2}\left\| \hat{\varvec{\beta }}-{\mathbf {b}}\right\| _{2}^{2}\le \frac{1}{2n}\left\| {\mathbf {y}}\right\| _{2}^{2}+\frac{\lambda \left( 1-\alpha \right) }{2}\left\| {\mathbf {b}}\right\| _{2}^{2} \end{aligned}$$

which implies that

$$\begin{aligned} \left\| \hat{{\mathbf {r}}}\right\| _{2}^{2}\le \left\| {\mathbf {y}}\right\| _{2}^{2}+n\lambda \left( 1-\alpha \right) \left\| {\mathbf {b}}\right\| _{2}^{2}. \end{aligned}$$

(16)

If we consider Eqs. (15) and (16) together, then

$$\begin{aligned} \left| {\hat{\beta }}_{j}-{\hat{\beta }}_{i}-\left( b_{j}-b_{i}\right) \right|&\le \frac{1}{n\lambda \left( 1-\alpha \right) }\sqrt{2\left( 1-\rho \right) }\sqrt{\left\| {\mathbf {y}}\right\| _{1}^{2}+n\lambda \left( 1-\alpha \right) \left\| {\mathbf {b}}\right\| _{2}^{2}} \end{aligned}$$

which completes the proof.$\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Genç, M., Özkale, M.R. Usage of the GO estimator in high dimensional linear models. Comput Stat 36, 217–239 (2021). https://doi.org/10.1007/s00180-020-01001-2

Download citation

Received: 12 July 2019
Accepted: 11 June 2020
Published: 18 June 2020
Issue Date: March 2021
DOI: https://doi.org/10.1007/s00180-020-01001-2

Usage of the GO estimator in high dimensional linear models

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Shrinkage and Sparse Estimation for High-Dimensional Linear Models

A new double-regularized regression using Liu and lasso regularization

Variable selection and estimation using a continuous approximation to the \(L_0\) penalty

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

1.1 Proof of Theorem 1

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Usage of the GO estimator in high dimensional linear models

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Shrinkage and Sparse Estimation for High-Dimensional Linear Models

A new double-regularized regression using Liu and lasso regularization

Variable selection and estimation using a continuous approximation to the \(L_0\) penalty

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

1.1 Proof of Theorem 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation