Variable Selection for Sparse Logistic Regression with Grouped Variables
Abstract
:1. Introduction
2. Penalized Weighted Score Function Method
3. Statistical Properties
- (A1)
- There exists a positive constant such that .
- (A2)
- satisfy that , and for .
- (A3)
- There exists such that
- (A4)
- Let be a convex three-times differentiable function such that for all , the function satisfies for all , where is a constant.
- (A5)
- For some integer s such that and a positive number , the following condition holds
4. Weighted Block Coordinate Descent Algorithm
- Armijo rule
Choose and let be the largest value of satisfying |
where , , and |
Algorithm 1 Weighted block coordinate gradient descent algorithm |
|
5. Simulations
- TP: the number of predicted non-zero values in the non-zero coefficient set when determining the model.
- TN: the number of predicted zero values in the zero coefficient set when determining the model.
- FP: the number of predicted non-zero values in the zero coefficient set when determining the model.
- FN: the number of predicted zero values in the non-zero coefficient set when determining the model.
- TPR: the ratio of predicted non-zero values in the non-zero coefficient set when determining the model, which is calculated by the following formula:
- Accur: the ratio of accurate predictions when determining the model, which is calculated by the following formula:
- Time: the running time of the algorithm.
- BNE: the block norm of the estimation error, which is calculated by the following formula:
6. Real Data
6.1. Studies on the Molecular Structure of Muscadine
6.2. Gene Expression Studies in Epithelial Cells of Breast Cancer Patients
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
References
- Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. Stat. Methodol. 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Fan, J.; Li, R. Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 2001, 96, 1348–1360. [Google Scholar] [CrossRef]
- Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. Stat. Methodol. 2005, 67, 301–320. [Google Scholar] [CrossRef]
- Candes, E.; Tao, T. The Dantzig selector: Statistical estimation when p is much larger than n. Ann. Stat. 2007, 35, 2313–2351. [Google Scholar]
- Zhang, C.H. Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 2010, 38, 894–942. [Google Scholar] [CrossRef] [PubMed]
- Sur, P.; Chen, Y.; Candès, E.J. The likelihood ratio test in high-dimensional logistic regression is asymptotically a rescaled chi-square. Probab. Theory Relat. Fields 2019, 175, 487–558. [Google Scholar] [CrossRef]
- Ma, R.; Tony Cai, T.; Li, H. Global and simultaneous hypothesis testing for high-dimensional logistic regression models. J. Am. Stat. Assoc. 2021, 116, 984–998. [Google Scholar] [CrossRef]
- Bianco, A.M.; Boente, G.; Chebi, G. Penalized robust estimators in sparse logistic regression. Test 2022, 31, 563–594. [Google Scholar] [CrossRef]
- Abramovich, F.; Grinshtein, V. High-dimensional classification by sparse logistic regression. IEEE Trans. Inf. Theory 2018, 65, 3068–3079. [Google Scholar] [CrossRef]
- Huang, H.; Gao, Y.; Zhang, H.; Li, B. Weighted Lasso estimates for sparse logistic regression: Non-asymptotic properties with measurement errors. Acta Math. Sci. 2021, 41, 207–230. [Google Scholar] [CrossRef]
- Yin, Z. Variable selection for sparse logistic regression. Metrika 2020, 83, 821–836. [Google Scholar] [CrossRef]
- Yuan, M.; Lin, Y. Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. Stat. Methodol. 2006, 68, 49–67. [Google Scholar] [CrossRef]
- Meier, L.; Van De Geer, S.; Bühlmann, P. The group lasso for logistic regression. J. R. Stat. Soc. Ser. Stat. Methodol. 2008, 70, 53–71. [Google Scholar] [CrossRef]
- Wang, L.; You, Y.; Lian, H. Convergence and sparsity of Lasso and group Lasso in high-dimensional generalized linear models. Stat. Pap. 2015, 56, 819–828. [Google Scholar] [CrossRef]
- Blazere, M.; Loubes, J.M.; Gamboa, F. Oracle Inequalities for a Group Lasso Procedure Applied to Generalized Linear Models in High Dimension. IEEE Trans. Inf. Theory 2014, 60, 2303–2318. [Google Scholar] [CrossRef]
- Kwemou, M. Non-asymptotic oracle inequalities for the Lasso and group Lasso in high dimensional logistic model. ESAIM Probab. Stat. 2016, 20, 309–331. [Google Scholar] [CrossRef]
- Nowakowski, S.; Pokarowski, P.; Rejchel, W.; Sołtys, A. Improving group Lasso for high-dimensional categorical data. In Proceedings of the International Conference on Computational Science; Springer: Berlin/Heidelberg, Germany, 2023; pp. 455–470. [Google Scholar]
- Zhang, Y.; Wei, C.; Liu, X. Group Logistic Regression Models with Lp, q Regularization. Mathematics 2022, 10, 2227. [Google Scholar] [CrossRef]
- Tseng, P. Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 2001, 109, 475–494. [Google Scholar] [CrossRef]
- Breheny, P.; Huang, J. Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors. Stat. Comput. 2015, 25, 173–187. [Google Scholar] [CrossRef]
- Abramovich, F.; Grinshtein, V.; Levy, T. Multiclass classification by sparse multinomial logistic regression. IEEE Trans. Inf. Theory 2021, 67, 4637–4646. [Google Scholar] [CrossRef]
- Chen, S.; Wang, P. Gene selection from biological data via group LASSO for logistic regression model: Effects of different clustering algorithms. In Proceedings of the 2021 40th Chinese Control Conference (CCC), Shanghai, China, 26–28 July 2021; pp. 6374–6379. [Google Scholar]
- Ryan Kilcullen, J.; Castonguay, L.G.; Janis, R.A.; Hallquist, M.N.; Hayes, J.A.; Locke, B.D. Predicting future courses of psychotherapy within a grouped LASSO framework. Psychother. Res. 2021, 31, 63–77. [Google Scholar] [CrossRef] [PubMed]
- Yang, Y.; Hu, X.; Jiang, H. Group penalized logistic regressions predict up and down trends for stock prices. N. Am. J. Econ. Financ. 2022, 59, 101564. [Google Scholar] [CrossRef]
- Belloni, A.; Chernozhukov, V.; Wang, L. Square-root lasso: Pivotal recovery of sparse signals via conic programming. Biometrika 2011, 98, 791–806. [Google Scholar] [CrossRef]
- Bunea, F.; Lederer, J.; She, Y. The group square-root lasso: Theoretical properties and fast algorithms. IEEE Trans. Inf. Theory 2013, 60, 1313–1325. [Google Scholar] [CrossRef]
- Huang, Y.; Wang, C. Consistent functional methods for logistic regression with errors in covariates. J. Am. Stat. Assoc. 2001, 96, 1469–1482. [Google Scholar] [CrossRef]
- Bach, F. Self-concordant analysis for logistic regression. Electron. J. Stat. 2010, 4, 384–414. [Google Scholar] [CrossRef]
- Hu, Y.; Li, C.; Meng, K.; Qin, J.; Yang, X. Group sparse optimization via lp, q regularization. J. Mach. Learn. Res. 2017, 18, 960–1011. [Google Scholar]
- Bickel, P.J.; Ritov, Y.; Tsybakov, A.B. Simultaneous analysis of Lasso and Dantzig selector. Ann. Stat. 2009, 37, 1705–1732. [Google Scholar] [CrossRef]
- Tseng, P.; Yun, S. A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. 2009, 117, 387–423. [Google Scholar] [CrossRef]
- Yang, Y.; Zou, H. A fast unified algorithm for solving group-lasso penalize learning problems. Stat. Comput. 2015, 25, 1129–1141. [Google Scholar] [CrossRef]
- Graham, K.; de Las Morenas, A.; Tripathi, A.; King, C.; Kavanah, M.; Mendez, J.; Stone, M.; Slama, J.; Miller, M.; Antoine, G.; et al. Gene expression in histologically normal epithelium from breast cancer patients and from cancer-free prophylactic mastectomy patients shares a similar profile. Br. J. Cancer 2010, 102, 1284–1293. [Google Scholar] [CrossRef] [PubMed]
- Sakhanenko, A. Berry-Esseen type estimates for large deviation probabilities. Sib. Math. J. 1991, 32, 647–656. [Google Scholar] [CrossRef]
Model I | |||||||
---|---|---|---|---|---|---|---|
TP | TPR | FP | Accur | Time | BNE | ||
p = 300 | grpreg( = min) | 30.00 (0.00) | 1.000 | 91.28 (19.46) | 0.696 | 300.63 | 18.32 (1.96) |
gglasso( = min) | 30.00 (0.00) | 1.000 | 41.64 (29.92) | 0.861 | 390.56 | 17.96 (3.11) | |
gglasso( = lse) | 29.68 (1.10) | 0.990 | 13.44 (14.73) | 0.954 | 389.27 | 21.81 (2.29) | |
wgrplasso( = 0.01) | 29.61 (1.06) | 0.987 | 26.15 (7.92) | 0.912 | 23.53 | 18.51 (0.65) | |
wgrplasso( = 0.05) | 29.77 (0.85) | 0.993 | 36.14 (9.80) | 0.879 | 29.24 | 17.88 (0.70) | |
p = 600 | grpreg( = min) | 29.90 (0.55) | 0.997 | 116.36 (26.51) | 0.806 | 444.31 | 20.35 (1.73) |
gglasso( = min) | 29.80 (0.91) | 0.994 | 45.85 (34.78) | 0.923 | 508.35 | 19.95 (2.41) | |
gglasso( = lse) | 29.32 (2.00) | 0.978 | 17.37 (16.92) | 0.970 | 506.27 | 22.77 (1.81) | |
wgrplasso( = 0.01) | 29.25 (1.40) | 0.975 | 41.84 (11.33) | 0.929 | 38.97 | 19.17 (0.71) | |
wgrplasso( = 0.05) | 29.50 (1.19) | 0.984 | 55.81 (12.78) | 0.906 | 45.16 | 18.73 (0.76) | |
p = 900 | grpreg( = min) | 29.66 (1.13) | 0.989 | 130.12 (32.66) | 0.855 | 590.55 | 21.56 (1.82) |
gglasso( = min) | 29.88 (0.59) | 0.996 | 64.84 (39.83) | 0.928 | 614.64 | 20.07 (2.24) | |
gglasso( = lse) | 29.30 (1.53) | 0.977 | 24.07 (21.79) | 0.972 | 612.24 | 23.13 (1.80) | |
wgrplasso( = 0.01) | 29.19 (1.43) | 0.973 | 54.10 (15.45) | 0.939 | 52.63 | 19.58 (0.73) | |
wgrplasso( = 0.05) | 29.44 (1.21) | 0.982 | 70.01 (15.98) | 0.922 | 62.81 | 19.20 (0.78) | |
Model II | |||||||
TP | TPR | FP | Accur | Time | BNE | ||
p = 300 | grpreg( = min) | 17.82 (4.36) | 0.594 | 65.31 (10.55) | 0.742 | 641.23 | 27.77 (1.32) |
gglasso( = min) | 14.30 (4.92) | 0.476 | 36.25 (10.33) | 0.827 | 391.28 | 27.69 (1.43) | |
gglasso( = lse) | 11.36 (4.80) | 0.378 | 27.70 (11.50) | 0.846 | 389.83 | 28.73 (0.96) | |
wgrplasso( = 0.01) | 25.07 (2.67) | 0.836 | 6.52 (4.83) | 0.962 | 39.71 | 15.92 (1.09) | |
wgrplasso( = 0.05) | 25.02 (2.68) | 0.834 | 6.28 (4.70) | 0.962 | 40.24 | 15.85 (1.09) | |
p = 600 | grpreg( = min) | 12.61 (4.32) | 0.420 | 85.84 (11.35) | 0.828 | 894.47 | 29.13 (1.17) |
gglasso( = min) | 10.95 (4.99) | 0.365 | 47.08 (13.41) | 0.890 | 584.32 | 28.73 (1.04) | |
gglasso( = lse) | 8.23 (4.76) | 0.274 | 36.33 (13.85) | 0.903 | 581.74 | 29.26 (0.72) | |
wgrplasso( = 0.01) | 24.57 (2.81) | 0.819 | 9.43 (6.08) | 0.975 | 69.48 | 15.96 (0.96) | |
wgrplasso( = 0.05) | 24.69 (2.80) | 0.823 | 9.23 (6.26) | 0.976 | 72.05 | 15.89 (0.99) | |
p = 900 | grpreg( = min) | 10.53 (4.60) | 0.351 | 96.88 (12.79) | 0.871 | 1115.73 | 29.64 (1.07) |
gglasso( = min) | 8.43 (4.49) | 0.281 | 53.67 (13.97) | 0.916 | 746.62 | 29.14 (0.93) | |
gglasso( = lse) | 6.09 (4.20) | 0.203 | 40.74 (15.09) | 0.928 | 742.62 | 29.49 (0.58) | |
wgrplasso( = 0.01) | 24.86 (2.66) | 0.829 | 10.80 (6.39) | 0.982 | 106.940 | 15.85 (1.01) | |
wgrplasso( = 0.05) | 24.99 (2.71) | 0.833 | 11.05 (6.23) | 0.982 | 111.95 | 15.80 (1.00) |
Model III | |||||||
---|---|---|---|---|---|---|---|
TP | TPR | FP | Accur | Time | BNE | ||
p = 300 | grpreg( = min) | 29.39 (1.79) | 0.980 | 73.59 (21.16) | 0.753 | 447.46 | 27.52 (1.96) |
gglasso( = min) | 29.91 (0.59) | 0.997 | 74.11 (25.60) | 0.753 | 812.03 | 24.06 (2.05) | |
gglasso( = lse) | 29.57 (2.32) | 0.986 | 40.58 (21.48) | 0.863 | 807.65 | 25.27 (1.69) | |
wgrplasso( = 0.01) | 27.69 (2.51) | 0.923 | 24.02 (7.69) | 0.912 | 35.92 | 28.99 (1.27) | |
wgrplasso( = 0.05) | 28.55 (2.06) | 0.952 | 32.00 (8.15) | 0.888 | 39.13 | 28.84 (1.38) | |
p = 600 | grpreg( = min) | 28.05 (2.96) | 0.935 | 86.76 (28.04) | 0.852 | 598.05 | 28.65 (1.70) |
gglasso( = min) | 29.40 (2.37) | 0.980 | 97.53 (36.13) | 0.836 | 974.70 | 25.44 (1.92) | |
gglasso( = lse) | 27.62 (5.90) | 0.920 | 45.84 (27.29) | 0.920 | 968.57 | 26.65 (1.87) | |
wgrplasso( = 0.01) | 27.15 (2.69) | 0.905 | 40.41 (10.68) | 0.928 | 56.35 | 29.40 (1.22) | |
wgrplasso( = 0.05) | 28.18 (2.21) | 0.940 | 51.31 (11.66) | 0.911 | 63.67 | 29.34 (1.33) | |
p = 900 | grpreg( = min) | 25.66 (5.66) | 0.856 | 82.92 (36.76) | 0.903 | 745.82 | 29.33 (1.51) |
gglasso( = min) | 28.77 (3.79) | 0.959 | 105.48 (45.77) | 0.881 | 1121.19 | 26.32 (1.87) | |
gglasso( = lse) | 24.33 (9.47) | 0.811 | 42.12 (35.83) | 0.947 | 1113.45 | 27.76 (2.14) | |
wgrplasso( = 0.01) | 26.85 (2.87) | 0.895 | 50.99 (10.80) | 0.940 | 68.74 | 29.70 (1.18) | |
wgrplasso( = 0.05) | 27.80 (2.38) | 0.926 | 63.14 (12.27) | 0.927 | 81.32 | 29.67 (1.27) | |
Model IV | |||||||
TP | TPR | FP | Accur | Time | BNE | ||
p = 300 | grpreg( = min) | 21.94 (4.03) | 0.732 | 63.80 (9.64) | 0.760 | 466.73 | 35.16 (1.78) |
gglasso( = min) | 19.88 (4.43) | 0.662 | 52.83 (11.36) | 0.790 | 409.92 | 28.30 (1.13) | |
gglasso( = lse) | 17.30 (4.74) | 0.577 | 47.80 (11.44) | 0.798 | 408.22 | 28.93 (0.74) | |
wgrplasso( = 0.01) | 28.75 (1.65) | 0.959 | 25.96 (8.12) | 0.909 | 218.10 | 26.09 (2.55) | |
wgrplasso( = 0.05) | 28.78 (1.65) | 0.960 | 26.32 (8.14) | 0.908 | 221.08 | 26.13 (2.57) | |
p = 600 | grpreg( = min) | 18.32 (4.40) | 0.611 | 83.08 (12.48) | 0.842 | 689.27 | 35.02 (1.79) |
gglasso( = min) | 16.48 (5.10) | 0.549 | 70.00 (14.34) | 0.861 | 571.90 | 29.08 (1.01) | |
gglasso( = lse) | 14.05 (5.17) | 0.468 | 62.39 (14.65) | 0.869 | 567.98 | 29.37 (0.62) | |
wgrplasso( = 0.01) | 28.58 (1.80) | 0.953 | 34.33 (10.12) | 0.940 | 384.79 | 26.59 (2.69) | |
wgrplasso( = 0.05) | 28.58 (1.83) | 0.953 | 34.76 (10.11) | 0.940 | 380.57 | 26.63 (2.70) | |
p = 900 | grpreg( = min) | 15.66 (4.25) | 0.522 | 94.71 (12.41) | 0.879 | 356.36 | 34.90 (1.50) |
gglasso( = min) | 13.80 (4.61) | 0.460 | 80.03 (13.92) | 0.893 | 289.06 | 29.45 (0.92) | |
gglasso( = lse) | 11.61 (4.49) | 0.387 | 70.52 (15.85) | 0.901 | 287.73 | 29.64 (0.54) | |
wgrplasso( = 0.01) | 28.55 (1.83) | 0.952 | 39.33 (12.57) | 0.955 | 184.13 | 26.53 (2.34) | |
wgrplasso( = 0.05) | 28.56 (1.80) | 0.952 | 39.24 (12.45) | 0.955 | 186.89 | 26.56 (2.36) |
wgrplasso ( = 0.05) | grpreg ( = min) | gglasso ( = min) | glmnet ( = min) | |
---|---|---|---|---|
Prediction accuracy | 0.820 | 0.813 | 0.771 | 0.758 |
Model size | 66.53 | 31.29 | 30.14 | 53.53 |
Time | 0.69 | 3.04 | 2.70 | 2.12 |
wgrplasso ( | grpreg ( = min) | gglasso ( = min) | |
---|---|---|---|
Prediction Accuracy | 0.73 | 0.63 | 0.71 |
Model Size | 14 | 9 | 14 |
Selected genes | 117_at 1255_g_at 200000_s_at 200002_at 200030_s_at 200040_at 200041_s_at 200655_s_at 200661_at 200729_s_at 201040_at 201465_s_at 202707_at 211997_x_at | 201464_x_at 201465_s_at 201778_s_at 202707_at 204620_s_at 205544_s_at 211997_x_at 213280_at 217921_at | 200047_s_at 200729_s_at 200801_x_at 201465_s_at 202046_s_at 202707_at 205544_s_at 208443_x_at 211374_x_at 211997_x_at 212234_at 213280_at 217921_at 220811_at |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhong, M.; Yin, Z.; Wang, Z. Variable Selection for Sparse Logistic Regression with Grouped Variables. Mathematics 2023, 11, 4979. https://doi.org/10.3390/math11244979
Zhong M, Yin Z, Wang Z. Variable Selection for Sparse Logistic Regression with Grouped Variables. Mathematics. 2023; 11(24):4979. https://doi.org/10.3390/math11244979
Chicago/Turabian StyleZhong, Mingrui, Zanhua Yin, and Zhichao Wang. 2023. "Variable Selection for Sparse Logistic Regression with Grouped Variables" Mathematics 11, no. 24: 4979. https://doi.org/10.3390/math11244979
APA StyleZhong, M., Yin, Z., & Wang, Z. (2023). Variable Selection for Sparse Logistic Regression with Grouped Variables. Mathematics, 11(24), 4979. https://doi.org/10.3390/math11244979