Abstract
The task of projecting onto \(\ell _p\) norm balls is ubiquitous in statistics and machine learning, yet the availability of actionable algorithms for doing so is largely limited to the special cases of \(p \in \left\{ 0, 1,2, \infty \right\}\). In this paper, we introduce novel, scalable methods for projecting onto the \(\ell _p\)-ball for general \(p>0\). For \(p \ge 1\), we solve the univariate Lagrangian dual via a dual Newton method. We then carefully design a bisection approach for \(p<1\), presenting theoretical and empirical evidence of zero or a small duality gap in the non-convex case. The success of our contributions is thoroughly assessed empirically, and applied to large-scale regularized multi-task learning and compressed sensing. The code implementing our methods is publicly available on Github.
Similar content being viewed by others
Notes
These properties have emerged in the context of studying theoretical properties of projected gradient descent for \(\ell _p\)-norm constrained least squares (problem (1) with \(\phi (\varvec{x},\varvec{y})=\frac{1}{2}\Vert \varvec{y}- \varvec{A}\varvec{x}\Vert _2^2\)). However, no actual algorithm for \(\ell _p\)-ball projection is provided in [2].
Refer to Theorem 1 in https://planetmath.org/newtonsmethodworksforconvexrealfunctions (retrieved 2022-05-08).
Available at https://github.com/Optimizater/Lp-ball-Projection.
Available publicly at https://github.com/won-j/ LpBallProjection.
Available at https://github.com/JuliaPy/PyCall.jl.
References
Argyriou, A., Evgeniou, T., Pontil, M.: Convex multi-task feature learning. Mach. Learn. 73(3), 243–272 (2008)
Bahmani, S., Raj, B.: A unifying analysis of projected gradient descent for \(\ell _p\)-constrained least squares. Appl. Comput. Harmon. Anal. 34(3), 366–378 (2013)
Barbero, A., Sra, S.: Modular proximal optimization for multidimensional total-variation regularization. J. Mach. Learn. Res. 19(1), 2232–2313 (2018)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imag. Sci. 2(1), 183–202 (2009)
Bertsekas, D.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont, Mass., USA (1999)
Bertsekas, D.P.: Projected Newton methods for optimization problems with simple constraints. SIAM J. Control Optim. 20(2), 221–246 (1982)
Blumensath, T., Davies, M.E.: Iterative hard thresholding for compressed sensing. Appl. Comput. Harmon. Anal. 27(3), 265–274 (2009)
Boyd, S.P., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge, UK (2004)
Candes, E.J., Tao, T.: Decoding by linear programming. IEEE Tran. Inform. Theory 51(12), 4203–4215 (2005)
Chartrand, R., Staneva, V.: Restricted isometry properties and nonconvex compressive sensing. Inverse Prob. 24(3), 035020 (2008)
Chartrand, R., Yin, W.: Nonconvex sparse regularization and splitting algorithms. In: Splitting Methods in Communication, Imaging, Science, and Engineering, pp. 237–249. Springer (2016)
Chen, L., Jiang, X., Liu, X., Kirubarajan, T., Zhou, Z.: Outlier-robust moving object and background decomposition via structured \(\ell _p\)-regularized low-rank representation. IEEE Trans. Emerg. Topics Comput. Intell. 5, 620–638 (2021)
Chen, X., Niu, L., Yuan, Y.: Optimality conditions and a smoothing trust region newton method for nonlipschitz optimization. SIAM J. Optim. 23(3), 1528–1552 (2013)
Condat, L.: Fast projection onto the simplex and the \(\ell _1\) ball. Math. Program. 158(1–2), 575–585 (2016)
Das Gupta, M., Kumar, S.: Non-convex p-norm projection for robust sparsity. In: Proc. IEEE Int. Conf. Computer Vision, pp. 1593–1600 (2013)
Donoho, D.L.: Compressed sensing. IEEE Tran. Inform. Theory 52(4), 1289–1306 (2006)
Duchi, J., Shalev-Shwartz, S., Singer, Y., Chandra, T.: Efficient projections onto the \(\ell _1\)-ball for learning in high dimensions. In: Proc. 25th Int. Conf. Mach. Learn., pp. 272–279. ACM (2008)
Fu, W.J.: Penalized regressions: the bridge versus the lasso. J. Comput. Graph. Stat. 7(3), 397–416 (1998)
Hu, Y., Li, C., Meng, K., Qin, J., Yang, X.: Group sparse optimization via \(\ell _{p, q}\) regularization. J. Mach. Learn. Res. 18(1), 960–1011 (2017)
Lange, K.: MM Optimization Algorithms. SIAM, Philadelphia, PA, USA (2016)
Liu, H., Palatucci, M., Zhang, J.: Blockwise coordinate descent procedures for the multi-task lasso, with applications to neural semantic basis discovery. In: Proc. 26th Int. Conf. Mach. Learn., pp. 649–656. ACM (2009)
Liu, J., Ji, S., Ye, J.: SLEP: Sparse learning with efficient projections. Tech. rep., Arizona State University (2011). https://github.com/jiayuzhou/SLEP
Liu, J., Ye, J.: Efficient \(\ell _1\)/\(\ell _q\) norm regularization. arXiv:1009.4766 (2010)
Lu, Z.: Iterative reweighted minimization methods for \(\ell _p\) regularized unconstrained nonlinear programming. Math. Program. 147(1), 277–307 (2014)
Marjanovic, G., Solo, V.: On \(\ell _q\) optimization and matrix completion. IEEE Trans. Signal Process. 60(11), 5714–5724 (2012)
Meier, L., Van De Geer, S., Bühlmann, P.: The group lasso for logistic regression. J. R. Stat. Soc. Ser. B. Stat. Methodol. 70(1), 53–71 (2008)
Oymak, S., Recht, B., Soltanolkotabi, M.: Sharp time-data tradeoffs for linear inverse problems. IEEE Tran. Inform. Theory 64(6), 4129–4158 (2017)
Quattoni, A., Carreras, X., Collins, M., Darrell, T.: An efficient projection for \(\ell _{1,\infty }\) regularization. In: Proc. 26th Int. Conf. Mach. Learn., pp. 857–864. ACM (2009)
Sattar, Y., Oymak, S.: Quickly finding the best linear model in high dimensions via projected gradient descent. IEEE Trans. Signal Process 68, 818–829 (2020)
Sra, S.: Fast projections onto mixed-norm balls with applications. Data Min. Knowl. Discov. 25(2), 358–377 (2012)
Tibshirani, R., Wainwright, M., Hastie, T.: Statistical Learning with Sparsity: the Lasso and Generalizations. Chapman and Hall/CRC, Boca Raton (2015)
Vogt, J.E., Roth, V.: A complete analysis of the \(\ell _{1,p}\) group-lasso. In: Proc. 29th Int. Conf. Mach. Learn., pp. 1091–1098. Omnipress (2012)
Wang, M., Xu, W., Tang, A.: On the performance of sparse recovery via \(\ell _p\)-minimization \((0 \le p \le 1)\). IEEE Tran. Inform. Theory 57(11), 7255–7278 (2011)
Xu, Z., Chang, X., Xu, F., Zhang, H.: \({L}_{1/2}\) regularization: a thresholding representation theory and a fast solver. IEEE Trans. Neural Netw. Learn. Syst. 23(7), 1013–1027 (2012)
Yang, X., Wang, J., Wang, H.: Towards an efficient approach for the nonconvex \(\ell _p\) ball projection: algorithm and analysis. arXiv:2101.01350 (2021)
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B. Stat. Methodol. 68(1), 49–67 (2006)
Yukawa, M., Amari, S.i.: \(\ell _p\)-regularized least squares \((0< p< 1)\) and critical path. IEEE Trans. Inform. Theory 62(1), 488–502 (2016)
Zhang, Y., Yeung, D.Y., Xu, Q.: Probabilistic multi-task feature selection. In: Adv. Neural Inf. Process. Syst., pp. 2559–2567 (2010)
Zhou, Z., Zhang, Q., So, A.M.C.: \(\ell _{1,p}\)-norm regularization: error bounds and convergence rate analysis of first-order methods. In: Proc. 32nd Int. Conf. Mach. Learn., vol. 37, pp. 1501–1510 (2015)
Acknowledgements
We thank the associate editor and anonymous referees for providing constructive comments, especially for pointing out references [12, 13, 15, 24, 35]. JW was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (2019R1A2C1007126). KL was supported by the United States Public Health Service (USPHS) grants GM53275 and HG006139.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Won, JH., Lange, K. & Xu, J. A unified analysis of convex and non-convex \(\ell _p\)-ball projection problems. Optim Lett 17, 1133–1159 (2023). https://doi.org/10.1007/s11590-022-01919-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11590-022-01919-0