Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

A unified analysis of convex and non-convex \(\ell _p\)-ball projection problems

  • Original Paper
  • Published:
Optimization Letters Aims and scope Submit manuscript

Abstract

The task of projecting onto \(\ell _p\) norm balls is ubiquitous in statistics and machine learning, yet the availability of actionable algorithms for doing so is largely limited to the special cases of \(p \in \left\{ 0, 1,2, \infty \right\}\). In this paper, we introduce novel, scalable methods for projecting onto the \(\ell _p\)-ball for general \(p>0\). For \(p \ge 1\), we solve the univariate Lagrangian dual via a dual Newton method. We then carefully design a bisection approach for \(p<1\), presenting theoretical and empirical evidence of zero or a small duality gap in the non-convex case. The success of our contributions is thoroughly assessed empirically, and applied to large-scale regularized multi-task learning and compressed sensing. The code implementing our methods is publicly available on Github.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. These properties have emerged in the context of studying theoretical properties of projected gradient descent for \(\ell _p\)-norm constrained least squares (problem (1) with \(\phi (\varvec{x},\varvec{y})=\frac{1}{2}\Vert \varvec{y}- \varvec{A}\varvec{x}\Vert _2^2\)). However, no actual algorithm for \(\ell _p\)-ball projection is provided in [2].

  2. Refer to Theorem 1 in https://planetmath.org/newtonsmethodworksforconvexrealfunctions (retrieved 2022-05-08).

  3. For a direct correspondence between Proposition 2.1 and [25, Theorem 1], \(q \leftarrow p\), \(\beta \leftarrow x\), \(z \leftarrow y\), \(\lambda \leftarrow \mu /p\), \(h_a \leftarrow r_p\), \(\beta _a \leftarrow \kappa _p\), and \(\beta _* = z_p(y)\) when \(\mu = 1\).

  4. Available at https://github.com/albarji/proxTV/blob/master/src/LPopt.cpp.

  5. Available at https://github.com/Optimizater/Lp-ball-Projection.

  6. Available publicly at https://github.com/won-j/ LpBallProjection.

  7. Available at https://github.com/JuliaPy/PyCall.jl.

  8. This optimal tuning parameter as required by the theory of [27] can be relaxed. However, we closely follow the experiment setup of [27] here.

References

  1. Argyriou, A., Evgeniou, T., Pontil, M.: Convex multi-task feature learning. Mach. Learn. 73(3), 243–272 (2008)

    Article  MATH  Google Scholar 

  2. Bahmani, S., Raj, B.: A unifying analysis of projected gradient descent for \(\ell _p\)-constrained least squares. Appl. Comput. Harmon. Anal. 34(3), 366–378 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  3. Barbero, A., Sra, S.: Modular proximal optimization for multidimensional total-variation regularization. J. Mach. Learn. Res. 19(1), 2232–2313 (2018)

    MathSciNet  MATH  Google Scholar 

  4. Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imag. Sci. 2(1), 183–202 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  5. Bertsekas, D.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont, Mass., USA (1999)

    MATH  Google Scholar 

  6. Bertsekas, D.P.: Projected Newton methods for optimization problems with simple constraints. SIAM J. Control Optim. 20(2), 221–246 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  7. Blumensath, T., Davies, M.E.: Iterative hard thresholding for compressed sensing. Appl. Comput. Harmon. Anal. 27(3), 265–274 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  8. Boyd, S.P., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge, UK (2004)

    Book  MATH  Google Scholar 

  9. Candes, E.J., Tao, T.: Decoding by linear programming. IEEE Tran. Inform. Theory 51(12), 4203–4215 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  10. Chartrand, R., Staneva, V.: Restricted isometry properties and nonconvex compressive sensing. Inverse Prob. 24(3), 035020 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  11. Chartrand, R., Yin, W.: Nonconvex sparse regularization and splitting algorithms. In: Splitting Methods in Communication, Imaging, Science, and Engineering, pp. 237–249. Springer (2016)

  12. Chen, L., Jiang, X., Liu, X., Kirubarajan, T., Zhou, Z.: Outlier-robust moving object and background decomposition via structured \(\ell _p\)-regularized low-rank representation. IEEE Trans. Emerg. Topics Comput. Intell. 5, 620–638 (2021)

    Article  Google Scholar 

  13. Chen, X., Niu, L., Yuan, Y.: Optimality conditions and a smoothing trust region newton method for nonlipschitz optimization. SIAM J. Optim. 23(3), 1528–1552 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  14. Condat, L.: Fast projection onto the simplex and the \(\ell _1\) ball. Math. Program. 158(1–2), 575–585 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  15. Das Gupta, M., Kumar, S.: Non-convex p-norm projection for robust sparsity. In: Proc. IEEE Int. Conf. Computer Vision, pp. 1593–1600 (2013)

  16. Donoho, D.L.: Compressed sensing. IEEE Tran. Inform. Theory 52(4), 1289–1306 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  17. Duchi, J., Shalev-Shwartz, S., Singer, Y., Chandra, T.: Efficient projections onto the \(\ell _1\)-ball for learning in high dimensions. In: Proc. 25th Int. Conf. Mach. Learn., pp. 272–279. ACM (2008)

  18. Fu, W.J.: Penalized regressions: the bridge versus the lasso. J. Comput. Graph. Stat. 7(3), 397–416 (1998)

    MathSciNet  Google Scholar 

  19. Hu, Y., Li, C., Meng, K., Qin, J., Yang, X.: Group sparse optimization via \(\ell _{p, q}\) regularization. J. Mach. Learn. Res. 18(1), 960–1011 (2017)

    MathSciNet  Google Scholar 

  20. Lange, K.: MM Optimization Algorithms. SIAM, Philadelphia, PA, USA (2016)

    Book  MATH  Google Scholar 

  21. Liu, H., Palatucci, M., Zhang, J.: Blockwise coordinate descent procedures for the multi-task lasso, with applications to neural semantic basis discovery. In: Proc. 26th Int. Conf. Mach. Learn., pp. 649–656. ACM (2009)

  22. Liu, J., Ji, S., Ye, J.: SLEP: Sparse learning with efficient projections. Tech. rep., Arizona State University (2011). https://github.com/jiayuzhou/SLEP

  23. Liu, J., Ye, J.: Efficient \(\ell _1\)/\(\ell _q\) norm regularization. arXiv:1009.4766 (2010)

  24. Lu, Z.: Iterative reweighted minimization methods for \(\ell _p\) regularized unconstrained nonlinear programming. Math. Program. 147(1), 277–307 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  25. Marjanovic, G., Solo, V.: On \(\ell _q\) optimization and matrix completion. IEEE Trans. Signal Process. 60(11), 5714–5724 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  26. Meier, L., Van De Geer, S., Bühlmann, P.: The group lasso for logistic regression. J. R. Stat. Soc. Ser. B. Stat. Methodol. 70(1), 53–71 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  27. Oymak, S., Recht, B., Soltanolkotabi, M.: Sharp time-data tradeoffs for linear inverse problems. IEEE Tran. Inform. Theory 64(6), 4129–4158 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  28. Quattoni, A., Carreras, X., Collins, M., Darrell, T.: An efficient projection for \(\ell _{1,\infty }\) regularization. In: Proc. 26th Int. Conf. Mach. Learn., pp. 857–864. ACM (2009)

  29. Sattar, Y., Oymak, S.: Quickly finding the best linear model in high dimensions via projected gradient descent. IEEE Trans. Signal Process 68, 818–829 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  30. Sra, S.: Fast projections onto mixed-norm balls with applications. Data Min. Knowl. Discov. 25(2), 358–377 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  31. Tibshirani, R., Wainwright, M., Hastie, T.: Statistical Learning with Sparsity: the Lasso and Generalizations. Chapman and Hall/CRC, Boca Raton (2015)

    MATH  Google Scholar 

  32. Vogt, J.E., Roth, V.: A complete analysis of the \(\ell _{1,p}\) group-lasso. In: Proc. 29th Int. Conf. Mach. Learn., pp. 1091–1098. Omnipress (2012)

  33. Wang, M., Xu, W., Tang, A.: On the performance of sparse recovery via \(\ell _p\)-minimization \((0 \le p \le 1)\). IEEE Tran. Inform. Theory 57(11), 7255–7278 (2011)

    Article  MATH  Google Scholar 

  34. Xu, Z., Chang, X., Xu, F., Zhang, H.: \({L}_{1/2}\) regularization: a thresholding representation theory and a fast solver. IEEE Trans. Neural Netw. Learn. Syst. 23(7), 1013–1027 (2012)

    Article  Google Scholar 

  35. Yang, X., Wang, J., Wang, H.: Towards an efficient approach for the nonconvex \(\ell _p\) ball projection: algorithm and analysis. arXiv:2101.01350 (2021)

  36. Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B. Stat. Methodol. 68(1), 49–67 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  37. Yukawa, M., Amari, S.i.: \(\ell _p\)-regularized least squares \((0< p< 1)\) and critical path. IEEE Trans. Inform. Theory 62(1), 488–502 (2016)

  38. Zhang, Y., Yeung, D.Y., Xu, Q.: Probabilistic multi-task feature selection. In: Adv. Neural Inf. Process. Syst., pp. 2559–2567 (2010)

  39. Zhou, Z., Zhang, Q., So, A.M.C.: \(\ell _{1,p}\)-norm regularization: error bounds and convergence rate analysis of first-order methods. In: Proc. 32nd Int. Conf. Mach. Learn., vol. 37, pp. 1501–1510 (2015)

Download references

Acknowledgements

We thank the associate editor and anonymous referees for providing constructive comments, especially for pointing out references [12, 13, 15, 24, 35]. JW was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (2019R1A2C1007126). KL was supported by the United States Public Health Service (USPHS) grants GM53275 and HG006139.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joong-Ho Won.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Won, JH., Lange, K. & Xu, J. A unified analysis of convex and non-convex \(\ell _p\)-ball projection problems. Optim Lett 17, 1133–1159 (2023). https://doi.org/10.1007/s11590-022-01919-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11590-022-01919-0

Keywords

Navigation