Nothing Special   »   [go: up one dir, main page]

Skip to main content

Advertisement

Log in

An improved algorithm for the \(L_2\)\(L_p\) minimization problem

  • Full Length Paper
  • Series A
  • Published:
Mathematical Programming Submit manuscript

Abstract

In this paper we consider a class of non-Lipschitz and non-convex minimization problems which generalize the \(L_2\)\(L_p\) minimization problem. We propose an iterative algorithm that decides the next iteration based on the local convexity/concavity/sparsity of its current position. We show that our algorithm finds an \(\epsilon \)-KKT point within \(O(\log {\epsilon ^{-1}})\) iterations from certain initial points. The same result is also applied to the problem with general linear constraints under mild conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Bian, W., Chen, X., Ye, Y.: Complexity analysis of interior point algorithms for non-Lipschitz and nonconvex minimization. Math. Program. 149(1–2), 301–327 (2012)

    MathSciNet  MATH  Google Scholar 

  2. Bian, W., Chen, X.: Worst-case complexity of smoothing quadratic regularization methods for non-Lipschitzian optimization. SIAM J. Optim. 23(3), 1718–1741 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  3. Candès, E., Wakin, M., Boyd, S.: Enhancing sparsity by reweighted \(l_1\) minimization. J. Fourier Anal. Appl. 14, 877–905 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  4. Chartrand, R.: Exact reconstruction of sparse signals via nonconvex minimization. IEEE Signal Process. Lett. 14(10), 707–710 (2007)

    Article  Google Scholar 

  5. Chartrand, R., Staneva, V.: Restricted isometry properties and nonconvex compressive sensing. Inverse Problem 24(3), 035020 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  6. Chen, C., Li, X., Tolman, C., Wang, S., Ye, Y.: Sparse Portfolio Selection via Quasi-norm Regularization. Working paper (2013)

  7. Chen, X., Xu, F., Ye, Y.: Lower bound theory of nonzero entries in solutions of \(\ell _2\)\(\ell _p\) minimization. SIAM J. Sci. Comput. 32(5), 2832–2852 (2010)

    Article  MathSciNet  Google Scholar 

  8. Chen, X., Ge, D., Wang, Z., Ye, Y.: Complexity of unconstrained \(L_2\)\(L_p\) minimization. Math. Program. 143(1–2), 371–383 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  9. Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Soc. 96(456), 1348–1360 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  10. Foucart, S., Lai, M.J.: Sparsest solutions of under-determined linear systems via \(l_q\) minimization for \(0< q\le 1\). Appl. Comput. Harmonic Anal. 26(3), 395–407 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  11. Frank, I.E., Freidman, J.H.: A statistical view of some chemometrics regression tools. Technometrics 35(2), 109–148 (1993)

    Article  MATH  Google Scholar 

  12. Garey, M.R., Johnson, D.S.: “Strong” NP-completeness results: motivation, examples, and implications. J. Assoc. Comput. Mach. 25(3), 499–508 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  13. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, New York (1979)

    MATH  Google Scholar 

  14. Ge, D., Jiang, X., Ye, Y.: A note on the complexity of \(L_p\) minimization. Math. Program. 129(2), 285–299 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  15. Huang, J., Horowitz, J.L., Ma, S.: Asymptotic properties of bridge estimators in sparse high-dimensional regression models. Ann. Stat. 36(2), 587–613 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  16. Knight, K., Fu, W.J.: Asymptotics for Lasso-type estimators. Ann. Stat. 28(5), 1356–1378 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  17. Lai, M., Wang, Y.: An unconstrained \(l_q\) minimization with \(0 < q< 1\) for sparse solution of under-determined linear systems. SIAM J. Optim. 21(1), 82–101 (2011)

    Article  MathSciNet  Google Scholar 

  18. Liu, Y.F., Ma, S., Dai, Y.H., Zhang, S.: A smoothing SQP framework for a class of composite \(l_q\) minimization over polyhedron. Math. Program. 158(1), 467–500 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  19. Natarajan, B.K.: Sparse approximate solutions to linear systems. SIAM J. Comput. 24(2), 227–234 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  20. Vazirani, V.: Approximation Algorithms. Springer, Berlin (2003)

    Book  Google Scholar 

  21. Ye, Y.: On the complexity of approximating a KKT point of quadratic programming. Math. Program. 80(2), 195–211 (1998)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors would like to thank anonymous referees for their insightful comments and suggestions. Research by the first author is supported by Program for Innovative Research Team of Shanghai University of Finance and Economics (IRTSHUFE) and by National Natural Science Foundation of China (NSFC) Project 11471205. Research by the third author is supported by Program for Innovative Research Team of Shanghai University of Finance and Economics (IRTSHUFE), the Program for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning, and by National Natural Science Foundation of China (NSFC) Project 71440014.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rongchuan He.

Appendix

Appendix

Let \(g_1(x) = \lambda \sum _i x_i^p\) and \(g_2(x) = \frac{1}{2}x^T(Q-\beta I)x\). Note that \(g(x) = g_1(x) + g_2(x)\).

Lemma 8

If \(x+d \ge 0\), \(x_i \ge L\) for every index i, and \(0< s \le \nu < 1\), then

$$\begin{aligned} \Vert \nabla g(x+sd) - \nabla g(x)\Vert ^2 \le 2s^2 \{ [\lambda p(1-p)]^2[L(1-\nu )]^{2p-4} + \Vert Q-\beta I\Vert ^2 \} \Vert d\Vert ^2 \end{aligned}$$

Proof

We consider \(g_1(x)\) and \(g_2(x)\) separately.

By the definition of \(g_1(x)\), we have

$$\begin{aligned} \begin{aligned}&\Vert \nabla g_1(x+sd) - \nabla g_1(x)\Vert \\&\quad \le \int _0^1 \Vert \nabla ^2g_1(x+tsd)sd\Vert \;\mathrm {d}t\\&\quad = s\lambda p(1-p)\int _0^1\sqrt{\sum _{i=1}^n(x_i+std_i)^{2p-4}d_i^2}\;\mathrm {d}t. \\ \end{aligned} \end{aligned}$$

Since \(x_i + tsd_i \ge (1-s)L\), for all index i, we have

$$\begin{aligned} \int _0^1\sqrt{\sum _{i=1}^n(x_i+std_i)^{2p-4}d_i^2}\mathrm {d}t \le \left( (1-s)L\right) ^{p-2}\Vert d\Vert . \end{aligned}$$

In conjunction with \(s \le \nu \) and \(0< p < 1\), we get

$$\begin{aligned} \Vert \nabla g_1(x+sd) - \nabla g_1(x)\Vert ^2 \le s^2[\lambda p(1-p)]^2[L(1-\nu )]^{2p-4}\Vert d\Vert ^2. \end{aligned}$$

For \(g_2(x)\), we have

$$\begin{aligned} \Vert \nabla g_2(x+sd) - \nabla g_2(x)\Vert ^2 \le s^2 \Vert Q - \beta I\Vert ^2 \Vert d\Vert ^2. \end{aligned}$$

Thus,

$$\begin{aligned} \begin{aligned}&\Vert \nabla g(x+sd) - \nabla g(x)\Vert ^2 \\&\quad \le 2 \left( \Vert \nabla g_1(x+sd) - \nabla g_1(x)\Vert ^2 + \Vert \nabla g_2(x+sd) - \nabla g_2(x)\Vert ^2\right) \\&\quad \le 2s^2\left\{ [\lambda p(1-p)]^2[L(1-\nu )]^{2p-4} + \Vert Q-\beta I\Vert ^2\right\} \Vert d\Vert ^2.\\ \end{aligned} \end{aligned}$$

\(\square \)

Remark Lemma 8 is the consequence of that g(x)’s gradient is Lipschitz continuous in the region that \(x_i \ge L(1-\nu )\) for all index i.

Lemma 9

If \(x+d \ge 0\), \(x \ge 0\) and \(0 < s \le 1\), then \(d^T\left( \nabla g(x) - \nabla g(x+sd)\right) \le \frac{-s}{1-s} d^T \nabla ^2g(x) d\).

Proof

We consider \(g_1(x)\) and \(g_2(x)\) separately. From the definition of \(g_1(x)\), we have

$$\begin{aligned} d^T(\nabla g_1(x) - \nabla g_1(x+sd)) = \lambda p\sum _i (d_i(x_i^{p-1} - (x_i + sd_i)^{p-1})), \end{aligned}$$

and

$$\begin{aligned} -\frac{s}{1-s}d^T\nabla ^2g_1(x)d = \frac{s}{1-s}\lambda p(1-p)\sum _ix_i^{p-2}d_i^2. \end{aligned}$$

To prove that \(d^T\left( \nabla g_1(x) - \nabla g_1(x+sd)\right) \le \frac{-s}{1-s} d^T \nabla ^2g_1(x) d\), it suffices to show that \(d_i(x_i^{p-1}) - (x_i + sd_i)^{p-1} \le \frac{s}{1-s}(1-p)x_i^{p-2}d_i^2, \forall i\).

If \(d_i \ge 0\), by the mean-value theorem, there exists a \(x_i'\in [x_i, x_i+sd_i]\), such that \(x_i^{p-1} - (x_i+sd_i)^{p-1} = x_i'^{p-2}(1-p)sd_i\). Since \(0< p < 1\), we have

$$\begin{aligned} d_ix_i'^{p-2}(1-p)sd_i \le s(1-p)x_i^{p-2}d_i^2 \le \frac{s}{1-s}(1-p)x_i^{p-2}d_i^2. \end{aligned}$$

If \(d_i < 0\), We let \(v = -\frac{d_i}{x_i}, 0 < v \le 1\). By multiplying \(\frac{x_i^{1-p}}{-d_i}\) on both sides,

$$\begin{aligned} d_i\left( x_i^{p-1}-(x_i+sd_i)^{p-1}\right) \le \frac{s}{1-s}(1-p)x_i^{p-2}d_i^2, \end{aligned}$$

can be simplified as

$$\begin{aligned} (1-sv)^{p-1} - 1 \le \frac{sv}{1-v}(1-p). \end{aligned}$$

Since

$$\begin{aligned} \frac{sv}{1-v}(1-p) \le \frac{sv}{1-sv}(1-p), \end{aligned}$$

letting \(u=sv\), it is sufficient to show that \((1-u)^{p-1}-1 \le \frac{u}{1-u}(1-p)\). Consider function \(\omega (u) = (1-u)^p - (1-u) - u(1-p)\), \(0 < u \le 1\). A simple calculation shows that \(\omega (u)\) is a decreasing function. Hence \(\omega (u) \le \lim _{u \rightarrow 0}\omega (u) = 0\). It follows that \((1-u)^{p-1}-1 \le \frac{u}{1-u}(1-p), \forall 0 < u \le 1\). Thus, we obtain

$$\begin{aligned} pd_i\left( x_i^{p-1}-(x_i+sd_i)^{p-1}\right) \le \frac{s}{1-s}p(1-p)d_ix_i^{p-2}d_i. \end{aligned}$$

For \(g_2(x)\), since \(0 < s \le 1\), we have

$$\begin{aligned} \begin{aligned}&d^T\left( \nabla g_2(x) - \nabla g_2(x+sd)\right) \\&\quad = d^T(\beta I - Q)d\\&\quad \le \frac{1}{1-s} d^T(\beta I - Q)d\\&\quad = \frac{-s}{1-s}d^T\nabla ^2g_2(x)d. \\ \end{aligned} \end{aligned}$$

Therefore,

$$\begin{aligned} \begin{aligned}&d^T(\nabla g(x) - \nabla g(x+sd))\\&\quad = d^T\left( \nabla g_1(x) - \nabla g_1(x+sd) + \nabla g_2(x) - \nabla g_2(x+sd)\right) \\&\quad \le \frac{-s}{1-s}d^T\nabla ^2g_1(x)d + \frac{-s}{1-s}d^T\nabla ^2g_2(x)d\\&\quad = \frac{-s}{1-s}d^T\nabla ^2g(x)d.\\ \end{aligned} \end{aligned}$$

\(\square \)

Remark Lemma 9 is a result of the special structure of function g(x), which is the sum of \(L_p\) function and a concave quadratic function.

Lemma 10

If \(\Delta \phi \ge \frac{\beta }{2}\Vert d\Vert ^2\), then

$$\begin{aligned} \Delta \phi - s(\Delta \phi - \frac{\beta }{2}\Vert d\Vert ^2 + \tau \Vert d\Vert ^2) + s^2(\mu \Vert d\Vert ^2) \le (1-s\delta )\Delta \phi , \end{aligned}$$

where \(\tau > 0\), \(\mu > 0\), \(0< \delta < \min \{\frac{2\tau }{\beta },1\}\), and \(0 < s \le \min \{\frac{1}{u}(\tau - \frac{\delta \beta }{2}), 1\}\)

Proof

$$\begin{aligned} \begin{aligned}&\Delta \phi - s\left( \Delta \phi - \frac{\beta }{2}\Vert d\Vert ^2 + \tau \Vert d\Vert ^2\right) + s^2\mu \Vert d\Vert ^2 - (1-s\delta )\Delta \phi \\&\quad = - s\left( \Delta \phi - \frac{\beta }{2}\Vert d\Vert ^2 + \tau \Vert d\Vert ^2\right) + s^2\mu \Vert d\Vert ^2 + s\delta \Delta \phi \\&\quad = s \left[ \left( s\mu + \frac{\beta }{2} - \tau \right) \Vert d\Vert ^2 - (1-\delta ) \Delta \phi \right] \\&\quad \le s \left[ \left( s\mu + \frac{\beta }{2} - \tau \right) \Vert d\Vert ^2 - \frac{(1-\delta )\beta }{2}\Vert d\Vert ^2\right] \\&\quad = s \Vert d\Vert ^2 \left( s\mu -\tau + \frac{\delta \beta }{2}\right) , \\ \end{aligned} \end{aligned}$$

where the inequality follows from \(\frac{\beta }{2}\Vert d\Vert ^2 \le \Delta \phi \). Due to \(0< \delta < \frac{2\tau }{\beta }\) and \(s \le \frac{1}{\mu }(\tau - \frac{\delta \beta }{2})\), we have

$$\begin{aligned} s\mu - \tau + \frac{\delta \beta }{2} \le 0. \end{aligned}$$

Therefore, \(\Delta \phi - s(\Delta \phi - \frac{\beta }{2}\Vert d\Vert ^2 + \tau \Vert d\Vert ^2) + s^2(\mu \Vert d\Vert ^2) \le (1-s\delta )\Delta \phi \). \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ge, D., He, R. & He, S. An improved algorithm for the \(L_2\)\(L_p\) minimization problem. Math. Program. 166, 131–158 (2017). https://doi.org/10.1007/s10107-016-1107-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-016-1107-2

Keywords

Mathematics Subject Classification

Navigation