An improved algorithm for the $$L_2$$ – $$L_p$$ minimization problem

Dongdong Ge¹,
Rongchuan He² &
Simai He¹

1367 Accesses
8 Citations
Explore all metrics

Abstract

In this paper we consider a class of non-Lipschitz and non-convex minimization problems which generalize the $L_2$–$L_p$ minimization problem. We propose an iterative algorithm that decides the next iteration based on the local convexity/concavity/sparsity of its current position. We show that our algorithm finds an $\epsilon $-KKT point within $O(\log {\epsilon ^{-1}})$ iterations from certain initial points. The same result is also applied to the problem with general linear constraints under mild conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the Properties of the Method of Minimization for Convex Functions with Relaxation on the Distance to Extremum

Article 01 January 2019

A Newton-CG algorithm with complexity guarantees for smooth unconstrained optimization

Article 19 January 2019

A Barzilai–Borwein-Like Iterative Half Thresholding Algorithm for the $L_{1/2}$ Regularized Problem

Article 04 September 2015

References

Bian, W., Chen, X., Ye, Y.: Complexity analysis of interior point algorithms for non-Lipschitz and nonconvex minimization. Math. Program. 149(1–2), 301–327 (2012)
MathSciNet MATH Google Scholar
Bian, W., Chen, X.: Worst-case complexity of smoothing quadratic regularization methods for non-Lipschitzian optimization. SIAM J. Optim. 23(3), 1718–1741 (2013)
Article MathSciNet MATH Google Scholar
Candès, E., Wakin, M., Boyd, S.: Enhancing sparsity by reweighted $l_1$ minimization. J. Fourier Anal. Appl. 14, 877–905 (2008)
Article MathSciNet MATH Google Scholar
Chartrand, R.: Exact reconstruction of sparse signals via nonconvex minimization. IEEE Signal Process. Lett. 14(10), 707–710 (2007)
Article Google Scholar
Chartrand, R., Staneva, V.: Restricted isometry properties and nonconvex compressive sensing. Inverse Problem 24(3), 035020 (2008)
Article MathSciNet MATH Google Scholar
Chen, C., Li, X., Tolman, C., Wang, S., Ye, Y.: Sparse Portfolio Selection via Quasi-norm Regularization. Working paper (2013)
Chen, X., Xu, F., Ye, Y.: Lower bound theory of nonzero entries in solutions of $\ell _2$–$\ell _p$ minimization. SIAM J. Sci. Comput. 32(5), 2832–2852 (2010)
Article MathSciNet Google Scholar
Chen, X., Ge, D., Wang, Z., Ye, Y.: Complexity of unconstrained $L_2$–$L_p$ minimization. Math. Program. 143(1–2), 371–383 (2014)
Article MathSciNet MATH Google Scholar
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Soc. 96(456), 1348–1360 (2001)
Article MathSciNet MATH Google Scholar
Foucart, S., Lai, M.J.: Sparsest solutions of under-determined linear systems via $l_q$ minimization for $0< q\le 1$. Appl. Comput. Harmonic Anal. 26(3), 395–407 (2009)
Article MathSciNet MATH Google Scholar
Frank, I.E., Freidman, J.H.: A statistical view of some chemometrics regression tools. Technometrics 35(2), 109–148 (1993)
Article MATH Google Scholar
Garey, M.R., Johnson, D.S.: “Strong” NP-completeness results: motivation, examples, and implications. J. Assoc. Comput. Mach. 25(3), 499–508 (1978)
Article MathSciNet MATH Google Scholar
Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, New York (1979)
MATH Google Scholar
Ge, D., Jiang, X., Ye, Y.: A note on the complexity of $L_p$ minimization. Math. Program. 129(2), 285–299 (2011)
Article MathSciNet MATH Google Scholar
Huang, J., Horowitz, J.L., Ma, S.: Asymptotic properties of bridge estimators in sparse high-dimensional regression models. Ann. Stat. 36(2), 587–613 (2008)
Article MathSciNet MATH Google Scholar
Knight, K., Fu, W.J.: Asymptotics for Lasso-type estimators. Ann. Stat. 28(5), 1356–1378 (2000)
Article MathSciNet MATH Google Scholar
Lai, M., Wang, Y.: An unconstrained $l_q$ minimization with $0 < q< 1$ for sparse solution of under-determined linear systems. SIAM J. Optim. 21(1), 82–101 (2011)
Article MathSciNet Google Scholar
Liu, Y.F., Ma, S., Dai, Y.H., Zhang, S.: A smoothing SQP framework for a class of composite $l_q$ minimization over polyhedron. Math. Program. 158(1), 467–500 (2016)
Article MathSciNet MATH Google Scholar
Natarajan, B.K.: Sparse approximate solutions to linear systems. SIAM J. Comput. 24(2), 227–234 (1995)
Article MathSciNet MATH Google Scholar
Vazirani, V.: Approximation Algorithms. Springer, Berlin (2003)
Book Google Scholar
Ye, Y.: On the complexity of approximating a KKT point of quadratic programming. Math. Program. 80(2), 195–211 (1998)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors would like to thank anonymous referees for their insightful comments and suggestions. Research by the first author is supported by Program for Innovative Research Team of Shanghai University of Finance and Economics (IRTSHUFE) and by National Natural Science Foundation of China (NSFC) Project 11471205. Research by the third author is supported by Program for Innovative Research Team of Shanghai University of Finance and Economics (IRTSHUFE), the Program for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning, and by National Natural Science Foundation of China (NSFC) Project 71440014.

Author information

Authors and Affiliations

Research Center for Management Science and Information Analytics, Shanghai University of Finance and Economics, Shanghai, China
Dongdong Ge & Simai He
Department of Management Sciences, City University of Hong Kong, Hong Kong, China
Rongchuan He

Authors

Dongdong Ge
View author publications
You can also search for this author in PubMed Google Scholar
Rongchuan He
View author publications
You can also search for this author in PubMed Google Scholar
Simai He
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rongchuan He.

Appendix

Let $g_1(x) = \lambda \sum _i x_i^p$ and $g_2(x) = \frac{1}{2}x^T(Q-\beta I)x$. Note that $g(x) = g_1(x) + g_2(x)$.

Lemma 8

If $x+d \ge 0$, $x_i \ge L$ for every index i, and $0< s \le \nu < 1$, then

$$\begin{aligned} \Vert \nabla g(x+sd) - \nabla g(x)\Vert ^2 \le 2s^2 \{ [\lambda p(1-p)]^2[L(1-\nu )]^{2p-4} + \Vert Q-\beta I\Vert ^2 \} \Vert d\Vert ^2 \end{aligned}$$

Proof

We consider $g_1(x)$ and $g_2(x)$ separately.

By the definition of $g_1(x)$, we have

$$\begin{aligned} \begin{aligned}&\Vert \nabla g_1(x+sd) - \nabla g_1(x)\Vert \\&\quad \le \int _0^1 \Vert \nabla ^2g_1(x+tsd)sd\Vert \;\mathrm {d}t\\&\quad = s\lambda p(1-p)\int _0^1\sqrt{\sum _{i=1}^n(x_i+std_i)^{2p-4}d_i^2}\;\mathrm {d}t. \\ \end{aligned} \end{aligned}$$

Since $x_i + tsd_i \ge (1-s)L$, for all index i, we have

$$\begin{aligned} \int _0^1\sqrt{\sum _{i=1}^n(x_i+std_i)^{2p-4}d_i^2}\mathrm {d}t \le \left( (1-s)L\right) ^{p-2}\Vert d\Vert . \end{aligned}$$

In conjunction with $s \le \nu $ and $0< p < 1$, we get

$$\begin{aligned} \Vert \nabla g_1(x+sd) - \nabla g_1(x)\Vert ^2 \le s^2[\lambda p(1-p)]^2[L(1-\nu )]^{2p-4}\Vert d\Vert ^2. \end{aligned}$$

For $g_2(x)$, we have

$$\begin{aligned} \Vert \nabla g_2(x+sd) - \nabla g_2(x)\Vert ^2 \le s^2 \Vert Q - \beta I\Vert ^2 \Vert d\Vert ^2. \end{aligned}$$

Thus,

$$\begin{aligned} \begin{aligned}&\Vert \nabla g(x+sd) - \nabla g(x)\Vert ^2 \\&\quad \le 2 \left( \Vert \nabla g_1(x+sd) - \nabla g_1(x)\Vert ^2 + \Vert \nabla g_2(x+sd) - \nabla g_2(x)\Vert ^2\right) \\&\quad \le 2s^2\left\{ [\lambda p(1-p)]^2[L(1-\nu )]^{2p-4} + \Vert Q-\beta I\Vert ^2\right\} \Vert d\Vert ^2.\\ \end{aligned} \end{aligned}$$

$\square $

Remark Lemma 8 is the consequence of that g(x)’s gradient is Lipschitz continuous in the region that $x_i \ge L(1-\nu )$ for all index i.

Lemma 9

If $x+d \ge 0$, $x \ge 0$ and $0 < s \le 1$, then $d^T\left( \nabla g(x) - \nabla g(x+sd)\right) \le \frac{-s}{1-s} d^T \nabla ^2g(x) d$.

Proof

We consider $g_1(x)$ and $g_2(x)$ separately. From the definition of $g_1(x)$, we have

$$\begin{aligned} d^T(\nabla g_1(x) - \nabla g_1(x+sd)) = \lambda p\sum _i (d_i(x_i^{p-1} - (x_i + sd_i)^{p-1})), \end{aligned}$$

and

$$\begin{aligned} -\frac{s}{1-s}d^T\nabla ^2g_1(x)d = \frac{s}{1-s}\lambda p(1-p)\sum _ix_i^{p-2}d_i^2. \end{aligned}$$

To prove that $d^T\left( \nabla g_1(x) - \nabla g_1(x+sd)\right) \le \frac{-s}{1-s} d^T \nabla ^2g_1(x) d$, it suffices to show that $d_i(x_i^{p-1}) - (x_i + sd_i)^{p-1} \le \frac{s}{1-s}(1-p)x_i^{p-2}d_i^2, \forall i$.

If $d_i \ge 0$, by the mean-value theorem, there exists a $x_i'\in [x_i, x_i+sd_i]$, such that $x_i^{p-1} - (x_i+sd_i)^{p-1} = x_i'^{p-2}(1-p)sd_i$. Since $0< p < 1$, we have

$$\begin{aligned} d_ix_i'^{p-2}(1-p)sd_i \le s(1-p)x_i^{p-2}d_i^2 \le \frac{s}{1-s}(1-p)x_i^{p-2}d_i^2. \end{aligned}$$

If $d_i < 0$, We let $v = -\frac{d_i}{x_i}, 0 < v \le 1$. By multiplying $\frac{x_i^{1-p}}{-d_i}$ on both sides,

$$\begin{aligned} d_i\left( x_i^{p-1}-(x_i+sd_i)^{p-1}\right) \le \frac{s}{1-s}(1-p)x_i^{p-2}d_i^2, \end{aligned}$$

can be simplified as

$$\begin{aligned} (1-sv)^{p-1} - 1 \le \frac{sv}{1-v}(1-p). \end{aligned}$$

Since

$$\begin{aligned} \frac{sv}{1-v}(1-p) \le \frac{sv}{1-sv}(1-p), \end{aligned}$$

letting $u=sv$, it is sufficient to show that $(1-u)^{p-1}-1 \le \frac{u}{1-u}(1-p)$. Consider function $\omega (u) = (1-u)^p - (1-u) - u(1-p)$, $0 < u \le 1$. A simple calculation shows that $\omega (u)$ is a decreasing function. Hence $\omega (u) \le \lim _{u \rightarrow 0}\omega (u) = 0$. It follows that $(1-u)^{p-1}-1 \le \frac{u}{1-u}(1-p), \forall 0 < u \le 1$. Thus, we obtain

$$\begin{aligned} pd_i\left( x_i^{p-1}-(x_i+sd_i)^{p-1}\right) \le \frac{s}{1-s}p(1-p)d_ix_i^{p-2}d_i. \end{aligned}$$

For $g_2(x)$, since $0 < s \le 1$, we have

$$\begin{aligned} \begin{aligned}&d^T\left( \nabla g_2(x) - \nabla g_2(x+sd)\right) \\&\quad = d^T(\beta I - Q)d\\&\quad \le \frac{1}{1-s} d^T(\beta I - Q)d\\&\quad = \frac{-s}{1-s}d^T\nabla ^2g_2(x)d. \\ \end{aligned} \end{aligned}$$

Therefore,

$$\begin{aligned} \begin{aligned}&d^T(\nabla g(x) - \nabla g(x+sd))\\&\quad = d^T\left( \nabla g_1(x) - \nabla g_1(x+sd) + \nabla g_2(x) - \nabla g_2(x+sd)\right) \\&\quad \le \frac{-s}{1-s}d^T\nabla ^2g_1(x)d + \frac{-s}{1-s}d^T\nabla ^2g_2(x)d\\&\quad = \frac{-s}{1-s}d^T\nabla ^2g(x)d.\\ \end{aligned} \end{aligned}$$

$\square $

Remark Lemma 9 is a result of the special structure of function g(x), which is the sum of $L_p$ function and a concave quadratic function.

Lemma 10

If $\Delta \phi \ge \frac{\beta }{2}\Vert d\Vert ^2$, then

$$\begin{aligned} \Delta \phi - s(\Delta \phi - \frac{\beta }{2}\Vert d\Vert ^2 + \tau \Vert d\Vert ^2) + s^2(\mu \Vert d\Vert ^2) \le (1-s\delta )\Delta \phi , \end{aligned}$$

where $\tau > 0$, $\mu > 0$, $0< \delta < \min \{\frac{2\tau }{\beta },1\}$, and $0 < s \le \min \{\frac{1}{u}(\tau - \frac{\delta \beta }{2}), 1\}$

Proof

$$\begin{aligned} \begin{aligned}&\Delta \phi - s\left( \Delta \phi - \frac{\beta }{2}\Vert d\Vert ^2 + \tau \Vert d\Vert ^2\right) + s^2\mu \Vert d\Vert ^2 - (1-s\delta )\Delta \phi \\&\quad = - s\left( \Delta \phi - \frac{\beta }{2}\Vert d\Vert ^2 + \tau \Vert d\Vert ^2\right) + s^2\mu \Vert d\Vert ^2 + s\delta \Delta \phi \\&\quad = s \left[ \left( s\mu + \frac{\beta }{2} - \tau \right) \Vert d\Vert ^2 - (1-\delta ) \Delta \phi \right] \\&\quad \le s \left[ \left( s\mu + \frac{\beta }{2} - \tau \right) \Vert d\Vert ^2 - \frac{(1-\delta )\beta }{2}\Vert d\Vert ^2\right] \\&\quad = s \Vert d\Vert ^2 \left( s\mu -\tau + \frac{\delta \beta }{2}\right) , \\ \end{aligned} \end{aligned}$$

where the inequality follows from $\frac{\beta }{2}\Vert d\Vert ^2 \le \Delta \phi $. Due to $0< \delta < \frac{2\tau }{\beta }$ and $s \le \frac{1}{\mu }(\tau - \frac{\delta \beta }{2})$, we have

$$\begin{aligned} s\mu - \tau + \frac{\delta \beta }{2} \le 0. \end{aligned}$$

Therefore, $\Delta \phi - s(\Delta \phi - \frac{\beta }{2}\Vert d\Vert ^2 + \tau \Vert d\Vert ^2) + s^2(\mu \Vert d\Vert ^2) \le (1-s\delta )\Delta \phi $. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ge, D., He, R. & He, S. An improved algorithm for the $L_2$–$L_p$ minimization problem. Math. Program. 166, 131–158 (2017). https://doi.org/10.1007/s10107-016-1107-2

Download citation

Received: 24 June 2014
Accepted: 30 December 2016
Published: 08 February 2017
Issue Date: November 2017
DOI: https://doi.org/10.1007/s10107-016-1107-2

An improved algorithm for the \(L_2\)–\(L_p\) minimization problem

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On the Properties of the Method of Minimization for Convex Functions with Relaxation on the Distance to Extremum

A Newton-CG algorithm with complexity guarantees for smooth unconstrained optimization

A Barzilai–Borwein-Like Iterative Half Thresholding Algorithm for the \(L_{1/2}\) Regularized Problem

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Lemma 8

Proof

Lemma 9

Proof

Lemma 10

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Navigation

An improved algorithm for the \(L_2\)–\(L_p\) minimization problem

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On the Properties of the Method of Minimization for Convex Functions with Relaxation on the Distance to Extremum

A Newton-CG algorithm with complexity guarantees for smooth unconstrained optimization

A Barzilai–Borwein-Like Iterative Half Thresholding Algorithm for the \(L_{1/2}\) Regularized Problem

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Lemma 8

Proof

Lemma 9

Proof

Lemma 10

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Search

Navigation