Abstract
In this paper we consider a class of non-Lipschitz and non-convex minimization problems which generalize the \(L_2\)–\(L_p\) minimization problem. We propose an iterative algorithm that decides the next iteration based on the local convexity/concavity/sparsity of its current position. We show that our algorithm finds an \(\epsilon \)-KKT point within \(O(\log {\epsilon ^{-1}})\) iterations from certain initial points. The same result is also applied to the problem with general linear constraints under mild conditions.
Similar content being viewed by others
References
Bian, W., Chen, X., Ye, Y.: Complexity analysis of interior point algorithms for non-Lipschitz and nonconvex minimization. Math. Program. 149(1–2), 301–327 (2012)
Bian, W., Chen, X.: Worst-case complexity of smoothing quadratic regularization methods for non-Lipschitzian optimization. SIAM J. Optim. 23(3), 1718–1741 (2013)
Candès, E., Wakin, M., Boyd, S.: Enhancing sparsity by reweighted \(l_1\) minimization. J. Fourier Anal. Appl. 14, 877–905 (2008)
Chartrand, R.: Exact reconstruction of sparse signals via nonconvex minimization. IEEE Signal Process. Lett. 14(10), 707–710 (2007)
Chartrand, R., Staneva, V.: Restricted isometry properties and nonconvex compressive sensing. Inverse Problem 24(3), 035020 (2008)
Chen, C., Li, X., Tolman, C., Wang, S., Ye, Y.: Sparse Portfolio Selection via Quasi-norm Regularization. Working paper (2013)
Chen, X., Xu, F., Ye, Y.: Lower bound theory of nonzero entries in solutions of \(\ell _2\)–\(\ell _p\) minimization. SIAM J. Sci. Comput. 32(5), 2832–2852 (2010)
Chen, X., Ge, D., Wang, Z., Ye, Y.: Complexity of unconstrained \(L_2\)–\(L_p\) minimization. Math. Program. 143(1–2), 371–383 (2014)
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Soc. 96(456), 1348–1360 (2001)
Foucart, S., Lai, M.J.: Sparsest solutions of under-determined linear systems via \(l_q\) minimization for \(0< q\le 1\). Appl. Comput. Harmonic Anal. 26(3), 395–407 (2009)
Frank, I.E., Freidman, J.H.: A statistical view of some chemometrics regression tools. Technometrics 35(2), 109–148 (1993)
Garey, M.R., Johnson, D.S.: “Strong” NP-completeness results: motivation, examples, and implications. J. Assoc. Comput. Mach. 25(3), 499–508 (1978)
Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, New York (1979)
Ge, D., Jiang, X., Ye, Y.: A note on the complexity of \(L_p\) minimization. Math. Program. 129(2), 285–299 (2011)
Huang, J., Horowitz, J.L., Ma, S.: Asymptotic properties of bridge estimators in sparse high-dimensional regression models. Ann. Stat. 36(2), 587–613 (2008)
Knight, K., Fu, W.J.: Asymptotics for Lasso-type estimators. Ann. Stat. 28(5), 1356–1378 (2000)
Lai, M., Wang, Y.: An unconstrained \(l_q\) minimization with \(0 < q< 1\) for sparse solution of under-determined linear systems. SIAM J. Optim. 21(1), 82–101 (2011)
Liu, Y.F., Ma, S., Dai, Y.H., Zhang, S.: A smoothing SQP framework for a class of composite \(l_q\) minimization over polyhedron. Math. Program. 158(1), 467–500 (2016)
Natarajan, B.K.: Sparse approximate solutions to linear systems. SIAM J. Comput. 24(2), 227–234 (1995)
Vazirani, V.: Approximation Algorithms. Springer, Berlin (2003)
Ye, Y.: On the complexity of approximating a KKT point of quadratic programming. Math. Program. 80(2), 195–211 (1998)
Acknowledgements
The authors would like to thank anonymous referees for their insightful comments and suggestions. Research by the first author is supported by Program for Innovative Research Team of Shanghai University of Finance and Economics (IRTSHUFE) and by National Natural Science Foundation of China (NSFC) Project 11471205. Research by the third author is supported by Program for Innovative Research Team of Shanghai University of Finance and Economics (IRTSHUFE), the Program for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning, and by National Natural Science Foundation of China (NSFC) Project 71440014.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Let \(g_1(x) = \lambda \sum _i x_i^p\) and \(g_2(x) = \frac{1}{2}x^T(Q-\beta I)x\). Note that \(g(x) = g_1(x) + g_2(x)\).
Lemma 8
If \(x+d \ge 0\), \(x_i \ge L\) for every index i, and \(0< s \le \nu < 1\), then
Proof
We consider \(g_1(x)\) and \(g_2(x)\) separately.
By the definition of \(g_1(x)\), we have
Since \(x_i + tsd_i \ge (1-s)L\), for all index i, we have
In conjunction with \(s \le \nu \) and \(0< p < 1\), we get
For \(g_2(x)\), we have
Thus,
\(\square \)
Remark Lemma 8 is the consequence of that g(x)’s gradient is Lipschitz continuous in the region that \(x_i \ge L(1-\nu )\) for all index i.
Lemma 9
If \(x+d \ge 0\), \(x \ge 0\) and \(0 < s \le 1\), then \(d^T\left( \nabla g(x) - \nabla g(x+sd)\right) \le \frac{-s}{1-s} d^T \nabla ^2g(x) d\).
Proof
We consider \(g_1(x)\) and \(g_2(x)\) separately. From the definition of \(g_1(x)\), we have
and
To prove that \(d^T\left( \nabla g_1(x) - \nabla g_1(x+sd)\right) \le \frac{-s}{1-s} d^T \nabla ^2g_1(x) d\), it suffices to show that \(d_i(x_i^{p-1}) - (x_i + sd_i)^{p-1} \le \frac{s}{1-s}(1-p)x_i^{p-2}d_i^2, \forall i\).
If \(d_i \ge 0\), by the mean-value theorem, there exists a \(x_i'\in [x_i, x_i+sd_i]\), such that \(x_i^{p-1} - (x_i+sd_i)^{p-1} = x_i'^{p-2}(1-p)sd_i\). Since \(0< p < 1\), we have
If \(d_i < 0\), We let \(v = -\frac{d_i}{x_i}, 0 < v \le 1\). By multiplying \(\frac{x_i^{1-p}}{-d_i}\) on both sides,
can be simplified as
Since
letting \(u=sv\), it is sufficient to show that \((1-u)^{p-1}-1 \le \frac{u}{1-u}(1-p)\). Consider function \(\omega (u) = (1-u)^p - (1-u) - u(1-p)\), \(0 < u \le 1\). A simple calculation shows that \(\omega (u)\) is a decreasing function. Hence \(\omega (u) \le \lim _{u \rightarrow 0}\omega (u) = 0\). It follows that \((1-u)^{p-1}-1 \le \frac{u}{1-u}(1-p), \forall 0 < u \le 1\). Thus, we obtain
For \(g_2(x)\), since \(0 < s \le 1\), we have
Therefore,
\(\square \)
Remark Lemma 9 is a result of the special structure of function g(x), which is the sum of \(L_p\) function and a concave quadratic function.
Lemma 10
If \(\Delta \phi \ge \frac{\beta }{2}\Vert d\Vert ^2\), then
where \(\tau > 0\), \(\mu > 0\), \(0< \delta < \min \{\frac{2\tau }{\beta },1\}\), and \(0 < s \le \min \{\frac{1}{u}(\tau - \frac{\delta \beta }{2}), 1\}\)
Proof
where the inequality follows from \(\frac{\beta }{2}\Vert d\Vert ^2 \le \Delta \phi \). Due to \(0< \delta < \frac{2\tau }{\beta }\) and \(s \le \frac{1}{\mu }(\tau - \frac{\delta \beta }{2})\), we have
Therefore, \(\Delta \phi - s(\Delta \phi - \frac{\beta }{2}\Vert d\Vert ^2 + \tau \Vert d\Vert ^2) + s^2(\mu \Vert d\Vert ^2) \le (1-s\delta )\Delta \phi \). \(\square \)
Rights and permissions
About this article
Cite this article
Ge, D., He, R. & He, S. An improved algorithm for the \(L_2\)–\(L_p\) minimization problem. Math. Program. 166, 131–158 (2017). https://doi.org/10.1007/s10107-016-1107-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-016-1107-2