Abstract
The standard assumption for proving linear convergence of first order methods for smooth convex optimization is the strong convexity of the objective function, an assumption which does not hold for many practical applications. In this paper, we derive linear convergence rates of several first order methods for solving smooth non-strongly convex constrained optimization problems, i.e. involving an objective function with a Lipschitz continuous gradient that satisfies some relaxed strong convexity condition. In particular, in the case of smooth constrained convex optimization, we provide several relaxations of the strong convexity conditions and prove that they are sufficient for getting linear convergence for several first order methods such as projected gradient, fast gradient and feasible descent methods. We also provide examples of functional classes that satisfy our proposed relaxations of strong convexity conditions. Finally, we show that the proposed relaxed strong convexity conditions cover important applications ranging from solving linear systems, Linear Programming, and dual formulations of linearly constrained convex problems.
Similar content being viewed by others
Notes
This result can be also proved using simple algebraic arguments. More precisely, from Courant-Fischer theorem we know that \(\Vert Ax\Vert \ge \sigma _{\min }(A) \Vert x\Vert \) for all \(x \in \text {Im}(A^T)\). Since we assume that our polyhedron \(\mathcal{P}=\{x: \; Ax=b\}\) is non-empty, then \(x - [x]_\mathcal{P} \in \text {Im}(A^T)\) for all \(x \in \mathbb {R}^n\) (from KKT conditions of \(\min _{z: Az=b} \Vert x - z\Vert ^2\) we have that there exists \(\mu \) such that \(x - [x]_{\mathcal{P}} + A^T \mu =0\)). In conclusion, we get:
$$\begin{aligned} \Vert Ax - b\Vert = \Vert Ax - A [x]_{\mathcal{P}} \Vert \ge \sigma _{\text {min}}(A) \Vert x - [x]_{\mathcal{P}}\Vert = \sigma _{\text {min}}(A) \text {dist}_2(x,\mathcal {P}) \quad \forall x \in \mathbb {R}^n. \end{aligned}$$See Remark 1 below for an example satisfying this condition.
References
Leventhal, D., Lewis, A.S.: Randomized methods for linear constraints: convergence rates and conditioning. Math. Oper. Res. 35(3), 641–654 (2010)
Liu, J., Wright, S., Re, C., Bittorf, V., Sridhar, S.: An asynchronous parallel stochastic coordinate descent algorithm. J. Mach. Learn. Res. 16(1), 285–322 (2015)
Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course. Kluwer, Dordrecht (2004)
Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4), 1574–1609 (2009)
Wright, S.: Coordinate descent algorithms. Math. Program. 151(1), 3–34 (2015)
Burke, J.V., Deng, S.: Weak sharp minima revisited Part III: error bounds for differentiable convex inclusions. Math. Program. 116(1–2), 37–56 (2009)
Lewis, A.S., Pang, J.S.: Error bounds for convex inequality systems. In: Chapter In: Generalized Convexity, Generalized Monotonicity–Recent Results. Springer, Berlin (1998)
Yangy, T., Lin, Q.: A stochastic gradient method with linear convergence rate for a class of non-smooth non-strongly convex optimization. Tech. rep. (2015). www.arxiv.org
Luo, Z.-Q., Tseng, P.: Error bounds and convergence analysis of feasible descent methods: a general approach. Ann. Oper. Res. 46(1), 157–178 (1993)
Necoara, I., Clipici, D.: Parallel random coordinate descent method for composite minimization: convergence analysis and error bounds. SIAM J. Optim. 26(1), 197–226 (2016)
Wang, P.W., Lin, C.J.: Iteration complexity of feasible descent methods for convex optimization. J. Mach. Learn. Res. 15(4), 1523–1548 (2014)
Zhang, H., Cheng, L.: Restricted strong convexity and its applications to convergence analysis of gradient type methods in convex optimization. Optim. Lett. 9(5), 961–979 (2015)
Beck, A., Shtern, S.: Linearly convergent away-step conditional gradient for non-strongly convex functions. Math. Program. 164(1–2), 1–27 (2017)
Drusvyatskiy, D., Lewis, A.: Error bounds, quadratic growth, and linear convergence of proximal methods. Tech. rep., (2016). (arXiv:1602.06661)
Zhou, Z., So, A.: A unified approach to error bounds for structured convex optimization problems. Math. Program. 165(2), 689–728 (2017)
Hoffman, A.J.: On approximate solutions of systems of linear inequalities. J. Res. Natl. Bur. Stand. 49(4), 263–265 (1952)
Klatte, D., Thiere, G.: Error bounds for solutions of linear equations and inequalities. Math. Methods Oper. Res. 41(2), 191–214 (1995)
Nesterov, Y.: Gradient methods for minimizing composite functions. Math. Program. 140(1), 125–161 (2013)
O’Donoghue, B., Candes, E.: Adaptive restart for accelerated gradient schemes. Found. Comput. Math. 15(3), 715–732 (2013)
Bubeck, S.: Convex optimization: algorithms and complexity. Found. Trends Mach. Learn. 8(3–4), 231–357 (2015)
O’Donoghue, B., Chu, E., Parikh, N., Boyd, S.: Conic optimization via operator splitting and homogeneous self-dual embedding. J. Optim. Theory Appl. 169(3), 1042–1068 (2016)
Lin, H., Mairal, J., Harchaoui, Z.: A universal Catalyst for first-order optimization. In: Advances in neural information processing systems, 3384–3392 (2015)
Acknowledgements
The research leading to these results has received funding from the Executive Agency for Higher Education, Research and Innovation Funding (UEFISCDI), Romania: PN-III-P4-PCE-2016-0731, project ScaleFreeNet, No. 39/2017.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Necoara, I., Nesterov, Y. & Glineur, F. Linear convergence of first order methods for non-strongly convex optimization. Math. Program. 175, 69–107 (2019). https://doi.org/10.1007/s10107-018-1232-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-018-1232-1