Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

An outer–inner linearization method for non-convex and nondifferentiable composite regularization problems

Published: 01 September 2021 Publication History

Abstract

We propose a new outer–inner linearization method for non-convex statistical learning problems involving non-convex structural penalties and non-convex loss. Many important statistical problems fall in this category, including the robust M-estimators, generalized linear models, and different types of structured learning problems. Our method exploits the fact that many such loss and penalty functions can be represented as compositions of smooth concave functions and nonsmooth convex functions. It linearizes the outer concave functions and solves the resulting convex, but still nonsmooth, subproblems by a special alternating linearization method. We provide proof of convergence to a stationary point of the method under mild conditions. Finally, numerical examples involving non-convex structural penalties and non-convex loss functions demonstrate the efficacy and accuracy of the method.

References

[1]
Attouch H, Bolte J, and Svaiter BF Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods Math. Program. 2013 137 91-129
[2]
Bahrampour S, Nasrabadi NM, Ray A, and Jenkins WK Multimodal task-driven dictionary learning for image classification IEEE Trans. Image Process. 2016 25 1 24-38
[3]
Barber R and Sidky E Mocca: mirrored convex/concave optimization for nonconvex composite functions J. Mach. Learn. Res. 2016 5056 17-5006
[4]
Barber RF, Sidky EY, Schmidt TG, and Pan X An algorithm for constrained one-step inversion of spectral CT data Phys. Med. Biol. 2016 61 3784
[5]
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends® Mach. Learn. 3(1), 1–122 (2011)
[6]
Breheny P and Huang J Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection Ann. Appl. Stat. 2011 5 232-253
[7]
Candes E, Wakin M, and Boyd M Enhancing sparsity by reweighted l1 minimization J. Fourier Anal. Appl. 2008 14 877-905
[8]
Chen J and Sayed AH On the learning behavior of adaptive networks-Part I: transient analysis IEEE Trans. Inf. Theory 2015 61 6 3487-3517
[9]
Clarke FH Optimization and Nonsmooth Analysis 1990 Philadelphia SIAM
[10]
Combettes, P.L., Pesquet, J.C.: Proximal splitting methods in signal processing. In: Fixed-Point Algorithms for Inverse Problems in Science and Engineering, vol. 49, pp. 185–212 (2011)
[11]
Du Y, Lin X, and Ruszczyński A A selective linearization method for multiblock convex optimization SIAM J. Optim. 2017 27 1102-1117
[12]
Fan J and Li R Variable selection via nonconcave penalized likelihood and its oracle properties J. Am. Stat. Assoc. 2001 96 1348-1360
[13]
Friedman J, Hastie T, and Tibshirani R Sparse inverse covariance estimation with the graphical lasso Biostatistics 2008 9 3 432-441
[14]
Gong, P., Zhang, C., Lu, Z., Huang, J., Ye, J.: A general iterative shrinkage and thresholding algorithm for non-convex regularized optimization problems. In: ICML (2013)
[15]
Hong M, Razaviyayn M, Luo ZQ, and Pang JS A unified algorithmic framework for block-structured optimization involving big data: with applications in machine learning and signal processing IEEE Signal Process. Mag. 2016 33 1 57-77
[16]
Huber P Robust estimation of a location parameter Ann. Math. Stat. 1964 35 1 73-101
[17]
Kiwiel K, Rosa C, and Ruszczyński A Proximal decomposition via alternating linearization SIAM J. Optim. 1999 9 153-172
[18]
Lehmann EL and Casella G Theory of Point Estimation 2006 Berlin Springer
[19]
Lin X, Pham M, and Ruszczynski A Alternating linearization for structured regularization problems J. Mach. Learn. Res. 2014 15 3447-3481
[20]
Loh, P.-L., Wainwright, M.J.: Regularized m-estimators with nonconvexity: statistical and algorithmic theory for local optima. In: Advances in Neural Information Processing Systems, pp. 476–484 (2013)
[21]
Loh P Statistical consistency and asymptotic normality for high-dimensional robust m-estimators Ann. Stat. 2015 45 866-896
[22]
Lu Z Iterative reweighted minimization methods for lp regularization unconstrained nonlinear programming Math. Program. 2014 147 277-307
[23]
Mairal J Incremental majorization-minimization optimization with application to large-scale machine learning SIAM J. Optim. 2015 25 2 829-855
[24]
Mairal J, Bach F, Ponce J, and Sapiro G Online learning for matrix factorization and sparse coding J. Mach. Learn. Res. 2010 11 19-60
[25]
Mairal J and Yu B Supervised feature selection in graphs with path coding penalties and network flows J. Mach. Learn. Res. 2013 14 1 2449-2485
[26]
Maronna RARD, Martin RD, and Yohai V Robust Statistics 2006 Hoboken Wiley
[27]
Mazumder R, Friedman JH, and Hastie T Sparsenet: coordinate descent with non-convex penalties J. Am. Stat. Assoc. 2011 106 1125-1138
[28]
Meier L, van de Geer S, and Biihlmann P The group lasso for logistic regression J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2008 70 1 53-71
[29]
Meinshausen N and Bühlmann P High-dimensional graphs and variable selection with the lasso Ann. Stat. 2006 34 1436-1462
[30]
Negahban, S., Yu, B., Wainwright, M.J., Ravikumar, P.K.: A unified framework for high-dimensional analysis of m-estimators with decomposable regularizers. In: Advances in Neural Information Processing Systems, pp. 1348–1356 (2009)
[31]
Portnoy S and He X A robust journey in the new millennium J. Am. Stat. Assoc. 2000 95 452 1331-1335
[32]
Rapaport F, Barillot E, and Vert J Classification of arrayCGH data using fused SVM Bioinformatics 2008 24 375-382
[33]
Rockafellar RT Nurminski EA Favorable classes of Lipschitz continuous functions in subgradient optimization Nondifferentiable Optimization 1982 Oxford Pergamon Press 125-144
[34]
Schmidt, T.G., Barber, R.F., Sidky, E.Y.: Spectral CT metal artifact reduction with an optimization-based reconstruction algorithm (2017)
[35]
Shevlyakov G, Morgenthaler S, and Shurygin A Redescending m-estimators J. Stat. Plan. Inference 2008 138 10 2906-2917
[36]
Spingarn Jonathan E Submonotone subdifferentials of lipschitz functions Trans. Am. Math. Soc. 1981 264 77-89
[37]
Tibshirani R, Saunders M, Zhu J, and Rosset S Sparsity and smoothness via the fused lasso J. R. Stat. Soc. Ser. B 2005 67 91-108
[38]
Tibshirani R and Taylor J The solution path of the generalized lasso Ann. Stat. 2011 39 1335-1371
[39]
Tibshirani R Regression shrinkage and selection via the lasso J. R. Stat. Soc. Ser. B Methodol. 1996 58 267-288
[40]
Wang, N., Wang, J., Yeung, D.-Y.: Online robust non-negative dictionary learning for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 657–664 (2013)
[41]
Wang Y, Yin W, and Zeng J Global convergence of ADMM in nonconvex nonsmooth optimization J. Sci. Comput. 2019 78 29-63
[42]
Yohai VJ High breakdown-point and high efficiency robust estimates for regression Ann. Stat. 1987 15 642-656
[43]
Yohai V and Maronna RA Asymptotic behavior of m-estimators for the linear model Ann. Stat. 1979 7 258-268
[44]
Yu, Y., Zheng, X., Marchetti-Bowick, M., Xing, E.: Minimizing nonconvex non-separable functions. In: Artificial Intelligence and Statistics Conference (2015)
[45]
Yuan M and Lin Y Model selection and estimation in regression with grouped variables J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2006 68 1 49-67
[46]
Zhang C-H Nearly unbiased variable selection under minimax concave penalty Ann. Stat. 2010 38 2 894-942
[47]
Zhao T, Liu H, and Zhang T Pathwise coordinate optimization for sparse learning: algorithm and theory Ann. Stat. 2017 46 180-218
[48]
Zhang T, Pham M, Sun J, Yan G, Li H, Sun Y, Gonzalez M, and Coan J A low-rank multivariate general linear model for multi-subject fMRI data and a non-convex optimization algorithm for brain response comparison NeuroImage 2018 173 580-591
[49]
Zhou X, Zhu M, Leonardos S, and Daniilidis K Sparse representation for 3d shape estimation: a convex relaxation approach IEEE Trans. Pattern Anal. Mach. Intell. 2016 99 1648-1661
[50]
Zou H and Li R One-step sparse estimates in nonconcave penalized likelihood models Ann. Stat. 2008 36 1509-1533
[51]
Zou H and Hastie T Regularization and variable selection via the elastic net J. R. Stat. Soc. Ser. B (Stat. Methodol.) 2005 67 2 301-320

Index Terms

  1. An outer–inner linearization method for non-convex and nondifferentiable composite regularization problems
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image Journal of Global Optimization
          Journal of Global Optimization  Volume 81, Issue 1
          Sep 2021
          266 pages

          Publisher

          Kluwer Academic Publishers

          United States

          Publication History

          Published: 01 September 2021
          Accepted: 25 May 2021
          Received: 08 August 2019

          Author Tags

          1. Nonsmooth optimization
          2. Non-convex loss
          3. Non-convex regularization
          4. Machine learning
          5. Statistics

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • 0
            Total Citations
          • 0
            Total Downloads
          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 04 Oct 2024

          Other Metrics

          Citations

          View Options

          View options

          Get Access

          Login options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media