Minimization of Non-smooth, Non-convex Functionals by Iterative Thresholding

Kristian Bredies¹,
Dirk A. Lorenz² &
Stefan Reiterer¹

1640 Accesses
48 Citations
Explore all metrics

Abstract

Convergence analysis is carried out for a forward-backward splitting/generalized gradient projection method for the minimization of a special class of non-smooth and genuinely non-convex minimization problems in infinite-dimensional Hilbert spaces. The functionals under consideration are the sum of a smooth, possibly non-convex and non-smooth, necessarily non-convex functional. For separable constraints in the sequence space, we show that the generalized gradient projection method amounts to a discontinuous iterative thresholding procedure, which can easily be implemented. In this case we prove strong subsequential convergence and moreover show that the limit satisfies strengthened necessary conditions for a global minimizer, i.e., it avoids a certain set of non-global minimizers. Eventually, the method is applied to problems arising in the recovery of sparse data, where strong convergence of the whole sequence is shown, and numerical tests are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On convergence of iterative thresholding algorithms to approximate sparse solution for composite nonconvex optimization

Article Open access 06 March 2024

A conjugate gradient like method for p-norm minimization in functional spaces

Article 07 June 2017

Regularized gradient-projection methods for finding the minimum-norm solution of the constrained convex minimization problem

Article Open access 09 January 2017

References

Nocedal, J., Wright, S.: Numerical Optimization. Springer Series in Operations Research and Financial Engineering, 2nd edn. Springer, New York (2006)
MATH Google Scholar
Pytlak, R.: Conjugate Gradient Algorithms in Nonconvex Optimization, Nonconvex Optimization and Its Applications, vol. 89. Springer, Berlin (2009)
Google Scholar
Nikolova, M.: Analysis of the recovery of edges in images and signals by minimizing nonconvex regularized least-squares. Multiscale Modell. Simul. 4(3), 960–991 (2005)
Article MATH MathSciNet Google Scholar
Nikolova, M.: Markovian reconstruction using a GNC approach. IEEE Trans. Image Process. 8(9), 1204–1220 (1999)
Article Google Scholar
Blake, A., Zisserman, A.: Visual Reconstruction. MIT Press, Cambridge (1987)
Google Scholar
Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16(6), 964–979 (1979)
Article MATH MathSciNet Google Scholar
Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward–backward splitting. Multiscale Model. Simul. 4(4), 1168–1200 (2005)
Article MATH MathSciNet Google Scholar
Bredies, K., Lorenz, D.A.: Linear convergence of iterative soft-thresholding. J. Fourier Anal. Appl. 14(5–6), 813–837 (2008). doi:10.1007/s00041-008-9041-1
Article MATH MathSciNet Google Scholar
Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods. Math. Program. Ser. B. (2011). doi:10.1007/s10107-011-0484-9.
Ito, K., Kunisch, K.: A variational approach to sparsity optimization based on Lagrange multiplier theory. MOBIS SFB-Report 2011–2014 (2011). http://math.uni-graz.at/mobis/publications/SFB-Report-2011-014.pdf.
Chouzenoux, E., Pesquet, J.C., Repetti, A.: Variable metric forward–backward algorithm for minimizing the sum of a differentiable function and a convex function. To appear J. Optim. Theory Appl. (2013). http://www.optimization-online.org/DB_HTML/2013/01/3749.html
Ochs, P., Chen, Y., Brox, T., Pock, T.: iPiano: Inertial proximal algorithm for non-convex optimization. (2013). http://lmb.informatik.uni-freiburg.de/Publications/2013/OB13
Kaplan, A., Tichatschke, R.: Proximal point methods and nonconvex optimization. J. Global Optim. 13, 389–406 (1998)
Article MATH MathSciNet Google Scholar
Lewis, A.S., Luke, D.R., Malick, J.: Local linear convergence for alternating and averaged nonconvex projections. Found. Comput. Math. 9, 485–513 (2009)
Article MATH MathSciNet Google Scholar
Rockafellar, R.T., Wets, R.J.B.: Variational Analysis, Grundlehren der Mathematischen Wissenschaften, vol. 317. Springer, Berlin (1998)
Google Scholar
Penot, J.P.: Proximal mappings. J. Approx. Theory 94, 203–221 (1998)
Article MATH MathSciNet Google Scholar
Hare, W., Sagastizábal, C.: Computing proximal points of nonconvex functions. Math. Program. Ser. B 116(1–2), 221–258 (2009)
Article MATH Google Scholar
Demyanov, V.F., Rubinov, A.M.: Approximate Methods in Optimization Problems. Modern Analytic and Computational Methods in Science and Mathematics, vol. 32. American Elsevier, New York (1970)
Google Scholar
Dunn, J.C.: Global and asymptotic convergence rate estimates for a class of projected gradient processes. SIAM J. Control Optim. 19(3), 368–400 (1981)
Article MATH MathSciNet Google Scholar
Clarke, F.H.: Optimization and Nonsmooth Analysis. CRM Université de Montréal, Montréal (1989)
MATH Google Scholar
Bredies, K., Lorenz, D.A.: Regularization with non-convex separable constraints. Inverse Problems, 25(8), 085011 (2009). doi:10.1088/0266-5611/25/8/085011
Blumensath, T., Yaghoobi, M., Davies, M.: Iterative hard thresholding and $l^0$ regularisation. In: IEEE International Conference on Acoustics, Speech and Signal Processing (2007)
Moulin, P., Liu, J.: Analysis of multiresolution image denoising schemes using generalized Gaussian and complexity priors. IEEE Trans. Inform. Theory 45(3), 909–919 (1999)
Article MATH MathSciNet Google Scholar
Lorenz, D.A.: Non-convex variational denoising of images: interpolation between hard and soft wavelet shrinkage. Curr. Dev. Theory Appl. Wavelets 1(1), 31–56 (2007)
MATH MathSciNet Google Scholar
Lorenz, D.A.: Convergence rates and source conditions for Tikhonov regularization with sparsity constraints. J. Inverse Ill-Posed Problems 16(5), 463–478 (2008). doi:10.1515/JIIP.2008.025
Article MATH MathSciNet Google Scholar
Grasmair, M.: Well-posedness classes for sparse regularisation. Reports of FSP S092-“Industrial Geometry”. University of Vienna, Austria (2010)
Lai, M.J., Wang, J.: An unconstrained $\ell _q$ minimization with $0 < q\le 1$ for sparse solution of underdetermined linear systems. SIAM J. Optim. 21(1), 82–101 (2011)
Sturmfels, B.: What is a Gröbner basis? Notices Am. Math. Soc. 52(10), 1199–1200 (2005)
MATH MathSciNet Google Scholar
Buchberger, B.: Introduction to Gröbner bases. In: Buchberger, B., Winkler, F. (eds.) Gröbner Basis and Applications. London Mathematical Society Lecture Notes Series, vol. 251, pp. 3–31. Cambridge University Press, Cambridge (1998)
Kreuzer, M., Robbiano, L.: Computational Commutative Algebra, vol. 1. Springer, New York (2000)
Book MATH Google Scholar

Download references

Acknowledgments

Kristian Bredies acknowledges support by the SFB Research Center “Mathematical Optimization and Applications in Biomedical Sciences” at the University of Graz. Dirk Lorenz acknowledges support by the DFG under Grant LO 1436/2-1.

Author information

Authors and Affiliations

Institute of Mathematics and Scientific Computing, University of Graz, Heinrichstraße 36, 8010 , Graz, Austria
Kristian Bredies & Stefan Reiterer
Institute for Analysis and Algebra, TU Braunschweig, 38092 , Braunschweig, Germany
Dirk A. Lorenz

Authors

Kristian Bredies
View author publications
You can also search for this author in PubMed Google Scholar
Dirk A. Lorenz
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Reiterer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dirk A. Lorenz.

Additional information

Communicated by Hedy Attouch.

Appendix

1.1 8.1 Proof of Lemma 3.2

As $\phi $ is non-decreasing, $\phi '(y) \ge 0$ for all $y > 0$, hence $\lim _{y \rightarrow \infty } \phi '(y) \ge 0$ (with $\infty $ allowed). Thus, each $\rho _s$ is a strictly convex function with $\rho _s(y) \rightarrow \infty $ for $y \rightarrow 0$ and $y \rightarrow \infty $, which implies that a unique minimizer $y_s > 0$ exists.

Next, note that since $\rho _s$ is strictly convex for some $s > 0$, it is locally Lipschitz. Consequently, $\phi '$ is locally Lipschitz and in particular absolutely continuous on compact intervals contained in ${]{0,\infty }[}$. This implies that $\phi '$ is differentiable almost everywhere in ${]{0,\infty }[}$, with $\phi '(y_1) - \phi '(y_0) = \int _{y_0}^{y_1} \phi ''(t) \,\mathrm {d}{t}$ for each $y_1 > y_0 > 0$. Moreover, by assumption $t \mapsto t\phi ''(t)$ is Lebesgue integrable on $[0,y]$ for each $y > 0$, hence Fubini’s theorem is applicable. We compute

$$\begin{aligned} \phi (y) - y\phi '(y) = \phi (0) + \int \limits _0^y (\phi '(\tau ) - \phi '(y)) \,\mathrm {d}{\tau } = - \int \limits _0^y \int \limits _{\tau }^y \phi ''(t) \,\mathrm {d}{t} \,\mathrm {d}{\tau } = -\int \limits _0^y t\phi ''(t) \,\mathrm {d}{t}. \end{aligned}$$

Since $(\phi ')'_-$ is strictly monotonically increasing and coincides with $\phi ''$ where the latter exists, i.e., almost everywhere in ${]{0,\infty }[}$, we get for $y > 0$ and any $z \ge (\phi ')'_-(y)$ that

$$\begin{aligned} - \int \limits _0^y t \phi ''(t) \,\mathrm {d}{t} = \int \limits _0^y t (\phi ')'_-(t) \,\mathrm {d}{t} > -z \int \limits _0^y t \,\mathrm {d}{t} = -\frac{z y^2}{2}. \end{aligned}$$

This already proves $\psi (y) > -z$. For $\psi $ being strictly decreasing, consider $y > 0$ where $\phi ''(y)$ exists and deduce

$$\begin{aligned} \psi '(y) = \frac{2}{y} \left( \frac{2}{y^2} \int \limits _0^y t \phi ''(t) \,\mathrm {d}{t} - \phi ''(y) \right) < 0 \quad \Leftrightarrow \quad \psi (y) > -\phi ''(y), \end{aligned}$$

where the latter has already been established. Hence, $\psi ' < 0$ almost everywhere in ${]{0,\infty }[}$, and as $\psi $ is also locally Lipschitz in ${]{0,\infty }[}$, it is absolutely continuous in each compact interval contained in ${]{0,\infty }[}$. Consequently, one can conclude that $\psi $ is strictly monotonically decreasing. Moreover, $\psi (y) > -(\phi ')'_-(y)$ also implies $\psi (y) \rightarrow \infty $ as $y \rightarrow 0$, since $\phi '(y)$ is bounded around $0$ whenever $(\phi ')'_-(y)$ is bounded around $0$. Finally, from the assumptions $\phi (y) \rightarrow \infty $ and $\phi '(y)/y \rightarrow 0$ as $y \rightarrow \infty $, it follows with L’Hôpital’s rule that $\phi (y)/y^2 \rightarrow 0$. Consequently, $\psi (y) \rightarrow 0$ as $y \rightarrow \infty $ and, together with the above, $\psi : {]{0,\infty }[} \rightarrow {]{0,\infty }[}$ is onto. $\square $

1.2 8.2 Tools from Algebraic Geometry

This section contains some technical lemmas for algebraic equations in order to prove Proposition 5.1. We make here use of the theory of Gröbner bases for solving algebraic equations.

We present the most important notations and definitions, which are needed for the theory here for the sake of convenience and in order to provide a uniform notation. A brief introduction to Gröbner bases can also be found in [28].

Definition 8.1

(Some notations and definitions in polynomial rings)

We denote by $\mathbf {P}$ the polynomial ring $\mathbf {C}[X_1,\ldots ,X_n]$ in $n$ variables with complex coefficients.
A subgroup $I$ of $\mathbf {P}$ with respect to $+$ is called an ideal of $\mathbf {P}$, iff for all $f \in \mathbf {P}$ and all $g \in I$ the product $f g$ is also in $I$. We write $I \vartriangleleft \mathbf {P}$ if $I$ is an ideal of $\mathbf {P}$.
Let $F$ be a subset of $\mathbf {P}$. The set
$$\begin{aligned} \langle F \rangle = \bigcap \limits _{F \subset I \vartriangleleft \mathbf {P}} I \end{aligned}$$
is called the ideal generated by $F$. It holds that
$$\begin{aligned} \langle f_1,\ldots ,f_m\rangle = \left\{ \sum \limits _{i=1}^m {g_i f_i}:{f_i \in F, \,g_i \in \mathbf {P}\text { for } i = 1,\ldots ,m}\right\} . \end{aligned}$$
For $\alpha \in \mathbf {N}^n_0$ the power product $X^{\alpha }$ is defined by $X^{\alpha } := X_1^{\alpha _1} \cdots X^{\alpha _n}_n$.
We denote with $\mathbf {T}\subset \mathbf {P}$ the set $\{{ X^{\alpha }}\ : \ {\alpha \in \mathbf {N}_0^n}\}$ of power products. A power product $X^\alpha $ is mixed if $\alpha _i > 0$ for at least two $i$.
The lexicographical order $<_{{{\mathrm{lex}}}}$ of multi-indices $\alpha , \beta \in \mathbf {N}^n_0$ is defined as
$$\begin{aligned} \alpha <_{{{\mathrm{lex}}}} \beta \quad \Leftrightarrow \quad \exists m > 0: (\forall i < m: \alpha _i = \beta _i) \wedge \alpha _m < \beta _m. \end{aligned}$$
The total degree lexicographical order $\prec $ on the set $\mathbf {T}$ of power products is defined by
$$\begin{aligned}&t_1 \prec t_2 \Leftrightarrow t_1:=X^{\alpha }, t_2 := X^{\beta }, \quad \alpha , \beta \in \mathbf {N}^n_0, \\&\quad |{\alpha }| < |{\beta }| \vee \left( |{\alpha }| = |{\beta }| \wedge \alpha <_{{{\mathrm{lex}}}} \beta \right) \end{aligned}$$
with $|{\alpha }| = \sum _{j=1}^n \alpha _j$.
For a polynomial $f \in \mathbf {P}$ we define the coefficient of the power product $t$ of $f$ as $C(f,t)$.
We define the support of the polynomial $f$ as ${{\mathrm{supp}}}(f) := \{{ t \in \mathbf {T}}\ : \ {C(f,t) \ne 0}\}$.
The degree of the polynomial $f$ is defined by $\deg (f) := \max \ \{{ |{\alpha }|}\ : \ {X^\alpha \in {{\mathrm{supp}}}(f)}\}$, while the degree with respect to $X_i$ is given by $\deg _{X_i}(f) := \max \ \{{ \alpha _i}\ : \ {X^\alpha \in {{\mathrm{supp}}}(f)}\}$.
The leading powerproduct of $f$ is defined by ${{\mathrm{LPP}}}(f) := \max _{\prec }\{{t}\ : \ {t \in {{\mathrm{supp}}}(f)}\}$.
The trailing term of $f$ is the smallest powerproduct in ${{\mathrm{supp}}}(f)$ with respect to $\prec $.

Definition 8.2

(Varieties) Let $I \subset \mathbf {P}$ be a set of polynomials. Then the variety of $I$ is defined as $ V(I) := \{{ c \in \mathbf {C}^n}\ : \ {f(c) = 0 \text { for all } f \in I}\}$.

Definition 8.3

Let $F \subset \mathbf {P}$. The initial ideal with respect to the order $\prec $ is defined by ${{\mathrm{In}}}_{\prec }(F) := \left\langle \{{{{\mathrm{LPP}}}(f)}\ : \ {f \in F}\} \right\rangle $.

Definition 8.4

(Gröbner basis) Let $I \vartriangleleft \mathbf {P}$ be an ideal and $\prec $ the lexicographical total degree order. A finite $G \subset I$ such that ${{\mathrm{In}}}_{\prec }(I) = {{\mathrm{In}}}_{\prec }(G)$ is called a Gröbner basis for $I$ with respect to $\prec $.

We call $G$ a reduced Gröbner basis iff additionally

(i)
each $g \in G$ is normed, i.e., the leading coefficient, $C(g,{{\mathrm{LPP}}}(g))$, is $1$,
(ii)
the set $\{{{{\mathrm{LPP}}}(g)}\ : \ {g \in G}\}$ minimally generates ${{\mathrm{In}}}_{\prec }(I)$, i.e., for all $g\in G$ it holds that ${{\mathrm{In}}}_{\prec }(G{\setminus }\{g\}) \ne {{\mathrm{In}}}_{\prec }(G)$, and
(iii)
no trailing term of any $g\in G$ lies in ${{\mathrm{In}}}_{\prec }(I)$.

One can show that a reduced Gröbner basis exists for each ideal $I$ and that it is unique. We will use the following characterization of Gröbner bases (the proof can be found in [29, Sect. 3.5] or [30, Theorem 2.4.1]).

Theorem 8.1

Let $I \vartriangleleft \mathbf {P}$ be an ideal and let $G \subset I$ be a finite set of normed polynomials. Then, the following statements are equivalent:

(i)
$G$ is a Gröbner basis for $I$ with respect to $\prec $.
(ii)
It holds that $ \{{{{\mathrm{LPP}}}(f)}\ : \ {f \in I}\} = \{{t {{\mathrm{LPP}}}(g)}\ : \ {g \in G, \ t \in \mathbf {T}}\}$.

Theorem 8.2 is also from the theory of Gröbner bases and gives a connection between the finiteness of varieties and the structure of the reduced Gröbner basis (the proof can be found in [29, Sect. 3.6], [30, Proposition 3.7.1]).

Theorem 8.2

Let $I \vartriangleleft \mathbf {P}$ be an ideal, and $G$ the reduced Gröbner basis with respect to $\prec $. Then, the following statements are equivalent:

(i)
$V(I)$ is finite.
(ii)
There exists $g_1,\ldots ,g_n \in G$, and $m_1,\ldots ,m_n \in \mathbf {N}_0$, such that ${{\mathrm{LPP}}}(g_i) = X_i^{m_i}$.
(iii)
The quotient space $\mathbf {P}/I$ is a finite-dimensional $\mathbf {C}$ vector space.

Finally, we have all the tools to obtain the desired results about the solutions of algebraic equations. They are formulated in the following two lemmas.

Lemma 8.1

Let $n \in \mathbf {N}$, $A \in \mathbf {C}^{n \times n}$ be a Hermitian positive definite matrix. The system of quadratic equations ${{\mathrm{diag}}}(X) A X = 0$, has only the trivial solution $X = 0$.

Proof

Let $X$ be a solution of ${{\mathrm{diag}}}(X)AX = 0$. Then, for each $i$, we have $X_i =0$ or $(AX)_i = 0$. Hence, we can rearrange the indices $1,\ldots ,n$ in such a way that for a $1 \le k \le n$, that $X_1,\ldots ,X_k \not = 0$, and $X_{k+1},\ldots ,X_n = 0$. In order to exchange the indices $i$ and $j$, we have to exchange the $i$-th and the $j$-th rows as well as the $i$-th and $j$-th columns of $A$. Since the eigenvalues of $A$ are invariant under those operations, the resulting new matrix $\tilde{A}$ is also positive definite and Hermitian. Since $X_{k+1},\ldots , X_n = 0$, we can eliminate the corresponding rows and columns of the system. Denoting the resulting matrix by $\tilde{\tilde{A}}$ and $\tilde{X} = (X_1,\ldots ,X_k)$, it follows that

$$\begin{aligned} \tilde{\tilde{A}} \tilde{X} = 0. \end{aligned}$$

From the Sylvester criterion we know that $\tilde{\tilde{A}}$ is positive definite as it is a submatrix of $\tilde{A}$. Therefore, $\tilde{X} = 0$, and hence, $X = 0$. $\square $

Lemma 8.2

Let $A \in \mathbf {C}^{n\times n}$ be a positive definite, Hermitian matrix and $b,c,d \in \mathbf {C}^n$. Then, the system

$$\begin{aligned}&{{\mathrm{diag}}}(X) A X + {{\mathrm{diag}}}(X) b + {{\mathrm{diag}}}(X) {{\mathrm{diag}}}(a) c + {{\mathrm{diag}}}(a) d = 0, \\&a_i^r = X_i^{q}, \qquad i=1,\ldots ,n, \end{aligned}$$

with $r > q$ has only finitely many solutions.

Proof

Define the polynomials

$$\begin{aligned} f_i := f(X_1,\ldots ,X_n) = X_i \left( \,\sum \limits _{j=1}^n A_{ij} X_j \right) + X_i b_i + X_i a_i c_i + a_i d_i = 0 \end{aligned}$$

and

$$\begin{aligned} \tilde{f}_i = X_i \left( \sum \limits _{j=1}^n A_{ij} X_j \right) . \end{aligned}$$

Furthermore set $\tilde{I}= \langle \tilde{f}_1,\ldots ,\tilde{f}_n \rangle $.

Let $\tilde{G}$ be the reduced Gröbner basis for $\tilde{I}$ with respect to the total degree lexicographical order $\prec $. From Lemma 8.1 and Theorem 8.2 we know that there are $m_1,\ldots ,m_n \in \mathbf {N}$ such that ${{\mathrm{LPP}}}(\tilde{g}_i) = X_i^{m_i}$ for $i=1,\ldots ,n$. Since $\tilde{G}\subseteq \tilde{I}$ there are $r_{ij} \in \mathbf {P}$, such that

$$\begin{aligned} \tilde{g}_i = \sum \limits _{j=1}^n \tilde{f}_j r_{ij} \end{aligned}$$

(22)

for all $i = 1,\ldots ,n$. Let us now fix some $i \in \{1,\ldots ,n\}$.

Next note that $\tilde{f}_i$ only contains quadratic power products and that $X_i^2 \in {{\mathrm{supp}}}(\tilde{f}_i)$ since $A_{ii} > 0$ by positive definiteness of $A$. Therefore, $\deg (\tilde{f}_i) = 2$. Now split each $r_{ij}$ according to

$$\begin{aligned} s_{ij}&:= \sum \limits _{\begin{array}{c} t \in {{\mathrm{supp}}}(r_{ij}), \\ \deg (t) \le m_i-2 \end{array}} C(r_{ij},t) t&t_{ij}&:= \sum \limits _{\begin{array}{c} t \in {{\mathrm{supp}}}(r_{ij}), \\ \deg (t) > m_i-2 \end{array}} C(r_{ij},t) t. \end{aligned}$$

We claim that $\sum _{j=1}^n t_{ij} \tilde{f}_j = 0$. Assume the opposite and observe that for each $t \in \mathbf {T}$ with $\deg (t) > m_i-2$ and each $j=1,\ldots ,n$ it follows that $C(t\tilde{f}_j, s) = 0$ for each $s \in \mathbf {T}$ with $\deg (s) \le m_i$. Consequently, $C(\sum _{j=1}^n t_{ij} \tilde{f}_j, s) = 0$ for each ${s \in \mathbf {T}}$ with $\deg (s) \le m_i$ implying that $\deg (\sum _{j=1}^n t_{ij} \tilde{f}_j) > m_i$ since $\sum _{j=1}^n t_{ij} \tilde{f}_j \ne 0$. But we also have that

$$\begin{aligned} m_i = \deg (\tilde{g}_i) = \deg \left( \sum _j s_{ij} \tilde{f}_j + \sum _j t_{ij} \tilde{f}_j \right) , \end{aligned}$$

and $\deg \Bigl (\sum _j s_{ij} \tilde{f}_j \Bigr ) \le m_i$, from which the contradiction $\deg (\sum _{j=1}^n t_{ij} \tilde{f}_j) \le m_i$ follows. Hence, $\sum _{j=1}^n t_{ij} \tilde{f}_j = 0$. In particular, $\tilde{g}_i = \sum _{j=1}^n s_{ij} \tilde{f}_j$, so, by possibly replacing $r_{ij}$ with $s_{ij}$, we can achieve in (22) that $\deg (r_{ij}) \le m_i -2$ for each $j=1,\ldots ,n$.

We define the polynomials $h_i := \sum _j r_{ij} f_j$ and consider them elements of the polynomial ring $\tilde{\mathbf {P}} := \mathbf {C}[a_1,\ldots ,a_n,X_1,\ldots ,X_n]$ in $2n$ variables, and extend the order $\prec $ to $\tilde{\mathbf {P}}$, i.e., we consider $a_1\prec \cdots \prec a_n \prec X_1\prec \cdots \prec X_n$. Additionally, we denote by $\tilde{\mathbf {T}}$ the set of power products of $\tilde{\mathbf {P}}$. It follows that

$$\begin{aligned} h_i = \sum \limits _j r_{ij} f_j&= \sum \limits _j r_{ij} (\tilde{f}_j + b_j X_j + c_j X_j a_j + d_j a_j) \\&= \sum \limits _j r_{ij} \tilde{f}_j + \sum \limits _j r_{ij} (b_j X_j + c_j X_j a_j + d_j a_j) \\&= \tilde{g}_i + \sum \limits _j r_{ij} (b_j X_j + c_j X_j a_j + d_j a_j). \end{aligned}$$

Since we assumed that the $b_j$, $c_j$, and $d_j$ are constants, $\deg (r_{ij}) \le m_i - 2$ implies that

$$\begin{aligned} \deg \left( \sum _j r_{ij} (b_j X_j + d_j a_j ) \right) \le m_i - 1, \qquad \deg \left( \sum _j r_{ij} c_j X_j a_j \right) \le m_i \end{aligned}$$

and since $a_j \prec X_i$ for all $j$, we have

$$\begin{aligned} {{\mathrm{LPP}}}(h_i) = {{\mathrm{LPP}}}(\tilde{g}_i) = X_i^{m_i}. \end{aligned}$$

Now we define the ideal $J \vartriangleleft \tilde{\mathbf {P}}$ generated by $f_1,\ldots ,f_n$ and the polynomials $k_j = a_j^r - X_j^{q}$, $j = 1,\ldots ,n$. Clearly, $h_1,\ldots ,h_n \in J$. Also, ${{\mathrm{LPP}}}(h_i) = X_i^{m_i}$ and ${{\mathrm{LPP}}}(k_i) = a_i^r$ for $i=1,\ldots ,n$, the latter since $r > q$.

In order to prove that $V(J)$ is finite we consider the reduced Gröbner basis $G$ for $J$ with respect to $\prec $. By the above, we have $X_i^{m_i},a_i^r \in \{{{\mathrm{LPP}}}(f)\,|\,\,f \in J \}$, and by Theorem 8.1 $\{{{{\mathrm{LPP}}}(f)}\ : \ {f \in J}\} = \{{t g}\ : \ {g \in G, \ t\in \tilde{\mathbf {T}}}\}$ . Hence, for each $i=1,\ldots ,n$, $X_i^{m_i} = t_i {{\mathrm{LPP}}}(g_i)$ and $a_i^r = t_{i-n} {{\mathrm{LPP}}}(g_{i-n})$ for some $g_1,\ldots ,g_{2n}\in G$, and $t_1,\ldots ,t_{2n} \in \tilde{\mathbf {T}}$. This is only possible if there exist $\hat{m}_1,\ldots ,\hat{m}_{2n}$ such that ${{\mathrm{LPP}}}(g_i) = X_i^{\hat{m}_i}$ and ${{\mathrm{LPP}}}(g_{i+n}) = a_i^{\hat{m}_{i+n}}$ for $i=1,\ldots ,n$. This implies by Theorem 8.2 that $V(J) = V(\{{ f_1,\ldots ,f_n,k_1,\ldots ,k_n }\})$ is finite, which is what we wanted to show. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bredies, K., Lorenz, D.A. & Reiterer, S. Minimization of Non-smooth, Non-convex Functionals by Iterative Thresholding. J Optim Theory Appl 165, 78–112 (2015). https://doi.org/10.1007/s10957-014-0614-7

Download citation

Received: 07 June 2013
Accepted: 19 June 2014
Published: 11 July 2014
Issue Date: April 2015
DOI: https://doi.org/10.1007/s10957-014-0614-7

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On convergence of iterative thresholding algorithms to approximate sparse solution for composite nonconvex optimization

A conjugate gradient like method for p-norm minimization in functional spaces

Regularized gradient-projection methods for finding the minimum-norm solution of the constrained convex minimization problem

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

1.1 8.1 Proof of Lemma 3.2

1.2 8.2 Tools from Algebraic Geometry

Definition 8.1

Definition 8.2

Definition 8.3

Definition 8.4

Theorem 8.1

Theorem 8.2

Lemma 8.1

Proof

Lemma 8.2

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Navigation

Minimization of Non-smooth, Non-convex Functionals by Iterative Thresholding

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On convergence of iterative thresholding algorithms to approximate sparse solution for composite nonconvex optimization

A conjugate gradient like method for p-norm minimization in functional spaces

Regularized gradient-projection methods for finding the minimum-norm solution of the constrained convex minimization problem

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

1.1 8.1 Proof of Lemma 3.2

1.2 8.2 Tools from Algebraic Geometry

Definition 8.1

Definition 8.2

Definition 8.3

Definition 8.4

Theorem 8.1

Theorem 8.2

Lemma 8.1

Proof

Lemma 8.2

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Search

Navigation