Split Bregman algorithms for multiple measurement vector problem

Jian Zou^1,2,
Yuli Fu²,
Qiheng Zhang² &
…
Haifeng Li²

496 Accesses
10 Citations
Explore all metrics

Abstract

The standard sparse representation aims to reconstruct sparse signal from single measurement vector which is known as SMV model. In some applications, the SMV model extend to the multiple measurement vector (MMV) model, in which the signal consists of a set of jointly sparse vectors. In this paper, efficient algorithms based on split Bregman iteration are proposed to solve the MMV problems with both constrained form and unconstrained form. The convergence of the proposed algorithms is also discussed. Moreover, the proposed algorithms are used in magnetic resonance imaging reconstruction. Numerical results show the effectiveness of the proposed algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fusion of sparse reconstruction algorithms for multiple measurement vectors

Article 07 November 2016

A Simple Recovery Framework for Signals with Time-Varying Sparse Support

Exact recovery of sparse multiple measurement vectors by $l_{2,p}$-minimization

Article Open access 10 January 2018

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Bazerque, J. A., & Giannakis, G. B. (2010). Distributed spectrum sensing for cognitive radio networks by exploiting sparsity. IEEE Transactions on Signal Processing, 58(3), 1847–1862.
Article MathSciNet Google Scholar
Berg, E., & Friedlander, M. (2010). Theoretical and empirical results for recovery from multiple measurements. IEEE Transactions on Information Theory, 56(5), 2516–2527.
Article Google Scholar
Berg, E., Schmidt, M., Friedlander ,M. P., & Murphy, K. (2008). Group sparsity via linear-time projection. Technical Report TR-2008-09, Department of Computer Science, University of British Columbia.
Bilen, C., Wang, Y., & Selesnick, I. W. (2012). High-speed compressed sensing reconstruction in dynamic parallel MRI using augmented lagrangian and parallel processing. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2(3), 370–379.
Article Google Scholar
Cai, J. F., Osher, S., & Shen, Z. (2009). Split bregman methods and frame based image restoration. Multiscale Modeling & Simulation, 8(2), 337–369.
Article MathSciNet Google Scholar
Chen, J., & Huo, X. (2006). Theoretical results on sparse representations of multiple-measurement vectors. IEEE Transactions on Signal Processing, 54(12), 4634–4643.
Article Google Scholar
Cotter, S. F., Rao, B. D., Engan, K., & Kreutz-Delgado, K. (2005). Sparse solutions to linear inverse problems with multiple measurement vectors. IEEE Transactions on Signal Processing, 53(7), 2477–2488.
Article MathSciNet Google Scholar
Davies, M. E., & Eldar, Y. C. (2012). Rank awareness in joint sparse recovery. IEEE Transactions on Information Theory, 58(2), 1135–1146.
Article MathSciNet Google Scholar
Deng, W., Yin, W., & Zhang, Y. (2011). Group sparse optimization by alternating direction method. TR11-06, Department of Computational and Applied Mathematics, Rice University.
Duarte, M. F., & Eldar, Y. C. (2011). Structured compressed sensing: From theory to applications. IEEE Transactions on Signal Processing, 59(9), 4053–4085.
Article MathSciNet Google Scholar
Eldar, Y. C., Kuppinger, P., & Bolcskei, H. (2010). Block-sparse signals: Uncertainty relations and efficient recovery. IEEE Transactions on Signal Processing, 58(6), 3042–3054.
Article MathSciNet Google Scholar
Goldstein, T., & Osher, S. (2009). The split bregman method for l1-regularized problems. SIAM Journal on Imaging Sciences, 2(2), 323–343.
Article MATH MathSciNet Google Scholar
He, Z., Cichocki, A., Cichocki, R., & Cao, J. (2008). CG-M-FOCUSS and its application to distributed compressed sensing. In: Advances in Neural Networks-ISNN 2008, pp. 237–245.
Huang, J., & Zhang, T. (2010). The benefit of group sparsity. The Annals of Statistics, 38(4), 1978–2004.
Article MATH MathSciNet Google Scholar
Jiang, L., & Yin, H. (2012). Bregman iteration algorithm for sparse nonnegative matrix factorizations via alternating l1-norm minimization. Multidimensional Systems and Signal Processing, 23(3), 315–328.
Article MathSciNet Google Scholar
Lee, D. H., Hong, C. P., & Lee, M. W. (2013). Sparse magnetic resonance imaging reconstruction using the bregman iteration. Journal of the Korean Physical Society, 62(2), 328–332.
Article Google Scholar
Lee, K., Bresler, Y., & Junge, M. (2012). Subspace methods for joint sparse recovery. IEEE Transactions on Information Theory, 58(6), 3613–3641.
Article MathSciNet Google Scholar
Liu, B., King, K., Steckner, M., Xie, J., Sheng, J., & Ying, L. (2009a). Regularized sensitivity encoding (sense) reconstruction using bregman iterations. Magnetic Resonance in Medicine, 61(1), 145–152.
Article Google Scholar
Liu, J., Ji, S., & Ye, J. (2009b). SLEP: Sparse Learning with Efficient Projections. Arizona State University.
Ma, S., Goldfarb, D., & Chen, L. (2011). Fixed point and bregman iterative methods for matrix rank minimization. Mathematical Programming, 128(1–2), 321–353.
Article MATH MathSciNet Google Scholar
Majumdar, A., & Ward, R. (2012) Face recognition from video: An MMV recovery approach. In: 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp. 2221–2224.
Majumdar, A., & Ward, R. (2013). Rank awareness in group-sparse recovery of multi-echo mr images. Sensors, 13(3), 3902–3921.
Article Google Scholar
Majumdar, A., & Ward, R. K. (2011). Joint reconstruction of multiecho mr images using correlated sparsity. Magnetic Resonance Imaging, 29(7), 899–906.
Article Google Scholar
Mishali, M., & Eldar, Y. C. (2008). Reduce and boost: Recovering arbitrary sets of jointly sparse vectors. IEEE Transactions on Signal Processing, 56(10), 4692–4702.
Article MathSciNet Google Scholar
Qin, Z., Scheinberg, K., & Goldfarb, D. (2010). Efficient block-coordinate descent algorithms for the group lasso. Technical Report 2806, Optimization Online.
Smith, D. S., Gore, J. C., Yankeelov, T. E., & Welch, E. B. (2012). Real-time compressive sensing MRI reconstruction using GPU computing and split bregman methods. International Journal of Biomedical Imaging, 2012, 1–6.
Stone, S. S., Haldar, J. P., Tsao, S. C., Sutton, B. P., Liang, Z. P., et al. (2008). Accelerating advanced MRI reconstructions on GPUs. Journal of Parallel and Distributed Computing, 68(10), 1307–1318.
Article Google Scholar
Tropp, J. A. (2006). Algorithms for simultaneous sparse approximation. Part ii: Convex relaxation. Signal Processing, 86(3), 589–602.
Article MATH MathSciNet Google Scholar
Tropp, J. A., Gilbert, A. C., & Strauss, M. J. (2006). Algorithms for simultaneous sparse approximation. Part i: Greedy pursuit. Signal Processing, 86(3), 572–588.
Article MATH Google Scholar
Tseng, P., & Yun, S. (2009). A coordinate gradient descent method for nonsmooth separable minimization. Mathematical Programming, 117(1–2), 387–423.
Article MATH MathSciNet Google Scholar
Xu, J., Feng, X., & Hao, Y. (2012). A coupled variational model for image denoising using a duality strategy and split bregman. Multidimensional Systems and Signal Processing, pp. 1–12. doi:10.1007/s11045-012-0190-7.
Yin, W., Osher, S., Goldfarb, D., & Darbon, J. (2008). Bregman iterative algorithms for $\ell 1$-minimization with applications to compressed sensing. SIAM Journal on Imaging Sciences, 1(1), 143–168.
Article MATH MathSciNet Google Scholar
Yuan, M., & Lin, Y. (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(1), 49–67.
Article MATH MathSciNet Google Scholar
Zou, J., Fu, Y., & Xie, S. (2012). A block fixed point continuation algorithm for block-sparse reconstruction. IEEE Signal Processing Letters, 19(6), 364–367.
Article Google Scholar

Download references

Acknowledgments

This work was supported by International cooperation project of Guangdong Natural Science Foundation under Grant No. 2009B050700020, NSFC—Guangdong Union Project under Grant No. U0835003, NSFC under Grant No. 60903170, 61004054, 61104053 and 61103122.

Author information

Authors and Affiliations

School of Information and Mathematics, Yangtze University, Jingzhou, 434020, Hubei, China
Jian Zou
School of Electronic and Information Engineering, South China University of Technology, Guangzhou, 510641, Guangdong, China
Jian Zou, Yuli Fu, Qiheng Zhang & Haifeng Li

Authors

Jian Zou
View author publications
You can also search for this author in PubMed Google Scholar
Yuli Fu
View author publications
You can also search for this author in PubMed Google Scholar
Qiheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Haifeng Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuli Fu.

Appendices

Appendix 1: Proof of theorem 1

proof

The main idea of this proof is from the one in Cai et al. (2009). However, this proof is much different due to the matrix variable and the mixed norm.

Since subproblems (17) and (18) are convex, the first order optimality condition of Algorithm 1 gives as follows:

$$\begin{aligned} \left\{ \begin{array}{ll} \mathbf 0 = \lambda \mathbf A ^T (\mathbf A \mathbf X ^{k+1}-\mathbf Y ) + \mu (\mathbf X ^{k+1}-\mathbf Z ^k+\mathbf B ^k), \\ \mathbf 0 = \mathbf P ^{k+1}+ \mu (\mathbf Z ^{k+1}-\mathbf X ^k-\mathbf B ^k), \\ \mathbf B ^{k+1} = \mathbf B ^k + (\mathbf X ^{k+1} - \mathbf Z ^{k+1}),\\ \end{array} \right. \end{aligned}$$

(34)

where $\mathbf P ^{k+1} \in \partial \Vert \mathbf Z \Vert _{2,1} \mid _\mathbf{Z = \mathbf Z ^{k+1} }$, for ease of notations, we just denote $\mathbf P ^{k+1} \in \partial \Vert \mathbf Z ^{k+1} \Vert _{2,1}$ in the remainder of this paper. The subgradient of $\Vert \cdot \Vert _{2,1}$ can be seen in Appendix 3.

Since there exists at least one solution $\mathbf X ^*$ of (3) by the assumption, by the first order optimality condition, $\mathbf X ^*$ must satisfy

$$\begin{aligned} \mathbf P ^* + \mathbf A ^T(\mathbf A \mathbf X ^{*}-\mathbf Y ) = 0, \end{aligned}$$

(35)

where $\mathbf P ^{*} \in \partial \Vert \mathbf Z ^{*} \Vert _{2,1}$ and $\mathbf Z ^{*} = \mathbf X ^{*}$.

Let $\mathbf B ^* = \mathbf P ^*/ \mu $. Then (35) can be formulated as

$$\begin{aligned} \left\{ \begin{array}{ll} \mathbf 0 = \lambda \mathbf A ^T (\mathbf A \mathbf X ^{*}-\mathbf Y ) + \mu (\mathbf X ^{*}-\mathbf Z ^*+\mathbf B ^*), \\ \mathbf 0 = \mathbf P ^{*}+ \mu (\mathbf Z ^{*}-\mathbf X ^*-\mathbf B ^*), \\ \mathbf B ^{*} = \mathbf B ^* + (\mathbf X ^{*} - \mathbf Z ^{*}).\\ \end{array} \right. \end{aligned}$$

(36)

Comparing (36) with (34), $(\mathbf X ^{*}, \mathbf Z ^{*}, \mathbf B ^{*})$ is a fixed point of Algorithm 1. Similar like the paper by Goldstein and Osher (2009), if the unconstrained spilt Bregman iteration converges, it converges to a solution of (3).

Denote the errors by $\mathbf X _e^{k} = \mathbf X ^{k}-\mathbf X ^{*}$, $\mathbf Z _e^{k} = \mathbf Z ^{k}-\mathbf Z ^{*}$ and $\mathbf B _e^{k} = \mathbf B ^{k}-\mathbf B ^{*}$. Subtracting the first equation of (34) by the first equation of (36), we obtain

$$\begin{aligned} \mathbf 0 = \lambda \mathbf A ^T \mathbf A \mathbf X _e^{k+1} + \mu (\mathbf X _e^{k+1}-\mathbf Z _e^k+\mathbf B _e^k). \end{aligned}$$

(37)

Taking the inner product of the left- and right-hand sides with $\mathbf X _e^{k+1}$, we have

$$\begin{aligned} \mathbf 0 = \lambda \Vert \mathbf A \mathbf X _e^{k+1} \Vert _F^2 + \mu \Vert \mathbf X _e^{k+1} \Vert _F^2 - \mu \langle \mathbf Z _e^k, \mathbf X _e^{k+1} \rangle + \mu \langle \mathbf B _e^k, \mathbf X _e^{k+1} \rangle . \end{aligned}$$

(38)

Similarly, subtracting the second equation of (34) by the second equation of (36) and then taking the inner product of the left- and right- hand sides with $\mathbf Z _e^{k+1}$, we have

$$\begin{aligned} \mathbf 0 = \langle \mathbf P ^{k+1}-\mathbf P ^{*}, \mathbf Z ^{k+1}-\mathbf Z ^{*}\rangle + \mu \Vert \mathbf Z _e^{k+1} \Vert _F^2 - \mu \langle \mathbf Z _e^{k+1}, \mathbf X _e^{k+1} \rangle - \mu \langle \mathbf B _e^k, \mathbf Z _e^{k+1} \rangle \end{aligned}$$

(39)

where $\mathbf P ^{k+1} \in \partial \Vert \mathbf Z ^{k+1} \Vert _{2,1}$, $\mathbf P ^{*} \in \partial \Vert \mathbf Z ^{*} \Vert _{2,1}$.

Adding both sides of (38) to the corresponding sides of (39), we have

$$\begin{aligned} \mathbf 0&= \lambda \Vert \mathbf A \mathbf X _e^{k+1} \Vert _F^2 + \langle \mathbf P ^{k+1}-\mathbf P ^{*}, \mathbf Z ^{k+1}-\mathbf Z ^{*}\rangle \nonumber \\&+\, \mu \left( \Vert \mathbf X _e^{k+1} \Vert _F^2 + \Vert \mathbf Z _e^{k+1} \Vert _F^2 - \langle \mathbf X _e^{k+1},\mathbf Z _e^{k+1}+\mathbf Z _e^{k} \rangle + \langle \mathbf B _e^k, \mathbf X _e^{k+1}-\mathbf Z _e^{k+1} \rangle \right) . \end{aligned}$$

(40)

Furthermore, by subtracting the third equation of (34) by the corresponding one in (36), we obtain

$$\begin{aligned} \mathbf B _e^{k+1} = \mathbf B _e^k + \mathbf X _e^{k+1} - \mathbf Z _e^{k+1}. \end{aligned}$$

(41)

Taking square norm of both sides of (41), we obtain

$$\begin{aligned} \Vert \mathbf B _e^{k+1}\Vert _F^2 = \Vert \mathbf B _e^k\Vert _F^2 + \Vert \mathbf X _e^{k+1} - \mathbf Z _e^{k+1} \Vert _F^2 + 2\langle \mathbf B _e^k, \mathbf X _e^{k+1} - \mathbf Z _e^{k+1}\rangle , \end{aligned}$$

(42)

and further

$$\begin{aligned} \langle \mathbf B _e^k, \mathbf X _e^{k+1} - \mathbf Z _e^{k+1}\rangle = \frac{1}{2}(\Vert \mathbf B _e^{k+1}\Vert _F^2 - \Vert \mathbf B _e^k\Vert _F^2 - \Vert \mathbf X _e^{k+1} - \mathbf Z _e^{k+1}\Vert _F^2). \end{aligned}$$

(43)

Substituting (43) into (40), we have

$$\begin{aligned}&\frac{\mu }{2}(\Vert \mathbf B _e^{k}\Vert _F^2 - \Vert \mathbf B _e^{k+1}\Vert _F^2) = \lambda \Vert \mathbf A \mathbf X _e^{k+1} \Vert _F^2 + \langle \mathbf P ^{k+1}-\mathbf P ^{*}, \mathbf Z ^{k+1}-\mathbf Z ^{*}\rangle \nonumber \\&\quad \quad + \mu (\Vert \mathbf X _e^{k+1} \Vert _F^2 + \Vert \mathbf Z _e^{k+1} \Vert _F^2 - \langle \mathbf X _e^{k+1},\mathbf Z _e^{k+1}+\mathbf Z _e^{k} \rangle - \frac{1}{2} \Vert \mathbf X _e^{k+1} - \mathbf Z _e^{k+1}\Vert _F^2) \nonumber \\&\quad = \lambda \Vert \mathbf A \mathbf X _e^{k+1} \Vert _F^2 + \langle \mathbf P ^{k+1}-\mathbf P ^{*}, \mathbf Z ^{k+1}-\mathbf Z ^{*}\rangle \!+\! \frac{\mu }{2} (\Vert \mathbf X _e^{k+1} \!-\! \mathbf Z _e^{k} \Vert _F^2 + \Vert \mathbf Z _e^{k+1} \Vert _F^2 - \Vert \mathbf Z _e^{k}\Vert _F^2).\nonumber \\ \end{aligned}$$

(44)

Summing (44) from $k=0$ to $k=K$, it yields

$$\begin{aligned} \frac{\mu }{2}(\Vert \mathbf B _e^{0}\Vert _F^2 - \Vert \mathbf B _e^{K+1}\Vert _F^2&= \lambda \sum _{k=0}^K \Vert \mathbf A \mathbf X _e^{k+1} \Vert _F^2 + \sum _{k=0}^K \langle \mathbf P ^{k+1}-\mathbf P ^{*}, \mathbf Z ^{k+1}-\mathbf Z ^{*}\rangle \nonumber \\&+ \frac{\mu }{2} (\sum _{k=0}^K \Vert \mathbf X _e^{k+1} - \mathbf Z _e^{k} \Vert _F^2 + \Vert \mathbf Z _e^{K+1} \Vert _F^2 - \Vert \mathbf Z _e^{0}\Vert _F^2). \end{aligned}$$

(45)

Since $\mathbf P ^{k+1} \in \partial \Vert \mathbf X ^{k+1} \Vert _{2,1}$, $\mathbf P ^{*} \in \partial \Vert \mathbf X ^{*} \Vert _{2,1}$ and $\Vert \cdot \Vert _{2,1}$ is convex, for $\forall k$,

$$\begin{aligned} \langle \mathbf P ^{k+1}-\mathbf P ^{*}, \mathbf Z ^{k+1}-\mathbf Z ^{*}\rangle&= \Vert \mathbf Z ^{k+1} \Vert _F^2 - \Vert \mathbf Z ^{*}\Vert _F^2 - \langle \mathbf P ^{*}, \mathbf Z _e^{k+1}-\mathbf Z _e^{*}\rangle \nonumber \\&+ \Vert \mathbf Z ^{*} \Vert _F^2 - \Vert \mathbf Z ^{k+1}\Vert _F^2 - \langle \mathbf P ^{k+1}, \mathbf Z ^{*}-\mathbf Z ^{k+1}\rangle \ge 0. \end{aligned}$$

(46)

Therefore, all terms involved in (45) are nonnegative. We have

$$\begin{aligned}&\frac{\mu }{2}\Vert \mathbf B _e^{0}\Vert _F^2 + \frac{\mu }{2} \Vert \mathbf Z _e^{0}\Vert _F^2 \nonumber \\&\quad \ge \frac{\lambda }{2} \sum _{k=0}^K \Vert \mathbf A \mathbf X _e^{k+1} \Vert _F^2 + \sum _{k=0}^K \langle \mathbf P ^{k+1}-\mathbf P ^{*}, \mathbf Z ^{k+1}-\mathbf Z ^{*}\rangle + \frac{\mu }{2} (\sum _{k=0}^K \Vert \mathbf X _e^{k+1} - \mathbf Z _e^{k} \Vert _F^2 ).\qquad \end{aligned}$$

(47)

From (47) and $\mu >0$, $\lambda >0$, we can get

$$\begin{aligned} \sum _{k=0}^{+ \infty } \Vert \mathbf A \mathbf X _e^{k+1} \Vert _F^2 < + \infty . \end{aligned}$$

(48)

(48) imply that $\lim _{k\rightarrow + \infty } \Vert \mathbf A \mathbf X _e^{k} \Vert _F^2 =0$, and

$$\begin{aligned}&\lim _{k\rightarrow + \infty } \Vert \mathbf A \mathbf X _e^{k} \Vert _F^2 = \lim _{k\rightarrow + \infty } \Vert \mathbf A \mathbf X ^{k} - \mathbf A \mathbf X ^{*}\Vert _F^2\nonumber \\&\quad = \lim _{k\rightarrow + \infty } \langle \mathbf A ^T(\mathbf A \mathbf X ^{k}-\mathbf A \mathbf X ^{*}), \mathbf X ^{k} - \mathbf X ^{*} \rangle = 0. \end{aligned}$$

(49)

Recall the definition of the Bregman distance, for any convex function $J$, we have

$$\begin{aligned} D_J^\mathbf{P }(\mathbf X ,\mathbf Z ) + D_J^\mathbf{Q }(\mathbf Z ,\mathbf X ) = \langle \mathbf Q -\mathbf P , \mathbf X -\mathbf Z \rangle \end{aligned}$$

(50)

where $\mathbf P \in \partial J(\mathbf Z )$ and $\mathbf Q \in \partial J(\mathbf X )$.

Now let $J(\mathbf X ) = \frac{1}{2}\Vert \mathbf{AX-Y }\Vert _F^2$, $\nabla J(\mathbf X ) = \mathbf A ^T(\mathbf{AX-Y })$, from (50) we have

$$\begin{aligned} D_J^{\nabla J(\mathbf X ^{*})}(\mathbf X ^k,\mathbf X ^*) + D_J^{\nabla J(\mathbf X ^{k})}(\mathbf X ^*,\mathbf X ^k) = \langle \mathbf A ^T(\mathbf A \mathbf X ^{k}-\mathbf A \mathbf X ^{*}), \mathbf X ^{k} - \mathbf X ^{*} \rangle = 0. \end{aligned}$$

(51)

(51) together with the nonnegativity of the Bregman distance, implies that

$$\begin{aligned} \lim _{k\rightarrow + \infty } D_J^{\nabla J(\mathbf X ^{*})}(\mathbf X ^k,\mathbf X ^*) =0, \end{aligned}$$

(52)

i.e.,

$$\begin{aligned} \lim _{k\rightarrow + \infty } \frac{1}{2}\Vert \mathbf A \mathbf X ^{k} - \mathbf Y \Vert _F^2 - \frac{1}{2}\Vert \mathbf A \mathbf X ^{*} - \mathbf Y \Vert _F^2 - \langle \mathbf A ^T(\mathbf A \mathbf X ^{*}-\mathbf Y ), \mathbf X ^{k} - \mathbf X ^{*} \rangle = 0. \end{aligned}$$

(53)

Similarly, (47) also leads to

$$\begin{aligned} \sum _{k=0}^{+ \infty } \langle \mathbf P ^{k+1}-\mathbf P ^{*}, \mathbf Z ^{k+1}-\mathbf Z ^{*}\rangle < + \infty , \end{aligned}$$

(54)

and hence

$$\begin{aligned} \lim _{k\rightarrow + \infty } \langle \mathbf P ^{k+1}-\mathbf P ^{*}, \mathbf Z ^{k+1}-\mathbf Z ^{*}\rangle =0. \end{aligned}$$

(55)

Let $J(\mathbf Z ) = \Vert \mathbf Z \Vert _{2,1}$ and $\mathbf P \in \partial \Vert \mathbf Z \Vert _{2,1}$, we can get $\lim _{k\rightarrow + \infty } D_{\Vert \cdot \Vert _{2,1}}^\mathbf{P ^*}(\mathbf Z ^k,\mathbf Z ^*) =0$, i.e.,

$$\begin{aligned} \lim _{k\rightarrow + \infty } \Vert \mathbf Z ^k \Vert _{2,1} - \Vert \mathbf Z ^* \Vert _{2,1} - \langle \mathbf Z ^{k}-\mathbf Z ^{*}, \mathbf P ^{*} \rangle = 0. \end{aligned}$$

(56)

Moreover, since $\mu >0$, (47) also leads to $\sum _{k=0}^{+ \infty } \Vert \mathbf X _e^{k+1} - \mathbf Z _e^{k} \Vert _F^2 < + \infty $, which implies that $\lim _{k\rightarrow + \infty } \Vert \mathbf X _e^{k+1} - \mathbf Z _e^{k} \Vert _F^2 = 0$. By $\mathbf X ^{*} = \mathbf Z ^{*}$, we conclude that

$$\begin{aligned} \lim _{k\rightarrow + \infty } \Vert \mathbf X _e^{k+1} - \mathbf Z _e^{k} \Vert _F^2&= \lim _{k\rightarrow + \infty } \Vert \mathbf X ^{k+1} - \mathbf X ^{*} - \mathbf Z ^{k} + \mathbf Z ^{*} \Vert _F^2 \nonumber \\&= \lim _{k\rightarrow + \infty } \Vert \mathbf X ^{k+1} - \mathbf Z ^{k} \Vert _F^2 = 0. \end{aligned}$$

(57)

Since $\Vert \cdot \Vert _{2,1}$ is continuous, by (56) and (57), for $\mathbf X $, we have

$$\begin{aligned} \lim _{k\rightarrow + \infty } \Vert \mathbf X ^k \Vert _{2,1} - \Vert \mathbf X ^* \Vert _{2,1} - \langle \mathbf X ^{k}-\mathbf X ^{*}, \mathbf P ^{*} \rangle = 0. \end{aligned}$$

(58)

Adding (53) to (58), it yields

$$\begin{aligned} 0&= \lim _{k\rightarrow + \infty } \Vert \mathbf X ^k \Vert _{2,1} + \frac{\lambda }{2} \Vert \mathbf A \mathbf X ^k-\mathbf Y \Vert _F^2 \nonumber \\&\quad -\Vert \mathbf X ^* \Vert _{2,1} - \frac{\lambda }{2} \Vert \mathbf A \mathbf X ^*-\mathbf Y \Vert _F^2 - \langle \mathbf X ^{k}-\mathbf X ^{*}, \mathbf P ^{*}+\mathbf A ^T(\mathbf A \mathbf X ^{*}-\mathbf Y ) \rangle \nonumber \\&= \lim _{k\rightarrow + \infty } \Vert \mathbf X ^k \Vert _{2,1} + \frac{\lambda }{2} \Vert \mathbf A \mathbf X ^k-\mathbf Y \Vert _F^2 - \Vert \mathbf X ^* \Vert _{2,1} - \frac{\lambda }{2} \Vert \mathbf A \mathbf X ^*-\mathbf Y \Vert _F^2, \end{aligned}$$

(59)

where the last equality comes from (35). So (29) is proved.

Next, we prove (30) whenever (3) has a unique solution. It is proved by contradiction.

Define $\varPhi (\mathbf X ) = \Vert \mathbf X \Vert _{2,1} + \Vert \mathbf{AX-Y }\Vert _F^2$. Then $\varPhi (\mathbf X )$ is convex and lower semi-continuous. Since $\mathbf X ^*$ is the unique minimizer, we have $\varPhi (\mathbf X ^*) > \varPhi (\mathbf X )$ for $\mathbf X \ne \mathbf X ^*$.

Now, suppose that (30) does not hold. So, there exists a subsequence $\mathbf X ^{k_i}$ such that $\Vert \mathbf X ^{k_i} - \mathbf X ^{*}\Vert _F > \epsilon $ for some $\epsilon > 0$ and all $i$. Then, $\varPhi (\mathbf X ^{k_i}) > \min \{ \varPhi (\mathbf X ) : \Vert \mathbf X ^{k_i} - \mathbf X ^{*}\Vert _F = \epsilon \}$. Let $\mathbf Z $ be the intersection of $\{ \mathbf X : \Vert \mathbf X - \mathbf X ^{*}\Vert _F = \epsilon \}$ and the line from $\mathbf X ^{*}$ to $\mathbf X ^{k_i}$. Indeed, there exists a positive number $t \in (0,1)$ such that $\mathbf Z = t \mathbf X ^{*} + (1-t) \mathbf X ^{k_i}$. By the convexity of $\varPhi $ and the definition of $\mathbf X ^{*}$, we have

$$\begin{aligned} \varPhi (\mathbf X ^{k_i}) >&= t \varPhi (\mathbf X ^{*}) + (1-t) \varPhi (\mathbf X ^{k_i}) \ge \varPhi (t \mathbf X ^{*} + (1-t) \mathbf X ^{k_i}) = \varPhi (\mathbf Z ) \nonumber \\&\ge \min \{ \varPhi (\mathbf X ) : \Vert \mathbf X ^{k_i} - \mathbf X ^{*}\Vert _F = \epsilon \}. \end{aligned}$$

(60)

Denote ${\tilde{\mathbf{X }}} = \min \{ \varPhi (\mathbf X ) : \Vert \mathbf X ^{k_i} - \mathbf X ^{*}\Vert _F = \epsilon \}$. By (29), we have

$$\begin{aligned} \varPhi (\mathbf X ^*) = \lim _{k\rightarrow + \infty } \varPhi (\mathbf X ^{k_i}) \ge \varPhi ({\tilde{\mathbf{X }}}) > \varPhi (\mathbf X ^*), \end{aligned}$$

(61)

which is a contradiction. So (30) holds whenever (3) has a unique solution. $\square $

Appendix 2: Proof of Theorem 2

proof

Since subproblems (26) and (27) are convex, the first order optimality condition of Algorithm 2 is given as follows:

$$\begin{aligned} \left\{ \begin{array}{ll} \mathbf 0 = \lambda \mathbf A ^T (\mathbf A \mathbf X ^{k+1}-\mathbf Y +\mathbf C ^k) + \mu (\mathbf X ^{k+1}-\mathbf Z ^k+\mathbf B ^k), \\ \mathbf 0 = \mathbf P ^{k+1}+ \mu (\mathbf Z ^{k+1}-\mathbf X ^k-\mathbf B ^k), \\ \mathbf C ^{k+1} = \mathbf C ^k + (\mathbf Y - \mathbf A \mathbf X ^{k+1}),\\ \mathbf B ^{k+1} = \mathbf B ^k + (\mathbf X ^{k+1} - \mathbf Z ^{k+1}),\\ \end{array} \right. \end{aligned}$$

(62)

where $\mathbf P ^{k+1} \in \partial \Vert \mathbf Z ^{k+1} \Vert _{2,1}$.

Let $ L(\mathbf X ,\mathbf W ) = \Vert \mathbf X \Vert _{2,1} + \langle \mathbf W , \mathbf{AX-Y }\rangle $ be the Lagrangian of (2) and $\mathbf W $ be the Lagrangian multiplier. Since there exists at least one solution $\mathbf X ^*$ of (2) by the assumption, the KKT condition ensures the existence of a $\mathbf W ^*$ such that

$$\begin{aligned} \left\{ \begin{array}{ll} \nabla _\mathbf{X } L(\mathbf X ^*,\mathbf W ^*) = \mathbf P ^* + \mathbf A ^T\mathbf W ^* = 0, \\ \nabla _\mathbf{W } L(\mathbf X ^*,\mathbf W ^*) = \mathbf A \mathbf X ^*-\mathbf Y = 0, \\ \end{array} \right. \end{aligned}$$

(63)

where $\mathbf P ^{*} \in \partial \Vert \mathbf X ^{*} \Vert _{2,1}$.

Let $\mathbf B ^* = \mathbf P ^*/ \mu $, $\mathbf C ^* = \mathbf W ^*/ \lambda $. Then (63) can be formulated as

$$\begin{aligned} \left\{ \begin{array}{ll} \mathbf 0 = \lambda \mathbf A ^T (\mathbf A \mathbf X ^{*}-\mathbf Y +\mathbf C ^*) + \mu (\mathbf X ^{*}-\mathbf Z ^*+\mathbf B ^*), \\ \mathbf 0 = \mathbf P ^{*}+ \mu (\mathbf Z ^{*}-\mathbf X ^*-\mathbf B ^*), \\ \mathbf C ^{*} = \mathbf C ^* + (\mathbf Y - \mathbf A \mathbf X ^{*}),\\ \mathbf B ^{*} = \mathbf B ^* + (\mathbf X ^{*} - \mathbf Z ^{*}).\\ \end{array} \right. \end{aligned}$$

(64)

Comparing (64) and (62), $(\mathbf X ^{*}, \mathbf Z ^{*}, \mathbf C ^{*}, \mathbf B ^{*})$ is a fixed point of Algorithm 2.

Denote the errors by $\mathbf X _e^{k} = \mathbf X ^{k}-\mathbf X ^{*}$, $\mathbf Z _e^{k} = \mathbf Z ^{k}-\mathbf Z ^{*}$, $\mathbf B _e^{k} = \mathbf B ^{k}-\mathbf B ^{*}$ and $\mathbf C _e^{k} = \mathbf C ^{k}-\mathbf C ^{*}$.

Similar to the corresponding part of the proof of Theorem 1,

we can get

$$\begin{aligned}&\frac{\lambda }{2}(\Vert \mathbf C _e^{0}\Vert _F^2 - \Vert \mathbf C _e^{K+1}\Vert _F^2) + \frac{\mu }{2}(\Vert \mathbf B _e^{0}\Vert _F^2 - \Vert \mathbf B _e^{K+1}\Vert _F^2)\nonumber \\&\quad = \frac{\lambda }{2} \sum _{k=0}^K \Vert \mathbf A \mathbf X _e^{k+1} \Vert _F^2 + \sum _{k=0}^K \langle \mathbf P ^{k+1}-\mathbf P ^{*}, \mathbf Z ^{k+1}-\mathbf Z ^{*}\rangle \nonumber \\&\quad \quad + \frac{\mu }{2} (\sum _{k=0}^K \Vert \mathbf X _e^{k+1} - \mathbf Z _e^{k} \Vert _F^2 + \Vert \mathbf Z _e^{K+1} \Vert _F^2 - \Vert \mathbf Z _e^{0}\Vert _F^2). \end{aligned}$$

(65)

Note all terms involved in (65) are nonnegative. We have

$$\begin{aligned}&\frac{\lambda }{2}\Vert \mathbf C _e^{0}\Vert _F^2 + \frac{\mu }{2}\Vert \mathbf B _e^{0}\Vert _F^2 + \frac{\mu }{2} \Vert \mathbf Z _e^{0}\Vert _F^2 \nonumber \\&\quad \ge \frac{\lambda }{2} \sum _{k=0}^K \Vert \mathbf A \mathbf X _e^{k+1} \Vert _F^2 + \sum _{k=0}^K \langle \mathbf P ^{k+1}-\mathbf P ^{*}, \mathbf Z ^{k+1}-\mathbf Z ^{*}\rangle + \frac{\mu }{2} (\sum _{k=0}^K \Vert \mathbf X _e^{k+1} - \mathbf Z _e^{k} \Vert _F^2 ).\qquad \end{aligned}$$

(66)

From (66) and $\mu >0$, $\lambda >0$, we can get

$$\begin{aligned} \sum _{k=0}^{+ \infty } \Vert \mathbf A \mathbf X _e^{k+1} \Vert _F^2 < + \infty . \end{aligned}$$

(67)

(67) and $\mathbf A \mathbf X _e^{k+1} = \mathbf A \mathbf X ^{k+1} - \mathbf Y $ leads to the first equation in (31) holding, i.e.,

$$\begin{aligned} \lim _{k\rightarrow + \infty } \Vert \mathbf A \mathbf X ^{k+1} - \mathbf Y \Vert _F^2 =0. \end{aligned}$$

(68)

Similarly, (67) also leads to

$$\begin{aligned} \sum _{k=0}^{+ \infty } \langle \mathbf P ^{k+1}-\mathbf P ^{*}, \mathbf Z _e^{k+1}-\mathbf Z _e^{*}\rangle < + \infty , \end{aligned}$$

(69)

and hence

$$\begin{aligned} \lim _{k\rightarrow + \infty } \langle \mathbf P ^{k+1}-\mathbf P ^{*}, \mathbf Z ^{k+1}-\mathbf Z ^{*}\rangle =0. \end{aligned}$$

(70)

Recall equation (50) and let $J(\mathbf Z )= \Vert \mathbf Z \Vert _{2,1}$, $\mathbf P \in \partial \Vert \mathbf Z \Vert _{2,1}$, we have

$$\begin{aligned} \langle \mathbf P ^{k+1}-\mathbf P ^{*}, \mathbf Z ^{k+1}-\mathbf Z ^{*}\rangle = D_{\Vert \cdot \Vert _{2,1}}^\mathbf{P ^k}(\mathbf Z ^k,\mathbf Z ^*) + D_{\Vert \cdot \Vert _{2,1}}^\mathbf{P ^*}(\mathbf Z ^*,\mathbf Z ^k). \end{aligned}$$

(71)

(71) and (70) imply that $\lim _{k\rightarrow + \infty } D_{\Vert \cdot \Vert _{2,1}}^\mathbf{P ^*}(\mathbf Z ^k,\mathbf Z ^*) + D_{\Vert \cdot \Vert _{2,1}}^\mathbf{P ^k}(\mathbf Z ^*,\mathbf Z ^k) =0$, and together with the nonnegativity of the Bregman distance, implies that

$$\begin{aligned} \lim _{k\rightarrow + \infty } D_{\Vert \cdot \Vert _{2,1}}^\mathbf{P ^*}(\mathbf Z ^k,\mathbf Z ^*) =0, \end{aligned}$$

(72)

i.e.,

$$\begin{aligned} \lim _{k\rightarrow + \infty } \Vert \mathbf Z ^k \Vert _{2,1} - \Vert \mathbf Z ^* \Vert _{2,1} - \langle \mathbf Z ^{k}-\mathbf Z ^{*}, \mathbf P ^{*} \rangle = 0. \end{aligned}$$

(73)

Moreover, since $\mu >0$, (67) also leads to $\sum _{k=0}^{+ \infty } \Vert \mathbf X _e^{k+1} - \mathbf Z _e^{k} \Vert _F^2 < + \infty $, which implies that $\lim _{k\rightarrow + \infty } \Vert \mathbf X _e^{k+1} - \mathbf Z _e^{k} \Vert _F^2 = 0$. By $\mathbf X ^{*} = \mathbf Z ^{*}$, we conclude that

$$\begin{aligned} \lim _{k\rightarrow + \infty } \Vert \mathbf X _e^{k+1} - \mathbf Z _e^{k} \Vert _F^2 = \lim _{k\rightarrow + \infty } \Vert \mathbf X ^{k+1} - \mathbf X ^{*} - \mathbf Z ^{k} + \mathbf Z ^{*} \Vert _F^2 \!=\! \lim _{k\rightarrow \!+\! \infty } \Vert \mathbf X ^{k+1} - \mathbf Z ^{k} \Vert _F^2 = 0.\nonumber \\ \end{aligned}$$

(74)

Since $\Vert \cdot \Vert _{2,1}$ is continuous, by (73) and (74), for $\mathbf X $, we have

$$\begin{aligned} \lim _{k\rightarrow + \infty } \Vert \mathbf X ^k \Vert _{2,1} - \Vert \mathbf X ^* \Vert _{2,1} - \langle \mathbf X ^{k}-\mathbf X ^{*}, \mathbf P ^{*} \rangle = 0. \end{aligned}$$

(75)

Furthermore, since $\mathbf A \mathbf X ^k \rightarrow \mathbf Y $ and $\mathbf A \mathbf X ^* = \mathbf Y $, we have

$$\begin{aligned} \lim _{k\rightarrow + \infty } \langle \mathbf A \mathbf X ^{k} - \mathbf A \mathbf X ^{*}, \mathbf W ^{*} \rangle = 0. \end{aligned}$$

(76)

Summing (75) and (76), it yields

$$\begin{aligned}&0 = \lim _{k\rightarrow + \infty } \Vert \mathbf X ^k \Vert _{2,1} - \Vert \mathbf X ^* \Vert _{2,1} - \langle \mathbf X ^{k} - \mathbf X ^{*}, \mathbf P ^{*} \rangle - \langle \mathbf A \mathbf X ^{k} - \mathbf A \mathbf X ^{*}, \mathbf W ^{*} \rangle \nonumber \\&\quad = \lim _{k\rightarrow + \infty } \Vert \mathbf X ^k \Vert _{2,1} - \Vert \mathbf X ^* \Vert _{2,1} - \langle \mathbf X ^{k} - \mathbf X ^{*}, \mathbf P ^{*}+\mathbf A ^T\mathbf W ^{*} \rangle \nonumber \\&\quad = \lim _{k\rightarrow + \infty } \Vert \mathbf X ^k \Vert _{2,1} - \Vert \mathbf X ^* \Vert _{2,1}, \end{aligned}$$

(77)

where the last equality comes from the first equation of (63). This is the second equation of (31).

Next, we prove (32) whenever (2) has a unique solution. It is proved by contradiction.

Let $\mathbf W ^*$ be the vector in (63). Define

$$\begin{aligned} \varPhi (\mathbf X ) = L(\mathbf X ,\mathbf W ^*) + \Vert \mathbf{AX-Y }\Vert _F^2 = \Vert \mathbf X \Vert _{2,1} + \langle \mathbf W ^*, \mathbf{AX-Y }\rangle + \Vert \mathbf{AX-Y }\Vert _F^2. \end{aligned}$$

(78)

(78) is convex and continuous. Since $(\mathbf X ^*,\mathbf W ^*)$ is a saddle point of $L(\mathbf X ,\mathbf W )$ and $\mathbf A \mathbf X ^*=\mathbf Y $, we have $\varPhi (\mathbf X ) \ge \varPhi (\mathbf X ^*)$. Especially, we have $\varPhi (\mathbf X ) > \varPhi (\mathbf X ^*)$ for $\mathbf X \ne \mathbf X ^*$. The above conclusion can be seen as follows: When $\mathbf X \ne \mathbf X ^*$, if $\mathbf A \mathbf X =\mathbf Y $, then $\varPhi (\mathbf X ) > \varPhi (\mathbf X ^*)$ holds immediately due to the uniqueness of the solution of (2); otherwise, $\Vert \mathbf{AX-Y }\Vert _F^2 > 0 = \Vert \mathbf A \mathbf X ^*-\mathbf Y \Vert _F^2$ and therefore

$$\begin{aligned}&\varPhi (\mathbf X ) = L(\mathbf X ,\mathbf W ^*) + \Vert \mathbf{AX-Y }\Vert _F^2 \ge L(\mathbf X ^*,\mathbf W ^*) + \Vert \mathbf{AX-Y }\Vert _F^2 \nonumber \\&> L(\mathbf X ^*,\mathbf W ^*) + \Vert \mathbf A \mathbf X ^*-\mathbf Y \Vert _F^2 = \varPhi (\mathbf X ^*). \end{aligned}$$

(79)

The remainder of the proof can follows the same line as the one in Theorem 1. $\square $

Appendix 3: Subgradient of $\Vert \mathbf X \Vert _{2,1} $

For $\Vert \mathbf X \Vert _{2,1} = \sum _{j=1}^n \Vert \mathbf X ^j \Vert _{2}$, the subgradient of $\Vert \mathbf X \Vert _{2,1} $ is obtained as

$$\begin{aligned} \frac{ \partial \Vert \mathbf X \Vert _{2,1} }{\partial \mathbf X } =\begin{bmatrix} \frac{ \partial \Vert \mathbf X ^1 \Vert _{2}}{\partial \mathbf X _{11}}&\cdots&\frac{ \partial \Vert \mathbf X ^1 \Vert _{2}}{\partial \mathbf X _{1L}} \\ \vdots&\ddots&\vdots \\ \frac{ \partial \Vert \mathbf X ^n \Vert _{2}}{\partial \mathbf X _{n1}}&\cdots&\frac{ \partial \Vert \mathbf X ^n \Vert _{2}}{\partial \mathbf X _{nL}} \end{bmatrix} =\begin{bmatrix} \frac{ \partial \Vert \mathbf X ^1 \Vert _{2}}{\partial \mathbf X ^{1}} \\ \vdots \\ \frac{ \partial \Vert \mathbf X ^n \Vert _{2}}{\partial \mathbf X ^{n}} \end{bmatrix} , \end{aligned}$$

(80)

where

$$\begin{aligned} \frac{ \partial \Vert \mathbf X ^j \Vert _{2}}{\partial \mathbf X ^{j}} =\left\{ \begin{array}{l@{\quad }l} \mathbf X ^j / \Vert \mathbf X ^j\Vert _2 &{} \text {if} \; \mathbf X ^j \ne \mathbf 0 , \\ \{ \mathbf v \big | \Vert \mathbf v \Vert _2 \le 1, \mathbf v \in \mathbb R ^{L}\} &{} \text {if} \; \mathbf X ^j = \mathbf 0 . \end{array} \right. \end{aligned}$$

(81)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zou, J., Fu, Y., Zhang, Q. et al. Split Bregman algorithms for multiple measurement vector problem. Multidim Syst Sign Process 26, 207–224 (2015). https://doi.org/10.1007/s11045-013-0251-6

Download citation

Received: 12 May 2013
Revised: 05 August 2013
Accepted: 28 August 2013
Published: 14 September 2013
Issue Date: January 2015
DOI: https://doi.org/10.1007/s11045-013-0251-6

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Fusion of sparse reconstruction algorithms for multiple measurement vectors

A Simple Recovery Framework for Signals with Time-Varying Sparse Support

Exact recovery of sparse multiple measurement vectors by \(l_{2,p}\)-minimization

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Proof of theorem 1

proof

Appendix 2: Proof of Theorem 2

proof

Appendix 3: Subgradient of \(\Vert \mathbf X \Vert _{2,1} \)

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Split Bregman algorithms for multiple measurement vector problem

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Fusion of sparse reconstruction algorithms for multiple measurement vectors

A Simple Recovery Framework for Signals with Time-Varying Sparse Support

Exact recovery of sparse multiple measurement vectors by \(l_{2,p}\)-minimization

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Proof of theorem 1

proof

Appendix 2: Proof of Theorem 2

proof

Appendix 3: Subgradient of \(\Vert \mathbf X \Vert _{2,1} \)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation