Single-Shot Phase Retrieval Via Gradient-Sparse Non-Convex Regularization Integrating Physical Constraints

156 Accesses
Explore all metrics

Abstract

Measurements of light typically capture amplitude information, often overlooking crucial phase details. This oversight underscores the importance of phase retrieval (PR), essential in biomedical imaging, X-ray crystallography, and microscopy, for reconstructing complex signals from intensity-only measurements. Traditional methods frequently fall short, especially in noisy conditions or when restricted to single-shot measurements. To address these challenges, we introduce a novel model that combines non-convex regularization with physical constraints. The model adopts the smoothly clipped absolute deviation (SCAD) function as a sparsity regularization term for gradients, incorporating fundamental constraints on support and absorption to establish an inverse model. Using the alternating direction method of multipliers (ADMM), we break down the problem into manageable sub-problems, implementing SCAD shrinkage in the complex domain and applying Wirtinger gradient projection methods. A thorough convergence analysis validates the stability and robustness of the algorithm. Extensive simulations confirm significant improvements in reconstruction quality compared to existing methods, with evaluations demonstrating superior performance across various noise levels and parameter settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 8

Fast Iterative Algorithms for Blind Phase Retrieval: A Survey

Large-scale phase retrieval

Article Open access 03 September 2021

Data Availability

All data generated or analysed during this study are included in this published article. The MATLAB code will be open source after publication.

Notes

Where $(\lambda , \eta )$ pairs are sampled from equidistant nodes in $\{0.05, 0.075, \ldots , 0.175, 0.2\} \times \{0.05, 0.075, \ldots , 0.225, 0.25\}$.

References

Abbey, B., Nugent, K.A., Williams, G.J., Clark, J.N., Peele, A.G., Pfeifer, M.A., De Jonge, M., McNulty, I.: Keyhole coherent diffractive imaging. Nat. Phys. 4(5), 394–398 (2008). https://doi.org/10.1038/nphys896
Article Google Scholar
Bach, F., Jenatton, R., Mairal, J., Obozinski, G.: Optimization with sparsity-inducing penalties. Found. Trends Mach. Learn. 4(1), 1–106 (2012). https://doi.org/10.1561/2200000015
Article MATH Google Scholar
Bai, C., Zhou, M., Min, J., Dang, S., Yu, X., Zhang, P., Peng, T., Yao, B.: Robust contrast-transfer-function phase retrieval via flexible deep learning networks. Opt. Lett. 44(21), 5141–5144 (2019). https://doi.org/10.1364/OL.44.005141
Article Google Scholar
Barton, J.J.: Removing multiple scattering and twin images from holographic images. Phys. Rev. Lett. 67(22), 3106–3109 (1991). https://doi.org/10.1103/PhysRevLett.67.3106
Article MATH Google Scholar
Bauschke, H.H., Combettes, P.L., Luke, D.R.: Phase retrieval, error reduction algorithm, and Fienup variants: a view from convex optimization. J. Opt. Soc. Am. A 19(7), 1334–45 (2002). https://doi.org/10.1364/josaa.19.001334
Article MathSciNet MATH Google Scholar
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009). https://doi.org/10.1137/080716542
Article MathSciNet MATH Google Scholar
Bioucas-Dias, J.M., Valadao, G.: Phase unwrapping via graph cuts. IEEE Trans. Image Process. 16(3), 698–709 (2007). https://doi.org/10.1109/tip.2006.888351
Article MathSciNet MATH Google Scholar
Bouhamidi, A., Jbilou, K.: Sylvester Tikhonov-regularization methods in image restoration. J. Comput. Appl. Math. 206(1), 86–98 (2007). https://doi.org/10.1016/j.cam.2006.05.028
Article MathSciNet MATH Google Scholar
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011). https://doi.org/10.1561/2200000016
Article MATH Google Scholar
Brandwood, D.: A complex gradient operator and its application in adaptive array theory. In: IEE Proceedings H (Microwaves, Optics and Antennas), vol. 130, pp. 11–16. IET Digital Library (1983)
Candes, E.J., Li, X., Soltanolkotabi, M.: Phase retrieval via Wirtinger flow: theory and algorithms. IEEE Trans. Inf. Theory 61(4), 1985–2007 (2015). https://doi.org/10.1109/TIT.2015.2399924
Article MathSciNet MATH Google Scholar
Chang, H., Lou, Y., Ng, M.K., Zeng, T.: Phase retrieval from incomplete magnitude information via total variation regularization. SIAM J. Sci. Comput. 38(6), A3672–A3695 (2016). https://doi.org/10.1137/15M1029357
Article MathSciNet MATH Google Scholar
Chen, Y., Candes, E.: Solving random quadratic systems of equations is nearly as easy as solving linear systems. Advances in Neural Information Processing Systems 28 (2015)
Chen, Y., Candès, E.J.: Solving random quadratic systems of equations is nearly as easy as solving linear systems. Commun. Pure Appl. Math. 70(5), 822–883 (2017). https://doi.org/10.1002/cpa.21638
Article MathSciNet MATH Google Scholar
Cho, T.S., Zitnick, C.L., Joshi, N., Kang, S.B., Szeliski, R., Freeman, W.T.: Image restoration by matching gradient distributions. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 683–694 (2012). https://doi.org/10.1109/TPAMI.2011.166
Article Google Scholar
Delgado, K.K.: The complex gradient operator and the CR-calculus. ECE275A-Lecture Supplement Fall 2006 (2006)
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001). https://doi.org/10.1198/016214501753382273
Article MathSciNet MATH Google Scholar
Fienup, J.R.: Reconstruction of an object from the modulus of its Fourier transform. Opt. Lett. 3(1), 27–9 (1978). https://doi.org/10.1364/ol.3.000027
Article MATH Google Scholar
Fienup, J.R.: Phase retrieval algorithms: a comparison. Appl. Opt. 21(15), 2758–69 (1982). https://doi.org/10.1364/AO.21.002758
Article MATH Google Scholar
Fienup, J.R., Crimmins, T.R., Holsztynski, W.: Reconstruction of the support of an object from the support of its autocorrelation. J. Opt. Soc. Am. 72(5), 610–624 (1982). https://doi.org/10.1364/JOSA.72.000610
Article MathSciNet MATH Google Scholar
Fus, F., Yang, Y., Pacureanu, A., Bohic, S., Cloetens, P.: Unsupervised solution for in-line holography phase retrieval using bayesian inference. Opt. Express 26(25), 32847–32865 (2018). https://doi.org/10.1364/OE.26.032847
Article Google Scholar
Gabor, D.: A new microscopic principle. Nature 161(4098), 777–778 (1948). https://doi.org/10.1038/161777a0
Article MATH Google Scholar
Gao, Y., Cao, L.: Iterative projection meets sparsity regularization: towards practical single-shot quantitative phase imaging with in-line holography. Light Adv. Manuf. 4(1), 37–53 (2023). https://doi.org/10.37188/lam.2023.006
Article MathSciNet MATH Google Scholar
Gao, Y., Yang, F., Cao, L.: Pixel super-resolution phase retrieval for lensless on-chip microscopy via accelerated Wirtinger flow. Cells 11(13), 1999 (2022). https://doi.org/10.3390/cells11131999
Gerchberg, R.W.: Phase determination from image and diffraction plane pictures in the electron microscope. Optik 34, 275–284 (1971)
MATH Google Scholar
Gerchberg, R.W.: A practical algorithm for the determination of plane from image and diffraction pictures. Optik 35(2), 237–246 (1972)
MATH Google Scholar
Ghayem, F., Sadeghi, M., Babaie-Zadeh, M., Chatterjee, S., Skoglund, M., Jutten, C.: Sparse signal recovery using iterative proximal projection. IEEE Trans. Signal Process. 66(4), 879–894 (2018). https://doi.org/10.1109/TSP.2017.2778695
Article MathSciNet MATH Google Scholar
Gong, P., Zhang, C., Lu, Z., Huang, J., Ye, J.: A general iterative shrinkage and thresholding algorithm for non-convex regularized optimization problems. In: Proceedings of the 30th International Conference on Machine Learning, vol. 28(2), pp. 37–45. PMLR (2013)
Goodman, J.W.: Introduction to Fourier optics. Roberts and Company Publishers, Greenwood (2005)
MATH Google Scholar
Gray, R.M., Davisson, L.D.: An Introduction to Statistical Signal Processing. Cambridge University Press, Cambridge (2004)
Book MATH Google Scholar
Hjørungnes, A.: Complex-Valued Matrix Derivatives: With Applications in Signal Processing and Communications. Cambridge University Press, Cambridge (2011)
Book MATH Google Scholar
Jaganathan, K., Oymak, S., Hassibi, B.: Sparse phase retrieval: uniqueness guarantees and recovery algorithms. IEEE Trans. Signal Process. 65(9), 2402–2410 (2017). https://doi.org/10.1109/TSP.2017.2656844
Article MathSciNet MATH Google Scholar
Kumar, S.: Phase retrieval with physics informed zero-shot network. Opt. Lett. 46(23), 5942–5945 (2021). https://doi.org/10.1364/OL.433625
Article MATH Google Scholar
Lanza, A., Morigi, S., Sgallari, F.: Constrained TV$_p$-$\ell _2$ model for image restoration. J. Sci. Comput. 68(1), 64–91 (2016). https://doi.org/10.1007/s10915-015-0129-x
Article MathSciNet MATH Google Scholar
Latychevskaia, T., Fink, H.W.: Solution to the twin image problem in holography. Phys. Rev. Lett. 98(23), 233901 (2007). https://doi.org/10.1103/PhysRevLett.98.233901
Article MATH Google Scholar
Li, R., Sudjianto, A.: Analysis of computer experiments using penalized likelihood in Gaussian Kriging models. Technometrics 47(2), 111–120 (2005). https://doi.org/10.1198/004017004000000671
Article MathSciNet MATH Google Scholar
Li, T., Li, F., Qi, H.: A Weibull gradient prior for image restoration. J. Comput. Appl. Math. 439, 115594 (2024). https://doi.org/10.1016/j.cam.2023.115594
Article MathSciNet MATH Google Scholar
Liu, G., Scott, P.D.: Phase retrieval and twin-image elimination for in-line Fresnel holograms. J. Opt. Soc. Am. A 4(1), 159–165 (1987). https://doi.org/10.1364/JOSAA.4.000159
Article MATH Google Scholar
Lou, Y., Zeng, T., Osher, S., Xin, J.: A weighted difference of anisotropic and isotropic total variation model for image processing. SIAM J. Imaging Sci. 8(3), 1798–1823 (2015). https://doi.org/10.1137/14098435x
Article MathSciNet MATH Google Scholar
Lu, C., Tang, J., Yan, S., Lin, Z.: Nonconvex nonsmooth low rank minimization via iteratively reweighted nuclear norm. IEEE Trans. Image Process. 25(2), 829–839 (2016). https://doi.org/10.1109/TIP.2015.2511584
Article MathSciNet MATH Google Scholar
Marchesini, S.: Invited article: a unified evaluation of iterative projection algorithms for phase retrieval. Rev. Sci. Instrum. 78(1), 011301 (2007). https://doi.org/10.1063/1.2403783
Article MATH Google Scholar
Marchesini, S., He, H., Chapman, H.N., Hau-Riege, S.P., Noy, A., Howells, M.R., Weierstall, U., Spence, J.C.: X-ray image reconstruction from a diffraction pattern alone. Phys. Rev. B 68(14), 140101 (2003). https://doi.org/10.1103/PhysRevB.68.140101
Article MATH Google Scholar
Mehranian, A., Rad, H.S., Rahmim, A., Ay, M.R., Zaidi, H.: Smoothly clipped absolute deviation (SCAD) regularization for compressed sensing MRI using an augmented lagrangian scheme. Magn. Reson. Imaging 31(8), 1399–1411 (2013). https://doi.org/10.1016/j.mri.2013.05.010
Article Google Scholar
Metzler, C., Schniter, P., Veeraraghavan, A., Baraniuk, R.: prDeep: Robust phase retrieval with a flexible deep network. In: Proceedings of the 35th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 80, pp. 3501–3510. PMLR (2018)
Metzler, C.A., Maleki, A., Baraniuk, R.G.: BM3D-PRGAMP: Compressive phase retrieval based on BM3D denoising. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 2504–2508. IEEE (2016). https://doi.org/10.1109/ICIP.2016.7532810
Miao, J., Ishikawa, T., Shen, Q., Earnest, T.: Extending X-ray crystallography to allow the imaging of noncrystalline materials, cells, and single protein complexes. Annu. Rev. Phys. Chem. 59, 387–410 (2008). https://doi.org/10.1146/annurev.physchem.59.032607.093642
Article MATH Google Scholar
Millane, R.P.: Phase retrieval in crystallography and optics. J. Opt. Soc. Am. A 7(3), 394–411 (1990). https://doi.org/10.1364/JOSAA.7.000394
Momey, F., Denis, L., Olivier, T., Fournier, C.: From Fienup’s phase retrieval techniques to regularized inversion for in-line holography: tutorial. J. Opt. Soc. Am. A 36(12), D62–D80 (2019). https://doi.org/10.1364/JOSAA.36.000D62
Article Google Scholar
Nesterov, Y.E.: A method of solving a convex programming problem with convergence rate ${O}\left(k^2\right)$. In: Doklady Akademii Nauk, vol. 269, pp. 543–547. Russian Academy of Sciences (1983)
Ohlsson, H., Yang, A.Y., Dong, R., Sastry, S.S.: Compressive phase retrieval from squared output measurements via semidefinite programming. IFAC Proc. Vol. 45(16), 89–94 (2012). https://doi.org/10.3182/20120711-3-BE-2027.00415
Article MATH Google Scholar
Pacheco, C., McKay, G.N., Oommen, A., Durr, N.J., Vidal, R., Haeffele, B.D.: Adaptive sparse reconstruction for lensless digital holography via PSF estimation and phase retrieval. Opt. Express 30(19), 33433–33448 (2022). https://doi.org/10.1364/OE.458360
Article Google Scholar
Rivenson, Y., Wu, Y., Wang, H., Zhang, Y., Feizi, A., Ozcan, A.: Sparsity-based multi-height phase recovery in holographic microscopy. Sci. Rep. 6(1), 37862 (2016). https://doi.org/10.1038/srep37862
Article Google Scholar
Ru, Y., Li, F., Fang, F., Zhang, G.: Patch-based weighted SCAD prior for compressive sensing. Inf. Sci. 592, 137–155 (2022). https://doi.org/10.1016/j.ins.2022.01.034
Article MATH Google Scholar
Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Phys. D 60(1), 259–268 (1992). https://doi.org/10.1016/0167-2789(92)90242-F
Article MathSciNet MATH Google Scholar
Shechtman, Y., Beck, A., Eldar, Y.C.: GESPAR: efficient phase retrieval of sparse signals. IEEE Trans. Signal Process. 62(4), 928–938 (2014). https://doi.org/10.1109/TSP.2013.2297687
Article MathSciNet MATH Google Scholar
Shi, B., Lian, Q., Huang, X., An, N.: Constrained phase retrieval: when alternating projection meets regularization. J. Opt. Soc. Am. B 35(6), 1271–1281 (2018). https://doi.org/10.1364/JOSAB.35.001271
Article Google Scholar
Sidky, E.Y., Chartrand, R., Boone, J.M., Pan, X.: Constrained ${\rm T}p{\rm V}$ minimization for enhanced exploitation of gradient sparsity: application to CT image reconstruction. IEEE J. Transl. Eng. He. 2, 1–18 (2014). https://doi.org/10.1109/JTEHM.2014.2300862
Article Google Scholar
Stefik, M.: Inferring DNA structures from segmentation data. Artif. Intell. 11(1), 85–114 (1978). https://doi.org/10.1016/0004-3702(78)90013-9
Article Google Scholar
Tian, Y., Fienup, J.R.: Phase retrieval with only a nonnegativity constraint. Opt. Lett. 48(1), 135–138 (2023). https://doi.org/10.1364/OL.478581
Article MATH Google Scholar
Uelwer, T., Hoffmann, T., Harmeling, S.: Non-iterative phase retrieval with cascaded neural networks. In: Artificial Neural Networks and Machine Learning - ICANN 2021, pp. 295–306. Springer International Publishing (2021). https://doi.org/10.1007/978-3-030-86340-1_24
Wang, Y., Yin, W., Zeng, J.: Global convergence of ADMM in nonconvex nonsmooth optimization. J. Sci. Comput. 78(1), 29–63 (2019). https://doi.org/10.1007/s10915-018-0757-z
Article MathSciNet MATH Google Scholar
Wang, Y., Zeng, J., Peng, Z., Chang, X., Xu, Z.: Linear convergence of adaptively iterative thresholding algorithms for compressed sensing. IEEE Trans. Signal Process. 63(11), 2957–2971 (2015). https://doi.org/10.1109/TSP.2015.2412915
Article MathSciNet MATH Google Scholar
Wang, Z., Simoncelli, E.P.: Translation insensitive image similarity in complex wavelet domain. In: Proceedings. (ICASSP ’05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005., vol. 2, pp. ii/573–ii/576. IEEE (2005). https://doi.org/10.1109/ICASSP.2005.1415469
Wirtinger, W.: Zur formalen theorie der funktionen von mehr komplexen veränderlichen. Math. Ann. 97(1), 357–375 (1927). https://doi.org/10.1007/BF01447872
Article MathSciNet MATH Google Scholar
Wu, J., Yang, F., Cao, L.: Resolution enhancement of long-range imaging with sparse apertures. Opt. Lasers Eng. 155, 107068 (2022). https://doi.org/10.1016/j.optlaseng.2022.107068
Article MATH Google Scholar
Ye, Q., Wang, L.W., Lun, D.P.K.: Sisprnet: end-to-end learning for single-shot phase retrieval. Opt. Express 30(18), 31937–31958 (2022). https://doi.org/10.1364/OE.464086
Article MATH Google Scholar
You, J., Jiao, Y., Lu, X., Zeng, T.: A nonconvex model with minimax concave penalty for image restoration. J. Sci. Comput. 78(2), 1063–1086 (2019). https://doi.org/10.1007/s10915-018-0801-z
Article MathSciNet MATH Google Scholar
You, J., Liu, H., Pan, Q., Sun, A., Zhang, Z., Shi, X.: Fast response fluorescent probe with a large stokes shift for thiophenol detection in water samples and cell imaging. J. Anal. Test. 7(1), 69–78 (2023). https://doi.org/10.1007/s41664-022-00247-7
Article Google Scholar
Zhang, C.H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38(2), 894–942 (2010). https://doi.org/10.1214/09-AOS729
Article MathSciNet MATH Google Scholar
Zhang, W., Cao, L., Brady, D.J., Zhang, H., Cang, J., Zhang, H., Jin, G.: Twin-image-free holography: a compressive sensing approach. Phys. Rev. Lett. 121(9), 093902 (2018). https://doi.org/10.1103/PhysRevLett.121.093902
Article MATH Google Scholar
Zhang, Y., Andreas Noack, M., Vagovic, P., Fezzaa, K., Garcia-Moreno, F., Ritschel, T., Villanueva-Perez, P.: Phasegan: a deep-learning phase-retrieval approach for unpaired datasets. Opt. Express 29(13), 19593–19604 (2021). https://doi.org/10.1364/OE.423222
Article Google Scholar

Download references

Funding

This research was supported by Natural Science Foundation of Shanghai (no. 22ZR1419500), Science and Technology Commission of Shanghai Municipality (no. 22DZ2229014), and Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

School of Mathematical Sciences, East China Normal University, Shanghai, 200241, China
Xuesong Chen
School of Mathematical Sciences, Key Laboratory of MEA (Ministry of Education) & Shanghai Key Laboratory of PMMP, East China Normal University, Shanghai, 200241, China
Fang Li

Authors

Xuesong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Fang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fang Li.

Ethics declarations

Conflict of interest

The authors declare that they have no Conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Proof for SCAD Shrinkage in Sect. 4.1

Proof of Theorem 1

Since the SCAD function (9) essentially depends only on |x|, let $r = |x| \geqslant 0$, then

$$\begin{aligned} \varphi _{\lambda ,\gamma }(x) = \varphi _{\lambda ,\gamma }(r). \end{aligned}$$

By the triangle inequality, we have

$$\begin{aligned}&\frac{1}{2}|x-v|^2+\eta \varphi _{\lambda ,\gamma }(x) \end{aligned}$$

(27)

$$\begin{aligned} = \,&\frac{1}{2}|x-v|^2+\eta \varphi _{\lambda ,\gamma }(r) \end{aligned}$$

(28)

$$\begin{aligned} \geqslant \,&\frac{1}{2}|r-|v||^2+\eta \varphi _{\lambda ,\gamma }(r) , \end{aligned}$$

(29)

where equality holds if and only if $x = kv$ ($k>0$). By Lemma 1, the minimum of (29) is achieved at $\widehat{r} = \operatorname {Shrink}_{\text {SCAD}} \left( |v|, \gamma , \lambda , \eta \right) $, thus the minimum of (27) is achieved at $\widehat{x} = \frac{v}{|v|} \widehat{r}$, i.e.,

$$\begin{aligned} \widehat{x} = \frac{v}{|v|} \operatorname {Shrink}_{\text {SCAD}} \left( |v|, \gamma , \lambda , \eta \right) , \end{aligned}$$

which completes the proof. $\square $

Proof of Theorem 2

The objective function

$$\begin{aligned} \frac{1}{2}\Vert \varvec{x}-\varvec{v}\Vert _2^2+\eta \Phi _{\lambda ,\gamma }(\varvec{x}) = \sum _{j=1}^{n} \left( \frac{1}{2} \left| \varvec{x}_{(j)}-\varvec{v}_{(j)}\right| ^2+\eta \varphi _{\lambda ,\gamma }\left( \varvec{x}_{(j)}\right) \right) , \end{aligned}$$

Therefore, optimization problem (20) can essentially be transformed into optimizing n independent problems of the form (18)

$$\begin{aligned}\mathop {\text {arg min}}\limits _ {\varvec{x}_{(j)}\,\,\in \,\,\mathbb {C}}\,\,\,\,\,\,{\frac{1}{2} \left| \varvec{x}_{(j)}-\varvec{v}_{(j)}\right| ^2+\eta \varphi _{\lambda ,\gamma }\left( \varvec{x}_{(j)}\right) } , \end{aligned}$$

According to Theorem 1, the solution for each sub-problem is

$$\begin{aligned} \widehat{\varvec{x}}_{(j)}=\operatorname {Shrink}_{\text {SCAD}} \left( \varvec{v}_{(j)}, \gamma , \lambda , \eta \right) . \end{aligned}$$

$\square $

Calculation of Wirtinger Derivative in Sect. 4.2

Define the function components of the objective function in (23) as follows,

$$\begin{aligned} \begin{gathered} \mathscr {F}(\varvec{x}) = \frac{1}{2}\left\| |\varvec{A} \varvec{x}|-\varvec{y}\right\| _2^2, \\ \mathscr {I}(\varvec{x}) = \frac{\mu }{2}\left\| \nabla \varvec{x} + \varvec{u} \right\| _2^2, \end{gathered} \end{aligned}$$

then the objective function $\mathscr {J}(\varvec{x}) = \mathscr {F}(\varvec{x}) + \mathscr {I}(\varvec{x})$,

$$\begin{aligned} \nabla _{\varvec{x}} \mathscr {J}(\varvec{x}) = \left( \frac{\partial \mathscr {J}}{\partial \varvec{x}^{\textsf{H}}}\right) ^{\textsf{T}} = \left( \frac{\partial \mathscr {F}}{\partial \varvec{x}^{\textsf{H}}}\right) ^{\textsf{T}} + \left( \frac{\partial \mathscr {I}}{\partial \varvec{x}^{\textsf{H}}}\right) ^{\textsf{T}}. \end{aligned}$$

(30)

Define $\varvec{\theta }: \mathbb {C}^n \rightarrow \mathbb {C}^n$ as $\varvec{\theta } (\varvec{x}) = \varvec{A} \varvec{x}$, $\varvec{\varepsilon }: \mathbb {C}^n \rightarrow \mathbb {R}^n$ as $\varvec{\varepsilon } = |\varvec{\theta }|-\varvec{y}$, then $\mathscr {F} = \frac{1}{2}\left\| \varvec{\varepsilon } \right\| _2^2$. By the chain rule of real functions and the chain rule for Wirtinger derivatives [31],

$$\begin{aligned} \frac{\partial \mathscr {F}}{\partial \varvec{x}^{\textsf{H}}} = \,&\frac{\partial \mathscr {F}}{\partial \varvec{\varepsilon }^{\textsf{T}}} \frac{\partial \varvec{\varepsilon }}{\partial \varvec{x}^{\textsf{H}}} = \varvec{\varepsilon }^{\textsf{T}} \frac{\partial \varvec{\varepsilon }}{\partial \varvec{x}^{\textsf{H}}} \\ = \,&\varvec{\varepsilon }^{\textsf{T}} \left( \frac{\partial \varvec{\varepsilon }}{\partial \varvec{\theta }^{\textsf{T}}} \frac{\partial \varvec{\theta }}{\partial \varvec{x}^{\textsf{H}}} + \frac{\partial \varvec{\varepsilon }}{\partial \varvec{\theta }^{\textsf{H}}} \frac{\partial \varvec{\theta }^*}{\partial \varvec{x}^{\textsf{H}}}\right) , \end{aligned}$$

by direct calculation we know

$$\begin{aligned} \begin{gathered} \frac{\partial \varvec{\theta }}{\partial \varvec{x}^{\textsf {H}}} = \varvec{0}, \\ \frac{\partial \varvec{\theta }^*}{\partial \varvec{x}^{\textsf {H}}} = \left( \frac{\partial \varvec{\theta }}{\partial \varvec{x}^{\textsf {T}}}\right) ^* = \varvec{A}^*, \\ \frac{\partial \varvec{\varepsilon }}{\partial \varvec{\theta }^{\textsf {H}}} = \frac{1}{2} \operatorname {diag}\left( \frac{\varvec{\theta }}{|\varvec{\theta }|}\right) , \end{gathered} \end{aligned}$$

where $\operatorname {diag} (\cdot )$ denotes the diagonal matrix. Thus,

$$\begin{aligned} \begin{aligned} \frac{\partial \mathscr {F}}{\partial \varvec{x}^{\textsf{H}}}&= \frac{1}{2} \varvec{\varepsilon }^{\textsf{T}} \operatorname {diag}\left( \frac{\varvec{\theta }}{|\varvec{\theta }|}\right) \varvec{A}^* \\ &= \frac{1}{2} \left( |\varvec{A} \varvec{x}| - \varvec{y}\right) ^{\textsf{T}} \operatorname {diag}\left( \frac{\varvec{A} \varvec{x}}{|\varvec{A} \varvec{x}|}\right) \varvec{A}^*. \end{aligned} \end{aligned}$$

(31)

Define $\varvec{\rho }: \mathbb {C}^n \rightarrow \mathbb {C}^n$ as $\varvec{\rho } (\varvec{x}) = \nabla \varvec{x} + \varvec{u}$, then $\mathscr {I} = \frac{\mu }{2}\left\| \varvec{\rho } \right\| _2^2$. By the chain rule for Wirtinger derivatives [31],

$$\begin{aligned} \begin{gathered} \frac{\partial \varvec{\rho }}{\partial \varvec{x}^{\textsf {H}}} = \varvec{0}, \\ \frac{\partial \varvec{\rho }^*}{\partial \varvec{x}^{\textsf {H}}} = \left( \frac{\partial \varvec{\rho }}{\partial \varvec{x}^{\textsf {T}}}\right) ^* = \nabla ^*, \\ \frac{\partial \mathscr {I}}{\partial \varvec{\rho }^{\textsf {H}}} = \frac{\mu }{2} \varvec{\rho }^{\textsf {T}}. \end{gathered} \end{aligned}$$

Through direct calculation, we can get

$$\begin{aligned} \frac{\partial \varvec{\rho }}{\partial \varvec{x}^{\textsf{H}}} = \varvec{0}, \\ \frac{\partial \varvec{\rho }^*}{\partial \varvec{x}^{\textsf{H}}} = \left( \frac{\partial \varvec{\rho }}{\partial \varvec{x}^{\textsf{T}}}\right) ^* = \nabla ^*, \\ \frac{\partial \mathscr {I}}{\partial \varvec{\rho }^{\textsf{H}}} = \frac{\mu }{2} \varvec{\rho }^{\textsf{T}}. \end{aligned}$$

Thus,

$$\begin{aligned} \frac{\partial \mathscr {I}}{\partial \varvec{x}^{\textsf{H}}} = \frac{\mu }{2} \varvec{\rho }^{\textsf{T}} \nabla ^* = \frac{\mu }{2} \left( \nabla \varvec{x} + \varvec{u}\right) ^{\textsf{T}} \nabla ^*. \end{aligned}$$

(32)

Finally, combining (30), (31), and (32) yields

$$\begin{aligned} \nabla _{\varvec{x}} \mathscr {J}(\varvec{x}) = \,&\left( \frac{\partial \mathscr {F}}{\partial \varvec{x}^{\textsf{H}}}\right) ^{\textsf{T}} + \left( \frac{\partial \mathscr {I}}{\partial \varvec{x}^{\textsf{H}}}\right) ^{\textsf{T}} \\ = \,&\frac{1}{2} \varvec{A}^{\textsf{H}} \operatorname {diag}\left( \frac{\varvec{A} \varvec{x}}{|\varvec{A} \varvec{x}|}\right) \left( |\varvec{A} \varvec{x}| - \varvec{y}\right) + \frac{\mu }{2} \nabla ^{\textsf{H}} \left( \nabla \varvec{x} + \varvec{u}\right) . \end{aligned}$$

Proof for Convergence Analysis in Sect. 4.3

1.1 Convergence of the ADMM Algorithm

Let us denote

$$\begin{aligned} \mathcal {G} (\varvec{z}, \varvec{x}) = \mathscr {H} (\varvec{z}) + \mathscr {F} (\varvec{x}) + \mathscr {G} (\varvec{x}) \end{aligned}$$

as the objective function of the problem (12).

Lemma 2

The objective function $\mathcal {G} (\varvec{z}, \varvec{x})$ is coercive on the feasible domain $\{(\varvec{z}, \varvec{x}) \mid \nabla \varvec{x} - \varvec{z} = \varvec{0}\}$, i.e., if $\Vert (\varvec{z}, \varvec{x})\Vert _2 \rightarrow +\infty $, then $\mathcal {G} (\varvec{z}, \varvec{x}) \rightarrow +\infty $.

Proof

Given that $\Vert (\varvec{z}, \varvec{x})\Vert _2 = \left( \Vert \varvec{z}\Vert _2^2 + \Vert \varvec{x}\Vert _2^2 \right) ^{\frac{1}{2}}$, as $\Vert (\varvec{z}, \varvec{x})\Vert _2 \rightarrow +\infty $, either $\Vert \varvec{z}\Vert _2 \rightarrow +\infty $ or $\Vert \varvec{x}\Vert _2 \rightarrow +\infty $ must hold.

If $\Vert \varvec{z}\Vert _2 = \Vert \nabla \varvec{x}\Vert _2 \rightarrow +\infty $, it follows that

$$\begin{aligned} \Vert \nabla \varvec{x}\Vert _2^2 = \,&\sum _{j_1=1}^{n_{1}-1} \sum _{j_2=1}^{n_2}\left| \varvec{X}_{j_1+1, j_2}-\varvec{X}_{j_1, j_2}\right| ^2 +\sum _{j_1=1}^{n_{1}} \sum _{j_2=1}^{n_2-1}\left| \varvec{X}_{j_1, j_2+1}-\varvec{X}_{j_1, j_2}\right| ^2 \\ &+ \sum _{j_2=1}^{n_2} \left| \varvec{X}_{(n_1, j_2)}\right| ^2 + \sum _{j_1=1}^{n_{1}} \left| \varvec{X}_{(j_1, n_2)}\right| ^2 \\ \leqslant \,&2 \sum _{j_1=1}^{n_{1}-1} \sum _{j_2=1}^{n_2}\left( \left| \varvec{X}_{j_1+1, j_2}\right| ^2+\left| \varvec{X}_{j_1, j_2}\right| ^2\right) +2 \sum _{j_1=1}^{n_{1}} \sum _{j_2=1}^{n_2-1}\left( \left| \varvec{X}_{j_1, j_2+1}\right| ^2+\left| \varvec{X}_{j_1, j_2}\right| ^2\right) \\ &+ 2\sum _{j_2=1}^{n_2}\left( \left| \varvec{X}_{(1, j_2)}\right| ^2 + \left| \varvec{X}_{(n_1, j_2)}\right| ^2\right) + 2\sum _{j_1=1}^{n_{1}}\left( \left| \varvec{X}_{(j_1, 1)}\right| ^2+\left| \varvec{X}_{(j_1, n_2)}\right| ^2\right) \\ \leqslant \,&8 \sum _{j_1=1}^{n_{1}} \sum _{j_2=1}^{n_2}\left| \varvec{X}_{j_1, j_2}\right| ^2 \\ = \,&8\Vert \varvec{x}\Vert _2^2, \end{aligned}$$

thus, $\Vert \varvec{x}\Vert _2 \rightarrow +\infty $ is implied, where $\varvec{X}$ is the matrix form corresponding to the vectorized image $\varvec{x}$.

If $\Vert \varvec{x}\Vert _2 \rightarrow +\infty $, since $\varvec{A}$ is a unitary linear map, then $|\varvec{A} \varvec{x}| \rightarrow +\infty $, and consequently $\mathscr {F} (\varvec{x}) \rightarrow +\infty $. Given that $\mathscr {G} (\varvec{x}), \mathscr {H} (\varvec{z}) \geqslant 0$, it follows that $\mathcal {G} (\varvec{z}, \varvec{x}) \rightarrow +\infty $. $\square $

From the proof of Lemma 2, we obtain that $\Vert \nabla \varvec{x}\Vert _2^2 \leqslant 8 \Vert \varvec{x}\Vert _2^2$, implying that $\rho \left( \nabla ^{\textsf{H}} \nabla \right) = \Vert \nabla \Vert _2^2 \leqslant 8$. This leads to the following corollary:

Corollary 1

If $\nabla $ represents the finite difference operator, then

$$\begin{aligned} \rho \left( \nabla ^{\textsf{H}} \nabla \right) \leqslant 8 \end{aligned}$$

holds.

Given that (31) shows

$$\begin{aligned} \nabla _{\varvec{x}} \mathscr {F} = \left( \frac{\partial \mathscr {F}}{\partial \varvec{x}^{\textsf{H}}}\right) ^{\textsf{T}} = \frac{1}{2} \varvec{A}^\textsf{H} \operatorname {diag}\left( \frac{\varvec{A} \varvec{x}}{|\varvec{A} \varvec{x}|}\right) (|\varvec{A} \varvec{x}|-\varvec{y}). \end{aligned}$$

And for the SCAD function (10), it only takes a simple calculation to know that the j-th component of $\nabla _{\varvec{z}} \mathscr {H} (\varvec{z})$ is

$$\begin{aligned} \nabla _{\varvec{z}} \mathscr {H} (\varvec{z})_{(j)} = \left\{ \begin{aligned}&\frac{\lambda \varvec{x}_{(j)}}{2 \left| \varvec{x}_{(j)}\right| }, & 0 \leqslant \left| \varvec{x}_{(j)}\right| \leqslant \lambda , \\ &\frac{-\varvec{x}_{(j)}+ \gamma \lambda \frac{\varvec{x}_{(j)}}{\left| \varvec{x}_{(j)}\right| }}{2(\gamma -1)}, & \lambda <\left| \varvec{x}_{(j)}\right| \leqslant \gamma \lambda , \\&0, & \left| \varvec{x}_{(j)}\right| >\gamma \lambda . \end{aligned} \right. \end{aligned}$$

As noted in the supplementary information of [23], $\mathscr {F}$ is Lipschitz differentiable, meaning that its gradient is Lipschitz continuous.

Lemma 3

([23]) The function $\mathscr {F}$ is Lipschitz differentiable with a Lipschitz constant $L_{\mathscr {F}} = \frac{1}{2} \rho \left( \varvec{A}^{\textsf{H}} \varvec{A}\right) $.

Lemma 4

There exist positive constants M and $C_1$, such that

$$\begin{aligned} \left\| \varvec{x}^{(k+1)} - \varvec{x}^{(k)} \right\| _2 \leqslant M \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2 \end{aligned}$$

and $\nabla ^\textsf{H} \varvec{\omega }^{(k)} = - \frac{2}{\mu } \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k)}\right) $, as well as $\varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)} \in \operatorname {Im} (\nabla )$, moreover

$$\begin{aligned} \left\| \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)} \right\| _2 \leqslant C_1 \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2. \end{aligned}$$

Proof

Since the finite difference operator $\nabla $ is column full rank, its left inverse $\nabla ^{-1}$ exists and is a finite-dimensional linear operator. Therefore, there exists a constant $M > 0$ such that

$$\begin{aligned} \left\| \nabla ^{-1} \varvec{u_1} - \nabla ^{-1} \varvec{u_2} \right\| _2 \leqslant M \left\| \varvec{u_1} - \varvec{u_2} \right\| _2, \quad \forall \varvec{u_1}, \varvec{u_2}, \end{aligned}$$

implying

$$\begin{aligned} \left\| \varvec{x}^{(k+1)} - \varvec{x}^{(k)} \right\| _2 \leqslant M \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2. \end{aligned}$$

Let $\lambda _{++}$ denote the smallest strictly positive eigenvalue of $\nabla ^{\textsf{H}} \nabla $. Given that $\varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)} = \nabla \varvec{x}^{(k+1)} - \varvec{z}^{(k+1)} \in \operatorname {Im} (\nabla )$ and considering the optimality condition for the update of $\varvec{x}$ in the algorithm,

$$\begin{aligned} \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k+1)}\right) + \frac{\mu }{2} \nabla ^{\textsf{H}} \left( \nabla \varvec{x}^{(k+1)} - \varvec{z}^{(k+1)}+ \varvec{\omega }^{(k)}\right) = \varvec{0}, \end{aligned}$$

where the gradient of the penalty term of the augmented Lagrangian function is similar to the gradient of $\mathscr {I}$ discussed in Appendix B. From the update of $\varvec{\omega }$, we have

$$\begin{aligned} \frac{\mu }{2} \nabla ^\textsf{H} \varvec{\omega }^{(k+1)} = - \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k+1)}\right) . \end{aligned}$$

Thus,

$$\begin{aligned}&\left\| \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)}\right\| _2\\&\quad \leqslant \, \sqrt{\lambda _{++}} \left\| \nabla ^\textsf{H}\left( \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)}\right) \right\| _2\\&\quad = \, \sqrt{\lambda _{++}} \cdot \frac{2}{\mu } \left\| \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k+1)}\right) - \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k)}\right) \right\| _2 \\&\quad \leqslant \, \sqrt{\lambda _{++}} \cdot \frac{2}{\mu } L_{\mathscr {F}} \left\| \varvec{x}^{(k+1)} - \varvec{x}^{(k)} \right\| _2\\&\quad \leqslant \, \sqrt{\lambda _{++}} \cdot \frac{2}{\mu } L_{\mathscr {F}} M \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2. \end{aligned}$$

We only need to take $C_1 = \sqrt{\lambda _{++}} \cdot \frac{2}{\mu } L_{\mathscr {F}} M$. This completes the proof. $\square $

Lemma 5

If $\mu > \frac{\frac{M^2}{2} L_{\mathscr {F}} +1}{1-C_1^2}$, then

$$\begin{aligned} \mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) - \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) \geqslant \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2^2. \end{aligned}$$

Moreover, $\mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) $ is bounded below, and the iteration sequence $\left\{ \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)} \right\} $ is bounded.

Proof

From the optimality condition for updating $\varvec{z}$ in the algorithm update process (13), it directly follows that

$$\begin{aligned} \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) \leqslant \mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) . \end{aligned}$$

Applying the Pythagorean theorem,

$$\begin{aligned} \Vert \varvec{x}\Vert _2^2-\Vert \varvec{y}\Vert _2^2=\Vert \varvec{x}-\varvec{y}\Vert _2^2-2 \operatorname {Re}\langle \varvec{y}, \varvec{x}-\varvec{y}\rangle , \quad \forall \varvec{x}, \varvec{y} \in \mathbb {C}^n, \end{aligned}$$

we have

$$\begin{aligned}&\mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) - \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) \\&\quad = \, \mathscr {F} \left( \varvec{x}^{(k)}\right) - \mathscr {F} \left( \varvec{x}^{(k+1)}\right) + \mu \operatorname {Re} \left\langle \varvec{\omega }^{(k+1)}, \nabla \varvec{x}^{(k)} - \nabla \varvec{x}^{(k+1)} \right\rangle \\&\qquad + \frac{\mu }{2} \left\| \nabla \varvec{x}^{(k)} - \nabla \varvec{x}^{(k+1)} \right\| _2^2 - \mu \left\| \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)} \right\| _2^2\\&\quad = \, \mathscr {F} \left( \varvec{x}^{(k)}\right) - \mathscr {F} \left( \varvec{x}^{(k+1)}\right) + \mu \operatorname {Re} \left\langle \nabla ^{\textsf{H}} \varvec{\omega }^{(k+1)}, \varvec{x}^{(k)} - \varvec{x}^{(k+1)} \right\rangle \\&\qquad + \frac{\mu }{2} \left\| \nabla \varvec{x}^{(k)} - \nabla \varvec{x}^{(k+1)} \right\| _2^2 - \mu \left\| \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)} \right\| _2^2 \\&\quad = \, \mathscr {F} \left( \varvec{x}^{(k)}\right) - \mathscr {F} \left( \varvec{x}^{(k+1)}\right) - 2 \operatorname {Re} \left\langle \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k+1)}\right) , \varvec{x}^{(k)} - \varvec{x}^{(k+1)} \right\rangle \\&\qquad + \mu \left\| \nabla \varvec{x}^{(k)} - \nabla \varvec{x}^{(k+1)} \right\| _2^2 - \mu \left\| \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)} \right\| _2^2. \end{aligned}$$

Leveraging the Lipschitz differentiability of $\mathscr {F}$ and its first-order expansion at $\varvec{x}^{(k+1)}$ [16], and referring to Lemma 4,

$$\begin{aligned}&\mathscr {F} \left( \varvec{x}^{(k)}\right) - \mathscr {F} \left( \varvec{x}^{(k+1)}\right) - 2 \operatorname {Re} \left\langle \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k+1)}\right) , \varvec{x}^{(k)} - \varvec{x}^{(k+1)} \right\rangle \\&\quad \geqslant \, - \frac{L_{\mathscr {F}}}{2} \left\| \varvec{x}^{(k)} - \varvec{x}^{(k+1)} \right\| _2^2\\&\quad \geqslant \, - \frac{L_{\mathscr {F}} M^2}{2} \left\| \nabla \varvec{x}^{(k)} - \nabla \varvec{x}^{(k+1)} \right\| _2^2. \end{aligned}$$

Therefore,

Thus,

As the image gradient operator $\nabla $ is column full rank, there exists a unique $\varvec{x}^\prime = \nabla ^{-1} \varvec{z}^{(k)}$ such that $\nabla \varvec{x}^\prime - \varvec{z}^{(k)} = \varvec{0}$, thus,

$$\begin{aligned}&\mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) \\&\quad = \, \mathcal {G} \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}\right) + \mu \operatorname {Re} \left\langle \varvec{\omega }^{(k)}, \nabla \varvec{x}^{(k)} - \nabla \varvec{x}^\prime \right\rangle + \frac{\mu }{2} \left\| \nabla \varvec{x}^{(k)} - \varvec{z}^{(k)}\right\| _2^2\\&\quad = \, \mathcal {G} \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}\right) + \mu \operatorname {Re} \left\langle \nabla ^{\textsf{H}} \varvec{\omega }^{(k)}, \varvec{x}^{(k)} - \varvec{x}^\prime \right\rangle + \frac{\mu }{2} \left\| \nabla \varvec{x}^{(k)} - \varvec{z}^{(k)}\right\| _2^2\\&\quad = \, \mathcal {G} \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}\right) + 2 \operatorname {Re} \left\langle \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k)}\right) , \varvec{x}^\prime - \varvec{x}^{(k)} \right\rangle + \frac{\mu }{2} \left\| \nabla \varvec{x}^{(k)} - \varvec{z}^{(k)}\right\| _2^2 \\&\quad \geqslant \, \mathscr {H} \left( \varvec{z}^{(k)}\right) + \mathscr {G} \left( \varvec{x}^{\prime }\right) + \mathscr {F} \left( \varvec{x}^{\prime }\right) + \frac{\mu - L_{\mathscr {F}} M^2}{2} \left\| \nabla \varvec{x}^{(k)} - \varvec{z}^{(k)}\right\| _2^2\\&\quad \geqslant \, \inf _{\varvec{z}, \varvec{x} \in C} \{ \mathcal {G} (\varvec{z}, \varvec{x}) \mid \nabla \varvec{x} - \varvec{z} = \varvec{0}\} + \frac{\mu - L_{\mathscr {F}} M^2}{2} \left\| \nabla \varvec{x}^{(k)} - \varvec{z}^{(k)}\right\| _2^2\\&\quad > \, - \infty . \end{aligned}$$

The last two steps hold because the function $\mathcal {G}$ is always non-negative, ensuring that a finite lower bound exists and the result does not tend towards negative infinity.

Since $\mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) $ is decreasing and bounded below, $\mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) $ is bounded, with the upper bound controlled by $\mathcal {L}_\mu \left( \varvec{z}^{(0)}, \varvec{x}^{(0)}, \varvec{\omega }^{(0)}\right) $.

From the inequality established earlier, we have

$$\begin{aligned} \mathcal {G} \left( \varvec{z}^{(k)}, \varvec{x}^\prime \right) + \frac{\mu - L_{\mathscr {F}} M^2}{2} \left\| \nabla \varvec{x}^{(k)} - \varvec{z}^{(k)}\right\| _2^2 \leqslant \mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) , \end{aligned}$$

then the left-hand side is bounded above by $\mathcal {L}_\mu \left( \varvec{z}^{(0)}, \varvec{x}^{(0)}, \varvec{\omega }^{(0)}\right) $ and below by 0. Thus, $\mathcal {G} \left( \varvec{z}^{(k)}, \varvec{x}^\prime \right) $ and $\left\| \nabla \varvec{x}^{(k)} - \varvec{z}^{(k)}\right\| _2$ are both bounded. Since $\mathcal {G}$ is coercive as per Lemma 2, $\left\{ \varvec{z}^{(k)}, \varvec{x}^\prime = \nabla ^{-1} \varvec{z}^{(k)}\right\} $ is bounded. Given that $\left\| \nabla \varvec{x}^{(k)} - \varvec{z}^{(k)}\right\| _2$ is bounded, the sequence $\left\{ \nabla \varvec{x}^{(k)}\right\} $ is also bounded. Consequently, by the column full rank property of $\nabla $, the sequence $\left\{ \varvec{x}^{(k)}\right\} $ is bounded as well.

Then, from Lemma 4 and the Lipschitz gradient property of $\mathscr {F}$, we deduce that $\left\{ \nabla ^\textsf{H} \varvec{\omega }^{(k)} \right\} $ is bounded. By considering the telescoping sum for $\varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)} \in \operatorname {Im} (\nabla )$, we conclude that $\varvec{\omega }^{(k)} - \varvec{\omega }^{(0)} \in \operatorname {Im} (\nabla )$, and thus $\left\{ \varvec{\omega }^{(k)} \right\} $ is bounded. Therefore, the sequence $\left\{ \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)} \right\} $ is bounded. $\square $

Lemma 6

There exists a constant $C_2 > 0$ and $d^{(k+1)} \in \partial \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) $ such that

$$\begin{aligned} \left\| d^{(k+1)} \right\| _2 \leqslant C_2 \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2. \end{aligned}$$

Proof

Initially,

$$\begin{aligned} \nabla _{\varvec{\omega }} \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) = \frac{\mu }{2} \left( \nabla \varvec{x}^{(k+1)} - \varvec{z}^{(k+1)}\right) = \frac{\mu }{2} \left( \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)}\right) , \end{aligned}$$

and according to Lemma 4,

$$\begin{aligned} \left\| \nabla _{\varvec{\omega }} \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) \right\| _2 \leqslant \frac{\mu }{2} C_1 \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2. \end{aligned}$$

Additionally,

$$\begin{aligned}&\nabla _{\varvec{z}} \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) \\&\quad = \, \nabla _{\varvec{z}} \mathscr {H} \left( \varvec{z}^{(k+1)}\right) + \frac{\mu }{2} \left( \nabla \varvec{x}^{(k+1)} - \varvec{z}^{(k+1)} + \varvec{\omega }^{(k+1)}\right) \\&\quad = \, \nabla _{\varvec{z}} \mathscr {H} \left( \varvec{z}^{(k+1)}\right) + \frac{\mu }{2} \left( \nabla \varvec{x}^{(k)} - \varvec{z}^{(k+1)} + \varvec{\omega }^{(k)}\right) + \frac{\mu }{2} \left( \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} + \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)}\right) . \end{aligned}$$

From the optimality conditions for updating $\varvec{z}$ in the algorithm update process (14), we know

$$\begin{aligned} \nabla _{\varvec{z}} \mathscr {H} \left( \varvec{z}^{(k+1)}\right) + \frac{\mu }{2} \left( \nabla \varvec{x}^{(k)} - \varvec{z}^{(k+1)} + \varvec{\omega }^{(k)}\right) = \varvec{0}, \end{aligned}$$

Therefore,

$$\begin{aligned} \nabla _{\varvec{z}} \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) \frac{\mu }{2} \left( \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} + \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)}\right) . \end{aligned}$$

Hence, by Lemma 4,

$$\begin{aligned} \left\| \nabla _{\varvec{z}} \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) \right\| _2 \leqslant \frac{\mu }{2} (C_1+1) \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2. \end{aligned}$$

Moreover,

$$\begin{aligned}&\nabla _{\varvec{x}} \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) \\&\quad = \, \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k+1)}\right) + \frac{\mu }{2} \nabla ^{\textsf{H}} \left( \nabla \varvec{x}^{(k+1)} - \varvec{z}^{(k+1)} + \varvec{\omega }^{(k+1)}\right) \\&\quad = \, \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k+1)}\right) + \frac{\mu }{2} \nabla ^{\textsf{H}} \left( \nabla \varvec{x}^{(k+1)} - \varvec{z}^{(k+1)} + \varvec{\omega }^{(k)}\right) + \frac{\mu }{2} \nabla ^{\textsf{H}} \left( \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)}\right) \\&\quad = \, \frac{\mu }{2} \nabla ^{\textsf{H}} \left( \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)}\right) . \end{aligned}$$

Using a method similar to that used in the proof of Lemma 4,

$$\begin{aligned}&\left\| \nabla _{\varvec{x}} \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) \right\| _2 \\&\quad =\, \frac{\mu }{2} \left\| \nabla ^\textsf{H}\left( \varvec{\omega }^{(k+1)}-\varvec{\omega }^{(k)}\right) \right\| _2\\&\quad = \, \left\| \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k+1)}\right) -\nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k)}\right) \right\| _2 \\&\quad \leqslant \, L_{\mathscr {F}} \left\| \varvec{x}^{(k+1)} - \varvec{x}^{(k)} \right\| _2\\&\quad \leqslant \, L_{\mathscr {F}} M \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2. \end{aligned}$$

Thus, let

$$\begin{aligned} d^{(k+1)} = \frac{\mu }{2} \left( \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} + \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)}, \nabla ^{\textsf{H}} \left( \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)}\right) , \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)}\right) , \end{aligned}$$

we have

$$\begin{aligned} \left\| d^{(k+1)} \right\| _2 \leqslant C_2 \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2, \end{aligned}$$

where $C_2 = \frac{\mu }{2} (2C_1+1) + L_{\mathscr {F}} M$. $\square $

Proof of Theorem 3

Since the sequence $\left\{ \varvec{x}^{(k)}, \varvec{z}^{(k)}, \varvec{\omega }^{(k)} \right\} $ is bounded, there exists a convergent subsequence and limit point, denoted as $\left( \varvec{x}^{(k_s)}, \varvec{z}^{(k_s)}, \varvec{\omega }^{(k_s)}\right) \rightarrow \left( \varvec{x}^*, \varvec{z}^*, \varvec{\omega }^*\right) $ as $s \rightarrow \infty $.

According to Lemma 5, since $\mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) $ is decreasing and bounded below, it converges. Particularly, its subsequence $\mathcal {L}_\mu \left( \varvec{z}^{(k_s)}, \varvec{x}^{(k_s)}, \varvec{\omega }^{(k_s)}\right) $ also converges. Since C is a closed convex set, $\mathscr {G} \left( \varvec{x}\right) = I_C \left( \varvec{x}\right) $ is lower semi-continuous, thus $\mathcal {L}_\mu $ is lower semi-continuous. Therefore, $\lim _{s \rightarrow \infty } \mathcal {L}_\mu \left( \varvec{z}^{(k_s)}, \varvec{x}^{(k_s)}, \varvec{\omega }^{(k_s)}\right) \geqslant \mathcal {L}_\mu \left( \varvec{z}^*, \varvec{x}^*, \varvec{\omega }^*\right) $. On the other hand, the only discontinuous term in $\mathcal {L}_\mu $ is $\mathscr {G}$, so

$$\begin{aligned} \lim _{s \rightarrow \infty } \mathcal {L}_\mu \left( \varvec{z}^{(k_s)}, \varvec{x}^{(k_s)}, \varvec{\omega }^{(k_s)}\right) - \mathcal {L}_\mu \left( \varvec{z}^*, \varvec{x}^*, \varvec{\omega }^*\right) = \limsup _{s \rightarrow \infty } \mathscr {G} \left( \varvec{z}^{(k_s)}\right) - \mathscr {G} \left( \varvec{z}^*\right) . \end{aligned}$$

However, from the optimality condition in the algorithm updates (14) and taking the limit, it is found that $\limsup _{s \rightarrow \infty } \mathscr {G} \left( \varvec{z}^{(k_s)}\right) - \mathscr {G} \left( \varvec{z}^*\right) \leqslant 0$, thus

$$\begin{aligned} \lim _{s \rightarrow \infty } \mathcal {L}_\mu \left( \varvec{z}^{(k_s)}, \varvec{x}^{(k_s)}, \varvec{\omega }^{(k_s)}\right) = \mathcal {L}_\mu \left( \varvec{z}^*, \varvec{x}^*, \varvec{\omega }^*\right) . \end{aligned}$$

Since $\mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) $ is convergent, by Lemma 5 it is known that $\left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2 \rightarrow 0$, therefore by Lemma 6 it is known that there exists $d^{(k)} \in \partial \mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) $ such that $\left\| d^{(k)} \right\| _2 \rightarrow 0$. Particularly, $\left\| d^{(k_s)} \right\| _2 \rightarrow 0$ as $s \rightarrow \infty $. Since $\lim _{s \rightarrow \infty } \mathcal {L}_\mu \left( \varvec{z}^{(k_s)}, \varvec{x}^{(k_s)}, \varvec{\omega }^{(k_s)}\right) = \mathcal {L}_\mu \left( \varvec{z}^*, \varvec{x}^*, \varvec{\omega }^*\right) $, it follows that $\varvec{0} \in \partial \mathcal {L}_\mu \left( \varvec{z}^*, \varvec{x}^*, \varvec{\omega }^*\right) $. $\square $

1.2 Convergence of the AWGP Algorithm

Consider the objective function of problem (23), defined as

$$\begin{aligned} \mathscr {J}(\varvec{x}) = \mathscr {F}(\varvec{x}) + \mathscr {I}(\varvec{x}), \end{aligned}$$

where $\mathscr {F}(\varvec{x}) = \frac{1}{2}\left\| |\varvec{A} \varvec{x}|-\varvec{y}\right\| _2^2$ and $\mathscr {I}(\varvec{x}) = \frac{\mu }{2}\left\| \nabla \varvec{x} + \varvec{u} \right\| _2^2$, with $\varvec{u} = - \varvec{z}^{(k)}+ \varvec{\omega }^{(k)}$.

Lemma 7

The function $\mathscr {I}$ is Lipschitz differentiable with a Lipschitz constant $L_{\mathscr {I}} = \frac{\mu }{2} \sqrt{\rho \left( \nabla ^{\textsf{H}} \nabla \right) }$.

Proof

From (32), we know that

$$\begin{aligned} \nabla _{\varvec{x}} \mathscr {I} (\varvec{x}) = \left( \frac{\partial \mathscr {I}}{\partial \varvec{x}^{\textsf{H}}}\right) ^{\textsf{T}} = \frac{\mu }{2} \nabla ^{\textsf{H}} \left( \nabla \varvec{x} + \varvec{u}\right) . \end{aligned}$$

For any $\varvec{x}_1, \varvec{x}_2 \in \mathbb {C}^n$, we have

$$\begin{aligned}&\left\| \nabla _{\varvec{x}} \mathscr {I} (\varvec{x}_1) - \nabla _{\varvec{x}} \mathscr {I} (\varvec{x}_2) \right\| _2\\&\quad = \, \left\| \frac{\mu }{2} \nabla ^{\textsf{H}} \left( \nabla \varvec{x}_1 + \varvec{u}\right) - \frac{\mu }{2} \nabla ^{\textsf{H}} \left( \nabla \varvec{x}_2 + \varvec{u}\right) \right\| _2 \\&\quad = \, \frac{\mu }{2} \left\| \nabla ^{\textsf{H}} \nabla \left( \varvec{x}_1 - \varvec{x}_2\right) \right\| _2\\&\quad \leqslant \, \frac{\mu }{2} \left\| \nabla ^{\textsf{H}} \nabla \right\| _2 \left\| \varvec{x}_1 - \varvec{x}_2 \right\| _2\\&\quad = \, \frac{\mu }{2} \sqrt{\rho \left( \nabla ^{\textsf{H}} \nabla \right) } \left\| \varvec{x}_1 - \varvec{x}_2 \right\| _2. \end{aligned}$$

Therefore, $\mathscr {I} (\varvec{x})$ is Lipschitz continuous. $\square $

Proof of Theorem 4

Given that the gradient projection algorithm is equivalent to a proximal gradient method with the nonsmooth term as an indicator function, and because the constraint set C in problem (23) is a closed convex set, according to the general results of accelerated proximal gradient methods for convex functions [6], its convergence requires only that the objective function $\mathscr {J}$ is Lipschitz differentiable and that the step size does not exceed the reciprocal of the Lipschitz constant.

As demonstrated by Lemma 3 and Lemma 7, $\mathscr {J}(\varvec{x}) = \mathscr {F}(\varvec{x}) + \mathscr {I}(\varvec{x})$ is Lipschitz differentiable with a Lipschitz constant $L_{\mathscr {J}} = \max \left\{ L_{\mathscr {F}}, L_{\mathscr {I}}\right\} = \max \left\{ \frac{1}{2} \rho \left( \varvec{A}^{\textsf{H}} \varvec{A}\right) , \frac{\mu }{2} \sqrt{\rho \left( \nabla ^{\textsf{H}} \nabla \right) }\right\} $. $\square $

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chen, X., Li, F. Single-Shot Phase Retrieval Via Gradient-Sparse Non-Convex Regularization Integrating Physical Constraints. J Sci Comput 102, 63 (2025). https://doi.org/10.1007/s10915-025-02788-2

Download citation

Received: 17 June 2024
Revised: 24 December 2024
Accepted: 01 January 2025
Published: 20 January 2025
DOI: https://doi.org/10.1007/s10915-025-02788-2

Single-Shot Phase Retrieval Via Gradient-Sparse Non-Convex Regularization Integrating Physical Constraints

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Fast Iterative Algorithms for Blind Phase Retrieval: A Survey

Fast Iterative Algorithms for Blind Phase Retrieval: A Survey

Large-scale phase retrieval

Data Availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Proof for SCAD Shrinkage in Sect. 4.1

Proof of Theorem 1

Proof of Theorem 2

Calculation of Wirtinger Derivative in Sect. 4.2

Proof for Convergence Analysis in Sect. 4.3

1.1 Convergence of the ADMM Algorithm

Lemma 2

Proof

Corollary 1

Lemma 3

Lemma 4

Proof

Lemma 5

Proof

Lemma 6

Proof

Proof of Theorem 3

1.2 Convergence of the AWGP Algorithm

Lemma 7

Proof

Proof of Theorem 4

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Search

Navigation