Nothing Special   »   [go: up one dir, main page]

Skip to main content

Advertisement

Log in

Single-Shot Phase Retrieval Via Gradient-Sparse Non-Convex Regularization Integrating Physical Constraints

  • Published:
Journal of Scientific Computing Aims and scope Submit manuscript

Abstract

Measurements of light typically capture amplitude information, often overlooking crucial phase details. This oversight underscores the importance of phase retrieval (PR), essential in biomedical imaging, X-ray crystallography, and microscopy, for reconstructing complex signals from intensity-only measurements. Traditional methods frequently fall short, especially in noisy conditions or when restricted to single-shot measurements. To address these challenges, we introduce a novel model that combines non-convex regularization with physical constraints. The model adopts the smoothly clipped absolute deviation (SCAD) function as a sparsity regularization term for gradients, incorporating fundamental constraints on support and absorption to establish an inverse model. Using the alternating direction method of multipliers (ADMM), we break down the problem into manageable sub-problems, implementing SCAD shrinkage in the complex domain and applying Wirtinger gradient projection methods. A thorough convergence analysis validates the stability and robustness of the algorithm. Extensive simulations confirm significant improvements in reconstruction quality compared to existing methods, with evaluations demonstrating superior performance across various noise levels and parameter settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Algorithm 1
Algorithm 2
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Data Availability

All data generated or analysed during this study are included in this published article. The MATLAB code will be open source after publication.

Notes

  1. Where \((\lambda , \eta )\) pairs are sampled from equidistant nodes in \(\{0.05, 0.075, \ldots , 0.175, 0.2\} \times \{0.05, 0.075, \ldots , 0.225, 0.25\}\).

References

  1. Abbey, B., Nugent, K.A., Williams, G.J., Clark, J.N., Peele, A.G., Pfeifer, M.A., De Jonge, M., McNulty, I.: Keyhole coherent diffractive imaging. Nat. Phys. 4(5), 394–398 (2008). https://doi.org/10.1038/nphys896

    Article  Google Scholar 

  2. Bach, F., Jenatton, R., Mairal, J., Obozinski, G.: Optimization with sparsity-inducing penalties. Found. Trends Mach. Learn. 4(1), 1–106 (2012). https://doi.org/10.1561/2200000015

    Article  MATH  Google Scholar 

  3. Bai, C., Zhou, M., Min, J., Dang, S., Yu, X., Zhang, P., Peng, T., Yao, B.: Robust contrast-transfer-function phase retrieval via flexible deep learning networks. Opt. Lett. 44(21), 5141–5144 (2019). https://doi.org/10.1364/OL.44.005141

    Article  Google Scholar 

  4. Barton, J.J.: Removing multiple scattering and twin images from holographic images. Phys. Rev. Lett. 67(22), 3106–3109 (1991). https://doi.org/10.1103/PhysRevLett.67.3106

    Article  MATH  Google Scholar 

  5. Bauschke, H.H., Combettes, P.L., Luke, D.R.: Phase retrieval, error reduction algorithm, and Fienup variants: a view from convex optimization. J. Opt. Soc. Am. A 19(7), 1334–45 (2002). https://doi.org/10.1364/josaa.19.001334

    Article  MathSciNet  MATH  Google Scholar 

  6. Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009). https://doi.org/10.1137/080716542

    Article  MathSciNet  MATH  Google Scholar 

  7. Bioucas-Dias, J.M., Valadao, G.: Phase unwrapping via graph cuts. IEEE Trans. Image Process. 16(3), 698–709 (2007). https://doi.org/10.1109/tip.2006.888351

    Article  MathSciNet  MATH  Google Scholar 

  8. Bouhamidi, A., Jbilou, K.: Sylvester Tikhonov-regularization methods in image restoration. J. Comput. Appl. Math. 206(1), 86–98 (2007). https://doi.org/10.1016/j.cam.2006.05.028

    Article  MathSciNet  MATH  Google Scholar 

  9. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011). https://doi.org/10.1561/2200000016

    Article  MATH  Google Scholar 

  10. Brandwood, D.: A complex gradient operator and its application in adaptive array theory. In: IEE Proceedings H (Microwaves, Optics and Antennas), vol. 130, pp. 11–16. IET Digital Library (1983)

  11. Candes, E.J., Li, X., Soltanolkotabi, M.: Phase retrieval via Wirtinger flow: theory and algorithms. IEEE Trans. Inf. Theory 61(4), 1985–2007 (2015). https://doi.org/10.1109/TIT.2015.2399924

    Article  MathSciNet  MATH  Google Scholar 

  12. Chang, H., Lou, Y., Ng, M.K., Zeng, T.: Phase retrieval from incomplete magnitude information via total variation regularization. SIAM J. Sci. Comput. 38(6), A3672–A3695 (2016). https://doi.org/10.1137/15M1029357

    Article  MathSciNet  MATH  Google Scholar 

  13. Chen, Y., Candes, E.: Solving random quadratic systems of equations is nearly as easy as solving linear systems. Advances in Neural Information Processing Systems 28 (2015)

  14. Chen, Y., Candès, E.J.: Solving random quadratic systems of equations is nearly as easy as solving linear systems. Commun. Pure Appl. Math. 70(5), 822–883 (2017). https://doi.org/10.1002/cpa.21638

    Article  MathSciNet  MATH  Google Scholar 

  15. Cho, T.S., Zitnick, C.L., Joshi, N., Kang, S.B., Szeliski, R., Freeman, W.T.: Image restoration by matching gradient distributions. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 683–694 (2012). https://doi.org/10.1109/TPAMI.2011.166

    Article  Google Scholar 

  16. Delgado, K.K.: The complex gradient operator and the CR-calculus. ECE275A-Lecture Supplement Fall 2006 (2006)

  17. Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001). https://doi.org/10.1198/016214501753382273

    Article  MathSciNet  MATH  Google Scholar 

  18. Fienup, J.R.: Reconstruction of an object from the modulus of its Fourier transform. Opt. Lett. 3(1), 27–9 (1978). https://doi.org/10.1364/ol.3.000027

    Article  MATH  Google Scholar 

  19. Fienup, J.R.: Phase retrieval algorithms: a comparison. Appl. Opt. 21(15), 2758–69 (1982). https://doi.org/10.1364/AO.21.002758

    Article  MATH  Google Scholar 

  20. Fienup, J.R., Crimmins, T.R., Holsztynski, W.: Reconstruction of the support of an object from the support of its autocorrelation. J. Opt. Soc. Am. 72(5), 610–624 (1982). https://doi.org/10.1364/JOSA.72.000610

    Article  MathSciNet  MATH  Google Scholar 

  21. Fus, F., Yang, Y., Pacureanu, A., Bohic, S., Cloetens, P.: Unsupervised solution for in-line holography phase retrieval using bayesian inference. Opt. Express 26(25), 32847–32865 (2018). https://doi.org/10.1364/OE.26.032847

    Article  Google Scholar 

  22. Gabor, D.: A new microscopic principle. Nature 161(4098), 777–778 (1948). https://doi.org/10.1038/161777a0

    Article  MATH  Google Scholar 

  23. Gao, Y., Cao, L.: Iterative projection meets sparsity regularization: towards practical single-shot quantitative phase imaging with in-line holography. Light Adv. Manuf. 4(1), 37–53 (2023). https://doi.org/10.37188/lam.2023.006

    Article  MathSciNet  MATH  Google Scholar 

  24. Gao, Y., Yang, F., Cao, L.: Pixel super-resolution phase retrieval for lensless on-chip microscopy via accelerated Wirtinger flow. Cells 11(13), 1999 (2022). https://doi.org/10.3390/cells11131999

  25. Gerchberg, R.W.: Phase determination from image and diffraction plane pictures in the electron microscope. Optik 34, 275–284 (1971)

    MATH  Google Scholar 

  26. Gerchberg, R.W.: A practical algorithm for the determination of plane from image and diffraction pictures. Optik 35(2), 237–246 (1972)

    MATH  Google Scholar 

  27. Ghayem, F., Sadeghi, M., Babaie-Zadeh, M., Chatterjee, S., Skoglund, M., Jutten, C.: Sparse signal recovery using iterative proximal projection. IEEE Trans. Signal Process. 66(4), 879–894 (2018). https://doi.org/10.1109/TSP.2017.2778695

    Article  MathSciNet  MATH  Google Scholar 

  28. Gong, P., Zhang, C., Lu, Z., Huang, J., Ye, J.: A general iterative shrinkage and thresholding algorithm for non-convex regularized optimization problems. In: Proceedings of the 30th International Conference on Machine Learning, vol. 28(2), pp. 37–45. PMLR (2013)

  29. Goodman, J.W.: Introduction to Fourier optics. Roberts and Company Publishers, Greenwood (2005)

    MATH  Google Scholar 

  30. Gray, R.M., Davisson, L.D.: An Introduction to Statistical Signal Processing. Cambridge University Press, Cambridge (2004)

    Book  MATH  Google Scholar 

  31. Hjørungnes, A.: Complex-Valued Matrix Derivatives: With Applications in Signal Processing and Communications. Cambridge University Press, Cambridge (2011)

    Book  MATH  Google Scholar 

  32. Jaganathan, K., Oymak, S., Hassibi, B.: Sparse phase retrieval: uniqueness guarantees and recovery algorithms. IEEE Trans. Signal Process. 65(9), 2402–2410 (2017). https://doi.org/10.1109/TSP.2017.2656844

    Article  MathSciNet  MATH  Google Scholar 

  33. Kumar, S.: Phase retrieval with physics informed zero-shot network. Opt. Lett. 46(23), 5942–5945 (2021). https://doi.org/10.1364/OL.433625

    Article  MATH  Google Scholar 

  34. Lanza, A., Morigi, S., Sgallari, F.: Constrained TV\(_p\)-\(\ell _2\) model for image restoration. J. Sci. Comput. 68(1), 64–91 (2016). https://doi.org/10.1007/s10915-015-0129-x

    Article  MathSciNet  MATH  Google Scholar 

  35. Latychevskaia, T., Fink, H.W.: Solution to the twin image problem in holography. Phys. Rev. Lett. 98(23), 233901 (2007). https://doi.org/10.1103/PhysRevLett.98.233901

    Article  MATH  Google Scholar 

  36. Li, R., Sudjianto, A.: Analysis of computer experiments using penalized likelihood in Gaussian Kriging models. Technometrics 47(2), 111–120 (2005). https://doi.org/10.1198/004017004000000671

    Article  MathSciNet  MATH  Google Scholar 

  37. Li, T., Li, F., Qi, H.: A Weibull gradient prior for image restoration. J. Comput. Appl. Math. 439, 115594 (2024). https://doi.org/10.1016/j.cam.2023.115594

    Article  MathSciNet  MATH  Google Scholar 

  38. Liu, G., Scott, P.D.: Phase retrieval and twin-image elimination for in-line Fresnel holograms. J. Opt. Soc. Am. A 4(1), 159–165 (1987). https://doi.org/10.1364/JOSAA.4.000159

    Article  MATH  Google Scholar 

  39. Lou, Y., Zeng, T., Osher, S., Xin, J.: A weighted difference of anisotropic and isotropic total variation model for image processing. SIAM J. Imaging Sci. 8(3), 1798–1823 (2015). https://doi.org/10.1137/14098435x

    Article  MathSciNet  MATH  Google Scholar 

  40. Lu, C., Tang, J., Yan, S., Lin, Z.: Nonconvex nonsmooth low rank minimization via iteratively reweighted nuclear norm. IEEE Trans. Image Process. 25(2), 829–839 (2016). https://doi.org/10.1109/TIP.2015.2511584

    Article  MathSciNet  MATH  Google Scholar 

  41. Marchesini, S.: Invited article: a unified evaluation of iterative projection algorithms for phase retrieval. Rev. Sci. Instrum. 78(1), 011301 (2007). https://doi.org/10.1063/1.2403783

    Article  MATH  Google Scholar 

  42. Marchesini, S., He, H., Chapman, H.N., Hau-Riege, S.P., Noy, A., Howells, M.R., Weierstall, U., Spence, J.C.: X-ray image reconstruction from a diffraction pattern alone. Phys. Rev. B 68(14), 140101 (2003). https://doi.org/10.1103/PhysRevB.68.140101

    Article  MATH  Google Scholar 

  43. Mehranian, A., Rad, H.S., Rahmim, A., Ay, M.R., Zaidi, H.: Smoothly clipped absolute deviation (SCAD) regularization for compressed sensing MRI using an augmented lagrangian scheme. Magn. Reson. Imaging 31(8), 1399–1411 (2013). https://doi.org/10.1016/j.mri.2013.05.010

    Article  Google Scholar 

  44. Metzler, C., Schniter, P., Veeraraghavan, A., Baraniuk, R.: prDeep: Robust phase retrieval with a flexible deep network. In: Proceedings of the 35th International Conference on Machine Learning, Proceedings of Machine Learning Research, vol. 80, pp. 3501–3510. PMLR (2018)

  45. Metzler, C.A., Maleki, A., Baraniuk, R.G.: BM3D-PRGAMP: Compressive phase retrieval based on BM3D denoising. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 2504–2508. IEEE (2016). https://doi.org/10.1109/ICIP.2016.7532810

  46. Miao, J., Ishikawa, T., Shen, Q., Earnest, T.: Extending X-ray crystallography to allow the imaging of noncrystalline materials, cells, and single protein complexes. Annu. Rev. Phys. Chem. 59, 387–410 (2008). https://doi.org/10.1146/annurev.physchem.59.032607.093642

    Article  MATH  Google Scholar 

  47. Millane, R.P.: Phase retrieval in crystallography and optics. J. Opt. Soc. Am. A 7(3), 394–411 (1990). https://doi.org/10.1364/JOSAA.7.000394

  48. Momey, F., Denis, L., Olivier, T., Fournier, C.: From Fienup’s phase retrieval techniques to regularized inversion for in-line holography: tutorial. J. Opt. Soc. Am. A 36(12), D62–D80 (2019). https://doi.org/10.1364/JOSAA.36.000D62

    Article  Google Scholar 

  49. Nesterov, Y.E.: A method of solving a convex programming problem with convergence rate \({O}\left(k^2\right)\). In: Doklady Akademii Nauk, vol. 269, pp. 543–547. Russian Academy of Sciences (1983)

  50. Ohlsson, H., Yang, A.Y., Dong, R., Sastry, S.S.: Compressive phase retrieval from squared output measurements via semidefinite programming. IFAC Proc. Vol. 45(16), 89–94 (2012). https://doi.org/10.3182/20120711-3-BE-2027.00415

    Article  MATH  Google Scholar 

  51. Pacheco, C., McKay, G.N., Oommen, A., Durr, N.J., Vidal, R., Haeffele, B.D.: Adaptive sparse reconstruction for lensless digital holography via PSF estimation and phase retrieval. Opt. Express 30(19), 33433–33448 (2022). https://doi.org/10.1364/OE.458360

    Article  Google Scholar 

  52. Rivenson, Y., Wu, Y., Wang, H., Zhang, Y., Feizi, A., Ozcan, A.: Sparsity-based multi-height phase recovery in holographic microscopy. Sci. Rep. 6(1), 37862 (2016). https://doi.org/10.1038/srep37862

    Article  Google Scholar 

  53. Ru, Y., Li, F., Fang, F., Zhang, G.: Patch-based weighted SCAD prior for compressive sensing. Inf. Sci. 592, 137–155 (2022). https://doi.org/10.1016/j.ins.2022.01.034

    Article  MATH  Google Scholar 

  54. Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Phys. D 60(1), 259–268 (1992). https://doi.org/10.1016/0167-2789(92)90242-F

    Article  MathSciNet  MATH  Google Scholar 

  55. Shechtman, Y., Beck, A., Eldar, Y.C.: GESPAR: efficient phase retrieval of sparse signals. IEEE Trans. Signal Process. 62(4), 928–938 (2014). https://doi.org/10.1109/TSP.2013.2297687

    Article  MathSciNet  MATH  Google Scholar 

  56. Shi, B., Lian, Q., Huang, X., An, N.: Constrained phase retrieval: when alternating projection meets regularization. J. Opt. Soc. Am. B 35(6), 1271–1281 (2018). https://doi.org/10.1364/JOSAB.35.001271

    Article  Google Scholar 

  57. Sidky, E.Y., Chartrand, R., Boone, J.M., Pan, X.: Constrained \({\rm T}p{\rm V}\) minimization for enhanced exploitation of gradient sparsity: application to CT image reconstruction. IEEE J. Transl. Eng. He. 2, 1–18 (2014). https://doi.org/10.1109/JTEHM.2014.2300862

    Article  Google Scholar 

  58. Stefik, M.: Inferring DNA structures from segmentation data. Artif. Intell. 11(1), 85–114 (1978). https://doi.org/10.1016/0004-3702(78)90013-9

    Article  Google Scholar 

  59. Tian, Y., Fienup, J.R.: Phase retrieval with only a nonnegativity constraint. Opt. Lett. 48(1), 135–138 (2023). https://doi.org/10.1364/OL.478581

    Article  MATH  Google Scholar 

  60. Uelwer, T., Hoffmann, T., Harmeling, S.: Non-iterative phase retrieval with cascaded neural networks. In: Artificial Neural Networks and Machine Learning - ICANN 2021, pp. 295–306. Springer International Publishing (2021). https://doi.org/10.1007/978-3-030-86340-1_24

  61. Wang, Y., Yin, W., Zeng, J.: Global convergence of ADMM in nonconvex nonsmooth optimization. J. Sci. Comput. 78(1), 29–63 (2019). https://doi.org/10.1007/s10915-018-0757-z

    Article  MathSciNet  MATH  Google Scholar 

  62. Wang, Y., Zeng, J., Peng, Z., Chang, X., Xu, Z.: Linear convergence of adaptively iterative thresholding algorithms for compressed sensing. IEEE Trans. Signal Process. 63(11), 2957–2971 (2015). https://doi.org/10.1109/TSP.2015.2412915

    Article  MathSciNet  MATH  Google Scholar 

  63. Wang, Z., Simoncelli, E.P.: Translation insensitive image similarity in complex wavelet domain. In: Proceedings. (ICASSP ’05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005., vol. 2, pp. ii/573–ii/576. IEEE (2005). https://doi.org/10.1109/ICASSP.2005.1415469

  64. Wirtinger, W.: Zur formalen theorie der funktionen von mehr komplexen veränderlichen. Math. Ann. 97(1), 357–375 (1927). https://doi.org/10.1007/BF01447872

    Article  MathSciNet  MATH  Google Scholar 

  65. Wu, J., Yang, F., Cao, L.: Resolution enhancement of long-range imaging with sparse apertures. Opt. Lasers Eng. 155, 107068 (2022). https://doi.org/10.1016/j.optlaseng.2022.107068

    Article  MATH  Google Scholar 

  66. Ye, Q., Wang, L.W., Lun, D.P.K.: Sisprnet: end-to-end learning for single-shot phase retrieval. Opt. Express 30(18), 31937–31958 (2022). https://doi.org/10.1364/OE.464086

    Article  MATH  Google Scholar 

  67. You, J., Jiao, Y., Lu, X., Zeng, T.: A nonconvex model with minimax concave penalty for image restoration. J. Sci. Comput. 78(2), 1063–1086 (2019). https://doi.org/10.1007/s10915-018-0801-z

    Article  MathSciNet  MATH  Google Scholar 

  68. You, J., Liu, H., Pan, Q., Sun, A., Zhang, Z., Shi, X.: Fast response fluorescent probe with a large stokes shift for thiophenol detection in water samples and cell imaging. J. Anal. Test. 7(1), 69–78 (2023). https://doi.org/10.1007/s41664-022-00247-7

    Article  Google Scholar 

  69. Zhang, C.H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38(2), 894–942 (2010). https://doi.org/10.1214/09-AOS729

    Article  MathSciNet  MATH  Google Scholar 

  70. Zhang, W., Cao, L., Brady, D.J., Zhang, H., Cang, J., Zhang, H., Jin, G.: Twin-image-free holography: a compressive sensing approach. Phys. Rev. Lett. 121(9), 093902 (2018). https://doi.org/10.1103/PhysRevLett.121.093902

    Article  MATH  Google Scholar 

  71. Zhang, Y., Andreas Noack, M., Vagovic, P., Fezzaa, K., Garcia-Moreno, F., Ritschel, T., Villanueva-Perez, P.: Phasegan: a deep-learning phase-retrieval approach for unpaired datasets. Opt. Express 29(13), 19593–19604 (2021). https://doi.org/10.1364/OE.423222

    Article  Google Scholar 

Download references

Funding

This research was supported by Natural Science Foundation of Shanghai (no. 22ZR1419500), Science and Technology Commission of Shanghai Municipality (no. 22DZ2229014), and Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fang Li.

Ethics declarations

Conflict of interest

The authors declare that they have no Conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Proof for SCAD Shrinkage in Sect. 4.1

Proof of Theorem 1

Since the SCAD function (9) essentially depends only on |x|, let \(r = |x| \geqslant 0\), then

$$\begin{aligned} \varphi _{\lambda ,\gamma }(x) = \varphi _{\lambda ,\gamma }(r). \end{aligned}$$

By the triangle inequality, we have

$$\begin{aligned}&\frac{1}{2}|x-v|^2+\eta \varphi _{\lambda ,\gamma }(x) \end{aligned}$$
(27)
$$\begin{aligned} = \,&\frac{1}{2}|x-v|^2+\eta \varphi _{\lambda ,\gamma }(r) \end{aligned}$$
(28)
$$\begin{aligned} \geqslant \,&\frac{1}{2}|r-|v||^2+\eta \varphi _{\lambda ,\gamma }(r) , \end{aligned}$$
(29)

where equality holds if and only if \(x = kv\) (\(k>0\)). By Lemma 1, the minimum of (29) is achieved at \(\widehat{r} = \operatorname {Shrink}_{\text {SCAD}} \left( |v|, \gamma , \lambda , \eta \right) \), thus the minimum of (27) is achieved at \(\widehat{x} = \frac{v}{|v|} \widehat{r}\), i.e.,

$$\begin{aligned} \widehat{x} = \frac{v}{|v|} \operatorname {Shrink}_{\text {SCAD}} \left( |v|, \gamma , \lambda , \eta \right) , \end{aligned}$$

which completes the proof. \(\square \)

Proof of Theorem 2

The objective function

$$\begin{aligned} \frac{1}{2}\Vert \varvec{x}-\varvec{v}\Vert _2^2+\eta \Phi _{\lambda ,\gamma }(\varvec{x}) = \sum _{j=1}^{n} \left( \frac{1}{2} \left| \varvec{x}_{(j)}-\varvec{v}_{(j)}\right| ^2+\eta \varphi _{\lambda ,\gamma }\left( \varvec{x}_{(j)}\right) \right) , \end{aligned}$$

Therefore, optimization problem (20) can essentially be transformed into optimizing n independent problems of the form (18)

$$\begin{aligned}\mathop {\text {arg min}}\limits _ {\varvec{x}_{(j)}\,\,\in \,\,\mathbb {C}}\,\,\,\,\,\,{\frac{1}{2} \left| \varvec{x}_{(j)}-\varvec{v}_{(j)}\right| ^2+\eta \varphi _{\lambda ,\gamma }\left( \varvec{x}_{(j)}\right) } , \end{aligned}$$

According to Theorem 1, the solution for each sub-problem is

$$\begin{aligned} \widehat{\varvec{x}}_{(j)}=\operatorname {Shrink}_{\text {SCAD}} \left( \varvec{v}_{(j)}, \gamma , \lambda , \eta \right) . \end{aligned}$$

\(\square \)

Calculation of Wirtinger Derivative in Sect. 4.2

Define the function components of the objective function in (23) as follows,

$$\begin{aligned} \begin{gathered} \mathscr {F}(\varvec{x}) = \frac{1}{2}\left\| |\varvec{A} \varvec{x}|-\varvec{y}\right\| _2^2, \\ \mathscr {I}(\varvec{x}) = \frac{\mu }{2}\left\| \nabla \varvec{x} + \varvec{u} \right\| _2^2, \end{gathered} \end{aligned}$$

then the objective function \(\mathscr {J}(\varvec{x}) = \mathscr {F}(\varvec{x}) + \mathscr {I}(\varvec{x})\),

$$\begin{aligned} \nabla _{\varvec{x}} \mathscr {J}(\varvec{x}) = \left( \frac{\partial \mathscr {J}}{\partial \varvec{x}^{\textsf{H}}}\right) ^{\textsf{T}} = \left( \frac{\partial \mathscr {F}}{\partial \varvec{x}^{\textsf{H}}}\right) ^{\textsf{T}} + \left( \frac{\partial \mathscr {I}}{\partial \varvec{x}^{\textsf{H}}}\right) ^{\textsf{T}}. \end{aligned}$$
(30)

Define \(\varvec{\theta }: \mathbb {C}^n \rightarrow \mathbb {C}^n\) as \(\varvec{\theta } (\varvec{x}) = \varvec{A} \varvec{x}\), \(\varvec{\varepsilon }: \mathbb {C}^n \rightarrow \mathbb {R}^n\) as \(\varvec{\varepsilon } = |\varvec{\theta }|-\varvec{y}\), then \(\mathscr {F} = \frac{1}{2}\left\| \varvec{\varepsilon } \right\| _2^2\). By the chain rule of real functions and the chain rule for Wirtinger derivatives [31],

$$\begin{aligned} \frac{\partial \mathscr {F}}{\partial \varvec{x}^{\textsf{H}}} = \,&\frac{\partial \mathscr {F}}{\partial \varvec{\varepsilon }^{\textsf{T}}} \frac{\partial \varvec{\varepsilon }}{\partial \varvec{x}^{\textsf{H}}} = \varvec{\varepsilon }^{\textsf{T}} \frac{\partial \varvec{\varepsilon }}{\partial \varvec{x}^{\textsf{H}}} \\ = \,&\varvec{\varepsilon }^{\textsf{T}} \left( \frac{\partial \varvec{\varepsilon }}{\partial \varvec{\theta }^{\textsf{T}}} \frac{\partial \varvec{\theta }}{\partial \varvec{x}^{\textsf{H}}} + \frac{\partial \varvec{\varepsilon }}{\partial \varvec{\theta }^{\textsf{H}}} \frac{\partial \varvec{\theta }^*}{\partial \varvec{x}^{\textsf{H}}}\right) , \end{aligned}$$

by direct calculation we know

$$\begin{aligned} \begin{gathered} \frac{\partial \varvec{\theta }}{\partial \varvec{x}^{\textsf {H}}} = \varvec{0}, \\ \frac{\partial \varvec{\theta }^*}{\partial \varvec{x}^{\textsf {H}}} = \left( \frac{\partial \varvec{\theta }}{\partial \varvec{x}^{\textsf {T}}}\right) ^* = \varvec{A}^*, \\ \frac{\partial \varvec{\varepsilon }}{\partial \varvec{\theta }^{\textsf {H}}} = \frac{1}{2} \operatorname {diag}\left( \frac{\varvec{\theta }}{|\varvec{\theta }|}\right) , \end{gathered} \end{aligned}$$

where \(\operatorname {diag} (\cdot )\) denotes the diagonal matrix. Thus,

$$\begin{aligned} \begin{aligned} \frac{\partial \mathscr {F}}{\partial \varvec{x}^{\textsf{H}}}&= \frac{1}{2} \varvec{\varepsilon }^{\textsf{T}} \operatorname {diag}\left( \frac{\varvec{\theta }}{|\varvec{\theta }|}\right) \varvec{A}^* \\ &= \frac{1}{2} \left( |\varvec{A} \varvec{x}| - \varvec{y}\right) ^{\textsf{T}} \operatorname {diag}\left( \frac{\varvec{A} \varvec{x}}{|\varvec{A} \varvec{x}|}\right) \varvec{A}^*. \end{aligned} \end{aligned}$$
(31)

Define \(\varvec{\rho }: \mathbb {C}^n \rightarrow \mathbb {C}^n\) as \(\varvec{\rho } (\varvec{x}) = \nabla \varvec{x} + \varvec{u}\), then \(\mathscr {I} = \frac{\mu }{2}\left\| \varvec{\rho } \right\| _2^2\). By the chain rule for Wirtinger derivatives [31],

$$\begin{aligned} \begin{gathered} \frac{\partial \varvec{\rho }}{\partial \varvec{x}^{\textsf {H}}} = \varvec{0}, \\ \frac{\partial \varvec{\rho }^*}{\partial \varvec{x}^{\textsf {H}}} = \left( \frac{\partial \varvec{\rho }}{\partial \varvec{x}^{\textsf {T}}}\right) ^* = \nabla ^*, \\ \frac{\partial \mathscr {I}}{\partial \varvec{\rho }^{\textsf {H}}} = \frac{\mu }{2} \varvec{\rho }^{\textsf {T}}. \end{gathered} \end{aligned}$$

Through direct calculation, we can get

$$\begin{aligned} \frac{\partial \varvec{\rho }}{\partial \varvec{x}^{\textsf{H}}} = \varvec{0}, \\ \frac{\partial \varvec{\rho }^*}{\partial \varvec{x}^{\textsf{H}}} = \left( \frac{\partial \varvec{\rho }}{\partial \varvec{x}^{\textsf{T}}}\right) ^* = \nabla ^*, \\ \frac{\partial \mathscr {I}}{\partial \varvec{\rho }^{\textsf{H}}} = \frac{\mu }{2} \varvec{\rho }^{\textsf{T}}. \end{aligned}$$

Thus,

$$\begin{aligned} \frac{\partial \mathscr {I}}{\partial \varvec{x}^{\textsf{H}}} = \frac{\mu }{2} \varvec{\rho }^{\textsf{T}} \nabla ^* = \frac{\mu }{2} \left( \nabla \varvec{x} + \varvec{u}\right) ^{\textsf{T}} \nabla ^*. \end{aligned}$$
(32)

Finally, combining (30), (31), and (32) yields

$$\begin{aligned} \nabla _{\varvec{x}} \mathscr {J}(\varvec{x}) = \,&\left( \frac{\partial \mathscr {F}}{\partial \varvec{x}^{\textsf{H}}}\right) ^{\textsf{T}} + \left( \frac{\partial \mathscr {I}}{\partial \varvec{x}^{\textsf{H}}}\right) ^{\textsf{T}} \\ = \,&\frac{1}{2} \varvec{A}^{\textsf{H}} \operatorname {diag}\left( \frac{\varvec{A} \varvec{x}}{|\varvec{A} \varvec{x}|}\right) \left( |\varvec{A} \varvec{x}| - \varvec{y}\right) + \frac{\mu }{2} \nabla ^{\textsf{H}} \left( \nabla \varvec{x} + \varvec{u}\right) . \end{aligned}$$

Proof for Convergence Analysis in Sect. 4.3

1.1 Convergence of the ADMM Algorithm

Let us denote

$$\begin{aligned} \mathcal {G} (\varvec{z}, \varvec{x}) = \mathscr {H} (\varvec{z}) + \mathscr {F} (\varvec{x}) + \mathscr {G} (\varvec{x}) \end{aligned}$$

as the objective function of the problem (12).

Lemma 2

The objective function \(\mathcal {G} (\varvec{z}, \varvec{x})\) is coercive on the feasible domain \(\{(\varvec{z}, \varvec{x}) \mid \nabla \varvec{x} - \varvec{z} = \varvec{0}\}\), i.e., if \(\Vert (\varvec{z}, \varvec{x})\Vert _2 \rightarrow +\infty \), then \(\mathcal {G} (\varvec{z}, \varvec{x}) \rightarrow +\infty \).

Proof

Given that \(\Vert (\varvec{z}, \varvec{x})\Vert _2 = \left( \Vert \varvec{z}\Vert _2^2 + \Vert \varvec{x}\Vert _2^2 \right) ^{\frac{1}{2}}\), as \(\Vert (\varvec{z}, \varvec{x})\Vert _2 \rightarrow +\infty \), either \(\Vert \varvec{z}\Vert _2 \rightarrow +\infty \) or \(\Vert \varvec{x}\Vert _2 \rightarrow +\infty \) must hold.

If \(\Vert \varvec{z}\Vert _2 = \Vert \nabla \varvec{x}\Vert _2 \rightarrow +\infty \), it follows that

$$\begin{aligned} \Vert \nabla \varvec{x}\Vert _2^2 = \,&\sum _{j_1=1}^{n_{1}-1} \sum _{j_2=1}^{n_2}\left| \varvec{X}_{j_1+1, j_2}-\varvec{X}_{j_1, j_2}\right| ^2 +\sum _{j_1=1}^{n_{1}} \sum _{j_2=1}^{n_2-1}\left| \varvec{X}_{j_1, j_2+1}-\varvec{X}_{j_1, j_2}\right| ^2 \\ &+ \sum _{j_2=1}^{n_2} \left| \varvec{X}_{(n_1, j_2)}\right| ^2 + \sum _{j_1=1}^{n_{1}} \left| \varvec{X}_{(j_1, n_2)}\right| ^2 \\ \leqslant \,&2 \sum _{j_1=1}^{n_{1}-1} \sum _{j_2=1}^{n_2}\left( \left| \varvec{X}_{j_1+1, j_2}\right| ^2+\left| \varvec{X}_{j_1, j_2}\right| ^2\right) +2 \sum _{j_1=1}^{n_{1}} \sum _{j_2=1}^{n_2-1}\left( \left| \varvec{X}_{j_1, j_2+1}\right| ^2+\left| \varvec{X}_{j_1, j_2}\right| ^2\right) \\ &+ 2\sum _{j_2=1}^{n_2}\left( \left| \varvec{X}_{(1, j_2)}\right| ^2 + \left| \varvec{X}_{(n_1, j_2)}\right| ^2\right) + 2\sum _{j_1=1}^{n_{1}}\left( \left| \varvec{X}_{(j_1, 1)}\right| ^2+\left| \varvec{X}_{(j_1, n_2)}\right| ^2\right) \\ \leqslant \,&8 \sum _{j_1=1}^{n_{1}} \sum _{j_2=1}^{n_2}\left| \varvec{X}_{j_1, j_2}\right| ^2 \\ = \,&8\Vert \varvec{x}\Vert _2^2, \end{aligned}$$

thus, \(\Vert \varvec{x}\Vert _2 \rightarrow +\infty \) is implied, where \(\varvec{X}\) is the matrix form corresponding to the vectorized image \(\varvec{x}\).

If \(\Vert \varvec{x}\Vert _2 \rightarrow +\infty \), since \(\varvec{A}\) is a unitary linear map, then \(|\varvec{A} \varvec{x}| \rightarrow +\infty \), and consequently \(\mathscr {F} (\varvec{x}) \rightarrow +\infty \). Given that \(\mathscr {G} (\varvec{x}), \mathscr {H} (\varvec{z}) \geqslant 0\), it follows that \(\mathcal {G} (\varvec{z}, \varvec{x}) \rightarrow +\infty \). \(\square \)

From the proof of Lemma 2, we obtain that \(\Vert \nabla \varvec{x}\Vert _2^2 \leqslant 8 \Vert \varvec{x}\Vert _2^2\), implying that \(\rho \left( \nabla ^{\textsf{H}} \nabla \right) = \Vert \nabla \Vert _2^2 \leqslant 8\). This leads to the following corollary:

Corollary 1

If \(\nabla \) represents the finite difference operator, then

$$\begin{aligned} \rho \left( \nabla ^{\textsf{H}} \nabla \right) \leqslant 8 \end{aligned}$$

holds.

Given that (31) shows

$$\begin{aligned} \nabla _{\varvec{x}} \mathscr {F} = \left( \frac{\partial \mathscr {F}}{\partial \varvec{x}^{\textsf{H}}}\right) ^{\textsf{T}} = \frac{1}{2} \varvec{A}^\textsf{H} \operatorname {diag}\left( \frac{\varvec{A} \varvec{x}}{|\varvec{A} \varvec{x}|}\right) (|\varvec{A} \varvec{x}|-\varvec{y}). \end{aligned}$$

And for the SCAD function (10), it only takes a simple calculation to know that the j-th component of \(\nabla _{\varvec{z}} \mathscr {H} (\varvec{z})\) is

$$\begin{aligned} \nabla _{\varvec{z}} \mathscr {H} (\varvec{z})_{(j)} = \left\{ \begin{aligned}&\frac{\lambda \varvec{x}_{(j)}}{2 \left| \varvec{x}_{(j)}\right| }, & 0 \leqslant \left| \varvec{x}_{(j)}\right| \leqslant \lambda , \\ &\frac{-\varvec{x}_{(j)}+ \gamma \lambda \frac{\varvec{x}_{(j)}}{\left| \varvec{x}_{(j)}\right| }}{2(\gamma -1)}, & \lambda <\left| \varvec{x}_{(j)}\right| \leqslant \gamma \lambda , \\&0, & \left| \varvec{x}_{(j)}\right| >\gamma \lambda . \end{aligned} \right. \end{aligned}$$

As noted in the supplementary information of [23], \(\mathscr {F}\) is Lipschitz differentiable, meaning that its gradient is Lipschitz continuous.

Lemma 3

([23]) The function \(\mathscr {F}\) is Lipschitz differentiable with a Lipschitz constant \(L_{\mathscr {F}} = \frac{1}{2} \rho \left( \varvec{A}^{\textsf{H}} \varvec{A}\right) \).

Lemma 4

There exist positive constants M and \(C_1\), such that

$$\begin{aligned} \left\| \varvec{x}^{(k+1)} - \varvec{x}^{(k)} \right\| _2 \leqslant M \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2 \end{aligned}$$

and \(\nabla ^\textsf{H} \varvec{\omega }^{(k)} = - \frac{2}{\mu } \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k)}\right) \), as well as \(\varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)} \in \operatorname {Im} (\nabla )\), moreover

$$\begin{aligned} \left\| \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)} \right\| _2 \leqslant C_1 \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2. \end{aligned}$$

Proof

Since the finite difference operator \(\nabla \) is column full rank, its left inverse \(\nabla ^{-1}\) exists and is a finite-dimensional linear operator. Therefore, there exists a constant \(M > 0\) such that

$$\begin{aligned} \left\| \nabla ^{-1} \varvec{u_1} - \nabla ^{-1} \varvec{u_2} \right\| _2 \leqslant M \left\| \varvec{u_1} - \varvec{u_2} \right\| _2, \quad \forall \varvec{u_1}, \varvec{u_2}, \end{aligned}$$

implying

$$\begin{aligned} \left\| \varvec{x}^{(k+1)} - \varvec{x}^{(k)} \right\| _2 \leqslant M \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2. \end{aligned}$$

Let \(\lambda _{++}\) denote the smallest strictly positive eigenvalue of \(\nabla ^{\textsf{H}} \nabla \). Given that \(\varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)} = \nabla \varvec{x}^{(k+1)} - \varvec{z}^{(k+1)} \in \operatorname {Im} (\nabla )\) and considering the optimality condition for the update of \(\varvec{x}\) in the algorithm,

$$\begin{aligned} \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k+1)}\right) + \frac{\mu }{2} \nabla ^{\textsf{H}} \left( \nabla \varvec{x}^{(k+1)} - \varvec{z}^{(k+1)}+ \varvec{\omega }^{(k)}\right) = \varvec{0}, \end{aligned}$$

where the gradient of the penalty term of the augmented Lagrangian function is similar to the gradient of \(\mathscr {I}\) discussed in Appendix B. From the update of \(\varvec{\omega }\), we have

$$\begin{aligned} \frac{\mu }{2} \nabla ^\textsf{H} \varvec{\omega }^{(k+1)} = - \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k+1)}\right) . \end{aligned}$$

Thus,

$$\begin{aligned}&\left\| \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)}\right\| _2\\&\quad \leqslant \, \sqrt{\lambda _{++}} \left\| \nabla ^\textsf{H}\left( \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)}\right) \right\| _2\\&\quad = \, \sqrt{\lambda _{++}} \cdot \frac{2}{\mu } \left\| \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k+1)}\right) - \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k)}\right) \right\| _2 \\&\quad \leqslant \, \sqrt{\lambda _{++}} \cdot \frac{2}{\mu } L_{\mathscr {F}} \left\| \varvec{x}^{(k+1)} - \varvec{x}^{(k)} \right\| _2\\&\quad \leqslant \, \sqrt{\lambda _{++}} \cdot \frac{2}{\mu } L_{\mathscr {F}} M \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2. \end{aligned}$$

We only need to take \(C_1 = \sqrt{\lambda _{++}} \cdot \frac{2}{\mu } L_{\mathscr {F}} M\). This completes the proof. \(\square \)

Lemma 5

If \(\mu > \frac{\frac{M^2}{2} L_{\mathscr {F}} +1}{1-C_1^2}\), then

$$\begin{aligned} \mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) - \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) \geqslant \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2^2. \end{aligned}$$

Moreover, \(\mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) \) is bounded below, and the iteration sequence \(\left\{ \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)} \right\} \) is bounded.

Proof

From the optimality condition for updating \(\varvec{z}\) in the algorithm update process (13), it directly follows that

$$\begin{aligned} \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) \leqslant \mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) . \end{aligned}$$

Applying the Pythagorean theorem,

$$\begin{aligned} \Vert \varvec{x}\Vert _2^2-\Vert \varvec{y}\Vert _2^2=\Vert \varvec{x}-\varvec{y}\Vert _2^2-2 \operatorname {Re}\langle \varvec{y}, \varvec{x}-\varvec{y}\rangle , \quad \forall \varvec{x}, \varvec{y} \in \mathbb {C}^n, \end{aligned}$$

we have

$$\begin{aligned}&\mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) - \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) \\&\quad = \, \mathscr {F} \left( \varvec{x}^{(k)}\right) - \mathscr {F} \left( \varvec{x}^{(k+1)}\right) + \mu \operatorname {Re} \left\langle \varvec{\omega }^{(k+1)}, \nabla \varvec{x}^{(k)} - \nabla \varvec{x}^{(k+1)} \right\rangle \\&\qquad + \frac{\mu }{2} \left\| \nabla \varvec{x}^{(k)} - \nabla \varvec{x}^{(k+1)} \right\| _2^2 - \mu \left\| \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)} \right\| _2^2\\&\quad = \, \mathscr {F} \left( \varvec{x}^{(k)}\right) - \mathscr {F} \left( \varvec{x}^{(k+1)}\right) + \mu \operatorname {Re} \left\langle \nabla ^{\textsf{H}} \varvec{\omega }^{(k+1)}, \varvec{x}^{(k)} - \varvec{x}^{(k+1)} \right\rangle \\&\qquad + \frac{\mu }{2} \left\| \nabla \varvec{x}^{(k)} - \nabla \varvec{x}^{(k+1)} \right\| _2^2 - \mu \left\| \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)} \right\| _2^2 \\&\quad = \, \mathscr {F} \left( \varvec{x}^{(k)}\right) - \mathscr {F} \left( \varvec{x}^{(k+1)}\right) - 2 \operatorname {Re} \left\langle \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k+1)}\right) , \varvec{x}^{(k)} - \varvec{x}^{(k+1)} \right\rangle \\&\qquad + \mu \left\| \nabla \varvec{x}^{(k)} - \nabla \varvec{x}^{(k+1)} \right\| _2^2 - \mu \left\| \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)} \right\| _2^2. \end{aligned}$$

Leveraging the Lipschitz differentiability of \(\mathscr {F}\) and its first-order expansion at \(\varvec{x}^{(k+1)}\) [16], and referring to Lemma 4,

$$\begin{aligned}&\mathscr {F} \left( \varvec{x}^{(k)}\right) - \mathscr {F} \left( \varvec{x}^{(k+1)}\right) - 2 \operatorname {Re} \left\langle \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k+1)}\right) , \varvec{x}^{(k)} - \varvec{x}^{(k+1)} \right\rangle \\&\quad \geqslant \, - \frac{L_{\mathscr {F}}}{2} \left\| \varvec{x}^{(k)} - \varvec{x}^{(k+1)} \right\| _2^2\\&\quad \geqslant \, - \frac{L_{\mathscr {F}} M^2}{2} \left\| \nabla \varvec{x}^{(k)} - \nabla \varvec{x}^{(k+1)} \right\| _2^2. \end{aligned}$$

Therefore,

$$\begin{aligned}&\mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) - \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) \\&\quad \geqslant \, - \frac{L_{\mathscr {F}} M^2}{2} \left\| \nabla \varvec{x}^{(k)} - \nabla \varvec{x}^{(k+1)} \right\| _2^2 + \mu \left\| \nabla \varvec{x}^{(k)} - \nabla \varvec{x}^{(k+1)} \right\| _2^2 - \mu C_1^2 \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2^2 \\&\quad \geqslant \, \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2^2. \end{aligned}$$

Thus,

$$\begin{aligned} \mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) - \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) \geqslant \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2^2. \end{aligned}$$

As the image gradient operator \(\nabla \) is column full rank, there exists a unique \(\varvec{x}^\prime = \nabla ^{-1} \varvec{z}^{(k)}\) such that \(\nabla \varvec{x}^\prime - \varvec{z}^{(k)} = \varvec{0}\), thus,

$$\begin{aligned}&\mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) \\&\quad = \, \mathcal {G} \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}\right) + \mu \operatorname {Re} \left\langle \varvec{\omega }^{(k)}, \nabla \varvec{x}^{(k)} - \nabla \varvec{x}^\prime \right\rangle + \frac{\mu }{2} \left\| \nabla \varvec{x}^{(k)} - \varvec{z}^{(k)}\right\| _2^2\\&\quad = \, \mathcal {G} \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}\right) + \mu \operatorname {Re} \left\langle \nabla ^{\textsf{H}} \varvec{\omega }^{(k)}, \varvec{x}^{(k)} - \varvec{x}^\prime \right\rangle + \frac{\mu }{2} \left\| \nabla \varvec{x}^{(k)} - \varvec{z}^{(k)}\right\| _2^2\\&\quad = \, \mathcal {G} \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}\right) + 2 \operatorname {Re} \left\langle \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k)}\right) , \varvec{x}^\prime - \varvec{x}^{(k)} \right\rangle + \frac{\mu }{2} \left\| \nabla \varvec{x}^{(k)} - \varvec{z}^{(k)}\right\| _2^2 \\&\quad \geqslant \, \mathscr {H} \left( \varvec{z}^{(k)}\right) + \mathscr {G} \left( \varvec{x}^{\prime }\right) + \mathscr {F} \left( \varvec{x}^{\prime }\right) + \frac{\mu - L_{\mathscr {F}} M^2}{2} \left\| \nabla \varvec{x}^{(k)} - \varvec{z}^{(k)}\right\| _2^2\\&\quad \geqslant \, \inf _{\varvec{z}, \varvec{x} \in C} \{ \mathcal {G} (\varvec{z}, \varvec{x}) \mid \nabla \varvec{x} - \varvec{z} = \varvec{0}\} + \frac{\mu - L_{\mathscr {F}} M^2}{2} \left\| \nabla \varvec{x}^{(k)} - \varvec{z}^{(k)}\right\| _2^2\\&\quad > \, - \infty . \end{aligned}$$

The last two steps hold because the function \(\mathcal {G}\) is always non-negative, ensuring that a finite lower bound exists and the result does not tend towards negative infinity.

Since \(\mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) \) is decreasing and bounded below, \(\mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) \) is bounded, with the upper bound controlled by \(\mathcal {L}_\mu \left( \varvec{z}^{(0)}, \varvec{x}^{(0)}, \varvec{\omega }^{(0)}\right) \).

From the inequality established earlier, we have

$$\begin{aligned} \mathcal {G} \left( \varvec{z}^{(k)}, \varvec{x}^\prime \right) + \frac{\mu - L_{\mathscr {F}} M^2}{2} \left\| \nabla \varvec{x}^{(k)} - \varvec{z}^{(k)}\right\| _2^2 \leqslant \mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) , \end{aligned}$$

then the left-hand side is bounded above by \(\mathcal {L}_\mu \left( \varvec{z}^{(0)}, \varvec{x}^{(0)}, \varvec{\omega }^{(0)}\right) \) and below by 0. Thus, \(\mathcal {G} \left( \varvec{z}^{(k)}, \varvec{x}^\prime \right) \) and \(\left\| \nabla \varvec{x}^{(k)} - \varvec{z}^{(k)}\right\| _2\) are both bounded. Since \(\mathcal {G}\) is coercive as per Lemma 2, \(\left\{ \varvec{z}^{(k)}, \varvec{x}^\prime = \nabla ^{-1} \varvec{z}^{(k)}\right\} \) is bounded. Given that \(\left\| \nabla \varvec{x}^{(k)} - \varvec{z}^{(k)}\right\| _2\) is bounded, the sequence \(\left\{ \nabla \varvec{x}^{(k)}\right\} \) is also bounded. Consequently, by the column full rank property of \(\nabla \), the sequence \(\left\{ \varvec{x}^{(k)}\right\} \) is bounded as well.

Then, from Lemma 4 and the Lipschitz gradient property of \(\mathscr {F}\), we deduce that \(\left\{ \nabla ^\textsf{H} \varvec{\omega }^{(k)} \right\} \) is bounded. By considering the telescoping sum for \(\varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)} \in \operatorname {Im} (\nabla )\), we conclude that \(\varvec{\omega }^{(k)} - \varvec{\omega }^{(0)} \in \operatorname {Im} (\nabla )\), and thus \(\left\{ \varvec{\omega }^{(k)} \right\} \) is bounded. Therefore, the sequence \(\left\{ \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)} \right\} \) is bounded. \(\square \)

Lemma 6

There exists a constant \(C_2 > 0\) and \(d^{(k+1)} \in \partial \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) \) such that

$$\begin{aligned} \left\| d^{(k+1)} \right\| _2 \leqslant C_2 \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2. \end{aligned}$$

Proof

Initially,

$$\begin{aligned} \nabla _{\varvec{\omega }} \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) = \frac{\mu }{2} \left( \nabla \varvec{x}^{(k+1)} - \varvec{z}^{(k+1)}\right) = \frac{\mu }{2} \left( \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)}\right) , \end{aligned}$$

and according to Lemma 4,

$$\begin{aligned} \left\| \nabla _{\varvec{\omega }} \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) \right\| _2 \leqslant \frac{\mu }{2} C_1 \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2. \end{aligned}$$

Additionally,

$$\begin{aligned}&\nabla _{\varvec{z}} \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) \\&\quad = \, \nabla _{\varvec{z}} \mathscr {H} \left( \varvec{z}^{(k+1)}\right) + \frac{\mu }{2} \left( \nabla \varvec{x}^{(k+1)} - \varvec{z}^{(k+1)} + \varvec{\omega }^{(k+1)}\right) \\&\quad = \, \nabla _{\varvec{z}} \mathscr {H} \left( \varvec{z}^{(k+1)}\right) + \frac{\mu }{2} \left( \nabla \varvec{x}^{(k)} - \varvec{z}^{(k+1)} + \varvec{\omega }^{(k)}\right) + \frac{\mu }{2} \left( \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} + \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)}\right) . \end{aligned}$$

From the optimality conditions for updating \(\varvec{z}\) in the algorithm update process (14), we know

$$\begin{aligned} \nabla _{\varvec{z}} \mathscr {H} \left( \varvec{z}^{(k+1)}\right) + \frac{\mu }{2} \left( \nabla \varvec{x}^{(k)} - \varvec{z}^{(k+1)} + \varvec{\omega }^{(k)}\right) = \varvec{0}, \end{aligned}$$

Therefore,

$$\begin{aligned} \nabla _{\varvec{z}} \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) \frac{\mu }{2} \left( \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} + \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)}\right) . \end{aligned}$$

Hence, by Lemma 4,

$$\begin{aligned} \left\| \nabla _{\varvec{z}} \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) \right\| _2 \leqslant \frac{\mu }{2} (C_1+1) \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2. \end{aligned}$$

Moreover,

$$\begin{aligned}&\nabla _{\varvec{x}} \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) \\&\quad = \, \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k+1)}\right) + \frac{\mu }{2} \nabla ^{\textsf{H}} \left( \nabla \varvec{x}^{(k+1)} - \varvec{z}^{(k+1)} + \varvec{\omega }^{(k+1)}\right) \\&\quad = \, \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k+1)}\right) + \frac{\mu }{2} \nabla ^{\textsf{H}} \left( \nabla \varvec{x}^{(k+1)} - \varvec{z}^{(k+1)} + \varvec{\omega }^{(k)}\right) + \frac{\mu }{2} \nabla ^{\textsf{H}} \left( \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)}\right) \\&\quad = \, \frac{\mu }{2} \nabla ^{\textsf{H}} \left( \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)}\right) . \end{aligned}$$

Using a method similar to that used in the proof of Lemma 4,

$$\begin{aligned}&\left\| \nabla _{\varvec{x}} \mathcal {L}_\mu \left( \varvec{z}^{(k+1)}, \varvec{x}^{(k+1)}, \varvec{\omega }^{(k+1)}\right) \right\| _2 \\&\quad =\, \frac{\mu }{2} \left\| \nabla ^\textsf{H}\left( \varvec{\omega }^{(k+1)}-\varvec{\omega }^{(k)}\right) \right\| _2\\&\quad = \, \left\| \nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k+1)}\right) -\nabla _{\varvec{x}} \mathscr {F} \left( \varvec{x}^{(k)}\right) \right\| _2 \\&\quad \leqslant \, L_{\mathscr {F}} \left\| \varvec{x}^{(k+1)} - \varvec{x}^{(k)} \right\| _2\\&\quad \leqslant \, L_{\mathscr {F}} M \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2. \end{aligned}$$

Thus, let

$$\begin{aligned} d^{(k+1)} = \frac{\mu }{2} \left( \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} + \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)}, \nabla ^{\textsf{H}} \left( \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)}\right) , \varvec{\omega }^{(k+1)} - \varvec{\omega }^{(k)}\right) , \end{aligned}$$

we have

$$\begin{aligned} \left\| d^{(k+1)} \right\| _2 \leqslant C_2 \left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2, \end{aligned}$$

where \(C_2 = \frac{\mu }{2} (2C_1+1) + L_{\mathscr {F}} M\). \(\square \)

Proof of Theorem 3

Since the sequence \(\left\{ \varvec{x}^{(k)}, \varvec{z}^{(k)}, \varvec{\omega }^{(k)} \right\} \) is bounded, there exists a convergent subsequence and limit point, denoted as \(\left( \varvec{x}^{(k_s)}, \varvec{z}^{(k_s)}, \varvec{\omega }^{(k_s)}\right) \rightarrow \left( \varvec{x}^*, \varvec{z}^*, \varvec{\omega }^*\right) \) as \(s \rightarrow \infty \).

According to Lemma 5, since \(\mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) \) is decreasing and bounded below, it converges. Particularly, its subsequence \(\mathcal {L}_\mu \left( \varvec{z}^{(k_s)}, \varvec{x}^{(k_s)}, \varvec{\omega }^{(k_s)}\right) \) also converges. Since C is a closed convex set, \(\mathscr {G} \left( \varvec{x}\right) = I_C \left( \varvec{x}\right) \) is lower semi-continuous, thus \(\mathcal {L}_\mu \) is lower semi-continuous. Therefore, \(\lim _{s \rightarrow \infty } \mathcal {L}_\mu \left( \varvec{z}^{(k_s)}, \varvec{x}^{(k_s)}, \varvec{\omega }^{(k_s)}\right) \geqslant \mathcal {L}_\mu \left( \varvec{z}^*, \varvec{x}^*, \varvec{\omega }^*\right) \). On the other hand, the only discontinuous term in \(\mathcal {L}_\mu \) is \(\mathscr {G}\), so

$$\begin{aligned} \lim _{s \rightarrow \infty } \mathcal {L}_\mu \left( \varvec{z}^{(k_s)}, \varvec{x}^{(k_s)}, \varvec{\omega }^{(k_s)}\right) - \mathcal {L}_\mu \left( \varvec{z}^*, \varvec{x}^*, \varvec{\omega }^*\right) = \limsup _{s \rightarrow \infty } \mathscr {G} \left( \varvec{z}^{(k_s)}\right) - \mathscr {G} \left( \varvec{z}^*\right) . \end{aligned}$$

However, from the optimality condition in the algorithm updates (14) and taking the limit, it is found that \(\limsup _{s \rightarrow \infty } \mathscr {G} \left( \varvec{z}^{(k_s)}\right) - \mathscr {G} \left( \varvec{z}^*\right) \leqslant 0\), thus

$$\begin{aligned} \lim _{s \rightarrow \infty } \mathcal {L}_\mu \left( \varvec{z}^{(k_s)}, \varvec{x}^{(k_s)}, \varvec{\omega }^{(k_s)}\right) = \mathcal {L}_\mu \left( \varvec{z}^*, \varvec{x}^*, \varvec{\omega }^*\right) . \end{aligned}$$

Since \(\mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) \) is convergent, by Lemma 5 it is known that \(\left\| \nabla \varvec{x}^{(k+1)} - \nabla \varvec{x}^{(k)} \right\| _2 \rightarrow 0\), therefore by Lemma 6 it is known that there exists \(d^{(k)} \in \partial \mathcal {L}_\mu \left( \varvec{z}^{(k)}, \varvec{x}^{(k)}, \varvec{\omega }^{(k)}\right) \) such that \(\left\| d^{(k)} \right\| _2 \rightarrow 0\). Particularly, \(\left\| d^{(k_s)} \right\| _2 \rightarrow 0\) as \(s \rightarrow \infty \). Since \(\lim _{s \rightarrow \infty } \mathcal {L}_\mu \left( \varvec{z}^{(k_s)}, \varvec{x}^{(k_s)}, \varvec{\omega }^{(k_s)}\right) = \mathcal {L}_\mu \left( \varvec{z}^*, \varvec{x}^*, \varvec{\omega }^*\right) \), it follows that \(\varvec{0} \in \partial \mathcal {L}_\mu \left( \varvec{z}^*, \varvec{x}^*, \varvec{\omega }^*\right) \). \(\square \)

1.2 Convergence of the AWGP Algorithm

Consider the objective function of problem (23), defined as

$$\begin{aligned} \mathscr {J}(\varvec{x}) = \mathscr {F}(\varvec{x}) + \mathscr {I}(\varvec{x}), \end{aligned}$$

where \(\mathscr {F}(\varvec{x}) = \frac{1}{2}\left\| |\varvec{A} \varvec{x}|-\varvec{y}\right\| _2^2\) and \(\mathscr {I}(\varvec{x}) = \frac{\mu }{2}\left\| \nabla \varvec{x} + \varvec{u} \right\| _2^2\), with \(\varvec{u} = - \varvec{z}^{(k)}+ \varvec{\omega }^{(k)}\).

Lemma 7

The function \(\mathscr {I}\) is Lipschitz differentiable with a Lipschitz constant \(L_{\mathscr {I}} = \frac{\mu }{2} \sqrt{\rho \left( \nabla ^{\textsf{H}} \nabla \right) }\).

Proof

From (32), we know that

$$\begin{aligned} \nabla _{\varvec{x}} \mathscr {I} (\varvec{x}) = \left( \frac{\partial \mathscr {I}}{\partial \varvec{x}^{\textsf{H}}}\right) ^{\textsf{T}} = \frac{\mu }{2} \nabla ^{\textsf{H}} \left( \nabla \varvec{x} + \varvec{u}\right) . \end{aligned}$$

For any \(\varvec{x}_1, \varvec{x}_2 \in \mathbb {C}^n\), we have

$$\begin{aligned}&\left\| \nabla _{\varvec{x}} \mathscr {I} (\varvec{x}_1) - \nabla _{\varvec{x}} \mathscr {I} (\varvec{x}_2) \right\| _2\\&\quad = \, \left\| \frac{\mu }{2} \nabla ^{\textsf{H}} \left( \nabla \varvec{x}_1 + \varvec{u}\right) - \frac{\mu }{2} \nabla ^{\textsf{H}} \left( \nabla \varvec{x}_2 + \varvec{u}\right) \right\| _2 \\&\quad = \, \frac{\mu }{2} \left\| \nabla ^{\textsf{H}} \nabla \left( \varvec{x}_1 - \varvec{x}_2\right) \right\| _2\\&\quad \leqslant \, \frac{\mu }{2} \left\| \nabla ^{\textsf{H}} \nabla \right\| _2 \left\| \varvec{x}_1 - \varvec{x}_2 \right\| _2\\&\quad = \, \frac{\mu }{2} \sqrt{\rho \left( \nabla ^{\textsf{H}} \nabla \right) } \left\| \varvec{x}_1 - \varvec{x}_2 \right\| _2. \end{aligned}$$

Therefore, \(\mathscr {I} (\varvec{x})\) is Lipschitz continuous. \(\square \)

Proof of Theorem 4

Given that the gradient projection algorithm is equivalent to a proximal gradient method with the nonsmooth term as an indicator function, and because the constraint set C in problem (23) is a closed convex set, according to the general results of accelerated proximal gradient methods for convex functions [6], its convergence requires only that the objective function \(\mathscr {J}\) is Lipschitz differentiable and that the step size does not exceed the reciprocal of the Lipschitz constant.

As demonstrated by Lemma 3 and Lemma 7, \(\mathscr {J}(\varvec{x}) = \mathscr {F}(\varvec{x}) + \mathscr {I}(\varvec{x})\) is Lipschitz differentiable with a Lipschitz constant \(L_{\mathscr {J}} = \max \left\{ L_{\mathscr {F}}, L_{\mathscr {I}}\right\} = \max \left\{ \frac{1}{2} \rho \left( \varvec{A}^{\textsf{H}} \varvec{A}\right) , \frac{\mu }{2} \sqrt{\rho \left( \nabla ^{\textsf{H}} \nabla \right) }\right\} \). \(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, X., Li, F. Single-Shot Phase Retrieval Via Gradient-Sparse Non-Convex Regularization Integrating Physical Constraints. J Sci Comput 102, 63 (2025). https://doi.org/10.1007/s10915-025-02788-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10915-025-02788-2

Keywords

Mathematics Subject Classification

Navigation