Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Scaled relative graphs: nonexpansive operators via 2D Euclidean geometry

  • Full Length Paper
  • Series A
  • Published:
Mathematical Programming Submit manuscript

Abstract

Many iterative methods in applied mathematics can be thought of as fixed-point iterations, and such algorithms are usually analyzed analytically, with inequalities. In this paper, we present a geometric approach to analyzing contractive and nonexpansive fixed point iterations with a new tool called the scaled relative graph. The SRG provides a correspondence between nonlinear operators and subsets of the 2D plane. Under this framework, a geometric argument in the 2D plane becomes a rigorous proof of convergence.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. The winding number of a closed curve in the plane around a given point is an integer representing the total number of times that curve travels counterclockwise around the point. A definition, based on the complex analysis and the Cauchy residue theorem can be found in Section 4.1 of [2].

References

  1. Ablowitz, M.J., Fokas, A.S.: Complex Variables: Introduction and Applications, 2nd edn. Cambridge University Press, Cambridge (2003)

    Book  MATH  Google Scholar 

  2. Abramowitz, M., Stegun, I.A.: Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Dover, New York (1964)

    MATH  Google Scholar 

  3. Banach, S.: Sur les opérations dans les ensembles abstraits et leur application aux équations intégrales. Fundam. Math. 3(1), 133–181 (1922)

    Article  MATH  Google Scholar 

  4. Banjac, G., Goulart, P.J.: Tight global linear convergence rate bounds for operator splitting methods. IEEE Trans. Autom. Control 63(12), 4126–4139 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  5. Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd edn. Springer, Berlin (2017)

    Book  MATH  Google Scholar 

  6. Bauschke, H.H., Noll, D., Phan, H.M.: Linear and strong convergence of algorithms involving averaged nonexpansive operators. J. Math. Anal. Appl. 421(1), 1–20 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  7. Bauschke, H.H., Wang, X.: Firmly nonexpansive and Kirszbraun–Valentine extensions: a constructive approach via monotone operator theory. In: Nonlinear Analysis and Optimization I: Nonlinear Analysis, pp. 55–64. American Mathematics Society (2010)

  8. Bauschke, H.H., Wang, X., Yao, L.: General resolvents for monotone operators: characterization and extension. In: Biomedical Mathematics: Promising Directions in Imaging, Therapy Planning, and Inverse Problems, pp. 57–74. Medical Physics Publishing (2010)

  9. Beck, A.: First-Order Methods in Optimization. Society for Industrial and Applied Mathematics, Philadelphia (2017)

    Book  MATH  Google Scholar 

  10. Bellman, R.: On the theory of dynamic programming. Proc. Natl. Acad. Sci. 38(8), 716–719 (1952)

    Article  MathSciNet  MATH  Google Scholar 

  11. Bolte, J., Nguyen, T.P., Peypouquet, J., Suter, B.W.: From error bounds to the complexity of first-order descent methods for convex functions. Math. Program. 165(2), 471–507 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  12. Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)

    Book  MATH  Google Scholar 

  13. Brezis, H., Lions, P.L.: Produits infinis de resolvantes. Israel J. Math. 29(4), 329–345 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  14. Briceño-Arias, L.M., Davis, D.: Forward–backward-half forward algorithm for solving monotone inclusions. SIAM J. Optim. 28(4), 2839–2871 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  15. Bruck, R.E.: On the weak convergence of an ergodic iteration for the solution of variational inequalities for monotone operators in hilbert space. J. Math. Anal. Appl. 61(1), 159–164 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  16. Cauchy, M.A.: Méthode générale pour la résolution des systémes d’équations simultanées. Comptes Rendus Hebdomadaires des Séances de l’Académie des Sciences 25, 536–538 (1847)

    Google Scholar 

  17. Combettes, P.L., Yamada, I.: Compositions and convex combinations of averaged nonexpansive operators. J. Math. Anal. Appl. 425(1), 55–70 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  18. Davis, D., Yin, W.: Faster convergence rates of relaxed Peaceman–Rachford and ADMM under regularity assumptions. Math. Oper. Res. 42(3), 783–805 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  19. Deng, W., Yin, W.: On the global and linear convergence of the generalized alternating direction method of multipliers. J. Sci. Comput. 66(3), 889–916 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  20. Dontchev, A.L., Rockafellar, R.T.: Regularity and conditioning of solution mappings in variational analysis. Set-Valued Anal. 12(1), 79–109 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  21. Dontchev, A.L., Rockafellar, R.T.: Implicit Functions and Solution Mappings: A View from Variational Analysis, 2nd edn. Springer, New York (2014)

    Book  MATH  Google Scholar 

  22. Douglas, J., Rachford, H.H.: On the numerical solution of heat conduction problems in two and three space variables. Trans. Am. Math. Soc. 82, 421–439 (1956)

    Article  MathSciNet  MATH  Google Scholar 

  23. Drusvyatskiy, D., Lewis, A.S.: Error bounds, quadratic growth, and linear convergence of proximal methods. Math. Oper. Res. 43(3), 919–948 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  24. Eckstein, J.: Splitting methods for monotone operators with applications to parallel optimization. Ph.D. thesis, MIT (1989)

  25. Eckstein, J., Bertsekas, D.P.: On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55(1–3), 293–318 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  26. Gabay, D.: Applications of the method of multipliers to variational inequalities. In: Fortin, M., Glowinski, R. (eds.) Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary-Value Problems. North-Holland, Amsterdam (1983)

    Google Scholar 

  27. Giselsson, P.: Lunds Universitet, lecture notes: large-scale convex optimization (2015). http://www.control.lth.se/education/doctorate-program/large-scale-convex-optimization/. Last visited on 1 Dec 2018

  28. Giselsson, P.: Tight global linear convergence rate bounds for Douglas–Rachford splitting. J. Fixed Point Theory Appl. 19(4), 2241–2270 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  29. Giselsson, P., Boyd, S.: Linear convergence and metric selection for Douglas–Rachford splitting and ADMM. IEEE Trans. Autom. Control 62(2), 532–544 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  30. Han, D., Yuan, X.: Convergence analysis of the Peaceman–Rachford splitting method for nonsmooth convex optimization. Optimization Online (2012)

  31. Hannah, R., Yin, W.: Scaled relative graph. UCLA CAM report (2016)

  32. Hiriart-Urruty, J.B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms, vol. 2. Springer, Berlin (1993)

    MATH  Google Scholar 

  33. Horn, R.A., Johnson, C.R.: Topics in Matrix Analysis. Cambridge University Press, Cambridge (1991)

    Book  MATH  Google Scholar 

  34. Huang, X., Ryu, E.K., Yin, W.: Scaled relative graph of normal matrices. arXiv preprint arXiv:2001.02061 (2019)

  35. Huang, X., Ryu, E.K., Yin, W.: Tight coefficients of averaged operators via scaled relative graph. J. Math. Anal. Appl. 490(1), 124211 (2020)

  36. Karimi, H., Nutini, J., Schmidt, M.: Linear convergence of gradient and proximal-gradient methods under the Polyak–Łojasiewicz condition. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) Machine Learning and Knowledge Discovery in Databases (KDD), pp. 795–811. Springer, Berlin (2016)

    Chapter  Google Scholar 

  37. Kline, M.: Calculus: An Intuitive and Physical Approach, 2nd edn. Wiley, Hoboken (1977)

    Google Scholar 

  38. Korpelevich, G.M.: The extragradient method for finding saddle points and other problems. Ekon. Mat. Metod. 12, 747–756 (1976)

    MathSciNet  MATH  Google Scholar 

  39. Krasnosel’skii, M.A.: Two remarks on the method of successive approximations. Uspekhi Mat. Nauk 10(1), 123–127 (1955)

    MathSciNet  Google Scholar 

  40. Latafat, P., Patrinos, P.: Asymmetric forward–backward–adjoint splitting for solving monotone inclusions involving three operators. Comput. Optim. Appl. 68(1), 57–93 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  41. Leventhal, D.: Metric subregularity and the proximal point method. J. Math. Anal. Appl. 360(2), 681–688 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  42. Liang, J., Fadili, J., Peyré, G.: Convergence rates with inexact non-expansive operators. Math. Program. 159(1), 403–434 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  43. Lindelöf, E.: Sur l’applications de la méthode des approximations successives aux équations différentielles ordinaires du premier ordre. Comptes Rendus Hebdomadaires des Séances de l’Académie des Sciences 118, 454–456 (1894)

    MATH  Google Scholar 

  44. Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16(6), 964–979 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  45. Malitsky, Y., Tam, M.K.: A forward-backward splitting method for monotone inclusions without cocoercivity. SIAM J. Optim. 30(2), 1451–1472 (2020)

  46. Mann, W.R.: Mean value methods in iteration. Proc. Am. Math. Soc. 4(3), 506–510 (1953)

    Article  MathSciNet  MATH  Google Scholar 

  47. Martinet, B.: Régularisation d’inéquations variationnelles par approximations successives. Revue Française d’Informatique et de Recherche Opérationnelle, Série Rouge 4(3), 154–158 (1970)

    MATH  Google Scholar 

  48. Martinet, B.: Determination approchée d’un point fixe d’une application pseudo-contractante. Comptes Rendus de l’Académie des Sciences, Série A 274, 163–165 (1972)

    MATH  Google Scholar 

  49. Mises, R.V., Pollaczek-Geiringer, H.: Praktische verfahren der gleichungsauflösung. Zeitschrift für Angewandte Mathematik und Mechanik 9(2), 152–164 (1929)

    Article  MATH  Google Scholar 

  50. Morley, F., Morley, F.V.: Inversive Geometry. G. Bell and Sons, London (1933)

    MATH  Google Scholar 

  51. Moursi, W.M., Vandenberghe, L.: Douglas–Rachford splitting for a Lipschitz continuous and a strongly monotone operator. J. Optim. Theory Appl. 183, 179–198 (2019)

  52. Murnaghan, F.D., Wintner, A.: A canonical form for real matrices under orthogonal transformations. Proc. Natl. Acad. Sci. 17(7), 417–420 (1931)

    Article  MATH  Google Scholar 

  53. Necoara, I., Nesterov, Y., Glineur, F.: Linear convergence of first order methods for non-strongly convex optimization. Math. Program. 175(1), 69–107 (2018)

    MathSciNet  MATH  Google Scholar 

  54. Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course. Springer, Berlin (2013)

    MATH  Google Scholar 

  55. von Neumann, J.: Functional Operators, Volume II. The Geometry of Orthogonal Spaces. Princeton University Press, Princeton (1950)

    MATH  Google Scholar 

  56. Newton, I.: De analysi per aequationes numero terminorum infinitas. The Royal Society, London (1669)

    Google Scholar 

  57. Ogura, N., Yamada, I.: Non-strictly convex minimization over the fixed point set of an asymptotically shrinking nonexpansive mapping. Numer. Funct. Anal. Optim. 23(1–2), 113–137 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  58. Pedoe, D.: A Course Geometry for Colleges and Universities. Cambridge University Press, Cambridge (1970)

    MATH  Google Scholar 

  59. Picard, E.: Mémoire sur la théorie des équations aux dérivées partielles et la méthode des approximations successives. Journal de Mathématiques Pures et Appliquées 4éme Série 6, 145–210 (1890)

    MATH  Google Scholar 

  60. Riley, K.F., Hobson, M.P., Bence, S.J.: Mathematical Methods for Physics and Engineering, 3rd edn. Cambridge University Press, Cambridge (2006)

    Book  MATH  Google Scholar 

  61. Robinson, S.M.: Some continuity properties of polyhedral multifunctions. In: König, H., Korte, B., Ritter, K. (eds.) Mathematical Programming at Oberwolfach, pp. 206–214. Springer, Berlin (1981)

    Chapter  Google Scholar 

  62. Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14(5), 877–898 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  63. Ryu, E.K., Boyd, S.: Primer on monotone operator methods. Appl. Comput. Math. 15, 3–43 (2016)

    MathSciNet  MATH  Google Scholar 

  64. Ryu, E.K., Taylor, A.B., Bergeling, C., Giselsson, P.: Operator splitting performance estimation: tight contraction factors and optimal parameter selection. SIAM J. Optim. 30(3), 2251–2271 (2020)

  65. Stewart, M.: Some General Theorems of Considerable Use in the Higher Parts of Mathematics. W. Sands, A. Murray, and J. Cochran, London (1746)

    Google Scholar 

  66. Taylor, A.B., Hendrickx, J.M., Glineur, F.: Smooth strongly convex interpolation and exact worst-case performance of first-order methods. Math. Program. 161(1), 307–345 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  67. Trefethen, L., Embree, M.: Spectra and Pseudospectra: The Behavior of Nonnormal Matrices and Operators. Princeton University Press, Princeton (2005)

    Book  MATH  Google Scholar 

  68. Tseng, P.: A modified forward-backward splitting method for maximal monotone mappings. SIAM J. Control Optim. 38(2), 431–446 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  69. Wentworth, G., Smith, D.E.: Plane and Solid Geometry. Ginn and Company, London (1913)

    Google Scholar 

  70. Ye, J., Yuan, X., Zeng, S., Zhang, J.: Variational analysis perspective on linear convergence of some first order methods for nonsmooth convex optimization problems. Optimization (2018) (Online Preprint)

  71. Yuan, X., Zeng, S., Zhang, J.: Discerning the linear convergence of ADMM for structured convex optimization through the lens of variational analysis. JMLR 21(83):1–75, (2020)

  72. Zhang, H.: New analysis of linear convergence of gradient-type methods via unifying error bound conditions. Math. Program. 180(1), 371–416 (2019)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We thank Pontus Giselsson for the illuminating discussion on metric subregularity. We thank Minyong Lee and Yeoil Yoon for helpful discussions on inversive geometry and its use in high school mathematics competitions. We thank Xinmeng Huang for the aid in drawing Figure 5. We thank the Erwin Schrödinger Institute and the organizers of its workshop in February 2019, which provided us with fruitful discussions and helpful feedback that materially improved this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wotao Yin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was partially supported by AFOSR MURI FA9550-18-1-0502, NSF Grant DMS-1720237, ONR Grant N000141712162, the New Faculty Startup Fund from Seoul National University, the National Research Foundation of Korea (NRF) Grant funded by the Korean Government (MSIP) [No. 2020R1F1A1A01072877], and the National Research Foundation of Korea (NRF) Grant funded by the Korean Government (MSIP) [No. 2017R1A5A1015626].

Appendices

Further discussion

1.1 The role of maximality

A fixed-point iteration \(x^{k+1}=Tx^k\) becomes undefined if its iterates ever escape the domain of T. This is why we assume the monotone operators are maximal as maximality ensures \({\mathrm {dom}( J_{\alpha A})}=\mathcal {H}\). The results of this work are otherwise entirely independent of the notion of maximality.

In Sect. 2, we define \({\mathcal {M}}\) to contain all monotone operators (maximal or not). This choice is necessary to make \({\mathcal {M}}\) SRG-full. For the other classes \({\mathcal {L}}_{L}\), \({\mathcal {C}}_{\beta }\), \({\mathcal {M}}_{\mu }\), and \({\mathcal {N}}_\theta \), we make no restriction on the domain or maximality so that they can be SRG-full.

1.2 Minkowski-type set notation

Given \(\alpha \in \mathbb {R}\) and sets \(U,V\subseteq \mathcal {H}\), write

$$\begin{aligned} \alpha U=\{\alpha u\,|\,u\in U\},\quad U+V=\{u+v\,|\,u\in U,\,v\in V\},\quad U-V=U+(-V). \end{aligned}$$

Notice that if either U or V is \(\emptyset \), then \(U+V=\emptyset \).

Given \(Z,W\subseteq \mathbb {C}\), write

$$\begin{aligned} Z+W=\{z+w\,|\,z\in Z,\,w\in W\},\quad ZW=\{zw\,|\,z\in Z,\,w\in W\}. \end{aligned}$$

Given \(\alpha \in \mathbb {C}\), \(\alpha \ne 0\), and \(Z\subseteq \overline{\mathbb {C}}\), write

$$\begin{aligned} \alpha Z = \{ \alpha z\,|\,z\in Z\}. \end{aligned}$$

Given a class of operators \({\mathcal {A}}\) and \(\alpha \ne 0\), write

$$\begin{aligned} {\mathcal {A}}^{-1}=\{ A^{-1}\,|\,A\in {\mathcal {A}}\},\qquad \alpha {\mathcal {A}}=\{\alpha A\,|\,A\in {\mathcal {A}}\},\qquad {\mathcal {A}}\alpha =\{A\alpha \,|\,A\in {\mathcal {A}}\}. \end{aligned}$$

Given classes of operators \({\mathcal {A}}\) and \({\mathcal {B}}\) and \(\alpha >0\), write

$$\begin{aligned} {\mathcal {A}}+{\mathcal {B}}&=\{A+B\,|\,A\in {\mathcal {A}},\,B\in {\mathcal {B}},\,A:\mathcal {H}\rightrightarrows \mathcal {H},\,B:\mathcal {H}\rightrightarrows \mathcal {H}\}\\ {\mathcal {A}}{\mathcal {B}}&=\{AB\,|\,A\in {\mathcal {A}},\,B\in {\mathcal {B}},\,A:\mathcal {H}\rightrightarrows \mathcal {H},\,B:\mathcal {H}\rightrightarrows \mathcal {H}\}\\ J_{\alpha {\mathcal {A}}}&=\{J_{\alpha A}\,|\,A\in {\mathcal {A}},\,A:\mathcal {H}\rightrightarrows \mathcal {H}\}. \end{aligned}$$

1.3 SRG-full classes

There is one degenerate case to keep in mind for the sake of rigor. The SRG-full class of operators \({\mathcal {A}}_\text {null}\) represented by \(h(a,b,c)=a+b+|c|\) has \({\mathcal {G}}({\mathcal {A}}_\text {null})=\emptyset \). However, the class \({\mathcal {A}}_\text {null}\) is not itself empty; it contains operators whose graph contains zero or one pair, i.e., \(A\in {\mathcal {A}}_\text {null}\) if and only if we have either a) \({\mathrm {dom}(A)}=\emptyset \) or b) \({\mathrm {dom}(A)}=x\) and \(Ax=\{y\}\) for some \(x,y\in \mathcal {H}\).

Theorem 3 does not apply when the operator classes are not SRG-full. For example, although

$$\begin{aligned} \partial {\mathcal {F}}_{\mu ,L}=\partial {\mathcal {F}}_{\mu ,\infty }\cap \partial {\mathcal {F}}_{0,L} \end{aligned}$$

we have the strict containment

figure ab

Invariant circle number

Let \({\mathcal {A}}\) be an SRG-full class such that \({\mathcal {G}}({\mathcal {A}})\ne \emptyset \) and \({\mathcal {G}}({\mathcal {A}})\ne \overline{\mathbb {C}}\). Define the circle number of \({\mathcal {A}}\) as

$$\begin{aligned} \inf \Big \{k&\in \mathbb {N}\,\Big |\,{\mathcal {G}}({\mathcal {A}})=B_1\cap \dots \cap B_k,\,B_i \text { is a disk or a half-space for } i=1,\dots ,k \Big \}, \end{aligned}$$

which is a positive integer or \(\infty \). For example, \({\mathcal {M}}\cap {\mathcal {L}}_1\) has circle number 2 since

figure ac

In this section, we show that the circle number of an operator class is invariant under certain operations. This is analogous to how the genus or the winding number are topological invariants under homeomorphisms. That it is impossible to continuously deform a donut into a sphere since they have different numbers of holes, an invariant, is a standard argument of topology. The circle number serves as an analogous invariant for operator classes.

Theorem 10

The circle number of an SRG-full operator class is invariant under non-zero pre and post-scalar multiplication, addition by identity, and inversion.

Proof

Let T be a one-to-one mapping from an operator to an operator and let \(T'\) the the corresponding one-to-one mapping from \(\overline{\mathbb {C}}\) to \(\overline{\mathbb {C}}\). In particular, consider the following four cases: first \(T(A)=\alpha A\) and \(T'(z)=\alpha z\), second \(T(A)= A\alpha \) and \(T'(z)=\alpha z\), third \(T(A)= I+A \) and \(T'(z)=1+z\), and fourth \(T(A)= A^{-1} \) and \(T'(z)={\bar{z}}^{-1}\).

If

$$\begin{aligned} {\mathcal {G}}({\mathcal {A}})=B_1\cap \dots \cap B_k, \end{aligned}$$

then

$$\begin{aligned} {\mathcal {G}}(T({\mathcal {A}}))=T'(B_1)\cap \dots \cap T'(B_k), \end{aligned}$$

where \(T'(B_1),\dots ,T'(B_k)\) are each a disk or a half-space. Therefore, the circle number of \(T({\mathcal {A}})\) satisfies

$$\begin{aligned} \inf \Big \{k\,\Big |\,{\mathcal {G}}(T({\mathcal {A}}))=T'(B_1)\cap \dots&\cap T'(B_k) \Big \}\le \inf \Big \{k\,\Big |\,{\mathcal {G}}({\mathcal {A}})=B_1\cap \dots \cap B_k \Big \}. \end{aligned}$$

Since T and \(T'\) are invertible mappings, the argument goes in the other direction as well, and we conclude that the infimums are equal. \(\square \)

Corollary 3

There is no one-to-one mapping from \(\mathcal {M}\) to \(\mathcal {M}\cap \mathcal {L}_L\) constructed via pre and post-scalar multiplication, addition with the identity operator, and operator inversion.

Such one-to-one mappings between operator classes are used for translating a nice result on a simple operator class to another operator class. In [7, 8, 64] the maximal monotone extension theorem was translated to extension theorems of other operator classes. Corollary 3 shows that this approach will not work for \(\mathcal {M}\cap \mathcal {L}_L\), a class of operators considered by the extragradient method [38], forward-backward-forward splitting [68], and other related methods [14, 45]. In fact, [64] shows that certain simple interpolation condition for \({\mathcal {M}}\) fails for \(\mathcal {M}\cap \mathcal {L}_L\).

Fig. 8
figure 8

Spherical triangle inequality: \(|\theta -\varphi |\le \psi \le \theta +\varphi \)

Deferred proofs

Fact 17

(Spherical triangle inequality) Any nonzero \(a,b,c\in \mathcal {H}\) satisfies

$$\begin{aligned}\left| \angle (a,b)-\angle (b,c)\right| \le \angle (a,c)\le \angle (a,b)+\angle (b,c). \end{aligned}$$

Figure 8 illustrates the inequality. We use the spherical triangle inequality in Theorem 7 to argue that there is no need to consider a third dimension and that we can continue the analysis in 2D.

Proof of spherical triangle inequality

Although this result is known, we provide a proof for completeness. Without loss of generality, assume a, b, and c are unit vectors. Let \(\theta =\angle (a,b)\) and \(\varphi =\angle (b,c)\), and without loss of generality, assume \(\theta \ge \varphi \). Then we have

$$\begin{aligned} a=b\cos \theta +u\sin \theta ,\quad c=b\cos \varphi +v\sin \varphi , \end{aligned}$$

where u and v are unit vectors orthogonal to b, and

$$\begin{aligned} \langle a,c\rangle =\cos \theta \cos \varphi +\langle u,v\rangle \sin \theta \sin \varphi . \end{aligned}$$

Since \(\theta ,\varphi \in [0, \pi ]\), we have \(\sin \theta \sin \varphi \ge 0\). Since \(\Vert u\Vert =\Vert v\Vert =1\), we have \(|\langle u,v\rangle | \le 1\). Therefore

$$\begin{aligned} \cos \theta \cos \varphi -\sin \theta \sin \varphi \le \langle a,c\rangle \le \cos \theta \cos \varphi +\sin \theta \sin \varphi . \end{aligned}$$

Since \(\cos (\alpha \pm \beta )=\cos \alpha \cos \beta \mp \sin \alpha \sin \beta \), [2, p. 72, 4.3.17] we have

$$\begin{aligned} \cos (\theta +\varphi )\le \langle a,c\rangle \le \cos (\theta -\varphi ), \end{aligned}$$

and we conclude

$$\begin{aligned} \angle (a,c)=\arccos (\langle a,c\rangle )\in [\theta -\varphi ,\theta +\varphi ]. \end{aligned}$$

\(\square \)

Proof of Fact 7

First consider the case \(\mu <1/(2\beta )\). By Proposition 1 and Theorem 4, we have the geometry

figure ad

To clarify, O is the center of the circle with radius \({\overline{OB}}\) (lighter shade) and C is the center of the circle with radius \({\overline{AC}}={\overline{CB}}\) defining the inner region (darker shade). With two applications of the Pythagorean theorem, we get

$$\begin{aligned} {\overline{OB}}^2&={\overline{OD}}^2+{\overline{DB}}^2 ={\overline{OD}}^2+{\overline{BC}}^2-{\overline{CD}}^2\\&=(1-\alpha \mu )^2+(\alpha /(2\beta ))^2-(\alpha /(2\beta )-\alpha \mu )^2 =1-2\alpha \mu +\alpha ^2\mu /\beta . \end{aligned}$$

Since \(\overline{B'B}\) is a chord of circle O, it is within the circle. Since 2 non-identical circles intersect at at most 2 points, and since A is within circle O, arc is within circle O. Finally, the region bounded by (darker shade) is within circle O (lighter shade).

In the cases \(\mu =1/(2\beta )\) and \(\mu >1/(2\beta )\), we have a slightly different geometry, but the same arguments and calculations hold.

figure ae

The containment holds for R and fails for smaller R. Since \({\mathcal {L}}_R\) is SRG-full by Theorem 2, the containment of the SRG in \(\overline{\mathbb {C}}\) equivalent to the containment of the class. \(\square \)

We quickly state Stewart’s theorem [65], which we use for the Proof of Fact 12. For a triangle \(\triangle ABC\) and Cevian \({\overline{CD}}\) to the side \({\overline{AB}}\),

figure af

the lengths of the line segments satisfy

$$\begin{aligned} {\overline{AD}}\cdot {\overline{CB}}^2 + {\overline{DB}}\cdot {\overline{AC}}^2 = {\overline{AB}}\cdot {\overline{CD}}^2 +{\overline{AD}}\cdot {\overline{DB}}^2 +{\overline{AD}}^2\cdot {\overline{DB}}. \end{aligned}$$

Full Proof of Fact 12 By Proposition 1 and Theorems 4 and 5, we have the geometry

figure ag

A closer look gives us

figure ah

To clarify, B is the center of the circle with radius \({\overline{BA}}\) and C is the center of the circle with radius \({\overline{CA}}\). By Stewart’s theorem [65], we have

$$\begin{aligned} {\overline{OA}}^2&=\frac{{\overline{OC}}\cdot {\overline{AB}}^2+{\overline{BO}}\cdot {\overline{CA}}^2-{\overline{BO}}\cdot {\overline{OC}}\cdot {\overline{BC}}}{{\overline{BC}}}\\&=\frac{ \tfrac{\beta }{\alpha +\beta }\left( 1-\tfrac{\alpha \mu }{1+\alpha \mu }\right) ^2+ \tfrac{\alpha \mu }{1+\alpha \mu } \left( 1 - \tfrac{\beta }{\alpha + \beta } \right) ^2 -\tfrac{\beta }{\alpha +\beta } \tfrac{\alpha \mu }{1+\alpha \mu } \left( \tfrac{\beta }{\alpha +\beta }+\tfrac{\alpha \mu }{1+\alpha \mu }\right) }{\tfrac{\beta }{\alpha +\beta }+\tfrac{\alpha \mu }{1+\alpha \mu }}\\&=1-\frac{4\alpha \mu }{1+2\alpha \mu +\alpha ^2\mu /\beta }. \end{aligned}$$

Since 2 non-identitcal circles intersect at at most 2 points, and since D is within circle B, arc is within circle O. By the same reasoning, arc is within circle O. Finally, the region bounded by (darker shade) is within circle O (lighter shade).

The containment holds for R and fails for smaller R. Since \({\mathcal {L}}_R\) is SRG-full by Theorem 2, the containment of the SRG in \(\overline{\mathbb {C}}\) equivalent to the containment of the class. \(\square \)

Proof of Fact 16

Let

figure ai

First, we show \(Q=[0,1]C=[0,1]Q\). To clarify, [0, 1] is the set of real numbers between 0 and 1 and [0, 1]C and [0, 1]Q are Minkowski products of sets of complex numbers. Given any point \(A\in Q\backslash C\), define \(A'\) as the nonzero intersection of the line extending \({\overline{OA}}\) and circle C. Since A is on the line and inside the circle, the nonzero intersection \(A'\) exists.

figure aj

Since \(A\in \overline{OA'}\subseteq [0,1]C\), we have \(Q\subseteq [0,1]C\). On the other hand, \(Q\supseteq [0,1]C\) follows from noting that given any point \(A'\) on C, the line segment \(\overline{OA'}\) is a chord of the circle C and therefore is within the disk Q. Therefore, \(Q=[0,1]C\). As a corollary, we have \(Q=[0,1]C=([0,1][0,1])C=[0,1]([0,1]C)=[0,1]Q\).

Next, define

$$\begin{aligned} S=\bigcup _{0\le \varphi _1\le 2\pi }S_{\varphi _1}, \qquad S_{\varphi _1}= Q\left( \frac{1}{2}+\frac{1}{2}e^{i\varphi _1}\right) . \end{aligned}$$

In geometric terms, this construction takes a point on the circle C, draws the disk whose diameter is the line segment between this point and the origin, and takes the union of such disks.

figure ak

The dashed circle is the unit circle. The solid circle is C. The shaded circles represent instances of \(S_{\varphi _1}\). We can characterize \({\mathcal {G}}({\mathcal {N}}_{1/2}{\mathcal {N}}_{1/2})\) by analyzing this construction since

$$\begin{aligned} S&=QC=(Q [0,1]) C= Q([0,1]C)\\&=QQ={\mathcal {G}}({\mathcal {N}}_{1/2}){\mathcal {G}}({\mathcal {N}}_{1/2})={\mathcal {G}}({\mathcal {N}}_{1/2}{\mathcal {N}}_{1/2}) \end{aligned}$$

by Proposition 1 and Theorem 7.

We now show \(S=\left\{ re^{i\varphi }\,|\,0\le r\le \cos ^2(\varphi /2)\right\} \). This fact is known and can be analytically derived through the envelope theorem [60, Exercise 5.22]. We provide a geometric proof, which was inspired by [37, Exercise 4.15].

Throughout this proof, we write \({\mathcal {I}}:\overline{\mathbb {C}}\rightarrow \overline{\mathbb {C}}\) for the mapping \({\mathcal {I}}(z)={\bar{z}}^{-1}\). We map S into the inverted space, i.e., we analyze

$$\begin{aligned} {\mathcal {I}}(S)=\bigcup _{0\le \varphi _1\le 2\pi }{\mathcal {I}}(S_{\varphi _1}). \end{aligned}$$
figure al

Again, \({\mathcal {I}}(z)={\bar{z}}^{-1}\). The dashed circle, the unit circle, is mapped onto itself. Circle C, the solid circle, is mapped to \({\mathcal {I}}(C)\), the verticle line going through 1. Each shaded circle \(S_{\varphi _1}\) is mapped to a half-space \({\mathcal {I}}(S_{\varphi _1})\). Let point A be the nonzero intersection between C and the boundary of \(S_{\varphi _1}\). Then point \({\mathcal {I}}(A)\) is the non-infinite intersection between \({\mathcal {I}}(C)\) and the boundary of \({\mathcal {I}}(S_{\varphi _1})\). By construction, \({\overline{OA}}\) is the diameter of \(S_{\varphi _1}\). The (infinite) line containing O, A, and \({\mathcal {I}}(A)\) is mapped onto itself, excluding the origin. Since \({\mathcal {I}}\) is conformal, the right angle at A between the boundary of \(S_{\varphi _1}\) and the diameter \({\overline{OA}}\) is mapped to a right angle between boundary of \({\mathcal {I}}(S_{\varphi _1})\) and \(\overline{O{\mathcal {I}}(A)}\).

figure am

Next, we show that the union of the half-spaces is described by a parabola. Define line D as the vertical line going through 2. Consider any point \(A'\) on the line \({\mathcal {I}}(C)\). Consider the line through \(A'\) perpendicular to \(\overline{OA'}\) and the half-space to the right of the line including \(\infty \). Define point E as the intersection of line D and the line extending \({\overline{OA}}\). Draw a line through point E perpendicular to line D, and define point B as the intersection of this line with the boundary of the half-space. Since E is within the half-space and the boundary of the half-space is not horizontal, this intersection exists and B is to the left of E. Since \(\overline{OA'}=\overline{A'E}\), we have \(\triangle BA'O\cong \triangle BA'E\) by the side-angle-side (SAS) congruence, and \({\overline{OB}}={\overline{BE}}\). The union of all such points corresponding to B forms a parabola. with directrix D, focus (0, 0) and vertex (1, 0).

The boundary of the half-space is tangent to the parabola at B, i.e., the line intersects the parabola at no other point. To see why, consider any other point \(B'\) on the boundary of the half-space. Then \(\overline{OB'}=\overline{B'E}\) by SAS congruence. However, \(\overline{B'E}\) is not perpendicular to D, i.e., \(\overline{B'E}\) is not a horizontal line. Therefore \(\overline{OB'}\) is longer than the distance of \(B'\) to D, and therefore \(B'\) is not on the parabola. Since each half-space is tangent to the parabola at B, all points to the right of B are in \({\mathcal {I}}(S)\) and no points strictly to the left of the parabola are in \({\mathcal {I}}(S)\). Therefore \({\mathcal {I}}(S)\) is characterized by the closed region to the right of the parabola including \(\infty \).

figure an

The region exterior to the circle centered at \(-1\) with radius 2 contains the region towards the right of the parabola. This is easily verified with calculus. The circle with the lighter shade corresponds to \({\mathcal {G}}({\mathcal {N}}_{2/3})\) by Proposition 1. Since \({\mathcal {N}}_{2/3}\) is SRG-full by Theorem 2, strict containment of the SRG in \(\overline{\mathbb {C}}\) implies strict containment of the class. The inverse curve of the parabola with the focus as the center of inversion is known as the cardioid and it has the polar coordinate representation \(r(\varphi )\le \cos ^2(\varphi /2)\). The expression of the Theorem is the region bounded by this curve. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ryu, E.K., Hannah, R. & Yin, W. Scaled relative graphs: nonexpansive operators via 2D Euclidean geometry. Math. Program. 194, 569–619 (2022). https://doi.org/10.1007/s10107-021-01639-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-021-01639-w

Keywords

Mathematics Subject Classification

Navigation