Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Primal-dual path following method for nonlinear semi-infinite programs with semi-definite constraints

  • Full Length Paper
  • Series A
  • Published:
Mathematical Programming Submit manuscript

Abstract

In this paper, we propose a primal-dual path following method for nonlinear semi-infinite semi-definite programs with infinitely many convex inequality constraints, called SISDP for short. A straightforward approach to the SISDP is to use classical methods for semi-infinite programs such as discretization and exchange methods and solve a sequence of (nonlinear) semi-definite programs (SDPs). However, it is often too demanding to find exact solutions of SDPs. In contrast, our approach does not rely on solving SDPs accurately but approximately following a path leading to a solution, which is formed on the intersection of the semi-infinite feasible region and the interior of the semi-definite feasible region. Specifically, we first present a prototype path-following method and show its global weak* convergence to a Karush-Kuhn-Tucker point of the SISDP under some mild assumptions. Next, to achieve fast local convergence, we integrate a two-step sequential quadratic programming method equipped with the Monteiro-Zhang scaling technique into the prototype method. We prove two-step superlinear convergence of the resulting algorithm using Alizadeh-Hareberly-Overton-like, Nesterov-Todd, and Helmberg-Rendle-Vanderbei-Wolkowicz/Kojima-Shindoh-Hara/Monteiro scaling directions. Finally, we conduct some numerical experiments to demonstrate the efficiency of the proposed method through comparison with a discretization method that solves SDPs obtained by finite relaxation of the SISDP.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. Actually, Jiang et al.  [12] solved linear SDPs with finitely many inequality constraints, which are obtained by discretizing the SISDP.

  2. Here, the metric topology in \(\mathcal {R}^n\times S^m\times \mathcal {R}^M\times T^M\) is the one which is naturally induced from the norm topology in \(\mathcal {R}^n\times S^m\times \mathcal {R}^m\) and the metric topology in \(T^M\).

  3. For each k, by definition, \(\tau _i^k\ne \tau _j^k\) if \(i\ne j\). But, at the limit, \(\tau _i^{*}= \tau _j^{*}\) possibly occurs, since two distinct sequences may converge to the identical point.

  4. We say that \(z^{*}\) is a KKT point of NSDP (3.1) with \(\bar{x}=x^{*}\) if \(\nabla f(x^{*})+\sum _{i=1}^{p(x^{*})} \nabla \hat{g}_i(x^{*})\zeta _i^{*} -(F_j\bullet V_{*})_{j=1}^n=0\), \(0\le \zeta ^{*}\perp \hat{g}(x^{*})\le 0\), and condition (2.2) holds with \(V=V_{*}\).

  5. For \(X\in S^{m}_{++},Y\in S^m\), the eigenvalues of \(X^{-1}Y\) are all real numbers.

  6. If \(x^k\) and \(x^{k+\frac{1}{2}}\) are sufficiently close to \(\delta \)-nondegenerate \(x^{*}\) as supposed in Algorithm 2, such j exists for each i.

  7. There is no theoretical guarantee for global optimality of \(\tau \) thus found. In practice, however, we may expect to have a global optimum by setting N large enough.

References

  1. Alizadeh, F., Goldfarb, D.: Second-order cone programming. Math. Program. 95, 3–51 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  2. Bonnans, J.F., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer-Verlag, New York (2000)

    Book  MATH  Google Scholar 

  3. Bertsekas, D.P., Nedic, A., Ozdaglar, A.E.: Convex Analysis and Optimization. Athena Scientific, Cambridge (2003)

    MATH  Google Scholar 

  4. Correa, R., Ramirez, C.H.: A global algorithm for nonlinear semidefinite programming. SIAM J. Optim. 15, 303–318 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  5. Fazel, M., Hindi, H., and Boyd, S. P.: Log-det heuristic for matrix rank minimization with applications to Hankel and Euclidean distance matrices, In: American Control Conference, 2003. Proceedings of the 2003, 3, 2156-2162. IEEE (2003)

  6. Floudas, C.A., Stein, O.: The adaptive convexification algorithm: a feasible point method for semi-infinite programming. SIAM J. Optim. 18, 1187–1208 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  7. Freund, R.W., Jarre, F., Vogelbusch, C.H.: Nonlinear semidefinite programming: Sensitivity, convergence, and an application in passive reduced-order modeling. Math. Program. 109, 581–611 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  8. Goberna, M.A., López, M.A.: Semi-Infinite Programming: Recent Advances. Kluwer Academic Publishers, Dordrecht (2001)

    Book  MATH  Google Scholar 

  9. Gramlich, G., Hettich, R., Sachs, E.W.: Local convergence of SQP methods in semi-infinite programming. SIAM J. Optim. 5, 641–658 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  10. Hayashi, S. and S.-Y. Wu, S.-Y.: An explicit exchange algorithm for linear semi-infinite programming problems with second-order cone constraint, SIAM J. Optim. 20, 1527–1546 (2009)

  11. Hettich, R., Kortanek, K.O.: Semi-infinite programming: Theory, methods, and applications. SIAM Rev. 35, 380–429 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  12. Jiang, A. and Kwan, H. K.: Minimax design of IIR digital filters using SDP relaxation technique, IEEE Trans. Circuits Syst. I, Reg. Papers, 57, 378–390 (2009)

  13. Konno, H., Kawadai, N., Tuy, H.: Cutting plane algorithms for nonlinear semi-definite programming problems with applications. J. Global Optim. 25, 141–155 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  14. Leibfritz, F., Maruhn, J.H.: A successive SDP-NSDP approach to a robust optimization problem in finance. Comput. Optim. and Appl. 44, 443–466 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  15. Li, D., Qi, L., Tam, J., Wu, S.-Y.: A smoothing Newton method for semi-infinite programming. J. Global Optim. 30, 169–194 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  16. Li, S., Teo, K.L., Yang, X., Wu, S.-Y.: Robust envelope-constrained filter with orthonormal bases and semi-definite and semi-infinite programming. Optim. Eng. 8, 299–319 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  17. Li, S., Wu, S.-Y., Yang, X., Teo, K.-L.: A relaxed cutting plane method for semi-infinite semi-definite programming. Comput. Optim. and Appl. 196, 459–473 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  18. Li, S., Yang, X., Teo, K.L., Wu, S.-Y.: A solution method for combined semi-infinite and semi-definite programming. ANZIAM J. 45, 477–494 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  19. Okuno, T., Fukushima, M.: Local reduction based SQP-type method for semi-infinite programs with an infinite number of second-order cone constraints. J. Global Optim. 60, 25–48 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  20. Okuno, T., Fukushima, M.: An interior point sequential quadratic programming-type method for log-determinant semi-infinite programs. J. Comput. Appl. Math. 376, 112784 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  21. Okuno, T., Hayashi, S., Fukushima, M.: A regularized explicit exchange method for semi-infinite programs with an infinite number of conic constraints. SIAM J. Optim. 22, 1009–1028 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  22. Pereira, A., Costa, M., Fernandes, E.: Interior point filter method for semi-infinite programming problems. Optimization 60, 1309–1338 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  23. Pereira, A., Fernandes, E.: A reduction method for semi-infinite programming by means of a global stochastic approach. Optimization 58, 713–726 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  24. Qi, L., Wu, S.-Y., Zhou, G.: Semismooth Newton methods for solving semi-infinite programming problems. J. Global Optim. 27, 215–232 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  25. Reemtsen, R.: Discretization methods for the solution of semi-infinite programming problems. J. Optim. Theory Appl. 71, 85–103 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  26. Reemtsen, R. and R\(\ddot{\text{u}}\)ckmann, J.: Semi-infinite Programming, Kluwer Academic Publishers, Boston (1998)

  27. Shapiro, A.: First and second order analysis of nonlinear semidefinite programs. Math. Program. 77, 301–320 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  28. Shiu, T., Wu, S.-Y.: Relaxed cutting plane method with convexification for solving nonlinear semi-infinite programming problems. Comput. Optim. and Appl. 53, 1–23 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  29. Stein, O., Steuermann, P.: The adaptive convexification algorithm for semi-infinite programming with arbitrary index sets. Math. Program. 18, 1–25 (2012)

    MathSciNet  MATH  Google Scholar 

  30. Still, G.: Discretization in semi-infinite programming: the rate of convergence. Math. Program. 91, 53–69 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  31. Tanaka, Y., Fukushima, M., Ibaraki, T.: A globally convergent SQP method for semi-infinite nonlinear optimization. J. Comput. Appl. Math. 23, 141–153 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  32. Todd, M.J., Toh, K.C., Tütüncü, R.H.: On the Nesterov-Todd direction in semidefinite programming. SIAM J. Optim. 8, 769–796 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  33. Toh, K. C., Todd, M. J., and T\(\ddot{\text{ u }}\)t\(\ddot{\text{ u }}\)nc\(\ddot{\text{ u }}\), R. H., SDPT3—a MATLAB software package for semidefinite programming, version 2.1, Optim. Methods Softw. 11, 545–581 (1999)

  34. Wang, S., Yuan, Y.: Feasible method for semi-infinite programs. SIAM J. Optim. 25, 2537–2560 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  35. Wolkowicz, H., Saigal, R., Vandenberghe, L.: Handbook of Semidefinite Programming: Theory, Algorithms, and Applications. Springer Science & Business Media, Germany (2012)

    MATH  Google Scholar 

  36. Wei, D., Sestok, C.K., Oppenheim, A.: V: Sparse filter design under a quadratic constraint: low-complexity algorithms. IEEE Trans. Sig. Process 857–870, 61 (2012)

    MATH  Google Scholar 

  37. Wen, F., Chu, L., Liu, P., Qiu, R.: C: A survey on nonconvex regularization-based sparse and low-rank recovery in signal processing, statistics, and machine learning. IEEE Access 69883–69906, 6 (2018)

    Google Scholar 

  38. Wu, S.-P., Boyd, S., and Vandenberghe, L.: FIR filter design via semidefinite programming and spectral factorization, In: Proc. IEEE Conf. on Decision and Control, pp. 271–276 (1996)

  39. Wu, S.-P., Boyd, S., and Vandenberghe, L.: FIR filter design via spectral factorization and convex optimization. Appl. Comput. Control, Signals and Circuits, pp. 215–245, (1999)

  40. Xu, M., Wu, S.-Y., Ye, J.J.: Solving semi-infinite programs by smoothing projected gradient method. Comput. Optim. and Appl. 59, 591–616 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  41. Yamashita, H., Yabe, H.: Local and superlinear convergence of a primal-dual interior point method for nonlinear semidefinite programming. Math. Program. 132, 1–30 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  42. Yamashita, H., Yabe, H.: A survey of numerical methods for nonlinear semidefinite programming. J. Oper. Res. Soc. Japan. 58, 24–60 (2015)

    MathSciNet  MATH  Google Scholar 

  43. Yamashita, H., Yabe, H., Harada, K.: A primal-dual interior point method for nonlinear semidefinite programming. Math. Program. 135, 89–121 (2012)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We would like to thank two anonymous reviewers and the associated editor for their valuable comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takayuki Okuno.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The work was supported by JSPS KAKENHI Grant Number [15K15943, 20K19748, 20H04145].

Appendix

Appendix

1.1 A.1 Proof of Theorem 2.1

Firstly, note that \(F(x^{*})\bullet V=0,\ F(x^{*})\in S^m_+\) and \(V\in S^m_+\) hold if and only if \(F(x^{*})\circ V=O,\ F(x^{*})\in S^m_+\), and \(V\in S^m_+\).

To show the theorem, we utilize [21, Theorem 2.4] related to the KKT conditions for semi-infinite programs with infinitely many conic constraints. To apply this theorem, we let \(\mathcal {W}:=S^m\times \mathcal {R}\), \(C:=S^m_{+}\times \mathcal {R}_+\), and \(H(x,\tau ):=(F(x),g(x,\tau ))\in \mathcal {W}\). Furthermore, we define the inner product \(\langle \cdot ,\cdot \rangle \) by \(\langle (Y_1,y_1),(Y_2,y_2)\rangle := Y_1\bullet Y_2+y_1y_2\) for any \((Y_1,y_1),(Y_2,y_2)\in \mathcal {W}\). Then, SISDP (1.1) is rewritten as the following semi-infinite program with infinitely many conic constrains:

$$\begin{aligned} \min \ f(x)\ \text{ s.t. } H(x,\tau )\in C\ \ (\tau \in T). \end{aligned}$$
(A.1)

In fact, the SCQ for the SISDP implies the RCQ [21, Definition 2.2], that is, for a given feasible point x, there exists \(d\in \mathcal {R}^n\) such that \(H(x,\tau )+\nabla _xH(x,\tau )^{\top }d\in \mathrm{int}\,C\), which is rewritten as \(F(x+d)\in S^m_{++}\) and \(g(x,\tau )+\nabla _xg(x,\tau )^{\top }d<0\ (\tau \in T)\). This can be ensured by letting \(d:=\bar{x}-x\) with a Slater point \(\bar{x}\) and using the fact that \(F(\bar{x})\in S^m_{++}\) and \(g(x,\tau )+\nabla _xg(x,\tau )^{\top }d\le g(\bar{x},\tau )<0\ (\tau \in T)\) from the convexity of \(g(\cdot ,\tau )\) for any \(\tau \in T\). Hence, by applying [21, Theorem 2.4] to problem (A.1), under the presence of the SCQ, there exist \(p(\le n)\) Lagrange multipliers \(Z_j:=(V_j,y_j)\in \mathcal {W} \ (1\le j\le p)\) and \(\tau _1,\tau _2,\ldots ,\tau _p\in T\) such that

$$\begin{aligned}&\nabla f(x^{*})-\sum _{j=1}^p\nabla _xH(x^{*},\tau _j)^{*}Z_j=0,\end{aligned}$$
(A.2)
$$\begin{aligned}&\langle H(x^{*},\tau _j),Z_j\rangle =0,\ H(x^{*},\tau _j)\in C,\ Z_j\in C^{*},\ (j=1,2,\ldots ,p) \end{aligned}$$
(A.3)

where \(C^{*}\) stands for the dual cone of C and

$$\begin{aligned} \nabla _xH(x^{*},\tau _i)^{*}Z_j= \begin{pmatrix} &{}(F_i\bullet V_j)_{i=1}^n\\ &{}\nabla _xg(x,\tau _j)y_j \end{pmatrix}. \end{aligned}$$

Then, by letting \(V:=\sum _{j=1}^pV_j\in S^m_{+}\), we have

$$\begin{aligned} \sum _{j=1}^p\nabla _xH(x^{*},\tau _i)^{*}Z_j =\begin{pmatrix} &{}\sum _{j=1}^p(F_i\bullet V_j)_{i=1}^n\\ &{}\sum _{j=1}^p\nabla _xg(x,\tau _j)y_j \end{pmatrix} =\begin{pmatrix} &{}(F_i\bullet V)_{i=1}^n\\ &{}\sum _{j=1}^p\nabla _xg(x,\tau _j)y_j \end{pmatrix}. \end{aligned}$$

By noting this fact together with \(C^{*}=C\), the conditions (A.2)-(A.3) yields

$$\begin{aligned}&\nabla f(x^{*})+\sum _{j=1}^p\nabla _xg(x^{*},\tau _j)y_j-{\left( F_i\bullet V\right) _{i=1}^n}=0, \end{aligned}$$
(A.4)
$$\begin{aligned}&{F}(x^{*})\bullet V=0,\ F(x^{*})\in S^m_+,\ V\in S^m_+, \end{aligned}$$
(A.5)
$$\begin{aligned}&\sum _{j=1}^pg(x^{*},\tau _j)y_j=0,\ g(x^{*},\tau _j)\le 0,\ y_j\ge 0\ (j=1,2,\ldots ,p). \end{aligned}$$
(A.6)

Finally, define \(y\in \mathcal {M}_+(T)\) as follows: For any \(\tau \in T\), \(y(\tau ):=y_j\) if \(\tau =\tau _j\) with \(1\le j\le p\) and, otherwise, \(y(\tau ):=0\). Then, by \(\sum _{j=1}^pg(x^{*},\tau _j)y_j=\int _{T}g(x^{*},\tau )dy(\tau )\) and \(\sum _{j=1}^p\nabla _xg(x^{*},\tau _j)y_j=\int _T\nabla _xg(x^{*},\tau )dy(\tau )\), conditions (A.4)-(A.6) are rewritten as the desired KKT conditions (2.1)-(2.3). By definition, \(|\mathrm{supp}(y)|\le p\le n\) holds.

The last assertion can be shown in a manner similar to the the one in the standard convex optimization theory. The proof is complete.

1.2 Proof of Proposition 3.1

Since \(x^{*}\) is \(\delta \)-nondegenerate and feasible for the SISDP, \(\max _{\tau \in T}g(x^{*},\tau )\le 0\) and the number of global maximizers of \(\max _{\tau \in T}g(x^{*},\tau )\) is finite. Denote the set of these global maximizers by \(\mathcal {S}\). As \(w^{*}\) is a KKT point of the SISDP, the KKT conditions (2.1)-(2.3) hold. In particular, by the complementarity condition (2.3), for each \(A\in \mathcal {B}\), we have the implication that \(y^{*}(A)>0\Rightarrow g(x^{*},\tau )=0\ (\tau \in A)\). Hereafter, suppose \(\mathrm{supp}(y^{*})\ne \emptyset \). Then, by definition, for arbitrarily chosen \(s\in \mathrm{supp}(y^{*})\), \(y^{*}(N_{s})>0\) holds for any open neighborhood \(N_s\) of s, which together with the above implication implies that \(g(x^{*},\tau )=0\ (\tau \in N_{s})\). Then, notice that \(\max _{\tau \in T}g(x^{*},\tau )\le 0\) since \(x^{*}\) is feasible to the SISDP. Thus, \(s\in N_s\subseteq \mathcal {S}\) holds for any \(s\in \mathrm{supp}(y^{*})\). This means \(\mathrm{supp}(y^{*})\subseteq \mathcal {S}\), which together with the fact that \(\mathcal {S}\) is a finite discrete set yields that \(\mathrm{supp}(y^{*})\) must be a finite discrete set. From this fact, we conclude that \(y^{*}\) is a discrete measure with finite support. Then, (3.3) readily follows from \(\mathrm{supp}(y^{*})\subseteq \mathcal {S}\subseteq \{\tau _{x^{*}}^1(x^{*}),\tau _{x^{*}}^2(x^{*}),\ldots ,\tau _{x^{*}}^{p(x^{*})}(x^{*})\}\).

Lastly, we show that \(z^{*}\) is a KKT point of the NSDP. Noting (3.3) and the fact that \(y^{*}\) is a discrete measure, we have \(\int _T\nabla _xg(x^{*},\tau )dy^{*}(\tau )=\sum _{i=1}^{p(x^{*})}\nabla _xg(x^{*},\tau _{x^{*}}^i(x^{*}))\zeta ^{*}_i\) and \(\int _Tg(x^{*},\tau )d\zeta ^{*}(\tau )=\sum _{i=1}^{p(x^{*})}g(x^{*},\tau _{x^{*}}^i(x^{*}))\zeta ^{*}_i\). Using this fact together with the KKT conditions (2.1)–(2.3) with \(w^{*}\) replaced by \(\bar{w}\), we obtain

$$\begin{aligned}&\nabla f(x^{*})+\sum _{i=1}^{p(x^{*})}\nabla _xg(x^{*},\tau _{x^{*}}^i(x^{*}))\zeta ^{*}_i-(F_j\bullet V_{*})_{j=1}^n=0,\\&{F}(x^{*})\circ V_{*}=O,\ F(x^{*})\in S^m_+,\ V_{*}\in S^m_+,\\&\sum _{i=1}^{p(x^{*})}g(x^{*},\tau _{x^{*}}^i(x^{*})\zeta ^{*}_i=0,\ g(x^{*},\tau _{x^{*}}^i(x^{*}))\le 0,\ \zeta ^{*}_i\ge 0\ (i=1,2,\ldots ,p(x^{*})). \end{aligned}$$

This means that \(z^{*}\) is a KKT point of the NSDP. \(\square \)

1.3 A.2 Proof of Proposition 3.3

We begin with giving some lemmas that help to show Proposition 3.3.

Lemma A.1

Let \(X\in S^m_+\), \(Y\in S^m\) and \(\mu \ge 0\). Then,

  1. 1.

    \(\Vert XY-YX\Vert _F\le 2\Vert X\circ Y-\mu I\Vert _F\) and

  2. 2.

    \(\Vert \mathcal {L}_X\mathcal {L}_Y-\mathcal {L}_Y\mathcal {L}_X\Vert _2\le \Vert X\circ Y-\mu I\Vert _F.\)

Proof

Using some orthogonal matrix \(\mathcal {O}\in \mathcal {R}^{m\times m}\), we make an eigenvalue decomposition of X: \(\mathcal {O}^{\top }X\mathcal {O}=D\) with \(D\in \mathcal {R}^{m\times m}\) being a diagonal matrix. Denote the i-th diagonal entry of D by \(d_i\ge 0\) for \(i=1,2,\ldots ,m\). Let \({\tilde{Y}}:=\mathcal {O}^{\top }Y\mathcal {O}\) with the (ij)-th entry \({\tilde{y}}_{ij}\) for \(1\le i,j\le m\).

  1. 1.

    We have the desired result from

    $$\begin{aligned} \Vert XY-YX\Vert _F^2&=\Vert \mathcal {O}^{\top }X\mathcal {O}\mathcal {O}^{\top }Y\mathcal {O}-\mathcal {O}^{\top }Y\mathcal {O}\mathcal {O}^{\top }X\mathcal {O}\Vert _F^2\\&=\Vert D{\tilde{Y}}-{\tilde{Y}}D\Vert _F^2\\&=\sum _{1\le i,j\le m}(d_{i}-d_{j})^2{\tilde{y}}_{ij}^2\\&\le \sum _{1\le i\ne j\le m}(d_{i}+d_{j})^2{\tilde{y}}_{ij}^2\\&\le \sum _{1\le i\ne j\le m}(d_{i}+d_{j})^2{\tilde{y}}_{ij}^2+\sum _{i=1}^m(2d_{i}{\tilde{y}}_{ii}-2\mu )^2\\&=\Vert D{\tilde{Y}}+{\tilde{Y}}D-2\mu I\Vert _F^2\\&=\Vert XY+YX-2\mu I\Vert _F^2\\&=4\Vert X\circ Y-\mu I\Vert _F^2, \end{aligned}$$

    where the first inequality follows from \(d_{i}\ge 0\) for \(i=1,2,\ldots ,m\).

  2. 2.

    By direct calculation, we have

    $$\begin{aligned} \Vert \mathcal {L}_X\mathcal {L}_Y-\mathcal {L}_Y\mathcal {L}_X\Vert _2&=\max _{\Vert Z\Vert _F=1}\Vert \mathcal {L}_X\mathcal {L}_YZ-\mathcal {L}_Y\mathcal {L}_XZ\Vert _F\\&=\max _{\Vert Z\Vert _F=1}\frac{\Vert (XY-YX)Z-Z(XY-YX)\Vert _F}{4}\\&\le \frac{\Vert XY-YX\Vert _F}{2}\\&\le \Vert X\circ Y-\mu I\Vert _F, \end{aligned}$$

    where the second inequality follows from item 1.

\(\square \)

Lemma A.2

Let \((X_{*},Y_{*})\in S^m_+\times S^m_+\) satisfy the strict complementarity condition that \(X_{*}{\circ }Y_{*}=O\) and \(X_{*}+Y_{*}\in S^m_{++}\). Let \(\{\mu _r\}\subseteq \mathcal {R}_{++}\) and \(\{(X_r,Y_r)\}\subseteq S^m_{++}\times S^m_{++}\) be sequences such that \(\lim _{r\rightarrow \infty }\mu _r=0\) and \(\lim _{r\rightarrow \infty }(X_r,Y_r)=(X_{*},Y_{*})\). Let spectral decompositions of \(X_{*}\) and \(Y_{*}\) be

$$\begin{aligned} \mathcal {O}_{*}^{\top }X_{*}\mathcal {O}_{*}=\begin{pmatrix} D_{X_{*}}&{}O\\ O&{}O \end{pmatrix},\ \mathcal {O}_{*}^{\top }Y_{*}\mathcal {O}_{*} = \begin{pmatrix} O&{}O\\ O&{}D_{Y_{*}} \end{pmatrix} \end{aligned}$$

using some orthogonal matrix \(\mathcal {O}_{*}\in \mathcal {R}^{m\times m}\) and positive diagonal matrices \(D_{X_{*}}\in S^p_{++}\) and \(D_{Y_{*}}\in S^q_{++}\) with \(p+q=m\). Furthermore, suppose \(p,q>0\) and choose a sequence of orthogonal matrices \(\{\mathcal {O}_{r}\}\subseteq \mathcal {R}^{m\times m}\) such that

$$\begin{aligned} \mathcal {O}_r^{\top }X_r\mathcal {O}_r=\begin{pmatrix} D_{X_{r}}&{}O\\ O&{}E_{X_{r}} \end{pmatrix},\ \lim _{r\rightarrow \infty }\mathcal {O}_r=\mathcal {O}_{*} \end{aligned}$$

with \(D_{X_r}\in \mathcal {R}^{p\times p}\) and \(E_{X_r}\in \mathcal {R}^{q\times q}\) being positive diagonal matrices for \(r\ge 1\). (Notice that \(\lim _{r\rightarrow \infty }E_{X_r}\)=O.) If \(\Vert X_r\circ Y_r-\mu _r I\Vert =o(\mu _r)\), then

$$\begin{aligned} \lim _{r\rightarrow \infty }\frac{1}{\mu _r}E_{X_r}=D_{Y_{*}}^{-1}. \end{aligned}$$
(A.7)

Proof

Let \({\tilde{Y}}_r:=\mathcal {O}_r^{\top }Y_r\mathcal {O}_r\) and \({\tilde{y}}^r_{ii}\) and \(e^r_i\) be the i-th diagonal entry of \({\tilde{Y}}_r\) and \(E_{X_r}\), respectively for any \(i=p+1,p+2,\ldots ,m\). Since \(\Vert X_r\circ Y_r-\mu _rI\Vert _F=o(\mu _r)\) and

$$\begin{aligned} \Vert X_r\circ Y_r-\mu _{r}I\Vert _F&=\left\| \begin{pmatrix} D_{X_{r}}&{}O\\ O&{}E_{X_{r}} \end{pmatrix}\circ {\tilde{Y}}_r-\mu _r I\right\| _F \\&\ge \sqrt{ \sum _{i=p+1}^{m}(e^r_{i}{\tilde{y}}_{ii}^r-\mu _r)^2}, \end{aligned}$$

we have

$$\begin{aligned} 0=\lim _{r\rightarrow \infty }\frac{\sqrt{\sum _{i=p+1}^{m}\left( e^r_i{\tilde{y}}^r_{ii}-\mu _r\right) ^2}}{\mu _r} =\lim _{r\rightarrow \infty }\sqrt{\sum _{i=p+1}^{m}\left( \frac{e^r_i}{\mu _r}{\tilde{y}}^r_{ii}-1\right) ^2}, \end{aligned}$$

which yields \(\lim _{r\rightarrow \infty }\frac{e^r_i}{\mu _r}{\tilde{y}}^r_{ii}=1\) for any \(i=p+1,\ldots ,m\). Notice that, for \(i\ge p+1\), \(\{{\tilde{y}}^r_{ii}\}\) converges to the i-th positive diagonal entry of \(D_{Y_{*}}\). In view of these facts, we obtain (A.7). \(\square \)

Now, we are ready to show Proposition 3.3. For the case where \(X_{*}\in S^m_{++}\), it is easy to prove the desired result. So, we consider the case of \(X_{*}\in S^m_+\setminus S^m_{++}\). Let \(\lambda _r>0\) be the smallest eigenvalue of \(X_r\). Notice that \(\lambda _r\rightarrow 0\ (r\rightarrow \infty )\) and, by Lemma A.2, \(\lim _{r\rightarrow \infty }\frac{\lambda _r}{\mu _r}\) exists and is positive. Thus, we also have

$$\begin{aligned} \lim _{r\rightarrow \infty }\frac{\mu _r}{\lambda _r}>0. \end{aligned}$$
(A.8)

Note that, for any \(X\in S^m\) having m eigenvalues \(\alpha _1\le \alpha _2\le \cdots \le \alpha _m\), the corresponding symmetric linear operator \(\mathcal {L}_X\) has \(m(m+1)/2\) eigenvalues \( \alpha _1,\alpha _2,\ldots ,\alpha _m,\{(\alpha _i+\alpha _j)/2\}_{i\ne j}. \) This fact yields that the maximum eigenvalue of the operator \(\mathcal {L}_{X_r}^{-1}\) is \(\lambda _r^{-1}\). Therefore, we have \(\Vert \mathcal {L}_{X_r}^{-1}\Vert _2=\lambda _r^{-1}\) for any \(r\ge 0\). It then follows that

$$\begin{aligned} \Vert \mathcal {L}_{X_r}\mathcal {L}_{Y_r}\mathcal {L}_{X_r}^{-1}-\mathcal {L}_{Y_r} \Vert _2&\le \Vert \mathcal {L}_{Y_r}\mathcal {L}_{X_r}-\mathcal {L}_{X_r}\mathcal {L}_{Y_r} \Vert _2\Vert \mathcal {L}_{X_r}^{-1}\Vert _2\\&\le \mu _r\Vert \mathcal {L}_{X_r}^{-1}\Vert _2 \frac{\Vert X_r\circ Y_r-\mu _rI \Vert _F}{\mu _r} \\&=\frac{\mu _r}{\lambda _r}\frac{\Vert X_r\circ Y_r-\mu _rI \Vert _F}{\mu _r}, \end{aligned}$$

where the second inequality follows from Lemma A.1. This relation together with (A.8) and \(\Vert X_r\circ Y_r-\mu _rI\Vert _F=O(\mu _r^{1+{\theta }})\) implies \( \Vert \mathcal {L}_{X_r}\mathcal {L}_{Y_r}\mathcal {L}_{X_r}^{-1}-\mathcal {L}_{Y_r} \Vert _2=O(\mu _r^{\theta }).\)

\(\square \)

1.4 A.4 Proof of Proposition 3.4

Define \(U_r(s):=\left( X_r+s\varDelta X_r\right) \circ \left( Y_r+s\varDelta Y_r\right) \) for \(s\in [0,1]\) and each r. By using the fact that \(\Vert X\Vert _F\ge |\lambda _{\min }(X)|\) for any \(X\in S^m\), conditions (3.25)–(3.27) yield that there exists some \(K>0\) such that

$$\begin{aligned}&\lambda _{\min }\left( \varDelta X_r\circ \varDelta Y_r\right) \ge -K\mu _r^2, \end{aligned}$$
(A.9)
$$\begin{aligned}&\lambda _{\min }\left( X_r\circ Y_r\right) \ge \mu _r-K\mu _r^{1+\theta }, \end{aligned}$$
(A.10)
$$\begin{aligned}&\lambda _{\min }\left( Z_r-\hat{\mu }_rI\right) \ge -K\hat{\mu }_r^{1+\widehat{\theta }}. \end{aligned}$$
(A.11)

Then, it holds that

$$\begin{aligned} \lambda _{\min }(U_r(s))&=\lambda _{\min }\left( X_r\circ Y_r+s X_r\circ \varDelta Y_r+s Y_r\circ \varDelta X_r+s^2\varDelta X_r\circ \varDelta Y_r \right) \\&=\lambda _{\min }\left( (1-s)X_r\circ Y_r+s(Z_r-\hat{\mu }_r I)+s\hat{\mu }_r I+s^2\varDelta X_r\circ \varDelta Y_r\right) \\&\ge (1-s)\lambda _{\min }\left( X_r\circ Y_r\right) +s\lambda _{\min }(Z_r-\hat{\mu }_r I)\\&+s\lambda _{\min }(\hat{\mu }_r I)+s^2\lambda _{\min }\left( \varDelta X_r\circ \varDelta Y_r\right) \\&\ge (1-s)\left( \mu _r-K\mu _r^{1+\theta }\right) -sK\hat{\mu }_r^{1+\widehat{\theta }}+s\hat{\mu }_r-s^2 K\mu _r^2\\&=:u_r(s) \end{aligned}$$

for any r sufficiently large and \(s\in [0,1]\), where the first inequality follows from the fact that \(\lambda _{\min }(A+B)\ge \lambda _{\min }(A)+\lambda _{\min }(B)\) for \(A, B\in S^m\) and the second inequality is due to (A.9)–(A.11) and \(s\in [0,1]\). Notice that \(u_r(s)\) is concave and quadratic. Then, for any r sufficiently large, we have \(u_r(s)>0\ (s\in [0,1])\) since \(0<\theta ,\widehat{\theta }<1\), \(\lim _{r\rightarrow \infty }(\mu _r,\hat{\mu }_r)=(0,0)\), and (3.28) imply that \(u_r(0)=\mu _r-K\mu _r^{1+\theta }>0\) and \(u_r(1)=\hat{\mu }_r-K\hat{\mu }_r^{1+\widehat{\theta }}-K\mu _r^2>0\) for sufficiently large r. This means that \(\lambda _{\min }(U_r(s))\ge u_r(s)>0\ (s\in [0,1])\) and therefore

$$\begin{aligned} U_r(s)\in S^m_{++}\ (s\in [0,1]), \end{aligned}$$
(A.12)

from which we can derive \(X_r+\varDelta X_r\in S^m_{++}\) and \(Y_r+\varDelta Y_r\in S^m_{++}\). Actually, for contradiction, suppose that either one of these two conditions is not true. We can assume \(X_r+\varDelta X_r\notin S^m_{++}\) without loss of generality. Recall that \(X_r\in S^m_{++}\). Then, there exists some \(\bar{s}\in (0,1]\) such that \(X_r+\bar{s}\varDelta X_r\in S^m_{+}\setminus S^m_{++}\). Therefore, we can find some nonzero vector \(d\in \mathcal {R}^n\) such that \((X_r+\bar{s}\varDelta X_r)d=0\). From this fact, we readily have

$$\begin{aligned} d^{\top }U_r(\bar{s})d&=\frac{ d^{\top }(X_r+\bar{s}\varDelta X_r)(Y_r+\bar{s}\varDelta Y_r)d+ d^{\top }(Y_r+\bar{s}\varDelta Y_r)(X_r+\bar{s}\varDelta X_r)d }{2}\\&=0, \end{aligned}$$

which contradicts (A.12). Hence, we conclude that \(X_r+\varDelta X_r\in S^m_{++}\) and \(Y_r+\varDelta Y_r\in S^m_{++}\) for all r sufficiently large. The proof is complete. \(\square \)

1.5 A.5 Proof of Lemma 3.1

To begin with, notice that Assumptions B-, C-, and C- yield that for sufficiently large k,

$$\begin{aligned} 0<\zeta ^k_i=\varTheta (1)\ (i\in I_a(x^{*})). \end{aligned}$$
(A.13)

Furthermore, by (3.10), we have

$$\begin{aligned} \zeta ^k_i=0\ (i\in \{1,2,\ldots ,p(x^{*})\}\setminus I_a(x^{*})), \end{aligned}$$
(A.14)

which together with \(\zeta _i^{*}=0\ (i\in \{1,2,\ldots ,p(x^{*})\}\setminus I_a(x^{*}))\) implies \(\Vert \tilde{z}^k-\tilde{z}^{*}\Vert =\Vert z^k-z^{*}\Vert \) for all k sufficiently large. Next, by (A.14), \(z^k\in \widehat{\mathcal {N}}_{\mu _{k-1}}^{\varepsilon _{k-1}}\), and \(\varepsilon _{k-1}=\gamma _1\mu _{k-1}^{1+\alpha }\), it follows that

$$\begin{aligned}&\left\| \nabla f(x^k)+\sum _{i\in I_a(x^{*})}\nabla \hat{g}_i(x^k)\zeta _i^k-(F_j\bullet V_k)_{j=1}^n\right\| =o(\mu _{k-1}),\nonumber \\&\Vert F(x^k)\circ V_k\Vert _F=\varTheta (\mu _{k-1}), \end{aligned}$$
(A.15)
$$\begin{aligned}&\left| \sum _{i\in I_a(x^{*})}\zeta _i^k\hat{g}_i(x^k)\right| =o(\mu _{k-1}),\ \max _{1\le i\le p(x^{*})}(\hat{g}_i(x^k))_+=o(\mu _{k-1}). \end{aligned}$$
(A.16)

Noting \(a=2(a)_+-|a|\) for \(a\in \mathcal {R}\), we have

$$\begin{aligned} \left| \sum _{i\in I_a(x^{*})}\zeta _i^k\hat{g}_i(x^k)\right| = \left| 2\sum _{i\in I_a(x^{*})}\zeta _i^k(\hat{g}_i(x^k))_+-\sum _{i\in I_a(x^{*})}\zeta _i^k\left| \hat{g}_i(x^k)\right| \right| , \end{aligned}$$

which together with (A.13) and (A.16) yields \(|\hat{g}_i(x^k)|=o(\mu _{k-1})\ (i\in I_a(x^{*}))\). From this fact and (A.15), we obtain \(\Vert \varPhi _0(\tilde{z}^k)\Vert =\varTheta (\mu _{k-1})\) and thus \(\mu _{k-1}=\varTheta (\Vert \varPhi _{0}(\tilde{z}^k)\Vert )\).

We next prove \(\mu _{k-1}=\varTheta (\Vert z^k-z^{*}\Vert )\) by showing \(\Vert \varPhi _{0}({\tilde{z}}^k)\Vert =\varTheta (\Vert \tilde{z}^k-\tilde{z}^{*}\Vert )\). It suffices to show that the sequence \(\{\eta _k\}:=\{{\Vert \varPhi _{0}(\tilde{z}^k)\Vert }/{\Vert \tilde{z}^k-\tilde{z}^{*}\Vert }\}\) is bounded above and away from zero. Note that by \(\varPhi _{0}(\tilde{z}^{*})=0\) and Assumption B-,

$$\begin{aligned} \eta _k=\frac{\Vert \varPhi _{0}(\tilde{z}^k)-\varPhi _{0}(\tilde{z}^{*})\Vert }{{\Vert \tilde{z}^k-\tilde{z}^{*}\Vert }}= \left\| \mathcal {J}\varPhi _{0}(\tilde{z}^{*})\frac{\tilde{z}^k-\tilde{z}^{*}}{\Vert \tilde{z}^k-\tilde{z}^{*}\Vert }+\frac{O(\Vert \tilde{z}^k-\tilde{z}^{*}\Vert ^2)}{\Vert \tilde{z}^k-\tilde{z}^{*}\Vert }\right\| . \end{aligned}$$

Obviously, \(\{\eta _k\}\) is bounded from above. To show \(\{\eta _k\}\) is bounded away from zero, suppose to the contrary. Then, without loss of generality, we can assume that \(\lim _{k\rightarrow \infty }\eta _k=0\), and hence there exists some \(d^{*}\) with \(\Vert d^{*}\Vert =1\) such that \(\lim _{k\rightarrow \infty }\frac{\tilde{z}^k-\tilde{z}^{*}}{\Vert \tilde{z}^k-\tilde{z}^{*}\Vert }=d^{*}\) and \(\mathcal {J}\varPhi _0(\tilde{z}^{*})d^{*}=0\). However, this contradicts the nonsingularity of \(\mathcal {J}\varPhi _{0}(\tilde{z}^{*})\) from Assumption B-. We have the desired conclusion. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Okuno, T., Fukushima, M. Primal-dual path following method for nonlinear semi-infinite programs with semi-definite constraints. Math. Program. 199, 251–303 (2023). https://doi.org/10.1007/s10107-022-01827-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-022-01827-2

Keywords

Mathematics Subject Classification

Navigation