Primal-dual path following method for nonlinear semi-infinite programs with semi-definite constraints

Takayuki Okuno^1,2 &
Masao Fukushima³

371 Accesses
2 Citations
Explore all metrics

Abstract

In this paper, we propose a primal-dual path following method for nonlinear semi-infinite semi-definite programs with infinitely many convex inequality constraints, called SISDP for short. A straightforward approach to the SISDP is to use classical methods for semi-infinite programs such as discretization and exchange methods and solve a sequence of (nonlinear) semi-definite programs (SDPs). However, it is often too demanding to find exact solutions of SDPs. In contrast, our approach does not rely on solving SDPs accurately but approximately following a path leading to a solution, which is formed on the intersection of the semi-infinite feasible region and the interior of the semi-definite feasible region. Specifically, we first present a prototype path-following method and show its global weak* convergence to a Karush-Kuhn-Tucker point of the SISDP under some mild assumptions. Next, to achieve fast local convergence, we integrate a two-step sequential quadratic programming method equipped with the Monteiro-Zhang scaling technique into the prototype method. We prove two-step superlinear convergence of the resulting algorithm using Alizadeh-Hareberly-Overton-like, Nesterov-Todd, and Helmberg-Rendle-Vanderbei-Wolkowicz/Kojima-Shindoh-Hara/Monteiro scaling directions. Finally, we conduct some numerical experiments to demonstrate the efficiency of the proposed method through comparison with a discretization method that solves SDPs obtained by finite relaxation of the SISDP.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Self-concordant inclusions: a unified framework for path-following generalized Newton-type algorithms

Article 30 March 2018

An interior-point algorithm for semidefinite least-squares problems

Article 20 October 2021

A feasible proximal bundle algorithm with convexification for nonsmooth, nonconvex semi-infinite programming

Article 22 September 2021

Notes

Actually, Jiang et al. [12] solved linear SDPs with finitely many inequality constraints, which are obtained by discretizing the SISDP.
Here, the metric topology in $\mathcal {R}^n\times S^m\times \mathcal {R}^M\times T^M$ is the one which is naturally induced from the norm topology in $\mathcal {R}^n\times S^m\times \mathcal {R}^m$ and the metric topology in $T^M$.
For each k, by definition, $\tau _i^k\ne \tau _j^k$ if $i\ne j$. But, at the limit, $\tau _i^{*}= \tau _j^{*}$ possibly occurs, since two distinct sequences may converge to the identical point.
We say that $z^{*}$ is a KKT point of NSDP (3.1) with $\bar{x}=x^{*}$ if $\nabla f(x^{*})+\sum _{i=1}^{p(x^{*})} \nabla \hat{g}_i(x^{*})\zeta _i^{*} -(F_j\bullet V_{*})_{j=1}^n=0$, $0\le \zeta ^{*}\perp \hat{g}(x^{*})\le 0$, and condition (2.2) holds with $V=V_{*}$.
For $X\in S^{m}_{++},Y\in S^m$, the eigenvalues of $X^{-1}Y$ are all real numbers.
If $x^k$ and $x^{k+\frac{1}{2}}$ are sufficiently close to $\delta $-nondegenerate $x^{*}$ as supposed in Algorithm 2, such j exists for each i.
There is no theoretical guarantee for global optimality of $\tau $ thus found. In practice, however, we may expect to have a global optimum by setting N large enough.

References

Alizadeh, F., Goldfarb, D.: Second-order cone programming. Math. Program. 95, 3–51 (2003)
Article MathSciNet MATH Google Scholar
Bonnans, J.F., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer-Verlag, New York (2000)
Book MATH Google Scholar
Bertsekas, D.P., Nedic, A., Ozdaglar, A.E.: Convex Analysis and Optimization. Athena Scientific, Cambridge (2003)
MATH Google Scholar
Correa, R., Ramirez, C.H.: A global algorithm for nonlinear semidefinite programming. SIAM J. Optim. 15, 303–318 (2004)
Article MathSciNet MATH Google Scholar
Fazel, M., Hindi, H., and Boyd, S. P.: Log-det heuristic for matrix rank minimization with applications to Hankel and Euclidean distance matrices, In: American Control Conference, 2003. Proceedings of the 2003, 3, 2156-2162. IEEE (2003)
Floudas, C.A., Stein, O.: The adaptive convexification algorithm: a feasible point method for semi-infinite programming. SIAM J. Optim. 18, 1187–1208 (2007)
Article MathSciNet MATH Google Scholar
Freund, R.W., Jarre, F., Vogelbusch, C.H.: Nonlinear semidefinite programming: Sensitivity, convergence, and an application in passive reduced-order modeling. Math. Program. 109, 581–611 (2007)
Article MathSciNet MATH Google Scholar
Goberna, M.A., López, M.A.: Semi-Infinite Programming: Recent Advances. Kluwer Academic Publishers, Dordrecht (2001)
Book MATH Google Scholar
Gramlich, G., Hettich, R., Sachs, E.W.: Local convergence of SQP methods in semi-infinite programming. SIAM J. Optim. 5, 641–658 (1995)
Article MathSciNet MATH Google Scholar
Hayashi, S. and S.-Y. Wu, S.-Y.: An explicit exchange algorithm for linear semi-infinite programming problems with second-order cone constraint, SIAM J. Optim. 20, 1527–1546 (2009)
Hettich, R., Kortanek, K.O.: Semi-infinite programming: Theory, methods, and applications. SIAM Rev. 35, 380–429 (1993)
Article MathSciNet MATH Google Scholar
Jiang, A. and Kwan, H. K.: Minimax design of IIR digital filters using SDP relaxation technique, IEEE Trans. Circuits Syst. I, Reg. Papers, 57, 378–390 (2009)
Konno, H., Kawadai, N., Tuy, H.: Cutting plane algorithms for nonlinear semi-definite programming problems with applications. J. Global Optim. 25, 141–155 (2003)
Article MathSciNet MATH Google Scholar
Leibfritz, F., Maruhn, J.H.: A successive SDP-NSDP approach to a robust optimization problem in finance. Comput. Optim. and Appl. 44, 443–466 (2009)
Article MathSciNet MATH Google Scholar
Li, D., Qi, L., Tam, J., Wu, S.-Y.: A smoothing Newton method for semi-infinite programming. J. Global Optim. 30, 169–194 (2004)
Article MathSciNet MATH Google Scholar
Li, S., Teo, K.L., Yang, X., Wu, S.-Y.: Robust envelope-constrained filter with orthonormal bases and semi-definite and semi-infinite programming. Optim. Eng. 8, 299–319 (2007)
Article MathSciNet MATH Google Scholar
Li, S., Wu, S.-Y., Yang, X., Teo, K.-L.: A relaxed cutting plane method for semi-infinite semi-definite programming. Comput. Optim. and Appl. 196, 459–473 (2006)
Article MathSciNet MATH Google Scholar
Li, S., Yang, X., Teo, K.L., Wu, S.-Y.: A solution method for combined semi-infinite and semi-definite programming. ANZIAM J. 45, 477–494 (2004)
Article MathSciNet MATH Google Scholar
Okuno, T., Fukushima, M.: Local reduction based SQP-type method for semi-infinite programs with an infinite number of second-order cone constraints. J. Global Optim. 60, 25–48 (2014)
Article MathSciNet MATH Google Scholar
Okuno, T., Fukushima, M.: An interior point sequential quadratic programming-type method for log-determinant semi-infinite programs. J. Comput. Appl. Math. 376, 112784 (2020)
Article MathSciNet MATH Google Scholar
Okuno, T., Hayashi, S., Fukushima, M.: A regularized explicit exchange method for semi-infinite programs with an infinite number of conic constraints. SIAM J. Optim. 22, 1009–1028 (2012)
Article MathSciNet MATH Google Scholar
Pereira, A., Costa, M., Fernandes, E.: Interior point filter method for semi-infinite programming problems. Optimization 60, 1309–1338 (2011)
Article MathSciNet MATH Google Scholar
Pereira, A., Fernandes, E.: A reduction method for semi-infinite programming by means of a global stochastic approach. Optimization 58, 713–726 (2009)
Article MathSciNet MATH Google Scholar
Qi, L., Wu, S.-Y., Zhou, G.: Semismooth Newton methods for solving semi-infinite programming problems. J. Global Optim. 27, 215–232 (2003)
Article MathSciNet MATH Google Scholar
Reemtsen, R.: Discretization methods for the solution of semi-infinite programming problems. J. Optim. Theory Appl. 71, 85–103 (1991)
Article MathSciNet MATH Google Scholar
Reemtsen, R. and R$\ddot{\text{u}}$ckmann, J.: Semi-infinite Programming, Kluwer Academic Publishers, Boston (1998)
Shapiro, A.: First and second order analysis of nonlinear semidefinite programs. Math. Program. 77, 301–320 (1997)
Article MathSciNet MATH Google Scholar
Shiu, T., Wu, S.-Y.: Relaxed cutting plane method with convexification for solving nonlinear semi-infinite programming problems. Comput. Optim. and Appl. 53, 1–23 (2012)
Article MathSciNet MATH Google Scholar
Stein, O., Steuermann, P.: The adaptive convexification algorithm for semi-infinite programming with arbitrary index sets. Math. Program. 18, 1–25 (2012)
MathSciNet MATH Google Scholar
Still, G.: Discretization in semi-infinite programming: the rate of convergence. Math. Program. 91, 53–69 (2001)
Article MathSciNet MATH Google Scholar
Tanaka, Y., Fukushima, M., Ibaraki, T.: A globally convergent SQP method for semi-infinite nonlinear optimization. J. Comput. Appl. Math. 23, 141–153 (1988)
Article MathSciNet MATH Google Scholar
Todd, M.J., Toh, K.C., Tütüncü, R.H.: On the Nesterov-Todd direction in semidefinite programming. SIAM J. Optim. 8, 769–796 (1998)
Article MathSciNet MATH Google Scholar
Toh, K. C., Todd, M. J., and T$\ddot{\text{ u }}$t$\ddot{\text{ u }}$nc$\ddot{\text{ u }}$, R. H., SDPT3—a MATLAB software package for semidefinite programming, version 2.1, Optim. Methods Softw. 11, 545–581 (1999)
Wang, S., Yuan, Y.: Feasible method for semi-infinite programs. SIAM J. Optim. 25, 2537–2560 (2015)
Article MathSciNet MATH Google Scholar
Wolkowicz, H., Saigal, R., Vandenberghe, L.: Handbook of Semidefinite Programming: Theory, Algorithms, and Applications. Springer Science & Business Media, Germany (2012)
MATH Google Scholar
Wei, D., Sestok, C.K., Oppenheim, A.: V: Sparse filter design under a quadratic constraint: low-complexity algorithms. IEEE Trans. Sig. Process 857–870, 61 (2012)
MATH Google Scholar
Wen, F., Chu, L., Liu, P., Qiu, R.: C: A survey on nonconvex regularization-based sparse and low-rank recovery in signal processing, statistics, and machine learning. IEEE Access 69883–69906, 6 (2018)
Google Scholar
Wu, S.-P., Boyd, S., and Vandenberghe, L.: FIR filter design via semidefinite programming and spectral factorization, In: Proc. IEEE Conf. on Decision and Control, pp. 271–276 (1996)
Wu, S.-P., Boyd, S., and Vandenberghe, L.: FIR filter design via spectral factorization and convex optimization. Appl. Comput. Control, Signals and Circuits, pp. 215–245, (1999)
Xu, M., Wu, S.-Y., Ye, J.J.: Solving semi-infinite programs by smoothing projected gradient method. Comput. Optim. and Appl. 59, 591–616 (2014)
Article MathSciNet MATH Google Scholar
Yamashita, H., Yabe, H.: Local and superlinear convergence of a primal-dual interior point method for nonlinear semidefinite programming. Math. Program. 132, 1–30 (2012)
Article MathSciNet MATH Google Scholar
Yamashita, H., Yabe, H.: A survey of numerical methods for nonlinear semidefinite programming. J. Oper. Res. Soc. Japan. 58, 24–60 (2015)
MathSciNet MATH Google Scholar
Yamashita, H., Yabe, H., Harada, K.: A primal-dual interior point method for nonlinear semidefinite programming. Math. Program. 135, 89–121 (2012)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We would like to thank two anonymous reviewers and the associated editor for their valuable comments and suggestions.

Author information

Authors and Affiliations

Faculty of Science and Technology Department of Science and Technology, Seikei University, 3-3-1, Kita, Kichijoji, Musashino, Tokyo, 180-8633, Japan
Takayuki Okuno
Center for Advanced Intelligence Project, RIKEN, Nihonbashi 1-chome Mitsui Building, 15th floor, 1-4-1 Nihonbashi, Chuo-ku, Tokyo, 103-0027, Japan
Takayuki Okuno
The Kyoto College of Graduate Studies for Informatics, Kyoto, 606-8225, Japan
Masao Fukushima

Authors

Takayuki Okuno
View author publications
You can also search for this author in PubMed Google Scholar
Masao Fukushima
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Takayuki Okuno.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The work was supported by JSPS KAKENHI Grant Number [15K15943, 20K19748, 20H04145].

Appendix

1.1 A.1 Proof of Theorem 2.1

Firstly, note that $F(x^{*})\bullet V=0,\ F(x^{*})\in S^m_+$ and $V\in S^m_+$ hold if and only if $F(x^{*})\circ V=O,\ F(x^{*})\in S^m_+$, and $V\in S^m_+$.

To show the theorem, we utilize [21, Theorem 2.4] related to the KKT conditions for semi-infinite programs with infinitely many conic constraints. To apply this theorem, we let $\mathcal {W}:=S^m\times \mathcal {R}$, $C:=S^m_{+}\times \mathcal {R}_+$, and $H(x,\tau ):=(F(x),g(x,\tau ))\in \mathcal {W}$. Furthermore, we define the inner product $\langle \cdot ,\cdot \rangle $ by $\langle (Y_1,y_1),(Y_2,y_2)\rangle := Y_1\bullet Y_2+y_1y_2$ for any $(Y_1,y_1),(Y_2,y_2)\in \mathcal {W}$. Then, SISDP (1.1) is rewritten as the following semi-infinite program with infinitely many conic constrains:

$$\begin{aligned} \min \ f(x)\ \text{ s.t. } H(x,\tau )\in C\ \ (\tau \in T). \end{aligned}$$

(A.1)

In fact, the SCQ for the SISDP implies the RCQ [21, Definition 2.2], that is, for a given feasible point x, there exists $d\in \mathcal {R}^n$ such that $H(x,\tau )+\nabla _xH(x,\tau )^{\top }d\in \mathrm{int}\,C$, which is rewritten as $F(x+d)\in S^m_{++}$ and $g(x,\tau )+\nabla _xg(x,\tau )^{\top }d<0\ (\tau \in T)$. This can be ensured by letting $d:=\bar{x}-x$ with a Slater point $\bar{x}$ and using the fact that $F(\bar{x})\in S^m_{++}$ and $g(x,\tau )+\nabla _xg(x,\tau )^{\top }d\le g(\bar{x},\tau )<0\ (\tau \in T)$ from the convexity of $g(\cdot ,\tau )$ for any $\tau \in T$. Hence, by applying [21, Theorem 2.4] to problem (A.1), under the presence of the SCQ, there exist $p(\le n)$ Lagrange multipliers $Z_j:=(V_j,y_j)\in \mathcal {W} \ (1\le j\le p)$ and $\tau _1,\tau _2,\ldots ,\tau _p\in T$ such that

$$\begin{aligned}&\nabla f(x^{*})-\sum _{j=1}^p\nabla _xH(x^{*},\tau _j)^{*}Z_j=0,\end{aligned}$$

(A.2)

$$\begin{aligned}&\langle H(x^{*},\tau _j),Z_j\rangle =0,\ H(x^{*},\tau _j)\in C,\ Z_j\in C^{*},\ (j=1,2,\ldots ,p) \end{aligned}$$

(A.3)

where $C^{*}$ stands for the dual cone of C and

$$\begin{aligned} \nabla _xH(x^{*},\tau _i)^{*}Z_j= \begin{pmatrix} &{}(F_i\bullet V_j)_{i=1}^n\\ &{}\nabla _xg(x,\tau _j)y_j \end{pmatrix}. \end{aligned}$$

Then, by letting $V:=\sum _{j=1}^pV_j\in S^m_{+}$, we have

$$\begin{aligned} \sum _{j=1}^p\nabla _xH(x^{*},\tau _i)^{*}Z_j =\begin{pmatrix} &{}\sum _{j=1}^p(F_i\bullet V_j)_{i=1}^n\\ &{}\sum _{j=1}^p\nabla _xg(x,\tau _j)y_j \end{pmatrix} =\begin{pmatrix} &{}(F_i\bullet V)_{i=1}^n\\ &{}\sum _{j=1}^p\nabla _xg(x,\tau _j)y_j \end{pmatrix}. \end{aligned}$$

By noting this fact together with $C^{*}=C$, the conditions (A.2)-(A.3) yields

$$\begin{aligned}&\nabla f(x^{*})+\sum _{j=1}^p\nabla _xg(x^{*},\tau _j)y_j-{\left( F_i\bullet V\right) _{i=1}^n}=0, \end{aligned}$$

(A.4)

$$\begin{aligned}&{F}(x^{*})\bullet V=0,\ F(x^{*})\in S^m_+,\ V\in S^m_+, \end{aligned}$$

(A.5)

$$\begin{aligned}&\sum _{j=1}^pg(x^{*},\tau _j)y_j=0,\ g(x^{*},\tau _j)\le 0,\ y_j\ge 0\ (j=1,2,\ldots ,p). \end{aligned}$$

(A.6)

Finally, define $y\in \mathcal {M}_+(T)$ as follows: For any $\tau \in T$, $y(\tau ):=y_j$ if $\tau =\tau _j$ with $1\le j\le p$ and, otherwise, $y(\tau ):=0$. Then, by $\sum _{j=1}^pg(x^{*},\tau _j)y_j=\int _{T}g(x^{*},\tau )dy(\tau )$ and $\sum _{j=1}^p\nabla _xg(x^{*},\tau _j)y_j=\int _T\nabla _xg(x^{*},\tau )dy(\tau )$, conditions (A.4)-(A.6) are rewritten as the desired KKT conditions (2.1)-(2.3). By definition, $|\mathrm{supp}(y)|\le p\le n$ holds.

The last assertion can be shown in a manner similar to the the one in the standard convex optimization theory. The proof is complete.

1.2 Proof of Proposition 3.1

Since $x^{*}$ is $\delta $-nondegenerate and feasible for the SISDP, $\max _{\tau \in T}g(x^{*},\tau )\le 0$ and the number of global maximizers of $\max _{\tau \in T}g(x^{*},\tau )$ is finite. Denote the set of these global maximizers by $\mathcal {S}$. As $w^{*}$ is a KKT point of the SISDP, the KKT conditions (2.1)-(2.3) hold. In particular, by the complementarity condition (2.3), for each $A\in \mathcal {B}$, we have the implication that $y^{*}(A)>0\Rightarrow g(x^{*},\tau )=0\ (\tau \in A)$. Hereafter, suppose $\mathrm{supp}(y^{*})\ne \emptyset $. Then, by definition, for arbitrarily chosen $s\in \mathrm{supp}(y^{*})$, $y^{*}(N_{s})>0$ holds for any open neighborhood $N_s$ of s, which together with the above implication implies that $g(x^{*},\tau )=0\ (\tau \in N_{s})$. Then, notice that $\max _{\tau \in T}g(x^{*},\tau )\le 0$ since $x^{*}$ is feasible to the SISDP. Thus, $s\in N_s\subseteq \mathcal {S}$ holds for any $s\in \mathrm{supp}(y^{*})$. This means $\mathrm{supp}(y^{*})\subseteq \mathcal {S}$, which together with the fact that $\mathcal {S}$ is a finite discrete set yields that $\mathrm{supp}(y^{*})$ must be a finite discrete set. From this fact, we conclude that $y^{*}$ is a discrete measure with finite support. Then, (3.3) readily follows from $\mathrm{supp}(y^{*})\subseteq \mathcal {S}\subseteq \{\tau _{x^{*}}^1(x^{*}),\tau _{x^{*}}^2(x^{*}),\ldots ,\tau _{x^{*}}^{p(x^{*})}(x^{*})\}$.

Lastly, we show that $z^{*}$ is a KKT point of the NSDP. Noting (3.3) and the fact that $y^{*}$ is a discrete measure, we have $\int _T\nabla _xg(x^{*},\tau )dy^{*}(\tau )=\sum _{i=1}^{p(x^{*})}\nabla _xg(x^{*},\tau _{x^{*}}^i(x^{*}))\zeta ^{*}_i$ and $\int _Tg(x^{*},\tau )d\zeta ^{*}(\tau )=\sum _{i=1}^{p(x^{*})}g(x^{*},\tau _{x^{*}}^i(x^{*}))\zeta ^{*}_i$. Using this fact together with the KKT conditions (2.1)–(2.3) with $w^{*}$ replaced by $\bar{w}$, we obtain

$$\begin{aligned}&\nabla f(x^{*})+\sum _{i=1}^{p(x^{*})}\nabla _xg(x^{*},\tau _{x^{*}}^i(x^{*}))\zeta ^{*}_i-(F_j\bullet V_{*})_{j=1}^n=0,\\&{F}(x^{*})\circ V_{*}=O,\ F(x^{*})\in S^m_+,\ V_{*}\in S^m_+,\\&\sum _{i=1}^{p(x^{*})}g(x^{*},\tau _{x^{*}}^i(x^{*})\zeta ^{*}_i=0,\ g(x^{*},\tau _{x^{*}}^i(x^{*}))\le 0,\ \zeta ^{*}_i\ge 0\ (i=1,2,\ldots ,p(x^{*})). \end{aligned}$$

This means that $z^{*}$ is a KKT point of the NSDP. $\square $

1.3 A.2 Proof of Proposition 3.3

We begin with giving some lemmas that help to show Proposition 3.3.

Lemma A.1

Let $X\in S^m_+$, $Y\in S^m$ and $\mu \ge 0$. Then,

1.
$\Vert XY-YX\Vert _F\le 2\Vert X\circ Y-\mu I\Vert _F$ and
2.
$\Vert \mathcal {L}_X\mathcal {L}_Y-\mathcal {L}_Y\mathcal {L}_X\Vert _2\le \Vert X\circ Y-\mu I\Vert _F.$

Proof

Using some orthogonal matrix $\mathcal {O}\in \mathcal {R}^{m\times m}$, we make an eigenvalue decomposition of X: $\mathcal {O}^{\top }X\mathcal {O}=D$ with $D\in \mathcal {R}^{m\times m}$ being a diagonal matrix. Denote the i-th diagonal entry of D by $d_i\ge 0$ for $i=1,2,\ldots ,m$. Let ${\tilde{Y}}:=\mathcal {O}^{\top }Y\mathcal {O}$ with the (i, j)-th entry ${\tilde{y}}_{ij}$ for $1\le i,j\le m$.

1.
We have the desired result from
$$\begin{aligned} \Vert XY-YX\Vert _F^2&=\Vert \mathcal {O}^{\top }X\mathcal {O}\mathcal {O}^{\top }Y\mathcal {O}-\mathcal {O}^{\top }Y\mathcal {O}\mathcal {O}^{\top }X\mathcal {O}\Vert _F^2\\&=\Vert D{\tilde{Y}}-{\tilde{Y}}D\Vert _F^2\\&=\sum _{1\le i,j\le m}(d_{i}-d_{j})^2{\tilde{y}}_{ij}^2\\&\le \sum _{1\le i\ne j\le m}(d_{i}+d_{j})^2{\tilde{y}}_{ij}^2\\&\le \sum _{1\le i\ne j\le m}(d_{i}+d_{j})^2{\tilde{y}}_{ij}^2+\sum _{i=1}^m(2d_{i}{\tilde{y}}_{ii}-2\mu )^2\\&=\Vert D{\tilde{Y}}+{\tilde{Y}}D-2\mu I\Vert _F^2\\&=\Vert XY+YX-2\mu I\Vert _F^2\\&=4\Vert X\circ Y-\mu I\Vert _F^2, \end{aligned}$$
where the first inequality follows from $d_{i}\ge 0$ for $i=1,2,\ldots ,m$.
2.
By direct calculation, we have
$$\begin{aligned} \Vert \mathcal {L}_X\mathcal {L}_Y-\mathcal {L}_Y\mathcal {L}_X\Vert _2&=\max _{\Vert Z\Vert _F=1}\Vert \mathcal {L}_X\mathcal {L}_YZ-\mathcal {L}_Y\mathcal {L}_XZ\Vert _F\\&=\max _{\Vert Z\Vert _F=1}\frac{\Vert (XY-YX)Z-Z(XY-YX)\Vert _F}{4}\\&\le \frac{\Vert XY-YX\Vert _F}{2}\\&\le \Vert X\circ Y-\mu I\Vert _F, \end{aligned}$$
where the second inequality follows from item 1.

$\square $

Lemma A.2

Let $(X_{*},Y_{*})\in S^m_+\times S^m_+$ satisfy the strict complementarity condition that $X_{*}{\circ }Y_{*}=O$ and $X_{*}+Y_{*}\in S^m_{++}$. Let $\{\mu _r\}\subseteq \mathcal {R}_{++}$ and $\{(X_r,Y_r)\}\subseteq S^m_{++}\times S^m_{++}$ be sequences such that $\lim _{r\rightarrow \infty }\mu _r=0$ and $\lim _{r\rightarrow \infty }(X_r,Y_r)=(X_{*},Y_{*})$. Let spectral decompositions of $X_{*}$ and $Y_{*}$ be

$$\begin{aligned} \mathcal {O}_{*}^{\top }X_{*}\mathcal {O}_{*}=\begin{pmatrix} D_{X_{*}}&{}O\\ O&{}O \end{pmatrix},\ \mathcal {O}_{*}^{\top }Y_{*}\mathcal {O}_{*} = \begin{pmatrix} O&{}O\\ O&{}D_{Y_{*}} \end{pmatrix} \end{aligned}$$

using some orthogonal matrix $\mathcal {O}_{*}\in \mathcal {R}^{m\times m}$ and positive diagonal matrices $D_{X_{*}}\in S^p_{++}$ and $D_{Y_{*}}\in S^q_{++}$ with $p+q=m$. Furthermore, suppose $p,q>0$ and choose a sequence of orthogonal matrices $\{\mathcal {O}_{r}\}\subseteq \mathcal {R}^{m\times m}$ such that

$$\begin{aligned} \mathcal {O}_r^{\top }X_r\mathcal {O}_r=\begin{pmatrix} D_{X_{r}}&{}O\\ O&{}E_{X_{r}} \end{pmatrix},\ \lim _{r\rightarrow \infty }\mathcal {O}_r=\mathcal {O}_{*} \end{aligned}$$

with $D_{X_r}\in \mathcal {R}^{p\times p}$ and $E_{X_r}\in \mathcal {R}^{q\times q}$ being positive diagonal matrices for $r\ge 1$. (Notice that $\lim _{r\rightarrow \infty }E_{X_r}$=O.) If $\Vert X_r\circ Y_r-\mu _r I\Vert =o(\mu _r)$, then

$$\begin{aligned} \lim _{r\rightarrow \infty }\frac{1}{\mu _r}E_{X_r}=D_{Y_{*}}^{-1}. \end{aligned}$$

(A.7)

Proof

Let ${\tilde{Y}}_r:=\mathcal {O}_r^{\top }Y_r\mathcal {O}_r$ and ${\tilde{y}}^r_{ii}$ and $e^r_i$ be the i-th diagonal entry of ${\tilde{Y}}_r$ and $E_{X_r}$, respectively for any $i=p+1,p+2,\ldots ,m$. Since $\Vert X_r\circ Y_r-\mu _rI\Vert _F=o(\mu _r)$ and

$$\begin{aligned} \Vert X_r\circ Y_r-\mu _{r}I\Vert _F&=\left\| \begin{pmatrix} D_{X_{r}}&{}O\\ O&{}E_{X_{r}} \end{pmatrix}\circ {\tilde{Y}}_r-\mu _r I\right\| _F \\&\ge \sqrt{ \sum _{i=p+1}^{m}(e^r_{i}{\tilde{y}}_{ii}^r-\mu _r)^2}, \end{aligned}$$

we have

$$\begin{aligned} 0=\lim _{r\rightarrow \infty }\frac{\sqrt{\sum _{i=p+1}^{m}\left( e^r_i{\tilde{y}}^r_{ii}-\mu _r\right) ^2}}{\mu _r} =\lim _{r\rightarrow \infty }\sqrt{\sum _{i=p+1}^{m}\left( \frac{e^r_i}{\mu _r}{\tilde{y}}^r_{ii}-1\right) ^2}, \end{aligned}$$

which yields $\lim _{r\rightarrow \infty }\frac{e^r_i}{\mu _r}{\tilde{y}}^r_{ii}=1$ for any $i=p+1,\ldots ,m$. Notice that, for $i\ge p+1$, $\{{\tilde{y}}^r_{ii}\}$ converges to the i-th positive diagonal entry of $D_{Y_{*}}$. In view of these facts, we obtain (A.7). $\square $

Now, we are ready to show Proposition 3.3. For the case where $X_{*}\in S^m_{++}$, it is easy to prove the desired result. So, we consider the case of $X_{*}\in S^m_+\setminus S^m_{++}$. Let $\lambda _r>0$ be the smallest eigenvalue of $X_r$. Notice that $\lambda _r\rightarrow 0\ (r\rightarrow \infty )$ and, by Lemma A.2, $\lim _{r\rightarrow \infty }\frac{\lambda _r}{\mu _r}$ exists and is positive. Thus, we also have

$$\begin{aligned} \lim _{r\rightarrow \infty }\frac{\mu _r}{\lambda _r}>0. \end{aligned}$$

(A.8)

Note that, for any $X\in S^m$ having m eigenvalues $\alpha _1\le \alpha _2\le \cdots \le \alpha _m$, the corresponding symmetric linear operator $\mathcal {L}_X$ has $m(m+1)/2$ eigenvalues $ \alpha _1,\alpha _2,\ldots ,\alpha _m,\{(\alpha _i+\alpha _j)/2\}_{i\ne j}. $ This fact yields that the maximum eigenvalue of the operator $\mathcal {L}_{X_r}^{-1}$ is $\lambda _r^{-1}$. Therefore, we have $\Vert \mathcal {L}_{X_r}^{-1}\Vert _2=\lambda _r^{-1}$ for any $r\ge 0$. It then follows that

$$\begin{aligned} \Vert \mathcal {L}_{X_r}\mathcal {L}_{Y_r}\mathcal {L}_{X_r}^{-1}-\mathcal {L}_{Y_r} \Vert _2&\le \Vert \mathcal {L}_{Y_r}\mathcal {L}_{X_r}-\mathcal {L}_{X_r}\mathcal {L}_{Y_r} \Vert _2\Vert \mathcal {L}_{X_r}^{-1}\Vert _2\\&\le \mu _r\Vert \mathcal {L}_{X_r}^{-1}\Vert _2 \frac{\Vert X_r\circ Y_r-\mu _rI \Vert _F}{\mu _r} \\&=\frac{\mu _r}{\lambda _r}\frac{\Vert X_r\circ Y_r-\mu _rI \Vert _F}{\mu _r}, \end{aligned}$$

where the second inequality follows from Lemma A.1. This relation together with (A.8) and $\Vert X_r\circ Y_r-\mu _rI\Vert _F=O(\mu _r^{1+{\theta }})$ implies $ \Vert \mathcal {L}_{X_r}\mathcal {L}_{Y_r}\mathcal {L}_{X_r}^{-1}-\mathcal {L}_{Y_r} \Vert _2=O(\mu _r^{\theta }).$

$\square $

1.4 A.4 Proof of Proposition 3.4

Define $U_r(s):=\left( X_r+s\varDelta X_r\right) \circ \left( Y_r+s\varDelta Y_r\right) $ for $s\in [0,1]$ and each r. By using the fact that $\Vert X\Vert _F\ge |\lambda _{\min }(X)|$ for any $X\in S^m$, conditions (3.25)–(3.27) yield that there exists some $K>0$ such that

$$\begin{aligned}&\lambda _{\min }\left( \varDelta X_r\circ \varDelta Y_r\right) \ge -K\mu _r^2, \end{aligned}$$

(A.9)

$$\begin{aligned}&\lambda _{\min }\left( X_r\circ Y_r\right) \ge \mu _r-K\mu _r^{1+\theta }, \end{aligned}$$

(A.10)

$$\begin{aligned}&\lambda _{\min }\left( Z_r-\hat{\mu }_rI\right) \ge -K\hat{\mu }_r^{1+\widehat{\theta }}. \end{aligned}$$

(A.11)

Then, it holds that

$$\begin{aligned} \lambda _{\min }(U_r(s))&=\lambda _{\min }\left( X_r\circ Y_r+s X_r\circ \varDelta Y_r+s Y_r\circ \varDelta X_r+s^2\varDelta X_r\circ \varDelta Y_r \right) \\&=\lambda _{\min }\left( (1-s)X_r\circ Y_r+s(Z_r-\hat{\mu }_r I)+s\hat{\mu }_r I+s^2\varDelta X_r\circ \varDelta Y_r\right) \\&\ge (1-s)\lambda _{\min }\left( X_r\circ Y_r\right) +s\lambda _{\min }(Z_r-\hat{\mu }_r I)\\&+s\lambda _{\min }(\hat{\mu }_r I)+s^2\lambda _{\min }\left( \varDelta X_r\circ \varDelta Y_r\right) \\&\ge (1-s)\left( \mu _r-K\mu _r^{1+\theta }\right) -sK\hat{\mu }_r^{1+\widehat{\theta }}+s\hat{\mu }_r-s^2 K\mu _r^2\\&=:u_r(s) \end{aligned}$$

for any r sufficiently large and $s\in [0,1]$, where the first inequality follows from the fact that $\lambda _{\min }(A+B)\ge \lambda _{\min }(A)+\lambda _{\min }(B)$ for $A, B\in S^m$ and the second inequality is due to (A.9)–(A.11) and $s\in [0,1]$. Notice that $u_r(s)$ is concave and quadratic. Then, for any r sufficiently large, we have $u_r(s)>0\ (s\in [0,1])$ since $0<\theta ,\widehat{\theta }<1$, $\lim _{r\rightarrow \infty }(\mu _r,\hat{\mu }_r)=(0,0)$, and (3.28) imply that $u_r(0)=\mu _r-K\mu _r^{1+\theta }>0$ and $u_r(1)=\hat{\mu }_r-K\hat{\mu }_r^{1+\widehat{\theta }}-K\mu _r^2>0$ for sufficiently large r. This means that $\lambda _{\min }(U_r(s))\ge u_r(s)>0\ (s\in [0,1])$ and therefore

$$\begin{aligned} U_r(s)\in S^m_{++}\ (s\in [0,1]), \end{aligned}$$

(A.12)

from which we can derive $X_r+\varDelta X_r\in S^m_{++}$ and $Y_r+\varDelta Y_r\in S^m_{++}$. Actually, for contradiction, suppose that either one of these two conditions is not true. We can assume $X_r+\varDelta X_r\notin S^m_{++}$ without loss of generality. Recall that $X_r\in S^m_{++}$. Then, there exists some $\bar{s}\in (0,1]$ such that $X_r+\bar{s}\varDelta X_r\in S^m_{+}\setminus S^m_{++}$. Therefore, we can find some nonzero vector $d\in \mathcal {R}^n$ such that $(X_r+\bar{s}\varDelta X_r)d=0$. From this fact, we readily have

$$\begin{aligned} d^{\top }U_r(\bar{s})d&=\frac{ d^{\top }(X_r+\bar{s}\varDelta X_r)(Y_r+\bar{s}\varDelta Y_r)d+ d^{\top }(Y_r+\bar{s}\varDelta Y_r)(X_r+\bar{s}\varDelta X_r)d }{2}\\&=0, \end{aligned}$$

which contradicts (A.12). Hence, we conclude that $X_r+\varDelta X_r\in S^m_{++}$ and $Y_r+\varDelta Y_r\in S^m_{++}$ for all r sufficiently large. The proof is complete. $\square $

1.5 A.5 Proof of Lemma 3.1

To begin with, notice that Assumptions B-, C-, and C- yield that for sufficiently large k,

$$\begin{aligned} 0<\zeta ^k_i=\varTheta (1)\ (i\in I_a(x^{*})). \end{aligned}$$

(A.13)

Furthermore, by (3.10), we have

$$\begin{aligned} \zeta ^k_i=0\ (i\in \{1,2,\ldots ,p(x^{*})\}\setminus I_a(x^{*})), \end{aligned}$$

(A.14)

which together with $\zeta _i^{*}=0\ (i\in \{1,2,\ldots ,p(x^{*})\}\setminus I_a(x^{*}))$ implies $\Vert \tilde{z}^k-\tilde{z}^{*}\Vert =\Vert z^k-z^{*}\Vert $ for all k sufficiently large. Next, by (A.14), $z^k\in \widehat{\mathcal {N}}_{\mu _{k-1}}^{\varepsilon _{k-1}}$, and $\varepsilon _{k-1}=\gamma _1\mu _{k-1}^{1+\alpha }$, it follows that

$$\begin{aligned}&\left\| \nabla f(x^k)+\sum _{i\in I_a(x^{*})}\nabla \hat{g}_i(x^k)\zeta _i^k-(F_j\bullet V_k)_{j=1}^n\right\| =o(\mu _{k-1}),\nonumber \\&\Vert F(x^k)\circ V_k\Vert _F=\varTheta (\mu _{k-1}), \end{aligned}$$

(A.15)

$$\begin{aligned}&\left| \sum _{i\in I_a(x^{*})}\zeta _i^k\hat{g}_i(x^k)\right| =o(\mu _{k-1}),\ \max _{1\le i\le p(x^{*})}(\hat{g}_i(x^k))_+=o(\mu _{k-1}). \end{aligned}$$

(A.16)

Noting $a=2(a)_+-|a|$ for $a\in \mathcal {R}$, we have

$$\begin{aligned} \left| \sum _{i\in I_a(x^{*})}\zeta _i^k\hat{g}_i(x^k)\right| = \left| 2\sum _{i\in I_a(x^{*})}\zeta _i^k(\hat{g}_i(x^k))_+-\sum _{i\in I_a(x^{*})}\zeta _i^k\left| \hat{g}_i(x^k)\right| \right| , \end{aligned}$$

which together with (A.13) and (A.16) yields $|\hat{g}_i(x^k)|=o(\mu _{k-1})\ (i\in I_a(x^{*}))$. From this fact and (A.15), we obtain $\Vert \varPhi _0(\tilde{z}^k)\Vert =\varTheta (\mu _{k-1})$ and thus $\mu _{k-1}=\varTheta (\Vert \varPhi _{0}(\tilde{z}^k)\Vert )$.

We next prove $\mu _{k-1}=\varTheta (\Vert z^k-z^{*}\Vert )$ by showing $\Vert \varPhi _{0}({\tilde{z}}^k)\Vert =\varTheta (\Vert \tilde{z}^k-\tilde{z}^{*}\Vert )$. It suffices to show that the sequence $\{\eta _k\}:=\{{\Vert \varPhi _{0}(\tilde{z}^k)\Vert }/{\Vert \tilde{z}^k-\tilde{z}^{*}\Vert }\}$ is bounded above and away from zero. Note that by $\varPhi _{0}(\tilde{z}^{*})=0$ and Assumption B-,

$$\begin{aligned} \eta _k=\frac{\Vert \varPhi _{0}(\tilde{z}^k)-\varPhi _{0}(\tilde{z}^{*})\Vert }{{\Vert \tilde{z}^k-\tilde{z}^{*}\Vert }}= \left\| \mathcal {J}\varPhi _{0}(\tilde{z}^{*})\frac{\tilde{z}^k-\tilde{z}^{*}}{\Vert \tilde{z}^k-\tilde{z}^{*}\Vert }+\frac{O(\Vert \tilde{z}^k-\tilde{z}^{*}\Vert ^2)}{\Vert \tilde{z}^k-\tilde{z}^{*}\Vert }\right\| . \end{aligned}$$

Obviously, $\{\eta _k\}$ is bounded from above. To show $\{\eta _k\}$ is bounded away from zero, suppose to the contrary. Then, without loss of generality, we can assume that $\lim _{k\rightarrow \infty }\eta _k=0$, and hence there exists some $d^{*}$ with $\Vert d^{*}\Vert =1$ such that $\lim _{k\rightarrow \infty }\frac{\tilde{z}^k-\tilde{z}^{*}}{\Vert \tilde{z}^k-\tilde{z}^{*}\Vert }=d^{*}$ and $\mathcal {J}\varPhi _0(\tilde{z}^{*})d^{*}=0$. However, this contradicts the nonsingularity of $\mathcal {J}\varPhi _{0}(\tilde{z}^{*})$ from Assumption B-. We have the desired conclusion. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Okuno, T., Fukushima, M. Primal-dual path following method for nonlinear semi-infinite programs with semi-definite constraints. Math. Program. 199, 251–303 (2023). https://doi.org/10.1007/s10107-022-01827-2

Download citation

Received: 01 October 2018
Accepted: 30 April 2022
Published: 04 June 2022
Issue Date: May 2023
DOI: https://doi.org/10.1007/s10107-022-01827-2

Primal-dual path following method for nonlinear semi-infinite programs with semi-definite constraints

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Self-concordant inclusions: a unified framework for path-following generalized Newton-type algorithms

An interior-point algorithm for semidefinite least-squares problems

A feasible proximal bundle algorithm with convexification for nonsmooth, nonconvex semi-infinite programming

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

1.1 A.1 Proof of Theorem 2.1

1.2 Proof of Proposition 3.1

1.3 A.2 Proof of Proposition 3.3

Lemma A.1

Proof

Lemma A.2

Proof

1.4 A.4 Proof of Proposition 3.4

1.5 A.5 Proof of Lemma 3.1

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Navigation

Primal-dual path following method for nonlinear semi-infinite programs with semi-definite constraints

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Self-concordant inclusions: a unified framework for path-following generalized Newton-type algorithms

An interior-point algorithm for semidefinite least-squares problems

A feasible proximal bundle algorithm with convexification for nonsmooth, nonconvex semi-infinite programming

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

1.1 A.1 Proof of Theorem 2.1

1.2 Proof of Proposition 3.1

1.3 A.2 Proof of Proposition 3.3

Lemma A.1

Proof

Lemma A.2

Proof

1.4 A.4 Proof of Proposition 3.4

1.5 A.5 Proof of Lemma 3.1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Search

Navigation