Abstract
Deterministic branch-and-bound algorithms for continuous global optimization often visit a large number of boxes in the neighborhood of a global minimizer, resulting in the so-called cluster problem (Du and Kearfott in J Glob Optim 5(3):253–265, 1994). This article extends previous analyses of the cluster problem in unconstrained global optimization (Du and Kearfott 1994; Wechsung et al. in J Glob Optim 58(3):429–438, 2014) to the constrained setting based on a recently-developed notion of convergence order for convex relaxation-based lower bounding schemes. It is shown that clustering can occur both on nearly-optimal and nearly-feasible regions in the vicinity of a global minimizer. In contrast to the case of unconstrained optimization, where at least second-order convergent schemes of relaxations are required to mitigate the cluster problem when the minimizer sits at a point of differentiability of the objective function, it is shown that first-order convergent lower bounding schemes for constrained problems may mitigate the cluster problem under certain conditions. Additionally, conditions under which second-order convergent lower bounding schemes are sufficient to mitigate the cluster problem around a global minimizer are developed. Conditions on the convergence order prefactor that are sufficient to altogether eliminate the cluster problem are also provided. This analysis reduces to previous analyses of the cluster problem for unconstrained optimization under suitable assumptions.
Similar content being viewed by others
References
Adjiman, C.S., Floudas, C.A.: Rigorous convex underestimators for general twice-differentiable problems. J. Glob. Optim. 9(1), 23–40 (1996)
Bazaraa, M.S., Sherali, H.D., Shetty, C.M.: Nonlinear Programming: Theory and Algorithms, 3rd edn. Wiley, Hoboken (2013)
Bompadre, A., Mitsos, A.: Convergence rate of McCormick relaxations. J. Glob. Optim. 52(1), 1–28 (2012)
Bompadre, A., Mitsos, A., Chachuat, B.: Convergence analysis of Taylor models and McCormick–Taylor models. J. Glob. Optim. 57(1), 75–114 (2013)
Bonnans, J.F., Ioffe, A.: Second-order sufficiency and quadratic growth for nonisolated minima. Math. Oper. Res. 20(4), 801–817 (1995)
Clarke, F.H.: Optimization and Nonsmooth Analysis, Classics in Applied Mathematics, vol. 5. SIAM, Philadelphia (1990)
Du, K., Kearfott, R.B.: The cluster problem in multivariate global optimization. J. Glob. Optim. 5(3), 253–265 (1994)
Floudas, C.A., Pardalos, P.M., Adjiman, C., Esposito, W.R., Gümüs, Z.H., Harding, S.T., Klepeis, J.L., Meyer, C.A., Schweiger, C.A.: Handbook of test problems in local and global optimization. In: Nonconvex Optimization and Its Applications, 1st edn, vol. 33. Springer, Berlin (1999)
Goldsztejn, A., Domes, F., Chevalier, B.: First order rejection tests for multiple-objective optimization. J. Glob. Optim. 58(4), 653–672 (2014)
Hijazi, H., Liberti, L.: Constraint qualification failure in action. Oper. Res. Lett. 44(4), 503–506 (2016)
Horst, R., Tuy, H.: Global Optimization: Deterministic Approaches, 3rd edn. Springer, Berlin (1996)
Ioffe, A.: On sensitivity analysis of nonlinear programs in Banach spaces: the approach via composite unconstrained optimization. SIAM J. Optim. 4(1), 1–43 (1994)
Kearfott, R.B., Du, K.: The cluster problem in global optimization: the univariate case. In: Validation Numerics, Computing Supplementum, vol. 9, pp. 117–127. Springer, Berlin (1993)
Khan, K.A., Watson, H.A.J., Barton, P.I.: Differentiable McCormick relaxations. J. Glob. Optim. 67(4), 687–729 (2017)
Krawczyk, R., Nickel, K.: Die zentrische form in der Intervallarithmetik, ihre quadratische Konvergenz und ihre Inklusionsisotonie. Computing 28(2), 117–137 (1982)
Mayer, G.: Epsilon-inflation in verification algorithms. J. Comput. Appl. Math. 60(1), 147–169 (1995)
McCormick, G.P.: Computability of global solutions to factorable nonconvex programs: Part I—convex underestimating problems. Math. Program. 10(1), 147–175 (1976)
Moore, R.E., Kearfott, R.B., Cloud, M.J.: Introduction to Interval Analysis. Society for Industrial and Applied Mathematics, Philadelphia (2009)
Najman, J., Bongartz, D., Tsoukalas, A., Mitsos, A.: Erratum to: multivariate McCormick relaxations. J. Glob. Optim. 68(1), 219–225 (2017)
Najman, J., Mitsos, A.: Convergence analysis of multivariate McCormick relaxations. J. Glob. Optim. 66(4), 597–628 (2016)
Neumaier, A.: Complete search in continuous global optimization and constraint satisfaction. Acta Numer. 13, 271–369 (2004)
Rudin, W.: Principles of Mathematical Analysis, 3rd edn. McGraw-Hill, New York (1976)
Schöbel, A., Scholz, D.: The theoretical and empirical rate of convergence for geometric branch-and-bound methods. J. Glob. Optim. 48(3), 473–495 (2010)
Scholtes, S.: Introduction to Piecewise Differentiable Equations. SpringerBriefs in Optimization, 1st edn. Springer, New York (2012)
Scholz, D.: Theoretical rate of convergence for interval inclusion functions. J. Glob. Optim. 53(4), 749–767 (2012)
Tsoukalas, A., Mitsos, A.: Multivariate McCormick relaxations. J. Glob. Optim. 59(2–3), 633–662 (2014)
Van Iwaarden, R.J.: An improved unconstrained global optimization algorithm. Ph.D. thesis, University of Colorado at Denver (1996)
Wechsung, A.: Global optimization in reduced space. Ph.D. thesis, Massachusetts Institute of Technology (2014)
Wechsung, A., Schaber, S.D., Barton, P.I.: The cluster problem revisited. J. Glob. Optim. 58(3), 429–438 (2014)
Acknowledgements
The authors would like to thank three anonymous reviewers and an associate editor for suggestions which helped improve the readability of this article, and Dr. Johannes Jäschke for helpful discussions (in particular, for bring references [5] and [12] to our attention). The first author would also like to thank Jaichander Swaminathan for helpful discussions regarding the proof of Theorem 2.
Author information
Authors and Affiliations
Corresponding author
Additional information
The authors gratefully acknowledge financial support from BP. This work was conducted as a part of the BP-MIT conversion research program.
Appendix: Proofs
Appendix: Proofs
1.1 Proof of Lemma 4
Lemma 4 Consider Problem (P). Suppose \(\mathbf {x}^*\) is nonisolated and f is differentiable at \(\mathbf {x}^*\). Then \(\forall \theta > 0\), \(\exists \alpha > 0\) such that
Proof
We proceed by contradiction. Define
and note that \(L(\alpha )\) is monotonically nonincreasing on \((0,+\infty )\). Suppose \(\exists \theta > 0\) such that \(\forall \alpha > 0\), we have \(L(\alpha ) \le L^* -\theta \). Consider a sequence \(\{\alpha _k\} \rightarrow 0\) with \(\alpha _k > 0\), and a corresponding sequence \(\{\mathbf {d}_k\}\) such that
The existence of \(\mathbf {d}_k\) follows from the assumption that \(L(\alpha ) \le L^* -\theta \), \(\forall \alpha > 0\). Since \({||}\mathbf {d}_k{||}_1 = 1\), \(\forall k\), we have the existence of \(\mathbf {d}^* \in \mathbb {R}^{n_x}\) with \(\mathbf {d}^* = \underset{k_q \rightarrow \infty }{\lim } \mathbf {d}_{k_q}\) and \({||}\mathbf {d}^*{||}_1 = 1\) for some convergent subsequence \(\{\mathbf {d}_{k_q}\}\). Furthermore, \(\mathbf {d}^* \in T(\mathbf {x}^*)\) and \({\nabla f(\mathbf {x}^*)}^\mathrm{T} \mathbf {d}^* \le L^* -\displaystyle \frac{\theta }{2}\), since \(\forall k_q\) we have \({\nabla f(\mathbf {x}^*)}^\mathrm{T} \mathbf {d}_{k_q} \le L^* -\displaystyle \frac{\theta }{2}\), which contradicts the definition of \(L^*\). \(\square \)
1.2 Proof of Proposition 2
Proposition 2 Consider Problem (P) with \(m_E \ge 1\). Suppose \(\mathbf {x}^*\) is nonisolated, f is differentiable at \(\mathbf {x}^*\), functions \(h_k\), \(k = 1,\ldots ,m_E\), are differentiable at \(\mathbf {x}^*\), and \(\mathscr {A}(\mathbf {x}^*) = \emptyset \). Furthermore, suppose there exist multipliers \(\varvec{\lambda }^* \in \mathbb {R}^{m_E}\) corresponding to the equality constraints such that \((\mathbf {x}^*,\mathbf {0},\varvec{\lambda }^*)\) is a KKT point. Then
Proof
Since \((\mathbf {x}^*,\mathbf {0},\varvec{\lambda }^*)\) is a KKT point, we have
From the assumption that \(\mathbf {x}^*\) is a nonisolated feasible point, we have that the set \(\left\{ \mathbf {d} : {||}\mathbf {d}{||}_1 = 1, \mathbf {d}\in T(\mathbf {x}^*) \right\} \) is nonempty. Additionally, we have
where \(\mathscr {L}(\mathbf {x}^*)\) denotes the linearized cone at \(\mathbf {x}^*\) (see, for instance, [2]). Consequently, for each \(\mathbf {d}\in T(\mathbf {x}^*)\) with \({||}\mathbf {d}{||}_1 = 1\), we have \({\nabla f(\mathbf {x}^*)}^\mathrm{T} \mathbf {d}= 0\). \(\square \)
1.3 Proof of Theorem 2
Theorem 2 Suppose the assumptions of Lemma 5 hold. Let \(\delta = \left( \displaystyle \frac{\varepsilon }{\tau ^*}\right) ^{\frac{1}{\beta ^*}}\), \(r = \displaystyle \frac{2\varepsilon }{L}\).
-
1.
If \(\delta \ge 2r\), let \(N = 1\).
-
2.
If \(\displaystyle \frac{2r}{m-1} > \delta \ge \displaystyle \frac{2r}{m}\) for some \(m \in \mathbb {N}\) with \(m \le n_x\) and \(2 \le m \le 5\), then let
$$\begin{aligned} N = \displaystyle \sum _{i=0}^{m-1}{2^i \left( {\begin{array}{c}n_x\\ i\end{array}}\right) } + 2n_x{\lceil }\displaystyle \frac{m-3}{3}{\rceil }. \end{aligned}$$ -
3.
Otherwise, let
$$\begin{aligned} N= & {} {\lceil }2 \left( \tau ^* \right) ^{\frac{1}{\beta ^*}} \varepsilon ^{\left( 1 - \frac{1}{\beta ^*} \right) } L^{-1}{\rceil }^{n_x-1} \left( {\lceil }2 \left( \tau ^* \right) ^{\frac{1}{\beta ^*}} \varepsilon ^{\left( 1 - \frac{1}{\beta ^*} \right) } L^{-1}{\rceil } + 2n_x {\lceil }\left( \tau ^* \right) ^{\frac{1}{\beta ^*}} \varepsilon ^{\left( 1 - \frac{1}{\beta ^*} \right) } L^{-1}{\rceil } \right) . \end{aligned}$$
Then, N is an upper bound on the number of boxes of width \(\delta \) required to cover \(\hat{X}_{5}\).
Proof
This proof is rederived based on Corollary 2.1 in [28] and the proof of Lemma 3 in [29]. Note that the condition in the second case is corrected to ‘\({2 \le m \le 5}\)’ as opposed to ‘\({2 \le m \le 6}\)’ in [28].
From Lemma 5, we have \(\hat{X}_{5} = \left\{ \mathbf {x}\in \mathscr {N}^1_{\hat{\alpha }}(\mathbf {x}^*) : L{||}\mathbf {x}- \mathbf {x}^*{||}_1 \le 2\varepsilon \right\} \subset \Big \{\mathbf {x}:{||}\mathbf {x}- \mathbf {x}^*{||}_1 \le \displaystyle \frac{2\varepsilon }{L}\Big \} {=}{:} \tilde{B}\). Therefore, an upper bound on the number of boxes of width \(\delta \) required to cover \(\hat{X}_5\) can be obtained by conservatively estimating the number of boxes of width \(\delta \) required to cover \(\tilde{B}\). In what follows, we will assume without loss of generality that \(\mathbf {x}^* = \mathbf {0}\).
-
1.
Suppose \(\delta \ge 2r\). Consider the box \(B_{\delta }\) of width \(\delta \) centered at \(\mathbf {x}^* = \mathbf {0}\). We have
$$\begin{aligned} \mathbf {x}\in \tilde{B} \implies {||}\mathbf {x}{||}_1 \le \displaystyle \frac{2\varepsilon }{L} \implies {||}\mathbf {x}{||}_{\infty } \le \displaystyle \frac{2\varepsilon }{L} = r \le \displaystyle \frac{\delta }{2} \implies \mathbf {x}\in B_{\delta }, \end{aligned}$$where we have used the fact that \({||}\mathbf {x}{||}_{\infty } \le {||}\mathbf {x}{||}_1\), \(\forall \mathbf {x}\in \mathbb {R}^{n_x}\). Therefore, \(B_{\delta }\) is sufficient to cover \(\tilde{B}\).
-
2.
Suppose \(m \le n_x\) with \(m \in \{2,\ldots ,5\}\) and \(\delta \ge \frac{2r}{m}\). Place a box \(B_{\delta }\) of width \(\delta \) centered at \(\mathbf {x}^* = \mathbf {0}\) (the condition on \(\delta \) ensures that \(B_{\delta }\) intersects the boundary of \(\tilde{B}\)). Let
$$\begin{aligned} E_i := \left\{ \mathbf {e}\in \mathbb {R}^{n_x} : e_j \in \left\{ -\displaystyle \frac{\delta }{2}, 0, \displaystyle \frac{\delta }{2}\right\} , \, \forall j \in \{1,\ldots ,n_x\}, \,\, \displaystyle \sum _{j=1}^{n_x}{I_0(e_j)} = i \right\} , \end{aligned}$$where \(I_0:\mathbb {R}\rightarrow \{0,1\}\) is defined as \(I_0(x) := {\left\{ \begin{array}{ll} 0, &{}\text{ if } x = 0 \\ 1, &{} \text{ otherwise } \end{array}\right. }\), denote the set of midpoints of the \((n_x-i)\)-dimensional faces of \(B_{\delta }\) (each element of \(E_i\) has exactly i nonzero components, each of which is \(\pm \frac{\delta }{2}\)). Note that \({|}*{|}{E_i} = 2^i \left( {\begin{array}{c}n_x\\ i\end{array}}\right) \), \(\forall i \in \{1,\ldots ,n_x\}\). Under the assumption \(\delta \ge \frac{2r}{m}\), we will show that, in addition to \(B_{\delta }\), it is sufficient to place one box beside \(B_{\delta }\) along the directions in \(E_1, \ldots , E_{m-1}\) when \(m = 2\) or \(m = 3\), and two boxes beside \(B_{\delta }\) along the directions in \(E_1\) and one box beside \(B_{\delta }\) along the directions in \(E_2, \ldots , E_{m-1}\) when \(m = 4\) or \(m = 5\) in order to cover \(\tilde{B}\). First, we show that we need not place any boxes beside \(B_{\delta }\) along the directions in \(E_{m}, \ldots , E_{n_x}\). Let \(\mathbf {e}\in E_i\) with \( i \in \{m,\ldots ,n_x\}\). We have \({||}\mathbf {e}{||}_1 = \frac{\delta }{2}i \ge \frac{i}{m}r \ge r\), which implies \(\mathbf {e}\in \partial \tilde{B} \cup \tilde{B}^{\text {C}}\) (where \(\partial \tilde{B}\) denotes the boundary of \(\tilde{B}\)). Consequently, boxes placed beside \(B_{\delta }\) along the directions in \(E_{m}, \ldots , E_{n_x}\) do not intersect the interior of \(\tilde{B}\) and are not required to cover \(\tilde{B}\). Suppose \(\delta \ge \frac{2r}{m}\), and let \(\mathbf {e}\in E_i\) for some \(i \in \{1,\ldots ,m-1\}\). The distance from \(\mathbf {e}\), which is the midpoint of an \((n-i)\)-dimensional face of \(B_{\delta }\), to \(\frac{2r}{\delta i}\mathbf {e}\), which is a point on the boundary of \(\tilde{B}\) in the direction \(\mathbf {e}\), in the \(\infty \)-norm is \(\frac{r}{i} - \frac{\delta }{2} \le \frac{r}{i} - \frac{r}{m}\). If this distance is less than \(\delta \) for each \(i \in \{1,\ldots ,m-1\}\), then one box beside \(B_{\delta }\) along the directions in \(E_1, \ldots , E_{m-1}\) is sufficient to cover \(\tilde{B}\). This amounts to requiring
$$\begin{aligned}&\qquad \quad \,\,\displaystyle \frac{r}{i} - \displaystyle \frac{r}{m} \le \displaystyle \frac{2r}{m}, \, \forall i \in \{1,\ldots , m-1\} \iff m \le 3i, \, \forall i \in \{1,\ldots , m-1\}\\&\iff m = 2 \,\, \text {or} \,\, m = 3. \end{aligned}$$Note that if \(m = 4\) or \(m = 5\), we still have \(m \le 3i\), \(\forall i \in \{2,\ldots , m-1\}\). Additionally, \(\frac{r}{1} - \frac{r}{m} \le \frac{4r}{m} \le 2\delta \) in such cases. Therefore, when \(m = 4\) or \(m = 5\), two boxes along the directions in \(E_1\) and one box along the directions in \(E_2, \ldots , E_{m-1}\) are sufficient to cover \(\tilde{B}\).
-
3.
If the previous assumptions on \(\delta \) are not satisfied, a box of width \(\delta \) centered at \(\mathbf {x}^*\) may not intersect \(\partial \tilde{B}\). To estimate the number of boxes of width \(\delta \) required to cover \(\tilde{B}\), we first estimate the number of boxes, \(N_r\), of width \(r = \frac{2\varepsilon }{L}\) required to cover \(\tilde{B}\) using the previous analysis, and then estimate the number of boxes of width \(\delta \) required to cover the intersection of these \(N_r\) boxes with \(\tilde{B}\).
The number of boxes of width r required to cover \(\tilde{B}\) is \(N_r := 1 + 2n_x\), where ‘1’ corresponds to the box centered at \(\mathbf {x}^* = \mathbf {0}\), and ‘\({2n_x}\)’ corresponds to the boxes along the directions in \(E_1\). Note that \(E_1\) is now defined as
$$\begin{aligned} E_1 := \left\{ \mathbf {e}\in \mathbb {R}^{n_x} : e_j \in \left\{ -\displaystyle \frac{r}{2}, 0, \displaystyle \frac{r}{2}\right\} , \, \forall j \in \{1,\ldots ,n_x\}, \, \, \displaystyle \sum _{j=1}^{n_x}{I_0(e_j)} = 1 \right\} \end{aligned}$$since \(\tilde{B}\) is first covered using boxes of width r. The box of width r centered at \(\mathbf {x}^*\) can be covered using \({\lceil }\frac{r}{\delta }{\rceil }^{n_x}\) boxes of width \(\delta \). Note that the entire volume of the \(2n_x\) boxes along the directions in \(E_1\) need not be covered using boxes of width \(\delta \) since parts of those boxes have no intersection with \(\tilde{B}\). To estimate the extent to which each of the \(2n_x\) boxes need to be covered with boxes of width \(\delta \), we compute the distance between any \(\mathbf {e}\in E_1\) (which is a midpoint of a one-dimensional face of the box of width r centered at \(\mathbf {x}^*\)) and \(\frac{2r}{r \times 1}\mathbf {e}= 2\mathbf {e}\) (which is a point on the boundary of \(\tilde{B}\) in the direction \(\mathbf {e}\)) in the \(\infty \)-norm. This distance turns out to be equal to \(\frac{r}{2}\). This implies at most half the volumes of the \(2n_x\) boxes need to be covered using boxes of width \(\delta \), which yields the estimate of \(2n_x {\lceil }\frac{r}{\delta }{\rceil }^{n_x - 1} {\lceil }\frac{r}{2\delta }{\rceil }\) boxes of width \(\delta \) that are required to cover the \(2n_x\) boxes of width r along the directions in \(E_1\). \(\square \)
1.4 Proof of Lemma 7
Lemma 7 Consider Problem (P). Suppose \(\mathbf {x}^*\) is nonisolated, f is locally Lipschitz continuous on X and directionally differentiable at \(\mathbf {x}^*\), and \(\exists \alpha > 0\) such that
Then, \(\exists \hat{\alpha } \in (0,\alpha ]\) such that the region \(\mathscr {N}^1_{\hat{\alpha }}(\mathbf {x}^*) \cap X_5\) can be conservatively approximated by
Proof
Let \(\mathbf {x}= \mathbf {x}^* + t\mathbf {d}\in \mathscr {N}^1_{\alpha }(\mathbf {x}^*) \cap \mathscr {F}(X)\) with \({||}\mathbf {d}{||}_1 = 1\) and \(t = {||}\mathbf {x}- \mathbf {x}^*{||}_1 > 0\). We have (see Theorem 3.1.2 in [24])
where Step 2 follows from the directional differentiability of f at \(\mathbf {x}^*\). Consequently, there exists \(\hat{\alpha } \in (0,\alpha ]\) such that for all \(\mathbf {x}= \mathbf {x}^* + t\mathbf {d}\in \mathscr {F}(X)\) with \({||}\mathbf {d}{||}_1 = 1\) and \(t \in [0,\hat{\alpha })\):
Therefore, \(\forall \mathbf {x}\in \mathscr {N}^1_{\hat{\alpha }}(\mathbf {x}^*) \cap X_5\) we have \(\mathbf {x}= \mathbf {x}^* + t\mathbf {d}\in \mathscr {F}(X)\) with \({||}\mathbf {d}{||}_1 = 1\) and \(t = {||}\mathbf {x}- \mathbf {x}^*{||}_1 < \hat{\alpha }\), and
\(\square \)
1.5 Proof of Lemma 10
Lemma 10 Consider Problem (P), and suppose the assumptions of Lemma 8 hold. Then \(\exists \hat{\alpha } \in (0,\alpha ]\) and constants \(\rho _1, \rho _2 \ge 0\) such that the region \(\mathscr {N}^2_{\hat{\alpha }}(\mathbf {x}^*) \cap X_5\) can be conservatively approximated by
Proof
The result trivially follows from Lemma 8 when \(\nabla f(\mathbf {x}^*) = \mathbf {0}\).
Suppose \(\nabla f(\mathbf {x}^*) \ne \mathbf {0}\). From Lemma 8, we have
Suppose we represent each \(\mathbf {x}\in \mathscr {N}^2_{\hat{\alpha }}(\mathbf {x}^*) \cap \mathscr {F}(X)\) by \(\mathbf {x}:= \mathbf {x}^* + \beta _1 \nabla f(\mathbf {x}^*) + \beta _2 \mathbf {d}\), where \(\beta _1, \beta _2 \in \mathbb {R}\) and \(\mathbf {d}\perp \nabla f(\mathbf {x}^*)\) with \({||}\mathbf {d}{||} = 1\). Consider the case when \(\beta _1 \ge 0\). We have
Therefore, \(\forall \mathbf {x}\in \mathscr {N}^2_{\hat{\alpha }}(\mathbf {x}^*) \cap X_5\) with \(\mathbf {x}= \mathbf {x}^* + \beta _1 \nabla f(\mathbf {x}^*) + \beta _2 \mathbf {d}\), \(\beta _1 \ge 0, \beta _2 \in \mathbb {R}\) and \(\mathbf {d}\perp \nabla f(\mathbf {x}^*)\) with \({||}\mathbf {d}{||} = 1\), we have
where the last step uses the fact that \(\hat{\alpha }\) is chosen such that \(o\left( \beta ^2_1 + \beta ^2_2\right) \ge -\displaystyle \frac{\gamma }{2} ( \beta ^2_1 {||}\nabla f(\mathbf {x}^*){||}^2 + \beta ^2_2 )\) (see Eq. 1). Note that \(\beta _1 \le \displaystyle \sqrt{\displaystyle \frac{2\varepsilon }{\gamma {||}\nabla f(\mathbf {x}^*){||}^2}}\) and \({|}*{|}{\beta _2} \le \displaystyle \sqrt{\displaystyle \frac{2\varepsilon }{\gamma }}\) follow from Eq. 2. The right hand side of Eq. 3 is \(O(\varepsilon )\) since \(\beta _1 = \beta _2 = O(\displaystyle \sqrt{\varepsilon })\), thereby establishing the existence of \(\rho _2 \ge 0\).
Next, suppose \(\beta _1 \le 0\). From the assumptions of Lemma 8, we have for each \(\mathbf {x}\in \mathscr {N}^2_{\hat{\alpha }}(\mathbf {x}^*) \cap X_5\) with \(\mathbf {x}= \mathbf {x}^* + \beta _1 \nabla f(\mathbf {x}^*) + \beta _2 \mathbf {d}\), \(\beta _1 \le 0, \beta _2 \in \mathbb {R}\) and \(\mathbf {d}\perp \nabla f(\mathbf {x}^*)\) with \({||}\mathbf {d}{||} = 1\):
and \(\beta _1 \ge -\displaystyle \sqrt{\displaystyle \frac{2\varepsilon }{\gamma {||}\nabla f(\mathbf {x}^*){||}^2}}\), \({|}*{|}{\beta _2} \le \displaystyle \sqrt{\displaystyle \frac{2\varepsilon }{\gamma }}\) from Eq. 2. The left hand side of Eq. 4 is \(O(\varepsilon )\) since \(\beta _1 = \beta _2 = O(\displaystyle \sqrt{\varepsilon })\), thereby establishing the existence of \(\rho _1 \ge 0\). \(\square \)
1.6 Proof of Proposition 4
Proposition 4 Consider Problem (P) with \(m_E \ge 1\). Suppose \(\mathbf {x}^*\) is a nonisolated constrained minimizer, f is differentiable at \(\mathbf {x}^*\), functions \(h_k\), \(k = 1,\ldots ,m_E\), are differentiable at \(\mathbf {x}^*\), and \(\mathscr {A}(\mathbf {x}^*) = \emptyset \). Furthermore, suppose there exist multipliers \(\varvec{\lambda }^* \in \mathbb {R}^{m_E}\) corresponding to the equality constraints such that \((\mathbf {x}^*,\mathbf {0},\varvec{\lambda }^*)\) is a KKT point. Then \(\not \exists \alpha > 0\), \(\mathscr {D}_0\) such that the assumptions of Lemma 11 are satisfied.
Proof
Since \((\mathbf {x}^*,\mathbf {0},\varvec{\lambda }^*)\) is a KKT point, we have
From the assumption that \(\mathbf {x}^*\) is a nonisolated feasible point, we have that the set \(\left\{ \mathbf {d} : {||}\mathbf {d}{||}_1 = 1, \mathbf {d}\in T(\mathbf {x}^*) \right\} \) is nonempty. Additionally, we have from the proof of Proposition 2 that for each \(\mathbf {d}\in T(\mathbf {x}^*)\) with \({||}\mathbf {d}{||}_1 = 1\), \({\nabla f(\mathbf {x}^*)}^\mathrm{T} \mathbf {d}= 0\) and \({\nabla h_k(\mathbf {x}^*)}^\mathrm{T} \mathbf {d}= 0, \forall k \in \{1,\ldots ,m_E\}\).
Assume, by way of contradiction that \(\exists \alpha > 0\) and a set \(\mathscr {D}_0\) satisfying the assumptions of Lemma 11. Consequently, \(\exists L_f, L_I > 0\) such that
and
Since \(\exists \mathbf {d}\in T(\mathbf {x}^*)\) with \({||}\mathbf {d}{||}_1 = 1\) such that \({\nabla f(\mathbf {x}^*)}^\mathrm{T} \mathbf {d}= 0\) and \({\nabla h_k(\mathbf {x}^*)}^\mathrm{T} \mathbf {d}= 0, \forall k \in \{1,\ldots ,m_E\}\), we have that the set
is nonempty. All that remains to reach a contradiction is to show that \(\exists \bar{\mathbf {d}} \in S \cap \mathscr {D}_I\).
From the above arguments, we have the existence of \(\bar{\mathbf {d}} \in S\), \(\bar{k} \in \{1,\ldots ,m_E\}\) such that \({|}*{|}{{\nabla h_{\bar{k}}(\mathbf {x}^*)}^\mathrm{T} \bar{\mathbf {d}}} \in (0, L_I)\), since the assumption \(L_I > 0\) implies all of the equality constraint gradients \(\nabla h_k(\mathbf {x}^*)\), \(k \in \{1,\ldots ,m_E\}\), cannot simultaneously be \(\mathbf {0}\). Since \({\nabla h_{\bar{k}}(\mathbf {x}^*)}^\mathrm{T} \bar{\mathbf {d}} \ne 0\), we have \(\bar{\mathbf {d}} \not \in T(\mathbf {x}^*)\) (this follows from the arguments made in the proof of Proposition 2). Consequently, \(\exists t \in (0,\alpha )\) such that \((\mathbf {x}^*+t\bar{\mathbf {d}}) \in \mathscr {N}^1_{\alpha }(\mathbf {x}^*) \cap \left( \mathscr {F}({X})\right) ^{\text {C}} \implies \bar{\mathbf {d}} \in \mathscr {D}_I\). This implies that either \(\bar{\mathbf {d}} \in \mathscr {D}_0\), or \(\bar{\mathbf {d}} \in \mathscr {D}_I \backslash \mathscr {D}_0\), which contradicts the definition of \(L_f\) or \(L_I\) since \({\nabla f(\mathbf {x}^*)}^\mathrm{T} \bar{\mathbf {d}} < L_f\) and \({|}*{|}{{\nabla h_k(\mathbf {x}^*)}^\mathrm{T} \bar{\mathbf {d}}} < L_I, \forall k \in \{1,\ldots ,m_E\}\). \(\square \)
Rights and permissions
About this article
Cite this article
Kannan, R., Barton, P.I. The cluster problem in constrained global optimization. J Glob Optim 69, 629–676 (2017). https://doi.org/10.1007/s10898-017-0531-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-017-0531-z
Keywords
- Cluster problem
- Global optimization
- Constrained optimization
- Branch-and-bound
- Convergence order
- Convex relaxation
- Lower bounding scheme