Studio Express A1 - Deutsch

Journal of Computational and Applied Mathematics 177 (2005) 67 – 88
www.elsevier.com/locate/cam
An interior-point method for solving box-constrained

underdetermined nonlinear systems
J.B. Franciscoa,1 , N. Krejićb,∗,2 , J.M. Martíneza,3
a Department of Applied Mathematics, IMECC-UNICAMP, University of Campinas, CP 6065, 13081-970 Campinas SP, Brazil
b Department of Mathematics and Informatics, University of Novi Sad, Trg Dositeja Obradovića 4, 21000 Novi Sad,
Serbia and Montenegro
Received 2 April 2004; received in revised form 26 August 2004
Abstract
A method introduced recently by Bellavia, Macconi and Morini for solving square nonlinear systems with bounds
is modified and extended to cope with the underdetermined case. The algorithm produces a sequence of interior
iterates and is based on globalization techniques due to Coleman and Li. Global and local convergence results are
proved and numerical experiments are presented.
© 2004 Elsevier B.V. All rights reserved.
MSC: 65H10
Keywords: Underdetermined nonlinear systems; Newton method; Global convergence; Quadratic convergence
∗ Corresponding author. Tel.: +38121458136; fax: +38121350458.

E-mail addresses: juliano@ime.unicamp.br (J.B. Francisco), natasa@im.ns.ac.yu (N. Krejić), martinez@ime.unicamp.br
(J.M. Martínez).
1 This author was supported by PRONEX-Optimization 76.79.1008-00, FAPESP (Grant 02/00702-0) and CNPq.
2 This author was supported by a visiting professor grant of FAPESP (Grant 99/03102-0).
3 This author was supported by PRONEX-Optimization 76.79.1008-00, FAPESP (Grant 01/04597-4), CNPq and FAEP-
UNICAMP.
0377-0427/$ - see front matter © 2004 Elsevier B.V. All rights reserved.
doi:10.1016/j.cam.2004.08.013
68 J.B. Francisco et al. / Journal of Computational and Applied Mathematics 177 (2005) 67 – 88
1. Introduction
The feasible set of practical nonlinear programming problems is usually represented by a set of equa-
tions subject to bounds on the variables. Many traditional [1,10,14–16,18–22] and modern practical
optimization methods (see [12,13]) exploit this structure and require specific algorithms for restoring
feasibility at every iteration. Usually, the feasible region is nonempty and points that are interior with
respect to the bounds exist. Therefore it is important to develop procedures that solve bound-constrained
underdetermined set of equations in a more efficient way that ordinary bound-constraint minimization
algorithms do.
With the aim of solving square (n × n) nonlinear systems subject to bounds, Bellavia et al. [2,3]
developed an algorithm that adapts most of the ideas of the bound-constrained optimization solvers of
Coleman and Li [4,5] to the resolution of systems of equations. In the present paper we modify and extend
the algorithm of Bellavia, Macconi and Morini for dealing with the underdetermined case.
So, the problem considered here is to solve
F (x) = 0, x ∈ , (1)
where
= {x ∈ Rn | l x u}, (2)
li ∈ R ∪ {−∞}, ui ∈ R ∪ {∞} and li < ui for i = 1, . . . , n. We assume that F : → Rm , (m n) is
continuously differentiable on an open set that contains .
The resolution of unconstrained underdetermined nonlinear systems has been addressed using the quasi-
Newton approach in [11,23] and using the Levenberg–Marquardt approach in [7]. Some ideas of these
papers are incorporated in the present work. In particular, we will take the Newton-pseudoinverse (normal-
flow) direction to generate a trial point whenever possible. Quasi-Newton minimum-norm directions have
been used in [11,23]. Methods for solving underdetermined systems based on the pseudoinverse can be
globalized using ordinary trust regions or line-search procedures (see, for example [6,17]). When bounds
are present, Euclidian trust regions are not so suitable. In practice the affine-scaling trust regions introduced
by Coleman and Li are able to deal with boxes in a more sensible way. When the current point is close
to the boundary (but not to the solution), the Coleman–Li trust-region strategy generally forces a large
step, a feature that is generally required for efficient and practical optimization methods. It is important
to observe that the method that will be introduced in this paper can be applied to the case = Rn . In this
case it reduces to a standard trust-region globalization of the normal-flow method for underdetermined
nonlinear systems.
A brief description of a typical iteration of the new method follows. Given the current interior iterate
x k , the Coleman–Li trust region with radius min > 0 is defined. The Cauchy point within this trust
region and the normal-flow Newtonian step are computed. If the Newtonian step satisfies the trust-region
constraint and the decrease of the quadratic model defined by the Newtonian step is large enough when
compared to the decrease of the quadratic model defined by the Cauchy step (preserving feasibility), the
feasible multiple of the Newtonian step defines a trial point. Otherwise, a different trial point within the
trust region and satisfying sufficient decrease of the model is defined. (A feasible multiple of the Cauchy
step is an admissible trial point.) The predicted reduction of the quadratic model and the actual reduction
of the objective function are computed at the trial point as in standard trust-region methods. If the actual
reduction is large enough when compared to the predicted reduction, the trial point is accepted and a new
J.B. Francisco et al. / Journal of Computational and Applied Mathematics 177 (2005) 67 – 88 69
iteration begins. Otherwise, the trust-region radius is reduced. The philosophy of the method is to use the
Newtonian direction as frequently as possible, taking advantage of its good local convergence properties.
However, the globalization strategy is based on Coleman–Li trust regions, which are known to take into
account efficiently the distance between the current point and the boundary.
Global convergence of the sequence generated by our method towards stationary points of the squared
norm of the system will be proved. Under suitable assumptions we will also show that, if a limit point is
interior, the whole sequence converges quadratically to this point. Although the convergence results are
not restricted to interior points, the result related to interior limit points is stronger than the one related
to limit points on the boundary. The reason is that we have in mind the application of the method as
a subalgorithm for finding feasible points in nonlinear programming algorithms [12,13]. In these cases
(usually m < n), interior points are preferred as basic points to start the optimality phase of the algorithms.
If interior-point solutions do not exist, usual constraint qualifications are not satisfied and the nonlinear
programming method generally fails. In other words, if all the stationary points of the squared norm of
the system are solutions (interior or not) of the system, the limit points of the sequence will be solutions
(perhaps noninterior). But we certainly prefer that the whole sequence does not converge to the boundary,
due to the main applications of the method. A slow convergence to the boundary is better since it allows
us to take a more interior inexact solution which, in general, is well suited for the application.
A method due to Kanzow et al. [9] is used here for numerical comparisons. This method applies
to the solution of nonlinear systems with convex constraints. A first method introduced in [9] needs
the resolution of rather hard subproblems (even in the bound-constrained case) but the second method,
which is the one that we use for comparisons, has a fairly simple implementation. In this method a
Levenberg–Marquardt step (that does not take into account the constraints at all) is computed at each
iteration and then projected on the feasible region. If the norm of the system at the projected point is a
fraction of the norm of the system at the current point, the projected Levenberg–Marquardt point is taken
as new iterate. If the projected Levenberg–Marquardt point generates a descent direction, this direction is
used for a line search. Otherwise, the method uses a projected gradient procedure to reduce the objective
function. Global convergence in the sense that every limit point is stationary can be obtained using
standard arguments and, under fairly weak assumptions, local and quadratic convergence is proved in [9].
This paper is organized as follows. Algorithm 2.1 is presented in Section 2 and convergence results are
given in Section 3. The algorithm is tested against the projected Levenberg–Marquardt algorithm [9] and
the numerical results are presented in Section 4. Finally some conclusions are drawn is Section 5.
Notation: A† denotes the Moore–Penrose pseudoinverse of the matrix A. Gk means G(x k ). · denotes
the Euclidian norm of vectors and its subordinate matrix norm. · p denotes the p-norm. The ith coordinate
of a point will be denoted [x]i . If there is no place to confusion, we will denote also xi = [x]i . Given
v ∈ Rn , diag(v) denotes the diagonal n × n matrix whose diagonal terms are the v1 , . . . , vn . B(y, ) is
the Euclidian ball with center y and radius . Int() denotes the interior of . J (x) is the Jacobian matrix
of F at the point x. For all z ∈ Rn , P (z) denotes the Euclidian projection on .
We denote fk = f (xk ), ∇fk = ∇f (xk ), Fk = F (xk ).
2. Description of the algorithm

Let us define the natural merit function associated with (1):
f (x) = 21 F (x) 2 . (3)

As it is well known
∇f (x) = J (x)T F (x).
The quadratic model at each iteration is given by
mk (p) = 21 J (x k )p + F (x k ) 2
= 21 F (x k ) 2 + F (x k )T J (x k )p + 21 pT J (x k )T J (x k )p. (4)
At the kth iteration, the trust region is defined by
Dk p ,
where > 0 is the trust-region radius and Dk ≡ D(x k ) is the affine-scaling matrix introduced below.
Given x ∈ Int(), the affine-scaling matrix D(x) is defined in the following way
 
|v1 (x)|−1/2
D(x) =  .. 
.
|vn (x)|−1/2
−1/2
= diag(|v1 (x)| , . . . , |vn (x)|−1/2 ), (5)
where

 x − ui if ∇f (x)i < 0 and ui < ∞,
 i
xi − li if ∇f (x)i 0 and li > − ∞,
vi (x) = (6)

 −1 if ∇f (x)i < 0 and ui = ∞,
1 if ∇f (x)i 0 and li = −∞.
Observe that D(x) is not defined on the boundary of but D(x)−1 can be extended continuously to
it. This extension will be also denoted D(x)−1 . So,
 
|v1 (x)|1/2
D(x)−1 =  .. ,
.
|vn (x)|1/2
for all x ∈ , where vi (x) is given by (6). Clearly, D(x)−1 is continuous for all x ∈ .
From now on we will consider the optimization problem associated with (1)
min f (x)
(7)
s.t. x ∈ .
The first-order optimality conditions of (7) are

∇f (x)i = 0 if li < xi < ui ,
∇f (x)i 0 if xi = ui , (8)
∇f (x)i 0 if xi = li .
Points that satisfy these conditions are called stationary. In [5, Lemma 2.3]) the following lemma has
been proved.
Lemma 2.1. Let x ∈ . Then D(x)−1 ∇f (x) = 0 if and only if x is stationary.

Given an iterate x k ∈ Int() and a direction p ∈ Rn the definitions below are necessary to find a new
interior iterate.
Define
(p) = arg max{t 0 | x k + tp ∈ }. (9)
Observe that, if (p) 1, one has that x k + p ∈/ Int() and the step must be reduced. Otherwise, x k + p ∈
Int(). For given ∈ (0, 1), we define

1 if (p) > 1,
(p) = (10)
max{, 1 − p }(p) otherwise.
Defining
(p) = (p)p, (11)
we see that x k + (p) ∈ Int(). In our approach we need to generate interior points because we want to
define trust regions with enough space for decreasing the objective function. So, since x k + p might be
infeasible or on the boundary, (p) is a safely interior multiple of p.
The scaled steepest descent direction d k is defined by
d k = −Dk−2 ∇f (x k ). (12)
Given a trust-region radius > 0, the Cauchy point will be defined as the vector pCk that minimizes the
model mk (4) along d k restricted to the trust region. So,
pCk = k d k = −k Dk−2 ∇fk , (13)
where
k = arg min{mk (d k ) | Dk d k }.

>0
Note that x k + pCk might not belong to .

The Cauchy steplength k can be computed explicitly, as stated in the proposition below. See the proof
in [17, p. 70].
Proposition 2.1. Let pCk be the Cauchy direction (13). Then,

−1
D k ∇f k 2

k = min , .
Jk Dk−2 ∇fk 2 Dk−1 ∇fk
For given p ∈ Rn we define, as usually in trust-region methods, the actual and the predicted reductions
ared(p) and pred(p) by
ared(p) = f (x k ) − f (x k + (p)),
pred(p) = mk (0) − mk ((p)).
A step p will be admissible if (p) produces a sufficient reduction with respect to the reduction attained
by the truncated Cauchy step (pCk ). In other words, acceptance of p is subject to the condition
pred(p) mk (0) − mk ((p))
kC (p) = = 1 , (14)
pred(pCk ) mk (0) − mk ((pCk ))
where 1 ∈ (0, 1) is given. In Section 4 (implementation features), we explain how we compute p in
practice. Observe that the trivial choice p = pCk is clearly admissible. By (14), we consider that a step
is admissible if it produces a sufficient reduction in the value of the quadratic model with respect to the
reduction attained by the truncated Cauchy step. This is a classical requirement that appears in different
ways in numerical optimization. The actual reduction of the objective function, in turn, will be required
to be of the same order as the reduction predicted by the search direction and, so, it will be proportional
to the reduction predicted by the Cauchy direction. This implies that, in the worst case, the method will
behave as a Cauchy-like method.
Each trial point x k + (p) will be accepted if and only if the actual reduction provided by p is big
enough in comparison to the predicted reduction. This condition is
ared(p) f (x k ) − f (x k + (p))
kf (p) = = 2 , (15)
pred(p) mk (0) − mk ((p))
where 2 ∈ (0, 1) is given.
The choice of the search direction will require an additional definition. We define the Newtonian
minimum norm direction pN k by
k
pN = −J (x k )† F (x k ). (16)
Now we are able to define the main algorithm.
Algorithm 2.1. Let x 0 ∈ Int(), min > 0, > min and , 1 , 2 ,

0 ,
1 ∈ (0, 1) such that
0 <
1 .
Set k ← 0.
Step 1: Compute Jk , Fk and Dk .
Step 2: If ∇f (x k ) = 0 terminate the execution of the algorithm. In this case x k is a stationary point
of minx∈ f (x).
Step 3: Compute
k
pN = −J (x k )† F (x k ).
Step 4: If
k
Dk pN and kC (pN
k
) 1 , (17)
define pk = pNk.
Else, find pk such that Dk pk and kC (pk ) 1 , with pCk given by (13).
Step 5: If
kf (p k ) 2 (18)
compute x k+1 = x k + (pk ) and go to Step 6.

Else, choose new ∈ [

0 ,
1 ], set ← new and go to Step 4.
Step 6: Set k ← k + 1, choose > min and go to Step 1.
3. Convergence results
In order to prove the convergence results for Algorithm 2.1 we will use the following assumptions.
H1. The sequence {x k } generated by the algorithm is bounded (A sufficient condition for this is that the
level set
{x ∈ | f (x) f (x 0 )}
is bounded.).
H2. For all x, y in an open, bounded and convex set L that contains the whole sequence generated by the
algorithm and all the points of the form x k + (pk ), we have that
J (x) − J (y) 20 x − y . (19)
(Clearly, a sufficient condition for this is that (19) holds for all x, y ∈ .)
H3. J (x) has full rank m for all x ∈ L.
Note that since F is continuously differentiable we also have that there exists 1 > 0 such that for all
x, y ∈ L
F (x) − F (y) 1 x − y . (20)
Clearly, by H1 and H3, there exists > 0 such that
J (x)† = J (x)T (J (x)J (x)T )−1 (21)
for all x ∈ L.
Moreover, by H2, we have that for all x, y ∈ ,
1

F (x) − F (y) − J (y)(x − y) = [J (y + t (x − y)) − J (y)](x − y) dt

0
1
2 0 t dt x − y 2
0
= 0 x − y 2 . (22)
The proof of the following lemma is essentially the one of Lemma 3.1 of [2].
Lemma 3.1. Assume that H2 holds and suppose that p ∈ Rn is such that
J (x k )(p) + F (x k ) F (x k ) .
Then
|ared(p) − pred(p)| ε k (p) (p) 2 ,
where
ε k (p) = 0 F (x k ) + 21 20 (p) 2 .
Proof. By the Mean-Value Theorem we have:

1
k k
F (x + (p)) = F (x ) + J (x k + (p))(p) d.
0
Define
1
sk = (J (x k + (p)) − J (x k ))(p) d.
0
Note that sk 0 (p) 2 . So,

2|ared(p) − pred(p)| = 2|mk ((p)) − f (x k + (p))|
= | F (x k ) + J (x k )(p) 2 − F (x k ) + J (x k )(p) + sk 2 |
2 F (x k ) + J (x k )(p) sk + sk 2
20 F (x k ) (p) 2 + 20 (p) 4 .
Therefore,
|ared(p) − pred(p)| ε k (p) (p) 2
and the thesis follows straightforwardly.
Now we state a technical lemma. Its proof is quite similar to the ones of [2, Lemma 3.3] and [5, Lemma
3.1], with small modifications.
Lemma 3.2. If p k is such that kC (pk ) 1 , then

−1 −1
1 D ∇f k D ∇fk
pred(p k ) 1 Dk−1 ∇fk min , k
, k
, (23)
2 Dk−1 JkT Jk Dk−1 ∇fk ∞
where is the constant used in (10).
The next lemma says that the algorithm is well defined. Its proof follows closely the lines of Lemma
3.4 of [2].
Lemma 3.3. Assume that H2 is fulfilled. If ∇fk = 0, then the loop that defines an iteration of Algorithm
(2.1) finishes after a finite number of cycles.
Proof. It suffices to prove that, after a finite number of step reductions, the algorithm finds a direction
pk such that kf (pk ) 2 holds for some small enough.
Assume that

Dk−1 ∇fk Dk−1 ∇fk
min , .
Dk−1 JkT JkT Dk−1 ∇fk ∞
So by Lemma 3.2,
k pred(pk ),
C
k = 2/( 1 D −1 ∇fk ). Since Dk pk , we get
where C k
k pred(pk ).
(pk ) pk Dk−1 Dk−1 C (24)
As mk ((pk )) mk (0), by Lemma 3.1, the Eq. (24) and the fact that (pk ) Dk−1 imply:
|ared(pk ) − pred(p k )| ε k (pk ) (pk ) 2

k pred(pk ).
ε k (p k ) D −1 2 C
k
Thus,
k .
|kf (pk ) − 1| ε k (pk ) Dk−1 2 C
But
ε k (pk ) 0 Fk + 21 20 Dk−1 ,
so
lim |kf (pk ) − 1| = 0.
→0
Therefore, there exists ∗ such that kf (pk ) 2 for all ∗ . Taking

−1 −1
D k ∇f k D k ∇fk
min ∗ , ,
Dk−1 JkT JkT Dk−1 ∇fk ∞
the condition kf (pk ) 2 is satisfied and the proof is complete.
The following lemma, whose proof is straightforward, will be used in the convergence proof.
Lemma 3.4. Assume that H1 holds. Then, there exists D > 0 such that
D(x k )−1 D .
Now we give the global convergence result.
Theorem 3.1. Assume that H1 and H2 are fulfilled and that the algorithm generates an infinite sequence
{x k }. Then all the limit points are stationary points of
min f (x). (25)
x∈
Proof. Let x ∗ ∈ be a limit point of {x k }. Then there exists N1 , an infinite subset of {0, 1, 2, . . .}, such
that x k → x ∗ for k ∈ N1 .
First we are going to prove that

lim Dk−1 ∇fk = 0. (26)
k∈N1
Assume that (26) is not true. Then there exists > 0such that Dk−1 ∇fk for an infinite set of indices
N2 ⊆ N1 .
By Lemma 3.2 and (15) we deduce that
ared(p k ) = f (x k ) − f (x k+1 ) 2 pred(p k )

1 −1 Dk−1 ∇fk Dk−1 ∇fk
1 2 Dk ∇fk min k , , , (27)
2 Dk−1 JkT Jk Dk−1 ∇fk ∞
for all k ∈ N1 . By H1, there exists g such that J (x)T J (x) g and ∇f (x) ∞ g for all x ∈ L.
Since D(x)−1 is bounded in L, there exists f > 0 such that Dk−1 JkT Jk Dk−1 f for all k ∈ N1 . Then
by (27),

k k+1 1
f (x ) − f (x ) 1 2 min k , , , (28)
2 f g
for all k ∈ N2 . Since {f (x k )}k∈{0,1,2,...} is monotone nonincreasing and bounded below,
lim (f (x k ) − f (x k+1 )) = 0.
k→∞
Let k be the final trust region radius defined at iteration k. By (28), we have that
lim k = 0.
k∈N2
Now, at each iteration k the first trial trust region radius is strictly greater than min . Thus, the fact that
k → 0 for k ∈ N2 implies that, for k ∈ N2 large enough, there exist ¯ k and p̄ k ≡ p̄(¯ k ) such that
¯ k k k ¯ k k
limk∈N2 k = 0, C (p̄ ) 1 , Dk p̄ k and f (p̄ ) < 2 .
By Lemma 3.4, we have that Dk−1 D . So,
(p̄k ) p̄ k D Dk p̄k D ¯ k .
By Lemma 3.1, we have:
|ared(p̄k ) − pred(p̄ k )| ε k (p̄k ) 2D ¯ k = k ¯ k ,

2
where k = εk (p̄k ) 2D ¯ k . Moreover, by Lemma 3.2 and the contradiction hypothesis, for k ∈ N2 large
enough we have that
pred(p̄ k ) 21 1 ¯ k ,
so
2
|ared(p̄ k ) − pred(p̄ k )| k pred(p̄ k )
1
and, therefore,
2
|kf (p̄k ) − 1| k ,
1
for k ∈ N2 large enough.
Since {ε k (p̄k )}k is bounded, it follows that limk∈N2 k = 0 and
lim |kf (p̄k ) − 1| = 0.
k∈N2
f
This contradicts the fact that k (p̄k ) < 2 for k ∈ N2 large enough. Therefore,
lim Dk−1 ∇fk = 0.
k∈N1
By Lemma 2.1 the proof is complete.
Corollary 3.1. Under the assumptions of Theorem 3.1 and H3, if a limit point x ∗ belongs to Int() then
F (x ∗ ) = 0.
Proof. By Theorem 3.1, x ∗ is a stationary point of the problem (7). But, since x ∗ ∈ Int(),
J (x ∗ )T F (x ∗ ) = 0.
Then, by H3, F (x ∗ ) = 0.
The following auxiliary lemma will be used in the quadratic convergence proof.
Lemma 3.5. Let z ∈ Int(). Then there exist r > 0 and D1 > 0 such that D(x) < D1 for all x ∈
B(z, r) ⊂ Int().
√
Proof. Since z ∈ Int(), there exists r ∈ (0, 1] such that B(z, 2r) ⊂ Int(). Define D1 = 1/r. Then,
for all x ∈ B(z, r),
|li − xi |, |ui − xi | > rfor i = 1, . . . , n.
√
So, by (5), it follows that D(x) < 1/r = D1 .
k is necessarily accepted
In the following lemmas we prove that the minimum-norm Newtonian step pN
∗
in a neighborhood of an interior solution x .
Lemma 3.6. Assume that H1–H3 are fulfilled, K is an infinite sequence of indices such that
lim x k = x ∗ ∈ Int()
k∈K
and
F (x ∗ ) = 0.
k given by (16) satisfies
Then there exists k0 ∈ {0, 1, 2, . . .} such that for k k0 , k ∈ K, the Newton step pN
(17) and (18).
Proof. By the continuity of F,

lim Fk = F (x ∗ ) = 0. (29)
k∈K
Since {x k }k∈K is bounded and x ∗ ∈ Int(), then by Lemma 3.5 there exists D > 0 such that Dk D
for all k ∈ K. So,
k
Dk pN Dk − J (x k )† Fk D Fk
and
k
lim Dk pN = 0.
k∈K
Then, there exists k1 ∈ {0, 1, 2, . . .} such that for k k1 , k ∈ K,

k
Dk pN min .
By (29) and (21), we have that
k
lim pN lim Fk = 0.
k∈K k∈K
k ) > 1 for k k , k ∈ K.
By (9) and (10), since x ∗ ∈ Int(), there exists k2 ∈ {0, 1, 2, . . .} such that (pN 2
k k k
Consequently, (pN ) = 1 and (pN ) = pN for k k2 , k ∈ K.
Define k = max{k1 , k2 }. Note that, for all k
k, k ∈ K,
k k
pred(pN ) = mk (0) − mk (pN )
= 2 Fk
1 2
(30)
and
pred(pCk ) mk (0) = 21 Fk 2 .
So kC (pN
k ) .
1
k = 0, defining ε k (p) as in Lemma 3.1 we have
Now, since limk∈K Fk = limk∈K pN
lim ε k (pN
k
) = 0.
k∈K
k + F = 0 F , by Lemma 3.1 we have that

Since Jk pN k k
k k
|ared(pN ) − pred(pN )| εk (pN
k k 2
) pN .
Dividing by pred(pN k ) and using (30), we obtain

ared(pk ) k 2
k k pN
N
k)
− 1 2ε (p ) ,
pred(pN N
Fk 2
for all k
k, k ∈ K.
k F , so for k
But pN k,
k
|kf (pN
k
) − 1| 22 εk (pN
k
).
k ) → 0, this implies that
Since εk (pN
lim kf (pN
k
) = 1.
k∈K
Therefore, there exists k0

k such that kf (pN
k ) for all k k , k ∈ K. This completes the proof.
2 0
Lemma 3.7. Assume that H1–H3 are fulfilled, the interior point x ∗ is a limit point of the sequence {x k }
and
F (x ∗ ) = 0.
k satisfies (17) and (18).
Then, there exists ε > 0 such that, whenever x k − x ∗ ε, the Newton step pN
Proof. Assume that the thesis is not true. Then, for all ε > 0 there exists an iterate xk (k depending on ε)
such that x k − x ∗ ε and at least one of the conditions (17)–(18) is not fulfilled by pN k . We consider
two possibilities:
(a) There exists ε > 0 such that the number of iterates for which x k − x ∗ ε and at least one of the
conditions (17)–(18) is not fulfilled by pN k is finite.
(b) For all ε > 0 there exist infinitely many iterates such that x k − x ∗ ε and at least one of the
conditions (17)–(18) is not fulfilled by pN k.
Consider the Case (a). Let x k̄ be the closest iterate to x ∗ such that x k − x ∗ ε and at least one of
the conditions (17)–(18) is not fulfilled by pN k . Take ε̄ = x k̄ − x ∗ /2. Then all the iterates that satisfy
x k − x ∗ ε̄ are such that pN k fulfills (17)–(18). This is precisely the thesis of the Lemma.
So, we only need to consider Case (b). Let us construct a subsequence of {x k } as follows:
j (1)
x j (1) is such that x j (1) − x ∗ 1 and at least one of the conditions (17)–(18) is not fulfilled by pN ;
For k > 1, x j (k) is such that x j (k) − x ∗ 1/k, at least one of the conditions (17)–(18) is not fulfilled
j (k)
by pN and j (k) > j (k − 1). The fact that there exist infinitely many iterates whose distance to x ∗ is
smaller than 1/k and not satisfying at least one of the conditions (17)–(18) guarantees that it is possible
to choose j (k) > j (k − 1).
By construction, the sequence {x j (k) } is a subsequence of {x k }, converges to x ∗ and each one of its
elements does not satisfy at least one of the conditions (17)–(18). This contradicts Lemma 3.6.
Lemma 3.8. Assume that H1–H3 hold and let x ∗ ∈ Int() be such that F (x ∗ ) = 0. Let ε > 0 as given by
the thesis of Lemma 3.7. Then there exists
> 0 such that if x k0 ∈ B(x ∗ ,
) the sequence {x k } generated
by the main algorithm satisfies
x k ∈ B(x ∗ , ε)
and
x k+1 = x k + pN
k
for all k k0 .
Proof. Define
c1 = 0 ,

ε 1

= min , , (31)
2(1 + 21 ) 2c1 1
where 1 is defined in (20).
Let x k0 ∈ B(x ∗ ,
). We are going to prove by induction that for all k k0 ,
x k ∈ B(x ∗ , ε). (32)
Since
ε, (32) is obviously true for k = k0 .
Assume, as inductive hypothesis, that
x j ∈ B(x ∗ , ε), j = k0 , k0 + 1, . . . , k. (33)
We wish to prove that x k+1 ∈ B(x ∗ , ε). By Lemma 3.7 and the inductive hypothesis,
j
x j +1 = x j + pN = x j − J (x j )† F (x j ), j ∈ {k0 , k0 + 1, . . . , k}. (34)
By (21),
j
pN F (x j ) (35)
for all j ∈ {k0 , k0 + 1, . . . , k}.

Since J (x j )J (x j )† = Im×m , by (22) we have that
j
F (x j +1 ) = F (x j +1 ) − F (x j ) − J (x j )pN
j
0 pN 2
for all j ∈ {k0 , k0 + 1, . . . , k}.

Then, by (35),
j j −1 j −1
pN 0 pN 2 = c1 pN 2 (36)
for all j ∈ {k0 + 1, . . . , k}.

Now, by the inductive hypothesis and (34),
x k+1 − x ∗ = x k + pN
k
− x ∗ x k − x ∗ + pN
k

k−1 k−1 ∗ k
= x + pN − x + pN
k−1
x k−1 − x ∗ + pN
k
+ pN
..
.
k
j
x k0 − x ∗ + pN . (37)
j =k0
By (36),
j j −1 j −2 2
pN c1 pN 2 c1 c12 pN 2
..
.
j −k0 −1 k0 2j −k0
c1 c12 . . . c12 pN
j −k0 −1
2i k0 2 j −k0
= c1 i=0
pN , (38)
for all j ∈ {k0 + 1, . . . , k}.
But, by (20) and (21),
k0
pN = J (x k0 )† F (x k0 ) F (x k0 )
= F (x k0 ) − F (x ∗ ) 1 x k0 − x ∗ .
j −k0 −1
Since i=0 2i = 2j −k0 − 1, by (37) and (38) we obtain
k

k+1 ∗ k0 ∗ j −k0 −1 k0 2 j −k0
x − x x −x + c12 pN
j =k0
k
j −k0 −1 j −k0 j −k0
x k0 − x ∗ + c12 (1 )2 x k0 − x ∗ 2
j =k0
k
j −k0 −1

+ 1
(c1 1
)2 . (39)
j =k0
So, by (31) and (39),

k 2j −k0 −1
k+1 ∗ 1
x − x
+ 1
2
j =k0
 
k j −k0
1

1 + 1 
(1 + 21 )
2
j =k0

.
2
Therefore x k+1 ∈ B(x ∗ , ). This completes the proof.
Now we can state our final convergence result.
Theorem 3.2. Assume that the assumptions H1–H3 hold and that x ∗ ∈ Int() is a limit point of the
sequence {x k } generated by Algorithm 2.1. Then F (x ∗ ) = 0 and x k converges quadratically to x ∗ .
Proof. The fact that F (x ∗ ) = 0 follows from J (x)T F (x ∗ ) = 0 and Corollary 3.1. By Lemma 3.8 there
exists
> 0 such that if x k0 − x ∗
then x k+1 = x k + pNk for all k k . Since x ∗ is a limit point of
0
k for all k large enough. So by Walker and Watson [23, Theorem 2.1],
{x k } it follows that x k+1 = x k + pN
the sequence converges quadratically to some solution. Since x ∗ is a limit point, the sequence converges
to x ∗ . This completes the proof.
4. Numerical experiments
We tested Algorithm 2.1 against a projected Levenberg–Marquardt method (called KYF from now on)
for solving nonlinear systems on convex sets proposed recently by Kanzow, Yamashita and Fukushima.
See Algorithm 3.12 of [9].
The KYF method is described below.
Algorithm 4.1 (KYF). Choose x 0 ∈ , > 0, , , ∈ (0, 1).

Set k ← 0.
Step 1: If F (x k ) = 0, terminate the execution of the algorithm.
Step 2: Define k = Fk 2 and compute dUk by
(J (x k )T J (x k ) + k I )dU = −J (x k )T Fk . (40)
Step 3: If F (P (x k + dUk )) Fk define x k+1 = P (x k + dUk ), set k ← k + 1 and go to Step 1.
Else, go to Step 4.
k =P (x k +d k )−x k . If ∇f (x k )T s k − s k p , compute t =max{ i | i=0, 1, . . .}
Step 4: Define sLM U LM LM k
such that
f (x k + tk sLM
k
) f (x k ) + tk ∇f (x k )T sLM
k
,
k , set k ← k + 1 and go to Step 1. Otherwise, go to Step 5.
define x k+1 = x k + tk sLM
Step 5: Compute t k = max{ i | i = 0, 1, . . .} such that
f (x k (t k )) f (x k ) + ∇f (x k )T (x k (t k ) − x k ),
where x k (t) = P (x k − t∇f (x k )). Define x k+1 = x k (t k ), set k ← k + 1 and go to Step 1.
We implemented Algorithm KYF with different values for the regularization parameter : 10−1 , 10−5
and 10−7 . Proceeding as in [9], another update formula of k , called here ∗ , was tried too: Starting with
0 = 21 10−8 F (x 0 ) , the formula is given by
k = min{k−1 , F (x k ) 2 }.
This choice does not change the theoretical properties of algorithm. The remaining parameters were the
ones used in [9] ( =0.9, =0.99995, =10−8 , p =2.1 and =10−4 ). The algorithms were implemented
in Matlab 5.3.
We used test problems described in [7] and a set of problems defined by feasible sets of nonlinear
programming problems in the book of Hock and Schittkowski [8]. These problems are of the form
F (x) = 0, x ∈ ,
where is a box and F : Rn → Rm . Table 1 shows the problem data in the following way:
Column 1: Number of the problem;
Table 1
Problems data
Problem Source m n
1 Problem 46 in [8] 2 5
2 Problem 53 in [8] 3 5
3 Problem 56 in [8] 4 7
4 Problem 63 in [8] 2 3
5 Problem 75 in [8] 3 4
6 Problem 77 in [8] 2 5
7 Problem 79 in [8] 3 5
8 Problem 81 in [8] 3 5
9 Problem 87 in [8] 4 6
10 Problem 107 in [8] 6 9
11 Problem 109 in [8] 6 9
12 Problem 111 in [8] 3 10
13 Problem 2 in [9] 150 300
14 Problem 4 in [9] 150 300
15 — 1 2
Column 2: Source of the problem;

Columns 3 and 4: dimensions m and n.
Problems 1, 3, 6 and 7 were, originally, unconstrained. So, we introduced artificial bounds 0 xi 2.5
for all i. Therefore, only the problems 5, 13 and 14 do not have bound constraints. In the problems 10 and
11 the variables x1 and x2 are unbounded above, and in the problem 4 all the variables are unbounded
above. Problem 13 is a linear system and Problem 14 is a quadratic system. We also tested both methods
for the problem defined by
F (x1 , x2 ) = x2 − 1
100 x1 , with − ∞ x1 ∞ and x2 0,
which was called Problem 15.

Let us enumerate some implementation features:
1. For both methods the Jacobian matrix was approximated by finite differences.
2. For both methods x k was accepted as a solution if F (x k ) 10−6 .
3. We allowed each method to perform 5000 iterations and 10 000 function evaluations (not considering
the evaluations used to approximate the Jacobian).
4. Algorithm 2.1 was also prepared to stop when 10−8 or when Dk−1 ∇f (x k ) < 10−10 but this never
happened in our experiments. If Dk−1 ∞ 7.45 × 10−155 we say that Dk−1 is numerically singular
and we stop the process.
5. The parameters for Algorithm 2.1 were: = D(x 0 )−1 ∇f (x 0 ) , min = 5 × 10−4 , = 0.99995,
1 = 0.1, 2 = 0.25 and
1 = 0.25.
6. The choice of new at Step 5 of Algorithm 2.1 was
new = min{
1 , 21 Dk (pk ) }.
7. At Step 6, if kf (pk ) 0.75 then,
= max{min , , 2 Dk (p k ) },
otherwise
= max{min , }.
8. At Step 4 of Algorithm 2.1 we need to find a direction p k satisfying kC (pk ) 1 and D k pk .
With this objective, we used the dogleg method (see [17]). The dogleg direction pdk is computed as
follows:
−D −2 ∇f
k k
if Dk pCk
pdk = Dk−1 ∇fk
pCk + ( − 1)(pN
k − pCk ) otherwise ,
where is the positive solution of
Dk (pCk + ( − 1)(pN
k
− pCk )) 2 = 2 .
If pdk does not satisfy the condition kC (pdk ) 1 , we choose pk = pCk . In this way we guarantee that
the step that satisfies (14) is necessarily obtained.
9. For finding the minimum-norm Newton direction we used the QR factorization.
The strictly feasible initial approximation x 0 was chosen to be x 0 = (l + u)/2 except in the following
situations.
1. In Problems 2 and 8 we took x 0 = l + 41 (u − l), because the initial point above is a stationary point
of minx∈ f (x).
2. In Problems 13 and 14 we took x 0 = 150(1, . . . , 1)T , which is one of the initial guesses suggested in
[9].
3. In Problem 15 we consider x0 = (− 21 , 21 )
In Table 2 we present the results of Algorithm 2.1 for this set of problems. The fourth column shows
the norm of F evaluated on the respective initial approximation. The last column shows the number
of iterations in which the projected gradient direction (pCk ), the dogleg direction (pdk ) and the Newton
direction (pNk ) were accepted, respectively. The notation MAXIT indicates that the allowed number of
iterations was exhausted. MAXFUN indicates that the allowed number of evaluations was exhausted and
SINGUL indicates that Dk−1 is nearly singular.
In Table 3 we compare Algorithm 2.1 with Algorithm 4.1. In the sixth column we display the number
of Levenberg–Marquardt (LM), Line-Search (LS) and Cauchy directions accepted by Algorithm 4.1,
respectively.
Table 2
Numerical results, Algorithm 2.1
Problem (m, n) Iterations F (x 0 ) F (x ∗ ) Evaluations k )/(p k )/(p k )

(pC d N
1 (2,5) 5 3.2 5.3 ×10−10 6 0/0/5

2 (3,5) 1 1.0 ×10 3.2 ×10−15 2 0/0/1
3 (4,7) 5 4.4 1.1 ×10−12 6 0/0/5
4 (2,3) 5 6.4 ×10 2.0 × 10−9 6 0/0/5
5 (3,4) 7 2.7 × 103 9.9 × 10−10 10 0/2/5
6 (2,5) 4 4.4 4.3 × 10−7 5 0/0/4
7 (3,5) 3 1.6 1.4 × 10−7 4 0/0/3
8 (3,5) 308 1.1 × 10 9.9 × 10−7 315 290/5/13
9 (4,6) SINGUL 4.6 × 103 4.4 × 103 6 4/0/1
10 (6,9) 229 5.8 9.9 × 10−7 230 225/1/3
11 (6,9) MAXIT 5.2 × 104 2.9 × 103 5001 4997/0/3
12 (3,10) 16 9.5 8.8 × 10−7 17 0/0/16
13 (150,300) 2 6.5 × 103 0.0 3 0/0/2
14 (150,300) 11 2.7 × 105 8.3 × 10−12 12 0/0/11
15 (1,2) 53 5.1 × 10−1 5.8 × 10−14 54 51/0/2
5. Conclusions
We introduced a new algorithm for solving nonlinear systems of equations with bounded variables.
Our main motivation is the “feasibility phase” of nonlinear programming algorithms based on periodic
restoration. In this case, most solutions are interior points. Moreover, one is often interested only in
interior solutions. With this in mind, our algorithm may be interpreted as a globalization of the normal-
flow method for solving underdetermined systems. The strategy for globalization is essentially the one
introduced in [5] and adapted in [2] for the resolution of bounded square nonlinear systems. A limited
number of numerical experiments show that the algorithm behaves as expected. In most iterations, it
reduces to the normal-flow method and, when it does not, the global strategy is able to lead the iterates to
a solution. For most cases, quadratic convergence was “observed” in practice, in the vicinity of interior
solutions. The KYF algorithm [9] can also be considered a globalization of the normal-flow method, at
least when the regularization parameter is very small, which corresponds to the situation in which the
best results were observed. In general, the rather simple globalization strategy of KYF, based on the
projected gradient is not as efficient as the interior-point strategy used in Algorithm 2.1. Nevertheless,
it must be mentioned that KYF was introduced with the aim of solving more general problems than the
ones considered in this paper. Since many nonlinear programming problems are large, both in the number
of constraints as in the number of variables, future research will consider the extension of this type of
algorithms in order to deal with sparsity of the Jacobian and in order to consider the possibility of using
iterative linear solvers for computing the Newtonian normal-flow direction. Moreover, the difficulties in
evaluating the Jacobian leads one to analyze the extension of underdetermined quasi-Newton methods
[11,23] to the bound-constrained case.
Table 3
Comparison of Algorithm KYF with Algorithm 2.1
Algorithm KYF Algorithm 2.1
Problem Iterations F (x ∗ ) Evaluations LM/LS/GP Iterations F (x ∗ ) Evaluations
10−1 5 6.3 × 10−10 6 5/0/0

1 10−5 5 5.3 × 10−10 6 5/0/0 5 5.3 ×10−10 6
10−7 5 5.3 × 10−10 6 5/0/0
∗ 5 5.3 × 10−10 6 5/0/0
10−1 7 1.5 × 10−12 8 7/0/0

2 10−5 2 6.5 × 10−11 3 2/0/0 1 3.2 ×10−15 2
10−7 2 1.2 × 10−12 3 2/0/0
∗ 2 1.9 × 10−14 3 2/0/0
10−1 4 3.1 × 10−11 5 4/0/0

3 10−5 5 1.1 × 10−12 6 5/0/0 5 1.1 ×10−12 6
10−7 5 1.1 × 10−12 6 5/0/0
∗ 5 1.1 × 10−12 6 5/0/0
10−1 6 1.8 × 10−9 7 6/0/0

4 10−5 5 2.8 × 10−8 6 5/0/0 5 2.0 × 10−9 6
10−7 5 3.0 × 10−8 6 5/0/0
∗ 5 3.0 × 10−8 6 5/0/0
10−1 MAXIT 2.1 × 103 9996 5/4995/0

5 10−5 31 5.8 × 10−10 33 30/1/0 7 9.9 × 10−10 10
10−7 13 5.6 × 10−11 86 7/6/0
∗ 20 1.0 × 10−11 152 8/12/0
10−1 4 3.3 × 10−7 5 4/0/0

6 10−5 4 4.3 × 10−7 5 4/0/0 4 4.3 × 10−7 5
10−7 4 4.3 × 10−7 5 4/0/0
∗ 4 4.3 × 10−7 5 4/0/0
10−1 3 1.1 × 10−7 4 3/0/0

7 10−5 3 1.4 × 10−7 4 3/0/0 3 1.4 × 10−7 4
10−7 3 1.4 × 10−7 4 3/0/0
∗ 3 1.4 × 10−7 4 3/0/0
10−1 27 1.6 × 10−10 482 9/10/8

8 10−5 21 1.6 × 10−7 472 5/9/7 308 9.9 × 10−7 315
10−7 25 2.0 × 10−11 618 5/11/9
∗ 21 3.1 × 10−8 489 5/9/7
10−1 MAXIT 4.4 × 103 9999 2/4998/0

9 10−5 237 8.4 × 10−7 238 237/0/0 SINGUL 4.4 × 103 6
Table 3 (continued)
Algorithm KYF Algorithm 2.1
Problem Iterations F (x ∗ ) Evaluations LM/LS/GP Iterations F (x ∗ ) Evaluations
10−7 93 8.8 × 10−7 94 93/0/0

∗ 92 9.1 × 10−7 93 92/0/0
10−1 310 1.0 × 10−6 313 309/1/0

10 10−5 255 9.9 × 10−7 258 254/1/0 229 9.9 × 10−7 230
10−7 245 9.6 × 10−7 249 244/1/0
∗ 245 9.7 × 10−7 249 244/1/0
10−1 MAXIT 4.7 × 104 9999 2/4998/0

11 10−5 565 9.9 × 10−7 566 565/0/0 MAXIT 2.9 × 103 5001
10−7 49 1.3 × 104 MAXFUN 1/4/44
∗ 49 1.3 × 104 MAXFUN 1/4/44
10−1 25 6.1 × 10−7 26 25/0/0

12 10−5 16 8.8 × 10−7 17 16/0/0 16 8.8 × 10−7 17
10−7 16 8.8 × 10−7 17 16/0/0
∗ 50 9.8 × 10−7 51 50/0/0
10−1 MAXIT 5.9 × 103 5001 0/5000/0

13 10−5 13 0.0 14 13/0/0 2 0.0 3
10−7 4 3.0 × 10−15 5 4/0/0
∗ 6 1.7 × 10−8 7 6/0/0
10−1 MAXIT 2.5 × 105 5001 0/5000/0

14 10−5 20 4.9 × 10−8 21 20/0/0 11 8.3 × 10−12 12
10−7 11 9.6 × 10−11 12 11/0/0
∗ 29 6.2 × 10−8 30 29/0/0
10−1 MAXIT 3.0 × 10−3 5001 5000/0/0

15 10−5 MAXIT 3.0 × 10−3 5001 5000/0/0 53 5.8 × 10−14 54
10−7 MAXIT 3.0 × 10−3 5001 5000/0/0
∗ MAXIT 3.0 × 10−3 5001 5000/0/0
Acknowledgements
We are indebted to an anonymous referee whose comments helped us a lot for improving the first
version of this paper.
References
[1] J. Abadie, J. Carpentier, Generalization of the Wolfe reduced-gradient method to the case of nonlinear constraints, in: R.
Fletcher (Ed.), Optimization, Academic Press, New York, 1968, pp. 37–47.
[2] S. Bellavia, M. Macconi, B. Morini, An affine scaling trust-region approach to bound-constrained nonlinear systems, Appl.
Numer. Math. 44 (2003) 257–280.
[3] S. Bellavia, M. Macconi, B. Morini, STRSCNE: a scaled trust-region solver for constrained nonlinear equations, Comput.
Optim. Appl. 28 (2004) 31–50.
[4] T.F. Coleman, Y. Li, On the convergence of interior-reflective Newton methods for nonlinear minimization subject to
bounds, Math. Programming 67 (1994) 189–224.
[5] T.F. Coleman, Y. Li, An interior trust region approach for nonlinear minimization subject to bounds, SIAM J. Optim. 6
(1996) 418–445.
[6] A.R. Conn, N.I.M. Gould, Ph.L. Toint, Trust-region method, in: MPS-SIAM Series on Optimization, SIAM, Philadelphia,
2000.
[7] N. Dan, N. Yamashita, M. Fukushima, Convergence properties of inexact Levenberg–Marquardt method under local error
bound conditions, Optim. Methods Software 17 (2002) 605–626.
[8] W. Hock, K. Schittkowski, Test examples for nonlinear programming codes, Springer Series Lectures Notes in Economics
Mathematical Systems, 1981.
[9] C. Kanzow, N. Yamashita, M. Fukushima, Levenberg–Marquardt methods for constrained nonlinear equations with strong
local convergence properties, J. Comput. Appl. Math. 172 (2004) 375–397.
[10] L.S. Lasdon, Reduced gradient methods, in: M.J.D. Powell (Ed.), Nonlinear Optimization 1981, Academic Press, New
York, 1982, pp. 235–242.
[11] J.M. Martínez, Quasi-Newton methods for solving underdetermined nonlinear simultaneous equations, J. Comput. Appl.
Math. 34 (1990) 171–190.
[12] J.M. Martínez, Inexact restoration method with Lagrangian tangent decrease and new merit function for nonlinear
programming, J. Optim. Theory Appl. 111 (2001) 39–58.
[13] J.M. Martínez, E.A. Pilotta, Inexact restoration algorithms for constrained optimization, J. Optim. Theory Appl. 104 (2000)
135–163.
[14] A. Miele, H.Y. Huang, J.C. Heideman, Sequential gradient-restoration algorithm for the minimization of constrained
functions, ordinary and conjugate gradient version, J. Optim. Theory Appl. 4 (1969) 213–246.
[15] A. Miele, A.V. Levy, E.E. Cragg, Modifications and extensions of the conjugate-gradient restoration algorithm for
mathematical programming problems, J. Optim. Theory Appl. 7 (1971) 450–472.
[16] A. Miele, E.M. Sims, V.K. Basapur, Sequential gradient-restoration algorithm for mathematical programming problems
with inequality constraints, Part 1, theory, Rice University, Aero-Astronautics Report No. 168, 1983.
[17] J. Nocedal, S. Wright, Numerical optimization, in: Springer Series in Operations Research, Springer, New York, Inc., 1999.
[18] M. Rom, M.Avriel, Properties of the sequential gradient-restoration algorithm (SGRA), Part 1: introduction and comparison
with related methods, J. Optim. Theory Appl. 62 (1989) 77–98.
[19] M. Rom, M. Avriel, Properties of the sequential gradient-restoration algorithm (SGRA), Part 2: convergence analysis, J.
Optim. Theory Appl. 62 (1989) 99–126.
[20] J.B. Rosen, The gradient projection method for nonlinear programming, Part 1, linear constraints, SIAM J. Appl. Math. 8
(1960) 181–217.
[21] J.B. Rosen, The gradient projection method for nonlinear programming, Part 2, nonlinear constraints, SIAM J. Appl. Math.
9 (1961) 514–532.
[22] J.B. Rosen, Two-phase algorithm for nonlinear constraint problems, in: O.L. Mangasarian, R.R. Meyer, S.M. Robinson
(Eds.), Nonlinear Programming 3, Academic Press, London, New York, 1978, pp. 97–124.
[23] H.F. Walker, L.T. Watson, Least-change secant update methods for underdetermined systems, SIAM J. Numer. Anal. 27
(5) (1990) 1227–1262.

Studio Express A1 - Deutsch

Uploaded by

Copyright:

Available Formats

Studio Express A1 - Deutsch

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Studio Express A1 - Deutsch

Uploaded by

Copyright:

Available Formats

Journal of Computational and Applied Mathematics 177 (2005) 67 – 88

An interior-point method for solving box-constrained

∗ Corresponding author. Tel.: +38121458136; fax: +38121350458.

2. Description of the algorithm

f (x) = 21 F (x) 2 . (3)

Lemma 2.1. Let x ∈ . Then D(x)−1 ∇f (x) = 0 if and only if x is stationary.

(p) = arg max{t 0 | x k + tp ∈ }. (9)

(p) = (p)p, (11)

pCk = k d k = −k Dk−2 ∇fk , (13)

k = arg min{mk (d k ) | Dk d k }.

Note that x k + pCk might not belong to .

Proposition 2.1. Let pCk be the Cauchy direction (13). Then,

Algorithm 2.1. Let x 0 ∈ Int(), min > 0, > min and , 1 , 2 ,

compute x k+1 = x k + (pk ) and go to Step 6.

Else, choose new ∈ [

Proof. By the Mean-Value Theorem we have:

Note that sk 0 (p) 2 . So,

Lemma 3.2. If p k is such that kC (pk ) 1 , then

|ared(pk ) − pred(p k )| ε k (pk ) (pk ) 2

the condition kf (pk ) 2 is satisﬁed and the proof is complete.

Now we give the global convergence result.

First we are going to prove that

for all k ∈ N2 . Since {f (x k )}k∈{0,1,2,...} is monotone nonincreasing and bounded below,

|ared(p̄k ) − pred(p̄ k )| ε k (p̄k ) 2D ¯ k = k ¯ k ,

By Lemma 2.1 the proof is complete.

Proof. By the continuity of F,

Then, there exists k1 ∈ {0, 1, 2, . . .} such that for k k1 , k ∈ K,

k + F = 0 F , by Lemma 3.1 we have that

Therefore, there exists k0 

x k ∈ B(x ∗ , ε). (32)

x j ∈ B(x ∗ , ε), j = k0 , k0 + 1, . . . , k. (33)

for all j ∈ {k0 , k0 + 1, . . . , k}.

for all j ∈ {k0 , k0 + 1, . . . , k}.

for all j ∈ {k0 + 1, . . . , k}.

So, by (31) and (39),

Now we can state our ﬁnal convergence result.

Algorithm 4.1 (KYF). Choose x 0 ∈ , > 0, , , ∈ (0, 1).

Column 2: Source of the problem;

which was called Problem 15.

7. At Step 6, if kf (pk ) 0.75 then,

where is the positive solution of

Problem (m, n) Iterations F (x 0 ) F (x ∗ ) Evaluations k )/(p k )/(p k )

1 (2,5) 5 3.2 5.3 ×10−10 6 0/0/5

Algorithm KYF Algorithm 2.1

Problem Iterations F (x ∗ ) Evaluations LM/LS/GP Iterations F (x ∗ ) Evaluations

10−1 5 6.3 × 10−10 6 5/0/0

10−1 7 1.5 × 10−12 8 7/0/0

10−1 4 3.1 × 10−11 5 4/0/0

10−1 6 1.8 × 10−9 7 6/0/0

10−1 MAXIT 2.1 × 103 9996 5/4995/0

10−1 4 3.3 × 10−7 5 4/0/0

10−1 3 1.1 × 10−7 4 3/0/0

10−1 27 1.6 × 10−10 482 9/10/8

10−1 MAXIT 4.4 × 103 9999 2/4998/0

Algorithm KYF Algorithm 2.1

Problem Iterations F (x ∗ ) Evaluations LM/LS/GP Iterations F (x ∗ ) Evaluations

10−7 93 8.8 × 10−7 94 93/0/0

10−1 310 1.0 × 10−6 313 309/1/0

10−1 MAXIT 4.7 × 104 9999 2/4998/0

10−1 25 6.1 × 10−7 26 25/0/0

(p) = (p)p, (11)

pCk = k d k = −k Dk−2 ∇fk , (13)

k = arg min{mk (d k ) | Dk d k }.

Algorithm 2.1. Let x 0 ∈ Int(), min > 0, > min and , 1 , 2 ,

compute x k+1 = x k + (pk ) and go to Step 6.

Note that sk 0 (p) 2 . So,

|ared(pk ) − pred(p k )| ε k (pk ) (pk ) 2

|ared(p̄k ) − pred(p̄ k )| ε k (p̄k ) 2D ¯ k = k ¯ k ,

Therefore, there exists k0

Algorithm 4.1 (KYF). Choose x 0 ∈ , > 0, , , ∈ (0, 1).