Nothing Special   »   [go: up one dir, main page]

A new trust region method for unconstrained optimization: 夡 Zhen-Jun Shi, Jin-Hua Guo

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Journal of Computational and Applied Mathematics 213 (2008) 509 520

www.elsevier.com/locate/cam

A new trust region method for unconstrained optimization


Zhen-Jun Shia, b, , Jin-Hua Guob
a College of Operations Research and Management, Qufu Normal University, Rizhao, Shandong 276826, PR China
b Department of Computer and Information Science, University of Michigan, Dearborn, MI 48128-1491, USA

Received 24 October 2005; received in revised form 29 January 2007

Abstract
In this paper, we propose a new trust region method for unconstrained optimization problems. The new trust region method can
automatically adjust the trust region radius of related subproblems at each iteration and has strong global convergence under some
mild conditions. We also analyze the global linear convergence, local superlinear and quadratic convergence rate of the new method.
Numerical results show that the new trust region method is available and efcient in practical computation.
2007 Elsevier B.V. All rights reserved.
MSC: 90C30; 49M37; 65K05
Keywords: Unconstrained optimization; Trust region method; Global convergence; Convergence rate

1. Introduction
Consider an unconstrained optimization problem
min f (x),

x Rn,

(1)

where R n is an n-dimensional Euclidean space and f : R n R 1 is a continuously differentiable function.


Most of the methods for solving (1) are iterative methods such as line search and trust region methods [1]. Letting
x0 be an initial point, we try to nd an efcient algorithm to generate a sequence {xk } and hope that the sequence
{xk } will converge to a minimizer or a stationary point, say x (for which f (x ) = 0). For convenience, we denote
g(x) = f (x), fk = f (xk ) and gk = g(xk ), respectively.
Throughout the paper we denote   as the Euclidean norm and Bk as a symmetric matrix and an approximation to
Hk = 2 f (xk ) provided that 2 f (xk ) is available.
There are many existing methods for solving (1) which can be divided into two classes: line search and trust region methods. Line search methods need to carry out line search procedures at each iteration and trust region methods
need to solve trust region subproblems at each step. The basic idea of trust region methods is as follows. At each iterate xk

The

work was supported in part by NSF CNS-0521142, USA.

Corresponding author. College of Operations Research and Management, Qufu Normal University, Rizhao, Shandong 276826, PR China.

E-mail addresses: zjshi@qrnu.edu.cn, zjshi@umd.umich.edu (Z.-J. Shi), jinhua@umich.edu (J.-H. Guo).


0377-0427/$ - see front matter 2007 Elsevier B.V. All rights reserved.
doi:10.1016/j.cam.2007.01.027

510

Z.-J. Shi, J.-H. Guo / Journal of Computational and Applied Mathematics 213 (2008) 509 520

(suppose that it is not a stationary point), a trial step is usually obtained by solving the following subproblem:
min mk (d) = gkT d + 21 d T Bk d,

dR n

s.t. d k ,

(2)

where k is a trust region radius. A merit function is normally used to test whether the trial step is accepted or the
trust region radius needs to be adjusted. If a trial step is accepted at some iteration then we call the related iteration an
effective iteration, otherwise we call the iteration an ineffective iteration.
In comparison with quasi-Newton methods, trust region methods converge to a point which not only is a stationary
point, but also satises second-order necessary conditions. Because of their strong convergence and robustness, trust
region methods have been studied by many authors [36,9,17] and some convergence properties are given in the
literature [2,8,10,12,18]. A professional book on trust region methods was published, in which we see diverse trust
region methods and a lot of theoretical analysis, see [1] and references therein.
It is obvious that the trust region radius k in the abovementioned subproblem is independent of gk and Bk . We do
not know whether the radius k is suitable to the whole algorithm. This situation would possibly increase the number of
solving subproblems and decrease the efciency of these methods. In fact, we want to reduce the number of ineffective
iterations so that we can reduce the number of solving subproblems when reaching the same precision.
Sartenaer [11] presented a strategy which can automatically determine an initial trust region radius. The basic idea is
to determine a maximal initial radius through many repeated trials in the direction gk in order to guarantee a sufcient
agreement between the model and the objective function. Zhang et al. [21] presented another strategy of determining
the trust region radius. Their basic idea is originated from the following subproblem in an articial neural network
research [19]:
min
s.t.

mk (d) = gkT d + 21 d T Bk d
k di k ,

i = 1, 2, . . . , n,

where k = cp gk /k , 0 < c < 1, k = min(Bk , 1), and p is a nonnegative integer. Therefore, instead of adjusting
k , one adjusts p at each iteration. Motivated by this technique, they solved the trust region subproblem (2) with
k = cp gk Bk

,

(3)

and gave a global convergent adaptive trust region method [20], where c (0, 1), p is a nonnegative integer and
B k = Bk + iI is a positive denite matrix for some i.
However, their method needs to estimate Bk  or B k1  at each iteration, which leads to some additional cost of
computation. As a result, a simple adaptive trust region method was proposed [14], in which the trust region radius was
computed by the following equation:
k = cp gk 3 /gkT B k gk ,

(4)

where c (0, 1), B k is a positive denite matrix and p is a nonnegative integer. An analysis for general adaptive trust
region method was given in [13,15,16]. Of course, we should try to nd more effective and implementable trust region
methods so as to decrease the number of solving trust region subproblems at each iteration and improve the numerical
performance of related trust region methods.
In this paper, we present a new trust region method for unconstrained optimization problems and develop some
convergence properties under some mild conditions. At each iteration, the new method generates a suitable initial trust
region radius automatically based on the current iterative information. In comparison with other trust region methods,
the new trust region model is more consistent with the objective function at the current iterate. The global convergence
and local superlinear and quadratic convergence rate of the new method are proved under some mild conditions.
Numerical results also show that the new trust region method is available and efcient in practical computation. We
can nd that the number of solving trust region subproblems is reduced substantially and the efciency of trust region
method is greatly improved.
The rest of this paper is organized as follows. In the next section, we introduce the new trust region method and give
some simple properties. In Sections 3 and 4, the global convergence, linear convergence rate, local superlinear and
quadratic convergence rate are investigated. Numerical results are given in Section 5. Conclusions and future research
are summarized in Section 6.

Z.-J. Shi, J.-H. Guo / Journal of Computational and Applied Mathematics 213 (2008) 509 520

511

2. New trust region method


In this section we will describe a new trust region method and give some simple properties.
Given  (0, 1) and  (0, 1), let Bk be an approximation to Hk and let qk satisfy

gkT qk
,
gk  qk 

(5)

with  (0, 1].


Set
sk =

gkT qk
,
q T Bk qk

(6)

in which B k is generated by the procedure: qkT B k qk = qkT Bk qk + iqk 2 and i is the smallest nonnegative integer such
that
qkT B k qk = qkT Bk qk + iqk 2 > 0.
The new trust region method at the current iterate xk needs to solve the subproblem
min mk (d) = gkT d + 21 d T Bk d

dR n

s.t. d qk .

(7)

Let k be the largest  in {sk , sk , 2 sk , . . .} such that the solution dk to (7) satises
rk =

fk f (xk + dk )
.
mk (dk )

(8)

The following is the new trust region method.


Algorithm A. Step 0: Set 0 <  < 1,  0, 0 <  < 1, x0 R n , and k := 0.
Step 1: If gk   then stop else go to Step 2.
Step 2: Choose qk to satisfy (5) and set  = sk dened in (6).
Step 3: Solve (7) to obtain dk and set xk+1 = xk + dk .
Step 4: If (8) holds then xk+1 = xk+1 and go to Step 5, else set  :=  and go to Step 3.
Step 5: Modify Bk as Bk+1 , set k := k + 1 and goto Step 1.
We assume that
(H1) The objective function f (x) has a lower bound on R n and g(x) = f (x) is uniformly continuous on an open
convex set that contains the level set L(x0 ) = {x R n |f (x) f (x0 )}, where x0 R n is given.
(H2) {Bk } is uniformly bounded, i.e., there exists an M such that Bk  M for all k.
Remark 2.1. If f (x) is a twice continuously differentiable function and the level set L(x0 ) is bounded, then (H1)
implies that { 2 f (x)} is uniformly continuous and bounded on an open bounded convex set that contains L(x0 ).
Hence, there exists an L such that  2 f (x) L and by using the mean value theorem we have
g(x) g(y)Lx y,

x, y .

Moreover, if g(x) is Lipschitz continuous on , then (H1) is sure to hold. Therefore, Assumption (H1) is weaker than
that of the literature [15,16,21,20].
Remark 2.2. (H2) implies that {B k } is also uniformly bounded, which can be seen from the generating procedure of
B k . In fact, if Bk  M for all k, then B k  = Bk + iI  2M + 1 because M < i M + 1 implies that qkT B k qk =
qkT Bk qk + iqk 2 > 0.

512

Z.-J. Shi, J.-H. Guo / Journal of Computational and Applied Mathematics 213 (2008) 509 520

Lemma 2.3. For all k we have


mk (dk ) mk (qk )  21 gkT qk ,
where dk is an optimal solution of (7) with respect to  sk .
Proof. It is obvious that d = qk is a feasible solution to (7). Then
mk (dk ) mk (qk )
= (gkT qk + 21 2 qkT Bk qk )
 (gkT qk + 21 2 qkT B k qk )
 (gkT qk + 21 sk qkT B k qk )
= 21 gkT qk .

3. Global convergence
Theorem 3.1. If (H1) and (H2) hold and  = 0, then Algorithm A either stops at a stationary point of (1) or generates
an innite sequence {xk } such that

lim

g T qk
k
qk 


= 0.

(9)

Proof. Suppose that Algorithm A generates an innite sequence {xk } and limk (gkT qk /qk ) = 0. Then there
exist 0 > 0 and an innite subset K {0, 1, 2, . . .} such that

gkT qk
0 ,
qk 

k K.

(10)

(H2) implies that there exists an M0 > 0 such that


Bk  M0 ,

k.

(11)

Thus
qkT B k qk M0 qk 2 ,

k.

(12)

Let K1 = {k K|k = sk } and K2 = {k K|k < sk }. Although K1 K2 = K is an innite subset of {0, 1, 2, . . .},
we can prove that both K1 and K2 cannot be an innite subset and thus obtain a contradiction to (10).
Assume that K1 is an innite subset of K, by Lemma 2.3 and (12), we have

k gkT qk
2



 (gkT qk )2
T
= sk g k q k =
2
2 q T B k qk

fk f (xk + dk )  mk (dk ) 

gkT qk

2M0 qk 

2
,

k K1 .

Z.-J. Shi, J.-H. Guo / Journal of Computational and Applied Mathematics 213 (2008) 509 520

513

By (H1), (10) and the above inequality, we have


+ >

+


(fk fk+1 ) 

(fk fk+1 )

kK

k=0


kK1

 
(fk fk+1 ) 
2M0

kK1

  2

0 .
2M0

gkT qk
qk 

2

kK1

This contradiction shows that K1 cannot be an innite subset of K.


In the case of k K2 , Lemma 2.3 implies that
g T qk
1
1
fk fk+1  k gkT qk = k qk  k .
2
2
qk 
By (H1) and (10) we have
+ >

(fk fk+1 ) 

k=0


kK2

(fk fk+1 )

kK2

g T qk
1
k qk  k
2
qk 

 1
0 k qk .
2

kK2

If K2 is an innite subset then


lim

kK2 ,k

[k qk ] = 0.

(13)

Suppose that dk is an optimal solution of (7) with respect to  k = k / (k K2 ). Then the following inequality holds:
fk f (xk + dk )
< ,
mk (dk )

k K2 ,

(14)

and dk    k qk , k K2 . By (13) we have


lim

kK2 ,k

[k qk ] = 0.

(15)

By (H1), Lemma 2.3, (10) and (15), we have



 

 f f (x + d )
  f f (x + d ) + m (d ) 
k
k
k
k
k k 
 k
  k
1 = 



 

mk (dk )
mk (dk )
=

O(dk 2 ) O(2k qk 2 )


 1
 k g T qk
mk (dk )
2

O(k qk )
21 gkT qk /qk 

O(k qk )
1
2 0

(k K2 , k +),

514

Z.-J. Shi, J.-H. Guo / Journal of Computational and Applied Mathematics 213 (2008) 509 520

which contradicts (14) for sufciently large k K2 . This contradiction shows that there exists no such an innite subset
K of {0, 1, 2, . . .} such that (10) holds. Therefore, (9) holds. 
Corollary 3.2. If the conditions in Theorem 3.1 hold and qk satises (5), then
lim gk  = 0.

Proof. Since qk satises (5), by Theorem 3.1 we have


gk  

g T qk
gkT qk
gk  = k
0
qk 
gk  qk 

This shows the conclusion.

(k ).

Theorem 3.3. Assume that the conditions in Corollary 3.2 hold and Algorithm A generates an innite sequence {xk }
such that xk x (k +), where g(x ) = 0, H (x ) = 2 f (x ) is a positive denite matrix, and H (x) = 2 f (x)
is continuous on a neighborhood N (x , ) of x . Then {xk } converges to x at least linearly.
Proof. Since H (x ) is a positive denite matrix and H (x) is continuous on N (x , ), it follows from the mean value
theorem that there exist L > 0 and 0  such that
g(x) g(y)L x y,

x, y N (x , 0 ).

Let K1 = {k|k = sk } and K2 = {k|k < sk }. If k K1 we can prove the following inequality similarly as in the proof
of Theorem 3.1.

2
gkT qk

, k K1 .
(16)
fk fk+1 
2M0 qk 
In the case of k K2 , we can see that k / sk . Suppose that dk is an optimal solution of (7) with respect to
 k = k /, k K2 . Then (14) holds. Therefore,
f (xk + dk ) fk > mk (dk ),

k K2 .

By using the mean value theorem on the left-hand side of the above inequality, there exists
k (0, 1) such that
g(xk +
k dk )T dk > [gkT dk + 21 dkT Bk dk ] [gkT dk 21 dkT B k dk ],

k K2 .

By (H2) we know that {B k } is uniformly bounded, say B k  M0 , and thus,


g(xk +
k dk )T dk > [gkT dk 21 M0 dk 2 ],

k K2 .

By combining the above inequality and Lemma 2.3, for k K2 , we have


L  2k qk 2 + 21 M0  2k qk 2 L dk 2 + 21 M0 dk 2
g(xk +
k dk ) gk  dk  + 21 M0 dk 2
 (1 )gkT dk ,
= (1 )[gkT dk + 21 dkT Bk dk ] + 21 (1 )dkT Bk dk ,
 (1 )mk (dk ) 21 (1 )dkT B k dk ,
 (1 )mk (k qk ) 21 (1 )M0  2k qk 2
 21 (1 )k gkT qk 21 (1 )2k M0 qk 2 .

Z.-J. Shi, J.-H. Guo / Journal of Computational and Applied Mathematics 213 (2008) 509 520

515

Thus
1
2 (1 )
L + 21 M0

k 

gkT qk
qk 2


k K2 .

(17)

By noting that
fk fk+1
,
mk (dk )
and by Lemma 2.3 and (17) we have
2

(1 ) gkT qk
,
fk fk+1 
4L + 2M0 qk 

k K2 .

(18)

By letting
= min

(1 )

,
2M0 4L + 2M0


,

and by (16), (18) and (5), for all k, we have




fk fk+1 


=

gkT qk
qk 

2

gkT qk
gk  qk 

2
gk 2

 2 gk 2 .
By letting = 2 , we have
fk fk+1  gk 2 .
The rest of the proof is similar to the proof of Theorem 4.1 of [13] and is omitted here.

(19)


4. Convergence rate
Theorem 4.1. Assume that the conditions in Corollary 3.2 hold and Algorithm A generates an innite sequence {xk }
such that xk x (k +). H (x) = 2 f (x) is continuous on a neighborhood N (x , ) of x and H (x ) and Bk
are uniformly positive denite matrices such that
[Bk H (x )]qk 
= 0,
k
qk 
lim

(20)

where qk = Bk1 gk . Then {xk } converges to x superlinearly.


Proof. Because Bk = Bk for sufciently large k, it is obvious that sk = 1 and we assert that dk = qk is a feasible solution
to the subproblem
min

dR n

mk (d) = gkT d + 21 d T Bk d

s.t. dqk .

516

Z.-J. Shi, J.-H. Guo / Journal of Computational and Applied Mathematics 213 (2008) 509 520

In the sequel, we will prove that


f (xk ) f (xk + dk )
,
mk (dk )
for sufciently large k. By (20) we have
gk + H (x )dk = o(dk ),
i.e.,
dk = H (x )1 gk + o(dk ),
thus
dk  H (x )1  gk  + o(dk ),
and so
o(dk )
gk 
1
+

.
1
dk  (H (x )) 
dk 
By Corollary 3.2 we have gk 0 (k +), therefore dk 0 (k +). By Lemma 2.3 and noting that
gkT qk = qkT Bk qk we have
mk (dk )

(gkT qk )2
q T B k qk
.
= k
2
2qkT Bk qk

(21)

By (20) and noting that qk = Bk1 gk = dk we have


f (xk + dk ) f (xk ) mk (dk ) = o(dk 2 ).

(22)

By (21), (22) and the uniformly positive deniteness of Bk , we have




 f (x ) f (x + d )
 o(d 2 )
k
k
k
k


1  T




q
B
mk (dk )
k k qk
2
o(dk 2 )

qk 2


o(dk 2 )
dk 2

(k +),

which implies that


f (xk ) f (xk + dk )
> ,
mk (dk )
for sufciently large k. Therefore, xk+1 = xk + dk = xk + qk , for sufciently large k and the new trust region method
reduces to a standard quasi-Newton method. By combining (20), we can complete the proof, or see [1,3,8]. 
Theorem 4.2. Assume that the conditions in Corollary 3.2 hold and Algorithm A generates an innite sequence {xk }
such that xk x (k +). H (x) is Lipschitz continuous and uniformly positive denite on a neighborhood
N(x , ) of x , Bk = H (xk ) and qk = Bk1 gk . Then {xk } converges to x quadratically.

Z.-J. Shi, J.-H. Guo / Journal of Computational and Applied Mathematics 213 (2008) 509 520

517

Proof. Since Bk = H (xk ) implies that all conditions of Theorem 4.1 hold, we have dk = qk 0 (k ). Therefore,
there exists a k such that
xk + tq k N(x , ),

k k , t [0, 1].

(23)

Because H (x) is Lipschitz continuous on the neighborhood N (x , ) of x , there must exist an L() > 0 such that
H (x) H (y)L()x y,

x, y N (x , ).

(24)

By Theorem 4.1, we know that the new trust region method reduces to Newton method xk+1 = xk + qk = xk
H (xk )1 gk , for sufciently large k. Thereby, {xk } converges to x quadratically. The remainder proof can also be seen in
[1,3,8]. 
Theorem 4.3. Assume that (H1) and (H2) hold and Bk = Hk , Algorithm A generates an innite sequence {xk }. If {xk }
converges to x , then H (x ) is a semi-positive denite matrix, i.e., x satises the second order necessary condition.
Proof. Denote 1k and as the smallest eigenvalue of Hk and H (x ), respectively. Let zk be a normalized eigenvector
(zk  = 1) of Hk corresponding to the eigenvalue 1k and zkT gk 0, then Hk zk = 1k zk . Suppose that H (x ) is not a
positive semi-denite matrix, then < 0 and thus 1k < 0 for sufciently large k.
Because k qk  zk  = k qk , it follows that k qk zk is a feasible solution to (7). Therefore,
mk (k qk zk ) = (k qk gkT zk + 21 2k qk 2 zkT Bk zk )  21 2k qk 2 zkT Bk zk
= 21 2k qk 2 zkT Hk zk = 21 2k qk 2 1k zk 2 = 21 2k qk 2 1k .

(25)

By (25) and (8) we have


fk fk+1  mk (dk )  mk (k qk zk )  21 2k qk 2 1k .
Since {f (xk )} is a monotone decreasing sequence and has a bound from below, we have
fk+1 fk 0

(k +),

and thus 2k qk 2 1k 0 (k +). Noting that 1k (k ), we have


lim k qk  = 0.

(26)

From the denition of Algorithm A, we can observe that the solution dk to


min

dR n

mk (d) = gkT d + 21 d T Bk d

s.t. d  k qk ,


with  k = k / cannot be accepted by the algorithm, i.e., if we set xk+1 = xk + dk , then
f (xk ) f (xk+1 )
< .
mk (dk )

(27)

On the other hand, noting that Bk = Hk , by Taylor expansion, we have


|f (xk ) f (xk+1 ) + mk (dk )| = o(dk 2 ).
By noting that mk (dk )  mk (k qk zk ) and by (25), (26), and (28), we can obtain


 f (xk ) f (xk+1 )

o(dk 2 )

1  1 2

mk (dk )
2  k qk 2 1k


o(2k qk 2 )
21  2k qk 2 1k

(28)

518

Z.-J. Shi, J.-H. Guo / Journal of Computational and Applied Mathematics 213 (2008) 509 520

Since 1k < 0 (k +), we get





 f (xk ) f (xk+1 )

1 0 (k +),


mk (dk )
and thus
f (xk ) f (xk+1 )
> ,
mk (dk )
for sufciently large k, which contradicts (27) and then shows that 0 and thus H (x ) is a positive semi-denite
matrix. 
5. Numerical results
In the new trust region method, qk has a wide scope which only needs to satisfy (5), and qk = gk is certainly a
choice which produces the recently proposed adaptive trust region methods [14]. If we take qk = B k1 gk then we can
obtain a new trust region method. The new trust region method with qk = gk and qk = B k1 gk are denoted by TRS
and TRN, respectively. The original trust region method with  = 100 and  = 0.01 is denoted by TRO [8, p. 68].
We chose the parameters c = 0.75,  = 0.75,  = 0.01 and  = 109 , and {Bk } was modied by BFGS formula.
If we set Bk I and qk = gk we can also obtain another new trust region method denoted by TRI. Zhangs adaptive
trust region method [21] is denoted by TRZ. We chose the 18 test problems and their initial points from the literature
[7], for example, P5 means the No. 5 problem in this literature and so on. The stop criteria is
gk  109 ,
and numerical results are summarized in Table 1.
In Table 1, the three numbers mean that the rst number refers to the number of iterations, the second and third
numbers stand for the number of functional evaluations and gradient evaluations respectively. P refers to the test
problem, and T denotes the total CPU time for solving all the 18 test problems. As you can see, it is difcult to
choose an adequate upper bound  for trust region radius in the original trust region method which may lead to a slow
Table 1
The number of iterations, function and gradient evaluations and CPU time
n

TRS

TRN

TRI

TRZ

TRO

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

2
2
2
2
2
2
3
3
3
3
3
3
4
4
4
4
5
6

25/62/25
21/42/21
15/43/15
15/55/16
26/46/26
18/49/18
43/46/43
48/62/48
46/56/46
28/34/28
46/62/46
46/68/46
56/58/56
38/45/38
76/76/76
29/36/29
93/98/93
38/46/38

17/36/17
16/34/16
12/23/12
15/15/15
21/26/21
15/15/15
36/36/36
35/35/35
37/37/37
23/23/23
38/38/38
37/37/37
45/45/45
29/29/29
59/59/59
18/18/18
79/79/79
26/26/26

28/62/28
27/58/27
16/52/16
17/68/17
26/62/26
23/63/23
52/65/52
53/58/53
63/68/63
25/50/25
53/57/53
65/75/65
68/74/68
43/48/43
85/88/85
43/63/43
98/98/98
38/64/38

26/59/29
32/32/32
22/33/22
34/64/34
37/87/37
35/38/35
61/72/61
64/68/64
66/74/66
56/61/56
59/78/59
69/72/69
81/89/81
68/78/68
97/99/97
78/84/78
85/96/85
54/59/54

36/56/56
42/57/52
20/78/60
28/75/68
32/78/72
28/85/78
48/67/67
83/75/72
85/85/81
65/72/64
72/83/78
83/88/83
91/95/91
86/92/86
94/98/94
93/98/93
92/123/93
63/82/78

CPU (s)

118

59

155

124

195

Z.-J. Shi, J.-H. Guo / Journal of Computational and Applied Mathematics 213 (2008) 509 520

519

convergent method. While in the new trust region method, the problem is solved and the trust region radius is adjusted
automatically at each step.
It is shown from Table 1 that TRN seems to be the best trust region method because it uses the least amount of
total CPU time for solving all the 18 test problems. TRS is the second one that has good numerical performance. The
preliminary numerical experiments show that the new trust region method in the paper is a kind of promising method
for optimization problems.
Numerical results also show that the number of solving subproblems was decreased substantially when reaching the
same precision. The reason is that we chose an adequate initial trust region radius at each step. Actually, if we choose an
adequate trust region radius at each iteration then the number of solving subproblems will be reduced and the efciency
of related trust region methods will be improved. In Algorithm A, a suitable initial trust region radius at each iteration
was chosen and adjusted automatically, this makes the new trust region method have a good numerical performance in
solving practical problems.
In particular, when qk = Bk1 gk the new trust region method will be reduced to quasi-Newton method and when
qk = H (xk )1 gk the new trust region method will be reduced to Newton method. These also show that the new trust
region method is a promising method.
6. Conclusions and future research
In this paper, we presented a new class of adaptive trust region methods for unconstrained optimization problems
and investigated their global convergence. In these new methods, the trust region radius can be adjusted automatically
according to the current iterative information and is computed by a simple formula. As we have seen, different choices
of qk determine different adaptive trust region methods. If we take qk = gk then the new method will reduce to the
recently proposed method [14]. If we take qk = Bk1 gk with Bk1 being available, then we can obtain some interesting
convergence properties of these new methods, such as superlinear convergence rate and quadratic convergence rate,
etc. We can choose other qk to obtain many different adaptive trust region methods. Numerical results show that some
new trust region methods are available and efcient in practical computation.
For future research, we should choose B k and qk by using other approaches and conduct some numerical experiments
when qk is taken as other special directions that satisfy (5). Moreover, the main task in trust region methods is to solve
the quadratic subproblem. We should investigate how to solve the subproblems efciently.
Acknowledgments
The rst author is very grateful to Prof. Xiang-Sun Zhang in Chinese Academy of Sciences for many meaningful
discussions on this topic. The authors would like to thank two anonymous referees and the editor for many valuable
comments and suggestions that greatly improved the paper.
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]

A.R. Conn, N.I.M. Gould, Ph.L. Toint, Trust-Region Methods, Society for Industrial and Applied Mathematics, SIAM, Philadelphia, PA, 2000.
N.Y. Deng, Y. Xiao, F.J. Zhou, Nonmonotonic trust region algorithm, J. Optim. Theory Appl. 76 (1993) 259285.
R. Fletcher, Practical Method of Optimization, Unconstrained Optimization, vol. 1, Wiley, New York, 1980.
J.H. Fu, W.Y. Sun, Nonmonotone adaptive trust-region method for unconstrained optimization problems, Appl. Math. Comput. 163 (2005)
489504.
D.J. Higham, Trust region algorithms and timestep selection, SIAM J. Numer. Anal. 37 (1999) 194210.
J.J. Mor, Recent developments in algorithms and software for trust region methods, in: A. Bachem, M. Grotchel, B. Korte (Eds.), Mathematical
Programming: The State of the Art, Springer, Berlin, 1983, pp. 258287.
J.J. Mor, B.S. Grabow, K.E. Hillstrom, Testing unconstrained optimization software, ACM Trans. Math. Software 7 (1981) 1741.
J. Nocedal, S.J. Wright, Numerical Optimization, Springer, New York, 1999.
M.J.D. Powell, Convergence properties of a class minimization algorithms, in: O.L. Mangasarian, R.R. Meyer, S.M. Robinson (Eds.), Nonlinear
Programming, vol. 2, Academic Press, New York, 1975, pp. l27.
M.J.D. Powell, On the global convergence of trust region algorithms for unconstrained optimization, Math. Programming 29 (1984) 297303.
A. Sartenaer, Automatic determination of an initial trust region in nonlinear programming, SIAM J. Sci. Comput. 18 (6) (1997) 17881803.

520

Z.-J. Shi, J.-H. Guo / Journal of Computational and Applied Mathematics 213 (2008) 509 520

[12] G.A. Schultz, R.B. Schnabel, R.H. Byrd, A family of trust-region-based algorithms for unconstrained minimization with strong global
convergence, SIAM J. Numer. Anal. 22 (1985) 4767.
[13] Z.J. Shi, Convergence of line search methods for unconstrained optimization, Appl. Math. Comput. 157 (2004) 393405.
[14] Z.J. Shi, H.Q. Wang, A new self-adaptive trust region method for unconstrained optimization, Technical Report, College of Operations Research
and Management, Qufu Normal University, 2004.
[15] Z.J. Shi, X.S. Zhang, Convergence analysis of adaptive trust region methods, Technical Report, Institute of Applied Mathematics, Academy of
Mathematics and Systems Science, Chinese Academy of Science, 2004.
[16] Z.J. Shi, X.S. Zhang, From Line Search Method to Trust Region Method. Lecture Notes in Operations Research, vol. 5, 2005, pp. 156170.
[17] W.Y. Sun, Nonmonotone trust region method for solving optimization problems, Appl. Math. Comput. 156 (2004) 159174.
[18] J.Z. Zhang, C.X. Xu, Trust region dogleg path algorithms for unconstrained minimization, Ann. Oper. Res. 87 (1999) 407418.
[19] X.S. Zhang, Trust region method in neural network, Acta Math. Appl. Sinica 12 (1) (1996) 110.
[20] X.S. Zhang, Z.W. Chen, J.L. Zhang, A self-adaptive trust region method for unconstrained optimization, Oper. Res. Trans. 5 (1) (2001) 5362.
[21] X.S. Zhang, J.L. Zhang, L.Z. Liao, An adaptive trust region method and its convergence, Sci. China 45 (2002) 620631.

You might also like