Basics of Optimal Control Theory

Nutan Kumar Tomar

Department of Mathematics

Indian Institute of Technology Patna

Part of MA531: Control Theory

Lecture 1
A Glimpse of Calculus of Variations (CoV)
required for
Optimal Control Theory

Basics from Calculus

d Rx
f (t)dt = f (x)
dx a
d Rb R b ∂f
f (x, t)dt = a dt
dx a ∂x
φR (x) φ (x)  
d 2 R2 ∂f dφ2 dφ1
f (x, t)dt = dt + f (x, φ2 (x)) − f (x, φ1 (x))
dx φ1 (x) φ1 (x) ∂x dx dx

Fundamental Lemma of CoV

Let f ∈ C [a, b]. Let

Z b
f (t)g (t) = 0,

for any g ∈ C [a, b]. Then

f ≡ 0 on [a, b].

Basics from Calculus

Increment: 4f ≡ 4f (t ∗ , 4t) := f (t ∗ + 4t) − f (t ∗ )

Differential of af : 
1 d 2f
4f = f (t ∗ ) + 4t + 2
(4t)2 · · · − f (t ∗ )
dt ∗ 2! dt
| {z } | {z ∗ }
4f = df + d 2 f + · · ·

A function f (t) is said to have a relative optimum at point t ∗ if ∃  > 0 such that
|t − t ∗ | <  =⇒ 4f has same sign (positive or negative).
4f = f (t) − f (t ∗ ) ≥ 0 ⇒ f (t ∗ ) is a local minimum.
4f = f (t) − f (t ∗ ) ≤ 0 ⇒ f (t ∗ ) is a local maximum.

(Necessary condition:) for optimum df = 0.

(Sufficient condition:) for minimum d 2 f > 0.
(Sufficient condition:) for maximum d 2 f < 0.

Function Vs. Functional

Increment: 4J := J(x ∗ (t) + δx(t)) − J(x ∗ (t)

1 ∂2J
∂J 2
4J = J(x ∗ (t) + δx(t) + (δx(t)) · · · − J(x ∗ (t) Rt
J(x(t)) = t f [2x 2 (t) + 3x(t) + 4]dt
∂x ∗ 2! ∂x 2 ∗ 0
| {z } | {z } Z tf
2 δJ = [4x(t) + 3]δx(t)dt
4J = δJ
|{z} +δ J + · · · t0
first variation

Function Vs. Functional

A functional J(x(t)) is said to have a relative optimum at point x ∗ (t) if ∃  > 0 such that
|x(t) − x ∗ (t)| <  =⇒ 4J has same sign (positive or negative).
4J = J(x) − J(x ∗ ) ≥ 0 ⇒ J(x ∗ ) is a local minimum.
4J = J(x) − J(x ∗ ) ≤ 0 ⇒ J(x ∗ ) is a local maximum.

Fundamental Theorem of CoV: (Necessary condition) For x ∗ (t) to be a candidate

for an optimum, the variation of J must be zero on x ∗ (t), i.e. δJ(x ∗ (t), δx(t)) =
0 for all admissible values of δx(t).
(Sufficient condition:) for minimum δ 2 J > 0.
(Sufficient condition:) for maximum δ 2 J < 0.

Function Vs. Functional

integral of variation = variation of integral

derivative of variation = variation of derivative
Z tf Z tf

d d δx(t)dt = (x(t) − x ∗ (t))dt

δx(t) = [x(t) − x ∗ (t)] t0 t0
dt dt Z tf Z tf
d d = x(t)dt − x ∗ (t)dt
= x(t) − x ∗ (t) t0 t0
dt dt Z tf
= δ ẋ(t). = δ( x(t)dt).

Variational Problems

Problem - I (fixed-fixed boundary condition)

Z tf
J(x(t)) = V (x(t), ẋ(t), t)dt,

where x(t0 ) = x0 (fixed)

x(tf ) = xf (fixed)

Necessary Condition: Euler- Lagrange Equation

Suppose x ∗ (t) solves the problem. Then
∂V d ∂V
− =0
∂x ∗ dt ∂ ẋ ∗

Variational Problems
R tf R tf
J(x ∗ (t) + δx(t)) − J(x ∗ (t)) = t0
V (x ∗ + δx, ẋ ∗ + δ ẋ, t)dt − t0
V (x ∗ , ẋ ∗ , t)dt
This gives
Z tf     
∂V ∂V
δJ(x ∗ (t), δx(t)) = δx + δ ẋ dt
t0 ∂x ∗ ∂ ẋ ∗
Z tf        tf
∂V d ∂V ∂V
= δx − δx dt + δx
t0 ∂x ∗ dt ∂ ẋ ∗ ∂ ẋ ∗ t 0

Now make sure that δJ(x , δx(t)) = 0 for arbitrary δx(t).

Variational Problems

Problem - II (fixed-partially fixed)

Z tf
J(x(t)) = V (x(t), ẋ(t), t)dt,

where x(t0 ) = x0 (fixed)

tf is fixed but x(tf ) is free.

Necessary Condition
Suppose x ∗ (t) solves the problem. Then
∂V d ∂V
− = 0 (E-L Equation)
∂x ∗ dt ∂ ẋ ∗
δx(tf ) = 0 (Transversality Condition)
∂ ẋ ∗ t

Variational Problems

Problem - III (fixed-free)

Z tf
J(x(t)) = V (x(t), ẋ(t), t)dt,

where x(t0 ) = x0 (fixed)

tf and x(tf ) both are free.

Necessary Condition
Suppose x ∗ (t) solves the problem. Then
∂V d ∂V
− = 0 (E-L Equation)
∂x ∗ dt ∂ ẋ ∗
∂V ∂V
δxf + V − ẋ δtf = 0 (Transversality Condition)
∂ ẋ ∗ t ∂ ẋ ∗ t
f f

Variational Problem

Z tf +δtf Z tf
J(x ∗ (t) + δx(t)) − J(x ∗ (t)) = V (x ∗ + δx, ẋ ∗ + δ ẋ, t)dt − V (x ∗ , ẋ ∗ , t)dt
t0 t0
Z tf Z tf
= V (x ∗ + δx, ẋ ∗ + δ ẋ, t)dt − V (x ∗ , ẋ ∗ , t)dt
t0 t0
Z tf +δtf
+ V (x ∗ + δx, ẋ ∗ + δ ẋ, t)dt

This gives
Z tf     
∗ ∂V ∂V
δJ(x (t), δx(t)) = δx + δ ẋ dt + red term
t0 ∂x ∗ ∂ ẋ ∗
Z tf        
∂V d ∂V ∂V
= δx − δx dt + δx
t0 ∂x ∗ dt ∂ ẋ ∗ ∂ ẋ ∗ t f

+ red term

Variational Problem

Z tf +δtf
V (x ∗ + δx, ẋ ∗ + δ ẋ, t)dt

= δtf V (x ∗ + δx, ẋ ∗ + δ ẋ, t)|tf +θδtf ; 0 < θ < 1

≈ δtf V (x ∗ , ẋ ∗ , t)|tf +θδtf
≈ δtf V (x ∗ , ẋ ∗ , t)|tf

δxf − δx(tf )
ẋ(tf ) + δ ẋ(tf ) ≈
⇒ δxf = δx(tf ) + {ẋ(tf ) + δ ẋ(tf )}δtf
⇒ δx(tf ) = δxf − ẋ(tf )δtf

Variational Problems

Problem - IV
Z tf
J(x(t)) = V (x(t), ẋ(t), t)dt,

where x(t0 ) = x0 (fixed)

tf and x(tf ) lie on a given curve θ(t).

Necessary Condition
Suppose x ∗ (t) solves the problem. Then
∂V d ∂V
− = 0 (E-L Equation)
∂x ∗ dt ∂ ẋ ∗
V + (θ̇ − ẋ) δtf = 0 (Transversality Condition)
∂ ẋ ∗ t

Variational Problems

Free-free transversality condition:

∂V ∂V
δxf + V − ẋ δtf = 0 (Transversality Condition)
∂ ẋ ∗ t ∂ ẋ ∗ t
f f

Note that  

δxf = δtf
dt tf

Then transversality condition becomes

V + (θ̇ − ẋ) δtf = 0
∂ ẋ ∗ t

Variational Problems: Summary

General Problem (fixed-free)

Z tf
E-L Condition must be fulfilled.
J(x(t)) = V (x(t), ẋ(t), t)dt, All other transversality conditions are
special case of the following condition.
where x(t0 ) = x0 (fixed)
tf and x(tf ) both are free.

Necessary Condition
Suppose x ∗ (t) solves the problem. Then
∂V d ∂V
− = 0 (E-L Equation)
∂x ∗ dt ∂ ẋ ∗
∂V ∂V
δxf + V − ẋ δtf = 0 (Transversality Condition)
∂ ẋ ∗ t ∂ ẋ ∗ t
f f

Variational Problems: Summary

General Condition (fixed-free case):

∂V ∂V
δxf + V − ẋ δtf = 0 (Transversality Condition)
∂ ẋ ∗ tf ∂ ẋ ∗ tf

 No∂Vcondition,

 = 0, fixed-(tf fixed, xf free);
 ∂ ẋ ∗ t∂V
 f
Special cases:
 V − ẋ ∂ ẋ
∗ tf  i
fixed-(tf free, xf fixed);
 h
 ∂V

 V + (θ̇ − ẋ) ∂ ẋ =0 fixed-(tf and x(tf ) lie on θ(t)).
∗ tf

Example 1

Optimize: J = 0
(ẋ 2 (t) + x(t))dt with x(0) = 2, x(1) = 3


E-L Equation

∂V d ∂V
− =0
∂x dt ∂ ẋ Boundary conditions
⇒1 − [2ẋ(t)] = 0 x(0) = c2 = 2
x(1) = 41 + c1 + c2 = 3
⇒2ẍ(t) = 1
⇒ c1 = 1 − 14 = 34
⇒ẋ(t) = t + c1
⇒x(t) = t 2 + c1 t + c2

Hence, x ∗ (t) = 41 t 2 + 34 t + 2.

Example 2

Optimize J = 0
(ẋ 2 (t) + x(t))dt with x(0) = 2, x(1) is free


E-L Equation Boundary/ Transversality conditions

∂V d ∂V
− =0 x(0) = c2 = 2
∂x dt ∂ ẋ
d ∂V ∂V
⇒1 − [2ẋ(t)] = 0 |t=1 δx(1) = 0 ⇒ |t=1 = 0
dt ∂ ẋ ∂ ẋ
⇒2ẍ(t) = 1 ⇒2ẋ(t)|t=1 = 0
1 1
⇒ẋ(t) = t + c1 ⇒ t + c1 |t=1 = 0
2 2
1 1
⇒x(t) = t 2 + c1 t + c2 ⇒c1 = −
4 2

Hence, x ∗ (t) = 41 t 2 − 12 t + 2.

Example 3
Optimize J = 0
1 + ẋ 2 dt with x(0) = 0 and (tf , xf ) lie on θ(t) = −5t + 15.
E-L Equation Boundary/ Transversality conditions

∂V d ∂V
− ( )=0 x(0) = c2 = 0
∂x dt ∂ ẋ
d ẋ ∂V
⇒ − (√ )=0 [V + (θ̇ − ẋ) ]t=tf = 0
dt 1 + ẋ 2 ∂ ẋ
√ 2ẋ 2 ẍ
p ẋ
[ẍ 1 + ẋ 2 − √ ] ⇒[ 1 + ẋ 2 + (−5 − c1 ) √ ]t=tf = 0
2 1+ẋ 2 1 + ẋ 2
⇒− =0
(1 + ẋ 2 ) ⇒ [1 + ẋ 2 − 5ẋ − c1 ẋ]t=tf = 0
⇒ẍ(1 + ẋ 2 ) − ẋ 2 ẍ = 0 ⇒ [1 + c1 2 − 5c1 − c1 2 ]t=tf = 0
⇒ẍ = 0 1
⇒ c1 =
⇒x = c1 t + c2 5

Hence, x ∗ (t) = 51 t.
To find tf :
1 26 75
tf = −5tf + 15 ⇒ tf = 15 ⇒ tf =
5 5 26
Variational Problems: Sufficient Conditions

R tf R tf
J(x ∗ (t) + δx(t)) − J(x ∗ (t)) = t0
V (x ∗ + δx, ẋ ∗ + δ ẋ, t)dt − t0
V (x ∗ , ẋ ∗ , t)dt

This gives
∂2V ∂2V
Z      2  
1 ∂ V
δ2 J = (δx)2 + 2 δxδ ẋ + (δ ẋ)2
2 t0 ∂x 2 ∗ ∂x∂ ẋ ∗ ∂ ẋ 2 ∗
 2
  
Z tf
∂ V  
1 2 δx 
δ ẋ  ∂x ∂x∂ ẋ 
 
=  δx  dt
2 ∂2V ∂2V δ ẋ

∂x∂ ẋ ∂ ẋ 2 ∗

Variational Problems in multidimensional space

General (fixed-free) case

Optimize Z tf
J(x(t)) = V (x(t), ẋ(t), t)dt, x(t) ∈ Rn

where x(t0 ) = x0 (fixed)

tf and x(tf ) both are free.

Necessary Condition
Suppose x ∗ (t) solves the problem. Then
∂V d ∂V
− = 0 (E-L Equation)
∂x ∗ dt ∂ ẋ ∗
" T #   
∂V ∂V
δxf + V − ẋ T δtf = 0 (Transversality Condition)
∂ ẋ ∗ ∂ ẋ ∗ t
tf f

Variational Problems in multidimensional space

General Condition (fixed-free case):

" T #   
∂V ∂V
δxf + V − ẋ δtf = 0 (Transversality Condition)
∂ ẋ ∗ ∂ ẋ ∗ tf

 No condition, fixed-fixed;
 h ∂V T i
Special cases: ∂ ẋ ∗
= 0, fixed-(tf fixed, xf free);
 tf
 V − ẋ T ∂V
, fixed-(tf free, xf fixed);

∂ ẋ ∗ t f

Constrained Variational Problems
R tf
Optimize: J(x(t)) = t0
V (x(t), ẋ(t), t)dt, x(t) ∈ Rn
subject to
g (x(t), ẋ(t)) = 0
where x(t0 ) = x0 (fixed); tf and x(tf ) are also fixed.
Method of Lagrange multipliers:

L = L(x(t), ẋ(t), λ, t) = V (x(t), ẋ(t), t) + λT g (x(t), ẋ(t))

and optimize a new
Z tf
J(x(t)) = L(x(t), ẋ(t), λ, t)dt, x(t) ∈ Rn , λ ∈ Rm

Necessary Condition
Suppose x ∗ (t) solves the problem. Then
∂L d ∂L
− = 0 (E-L Equation)
∂x ∗ dt ∂ ẋ ∗
∂L d ∂L
− = 0 (E-L Equation) ⇒ g = 0
∂λ ∗ dt ∂ λ̇ ∗

Constrained Variational Problems

R tf
Optimize: J(x(t)) = t0
V (x1 (t), x2 (t), ẋ1 (t), ẋ1 (t), t)dt, x(t) ∈ R2
subject to
g (x1 (t), x2 (t), ẋ1 (t), ẋ1 (t)) = 0
where x(t0 ) = x0 (fixed); tf and x(tf ) are also fixed.
Method of Lagrange multipliers:
L = L(x1 (t), x2 (t), ẋ1 (t), ẋ1 (t), λ, t) = V + λg
and optimize a new
Z tf
J(x(t)) = L(x1 (t), x2 (t), ẋ1 (t), ẋ1 (t), λ, t)dt

Now consider J(x ∗ (t) + δx(t)) − J(x ∗ (t)) and find

Z tf         
∂L ∂L ∂L ∂L
δJ = δx1 + δ ẋ1 + δx2 + δ ẋ2 dt
t0 ∂x1 ∗ ∂ ẋ1 ∗ ∂x2 ∗ ∂ ẋ2 ∗

Constrained Variational Problems

Z tf         
∂L ∂L ∂L ∂L
δJ = δx1 + δ ẋ1 + δx2 + δ ẋ2 dt
t0 ∂x1
∗ ∂ ẋ1
∗ ∂x2 ∗ ∂ ẋ2 ∗
Z tf        tf
∂L d ∂L ∂L
= − δx1 dt + δx1
t0 ∂x1 ∗ dt ∂ ẋ1 ∗ ∂ ẋ1 ∗ t0
Z tf        tf
∂L d ∂L ∂L
+ − δx2 dt + δx2
t0 ∂x2 ∗ dt ∂ ẋ2 ∗ ∂ ẋ2 ∗ t 0

Now make sure that δJ = 0 for arbitrary δx1 (t). Remember! δx1 (t) and δx2 (t) - both are
not independent.

Lecture 2
Optimal Control Problems

Optimal Control Problems

Find a u(t) ∈ Rm such that u(t) with the corresponding trajectory of

ẋ = f (x, u, t); x(t) ∈ Rn , f : Rn × Rm × [t0 , tf ] → Rn
where x(0) = x0 (fixed) optimize the performance index
Z t
J(u(t)) = S(xf , tf ) + V (x, u, t)dt

Optimal Control Problems

Some special cases of the performance index

J(u(t)) = S(xf , tf ) + t0 V (x, u, t)dt
s/t ẋ = f (x, u, t); x(t0 ) = x0 .

1. Minimum time problem

Z t
J= dt = [tf − t0 ]

2. Minimum control efforts problem

Z t Z t
1 1
J= u T udt = kuk2 dt
2 t0 2 t0
Z t
In general, J = u T Rudt (R > 0)
2 t0

3. State tracking about fixed state trajectory C with minimum control efforts problem

1 t
J= [(x − C )T Q(x − C ) + u T Ru]dt (Q ≥ 0, R > 0)
2 t0

Optimal Control Problems

Some special cases of the performance index

J(u(t)) = S(xf , tf ) + t0 V (x, u, t)dt

4. State tracking about zero state trajectory with minimum control efforts problem

1 t T
J= [x Qx + u T Ru]dt (Q ≥ 0, R > 0)
2 t0

5. Minimum control efforts problem for finding final state xf such that xf reaches close
to a constant C
1 t T
J = (xf − C )T Sf (xf − C ) + [u Ru]dt (Sf ≥ 0, R > 0)
2 t0

Optimal Control Problems

Find a u(t) ∈ Rm such that the trajectory of

ẋ = f (x, u, t); x(t) ∈ Rn , f : Rn × Rm × [t0 , tf ] → Rn
with x(0) = x0 (fixed) optimizes the performance index
Z t
J(u(t)) = S(xf , tf ) + V (x, u, t)dt
Z t
= S(x0 , t0 ) + [V (x, u, t) + S(x, t)]dt
t0 dt
Z tf
(∵ S(x, t)dt = S(xf , tf ) − S(x0 , t0 ))
t0 dt

Optimal Control Problems

Hence general optimal control problem is

Rt d
J(u(t)) = t0
[V (x, u, t) + dt
S(x, t)]dt
such that
ẋ = f (x, u, t); x(0) = x0 (fixed), tf and xf are free.

Now using Lagrange multiplier, the problem becomes

Optimize: J(u(t)) = 0
V + dS
+ λT [f − ẋ]dt with x(0) = x0 and tf , xf are free.

Optimal Control Problems

Optimize: J(u(t)) = 0
V + dS
+ λT [f − ẋ]dt with x(0) = x0 and tf , xf are free.


H ≡ H(x(t), u(t), λ(t), t) = V (x, u, t) + λT (t)f (x, u, t) (Hamiltonian)

L ≡ L(x(t), ẋ(t), u(t), λ(t), t) = H + dS
− λT (t)ẋ (Lagrangian)

Then problem becomes

Optimize: J = 0
L(x, ẋ, u, λ, t)dt with x(0) = x0 and tf , xf are free.

Recall Basic frame-work

General (fixed-free) case

Optimize Z tf
J(x(t)) = V (x(t), ẋ(t), t)dt, x(t) ∈ Rn

where x(t0 ) = x0 (fixed)

tf and x(tf ) both are free.

Necessary Condition
Suppose x ∗ (t) solves the problem. Then
∂V d ∂V
− = 0 (E-L Equation)
∂x ∗ dt ∂ ẋ ∗
" T #   
∂V ∂V
δxf + V − ẋ T δtf = 0 (Transversality Condition)
∂ ẋ ∗ ∂ ẋ ∗ t
tf f

Optimize: J = 0
L(x, ẋ, u, λ, t)dt with x(0) = x0 and tf , xf are free.

Optimal Control Problems

Optimize: J = 0 L(x, ẋ, u, λ, t)dt L = H + dS
− λT (t)ẋ
with x(0) = x0 and tf , xf are free. H = V (x, u, t) + λT (t)f (x, u, t)

Necessary Condition (E-L Equation)

∂L d ∂L ∂H
− = 0 ⇒ λ̇ = −
∂x dt ∂ ẋ ∂x
∂L d ∂L  ∂H
− =0⇒ =0
∂u dt ∂ u̇ ∂u
∂L d ∂L  ∂H
− =0⇒ − ẋ = 0 ⇒ f − ẋ = 0 ⇒ ẋ = f
∂λ dt ∂ λ̇ ∂λ

Necessary Condition (Transversality Condition)

" T #   " T #  
∂L ∂L T ∂S ∂S
δxf + L − ẋ δtf = 0 ⇒ −λ δxf + H + δtf = 0
∂ ẋ ∂ ẋ tf ∂x ∂t tf
tf tf

Optimal Control Problems
∂L d ∂L ∂H
Proof: − = 0 ⇒ λ̇ = −
∂x dt ∂ ẋ ∂x

L = H + dS
− λT (t)ẋ
Optimize: J = 0 L(x, ẋ, u, λ, t)dt H = V (x, u, t) + λT (t)f (x, u, t)
with x(0) = x0 and tf , xf are free. dS ∂S ∂S T
= +( ) ẋ
dt ∂t ∂x

∂L ∂ dS
= H+ − λT ẋ
∂x ∂x dt
∂H ∂2S ∂2S
+ = + ẋ
∂x " ∂x∂t
( ∂x 2
   T ) #
d ∂L d ∂ ∂S ∂S
= + ẋ − λ
dt ∂ ẋ dt ∂ ẋ ∂t ∂x
" T #  
d ∂S ∂S ∂S
= −λ ∵ ≡ (x, t)
dt ∂x ∂x ∂x
∂2S ∂2S
= + ẋ − λ̇
∂t∂x ∂x 2

Optimal Control Problems

h i h T i
∂L T
δxf + L − ẋ T ∂∂Lẋ t δtf = 0 ⇒ ∂S ∂S
Proof: ∂ ẋ ∂x
−λ δxf + H + ∂t tf
δtf = 0
tf f tf

L = H + dS
− λT (t)ẋ
Optimize: J = 0 L(x, ẋ, u, λ, t)dt H = V (x, u, t) + λT (t)f (x, u, t)
with x(0) = x0 and tf , xf are free. dS ∂S ∂S T
= +( ) ẋ
dt ∂t ∂x

" T #   
∂L ∂S
δxf + L − ẋ T −λ δtf
∂ ẋ ∂x tf
" T # "  T #
∂S ∂S ∂S ∂S
= −λ δxf + H + + ẋ − λT ẋ − ẋ T + ẋ T λ δtf
∂x ∂t ∂x ∂x
tf tf
" T #  
∂S ∂S
= −λ δxf + H + δtf
∂x ∂t tf

Optimal Control Problems

Optimize: J = 0 L(x, ẋ, u, λ, t)dt L = H + dS
− λT (t)ẋ
with x(0) = x0 and tf , xf are free. H = V (x, u, t) + λT (t)f (x, u, t)

Necessary Condition (E-L Equations)

λ̇ = − (Costate (adjoint) Equation)

= 0 (Optimal control Equation)

ẋ = f (State Equation)

Boundary Conditions
" T #  
∂S ∂S
−λ δxf + H + δtf = 0 (Transversality Condition)
∂x ∂t tf

x(t0 ) = x0 (Given initial condition)

Optimal Control Problems

Special cases of

Boundary Conditions
" T #  
∂S ∂S
−λ δxf + H + δtf = 0 (Transversality Condition)
∂x ∂t tf

ẋ = x0 (Given initial condition)

Case 1: tf is fixed xf is free

−λ = 0 (Transversality Condition)
∂x tf

Case 2: xf is fixed tf is free

H+ = 0 (Transversality Condition)
∂t tf

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 39 / 91

Optimal Control Problems: Algorithm and challenges

Step 1. Solve optimal control Equation (a set of m algebraic equations).

Step 2. Substitute value of u from Step 1 into state and costate equations (each of
them contains n differential equations).
Step 3. Solve Sate and costate system with all the boundary conditions.
Boundary conditions are split (TPBVP).
Solving TPBVP demands a good numerical scheme.
Algorithm provides open loop optimal control.

Example 4

Find optimal control to optimize

5 2 1 tf 2
J = kxf − k + u (t)dt
2 2 2 t0

subject to
ẋ1 x2 0
= ; x(0) = and tf = 2
ẋ2 −x2 + u 0

1 2  x2 1
= u 2 + λ1 x2 + λ2 (−x2 + u)

H= u + λ1 λ2
2 −x2 + u 2
Costate equation:
∂H λ̇1 0 0 0 0 λ1
λ̇ = − ⇒ =− = =
∂x λ̇2 λ1 − λ2 −λ1 + λ2 −1 1 λ2

Example 4 cont...

Optimal control Equation:

= 0 ⇒ u + λ2 = 0 ⇒ u(t) = −λ2 (t)

State equation:
ẋ1 x2 x2
= =
ẋ2 −x2 + u −x2 − λ2

Boundary Conditions:
" T #        
∂S λ1 (2) x1 (2) − 5 x1 (0) 0
−λ =0 ⇒ = and =
∂x λ2 (2) x2 (2) − 2 x2 (0) 0

State and Costate system:

     
ẋ1 x1 0 1 0 0
 ẋ2   x2  0 −1 0 −1
  = A   ; where A =  
λ̇1  λ1  0 0 0 0
λ̇2 λ 2 0 0 −1 1

Example 4 cont...

Solve state and costate equations:

      
x1 (2) x1 (0) 1 0.86 1.63 −2.76 0
 x2 (2) 

 = e At
i  x2 (0)  0 0.14
 = 2.76 −3.630
 
λ1 (2) t=2 λ1 (0) 0 0 1 0  c1 
λ2 (2) λ2 (0) 0 0 −6.39 7.39 c2
λ1 (2) x1 (2) − 5
By Substituting = and rearranging equations for unknowns
λ2 (2) x2 (2) − 2
x1 (2), x2 (2), c1 and c2 obtain
        
1 0 −1.63 2.76 x1 (2) 0 x1 (2) 2.30
0 1 −2.76 3.63  x2 (2) = 0 ⇒ x2 (2) =  1.33 
       

1 0 −1 0   c1   5   c1   −2.70
0 1 6.39 −7.39 c2 2 c2 −2.42

Example 4 cont...

Optimal control is: u(t) = −λ2 (t) where

     
x1 (t) x1 (0) 0 1 0 0
 x2 (t)  h i x (0) 
0 −1 0 −1
e At 

λ1 (t)
 =  c1   Here A = 0
  
0 0 0 
λ2 (t) c2 0 0 −1 1
1 1 − e −t 1 t −t 1 t
+ e −t )
  
(e − e ) − t 1 − 2
(e 0
−t 1 t −t 1 −t t
0 e −1 + 2 (e + e ) 2
(e − e )   0   
= 
0  −2.70

0 1 0
t t
0 0 1−e e −2.42

Thus, optimal control is

u(t) = −2.70 + 0.28e t , 0 ≤ t ≤ 2.

Example 4 cont...
Optimal control is: u(t) = −2.70 + 0.28e t , 0 ≤ t ≤ 2.
Optimal state/costate are
1 1 − e −t 1
(e t − e −t ) − t 1 − 12 (e t + e −t )
    
x1 (t) 2
 x2 (t)  0 e −t −1 + 21 (e t + e −t ) 1
(e −t − e t )   0 
 
λ1 (t) = 0
  
0 1 0  −2.70
λ2 (t) 0 0 1 − et et −2.42
Thus, H = 12 u 2 + λ1 x2 + λ2 (−x2 + u) along optimal path:

A = [0 1 0 0;0 -1 0 -1;0 0 0 0;0 0 -1 1];
syms t;
eAt = expm(A*t);
z0 = [0; 0;-2.70;-2.42];
z = eAt*z0;
H = 0.5*z(4)*z(4) + [z(3)

One result to check the optimal control

If H is not an explicit function of time t, then H is constant along the optimal path.


Necessary Condition (E-L Equations)

λ̇ = − (Costate (adjoint) Equation)
L = H + dS
− λT (t)ẋ
H = V (x, u, t) + λT (t)f (x, u, t)
= 0 (Optimal control Equation)

ẋ = f = (State Equation)

dH ∂H ∂H ∂H ∂H
= + ẋ T + u̇ T + λ̇T
dt ∂t ∂x
∂H T ∂H T ∂H ∂H
= + ẋ + λ̇ + u̇ =
∂t ∂x ∂u ∂t

Linear Quadratic Optimal Control problem

Optimize: J(u(t)) = 21 (xf −Cf )T Sf (xf −Cf )+ 12 0 f [(x −C )T Q(x −C )+u T Ru]dt
subject to ẋ = Ax + Bu with x(0) = x0 (fixed),
tf is fixed and finite, and x(tf ) is free.
Sf , Q ≥ 0 and R > 0

Q: Error Weighted Matrix

R: Control Weighted Matrix
Sf : Terminal Cost Weighted Matrix
State Regular Problem (Linear Quadratic Regular Problem)
Finite and Infinite Time Horizon Problem
Sf = 0 in infinite-time horizon state regular problem

Recall - Optimal Control Problems - Basic Frame-work

Optimize: J = 0 L(x, ẋ, u, λ, t)dt L = H + dS
− λT (t)ẋ
with x(0) = x0 and tf , xf are free. H = V (x, u, t) + λT (t)f (x, u, t)

Necessary Condition (E-L Equations)

λ̇ = − (Costate (adjoint) Equation)

= 0 (Optimal control Equation)

ẋ = f (State Equation)

Boundary Conditions
" T #  
∂S ∂S
−λ δxf + H + δtf = 0 (Transversality Condition)
∂x ∂t tf

x(t0 ) = x0 (Given initial condition)

Recall - Optimal Control Problems - Basic Frame-work

Special cases of

Boundary Conditions
" T #  
∂S ∂S
−λ δxf + H + δtf = 0 (Transversality Condition)
∂x ∂t tf

ẋ = x0 (Given initial condition)

Case 1: tf is fixed xf is free

−λ = 0 (Transversality Condition)
∂x tf

Case 2: xf is fixed tf is free

H+ = 0 (Transversality Condition)
∂t tf

Linear Quadratic Regulator (LQR) problem - I

Optimize: J(u(t)) = 12 xfT Sf xf + 12 0 f [x T Qx + u T Ru]dt
subject to ẋ = Ax + Bu with x(0) = x0 (fixed),
tf is fixed and finite, and x(tf ) is free.
Sf , Q ≥ 0 and R > 0


1 T
H ≡ H(x(t), u(t), λ(t), t) = [x Qx + u T Ru] + λT [Ax + Bu] (Hamiltonian)

Costate equation:

λ̇ = − ⇒ λ̇ = −[Qx + AT λ]

Optimal control Equation:

= 0 ⇒ Ru + B T λ = 0 ⇒ u = −R −1 B T λ

LQR problem - I

State equation:

ẋ = ⇒ ẋ = Ax + Bu

Boundary Conditions:
x(0) = x0 and λ(tf ) = = Sf x(tf )
∂x tf

State and Costate system:

ẋ A −E x
= ; where E = BR −1 B T
λ̇ −Q −AT λ

LQR problem - I

Take: λ(t) = P(t)x(t)

which implies λ̇(t) = Ṗ(t)x(t) + P(t)ẋ(t) Substitute values from state and
costate system, obtain

−Qx − AT λ = Ṗ(t)x(t) + P(t)(Ax − BR −1 B T λ)

Substitute λ = P(t)x and obtain

−Qx − AT P(t)x = Ṗ(t)x + P(t)(Ax − BR −1 B T P(t)x)

= Ṗ(t)x(t) + P(t)Ax − P(t)BR −1 B T P(t)x

which implies

Ṗ(t)x + P(t)Ax − P(t)BR −1 B T P(t)x + Qx + AT P(t)x = 0

⇒ [Ṗ(t) + P(t)A + AT P(t) − P(t)BR −1 B T P(t) + Q]x(t) = 0

Since x(t) is not identically zero

Ṗ + PA + AT P − PBR −1 B T P + Q = 0 Differential Riccati Equation (DRE).

Moreover, P(tf )x(tf ) = λ(tf ) = Sf x(tf ) ⇒ P(tf ) = Sf

LQR problem - I

Optimize: J(u(t)) = 12 xfT Sf xf + 12 0 f [x T Qx + u T Ru]dt
subject to ẋ = Ax + Bu with x(0) = x0 (fixed),
tf is fixed and finite, and x(tf ) is free.
Sf , Q ≥ 0 and R > 0
Here, H ≡ H(x(t), u(t), λ(t), t) = [x T Qx +u T Ru]+λT [Ax +Bu] (Hamiltonian)
Algorithm to solve the above problem:

Step 1. Solve DRE: Ṗ + PA + AT P − PBR −1 B T P + Q = 0 with condition

P(tf ) = Sf from tf to t0 .

Step 2. for optimal state x ∗ , solve:

ẋ = Ax +Bu = Ax −BR −1 B T λ(t) = Ax −BR −1 B T Px(t) = [A−BR −1 B T P]x(t)
with x(0) = x0 .

Step 3. optimal control u ∗ (t) = −R −1 B T Px ∗ (t).

LQR problem - I: Some Important Features

1 Riccati Matrix P(t) is time varying matrix which depends on A, B, Sf , Q, R but

does not depend on x0 .
2 P(t) is SPD.
3 Sufficient Test: u ∗ is minimum if R is positive definite.
4 Sufficient Test: u ∗ is maximum if R is negative definite.
5 Computation of P is independent of x ∗ and u ∗ (See Algorithm: Step 1 is
independent of Step 2 and 3).
6 Algorithm provides u ∗ as a linear function of x ∗ (closed loop controller).
7 LQR problem can be solved by an open loop controller.
Computation of P(t):
In general: solve DRE by well known solvers of systems of ODEs.
Under some conditions: Analytic solution is also available for DRE. Here important is the
following property of state costate system matrix.
ẋ A −E x x
= T =∆ ; where E = BR −1 B T
λ̇ −Q −A λ λ

If µ is an eigenvalue of ∆ then −µ is also an eigenvalue of ∆.

LQR problem - I: Analytic Solution of DRE
Consider state costate system:
ẋ A −E x x
= T =∆ ; where E = BR −1 B T
λ̇ −Q −A λ λ
x −M 0 W11 W12
= WDW −1 , where D = , W = .
λ 0 M W21 W22
w x
Take = W −1 . Then
z λ
x w W11 W12 w
= W = (1)
λ z W21 W22 z
ẇ w −M 0 w
= D = (2)
ż z 0 M z

From (2), obtain

e −M(t−tf )
w (t) 0 w (tf )
z(t) 0 e M(t−tf ) z(tf )
e M(t−tf )
w (tf ) 0 w (t)
⇒ = (3)
z(t) 0 e M(t−tf ) z(tf )

LQR problem - I: Analytic Solution of DRE

From (1) and fact that λ(tf ) = Sf x(tf ), obtain

W21 w (tf ) + W22 z(tf ) = Sf x(tf )

= Sf [W11 w (tf ) + W12 z(tf )] (from (1) again.)

which implies

z(tf ) = − [W22 − Sf W12 ]−1 [W21 − Sf W11 ] w (tf )

= T1 w (tf ), where T1 = − [W22 − Sf W12 ]−1 [W21 − Sf W11 ]

From (3), obtain

z(t) = e M(t−tf ) z(tf ) = e M(t−tf ) T1 w (tf )

= e M(t−tf ) T1 e M(t−tf ) w (t) (from (3) again)
= T2 w (t), (4)
−M(tf −t) −M(tf −t)
where T2 = e T1 e

LQR problem - I: Analytic Solution of DRE

From (1) and fact that λ(t) = P(t)x(t), obtain

W21 w (t) + W22 z(t) = P(t)x(t)

= P(t) [W11 w (t) + W12 z(t)] (from (1) again.)

Finally, from (4), obtain

P(t) = [W21 + W22 T2 ] [W11 + W12 T2 ]−1

T2 = e −M(tf −t) T1 e −M(tf −t)
T1 = − [W22 − Sf W12 ]−1 [W21 − Sf W11 ]

Example 5 (LQR problem - I)

1 2
x1 (5) + x1 (5)x2 (5) + 2x22 (5)

1 R5
+ 0 2x12 (t) + 6x1 (t)x2 (t) + 5x22 (t) + 0.25u 2 (t) dt

subject to ẋ1 (t) = x2 (t)
ẋ2 (t) = −2x1 (t) + x2 (t) + u(t)
x1 (0) = 2 and x2 (0) = −3
0 1 0 1 0.5 2 3
Here, we have A = ,B= , Sf = ,Q=
−2 1 1 0.5 2 3 5
2 1
x0 = , R = , t0 = 0, and tf = 5.
−3 4

Step 1. Solve DRE: Ṗ + PA + AT P − PBR −1 B T P + Q = 0 with condition

P(tf ) = Sf from tf to t0 .

Example 5 (LQR problem - I)

Ṗ + PA + AT P − PBR −1 B T P + Q = 0 with condition P(tf ) = Sf from tf to t0 .

ṗ11 ṗ12 p11 p12 0 1
= −
ṗ12 ṗ22 p12 p22 −2 1
0 −2 p11 p12

1 1 p12 p22
p11 p12 0   p11 p12 2 3
+ 4 0 1 −
p12 p22 1 p12 p22 3 5

p11 (5) p12 (5) 1 0.5
p12 (5) p22 (5) 0.5 2

Example 5 (LQR problem - I)

Ṗ + PA + AT P − PBR −1 B T P + Q = 0 with condition P(tf ) = Sf from tf to t0 .

ṗ11 (t) = 4p12 (t) + 4p12 (t) − 2; p11 (5) = 1
ṗ12 (t) = −p11 (t) − p12 (t) + 2p22 (t) + 4p12 (t)p22 (t) − 3; p12 (5) = 0.5
ṗ22 (t) = −2p12 (t) − 2p22 (t) + 4p22 (t) − 5; p22 (5) = 2

Solve above system of nonlinear differential equations backward in time.

Step 2. for optimal state x ∗ , solve:

ẋ = Ax +Bu = Ax −BR −1 B T λ(t) = Ax −BR −1 B T Px(t) = [A−BR −1 B T P]x(t)
with x(0) = x0 .

Step 3. optimal control u ∗ (t) = −R −1 B T Px ∗ (t).

Example 5 (LQR problem - I)

SC = [0 1 0 0;-2 1 0 -4;-2 -3 0 2;-3 -5 -1 -1];
Sf = [1 0.5;0.5 2]; [W1 D] = eig(SC);
W = W1*[1 0 0 0;0 1 0 0;0 0 0 1;0 0 1 0];
W11 = W(1:2,1:2);
W12 = W(1:2,3:4);
W21 = W(3:4,1:2);
W22 = W(3:4,3:4);
M1 = D(1:2,1:2);
tspan = 0:0.1:5;
n = length(tspan);
p11 = zeros(n);
p12 = zeros(n);
p22 = zeros(n); j = 1; for t = 0:0.1:5
eM1 = expm(M1*(5-t));
T1 = -inv(W22-Sf*W12)*(W21-Sf*W11);
T2 = eM1*T1*eM1;
P = (W21+W22*T2)*inv(W11+W12*T2);
p11(j) = P(1,1);
p12(j) = P(1,2);
p22(j) = P(2,2);
title(’Solution of DRE’)

Example 5 (LQR problem - I)

clear; [tx,x]=ode45(’state’, [0,5],[2;-3]);

title(’optimal state’)

function dx = state(t,x)
SC = [0 1 0 0;-2 1 0 -4;-2 -3 0 2;-3 -5 -1 -1];
A= SC(1:2,1:2);
Rinv = 4;
Sf = [1 0.5;0.5 2];
W = W1*[1 0 0 0;0 1 0 0;0 0 0 1;0 0 1 0];
W11 = W(1:2,1:2);
W12 = W(1:2,3:4);
W21 = W(3:4,1:2);
W22 = W(3:4,3:4);
M1 = D(1:2,1:2);
eM1 = expm(M1*(5-t));
T1 = -inv(W22-Sf*W12)*(W21-Sf*W11);
T2 = eM1*T1*eM1;
P = (W21+W22*T2)*inv(W11+W12*T2);
dx = [A-B*Rinv*B’*P]*x;

Example 5 (LQR problem - I)

clear; SC = [0 1 0 0;-2 1 0 -4;-2 -3 0 2;-3 -5 -1 -1];

Rinv = 4;
Sf = [1 0.5;0.5 2]; [W1,D] = eig(SC);
W = W1*[1 0 0 0;0 1 0 0;0 0 0 1;0 0 1 0];
W11 = W(1:2,1:2);
W12 = W(1:2,3:4);
W21 = W(3:4,1:2);
W22 = W(3:4,3:4);
M1 = D(1:2,1:2); tspan = 0:0.1:5;
n = length(tspan);
j =1;
for t = 0:0.1:5
eM1 = expm(M1*(5-t));
T1 = -inv(W22-Sf*W12)*(W21-Sf*W11);
T2 = eM1*T1*eM1;
P = (W21+W22*T2)*inv(W11+W12*T2);
K = Rinv*B’*P; [tx,x]=ode45(’state’, [0,5],[2;-3]);
xs = interp1(tx,x,tspan)
u(j) = -K*[xs(j,:)]’;
title(’optimal control’)

LQR problem - II

Optimize: J(u(t)) = 12 0 [x T Qx + u T Ru]dt
subject to ẋ = Ax + Bu with x(0) = x0 (fixed),
tf is fixed and finite, and x(tf ) is free.
Sf , Q ≥ 0 and R > 0
Here, H ≡ H(x(t), u(t), λ(t), t) = [x T Qx +u T Ru]+λT [Ax +Bu] (Hamiltonian)
Assumption: System is controllable.
Recall: Algorithm to solve the LQR Problem-I:

Step 1. Solve DRE: Ṗ + PA + AT P − PBR −1 B T P + Q = 0 with condition

P(tf ) = Sf from tf to t0 .

Step 2. for optimal state x ∗ , solve:

ẋ = Ax +Bu = Ax −BR −1 B T λ(t) = Ax −BR −1 B T Px(t) = [A−BR −1 B T P]x(t)
with x(0) = x0 .

Step 3. optimal control u ∗ (t) = −R −1 B T Px ∗ (t).

LQR problem - II

Optimize: J(u(t)) = 12 0 [x T Qx + u T Ru]dt
subject to ẋ = Ax + Bu with x(0) = x0 (fixed),
tf is fixed and finite, and x(tf ) is free.
Sf , Q ≥ 0 and R > 0
Here, H ≡ H(x(t), u(t), λ(t), t) = [x T Qx +u T Ru]+λT [Ax +Bu] (Hamiltonian)
Assumption: System is controllable.

Algorithm to solve the LQR Problem-II:

Step 1. Solve ARE: PA + AT P − PBR −1 B T P + Q = 0 .

Step 2. for optimal state x ∗ , solve:

ẋ = Ax +Bu = Ax −BR −1 B T λ(t) = Ax −BR −1 B T Px(t) = [A−BR −1 B T P]x(t)
with x(0) = x0 .

Step 3. optimal control u ∗ (t) = −R −1 B T Px ∗ (t).

Example 6 (LQR problem - II)

1 R∞ 2
2x1 (t) + 6x1 (t)x2 (t) + 5x22 (t) + 0.25u 2 (t) dt

2 0
subject to ẋ1 (t) = x2 (t)
ẋ2 (t) = −2x1 (t) + x2 (t) + u(t)
x1 (0) = 2 and x2 (0) = −3
0 1 0 2 3
Here, we have A = ,B= ,Q=
−2 1 1 3 5
2 1
x0 = , R = , and t0 = 0.
−3 4

Step 1. Solve ARE: PA + AT P − PBR −1 B T P + Q = 0 .

Example 6 (LQR problem - II)

PA + AT P − PBR −1 B T P + Q = 0 .
0 0 p11 p12 0 1
= −
0 0 p12 p22 −2 1
0 −2 p11 p12

1 1 p12 p22
p11 p12 0   p11 p12 2 3
+ 4 0 1 −
p12 p22 1 p12 p22 3 5

p11 (5) p12 (5) 1 0.5
p12 (5) p22 (5) 0.5 2

0 = 4p12 (t) + 4p12 (t) − 2;
0 = −p11 (t) − p12 (t) + 2p22 (t) + 4p12 (t)p22 (t) − 3;
0 = −2p12 (t) − 2p22 (t) + 4p22 (t) − 5;

Example 6 (LQR problem - II)

  A= [0 1;-2 1];
1.73663 0.3660 B=[0;1]; Q = [2 3;3 5];
P= R = [0.25];
0.3660 1.4729 E = B*inv(R)*B’;
P = are(A,E,Q);

Step 2. for optimal state x ∗ , solve:

ẋ = [A − BR −1 B T P]x(t) with x(0) = x0 .

Step 3. optimal control u ∗ (t) = −R −1 B T Px ∗ (t).

Lecture 3
Constrained Optimal Control Problems

Constrained Optimal Control Problems

Find a u(t) ∈ Rm such that u(t) with the corresponding trajectory of

ẋ = f (x, u, t); x(t) ∈ Rn , f : Rn × Rm × [t0 , tf ] → Rn
where x(0) = x0 (fixed) optimize the performance index
Z t
J(u(t)) = S(xf , tf ) + V (x, u, t)dt

ku(t)k ≤ U, or − Ui ≤ ui (t) ≤ Ui .

Constrained Optimal Control Problems
Optimize: J = 0 L(x, ẋ, u, λ, t)dt
with x(0) = x0 and tf , xf are free. L = H + dS
− λT (t)ẋ
Moreover H = V (x, u, t) + λT (t)f (x, u, t)
ku(t)k ≤ U, or − Ui ≤ ui (t) ≤ Ui .
Necessary Condition (E-L Equations/Pontryagin Minimum Principle)

λ̇ = − (Costate (adjoint) Equation)
hhh ((((
∂H hhhhh (((Equation)
=0 ((Optimal
( ( (control
h hhh H(x, u ∗ , λ) = min H(x, u, λ)
(( ( ( ( hh hh ku(t)k≤U

ẋ = = f (State Equation)

Boundary Conditions
" T #  
∂S ∂S
−λ δxf + H + δtf = 0 (Transversality Condition)
∂x ∂t tf

x(t0 ) = x0 (Given initial condition)

Constrained Optimal Control Problems

Optimize: J = 0 L(x, ẋ, u, λ, t)dt
with x(0) = x0 and tf , xf are free. L = H + dS
− λT (t)ẋ
Moreover H = V (x, u, t) + λT (t)f (x, u, t)
ku(t)k ≤ U, or − Ui ≤ ui (t) ≤ Ui .

( T T T )
Z tf  
∂H ∂H ∂H
δJ = + λ̇(t) δx(t) + δu(t) + − ẋ(t) δλ(t) dt
t0 ∂x ∂u ∗ ∂λ
" T #  
∂S ∂S
+ −λ δxf + H + δtf
∂x ∂t tf

Constrained Optimal Control Problems

Optimize: J = 0 L(x, ẋ, u, λ, t)dt
with x(0) = x0 and tf , xf are free. L = H + dS
− λT (t)ẋ
Moreover H = V (x, u, t) + λT (t)f (x, u, t)
ku(t)k ≤ U, or − Ui ≤ ui (t) ≤ Ui .

( T )
Z tf
∗ ∂H
δJ(u (t), δu(t)) = δu(t) dt
t0 ∂u ∗


∆J(u ∗ (t), δu(t)) = J(u) − J(u ∗ ) ≥ 0 (for minimum)

= δJ(u ∗ (t)) + HOT
= δJ(u ∗ (t), δu(t)) (Neglecting HOT)
Z tf ( T )
= δu(t) dt
t0 ∂u ∗

Constrained Optimal Control Problems

∆J(u ∗ (t), δu(t)) = J(u) − J(u ∗ ) ≥ 0 (for minimum)

= δJ(u ∗ (t) + HOT
= δJ(u ∗ (t), δu(t)) (Neglecting HOT)
Z tf ( T )
= δu(t) dt
t0 ∂u ∗
Z tf
= {∆H} dt
Z tf
= [H(x, u ∗ + δu, λ, t) − H(x, u ∗ , λ, t)] dt

Now make sure that ∆J(u ∗ , δu(t)) ≥ 0 for arbitrary ’admissible’ δu(t). This gives that

H(x, u ∗ , λ, t) ≤ H(x, u ∗ + δu, λ, t)∀ admissible δu

H(x, u ∗ , λ, t) = min H(x, u, λ, t)

Revisit - Constrained Optimal Control Problems
Optimize: J = 0 L(x, ẋ, u, λ, t)dt
with x(0) = x0 and tf , xf are free. L = H + dS
− λT (t)ẋ
Moreover H = V (x, u, t) + λT (t)f (x, u, t)
ku(t)k ≤ U, or − Ui ≤ ui (t) ≤ Ui .

Necessary Condition (E-L Equations/Pontryagin Minimum Principle)

λ̇ = − (Costate (adjoint) Equation)

H(x, u ∗ , λ, t) = min H(x, u, λ, t) (optimal control condition)


ẋ = f (State Equation)

Boundary Conditions
" T #  
∂S ∂S
−λ δxf + H + δtf = 0 (Transversality Condition)
∂x ∂t tf

x(t0 ) = x0 (Given initial condition)

Revisit - Constrained Optimal Control Problems

H(x, u ∗ , λ, t) = min H(x, u, λ, t)

is valid for both constrained and unconstrained problems.

The conditions are still necessary conditions.
(Additional necessary condition:) If tf is fixed and H is not an explicit function of
time t, then H is constant along the optimal path.
(Additional necessary condition:) If tf is free and H is not an explicit function of
time t, then H is zero along the optimal path.
The sufficient condition that > 0 is only valid for unconstrained problems.
∂u ∗

Example 7

Optimize H = u 2 − 6u + 7, |u| ≤ 2
If there is no constraint then optimizer follows
= 0 ⇒ 2u − 6 = 0 ⇒ u ∗ = 3
But this value of u is certainly outside the constraints.
H(u ∗ ) = min H(u)

It is clear from the picture that

u∗ = 2

Time optimal control problem (TOCP)

Minimize the time taken for an LTI system

ẋ = Ax + Bu

to go from an arbitrary initial condition x(t0 ) = x0 to the desired final state xf . Here
control is constrained by
ku(t)k ≤ U.
Without loss of generality we take
xf = 0 (Time optimal regulator problem)
ku(t)k ≤ 1, i.e. −1 ≤ ui ≤ 1.
Assumption: System is controllable.

Optimize J = t0f dt
subject to ẋ = Ax + Bu
x(t0 ) = x0 ; x(tf ) = 0; t0 fixed and tf is free

H = 1 + λT [Ax + Bu]

Time optimal control problem (TOCP)

Optimize J = t0f dt
subject to ẋ = Ax + Bu
x(t0 ) = x0 ; x(tf ) = 0; t0 fixed and tf is free

Here: H = 1 + λT [Ax + Bu]

State Equation:

ẋ = ⇒ ẋ = Ax + Bu ∗
Costate Equation:

λ̇ = − ⇒ λ̇ = −AT λ
Boundary Conditions: x(t0 ) = x0 ; x(tf ) = 0 and tf is free gives
" T #  
∂S ∂S
−λ δxf + H + δtf = 0 ⇒ H(x, u ∗ , λ) = 0
∂x ∂t tf

Time optimal control problem (TOCP)

Optimize J = t0f dt
subject to ẋ = Ax + Bu
x(t0 ) = x0 ; x(tf ) = 0; t0 fixed and tf is free

Here: H = 1 + λT [Ax + Bu]

Optimal Control Condition: H(x, u ∗ , λ) = min H(x, u, λ) gives

H(x, u ∗ , λ) = min H(x, u, λ)

⇒ 1 + λT [Ax + Bu ∗ ] = min 1 + λT [Ax + Bu]

⇒ (u ∗ )T B T λ = min u T B T λ
∗ T
⇒ (u ) q = min u T q, where, q = B T λ

which implies 
−1, if q > 0;
u∗ =
1, if q < 0.
Time optimal control problem (TOCP)

Optimize J = t0f dt
subject to ẋ = Ax + Bu
x(t0 ) = x0 ; x(tf ) = 0; t0 fixed and tf is free

Therefore, Optimal Control Condition:

uj∗ = −sgn{qj } = −sgn{bjT λ},

where bj is the j-th column of B (Bang-Bang Control)


Time optimal control problem (TOCP): Some Results

1 The necessary and sufficient condition for the TOCP to be normal is that the system
is completely controllable.
2 The necessary and sufficient condition for the TOCP to be singular is that the
system is not completely controllable.
3 For the normal-TOCP the developed control is the unique minimizer.
4 For the normal-TOCP, if A has all n eigenvalues real, then u ∗ can switch between
−1 and +1 at most (n-1) times

Time optimal control problem (TOCP): Algorithm

Minimize J = t0f dt
subject to ẋ = Ax + Bu
x(t0 ) = x0 ; x(tf ) = 0; t0 fixed and tf is free

1 Solve costate equation: λ̇ = −AT λ ⇒ λ(t) = e −A t λ(0).
2 Remember! λ(0) is not known, so Assume λ(0).
3 Evaluate: uj∗ = −sgn{qj } = −sgn{bjT λ}.
4 Solve: state dynamics: ẋ = Ax + Bu ∗ with given x0 .
5 Monitor the solution x(t) for a tf such that x(tf ) = 0; otherwise change your guess
for λ(0).
The algorithm is tedious! assume a closed loop controller:

Example 8: Normal-TOCP

1 R tf
Minimize J = dt
2 t0
subject to ẋ1 (t) = x2 (t)
ẋ2 (t) = u(t)  
x(t0 ) = x0
|u| ≤ 1.

Here: H = 1 + λ1 x2 + λ2 u
State Equation:
∂H ẋ1 0 1 x1 0 ∗
ẋ = ⇒ = + u
∂λ ẋ2 0 0 x2 1
Costate Equation:
∂H λ̇1 0 0 λ1
λ̇ = − ⇒ =−
∂x λ̇2 1 0 λ2
Boundary Conditions: x(t0 ) = x0 ; x(tf ) = 0

Example 8: Normal-TOCP

Optimal Control Condition: H(x, u ∗ , λ) = min H(x, u, λ) gives


H(x, u ∗ , λ) = min H(x, u, λ)


⇒ 1 + λ1 x2 + λ2 u ∗ = min (1 + λ1 x2 + λ2 u)

⇒ λ2 u ∗ = min λ2 u

⇒ u ∗ = −sgn{λ2 }

Solve Costate equations:

λ̇1 = 0 ⇒ λ1 (t) = λ1 (0)

λ̇2 = −λ1 ⇒ λ2 (t) = −λ1 (0)t + λ2 (0)

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 85 / 91

Example 8: Normal-TOCP

Thus possible controls: {+1} {−1} {+1, −1} or {−1, +1}.

Problem: λ(0) is not known. Ques:
How can we determine u ∗ from x?
If u ∗ switches then when?
Example 8: Normal-TOCP
Solve state equation and draw its Phase Plane diagram:

∂H ẋ1 0 1 x1 0 ∗ 0 1 x1 0
ẋ = ⇒ = + u = + U; (U = ±1)
∂λ ẋ2 0 0 x2 1 0 0 x2 1

which implies
x2 (t) = x20 + Ut
1 2
x1 (t) = x10 + x20 t + Ut
Eliminate t from above solution
(x2 (t) − x20 )
1 2 1 1
x1 (t) = x10 − Ux20 + Ux22 (t); (U = ±1 = )
2 2 U
If U = +1 If U = −1

t = x2 (t) − x20 t = −x2 (t) + x20

1 1 2
x1 (t) = C1 + x22 (t) x1 (t) = C2 − x2 (t)
2 2
2 2
C1 = x10 − 12 x20 and C2 = x10 + 12 x20
Example 8: Normal-TOCP

Example 8: Normal-TOCP

Switch Curve:
1 2
γ− = (x10 , x20 ) : x10 = − x20 , x20 ≥ 0
1 2
γ+ = (x10 , x20 ) : x10 = x20 , x20 ≤ 0
γ = γ− ∪γ+ = (x10 , x20 ) : x10 = − x20 |x20 |

Phase Plane Regions:

R+ = (x10 , x20 ) : x10 < − x20 |x20
R− = (x10 , x20 ) : x10 > − x20 |x20

Example 8: Normal-TOCP

Optimal Control Law:

u ∗ = u ∗ (x1 (t), x2 (t)) = +1; (x1 (t), x2 (t)) ∈ γ+ ∪ R+

u ∗ = u ∗ (x1 (t), x2 (t)) = −1; (x1 (t), x2 (t)) ∈ γ− ∪ R−

Optimal Control Systems - D.S. Naidu (CRC Press)

Optimal Control Theory: An Introduction - D. E. Kirk (Dover Publications)

Thank you.

