Optimal Control PDF

Basics of Optimal Control Theory
Nutan Kumar Tomar

Department of Mathematics
Indian Institute of Technology Patna
Part of MA531: Control Theory
N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 1 / 91

Lecture 1
A Glimpse of Calculus of Variations (CoV)
required for
Optimal Control Theory

Basics from Calculus
d Rx
f (t)dt = f (x)
dx a
d Rb R b ∂f
f (x, t)dt = a dt
dx a ∂x
φR (x) φ (x)
d 2 R2 ∂f dφ2 dφ1
f (x, t)dt = dt + f (x, φ2 (x)) − f (x, φ1 (x))
dx φ1 (x) φ1 (x) ∂x dx dx
Fundamental Lemma of CoV
Let f ∈ C [a, b]. Let

Z b
f (t)g (t) = 0,
a
for any g ∈ C [a, b]. Then

f ≡ 0 on [a, b].

Basics from Calculus
Increment: 4f ≡ 4f (t ∗ , 4t) := f (t ∗ + 4t) − f (t ∗ )

Differential of af :
1 d 2f

df
4f = f (t ∗ ) + 4t + 2
(4t)2 · · · − f (t ∗ )
dt ∗ 2! dt
| {z } | {z ∗ }
4f = df + d 2 f + · · ·
Definition
A function f (t) is said to have a relative optimum at point t ∗ if ∃ > 0 such that
|t − t ∗ | < =⇒ 4f has same sign (positive or negative).
4f = f (t) − f (t ∗ ) ≥ 0 ⇒ f (t ∗ ) is a local minimum.
4f = f (t) − f (t ∗ ) ≤ 0 ⇒ f (t ∗ ) is a local maximum.
(Necessary condition:) for optimum df = 0.

(Sufficient condition:) for minimum d 2 f > 0.
(Sufficient condition:) for maximum d 2 f < 0.

Function Vs. Functional
Increment: 4J := J(x ∗ (t) + δx(t)) − J(x ∗ (t)

Example:
1 ∂2J

∂J 2
4J = J(x ∗ (t) + δx(t) + (δx(t)) · · · − J(x ∗ (t) Rt
J(x(t)) = t f [2x 2 (t) + 3x(t) + 4]dt
∂x ∗ 2! ∂x 2 ∗ 0
| {z } | {z } Z tf
2 δJ = [4x(t) + 3]δx(t)dt
4J = δJ
|{z} +δ J + · · · t0
first variation

Definition
A functional J(x(t)) is said to have a relative optimum at point x ∗ (t) if ∃ > 0 such that
|x(t) − x ∗ (t)| < =⇒ 4J has same sign (positive or negative).
4J = J(x) − J(x ∗ ) ≥ 0 ⇒ J(x ∗ ) is a local minimum.
4J = J(x) − J(x ∗ ) ≤ 0 ⇒ J(x ∗ ) is a local maximum.
Fundamental Theorem of CoV: (Necessary condition) For x ∗ (t) to be a candidate

for an optimum, the variation of J must be zero on x ∗ (t), i.e. δJ(x ∗ (t), δx(t)) =
0 for all admissible values of δx(t).
(Sufficient condition:) for minimum δ 2 J > 0.
(Sufficient condition:) for maximum δ 2 J < 0.

integral of variation = variation of integral

derivative of variation = variation of derivative
Z tf Z tf
d d δx(t)dt = (x(t) − x ∗ (t))dt

δx(t) = [x(t) − x ∗ (t)] t0 t0
dt dt Z tf Z tf
d d = x(t)dt − x ∗ (t)dt
= x(t) − x ∗ (t) t0 t0
dt dt Z tf
= δ ẋ(t). = δ( x(t)dt).
t0

Variational Problems
Problem - I (fixed-fixed boundary condition)

Optimize
Z tf
J(x(t)) = V (x(t), ẋ(t), t)dt,
t0
where x(t0 ) = x0 (fixed)

x(tf ) = xf (fixed)
Necessary Condition: Euler- Lagrange Equation

Suppose x ∗ (t) solves the problem. Then

∂V d ∂V
− =0
∂x ∗ dt ∂ ẋ ∗

R tf R tf
J(x ∗ (t) + δx(t)) − J(x ∗ (t)) = t0
V (x ∗ + δx, ẋ ∗ + δ ẋ, t)dt − t0
V (x ∗ , ẋ ∗ , t)dt
This gives
Z tf
∂V ∂V
δJ(x ∗ (t), δx(t)) = δx + δ ẋ dt
t0 ∂x ∗ ∂ ẋ ∗
Z tf tf
∂V d ∂V ∂V
= δx − δx dt + δx
t0 ∂x ∗ dt ∂ ẋ ∗ ∂ ẋ ∗ t 0
∗
Now make sure that δJ(x , δx(t)) = 0 for arbitrary δx(t).

Problem - II (fixed-partially fixed)

Optimize
Z tf
J(x(t)) = V (x(t), ẋ(t), t)dt,
t0

tf is fixed but x(tf ) is free.
Necessary Condition

∂V d ∂V
− = 0 (E-L Equation)
∂x ∗ dt ∂ ẋ ∗

∂V
δx(tf ) = 0 (Transversality Condition)
∂ ẋ ∗ t
f

Problem - III (fixed-free)

Optimize
Z tf
J(x(t)) = V (x(t), ẋ(t), t)dt,
t0

tf and x(tf ) both are free.
Necessary Condition

∂V d ∂V
∂x ∗ dt ∂ ẋ ∗

∂V ∂V
δxf + V − ẋ δtf = 0 (Transversality Condition)
∂ ẋ ∗ t ∂ ẋ ∗ t
f f

Variational Problem
Z tf +δtf Z tf
J(x ∗ (t) + δx(t)) − J(x ∗ (t)) = V (x ∗ + δx, ẋ ∗ + δ ẋ, t)dt − V (x ∗ , ẋ ∗ , t)dt
t0 t0
Z tf Z tf
= V (x ∗ + δx, ẋ ∗ + δ ẋ, t)dt − V (x ∗ , ẋ ∗ , t)dt
t0 t0
Z tf +δtf
+ V (x ∗ + δx, ẋ ∗ + δ ẋ, t)dt
tf
This gives
Z tf
∗ ∂V ∂V
δJ(x (t), δx(t)) = δx + δ ẋ dt + red term
t0 ∂x ∗ ∂ ẋ ∗
Z tf
∂V d ∂V ∂V
= δx − δx dt + δx
t0 ∂x ∗ dt ∂ ẋ ∗ ∂ ẋ ∗ t f
+ red term

Variational Problem
Z tf +δtf
V (x ∗ + δx, ẋ ∗ + δ ẋ, t)dt
tf
= δtf V (x ∗ + δx, ẋ ∗ + δ ẋ, t)|tf +θδtf ; 0 < θ < 1

≈ δtf V (x ∗ , ẋ ∗ , t)|tf +θδtf
≈ δtf V (x ∗ , ẋ ∗ , t)|tf
δxf − δx(tf )
ẋ(tf ) + δ ẋ(tf ) ≈
δtf
⇒ δxf = δx(tf ) + {ẋ(tf ) + δ ẋ(tf )}δtf
⇒ δx(tf ) = δxf − ẋ(tf )δtf

Problem - IV
Optimize
Z tf
J(x(t)) = V (x(t), ẋ(t), t)dt,
t0

tf and x(tf ) lie on a given curve θ(t).
Necessary Condition

∂V d ∂V
∂x ∗ dt ∂ ẋ ∗

∂V
V + (θ̇ − ẋ) δtf = 0 (Transversality Condition)
∂ ẋ ∗ t
f

Free-free transversality condition:

∂V ∂V
∂ ẋ ∗ t ∂ ẋ ∗ t
f f
Note that
dθ
δxf = δtf
dt tf
Then transversality condition becomes

∂V
V + (θ̇ − ẋ) δtf = 0
∂ ẋ ∗ t
f

Variational Problems: Summary
General Problem (fixed-free)

Optimize
Z tf
E-L Condition must be fulfilled.
J(x(t)) = V (x(t), ẋ(t), t)dt, All other transversality conditions are
t0
special case of the following condition.
Necessary Condition

∂V d ∂V
∂x ∗ dt ∂ ẋ ∗

∂V ∂V
∂ ẋ ∗ t ∂ ẋ ∗ t
f f

Variational Problems: Summary
General Condition (fixed-free case):

∂V ∂V
∂ ẋ ∗ tf ∂ ẋ ∗ tf


 No∂Vcondition,
fixed-fixed;

 = 0, fixed-(tf fixed, xf free);
∂ ẋ ∗ t∂V
 f
Special cases:

 V − ẋ ∂ ẋ
,
∗ tf i
fixed-(tf free, xf fixed);
 h
 ∂V

 V + (θ̇ − ẋ) ∂ ẋ =0 fixed-(tf and x(tf ) lie on θ(t)).
∗ tf

Example 1
R1
Optimize: J = 0
(ẋ 2 (t) + x(t))dt with x(0) = 2, x(1) = 3
Solution:
E-L Equation
∂V d ∂V
− =0
∂x dt ∂ ẋ Boundary conditions
d
⇒1 − [2ẋ(t)] = 0 x(0) = c2 = 2
dt
x(1) = 41 + c1 + c2 = 3
⇒2ẍ(t) = 1
⇒ c1 = 1 − 14 = 34
1
⇒ẋ(t) = t + c1
2
1
⇒x(t) = t 2 + c1 t + c2
4
Hence, x ∗ (t) = 41 t 2 + 34 t + 2.

Example 2
R1
Optimize J = 0
(ẋ 2 (t) + x(t))dt with x(0) = 2, x(1) is free
Solution:
E-L Equation Boundary/ Transversality conditions
∂V d ∂V
− =0 x(0) = c2 = 2
∂x dt ∂ ẋ
d ∂V ∂V
⇒1 − [2ẋ(t)] = 0 |t=1 δx(1) = 0 ⇒ |t=1 = 0
dt ∂ ẋ ∂ ẋ
⇒2ẍ(t) = 1 ⇒2ẋ(t)|t=1 = 0
1 1
⇒ẋ(t) = t + c1 ⇒ t + c1 |t=1 = 0
2 2
1 1
⇒x(t) = t 2 + c1 t + c2 ⇒c1 = −
4 2
Hence, x ∗ (t) = 41 t 2 − 12 t + 2.

Example 3
R1√
Optimize J = 0
1 + ẋ 2 dt with x(0) = 0 and (tf , xf ) lie on θ(t) = −5t + 15.
Solution:
E-L Equation Boundary/ Transversality conditions
∂V d ∂V
− ( )=0 x(0) = c2 = 0
∂x dt ∂ ẋ
d ẋ ∂V
⇒ − (√ )=0 [V + (θ̇ − ẋ) ]t=tf = 0
dt 1 + ẋ 2 ∂ ẋ
√ 2ẋ 2 ẍ
p ẋ
[ẍ 1 + ẋ 2 − √ ] ⇒[ 1 + ẋ 2 + (−5 − c1 ) √ ]t=tf = 0
2 1+ẋ 2 1 + ẋ 2
⇒− =0
(1 + ẋ 2 ) ⇒ [1 + ẋ 2 − 5ẋ − c1 ẋ]t=tf = 0
⇒ẍ(1 + ẋ 2 ) − ẋ 2 ẍ = 0 ⇒ [1 + c1 2 − 5c1 − c1 2 ]t=tf = 0
⇒ẍ = 0 1
⇒ c1 =
⇒x = c1 t + c2 5
Hence, x ∗ (t) = 51 t.
To find tf :
1 26 75
tf = −5tf + 15 ⇒ tf = 15 ⇒ tf =
5 5 26
Variational Problems: Sufficient Conditions
R tf R tf
J(x ∗ (t) + δx(t)) − J(x ∗ (t)) = t0
V (x ∗ + δx, ẋ ∗ + δ ẋ, t)dt − t0
V (x ∗ , ẋ ∗ , t)dt
This gives
tf
∂2V ∂2V
Z 2
1 ∂ V
δ2 J = (δx)2 + 2 δxδ ẋ + (δ ẋ)2
dt
2 t0 ∂x 2 ∗ ∂x∂ ẋ ∗ ∂ ẋ 2 ∗
 2
∂2V
  
Z tf
∂ V
1 2 δx 
δ ẋ  ∂x ∂x∂ ẋ 
 
=  δx  dt
2 ∂2V ∂2V δ ẋ

t0
∂x∂ ẋ ∂ ẋ 2 ∗

Variational Problems in multidimensional space
General (fixed-free) case

Optimize Z tf
J(x(t)) = V (x(t), ẋ(t), t)dt, x(t) ∈ Rn
t0

Necessary Condition

∂V d ∂V
∂x ∗ dt ∂ ẋ ∗
" T #
∂V ∂V
δxf + V − ẋ T δtf = 0 (Transversality Condition)
∂ ẋ ∗ ∂ ẋ ∗ t
tf f

Variational Problems in multidimensional space
General Condition (fixed-free case):

" T #
∂V ∂V
T
∂ ẋ ∗ ∂ ẋ ∗ tf
tf


 No condition, fixed-fixed;
 h ∂V T i
Special cases: ∂ ẋ ∗
= 0, fixed-(tf fixed, xf free);
 tf
 V − ẋ T ∂V

, fixed-(tf free, xf fixed);

∂ ẋ ∗ t f

Constrained Variational Problems
R tf
Optimize: J(x(t)) = t0
V (x(t), ẋ(t), t)dt, x(t) ∈ Rn
subject to
g (x(t), ẋ(t)) = 0
where x(t0 ) = x0 (fixed); tf and x(tf ) are also fixed.
Method of Lagrange multipliers:
L = L(x(t), ẋ(t), λ, t) = V (x(t), ẋ(t), t) + λT g (x(t), ẋ(t))

and optimize a new
Z tf
J(x(t)) = L(x(t), ẋ(t), λ, t)dt, x(t) ∈ Rn , λ ∈ Rm
t0
Necessary Condition

∂L d ∂L
∂x ∗ dt ∂ ẋ ∗

∂L d ∂L
− = 0 (E-L Equation) ⇒ g = 0
∂λ ∗ dt ∂ λ̇ ∗

R tf
Optimize: J(x(t)) = t0
V (x1 (t), x2 (t), ẋ1 (t), ẋ1 (t), t)dt, x(t) ∈ R2
subject to
g (x1 (t), x2 (t), ẋ1 (t), ẋ1 (t)) = 0
where x(t0 ) = x0 (fixed); tf and x(tf ) are also fixed.
Method of Lagrange multipliers:
L = L(x1 (t), x2 (t), ẋ1 (t), ẋ1 (t), λ, t) = V + λg
and optimize a new
Z tf
J(x(t)) = L(x1 (t), x2 (t), ẋ1 (t), ẋ1 (t), λ, t)dt
t0
Now consider J(x ∗ (t) + δx(t)) − J(x ∗ (t)) and find

Z tf
∂L ∂L ∂L ∂L
δJ = δx1 + δ ẋ1 + δx2 + δ ẋ2 dt
t0 ∂x1 ∗ ∂ ẋ1 ∗ ∂x2 ∗ ∂ ẋ2 ∗

Z tf
∂L ∂L ∂L ∂L
δJ = δx1 + δ ẋ1 + δx2 + δ ẋ2 dt
t0 ∂x1
∗ ∂ ẋ1
∗ ∂x2 ∗ ∂ ẋ2 ∗
Z tf tf
∂L d ∂L ∂L
= − δx1 dt + δx1
t0 ∂x1 ∗ dt ∂ ẋ1 ∗ ∂ ẋ1 ∗ t0
Z tf tf
∂L d ∂L ∂L
+ − δx2 dt + δx2
t0 ∂x2 ∗ dt ∂ ẋ2 ∗ ∂ ẋ2 ∗ t 0
Now make sure that δJ = 0 for arbitrary δx1 (t). Remember! δx1 (t) and δx2 (t) - both are
not independent.

Lecture 2
Optimal Control Problems

Find a u(t) ∈ Rm such that u(t) with the corresponding trajectory of

ẋ = f (x, u, t); x(t) ∈ Rn , f : Rn × Rm × [t0 , tf ] → Rn
where x(0) = x0 (fixed) optimize the performance index
Z t
J(u(t)) = S(xf , tf ) + V (x, u, t)dt
t0

Some special cases of the performance index

Rt
J(u(t)) = S(xf , tf ) + t0 V (x, u, t)dt
s/t ẋ = f (x, u, t); x(t0 ) = x0 .
1. Minimum time problem

Z t
J= dt = [tf − t0 ]
t0
2. Minimum control efforts problem

Z t Z t
1 1
J= u T udt = kuk2 dt
2 t0 2 t0
Z t
1
In general, J = u T Rudt (R > 0)
2 t0
3. State tracking about fixed state trajectory C with minimum control efforts problem
1 t
Z
J= [(x − C )T Q(x − C ) + u T Ru]dt (Q ≥ 0, R > 0)
2 t0

Some special cases of the performance index

Rt
J(u(t)) = S(xf , tf ) + t0 V (x, u, t)dt
4. State tracking about zero state trajectory with minimum control efforts problem
1 t T
Z
J= [x Qx + u T Ru]dt (Q ≥ 0, R > 0)
2 t0
5. Minimum control efforts problem for finding final state xf such that xf reaches close
to a constant C
1 t T
Z
J = (xf − C )T Sf (xf − C ) + [u Ru]dt (Sf ≥ 0, R > 0)
2 t0

Find a u(t) ∈ Rm such that the trajectory of

with x(0) = x0 (fixed) optimizes the performance index
Z t
J(u(t)) = S(xf , tf ) + V (x, u, t)dt
t0
Z t
d
= S(x0 , t0 ) + [V (x, u, t) + S(x, t)]dt
t0 dt
Z tf
d
(∵ S(x, t)dt = S(xf , tf ) − S(x0 , t0 ))
t0 dt

Hence general optimal control problem is
Optimize
Rt d
J(u(t)) = t0
[V (x, u, t) + dt
S(x, t)]dt
such that
ẋ = f (x, u, t); x(0) = x0 (fixed), tf and xf are free.
Now using Lagrange multiplier, the problem becomes
Rt
Optimize: J(u(t)) = 0
V + dS
dt
+ λT [f − ẋ]dt with x(0) = x0 and tf , xf are free.

Rt
Optimize: J(u(t)) = 0
V + dS
dt
+ λT [f − ẋ]dt with x(0) = x0 and tf , xf are free.
Take
H ≡ H(x(t), u(t), λ(t), t) = V (x, u, t) + λT (t)f (x, u, t) (Hamiltonian)

and
L ≡ L(x(t), ẋ(t), u(t), λ(t), t) = H + dS
dt
− λT (t)ẋ (Lagrangian)
Then problem becomes
Rt
Optimize: J = 0
L(x, ẋ, u, λ, t)dt with x(0) = x0 and tf , xf are free.

Recall Basic frame-work
General (fixed-free) case

Optimize Z tf
J(x(t)) = V (x(t), ẋ(t), t)dt, x(t) ∈ Rn
t0

Necessary Condition

∂V d ∂V
∂x ∗ dt ∂ ẋ ∗
" T #
∂V ∂V
δxf + V − ẋ T δtf = 0 (Transversality Condition)
∂ ẋ ∗ ∂ ẋ ∗ t
tf f
Rt
Optimize: J = 0
L(x, ẋ, u, λ, t)dt with x(0) = x0 and tf , xf are free.

Rt
Optimize: J = 0 L(x, ẋ, u, λ, t)dt L = H + dS
dt
− λT (t)ẋ
with x(0) = x0 and tf , xf are free. H = V (x, u, t) + λT (t)f (x, u, t)
Necessary Condition (E-L Equation)

∂L d ∂L ∂H
− = 0 ⇒ λ̇ = −
∂x dt ∂ ẋ ∂x
0

∂L d ∂L ∂H
− =0⇒ =0
∂u dt ∂ u̇ ∂u
0

∂L d ∂L ∂H
− =0⇒ − ẋ = 0 ⇒ f − ẋ = 0 ⇒ ẋ = f
∂λ dt ∂ λ̇ ∂λ
Necessary Condition (Transversality Condition)
" T # " T #
∂L ∂L T ∂S ∂S
δxf + L − ẋ δtf = 0 ⇒ −λ δxf + H + δtf = 0
∂ ẋ ∂ ẋ tf ∂x ∂t tf
tf tf


∂L d ∂L ∂H
Proof: − = 0 ⇒ λ̇ = −
∂x dt ∂ ẋ ∂x
L = H + dS
dt
− λT (t)ẋ
Rt
Optimize: J = 0 L(x, ẋ, u, λ, t)dt H = V (x, u, t) + λT (t)f (x, u, t)
with x(0) = x0 and tf , xf are free. dS ∂S ∂S T
= +( ) ẋ
dt ∂t ∂x

∂L ∂ dS
= H+ − λT ẋ
∂x ∂x dt
∂H ∂2S ∂2S
+ = + ẋ
∂x " ∂x∂t
( ∂x 2
T ) #
d ∂L d ∂ ∂S ∂S
= + ẋ − λ
dt ∂ ẋ dt ∂ ẋ ∂t ∂x
" T #
d ∂S ∂S ∂S
= −λ ∵ ≡ (x, t)
dt ∂x ∂x ∂x
∂2S ∂2S
= + ẋ − λ̇
∂t∂x ∂x 2

h i h T i
∂L T
δxf + L − ẋ T ∂∂Lẋ t δtf = 0 ⇒ ∂S ∂S

Proof: ∂ ẋ ∂x
−λ δxf + H + ∂t tf
δtf = 0
tf f tf
L = H + dS
dt
− λT (t)ẋ
Rt
Optimize: J = 0 L(x, ẋ, u, λ, t)dt H = V (x, u, t) + λT (t)f (x, u, t)
with x(0) = x0 and tf , xf are free. dS ∂S ∂S T
= +( ) ẋ
dt ∂t ∂x
" T #
∂L ∂S
δxf + L − ẋ T −λ δtf
∂ ẋ ∂x tf
tf
" T # " T #
∂S ∂S ∂S ∂S
= −λ δxf + H + + ẋ − λT ẋ − ẋ T + ẋ T λ δtf
∂x ∂t ∂x ∂x
tf tf
" T #
∂S ∂S
= −λ δxf + H + δtf
∂x ∂t tf
tf

Rt
dt
− λT (t)ẋ
Necessary Condition (E-L Equations)
∂H
λ̇ = − (Costate (adjoint) Equation)
∂x
∂H
= 0 (Optimal control Equation)
∂u
ẋ = f (State Equation)
Boundary Conditions
" T #
∂S ∂S
−λ δxf + H + δtf = 0 (Transversality Condition)
∂x ∂t tf
tf
x(t0 ) = x0 (Given initial condition)

Special cases of
Boundary Conditions
" T #
∂S ∂S
∂x ∂t tf
tf
ẋ = x0 (Given initial condition)
Case 1: tf is fixed xf is free

∂S
−λ = 0 (Transversality Condition)
∂x tf
Case 2: xf is fixed tf is free

∂S
H+ = 0 (Transversality Condition)
∂t tf

Optimal Control Problems: Algorithm and challenges
Step 1. Solve optimal control Equation (a set of m algebraic equations).

Step 2. Substitute value of u from Step 1 into state and costate equations (each of
them contains n differential equations).
Step 3. Solve Sate and costate system with all the boundary conditions.
Boundary conditions are split (TPBVP).
Solving TPBVP demands a good numerical scheme.
Algorithm provides open loop optimal control.

Example 4
Find optimal control to optimize
5 2 1 tf 2
Z
1
J = kxf − k + u (t)dt
2 2 2 t0
subject to

ẋ1 x2 0
= ; x(0) = and tf = 2
ẋ2 −x2 + u 0
Solution:

1 2 x2 1
= u 2 + λ1 x2 + λ2 (−x2 + u)

H= u + λ1 λ2
2 −x2 + u 2
Costate equation:

∂H λ̇1 0 0 0 0 λ1
λ̇ = − ⇒ =− = =
∂x λ̇2 λ1 − λ2 −λ1 + λ2 −1 1 λ2

Example 4 cont...
Optimal control Equation:
∂H
= 0 ⇒ u + λ2 = 0 ⇒ u(t) = −λ2 (t)
∂u
State equation:

ẋ1 x2 x2
= =
ẋ2 −x2 + u −x2 − λ2
Boundary Conditions:
" T #
∂S λ1 (2) x1 (2) − 5 x1 (0) 0
−λ =0 ⇒ = and =
∂x λ2 (2) x2 (2) − 2 x2 (0) 0
tf
State and Costate system:

     
ẋ1 x1 0 1 0 0
 ẋ2   x2  0 −1 0 −1
  = A   ; where A =  
λ̇1  λ1  0 0 0 0
λ̇2 λ 2 0 0 −1 1

Example 4 cont...
Solve state and costate equations:

      
x1 (2) x1 (0) 1 0.86 1.63 −2.76 0
 x2 (2) 

h
 = e At
i  x2 (0)  0 0.14
 = 2.76 −3.630
 
λ1 (2) t=2 λ1 (0) 0 0 1 0  c1 
λ2 (2) λ2 (0) 0 0 −6.39 7.39 c2

λ1 (2) x1 (2) − 5
By Substituting = and rearranging equations for unknowns
λ2 (2) x2 (2) − 2
x1 (2), x2 (2), c1 and c2 obtain
        
1 0 −1.63 2.76 x1 (2) 0 x1 (2) 2.30
0 1 −2.76 3.63  x2 (2) = 0 ⇒ x2 (2) =  1.33 
       

1 0 −1 0   c1   5   c1   −2.70
0 1 6.39 −7.39 c2 2 c2 −2.42

Example 4 cont...
Optimal control is: u(t) = −λ2 (t) where

     
x1 (t) x1 (0) 0 1 0 0
 x2 (t)  h i x (0) 
2
0 −1 0 −1
e At 


λ1 (t)
 =  c1   Here A = 0
  
0 0 0 
λ2 (t) c2 0 0 −1 1
1 1 − e −t 1 t −t 1 t
+ e −t )
  
2
(e − e ) − t 1 − 2
(e 0
−t 1 t −t 1 −t t
0 e −1 + 2 (e + e ) 2
(e − e )   0   
= 
0  −2.70

0 1 0
t t
0 0 1−e e −2.42
Thus, optimal control is
u(t) = −2.70 + 0.28e t , 0 ≤ t ≤ 2.

Example 4 cont...
Optimal control is: u(t) = −2.70 + 0.28e t , 0 ≤ t ≤ 2.
Optimal state/costate are
1 1 − e −t 1
(e t − e −t ) − t 1 − 12 (e t + e −t )
    
x1 (t) 2
0
 x2 (t)  0 e −t −1 + 21 (e t + e −t ) 1
2
(e −t − e t )   0 
 
λ1 (t) = 0
  
0 1 0  −2.70
λ2 (t) 0 0 1 − et et −2.42
Thus, H = 12 u 2 + λ1 x2 + λ2 (−x2 + u) along optimal path:
clear;
A = [0 1 0 0;0 -1 0 -1;0 0 0 0;0 0 -1 1];
syms t;
eAt = expm(A*t);
z0 = [0; 0;-2.70;-2.42];
z = eAt*z0;
H = 0.5*z(4)*z(4) + [z(3)
z(4)]*[z(2);-z(2)-z(4)];
ezplot(H,[0,2]);

One result to check the optimal control
Theorem
If H is not an explicit function of time t, then H is constant along the optimal path.
Proof:
∂H
∂x
L = H + dS
dt
− λT (t)ẋ
H = V (x, u, t) + λT (t)f (x, u, t)
∂H
∂u
∂H
ẋ = f = (State Equation)
∂λ
dH ∂H ∂H ∂H ∂H
= + ẋ T + u̇ T + λ̇T
dt ∂t ∂x
∂u
∂λ
∂H T ∂H T ∂H ∂H
= + ẋ + λ̇ + u̇ =
∂t ∂x ∂u ∂t

Linear Quadratic Optimal Control problem
Rt
Optimize: J(u(t)) = 21 (xf −Cf )T Sf (xf −Cf )+ 12 0 f [(x −C )T Q(x −C )+u T Ru]dt
subject to ẋ = Ax + Bu with x(0) = x0 (fixed),
tf is fixed and finite, and x(tf ) is free.
Sf , Q ≥ 0 and R > 0
Q: Error Weighted Matrix

R: Control Weighted Matrix
Sf : Terminal Cost Weighted Matrix
State Regular Problem (Linear Quadratic Regular Problem)
Finite and Infinite Time Horizon Problem
Sf = 0 in infinite-time horizon state regular problem

Recall - Optimal Control Problems - Basic Frame-work
Rt
dt
− λT (t)ẋ
∂H
∂x
∂H
∂u
Boundary Conditions
" T #
∂S ∂S
∂x ∂t tf
tf

Recall - Optimal Control Problems - Basic Frame-work
Special cases of
Boundary Conditions
" T #
∂S ∂S
∂x ∂t tf
tf
ẋ = x0 (Given initial condition)
Case 1: tf is fixed xf is free

∂S
−λ = 0 (Transversality Condition)
∂x tf
Case 2: xf is fixed tf is free

∂S
H+ = 0 (Transversality Condition)
∂t tf

Linear Quadratic Regulator (LQR) problem - I
Rt
Optimize: J(u(t)) = 12 xfT Sf xf + 12 0 f [x T Qx + u T Ru]dt
Sf , Q ≥ 0 and R > 0
Take
1 T
H ≡ H(x(t), u(t), λ(t), t) = [x Qx + u T Ru] + λT [Ax + Bu] (Hamiltonian)
2
Costate equation:
∂H
λ̇ = − ⇒ λ̇ = −[Qx + AT λ]
∂x
Optimal control Equation:
∂H
= 0 ⇒ Ru + B T λ = 0 ⇒ u = −R −1 B T λ
∂u

LQR problem - I
State equation:
∂H
ẋ = ⇒ ẋ = Ax + Bu
∂λ
Boundary Conditions:

∂S
x(0) = x0 and λ(tf ) = = Sf x(tf )
∂x tf
State and Costate system:

ẋ A −E x
= ; where E = BR −1 B T
λ̇ −Q −AT λ

LQR problem - I
Take: λ(t) = P(t)x(t)

which implies λ̇(t) = Ṗ(t)x(t) + P(t)ẋ(t) Substitute values from state and
costate system, obtain
−Qx − AT λ = Ṗ(t)x(t) + P(t)(Ax − BR −1 B T λ)
Substitute λ = P(t)x and obtain
−Qx − AT P(t)x = Ṗ(t)x + P(t)(Ax − BR −1 B T P(t)x)

= Ṗ(t)x(t) + P(t)Ax − P(t)BR −1 B T P(t)x
which implies
Ṗ(t)x + P(t)Ax − P(t)BR −1 B T P(t)x + Qx + AT P(t)x = 0

⇒ [Ṗ(t) + P(t)A + AT P(t) − P(t)BR −1 B T P(t) + Q]x(t) = 0
Since x(t) is not identically zero
Ṗ + PA + AT P − PBR −1 B T P + Q = 0 Differential Riccati Equation (DRE).
Moreover, P(tf )x(tf ) = λ(tf ) = Sf x(tf ) ⇒ P(tf ) = Sf

LQR problem - I
Rt
Optimize: J(u(t)) = 12 xfT Sf xf + 12 0 f [x T Qx + u T Ru]dt
Sf , Q ≥ 0 and R > 0
1
Here, H ≡ H(x(t), u(t), λ(t), t) = [x T Qx +u T Ru]+λT [Ax +Bu] (Hamiltonian)
2
Algorithm to solve the above problem:
Step 1. Solve DRE: Ṗ + PA + AT P − PBR −1 B T P + Q = 0 with condition

P(tf ) = Sf from tf to t0 .
Step 2. for optimal state x ∗ , solve:

ẋ = Ax +Bu = Ax −BR −1 B T λ(t) = Ax −BR −1 B T Px(t) = [A−BR −1 B T P]x(t)
with x(0) = x0 .
Step 3. optimal control u ∗ (t) = −R −1 B T Px ∗ (t).

LQR problem - I: Some Important Features
1 Riccati Matrix P(t) is time varying matrix which depends on A, B, Sf , Q, R but

does not depend on x0 .
2 P(t) is SPD.
3 Sufficient Test: u ∗ is minimum if R is positive definite.
4 Sufficient Test: u ∗ is maximum if R is negative definite.
5 Computation of P is independent of x ∗ and u ∗ (See Algorithm: Step 1 is
independent of Step 2 and 3).
6 Algorithm provides u ∗ as a linear function of x ∗ (closed loop controller).
7 LQR problem can be solved by an open loop controller.
Computation of P(t):
In general: solve DRE by well known solvers of systems of ODEs.
Under some conditions: Analytic solution is also available for DRE. Here important is the
following property of state costate system matrix.

ẋ A −E x x
= T =∆ ; where E = BR −1 B T
λ̇ −Q −A λ λ
If µ is an eigenvalue of ∆ then −µ is also an eigenvalue of ∆.

LQR problem - I: Analytic Solution of DRE
Consider state costate system:

ẋ A −E x x
= T =∆ ; where E = BR −1 B T
λ̇ −Q −A λ λ

x −M 0 W11 W12
= WDW −1 , where D = , W = .
λ 0 M W21 W22

w x
Take = W −1 . Then
z λ

x w W11 W12 w
= W = (1)
λ z W21 W22 z

ẇ w −M 0 w
= D = (2)
ż z 0 M z
From (2), obtain
e −M(t−tf )

w (t) 0 w (tf )
=
z(t) 0 e M(t−tf ) z(tf )
e M(t−tf )

w (tf ) 0 w (t)
⇒ = (3)
z(t) 0 e M(t−tf ) z(tf )

From (1) and fact that λ(tf ) = Sf x(tf ), obtain
W21 w (tf ) + W22 z(tf ) = Sf x(tf )

= Sf [W11 w (tf ) + W12 z(tf )] (from (1) again.)
which implies
z(tf ) = − [W22 − Sf W12 ]−1 [W21 − Sf W11 ] w (tf )

= T1 w (tf ), where T1 = − [W22 − Sf W12 ]−1 [W21 − Sf W11 ]
From (3), obtain
z(t) = e M(t−tf ) z(tf ) = e M(t−tf ) T1 w (tf )

= e M(t−tf ) T1 e M(t−tf ) w (t) (from (3) again)
= T2 w (t), (4)
−M(tf −t) −M(tf −t)
where T2 = e T1 e

From (1) and fact that λ(t) = P(t)x(t), obtain
W21 w (t) + W22 z(t) = P(t)x(t)

= P(t) [W11 w (t) + W12 z(t)] (from (1) again.)
Finally, from (4), obtain
P(t) = [W21 + W22 T2 ] [W11 + W12 T2 ]−1
where
T2 = e −M(tf −t) T1 e −M(tf −t)
and
T1 = − [W22 − Sf W12 ]−1 [W21 − Sf W11 ]

Example 5 (LQR problem - I)
1 2
x1 (5) + x1 (5)x2 (5) + 2x22 (5)

J=
2
1 R5
+ 0 2x12 (t) + 6x1 (t)x2 (t) + 5x22 (t) + 0.25u 2 (t) dt

2
subject to ẋ1 (t) = x2 (t)
ẋ2 (t) = −2x1 (t) + x2 (t) + u(t)
x1 (0) = 2 and x2 (0) = −3

0 1 0 1 0.5 2 3
Here, we have A = ,B= , Sf = ,Q=
−2 1 1 0.5 2 3 5

2 1
x0 = , R = , t0 = 0, and tf = 5.
−3 4


Ṗ + PA + AT P − PBR −1 B T P + Q = 0 with condition P(tf ) = Sf from tf to t0 .

ṗ11 ṗ12 p11 p12 0 1
= −
ṗ12 ṗ22 p12 p22 −2 1

0 −2 p11 p12
−
1 1 p12 p22

p11 p12 0 p11 p12 2 3
+ 4 0 1 −
p12 p22 1 p12 p22 3 5
with

p11 (5) p12 (5) 1 0.5
=
p12 (5) p22 (5) 0.5 2

Ṗ + PA + AT P − PBR −1 B T P + Q = 0 with condition P(tf ) = Sf from tf to t0 .
2
ṗ11 (t) = 4p12 (t) + 4p12 (t) − 2; p11 (5) = 1
ṗ12 (t) = −p11 (t) − p12 (t) + 2p22 (t) + 4p12 (t)p22 (t) − 3; p12 (5) = 0.5
2
ṗ22 (t) = −2p12 (t) − 2p22 (t) + 4p22 (t) − 5; p22 (5) = 2
Solve above system of nonlinear differential equations backward in time.

with x(0) = x0 .

clear;
SC = [0 1 0 0;-2 1 0 -4;-2 -3 0 2;-3 -5 -1 -1];
Sf = [1 0.5;0.5 2]; [W1 D] = eig(SC);
W = W1*[1 0 0 0;0 1 0 0;0 0 0 1;0 0 1 0];
W11 = W(1:2,1:2);
W12 = W(1:2,3:4);
W21 = W(3:4,1:2);
W22 = W(3:4,3:4);
M1 = D(1:2,1:2);
tspan = 0:0.1:5;
n = length(tspan);
p11 = zeros(n);
p12 = zeros(n);
p22 = zeros(n); j = 1; for t = 0:0.1:5
eM1 = expm(M1*(5-t));
T1 = -inv(W22-Sf*W12)*(W21-Sf*W11);
T2 = eM1*T1*eM1;
P = (W21+W22*T2)*inv(W11+W12*T2);
p11(j) = P(1,1);
p12(j) = P(1,2);
p22(j) = P(2,2);
j=j+1;
end
figure
plot(tspan,p11,’g’,tspan,p12,’b’,tspan,p22,’r’)
title(’Solution of DRE’)
xlabel(’t-value’)
ylabel(’P-value’)

clear; [tx,x]=ode45(’state’, [0,5],[2;-3]);

plot(tx,x)
title(’optimal state’)
ylabel(’x-value’)
function dx = state(t,x)
SC = [0 1 0 0;-2 1 0 -4;-2 -3 0 2;-3 -5 -1 -1];
A= SC(1:2,1:2);
B=[0;1];
Rinv = 4;
Sf = [1 0.5;0.5 2];
W = W1*[1 0 0 0;0 1 0 0;0 0 0 1;0 0 1 0];
W11 = W(1:2,1:2);
W12 = W(1:2,3:4);
W21 = W(3:4,1:2);
W22 = W(3:4,3:4);
M1 = D(1:2,1:2);
eM1 = expm(M1*(5-t));
T1 = -inv(W22-Sf*W12)*(W21-Sf*W11);
T2 = eM1*T1*eM1;
P = (W21+W22*T2)*inv(W11+W12*T2);
dx = [A-B*Rinv*B’*P]*x;
end

clear; SC = [0 1 0 0;-2 1 0 -4;-2 -3 0 2;-3 -5 -1 -1];

B=[0;1];
Rinv = 4;
Sf = [1 0.5;0.5 2]; [W1,D] = eig(SC);
W = W1*[1 0 0 0;0 1 0 0;0 0 0 1;0 0 1 0];
W11 = W(1:2,1:2);
W12 = W(1:2,3:4);
W21 = W(3:4,1:2);
W22 = W(3:4,3:4);
M1 = D(1:2,1:2); tspan = 0:0.1:5;
n = length(tspan);
j =1;
for t = 0:0.1:5
eM1 = expm(M1*(5-t));
T1 = -inv(W22-Sf*W12)*(W21-Sf*W11);
T2 = eM1*T1*eM1;
P = (W21+W22*T2)*inv(W11+W12*T2);
K = Rinv*B’*P; [tx,x]=ode45(’state’, [0,5],[2;-3]);
xs = interp1(tx,x,tspan)
u(j) = -K*[xs(j,:)]’;
j=j+1;
end
figure
plot(tspan,u,’r’)
title(’optimal control’)
ylabel(’u-value’)

LQR problem - II
R∞
Optimize: J(u(t)) = 12 0 [x T Qx + u T Ru]dt
Sf , Q ≥ 0 and R > 0
1
2
Assumption: System is controllable.
Recall: Algorithm to solve the LQR Problem-I:


with x(0) = x0 .

LQR problem - II
R∞
Optimize: J(u(t)) = 12 0 [x T Qx + u T Ru]dt
Sf , Q ≥ 0 and R > 0
1
2
Algorithm to solve the LQR Problem-II:
Step 1. Solve ARE: PA + AT P − PBR −1 B T P + Q = 0 .

with x(0) = x0 .

Example 6 (LQR problem - II)
1 R∞ 2
2x1 (t) + 6x1 (t)x2 (t) + 5x22 (t) + 0.25u 2 (t) dt

J=
2 0
ẋ2 (t) = −2x1 (t) + x2 (t) + u(t)
x1 (0) = 2 and x2 (0) = −3

0 1 0 2 3
Here, we have A = ,B= ,Q=
−2 1 1 3 5

2 1
x0 = , R = , and t0 = 0.
−3 4
Step 1. Solve ARE: PA + AT P − PBR −1 B T P + Q = 0 .

PA + AT P − PBR −1 B T P + Q = 0 .

0 0 p11 p12 0 1
= −
0 0 p12 p22 −2 1

0 −2 p11 p12
−
1 1 p12 p22

p11 p12 0 p11 p12 2 3
+ 4 0 1 −
p12 p22 1 p12 p22 3 5
with

p11 (5) p12 (5) 1 0.5
=
p12 (5) p22 (5) 0.5 2
2
0 = 4p12 (t) + 4p12 (t) − 2;
0 = −p11 (t) − p12 (t) + 2p22 (t) + 4p12 (t)p22 (t) − 3;
2
0 = −2p12 (t) − 2p22 (t) + 4p22 (t) − 5;

clear;
A= [0 1;-2 1];
1.73663 0.3660 B=[0;1]; Q = [2 3;3 5];
P= R = [0.25];
0.3660 1.4729 E = B*inv(R)*B’;
P = are(A,E,Q);

ẋ = [A − BR −1 B T P]x(t) with x(0) = x0 .

Lecture 3
Constrained Optimal Control Problems

Find a u(t) ∈ Rm such that u(t) with the corresponding trajectory of

where x(0) = x0 (fixed) optimize the performance index
Z t
J(u(t)) = S(xf , tf ) + V (x, u, t)dt
t0
Moreover
ku(t)k ≤ U, or − Ui ≤ ui (t) ≤ Ui .

Rt
Optimize: J = 0 L(x, ẋ, u, λ, t)dt
with x(0) = x0 and tf , xf are free. L = H + dS
dt
− λT (t)ẋ
Moreover H = V (x, u, t) + λT (t)f (x, u, t)
Necessary Condition (E-L Equations/Pontryagin Minimum Principle)
∂H
∂x
hhh ((((
∂H hhhhh (((Equation)
=0 ((Optimal
( ( (control
h hhh H(x, u ∗ , λ) = min H(x, u, λ)
∂u
(( ( ( ( hh hh ku(t)k≤U
∂H
ẋ = = f (State Equation)
∂λ
Boundary Conditions
" T #
∂S ∂S
∂x ∂t tf
tf

Rt
dt
− λT (t)ẋ
( T T T )
Z tf
∂H ∂H ∂H
δJ = + λ̇(t) δx(t) + δu(t) + − ẋ(t) δλ(t) dt
t0 ∂x ∂u ∗ ∂λ
" T #
∂S ∂S
+ −λ δxf + H + δtf
∂x ∂t tf
tf

Rt
dt
− λT (t)ẋ
( T )
Z tf
∗ ∂H
δJ(u (t), δu(t)) = δu(t) dt
t0 ∂u ∗
Now
∆J(u ∗ (t), δu(t)) = J(u) − J(u ∗ ) ≥ 0 (for minimum)

= δJ(u ∗ (t)) + HOT
= δJ(u ∗ (t), δu(t)) (Neglecting HOT)
Z tf ( T )
∂H
= δu(t) dt
t0 ∂u ∗

∆J(u ∗ (t), δu(t)) = J(u) − J(u ∗ ) ≥ 0 (for minimum)

= δJ(u ∗ (t) + HOT
= δJ(u ∗ (t), δu(t)) (Neglecting HOT)
Z tf ( T )
∂H
= δu(t) dt
t0 ∂u ∗
Z tf
= {∆H} dt
t0
Z tf
= [H(x, u ∗ + δu, λ, t) − H(x, u ∗ , λ, t)] dt
t0
Now make sure that ∆J(u ∗ , δu(t)) ≥ 0 for arbitrary ’admissible’ δu(t). This gives that
H(x, u ∗ , λ, t) ≤ H(x, u ∗ + δu, λ, t)∀ admissible δu
Hence
H(x, u ∗ , λ, t) = min H(x, u, λ, t)
ku(t)k≤U

Revisit - Constrained Optimal Control Problems
Rt
dt
− λT (t)ẋ
Necessary Condition (E-L Equations/Pontryagin Minimum Principle)
∂H
∂x
H(x, u ∗ , λ, t) = min H(x, u, λ, t) (optimal control condition)

ku(t)k≤U
Boundary Conditions
" T #
∂S ∂S
∂x ∂t tf
tf

Revisit - Constrained Optimal Control Problems
The PMP
H(x, u ∗ , λ, t) = min H(x, u, λ, t)
ku(t)k≤U
is valid for both constrained and unconstrained problems.

The conditions are still necessary conditions.
(Additional necessary condition:) If tf is fixed and H is not an explicit function of
time t, then H is constant along the optimal path.
(Additional necessary condition:) If tf is free and H is not an explicit function of
time t, then H is zero along the optimal path.

∂H
The sufficient condition that > 0 is only valid for unconstrained problems.
∂u ∗

Example 7
Optimize H = u 2 − 6u + 7, |u| ≤ 2
If there is no constraint then optimizer follows
∂H
= 0 ⇒ 2u − 6 = 0 ⇒ u ∗ = 3
∂u
But this value of u is certainly outside the constraints.
Think!
H(u ∗ ) = min H(u)
|u|≤2
It is clear from the picture that
u∗ = 2

Time optimal control problem (TOCP)
Minimize the time taken for an LTI system
ẋ = Ax + Bu
to go from an arbitrary initial condition x(t0 ) = x0 to the desired final state xf . Here
control is constrained by
ku(t)k ≤ U.
Without loss of generality we take
xf = 0 (Time optimal regulator problem)
ku(t)k ≤ 1, i.e. −1 ≤ ui ≤ 1.
Rt
Optimize J = t0f dt
subject to ẋ = Ax + Bu
x(t0 ) = x0 ; x(tf ) = 0; t0 fixed and tf is free
Here,
H = 1 + λT [Ax + Bu]

Rt
Optimize J = t0f dt
Here: H = 1 + λT [Ax + Bu]

State Equation:
∂H
ẋ = ⇒ ẋ = Ax + Bu ∗
∂λ
Costate Equation:
∂H
λ̇ = − ⇒ λ̇ = −AT λ
∂x
Boundary Conditions: x(t0 ) = x0 ; x(tf ) = 0 and tf is free gives
" T #
∂S ∂S
−λ δxf + H + δtf = 0 ⇒ H(x, u ∗ , λ) = 0
∂x ∂t tf
tf

Rt
Optimize J = t0f dt
Here: H = 1 + λT [Ax + Bu]

Optimal Control Condition: H(x, u ∗ , λ) = min H(x, u, λ) gives
ku(t)k≤1
H(x, u ∗ , λ) = min H(x, u, λ)

ku(t)k≤1

⇒ 1 + λT [Ax + Bu ∗ ] = min 1 + λT [Ax + Bu]
ku(t)k≤1
⇒ (u ∗ )T B T λ = min u T B T λ
ku(t)k≤1
∗ T
⇒ (u ) q = min u T q, where, q = B T λ
ku(t)k≤1
which implies
−1, if q > 0;
u∗ =
1, if q < 0.
Rt
Optimize J = t0f dt
Therefore, Optimal Control Condition:
uj∗ = −sgn{qj } = −sgn{bjT λ},
where bj is the j-th column of B (Bang-Bang Control)

Normal-TOCP
Singular-TOCP

Time optimal control problem (TOCP): Some Results
1 The necessary and sufficient condition for the TOCP to be normal is that the system
is completely controllable.
2 The necessary and sufficient condition for the TOCP to be singular is that the
system is not completely controllable.
3 For the normal-TOCP the developed control is the unique minimizer.
4 For the normal-TOCP, if A has all n eigenvalues real, then u ∗ can switch between
−1 and +1 at most (n-1) times

Time optimal control problem (TOCP): Algorithm
Rt
Minimize J = t0f dt
T
1 Solve costate equation: λ̇ = −AT λ ⇒ λ(t) = e −A t λ(0).
2 Remember! λ(0) is not known, so Assume λ(0).
3 Evaluate: uj∗ = −sgn{qj } = −sgn{bjT λ}.
4 Solve: state dynamics: ẋ = Ax + Bu ∗ with given x0 .
5 Monitor the solution x(t) for a tf such that x(tf ) = 0; otherwise change your guess
for λ(0).
The algorithm is tedious! assume a closed loop controller:

Example 8: Normal-TOCP
1 R tf
Minimize J = dt
2 t0
ẋ2 (t) = u(t)
0
x(t0 ) = x0
0
|u| ≤ 1.
Here: H = 1 + λ1 x2 + λ2 u
State Equation:

∂H ẋ1 0 1 x1 0 ∗
ẋ = ⇒ = + u
∂λ ẋ2 0 0 x2 1
Costate Equation:

∂H λ̇1 0 0 λ1
λ̇ = − ⇒ =−
∂x λ̇2 1 0 λ2
Boundary Conditions: x(t0 ) = x0 ; x(tf ) = 0

Optimal Control Condition: H(x, u ∗ , λ) = min H(x, u, λ) gives

ku(t)k≤1
H(x, u ∗ , λ) = min H(x, u, λ)

ku(t)k≤1
⇒ 1 + λ1 x2 + λ2 u ∗ = min (1 + λ1 x2 + λ2 u)
ku(t)k≤1
⇒ λ2 u ∗ = min λ2 u
ku(t)k≤1
⇒ u ∗ = −sgn{λ2 }
Solve Costate equations:
λ̇1 = 0 ⇒ λ1 (t) = λ1 (0)

λ̇2 = −λ1 ⇒ λ2 (t) = −λ1 (0)t + λ2 (0)

Thus possible controls: {+1} {−1} {+1, −1} or {−1, +1}.

Problem: λ(0) is not known. Ques:
How can we determine u ∗ from x?
If u ∗ switches then when?
Solve state equation and draw its Phase Plane diagram:

∂H ẋ1 0 1 x1 0 ∗ 0 1 x1 0
ẋ = ⇒ = + u = + U; (U = ±1)
∂λ ẋ2 0 0 x2 1 0 0 x2 1
which implies
x2 (t) = x20 + Ut
1 2
x1 (t) = x10 + x20 t + Ut
2
Eliminate t from above solution
(x2 (t) − x20 )
t=
U
1 2 1 1
x1 (t) = x10 − Ux20 + Ux22 (t); (U = ±1 = )
2 2 U
If U = +1 If U = −1
t = x2 (t) − x20 t = −x2 (t) + x20

1 1 2
x1 (t) = C1 + x22 (t) x1 (t) = C2 − x2 (t)
2 2
2 2
C1 = x10 − 12 x20 and C2 = x10 + 12 x20

Switch Curve:

1 2
γ− = (x10 , x20 ) : x10 = − x20 , x20 ≥ 0
2

1 2
γ+ = (x10 , x20 ) : x10 = x20 , x20 ≤ 0
2

1
γ = γ− ∪γ+ = (x10 , x20 ) : x10 = − x20 |x20 |
2
Phase Plane Regions:

1
R+ = (x10 , x20 ) : x10 < − x20 |x20
2

1
R− = (x10 , x20 ) : x10 > − x20 |x20
2

Optimal Control Law:
u ∗ = u ∗ (x1 (t), x2 (t)) = +1; (x1 (t), x2 (t)) ∈ γ+ ∪ R+

u ∗ = u ∗ (x1 (t), x2 (t)) = −1; (x1 (t), x2 (t)) ∈ γ− ∪ R−

References
Optimal Control Systems - D.S. Naidu (CRC Press)

Optimal Control Theory: An Introduction - D. E. Kirk (Dover Publications)
Thank you.

Optimal Control PDF

Uploaded by

Copyright:

Available Formats

Optimal Control PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Optimal Control PDF

Uploaded by

Copyright:

Available Formats

Basics of Optimal Control Theory

Nutan Kumar Tomar

Indian Institute of Technology Patna

Part of MA531: Control Theory

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 1 / 91

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 2 / 91

Fundamental Lemma of CoV

Let f ∈ C [a, b]. Let

for any g ∈ C [a, b]. Then

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 3 / 91

Increment: 4f ≡ 4f (t ∗ , 4t) := f (t ∗ + 4t) − f (t ∗ )

(Necessary condition:) for optimum df = 0.

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 4 / 91

Increment: 4J := J(x ∗ (t) + δx(t)) − J(x ∗ (t)

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 5 / 91

Fundamental Theorem of CoV: (Necessary condition) For x ∗ (t) to be a candidate

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 6 / 91

integral of variation = variation of integral

d d δx(t)dt = (x(t) − x ∗ (t))dt

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 7 / 91

Problem - I (fixed-fixed boundary condition)

where x(t0 ) = x0 (fixed)

Necessary Condition: Euler- Lagrange Equation

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 8 / 91

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 9 / 91

Problem - II (fixed-partially fixed)

where x(t0 ) = x0 (fixed)

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 10 / 91

Problem - III (fixed-free)

where x(t0 ) = x0 (fixed)

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 11 / 91

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 12 / 91

= δtf V (x ∗ + δx, ẋ ∗ + δ ẋ, t)|tf +θδtf ; 0 < θ < 1

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 13 / 91

where x(t0 ) = x0 (fixed)

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 14 / 91

Free-free transversality condition:

Then transversality condition becomes

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 15 / 91

General Problem (fixed-free)

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 16 / 91

General Condition (fixed-free case):

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 17 / 91

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 18 / 91

E-L Equation Boundary/ Transversality conditions

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 19 / 91

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 21 / 91

General (fixed-free) case

where x(t0 ) = x0 (fixed)

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 22 / 91

General Condition (fixed-free case):

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 23 / 91

L = L(x(t), ẋ(t), λ, t) = V (x(t), ẋ(t), t) + λT g (x(t), ẋ(t))

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 24 / 91

Now consider J(x ∗ (t) + δx(t)) − J(x ∗ (t)) and find

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 25 / 91

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 26 / 91

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 27 / 91