Nothing Special   »   [go: up one dir, main page]

Optimal Control PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 91

Basics of Optimal Control Theory

Nutan Kumar Tomar


Department of Mathematics

Indian Institute of Technology Patna

Part of MA531: Control Theory

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 1 / 91


Lecture 1
A Glimpse of Calculus of Variations (CoV)
required for
Optimal Control Theory

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 2 / 91


Basics from Calculus

d Rx
f (t)dt = f (x)
dx a
d Rb R b ∂f
f (x, t)dt = a dt
dx a ∂x
φR (x) φ (x)  
d 2 R2 ∂f dφ2 dφ1
f (x, t)dt = dt + f (x, φ2 (x)) − f (x, φ1 (x))
dx φ1 (x) φ1 (x) ∂x dx dx

Fundamental Lemma of CoV

Let f ∈ C [a, b]. Let


Z b
f (t)g (t) = 0,
a

for any g ∈ C [a, b]. Then


f ≡ 0 on [a, b].

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 3 / 91


Basics from Calculus

Increment: 4f ≡ 4f (t ∗ , 4t) := f (t ∗ + 4t) − f (t ∗ )


Differential of af : 
1 d 2f
 
df
4f = f (t ∗ ) + 4t + 2
(4t)2 · · · − f (t ∗ )
dt ∗ 2! dt
| {z } | {z ∗ }
4f = df + d 2 f + · · ·

Definition
A function f (t) is said to have a relative optimum at point t ∗ if ∃  > 0 such that
|t − t ∗ | <  =⇒ 4f has same sign (positive or negative).
4f = f (t) − f (t ∗ ) ≥ 0 ⇒ f (t ∗ ) is a local minimum.
4f = f (t) − f (t ∗ ) ≤ 0 ⇒ f (t ∗ ) is a local maximum.

(Necessary condition:) for optimum df = 0.


(Sufficient condition:) for minimum d 2 f > 0.
(Sufficient condition:) for maximum d 2 f < 0.

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 4 / 91


Function Vs. Functional

Increment: 4J := J(x ∗ (t) + δx(t)) − J(x ∗ (t)


Example:
1 ∂2J
   
∂J 2
4J = J(x ∗ (t) + δx(t) + (δx(t)) · · · − J(x ∗ (t) Rt
J(x(t)) = t f [2x 2 (t) + 3x(t) + 4]dt
∂x ∗ 2! ∂x 2 ∗ 0
| {z } | {z } Z tf
2 δJ = [4x(t) + 3]δx(t)dt
4J = δJ
|{z} +δ J + · · · t0
first variation

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 5 / 91


Function Vs. Functional

Definition
A functional J(x(t)) is said to have a relative optimum at point x ∗ (t) if ∃  > 0 such that
|x(t) − x ∗ (t)| <  =⇒ 4J has same sign (positive or negative).
4J = J(x) − J(x ∗ ) ≥ 0 ⇒ J(x ∗ ) is a local minimum.
4J = J(x) − J(x ∗ ) ≤ 0 ⇒ J(x ∗ ) is a local maximum.

Fundamental Theorem of CoV: (Necessary condition) For x ∗ (t) to be a candidate


for an optimum, the variation of J must be zero on x ∗ (t), i.e. δJ(x ∗ (t), δx(t)) =
0 for all admissible values of δx(t).
(Sufficient condition:) for minimum δ 2 J > 0.
(Sufficient condition:) for maximum δ 2 J < 0.

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 6 / 91


Function Vs. Functional

integral of variation = variation of integral


derivative of variation = variation of derivative
Z tf Z tf

d d δx(t)dt = (x(t) − x ∗ (t))dt


δx(t) = [x(t) − x ∗ (t)] t0 t0
dt dt Z tf Z tf
d d = x(t)dt − x ∗ (t)dt
= x(t) − x ∗ (t) t0 t0
dt dt Z tf
= δ ẋ(t). = δ( x(t)dt).
t0

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 7 / 91


Variational Problems

Problem - I (fixed-fixed boundary condition)


Optimize
Z tf
J(x(t)) = V (x(t), ẋ(t), t)dt,
t0

where x(t0 ) = x0 (fixed)


x(tf ) = xf (fixed)

Necessary Condition: Euler- Lagrange Equation


Suppose x ∗ (t) solves the problem. Then
   
∂V d ∂V
− =0
∂x ∗ dt ∂ ẋ ∗

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 8 / 91


Variational Problems
R tf R tf
J(x ∗ (t) + δx(t)) − J(x ∗ (t)) = t0
V (x ∗ + δx, ẋ ∗ + δ ẋ, t)dt − t0
V (x ∗ , ẋ ∗ , t)dt
This gives
Z tf     
∂V ∂V
δJ(x ∗ (t), δx(t)) = δx + δ ẋ dt
t0 ∂x ∗ ∂ ẋ ∗
Z tf        tf
∂V d ∂V ∂V
= δx − δx dt + δx
t0 ∂x ∗ dt ∂ ẋ ∗ ∂ ẋ ∗ t 0

Now make sure that δJ(x , δx(t)) = 0 for arbitrary δx(t).

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 9 / 91


Variational Problems

Problem - II (fixed-partially fixed)


Optimize
Z tf
J(x(t)) = V (x(t), ẋ(t), t)dt,
t0

where x(t0 ) = x0 (fixed)


tf is fixed but x(tf ) is free.

Necessary Condition
Suppose x ∗ (t) solves the problem. Then
   
∂V d ∂V
− = 0 (E-L Equation)
∂x ∗ dt ∂ ẋ ∗
  
∂V
δx(tf ) = 0 (Transversality Condition)
∂ ẋ ∗ t
f

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 10 / 91


Variational Problems

Problem - III (fixed-free)


Optimize
Z tf
J(x(t)) = V (x(t), ẋ(t), t)dt,
t0

where x(t0 ) = x0 (fixed)


tf and x(tf ) both are free.

Necessary Condition
Suppose x ∗ (t) solves the problem. Then
   
∂V d ∂V
− = 0 (E-L Equation)
∂x ∗ dt ∂ ẋ ∗
     
∂V ∂V
δxf + V − ẋ δtf = 0 (Transversality Condition)
∂ ẋ ∗ t ∂ ẋ ∗ t
f f

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 11 / 91


Variational Problem

Z tf +δtf Z tf
J(x ∗ (t) + δx(t)) − J(x ∗ (t)) = V (x ∗ + δx, ẋ ∗ + δ ẋ, t)dt − V (x ∗ , ẋ ∗ , t)dt
t0 t0
Z tf Z tf
= V (x ∗ + δx, ẋ ∗ + δ ẋ, t)dt − V (x ∗ , ẋ ∗ , t)dt
t0 t0
Z tf +δtf
+ V (x ∗ + δx, ẋ ∗ + δ ẋ, t)dt
tf

This gives
Z tf     
∗ ∂V ∂V
δJ(x (t), δx(t)) = δx + δ ẋ dt + red term
t0 ∂x ∗ ∂ ẋ ∗
Z tf        
∂V d ∂V ∂V
= δx − δx dt + δx
t0 ∂x ∗ dt ∂ ẋ ∗ ∂ ẋ ∗ t f

+ red term

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 12 / 91


Variational Problem

Z tf +δtf
V (x ∗ + δx, ẋ ∗ + δ ẋ, t)dt
tf

= δtf V (x ∗ + δx, ẋ ∗ + δ ẋ, t)|tf +θδtf ; 0 < θ < 1


≈ δtf V (x ∗ , ẋ ∗ , t)|tf +θδtf
≈ δtf V (x ∗ , ẋ ∗ , t)|tf

δxf − δx(tf )
ẋ(tf ) + δ ẋ(tf ) ≈
δtf
⇒ δxf = δx(tf ) + {ẋ(tf ) + δ ẋ(tf )}δtf
⇒ δx(tf ) = δxf − ẋ(tf )δtf

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 13 / 91


Variational Problems

Problem - IV
Optimize
Z tf
J(x(t)) = V (x(t), ẋ(t), t)dt,
t0

where x(t0 ) = x0 (fixed)


tf and x(tf ) lie on a given curve θ(t).

Necessary Condition
Suppose x ∗ (t) solves the problem. Then
   
∂V d ∂V
− = 0 (E-L Equation)
∂x ∗ dt ∂ ẋ ∗
  
∂V
V + (θ̇ − ẋ) δtf = 0 (Transversality Condition)
∂ ẋ ∗ t
f

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 14 / 91


Variational Problems

Free-free transversality condition:


     
∂V ∂V
δxf + V − ẋ δtf = 0 (Transversality Condition)
∂ ẋ ∗ t ∂ ẋ ∗ t
f f

Note that  

δxf = δtf
dt tf

Then transversality condition becomes


  
∂V
V + (θ̇ − ẋ) δtf = 0
∂ ẋ ∗ t
f

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 15 / 91


Variational Problems: Summary

General Problem (fixed-free)


Optimize
Z tf
E-L Condition must be fulfilled.
J(x(t)) = V (x(t), ẋ(t), t)dt, All other transversality conditions are
t0
special case of the following condition.
where x(t0 ) = x0 (fixed)
tf and x(tf ) both are free.

Necessary Condition
Suppose x ∗ (t) solves the problem. Then
   
∂V d ∂V
− = 0 (E-L Equation)
∂x ∗ dt ∂ ẋ ∗
     
∂V ∂V
δxf + V − ẋ δtf = 0 (Transversality Condition)
∂ ẋ ∗ t ∂ ẋ ∗ t
f f

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 16 / 91


Variational Problems: Summary

General Condition (fixed-free case):


     
∂V ∂V
δxf + V − ẋ δtf = 0 (Transversality Condition)
∂ ẋ ∗ tf ∂ ẋ ∗ tf



 No∂Vcondition,
  fixed-fixed;

 = 0, fixed-(tf fixed, xf free);
 ∂ ẋ ∗ t∂V
 f
Special cases:
 
 V − ẋ ∂ ẋ
,
∗ tf  i
fixed-(tf free, xf fixed);
 h
 ∂V

 V + (θ̇ − ẋ) ∂ ẋ =0 fixed-(tf and x(tf ) lie on θ(t)).
∗ tf

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 17 / 91


Example 1

R1
Optimize: J = 0
(ẋ 2 (t) + x(t))dt with x(0) = 2, x(1) = 3

Solution:

E-L Equation

∂V d ∂V
− =0
∂x dt ∂ ẋ Boundary conditions
d
⇒1 − [2ẋ(t)] = 0 x(0) = c2 = 2
dt
x(1) = 41 + c1 + c2 = 3
⇒2ẍ(t) = 1
⇒ c1 = 1 − 14 = 34
1
⇒ẋ(t) = t + c1
2
1
⇒x(t) = t 2 + c1 t + c2
4

Hence, x ∗ (t) = 41 t 2 + 34 t + 2.

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 18 / 91


Example 2

R1
Optimize J = 0
(ẋ 2 (t) + x(t))dt with x(0) = 2, x(1) is free

Solution:

E-L Equation Boundary/ Transversality conditions

∂V d ∂V
− =0 x(0) = c2 = 2
∂x dt ∂ ẋ
d ∂V ∂V
⇒1 − [2ẋ(t)] = 0 |t=1 δx(1) = 0 ⇒ |t=1 = 0
dt ∂ ẋ ∂ ẋ
⇒2ẍ(t) = 1 ⇒2ẋ(t)|t=1 = 0
1 1
⇒ẋ(t) = t + c1 ⇒ t + c1 |t=1 = 0
2 2
1 1
⇒x(t) = t 2 + c1 t + c2 ⇒c1 = −
4 2

Hence, x ∗ (t) = 41 t 2 − 12 t + 2.

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 19 / 91


Example 3
R1√
Optimize J = 0
1 + ẋ 2 dt with x(0) = 0 and (tf , xf ) lie on θ(t) = −5t + 15.
Solution:
E-L Equation Boundary/ Transversality conditions

∂V d ∂V
− ( )=0 x(0) = c2 = 0
∂x dt ∂ ẋ
d ẋ ∂V
⇒ − (√ )=0 [V + (θ̇ − ẋ) ]t=tf = 0
dt 1 + ẋ 2 ∂ ẋ
√ 2ẋ 2 ẍ
p ẋ
[ẍ 1 + ẋ 2 − √ ] ⇒[ 1 + ẋ 2 + (−5 − c1 ) √ ]t=tf = 0
2 1+ẋ 2 1 + ẋ 2
⇒− =0
(1 + ẋ 2 ) ⇒ [1 + ẋ 2 − 5ẋ − c1 ẋ]t=tf = 0
⇒ẍ(1 + ẋ 2 ) − ẋ 2 ẍ = 0 ⇒ [1 + c1 2 − 5c1 − c1 2 ]t=tf = 0
⇒ẍ = 0 1
⇒ c1 =
⇒x = c1 t + c2 5

Hence, x ∗ (t) = 51 t.
To find tf :
1 26 75
tf = −5tf + 15 ⇒ tf = 15 ⇒ tf =
5 5 26
N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 20 / 91
Variational Problems: Sufficient Conditions

R tf R tf
J(x ∗ (t) + δx(t)) − J(x ∗ (t)) = t0
V (x ∗ + δx, ẋ ∗ + δ ẋ, t)dt − t0
V (x ∗ , ẋ ∗ , t)dt

This gives
tf
∂2V ∂2V
Z      2  
1 ∂ V
δ2 J = (δx)2 + 2 δxδ ẋ + (δ ẋ)2
dt
2 t0 ∂x 2 ∗ ∂x∂ ẋ ∗ ∂ ẋ 2 ∗
 2
∂2V
  
Z tf
∂ V  
1 2 δx 
δ ẋ  ∂x ∂x∂ ẋ 
 
=  δx  dt
2 ∂2V ∂2V δ ẋ

t0
∂x∂ ẋ ∂ ẋ 2 ∗

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 21 / 91


Variational Problems in multidimensional space

General (fixed-free) case


Optimize Z tf
J(x(t)) = V (x(t), ẋ(t), t)dt, x(t) ∈ Rn
t0

where x(t0 ) = x0 (fixed)


tf and x(tf ) both are free.

Necessary Condition
Suppose x ∗ (t) solves the problem. Then
   
∂V d ∂V
− = 0 (E-L Equation)
∂x ∗ dt ∂ ẋ ∗
" T #   
∂V ∂V
δxf + V − ẋ T δtf = 0 (Transversality Condition)
∂ ẋ ∗ ∂ ẋ ∗ t
tf f

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 22 / 91


Variational Problems in multidimensional space

General Condition (fixed-free case):


" T #   
∂V ∂V
T
δxf + V − ẋ δtf = 0 (Transversality Condition)
∂ ẋ ∗ ∂ ẋ ∗ tf
tf


 No condition, fixed-fixed;
 h ∂V T i
Special cases: ∂ ẋ ∗
= 0, fixed-(tf fixed, xf free);
 tf
 V − ẋ T ∂V
  
, fixed-(tf free, xf fixed);

∂ ẋ ∗ t f

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 23 / 91


Constrained Variational Problems
R tf
Optimize: J(x(t)) = t0
V (x(t), ẋ(t), t)dt, x(t) ∈ Rn
subject to
g (x(t), ẋ(t)) = 0
where x(t0 ) = x0 (fixed); tf and x(tf ) are also fixed.
Method of Lagrange multipliers:

L = L(x(t), ẋ(t), λ, t) = V (x(t), ẋ(t), t) + λT g (x(t), ẋ(t))


and optimize a new
Z tf
J(x(t)) = L(x(t), ẋ(t), λ, t)dt, x(t) ∈ Rn , λ ∈ Rm
t0

Necessary Condition
Suppose x ∗ (t) solves the problem. Then
   
∂L d ∂L
− = 0 (E-L Equation)
∂x ∗ dt ∂ ẋ ∗
   
∂L d ∂L
− = 0 (E-L Equation) ⇒ g = 0
∂λ ∗ dt ∂ λ̇ ∗

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 24 / 91


Constrained Variational Problems

R tf
Optimize: J(x(t)) = t0
V (x1 (t), x2 (t), ẋ1 (t), ẋ1 (t), t)dt, x(t) ∈ R2
subject to
g (x1 (t), x2 (t), ẋ1 (t), ẋ1 (t)) = 0
where x(t0 ) = x0 (fixed); tf and x(tf ) are also fixed.
Method of Lagrange multipliers:
L = L(x1 (t), x2 (t), ẋ1 (t), ẋ1 (t), λ, t) = V + λg
and optimize a new
Z tf
J(x(t)) = L(x1 (t), x2 (t), ẋ1 (t), ẋ1 (t), λ, t)dt
t0

Now consider J(x ∗ (t) + δx(t)) − J(x ∗ (t)) and find


Z tf         
∂L ∂L ∂L ∂L
δJ = δx1 + δ ẋ1 + δx2 + δ ẋ2 dt
t0 ∂x1 ∗ ∂ ẋ1 ∗ ∂x2 ∗ ∂ ẋ2 ∗

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 25 / 91


Constrained Variational Problems

Z tf         
∂L ∂L ∂L ∂L
δJ = δx1 + δ ẋ1 + δx2 + δ ẋ2 dt
t0 ∂x1
∗ ∂ ẋ1
∗ ∂x2 ∗ ∂ ẋ2 ∗
Z tf        tf
∂L d ∂L ∂L
= − δx1 dt + δx1
t0 ∂x1 ∗ dt ∂ ẋ1 ∗ ∂ ẋ1 ∗ t0
Z tf        tf
∂L d ∂L ∂L
+ − δx2 dt + δx2
t0 ∂x2 ∗ dt ∂ ẋ2 ∗ ∂ ẋ2 ∗ t 0

Now make sure that δJ = 0 for arbitrary δx1 (t). Remember! δx1 (t) and δx2 (t) - both are
not independent.

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 26 / 91


Lecture 2
Optimal Control Problems

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 27 / 91


Optimal Control Problems

Find a u(t) ∈ Rm such that u(t) with the corresponding trajectory of


ẋ = f (x, u, t); x(t) ∈ Rn , f : Rn × Rm × [t0 , tf ] → Rn
where x(0) = x0 (fixed) optimize the performance index
Z t
J(u(t)) = S(xf , tf ) + V (x, u, t)dt
t0

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 28 / 91


Optimal Control Problems

Some special cases of the performance index


Rt
J(u(t)) = S(xf , tf ) + t0 V (x, u, t)dt
s/t ẋ = f (x, u, t); x(t0 ) = x0 .

1. Minimum time problem


Z t
J= dt = [tf − t0 ]
t0

2. Minimum control efforts problem


Z t Z t
1 1
J= u T udt = kuk2 dt
2 t0 2 t0
Z t
1
In general, J = u T Rudt (R > 0)
2 t0

3. State tracking about fixed state trajectory C with minimum control efforts problem

1 t
Z
J= [(x − C )T Q(x − C ) + u T Ru]dt (Q ≥ 0, R > 0)
2 t0

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 29 / 91


Optimal Control Problems

Some special cases of the performance index


Rt
J(u(t)) = S(xf , tf ) + t0 V (x, u, t)dt

4. State tracking about zero state trajectory with minimum control efforts problem

1 t T
Z
J= [x Qx + u T Ru]dt (Q ≥ 0, R > 0)
2 t0

5. Minimum control efforts problem for finding final state xf such that xf reaches close
to a constant C
1 t T
Z
J = (xf − C )T Sf (xf − C ) + [u Ru]dt (Sf ≥ 0, R > 0)
2 t0

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 30 / 91


Optimal Control Problems

Find a u(t) ∈ Rm such that the trajectory of


ẋ = f (x, u, t); x(t) ∈ Rn , f : Rn × Rm × [t0 , tf ] → Rn
with x(0) = x0 (fixed) optimizes the performance index
Z t
J(u(t)) = S(xf , tf ) + V (x, u, t)dt
t0
Z t
d
= S(x0 , t0 ) + [V (x, u, t) + S(x, t)]dt
t0 dt
Z tf
d
(∵ S(x, t)dt = S(xf , tf ) − S(x0 , t0 ))
t0 dt

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 31 / 91


Optimal Control Problems

Hence general optimal control problem is

Optimize
Rt d
J(u(t)) = t0
[V (x, u, t) + dt
S(x, t)]dt
such that
ẋ = f (x, u, t); x(0) = x0 (fixed), tf and xf are free.

Now using Lagrange multiplier, the problem becomes

Rt
Optimize: J(u(t)) = 0
V + dS
dt
+ λT [f − ẋ]dt with x(0) = x0 and tf , xf are free.

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 32 / 91


Optimal Control Problems

Rt
Optimize: J(u(t)) = 0
V + dS
dt
+ λT [f − ẋ]dt with x(0) = x0 and tf , xf are free.

Take

H ≡ H(x(t), u(t), λ(t), t) = V (x, u, t) + λT (t)f (x, u, t) (Hamiltonian)


and
L ≡ L(x(t), ẋ(t), u(t), λ(t), t) = H + dS
dt
− λT (t)ẋ (Lagrangian)

Then problem becomes

Rt
Optimize: J = 0
L(x, ẋ, u, λ, t)dt with x(0) = x0 and tf , xf are free.

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 33 / 91


Recall Basic frame-work

General (fixed-free) case


Optimize Z tf
J(x(t)) = V (x(t), ẋ(t), t)dt, x(t) ∈ Rn
t0

where x(t0 ) = x0 (fixed)


tf and x(tf ) both are free.

Necessary Condition
Suppose x ∗ (t) solves the problem. Then
   
∂V d ∂V
− = 0 (E-L Equation)
∂x ∗ dt ∂ ẋ ∗
" T #   
∂V ∂V
δxf + V − ẋ T δtf = 0 (Transversality Condition)
∂ ẋ ∗ ∂ ẋ ∗ t
tf f

Rt
Optimize: J = 0
L(x, ẋ, u, λ, t)dt with x(0) = x0 and tf , xf are free.

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 34 / 91


Optimal Control Problems

Rt
Optimize: J = 0 L(x, ẋ, u, λ, t)dt L = H + dS
dt
− λT (t)ẋ
with x(0) = x0 and tf , xf are free. H = V (x, u, t) + λT (t)f (x, u, t)

Necessary Condition (E-L Equation)


   
∂L d ∂L ∂H
− = 0 ⇒ λ̇ = −
∂x dt ∂ ẋ ∂x
0
   
∂L d ∂L  ∂H
− =0⇒ =0
∂u dt ∂ u̇ ∂u
0
   
∂L d ∂L  ∂H
− =0⇒ − ẋ = 0 ⇒ f − ẋ = 0 ⇒ ẋ = f
∂λ dt ∂ λ̇ ∂λ

Necessary Condition (Transversality Condition)

" T #   " T #  
∂L ∂L T ∂S ∂S
δxf + L − ẋ δtf = 0 ⇒ −λ δxf + H + δtf = 0
∂ ẋ ∂ ẋ tf ∂x ∂t tf
tf tf

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 35 / 91


Optimal Control Problems
   
∂L d ∂L ∂H
Proof: − = 0 ⇒ λ̇ = −
∂x dt ∂ ẋ ∂x

L = H + dS
dt
− λT (t)ẋ
Rt
Optimize: J = 0 L(x, ẋ, u, λ, t)dt H = V (x, u, t) + λT (t)f (x, u, t)
with x(0) = x0 and tf , xf are free. dS ∂S ∂S T
= +( ) ẋ
dt ∂t ∂x

 
∂L ∂ dS
= H+ − λT ẋ
∂x ∂x dt
∂H ∂2S ∂2S
+ = + ẋ
∂x " ∂x∂t
( ∂x 2
   T ) #
d ∂L d ∂ ∂S ∂S
= + ẋ − λ
dt ∂ ẋ dt ∂ ẋ ∂t ∂x
" T #  
d ∂S ∂S ∂S
= −λ ∵ ≡ (x, t)
dt ∂x ∂x ∂x
∂2S ∂2S
= + ẋ − λ̇
∂t∂x ∂x 2

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 36 / 91


Optimal Control Problems

h i h T i
∂L T
δxf + L − ẋ T ∂∂Lẋ t δtf = 0 ⇒ ∂S ∂S
    
Proof: ∂ ẋ ∂x
−λ δxf + H + ∂t tf
δtf = 0
tf f tf

L = H + dS
dt
− λT (t)ẋ
Rt
Optimize: J = 0 L(x, ẋ, u, λ, t)dt H = V (x, u, t) + λT (t)f (x, u, t)
with x(0) = x0 and tf , xf are free. dS ∂S ∂S T
= +( ) ẋ
dt ∂t ∂x

" T #   
∂L ∂S
δxf + L − ẋ T −λ δtf
∂ ẋ ∂x tf
tf
" T # "  T #
∂S ∂S ∂S ∂S
= −λ δxf + H + + ẋ − λT ẋ − ẋ T + ẋ T λ δtf
∂x ∂t ∂x ∂x
tf tf
" T #  
∂S ∂S
= −λ δxf + H + δtf
∂x ∂t tf
tf

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 37 / 91


Optimal Control Problems

Rt
Optimize: J = 0 L(x, ẋ, u, λ, t)dt L = H + dS
dt
− λT (t)ẋ
with x(0) = x0 and tf , xf are free. H = V (x, u, t) + λT (t)f (x, u, t)

Necessary Condition (E-L Equations)

∂H
λ̇ = − (Costate (adjoint) Equation)
∂x

∂H
= 0 (Optimal control Equation)
∂u

ẋ = f (State Equation)

Boundary Conditions
" T #  
∂S ∂S
−λ δxf + H + δtf = 0 (Transversality Condition)
∂x ∂t tf
tf

x(t0 ) = x0 (Given initial condition)

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 38 / 91


Optimal Control Problems

Special cases of

Boundary Conditions
" T #  
∂S ∂S
−λ δxf + H + δtf = 0 (Transversality Condition)
∂x ∂t tf
tf

ẋ = x0 (Given initial condition)

Case 1: tf is fixed xf is free


 
∂S
−λ = 0 (Transversality Condition)
∂x tf

Case 2: xf is fixed tf is free


 
∂S
H+ = 0 (Transversality Condition)
∂t tf

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 39 / 91


Optimal Control Problems: Algorithm and challenges

Step 1. Solve optimal control Equation (a set of m algebraic equations).


Step 2. Substitute value of u from Step 1 into state and costate equations (each of
them contains n differential equations).
Step 3. Solve Sate and costate system with all the boundary conditions.
Boundary conditions are split (TPBVP).
Solving TPBVP demands a good numerical scheme.
Algorithm provides open loop optimal control.

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 40 / 91


Example 4

Find optimal control to optimize

5 2 1 tf 2
  Z
1
J = kxf − k + u (t)dt
2 2 2 t0

subject to
     
ẋ1 x2 0
= ; x(0) = and tf = 2
ẋ2 −x2 + u 0

Solution:
 
1 2  x2 1
= u 2 + λ1 x2 + λ2 (−x2 + u)

H= u + λ1 λ2
2 −x2 + u 2
Costate equation:
        
∂H λ̇1 0 0 0 0 λ1
λ̇ = − ⇒ =− = =
∂x λ̇2 λ1 − λ2 −λ1 + λ2 −1 1 λ2

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 41 / 91


Example 4 cont...

Optimal control Equation:

∂H
= 0 ⇒ u + λ2 = 0 ⇒ u(t) = −λ2 (t)
∂u

State equation:
     
ẋ1 x2 x2
= =
ẋ2 −x2 + u −x2 − λ2

Boundary Conditions:
" T #        
∂S λ1 (2) x1 (2) − 5 x1 (0) 0
−λ =0 ⇒ = and =
∂x λ2 (2) x2 (2) − 2 x2 (0) 0
tf

State and Costate system:


     
ẋ1 x1 0 1 0 0
 ẋ2   x2  0 −1 0 −1
  = A   ; where A =  
λ̇1  λ1  0 0 0 0
λ̇2 λ 2 0 0 −1 1

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 42 / 91


Example 4 cont...

Solve state and costate equations:


      
x1 (2) x1 (0) 1 0.86 1.63 −2.76 0
 x2 (2) 

h
 = e At
i  x2 (0)  0 0.14
 = 2.76 −3.630
 
λ1 (2) t=2 λ1 (0) 0 0 1 0  c1 
λ2 (2) λ2 (0) 0 0 −6.39 7.39 c2
   
λ1 (2) x1 (2) − 5
By Substituting = and rearranging equations for unknowns
λ2 (2) x2 (2) − 2
x1 (2), x2 (2), c1 and c2 obtain
        
1 0 −1.63 2.76 x1 (2) 0 x1 (2) 2.30
0 1 −2.76 3.63  x2 (2) = 0 ⇒ x2 (2) =  1.33 
       

1 0 −1 0   c1   5   c1   −2.70
0 1 6.39 −7.39 c2 2 c2 −2.42

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 43 / 91


Example 4 cont...

Optimal control is: u(t) = −λ2 (t) where


     
x1 (t) x1 (0) 0 1 0 0
 x2 (t)  h i x (0) 
2
0 −1 0 −1
e At 


λ1 (t)
 =  c1   Here A = 0
  
0 0 0 
λ2 (t) c2 0 0 −1 1
1 1 − e −t 1 t −t 1 t
+ e −t )
  
2
(e − e ) − t 1 − 2
(e 0
−t 1 t −t 1 −t t
0 e −1 + 2 (e + e ) 2
(e − e )   0   
= 
0  −2.70

0 1 0
t t
0 0 1−e e −2.42

Thus, optimal control is

u(t) = −2.70 + 0.28e t , 0 ≤ t ≤ 2.

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 44 / 91


Example 4 cont...
Optimal control is: u(t) = −2.70 + 0.28e t , 0 ≤ t ≤ 2.
Optimal state/costate are
1 1 − e −t 1
(e t − e −t ) − t 1 − 12 (e t + e −t )
    
x1 (t) 2
0
 x2 (t)  0 e −t −1 + 21 (e t + e −t ) 1
2
(e −t − e t )   0 
 
λ1 (t) = 0
  
0 1 0  −2.70
λ2 (t) 0 0 1 − et et −2.42
Thus, H = 12 u 2 + λ1 x2 + λ2 (−x2 + u) along optimal path:

clear;
A = [0 1 0 0;0 -1 0 -1;0 0 0 0;0 0 -1 1];
syms t;
eAt = expm(A*t);
z0 = [0; 0;-2.70;-2.42];
z = eAt*z0;
H = 0.5*z(4)*z(4) + [z(3)
z(4)]*[z(2);-z(2)-z(4)];
ezplot(H,[0,2]);

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 45 / 91


One result to check the optimal control

Theorem
If H is not an explicit function of time t, then H is constant along the optimal path.

Proof:

Necessary Condition (E-L Equations)

∂H
λ̇ = − (Costate (adjoint) Equation)
∂x
L = H + dS
dt
− λT (t)ẋ
H = V (x, u, t) + λT (t)f (x, u, t)
∂H
= 0 (Optimal control Equation)
∂u

∂H
ẋ = f = (State Equation)
∂λ

dH ∂H ∂H ∂H ∂H
= + ẋ T + u̇ T + λ̇T
dt ∂t ∂x
 ∂u
 ∂λ
∂H T ∂H T ∂H ∂H
= + ẋ + λ̇ + u̇ =
∂t ∂x ∂u ∂t

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 46 / 91


Linear Quadratic Optimal Control problem

Rt
Optimize: J(u(t)) = 21 (xf −Cf )T Sf (xf −Cf )+ 12 0 f [(x −C )T Q(x −C )+u T Ru]dt
subject to ẋ = Ax + Bu with x(0) = x0 (fixed),
tf is fixed and finite, and x(tf ) is free.
Sf , Q ≥ 0 and R > 0

Q: Error Weighted Matrix


R: Control Weighted Matrix
Sf : Terminal Cost Weighted Matrix
State Regular Problem (Linear Quadratic Regular Problem)
Finite and Infinite Time Horizon Problem
Sf = 0 in infinite-time horizon state regular problem

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 47 / 91


Recall - Optimal Control Problems - Basic Frame-work

Rt
Optimize: J = 0 L(x, ẋ, u, λ, t)dt L = H + dS
dt
− λT (t)ẋ
with x(0) = x0 and tf , xf are free. H = V (x, u, t) + λT (t)f (x, u, t)

Necessary Condition (E-L Equations)

∂H
λ̇ = − (Costate (adjoint) Equation)
∂x

∂H
= 0 (Optimal control Equation)
∂u

ẋ = f (State Equation)

Boundary Conditions
" T #  
∂S ∂S
−λ δxf + H + δtf = 0 (Transversality Condition)
∂x ∂t tf
tf

x(t0 ) = x0 (Given initial condition)

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 48 / 91


Recall - Optimal Control Problems - Basic Frame-work

Special cases of

Boundary Conditions
" T #  
∂S ∂S
−λ δxf + H + δtf = 0 (Transversality Condition)
∂x ∂t tf
tf

ẋ = x0 (Given initial condition)

Case 1: tf is fixed xf is free


 
∂S
−λ = 0 (Transversality Condition)
∂x tf

Case 2: xf is fixed tf is free


 
∂S
H+ = 0 (Transversality Condition)
∂t tf

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 49 / 91


Linear Quadratic Regulator (LQR) problem - I

Rt
Optimize: J(u(t)) = 12 xfT Sf xf + 12 0 f [x T Qx + u T Ru]dt
subject to ẋ = Ax + Bu with x(0) = x0 (fixed),
tf is fixed and finite, and x(tf ) is free.
Sf , Q ≥ 0 and R > 0

Take

1 T
H ≡ H(x(t), u(t), λ(t), t) = [x Qx + u T Ru] + λT [Ax + Bu] (Hamiltonian)
2

Costate equation:

∂H
λ̇ = − ⇒ λ̇ = −[Qx + AT λ]
∂x

Optimal control Equation:

∂H
= 0 ⇒ Ru + B T λ = 0 ⇒ u = −R −1 B T λ
∂u

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 50 / 91


LQR problem - I

State equation:

∂H
ẋ = ⇒ ẋ = Ax + Bu
∂λ

Boundary Conditions:
 
∂S
x(0) = x0 and λ(tf ) = = Sf x(tf )
∂x tf

State and Costate system:


    
ẋ A −E x
= ; where E = BR −1 B T
λ̇ −Q −AT λ

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 51 / 91


LQR problem - I

Take: λ(t) = P(t)x(t)


which implies λ̇(t) = Ṗ(t)x(t) + P(t)ẋ(t) Substitute values from state and
costate system, obtain

−Qx − AT λ = Ṗ(t)x(t) + P(t)(Ax − BR −1 B T λ)

Substitute λ = P(t)x and obtain

−Qx − AT P(t)x = Ṗ(t)x + P(t)(Ax − BR −1 B T P(t)x)


= Ṗ(t)x(t) + P(t)Ax − P(t)BR −1 B T P(t)x

which implies

Ṗ(t)x + P(t)Ax − P(t)BR −1 B T P(t)x + Qx + AT P(t)x = 0


⇒ [Ṗ(t) + P(t)A + AT P(t) − P(t)BR −1 B T P(t) + Q]x(t) = 0

Since x(t) is not identically zero

Ṗ + PA + AT P − PBR −1 B T P + Q = 0 Differential Riccati Equation (DRE).

Moreover, P(tf )x(tf ) = λ(tf ) = Sf x(tf ) ⇒ P(tf ) = Sf

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 52 / 91


LQR problem - I

Rt
Optimize: J(u(t)) = 12 xfT Sf xf + 12 0 f [x T Qx + u T Ru]dt
subject to ẋ = Ax + Bu with x(0) = x0 (fixed),
tf is fixed and finite, and x(tf ) is free.
Sf , Q ≥ 0 and R > 0
1
Here, H ≡ H(x(t), u(t), λ(t), t) = [x T Qx +u T Ru]+λT [Ax +Bu] (Hamiltonian)
2
Algorithm to solve the above problem:

Step 1. Solve DRE: Ṗ + PA + AT P − PBR −1 B T P + Q = 0 with condition


P(tf ) = Sf from tf to t0 .

Step 2. for optimal state x ∗ , solve:


ẋ = Ax +Bu = Ax −BR −1 B T λ(t) = Ax −BR −1 B T Px(t) = [A−BR −1 B T P]x(t)
with x(0) = x0 .

Step 3. optimal control u ∗ (t) = −R −1 B T Px ∗ (t).

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 53 / 91


LQR problem - I: Some Important Features

1 Riccati Matrix P(t) is time varying matrix which depends on A, B, Sf , Q, R but


does not depend on x0 .
2 P(t) is SPD.
3 Sufficient Test: u ∗ is minimum if R is positive definite.
4 Sufficient Test: u ∗ is maximum if R is negative definite.
5 Computation of P is independent of x ∗ and u ∗ (See Algorithm: Step 1 is
independent of Step 2 and 3).
6 Algorithm provides u ∗ as a linear function of x ∗ (closed loop controller).
7 LQR problem can be solved by an open loop controller.
Computation of P(t):
In general: solve DRE by well known solvers of systems of ODEs.
Under some conditions: Analytic solution is also available for DRE. Here important is the
following property of state costate system matrix.
      
ẋ A −E x x
= T =∆ ; where E = BR −1 B T
λ̇ −Q −A λ λ

If µ is an eigenvalue of ∆ then −µ is also an eigenvalue of ∆.

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 54 / 91


LQR problem - I: Analytic Solution of DRE
Consider state costate system:
      
ẋ A −E x x
= T =∆ ; where E = BR −1 B T
λ̇ −Q −A λ λ
     
x −M 0 W11 W12
= WDW −1 , where D = , W = .
λ 0 M W21 W22
   
w x
Take = W −1 . Then
z λ
      
x w W11 W12 w
= W = (1)
λ z W21 W22 z
      
ẇ w −M 0 w
= D = (2)
ż z 0 M z

From (2), obtain

e −M(t−tf )
    
w (t) 0 w (tf )
=
z(t) 0 e M(t−tf ) z(tf )
e M(t−tf )
    
w (tf ) 0 w (t)
⇒ = (3)
z(t) 0 e M(t−tf ) z(tf )

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 55 / 91


LQR problem - I: Analytic Solution of DRE

From (1) and fact that λ(tf ) = Sf x(tf ), obtain

W21 w (tf ) + W22 z(tf ) = Sf x(tf )


= Sf [W11 w (tf ) + W12 z(tf )] (from (1) again.)

which implies

z(tf ) = − [W22 − Sf W12 ]−1 [W21 − Sf W11 ] w (tf )


= T1 w (tf ), where T1 = − [W22 − Sf W12 ]−1 [W21 − Sf W11 ]

From (3), obtain

z(t) = e M(t−tf ) z(tf ) = e M(t−tf ) T1 w (tf )


= e M(t−tf ) T1 e M(t−tf ) w (t) (from (3) again)
= T2 w (t), (4)
−M(tf −t) −M(tf −t)
where T2 = e T1 e

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 56 / 91


LQR problem - I: Analytic Solution of DRE

From (1) and fact that λ(t) = P(t)x(t), obtain

W21 w (t) + W22 z(t) = P(t)x(t)


= P(t) [W11 w (t) + W12 z(t)] (from (1) again.)

Finally, from (4), obtain

P(t) = [W21 + W22 T2 ] [W11 + W12 T2 ]−1

where
T2 = e −M(tf −t) T1 e −M(tf −t)
and
T1 = − [W22 − Sf W12 ]−1 [W21 − Sf W11 ]

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 57 / 91


Example 5 (LQR problem - I)

1 2
x1 (5) + x1 (5)x2 (5) + 2x22 (5)

J=
2
1 R5
+ 0 2x12 (t) + 6x1 (t)x2 (t) + 5x22 (t) + 0.25u 2 (t) dt

2
subject to ẋ1 (t) = x2 (t)
ẋ2 (t) = −2x1 (t) + x2 (t) + u(t)
x1 (0) = 2 and x2 (0) = −3
       
0 1 0 1 0.5 2 3
Here, we have A = ,B= , Sf = ,Q=
−2 1 1 0.5 2 3 5
 
2 1
x0 = , R = , t0 = 0, and tf = 5.
−3 4

Step 1. Solve DRE: Ṗ + PA + AT P − PBR −1 B T P + Q = 0 with condition


P(tf ) = Sf from tf to t0 .

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 58 / 91


Example 5 (LQR problem - I)

Ṗ + PA + AT P − PBR −1 B T P + Q = 0 with condition P(tf ) = Sf from tf to t0 .


    
ṗ11 ṗ12 p11 p12 0 1
= −
ṗ12 ṗ22 p12 p22 −2 1
  
0 −2 p11 p12

1 1 p12 p22
      
p11 p12 0   p11 p12 2 3
+ 4 0 1 −
p12 p22 1 p12 p22 3 5

with
   
p11 (5) p12 (5) 1 0.5
=
p12 (5) p22 (5) 0.5 2

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 59 / 91


Example 5 (LQR problem - I)

Ṗ + PA + AT P − PBR −1 B T P + Q = 0 with condition P(tf ) = Sf from tf to t0 .

2
ṗ11 (t) = 4p12 (t) + 4p12 (t) − 2; p11 (5) = 1
ṗ12 (t) = −p11 (t) − p12 (t) + 2p22 (t) + 4p12 (t)p22 (t) − 3; p12 (5) = 0.5
2
ṗ22 (t) = −2p12 (t) − 2p22 (t) + 4p22 (t) − 5; p22 (5) = 2

Solve above system of nonlinear differential equations backward in time.

Step 2. for optimal state x ∗ , solve:


ẋ = Ax +Bu = Ax −BR −1 B T λ(t) = Ax −BR −1 B T Px(t) = [A−BR −1 B T P]x(t)
with x(0) = x0 .

Step 3. optimal control u ∗ (t) = −R −1 B T Px ∗ (t).

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 60 / 91


Example 5 (LQR problem - I)

clear;
SC = [0 1 0 0;-2 1 0 -4;-2 -3 0 2;-3 -5 -1 -1];
Sf = [1 0.5;0.5 2]; [W1 D] = eig(SC);
W = W1*[1 0 0 0;0 1 0 0;0 0 0 1;0 0 1 0];
W11 = W(1:2,1:2);
W12 = W(1:2,3:4);
W21 = W(3:4,1:2);
W22 = W(3:4,3:4);
M1 = D(1:2,1:2);
tspan = 0:0.1:5;
n = length(tspan);
p11 = zeros(n);
p12 = zeros(n);
p22 = zeros(n); j = 1; for t = 0:0.1:5
eM1 = expm(M1*(5-t));
T1 = -inv(W22-Sf*W12)*(W21-Sf*W11);
T2 = eM1*T1*eM1;
P = (W21+W22*T2)*inv(W11+W12*T2);
p11(j) = P(1,1);
p12(j) = P(1,2);
p22(j) = P(2,2);
j=j+1;
end
figure
plot(tspan,p11,’g’,tspan,p12,’b’,tspan,p22,’r’)
title(’Solution of DRE’)
xlabel(’t-value’)
ylabel(’P-value’)

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 61 / 91


Example 5 (LQR problem - I)

clear; [tx,x]=ode45(’state’, [0,5],[2;-3]);


plot(tx,x)
title(’optimal state’)
xlabel(’t-value’)
ylabel(’x-value’)

function dx = state(t,x)
SC = [0 1 0 0;-2 1 0 -4;-2 -3 0 2;-3 -5 -1 -1];
A= SC(1:2,1:2);
B=[0;1];
Rinv = 4;
Sf = [1 0.5;0.5 2];
W = W1*[1 0 0 0;0 1 0 0;0 0 0 1;0 0 1 0];
W11 = W(1:2,1:2);
W12 = W(1:2,3:4);
W21 = W(3:4,1:2);
W22 = W(3:4,3:4);
M1 = D(1:2,1:2);
eM1 = expm(M1*(5-t));
T1 = -inv(W22-Sf*W12)*(W21-Sf*W11);
T2 = eM1*T1*eM1;
P = (W21+W22*T2)*inv(W11+W12*T2);
dx = [A-B*Rinv*B’*P]*x;
end

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 62 / 91


Example 5 (LQR problem - I)

clear; SC = [0 1 0 0;-2 1 0 -4;-2 -3 0 2;-3 -5 -1 -1];


B=[0;1];
Rinv = 4;
Sf = [1 0.5;0.5 2]; [W1,D] = eig(SC);
W = W1*[1 0 0 0;0 1 0 0;0 0 0 1;0 0 1 0];
W11 = W(1:2,1:2);
W12 = W(1:2,3:4);
W21 = W(3:4,1:2);
W22 = W(3:4,3:4);
M1 = D(1:2,1:2); tspan = 0:0.1:5;
n = length(tspan);
j =1;
for t = 0:0.1:5
eM1 = expm(M1*(5-t));
T1 = -inv(W22-Sf*W12)*(W21-Sf*W11);
T2 = eM1*T1*eM1;
P = (W21+W22*T2)*inv(W11+W12*T2);
K = Rinv*B’*P; [tx,x]=ode45(’state’, [0,5],[2;-3]);
xs = interp1(tx,x,tspan)
u(j) = -K*[xs(j,:)]’;
j=j+1;
end
figure
plot(tspan,u,’r’)
title(’optimal control’)
xlabel(’t-value’)
ylabel(’u-value’)

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 63 / 91


LQR problem - II

R∞
Optimize: J(u(t)) = 12 0 [x T Qx + u T Ru]dt
subject to ẋ = Ax + Bu with x(0) = x0 (fixed),
tf is fixed and finite, and x(tf ) is free.
Sf , Q ≥ 0 and R > 0
1
Here, H ≡ H(x(t), u(t), λ(t), t) = [x T Qx +u T Ru]+λT [Ax +Bu] (Hamiltonian)
2
Assumption: System is controllable.
Recall: Algorithm to solve the LQR Problem-I:

Step 1. Solve DRE: Ṗ + PA + AT P − PBR −1 B T P + Q = 0 with condition


P(tf ) = Sf from tf to t0 .

Step 2. for optimal state x ∗ , solve:


ẋ = Ax +Bu = Ax −BR −1 B T λ(t) = Ax −BR −1 B T Px(t) = [A−BR −1 B T P]x(t)
with x(0) = x0 .

Step 3. optimal control u ∗ (t) = −R −1 B T Px ∗ (t).


N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 64 / 91
LQR problem - II

R∞
Optimize: J(u(t)) = 12 0 [x T Qx + u T Ru]dt
subject to ẋ = Ax + Bu with x(0) = x0 (fixed),
tf is fixed and finite, and x(tf ) is free.
Sf , Q ≥ 0 and R > 0
1
Here, H ≡ H(x(t), u(t), λ(t), t) = [x T Qx +u T Ru]+λT [Ax +Bu] (Hamiltonian)
2
Assumption: System is controllable.

Algorithm to solve the LQR Problem-II:

Step 1. Solve ARE: PA + AT P − PBR −1 B T P + Q = 0 .

Step 2. for optimal state x ∗ , solve:


ẋ = Ax +Bu = Ax −BR −1 B T λ(t) = Ax −BR −1 B T Px(t) = [A−BR −1 B T P]x(t)
with x(0) = x0 .

Step 3. optimal control u ∗ (t) = −R −1 B T Px ∗ (t).

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 65 / 91


Example 6 (LQR problem - II)

1 R∞ 2
2x1 (t) + 6x1 (t)x2 (t) + 5x22 (t) + 0.25u 2 (t) dt

J=
2 0
subject to ẋ1 (t) = x2 (t)
ẋ2 (t) = −2x1 (t) + x2 (t) + u(t)
x1 (0) = 2 and x2 (0) = −3
     
0 1 0 2 3
Here, we have A = ,B= ,Q=
−2 1 1 3 5
 
2 1
x0 = , R = , and t0 = 0.
−3 4

Step 1. Solve ARE: PA + AT P − PBR −1 B T P + Q = 0 .

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 66 / 91


Example 6 (LQR problem - II)

PA + AT P − PBR −1 B T P + Q = 0 .
    
0 0 p11 p12 0 1
= −
0 0 p12 p22 −2 1
  
0 −2 p11 p12

1 1 p12 p22
      
p11 p12 0   p11 p12 2 3
+ 4 0 1 −
p12 p22 1 p12 p22 3 5

with
   
p11 (5) p12 (5) 1 0.5
=
p12 (5) p22 (5) 0.5 2

2
0 = 4p12 (t) + 4p12 (t) − 2;
0 = −p11 (t) − p12 (t) + 2p22 (t) + 4p12 (t)p22 (t) − 3;
2
0 = −2p12 (t) − 2p22 (t) + 4p22 (t) − 5;

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 67 / 91


Example 6 (LQR problem - II)

clear;
  A= [0 1;-2 1];
1.73663 0.3660 B=[0;1]; Q = [2 3;3 5];
P= R = [0.25];
0.3660 1.4729 E = B*inv(R)*B’;
P = are(A,E,Q);

Step 2. for optimal state x ∗ , solve:


ẋ = [A − BR −1 B T P]x(t) with x(0) = x0 .

Step 3. optimal control u ∗ (t) = −R −1 B T Px ∗ (t).

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 68 / 91


Lecture 3
Constrained Optimal Control Problems

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 69 / 91


Constrained Optimal Control Problems

Find a u(t) ∈ Rm such that u(t) with the corresponding trajectory of


ẋ = f (x, u, t); x(t) ∈ Rn , f : Rn × Rm × [t0 , tf ] → Rn
where x(0) = x0 (fixed) optimize the performance index
Z t
J(u(t)) = S(xf , tf ) + V (x, u, t)dt
t0

Moreover
ku(t)k ≤ U, or − Ui ≤ ui (t) ≤ Ui .

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 70 / 91


Constrained Optimal Control Problems
Rt
Optimize: J = 0 L(x, ẋ, u, λ, t)dt
with x(0) = x0 and tf , xf are free. L = H + dS
dt
− λT (t)ẋ
Moreover H = V (x, u, t) + λT (t)f (x, u, t)
ku(t)k ≤ U, or − Ui ≤ ui (t) ≤ Ui .
Necessary Condition (E-L Equations/Pontryagin Minimum Principle)

∂H
λ̇ = − (Costate (adjoint) Equation)
∂x
hhh ((((
∂H hhhhh (((Equation)
=0 ((Optimal
( ( (control
h hhh H(x, u ∗ , λ) = min H(x, u, λ)
∂u
(( ( ( ( hh hh ku(t)k≤U

∂H
ẋ = = f (State Equation)
∂λ

Boundary Conditions
" T #  
∂S ∂S
−λ δxf + H + δtf = 0 (Transversality Condition)
∂x ∂t tf
tf

x(t0 ) = x0 (Given initial condition)


N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 71 / 91
Constrained Optimal Control Problems

Rt
Optimize: J = 0 L(x, ẋ, u, λ, t)dt
with x(0) = x0 and tf , xf are free. L = H + dS
dt
− λT (t)ẋ
Moreover H = V (x, u, t) + λT (t)f (x, u, t)
ku(t)k ≤ U, or − Ui ≤ ui (t) ≤ Ui .

( T T T )
Z tf  
∂H ∂H ∂H
δJ = + λ̇(t) δx(t) + δu(t) + − ẋ(t) δλ(t) dt
t0 ∂x ∂u ∗ ∂λ
" T #  
∂S ∂S
+ −λ δxf + H + δtf
∂x ∂t tf
tf

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 72 / 91


Constrained Optimal Control Problems

Rt
Optimize: J = 0 L(x, ẋ, u, λ, t)dt
with x(0) = x0 and tf , xf are free. L = H + dS
dt
− λT (t)ẋ
Moreover H = V (x, u, t) + λT (t)f (x, u, t)
ku(t)k ≤ U, or − Ui ≤ ui (t) ≤ Ui .

( T )
Z tf
∗ ∂H
δJ(u (t), δu(t)) = δu(t) dt
t0 ∂u ∗

Now

∆J(u ∗ (t), δu(t)) = J(u) − J(u ∗ ) ≥ 0 (for minimum)


= δJ(u ∗ (t)) + HOT
= δJ(u ∗ (t), δu(t)) (Neglecting HOT)
Z tf ( T )
∂H
= δu(t) dt
t0 ∂u ∗

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 73 / 91


Constrained Optimal Control Problems

∆J(u ∗ (t), δu(t)) = J(u) − J(u ∗ ) ≥ 0 (for minimum)


= δJ(u ∗ (t) + HOT
= δJ(u ∗ (t), δu(t)) (Neglecting HOT)
Z tf ( T )
∂H
= δu(t) dt
t0 ∂u ∗
Z tf
= {∆H} dt
t0
Z tf
= [H(x, u ∗ + δu, λ, t) − H(x, u ∗ , λ, t)] dt
t0

Now make sure that ∆J(u ∗ , δu(t)) ≥ 0 for arbitrary ’admissible’ δu(t). This gives that

H(x, u ∗ , λ, t) ≤ H(x, u ∗ + δu, λ, t)∀ admissible δu

Hence
H(x, u ∗ , λ, t) = min H(x, u, λ, t)
ku(t)k≤U

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 74 / 91


Revisit - Constrained Optimal Control Problems
Rt
Optimize: J = 0 L(x, ẋ, u, λ, t)dt
with x(0) = x0 and tf , xf are free. L = H + dS
dt
− λT (t)ẋ
Moreover H = V (x, u, t) + λT (t)f (x, u, t)
ku(t)k ≤ U, or − Ui ≤ ui (t) ≤ Ui .

Necessary Condition (E-L Equations/Pontryagin Minimum Principle)

∂H
λ̇ = − (Costate (adjoint) Equation)
∂x

H(x, u ∗ , λ, t) = min H(x, u, λ, t) (optimal control condition)


ku(t)k≤U

ẋ = f (State Equation)

Boundary Conditions
" T #  
∂S ∂S
−λ δxf + H + δtf = 0 (Transversality Condition)
∂x ∂t tf
tf

x(t0 ) = x0 (Given initial condition)

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 75 / 91


Revisit - Constrained Optimal Control Problems

The PMP
H(x, u ∗ , λ, t) = min H(x, u, λ, t)
ku(t)k≤U

is valid for both constrained and unconstrained problems.


The conditions are still necessary conditions.
(Additional necessary condition:) If tf is fixed and H is not an explicit function of
time t, then H is constant along the optimal path.
(Additional necessary condition:) If tf is free and H is not an explicit function of
time t, then H is zero along the optimal path.
 
∂H
The sufficient condition that > 0 is only valid for unconstrained problems.
∂u ∗

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 76 / 91


Example 7

Optimize H = u 2 − 6u + 7, |u| ≤ 2
If there is no constraint then optimizer follows
∂H
= 0 ⇒ 2u − 6 = 0 ⇒ u ∗ = 3
∂u
But this value of u is certainly outside the constraints.
Think!
H(u ∗ ) = min H(u)
|u|≤2

It is clear from the picture that

u∗ = 2

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 77 / 91


Time optimal control problem (TOCP)

Minimize the time taken for an LTI system

ẋ = Ax + Bu

to go from an arbitrary initial condition x(t0 ) = x0 to the desired final state xf . Here
control is constrained by
ku(t)k ≤ U.
Without loss of generality we take
xf = 0 (Time optimal regulator problem)
ku(t)k ≤ 1, i.e. −1 ≤ ui ≤ 1.
Assumption: System is controllable.

Rt
Optimize J = t0f dt
subject to ẋ = Ax + Bu
x(t0 ) = x0 ; x(tf ) = 0; t0 fixed and tf is free

Here,
H = 1 + λT [Ax + Bu]

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 78 / 91


Time optimal control problem (TOCP)

Rt
Optimize J = t0f dt
subject to ẋ = Ax + Bu
x(t0 ) = x0 ; x(tf ) = 0; t0 fixed and tf is free

Here: H = 1 + λT [Ax + Bu]


State Equation:

∂H
ẋ = ⇒ ẋ = Ax + Bu ∗
∂λ
Costate Equation:

∂H
λ̇ = − ⇒ λ̇ = −AT λ
∂x
Boundary Conditions: x(t0 ) = x0 ; x(tf ) = 0 and tf is free gives
" T #  
∂S ∂S
−λ δxf + H + δtf = 0 ⇒ H(x, u ∗ , λ) = 0
∂x ∂t tf
tf

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 79 / 91


Time optimal control problem (TOCP)

Rt
Optimize J = t0f dt
subject to ẋ = Ax + Bu
x(t0 ) = x0 ; x(tf ) = 0; t0 fixed and tf is free

Here: H = 1 + λT [Ax + Bu]


Optimal Control Condition: H(x, u ∗ , λ) = min H(x, u, λ) gives
ku(t)k≤1

H(x, u ∗ , λ) = min H(x, u, λ)


ku(t)k≤1
 
⇒ 1 + λT [Ax + Bu ∗ ] = min 1 + λT [Ax + Bu]
ku(t)k≤1

⇒ (u ∗ )T B T λ = min u T B T λ
ku(t)k≤1
∗ T
⇒ (u ) q = min u T q, where, q = B T λ
ku(t)k≤1

which implies 
−1, if q > 0;
u∗ =
1, if q < 0.
N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 80 / 91
Time optimal control problem (TOCP)

Rt
Optimize J = t0f dt
subject to ẋ = Ax + Bu
x(t0 ) = x0 ; x(tf ) = 0; t0 fixed and tf is free

Therefore, Optimal Control Condition:

uj∗ = −sgn{qj } = −sgn{bjT λ},

where bj is the j-th column of B (Bang-Bang Control)


Normal-TOCP
Singular-TOCP

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 81 / 91


Time optimal control problem (TOCP): Some Results

1 The necessary and sufficient condition for the TOCP to be normal is that the system
is completely controllable.
2 The necessary and sufficient condition for the TOCP to be singular is that the
system is not completely controllable.
3 For the normal-TOCP the developed control is the unique minimizer.
4 For the normal-TOCP, if A has all n eigenvalues real, then u ∗ can switch between
−1 and +1 at most (n-1) times

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 82 / 91


Time optimal control problem (TOCP): Algorithm

Rt
Minimize J = t0f dt
subject to ẋ = Ax + Bu
x(t0 ) = x0 ; x(tf ) = 0; t0 fixed and tf is free

T
1 Solve costate equation: λ̇ = −AT λ ⇒ λ(t) = e −A t λ(0).
2 Remember! λ(0) is not known, so Assume λ(0).
3 Evaluate: uj∗ = −sgn{qj } = −sgn{bjT λ}.
4 Solve: state dynamics: ẋ = Ax + Bu ∗ with given x0 .
5 Monitor the solution x(t) for a tf such that x(tf ) = 0; otherwise change your guess
for λ(0).
The algorithm is tedious! assume a closed loop controller:

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 83 / 91


Example 8: Normal-TOCP

1 R tf
Minimize J = dt
2 t0
subject to ẋ1 (t) = x2 (t)
ẋ2 (t) = u(t)  
0
x(t0 ) = x0
0
|u| ≤ 1.

Here: H = 1 + λ1 x2 + λ2 u
State Equation:
      
∂H ẋ1 0 1 x1 0 ∗
ẋ = ⇒ = + u
∂λ ẋ2 0 0 x2 1
Costate Equation:
    
∂H λ̇1 0 0 λ1
λ̇ = − ⇒ =−
∂x λ̇2 1 0 λ2
Boundary Conditions: x(t0 ) = x0 ; x(tf ) = 0

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 84 / 91


Example 8: Normal-TOCP

Optimal Control Condition: H(x, u ∗ , λ) = min H(x, u, λ) gives


ku(t)k≤1

H(x, u ∗ , λ) = min H(x, u, λ)


ku(t)k≤1

⇒ 1 + λ1 x2 + λ2 u ∗ = min (1 + λ1 x2 + λ2 u)
ku(t)k≤1

⇒ λ2 u ∗ = min λ2 u
ku(t)k≤1

⇒ u ∗ = −sgn{λ2 }

Solve Costate equations:

λ̇1 = 0 ⇒ λ1 (t) = λ1 (0)


λ̇2 = −λ1 ⇒ λ2 (t) = −λ1 (0)t + λ2 (0)

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 85 / 91


Example 8: Normal-TOCP

Thus possible controls: {+1} {−1} {+1, −1} or {−1, +1}.


Problem: λ(0) is not known. Ques:
How can we determine u ∗ from x?
If u ∗ switches then when?
N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 86 / 91
Example 8: Normal-TOCP
Solve state equation and draw its Phase Plane diagram:

           
∂H ẋ1 0 1 x1 0 ∗ 0 1 x1 0
ẋ = ⇒ = + u = + U; (U = ±1)
∂λ ẋ2 0 0 x2 1 0 0 x2 1

which implies
x2 (t) = x20 + Ut
1 2
x1 (t) = x10 + x20 t + Ut
2
Eliminate t from above solution
(x2 (t) − x20 )
t=
U
1 2 1 1
x1 (t) = x10 − Ux20 + Ux22 (t); (U = ±1 = )
2 2 U
If U = +1 If U = −1

t = x2 (t) − x20 t = −x2 (t) + x20


1 1 2
x1 (t) = C1 + x22 (t) x1 (t) = C2 − x2 (t)
2 2
2 2
C1 = x10 − 12 x20 and C2 = x10 + 12 x20
N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 87 / 91
Example 8: Normal-TOCP

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 88 / 91


Example 8: Normal-TOCP

Switch Curve:
 
1 2
γ− = (x10 , x20 ) : x10 = − x20 , x20 ≥ 0
2
 
1 2
γ+ = (x10 , x20 ) : x10 = x20 , x20 ≤ 0
2
 
1
γ = γ− ∪γ+ = (x10 , x20 ) : x10 = − x20 |x20 |
2

Phase Plane Regions:


 
1
R+ = (x10 , x20 ) : x10 < − x20 |x20
2
 
1
R− = (x10 , x20 ) : x10 > − x20 |x20
2

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 89 / 91


Example 8: Normal-TOCP

Optimal Control Law:

u ∗ = u ∗ (x1 (t), x2 (t)) = +1; (x1 (t), x2 (t)) ∈ γ+ ∪ R+


u ∗ = u ∗ (x1 (t), x2 (t)) = −1; (x1 (t), x2 (t)) ∈ γ− ∪ R−

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 90 / 91


References

Optimal Control Systems - D.S. Naidu (CRC Press)


Optimal Control Theory: An Introduction - D. E. Kirk (Dover Publications)

Thank you.

N. K. Tomar (IIT Patna) Basics of Optimal Control Theory 91 / 91

You might also like