Embedded Implementation of Model Predictive Control

This
document is downloaded from DR‑NTU (https://dr.ntu.edu.sg)
Nanyang Technological University, Singapore.
Embedded implementation of model predictive
control
Wu, Xuepei.
2011
http://hdl.handle.net/10356/46832
Nanyang Technological University
Downloaded on 13 Apr 2023 21:05:14 SGT

ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
NANYANG
TECHNOLOGICAL
UNIVERSITY
EMBEDDED IMPLEMENTATION
OF
MODEL PREDICTIVE CONTROL
WU XUEPEI
SCHOOL OF ELECTRICAL AND ELECTRONIC ENGINEERING
A DISSERTATION SUBMITTED IN PARTIAL FULFILMENT OF

THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE IN COMPUTER CONTROL & AUTOMATION
2009
Acknowledgements
I would like to express my deep gratitude to my supervisor Associate Professor Dr.
Ling Keck Voon, Nanyang Technological University, for his constructive suggestions
and helpful criticism throughout the whole period of the project. His enthusiastic
concern on the research progress kept pushing me closer and closer to my objective.
I would also like to sincerely appreciate the help of Project Officer Ms. Yue Siew
Peng. The project would not been successfully finished in due time without her
professional guidance and assistance.
Appreciations are also dedicated to Laboratory Executives in Control Engineering Lab,

Mr. Chan Hian Kuan and Ms. Lim-Tan Geok Lan, for their kindly help on the
management of the laboratory utilities and technical supports.
Last but not least, I would like to thank my parents and my friends for their supports
and encouragements.
ii
Summary
Model Predictive Control (MPC) refers to a type of computer control technology that
utilizes an explicit process model to predict the future response of a plant. It has
become an established control technology in the petrochemical industry, and it is
currently being increasingly applied in many other sectors. MPC outperforms other
control strategies through its ability to deal with constraints. Constrained MPC can be
formulated as a Quadratic Programming (QP) problem, which therefore is central to
MPC. Interior Point Method (IPM) and Active Set Method (ASM) are two schemes
commonly employed to solve QP problems.
MPC has been successfully implemented and verified on Field Programmable Gate
Array (FPGA) or General Purpose Processor (GPP). Alternatively, this dissertation
begins to explore implementation of constrained MPC on fixed-point and floating-
point 32-bit Digital Signal Processor (DSP). A prototyping environment suitable for
investigating implementation issues to bring MPC onto DSP is introduced.
Implementation of IPM and ASM is described in detail and computational
performance of those two methods are presented and analyzed.
The resulting MPC on DSP is demonstrated through a simulated aircraft control

example and it is shown that DSP could be used to implement a reasonably sized
constrained MPC controller.
Table of Contents
Acknowledgements i
Summary ii
List of Figures vi
List of Tables viii
1. Introduction 1
1.1 Motivation 1
1.2 Objectives 2
1.3 Major Contribution of the Thesis 2
2. Constrained Model Predictive Control 5

2.1 Introduction 5
2.2 Receding Horizon Strategy 6
2.3 Principles of Constrained MPC 9
2.4 Solving QP problems 13
2.4.1 Interior Point Method 14
2.4.2 Active Set Method 17
3. Prototyping MPC on DSP 21

3.1 Introduction 21
3.2 Embedded Prototyping Platform 21
3.2.1 TMS320F2812 Processor 22
3.2.2 F2812 Prototyping Board 23
3.2.3 TMS320C6713 Processor 24
3.2.4 C6713 Prototyping Board 25
3.3 Establishing Prototyping Environment 26

3.3.1 F2812 External RAM Expansion 26
3.3.2 Memory Map Configuration 27
3.3.3 Development Software Tool 30
3.4 Matrix Operations on DSP 31
3.4.1 Matrix Multiplication 32
3.4.2 Matrix Inversion 33
3.5 Implementation of IPM Algorithm 36
3.5.1 Formulation of IPM Implementation 36
3.5.2 Implementing IPM Algorithm on MATLAB 37
3.5.3 Implementing IPM Algorithm on DSP 38
3.5.4 Computational Complexity of IPM Algorithm 42
3.5.5 Storage Requirement of IPM Algorithm 42
3.6 Implementation of ASM Algorithm 42
3.6.1 Formulation of ASM Implementation 42
3.6.2 Implementing ASM Algorithm on MATLAB 43
3.6.3 Implementing ASM Algorithm on DSP 43
3.6.4 Computational Complexity of ASM Algorithm 45
3.6.5 Storage Requirement of ASM Algorithm 46
3.7 Testing QP problems 46
3.7.1 Examining a Single QP problem 46
3.7.2 Test Methodology 49
Simulation Results 56
4.1 Convergence Speed 56
4.1.1 Computational Time 56
4.1.2 Number of Iterations 59
4.2 Storage 61
4.3 Numerical Error 61
5. Aircraft Case Study 63

5.1 Introduction 63
5.2 Aircraft Model 63
5.3 Simulation Results 64
5.3.1 System Response 64
5.3.2 Convergence Speed 67
5.3.3 Storage 70
6. Conclusion and Recommendations 71

6.1 Conclusion 71
6.2 Recommendations for Further Research 71
Bibliography 73
Appendices 76
List of Figures
Figure 1.1 Computational time per QP problem using IPM 3

Figure 1.2 Computational time per QP problem using floating-point DSP 4
Figure 2.1 Set-point improvement available from the use of MPC 6
Figure 2.2 The basic idea of receding horizon 7
Figure 2.3 The boundary between controller and plant 11
Figure 3.1 Prototyping of MPC on a Chip 21
Figure 3.2 F2812 eZdsp prototyping board 23
Figure 3.3 C6713 DSK prototyping board 25
Figure 3.4 External RAM expansion circuit 27
Figure 3.5 CCS IDE interface 30
Figure 3.6 Compiler and linker options for F2812 eZdsp 31
Figure 3.7 Compiler and linker options for C6713 DSK 31
Figure 3.8 Debugging flowchart 50
Figure 3.9 Information of the processors and boards 51
Figure 4.1 Computational time per QP problem using IPM on TMS320F2812 56
Figure 4.2 Computational time per QP problem using ASM on TMS320F2812 57
Figure 4.3 Computational time per QP problem using IPM on TMS320C6713 57
Figure 4.4 Computational time per QP problem using ASM on TMS320C6713 58
Figure 4.5 Average number of iterations using IPM on TMS320F2812 59
Figure 4.6 Average number of iterations using ASM on TMS320F2812 60
Figure 4.7 Average number of iterations using IPM on TMS320C6713 60
Figure 4.8 Average number of iterations using ASM on TMS320C6713 60
Figure 5.1 Simulation configuration 64
Figure 5.2 Response of Case 2 with IPM algorithm 65
Figure 5.3 Response of Case 3 with IPM algorithm 65
Figure 5.4 Response of Case 2 with ASM algorithm 66
Figure 5.5 Response of Case 3 with ASM algorithm 66
Figure 5.6 Iterations and computational time oflPM algorithm on F2812 68

Figure 5.7 Iterations and computational time of ASM algorithm on F2812 68
Figure 5.8 Iterations and computational time of IPM algorithm on C6713 69
Figure 5.9 Iterations and computational time of ASM algorithm on C6713 69
List of Tables
Table 3.1 Initialized sections 28

Table 3.2 Uninitialized sections 28
Table 3.3 Worst case computation complexity for IPM algorithm 42
Table 3.4 Worst case computation complexity for ASM algorithm 46
Table 4.1 Storage usage for solving a QP problem using IPM and ASM 61
Table 5.1 Average computational time per aircraft QP problem 70
Table 5.2 Storage usage of aircraft QP problem 70
1. Introduction
1.1 Motivation
Micro-Electro-Mechanical System (MEMS) is the integration of mechanical elements,

sensors, actuators, and electronics on a common silicon substrate through
microfabrication technology. The development of MEMS enables miniaturization of
machines which traditionally occupy a large volume space into chip scale level and
therefore make possible the realization of complete Systems-on-a-Chip (SoC).
MEMS have enjoyed substantial growth in the past decade. Industry analysts estimate
intelligent microsystems will have $100 Billion annual market within the next decade
[15]. There will be increasing demands for advanced control systems to be embedded
into microsystems. The only advanced control technology which has made a
significant impact on industrial control engineering is Model Predictive Control
(MPC).
MPC has now been applied in a wide variety of application areas including process
control, aerospace, vehicles, logistics and networks. The reason which is crucial to the
success of MPC in these applications is that the most profitable operation is often
obtained when a process in running at a constraint or even at more than one constraint,
and being able to take account of constraints is exactly a unique strength of MPC.
Another factor cited often for the practical success of MPC is its natural ability to
handle multivariable problems.
For processes with slow dynamics and low sampling rates, MPC is typically
implemented on a dedicated computer. For systems with fast dynamics such as those
in MEMS, an embedded MPC would be an appropriate controller implementation
since the size and the application precludes the use of a dedicated computer. MPC
implementation based on Field Programmable Gate Array (FPGA) and General
Purpose Processor (GPP) has been proposed in [1] and [11] respectively.
Digital Signal Processor (DSP) is a potential candidate for an embedded platform to

implement MPC. It has become the foundation of the digital revolution and has been
widely applied in real-time industrial applications such as motor controls and power
electronics (see [17], [18] and [27]). So here comes a question: is there a way to bring
MPC onto a single DSP chip, so that it could be applied in DSP-based microscale
devices? This is the major motivation of this project.
1.2 Objectives
MPC deals with constraints by reducing control law to an optimization problem,
which is so-called Quadratic Programming (QP) problem. Computational complexity
can become an issue as the on-line optimization is highly computationally demanding,
especially when applying MPC to embedded applications where computational
resource may be a major constraint. Two optimization schemes, Interior Point method
(IPM) and Active Set method (ASM), are most commonly employed as QP solver.
There are several objectives along this line of research:
1. To implement and verify the IPM and ASM algorithms on DSP processors.
2. To compare the computational performances between fixed-point and floating-
point hardware implementation of IPM and ASM algorithms.
3. To compare the performance of IPM versus ASM using fixed-point and floating-
point DSP.
4. To demonstrate the resulting MPC on a chip through a simulated aircraft control
example.
1.3 Major Contribution of the Thesis
The major contributions of the thesis can be concluded as follows:
1. Coded the IPM and ASM algorithms in C programs and verified them on
TMS320F2812 and TMS320C6713 as the fixed-point and floating-point DSP,
respectively.
2. Compared the computational performance of IPM and ASM by testing both

algorithms for solving different sizes of QP problems, by varying the number of
inequality constraints and decision variables within a certain range, for their
computational time and number of iterations. In addition, analyzed the
computational complexity and investigated the convergence speed of both
algorithms.
3. Verified the applicability of MPC with an aircraft control example by employing

the DSP implementation of the QP solver as the key component of the MPC
controller. It has been shown that DSP could be used to implement a reasonably
sized constrained MPC controller.
We summarize the results obtained in this thesis in Figures 1.1 and 1.2, where m c and
nv denote the number of inequality constraints and the number of decision variables,
respectively.
Computational time as a function ol mc on Manor Point Mathod Computational time at a function of mc on Intertor Pant Method
Number of Constraints mc Number of Constraints mc
Figure 1.1 Computational time per QP problem using IPM

(Left: fixed-point DSP, Right:floating-pointDSP)
Figure 1.1 illustrates the comparison between fixed-point and floating-point DSP
implementation of IPM algorithm. It can be evidently seen that TMS320C6713 DSP
takes great advantage of its floating-point architecture and therefore performs much
faster operations than TMS320F2812. As a result, we conclude that DSP with
floating-point architecture is more appropriate for real-time industrial applications of
MPC, where the computational speed is substantially considered.
Computational time as a function o f m c on Interior Point Method Computational time as a function ol mc on Active Set Method
— • — nv=3 A
•T —*—
nv=5
nv=7
•
1 •1 •1 1
V " " l
E 1 1 1
nv=9
-JK
> '
j
| 1 0 0 -
•
-*•
nv«ll
nv=13 - r- - - r- - - l
- 1 - -
1 1
o
' >
& --1*-'
E
, ^ 1 1
I 50 * ^ - - > — " * • - '

-* -. L - i- - - 1
Jfr • -%
1 i p f
Com
* ^ i t l ,
#1
m •
—% '
• jr^
30 40 50 60 70 80 90
... -V -- - |v
100
_s 110 120
Number of Constraints mc Number of Constraints mc
Figure 1.2 Computational time per QP problem using floating-point DSP

(Left: IPM, Right: ASM)
We have also compared the computational complexity and speed of convergence of

the two QP solvers we employed in this project. It can be found from Figure 1.2 that,
in general, ASM gives lower complexity and converges faster when mc and nv are
small. Otherwise, IPM should be a better choice due to its scalability.
2. Constrained Model Predictive Control
2.1 Introduction
Model Predictive Control (MPC) is the only advanced control technology which has
made a significant impact on industrial control engineering. It originated in the
petrochemical process industry, but its use has now been broadly pioneered in high
bandwidth applications, such as ships [32], aerospace [33], vehicles [34], logistics
[35], networks [36], and also microscale devices [11].
Constraints are always present in applications and the main advantage of MPC is the
ability to take account of constraints. A plant is not expected to operate exactly at the
real limits of its capabilities because of the unexpected disturbances from various
sources. But the better the control system is at dealing with such disturbances, the
closer the controller can operate to the constraints. Figure 2.1 illustrates three
hypothetical probability distributions of some controlled output of a plant, and the
constraint beyond which the output should not stray. Distribution (a) shows the
relatively large variance which results from the use of a relatively badly tuned linear
controller, assuming that the plant behaves approximately linearly, and that the
disturbances have a Gaussian distribution. In order to have an acceptably low
probability of violating the constraint, the set-point has to be set relatively far away
from the constraint, and hence the plant operates far away from the optimal point for
the great majority of the time. Distribution (b) shows the smaller variance of the
controlled outputs, which might be achieved by the use of linear optimal control. The
variance has been reduced, thus allowing the set-point to be significantly closer to the
constraint. Since the control law is linear, the distribution remains Gaussian.
Distribution (c) shows the effect of employing predictive control. With the ability of
being aware of the constraint, the controller reacts very differently in response to a
disturbance which pushes the output towards the constraint from that which pushes it
away from it. The controller is therefore nonlinear, and the distribution of the output
becomes significantly unsymmetrical with the possibility to operate the plant with a
set-point very close to the constraint, while retaining an acceptably small probability
of violating the constraint.
Figure 2.1 Set-point improvement available from the use of MPC
Because of the high computational requirements needed for optimizations associated

with MPC, it has primarily been applied to plants in the process industry, with slow
dynamics.
2.2 Receding Horizon Strategy
MPC adopts a receding horizon control strategy. Figure 2.2 shows the basic idea of
the receding horizon concept. A single-input single-output (SISO) discrete time plant
is discussed in this presentation.
The set-point trajectory at any time t, which is the trajectory that the output should
follow ideally, is denoted as s ( t ) . Distinct from the set-point trajectory is the
reference trajectory. The reference trajectory does not necessarily have to coincide
with the set-point. It starts at the current output y(/c) and defines an ideal trajectory
along which the plant should return to the set-point trajectory, e.g. after a disturbance
occurs. The notation r(k + i\k) indicates the predetermined reference trajectory
depends on the conditions at current time k. However, if the knowledge of future
references is unknown, a commonly made simplification is to assume a constant
reference from time k and onwards. Here we assume that the current output y(k) can
be measured with the past inputs u{k — 1), u(k — 2),..., but not on the input u(k).
Constraint
k + NP Time
^ Constraint
Input
k k + Nu k + NP Time
Figure 2.2 The basic idea of receding horizon
Now we consider the simplest case, assuming that the predicted output is required to
be equal to the reference trajectory only at the end of the prediction horizon, namely at
time k + NP. In other words, we only have a single coincidence point at time k + NP.
Also, we assume the optimal input trajectory, which drives the future output of plant
to the reference, remains constant over the prediction horizon, i.e. u(k\k) =
u(k + l\k) = ••• = u(k + NP — l\k). In this case, the receding horizon idea is to
compute an optimal input trajectory {u(k\k),u(k + l|/c), ...,u(/c + NP — l\k)} and
choose one parameter u(k\k) such as to bring the plant output at the end of the
prediction horizon, namely y(k + NP\k) to the required value r(k + NP\k).
Once a future input trajectory has been chosen, only the first element of that trajectory
is actually applied as input to the plant. That is, we set u(/c) = u(k\k), where u(/c)
denotes the actual input applied at time k.
Then the whole cycle of output measurement, prediction and input trajectory
determination is repeated. After one sampling interval, a new reference trajectory
r(k + i\k + 1) (i = 2,3,...) is defined; predictions are made over the horizon k +
1 + NP, with i = 1,2,..., NP ; a new input trajectory u(k + 1 + i\k + 1), with i =
0,2, ...,NP — 1 is chosen; and finally the next input u(fe + 1) = u(k + l\k + 1) is
applied to the plant. Since the prediction horizon remains of the same length NP as
before, but slides along by one sampling interval at each step, this way of controlling a
plant is called a receding horizon strategy.
In the simplest case considered above, since there is only one equation at coincidence
point k + NP , y(k + NP\k) = r(k + NP\k) to be satisfied and one parameter to
choose for the future input trajectory, there is a unique solution. However, in more
common situations there are several coincidence points in the prediction horizon (even
all the points k + \,k + 2, ...,k + NP are coincidence points). Therefore the internal
system of the controller, which is used to predict the behavior of the plant, is
overdetermined, i.e. there are usually more coincidence points than parameters. This
leads to the condition that there are more equations to be satisfied than the number of
available variables and the impossibility of choosing the future input trajectory such
that the predicted output coincides with the reference input at all the coincidence
points. We therefore have to be satisfied with an approximate solution. Most
commonly, this approximate solution is the least-squares (LS) solution.
If there are constraints on the inputs and/or outputs, then the simple linear LS solution
has to be replaced by a constrained LS solution. With constraints, it is no longer
possible to obtain a closed-form solution and some form of iterative optimization
algorithm must be employed. If the constraints are in the form of linear inequalities,
then we have a quadratic programming (QP) problem.
Now suppose that we allow a more complicated future input trajectory. The input is
allowed to change over the next Nu steps. For instance, control horizon Nu = 3 in
Figure 2.2, the input is assumed to vary over the first three steps of the prediction
horizon, but to remain constant thereafter, i.e. u(k + 2\k) — u(k + 3\k) = ••• =
u(k + NP — l\k) or the change in the input Au(/c + i\k) = 0 with i = 3,4, ...,NP , so
that there are three parameters to choose: u(k\k), u(k + l\k) and u(/t + 2\k) .
Generally speaking, /Vu is chosen to be considerably smaller than NP. And we have to
choose u(k\k),u(k + l\k), ...,u(k + Nu — l|/c) and assume that u(k + Nu - l\k) =
u(k + Nu\k) — ••• — u(fc + NP — l\k). Again, we only implement the first element
of the future input trajectory.
2.3 Principles of Constrained MPC
Constrained MPC can be formulated as a QP problem. Suppose that we have a

discrete linear time-invariant (LTI) plant in the discrete-time state-space form,
x(k + 1) = APx(k) + BPu(k)

U J
y(k) = CPx(k)
where y(/c) £ IRm, u(k) E Rl and x(k) E M.n represent the system output, input and
states respectively. The index k counts time steps. In most cases, DP is set to zero as
the control signal u(k) is not able to affect the output y(k) directly with existing
system delay. The process output can be predicted by iterating the model,
y(k + l\k) = CPx(k + l|/c) = CPAPx{k) + CPBPu(k\k) (2)
Similarly,
y(k + 2\k) = CPx{k + 2|fc) = CPApx(k) + CPAPBPu(k\k) + CPBPuQ<. + \\k) (3)
And in general,
y(k + i\k) = CPx(k + i\k) = CPAPx{k) + Y CPAp~xBpu(k + i - j\k) (4)

7= 1
10
State-space model has the advantage that it can be used for multivariable processes in
a straightforward manner.
If the predicted outputs are collected into a vector, we obtain an expression for the
vector of predicted outputs,
f Kk +1) • Lp/lp CpBp 0 0 U(fc)

9(k + 2) — LpApDp CpBp 0
x(k) + u(k + 1) (5)
NP-NU.
$(k + NPl .CpAp . dp/ip Dp Cp/lp Dp CpPnA
p
|u(fc + Nu - 1)
where NP and Nu are the prediction and control horizons, respectively.
The above plant model (1) expresses the plant state x in terms of the values of the
input u. However, in predictive control it is often convenient to regard the discrete-
time integration from Au to u as being included in the plant dynamics. As a result, we
consider the required changes in the control signal from one time step to the next:
Au(/c) = u(k) — u(/c — 1). The general aim is that the future output y within the
considered horizons should follow a reference signal r and, at the same time, the
control effort Au necessary for doing so should also be penalized. For many industrial
applications, the plants generally have some physical limitations. Such limitations are
the constraints to system input, output or state variables.
Therefore the constrained MPC problem is to minimize the cost function,

Np Nu-\
2
4>(y, Au) = V ||y(fc + i\k) - r(k + i\k)\\ +/iV ||Au(/c + i|/c)|| 2 (6)
i=i i=0
subject to linear inequality constraints on the system outputs, inputs and states,
(7)
U
MIN ^ 7 u " — UMAX (8)
AuMIN < JAuAu < AuMAX (9)
h is the control weighting which expresses the relative importance between the
tracking errors (y — r) and control effort Au.
11
Figure 2.3 shows the real controller producing the signal u which is passed to the real
plant, and the MPC controller producing the signal Au which is passed to the MPC
plant.
MPC Plant
(A.B.C)
MPC Au I u Real Plant

-i H
Controller I \Ap, Bp, Cpj
i
I
"T~"
Real Controller
Figure 2.3 The boundary between controller and plant
Now we try to derive the MPC control law by using a state-space model with an
incremental input Au rather than the input u.
From the state-space model (1), we have
x(k + 1)
u(k)
(10)
or
Ax(k + 1)
(11)
where Ax(/c) = x(fc) - x(k - 1).
Ax(/c + 1)1 r AP 01 rAx(/c) BP

Au(fc)
y(/c + l)J-[Cp/4p /J[y(/c)J + CpBp
(12)
12
To include the incremental input Au in a state-space model of the MPC plant, we

define the state vector
Ax(k + 1 )
(13)
m= y(k +1)
Then the state-space models have the form
f(k + 1) = i4f (fc) + BAu{k)

(14)
y(k) = caw
where
AP 0 BP
,B ,C = [0 I],
CpBp
It can be shown that the predictions based on the augmented state-space model is
r y(k +1) 1 ' CA CB 0 0 Au(/c)

9(k + 2) CA2 CAB CB 0 Au(fc + 1)
= f(fc) + (15)
N N X N 2 Np Nu
y(k + w P ). .CA >'. .CA <- B CA <-- B CA ~ B. Au(k + NU-1)
or in vector-matrix notation
y = Vxttk) + Vuz (16)
with
y = [y(fc + D y(fe + 2) ... y(k + NP)]T,

z = [Au(k) Au(k + 1) ... Au(k + NU + 1)] T ,
CA
CA2
Vx =
CANr
0 0
CAB 0
CANr-xB CANr~2B ••• CANP~N"B.
Hence the cost function in (6) can be further rewritten as,

13
1
<p(z)=-zTQz + crz (17)
The cost function is now represented in quadratic form with the constraints in the
form of linear inequalities.
In this thesis, for the convenience of hardware implementation, all constraints are
rearranged into the unique form of input constraints.
Using the prediction result in (16), we can rewrite (7) as follows
h v„z<\-::-]-\
yMAxJy1 _
. [ /;
Vxx(k) (18)
-h ~yMINJ i~Jy
In this way, the inequality constraints (7 - 9) can be written compactly as
Jz < g (19)
In the above equations, Q = 2*PuTlFu + 2/ is a nv x nv symmetric positive semi-
definite matrix so-called Hessian matrix, where nv = Nul is the number of decision
variables, and c = 2 v P u T ( x F z x' c — r) is nv x 1 in size. / and g have the sizes mc x nv
and mc x 1, respectively, where mc is the total number of inequality constraints.
Equations (17) and (19) tell us that the QP problem is bound up with matrices Q,c,J
andg.
In this way, the constrained MPC problem is converted into a problem of minimizing
an objective function subjecting to constraints as it is in (17) and (19), which is a
standard optimization problem known as the QP problem. A linear programming (LP)
is the special case of a QP when Q = 0, so that the objective function is linear rather
than quadratic.
2.4 Solving QP problems
Basically, the fundamental and major component of MPC controller is the solving of
QP problem as it results in optimal input trajectory for the control process. Therefore,
the ability of solving the QP problem on-line becomes critical when applying MPC to
real-time systems with prescribed response time and embedded applications with
limited computational resource. In the above section, a constrained MPC optimization
problem reduces to a typical QP problem. And there exist many methods in the
literature for solving QP problems. Among these methods, Interior Point method (IPM)
14
and Active Set method (ASM) have been proved to be powerful approaches for
solving QP problems, therefore have been the most commonly used. The capacity of
both methods for MPC applications have been investigated in [13], [25], [27]. In this
section, the effective on-line computation algorithms basing on IPM and ASM are
described.
Since Q > 0, the QP problem in (17) and (19) is convex. As a result of the QP
problem being convex, the necessary and sufficient conditions for z to be the global
optimum are given by the Karush-Kuhn-Tucker (KKT) conditions: there must exist
vectors (Lagrange multipliers) A > 0 and (slack variable) t > 0, such that
Qz+JJA = -c (20)
-Jz-t = -g (21)
tTA = 0 (22)
Equations (20) and (21) are called feasibility conditions while the equation (22) is
called complementary condition.
2.4.1 Interior Point Method
IPM is one of the most popular methods for solving QP problems. The basic idea of
this method is derived from the KKT conditions. The infeasible IPM perturbs the
complementary condition in (22) with the following scalar
Hk=^-^- (23)
mc
where k is the iteration sequence and mc is the number of inequality constraints in (4).
As the iteration goes on, the infeasibility and fik is gradually reduced to zero.
According to the infeasible interior point framework introduced by [13], an optimal

control signal can be computed by solving the convex QP problem using the following
steps:
Step 1:
Choose an initial condition (z°, A0, t°) with positive (A0, t°).
15
Step 2:
At the kth iteration step, solve for the increments (Az k , AAk, At k ) with
(24)
2 J
and
At = -tk + A-x(onke - TAX) (25)
Here r = -(Aky1TK and
'% tk "1"
A = ,T = ,e =
^mc tk .1.
a = 0.25 is a pre-defined constant.
In addition,
r{ = -Qzk-JTAk-c (26)
k k /c _1
r = -Jz + g — cr/i A e (27)
Step 3:
Increment the variables by
(z k + 1 , A* + \ t k + 1 ) = (zk, Xk, tk) + ak(bzk, AAk, At k ) (28)
Step 4:
Judge the convergence. If the iteration converges, stop the process and the optimal
control zk+1 is obtained; otherwise, go back to Step 2 with (z f c + \ A k+1 , t k + 1 ) and
continue the iteration process.
It can be seen that solving such a QP problem is an iterative process and the main
computational load is in solving equation (24) at each iteration.
Equation (24) can be solved either as

16
Azk = Q-lrk _ Q-ljTtfk <• ^
or
f Azk = «? -fr-yrw -; T r-v 2 k )

t AAk = r-1r2fc-r-1;Azk LJ
Both schemes above need to inverse a matrix. For (29), the matrix to be inverted at
each iteration is T — JQ~X]T, which has dimension m c x m c . In (30), the matrix
is Q — / T r _ 1 / > which has dimension nv x nv. T — JQ_1JT has a special structure in
that only the diagonal elements change at each iteration. In most practical MPC
applications, mc is generally much larger than nv. In such a situation, (30) is more
attractive for embedded implementation and is adopted for the rest of the thesis.
The pseudo-code for IPM algorithm is presented below:
01: Select an initial condition (z°, A0, t°) with (A°,t o )>0and select a positive
termination threshold 6.
02 for k = 0,1,2,... do
03 A <- diag(Ak) and T *- diag(t k )
04 Solve for Azk with ((? - / T r - 1 / ) A z k = rx* -]Tr~xr^
05 Calculate AAk = r~lr£ - r^jAz"- and At" = -tk + A^âê - TAX)
06 a1 <- 1 and a2 *- 1
07 if tk + arAtk < 0 and Ak + cr2AAk < 0 then
08 a-L^ a-!- 0.001 and az<â2- 0.001
09 end if
10 a - m t a f ^ j t a . 0.001).. max ){a2,0.001})
11 (z k + 1 ,A k + 1 ,t k + 1 ) <- (z k ,A k ,t k ) + ak(Azk,AAk,Atk)
12 ^k+i <_ (tT)k+ix/c+imc-i
13 if/zk+1 < 5 then

14 Terminate with solution z, = z k + 1
15 end if
16 end for
17
Here, we have
?/(At) = {i = 1,2 m c : At(i)<0},

T/(AA) = {i = 1,2 m c : AA(i) < 0}
The sub-optimality of z k is measured by the quantity juk = (t T ) k X. k m c ~ 1 , where tk

and Ak are iterates for the slack variables and Lagrange multipliers. As the iteration
continues, fj.k is gradually reduced to zero and zk converges to the optimal solution.
In Line 4 of the above 1PM algorithm, a linear system in the form of Ax = b has to be
solved, where A is a nv x nv square matrix and b is column vector with nv
components. In this project, it is solved by Gauss-Jordan elimination with pivoting.
2.4.2 Active Set Method
ASM is also most commonly used as a general purpose QP solver (see [2] and [17]). It
solves the QP problem (17) and (19) via identifying the active set of constraints w for
its solution z,. The method begins with an initial guess iv° the active set. In the next
iteration, w is refined by deleting a constraint from vv° or adding a constraint tow 0 . In
this way, a guess of the active set is refined iteratively until the exact active set is
found.
Step 1:
Initial a feasible point z° and let its initial working set w° to be zero.
Step 2:
Now we consider in more detail the minimization of the cost function at the kth
iteration. The new cost is
d>(zk + Az k ) = - (z k + Az k ) T Q(z k + Az k ) + cT(zk + Az k )

1
= 0 ( z k ) + -Az k (?Az + (c T + (z k ) T Q)Az k (31)
subject to
Av*Azk = 0 (32)
18
So the optimization problem to be solved becomes the minimization of -AzkQAz +

k T fe k k
(c ) Az , where c — Qz + c.
Thus the KK.T conditions for this method are:
QAzk+j\X = -ck (33)
-JwkAzk =0 (34)
These equations can be represented in matrix form,
Az k ' -ck'
. A . = . 0 .
(35)
Jwk denotes the matrix obtained by removing the £th row in step 5. Solve the above
systems of linear equations for Azk and A.
Step 3:
If Azk = 0 does not solve (31) and (32), go to step 5 to check if global optimum has
reached. Otherwise, continue.
Step 4:
Suppose that at the kth iteration we have the feasible solution zk. The ASM then find
an improved solution zk + Azk which minimizes the cost (31) without worrying
about the inactive inequality constraints. If this new solution is feasible, i.e. if/(z f c +
Azk) < g, then it is accepted as the next iteration: zk+1 = zk + Az. If not, choose a
step ak, where
fat-/(**!)
an min 1,' min
iSDk [ JtAzk JI
(36)
Ji denotes the ith row of/, and from/ for all i g wk.
With 0 < ak < 1, the new solution becomes z k + 1 = zk + aAz, and a new constraint
becomes active, defined by the index which achieves the minimum in (36), and this
index is added to the active set w.
19
Step 5:
Judge the convergence by evaluating those elements of Lagrange multipliers A

corresponding to inequality constraints. If they are all nonnegative then the global
solution has been found, terminate withz, = zk. Otherwise, remove the inequality
constraint that corresponds to the most negative Lagrange multiplier (say the ith row)
from the active set to reduce the cost function and go to step 2.
The following is the pseudo-code of ASM algorithm for embedded implementation.

01: Compute a feasible initial point z°.
02: Let the initial active set w° be empty.
03: for k = 0,1,2, ...do
Azk -c-Qzk^ for(Azk,A)
04: Solve
X J 0
05 ifAzk = Othen
06 if all entries of X > 0 then
07 Terminate with solution z» = zk
08 else
09 Remove from wk and index that corresponds to the smallest entry
of A,and then zk+1 «- zk
10: end if
11: else
12: D*~{itw*:Jtte*>O,*0-<l}
13 ifDk is empty then
14 zk+1 «- zk + Az and wk+1 «- wk
15 else
16: minf 9i~Jiz' } and zk + i zk + akAz

17 Create wk+1 by adding one element of Dk to wk
18 end if
19 end if
20 end for
20
As in the IPM algorithm, a linear system formulated as Ax = b in Line 4 of the above

ASM algorithm has to be solved. Here A is a square matrix of order mc + nv and b is
a (mc + n„) x 1 column. In this project, it is solved by Gauss-Jordan elimination with
pivoting, and the specific implementation of this method is introduced in Section 3.3.2.
21
3. Prototyping MPC on DSP
3.1 Introduction
MPC controller is an easy-to-implement control law and it has been successfully

realized and verified on Field Programmable Gate Array (FPGA) in [1] and [3].
Alternatively, a real-time embedded implementation using a General Purpose
Processor (GPP) is demonstrated in [11] and [15], where a high-speed 32-bit
microprocessor is utilized. In this section, implementation of IPM and ASM using 32-
bit Digital Signal Processor (DSP) is proposed and described in detail.
The optimization algorithm models are developed and verified first on

MATLAB/Simulink, and then the M-codes are translated into C/C++ codes for
implementing the algorithms on the fixed-point and floating-point DSP development
boards.
MPC implemented
in MATLAB
®- Plant simulated
in MATLAB
MPC implemented
on DSP
-o
Figure 3.1 Prototyping of MPC on a Chip
The development software we adopted in this project is MATLAB R2007b (from The
Math Works Inc.) and Code Composer Studio v3.3 (from Texas Instruments Inc.).
3.2 Embedded Prototyping Platform
The embedded platforms we employed in achieving constraint MPC on a chip include

a TMS320F2812 eZdsp™ prototyping board and a TMS320C6713 DSK prototyping
board (both from Spectrum Digital Inc.), for DSP implementation with fixed-point
and floating-point arithmetic architecture respectively. The DSP development board is
22
a stand-alone module permitting software engineers to evaluate certain characteristics

of the DSP to determine processor applicability to design requirements. Evaluators
can create software to execute onboard or expand the system in a variety of ways.
3.2.1 TMS320F2812 Processor
The TMS320F2812 (abbreviated as F2812) DSP is high-performance solution for

real-time control application. Because it is a 32-bit fixed-point processor, numbers are
represented in integer format and only fixed-point arithmetic can be handled.
Nevertheless, resorting to floating-point emulation by the CCS compiler, floating-
point operations can be converted to fixed-point format and executed relatively slowly
on F2812. F2812 operates at maximum 150 million instructions per second (MIPS)
and delivers a 32><32 multiply and accumulate (MAC) in a single 6.67 ns instruction
cycle.
In contrast with GPP and its von Neumann architecture, DSP architectures are
designed for repetitive multiply, add and store operations that step sequentially
through data values stored in consecutive memory locations. MAC is a common
operation in computing that computes the product of two numbers and adds that
product to an accumulator. This operation involves at least two operands and for this
reason DSP processors usually support multiple memory accesses in the same
instruction cycle. To be able to fetch two operands and an instruction from memory
into the central processing unit (CPU) in a single instruction cycle, the DSP processor
is equipped with two independent bus systems: program bus and data bus. Texas
Instruments calls this kind of processor a "modified Harvard architecture". This
architecture usually permits the program bus to be used also for access of operands, i.e.
the processor is capable of reading operands not only from data memory but also from
program memory.
The F2812 memory bus architecture contains a program read bus, data read bus and
data write bus. The program read bus consists of 22 address lines and 32 data lines.
The data read and write busses consist of 32 address lines and 32 data lines each. The
32-bit-wide data busses enable single cycle 32-bit operations. On F2812, a 32><32
23
hardware multiplier is immediately followed by a 32-bit arithmetic logic unit (ALU),

so that the accumulation operation is performed in the immediate next clock cycle
after the multiplication. The multiplier and the ALU can be used in parallel through
pipelining to simultaneously execute a 32x32 MAC operation on F2812 in a single
instruction cycle.
Instruction pipelining is often used in DSP architecture to increase instruction

throughput and save execution time by executing several instructions concurrently. It
does not completely cancel out idle time in a CPU but making those modules work in
parallel improves program execution significantly.
F2812 processor has an 8-stage protected pipeline to maximize the throughput. DSP
can ideally achieve one clock cycle per instruction on average with pipelining.
Without pipelining, the number of clock cycles it takes to execute an instruction is 8.
Therefore this pipelining enables the F2812 to execute at high speeds without
resorting to expensive high-speed memories. This protected pipeline prevents a write
to and a read from the same location from occurring out of sequence and therefore
assures the order of results is as written in source code.
3.2.2 F2812 Prototyping Board
Figure 3.2 F2812 eZdsp prototyping board

24
The F2812 eZdsp prototyping board (Figure 3.2) is adopted as the embedded platform
for carrying out the fixed-point DSP implementation of IPM and ASM.
Key Features of the F2812 eZdsp:
1. TMS320F2812 fixed-point Digital Signal Processor at 150 MHz

2. 30 MHz clock
3. 36 KB on-chip single cycle access RAM
4. 256 KB words on-chip Flash memory
5. 128 KB off-chip SRAM memory
6. 2 Expansion Connectors (analog, I/O)
7. Onboard embedded IEEE 1149.1 JTAG emulation connector
The F2812 eZdsp incorporates a standard parallel port interface that supports ECP,
EPP, and SPP8 communications and has direct access to the integrated JTAG
interface. In this project, the parallel port is used for exchanging data, like matrices
Q, c,J and g, between PC and DSP.
3.2.3 TMS320C6713 Processor
The TMS320C6713 (abbreviated as C6713) is a 32-bit floating-point processor

capable of performing both integer and floating-point operations. With IEEE 754
floating-point standard (1-bit sign, 8-bit exponent and 23-bit mantissa) on C6713, a
much greater range of numbers can be represented comparing with a fixed-point
processor.
Unlike F2812, C6713 is based on the very-long-instruction-word (VLIW) architecture,

which is very well suited for numerically intensive algorithms such as IPM and ASM.
This architecture was further developed from Harvard architecture and introduced by
Texas Instruments. VLIW architecture consists of multiple ALUs running in parallel,
performing multiple instructions during a single clock cycle. With the C6713
operating at 225 MHz (4.44 ns cycle time per instruction), the processor is capable of
fetching eight 32-bit instructions every 4.44 ns, which equates to 1800 MIPS.
25
C6713 CPU contains eight independent functional units: six ALUs (two of the six
perform fixed-point operations only) and two multipliers. Two MACs can be ideally
achieved per instruction cycle, for a total of 450 million MACs per second (MMACS).
With six of the eight functional units capable of handling floating-point operations, it
is possible to perform 1350 million floating-point operations per second (MFLOPS).
The C6713 pipeline can dispatch eight parallel instructions every cycle.
3.2.4 C6713 Prototyping Board
TMS320C6713 DSK prototyping board (see Figure 3.3) is an excellent platform to

develop and run highly computationally demanding algorithms for the C6713
processor.
Figure 3.3 C6713 DSK prototyping board
Key Features of the C6713 DSK:
1. TMS320C6713 Floating-point Digital Signal Processor at 225 MHz

2. 16 MB on-board synchronous DRAM
3. 512 KB non-volatile Flash memory
4. 4 user accessible LEDs and DIP switches
26
5. Software board configuration through registers implemented in CPLD

6. JTAG emulation through on-board JTAG emulator with USB host interface or
external emulator
A USB interface to the embedded JTAG emulation logic on the C6713 DSK is
provided and this allows for code development and debug without the use of an
external emulator. In this project we use USB serial port for the communication
between PC and DSP.
3.3 Establishing Prototyping Environment
In the above section, we have adopted the DSP development boards for implementing
IPM and ASM algorithms. Some configurations have to be done on both hardware
and software level to make the development board work properly. In this section, a
rapid prototyping environment suitable for exploring the implementation of MPC is
set up.
3.3.1 F2812 External RAM Expansion

The IPM and ASM algorithms involve great amount of floating-point matrix
calculation with high-dimension, which requires large storage of RAM. For F2812
eZdsp, 36 KB on-chip RAM is not sufficient for storing all the matrices of a QP
problem when the size of the problem reaches a certain level. Luckily, a 128 KB off-
chip static RAM IS61LV6416-12T is provided on the development board, which
overcomes the previous limitation.
An XINTF zone is a region in the F2812 memory map that is directly connected to the
external interface. Each XINTF zone can be individually configured with unique read
and write access timing and each has an associated zone chip-select signal.
Zone 6 has a chip-select signal XZCS6AND7 (shared with Zone 7) that is connected
with the chip-select signal of external SRAM IS61LV6416-12T (see Figure 3.4). As a
result, in this project, Zone 6 is set as the external RAM region and toggled when an
access is made to Zone 6.
27
- < C \ A [ 0 . 181
— ' ; X U | 0 . I5|
SRAM
Q a
a a
AO >> DO
7
8
XEO
XD1
/
A] Dl
9 XD2 /
A2 D2
10 XD3 /
A3 D3
13 XEW /
A4 D4
14 XD5 /
A5 ES
15 XD6 /
A6 D6
16 XET7
/
A7 D7
29 XE» /
A8 E*
30 XD9 /
All EN
31 XD10
/
A12 D10
32 XD11
/
AI3 Dll
35 XDI2
/
AI4 DI2
36 XD13
/
AI5 DI3
37 XDI4
/
Aid D14
38 XD15
/
AI7 1)1 J /
A9
XA17
All)
XZCSSAMJ7n>>- QL
XWE»>>- W_E
XRI)n > > - OE ] ^ CIO
Ipn 0.01UF1~ J~0.01uF
BLE
</> Vl
>>
IS6ILV6416-I2T
Figure 3.4 External RAM expansion circuit
However, F2812 is not allowed to access the external program bus and the data bus at
the same time. Compared to a single cycle for internal access to two 32-bit operands,
it takes at least 4 cycles to do the same with external memory.
For C6713 DSK prototyping board, matrices can be completely saved in on-chip
RAM of TMS320C6713 processor. Hence external RAM expansion is not necessary
for this project, though it is provided on C6713 DSK.
3.3.2 Memory Map Configuration
The DSP compiler creates multiple portions of code and data called sections. These
sections are categorized into two different groups: initialized and uninitialized. The
initialized group of sections is composed of all code, constants, and initialization
tables. Table 3.1 shows the initialized sections produced by the compiler.
28
Table 3.1 Initialized sections

Name Contents Restrictions
.cinit Tables for explicitly initialized global and Program
static variables
.const Global and static const variables that are Low 64K data
explicitly initialized and string literals
.econst Far constant variables Anywhere in data
.pimt Tables for global object constructors Program
.switch Tables for implementing switch statements Program (with -mt option)
Data (without -mt option)
.text Executable code and constants Program
The uninitialized group of sections is composed of variables, the stack, and malloc
memory. Table 3.2 shows the uninitialized sections produced by the compiler.
Table 3.2 Uninitialized sections
Name Contents Restrictions
.bss Global and static variables Low 64K data
.ebss Far global/static variables Anywhere in data
.stack Stack space Low 64K data
.sysmem Memory for malloc functions Low 64K data
.esysmem Memory for far malloc functions Anywhere in data
In this project, matrices defined as global arrays are mainly stored in .cinit and .ebss
sections, and matrices defined as local arrays are stored in .stack section. Generally
global variables should be carefully used because they can potentially be modified
from anywhere and any part of the program may depend on it. However, in this
application the stack might not be capable of storing all the local arrays, so some of
the matrices should be moved to global position.
Once the compiler has generated these sections the linker takes the individual sections
from each source file and combines them to create an output section. The linker
command file (.cmd) is used to tell the linker where to allocate these sections.
Because of external RAM expansion, .cinit and .ebss sectors are assigned to Zone 6.
29
Hence, for F2812 eZdsp, the linker command file F2812_EzDSP_RAM_lnk.cmd

should be rewritten as below:
// FILE: F2812._EZDSP_RAM_1nk.cmd, Linker Command File
{
PAGE 0 :
RAMMO : origin = 0x000000 , length = 0x000400
BEGIN : origin = O X 3 F 8 0 0 0 , length = 0x000002
PRAMHO : origin = 0 X 3 F 8 0 0 2 , length = OxOOOFFE
RESET : origin = 0x3FFFC0 , length = 0x000002
ZONE6 : origin = 0x100000 , length = 0x005000
PAGE 1 :
RAMMl : origin = 0x000400,, length = 0x000400
DRAMHO : origin = 0x3f9000 , length = 0x001000
ZONE6 : origin = 0x105000,, length = 0x009000
}
SECTIONS
{
codestart : > BEGIN, PAGE = 0
ramfuncs : > PRAMHO PAGE = 0
.text : > PRAMHO, PAGE = 0
.cinit : > ZONE6, PAGE = 0
.pi nit : > PRAMHO, PAGE = 0
.switch : > RAMMO, PAGE = 0
.reset : > RESET, PAGE = 0, TYPE = DSECT
.stack : > RAMMl, PAGE = 1
.ebss : > ZONE6, PAGE = 1
.econst : > DRAMHO, PAGE = 1
.esysmem : > DRAMHO, PAGE = 1
}
The linker command file for C6713 DSK is shown below. It can be seen that, in this
project, only on-chip RAM is employed to store variables.
// FILE: C6713dsk.cmd, Linker Command File
MEMORY
{
IVECS: org=0h, len=0x220
I RAM: org=0x00000220, len=0x0002FDE /*internal memory*/
SDRAM: org=0x80000000, len=0x0010000 /*external memory*/
FLASH: org=0x90000000, len=0x0002000 /*flash memory*/
}
SECTIONS
{
.EXT_RAI^l :> SDRAM
.vector;i :> IVECS /*in vector file*/
.text :> IRAM /*Created by C Compiler*/
.bss :> IRAM
.cinit :> IRAM
.stack :> IRAM
.sysmem :> IRAM
.const :> IRAM
.switch :> IRAM
.far :> IRAM
30
.cio :> IRAM

.csldata :> IRAM
3.3.3 Development Software Tool
Code Composer Studio (CCS or CCStudio) provides an easy-to-use integrated

development environment (IDE), bringing together the assembler, C/C++ compiler,
linker, debugger, simulators, emulation drivers and so on. Figure 3.5 shows the
software development interface.
< /F2S12 e/dsp/cpu I! TMS320C28xx - Code Composer Sludii
Ed* View Project Debug GEL Option Profle Took DSP/BIOS Window Heb
a ef a %> "Tj H « S, H % U 3 ¥? ia 10 n /* • % *
p* 3lDebug 3 # I 4 »1
& eV Q | 93 [U Q B « i P
f S T
JGELftes
•include "DSP281x_Device.h"
•include <stdio.h>
•include <math.h>
' PSPStflx Hescierfile Include file
13
j j Projects
J j I P M p * (Debug)
•define NV 1
K _ J Dependent Projects
+ _ J Documents
•define MC Jl
<? ^ J DSP/BI05 Config
•define MAXITER 100
1 Generated Ftes
•define TOLMIU B.00001
1) • _JInckjoe
•define DZTOL 0.005
I Ubreries
rt - J Source
•define TUNE 0."Of/,
•define OETOL le-S
2 J ] D5PZ81x_CpuTimers.c
J ] D5P2fllxJ)er"«JUsr.c
J ] DSP28]x_GtabarVanabteDefs c
•define IMTJLAM 0.75
J ] D5P28ix_Gpto.c
^define INITJT B.25
j r ] DSP2filx_PieCtrr.c
J ] D5PZfllx_Pievect.c
float Q[cJV][NV];
J$ DSP281x_SysCb1.c
float c[NV];
Jtl man.t
float J[MC][rJV];
J ] D5P281x_r1eaders_nonB10S.cmd
float giMCJ;
i ] F2812.EzD5P_RAM_Hc.cmd
f l o a t l[UV):
lot I t e r ;
float Exectime;
[*! File View |/»Bookmadts )
[tmaln.c] " D : M : C S t u d i o _ v 3 . 3 \ C 2 0 0 0 \ c g t o o l s ^ i n V ; 1 2 0 0 0 " - 3 - f r " D : / C C S t u d i o _ v 3 . 3/MyProjectsÎPM/Debug" - i "D:/CCStur__J
[Linking...] "D:\CCStudio_v3.3sC2000\cgtaols\bin\cl2000" -@"Debug.lkf"

<Linking>
Build Complete,
0 E r r o r s , 0 W a r n i n g s , 0 Remarks.
umnvBujw/ ILiJJ iT
% • . H A L T E D : s/w Fie: D:KCStudto_v3.3\HyPro)*«sUPM\5owce\main.e Lnl.Cdl
Figure 3.5 CCS IDE interface
Complier and linker in CCS should be correctly configured for DSP applications. The
proper configurations of the complier and linker for F2812 eZdsp and C6713 DSK
running in real-time mode are shown in Figure 3.6 and 3.7.
31
Build Options for IPM.pjt (Debug) I ? | X I Build Options for IPM.pjt (Debug)
General Compile! '• Luker DspBtosE jiider | Link Order \ Geneial j Compile! Lriuar | OspBiosBulder ] Link Order j
g-ii"D:\aSludo_vi3\M)«o|ecK 1 JPM'>E)«>ja'
Category: Category
Target Vernon: |C28w(-v2B}jJ Suppress Banner [-q]
Libr anes
Generate Debug Info: JFul SymboSc Debug lg) 3 Advanced Output Module: |Absokrte Executable (a) 3
Opt Level: fNori Output Flename (o): \Deoug\IPM out
jd
Progjam Level Opt: None
~B Map Flename |-m]:
Autonrt Modet
\Debug\IPM map
Run-TneAutoriitiafeationt-c) g
Heap See (-heap):
Fat Heap Stack l-larhaapl: |
Stack Size (-stack). 10x400
F l Value (A |
Code Entry Point |-e>
Help Help
Figure 3.6 Compiler and linker options for F2812 eZdsp

Build Options for IPM.pjt (Debug) ? X ! 1 Huild Options for IPM.pjt (Debug)
General Comptei | Linker | DspBiosBuJder j Link Order |
Categoty: basic Category: Basic

Taiget Version: C671x(-mv6710) j j f~ Suppress Banner (-q|
Advanced
Feedback Generate Debug Into: [FUI Symbolic Debug J-g)_£j Output Module:
• zi
Fin Gpt Speed v* Size: j Speed Most Critical (no -ms) JLf Output Filename |-ot |ADebug\IPM.out
Awembly
Pais* Opt Level [None 3 Map Fiename (-ml: jADebugUPM map
Preprocessor
Diagnostics Program Level Opt.: [None 3 AutornilMooel ! Run-Time Automibaltzation ( oljj
Heap Size (-heapfc
|
Stack Size (stack): 10x400
F4 Value [I)
Code Entry Point ( - e l f
] He!c
J Help
Figure 3.7 Compiler and linker options for C6713 DSK
3.4 Matrix Operations on DSP
MPC is an optimization-based control scheme that requires abundant matrix

operations for the calculation of the optimal control moves. As a result, the main
computational bottleneck for QP implementation on DSP is matrix computations.
32
3.4.1 Matrix Multiplication

In C language programming, the FOR loop is used to perform a series of repeated
addition, subtraction, multiplication, division, element inversion and initialization. In
general, nested FOR loops are needed for arithmetic on two dimension matrices. The
source code shown below is for a (mc x n v ) x (n„ x 1) matrix multiplication
example using two different schemes.
#define M MC
#define R NV
#define N 1
f l o a t matC[M][N];
/ / Scheme 1 : using 2-dimentional array references

void m a t r i x M u l t i p l i c a t i o n l C f l o a t matA[M][R].float matB[R][N])
{
int i , j , k ;
f o r ( i = 0 ; i<M; i++)
{
for ( j = 0 ; j<N; j++)
{
m a t c [ i ] [ j ] = 0;
f o r (k=0; k<R; k++)
{
matC[i][j]+= matA[i][k] * matB[k][j];
}
}
}
}
// Scheme 2: using arrays of pointers

void matrixMultipiication2(float matA[M][R],float matB[R][N])
{
int i,j,k;
float *ap,*bp,*cp;
for (i=0; i<M; i++)
{
cp = matc[i];
bp = matB[0];
33
for (j=0; j<N; j++)

{
ap = matA[i];
*cp = (*ap++) * (*bp++);
for (k=l; k<R; k++)
{
*cp+= (*ap++) * CmatB[k][j]);
}
cp++;
}
}
}
This matrix multiplication requires mc x n„floating-pointarithmetic operations, with

each arithmetic operation requiring a cumulative 32x32 MAC operation.
The calculation required by the complier to access a particular element in a 2-

dimentional matrix (Scheme 1 in the code above) takes less instructions and less
execution time than accessing the same matrix with pointers (Scheme 2 in the above
code) when n„ is small. However, Scheme 2 is more efficient and runs faster than
Scheme 1 when nv is large, depending on the compiler and the speed of pointer
operations on the DSP. In this project, Scheme 1 is actually adopted for matrix
multiplication.
3.4.2 Matrix Inversion
As mentioned in Section 2.2, a linear system in the form of Ax — b has to be solved in

order to obtain the future input trajectory. But this internal system is almost always
overdetermined, so it is not expected to be solved exactly. Instead, it can be solved in
the LS sense. A theoretical approach to solving the overdetermined system begins by
multiplying both sides by AT:
ATAx = ATb (37)
This reduces the system to a square system. If ArA is nonsingular (or invertible), then
x = (ATAy1ATb (38)
34
In MATLAB this solution x is computed as

x = inv(A'*A)*A'*b
However, there are several undesirable aspects to this approach. Using a matrix
inverse to solve a system of equations is more work and less accurate than solving the
system directly by Gaussian or Gauss-Jordan elimination, especially when A is large.
In MATLAB, the solution to the linear equation Ax = b can also be found by

"backslash" operator:
x = A\b
Also, in Line 4 of both IPM and ASM algorithms, Az is derived by solving a linear
system. Since the matrix A of this linear system to be solved is square, the "backslash"
"operator employs Gaussian elimination method to obtain the solution for both IPM
and ASM.
The C-code for the linear equation solver using Gauss-Jordan elimination in this
project is shown below:
void s o l v e L i n E q ( f l o a t *A, f l o a t *b)
{
int i , i r , ic, j , jc;
i n t NUM, a c t s i z e ;
f l o a t temp, temp2;
NUM = NV;
a c t s i z e = NV;
f o r ( i = 0 ; i<NUM; i++)
{
i f ( A [ i * a c t s i z e + i ] == 0)
{
f o r ( j = i ; j<NUM; j++)
{
temp2 = f a b s ( A [ j * a c t s i z e + i ] ) ;
i f ( t e m p 2 >= GETOL)
{
f o r ( j c = 0 ; jc<NUM; jc++)
{
35
temp = A [ i * a c t s i z e + j c ] ;
A[i*actsize+jc] =
A[j*actsize+jc];
A [ j * a c t s i z e + j c ] = temp;
}
temp = b [ i ] ;
b[i] = b[j];
b [ j ] = temp;
break;
}
}
}
temp = A [ i + i * a c t s i z e ] ;
A [ i + i * a c t s i z e ] = 1.0;
f o r ( j = i + l ; j<NUM; j++)
{
A[i*actsize+j] = A[i*actsize+j]/temp;
}
b [ i ] = b[i]/temp;
f o r ( i r = 0 ; ir<NUM; i r + + )
{
if(ir != i )
{
temp = A [ i r * a c t s i z e + i ] ;
A [ i r * a c t s i z e + i ] = 0;
f o r ( i c = ( i + l ) ; ic<NUM; ic++)
{
A[ir*actsize+ic] = A[ir*actsize+ic]
- temp * A [ i * a c t s i z e + i c ] ;
}
b [ i r ] = b [ i r ] - temp * b [ i ] ;
}
}
}
}
Let Ma,Mm and Md denote the number of addition, multiplication and division
required for Gauss-Jordan elimination with pivoting in the C-code above. Notice that
36
floating-point subtraction operations are treated as addition operations. For n = 1,2,...

we have
Ma(n) = 0 . 5 n ( n - l ) ( n + l)
Mm(n) = 0 . 5 n ( n - l ) ( n + l)
Md{n) = 0.5n(n + 1)
3.5 Implementation of IPM Algorithm
In Line 11 of IPM algorithm, the optimal solution z, Lagrange multiplier A and slack
variable t are updated at every iteration of the optimization using the following
equations,
' Azk = (Q -yTr-V)_1(rifc -fr-1^)

AAfc = r^r? - r-x]Lzk (39)
k Atk = -tk + tCx(onke - TAXk)
The calculation computing these increments is implemented in M-code first, and
translated to C/C++ code for DSP realization and compiled and optimized in CCS
thereafter.
3.5.1 Formulation of IPM Implementation

1. Solving Azk
Rewrite equation (30),
Az* = «? -/T-VrVi* -JTr-lr£)

with
rk = -Jzk + g - anktCxe
For the convenience of implementation, we substitute rk = —Qzk — }TXk — c and

r = -(A k )- 1 7 k into (30),
Azk = 0? +JTAk(Tky1jy1(-Qzk -]TXk - c-JThk{Tkyxr^)
= (Q +JTAk(Tky1jr1 [~Qzk - c -JT(Ake - A^T")-1^)) (40)

37
2. Solving AAk
Rewrite equation (30),
AAk = r - 1 ^ - r~VAz k
Substituting r2k = -]zk + g- onkA~le and T = -(Aky1Tk,

AAk = -Ak(Tk)-1(-Jzli +9- o-MkA_1e - / A z k )
= -(Tky1(d-Jzk + g)Ak - anke - AkJAzk)
= -(Jky1(i-)zk + g)Ak - anke) + (Tky1AkJAzk (41)
3. Solving Atk
Recall equation (25),
Atk = - t k + A-V/* k e - TAX)
With (30), this equation can be rewritten as

Atk = - t k + # - y ( z k + Azk)
= -tk +g-Jzk-JAzk (42)
3.5.2 Implementing IPM Algorithm on MATLAB
MATLAB/Simulink software from The Mathworks provides an excellent computation

and simulation environment for designing and implementing control algorithms. The
MPC algorithm for obtaining the solution of QP problems is first prototyped in
MATLAB code and simulated and verified in the MATLAB/Simulink environment.
The following shows the M-code for solving QP problem with IPM in one iteration.
sigma = 0.25;
mc = size(g,1);
lambda = 0.75 * ones(mc,1);
t = sigma * ones(mc,l);
miu = t' * lambda / mc;
z = zeros(size(Q,1),1) ;
% E n d o f step 1 of IP Solver
38
% Step 2 of IP Solver
invt = ones(mc,1)./t;
vtmp = sigma * miu * ones(mc, 1);
% r2 = diag(lambda) * (-J * z + g) - vtmp;
sla = J * z;
sib = g - sla;
sic = diag(lambda) * sib;
r2 = sic - vtmp;
% A = Q + (J» * diag(invt)) * (diag(lambda) * J);

s2a = diag(invt) * diag(lambda);
s2b = J' * s2a;
s2c = s2b * J;
A = Q + s2c;
% dz = inv(A) * (-Q * z - c - J' * (lambda - diag(invt) *

r2) ) ;
s3a = diag(invt) * r2;
s3b = lambda - s3a;
s3c = J' * s3b;
s3d = -c - s3c;
s3e = Q * z;
s3f = s3d - s3e;
B = s3f;
dz = inv(A) * B;
% dz = solveLinEq(A, B ) ;
% dlambda = -invt.*(r2 - diag(lambda) * J * dz);

s4a = J * dz;
s4b = s2a * s4a;
dlambda = s4b - s3a;
% dt = -t + g - J * (z + dz);
s5a = sib - t;
dt = s5a - s4a;
% E n d o f s t e p 2 of IP Solver
3.5.3 Implementing IPM Algorithm on DSP

The prototype IPM in the form of M-code above is translated into C/C++ code for
DSP realization. The DSP implementation in this project is coded in C programming
language. IPM can be alternatively coded in C++ programming language as C++
compiler is provided in CCS. However, we found that the C++ program takes more
time to execute on DSP and much less efficient than C program.
#define I N I T _ T 0.25
#define INIT_LAM 0.75
/ / The f o l l o w i n g code i s p a r t i a l l y from void lPM_funct(void)
/ / Step 1 o f IP Solver
f o r ( i = 0 ; i<MC; i++)
{
lam[i] = INIT_LAM;
t [ i ] = INIT_T;
}
f o r ( j = 0 ; j<NV; j++)
{
z [ j ] = 0;
}
miu = INIT_LAM * INIT_T;
/ / Step 2 o f IP Solver
f o r ( i = 0 ; i<MC; i++)
{
invt[i] = l / t [ i ] ;
}
vtmp = I N I T _ T * miu;
for (ir=0; ir<MC; ir++)

{
tpvarl = 0 ; // sla
f o r ( i c = 0 ; ic<NV; ic++)
{
tpvarl = tpvarl + J [ i r ] [ i c ] * z [ i c ] ; / / sla
}
slb[ir] = g[ir] - tpvarl;
tpvar3 = l a m [ i r ] * s l b [ i r ] ; / / sic
r 2 [ i r ] = tpvar3 - vtmp;
f o r ( i r = 0 ; ir<MC; i r + + )
40
{
s2a[ir] = i n v t [ i r ] * lam[ir];
}
f o r ( j r = 0 ; jr<NV; j r + + )
{
f o r ( j c = 0 ; jc<MC; jc++)
{
s2b[jr][jc] = 3[jc][jr] * s2a[jc];
}
}
f o r ( j r = 0 ; jr<NV; j r + + )
{
for (ic=0; ic<NV; ic++)
{
tpvarl=0;
for(jc=0; jc<MC; jc++)
{
tpvarl = tpvarl + s2b[jr][jc] * J[jc] [ i c ] ;
}
s2c[jr][ic] = tpvarl;
}
}
f o r ( i r = 0 ; ir<NV; i r++)
{
f o r ( i c = 0 ; ic<NV; ic++)
{
matA[ir][ic] = Q[ir][ic] + s2c[ir][ic];
}
}
f o r ( i r = 0 ; ir<MC; i r + + )
{
s3a[ir] = i n v t [ i r ] * r 2 [ i r ] ;
s3b[ir] = lam[ir] - s3a[ir];
}
f o r ( j r = 0 ; jr<NV; j r + + )
{
tpvarl = 0;
{
t p v a r l = t p v a r l + 3 [ j c ] [ j r ] * s3b[jc]; / / s3c
}
s3d[jr] = - c [ j r ] - t p v a r l ;
}
f o r ( j r = 0 ; jr<NV; jr++)
{
tpvar2 = 0;
for(jc=0; jc<NV; jc++)

{
tpvar2 = tpvar2 + Q [ j r ] [ j c ] * z [ j c ] ; / / s3e
}
s3f[jr] = s3d[jr] - tpvar2; // matB
}
solveLinEq(matA[0], s 3 f ) ; / / Line 4
f o r ( i r = 0 ; ir<NV; ir++)
{
dz[ir] = s3f[ir];
}
f o r ( i r = 0 ; ir<MC; ir++)
{
tpvarl = 0;
for(ic=0; ic<NV; ic++)

{
t p v a r l = t p v a r l + J [ i r ] [ i c ] * d z [ i c ] ; / / s4a
}
tpvar2 = s2a[ir] * tpvarl; // s4b

dlam[ir] = tpvar2 - s3a[ir];
tpvar3 = slb[ir] - t[irl; // s5a

42
d t [ i r ] = tpvar3 - t p v a r l ;
}
3.5.4 Computational Complexity of IPM Algorithm
MPC relies on solving constrained optimization problems on-line. The computational

load for such problems is much greater than that for traditional control algorithms.
Computational complexity consequently can become an issue when applying MPC to
complex systems with fast response time or to embedded applications where
computational resource is a major constraint.
Table 3.3 Worst case computation complexity for IPM algorithm
IPM algorithm Number of operations per iteration
Addition nl(mc + 1) + n„(3mc + 1) + 51mc + Ma(nv)
Multiplication nllmc + 1) + n„(4mc + 1) + 27mc + 2 + Mm(n„)
Division mc + 1 + Md{nv)
The expressions in Table 3.3 (see Appendix C for details) are derived for the worst
case scenarios and give the number of floating-point arithmetic operations executed
by one iteration of the IPM algorithm, parameterized by the number of decision
variables nv, and the number of inequality constraints mc. In this analysis, we
separated out the number of arithmetic operations for the solving of the linear system.
3.5.5 Storage Requirement of IPM Algorithm
For IPM algorithm, the number of 32-bit floating-point variables needed to be stored
on DSP is:
Storage = 3n£ + n„(2m c + 5) + l l m c + 14
Note that the storage given in the above equation does not include variables used for
variable swapping, loop indices, counters and flags.
3.6 Implementation of ASM Algorithm
3.6.1 Formulation of ASM Implementation
For the convenience of implementation, here we rearrange equation (35) in step 2 of

ASM algorithm to find (Azk, A) by solving a linear system in the form of Ax = b,
43
-1
\Azk] Q Jwk -c - Qzk
[ X\ Jwk 0 [ 0 J
<2o ô
3.6.2 Implementing ASM Algorithm on MATLAB

The M-codes for implementing ASM is partially presented below. In this project we
use the "backslash" operator to solve (43).
% Step 1 of AS Solver
nv = size(J,2);
mc = size(J,1);
z = zeros(nv,1);
% End of Step 1 of AS Solver
% Step 2 of AS Solver
Wk = [];
JWk = J(Wk,:);
mwork = length(Wk);
ck = c + Q*z;
QO = [Q J W k ' ; JWk z e r o s ( m w o r k , m w o r k ) ] ;
PO = [ - c k ; z e r o s ( m w o r k , 1 ) ] ;
d z l a m = QO\PO;
dz = d z l a m ( l : n v ) ;
lambda = dzlam(nv+1:end) ;
% End of S t e p 2 of AS S o l v e r
3.6.3 Implementing ASM Algorithm on DSP

The ASM implementation on DSP is also coded in C language. Note that the M-codes
in Section 3.6.2 and the following C-codes are only used for solving (Az^.A) and not
representative of how active set works in the algorithm. Refer to Appendix B for the
complete codes of ASM implementation.
/ / The f o l l o w i n g code i s p a r t i a l l y from v o i d ASM_funct(void)
/ / Step 1 of AS Solver
f o r ( j = 0 ; j<MC; j++)
{
lambda[j] = 0;
w k [ j ] = 0;
Dk[j] = 0;
}
/ / Step 2 of IP solver
f o r ( i r = 0 ; ir<NV; i r + + )
{
temp2 = 0;
f o r ( i c = 0 ; ic<NV; ic++)
{
tempi = Q [ i r ] [ i c ] * z [ i c ] ;
temp2 = temp2 + t e m p i ;
}
c k [ i r ] = c [ i r ] + temp2;
}
countAct=0;
for(j=0; j<MC; j++)
{
if(Wk[j]==l)
countAct++;
}
f o r ( i r = 0 ; ir<NV; i r + + )
{
f o r ( i c = 0 ; ic<NV; ic++)
{
QO[ir][ic] = Q [ i r ] [ i c ] ;
}
PO[ir] = - c k [ i r ] ;
}
temp3 = 0;
f o r ( j = 0 ; j<MC; j++)
{
if(wk[j]==l)
{
f o r ( j c = 0 ; jc<NV; jc++)
{
QO[NV+temp3][jc] = J [ j ] [ j c ] ;
QO[jc][NV+temp3] = J [ j J [ j c ] ;
45
}
P0[NV+temp3] = 0;
temp3++;
}
}
temp2 = 0;
f o r ( j r = N V ; jr<(countAct+NV); jr++)
{
f o r ( j c = N V ; jc<(countAct+NV); jc++)
{
QO[jr][jc] = 0;
}
}
solveLinEq(QO[0], PO, countAct, NV, NV+MC); / / Line 4
f o r ( i = 0 ; i<NV; i++)
{
dz[i] = PO[i];
}
temp3 = 0;
f o r ( j = 0 ; j<MC; j++)
{
i f Cwk[j] == 1)
{
lambda[j] = PO[NV+temp3];
temp3++;
}
else
lambda[j] = 0;
}
3.6.4 Computational Complexity of ASM Algorithm

Table 3.4 shows the number of floating-point arithmetic operations executed in one
iteration of the ASM algorithm. Refer to Appendix C for details.
46
Since the constraints associated with the minimum and maximum cannot be active
simultaneously, the total number of active constraints cannot exceed mc/2. Again, the
analysis in Table 3.4 is based on the worst case scenarios for ASM algorithm. For
example, we assume that the number of active constraints in the ASM algorithm is
always m c / 2 so that the matrix Jwk in Line 4 always hasm c /2 x n v entries. But, in
reality ]wk may have a smaller size.
Table 3.4 Worst case computation complexity for ASM algorithm
ASM algorithm Number of operations per iteration
Addition n£ + nv(2mc + 1) - mc+Ma(nv + 0.5m c )
Multiplication n$ + nv(2mc + 1) + Mm(nv + 0.5m c )
Division mc + Md(nv + Q.5mc)
3.6.5 Storage Requirement of ASM Algorithm
The number of 32-bit floating-point variables needed to be stored on DSP for

implementing ASM is:
Storage = 2n£ + n v (3m c + 5) + ml + 2m c + 6
3.7 Testing QP problems
3.7.1 Examining a Single QP problem
For a single QP problem, we are interested to investigate its execution time, number
of iterations and storage usage of the computation.
Timing performance can be measured by hardware timer on DSP or the profile clock
in CCS. The timer value can be read to get a measurement of clock counts and
execution time.
CPU-Timer 0 is for general use on F2812. This timer is a 32-bit countdown counter
which is decremented at the CPU clock speed divided by the prescale value setting. In
this project the counter frequency is set to be equal to F2812 CPU clock (150 MHz).
# i n c l u d e "DSP281x_Device.h" / / DSP281x H e a d e r f i l e include F i l e

#include <stdio.h>
# i n c l u d e <math.h>
#define NV 3
47
#define MC 31
#define MAXITER 100
float Q[NV][NV];
float c[NV];
float J[MC][NV];
float g[MC];
f l o a t z[NV]; / / solution
int iter; / / number of i t e r a t i o n s
f l o a t Exectime; / / execution time
void Timing(void)
{
long t i m e r v a l u e ;
timervalue = CpuTimerORegs.TIM.all;
timervalue = OXFFFFFFFF - t i m e r v a l u e ;
Exectime = timervalue / 150; / / us
}
v o i d main ( v o i d )
{
InitSysCtrlO;
initCpuTimersO ;
StartCpuTimerOO;
I P M _ f u n c t ( ) ; / / or ASM_funct();
StopCpuTimerOO ;
TimingO ;
}
There are two 32-Bit general-purpose Timers on C6713, Timer 0 and Timer 1. In this
project, Timer 1 is adopted to measure clock counts and computational time, operating
at a quarter of C6713 CPU clock.
/ / #include " c 6 7 1 3 d s k i n i t . h "
#include <stdio.h>
#include <math.h>
#include " c s l . h "
#include " c s l _ t i m e r . h "
#define NV 3
#defnne MC 31
#define MAXITER 100
f l o a t Q[NV][NV];
f l o a t c[NV];
float :[MC][NV];
f l o a t g[MC];
f l o a t z[NV]; / / solution
int Iter; / / number of iterations
f l o a t Exectime; / / execution time
s t a t i c Uint32 Timercontrol = TIMER_CTL_RMK(

TIMER_CTL_INVINP_NO,
TIMER_CTL_CLKSRC_CPUOVR4, // CPU clock / 4
TIMER_CTL_CP_PULSE,
TIMER_CTL_HLD_YES,
TIMER_CTL_GO_NO,
TIMER_CTL_PWID_ONE,
TIMER_CTL_DATOUT_0,
TIMER_CTL_INVOUT_NO,
TIMER_CTL_FUNC_GPIO
);
s t a t i c TlMER_Handle hTimerl;
void main(void)
{
long timervalue = 0;
long timervalue_start = 0;
long timervalue_end = 0;
void TimerEventHandler(void);
extern far void vectors();
hTimerl = TIMER_open(TIMER_DEVl, TIMER_OPEN_RESET);
TlMER_confi gArgs(hTi merl,
Timercontrol, // use predefined control value
OxFFFFFFFF, // set period
0x00000000 // start count value at zero
);
TlMER_start(hTimerl);
49
t i m e r v a l u e _ s t a r t = TIMER_getCount(hTimerl);
lPM_funct(); / / or ASM_funct();
timervalue_end = TIMER_getCount(hTimerl);
timervalue = timervalue_end - t i m e r v a l u e _ s t a r t ;
Exectime = timervalue / 56.25; / / us
TlMER_pause(hTimerl);
TiMER_close(hTimerl);
}
Memory usage configuration of every section is amply reported by .map file created
by CCS.
3.7.2 Test Methodology
One hundred QP problems for each set of

{nvlmc) e{3,5,7 13} x {31,41,51,..., 111}
are randomly generated on a PC using MATLAB by the method proposed in [17] and
saved as a .mat file consisting of matrices Q, c,J and g.
QP problems generated by this method always have the solutions (1,1, ...,1) and a
feasible point (0,0, ...,0). Therefore, the initial point z° = (0,0, ...,0) is used for both
of the IPM and ASM algorithms.
Also, the number of active constraints at the solution never exceeds the number of
decision variables. This is to ensure that the Lagrangian multipliers always exist at the
solution. Finally, the QP problems are scaled such that the eigenvalues of Q lie in the
range from 0.5 to 20.
Due to the large number of the test sets (5400), an automatic test methodology is
introduced in this section to substantially streamline testing process and reduce testing
time.
In this project we use a MATLAB component Embedded IDE Link CC 3 to carry out
our test methodology. This component provides a collection of methods that use the
application program interface (API) of CCS to communicate between MATLAB
software and a processor in CCS to build a connection.
50
Start testing
Set up connection
between PC and DSP
Build and load CCS project

of solving QP problems
Initialize Q, c,J and g on DSP by

sending the matrices from PC to CCS
Run DSP program to

solve a single QP problem
NO
Read solution z, computational time and

number of iterations from CCS to PC
NO
Figure 3.8 Debugging flowchart
With Embedded IDE Link CC, we can use MATLAB and/or Simulink software to
interactively analyze, profile and debug the execution behavior of processor-specific
code for solving within CCS. Figure 3.8 shows how Embedded IDE Link CC is used
to verify QP problems in this project.
51
Step 1:
Set up connection between PC and DSP by entering the following command at the
MATLAB software prompt:
ccsboardinfo
A list of the processors and boards that CCS recognizes as installed on the PC is
returned. Alternatively, the following code can be called to set up:
[boardnum,procnum] = boardprocsel;
MATLAB returns a window (see Figure 3.9) with the information of the processors
and boards installed on the PC.
SelectingBoardnum Et Procnum C j H Pk) I •> Selecting Boardnum a Procnum
Code Composer Studio(tm) has configured a single processsor Code Composer Studio(tm) has configured a single processsor
Boa>dF2812eZdsp BoardC8?13DSK
Proc: cpu_0 Proc: C P U J
Type: TMS320F280C Type: T M S 3 2 0 C 6 K 1 X
This device w i be used for the Code Composer Link T his device wJ be used for the Code Composer Link
OK CK
Figure 3.9 Information of the processors and boards
Issue the command "ticcs" to set up the connection between MATLAB and CCS that
refers to selected board and processor:
cc = ticcs('boardnum',boardnum,'procnum',procnum);
This connection is a MATLAB link object that is saved as a variable "cc" and directs
actions in MATLAB to the processor. If CCS was not running before creating the
object, CCS launches and runs in the background. Enter the code below to force CCS
to be visible on the PC.
visible(cc,1);
Step 2:
A connection to a processor (F2812 or C6713) and board (F2812 eZdsp or C6713

DSK) has been established. Now the CCS project of IPM or ASM is needed to be
built and loaded either in CCS IDE or MATLAB with the following commands.
52
[filename, pathname] = uigetfile('*.pjt','select •pjt file');

% projfile.= fullfile(filename,pathname);
% projpath = fileparts(projfile);
cd(cc,pathname);
open(cc,filename); % open the project
build(cc,'all'); % build in CCS
[filename, pathname] - uigetfile('*.out','select .out file');
cd(cc,pathname);
load(cc,filename); % load C program
Step 3:
Initialize global arrays Q, c,J and g in the C program on DSP by sending the matrices
from MATLAB to CCS.
The CCS is able to work in MATLAB environment and accesses the variables
declared in the C program through the "createobj" command. Command "createobj" is
such a method supported by Embedded IDE Link CC that creates a MATLAB
embedded object to represent a C variable in the program in CCS. Variables can be
read or modified in a C-like way working with the embedded object that "createobj"
returns. Here we create embedded objects for global arrays Q, c,J, g and variable "z",
"Exectime" and "Iter" in the C program.
Q_obj = c r e a t e o b j ( c c , ' Q ' ) ;
c_obj = c r e a t e o b j ( c c , ' c ' ) ;
J_obj = c r e a t e o b j ( c c , ' J ' ) ;
g_obj = c r e a t e o b j ( c c , ' g ' ) ;
z_obj - c r e a t e o b j (cc, ' z ' ) ;
Exectime_obj = c r e a t e o b j ( c c , ' E x e c t i m e ' ) ;
Iter_obj = createobj(cc, ' Iter ' ) ;
Matrices Q,c,J andg are loaded from MATLAB workspace in form of .mat files to
CCS by "write" function.
write(Q_obj,Q);
write(c_obj,c);
write(J_obj,J);
write(g_obj,g);
Step 4:
Run the DSP program to solve a single QP problem.

run(cc);
while (isrunning(cc) == 1);
53
end
Step 5:
Work with the CCS project from MATLAB software by reading results ("z",
"Exectime" and "Iter") from the processor.
iteration = read(Iter_obj);
z_ip = read(z_obj);
executiontime = read(Exectime_obj);
Step 6:
Compare the DSP solution of the QP problem with MATLAB QP solver "quadprog".
If the difference between the two solutions is within a tolerance we have defined, e.g.
0.001, we consider the DSP solution is accurate. Then we get the number of satisfied
solution per 100 QP problems, average computational time and average number of
iterations.
Step 7:
Objects that created by Embedded IDE Link CC software have COM handles to CCS.
Closing MATLAB software removes these COM handles automatically. Alternatively,
project closes in CCS with the following command.
close(cc,projfile,'project')
The following table shows the sample test suite for DSP implementation of constraint
MPC.
clear all
TOL = le-3; i acceptable accuracy
N = 100; % how many test sets
nv = 3; % size of QP
mc = 31;
% set up communication between MATLAB and CCS
[boardnum,procnum] = boardprocsel;
cc = ticcs('boardnum',boardnum,'procnum',procnum);
[filename, pathname] = uigetfile('*.mat','select QcJg.mat');

cd(pathname);
Q obj = createobj(cc,'Q');
z_obj = createobj(cc, ' z');

Exectime_obj = createobj(cc,'Exectime');
Iter_obj = createobj(cc, 'Iter');
for i= 1:N
restart(cc); % restart the program on DSP
% move the program counter to the beginning of main
goto(cc,'main');
run (cc, 'main');
load(filename); % load Q,c,J,g matrices to MATLAB

% write values to variables Q, c, J and g in C code
write(Q_obj,Q);
run(cc); % start MPC algorithm performed on DSP

% wait for the QP problem solved
while (isrunning(cc) == 1);
end
i save solution z from CCS to MATLAB
z_ip = read(z_obj);
for j = l:nv
z (N,j) = z_ip(j);
end
% save execution time from CCS to MATLAB
executiontime = read(Exectime_obj);
% save number of iterations from CCS to MATLAB
iteration = read(Iter_obj);
z_qp = quadprog(Q,c,J,g); % use MATLAB QP solver

err - max(abs(z_qp - z_ip));
if (err < TOL)
NCorrect = NCorrect + 1;
exectime_all = exectime_all + executiontime;
iter_all = iter_all + iteration;
end
filename = filename + 1; % next QcJg.mat
end
% average execution time

55
exectime_aver = exectime_all/NCorrect;
iter_aver = iter_all/NCorrect; % average number of iteration
It can be seen that with Embedded IDE Link CC, QP problem verification is
automated simply by writing MATLAB software scripts.
56
4. Simulation Results
4.1 Convergence Speed
The performance we pay most attention to among simulation results is the

convergence speed for obtaining the solution to QP problems using both of IPM and
ASM. For each QP problem, the computational time and number of iterations which
reflect the convergence speed are recorded. The average computational time is given
by the total computational time of accurately solved QP problems divided by the
number of successes in each one hundred QP problems. The average number of
iterations is computed in the same fashion.
4.1.1 Computational Time

First we compare the average computational time between fixed-point and floating-
point DSP implementation. Figure 4.1 represents the average computational time, as a
function ofm c andn„, the IPM algorithm spends on fixed-point DSP and Figure 4.2
shows the ASM one. The average computational time of IPM algorithm in Figure 4.1
and 4.2 appears to increase linearly with m c at a fixed nv or n„ at a fixed m c .
Computational time as a function of mc on Interior Pant Method Computational time at a function of nv on Interior Point Method
Number of Constraint* mc Number of Decision Variable* nv
Figure 4.1 Computational time per QP problem using IPM on TMS320F2812

57
Computational time as a function of mc on Active Set Method Computational time as a (unction of nv on Active Set Method
30 40 50 60 70 80 90 100 110 120

Number ol Constraints mc
Figure 4.2 Computational time per QP problem using ASM on TMS320F2812
Figure 4.3 and 4.4 shows the computational time of C6713 implementation. It can be
evidently seen that C6713 DSP takes great advantage of its floating-point architecture
and higher CPU frequency and therefore performs much faster operations than F2812.
Also, since the on-chip RAM of C6713 is used (rather than the off-chip RAM of
F2812), there is no significant delay when accessing data. As a result, DSP with
advanced architecture is more appropriate for real-time industrial applications of MPC
where the computational speed is substantially considered.
Computational time as a function o( mc on Interiof Point Method Computational time as a function of nvon Interior Point Method
Number of Decision. Variables nv
Figure 4.3 Computational time per QP problem using IPM on TMS320C6713

58
Computation* urns a t a tunctktn ol mc on Active Set Method Computational time as a (unction 01 mac Active Set Method
Figure 4.4 Computational time per Q? problem using ASM on TMS320C6713
ASM (Figure 4.2 and 4.4) takes a shorter computational time than IPM (Figure 4.1
and 4.3) when the number of decision variables nv is small. However, as nv grows,
the computational time ASM requires increases faster than IPM and therefore is
longer than IPM after nv reaches a certain value.
If we ignore the arithmetic operations for solving the linear systems in both IPM and
ASM algorithms, the IPM algorithm requires approximately nv2mc + 50m c more
additions and nv2mc + 27 mc more multiplications than the ASM algorithm. The
extra operations are used for matrix multiplications in forming the linear system in
Line 4 of the IPM algorithm (i.e., forming the matrix/ 7 /" - 1 / and vectors rx andr 2 ).
Also, recall the number of addition, multiplication and division required for Gauss-
Jordan elimination with pivoting discussed in Section 3.4.2.
Ma(n) = 0 . 5 n ( n - l ) ( n + l )
Mm(n) = 0 . 5 n ( n - l ) ( n + l )
Md(n) = 0.5n(n + 1)
ASM may need to solve a system of nv + mc/2 linear equations, i.e. n = nv + mc/2.
By contrast, IPM only needs to solve a system of nv equations. The number of
operations required by IPM algorithm to solve a linear system is less sensitive to nv
than that required by ASM algorithm. For the reasons above, as nv grows, the
computational load for ASM increases faster than IPM.
59
4.1.2 Number of Iterations
Next we examine the average number of iterations required to solve a single QP

problem.
For IPM algorithm (Figure 4.5 and 4.7), the average number of iterations lies between
12 and 16 for most values of (n„,m c ) and is insensitive to nv andm c . For ASM
(Figure 4.6 and 4.8) algorithm, the average number of iteration grows linearly with nv
andm c , and almost reaches 55 for the largest QP problem size. From the above
comparison we conclude that IPM has a better scalability than ASM, as the number of
inequality constraints mc and decision variables nv weakly affect the number of
iterations required by the IPM to solve one QP problem.
There is only a small distinction between the fixed-point and floating-point DSP
performance on the number of iterations. This is due to the different accuracy between
fixed-point and floating-point representation. The loss of precision on fixed-point DSP
forces an earlier termination of the iteration as compared to the floating-point DSP.
Average number ot iterations as a unction of m C on Interior Point Method number ol iteratkine M • function of nv on Interior Point Method
—•— nv*3
nv«5 - t-
• • - » • - n*«7 - - -A- - - - -'
& 15.5
• nv=11
nv=9
" _ L - _ i_ _ _ L
\-R\\--\
—*—m*13
v - § i •V L
;/: *•- " -V ""
13.5
-£--\/ *
• -.^-^y - - - ,
r~ r
1/ r
r
/A
1
13 - r . . ,-
12 5 - *•
100 110 120

Number o* Constraint• rr>c
Figure 4.5 Average number of iterations using IPM on TMS320F2812

60
Average number of iterations as a function olmcwi Active Set Method
Figure 4.6 Average number of iterations using ASM on TMS320F2812

Average number of iterations as a function of mc on Interior Point Method Average number of iterations as a function of nv on Interior Point Method
— • — nv=3
nv=5
• nv=7
rtv=9
• nv=11
— * — nv=13
I 13
30 40 50 60 70 80 90 100 110 120

Number of Constraints mc
Figure 4.7 Average number of iterations using IPM on TMS320C6713

Average number of iterations as a function of mc on Active Set Method Average number of iterations as a function of nv on Active Set Method
— A — mc=31
mcMI
—W~ - mc=51
1 mc=81
- mc=7l
S 40 — 4 — mc*B1
I • mc=91
••• W W I
- mc«111
10 11 12 13
Number of Decision Variables nv
Figure 4.8 Average number of iterations using ASM on TMS320C6713

61
4.2 Storage
Table 4.1 shows the storage usage of F2812 processor for solving the QP problem of
size mc = 111 and n„ = 13. Mapping of the variables to the sections are done
automatically by the C compiler of CCS. In Table 4.1 we only take storage usage
of .cinit, .ebss and .stack sections into account.
Table 4.1 Storage usage for solving a QP problem using IPM and ASM
1PM ASM
29426B 81258B
As seen from Table, the ASM algorithm requires more storage, since it needs to solve
a system of nv + mc/2 linear equations. For the largest generated QP problem, the
memory requirement is around 80 KB, which greatly exceeds the storage of on-chip
RAM on F2812 (36 KB). In contrast, the IPM algorithm only needs to solve a system
of nv linear equations. Therefore, the ASM algorithm needs a larger memory for
storing the matrices that are needed during the solving of the linear equations.
4.3 Numerical Error
The DSP implementation does not always give the accurate solution mainly due to the
finite wordlength effects of embedded microprocessors. The solutions that return from
DSP are compared to the results that are solved by "quadprog" function in MATLAB.
The DSP results within ±0.001 tolerances are acceptable.
Most commonly, the round-off error, which is the difference between the calculated
approximation of a number and its exact mathematical value, occurs and propagates
through the iterations during the DSP implementation, leading IPM or ASM to
terminate at a point where the objective function value deviates severely from the
optimal value.
Note that not all the inaccurate solutions are caused by round-off error. In some
occasions of our DSP implementation, overflow occurs which eventually cause
unboundedness of the iterations or instability to the IPM or ASM algorithms.
62
Based on fixed-point processor F2812, for 19 out of 5400 randomly generated QP

problems, the IPM algorithm results in an unbounded zk. An explanation for the
failure cases is that l/tk has to be computed in order to form the inverse of T in each
k
iteration. Recall that t with i = 1,2, ...,m c is the slack variable for the ith inequality
constraint. Hence, if the ith constraint is active at the solution, tk would approach to
zero as iteration goes on. Then taking the inverse 1/tf may exceed the capacity of the
wordlength of the fixed-point representation and therefore cause severe numerical
error. Implementation of IPM on single precision systems and the associated difficulty
of numerical error have been mentioned in [28].
Compared to the IPM algorithm, the failure rate for the ASM algorithm is lower,
which is about 18 out of 5400 QP problems for fixed-point solution. Again, the failure
cases are mainly due to overflow, which usually occurs when the linear system is
solved for a very small Azk. Unfortunately, the convergence of the ASM algorithm
indeed requires Azk being close to a zero vector, which potentially leads to failure.
For C6713 implementation, the numerical error appears less frequent (13 out of 5400
for IPM algorithm and none is detected for ASM algorithm) because the maximum
representable number is enormously larger.
63
5. Aircraft Case Study
5.1 Introduction
To verify the DSP implementation of MPC, the Cessna Citation 500 Aircraft Model
from Example 2.7 in [2] is currently examined as the actual controlled system by the
embedded model predictive controller implemented on TMS320F2812 and
TMS320C6713. In this simulation we only verify Case 2 and Case 3 of Example 2.7
in [2].
5.2 Aircraft Model
The aircraft model has the following continuous-time state-space form

-1.2822 0 0.98 °1 -0.3"
0 0 1 0 0
A = -5.4293 ,B =
0 -1.8366 0 -17
-128.2 128.2 0 0. . 0 .
0 1 0 0 0
C = 0 0 0 1 ,D = 0
.-128.2 128.2 0 0. .0.
The model has the elevator angle (rad) as its only input, and the pitch angle (rad),
altitude (m) and altitude rate (m/sec) as outputs. The elevator angle is limited to ±15°
(±0.262 rad), and the elevator slew rate is limited to ±307sec (±0.524 rad/sec). These
are limits imposed by the equipment design and cannot be exceeded. For passenger
comfort the pitch angle is limit to ±20° (±0.349 rad).
A MPC controller is designed with sampling interval of 0.5 sec, prediction horizon NP
= 10 and control horizon Nu = 3. The following constraints
ju| < 0.262, \Au\ < 0.524, and \yl\ < 0.349
are also included. This translates to a QP problem of 3 unknowns (mc = 3) with 32

constraints (nv = 32) that the DSP has to compute on-line at every sampling time.
The aircraft model is simulated in MATLAB/Simulink on a PC and controlled by the

DSP implementation of QP solver on TMS320F2812 or TMS320C6713 prototyping
64
board (see Figure 5.1). The plant and controller interact through parallel port (for
F2812) or USB cable (for C6713).
Host (MATLAB platform) Target (DSP board)

__ Parallel port / USB ^
Test Suite DSP
Running implemented Running plant model

constrained MPC in MATLAB
controller on DSP
Figure 5.1 Simulation configuration
5.3 Simulation Results
5.3.1 System Response

First we examine the response of the aircraft system to a step change of 400m in the
altitude set-point at time 0, with set-points for pitch angle and altitude rate held at 0,
and the reference trajectory equal to the set-point: r(k + i\k) — [0,400,0] r fork > 0
(Case 2). Note that in this case the altitude rate was unconstrained. Figure 5.2 shows
the response of Case 2 derived from IPM implementing on F2812, corresponds to
Figure 2.7 in [2].
From Figure 5.2, it can be seen that the pitch angle constraint is active for much of the
transient, which results in the altitude rate being constant for part of the transient.
Next, a constraint on the altitude rate with a limit of 30m/sec is introduced and the
number of constraints is now increased to 52 (Case 3). Again, the IPM algorithm
realized on F2812 is able to solve the QP problem without violating the given
constraints. Figure 5.3 shows the response to 400m step change in altitude set-point,
with the added constraint, corresponds to Figure 2.8 in [2]. The altitude rate constraint
is active during most of the control process while the pitch angle constraint is only
active at the beginning of the process. The pitch angle constraint is briefly active near
the beginning of the transient, but that the altitude rate constraint is active for most of
the time.
Pitch angle and constraint Altitude and set-point

!
20 , 400
/ A 1:.
fi
g- 300
9
2 . 10
•
"S 200
fi
0 !1 { <
100 •--/{-----j
'
-5 0
10 15 20 0 10 15 20
Time (sec) Time (sec)
Altitude rate and constraint Elevator angle and constraint
-. 50 - - - - - J. t -r- — ~ -
<u in
| 40
e
vator a ngle (d
f 30
4 rk.
r^
C3
2 20 .!...[..
/•+ .4...
„ . : \
•
3
I 10
1 LU
V_
5 10 15 20 5 10 15 20
Figure 5.2 Response of Case 2 with IPM algorithm

20 400
h'
', •
_ 15
9
2- 10 j —
i 300
\--j/-\- -
s
-c 200
x± 5
0-
0 ; \. 100 ..../..-.A :
/ ,
-5 0 —£_ 1 1 1
10 15 20 > 5 10 15 20
-& 20
I L..i
u - -
\ !
I !
- 1 " . •
i•
v
g
10
o p-i uu
i:
_ . „ „ \ J - - o ' > i
ra
; i ^ j -10 .-- -T , ,--
i ' i
ill
5 10 15 20 5 10 15 20
Figure 5.3 Response of Case 3 with IPM algorithm

20 400 ly-
/
1 I 1
15
I , 1, -§• 300
•' i .
• 7 i
10 • /
o i /
| 200 • , - - - - .
5l . . . . L' i
; \ 3
100
0 / >
-5 n
10 15 20 10 15 20
50 _ i _ L —
40 10
30
20 / I
a>
c _^J^__
o
ro
10
0 f-'r-h" UJ
-10
1
~ T ~
' »
!
1
5 10 15 20 5 10 15 20
Figure 5.4 Response o f Case 2 with A S M algorithm
— —; J_. 1 -
400
/V 300
y-
I 1
:
_ L
It.Vj o A
I : : \i
"S 200
100 -/
/
0
10 15 20 0 5 10 15 20
\ 30 10
\ :
1s 20 /
f pJ U^
i i
(U f 1
•*->
5 i
0) 10
-o i ro
&
3
~ 0f
<
i
1 -10
I
5 10 15 20 5 10 15 20
Figure 5.5 Response of Case 3 with ASM algorithm

67
The MPC controller using ASM algorithm as the QP solver gives almost the same
response for Case 2 and Case 3 as compared with IPM algorithm (see Figure 5.4 and
5.5). However, it requires less computational time as compared with that using IPM
algorithm, which is introduced in the next section.
5.3.2 Convergence Speed
Figure 5.6 presents the number of iterations and computational time recorded at every
sampling instant IPM requires for solving the aircraft QP problem on F2812 operating
at 150 MHz. For Case 2, it can be seen that when the pitch angle constraint becomes
active from 4 sec to 9 sec, both the number of iterations and computational time
increase dramatically. After the transient they fall to a relatively low number. The
most difficult QP problem occurred at about 5 sec. For Case 3, the most difficult QP
problem happens at 9 sec, which takes more than 30 iterations to solve. ASM takes
less computation cost to solve the aircraft QP problem and runs about 14 and 10 times
faster in Case 2 and Case 3, respectively (see Figure 5.7).
For floating-point DSP C6713, the clock frequency for both implementations of the
IPM and ASM algorithms are set to 225 MHz.
Comparing the results of F2812 and C6713 implementations, it can be seen that
although the trends are similar, the execution time needed for the latter was reduced
tremendously (see Figure 5.8 and 5.9). C6713 gives a better result not only because of
a higher clock frequency but, more importantly, its architecture.
The aircraft model simulation again proves that ASM is much more effective than
IPM to handle the QP problem and therefore offers time saving when the number of
decision variables is small.
Interior Point (Case 2) Interior Point (Case 3)
30 _; „ . A :.
14 . J LL
a . 25 * -/ -\ -J
c
§ 12 , ^ L 1 J 1-. --. o
2 | 20
9
- 10 / L
£
* ~ i A A A i 15
8 1 _jV- ___!_. V 10 y v y ^ _ l j _ \
10 15 20 5 10 15 20
x -jo Interior Point (Case 2) 105lnterior Point (Case 3)
: 'VIEE:
V
J\S \ \ !
10 5 10 15 20
Figure 5.6 Iterations and computational time of IPM algorithm on F2812
Active Set (Case 2) Active Set (Case 3)
V
J .
PG • ' v \J
0 5 10 15 20
Active Set (Case 2) x io" Active Set (Case 3)
w
3- 3
o
E
52
CO
o J \/v ; I rj\ ;
11
Q.
E
5 10 15 20 8 ol 5 10 15 20
Figure 5.7 Iterations and computational time of ASM algorithm on F2812

5 10 15 20 5 10 15 20
5 10 15 5 10 15 20
Figure 5.8 Iterations and computational time of IPM algorithm on C6713
15 1 1 1
1"
: v \J
5 10 15 20 5 10 15 20
5 10 15 5 10 15 20
Figure 5.9 Iterations and computational time of ASM algorithm on C6713

70
Table 5.1 lists the average computation time for solving a QP problem in the aircraft
example.
Table 5.1 Average computational time per aircraft QP problem
Aircraft QP problem IPM ASM
F2812 Case 2 47.20 ms 33.72 ms
(150 MHz) Case 3 84.66 ms 82.12 ms
C6713 Case 2 2.54 ms 0.36 ms
(225 MHz) Case 3 4.91 ms 0.72 ms
5.3.3 Storage
The required memory size can be estimated by the equations given in Section 3.5.5
and 3.6.5. Table 5.2 shows the storage usage of F2812 implementation for both Case 2
and Case 3. Again, we only consider the usage of .cinit, .ebss and .stack sections in
F2812 processor. Because the ASM algorithm needs to solve linear systems of larger
size, more RAM storage are used.
Table 5.2 Storage usage of aircraft QP problem
"Aircraft QP problem IPM ASM~
Case 2 4820 B 8754 B
Case 3 6724 B 16690 B
71
6. Conclusion and Recommendations
6.1 Conclusion
In this paper, we explored the applicability of the "MPC on a Chip" idea using both
fixed-point and floating-point DSP. Interior Point method and Active Set method with
dense matrix formulation, was employed to solve the QP problem. A rapid
prototyping environment suitable for exploring the various implementation issues to
bring MPC onto DSP was described. Simulation tests show that the DSP could be
used to implement a reasonably sized constrained MPC controller and floating-point
architecture DSP provide much faster computational speed than fixed-point. We have
also found that, in general, ASM should perform better than IPM when the numbers of
inequality constraints and decision variables are small. Otherwise, IPM should be a
better choice due to its scalability. The instability was mainly due to numerical error,
which was found to be more serious in IPM than in ASM in current implementations.
6.2 Recommendations for Further Research
The following directions are worthy of further pursuit:
1. Further effort should be directed at investigating the possible parallelizing of MPC

computations to take advantage of the available pipelining on the DSP, e.g. a
parallel matrix multiplication algorithm, which outperforms normal matrix
multiplication that usually takes 3 FOR loops to complete, is proposed in [20].
Also, exploring new microprocessor architectures which are more suitable for
MPC implementation is needed for further work (see [36] and [37]).
2. In this project, we have used MATLAB/Simulink and CCS to take a MPC solution
from design to embedded implementation. Further effort should also be directed at
achieving a higher level of automation in implementing embedded MPC
technology. This would facilitate the embedded system community to explore the
design space available in realizing a customized embedded MPC design.
72
3. In this dissertation, we have used Gauss-Jordan elimination to solve the linear

systems in both IPM and ASM algorithms. This method may not be the most
efficient method for solving linear systems. Since the major portion of computation
cost by IPM or ASM is spent on solving the linear system, it is worthwhile to see if
a more efficient method can be adopted (see [14]). Also, we have observed that
numerical error often occurs when the linear systems are being solved. The
numerical error may be alleviated by, say, a minimum residual method, for solving
the linear systems.
4. We may adopt the method of [13] to formulate our MPC problem into a QP
problem. In this method, the states and inputs during the prediction horizon are
kept as variables, in order to get a banded Q. For large QP problems, this may
results in lesser storage, and faster solution of the linear system of equations in
both IPM and ASM.
5. In MPC applications, constraints are usually imposed to restrict the minimum and
maximum values of the inputs, outputs, or states. This may result in a sparse/ with
special structures. It is meaningful to see if these structures can be exploited to
reduce the storage and computational complexity of solving the linear systems.
73
Bibliography
[1] K. V. Ling, S. P. Yue and J. M. Maciejowski, "A FPGA implementation of model predictive
control," in Proceedings of the 2006 American Control Conference, Minneapolis, Minnesota,
USA, pp. 1930-1935, June 2006.
[2] J. M. Maciejowski, Predictive Control with Constraints. Prentice Hall, 2002.
[3] M. H. He and K. V. Ling, "Model predictive control on a chip," in the 5th International
Conference on Control & Automation, Budapest, Hungary, pp. 528-532, June 2005.
[4] K. V. Ling, EE6225 Process Control - Model Predictive Control. School of Electrical and
Electronic Engineering, Nanyang Technological University, Lecture Notes, 2009.
[5] Y. Y. Su, "Embedded model predictive control on a microcontroller," School of Electrical
and Electronic Engineering, Nanyang Technological University, Bachelor Thesis, 2008.
[6] K. V. Ling, B. F. Wu and J. M. Maciejowski, "Embedded model predictive control (MPC)
using a FPGA," in Proceedings of the 17th IF AC World Congress, Seoul, Korea, pp. 15250—
15255, July 2008.
[7] M. H. He and C. Chen, "An effective online computation scheme for constrained model
predictive control in embedded systems", in Proceedings of the 6th World Congress on
Intelligent Control and Automation, Dalian, China, pp. 6699-6703, June 2006.
[8] D. W. Clarke, C. Mohtadi, P. S. Tuffs. "Generalized Predictive Control - Part I. The Basic
Algorithm", in Automatica, vol 23, no. 2, pp. 137-148, 1987.
[9] M. D. Adams, EE6401 Advanced Digital Signal Processing - DSP Architectures &
Applications. School of Electrical and Electronic Engineering, Nanyang Technological
University, Lecture Notes, 2009.
[10] R. Chassaing, Digital Signal Processing and Applications with the C6713 and C6416 DSK.
John Wiley & Sons, 2005.
[11] L. G. Bleris and M. V. Kothare, "Implementation of model predictive control for glucose
regulation on a general purpose microprocessor," in Proceedings of the 44th IEEE
Conference on Decision and Control and the 2005 European Control Conference, pp. 5162—
5167, December 2005.
[12] R. Fletcher, Practical Methods of Optimization, 2nd Edition. John Wiley & Sons, 1987.
[13] S. J. Wright, "Applying new optimization algorithms to model predictive control," in
Proceedings of Chemical Process Control V, pp. 147-155, 1997.
74
[14] M. H. He, C. Chen, and X. Y. Zhang, "FPGA implementation of a recursive rank one
updating matrix inversion algorithm for constrained MPC", in Proceedings of the 6th World
Congress on Intelligent Control and Automation, Dalian, China, pp. 733-737, June 2006.
[15] L. G. Bleris and M. V. Kothare, "Real-time implementation of model predictive control," in
2005 American Control Conference, Portland, OR, USA, pp. 4166-4171, June 2005.
[16] E. F. Camacho and C. Bordons, Model Predictive Control, 2nd Edition. Springer, New York,
1999.
[17] M. L. Lenard and M. Minkoff, "Randomly generated test problems for positive definite
quadratic programming," in A CM Transactions on Mathematical Software, vol. 10, no. 1, pp.
86-96, March 1984.
[18] K. S. Low, K. Y. Chiun and K. V. Ling, "A DSP-based servo system using generalized
predictive control", in Proceedings of the Power Conversion Conference, vol. 1, pp. 507-512,
August 1997.
[19] K. S. Low, K. Y. Chiun and K. V. Ling, "Evaluating generalized predictive control for a
brushless DC drive", in IEEE Transactions on Power Electronics, vol. 13, no. 6, pp. 1191-
1198, November 1998.
[20] J. Lei, X. Y. Jin and R. Wang, "Parallel computation of matrix multiplication and DSP
implement", in Chinese Journal of Sensors and Actuators, vol. 19, no. 3, pp. 737-740,
June 2006.
[21] L. Wanhammar, DSP Integrated Circuits. Academic Press, 1999.
[22] K. S. Low, EE6203 Computer Control Systems. School of Electrical and Electronic
Engineering, Nanyang Technological University, Lecture Notes, 2008.
[23] A. Bemporad, "Model predictive control design: new trends and tools", in Proceedings of the
45th IEEE Conference on Decision & Control, San Diego, CA, USA, pp. 6678-6683,
December 2006.
[24] J. Moreno and M. Medina, "Matrix computations on arrays of DSPs", in Proceedings of the
International Conference on Application Specific Array Processors, pp. 496-510, August
1992.
[25] C. V. Rao, S. J. Wright, and J. B. Rawlings, "Application of interior-point methods to model
predictive control," in Journal of Optimization Theory and Applications, vol. 99, no. 3, pp.
723-757, December 1998.
[26] B. J. Falkowski, EE6404 VLSI Digital Signal Processors. School of Electrical and Electronic
Engineering, Nanyang Technological University, Lecture Notes, 2009.
75
[27] R. A. Bartlett, L. T. Biegler, J. Backstrom, and V. Gopal, "Quadratic programming

algorithms for large-scale model predictive control," in Journal of Process Control, vol. 12,
no. 7, pp. 775-795, 2002.
[28] J. H. Jung and D. P. O'Leary, "Implementing an interior point method for linear programs on
a CPU-GPU system," in Electronic Transactions on Numerical Analysis, vol. 28, pp. 174-
189,2008.
[29] W. F. Zhang, G. Feng, Y. F. Liu and B. Wu, "DSP Implementation of predictive control
strategy for Power Factor Correction (PFC)," in IEEE Applied Power Electronics Conference
and Exposition, vol. 1, pp. 67-73, 2004.
[30] J. Ledin, Embedded Control Systems in C/C++. CMPBooks, 2004.
[31] C. Y. Chan, EE6204 System Analysis - Optimization Techniques. School of Electrical and
Electronic Engineering, Nanyang Technological University, Lecture Notes, 2009.
[32] T. Perez, G. C. Goodwin, and C. W. Tzeng, "Model predictive rudder rolls stabilization
control for ships," in Proceedings of 5 th IF AC Conference on Manoeuvring and Control of
Marine Craft, 2000.
[33] A. Richards and J. P. How, "Model predictive control of vehicle maneuvers with guaranteed
completion time and robust feasibility," in Proceedings of the 2003 American Control
Conference, vol. 5, pp. 4034-4040, 2003.
[34] M. Morari, M. Baotic, and F. Borrelli, "Hybrid systems modeling and control," in European
Journal of Control, vol. 9, no. 2-3, pp. 177-189, 2003.
[35] E. Perea-Lopez, B. E. Ydstie and I. E. Grossmann, "A model predictive control strategy for
supply chain optimization," in Computers and Chemical Engineering, pp. 1201-1218, 2003.
[36] S. E. Mansour, B. Robertson, G. Kember, B. Phillips, "Using model predictive control for
real-time control over the Internet," in Canadian Conference on Electrical and Computer
Engineering, vol. 2, pp. 827-830, 2004.
[37] L. G. Bleris, P. D. Vouzis, M. G. Arnold and M. V. Kothare, "A Co-Processor FPGA
platform for the implementation of real-time model predictive control," in Proceedings of the
2006 American Control Conference, Minneapolis, Minnesota, USA, pp. 1912-1917, June
2006.
[38] L. Girija, A. S. Spanias, "Analysis of the Matrix Processing (MxP) architecture", in IEEE
International Conference on Acoustics, Speech and Signal Processing, Hawaii, pp. 377-380,
April 2007.
Appendix A
Convergence speed forfixed-pointimplementation
Table A-l Average clock counts required for solving a Q P problem using IPM on F2812
3 5 7 9 11 13
31 13797853 18936527 26218541 36997416 51866880 65037126
41 20435137 28381309 34892819 47797509 63502507 83262465
51 25867456 33043079 44586086 59702064 80109739 103617405
61 29864966 42687985 52810508 76455564 96409138 122451235
71 36086726 46297408 59203099 81532970 111342914 143318136
81 37697675 53187466 70147453 105101578 126198488 161467577
91 41022641 63647704 87476307 102193782 139847092 178972484
101 46660405 66424786 104672777 116030306 147603654 194742431
111 50659585 74676578 99281591 133955255 175129920 224182491
Table A-2 Average clock counts required for solving a QP problem using ASM on F2812
3 5 7 9 11 13
31 1096329 4803448 13372395 29937525 58217088 101437463
41 1679198 5925268 15778016 36168170 67879598 117489399
51 1819867 7627291 17881232 40929306 85837137 132890493
61 2221177 8004680 21344732 48352998 92164862 161311198
71 2616860 8840456 24044840 48926358 95669258 186250817
81 3127185 9517796 26348607 58382932 106347844 188507334
91 3131138 10823537 25533085 63165119 116920182 220249978
101 3834078 11799732 29240875 63975058 123574877 225826853
111 3791100 13803990 31699587 68062115 130550858 222387360

2. Convergence speed forfloating-pointimplementation
Table A-3 Average clock counts required for solving a QP problem using IPM on C6713
mc N. 3 5 7 9 11 13
31 223362 327021 485541 698555 1026495 1319486
41 337883 532960 659432 967190 1333458 1727019
51 437332 577139 857746 1196534 1646348 2184117
61 519220 727105 999801 1479253 1993709 2561645
71 588887 798391 1139294 1681616 2243711 2957859
81 617449 928469 1331698 2123291 2602858 3399915
91 664344 1118555 1584085 2024714 2859242 3753563
101 797521 1214113 1967442 2332240 3059648 4065026
111 846226 1345739 2011041 2646408 3645791 4844615
Table A-4 Average clock counts required for solving a QP problem using ASM on C6713
3 5 7 9 11 13
31 45637 196524 473025 965472 1759766 2914390
41 63329 221079 545153 1166411 2274348 3595286
51 91242 286008 702702 1346189 2694191 4129255
61 92202 309639 746308 1644847 2936327 5085445
71 112316 343487 858227 1805607 3330366 5609135
81 121551 408243 994039 1998569 3856859 6229703
91 143339 462549 950101 2212311 4198506 7176345
101 151507 450685 1070820 2211026 4380624 7431059
111 175480 502667 1209319 2368904 4644510 7908928

78
Appendix B
1. M-code for implementing IPM algorithm

function [ z,iter,miu]=InteriorPointMethod(Q,c,J,g)
% - - s o l v i n g q u a d r a t i c programming problem with i n t e r i o r p o i n t method

Q
I—cost function: min(z)=l/2*z'*Q*z+c'*z such that J*z<=g

o
%--return value: z
% using eqn.22 in the ICCA2005 paper

% last modified: Jan 04, 2 007
global SINGLE
MAXITER = 100
TOLmiu = le-5
DzTOL = 0.005 default
sigma = 0.25;
tune = 0.995;
% initial value for iteration parameters

SEARCHING = 1; iteration = 0;
mc = size(g,1);
lambda = 0.75 * ones(mc,l);
t = sigma * ones(mc,l);
miu = t' * lambda/mc;
z = zeros(size(Q,1),1);
while (SEARCHING && iteration < MAXITER)

iteration = iteration + 1;
i Step 2 of Interior Point Solver
invt = ones(mc,l) ./ t;
vtmp = sigma * miu * ones(mc,l);
% r2 = diag(lambda)*(-J*z+g)-vtmp;
sla = J * z;
sib = g - sla;
sic = diag(lambda) * sib;
r2 = sic - vtmp;
% A = Q+(J'*diag(invt))*(diag(lambda)*J);
s2a = diag(invt) * diag(lambda);
s2b = J1 * s2a;
s2c = s2b * J;
A = Q + s2c;
79
% dz = inv{A)*<-Q*z-c-J'*(lambda-diag(invt)*r2));
s3a = diag(invt) * r2;
s3b = lambda - s3a;
s3c = J' * s3b;
s3d = -c - s3c;
s3e = Q * z;
s3f = s3d - s3e;
B = s3f;
dz = inv(A) * B;
% dz = lineqsol(A, B ) ;
% dlambda = -invt. *(r2-diag(lambda)*J*dz) ;

s4a = J * dz;
s4b = s2a * s4a;
dlambda = s4b - s3a;
% dt = -t+g-J*(z+dz);
s5a = sib - t;
dt = s5a - s4a;
% End of Step 2 of Interior Point Solver
% update parameters
alpha_tmp_l = 1; tflag = 1;
while (tflag && alpha_tmp_l>0)
if (t + alpha_tmp_l*dt >=0)
tflag = 0;
else
alphatmpl = alpha_tmp_l - 0.001;
end
end % while
alpha_tmp_l = max(alpha_tmp_l, 0.001) ;
alpha_tmp_2 = 1; tflag = 1;
while (tflag && alpha_tmp_2>0)
if (lambda + alpha_tmp_2*dlambda >=0)
tflag = 0;
else
alpha_tmp_2 = alpha_tmp_2 - 0.001;
end
end *, while
alpha_tmp_2 = max(alpha_tmp_2, 0.001);
alpha_max = min(alpha_tmp_l,alpha_tmp_2);
alpha = tune * alpha_max;
z = z + alpha * dz;
lambda = lambda + alpha * dlambda;
t = t + alpha * dt;
% adjust infeasibility gap

miu = lambda' * t / mc;
tz'; dz'];
% termination criteria
if (abs(miu) < TOLmiu),
SEARCHING = 0;
disp ('Interior point method converged: [iteration mu]
disp([iteration, miu]);
if ( max(abs(dz)) > max(DzTOL*abs(z)) )
SEARCHING = 1;
disp('az too big, continue searching...')
end
end
end % while
if iteration == MAXITER,
disp('IP: Maximum iteration reached, mu = '),miu;
end
iter = iteration;
2. M-code for implementing ASM algorithm

function [z,iter]=ActiveSetMethod(Q,c, J,g)
'--solving quadratic programming problem with active set meth
'i—cost function: min (z) =l/2*z ' *Q*z+c' *z such that J*z<=g
%—return value: z
"s last modified: Oct 20, 2006
MAXITER = 100;
Tol = le-1;
iter = 0;
nv = size(J,2);
mc = size(J,1);
% Line 2: initial point

z = zeros(nv,1);
if sum(J*z-g>Tol)>0
error ('Infeasible initial point')
end
% Line 3: assume active set. is empty
Wk = [];
% main loop
while iter<=MAXITER
iter = iter + 1;
% Line 4
JWk = J(Wk,:);
i number of working constraints
mwork = length(Wk);
ck = c + Q * z;
QO = [Q JWk'; JWk zeros(mwork,mwork) ];
PO = [-ck; zeros (mwork,1)];
dzlam = QO\PO;
dz = dzlam(1:nv);
lam = dzlam(nv+1:end);
% Line 5: if dz=0
if norm(dz)<Tol
% Line 6: if all components of lam are positive or
if isempty(find(lam<0))
% Line 1: global optimum found
return
else
% Line 9
[minlam,removelnd] = min(lam);
Wk = [Wk(l:removeInd-l) Wk(removelnd+l:end) ] ;
disp('Line 9')
end
else
% Line 12
i ini t iaiize DK
Dk = [];
I indices not in Wk
nWk = setdiff(l:mc,Wk) ;
Jdz = J * dz;
gJz = g - J * z;
% create Dk
for i = 1:length(nWk)
if Jdz(nWk(i))>0 && gJz(nWk(i))/Jdz(nWk (i))<l
Dk = [Dk nWk(i)];
end
end
* Line 13
if isempty(Dk)
''.' u i n e i -i
z = z + dz;
d i s p ( ' L i n e 14')
else
% Line 16: update z
[alpha,ind] = min(gJz(Dk)./Jdz (Dk) ) ;
z = z + alpha * dz;
*• L i n e 1 7 : u p d a t e Wk
Wk = [Wk D k ( i n d ) ] ;
d i s p ( ' L i n e 16')
end
end
82
3. C-code for implementing IPM algorithm on F2812 eZdsp

# i n c l u d e "DSP281x_Device.h" / / DSP281x H e a d e r f i l e Include File
#include < s t d i o . h >
^ i n c l u d e <math.h>
#define NV 3
#define MC 32
# d e f i n e MAXITER 100
#define TOLMiu 0.00001

#define DZTOL 0.005
#define TUNE 0.995
#def-ine GETOL l e - 8
# d e f i n e I N I T _ L A M 0.75
# d e f i n e I N I T _ T 0.25
float Q[NV][NV];
float C[NV];
f l o a t J[MC][NV];
f l o a t g[MC];
f l o a t z[NV];
int Iter;
f l o a t Exectime;
float i n v t [ M C ] , r2[MC];
float dz[NV], dt[MC], dlam[MC];
float slb[MC], s2a[MC], s2b[NV][MC], s2c[NV][NV];
float matA[NV][NV], s3a[MC], s3b[MC], s3d[NV], s 3 f [ N V ] ;
float lam[MC], t [ M C ] ;
void Timing(void)
long t i m e r v a l u e ;
t i m e r v a l u e = CpuTimerORegs.TlM.all;
t i m e r v a l u e = OXFFFFFFFF - timervalue;
Exectime - t i m e r v a l u e / 150; / / us
}
void solvelineq(float *A, float *b)
int i, ir, ic, j, jc;
int NUM, actsize;
float temp, temp2;
NUM = NV;
actsize = NV;
for(i=0; i<NUM; i++)
{
i f ( A [ i * a c t s i z e + i ] == 0)
{
f o r ( j = i ; j<NUM; j + + )
temp2 = fabs(A[j*actsize+i]);
if(temp2 >= GETOL)
for(jc=0; jc<NUM; jc++)
temp = A [ i * a c t s i z e + i c ] ;
A[i*actsize+jc] = A[j*actsize+jc];
temp = b [ i ] ;
83
bCi] = b [ j ] ;
b [ j ] = temp;
break;
temp = A [ i + i * a c t s i z e ] ;
A [ i + i * a c t s i z e ] = 1.0;
f o r ( j = i + l ; j<NUM; j++)
b[i] = b[i]/temp;
f o r ( i r = 0 ; ir<NUM; i r++)
{
i f C i r != i )
{
A [ i r * a c t s i z e + i ] = 0;
f o r ( i c = ( i + l ) ; ic<NUM; ic++)
{
A [ i r * a c t s i z e + i c ] = A [ i r * a c t s i z e + i c ] - temp
*A[i*actsize+ic];
}
b [ i r ] = b [ i r ] - temp*b[i];
}
}
}
}
v o i d lPM_funct(void)
{
int i, j , ic, jc, ir, jr;
i n t expint;
f l o a t miu, vtmp, pmiu;
/ / f l o a t s4b[MC], s5a[MC];
f l o a t a l p h a l , alpha2, alphal_bs, alpha2_bs, alpha_max, alpha;
f l o a t t p v a r l , tpvar2, tpvar3;
f l o a t max_z, max_dz;
// initialization
I t e r = 0;
alphal = 1 . 0 ;
alpha2 = 1.0;
alphal_bs = 0 . 5 ;
alpha2_bs = 0 . 5 ;
expint = 1 ;
f o r ( i = 0 ; i<MC; i++)
{
lam[i] = INIT_LAM;
t [ i ] = INIT_T;
}
f o r ( j = 0 ; j<NV; j++)
z [ j ] = 0;
miu = INIT_LAM * INIT_T;

/ / begin i t e r a t i o n
do{
lter++;
f o r ( i = 0 ; i<MC; i++)
{
invt[i] = l / t [ i ] ;
vtmp = I N I T _ T * miu;
f o r ( i r = 0 ; ir<MC; i r + + )
{
t p v a r l = 0;
f o r ( i c = 0 ; IC<NV; ic++)
{
tpvarl = tpvarl + J [ i r ] [ i c ] * z [ i c ] ;
slb[ir] = g[ir] - tpvarl;

tpvar3 = l a m [ i r ] * s l b [ i r ] ;
r 2 [ i r ] = tpvar3 - vtmp;
f o r ( i r = 0 ; ir<MC; i r + + )
{
s2a[ir] = i n v t [ i r ] * 1am[ir];
}
f o r ( j r = 0 ; jr<NV; j r + + )
f o r ( j c = 0 ; jc<MC; jc++)
s2b[jr][jc] = J[jc][jr] * s2a[jc];
}
f o r ( j r = 0 ; jr<NV; j r + + )
for (ic=0; ic<NV; ic++)
{
tpvarl = 0;
tpvarl = tpvarl + s2b[jr][jc] * J[jc][ic];
s2c[jr][ic] = tpvarl;
}
f o r ( i r = 0 ; ir<NV; i r + + )
f o r ( i c = 0 ; ic<NV; ic++)
{
matA[ir][ic] = Q[ir][ic] + s2c[ir] [ i c ] ;
}
}
f o r ( i r = 0 ; ir<MC; i r + + )
{
s3a[ir] = i n v t f i r ] * r 2 [ i r ] ;
s 3 b [ i r ] = 1am[ir] - s 3 a [ i r ] ;
f o r ( j r = 0 ; jr<NV; j r + + )
t p v a r l = 0;
f o r ( j c = 0 ; jc<MC; jc++)
tpvarl = tpvarl + J [ j c ] [ j r ] * s3b[jc];
s3d[jr] = - c [ j r ] - tpvarl;
f o r ( j r = 0 ; jr<NV; j r + + )
tpvar2 = 0;
f o r ( j c = 0 ; jc<NV; jc++)
{
tpvar2 = tpvar2 + Q [ j r ] [ j c ] * z[jc];
s 3 f [ j r ] = s 3 d [ j r ] - tpvar2;
solvelineq(niatA[0], s 3 f ) ;
f o r ( i r = 0 ; ir<NV; i r + + )
{
dz[ir] = s3f[ir];
}
f o r ( i r = 0 ; ir<MC; i r + + )
{
t p v a r l = 0;
f o r ( i c = 0 ; ic<NV; ic++)
{
tpvarl = tpvarl + J [ i r ] [ i c ] * d z [ i c ] ;
}
tpvar2
tpvar2 = = s2aLnrj
s 2 a [ i r ] ** tt p
p vv a
a rr ll ;;
d l a m [ i r ] = tpvar2 - s 3 a [ i r ] ;
tpvar3 = s l b [ i r ] - t [ i r ] ;
d t [ i r ] = tpvar3 - t p v a r l ;
}
/ / end of step 2
alphal = 1 . 0 ;
alphal_bs = 0 . 5 ;
alpha2 = 1.0;
alpha2_bs = 0 . 5 ;
i r = 0;
do{
tpvarl = t [ i r ] + d t [ i r ] ;
i f ( t p v a r l < 0)
do{
exp"int++;
tpvar2 = alphal_bs * d t [ i r ] ;
tpvar3 = t [ i r ] + t p v a r 2 ;
i f ( t p v a r 3 < 0)
{
alphal_bs = alphal_bs - (float)
pow((double)2,(double) -expint);
86
else
{
alphaL_bs = alphal_bs + ( f l o a t )
pow((double)2,(double) -expint);
} w h i l e ( e x p i n t < 10);
tpvar2 = a l p h a l - alphal_bs;
i f ( t p v a r 2 > 0)
a l p h a l = alphaL_bs;
ir++;
expint = 1 ;
alphal_bs = 0 . 5 ;
}
else
{
i r++;
expint = 1 ;
alphal_bs = 0 . 5 ;
}
else
{
ir++;
}
} w h i l e ( i r < MC);
ir=0;
do{
tpvarl = lam[ir] + dlam[ir];
i f ( t p v a r l < 0)
do{
expint++;
tpvar2 = alpha2_bs * d l a m [ i r ] ;
tpvar3 = l a m [ i r ] + t p v a r 2 ;
i f ( t p v a r 3 < 0)
alpha2_bs = alpha2_bs - pow(2,-
expint);
}
else
{
expint); alpha2_bs = alpha2_bs + pow(2,-
}
} w h i l e ( e x p i n t < 10);
tpvar2 = alpha2 - alpha2_bs;
i f ( t p v a r 2 > 0)
alpha2 = alpha2_bs;
i r++;
expint = 1 ;
alpha2_bs = 0 . 5 ;
}
else
{
i r++;
expint = 1 ;
alpha2_bs = 0 . 5 ;
}
else
{
ir++;
}
} w h i l e ( i r < MC);
tpvar3 = a l p h a l - alpha2;
i f (tpvar3 > 0)
alpha_max = alpha2;
else
alpha_max = a l p h a l ;
alpha = TUNE * alpha_max;
f o r ( i r = 0 ; ir<NV; i r + + )
{
z [ i r ] = z [ i r ] + alpha*dz[ir];
f o r ( i r = 0 ; ir<MC; i r + + )
{
lam[ir] = lam[ir] + alpha*dlam[ir];
t [ i r ] = t [ i r ] + alpha*dt[ir];
t p v a r l = 0;
f o r ( j r = 0 ; jr<MC; j r + + )
tpvarl = tpvarl + l a m [ j r ] * t [ j r ] ;
miu = t p v a r l / MC;
pmiu = f a b s ( m i u ) ;
t p v a r l = pmiu - TOLMIU;
i f ( t p v a r l < 0)
max_dz = f a b s ( d z [ 0 ] ) ;
f o r ( i r = l ; ir<NV; i r + + )
{
tpvar2 = max_dz - f a b s ( d z [ i r ] ) ;
i f ( t p v a r 2 < 0)
max_dz = f a b s ( d z [ i r ] ) ;
}
}
max_z = DZTOL * f a b s ( z [ 0 ] ) ;
f o r ( i r = l ; ir<NV; i r + + )
{
tpvar3 = max_z - D Z T O L * f a b s ( z [ i r ] ) ;
i f ( t p v a r 3 < 0)
max_z = D Z T O L * f a b s ( z [ i r ] ) ;
}
}
tpvar2 = max_dz - max_z;
if(tpvar2 < 0)
{
break;
}
88
}while(lter < MAXITER);

}
v o i d main(void)
{
initSysCtrlO;
/ / initGpioO ;
/ / DINT;
/ / initPieCtrlQ;
// IER = 0x0000;
// IFR = 0x0000;
/ / initPievectTableO;
/ / initPeripheralsO ;
initcpuTimersO;
startCpuTimerOO;
lPM_funct();
StopCpuTimerOO;
TimingO;
4. C-code for implementing ASM algorithm on C6713 DSK

/ / #include "C6713dskinit.h"
#idelude <stdio.h>
# i n c l u d e <math.h>
#include "csl.h"
#include "csl_timer.h"
# d e f i n e NV 3
# d e f i n e MC 32
#define MAXITER 100
# d e f i n e TOL l e - 3
# d e f i n e GETOL l e - 8
# d e f i n e smallnum l e - 3
f l o a t Q[NV] [ N V ] ;
f l o a t c[NV];
float J[MC][NV];
float g[MC];
float Z[NV];
int Iter;
f l o a t Exectime;
f l o a t lambda[MC];
f l o a t Mmat[NV][NV], dmat[NV];
i n t wk [ M C ] ;
i n t Dk[MC];
f l o a t dz[NV], ck[NV];
f l o a t QO[NV+MC][NV+MC] , PO[NV+MC];
static U i n t 3 2 T i m e r C o n t r o l = TIMER_CTL_RMK(
TIMER_CTL_INVINP_NO,
TIMER_CTL_CLKSRC_CPUOVR4, / / CPU c l o c k / 4
TIMER_CTL_CP_PULSE,
TIMER_CTL_HLD_YES,
TIMER_CTL_GO_NO,
TIMER_CTL_PWID_ONE,
TIMER_CTL_DATOUT_0,
TIMER_CTL_INVOUT_NO,
TIMER_CTL_FUNC_GPIO
);
s t a t i c TlMER_Handle h T i m e r l ;
v o i d s o l v e l i n e q ( f l o a t *A, f l o a t * b , i n t countAct, i n t r r , i n t a c t s i z e )
int i , i r , ic, j , jc;
i n t NUM;
f l o a t temp, temp2;
NUM = rr+countAct;
f o r ( i = 0 ; i<NUM; i++)
if(fabs(A[i*actsize+i]) < smallnum)
for(j=i+l; J<NUM; j++)
{
temp2 = fabs(A[j*actsize+i]);
if(temp2 >= GETOL)
{
for(jc=0; jc<NUM; jc++)
temp = A [ i * a c t s i z e + j c ] ;
A[i*actsize+qc] = A [ j * a c t s i z e + j c ] ;
temp = b [ i ]
b[i] = b[j]
b [ j ] = temp
break;
temp = A [ i + i * a c t s i z e ] ;
A [ i + i * a c t s i z e ] = 1.0;
for(j=i+l; J<NUM; j++)
}
b [ i ] = b[i]/temp;
f o r ( i r = 0 ; ir<NUM; i r + + )
if(ir != i )
{
A [ i r * a c t s i z e + i ] = 0;
f o r ( i c = ( i + l ) ; ic<NUM; ic++)
{
A [ i r * a c t s i z e + i c ] = A [ i r * a c t s i z e + i c ] - temp
*A[i*actsize+ic];
}
b [ i r ] = b [ i r ] - temp*b[i];
}
}
}
}
v o i d LTSQ(void)
{
int i, j, jr;
float tempi, sum;
90
// 3'*J
for(i=0; i<NV; i++) // rows of J'
{
for(j=0; j<NV; j++) // cols of J
sum = 0;
for(jr=0; jr<MC; jr++) // rows of j
tempi = J [ j r ] [ i ] * 3 [ j r ] [ j ] ;
sum = sum + tempi;
}
Mmat[i] [ j ] = sum;
}
/ / J'*g
f o r ( i = 0 ; i<NV; i++) / / rows of 3'
{
sum = 0;
f o r ( j r = 0 ; jr<MC; j r + + ) / / rows of J
tempi = J [ j r ] [ i ] * g [ j r ] ;
sum = sum + tempi;
}
d m a t [ i ] = sum;
}
s o l v e l i n e q ( M m a t [ 0 ] , dmat, 0, NV, NV);
f o r ( i = 0 ; i<NV; i++)
{
z [ i ] = dmat[i];
}
}
void ASM_funct(void)
{
i n t j r , j c , j , i r , i c , i , minindex, i f s , temp3;
i n t neglam, actno, alpwk[MC], minalpindex, countAct;
f l o a t tempi, temp2, tsum, mini am, a l p l , minalp;
i n t nochangeflag, Dkcount;
// initialization
i = 0 ; i c = 0 ; i r = 0 ; j = 0 ; j c = 0 ; j r = 0 ; l t e r = 0 ; minlam=0;
/ / check i f g i s a l l p o s i t i v e
i f s = 1; / / i n i t i a l solution is feasible
f o r ( i = 0 ; i<MC; i++)
{
i f ( g [ i ] < 0)
i f s = 0 ; / / i n i t i a l s o l u t i o n i s not f e a s i b l e
i f ( i f s; == 1)
{
f o r ( i = 0 ; ; i<NV ; i++)
{
z[i] = 0 i
}
}
else
{
LTSQO ;
}
f o r ( j = 0; j<MC; j++)
/ / a l p w k [ j ] = 0;
lambda[j] = 0;
WkM] = 0;
D k [ j ] = 0;
/ / begin i t e r a t i o n
do{
Iter++;
/ / ck = c + Q*z
f o r ( i r = 0 ; ir<NV; i r + + )
{
temp2 = 0;
f o r ( i c = 0 ; ic<NV; ic++)
{
tempi = Q [ i r ] [ i c ] * z [ i c ] ;
temp2 = temp2 + tempi;
c k [ i r ] = c [ i r ] + temp2;
countAct = 0;
for(j=0; j<MC; j++)
if(Wk[j]==l)
countAct++;
}
/ / form augmented matrix and solve f o r dz and dlam
/ / QO = [ Q JWk'l PO = [ -ck ]
// ] JWk 0 ] [ 0 ]
f o r ( i r = 0 ; ir<NV; i r + + )
{
f o r ( i c = 0 ; ic<NV; ic++)
{
QO[ir][ic] = Q [ i r ] [ i c ] ;
}
PO[ir] = - c k [ i r ] ;
}
temp3 = 0;
f o r ( j = 0 ; j<MC; j + + )
if(Wk[j]==l)
f o r ( j c = 0 ; jc<NV; jc++)
QO[NV+temp3][jc] = J [ j ] [ j c ] ;
QO[jc][NV+temp3] = J [ J ] [ J C ] ;
PO[NV+temp3] = 0;
temp3++;
temp2 = 0;
f o r ( j r = N V ; jr<(countAct+NV); j r + + )
for(jc=NV; jc<(countAct+NV); jc++)
Q O [ j r ] [ j c ] = 0;
}
solvenneq(QO[0], PO, countAct, NV, NV+MC) ;

f o r ( i = 0 ; i<NV; i++)
{
dz[i] = PO[i];
}
temp3 = 0;
f o r ( j = 0 ; j<MC; j++)
if (Wk[j] == 1)
lambda[j] = PO[NV+temp3];
temp3++;
else
lambda[j] = 0;
nochangeflag = 1 ;
f o r ( i = 0 ; i<NV; i++)
if ( f a b s ( d z [ i ] ) > TOL)
nochangeflag = 0;
}
i f ( n o c h a n g e f l a g == 1)
neglam = 0; actno = 0; mini am = 0;
/ / remove c o n s t r a i n t from a c t i v e set
f o r ( j = 0 ; j<MC; j++)
{
i f ( w k [ j ] == 1) / / i f c o n s t r a i n t i s a c t i v e
{
actno++;
if(lambda[j]<0)
neglam++;
i fCIambda[j]<mi nlam)
mini am = l a m b d a [ j ] ;
minindex = j ;
}
}
}
wk[minindex] = 0;
if Cneglam == 0)
break;
}
else
{ minalp = 1; Dkcount = 0;
// find alpha and add constraint to active set
forCj=0; j<MC; j++)
i f ( w k [ j ] == 0) / / i f c o n s t r a i n t i s inactive
tsum=0;
f o r ( j c =0; jc<NV; jc++)
{
tempi = 3 [ j ] [ j c ] * d z [ j c ] ;
tsum = tsum + tempi;
temp2=0;
f o r ( j c =0; jc<NV; jc++)
tempi = J [ j ] [ j c ] * z [ j c ] ;
temp2 = temp2 + tempi;
a l p l = (g[j]-temp2)/tsum;
if(tsum > 0 && a l p l < 1)
{
Dk[j] = 1 ;
Dkcount++;
i f ( a l p l < minalp)
{
minalp = a l p l ;
minalpindex = j ;
}
}
else
Dk[j] = 0;
}
else
o k [ j ] = 0;
}
if(Dkcount == 0)
{
for (i=0; i<NV; i++)
{
Z[i] = z[i] + dz[i];
}
}
else
i / / update z
for (i=0; i<NV; i++)
{
z [ i ] = z [ i ] + minalp*dz[i];
/ / update active constraints
Wk[minalpindex] = 1;
}
}while(iter < MAXITER);
d main(void)
long timervalue = 0;
long timervalue_start = 0;
long timervalue_end = 0;
void TimerEventHandler(void);
extern far void vectorsO;
hTimerl = TIMER_open(TIMER_DEVl, TIMER_OPEN_RESET);
TlMER_confi gArgs(nTi merl,
TimerControl, // use predefined control value
OxFFFFFFFF, // set period
0x00000000 // start count value at zero
);
TIMER_start(hTimerl);
timervalue_start = TlMER_getcount(hTimerl);
94
ASM_funct();
timervalue_end = TlMER_getCount(hTimerl);
timervalue = timerva~lue_end - t i m e r v a l u e _ s t a r t ;
Exectime = timervalue / 56.25; / / us
TlMER_pause(hTimerl);
TiMER_close(hTimerl);
95
Appendix C
1. Analysis of computational complexity for IPM algorithm

Table C-l Computational analysis of IPM algorithm
Operations Addition / Multiplication Division
Subtraction
T-i
mc
k
vtmp = on 1
k k k
r = A (-Jz + g) - vtmp
k
sla =]z mc(nv - 1) nvmc
sib = g — sla mc
k
sic = A slb mc
k
r = sic — vtmp mc
T 1 k
A= Q+J (T- A )J
s2a = T_1A* mc
s2b =/s2a nvmc
T
s2c=/ s2b nl{mc - 1) nlmc
2
A = Q + s2c n
k l k T 1
Az = A~ {-Qz - c -y (A" - T- ^"))
1
s3a = T- ^ mc
k
s3b = X — s3a mc
T
s3c =/ s3b nv(mc - 1) nvmc
s3d = — c — s3c nv
s3e = Qz k
nv(nv - 1) n2
B = s3f = s3d — s3e nv
k 1
Az =A~ B Ma Mm Md
k l k k 1
AX = T~ A ]Az - T' ^
k
s4a =]Az mc(nv - 1) nvmc
s4b = s2as4a mc
k
AA = s4b — s3a mc
k k k k
At = -t + (g-Jz )-JAz
fc k
At = s l b - t - s 4 a 2mc
k k
tpvarl = t + At mc
k
tpvar2 = abslAt 10mc
96
tpvar3 = tk + tpvar2 10mc

absi = «bsi ~ 0.001 10mc
«i - <*bsl mc
k k
tpvarl = X + AA mc
fc
tpvar2 = abs2AA 10mc
k
tpvar3 = X + tpvar2 10mc
«bs2 = «ft52 - 0-001 10mc
«2 - a&S2 mc
ai-«2 1
k
a" = 0.995a 1
z fc+l _ zk _j_ akfozk
"v «v
Ak+1 _ Ak + akAAk
mc mc
t k+l _ tk + akAtk mc mc
^k+1 _ Xk+ltk+l/mc
mc — 1 mc 1
2. Analysis of computational complexity for ASM algorithm
Table C-2 Computational analysis of ASM algorithm

Operations Addition / Multiplication Division
Subtraction
ck = c + (?zk
tempi = Qzk nv(nv -1) nl
k
c = c + tempi nv
k
Az = Qo/P0 Ma Mm Md
k k
a" = {g-Jz )/JAz
tsum = JAzk mc(nv - 1) nvmc
k
temp2 =Jz mc(nv - 1) nvmc
k
a = (g - temp2)/tsum mc mc
k+1 k k
z «- z + a Az nv nv

Embedded Implementation of Model Predictive Control

Uploaded by

Copyright:

Available Formats

Embedded Implementation of Model Predictive Control

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Embedded Implementation of Model Predictive Control

Uploaded by

Copyright:

Available Formats

This

Downloaded on 13 Apr 2023 21:05:14 SGT

SCHOOL OF ELECTRICAL AND ELECTRONIC ENGINEERING

A DISSERTATION SUBMITTED IN PARTIAL FULFILMENT OF

Appreciations are also dedicated to Laboratory Executives in Control Engineering Lab,

The resulting MPC on DSP is demonstrated through a simulated aircraft control

List of Tables viii

2. Constrained Model Predictive Control 5

3. Prototyping MPC on DSP 21

3.3 Establishing Prototyping Environment 26

5. Aircraft Case Study 63

6. Conclusion and Recommendations 71

Figure 1.1 Computational time per QP problem using IPM 3

Figure 5.6 Iterations and computational time oflPM algorithm on F2812 68

Table 3.1 Initialized sections 28

Micro-Electro-Mechanical System (MEMS) is the integration of mechanical elements,

Digital Signal Processor (DSP) is a potential candidate for an embedded platform to

1.3 Major Contribution of the Thesis

The major contributions of the thesis can be concluded as follows:

2. Compared the computational performance of IPM and ASM by testing both

3. Verified the applicability of MPC with an aircraft control example by employing

Number of Constraints mc Number of Constraints mc

Figure 1.1 Computational time per QP problem using IPM

I 50 * ^ - - > — " * • - '

Figure 1.2 Computational time per QP problem using floating-point DSP

We have also compared the computational complexity and speed of convergence of

2. Constrained Model Predictive Control

Figure 2.1 Set-point improvement available from the use of MPC

Because of the high computational requirements needed for optimizations associated

2.2 Receding Horizon Strategy

Figure 2.2 The basic idea of receding horizon

2.3 Principles of Constrained MPC

Constrained MPC can be formulated as a QP problem. Suppose that we have a

x(k + 1) = APx(k) + BPu(k)

y(k + l\k) = CPx(k + l|/c) = CPAPx{k) + CPBPu(k\k) (2)

y(k + 2\k) = CPx{k + 2|fc) = CPApx(k) + CPAPBPu(k\k) + CPBPuQ<. + \\k) (3)

y(k + i\k) = CPx(k + i\k) = CPAPx{k) + Y CPAp~xBpu(k + i - j\k) (4)

f Kk +1) • Lp/lp CpBp 0 0 U(fc)

where NP and Nu are the prediction and control horizons, respectively.

Therefore the constrained MPC problem is to minimize the cost function,

MPC Au I u Real Plant

Figure 2.3 The boundary between controller and plant

From the state-space model (1), we have

where Ax(/c) = x(fc) - x(k - 1).

Ax(/c + 1)1 r AP 01 rAx(/c) BP

To include the incremental input Au in a state-space model of the MPC plant, we

Then the state-space models have the form

f(k + 1) = i4f (fc) + BAu{k)

r y(k +1) 1 ' CA CB 0 0 Au(/c)

y = Vxttk) + Vuz (16)

y = [y(fc + D y(fe + 2) ... y(k + NP)]T,

CANr-xB CANr~2B ••• CANP~N"B.

Hence the cost function in (6) can be further rewritten as,

Using the prediction result in (16), we can rewrite (7) as follows

In this way, the inequality constraints (7 - 9) can be written compactly as

2.4 Solving QP problems

2.4.1 Interior Point Method

According to the infeasible interior point framework introduced by [13], an optimal

Here r = -(Aky1TK and