Embedded Implementation of Model Predictive Control
Embedded Implementation of Model Predictive Control
Embedded Implementation of Model Predictive Control
document is downloaded from DR‑NTU (https://dr.ntu.edu.sg)
Nanyang Technological University, Singapore.
Embedded implementation of model predictive
control
Wu, Xuepei.
2011
http://hdl.handle.net/10356/46832
Nanyang Technological University
NANYANG
TECHNOLOGICAL
UNIVERSITY
EMBEDDED IMPLEMENTATION
OF
MODEL PREDICTIVE CONTROL
WU XUEPEI
2009
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
Acknowledgements
I would like to express my deep gratitude to my supervisor Associate Professor Dr.
Ling Keck Voon, Nanyang Technological University, for his constructive suggestions
and helpful criticism throughout the whole period of the project. His enthusiastic
concern on the research progress kept pushing me closer and closer to my objective.
I would also like to sincerely appreciate the help of Project Officer Ms. Yue Siew
Peng. The project would not been successfully finished in due time without her
professional guidance and assistance.
Last but not least, I would like to thank my parents and my friends for their supports
and encouragements.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
ii
Summary
Model Predictive Control (MPC) refers to a type of computer control technology that
utilizes an explicit process model to predict the future response of a plant. It has
become an established control technology in the petrochemical industry, and it is
currently being increasingly applied in many other sectors. MPC outperforms other
control strategies through its ability to deal with constraints. Constrained MPC can be
formulated as a Quadratic Programming (QP) problem, which therefore is central to
MPC. Interior Point Method (IPM) and Active Set Method (ASM) are two schemes
commonly employed to solve QP problems.
MPC has been successfully implemented and verified on Field Programmable Gate
Array (FPGA) or General Purpose Processor (GPP). Alternatively, this dissertation
begins to explore implementation of constrained MPC on fixed-point and floating-
point 32-bit Digital Signal Processor (DSP). A prototyping environment suitable for
investigating implementation issues to bring MPC onto DSP is introduced.
Implementation of IPM and ASM is described in detail and computational
performance of those two methods are presented and analyzed.
Table of Contents
Acknowledgements i
Summary ii
List of Figures vi
1. Introduction 1
1.1 Motivation 1
1.2 Objectives 2
1.3 Major Contribution of the Thesis 2
Simulation Results 56
4.1 Convergence Speed 56
4.1.1 Computational Time 56
4.1.2 Number of Iterations 59
4.2 Storage 61
4.3 Numerical Error 61
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
Bibliography 73
Appendices 76
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
List of Figures
List of Tables
1. Introduction
1.1 Motivation
MEMS have enjoyed substantial growth in the past decade. Industry analysts estimate
intelligent microsystems will have $100 Billion annual market within the next decade
[15]. There will be increasing demands for advanced control systems to be embedded
into microsystems. The only advanced control technology which has made a
significant impact on industrial control engineering is Model Predictive Control
(MPC).
MPC has now been applied in a wide variety of application areas including process
control, aerospace, vehicles, logistics and networks. The reason which is crucial to the
success of MPC in these applications is that the most profitable operation is often
obtained when a process in running at a constraint or even at more than one constraint,
and being able to take account of constraints is exactly a unique strength of MPC.
Another factor cited often for the practical success of MPC is its natural ability to
handle multivariable problems.
For processes with slow dynamics and low sampling rates, MPC is typically
implemented on a dedicated computer. For systems with fast dynamics such as those
in MEMS, an embedded MPC would be an appropriate controller implementation
since the size and the application precludes the use of a dedicated computer. MPC
implementation based on Field Programmable Gate Array (FPGA) and General
Purpose Processor (GPP) has been proposed in [1] and [11] respectively.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
1.2 Objectives
MPC deals with constraints by reducing control law to an optimization problem,
which is so-called Quadratic Programming (QP) problem. Computational complexity
can become an issue as the on-line optimization is highly computationally demanding,
especially when applying MPC to embedded applications where computational
resource may be a major constraint. Two optimization schemes, Interior Point method
(IPM) and Active Set method (ASM), are most commonly employed as QP solver.
There are several objectives along this line of research:
1. To implement and verify the IPM and ASM algorithms on DSP processors.
2. To compare the computational performances between fixed-point and floating-
point hardware implementation of IPM and ASM algorithms.
3. To compare the performance of IPM versus ASM using fixed-point and floating-
point DSP.
4. To demonstrate the resulting MPC on a chip through a simulated aircraft control
example.
1. Coded the IPM and ASM algorithms in C programs and verified them on
TMS320F2812 and TMS320C6713 as the fixed-point and floating-point DSP,
respectively.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
We summarize the results obtained in this thesis in Figures 1.1 and 1.2, where m c and
nv denote the number of inequality constraints and the number of decision variables,
respectively.
Computational time as a function ol mc on Manor Point Mathod Computational time at a function of mc on Intertor Pant Method
Figure 1.1 illustrates the comparison between fixed-point and floating-point DSP
implementation of IPM algorithm. It can be evidently seen that TMS320C6713 DSP
takes great advantage of its floating-point architecture and therefore performs much
faster operations than TMS320F2812. As a result, we conclude that DSP with
floating-point architecture is more appropriate for real-time industrial applications of
MPC, where the computational speed is substantially considered.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
Computational time as a function o f m c on Interior Point Method Computational time as a function ol mc on Active Set Method
— • — nv=3 A
•T —*—
nv=5
nv=7
•
1 •1 •1 1
V " " l
E 1 1 1
nv=9
-JK
> '
j
| 1 0 0 -
•
-*•
nv«ll
nv=13 - r- - - r- - - l
- 1 - -
1 1
o
' >
& --1*-'
E
, ^ 1 1
Jfr • -%
1 i p f
Com
* ^ i t l ,
#1
m •
—% '
• jr^
30 40 50 60 70 80 90
... -V -- - |v
100
_s 110 120
Number of Constraints mc Number of Constraints mc
2.1 Introduction
Model Predictive Control (MPC) is the only advanced control technology which has
made a significant impact on industrial control engineering. It originated in the
petrochemical process industry, but its use has now been broadly pioneered in high
bandwidth applications, such as ships [32], aerospace [33], vehicles [34], logistics
[35], networks [36], and also microscale devices [11].
Constraints are always present in applications and the main advantage of MPC is the
ability to take account of constraints. A plant is not expected to operate exactly at the
real limits of its capabilities because of the unexpected disturbances from various
sources. But the better the control system is at dealing with such disturbances, the
closer the controller can operate to the constraints. Figure 2.1 illustrates three
hypothetical probability distributions of some controlled output of a plant, and the
constraint beyond which the output should not stray. Distribution (a) shows the
relatively large variance which results from the use of a relatively badly tuned linear
controller, assuming that the plant behaves approximately linearly, and that the
disturbances have a Gaussian distribution. In order to have an acceptably low
probability of violating the constraint, the set-point has to be set relatively far away
from the constraint, and hence the plant operates far away from the optimal point for
the great majority of the time. Distribution (b) shows the smaller variance of the
controlled outputs, which might be achieved by the use of linear optimal control. The
variance has been reduced, thus allowing the set-point to be significantly closer to the
constraint. Since the control law is linear, the distribution remains Gaussian.
Distribution (c) shows the effect of employing predictive control. With the ability of
being aware of the constraint, the controller reacts very differently in response to a
disturbance which pushes the output towards the constraint from that which pushes it
away from it. The controller is therefore nonlinear, and the distribution of the output
becomes significantly unsymmetrical with the possibility to operate the plant with a
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
set-point very close to the constraint, while retaining an acceptably small probability
of violating the constraint.
MPC adopts a receding horizon control strategy. Figure 2.2 shows the basic idea of
the receding horizon concept. A single-input single-output (SISO) discrete time plant
is discussed in this presentation.
The set-point trajectory at any time t, which is the trajectory that the output should
follow ideally, is denoted as s ( t ) . Distinct from the set-point trajectory is the
reference trajectory. The reference trajectory does not necessarily have to coincide
with the set-point. It starts at the current output y(/c) and defines an ideal trajectory
along which the plant should return to the set-point trajectory, e.g. after a disturbance
occurs. The notation r(k + i\k) indicates the predetermined reference trajectory
depends on the conditions at current time k. However, if the knowledge of future
references is unknown, a commonly made simplification is to assume a constant
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
reference from time k and onwards. Here we assume that the current output y(k) can
be measured with the past inputs u{k — 1), u(k — 2),..., but not on the input u(k).
Constraint
k + NP Time
^ Constraint
Input
k k + Nu k + NP Time
Now we consider the simplest case, assuming that the predicted output is required to
be equal to the reference trajectory only at the end of the prediction horizon, namely at
time k + NP. In other words, we only have a single coincidence point at time k + NP.
Also, we assume the optimal input trajectory, which drives the future output of plant
to the reference, remains constant over the prediction horizon, i.e. u(k\k) =
u(k + l\k) = ••• = u(k + NP — l\k). In this case, the receding horizon idea is to
compute an optimal input trajectory {u(k\k),u(k + l|/c), ...,u(/c + NP — l\k)} and
choose one parameter u(k\k) such as to bring the plant output at the end of the
prediction horizon, namely y(k + NP\k) to the required value r(k + NP\k).
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
Once a future input trajectory has been chosen, only the first element of that trajectory
is actually applied as input to the plant. That is, we set u(/c) = u(k\k), where u(/c)
denotes the actual input applied at time k.
Then the whole cycle of output measurement, prediction and input trajectory
determination is repeated. After one sampling interval, a new reference trajectory
r(k + i\k + 1) (i = 2,3,...) is defined; predictions are made over the horizon k +
1 + NP, with i = 1,2,..., NP ; a new input trajectory u(k + 1 + i\k + 1), with i =
0,2, ...,NP — 1 is chosen; and finally the next input u(fe + 1) = u(k + l\k + 1) is
applied to the plant. Since the prediction horizon remains of the same length NP as
before, but slides along by one sampling interval at each step, this way of controlling a
plant is called a receding horizon strategy.
In the simplest case considered above, since there is only one equation at coincidence
point k + NP , y(k + NP\k) = r(k + NP\k) to be satisfied and one parameter to
choose for the future input trajectory, there is a unique solution. However, in more
common situations there are several coincidence points in the prediction horizon (even
all the points k + \,k + 2, ...,k + NP are coincidence points). Therefore the internal
system of the controller, which is used to predict the behavior of the plant, is
overdetermined, i.e. there are usually more coincidence points than parameters. This
leads to the condition that there are more equations to be satisfied than the number of
available variables and the impossibility of choosing the future input trajectory such
that the predicted output coincides with the reference input at all the coincidence
points. We therefore have to be satisfied with an approximate solution. Most
commonly, this approximate solution is the least-squares (LS) solution.
If there are constraints on the inputs and/or outputs, then the simple linear LS solution
has to be replaced by a constrained LS solution. With constraints, it is no longer
possible to obtain a closed-form solution and some form of iterative optimization
algorithm must be employed. If the constraints are in the form of linear inequalities,
then we have a quadratic programming (QP) problem.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
Now suppose that we allow a more complicated future input trajectory. The input is
allowed to change over the next Nu steps. For instance, control horizon Nu = 3 in
Figure 2.2, the input is assumed to vary over the first three steps of the prediction
horizon, but to remain constant thereafter, i.e. u(k + 2\k) — u(k + 3\k) = ••• =
u(k + NP — l\k) or the change in the input Au(/c + i\k) = 0 with i = 3,4, ...,NP , so
that there are three parameters to choose: u(k\k), u(k + l\k) and u(/t + 2\k) .
Generally speaking, /Vu is chosen to be considerably smaller than NP. And we have to
choose u(k\k),u(k + l\k), ...,u(k + Nu — l|/c) and assume that u(k + Nu - l\k) =
u(k + Nu\k) — ••• — u(fc + NP — l\k). Again, we only implement the first element
of the future input trajectory.
where y(/c) £ IRm, u(k) E Rl and x(k) E M.n represent the system output, input and
states respectively. The index k counts time steps. In most cases, DP is set to zero as
the control signal u(k) is not able to affect the output y(k) directly with existing
system delay. The process output can be predicted by iterating the model,
Similarly,
And in general,
10
State-space model has the advantage that it can be used for multivariable processes in
a straightforward manner.
If the predicted outputs are collected into a vector, we obtain an expression for the
vector of predicted outputs,
The above plant model (1) expresses the plant state x in terms of the values of the
input u. However, in predictive control it is often convenient to regard the discrete-
time integration from Au to u as being included in the plant dynamics. As a result, we
consider the required changes in the control signal from one time step to the next:
Au(/c) = u(k) — u(/c — 1). The general aim is that the future output y within the
considered horizons should follow a reference signal r and, at the same time, the
control effort Au necessary for doing so should also be penalized. For many industrial
applications, the plants generally have some physical limitations. Such limitations are
the constraints to system input, output or state variables.
subject to linear inequality constraints on the system outputs, inputs and states,
(7)
U
MIN ^ 7 u " — UMAX (8)
AuMIN < JAuAu < AuMAX (9)
h is the control weighting which expresses the relative importance between the
tracking errors (y — r) and control effort Au.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
11
Figure 2.3 shows the real controller producing the signal u which is passed to the real
plant, and the MPC controller producing the signal Au which is passed to the MPC
plant.
MPC Plant
(A.B.C)
Now we try to derive the MPC control law by using a state-space model with an
incremental input Au rather than the input u.
x(k + 1)
u(k)
(10)
or
Ax(k + 1)
(11)
12
Ax(k + 1 )
(13)
m= y(k +1)
AP 0 BP
,B ,C = [0 I],
CpBp
It can be shown that the predictions based on the augmented state-space model is
or in vector-matrix notation
with
13
1
<p(z)=-zTQz + crz (17)
The cost function is now represented in quadratic form with the constraints in the
form of linear inequalities.
In this thesis, for the convenience of hardware implementation, all constraints are
rearranged into the unique form of input constraints.
h v„z<\-::-]-\
yMAxJy1 _
. [ /;
Vxx(k) (18)
-h ~yMINJ i~Jy
Jz < g (19)
In the above equations, Q = 2*PuTlFu + 2/ is a nv x nv symmetric positive semi-
definite matrix so-called Hessian matrix, where nv = Nul is the number of decision
variables, and c = 2 v P u T ( x F z x' c — r) is nv x 1 in size. / and g have the sizes mc x nv
and mc x 1, respectively, where mc is the total number of inequality constraints.
Equations (17) and (19) tell us that the QP problem is bound up with matrices Q,c,J
andg.
In this way, the constrained MPC problem is converted into a problem of minimizing
an objective function subjecting to constraints as it is in (17) and (19), which is a
standard optimization problem known as the QP problem. A linear programming (LP)
is the special case of a QP when Q = 0, so that the objective function is linear rather
than quadratic.
Basically, the fundamental and major component of MPC controller is the solving of
QP problem as it results in optimal input trajectory for the control process. Therefore,
the ability of solving the QP problem on-line becomes critical when applying MPC to
real-time systems with prescribed response time and embedded applications with
limited computational resource. In the above section, a constrained MPC optimization
problem reduces to a typical QP problem. And there exist many methods in the
literature for solving QP problems. Among these methods, Interior Point method (IPM)
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
14
and Active Set method (ASM) have been proved to be powerful approaches for
solving QP problems, therefore have been the most commonly used. The capacity of
both methods for MPC applications have been investigated in [13], [25], [27]. In this
section, the effective on-line computation algorithms basing on IPM and ASM are
described.
Since Q > 0, the QP problem in (17) and (19) is convex. As a result of the QP
problem being convex, the necessary and sufficient conditions for z to be the global
optimum are given by the Karush-Kuhn-Tucker (KKT) conditions: there must exist
vectors (Lagrange multipliers) A > 0 and (slack variable) t > 0, such that
Qz+JJA = -c (20)
-Jz-t = -g (21)
tTA = 0 (22)
Equations (20) and (21) are called feasibility conditions while the equation (22) is
called complementary condition.
IPM is one of the most popular methods for solving QP problems. The basic idea of
this method is derived from the KKT conditions. The infeasible IPM perturbs the
complementary condition in (22) with the following scalar
Hk=^-^- (23)
mc
where k is the iteration sequence and mc is the number of inequality constraints in (4).
As the iteration goes on, the infeasibility and fik is gradually reduced to zero.
Step 1:
Choose an initial condition (z°, A0, t°) with positive (A0, t°).
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
15
Step 2:
At the kth iteration step, solve for the increments (Az k , AAk, At k ) with
(24)
2 J
and
At = -tk + A-x(onke - TAX) (25)
'% tk "1"
A = ,T = ,e =
^mc tk .1.
In addition,
r{ = -Qzk-JTAk-c (26)
k k /c _1
r = -Jz + g — cr/i A e (27)
Step 3:
Step 4:
Judge the convergence. If the iteration converges, stop the process and the optimal
control zk+1 is obtained; otherwise, go back to Step 2 with (z f c + \ A k+1 , t k + 1 ) and
continue the iteration process.
It can be seen that solving such a QP problem is an iterative process and the main
computational load is in solving equation (24) at each iteration.
16
or
Both schemes above need to inverse a matrix. For (29), the matrix to be inverted at
each iteration is T — JQ~X]T, which has dimension m c x m c . In (30), the matrix
is Q — / T r _ 1 / > which has dimension nv x nv. T — JQ_1JT has a special structure in
that only the diagonal elements change at each iteration. In most practical MPC
applications, mc is generally much larger than nv. In such a situation, (30) is more
attractive for embedded implementation and is adopted for the rest of the thesis.
01: Select an initial condition (z°, A0, t°) with (A°,t o )>0and select a positive
termination threshold 6.
02 for k = 0,1,2,... do
03 A <- diag(Ak) and T *- diag(t k )
04 Solve for Azk with ((? - / T r - 1 / ) A z k = rx* -]Tr~xr^
05 Calculate AAk = r~lr£ - r^jAz"- and At" = -tk + A^^a^e - TAX)
06 a1 <- 1 and a2 *- 1
07 if tk + arAtk < 0 and Ak + cr2AAk < 0 then
08 a-L^ a-!- 0.001 and az<^a2- 0.001
09 end if
10 a - m t a f ^ j t a . 0.001).. max ){a2,0.001})
11 (z k + 1 ,A k + 1 ,t k + 1 ) <- (z k ,A k ,t k ) + ak(Azk,AAk,Atk)
12 ^k+i <_ (tT)k+ix/c+imc-i
17
Here, we have
In Line 4 of the above 1PM algorithm, a linear system in the form of Ax = b has to be
solved, where A is a nv x nv square matrix and b is column vector with nv
components. In this project, it is solved by Gauss-Jordan elimination with pivoting.
ASM is also most commonly used as a general purpose QP solver (see [2] and [17]). It
solves the QP problem (17) and (19) via identifying the active set of constraints w for
its solution z,. The method begins with an initial guess iv° the active set. In the next
iteration, w is refined by deleting a constraint from vv° or adding a constraint tow 0 . In
this way, a guess of the active set is refined iteratively until the exact active set is
found.
Step 1:
Initial a feasible point z° and let its initial working set w° to be zero.
Step 2:
Now we consider in more detail the minimization of the cost function at the kth
iteration. The new cost is
subject to
Av*Azk = 0 (32)
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
18
-JwkAzk =0 (34)
Az k ' -ck'
. A . = . 0 .
(35)
Jwk denotes the matrix obtained by removing the £th row in step 5. Solve the above
systems of linear equations for Azk and A.
Step 3:
If Azk = 0 does not solve (31) and (32), go to step 5 to check if global optimum has
reached. Otherwise, continue.
Step 4:
Suppose that at the kth iteration we have the feasible solution zk. The ASM then find
an improved solution zk + Azk which minimizes the cost (31) without worrying
about the inactive inequality constraints. If this new solution is feasible, i.e. if/(z f c +
Azk) < g, then it is accepted as the next iteration: zk+1 = zk + Az. If not, choose a
step ak, where
fat-/(**!)
an min 1,' min
iSDk [ JtAzk JI
(36)
Ji denotes the ith row of/, and from/ for all i g wk.
With 0 < ak < 1, the new solution becomes z k + 1 = zk + aAz, and a new constraint
becomes active, defined by the index which achieves the minimum in (36), and this
index is added to the active set w.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
19
Step 5:
12: D*~{itw*:Jtte*>O,*0-<l}
13 ifDk is empty then
14 zk+1 «- zk + Az and wk+1 «- wk
15 else
20
21
3.1 Introduction
MPC implemented
in MATLAB
®- Plant simulated
in MATLAB
MPC implemented
on DSP
-o
The development software we adopted in this project is MATLAB R2007b (from The
Math Works Inc.) and Code Composer Studio v3.3 (from Texas Instruments Inc.).
22
In contrast with GPP and its von Neumann architecture, DSP architectures are
designed for repetitive multiply, add and store operations that step sequentially
through data values stored in consecutive memory locations. MAC is a common
operation in computing that computes the product of two numbers and adds that
product to an accumulator. This operation involves at least two operands and for this
reason DSP processors usually support multiple memory accesses in the same
instruction cycle. To be able to fetch two operands and an instruction from memory
into the central processing unit (CPU) in a single instruction cycle, the DSP processor
is equipped with two independent bus systems: program bus and data bus. Texas
Instruments calls this kind of processor a "modified Harvard architecture". This
architecture usually permits the program bus to be used also for access of operands, i.e.
the processor is capable of reading operands not only from data memory but also from
program memory.
The F2812 memory bus architecture contains a program read bus, data read bus and
data write bus. The program read bus consists of 22 address lines and 32 data lines.
The data read and write busses consist of 32 address lines and 32 data lines each. The
32-bit-wide data busses enable single cycle 32-bit operations. On F2812, a 32><32
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
23
F2812 processor has an 8-stage protected pipeline to maximize the throughput. DSP
can ideally achieve one clock cycle per instruction on average with pipelining.
Without pipelining, the number of clock cycles it takes to execute an instruction is 8.
Therefore this pipelining enables the F2812 to execute at high speeds without
resorting to expensive high-speed memories. This protected pipeline prevents a write
to and a read from the same location from occurring out of sequence and therefore
assures the order of results is as written in source code.
24
The F2812 eZdsp prototyping board (Figure 3.2) is adopted as the embedded platform
for carrying out the fixed-point DSP implementation of IPM and ASM.
The F2812 eZdsp incorporates a standard parallel port interface that supports ECP,
EPP, and SPP8 communications and has direct access to the integrated JTAG
interface. In this project, the parallel port is used for exchanging data, like matrices
Q, c,J and g, between PC and DSP.
25
C6713 CPU contains eight independent functional units: six ALUs (two of the six
perform fixed-point operations only) and two multipliers. Two MACs can be ideally
achieved per instruction cycle, for a total of 450 million MACs per second (MMACS).
With six of the eight functional units capable of handling floating-point operations, it
is possible to perform 1350 million floating-point operations per second (MFLOPS).
The C6713 pipeline can dispatch eight parallel instructions every cycle.
26
A USB interface to the embedded JTAG emulation logic on the C6713 DSK is
provided and this allows for code development and debug without the use of an
external emulator. In this project we use USB serial port for the communication
between PC and DSP.
In the above section, we have adopted the DSP development boards for implementing
IPM and ASM algorithms. Some configurations have to be done on both hardware
and software level to make the development board work properly. In this section, a
rapid prototyping environment suitable for exploring the implementation of MPC is
set up.
An XINTF zone is a region in the F2812 memory map that is directly connected to the
external interface. Each XINTF zone can be individually configured with unique read
and write access timing and each has an associated zone chip-select signal.
Zone 6 has a chip-select signal XZCS6AND7 (shared with Zone 7) that is connected
with the chip-select signal of external SRAM IS61LV6416-12T (see Figure 3.4). As a
result, in this project, Zone 6 is set as the external RAM region and toggled when an
access is made to Zone 6.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
27
- < C \ A [ 0 . 181
— ' ; X U | 0 . I5|
SRAM
Q a
a a
AO >> DO
7
8
XEO
XD1
/
A] Dl
9 XD2 /
A2 D2
10 XD3 /
A3 D3
13 XEW /
A4 D4
14 XD5 /
A5 ES
15 XD6 /
A6 D6
16 XET7
/
A7 D7
29 XE» /
A8 E*
30 XD9 /
All EN
31 XD10
/
A12 D10
32 XD11
/
AI3 Dll
35 XDI2
/
AI4 DI2
36 XD13
/
AI5 DI3
37 XDI4
/
Aid D14
38 XD15
/
AI7 1)1 J /
A9
XA17
All)
XZCSSAMJ7n>>- QL
XWE»>>- W_E
XRI)n > > - OE ] ^ CIO
Ipn 0.01UF1~ J~0.01uF
BLE
</> Vl
>>
IS6ILV6416-I2T
However, F2812 is not allowed to access the external program bus and the data bus at
the same time. Compared to a single cycle for internal access to two 32-bit operands,
it takes at least 4 cycles to do the same with external memory.
For C6713 DSK prototyping board, matrices can be completely saved in on-chip
RAM of TMS320C6713 processor. Hence external RAM expansion is not necessary
for this project, though it is provided on C6713 DSK.
The DSP compiler creates multiple portions of code and data called sections. These
sections are categorized into two different groups: initialized and uninitialized. The
initialized group of sections is composed of all code, constants, and initialization
tables. Table 3.1 shows the initialized sections produced by the compiler.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
28
The uninitialized group of sections is composed of variables, the stack, and malloc
memory. Table 3.2 shows the uninitialized sections produced by the compiler.
Table 3.2 Uninitialized sections
Name Contents Restrictions
.bss Global and static variables Low 64K data
.ebss Far global/static variables Anywhere in data
.stack Stack space Low 64K data
.sysmem Memory for malloc functions Low 64K data
.esysmem Memory for far malloc functions Anywhere in data
In this project, matrices defined as global arrays are mainly stored in .cinit and .ebss
sections, and matrices defined as local arrays are stored in .stack section. Generally
global variables should be carefully used because they can potentially be modified
from anywhere and any part of the program may depend on it. However, in this
application the stack might not be capable of storing all the local arrays, so some of
the matrices should be moved to global position.
Once the compiler has generated these sections the linker takes the individual sections
from each source file and combines them to create an output section. The linker
command file (.cmd) is used to tell the linker where to allocate these sections.
Because of external RAM expansion, .cinit and .ebss sectors are assigned to Zone 6.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
29
The linker command file for C6713 DSK is shown below. It can be seen that, in this
project, only on-chip RAM is employed to store variables.
// FILE: C6713dsk.cmd, Linker Command File
MEMORY
{
IVECS: org=0h, len=0x220
I RAM: org=0x00000220, len=0x0002FDE /*internal memory*/
SDRAM: org=0x80000000, len=0x0010000 /*external memory*/
FLASH: org=0x90000000, len=0x0002000 /*flash memory*/
}
SECTIONS
{
.EXT_RAI^l :> SDRAM
.vector;i :> IVECS /*in vector file*/
.text :> IRAM /*Created by C Compiler*/
.bss :> IRAM
.cinit :> IRAM
.stack :> IRAM
.sysmem :> IRAM
.const :> IRAM
.switch :> IRAM
.far :> IRAM
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
30
a ef a %> "Tj H « S, H % U 3 ¥? ia 10 n /* • % *
p* 3lDebug 3 # I 4 »1
& eV Q | 93 [U Q B « i P
f S T
JGELftes
•include "DSP281x_Device.h"
•include <stdio.h>
•include <math.h>
' PSPStflx Hescierfile Include file
13
j j Projects
J j I P M p * (Debug)
•define NV 1
K _ J Dependent Projects
+ _ J Documents
•define MC Jl
<? ^ J DSP/BI05 Config
•define MAXITER 100
1 Generated Ftes
•define TOLMIU B.00001
1) • _JInckjoe
•define DZTOL 0.005
I Ubreries
rt - J Source
•define TUNE 0."Of/,
•define OETOL le-S
2 J ] D5PZ81x_CpuTimers.c
J ] D5P2fllxJ)er"«JUsr.c
J ] DSP28]x_GtabarVanabteDefs c
•define IMTJLAM 0.75
J ] D5P28ix_Gpto.c
^define INITJT B.25
j r ] DSP2filx_PieCtrr.c
J ] D5PZfllx_Pievect.c
float Q[cJV][NV];
J$ DSP281x_SysCb1.c
float c[NV];
Jtl man.t
float J[MC][rJV];
J ] D5P281x_r1eaders_nonB10S.cmd
float giMCJ;
i ] F2812.EzD5P_RAM_Hc.cmd
f l o a t l[UV):
lot I t e r ;
float Exectime;
[*! File View |/»Bookmadts )
Build Complete,
0 E r r o r s , 0 W a r n i n g s , 0 Remarks.
umnvBujw/ ILiJJ iT
% • . H A L T E D : s/w Fie: D:KCStudto_v3.3\HyPro)*«sUPM\5owce\main.e Lnl.Cdl
Complier and linker in CCS should be correctly configured for DSP applications. The
proper configurations of the complier and linker for F2812 eZdsp and C6713 DSK
running in real-time mode are shown in Figure 3.6 and 3.7.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
31
Build Options for IPM.pjt (Debug) I ? | X I Build Options for IPM.pjt (Debug)
General Compile! '• Luker DspBtosE jiider | Link Order \ Geneial j Compile! Lriuar | OspBiosBulder ] Link Order j
g-ii"D:\aSludo_vi3\M)«o|ecK 1 JPM'>E)«>ja'
Category: Category
Target Vernon: |C28w(-v2B}jJ Suppress Banner [-q]
Libr anes
Generate Debug Info: JFul SymboSc Debug lg) 3 Advanced Output Module: |Absokrte Executable (a) 3
Opt Level: fNori Output Flename (o): \Deoug\IPM out
jd
Progjam Level Opt: None
~B Map Flename |-m]:
Autonrt Modet
\Debug\IPM map
Run-TneAutoriitiafeationt-c) g
Heap See (-heap):
Fat Heap Stack l-larhaapl: |
Stack Size (-stack). 10x400
F l Value (A |
Code Entry Point |-e>
Help Help
] He!c
J Help
32
33
This reduces the system to a square system. If ArA is nonsingular (or invertible), then
x = (ATAy1ATb (38)
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
34
However, there are several undesirable aspects to this approach. Using a matrix
inverse to solve a system of equations is more work and less accurate than solving the
system directly by Gaussian or Gauss-Jordan elimination, especially when A is large.
x = A\b
Also, in Line 4 of both IPM and ASM algorithms, Az is derived by solving a linear
system. Since the matrix A of this linear system to be solved is square, the "backslash"
"operator employs Gaussian elimination method to obtain the solution for both IPM
and ASM.
The C-code for the linear equation solver using Gauss-Jordan elimination in this
project is shown below:
void s o l v e L i n E q ( f l o a t *A, f l o a t *b)
{
int i , i r , ic, j , jc;
i n t NUM, a c t s i z e ;
f l o a t temp, temp2;
NUM = NV;
a c t s i z e = NV;
f o r ( i = 0 ; i<NUM; i++)
{
i f ( A [ i * a c t s i z e + i ] == 0)
{
f o r ( j = i ; j<NUM; j++)
{
temp2 = f a b s ( A [ j * a c t s i z e + i ] ) ;
i f ( t e m p 2 >= GETOL)
{
f o r ( j c = 0 ; jc<NUM; jc++)
{
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
35
temp = A [ i * a c t s i z e + j c ] ;
A[i*actsize+jc] =
A[j*actsize+jc];
A [ j * a c t s i z e + j c ] = temp;
}
temp = b [ i ] ;
b[i] = b[j];
b [ j ] = temp;
break;
}
}
}
temp = A [ i + i * a c t s i z e ] ;
A [ i + i * a c t s i z e ] = 1.0;
f o r ( j = i + l ; j<NUM; j++)
{
A[i*actsize+j] = A[i*actsize+j]/temp;
}
b [ i ] = b[i]/temp;
f o r ( i r = 0 ; ir<NUM; i r + + )
{
if(ir != i )
{
temp = A [ i r * a c t s i z e + i ] ;
A [ i r * a c t s i z e + i ] = 0;
f o r ( i c = ( i + l ) ; ic<NUM; ic++)
{
A[ir*actsize+ic] = A[ir*actsize+ic]
- temp * A [ i * a c t s i z e + i c ] ;
}
b [ i r ] = b [ i r ] - temp * b [ i ] ;
}
}
}
}
Let Ma,Mm and Md denote the number of addition, multiplication and division
required for Gauss-Jordan elimination with pivoting in the C-code above. Notice that
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
36
Mm(n) = 0 . 5 n ( n - l ) ( n + l)
Md{n) = 0.5n(n + 1)
In Line 11 of IPM algorithm, the optimal solution z, Lagrange multiplier A and slack
variable t are updated at every iteration of the optimization using the following
equations,
37
2. Solving AAk
AAk = r - 1 ^ - r~VAz k
3. Solving Atk
Recall equation (25),
Atk = - t k + A-V/* k e - TAX)
The following shows the M-code for solving QP problem with IPM in one iteration.
sigma = 0.25;
mc = size(g,1);
lambda = 0.75 * ones(mc,1);
t = sigma * ones(mc,l);
miu = t' * lambda / mc;
z = zeros(size(Q,1),1) ;
% E n d o f step 1 of IP Solver
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
38
% Step 2 of IP Solver
invt = ones(mc,1)./t;
vtmp = sigma * miu * ones(mc, 1);
% r2 = diag(lambda) * (-J * z + g) - vtmp;
sla = J * z;
sib = g - sla;
sic = diag(lambda) * sib;
r2 = sic - vtmp;
% dt = -t + g - J * (z + dz);
s5a = sib - t;
dt = s5a - s4a;
% E n d o f s t e p 2 of IP Solver
#define I N I T _ T 0.25
#define INIT_LAM 0.75
/ / Step 1 o f IP Solver
f o r ( i = 0 ; i<MC; i++)
{
lam[i] = INIT_LAM;
t [ i ] = INIT_T;
}
f o r ( j = 0 ; j<NV; j++)
{
z [ j ] = 0;
}
/ / Step 2 o f IP Solver
f o r ( i = 0 ; i<MC; i++)
{
invt[i] = l / t [ i ] ;
}
vtmp = I N I T _ T * miu;
f o r ( i r = 0 ; ir<MC; i r + + )
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
40
{
s2a[ir] = i n v t [ i r ] * lam[ir];
}
f o r ( j r = 0 ; jr<NV; j r + + )
{
f o r ( j c = 0 ; jc<MC; jc++)
{
s2b[jr][jc] = 3[jc][jr] * s2a[jc];
}
}
f o r ( j r = 0 ; jr<NV; j r + + )
{
for (ic=0; ic<NV; ic++)
{
tpvarl=0;
for(jc=0; jc<MC; jc++)
{
tpvarl = tpvarl + s2b[jr][jc] * J[jc] [ i c ] ;
}
s2c[jr][ic] = tpvarl;
}
}
f o r ( i r = 0 ; ir<NV; i r++)
{
f o r ( i c = 0 ; ic<NV; ic++)
{
matA[ir][ic] = Q[ir][ic] + s2c[ir][ic];
}
}
f o r ( i r = 0 ; ir<MC; i r + + )
{
s3a[ir] = i n v t [ i r ] * r 2 [ i r ] ;
s3b[ir] = lam[ir] - s3a[ir];
}
f o r ( j r = 0 ; jr<NV; j r + + )
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
{
tpvarl = 0;
for(jc=0; jc<MC; jc++)
{
t p v a r l = t p v a r l + 3 [ j c ] [ j r ] * s3b[jc]; / / s3c
}
s3d[jr] = - c [ j r ] - t p v a r l ;
}
f o r ( j r = 0 ; jr<NV; jr++)
{
tpvar2 = 0;
solveLinEq(matA[0], s 3 f ) ; / / Line 4
f o r ( i r = 0 ; ir<NV; ir++)
{
dz[ir] = s3f[ir];
}
f o r ( i r = 0 ; ir<MC; ir++)
{
tpvarl = 0;
42
d t [ i r ] = tpvar3 - t p v a r l ;
}
The expressions in Table 3.3 (see Appendix C for details) are derived for the worst
case scenarios and give the number of floating-point arithmetic operations executed
by one iteration of the IPM algorithm, parameterized by the number of decision
variables nv, and the number of inequality constraints mc. In this analysis, we
separated out the number of arithmetic operations for the solving of the linear system.
For IPM algorithm, the number of 32-bit floating-point variables needed to be stored
on DSP is:
Note that the storage given in the above equation does not include variables used for
variable swapping, loop indices, counters and flags.
43
-1
\Azk] Q Jwk -c - Qzk
[ X\ Jwk 0 [ 0 J
<2o ^o
% Step 2 of AS Solver
Wk = [];
JWk = J(Wk,:);
mwork = length(Wk);
ck = c + Q*z;
QO = [Q J W k ' ; JWk z e r o s ( m w o r k , m w o r k ) ] ;
PO = [ - c k ; z e r o s ( m w o r k , 1 ) ] ;
d z l a m = QO\PO;
dz = d z l a m ( l : n v ) ;
lambda = dzlam(nv+1:end) ;
% End of S t e p 2 of AS S o l v e r
/ / Step 1 of AS Solver
f o r ( j = 0 ; j<MC; j++)
{
lambda[j] = 0;
w k [ j ] = 0;
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
Dk[j] = 0;
}
/ / Step 2 of IP solver
f o r ( i r = 0 ; ir<NV; i r + + )
{
temp2 = 0;
f o r ( i c = 0 ; ic<NV; ic++)
{
tempi = Q [ i r ] [ i c ] * z [ i c ] ;
temp2 = temp2 + t e m p i ;
}
c k [ i r ] = c [ i r ] + temp2;
}
countAct=0;
for(j=0; j<MC; j++)
{
if(Wk[j]==l)
countAct++;
}
f o r ( i r = 0 ; ir<NV; i r + + )
{
f o r ( i c = 0 ; ic<NV; ic++)
{
QO[ir][ic] = Q [ i r ] [ i c ] ;
}
PO[ir] = - c k [ i r ] ;
}
temp3 = 0;
f o r ( j = 0 ; j<MC; j++)
{
if(wk[j]==l)
{
f o r ( j c = 0 ; jc<NV; jc++)
{
QO[NV+temp3][jc] = J [ j ] [ j c ] ;
QO[jc][NV+temp3] = J [ j J [ j c ] ;
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
45
}
P0[NV+temp3] = 0;
temp3++;
}
}
temp2 = 0;
f o r ( j r = N V ; jr<(countAct+NV); jr++)
{
f o r ( j c = N V ; jc<(countAct+NV); jc++)
{
QO[jr][jc] = 0;
}
}
f o r ( i = 0 ; i<NV; i++)
{
dz[i] = PO[i];
}
temp3 = 0;
f o r ( j = 0 ; j<MC; j++)
{
i f Cwk[j] == 1)
{
lambda[j] = PO[NV+temp3];
temp3++;
}
else
lambda[j] = 0;
}
46
Since the constraints associated with the minimum and maximum cannot be active
simultaneously, the total number of active constraints cannot exceed mc/2. Again, the
analysis in Table 3.4 is based on the worst case scenarios for ASM algorithm. For
example, we assume that the number of active constraints in the ASM algorithm is
always m c / 2 so that the matrix Jwk in Line 4 always hasm c /2 x n v entries. But, in
reality ]wk may have a smaller size.
Table 3.4 Worst case computation complexity for ASM algorithm
ASM algorithm Number of operations per iteration
Addition n£ + nv(2mc + 1) - mc+Ma(nv + 0.5m c )
Multiplication n$ + nv(2mc + 1) + Mm(nv + 0.5m c )
Division mc + Md(nv + Q.5mc)
For a single QP problem, we are interested to investigate its execution time, number
of iterations and storage usage of the computation.
Timing performance can be measured by hardware timer on DSP or the profile clock
in CCS. The timer value can be read to get a measurement of clock counts and
execution time.
CPU-Timer 0 is for general use on F2812. This timer is a 32-bit countdown counter
which is decremented at the CPU clock speed divided by the prescale value setting. In
this project the counter frequency is set to be equal to F2812 CPU clock (150 MHz).
47
#define MC 31
#define MAXITER 100
float Q[NV][NV];
float c[NV];
float J[MC][NV];
float g[MC];
f l o a t z[NV]; / / solution
int iter; / / number of i t e r a t i o n s
f l o a t Exectime; / / execution time
void Timing(void)
{
long t i m e r v a l u e ;
timervalue = CpuTimerORegs.TIM.all;
timervalue = OXFFFFFFFF - t i m e r v a l u e ;
Exectime = timervalue / 150; / / us
}
v o i d main ( v o i d )
{
InitSysCtrlO;
initCpuTimersO ;
StartCpuTimerOO;
I P M _ f u n c t ( ) ; / / or ASM_funct();
StopCpuTimerOO ;
TimingO ;
}
There are two 32-Bit general-purpose Timers on C6713, Timer 0 and Timer 1. In this
project, Timer 1 is adopted to measure clock counts and computational time, operating
at a quarter of C6713 CPU clock.
/ / #include " c 6 7 1 3 d s k i n i t . h "
#include <stdio.h>
#include <math.h>
#include " c s l . h "
#include " c s l _ t i m e r . h "
#define NV 3
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
#defnne MC 31
#define MAXITER 100
f l o a t Q[NV][NV];
f l o a t c[NV];
float :[MC][NV];
f l o a t g[MC];
f l o a t z[NV]; / / solution
int Iter; / / number of iterations
f l o a t Exectime; / / execution time
s t a t i c TlMER_Handle hTimerl;
void main(void)
{
long timervalue = 0;
long timervalue_start = 0;
long timervalue_end = 0;
void TimerEventHandler(void);
extern far void vectors();
hTimerl = TIMER_open(TIMER_DEVl, TIMER_OPEN_RESET);
TlMER_confi gArgs(hTi merl,
Timercontrol, // use predefined control value
OxFFFFFFFF, // set period
0x00000000 // start count value at zero
);
TlMER_start(hTimerl);
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
49
t i m e r v a l u e _ s t a r t = TIMER_getCount(hTimerl);
lPM_funct(); / / or ASM_funct();
timervalue_end = TIMER_getCount(hTimerl);
timervalue = timervalue_end - t i m e r v a l u e _ s t a r t ;
Exectime = timervalue / 56.25; / / us
TlMER_pause(hTimerl);
TiMER_close(hTimerl);
}
Memory usage configuration of every section is amply reported by .map file created
by CCS.
are randomly generated on a PC using MATLAB by the method proposed in [17] and
saved as a .mat file consisting of matrices Q, c,J and g.
QP problems generated by this method always have the solutions (1,1, ...,1) and a
feasible point (0,0, ...,0). Therefore, the initial point z° = (0,0, ...,0) is used for both
of the IPM and ASM algorithms.
Also, the number of active constraints at the solution never exceeds the number of
decision variables. This is to ensure that the Lagrangian multipliers always exist at the
solution. Finally, the QP problems are scaled such that the eigenvalues of Q lie in the
range from 0.5 to 20.
Due to the large number of the test sets (5400), an automatic test methodology is
introduced in this section to substantially streamline testing process and reduce testing
time.
In this project we use a MATLAB component Embedded IDE Link CC 3 to carry out
our test methodology. This component provides a collection of methods that use the
application program interface (API) of CCS to communicate between MATLAB
software and a processor in CCS to build a connection.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
50
Start testing
Set up connection
between PC and DSP
NO
NO
With Embedded IDE Link CC, we can use MATLAB and/or Simulink software to
interactively analyze, profile and debug the execution behavior of processor-specific
code for solving within CCS. Figure 3.8 shows how Embedded IDE Link CC is used
to verify QP problems in this project.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
51
Step 1:
Set up connection between PC and DSP by entering the following command at the
MATLAB software prompt:
ccsboardinfo
A list of the processors and boards that CCS recognizes as installed on the PC is
returned. Alternatively, the following code can be called to set up:
[boardnum,procnum] = boardprocsel;
MATLAB returns a window (see Figure 3.9) with the information of the processors
and boards installed on the PC.
SelectingBoardnum Et Procnum C j H Pk) I •> Selecting Boardnum a Procnum
Code Composer Studio(tm) has configured a single processsor Code Composer Studio(tm) has configured a single processsor
Boa>dF2812eZdsp BoardC8?13DSK
Proc: cpu_0 Proc: C P U J
Type: TMS320F280C Type: T M S 3 2 0 C 6 K 1 X
This device w i be used for the Code Composer Link T his device wJ be used for the Code Composer Link
OK CK
Issue the command "ticcs" to set up the connection between MATLAB and CCS that
refers to selected board and processor:
cc = ticcs('boardnum',boardnum,'procnum',procnum);
This connection is a MATLAB link object that is saved as a variable "cc" and directs
actions in MATLAB to the processor. If CCS was not running before creating the
object, CCS launches and runs in the background. Enter the code below to force CCS
to be visible on the PC.
visible(cc,1);
Step 2:
52
Step 3:
Initialize global arrays Q, c,J and g in the C program on DSP by sending the matrices
from MATLAB to CCS.
The CCS is able to work in MATLAB environment and accesses the variables
declared in the C program through the "createobj" command. Command "createobj" is
such a method supported by Embedded IDE Link CC that creates a MATLAB
embedded object to represent a C variable in the program in CCS. Variables can be
read or modified in a C-like way working with the embedded object that "createobj"
returns. Here we create embedded objects for global arrays Q, c,J, g and variable "z",
"Exectime" and "Iter" in the C program.
Q_obj = c r e a t e o b j ( c c , ' Q ' ) ;
c_obj = c r e a t e o b j ( c c , ' c ' ) ;
J_obj = c r e a t e o b j ( c c , ' J ' ) ;
g_obj = c r e a t e o b j ( c c , ' g ' ) ;
z_obj - c r e a t e o b j (cc, ' z ' ) ;
Exectime_obj = c r e a t e o b j ( c c , ' E x e c t i m e ' ) ;
Iter_obj = createobj(cc, ' Iter ' ) ;
Matrices Q,c,J andg are loaded from MATLAB workspace in form of .mat files to
CCS by "write" function.
write(Q_obj,Q);
write(c_obj,c);
write(J_obj,J);
write(g_obj,g);
Step 4:
53
end
Step 5:
Work with the CCS project from MATLAB software by reading results ("z",
"Exectime" and "Iter") from the processor.
iteration = read(Iter_obj);
z_ip = read(z_obj);
executiontime = read(Exectime_obj);
Step 6:
Compare the DSP solution of the QP problem with MATLAB QP solver "quadprog".
If the difference between the two solutions is within a tolerance we have defined, e.g.
0.001, we consider the DSP solution is accurate. Then we get the number of satisfied
solution per 100 QP problems, average computational time and average number of
iterations.
Step 7:
Objects that created by Embedded IDE Link CC software have COM handles to CCS.
Closing MATLAB software removes these COM handles automatically. Alternatively,
project closes in CCS with the following command.
close(cc,projfile,'project')
The following table shows the sample test suite for DSP implementation of constraint
MPC.
clear all
TOL = le-3; i acceptable accuracy
N = 100; % how many test sets
nv = 3; % size of QP
mc = 31;
% set up communication between MATLAB and CCS
[boardnum,procnum] = boardprocsel;
cc = ticcs('boardnum',boardnum,'procnum',procnum);
cd(pathname);
Q obj = createobj(cc,'Q');
for i= 1:N
restart(cc); % restart the program on DSP
% move the program counter to the beginning of main
goto(cc,'main');
run (cc, 'main');
55
exectime_aver = exectime_all/NCorrect;
iter_aver = iter_all/NCorrect; % average number of iteration
It can be seen that with Embedded IDE Link CC, QP problem verification is
automated simply by writing MATLAB software scripts.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
56
4. Simulation Results
57
Computational time as a function of mc on Active Set Method Computational time as a (unction of nv on Active Set Method
Figure 4.3 and 4.4 shows the computational time of C6713 implementation. It can be
evidently seen that C6713 DSP takes great advantage of its floating-point architecture
and higher CPU frequency and therefore performs much faster operations than F2812.
Also, since the on-chip RAM of C6713 is used (rather than the off-chip RAM of
F2812), there is no significant delay when accessing data. As a result, DSP with
advanced architecture is more appropriate for real-time industrial applications of MPC
where the computational speed is substantially considered.
Computational time as a function o( mc on Interiof Point Method Computational time as a function of nvon Interior Point Method
58
Computation* urns a t a tunctktn ol mc on Active Set Method Computational time as a (unction 01 mac Active Set Method
ASM (Figure 4.2 and 4.4) takes a shorter computational time than IPM (Figure 4.1
and 4.3) when the number of decision variables nv is small. However, as nv grows,
the computational time ASM requires increases faster than IPM and therefore is
longer than IPM after nv reaches a certain value.
If we ignore the arithmetic operations for solving the linear systems in both IPM and
ASM algorithms, the IPM algorithm requires approximately nv2mc + 50m c more
additions and nv2mc + 27 mc more multiplications than the ASM algorithm. The
extra operations are used for matrix multiplications in forming the linear system in
Line 4 of the IPM algorithm (i.e., forming the matrix/ 7 /" - 1 / and vectors rx andr 2 ).
Also, recall the number of addition, multiplication and division required for Gauss-
Jordan elimination with pivoting discussed in Section 3.4.2.
Ma(n) = 0 . 5 n ( n - l ) ( n + l )
Mm(n) = 0 . 5 n ( n - l ) ( n + l )
Md(n) = 0.5n(n + 1)
ASM may need to solve a system of nv + mc/2 linear equations, i.e. n = nv + mc/2.
By contrast, IPM only needs to solve a system of nv equations. The number of
operations required by IPM algorithm to solve a linear system is less sensitive to nv
than that required by ASM algorithm. For the reasons above, as nv grows, the
computational load for ASM increases faster than IPM.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
59
For IPM algorithm (Figure 4.5 and 4.7), the average number of iterations lies between
12 and 16 for most values of (n„,m c ) and is insensitive to nv andm c . For ASM
(Figure 4.6 and 4.8) algorithm, the average number of iteration grows linearly with nv
andm c , and almost reaches 55 for the largest QP problem size. From the above
comparison we conclude that IPM has a better scalability than ASM, as the number of
inequality constraints mc and decision variables nv weakly affect the number of
iterations required by the IPM to solve one QP problem.
There is only a small distinction between the fixed-point and floating-point DSP
performance on the number of iterations. This is due to the different accuracy between
fixed-point and floating-point representation. The loss of precision on fixed-point DSP
forces an earlier termination of the iteration as compared to the floating-point DSP.
Average number ot iterations as a unction of m C on Interior Point Method number ol iteratkine M • function of nv on Interior Point Method
—•— nv*3
nv«5 - t-
• • - » • - n*«7 - - -A- - - - -'
& 15.5
• nv=11
nv=9
" _ L - _ i_ _ _ L
\-R\\--\
—*—m*13
v - § i •V L
;/: *•- " -V ""
13.5
-£--\/ *
• -.^-^y - - - ,
r~ r
1/ r
r
/A
1
13 - r . . ,-
12 5 - *•
60
— • — nv=3
nv=5
• nv=7
rtv=9
• nv=11
— * — nv=13
I 13
— A — mc=31
mcMI
—W~ - mc=51
1 mc=81
- mc=7l
S 40 — 4 — mc*B1
I • mc=91
••• W W I
- mc«111
10 11 12 13
Number of Decision Variables nv
61
4.2 Storage
Table 4.1 shows the storage usage of F2812 processor for solving the QP problem of
size mc = 111 and n„ = 13. Mapping of the variables to the sections are done
automatically by the C compiler of CCS. In Table 4.1 we only take storage usage
of .cinit, .ebss and .stack sections into account.
Table 4.1 Storage usage for solving a QP problem using IPM and ASM
1PM ASM
29426B 81258B
As seen from Table, the ASM algorithm requires more storage, since it needs to solve
a system of nv + mc/2 linear equations. For the largest generated QP problem, the
memory requirement is around 80 KB, which greatly exceeds the storage of on-chip
RAM on F2812 (36 KB). In contrast, the IPM algorithm only needs to solve a system
of nv linear equations. Therefore, the ASM algorithm needs a larger memory for
storing the matrices that are needed during the solving of the linear equations.
The DSP implementation does not always give the accurate solution mainly due to the
finite wordlength effects of embedded microprocessors. The solutions that return from
DSP are compared to the results that are solved by "quadprog" function in MATLAB.
The DSP results within ±0.001 tolerances are acceptable.
Most commonly, the round-off error, which is the difference between the calculated
approximation of a number and its exact mathematical value, occurs and propagates
through the iterations during the DSP implementation, leading IPM or ASM to
terminate at a point where the objective function value deviates severely from the
optimal value.
Note that not all the inaccurate solutions are caused by round-off error. In some
occasions of our DSP implementation, overflow occurs which eventually cause
unboundedness of the iterations or instability to the IPM or ASM algorithms.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
62
Compared to the IPM algorithm, the failure rate for the ASM algorithm is lower,
which is about 18 out of 5400 QP problems for fixed-point solution. Again, the failure
cases are mainly due to overflow, which usually occurs when the linear system is
solved for a very small Azk. Unfortunately, the convergence of the ASM algorithm
indeed requires Azk being close to a zero vector, which potentially leads to failure.
For C6713 implementation, the numerical error appears less frequent (13 out of 5400
for IPM algorithm and none is detected for ASM algorithm) because the maximum
representable number is enormously larger.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
63
5.1 Introduction
To verify the DSP implementation of MPC, the Cessna Citation 500 Aircraft Model
from Example 2.7 in [2] is currently examined as the actual controlled system by the
embedded model predictive controller implemented on TMS320F2812 and
TMS320C6713. In this simulation we only verify Case 2 and Case 3 of Example 2.7
in [2].
The model has the elevator angle (rad) as its only input, and the pitch angle (rad),
altitude (m) and altitude rate (m/sec) as outputs. The elevator angle is limited to ±15°
(±0.262 rad), and the elevator slew rate is limited to ±307sec (±0.524 rad/sec). These
are limits imposed by the equipment design and cannot be exceeded. For passenger
comfort the pitch angle is limit to ±20° (±0.349 rad).
A MPC controller is designed with sampling interval of 0.5 sec, prediction horizon NP
= 10 and control horizon Nu = 3. The following constraints
ju| < 0.262, \Au\ < 0.524, and \yl\ < 0.349
64
board (see Figure 5.1). The plant and controller interact through parallel port (for
F2812) or USB cable (for C6713).
From Figure 5.2, it can be seen that the pitch angle constraint is active for much of the
transient, which results in the altitude rate being constant for part of the transient.
Next, a constraint on the altitude rate with a limit of 30m/sec is introduced and the
number of constraints is now increased to 52 (Case 3). Again, the IPM algorithm
realized on F2812 is able to solve the QP problem without violating the given
constraints. Figure 5.3 shows the response to 400m step change in altitude set-point,
with the added constraint, corresponds to Figure 2.8 in [2]. The altitude rate constraint
is active during most of the control process while the pitch angle constraint is only
active at the beginning of the process. The pitch angle constraint is briefly active near
the beginning of the transient, but that the altitude rate constraint is active for most of
the time.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
0 !1 { <
100 •--/{-----j
'
-5 0
10 15 20 0 10 15 20
Time (sec) Time (sec)
Altitude rate and constraint Elevator angle and constraint
-. 50 - - - - - J. t -r- — ~ -
<u in
| 40
e
vator a ngle (d
f 30
4 rk.
r^
C3
2 20 .!...[..
/•+ .4...
„ . : \
•
3
I 10
1 LU
V_
5 10 15 20 5 10 15 20
Time (sec) Time (sec)
_ 15
9
2- 10 j —
i 300
\--j/-\- -
s
-c 200
x± 5
0-
0 ; \. 100 ..../..-.A :
/ ,
-5 0 —£_ 1 1 1
10 15 20 > 5 10 15 20
Time (sec) Time (sec)
Altitude rate and constraint Elevator angle and constraint
-& 20
I L..i
u - -
\ !
I !
- 1 " . •
i•
v
g
10
o p-i uu
i:
_ . „ „ \ J - - o ' > i
ra
; i ^ j -10 .-- -T , ,--
i ' i
ill
5 10 15 20 5 10 15 20
Time (sec) Time (sec)
20 400 ly-
/
1 I 1
15
I , 1, -§• 300
•' i .
• 7 i
10 • /
o i /
| 200 • , - - - - .
5l . . . . L' i
; \ 3
100
0 / >
-5 n
10 15 20 10 15 20
Time (sec) Time (sec)
Altitude rate and constraint Elevator angle and constraint
50 _ i _ L —
40 10
30
20 / I
a>
c _^J^__
o
ro
10
0 f-'r-h" UJ
-10
1
~ T ~
' »
!
1
5 10 15 20 5 10 15 20
Time (sec) Time (sec)
— —; J_. 1 -
400
/V 300
y-
I 1
:
_ L
It.Vj o A
I : : \i
"S 200
100 -/
/
0
10 15 20 0 5 10 15 20
Time (sec) Time (sec)
Altitude rate and constraint Elevator angle and constraint
\ 30 10
\ :
1s 20 /
f pJ U^
i i
(U f 1
•*->
5 i
0) 10
-o i ro
&
3
~ 0f
<
i
1 -10
I
5 10 15 20 5 10 15 20
Time (sec) Time (sec)
67
The MPC controller using ASM algorithm as the QP solver gives almost the same
response for Case 2 and Case 3 as compared with IPM algorithm (see Figure 5.4 and
5.5). However, it requires less computational time as compared with that using IPM
algorithm, which is introduced in the next section.
Figure 5.6 presents the number of iterations and computational time recorded at every
sampling instant IPM requires for solving the aircraft QP problem on F2812 operating
at 150 MHz. For Case 2, it can be seen that when the pitch angle constraint becomes
active from 4 sec to 9 sec, both the number of iterations and computational time
increase dramatically. After the transient they fall to a relatively low number. The
most difficult QP problem occurred at about 5 sec. For Case 3, the most difficult QP
problem happens at 9 sec, which takes more than 30 iterations to solve. ASM takes
less computation cost to solve the aircraft QP problem and runs about 14 and 10 times
faster in Case 2 and Case 3, respectively (see Figure 5.7).
For floating-point DSP C6713, the clock frequency for both implementations of the
IPM and ASM algorithms are set to 225 MHz.
Comparing the results of F2812 and C6713 implementations, it can be seen that
although the trends are similar, the execution time needed for the latter was reduced
tremendously (see Figure 5.8 and 5.9). C6713 gives a better result not only because of
a higher clock frequency but, more importantly, its architecture.
The aircraft model simulation again proves that ASM is much more effective than
IPM to handle the QP problem and therefore offers time saving when the number of
decision variables is small.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
30 _; „ . A :.
14 . J LL
a . 25 * -/ -\ -J
c
§ 12 , ^ L 1 J 1-. --. o
2 | 20
9
- 10 / L
£
* ~ i A A A i 15
8 1 _jV- ___!_. V 10 y v y ^ _ l j _ \
10 15 20 5 10 15 20
Time (sec) Time (sec)
x -jo Interior Point (Case 2) 105lnterior Point (Case 3)
: 'VIEE:
V
J\S \ \ !
10 5 10 15 20
Time (sec) Time (sec)
V
J .
PG • ' v \J
0 5 10 15 20
Time (sec) Time (sec)
Active Set (Case 2) x io" Active Set (Case 3)
w
3- 3
o
E
52
CO
o J \/v ; I rj\ ;
11
Q.
E
5 10 15 20 8 ol 5 10 15 20
Time (sec) Time (sec)
5 10 15 20 5 10 15 20
Time (sec) Time (sec)
Interior Point (Case 2) Interior Point (Case 3)
5 10 15 5 10 15 20
Time (sec) Time (sec)
15 1 1 1
1"
: v \J
5 10 15 20 5 10 15 20
Time (sec) Time (sec)
Active Set (Case 2) Active Set (Case 3)
5 10 15 5 10 15 20
Time (sec) Time (sec)
70
Table 5.1 lists the average computation time for solving a QP problem in the aircraft
example.
Table 5.1 Average computational time per aircraft QP problem
Aircraft QP problem IPM ASM
F2812 Case 2 47.20 ms 33.72 ms
(150 MHz) Case 3 84.66 ms 82.12 ms
C6713 Case 2 2.54 ms 0.36 ms
(225 MHz) Case 3 4.91 ms 0.72 ms
5.3.3 Storage
The required memory size can be estimated by the equations given in Section 3.5.5
and 3.6.5. Table 5.2 shows the storage usage of F2812 implementation for both Case 2
and Case 3. Again, we only consider the usage of .cinit, .ebss and .stack sections in
F2812 processor. Because the ASM algorithm needs to solve linear systems of larger
size, more RAM storage are used.
Table 5.2 Storage usage of aircraft QP problem
"Aircraft QP problem IPM ASM~
Case 2 4820 B 8754 B
Case 3 6724 B 16690 B
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
71
6.1 Conclusion
In this paper, we explored the applicability of the "MPC on a Chip" idea using both
fixed-point and floating-point DSP. Interior Point method and Active Set method with
dense matrix formulation, was employed to solve the QP problem. A rapid
prototyping environment suitable for exploring the various implementation issues to
bring MPC onto DSP was described. Simulation tests show that the DSP could be
used to implement a reasonably sized constrained MPC controller and floating-point
architecture DSP provide much faster computational speed than fixed-point. We have
also found that, in general, ASM should perform better than IPM when the numbers of
inequality constraints and decision variables are small. Otherwise, IPM should be a
better choice due to its scalability. The instability was mainly due to numerical error,
which was found to be more serious in IPM than in ASM in current implementations.
2. In this project, we have used MATLAB/Simulink and CCS to take a MPC solution
from design to embedded implementation. Further effort should also be directed at
achieving a higher level of automation in implementing embedded MPC
technology. This would facilitate the embedded system community to explore the
design space available in realizing a customized embedded MPC design.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
72
4. We may adopt the method of [13] to formulate our MPC problem into a QP
problem. In this method, the states and inputs during the prediction horizon are
kept as variables, in order to get a banded Q. For large QP problems, this may
results in lesser storage, and faster solution of the linear system of equations in
both IPM and ASM.
5. In MPC applications, constraints are usually imposed to restrict the minimum and
maximum values of the inputs, outputs, or states. This may result in a sparse/ with
special structures. It is meaningful to see if these structures can be exploited to
reduce the storage and computational complexity of solving the linear systems.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
73
Bibliography
[1] K. V. Ling, S. P. Yue and J. M. Maciejowski, "A FPGA implementation of model predictive
control," in Proceedings of the 2006 American Control Conference, Minneapolis, Minnesota,
USA, pp. 1930-1935, June 2006.
[2] J. M. Maciejowski, Predictive Control with Constraints. Prentice Hall, 2002.
[3] M. H. He and K. V. Ling, "Model predictive control on a chip," in the 5th International
Conference on Control & Automation, Budapest, Hungary, pp. 528-532, June 2005.
[4] K. V. Ling, EE6225 Process Control - Model Predictive Control. School of Electrical and
Electronic Engineering, Nanyang Technological University, Lecture Notes, 2009.
[5] Y. Y. Su, "Embedded model predictive control on a microcontroller," School of Electrical
and Electronic Engineering, Nanyang Technological University, Bachelor Thesis, 2008.
[6] K. V. Ling, B. F. Wu and J. M. Maciejowski, "Embedded model predictive control (MPC)
using a FPGA," in Proceedings of the 17th IF AC World Congress, Seoul, Korea, pp. 15250—
15255, July 2008.
[7] M. H. He and C. Chen, "An effective online computation scheme for constrained model
predictive control in embedded systems", in Proceedings of the 6th World Congress on
Intelligent Control and Automation, Dalian, China, pp. 6699-6703, June 2006.
[8] D. W. Clarke, C. Mohtadi, P. S. Tuffs. "Generalized Predictive Control - Part I. The Basic
Algorithm", in Automatica, vol 23, no. 2, pp. 137-148, 1987.
[9] M. D. Adams, EE6401 Advanced Digital Signal Processing - DSP Architectures &
Applications. School of Electrical and Electronic Engineering, Nanyang Technological
University, Lecture Notes, 2009.
[10] R. Chassaing, Digital Signal Processing and Applications with the C6713 and C6416 DSK.
John Wiley & Sons, 2005.
[11] L. G. Bleris and M. V. Kothare, "Implementation of model predictive control for glucose
regulation on a general purpose microprocessor," in Proceedings of the 44th IEEE
Conference on Decision and Control and the 2005 European Control Conference, pp. 5162—
5167, December 2005.
[12] R. Fletcher, Practical Methods of Optimization, 2nd Edition. John Wiley & Sons, 1987.
[13] S. J. Wright, "Applying new optimization algorithms to model predictive control," in
Proceedings of Chemical Process Control V, pp. 147-155, 1997.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
74
[14] M. H. He, C. Chen, and X. Y. Zhang, "FPGA implementation of a recursive rank one
updating matrix inversion algorithm for constrained MPC", in Proceedings of the 6th World
Congress on Intelligent Control and Automation, Dalian, China, pp. 733-737, June 2006.
[15] L. G. Bleris and M. V. Kothare, "Real-time implementation of model predictive control," in
2005 American Control Conference, Portland, OR, USA, pp. 4166-4171, June 2005.
[16] E. F. Camacho and C. Bordons, Model Predictive Control, 2nd Edition. Springer, New York,
1999.
[17] M. L. Lenard and M. Minkoff, "Randomly generated test problems for positive definite
quadratic programming," in A CM Transactions on Mathematical Software, vol. 10, no. 1, pp.
86-96, March 1984.
[18] K. S. Low, K. Y. Chiun and K. V. Ling, "A DSP-based servo system using generalized
predictive control", in Proceedings of the Power Conversion Conference, vol. 1, pp. 507-512,
August 1997.
[19] K. S. Low, K. Y. Chiun and K. V. Ling, "Evaluating generalized predictive control for a
brushless DC drive", in IEEE Transactions on Power Electronics, vol. 13, no. 6, pp. 1191-
1198, November 1998.
[20] J. Lei, X. Y. Jin and R. Wang, "Parallel computation of matrix multiplication and DSP
implement", in Chinese Journal of Sensors and Actuators, vol. 19, no. 3, pp. 737-740,
June 2006.
[21] L. Wanhammar, DSP Integrated Circuits. Academic Press, 1999.
[22] K. S. Low, EE6203 Computer Control Systems. School of Electrical and Electronic
Engineering, Nanyang Technological University, Lecture Notes, 2008.
[23] A. Bemporad, "Model predictive control design: new trends and tools", in Proceedings of the
45th IEEE Conference on Decision & Control, San Diego, CA, USA, pp. 6678-6683,
December 2006.
[24] J. Moreno and M. Medina, "Matrix computations on arrays of DSPs", in Proceedings of the
International Conference on Application Specific Array Processors, pp. 496-510, August
1992.
[25] C. V. Rao, S. J. Wright, and J. B. Rawlings, "Application of interior-point methods to model
predictive control," in Journal of Optimization Theory and Applications, vol. 99, no. 3, pp.
723-757, December 1998.
[26] B. J. Falkowski, EE6404 VLSI Digital Signal Processors. School of Electrical and Electronic
Engineering, Nanyang Technological University, Lecture Notes, 2009.
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
75
Appendix A
Table A-l Average clock counts required for solving a Q P problem using IPM on F2812
3 5 7 9 11 13
Table A-2 Average clock counts required for solving a QP problem using ASM on F2812
3 5 7 9 11 13
Table A-3 Average clock counts required for solving a QP problem using IPM on C6713
mc N. 3 5 7 9 11 13
Table A-4 Average clock counts required for solving a QP problem using ASM on C6713
3 5 7 9 11 13
78
Appendix B
%--return value: z
global SINGLE
MAXITER = 100
TOLmiu = le-5
DzTOL = 0.005 default
sigma = 0.25;
tune = 0.995;
mc = size(g,1);
lambda = 0.75 * ones(mc,l);
t = sigma * ones(mc,l);
miu = t' * lambda/mc;
z = zeros(size(Q,1),1);
% r2 = diag(lambda)*(-J*z+g)-vtmp;
sla = J * z;
sib = g - sla;
sic = diag(lambda) * sib;
r2 = sic - vtmp;
% A = Q+(J'*diag(invt))*(diag(lambda)*J);
s2a = diag(invt) * diag(lambda);
s2b = J1 * s2a;
s2c = s2b * J;
A = Q + s2c;
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
79
% dz = inv{A)*<-Q*z-c-J'*(lambda-diag(invt)*r2));
s3a = diag(invt) * r2;
s3b = lambda - s3a;
s3c = J' * s3b;
s3d = -c - s3c;
s3e = Q * z;
s3f = s3d - s3e;
B = s3f;
dz = inv(A) * B;
% dz = lineqsol(A, B ) ;
% dt = -t+g-J*(z+dz);
s5a = sib - t;
dt = s5a - s4a;
% update parameters
alpha_tmp_l = 1; tflag = 1;
while (tflag && alpha_tmp_l>0)
if (t + alpha_tmp_l*dt >=0)
tflag = 0;
else
alphatmpl = alpha_tmp_l - 0.001;
end
end % while
alpha_tmp_l = max(alpha_tmp_l, 0.001) ;
alpha_tmp_2 = 1; tflag = 1;
while (tflag && alpha_tmp_2>0)
if (lambda + alpha_tmp_2*dlambda >=0)
tflag = 0;
else
alpha_tmp_2 = alpha_tmp_2 - 0.001;
end
end *, while
alpha_tmp_2 = max(alpha_tmp_2, 0.001);
alpha_max = min(alpha_tmp_l,alpha_tmp_2);
alpha = tune * alpha_max;
z = z + alpha * dz;
lambda = lambda + alpha * dlambda;
t = t + alpha * dt;
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
% termination criteria
if (abs(miu) < TOLmiu),
SEARCHING = 0;
disp ('Interior point method converged: [iteration mu]
disp([iteration, miu]);
if ( max(abs(dz)) > max(DzTOL*abs(z)) )
SEARCHING = 1;
disp('az too big, continue searching...')
end
end
end % while
if iteration == MAXITER,
disp('IP: Maximum iteration reached, mu = '),miu;
end
iter = iteration;
'i—cost function: min (z) =l/2*z ' *Q*z+c' *z such that J*z<=g
%—return value: z
MAXITER = 100;
Tol = le-1;
iter = 0;
nv = size(J,2);
mc = size(J,1);
JWk = J(Wk,:);
i number of working constraints
mwork = length(Wk);
ck = c + Q * z;
QO = [Q JWk'; JWk zeros(mwork,mwork) ];
PO = [-ck; zeros (mwork,1)];
dzlam = QO\PO;
dz = dzlam(1:nv);
lam = dzlam(nv+1:end);
% Line 5: if dz=0
if norm(dz)<Tol
% Line 6: if all components of lam are positive or
if isempty(find(lam<0))
% Line 1: global optimum found
return
else
% Line 9
[minlam,removelnd] = min(lam);
Wk = [Wk(l:removeInd-l) Wk(removelnd+l:end) ] ;
disp('Line 9')
end
else
% Line 12
i ini t iaiize DK
Dk = [];
I indices not in Wk
nWk = setdiff(l:mc,Wk) ;
Jdz = J * dz;
gJz = g - J * z;
% create Dk
for i = 1:length(nWk)
if Jdz(nWk(i))>0 && gJz(nWk(i))/Jdz(nWk (i))<l
Dk = [Dk nWk(i)];
end
end
* Line 13
if isempty(Dk)
''.' u i n e i -i
z = z + dz;
d i s p ( ' L i n e 14')
else
% Line 16: update z
[alpha,ind] = min(gJz(Dk)./Jdz (Dk) ) ;
z = z + alpha * dz;
*• L i n e 1 7 : u p d a t e Wk
Wk = [Wk D k ( i n d ) ] ;
d i s p ( ' L i n e 16')
end
end
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
82
float Q[NV][NV];
float C[NV];
f l o a t J[MC][NV];
f l o a t g[MC];
f l o a t z[NV];
int Iter;
f l o a t Exectime;
float i n v t [ M C ] , r2[MC];
float dz[NV], dt[MC], dlam[MC];
float slb[MC], s2a[MC], s2b[NV][MC], s2c[NV][NV];
float matA[NV][NV], s3a[MC], s3b[MC], s3d[NV], s 3 f [ N V ] ;
float lam[MC], t [ M C ] ;
void Timing(void)
long t i m e r v a l u e ;
t i m e r v a l u e = CpuTimerORegs.TlM.all;
t i m e r v a l u e = OXFFFFFFFF - timervalue;
Exectime - t i m e r v a l u e / 150; / / us
}
void solvelineq(float *A, float *b)
int i, ir, ic, j, jc;
int NUM, actsize;
float temp, temp2;
NUM = NV;
actsize = NV;
for(i=0; i<NUM; i++)
{
i f ( A [ i * a c t s i z e + i ] == 0)
{
f o r ( j = i ; j<NUM; j + + )
temp2 = fabs(A[j*actsize+i]);
if(temp2 >= GETOL)
for(jc=0; jc<NUM; jc++)
temp = A [ i * a c t s i z e + i c ] ;
A[i*actsize+jc] = A[j*actsize+jc];
A [ j * a c t s i z e + j c ] = temp;
temp = b [ i ] ;
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
83
bCi] = b [ j ] ;
b [ j ] = temp;
break;
temp = A [ i + i * a c t s i z e ] ;
A [ i + i * a c t s i z e ] = 1.0;
f o r ( j = i + l ; j<NUM; j++)
A[i*actsize+j] = A[i*actsize+j]/temp;
b[i] = b[i]/temp;
f o r ( i r = 0 ; ir<NUM; i r++)
{
i f C i r != i )
{
temp = A [ i r * a c t s i z e + i ] ;
A [ i r * a c t s i z e + i ] = 0;
f o r ( i c = ( i + l ) ; ic<NUM; ic++)
{
A [ i r * a c t s i z e + i c ] = A [ i r * a c t s i z e + i c ] - temp
*A[i*actsize+ic];
}
b [ i r ] = b [ i r ] - temp*b[i];
}
}
}
}
v o i d lPM_funct(void)
{
int i, j , ic, jc, ir, jr;
i n t expint;
f l o a t miu, vtmp, pmiu;
/ / f l o a t s4b[MC], s5a[MC];
f l o a t a l p h a l , alpha2, alphal_bs, alpha2_bs, alpha_max, alpha;
f l o a t t p v a r l , tpvar2, tpvar3;
f l o a t max_z, max_dz;
// initialization
I t e r = 0;
alphal = 1 . 0 ;
alpha2 = 1.0;
alphal_bs = 0 . 5 ;
alpha2_bs = 0 . 5 ;
expint = 1 ;
f o r ( i = 0 ; i<MC; i++)
{
lam[i] = INIT_LAM;
t [ i ] = INIT_T;
}
f o r ( j = 0 ; j<NV; j++)
z [ j ] = 0;
/ / begin i t e r a t i o n
do{
lter++;
f o r ( i = 0 ; i<MC; i++)
{
invt[i] = l / t [ i ] ;
vtmp = I N I T _ T * miu;
f o r ( i r = 0 ; ir<MC; i r + + )
{
t p v a r l = 0;
f o r ( i c = 0 ; IC<NV; ic++)
{
tpvarl = tpvarl + J [ i r ] [ i c ] * z [ i c ] ;
f o r ( i r = 0 ; ir<MC; i r + + )
{
s2a[ir] = i n v t [ i r ] * 1am[ir];
}
f o r ( j r = 0 ; jr<NV; j r + + )
f o r ( j c = 0 ; jc<MC; jc++)
s2b[jr][jc] = J[jc][jr] * s2a[jc];
}
f o r ( j r = 0 ; jr<NV; j r + + )
for (ic=0; ic<NV; ic++)
{
tpvarl = 0;
for(jc=0; jc<MC; jc++)
tpvarl = tpvarl + s2b[jr][jc] * J[jc][ic];
s2c[jr][ic] = tpvarl;
}
f o r ( i r = 0 ; ir<NV; i r + + )
f o r ( i c = 0 ; ic<NV; ic++)
{
matA[ir][ic] = Q[ir][ic] + s2c[ir] [ i c ] ;
}
}
f o r ( i r = 0 ; ir<MC; i r + + )
{
s3a[ir] = i n v t f i r ] * r 2 [ i r ] ;
s 3 b [ i r ] = 1am[ir] - s 3 a [ i r ] ;
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
f o r ( j r = 0 ; jr<NV; j r + + )
t p v a r l = 0;
f o r ( j c = 0 ; jc<MC; jc++)
s3d[jr] = - c [ j r ] - tpvarl;
f o r ( j r = 0 ; jr<NV; j r + + )
tpvar2 = 0;
f o r ( j c = 0 ; jc<NV; jc++)
{
s 3 f [ j r ] = s 3 d [ j r ] - tpvar2;
solvelineq(niatA[0], s 3 f ) ;
f o r ( i r = 0 ; ir<NV; i r + + )
{
dz[ir] = s3f[ir];
}
f o r ( i r = 0 ; ir<MC; i r + + )
{
t p v a r l = 0;
f o r ( i c = 0 ; ic<NV; ic++)
{
tpvarl = tpvarl + J [ i r ] [ i c ] * d z [ i c ] ;
}
tpvar2
tpvar2 = = s2aLnrj
s 2 a [ i r ] ** tt p
p vv a
a rr ll ;;
d l a m [ i r ] = tpvar2 - s 3 a [ i r ] ;
tpvar3 = s l b [ i r ] - t [ i r ] ;
d t [ i r ] = tpvar3 - t p v a r l ;
}
/ / end of step 2
alphal = 1 . 0 ;
alphal_bs = 0 . 5 ;
alpha2 = 1.0;
alpha2_bs = 0 . 5 ;
i r = 0;
do{
tpvarl = t [ i r ] + d t [ i r ] ;
i f ( t p v a r l < 0)
do{
exp"int++;
tpvar2 = alphal_bs * d t [ i r ] ;
tpvar3 = t [ i r ] + t p v a r 2 ;
i f ( t p v a r 3 < 0)
{
alphal_bs = alphal_bs - (float)
pow((double)2,(double) -expint);
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
86
else
{
alphaL_bs = alphal_bs + ( f l o a t )
pow((double)2,(double) -expint);
} w h i l e ( e x p i n t < 10);
tpvar2 = a l p h a l - alphal_bs;
i f ( t p v a r 2 > 0)
a l p h a l = alphaL_bs;
ir++;
expint = 1 ;
alphal_bs = 0 . 5 ;
}
else
{
i r++;
expint = 1 ;
alphal_bs = 0 . 5 ;
}
else
{
ir++;
}
} w h i l e ( i r < MC);
ir=0;
do{
tpvarl = lam[ir] + dlam[ir];
i f ( t p v a r l < 0)
do{
expint++;
tpvar2 = alpha2_bs * d l a m [ i r ] ;
tpvar3 = l a m [ i r ] + t p v a r 2 ;
i f ( t p v a r 3 < 0)
alpha2_bs = alpha2_bs - pow(2,-
expint);
}
else
{
expint); alpha2_bs = alpha2_bs + pow(2,-
}
} w h i l e ( e x p i n t < 10);
tpvar2 = alpha2 - alpha2_bs;
i f ( t p v a r 2 > 0)
alpha2 = alpha2_bs;
i r++;
expint = 1 ;
alpha2_bs = 0 . 5 ;
}
else
{
i r++;
expint = 1 ;
alpha2_bs = 0 . 5 ;
}
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
else
{
ir++;
}
} w h i l e ( i r < MC);
tpvar3 = a l p h a l - alpha2;
i f (tpvar3 > 0)
alpha_max = alpha2;
else
alpha_max = a l p h a l ;
alpha = TUNE * alpha_max;
f o r ( i r = 0 ; ir<NV; i r + + )
{
z [ i r ] = z [ i r ] + alpha*dz[ir];
f o r ( i r = 0 ; ir<MC; i r + + )
{
lam[ir] = lam[ir] + alpha*dlam[ir];
t [ i r ] = t [ i r ] + alpha*dt[ir];
t p v a r l = 0;
f o r ( j r = 0 ; jr<MC; j r + + )
tpvarl = tpvarl + l a m [ j r ] * t [ j r ] ;
miu = t p v a r l / MC;
pmiu = f a b s ( m i u ) ;
t p v a r l = pmiu - TOLMIU;
i f ( t p v a r l < 0)
max_dz = f a b s ( d z [ 0 ] ) ;
f o r ( i r = l ; ir<NV; i r + + )
{
tpvar2 = max_dz - f a b s ( d z [ i r ] ) ;
i f ( t p v a r 2 < 0)
max_dz = f a b s ( d z [ i r ] ) ;
}
}
max_z = DZTOL * f a b s ( z [ 0 ] ) ;
f o r ( i r = l ; ir<NV; i r + + )
{
tpvar3 = max_z - D Z T O L * f a b s ( z [ i r ] ) ;
i f ( t p v a r 3 < 0)
max_z = D Z T O L * f a b s ( z [ i r ] ) ;
}
}
tpvar2 = max_dz - max_z;
if(tpvar2 < 0)
{
break;
}
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
88
#include "csl.h"
#include "csl_timer.h"
# d e f i n e NV 3
# d e f i n e MC 32
#define MAXITER 100
# d e f i n e TOL l e - 3
# d e f i n e GETOL l e - 8
# d e f i n e smallnum l e - 3
f l o a t Q[NV] [ N V ] ;
f l o a t c[NV];
float J[MC][NV];
float g[MC];
float Z[NV];
int Iter;
f l o a t Exectime;
f l o a t lambda[MC];
f l o a t Mmat[NV][NV], dmat[NV];
i n t wk [ M C ] ;
i n t Dk[MC];
f l o a t dz[NV], ck[NV];
f l o a t QO[NV+MC][NV+MC] , PO[NV+MC];
static U i n t 3 2 T i m e r C o n t r o l = TIMER_CTL_RMK(
TIMER_CTL_INVINP_NO,
TIMER_CTL_CLKSRC_CPUOVR4, / / CPU c l o c k / 4
TIMER_CTL_CP_PULSE,
TIMER_CTL_HLD_YES,
TIMER_CTL_GO_NO,
TIMER_CTL_PWID_ONE,
TIMER_CTL_DATOUT_0,
TIMER_CTL_INVOUT_NO,
TIMER_CTL_FUNC_GPIO
);
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
s t a t i c TlMER_Handle h T i m e r l ;
v o i d s o l v e l i n e q ( f l o a t *A, f l o a t * b , i n t countAct, i n t r r , i n t a c t s i z e )
int i , i r , ic, j , jc;
i n t NUM;
f l o a t temp, temp2;
NUM = rr+countAct;
f o r ( i = 0 ; i<NUM; i++)
if(fabs(A[i*actsize+i]) < smallnum)
for(j=i+l; J<NUM; j++)
{
temp2 = fabs(A[j*actsize+i]);
if(temp2 >= GETOL)
{
for(jc=0; jc<NUM; jc++)
temp = A [ i * a c t s i z e + j c ] ;
A[i*actsize+qc] = A [ j * a c t s i z e + j c ] ;
A [ j * a c t s i z e + j c ] = temp;
temp = b [ i ]
b[i] = b[j]
b [ j ] = temp
break;
temp = A [ i + i * a c t s i z e ] ;
A [ i + i * a c t s i z e ] = 1.0;
for(j=i+l; J<NUM; j++)
A[i*actsize+j] = A[i*actsize+j]/temp;
}
b [ i ] = b[i]/temp;
f o r ( i r = 0 ; ir<NUM; i r + + )
if(ir != i )
{
temp = A [ i r * a c t s i z e + i ] ;
A [ i r * a c t s i z e + i ] = 0;
f o r ( i c = ( i + l ) ; ic<NUM; ic++)
{
A [ i r * a c t s i z e + i c ] = A [ i r * a c t s i z e + i c ] - temp
*A[i*actsize+ic];
}
b [ i r ] = b [ i r ] - temp*b[i];
}
}
}
}
v o i d LTSQ(void)
{
int i, j, jr;
float tempi, sum;
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
90
// 3'*J
for(i=0; i<NV; i++) // rows of J'
{
for(j=0; j<NV; j++) // cols of J
sum = 0;
for(jr=0; jr<MC; jr++) // rows of j
tempi = J [ j r ] [ i ] * 3 [ j r ] [ j ] ;
sum = sum + tempi;
}
Mmat[i] [ j ] = sum;
}
/ / J'*g
f o r ( i = 0 ; i<NV; i++) / / rows of 3'
{
sum = 0;
f o r ( j r = 0 ; jr<MC; j r + + ) / / rows of J
tempi = J [ j r ] [ i ] * g [ j r ] ;
sum = sum + tempi;
}
d m a t [ i ] = sum;
}
s o l v e l i n e q ( M m a t [ 0 ] , dmat, 0, NV, NV);
f o r ( i = 0 ; i<NV; i++)
{
z [ i ] = dmat[i];
}
}
void ASM_funct(void)
{
i n t j r , j c , j , i r , i c , i , minindex, i f s , temp3;
i n t neglam, actno, alpwk[MC], minalpindex, countAct;
f l o a t tempi, temp2, tsum, mini am, a l p l , minalp;
i n t nochangeflag, Dkcount;
// initialization
i = 0 ; i c = 0 ; i r = 0 ; j = 0 ; j c = 0 ; j r = 0 ; l t e r = 0 ; minlam=0;
/ / check i f g i s a l l p o s i t i v e
i f s = 1; / / i n i t i a l solution is feasible
f o r ( i = 0 ; i<MC; i++)
{
i f ( g [ i ] < 0)
i f s = 0 ; / / i n i t i a l s o l u t i o n i s not f e a s i b l e
i f ( i f s; == 1)
{
f o r ( i = 0 ; ; i<NV ; i++)
{
z[i] = 0 i
}
}
else
{
LTSQO ;
}
f o r ( j = 0; j<MC; j++)
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
/ / a l p w k [ j ] = 0;
lambda[j] = 0;
WkM] = 0;
D k [ j ] = 0;
/ / begin i t e r a t i o n
do{
Iter++;
/ / ck = c + Q*z
f o r ( i r = 0 ; ir<NV; i r + + )
{
temp2 = 0;
f o r ( i c = 0 ; ic<NV; ic++)
{
tempi = Q [ i r ] [ i c ] * z [ i c ] ;
temp2 = temp2 + tempi;
c k [ i r ] = c [ i r ] + temp2;
countAct = 0;
for(j=0; j<MC; j++)
if(Wk[j]==l)
countAct++;
}
/ / form augmented matrix and solve f o r dz and dlam
/ / QO = [ Q JWk'l PO = [ -ck ]
// ] JWk 0 ] [ 0 ]
f o r ( i r = 0 ; ir<NV; i r + + )
{
f o r ( i c = 0 ; ic<NV; ic++)
{
QO[ir][ic] = Q [ i r ] [ i c ] ;
}
PO[ir] = - c k [ i r ] ;
}
temp3 = 0;
f o r ( j = 0 ; j<MC; j + + )
if(Wk[j]==l)
f o r ( j c = 0 ; jc<NV; jc++)
QO[NV+temp3][jc] = J [ j ] [ j c ] ;
QO[jc][NV+temp3] = J [ J ] [ J C ] ;
PO[NV+temp3] = 0;
temp3++;
temp2 = 0;
f o r ( j r = N V ; jr<(countAct+NV); j r + + )
for(jc=NV; jc<(countAct+NV); jc++)
Q O [ j r ] [ j c ] = 0;
}
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
nochangeflag = 1 ;
f o r ( i = 0 ; i<NV; i++)
if ( f a b s ( d z [ i ] ) > TOL)
nochangeflag = 0;
}
i f ( n o c h a n g e f l a g == 1)
neglam = 0; actno = 0; mini am = 0;
/ / remove c o n s t r a i n t from a c t i v e set
f o r ( j = 0 ; j<MC; j++)
{
i f ( w k [ j ] == 1) / / i f c o n s t r a i n t i s a c t i v e
{
actno++;
if(lambda[j]<0)
neglam++;
i fCIambda[j]<mi nlam)
mini am = l a m b d a [ j ] ;
minindex = j ;
}
}
}
wk[minindex] = 0;
if Cneglam == 0)
break;
}
else
{ minalp = 1; Dkcount = 0;
// find alpha and add constraint to active set
forCj=0; j<MC; j++)
i f ( w k [ j ] == 0) / / i f c o n s t r a i n t i s inactive
tsum=0;
f o r ( j c =0; jc<NV; jc++)
{
tempi = 3 [ j ] [ j c ] * d z [ j c ] ;
tsum = tsum + tempi;
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
temp2=0;
f o r ( j c =0; jc<NV; jc++)
tempi = J [ j ] [ j c ] * z [ j c ] ;
temp2 = temp2 + tempi;
a l p l = (g[j]-temp2)/tsum;
if(tsum > 0 && a l p l < 1)
{
Dk[j] = 1 ;
Dkcount++;
i f ( a l p l < minalp)
{
minalp = a l p l ;
minalpindex = j ;
}
}
else
Dk[j] = 0;
}
else
o k [ j ] = 0;
}
if(Dkcount == 0)
{
for (i=0; i<NV; i++)
{
Z[i] = z[i] + dz[i];
}
}
else
i / / update z
for (i=0; i<NV; i++)
{
z [ i ] = z [ i ] + minalp*dz[i];
/ / update active constraints
Wk[minalpindex] = 1;
}
}while(iter < MAXITER);
d main(void)
long timervalue = 0;
long timervalue_start = 0;
long timervalue_end = 0;
void TimerEventHandler(void);
extern far void vectorsO;
hTimerl = TIMER_open(TIMER_DEVl, TIMER_OPEN_RESET);
TlMER_confi gArgs(nTi merl,
TimerControl, // use predefined control value
OxFFFFFFFF, // set period
0x00000000 // start count value at zero
);
TIMER_start(hTimerl);
timervalue_start = TlMER_getcount(hTimerl);
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
94
ASM_funct();
timervalue_end = TlMER_getCount(hTimerl);
timervalue = timerva~lue_end - t i m e r v a l u e _ s t a r t ;
Exectime = timervalue / 56.25; / / us
TlMER_pause(hTimerl);
TiMER_close(hTimerl);
ATTENTION: The Singapore Copyright Act applies to the use of this document. Nanyang Technological University Library
95
Appendix C
96