Thesis 1

Mathematics of Planet Earth 2
Michael Y. Li
An Introduction
to Mathematical
Modeling
of Infectious
Diseases
Mathematics of Planet Earth
Volume 2
Series editors
Ken Golden, The University of Utah, USA

Mark Lewis, University of Alberta, Canada
Yasumasa Nishiura, Tohoku University, Japan
Malcolm Sambridge, The Australian National University, Australia
Joseph Tribbia, National Center for Atmospheric Research, USA
Jorge Passamani Zubelli, Instituto Nacional de Matemática Pura e Aplicada,
Brazil
Springer’s Mathematics of Planet Earth series provides a variety of well-

written books of a variety of levels and styles, highlighting the fundamental
role played by mathematics in a huge range of planetary contexts on a global
scale. Climate, ecology, sustainability, public health, diseases and epidemics,
management of resources and risk analysis are important elements. The math-
ematical sciences play a key role in these and many other processes relevant
to Planet Earth, both as a fundamental discipline and as a key component
of cross-disciplinary research. This creates the need, both in education and
research, for books that are introductory to and abreast of these develop-
ments.
More information about this series at http://www.springer.com/series/13771

Michael Y. Li
An Introduction
to Mathematical
Modeling of Infectious
Diseases
Michael Y. Li
Mathematical and Statistical Sciences
University of Alberta
Edmonton, AB
Canada
Mathematics of Planet Earth

ISBN 978-3-319-72121-7 ISBN 978-3-319-72122-4 (eBook)
https://doi.org/10.1007/978-3-319-72122-4
Library of Congress Control Number: 2017960197

Mathematics Subject Classification (2010): 34D05, 34D20, 34D23, 92D25, 92D30
© Springer International Publishing AG 2018
This work is subject to copyright. All rights are reserved by the Publisher, whether the
whole or part of the material is concerned, specifically the rights of translation, reprint-
ing, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any
other physical way, and transmission or information storage and retrieval, electronic adap-
tation, computer software, or by similar or dissimilar methodology now known or hereafter
developed.
The use of general descriptive names, registered names, trademarks, service marks, etc.
in this publication does not imply, even in the absence of a specific statement, that such
names are exempt from the relevant protective laws and regulations and therefore free for
general use.
The publisher, the authors and the editors are safe to assume that the advice and informa-
tion in this book are believed to be true and accurate at the date of publication. Neither the
publisher nor the authors or the editors give a warranty, express or implied, with respect
to the material contained herein or for any errors or omissions that may have been made.
The publisher remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
Printed on acid-free paper
This Springer imprint is published by Springer Nature

The registered company is Springer International Publishing AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
Mathematical modeling has been increasingly recognized in the

public health community as an important research tool for infec-
tious diseases control. Mathematical models are widely used for
evaluating the effectiveness of control strategies for TB, HIV,
Influenza, West Nile virus, and the emerging Zika virus, for
cost-effect analysis of vaccination programs, and for controlling
hospital-acquired infections. Students with modeling skills are
finding rewarding careers in public health services and private
health industry.
This book is intended to provide its reader with essential mod-
eling skills and methodology for the study of infectious diseases
through a one-semester modeling course or directed individual
studies. The book is primarily written for upper undergraduate
and beginning graduate students in mathematical sciences who
have an interest in mathematical modeling of infectious diseases.
Mathematical theories and techniques are explained in the context
of epidemic models without a sacrifice of mathematical rigor. The
basic prerequisites are advanced Calculus and an understanding
of matrix theory. Beginning students can learn advanced theory
of differential equations in the concrete setting of model anal-
ysis. Students who have taken courses in advanced differential
equations will be able to see how abstract theories are applied to
real-world problems. Materials in the book are also accessible to
public health science students with analytical skills. Several grad-
uate students in public health sciences have participated in the
v
vi Preface
author’s modeling courses and went on to write MSc and PhD

theses using modeling as the main research tool.
This book has grown from a set of lecture notes the author
has been using for a disease modeling course at the University
of Alberta. The course was taught in the second semester of the
academic year to prepare students for their research on modeling
projects during the summer. Each student was given a disease
modeling project to work on during the course. Students were
required to submit a proposal for their projects by the midterm
and complete a written project report at the end of the semester,
which comprised the main part of the assessment. Each student
also gave a 20-minute oral presentation about the project. Exer-
cises in the book provided students with opportunities to practice
modeling skills they learned during the course. Students from pub-
lic health or biological sciences were each paired with a mathemat-
ics student to collaborate on a project for cooperative learning.
The highlight of the course has always been the term projects,
through which students were able to synthesize the knowledge
they gain in the course. In many occasions, term projects were
further expanded into a full-fledged MSc thesis or formed the
foundation of a PhD dissertation.
The book starts with an introductory chapter (Chapter 1) on
the basic process of setting up an epidemic model using differential
equations. The chapter contains in-depth discussions of key con-
cepts in infectious disease epidemiology such as the basic repro-
duction number, incidence rates, latency, immunity, demography,
route of transmission, heterogeneity, vaccination, and quarantine.
Mathematical descriptions of these key concepts are given and
used as building blocks for ever more complex epidemic models.
This chapter provides a soft entry into modeling for public health
science students and an introductory encounter with infectious
disease epidemiology for mathematics students.
Chapter 2 focuses on model analysis with the intention to intro-
duce students to standard mathematical approaches and impor-
tant mathematical concepts such as stability, bifurcation, and
threshold phenomenon, all explained in the context of epidemic
models. The author has carefully selected five classic epidemic
models to demonstrate five important mathematical approaches
Preface vii
to model analysis: first integrals and level curves, phase-line anal-

ysis, phase-plane analysis, reduction of dimension using homo-
geneity, and monotone dynamical systems.
Chapter 3 contains the abstract theory of differential equa-
tions that is used in model analysis in Chapter 2. It provides a
rigorous theoretical reference for important concepts and math-
ematical results used in Chapter 2. It is recommended that the
materials in this chapter are explained when they are used in the
corresponding sections of Chapter 2.
Chapter 4 deals with the issue of confronting models with
data and nonlinear least-squares methods for parameter esti-
mation from data. A detailed presentation of the linear least-
squares method and the Gauss–Newton method for nonlinear
least squares is given and demonstrated with examples. For prac-
tical applications in epidemic models, algorithms for nonlinear
least-squares estimation using Matlab routines are described.
Matlab codes are included to help students start to use them
for their projects.
Chapter 5 provides some special topic for further reading and
can be used as additional teaching materials at the instructor’s
discretion. The first section is a higher dimensional SEIR model
with its full analysis. Materials in this section help to reinforce
that techniques for the analysis of simple models in Chapter 2
are also equally useful for complex models. The second section
covers an in-host model for the viral dynamics of the infection of
Human T-cell Lymphotropic Virus Type I (HTLV-I) of CD4+ T
cells. Through discussions in this section, students will learn that
modeling skills for epidemic models at the population level can be
applied to model disease processes at a microbic level. An inter-
esting phenomenon called “backward bifurcation” is introduced
in this simple in-host model.
On a personal note, I would like to thank all the students
who have participated in my disease modeling course during the
past 10 years at the University of Alberta, whose enthusiasm has
been the inspiration for writing this book. A special mention is
given to my graduate students Hongbin Guo, Rebecca De Boer,
Michael Akinwumi, Betsy Varughese, Zhimin Su, Alison Muscat,
Kim Simmons, Lordlin Owusu, Weston Roda, Jeanette Amissah,
viii Preface
and Frank Lee, whose feedbacks really helped with the develop-
ment of materials in the book. Many of these students are working
as modelers in public health services and private sectors. A special
thanks to Rebecca De Boer and Weston Roda for their assistance
with Matlab codes and for teaching Matlab tutorials in my mod-
eling classes.
I want to express my deep gratitude to my family for their
unwavering support of my work, especially to my wife Audrey for
her constant encouragement, and to my daughter Jessica for proof
reading the manuscripts. I wish to thank many of my mentors,
James Muldowney, Paul Waltman, Fred Brauer, Pauline van den
Driessche, and Gail Wolkowicz, for their guidance and support for
my research.
A special thanks to Donna Chernyk of Springer for all her work
and help during the publication process of the book.
Michael Y. Li
Edmonton, Alberta
October 15, 2017
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
1 Important Concepts in Mathematical Modeling

of Infectious Diseases . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Mathematical Modeling of Infectious Diseases:
Issues and Approaches . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Deterministic Epidemic Models: Compartmental
Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 An Example: Kermack–McKendrick Model . . . . . . . 8
1.4 Important Concepts in Compartmental Epidemic
Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Five Classic Epidemic Models and Their

Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.1 Kermack–McKendrick Model . . . . . . . . . . . . . . . . . . . 37
2.2 A Model for Diseases with No Immunity . . . . . . . . . 46
2.3 A Model with Demography . . . . . . . . . . . . . . . . . . . . 52
2.4 An SIR Model with Varying Total Population:
Homogeneous Systems . . . . . . . . . . . . . . . . . . . . . . . . . 60
2.5 Ross–MacDonald Model for Malaria: A Monotone
System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3 Basic Mathematical Tools and Techniques . . . . . . 79

3.1 Stability of Equilibrium Solutions . . . . . . . . . . . . . . 79
3.2 Stability Analysis by Linearization . . . . . . . . . . . . . . 81
ix
x Contents
3.3 Stability Analysis Using Lyapunov Functions . . . . 82

3.4 Stability of Periodic Solutions:
The Floquet Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.5 Global Dynamics of 1-Dimensional Systems:
Phase-Line Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.6 Global Dynamics of 2-Dimensional Systems:
Phase-Plane Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.7 Uniform Persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
3.8 Metzler Matrices and Monotone Systems . . . . . . . . . 98
4 Parameter Estimation and Nonlinear

Least-Squares Methods . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.1 Curve-Fitting and Linear Least-Squares
Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.2 Nonlinear Least-Squares Problem . . . . . . . . . . . . . . . 111
4.3 Parameter Estimation for Epidemic Models . . . . . . 118
5 Special Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

5.1 Higher Dimensional Models: SEIR Models . . . . . . . . 126
5.2 In-host Models and Backward Bifurcation . . . . . . . . 136
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Chapter 1
Important Concepts in
Mathematical Modeling
of Infectious Diseases
1.1 Mathematical Modeling of

Infectious Diseases: Issues and
Approaches
A disease is infectious if the causative agent, whether a virus, bac-
terium, protozoa, or toxin, can be passed from one host to another
through modes of transmission such as direct physical contact,
airborne droplets, water or food, disease vectors, or mother to
newborn.
The objective of a mathematical model of an infectious
disease is to describe the transmission process of the disease, which
can be defined generally as follows: when infectious individuals
are introduced into a population of susceptibles, the disease is
passed to other individuals through its modes of transmission thus
spreading in the population. An infected individual may remain
asymptomatic at the early stage of infection, only later devel-
oping clinical symptoms and being diagnosed as a disease case.
If the number of cases rises above the usual average within a
short period of time, a disease outbreak occurs. When the disease
spreads quickly to many people, it is an epidemic. Infected indi-
viduals recover from infection, either through treatment or due
to the action of the immune system, and gain various degrees of
acquired immunity against reinfection. When the pool of suscep-
tible individuals is sufficiently depleted, new infections will cease
© Springer International Publishing AG 2018 1
M. Y. Li, An Introduction to Mathematical Modeling
of Infectious Diseases, Mathematics of Planet Earth 2,
https://doi.org/10.1007/978-3-319-72122-4_1
2 Important Concepts
and the epidemic slows down and stops. If fresh susceptibles are
added to the population, either from birth or migration, or if rein-
fection occurs easily, the epidemic may persist and the infection
may remain in the population over a long period of time. In this
case, the disease is said to be endemic in the population. If the
disease spreads spatially on a global scale to many countries and
continents, a pandemic occurs. The 1918 Spanish flu that spread to
all continents and killed over 50 million people is a classic exam-
ple of a global pandemic. With modern air travel, many emerging
and reemerging infectious diseases have an increasing potential to
cause a global pandemic.
Facing an imminent epidemic, public health authorities will be
looking for answers to the following important questions:
(1) How severe will the epidemic be? The severity can be mea-
sured in two different ways:
(a) Total number of infected people who may require med-
ical care.
(b) Maximum number of infected people at any given time.
(2) How long will it last? When will it peak? What will be its
time course?
(3) How effective will quarantine or vaccination be?
(4) What quantity of vaccine or anti-viral drugs should be
stockpiled?
(5) What are effective measures to contain, control, and eradi-
cate an endemic disease?
Partial answers may be obtained using a variety of approaches.

Mathematical modeling has proven to be an important tool in
assisting public health authorities to make informed decisions.
This brings us to the obvious question: why is mathematical
modeling of infectious diseases useful? Part of the answer is that
traditional methods using experimental and statistical approaches
may not be adequate for various reasons:
1.1 Modeling Infectious Diseases 3
(1) Infectious diseases often affect a large population of individ-

uals over a large geographic area. Experiments conducted
in laboratories are often inadequate simply because of the
huge difference in scale.
(2) For infectious diseases in humans, large-scale experiments
may be impossible or unethical.
(3) Existing data sets about the disease may not be complete
or accurate enough for the statistical analysis to be reliable.
For instance, there will be little or no disease surveillance
data available on infected people who are asymptomatic.
(4) Since repeated experiments of disease outbreaks in a pop-
ulation are rarely possible, disease surveillance data often
represents a single outbreak (sample) with a large amount
of information. Statistical analysis of high-dimensional data
from small samples is a challenge since statistics theory
relies on large samples.
Mathematical modeling can provide an understanding of the

underlying mechanisms of disease transmission and spread, help
to pinpoint key factors in the disease transmission process, suggest
effective control and preventive measures, and provide an estimate
for the severity and potential scale of the epidemic. Put it sim-
ply, mathematical modeling should become part of the toolbox of
public health research and decision-making.
The next question we would like to answer is: what is the gen-
eral process of mathematical modeling? Generally speaking, the
modeling process involves the following six stages:
(1) Make assumptions about the disease transmission process

based on the best available biological knowledge on the
pathogenesis of the infection and epidemiology of the
disease.
(2) Set up mathematical models for the transmission process
based on these assumptions. This usually starts from draw-
ing the transfer diagram and then deriving the mathemat-
ical equations.
(3) Perform mathematical analysis on the model to understand
all possible qualitatively distinct model outcomes. This is
typically done by applying existing mathematical theories

on stability and bifurcations in conjunction with numerical
simulations.
(4) Interpret the mathematical findings within the modeling
context. These interpretations form our understanding of
the disease transmission process entailed by the set of
assumptions made in Step (1).
(5) Collect available disease data from public health agencies
and from research publications. Validate the model using
data.
(6) Improve the model by modifying the earlier assumptions in
Step (1), and produce a more accurate understanding of the
disease process.
It is important to keep in mind that a mathematical model is

an approximation of the real disease process. It is a mathematical
translation of our hypotheses about disease transmission. Another
important role for mathematical models is hypothesis testing: by
comparing the model outcomes with existing knowledge or data of
the disease, we can use the model to test various hypotheses about
the disease. Compared to experimental approaches, the modeling
approach has the advantage of saving enormous amounts of time
and resources.
We should caution that a mathematical model is not a magic
bullet. There are often many difficulties associated with math-
ematical modeling. The following is a list of important issues
involved in the mathematical modeling process:
(1) Due to our limited knowledge about an infectious disease,

realistic assumptions about its transmission process are not
always possible. Various degrees of simplification need to be
made. Very often, our assumptions are simply hypotheses.
When interpreting findings from the model, we always need
to keep these limitations in mind.
(2) Model validation using disease data is important because it
provides a test of our modeling hypotheses. This may not
always be possible or may be difficult to do depending on
the availability and quality of data.
1.1 Modeling Infectious Diseases 5
(3) Mathematical analysis of the model may be limited by exist-

ing mathematical theory.
There is always a trade-off in mathematical modeling between

more realistic and therefore more complex models and our abil-
ity to analyze the model mathematically and obtain useful infor-
mation for interpretation. Advancement in mathematical theory
and methodology often allows us to successfully use more realistic
models. When using mathematical models to analyze or interpret
disease data, it is not always true that a more realistic model
will do better. Part of the reason is that more realistic models
incorporate a greater degree of biological complexity and hence
introduce more model parameters. With the same dataset, it may
be more difficult to estimate all the parameter values for a more
complex model compared to a simpler model. The result might
be a greater degree of uncertainty in model outcomes.
There are three general approaches to mathematical modeling
of infectious diseases:
(1) Statistical models. These models are very data oriented and
are constructed to deal with a specific set of data.
• Advantages: statistical models are widely used in epi-
demiology and public health research.
• Drawbacks: statistical models require large samples of
data.
(2) Deterministic models. These are typically models using dif-
ferential and difference equations of various forms. The key
assumption is that the sizes of the susceptible and infectious
populations are continuous functions of time. The mod-
els describe the dynamic interrelations among the rates of
change and population sizes.
• Advantages: mathematical theories for this type of model
are more mature in comparison to stochastic models; the
derivation of mathematical models are less data depen-
dent in comparison to statistical models; and mathemat-
ical models are suited for making predictions.
• Drawbacks: these models are not expected to be valid

if the population sizes are very small, in which case
stochastic disturbances become non-negligible.
(3) Stochastic models. In this type of model, disease infec-
tion is treated as a stochastic process. The models describe
dynamic interrelations of its probability distributions.
• Advantages: stochastic models are suitable to deal with
small population groups such as a small community or
a single hospital, or where a few infected individuals are
highly active and have a high number of infectious con-
tacts.
• Drawbacks: mathematical analysis of stochastic models
is difficult due to a lack of mathematical machinery;
model analysis largely relies on observations from a huge
number of numerical simulations.
This book is an introduction to deterministic models of infec-

tious diseases, their mathematical analysis, and model calibration
from disease data.
1.2 Deterministic Epidemic Models:

Compartmental Approach
In this section, we explain how to set up a mathematical model
for the transmission process of an infectious disease using a com-
partmental approach. We first partition the host population into
mutually exclusive groups – compartments – according to the nat-
ural history of the disease. For a simple infectious disease, possible
compartments may be:
S : susceptible hosts, I : infectious hosts, R : recovered hosts.
Then, we illustrate the transmission process schematically in a

carton, called a transfer diagram, shown in Figure 1.1. In the trans-
fer diagram, the arrows indicate movements of individuals among
1.2 Compartmental Models 7
compartments. The term “removal” includes loss of individuals

through death or out-migration.
loss of immunity
new new
susceptibles infections recovery
S I R
removal removal removal
Figure 1.1: Transfer diagram for an SIR compartment model.
The goal of modeling is to track the number of hosts in each

of the three compartments at any given time t, and we denote
these numbers by S(t), I(t), and R(t) accordingly. To set up the
compartmental model, we consider a small time interval [t, t + Δt]
and the net change in the number of individuals in each compart-
ment. In the transfer diagram, the arrows indicate the direction of
individual movement. The net change of the number of hosts in a
compartment is the number coming into the compartment minus
the number leaving the compartment during the time interval.
Applying this principle to each compartment, we arrive at the
following equations:
ΔS(t) = new susceptibles + transfer from R − new infections

− removal from S
(1.1)
ΔI(t) = new infections − transfer into R − removal from I
ΔR(t) = transfer from I − transfer into S − removal from R .
If we divide both sides of these equations by Δt and let Δt → 0,

then the left-hand side will be the derivatives S (t), I (t), and
R (t), since
ΔS(t) S(t + Δt) − S(t)

= → S (t), as Δt → 0,
Δt Δt
and similar relations hold for I (t) and R (t). The terms on the
right-hand side will become instantaneous rates of incidence,
recovery, and removal. We thus have the following differential

equations:
S (t) = influx of new susceptibles + transfer rate from R − incidence rate
− removal rate from S

I (t) = incidence rate − transfer rate into R − removal rate from I
R (t) = transfer rate from I − transfer rate into S − removal rate from R .
(1.2)
If we express all the terms on the right-hand side as functions
of S(t), I(t), and R(t), we will obtain a system of differential
equations for S(t), I(t), and R(t), which will form our mathe-
matical model. It is important to note that how these terms are
expressed as functions of S(t), I(t), and R(t) is based on our
hypotheses regarding the biological processes of disease trans-
mission and population transfer among compartments. Therefore,
different hypotheses will give rise to different forms of the model,
and may lead to different model outcomes. If data is available to
verify our model outcomes, then the model can be used to test the
validity of our hypotheses about the disease transmission process.
In the next section, we will see how basic hypotheses can be made
in order to derive one of the classic epidemic models.
1.3 An Example:
Kermack–McKendrick Model
To demonstrate how various rates in equation (1.2) may depend
on S(t), I(t), and R(t), we make the following hypotheses about
the transmission process of an infectious disease and its host pop-
ulation:
(1) Transmission occurs horizontally through direct contact

between hosts.
(2) Mixing of individual hosts is homogeneous and thus the Law
of Mass Action holds: the number of contacts between hosts
from different compartments depends only on the number
1.3 Kermack–McKendrick Model 9
of hosts in each compartment. In particular, the incidence

rate – number of new infections per unit time – can be
expressed as λI(t)S(t), where λ is called the transmission
coefficient.
(3) Rate of transfer from a compartment is proportional to the
population size of the compartment. For instance, the rate
of transfer from I to R, the recovery rate, can be written as
γI(t), for some rate constant γ.
(4) Infected individuals become infectious upon infection with
no latency period.
(5) There is no loss of immunity and no possibility of reinfec-
tion. This implies that the transfer rate from R back to S
is zero.
(6) There is no input of new susceptibles and no removal from
any compartments. The influx of new susceptibles is zero,
and so are the removal rates from all compartments.
(7) The total host population remains a constant. This is a
direct result of the previous assumption, but we state it here
explicitly to emphasize the fact. The dynamics of epidemic
models can be more complicated if the total population
varies with time.
Although they may appear very restrictive, these assumptions

are quite plausible for a disease spread within the student popula-
tion on a campus, where mixing occurs mainly in classes, cafete-
rias, libraries, and other public places. Based on these assump-
tions, the transfer diagram for a conceptual model shown in
Figure 1.1 can be translated into an explicit model as shown in
Figure 1.2.
λIS γI
S I R
Figure 1.2: Transfer diagram for an simple SIR model.
Substituting all terms in (1.2) by our mathematical descrip-

tions, we obtain the following system of differential equations:
dS
= −λIS (1.3)
dt
dI
= λIS − γI (1.4)
dt
dR
= γI, (1.5)
dt
with initial conditions
S(0) = S0 > 0, I(0) = I0 > 0, R(0) = 0.
In the model, functions S(t), I(t), and R(t) are variables. Since
they denote the number of people, they are expected to take non-
negative values. Constants λ and γ are model parameters, and
they are assumed to be nonnegative since they denote rate con-
stants. If the values of model parameters λ and γ are known, then
for each set of initial conditions S0 and I0 , model (1.3)–(1.5) has
a unique solution (S(t), I(t), R(t)) that produces a prediction for
the time course of the epidemic for t > 0. Here t = 0 marks the
beginning of the epidemic.
Even when the values of parameters are not known, some sim-
ple observations about the solutions to system (1.3)–(1.5) can be
made that hold for all or a large set of possible parameter values.
1. Let N (t) = S(t) + I(t) + R(t) denote the total host popu-
lation. Then, by adding equations (1.3)–(1.5), we have
dN
= 0,
dt
and thus the total population N (t) = N0 = S0 + I0 remains
a constant.
2. From equation (1.3) we obtain
dS
≤ 0.
dt
Therefore, S(t) is always decreasing. In particular,
S(t) ≤ S0 .
1.4 Important Concepts 11
3. Rewrite equation (1.4) as

dI
= (λS − γ)I.
dt
Then, we have the following two cases:

dI
(1) If S0 < γ
λ
, then dt
< 0. Since S(t) ≤ S0 < λγ , we know
t=0
I (t) < 0 for all t ≥ 0, and thus I(t) strictly decreases.
As a result, no epidemics can occur in this case.
(2) If S0 > λγ , then S(t) > λγ for t ∈ [0, t̄) for some t̄ > 0.
This implies I (t) > 0 and thus I(t) strictly increases
for t ∈ [0, t̄). As a result, an epidemic happens.
This demonstrates the well-known threshold phenomenon:
there is a threshold value λγ , which S0 must exceed for an
epidemic to occur. This threshold value is often called the
critical community size for an epidemic.
We see that it is possible to derive properties of solutions

without knowing specific values of model parameters. Our
analysis can identify parameter regions for distinct model
outcomes. This is the power and often the objective of math-
ematical analysis. We will revisit this model in greater detail
in the next chapter.
1.4 Important Concepts in

Compartmental Epidemic Models
In this section, we discuss some important epidemiological con-
cepts used in compartmental modeling, demonstrate how to
describe them in mathematical terms, and explore the assump-
tions used for deriving a specific mathematical term, such as the
disease incidence λI(t)S(t) introduced in the previous section.
Through these discussions, students in mathematics can learn
how mathematical terms in a model correspond to epidemiolog-
ical concepts, and students in public health sciences can learn
how mathematics is used to quantify epidemiological concepts.
An understanding of the assumptions behind the mathematical

terms in the model can help students consider alternative ways to
describe them and learn how to generalize a model.
1.4.1 Transfer Rates

For the simple Kermack–McKendrick model described in the pre-
vious section, we assumed that the recovery rate, or the rate of
transfer from compartment I to R, is given by γI. This is equiv-
alent to assuming the following:
(H) the fraction of the infectious population that recovers per
unit time is a constant γ.
Proportional transfer rates as assumed in (H) are often used for
transfers between compartments in simple compartmental mod-
els. However, we need to understand that this is only one of
the many assumptions we can make about population transfers.
In fact, our assumption that
recovery rate is in proportion to
the size of the infectious popula- rN (t )
N (t)
tion is by no means universal. In
the following, we develop a bet-
ter mathematical understand-
Figure 1.3: A compartment C.
ing of the proportional transfer
rate and consider other possible
alternatives.
Consider a general compartment C of total population size
N (t), where individuals leave the compartment at a rate rN (t)
(r > 0). Then the size N (t) satisfies
dN (t)
= −rN (t), r > 0,
dt
and thus N (t) = N0 e−rt , or
N (t)
= e−rt . (1.6)
N0
Therefore, e−rt gives the fraction of the population that remains
in the compartment C. In probability terms, e−rt is the probabil-
ity of an individual entering C at time t = 0 and remaining in C

at time t > 0. Since we are interested in the population transfer
out of C, we consider
⎧
⎨1 − e−rt , t ≥ 0,
F (t) = ⎩ (1.7)
0, t < 0,
which gives the fraction of the population that has left C during
the time period [0, t), or the probability of an individual who has
left C during [0, t). Here we see that F (t) has the characteristics of
a probability distribution. In fact, if we let X denote the random
variable of the residence time of an individual in compartment C,
i.e., the time period from entrance to exit, we see that
F (t) = Prob [ X ≤ t ]. (1.8)
In other words, F (t) is the probability distribution function of

the individual residence time in C, and it satisfies the standard
properties of a probability distribution function:
(1) F (t) ≥ 0
(2) F (t) → 0, as t → −∞
(3) F (t) → 1, as t → +∞.
F
1
0.8
0.6
0.4
0.2
-40 -20 20 40 t
Figure 1.4: A probability distribution function.

Now we see that the assumption of proportional exit rate is

the same as the following:
(H ) the residence time of an individual in compartment C has
an exponential distribution.
We can also describe the random variable X in terms of the
probability density function f (t) = dtd F (t), namely:
⎧
⎨re−rt , t ≥ 0,
f (t) = ⎩ (1.9)
0, t < 0,
which has the following properties:
(1) f (t) ≥ 0
∞
(2) −∞ f (t)dt = 1
t
(3) F (t) = Prob [ X ≤ t ] = −∞ f (s)ds.
An example of a probability density function with an exponen-

tial distribution is shown in Figure 1.5. The probability density
f (t) gives the proportion of individuals with residence time t. We
see from the graph of f (t) in Figure 1.5 that a larger propor-
tion of individuals have a small residence time and hence exit the
compartment shortly after entry, while a smaller proportion of
individuals exit after a longer time. This is the hallmark of the
exponential distribution.
Figure 1.5: A probability density function.

The expected value, also called the mean value, of X is

∞
1
E[X] = tf (t)dt = . (1.10)
−∞ r
Therefore, the mean residence time, under the proportional exit
rate assumption, is 1r .
For transfers from compartment I to R, the residence time is
the period between time of infection and time of recovery, which
is the infectious period. The following transfer diagram:
γI
I / R
is equivalent to assuming that the infectious period of individuals

has an exponential distribution
⎧
⎨1 − e−γt , t ≥ 0,
F (t) = ⎩
0, t < 0,
and γ1 is the mean infectious period. Similarly, in the transfer

diagram
R
δR / S
the residence time is the immune period – the period recovered

individuals are protected from reinfection – and proportionate
rate δR assumes that the immune periods of individuals have an
exponential distribution
⎧
⎨1 − e−δt , t ≥ 0,
F1 (t) =
⎩0, t < 0,
and the mean immune period is 1δ .

A natural question is: how do we derive the model equations
if individual infectious periods have a general probability distri-
bution function F (t)?
We will revisit the SIR model (1.3)–(1.5) with this general

assumption. We see that the change in assumption for the infec-
tious period does not impact the S equation (1.3). We need to
re-derive equations for I(t) and R(t) based on this new assump-
tion. Let the residence time X have the probability distribution
function F (t) = Prob[X ≤ t] as defined in (1.8). We consider the
associated survival function
G(t) = 1 − F (t) = Prob[X > t].
For any given time τ > 0, G(t − τ ) is the fraction of individuals

who become infected at time τ > 0 and are still infectious at time
t (t > τ ). Therefore, the number of individuals who are infected
at time τ and remain infectious at time t is given by
λI(τ )S(τ )G(t − τ ).
Therefore, at time t, the number of individuals who have accu-

mulated in the I compartment since τ = 0 is
t
I(t) = I0 (t) + λI(τ )S(τ )G(t − τ ) dτ, (1.11)
0
where I0 (t) is the cumulative number of individuals who are

already infected at τ = 0 and remain infectious at time t > 0.
Similarly, the R equation is given by
t
R(t) = R0 (t) + λI(τ )S(τ )(1 − G(t − τ )) dτ. (1.12)
0
We see that our new SIR model (1.3), (1.11) and (1.12) with gen-
erally distributed infectious periods is a system of differential-
integral equations.
In the special case when F (t) = 1 − e−γt is an exponential dis-
tribution, we have G(t) = e−γt , and from (1.11)
t
I(t) = I0 (t) + λI(τ )S(τ )e−γ(t−τ ) dτ,
0
and I0 (t) = I(0)e−γt by (1.6). Therefore,

t
I (t) = I0 (t) + λI(t)S(t) − γ λI(τ )S(τ )e−γτ dτ
0
= I0 (t) + λI(t)S(t) − γ(I(t) − I0 (t)).
Since I0 (t) = −γI0 (t), we obtain the original equation (1.3)
I (t) = λI(t)S(t) − γI(t).
Similarly, we have from (1.12)
R (t) = γI(t).
This confirms our claim that the SIR model in the form of ODE
(1.3)–(1.5) assumes an exponentially distributed infectious period.
Let us take another special case when infectious periods have
the following survival function:
⎧
⎨1
if − ∞ < t < ω,
G(t) = ⎩ (1.13)
0 otherwise.
Correspondingly, F (t) = 1 − G(t) is the Heaviside function at

t = ω, for some constant ω > 0. Then, the equation for I(t)
becomes t
I(t) = I0 (t) + λI(τ )S(τ ) dτ,
t−ω
and thus, for t > ω,
I (t) = I0 (t) + λI(t)S(t) − λI(t − ω)S(t − ω).
Since I0 (t) = 0 for t > ω, we arrive at a differential equation with

a time delay ω,
I (t) = λI(t)S(t) − λI(t − ω)S(t − ω). (1.14)
Similarly, for t > ω,

t−ω
R(t) = R0 (t) + λI(τ )S(τ ) dτ,
0
and
R (t) = λI(t − ω)S(t − ω). (1.15)
Here we have used the fact that R0 (t) = 1 for t > ω. We see
here that the delayed differential equation model (1.3), (1.14) and
(1.15) is the result of assuming that the infectious periods have a
Heaviside distribution function, or Dirac delta density function,
at t = ω.
Exercises.
(1) Show that ⎧

⎨0 t<0
f (t) = ⎩ −γt
γe t≥0
is a probability density function on (−∞, ∞), and that the
probability mean of an exponentially distributed random
variable X ≥ 0 is given by
∞
1
E[X] = γ te−γt dt = .
0 γ
(2) Compute the probability mean for the distribution function
given in (1.13).
(3) For the transfer diagram in Figure 1.2, derive the equations
for I(t) and R(t) when the infectious period has a gen-
eral distribution function P (t). Prove that, when P (t) =
1 − e−γt for t ≥ 0 and P (t) = 0 for t < 0, the equations
become differential equations (1.4)–(1.5).
1.4.2 Modeling Disease Incidence

Disease incidence is the rate at which new infections occur;
namely, the number of newly infected individuals (both reported
cases and asymptomatic individuals) per unit time. This should be
distinguished from disease prevalence, which measures the number
of infected individuals at time t.
1. Simple incidence forms.

The incidence term in the Kermack–McKendrick model in
Section 1.3 is given by βIS, which is often called simple mass-
action incidence, or bilinear incidence. It is based on the Law of
Mass Action, first derived for chemical kinetics in 1864 by Waage
and Guldberg [1, 41, 59], which states that the rate of chemical
reaction is proportional to the density of the reactants:
A + B ⇐⇒ A B
rate = k [A] [B].
The underlying assumptions of the Law of Mass Action are:
(1) Homogeneous mixing so that contact rate only depends on

the density of reactants.
(2) Conservation of total mass.
(3) Low reactant densities.
We see that, for contacts among individuals (human, animals,

or cells), these assumptions are at best crude approximations.
Modification of any of these assumptions will lead to different
incidence forms.
2. The effects of saturation.
If a host population is saturated with infectious individuals,
the rate of new infections will only be determined by the number
of susceptibles S, and homogeneous mixing is no longer valid.
To describe incidence rate in this situation, we can learn from
standard theory of enzyme kinetics. The reaction rate v0 has a
nonlinear dependence on the concentration [A] of the substrate A
given by the Michaelis–Menten equation:
vmax [A]
v0 = . (1.16)
K + [A]
In this equation, vmax is the maximum reaction rate, and K is the
Michaelis–Menten constant [44, 48].
v0
vmax
vmax
2
K [A]
Figure 1.6: Michaelis–Menten reaction curve.

We see from the graph of the Michaelis–Menten reaction that
when substrate concentration [A] is large, reaction rate v0 satu-
rates at the constant Vmax .
The function
ax
f (x) = (1.17)
K +x
is often called the Monod function, or Holling’s type II function
in ecology literature [5, 45, 48, 55]. The saturation effect on the
disease incidence can be modeled using the Monod function to
give
βIS
, (1.18)
K +I
so that when the population of I is large, the incidence rate is
approximately βS. Other incidence forms can be
βI m S
, m > 0. (1.19)
K + Im
Similarly, saturation of susceptibles may be modeled by
βIS n
, n > 0. (1.20)
K + Sn
3. Varying total population size.

Another way to derive disease incidence is the following: let
S(t), I(t), R(t), and N (t) = S(t) + I(t) + R(t) denote the sizes
of the susceptible, infectious, recovered, and total population,

respectively. Let λ be the average per capita contact number
among individuals per unit time, and p the probability that a
contact will produce an infection. Then the incidence is given by
S(t)
pλ · · I(t),
N (t)
which can be interpreted as
average number of probability that a

effective contacts contact is made total number of
· ·
produced by each with a susceptible infectious individuals
infectious individual individual
If we combine the probability p with the contact number λ, so that

λ is the per capita effective contact number, then the incidence is
given by
λ
IS. (1.21)
N
When the total population size N is a constant, then the incidence
form in (1.21) is the same as the simple mass-action incidence form
with β = λ/N . However, when N (t) varies with t, things become
more complicated. We consider the following two simple cases:
Case 1. The effective contact number λ is independent of the total

population size. In this case, the incidence is given by
λI(t)S(t)
, (1.22)
N (t)
where λ is a constant. This form is called the proportion-
ate incidence or standard incidence.
Case 2. The effective contact number is proportional to the total
population size, namely
λ(N ) = βN
for a constant β. Then the incidence is again of the simple

mass-action form
βI(t)S(t). (1.23)
Suitability of these different incidence forms has been dis-
cussed in many research articles [6, 13, 26, 27, 66].
We examine the different assumptions behind these two inci-

dence forms. The difference lies in the assumption on how con-
tact rate varies with the total population size. It is reasonable to
assume that the contact rate is in proportion to the total popu-
lation density. Therefore, the difference between incidence forms
(1.22) and (1.23) lies in the assumption on how total population
density changes as the total population size varies.
Assumption 1: population density is independent of population
size. This is likely the case in a rural population; as population
size increases, rural towns tend to expand to maintain a constant
population density.
Assumption 2: population density is in proportion to population
size. This is likely the case for a large urban population where
the city is confined in space; an increase in population size will
proportionally increase the population density.
In this sense, proportionate incidence (1.22) is more suitable
for a population in a rural setting, whereas the bilinear incidence
(1.23) is suitable for a large urban population.
Above discussions are applicable to diseases such as influenza
and tuberculosis, for which the transmission occurs through air-
borne droplets from coughing. We note that an exception is
sexually transmitted diseases in which the transmission occurs
between sexual partners. In this case, the contact number may
not vary with the population density. The standard incidence λIS N
is commonly used for sexually transmitted diseases.
1.4.3 Demography: Birth, Death, and

Population Growth
To incorporate demographic factors into the Kermack–
McKendrick models (1.3)–(1.5), we need to make various assump-
tions on the birth, death, and growth of the host population, the
simplest of which is the proportional rate assumption that the
birth or death rate is proportional to the population size. A model

that incorporates these assumptions is depicted in the diagram in
Figure 1.7 with the corresponding system of differential equations
(1.24)
S (t) = bN (t) − λI(t)S(t) − d1 S(t)
I (t) = λI(t)S(t) − (γ + d2 )I(t)
(1.24)
R (t) = γI(t) − d3 R(t)
N (t) = S(t) + I(t) + R(t)
Here b is the natural birth rate constant, and d1 , d2 , and d3 are
death rate constants for compartments S, I, and R, respectively.
Rate constant d2 may include both natural and disease-caused
death. If we add the first three equations in (1.24), we obtain
bN λIS γI
S I R
d1 S d2 I d3 R
Figure 1.7: Transfer diagram for an SIR model with birth and
death.
N (t) = bN (t) − d1 S(t) − d2 I(t) − d3 R(t).

In general, when di are different, this implies that N (t) will vary
with time. In the special case when d1 = d2 = d3 = d, we have
N (t) = (b − d)N (t),
and thus
N (t) = N0 e(b−d)t .
If b > d, N (t) → ∞ exponentially as t → ∞; if b < d, N (t) → 0
exponentially as t → ∞; and if b = d, N (t) ≡ N0 , a constant.
Exponential growth and decay of the total population are often
modified by the logistic growth,
N 2 (t)
N (t) = (b − d)N (t) − , (1.25)
K
where b, d are natural birth and death rates, respectively, and

(b − d)K is the carrying capacity for the population. The logistic
growth can be incorporated into the Kermack–McKendrick model
as follows:
N 2 (t)
S (t) = bN (t) − − λI(t)S(t) − dS(t)
K
(1.26)
I (t) = λI(t)S(t) − (γ + d)I(t)
R (t) = γI(t) − dR(t).
For more discussions on epidemic models with density-dependent

demographics, we refer the reader to [18].
Another source of new susceptible individuals is migration or
immigration. A simple way of incorporating the immigration into
a model is to assume that the number of new immigrants is a
constant A per unit time. For instance, we may have the transfer
diagram with immigration as shown in Figure 1.8, and derive the
corresponding differential equations as before. We also note that,
rates d1 S, d2 I, and d3 R can also be interpreted as removal rates,
which can include removal of individuals from a compartment due
to death and out-migration.
A λIS γI
S I R
d1 S d2 I d3 R
Figure 1.8: Transfer diagram for an SIR model with migra-

tion/immigration and removal.
Exercises.
(1) Derive the system of differential equations for the model

with transfer diagram in Figure 1.8.
(2) Consider the case that new migrants into the population
consist of both susceptible and infectious individuals. Draw
a transfer diagram that represents this situation and derive
the corresponding differential equations.
1.4.4 Disease Latency: Latent and

Incubation Periods
Disease infection begins with the transmission of the pathogen
from one host to another. After pathogens invade the host body,
they need to be able to evade or overcome the host immune
response and be able to multiply or replicate. When the pathogens
accumulate sufficiently in large numbers and reach the target
organs, they begin to cause sufficient damage to the host body
so that the host becomes symptomatic and the host is capable
of transmitting the pathogens to others. For some infections, the
symptoms may appear after the host becomes contagious. The
period from time of infection to time of onset of symptoms is
called the incubation period. The period from time of infection to
time of being contagious or infectious is called the latent period.
The period during which the host is infectious is called the infec-
tious period. See the illustration in Figure 1.9 for an example of
relations between these periods. During the latent period, a host
may or may not show symptoms, but the host is not capable of
transmitting pathogens to other hosts. In Figure 1.9, the onset of
the symptoms is shown to occur after the host becomes infectious.
We note that the reverse could also happen, and the incubation
period can be shorter than the latent period for certain diseases.
To incorporate the disease latency in a mathematical model,
we need to make some basic assumptions about the latency of
the disease. The simplest is to divide the infected compartment
into two compartments: a latent compartment E and an infectious
compartment I, and assume that the transfer from E to I satis-
fies the proportional rate assumption, namely, given by E with
rate constant . From our discussions in Section 1.4.1, we know
that this is equivalent to assuming that individual latency has an
exponential distribution, and 1/ is the mean latent period. We
have a new transfer diagram as shown in Figure 1.10, which leads
to a system of differential equations (1.27).
Beginning of End of
infectiousness infectiousness
Onset of
Time of Infection symptoms Recovery
Latent period Infectious period
Incubation period
Figure 1.9: A graphical illustration of incubation, latent and infec-

tious periods.
bN λIS γI
S E I R
d1 S d2 E d3 I d4 R
Figure 1.10: Transfer diagram for an SEIR model.
S (t) = bN (t) − λI(t)S(t) − d1 S(t)

E (t) = λI(t)S(t) − ( + d2 )E(t)
I (t) = E(t) − (γ + d3 )I(t) (1.27)
R (t) = γI(t) − d4 R(t)
N (t) = S(t) + E(t) + I(t) + R(t).
This is an SEIR model. Variations of the model can be derived
when different incidence and demographic terms are used.
1.4.5 Acquired Immunity

When an infected host recovers from an infection, it usually main-
tains a certain degree of immunity against reinfection from the
same strain of pathogen. If the infection has caused a humoral
immune response, antibodies produced by the host usually remain
in the body for a period of time and guard the body from the same
antigens. Memory T lymphocytes in the immune system have the
ability to recognize the specific antigen against which they are
specifically created, and persist long past the primary infection
for the purpose of mounting a much quicker immune response
when the same pathogenic antigen is recognized. Mild immune
responses can also be induced by inoculation or immunization so

that the body is pre-stocked with the correct antibodies or mem-
ory T cells to fight the infection when it occurs.
Without exposure to reinfection, immunity against a specific
disease will wane and eventually disappear. Certain diseases such
as measles are known to cause permanent immunity in humans
so that no reinfection occurs after recovery. In terms of compart-
mental models, loss of immunity results in a transfer of recovered
individuals back to the susceptible compartment, as illustrated
in Figure 1.11. We assume that the transfer rate is proportional
to the size of the compartment. Other assumptions on the rate of
transfer can be made according to our discussions in Section 1.4.1.
δR
R S
Figure 1.11: Loss of immunity resulting in individuals in compart-

ment R moving into S.
Exercise. Draw the transfer diagram for an SEIRS model with
disease latency and temporary immunity. Derive the differential
equations. You may want to consider different incidence forms.
1.4.6 Routes of Transmission: Horizontal and

Vertical
In the Kermack–McKendrick model, the infection is assumed to
occur by direct contact of an infectious and a susceptible host.
This is often called horizontal transmission. Other routes of trans-
missions exist for many diseases. One example is vertical transmis-
sion, in which the pathogens are passed to a newborn directly from
an infected mother. Example of diseases that can be transmitted
vertically include HIV/AIDS, Chagas disease, and Hepatitis B.
To model vertical transmission, we assume that a fraction p of
newborns from the infected population becomes infected at birth,
and the remaining fraction (1 − p) are susceptible. The transfer
diagram in Figure 1.12 illustrates a disease spread with both hor-
izontal and vertical transmissions.
bN − pbI λIS γI
S I R
pbI
Figure 1.12: Transfer diagram for an SIR model with vertical

transmission.
In Figure 1.12, bN is the total number of newborns with natu-

ral birth rate b, pbI is the number of newborns who are infected
at birth, bN − pbI is the number of healthy but susceptible new-
borns. The resulting model is described by the following system
of differential equations:
S (t) = bN (t) − pbI(t) − λI(t)S(t)

I (t) = pbI(t) + λI(t)S(t) − γI(t) (1.28)
R (t) = γI(t).
A good reference for mathematical modeling of vertically trans-

mitted diseases is the book by Busenberg and Cooke [8].
Exercise. Consider an SEIR model with both horizontal and ver-

tical transmission. What assumptions can one make about the
newborns of mothers from the E and I compartments? Should
the infected newborns enter the E compartment or the I com-
partment, or both? Think of the possibilities. Draw transfer dia-
grams according to different assumptions you make, and derive
the corresponding differential equations.
1.4.7 Heterogeneity: Age and Social Groups,

Differential Infectivity, Multiple Hosts,
and Disease Vectors
The Kermack–McKendrick model considers a single homogeneous
host population. In reality, host populations are far from homoge-
neous. Many characteristics of the host population can contribute
to heterogeneity. Age is an important source of heterogeneity;
mixing is often preferential to individuals in the same age group,
vulnerability to a particular disease and the case fatality rate

generally vary greatly across different age groups. Certain social
groups may have much higher incidence of sexually transmitted
diseases than the general population. Different ethnic groups may
have different susceptibilities to certain diseases. Another impor-
tant factor for heterogeneity in disease transmission is the spatial
spread; an infectious disease typically starts from an outbreak
at one location and spreads elsewhere. For vector-borne diseases
and diseases that involve several hosts, there exists heterogeneity
among the disease vectors, intermediate hosts and human hosts.
These factors need to be incorporated into a mathematical model
to represent realistically the disease transmission process. In this
section, we will discuss how to incorporate age structures into our
basic model.
Let S(a, t), I(a, t), and R(a, t) be the number of individuals
who are susceptible, infectious, and recovered at age a and time
t, respectively. Here a, t are considered as independent variables.
The rate of change of the population in each compartment should
account for changes due to time, as we have seen in Section 1.2,
and changes due to aging, which are described as partial deriva-
tives with respect to a: ∂S(a,t)
∂a
, ∂I(a,t)
∂a
, and ∂R(a,t)
∂a
, respectively. The
Kermack–McKendrick model with age structure becomes a sys-
tem of integral partial differential equations:
∞
∂S(a, t) ∂S(a, t)
+ = −S(a, t) λ(a, b)I(b, t)db
∂t ∂a 0
∞
∂I(a, t) ∂I(a, t)
+ = S(a, t) λ(a, b)I(b, t)db − γ(a)I(a, t) (1.29)
∂t ∂a 0
∂R(a, t) ∂R(a, t)
+ = γ(a)I(a, t).
∂t ∂a
Birth terms will appear as boundary conditions at a = 0:
∞
S(0, t) = [b(a)N (t, a) − p(a)b(a)I(t, a) ]da, S(a, 0) = ψ(a),
a
0∞
I(0, t) = p(a)b(a)I(t, a) da, I(a, 0) = φ(a),
(1.30)
a0
R(0, t) = 0, R(0, a) = ξ(a).
Here, parameter b(a) denotes the birth rate for populations at age
a, and p(a) is the fraction of infectious newborns from populations
at age a. The individuals are assumed to become child-bearing at
age a0 . The term p(a)b(a)I(t, a) in the boundary conditions is due
to vertical transmission, and (ψ(a), φ(a), ξ(a)) denotes the initial
age distribution. The age-dependent force of infection is given by
∞
λ(a, t) = λ(a, b)I(b, t)db,
0
where λ(a, b) is the transmission coefficients from the subpopula-

tion at age b to the subpopulation at age a.
For discussions and analysis of age-structured epidemic models,
see [32, 33, 63].
1.4.8 Disease Control and Prevention

Measures: Immunization and
Quarantine
Apart from medical treatment, two of the most effective and
widely used prevention and control measures for infectious dis-
eases are immunization and quarantine.
An effective vaccine can protect an otherwise susceptible host
against possible infection. By immunizing a large portion of the
susceptible host population before or at the early phase of a dis-
ease outbreak, we can reduce the initial number S0 of suscepti-
bles to a level that is below the threshold βγ , and by our thresh-
old result in Section 1.3, a full-blown epidemic can be prevented.
From a compartmental modeling viewpoint, vaccination moves
susceptible hosts directly to the recovered compartment without
going through the I compartment. If we assume that a fraction p
of all susceptibles is vaccinated per unit time, then we arrive at
the transfer diagram in Figure 1.13 and a system of differential
equations (1.31).
S = bN − λIS − dS − pS
I = λIS − (d + γ)I
(1.31)
R = pS + γI − dR
N = S + I + R.
pS
bN λIS γI
S I R
dS dI dR
Figure 1.13: Transfer diagram for an SIR model with vaccination.
One of the issues with vaccination is that some vaccines can be

“leaky” or imperfect; namely, the immune protection of hosts is
not perfect and immunized hosts can still get infected, albeit with
a smaller probability than unvaccinated hosts. To model a leaky
vaccine, we add a new compartment V of vaccinated hosts and an
additional incidence term αλV S due to the infection of vaccinated
individuals. We obtain the transfer diagram in Figure 1.14.
λIS γI
S I R
pS αλIV
Figure 1.14: A vaccination model with a leaky vaccine.
In Figure 1.14, 0 < α ≤ 1 denotes the reduced probability of

transmission due to immune protection from the imperfect vac-
cine. A set of differential equations can be readily derived based
on the transfer diagram.
Quarantine is a measure that isolates infectious individuals and
hence prevents them from infecting others. Through quarantine,
we can slow down and even stop the transmission process. We
introduce a new compartment Q for the quarantined individuals,
and assume that quarantine is carried out in such a way that

a fraction 0 < p ≤ 1 of infectious individuals will be isolated. A
simple transfer diagram is depicted in Figure 1.15, where δ is the
rate constant for the recovery of quarantined individuals.
λIS γI
S I R
δQ
pI
Figure 1.15: An SIR model with quarantine.
Exercise. Derive systems of differential equations according to

transfer diagrams depicted in Figure 1.14 and Figure 1.15.
1.4.9 The Basic Reproduction Number R0

The basic reproduction number R0 , also called the basic repro-
ductive number or the basic reproductive ratio, is the single most
important parameter in epidemic modeling. It measures the aver-
age number of secondary infections caused by a single infectious
individual in an entirely susceptible population during the mean
infectious period.
In the context of the Kermack–McKendrick model, R0 can be
expressed as
1
λ · S0 ·
γ
average number of initial mean

effective contacts of a · susceptible · infectious
single infectious host population period
Using R0 , the threshold phenomenon described in Section 1.3 can

be expressed as follows:
If R0 < 1, then an epidemic will not occur;

If R0 > 1, an epidemic will occur.
We will see in later chapters that threshold results in this form

occur even in more complex epidemic models.
b λIS γI
S E I R
bS bE bI bR
For the SEIR model with the transfer diagram in Figure 1.16,
the basic reproduction number is given by
1
R0 = λ · · ,
+b γ+b
average number of probability of a newly mean

effective contacts of a · infected host surviving · infectious .
single infectious host the latent period period
1
Note that the mean infectious period γ+b is understood as the
mean period an individual remains alive and infectious. We also
note that the initial susceptible population does not appear in R0
in this case. The reason is that, in this model, the constant total
population N = S + E + I + R is scaled to 1.
When models get more complex, R0 may be harder to derive
directly from the transfer diagram. Other methods for deriving
R0 exist. Most of them are based on the stability analysis of the
disease-free equilibrium. In the later chapters, we will illustrate
various ways to derive and interpret R0 .
For more detailed description and discussions on the basic
reproduction number for epidemic models, we refer the reader
to [2, 7, 14, 15, 27, 57, 60]. A practical approach to the compu-
tation of R0 for complex epidemic models is given in [64].
Chapter 2
Five Classic Epidemic

Models and Their
Analysis
In this chapter, we present some standard mathematical meth-

ods for the analysis of compartmental epidemic models. We have
chosen five classic epidemic models to demonstrate these methods.
We start from the basic Kermack–McKendrick model and progres-
sively expand it to a model with demography, and then introduce
the Ross–MacDonald model for malaria. Each model is chosen
to illustrate a specific mathematical approach for model analysis:
the method of first integrals and level curves, the phase-line analy-
sis, phase-plane analysis, reduction of dimension using homogene-
ity, and monotone dynamical systems. The general mathematical
theories applied in this chapter are provided in Chapter 3 for
reference and in-depth learning. Students in mathematics have a
chance to learn these general theories in the setting of epidemic
models and see how abstract theories of differential equations are
applied to real-world problems. Students in public health and bio-
logical sciences will be able to learn the basic model analysis and
gain exposure to some abstract mathematical concepts such as
stability and bifurcations explained in the context of epidemiol-
ogy, as well as to the theory of modern differential equations.
Section 2.1 contains analysis of the Kermack–McKendrick
model introduced in Section 1.3. The main mathematical tech-
nique presented is the method of first integrals and how to obtain

https://doi.org/10.1007/978-3-319-72122-4_2
36 Five Classic Examples
orbits of solutions from level curves of first integrals. Epidemio-

logical concepts discussed include the basic reproduction number,
final size formula, and threshold phenomenon for epidemics.
Section 2.2 discusses a simple model for diseases without
immunity. The resulting model equation is a 1-dimensional differ-
ential equation. We present the method of phase-line analysis to
illustrate important concepts of equilibria, stability, bifurcation,
and bifurcation diagrams.
Section 2.3 deals with a model with demography which leads
to a 2-dimensional system of ordinary differential equations. We
present the standard phase-plane analysis, which is based on the
classical Poincaré–Bendixson theory, one of the most beautiful
theories in differential equations. The techniques used include
local stability analysis by linearization, global stability using the
method of Lyapunov functions, and global convergence using the
Bendixson–Dulac criteria and the Poincaré–Bendixson Theorem.
Section 2.4 investigates a model with the proportionate inci-
dence term and exponential birth and death terms. The resulting
system of equations is 3-dimensional and does not allow us to
apply the phase-plane analysis directly. The homogeneous prop-
erty satisfied by the model system can be used to reduce the
dimension by one. We show how to derive a system of equations
for the fractional variables that becomes 2-dimensional. Math-
ematically, this amounts to projecting the original system onto
a hyperplane or sphere. We show how a complete understand-
ing of the projected system can help us to analyze the original
3-dimensional system.
Section 2.5 deals with the classic Ross–MacDonald model for
malaria. We will show how to derive a model for vector-borne
diseases. The mathematical techniques used in this section are
part of the monotone dynamical systems theory.
The models discussed in this chapter are kept simple and analy-
sis mostly elementary. Some of the same mathematical approaches
demonstrated using these simple models are used for analysis of
more complex models in Chapter 5.
2.1 Kermack–McKendrick Model

In this section we carry out detailed mathematical analysis and
derive important properties of solutions to the Kermack–
McKendrick model,
dS
= −βIS
dt
dI
= βIS − γI (2.1)
dt
dR
= γI,
dt
with initial conditions S(0) = S0 > 0, I(0) = I0 > 0, and R(0) =
R0 ≥ 0.
2.1.1 Simple Properties of Solutions

Property 1. Model (2.1) is well posed.
By well-posedness we mean that nonnegative initial conditions
lead to nonnegative solutions, namely, S0 ≥ 0, I0 ≥ 0, and R0 ≥ 0
imply S(t) ≥ 0, I(t) ≥ 0, and R(t) ≥ 0 for t ≥ 0. Another way to
describe this is that the nonnegative cone of R3 ,
R3+ = { (S, I, R) ∈ R3 | S ≥ 0, I ≥ 0, R ≥ 0 }, (2.2)
is positively invariant with respect to (2.1).

Positive invariance of R3+ can be verified by examining the
direction of the vector field (−βIS, βIS − γI, γI)T of (2.1) on
each coordinate plane. We want to show the vector field is either
tangent to the coordinate or pointing to the interior of R3+ .
(1) On the SR-plane: I = 0 on this plane, and

dI
= 0.
dt I=0
This shows that the vector field on the SR-plane is tangent to
the SR-plane. This tangency also implies that the SR-plane itself
is invariant – solutions starting on the SR-plane remain on the
same plane. Biologically, this says that if there is no infection at

the beginning, there remains no infection.
(2) On the IR-plane: S = 0 on this plane, and

dS
= 0.
dt S=0
Therefore, the IR-plane is also invariant. No solutions in the inte-
rior of R3+ can escape through the IR-plane or the SR-plane.
(3) On the SI-plane: R = 0 on this plane, and

dR
= γI ≥ 0.
dt R=0
Therefore, the vector field on the SI-plane points to the interior
of R3+ . No solutions can escape the interior through the SI-plane.
We thus have shown that all solutions starting in R3+ remain in
R3+ for t > 0.
Property 2. Total population is constant.
Let N (t) = S(t) + I(t) + R(t), N0 = S0 + I0 + R0 . Adding all
three equations in (2.1) we obtain
N (t) = 0 for all t ≥ 0,
which implies N (t) = N0 for all t ≥ 0.

Property 3. Solutions to (2.1) exist for t ∈ [0, +∞).
From Properties 1 and 2 we know that solutions (S(t), I(t),
R(t)) are bounded in their maximal interval of existence. Funda-
mental theory of differential equations tells us that solutions can
be extended for all positive time.
Property 4. Limit limt→∞ (S(t), I(t), R(t)) = (S(∞), I(∞), R(∞))
exists.
From (2.1) we know
S (t) = −βI(t)S(t) ≤ 0.
Therefore, S(t) is decreasing and bounded below by 0, and thus

S(∞) ≥ 0 exists. Similarly,
R (t) = γI(t) ≥ 0
and R(t) is increasing and bounded above by N0 . Therefore,

R(∞) ≥ 0 exists. From
I(t) = N0 − S(t) − R(t)
we know that I(∞) = N0 − S(∞) − R(∞) ≥ 0 exists.

Property 5. S0 > 0 and I0 > 0 imply 0 < S(∞) < S0 and
I(∞) = 0.
Without loss of generality we may assume that R0 = 0, since
otherwise, we may choose N̄0 = N0 − R0 . Then S(t) > 0, I(t) > 0,
and R(t) > 0 for t > 0. Dividing the equations for S and R, we
have
dS β
= − S.
dR γ
Solving this equation for S we obtain
β β
S(R) = S0 e− γ R ≥ S0 e− γ N0 > 0. (2.3)
Therefore, 0 < S(∞) < S0 .

From the S equation and the fact that S(∞) and I(∞) exist,
we know that
lim S (t) = −βI(∞) S(∞) exists.

t→∞
Furthermore, limt→∞ S (t) = 0. Otherwise, if limt→∞ S (t) =

α < 0, then S (t) < α/2 for t ≥ T, for sufficiently large time T ,
and thus
α 2S(T )
S(t) < S(T ) + (t − T ) < 0 for t > T − .
2 α
This contradicts S(t) > 0 for all t > 0. Therefore, S(∞)I(∞) = 0.
Since S(∞) > 0, we know I(∞) = 0.
We can draw two important biological conclusions based on
these properties:
(1) The disease eventually dies out (I(∞) = 0). The model
describes an epidemic disease.
(2) There is always a positive fraction S(∞)/S0 of susceptibles

that escape infection at the end of the epidemic. The epi-
demic does not stop because of complete exhaustion of sus-
ceptibles.
The second conclusion begs the question: why does an epidemic

first rise, peak, and then decline? We address this question in the
next subsection.
2.1.2 Phase Portrait in the SI-Plane:

Epidemic Curves
For a better understanding of the behaviors of solutions, we try to
view their orbits in the SI-plane. Consider only the S, I equations:
dS
= −βIS
dt (2.4)
dI
= βIS − γI
dt
in the first quadrant of the SI-plane. Dividing the two equations
we obtain
dI γ ρ
= −1 + = −1 + , (2.5)
dS βS S
where ρ = γ/β is the threshold number for S0 . Integrating (2.5)
we obtain
φ(S, I) = I + S − ρ log S = C, (2.6)
where C is an integration constant and can be determined from
S0 and I0 . The function φ(S, I) = I + S − ρ log S is called a first
integral of system (2.4), since it satisfies
d S (t)
φ(S(t), I(t)) = I (t) + S (t) − ρ = −γI(t) + ρβI(t) = 0
dt S(t)
for all t. Therefore, function φ(S, I) remains a constant along
a solution (S(t), I(t)) of (2.4). This implies that trajectories
(S(t), I(t)) of system (2.4) are given by the family of level curves
φ(S, I) = C of the first integral φ(S, I), or equivalently defined by
equation (2.6). This gives us a way to plot trajectories of (2.4) by
plotting level curves of a first integral φ(S, I) for different values

of C. These curves are depicted in Figure 2.1.
We can observe several characteristics of the phase trajectories.
(1) The maximum value Imax of I is achieved when S = ρ, the

threshold value. This fact is also clear from the equation of
I, since I = 0 if and only if S = ρ, and S = ρ is a critical
point for I.
(2) If S0 < ρ, then I(t) decreases monotonically as t increases
and the epidemic declines and no new outbreaks occur.
(3) If S0 > ρ, then I(t) initially increases monotonically while
S(t) decreases, I(t) peaks when S = ρ, and then as S(t)
decreases below ρ, I(t) decreases to 0. This gives the rise–
peak–decline cycle of an epidemic.
Figure 2.1: Family of epidemic curves.

Clearly, cases (2) and (3) confirm the threshold phenomenon we

have observed in Section 1.3 and demonstrate the important role
threshold parameter ρ = γ/β plays in determining the fate of an
epidemic.
The threshold condition S0 > ρ can also be interpreted in the
equivalent form S0 /ρ > 1, or
β
S0 > 1.
γ
From Section 1.4.9, we know that the expression β S0 /γ repre-
sents the average number of secondary infections during an mean
infectious period in an initially susceptible population S0 , which
is the basic reproduction number R0 . Therefore, the conclusions
in (2) and (3) can be restated as the following threshold result:
no outbreaks will occur if the basic reproduction number R0 < 1;
and an initial case will lead to an outbreak if R0 > 1.
An important epidemiological message from the mathematical
theory is that, in the natural history of an epidemic, the decline
after peak happens when the available susceptible population
declines below the threshold value ρ = γ/β. Interventions that
decrease the transmission coefficient β by reducing chances of con-
tact or increase γ by shortening the infectious period will impact
the threshold value, and in turn change the timing of the peak.
2.1.3 Final Size Formula and the Severity

of an Epidemic
Recall equation (2.6)
φ(S, I) = S + I − ρ log S = C.
Assume that I0 ≈ 0. From
φ(S0 , 0) = φ(S∞ , 0),
we obtain
S0 − S∞ = ρ(log S0 − log S∞ ), (2.7)
and
S0 − S∞
ρ= . (2.8)
log S0 − log S∞
Note that S∞ gives the number of susceptible individuals who
escape the epidemic and can be used as an indicator of the severity
of the epidemic. Equation (2.7) is often called the final size equa-
tion. If the basic reproduction number R0 = Sρ0 = βγ S0 is known
for a disease, then equation (2.7) can be used to estimate the
final size S∞ . On the other hand, after an epidemic, the final size
S∞ is known, and equation (2.8) can be used to estimate the
basic reproduction number R0 , which can give an estimate of the
transmission rate β.
The severity of an epidemic can also be measured in terms of
the cumulative number of infected individuals, also called the size
of an epidemic:
size of epidemic = S0 − S∞ . (2.9)
If we know that R(0) = 0 and I(0) ≈ 0, then we have
S0 − S∞ = R∞ . (2.10)
This relation is clear from biological interpretation. It can also be

derived mathematically. Dividing the equations for S and R in
(2.1) we have
dS
−ρ = dR.
S
Integrating both sides and using the final size formula (2.7) we
obtain
ρ(log S0 − log S∞ ) = R∞ = S0 − S∞ .
The third way to measure the severity of an epidemic is the
maximum number of infected individuals, Imax , which can be used
to predict whether there will be sufficient hospital capacity. From
relation (2.6) we obtain
C = I0 + S0 − ρ log S0 = N0 − ρ log S0 ,
and thus
I = −S + ρ log S + N0 − ρ log S0 . (2.11)

Since dI
dS
= 0 ⇐⇒ S = ρ, we can derive
Imax = I(ρ) = −ρ + ρ log ρ + N0 − ρ log S0 . (2.12)

Once the basic reproduction number R0 = S0
ρ
is estimated, Imax
can be estimated from this relation.
2.1.4 Kermack–McKendrick Threshold

Theorem
To arrive at their now famous threshold theorem, Kermack and
McKendrick obtained approximate solutions of model (2.1). Using
S + I + R = N0 , the R equation in (2.1) can be rewritten as
dR
= γ(N0 − R − S).
dt
Using relation (2.3), S(R) = S0 e−R/ρ , we obtain
dR R
= γ(N0 − R − S0 e− ρ ). (2.13)
dt
To obtain an approximation solution to (2.13), we use Taylor
expansion of e−R/ρ at R = 0 up to the second order and obtain
an approximation of equation (2.13):

dR S0 S0
= γ N 0 − S0 + − 1 R − 2 R2 . (2.14)
dt ρ 2ρ
Solving this equation we obtain

ρ2 S 0 1
R(t) = − 1 + α tanh αγt − φ , (2.15)
S0 ρ 2
where 1
S0 2 2S0 I0 2
α= −1 + ,
ρ ρ2
(2.16)
1 S0
φ = tanh−1 −1 .
α ρ
Therefore
dR γα2 ρ2 1
= sech2 αγt − φ . (2.17)
dt 2S0 2
If I0 ≈ 0 and S0 > ρ, then we have from (2.16), α = S0
ρ
− 1, and
from (2.15)

ρ2 S 0 γ S0
R(t) = −1 1 + tanh ( − 1)t − φ . (2.18)
S0 ρ 2 ρ
Letting t → ∞ and using limt→∞ tanh(at + b) = 1, we obtain an
approximate value for the size R∞ = S0 − S∞ of the epidemic

ρ
R∞ = 2ρ 1 − . (2.19)
S0
If we write S0 = ρ + ν for ν > 0, then

ρ ν
R∞ = 2ρ 1 − = 2ρ
ρ+ν ρ+ν
ρ
= 2ν ≈ 2ν.
ρ+ν
Namely, the size of the epidemic is roughly 2ν. Therefore,
S∞ = S0 − R∞ = ρ + ν − 2ν = ρ − ν.
This relation leads to the threshold theorem of Kermack–

McKendrick.
Theorem 2.1.1 (Threshold Theorem)
(1) An epidemic occurs if and only if S0 exceeds the threshold ρ.

(2) If S0 = ρ + ν, ν > 0, then after the epidemic, the number
of susceptible individuals is reduced by an amount approxi-
mately 2ν, namely, S∞ ≈ ρ − ν.
If all cases in an epidemic terminate in death, then R can be

considered the removed class, and dRdt
will be the rate for disease
death. This can be fitted to the weekly death register data. Ker-
mack and McKendrick used relation (2.17) to fit the data from the
Bubonic plague in Bombay in 1905–06 and obtained a very good
fit using a simple model. See [48], page 325, for a demonstration
of the fitting.
Exact solutions to (2.13) were obtained by Kendall in 1956. In
present days, numerical solutions can be easily obtained using a
computer software and are used to fit data.
Exercises.
(1) Solve equation (2.13) and find its exact solutions (hint: try
separation of variables).
(2) Numerically solve equation (2.13) and its second-order
approximation (2.14) using a mathematical software pack-
age. Compare solutions of the two equations that originate
from the same initial point. Describe what you observe
about the differences of the two solutions. Can you explain
the differences you observed? Based on these numerical sim-
ulations, what conclusions can you draw regarding the accu-
racy of the second-order approximation of equations (2.13)?
Based on your observations, what can you say about the
validity of statement (2) of Theorem 2.1.1?
2.2 A Model for Diseases with No

Immunity
Consider an infectious disease that confers no immunity. Ignor-
ing demography and latency, we have the transfer diagram in
Figure 2.2.
2.2 A Model for Diseases with No Immunity 47
Figure 2.2: Transfer diagram for an SIS model.
Infected individuals return to the susceptible compartment

upon recovery, and the natural history of the infection can be
illustrated as S → I → S. This type of model is called an SIS
model. The model equations are:
S (t) = −βIS + γI
(2.20)
I (t) = βIS − γI.
Initial conditions are S(0) = S0 and I(0) = I0 . As we have seen

in previous sections, N = S + I = N0 = S0 + I0 . In this model,
there will be two possible outcomes for an initial outbreak: either
the outbreak terminates or the disease becomes endemic. To see
this, we replace S = N0 − I in the second equation and obtain
I = βI(N0 − I) − γI = (βN0 − γ − βI)I,
or
I
I = (βN0 − γ)I 1 − . (2.21)
N0 − γ
β
Let
I
f (I) = (βN0 − γ)I 1 − . (2.22)
N0 − γ
β
Then the differential equation (2.21) can be written as
I = f (I). (2.23)
Properties of solutions to (2.23) can be largely determined from

the graph of f (I), which is depicted in Figure 2.3, using a pro-
cedure called phase-line analysis. The basic ideas of phase-line
analysis are as follows: since solutions of equation (2.23) can only
move along a straight line and cannot self-intersect, we know that

I(t) is either monotonically increasing (I (t) > 0) or monotoni-
cally decreasing (I (t) < 0). From (2.23) we also know that the
sign of I (t) can be determined from the graph of f : in an interval
on the I-axis where f (I) > 0, we know that I (t) > 0 and that
I(t) is increasing, and solutions move to the right; in an interval
where f (I) < 0, I(t) is decreasing and solutions move to the left.
These situations are depicted in Figure 2.3 in which the arrows on
the I-axis denote the direction solutions are moving. Equilibria
of (2.23) are solutions that do not depend on t (I (t) = 0), and
hence are points where f (I) = 0, or where the graph of f inter-
sects the I-axis. If the arrows on either side of an equilibrium I ∗
are moving toward I ∗ , then solutions near I ∗ are convergent to I ∗ .
In this case, the equilibrium I ∗ is said to be asymptotically stable.
On the other hand, if the arrows are moving away from I ∗ , then
I ∗ is an unstable equilibrium.
Applying the phase-line analysis to (2.21) with f (I) given in
(2.22), we see that (2.21) has two equilibria: I1∗ = 0 and I2∗ =
N0 − βγ = N0 − ρ. Whether I2∗ is positive depends on the sign of
N0 − βγ . If N0 < βγ , then the equation only has one equilibrium
I ∗ = 0 in the nonnegative I-axis and it is asymptotically stable. If
N0 > βγ , both equilibria I1∗ and I2∗ exist in the nonnegative I-axis.
Equilibrium I1∗ = 0 is unstable and I2∗ is asymptotically stable.
See Figure 2.3 (a) and (b). We summarize these results in the
next proposition.
Proposition 2.2.1
(1) If βN
γ
0
< 1, then, for any 0 < I0 < N0 , I(t) → 0 monotoni-
cally as t → ∞.
(2) If βN
γ
0
> 1, then, for any 0 < I0 < N0 , I(t) → N0 − γ
β
mono-
tonically as t → ∞.
Proof is left as an exercise.

Figure 2.3: Graphical demonstration of phase-line analysis.
Recall that R0 = βN γ
0
denotes the basic reproduction number.
Proposition 2.2.1 can be reinterpreted as follows for solutions
(S(t), I(t)) of (2.20):
(1) If R0 < 1, then (S(t), I(t)) → (N0 , 0) as t → ∞.
(2) If R0 > 1, then (S(t), I(t)) → ( βγ , N0 − βγ ) as t → ∞.
The equilibria P0 = (N0 , 0) and P ∗ = ( βγ , N0 − βγ ) of model

(2.20) are called the disease-free equilibrium and the endemic
equilibrium, respectively. Let
Γ = {(S, I) ∈ R2+ | S + I = N0 }
be the feasible region for model (2.20). Summarizing the preced-

ing discussions, we arrive at the next threshold theorem.
Theorem 2.2.2
(1) For each N0 > 0, there are two possible equilibria in the
feasible region Γ: the disease-free equilibrium P0 = (N0 , 0)
and the endemic equilibrium P ∗ = ( βγ , N0 − βγ ).
(2) If R0 < 1, then all solutions in Γ converge to P0 .
(3) If R0 > 1, then all solutions with I0 > 0 converge to P ∗ .
Biological interpretations:
(1) The basic reproduction number R0 serves as a threshold

parameter that determines the outcome of an initial out-
break.
(2) For the same value of R0 , the outcomes are qualitatively
the same for all initial conditions.
(3) When R0 < 1, the disease dies out. From the monotonicity
of S(t) and I(t), we know that no epidemic will develop.
(4) Because ( βγ , N0 − βγ ) is an equilibrium, S(t) stays only on
one side of βγ , and thus I (t) = I(t)(βS(t) − γ) does not
change sign. As a consequence, I(t) is monotone. See Figure
2.4. This indicates that an SIS model is not capable to catch
the rise and fall pattern of a typical disease outbreak.
In the preceding discussions, we notice that, when the
value of R0 increases from R0 < 1 to R0 > 1, behaviors of solu-
tions to system (2.20) in the feasible region undergo qualitative
changes in both the number of equilibria and their stability. We
say that system (2.20) undergoes a bifurcation and R = 1 is the
bifurcation value.
A simple geometric mechanism for this bifurcation can be seen
by observing the changes in the parabola defined in (2.22) as we
vary N0 from N0 < ρ/β (R0 < 1) to N0 > ρ/β (R0 > 1), and how
the equilibrium I2∗ = N0 − ρ/β increases from left of the origin
when N0 < ρ/β (R0 < 1), merges with I1∗ = 0 when N0 = ρ/β
(R0 = 1), and then emerges right of the origin when N0 > ρ/β
(R0 > 1). The graph of f (I) also explains the changes in stability
that accompany the changes in the number of equilibria.
In the next section, we continue the discussion of bifurcation
and show how to represent it graphically in a bifurcation diagram.
Figure 2.4: Time plots of I(t).
Exercises.
(1) Prove the results in Proposition 2.2.1.

(2) Consider an ODE in Rn ,
x = f (x)
and assume that f is a C 1 function. Suppose a solution

x(t) → x̄ as t → ∞. Show that x̄ is an equilibrium, namely,
f (x̄) = 0.
(3) Vary N0 from N0 < ρ/β to N0 > ρ/β and observe the
changes in the parabola defined in (2.22). In the process,
determine changes to the equilibrium I2∗ = N0 − ρ/β and to
the stability of both I1∗ = 0 and I2∗ .
(4) Apply the phase-line analysis to an SIS with demography
as depicted in Figure 2.5.
Figure 2.5: Transfer diagram for an SIS model with demography.
2.3 A Model with Demography

Consider the following modified Kermack–McKendrick model with
demography:
Here, the birth and death processes are assumed to have the
same rate constant b, and the disease is not fatal. The total pop-
ulation is kept as a constant, which is already scaled to 1. The
differential equations for the model are:
Figure 2.6: Transfer diagram for a model with demography.
S = b − βIS − bS
I = βIS − γI − bI (2.24)
R = γI − bR.
The feasible region for model (2.24) is:
G = {(S, I, R) ∈ R3+ | S + I + R = 1}.
We would like to know if the threshold theorem in the previ-

ous section still holds for this model with the basic reproduction
number R0 = γ+b β
(see Section 1.4.9). Because of the conservation
law S + I + R = 1, we will be able to reduce the number of equa-
tions by 1. System (2.24) is in fact a 2-dimensional system. To
2.3 A Model with Demography 53
simplify our analysis, we can ignore the R equation since the

first two equations in (2.24) do not contain R. Once behaviors
of (S(t), I(t)) are known, those of R(t) can be readily obtained
from R = 1 − S − I. For this reason, we can consider the following
equivalent system:
(2.25)
I = βIS − γI − bI
in a 2-dimensional feasible region
Γ = {(S, I) ∈ R2+ | 0 ≤ S + I ≤ 1}.
2.3.1 Disease-Free and Endemic Equilibria

Based on our experience in Section 2.3, longtime outcomes of
the disease are manifested in the form of equilibria, or solutions
that do not change in time. To find equilibria of (2.25), we set
S = I = 0 and obtain a system of algebraic equations
b − βIS − bS = 0
(2.26)
βIS − γI − bI = 0.
Solving these equations, we obtain two possible equilibria: P0 =

(1, 0), the disease-free equilibrium, and P ∗ = (S ∗ , I ∗ ), the endemic
equilibrium, where
b+γ b[β − (b + γ)]
S∗ = , I∗ = .
β β(b + γ)
Note that P ∗ falls outside of the feasible region Γ if R0 = β

b+γ
< 1,
and coincides with P0 when R0 = 1.
Proposition 2.3.1 System (2.25) has two possible equilibria.
(1) If R0 ≤ 1, then P0 = (1, 0) is the only equilibrium in Γ.

(2) If R0 > 1, then both P0 and the endemic equilibrium P ∗
exist in Γ.
2.3.2 Local Stability Analysis of Equilibria

Local stabilities of P0 and P ∗ determine the disease outcomes
when initial conditions are close to those at the equilibrium. A
standard method for investigating local stability is by the method
of linearization, as explained in Section 3.2 of Chapter 3.
1. Stability of P0 . The Jacobian matrix of (2.25) at P0 is

−b −β
J(P0 ) = .
0 β − (b + γ)
Because of the upper triangularity of the matrix, its eigenvalues

are the same as the diagonal elements, λ1 = −b, λ2 = β − (b + γ).
Therefore, P0 is locally asymptotically stable if λ2 < 0, or if
R0 < 1, and P0 is a saddle (unstable) if R0 > 1. When R0 = 1,
λ2 = 0. In this case, P0 is no longer a hyperbolic equilibrium, and
the method of linearization is not applicable. The stability of P0
when R0 = 1 will be dealt with using the method of Lyapunov
functions in Section 3.3.
2. Stability of P ∗ . The Jacobian matrix at P ∗ = (S ∗ , I ∗ ) is

∗ −βI ∗ − b −βS ∗
J(P ) = .
βI ∗ 0
Finding eigenvalues of this matrix is more involved than that of

J(P0 ). We will use the Routh–Hurwitz conditions in Section 3.2
to determine the stability. Straightforward calculation leads to
b
tr(J(P ∗ )) = −βI ∗ − b = − < 0, (2.27)
S∗
det(J(P ∗ )) = βI ∗ S ∗ > 0, if R0 > 1. (2.28)
By the Routh–Hurwitz criteria, we know that P ∗ is locally asymp-

totically stable if and only if R0 > 1, or as long as it exists in Γ.
2.3.3 Bifurcation Digram

We see in Sections 2.3.1 and 2.3.2 that the number of equilibria in
Γ, together with their local stability, is determined by the values
of R0 . The model (2.25) undergoes a bifurcation at R0 = 1, as we

have seen in Section 2.2. This bifurcation can be depicted graph-
ically in a bifurcation diagram, as shown in Figure 2.7. In the
bifurcation diagram, the horizontal axis is the values of the bifur-
cation parameter R0 , with the bifurcation value R0 = 1 marked,
and the vertical axis is for values of I ∗ at an equilibrium. A stable
equilibrium is represented by a solid line and an unstable one by
a dashed line.
We make the following observations about the bifurcation dia-
gram:
(1) For 0 < R0 < 1, there is only one equilibrium, P0 , and it is

asymptotically stable. We mark P0 by a solid line.
Figure 2.7: Bifurcation diagram for a model with demography.
(2) R0 = 1 is a bifurcation value; as R0 increases across 1, P0

continues to exist and a new equilibrium P ∗ comes to exis-
tence in the feasible region Γ. Furthermore, P0 loses its sta-
bility when R0 > 1 and P ∗ gains stability as it comes into
existence. We mark P0 by a dashed line and P ∗ by a solid

curve.
The type of bifurcation shown in Figure 2.7 is called a transcrit-

ical bifurcation, a very common bifurcation that occurs in many
epidemic models.
2.3.4 Establishing the Global Stability of P0 :

the Method of Lyapunov–LaSalle
As we have seen in Section 2.2, when R0 < 1, all solutions con-
verge to the disease-free equilibrium P0 . We show that the same
result holds for model (2.25). In this case, we say that P0 is glob-
ally stable in the feasible region Γ. The method we use to show
the global stability is the method of Lyapunov–LaSalle. Consider
a Lyapunov function
L(S, I) = I.
Its derivative along a solution (S(t), I(t)) is
dL dI
= = I(βS − γ − b) ≤ I(β − γ − b)
dt dt (2.29)
≤ 0, if R0 ≤ 1.
Therefore, LaSalle’s Invariance Principle (Theorem 3.3.4, Section

3.3) implies that all limit points of solutions to (2.25) belong to
the largest invariance set in
dL
K = {(S, I) ∈ Γ | = 0}.
dt
From (2.29) we know that dL dt
= 0 if and only if either (1) I = 0
or (2) R0 = 1 and S = β . In case (2), we necessarily have
γ+b
S = γ+b
β
, I = I ∗ . In case (1), any solution of (2.25) staying in the
set where I = 0 will satisfy S = b − bS, and thus S(t) → 1 as
t → ∞. In both cases, the only compact invariant set in the set
K is the singleton {P0 }. This implies that all solutions in Γ con-
verge to P0 . We can also show that the existence of the Lyapunov
function L and the global convergence imply the local stability of
P0 . Therefore, we have shown that P0 is globally stable in Γ when

R0 ≤ 1.
2.3.5 Establishing the Global Stability of P ∗ :

Phase-Plane Analysis
Assuming that R0 > 1, we will show that all solutions with I0 > 0
converge to the endemic equilibrium P ∗ . For this goal, we will use
the Poincaré–Bendixson theory through the following steps:
Step 1. Show that system (2.25) has no nonconstant periodic
solutions using the Bendixson–Dulac criteria (Theorem
3.6.5, Section 3.6).
Step 2. By the Poincaré–Bendixson Theorem (Theorem 3.6.2,
Section 3.6) and the fact that P ∗ is the only equilibrium
◦
in the interior Γ of Γ, all solutions with I0 > 0 must have
P ∗ as an omega limit point, and thus can get arbitrarily
close to P ∗ .
Step 3. Since P ∗ is locally asymptotically stable, any solution
that gets sufficiently close to P ∗ must converge to P ∗ .
Therefore, all solutions with I0 > 0 converge to P ∗ , and
◦
P ∗ is globally stable in Γ.
It only remains to show that (2.25) has no nonconstant periodic

solutions. Write (2.25) as
S = P (S, I)
(2.30)
I = Q(S, I).
1
Let α(S, I) = I
be a Dulac multiplier. Then

∂ ∂ ∂ b bS ∂
(αP ) + (αQ) = − βS − + (βS − b − γ)
∂S ∂I ∂S I I ∂I
b ◦
= −β − < 0, in Γ.
I
◦
Therefore, the Bendixson–Dulac condition holds in Γ and no non-
◦
constant periodic solutions exist in Γ.
Summarizing the analyses in previous sections, we obtain a
threshold theorem.
Theorem 2.3.2
(1) If R0 ≤ 1, the disease-free equilibrium is the only equilib-

rium in the feasible region Γ, and it is globally stable in Γ.
(2) If R0 > 1, then P0 is unstable, and the unique endemic equi-
◦
librium P ∗ is globally stable in the interior Γ.
Behaviors of solutions can be shown graphically in the

SI-space, where each solution (S(t), I(t)) can be regarded as a
parametric curve, called an orbit, starting from its initial point
(S(0), I(0)). When the limiting behaviors of all orbits are shown,
we obtain the phase portrait of system (2.24). The phase por-
traits of system (2.24) in both cases (a) and (b), as established in
Theorem 2.3.2, are depicted in Figure 2.8. The type of analysis
employed in this section to establish the phase portraits is called
phase-plane analysis.
Figure 2.8: Phase portraits of an SIR model.

Exercises.
(1) Derive the characteristic equations for the eigenvalues of

the Jacobian matrix J(P ∗ ) in Section 2.3.3, and use the
quadratic formula to show that both eigenvalues are either
negative real numbers or complex eigenvalues with nega-
tive real parts. This is a direct method to show asymptotic
stability without using the Routh–Hurwitz conditions.
(2) In general, convergence of all solutions to an equilibrium
does not necessarily imply local stability of the equilibrium.
Show that, for model (2.24) when R0 = 1, the facts that
I (t) ≤ 0 along all solutions and that all solutions converge
to as t → ∞ imply the local stability of the disease-free
equilibrium P0 .
(3) Apply the phase-plane analysis to prove a threshold theo-
rem for the SIRS model depicted in the transfer diagram in
Figure 2.9, and draw the bifurcation diagram. Compare the
basic reproduction number for this model that is derived in
this section.
(4) Apply the phase-plane analysis to prove a threshold theo-
rem for the SIR model with a nonlinear incidence as shown
in the transfer
√ diagram in Figure 2.10. Note that, because
of the I term, the partial derivatives of the vector field
are not continuous at I = 0, and the linearization at the
disease-free equilibrium P0 is not well defined. What other
methods beside the linearization can you use to establish
the local stability of P0 ?
Figure 2.9: Transfer diagram for an SIRS model.

Figure 2.10: Transfer diagram for an SIR model with nonlinear

incidence.
2.4 An SIR Model with Varying Total

Population: Homogeneous
Systems
Consider an SIR model whose transfer diagram is depicted in
Figure 2.11.
In the model, N = S + I + R denotes the total population. We
assume that the birth process is in the form bN and the incidence
is the standard incidence λIS
N
. The modeling equations are:
λIS
S = bN − − dS
N
λIS (2.31)
I = − γI − dI
N
R = γI − dR.
The total population N satisfies
N = (b − d)N. (2.32)
Figure 2.11: Transfer diagram for an SIR model.

2.4 Models with Varying Total Population 61
If b = d, then N (t) varies with time t. Thus (2.31), unlike (2.24),

cannot be directly reduced to a 2-dimensional system. However,
(2.31) is a homogeneous system of degree 1. This property allows
us to transform it into a system that is reducible to 2 dimensions
and can be analyzed using the phase-plane analysis in Section 2.3.
We first establish a general framework for homogeneous systems.
A system of differential equations
x = f (x), x ∈ Rn , (2.33)
is said to be homogeneous of degree 1 if f (x) satisfies
f (λx) = λ f (x), λ > 0, x ∈ Rn . (2.34)
Functions that satisfy (2.34) are called homogeneous of degree 1,

and examples of this type of function include
f (x) = Ax, where A is an n × n matrix (2.35)

x1 x2
f (x1 , x2 ) = , (x1 , x2 ) ∈ R2 . (2.36)
x1 + x2
Proposition 2.4.1 (Euler Identity) Suppose that f (x) is
homogeneous of degree 1, namely, f (x) satisfies (2.34). Then

n
∂f (x)
xi = f (x), for x ∈ Rn . (2.37)
i=1 ∂xi
Proof. Differentiating (2.34) with respect to λ we obtain

n
∂f (λx)
xi = f (x).
i=1 ∂xi
Setting λ = 1 we obtain the Euler Identity.

Suppose that (2.33) represents a biological model for which
the nonnegative cone Rn+ is positively invariant. Then the homo-
geneity of f allows us to introduce a set of new variables
xi
yi = , i = 1, 2, · · · , n,
N
where N = x1 + x2 + · · · + xn = 0, and xi ≥ 0. Then y = (y1 , · · · ,

yn ) ∈ Rn+ \ {0} satisfies
x x 1 x
n
y = − 2 N = f (x) − 2 fi (x)
N N N N i=1

x x n
x
=f − fi (by (2.34))
N N i=1 N

n
= f (y) − y fi (y).
i=1
Namely,

n
y = f (y) − y fi (y), (2.38)
i=1
and

n
yi = 1. (2.39)
i=1
System (2.38) is the projection of (2.33) onto the hyperplane

(simplex) ni=1 yi = 1, so it is of dimension n − 1.

Back to our model (2.31). It is straightforward to verify that
function
T
λIS λIS
f (S, I, R) = bN − − dS, − (d + γ)I, γI − dR ,
N N
with N = S + I + R, is homogeneous of degree 1. Let
S I R
s= , i= , r= . (2.40)
N N N
Then, we can write a system for the fractional variables (s, i, r)
following (2.38), or derive the system by direct differentiation.
For instance,

S S 1 λIS S
s = − 2 N = bN − − dS − 2 (b − d)N
N N N N N
= b − λis − ds − s(b − d) = b − λis − bs.
Equations for i and r can be derived similarly. Therefore, we

obtain the projected system
s = b − λis − bs
i = λis − (b + γ)i (2.41)
r = γi − br,
with
s + i + r = 1. (2.42)
Note that system (2.41) has the same form as system (2.24). We
can apply the same phase-plane analysis as in Section 2.3 to arrive
at the next threshold result. Let
λ
R0 = ,
b+γ
and
Γ = {(s, i, r) ∈ R3+ | s + i + r = 1}.

Note that Γ is a subset of the hyperplane s + i + r = 1 and its
◦
interior Γ is also a subset of the hyperplane.
Theorem 2.4.2 The global dynamics of the projected system
(2.41) are completely determined by the basic reproduction number
R0 . More specifically:
(1) If R0 ≤ 1, the disease-free equilibrium P0 = (1, 0, 0) is the

only equilibrium in the feasible region Γ, and it is globally
stable in Γ;
(2) If R0 > 1, then P0 is unstable and the endemic equilibrium

∗ b + γ b[λ − (b + γ)] γ[λ − (b + γ)]
P = , ,
λ λ(b + γ) λ(b + γ)
◦
is globally stable in the interior Γ.
With the dynamics of the projected system (2.41) completely

determined in Theorem 2.4.2, we can turn to the original sys-
tem (2.31). First, we note that, at any equilibrium of (2.31), we
necessarily have N (t) = 0. If b = d, this implies N = 0, and thus
S = I = R = 0, which is biologically irrelevant. Therefore, system
(2.31) has no biologically relevant equilibria! How do solutions to
(2.31) behave? In general, they either exponentially go to zero or
exponentially go to infinity, as we see below.
Case I. b = d. In this case, N (t) = 0 and thus N (t) is a constant,
which can be scaled to 1. Thus (2.31) reduces to (2.41).
Case II. b < d. In this case, N (t) → 0 exponentially as t → ∞.
Therefore, S(t), I(t), and R(t) converge to 0 exponentially as
t → ∞ since 0 ≤ S(t), I(t), R(t) ≤ N (t).
Case III. b > d. In this case, N (t) → ∞ exponentially as t → ∞.
We consider two subcases.
(IIIa) If R0 < 1, then, as t → ∞, NS(t)

(t)
= s(t) → 1, and thus
S(t) → ∞. From the I equation in (2.31) we obtain
S
I (t) = [λ − (d + γ)]I + (
− 1)I.
N
Since NS − 1 → 0 exponentially, the behavior of I(t) is
determined by the principal part
I (t) = [λ − (d + γ)]I.
Therefore,
(1) I(t) → 0 exponentially if R1 = λ
d+γ
< 1;
(2) I(t) → ∞ exponentially if R1 > 1 (while I(t)

N (t)
→0
exponentially).
In case (1), I(t) → 0 exponentially, and thus R(t) is deter-

mined by the principal part of the R equation
R = −dR.
This implies that R(t) → 0 if R1 < 1. In case (2), there

exists a,
, T > 0 such that I(t) ≥ aet for t ≥ T . From the
R equation in (2.31),
R = γI − dR
we obtain, for t > T ,

t
R(t) = R(0)e−dt + γe−dt I(τ )edτ dτ
0
t
aγ
≥ aγedt e(+d)τ dτ = et − e−d(t−T )+T
T d+
→ ∞, as t → ∞.
Therefore, S(t) → ∞, I(t) → ∞, and R(t) → ∞ as t → ∞

if R1 > 1.
(IIIb) If R0 > 1, then, as t → ∞,
S(t) b+γ I(t)
= s(t) → s∗ = > 0, = i(t) → i∗ > 0, and
N (t) λ N (t)
(2.43)
R(t)
= r(t) → r∗ > 0. (2.44)
N (t)
Therefore, S(t) → ∞, I(t) → ∞, and R(t) → ∞ as t → ∞.

Table 2.1: Summary of results for model (2.31). GAS = globally

asymptotically stable, US = unstable, and DNE = does not exist.
b, d R0 ac S, I, R, N s, i, r P0 b P ∗c
N (t) = N0 , s(t) → 1,
b=d R0 ≤ 1 S(t) = N0 s(t), i(t) → 0, P0 GAS P ∗ DNE
I(t) = N0 i(t), r(t) → 0.
R(t) = N0 r(t).
s(t) → s∗ ,
R0 > 1 i(t) → i∗ , P0 US P ∗ GAS
r(t) → r∗ .
N (t) → 0,
s(t) → 1,
S(t) → 0,
b<d R0 ≤ 1 i(t) → 0, P0 GAS P ∗ DNE
I(t) → 0,
r(t) → 0.
R(t) → 0.
s(t) → s∗ ,
R0 > 1 i(t) → i∗ , P0 US P ∗ GAS
r(t) → r∗ .
N (t) → ∞,
s(t) → 1,
S(t) → ∞,
b>d R0 ≤ 1 R1 ≤ 1d i(t) → 0, P0 GAS P ∗ DNE
I(t) → 0,
r(t) → 0.
R(t) → ∞.
N (t) → ∞,
s(t) → s∗ ,
S(t) → ∞,
R1 > 1 i(t) → i∗ ,
I(t) → ∞,
r(t) → r∗ .
R(t) → ∞.
N (t) → ∞,
s(t) → s∗ ,
S(t) → ∞,
R0 > 1 i(t) → i∗ , P0 US P ∗ GAS
I(t) → ∞,
r(t) → r∗ .
R(t) → ∞.
a
R0 = (b+γ)
λ
.
b
P0 = (1, 0, 0).
c
P ∗ = (s∗ , i∗ , r∗ ).
d
R1 = (d+γ)
λ
.
These results are summarized in Table 2.1. We note that the

dynamics of the projected system (2.41) are determined by the
basic reproduction number R0 and the behaviors of solutions of
the original system (2.31) are determined by both R0 and R1 .
To summarize, for model (2.31), disease outcomes are better

described in terms of fractional variables, rather than the popu-
lation size. In particular, we have the following conclusions:
(1) If R0 = b+γ
λ
≤ 1, then the disease always dies out from the
population in the sense that the fraction of the population
that is infected (usually called the disease prevalence) goes
to zero as t → ∞.
(2) If R0 > 1, then any initial outbreak will lead to an endemic
disease because the infectious fraction approaches a positive
constant as t → ∞.
Exercises.
(1) A function x → f (x) ∈ Rn defined for x ∈ Rn is homoge-

neous of degree δ > 0 if f satisfies
f (λx) = λδ f (x), λ > 0, x ∈ Rn . (2.45)
Derive the Euler Identity in this case. Give several examples

of functions that are homogeneous of degree δ and verify
that they satisfy the Euler Identity you have derived.
(2) Let f : Rn → Rn be homogeneous of degree δ > 0. Let
φ : Rn → R be a scalar-valued function that is homogeneous
of degree 1. Assume that φ is positive definite, namely:
φ(x) ≥ 0 and φ(x) = 0 if and only if x = 0. (2.46)
Let x(t) = 0 be a solution of the differential equation

x = f (x), and
x(t)
y(t) = .
φ(x(t))
Derive the differential equation satisfied by y(t), which is
called the projected equation of x = f (x) onto the sphere
φ(x) = 1.
(3) Using the method described in this section, analyze the

models described in Figures 2.12 and 2.13.
Figure 2.12: Transfer diagram for an SIR model with disease-

caused death.
Figure 2.13: Transfer diagram for an SIRS model.
2.5 Ross–MacDonald Model for

Malaria: A Monotone System
Malaria is caused by parasitic protozoans of the genus Plasmod-
ium. The protozoans are transmitted to people through bites
of infected female mosquitoes when they are taking a blood
meal. While living in humans and mosquitoes, the protozoans
go through long and complex life cycles, each of which is critical
for their survival and transmission. Since its transmission criti-
cally depends on the disease vector mosquitoes, malaria is a typ-
ical example of a vector-borne disease. The transmission of the
Plasmodium protozoans between humans and mosquitoes occurs
through mosquito bites: when a healthy (female) mosquito bites
an infected human, she takes in the protozoans with the blood
and becomes infected; when an infected mosquito bites a healthy
2.5 Ross–MacDonald Model 69
human, a small amount of saliva is injected into the human body

from the salivary gland of the mosquito, which malaria protozoans
have invaded, and the protozoans enter the human body. In the
human body, the protozoans first migrate into the liver where
they mature and release their young into the bloodstream; there
they will be picked up by mosquitoes taking a blood meal, and
the infection cycle continues.
2.5.1 Modeling Malaria Transmission

The earliest work of mathematical modeling of malaria transmis-
sion dynamics was done in the early 1900 by R. Ross [52], who
later was awarded the Nobel Prize for Medicine for his medi-
cal research on malaria. Ross models were further developed by
G. MacDonald [42]. To formulate the Ross–MacDonald model for
malaria transmission, we consider a human and a mosquito pop-
ulation, and assume, for simplicity, that both populations have
a constant size H and V , respectively. Let Sh and Ih denote the
susceptible and infectious humans, and Sv and Iv the susceptible
and infectious female mosquitoes. Here, we only consider female
mosquitoes because only female mosquitoes take blood meals.
Important parameters used in the model are listed in Table 2.2.
When applying the model or its variations to malaria control,
it is often useful to keep in mind the relative scales of values of
the parameters. For instance, γv is typically very small since it is
known that mosquitoes infected with malaria rarely recover. Also,
human birth and death rates μh are relatively small compared to
other parameters such as human recovery rate γh . Typically, the
mean life expectancy for humans 1/μh is between 60 and 85 years,
while the mean recovery time for humans from malaria without
serious complications is 1–2 weeks after treatment is given.
Table 2.2: Parameters in the Ross–MacDonald model for malaria

transmission.
a: biting rate (number of humans bitten per mosquito in a unit time)
b1 : probability of human infection occurring from an infectious bite
b2 : probability of mosquito infection occurring from an infectious bite
μh : human birth and death rates
μv : mosquitoes birth and death rates
γh : human recovery rate
γv : mosquito recovery rate
The transmission process of malaria between the human and

mosquito populations is depicted in Figure 2.14.
The malaria incidence in the human population is calculated
as
Sh
a · b1 · · Iv ,
H
which is interpreted as follows: the average number a of infectious
bites per unit time multiplies the probability b1 of an infectious
bite will produce a human infection, multiplies the probability
Sh
H
of a bite is on a susceptible human, and then multiplies the
number IV of infectious female mosquitoes.
Figure 2.14: Transfer diagram for Ross–MacDonald model. Solid

arrows indicate transfer. Dashed arrows indicate cross infection.
To derive the malaria incidence in the mosquito population,

let ã denote the mosquito biting frequency, namely, the average
number of mosquitoes that bit a human within a unit time. Then
a and ã are related in the relation
a V = ã H. (2.47)
Using parameter ã, the incidence in the mosquito population is

given by
Sv
ã · b2 · · Ih .
V
Replacing ã by the biting rate a using relation (2.47), we obtain
the malaria incidence in the mosquito population
Ih
a · b2 · Sv .
H
Using the transfer diagram, we can derive the following differ-
ential equations:
S h Iv
Sh = μh H − ab1 − μ h S h + γ h Ih
H
S h Iv
Ih = ab1 − μ h Ih − γ h Ih
H (2.48)
S v Ih
Sv = μv V − ab2 − μ v S v + γ v Iv
H
S v Ih
Iv = ab2 − μ v Iv − γv Iv .
H
Based on our analysis in the previous section, we see that model
(2.48) is a homogeneous system of degree 1. Using fractional vari-
ables
Sh Ih Sv Iv V
sh = , ih = , sv = , iv = , and m = ,
H H V V H
we can derive the following system:
sh = μh − ab1 msh iv − μh sh + γh ih

ih = ab1 msh iv − μh ih − γh ih
(2.49)
sv = μv − ab2 sv ih − μv sv + γv iv
iv = ab2 sv ih − μv iv − γv iv .
Since sh + ih = 1 and sv + iv = 1, we may choose to keep only two

of the four variables, say ih and iv , which are the prevalence in
the human and mosquito populations, respectively. Let x = ih and
y = iv , then sh = 1 − x and sv = 1 − y. Substituting into (2.49)
we obtain the following equivalent system:
x = amb1 y(1 − x) − γ1 x
(2.50)
y = ab2 x(1 − y) − γ2 y.
Here, we have set new parameters γ1 = μh + γh and γ2 = μv + γv .

We consider system (2.50) in the following bounded region:
Γ = {(x, y) ∈ R2+ | 0 ≤ x ≤ 1, 0 ≤ y ≤ 1}. (2.51)
By examining the direction of the vector field on the boundary

of Γ, we can verify that Γ is positively invariant with respect to
system (2.50).
2.5.2 Equilibria and the Basic Reproduction

Number
System (2.50) has two possible equilibria: the disease-free equi-
librium, P0 = (0, 0), which corresponds to the absence of infec-
tious individuals in both human and mosquito populations, and
an endemic equilibrium P ∗ = (x∗ , y ∗ ), where x∗ , y ∗ are given by
a2 mb1 b2 − γ1 γ2 a2 mb1 b2 − γ1 γ2
x∗ = , y∗ = .
ab2 (amb1 + γ1 ) amb1 (ab2 + γ2 )
We note that x∗ , y ∗ > 0 if and only if the following condition holds:

a2 mb1 b2
R0 = > 1, (2.52)
γ1 γ2
where R0 is the basic reproduction number for malaria transmis-
sion. Heuristically, it can be interpreted as follows. A primary
human case has a recovery rate γ1 , and the average infectious
period is γ11 . During this time, the average number of mosquito
bites from the susceptible fraction of mosquitoes is γa1 , which gives
a total of ab
γ1
2
infected mosquitoes. Each of these mosquitoes will
survive for an average time γ12 and produce a total of γa2 bites,
which will lead to a total of amb γ2
1
secondary human infections.
Therefore, the average total of secondary human infections from
a single primary human case is ab 2 amb1
γ1 γ2
, giving the basic repro-
duction number in (2.52).
Local stability analysis can be carried out at each of the two
equilibria, using techniques in Section 2.3.2. We arrive at the
following result (details of verification are left as an exercise):
Theorem 2.5.1 Let R0 be defined in (2.52).
(1) If R0 ≤ 1, then system (2.50) has only the disease-free equi-

librium P0 = (0, 0), and P0 is globally asymptotically stable
in Γ.
(2) If R0 > 1, then P0 is unstable, and a unique endemic equi-
librium P ∗ = (x∗ , y ∗ ) exists and is globally asymptotically
stable in the interior of Γ.
2.5.3 Monotonicity and Global Dynamics

The global stability of the disease-free equilibrium P0 when R0 ≤ 1
and of the endemic equilibrium P ∗ when R0 > 1 can be estab-
lished using the same analysis as in Section 2.3, using a Lyapunov
function and the Poincaré–Bendixson theory. In this section, we
show that system (2.50) is a monotone system, which allows us to
use the machinery developed for monotone systems in Chapter 3
and establish the global stability.
We first show that system (2.50) is a monotone system by the

definition in Section 3.8 of Chapter 3. The Jacobian matrix of
(2.50) is

−γ1 − amb1 y amb1 (1 − x)
J(x, y) = .
ab2 (1 − y) −γ2 − ab2 x
We see that the off-diagonal terms are both positive, and thus
J(x, y) is a Metzler matrix. System (2.50) is a monotone system
and is irreducible in the interior of Γ, and its solutions preserve
the order defined by the positive quadrant.
Next, we verify that system (2.50) is strictly sublinear using
the definition in Section 3.8. Let 0 < λ < 1 and
F (x, y) = (P (x, y), Q(x, y)) = (amb1 y(1 − x) − γ1 x, ab2 x(1 − y) − γ2 y)
be the vector field of system (2.50). Then
P (λx, λy) = amb1 λy(1 − λx) − γ1 λx

= λ[amb1 y(1 − λx) − γ1 x]
> λ[amb1 y(1 − x) − γ1 x] = λP (x, y).
Similarly, we can verify that Q(λx, λy) > λQ(x, y), for (x, y) ∈ Γ
and x > 0, y > 0. This implies that F (x, y) is strictly sublinear.
We can thus apply Theorem 3.8.5 to prove the global stability of
P0 when R0 ≤ 1, and that of P ∗ when R0 > 1.
2.5.4 Exercises
Gonorrhea is a sexually transmitted infection (STI). Gonorrhea
is caused by the bacterium Neisseria gonorrheae and spread
between people through sexual contact. Individuals who have had
gonorrhea and received treatment can get reinfected. The Center
for Disease Control and Prevention (CDC) estimated that about
400,000 cases of gonorrhea were reported in the U.S. in 2015.
A mathematical model for gonorrhea transmission dynamics was
formulated and analyzed in Lajmanovich and Yorke [36]. The
model includes cross infections among a finite number of groups.
(1). Since gonorrhea infection yields little immune protection

against reinfection, the transmission dynamics fit an SIS model.
(a) Explain the assumptions made in the transfer diagram in

Figure 2.15 and use the diagram to derive a single popula-
tion model for gonorrhea transmission.
Figure 2.15: Transfer diagram for a single population gonorrhea

model.
(b) Show that the total population N = S + I remains a con-

stant.
S
(c) Derive the system for the fractional variables s = N
and
i = NI .
(d) Discuss the feasible region of the system and its equilibria.
(e) Derive the basic reproduction number R0 for the system
from its biological definition and show that the R0 you have
derived is a threshold parameter: if R0 ≤ 1, the system has
only the disease-free equilibrium, and if R0 > 1, the system
has a unique endemic equilibrium.
(f) Using the relation s + i = 1, reduce the system of two equa-
tions to a single equation for the variable i, and use the
phase-line analysis to discuss the local and global stabil-
ity of each equilibrium. Verify that if R0 < 1 then the
disease-free equilibrium is globally asymptotically stable;
if R0 > 1, then the disease-free equilibrium is unstable, and
the endemic equilibrium is globally asymptotically stable.
(2). Sexual contacts within a population are not homogeneous;

some people have more sexual partners than others. One way
to model heterogeneous mixing is to divide the population into

groups and allow a greater degree of within-group contacts than
intergroup contacts.
(a) Explain the assumptions that are made in the transfer

diagram shown in Figure 2.16 for a two-group gonorrhea
model, and use the transfer diagram to derive the model
equations.
(b) Repeat the discussions (b)–(e) in the previous exercise for
the new model you have derived.
Figure 2.16: Transfer diagram for a two-group gonorrhea model.

Here, λ1 = λ11 I1 + λ21 I2 and λ2 = λ12 I1 + λ22 I2 . Solid arrows
indicate transfer. Dashed arrows indicate cross infection.
(c) Using the relations s1 + i1 = 1 and s2 + i2 = 1 for fractional

variables sk = Sk /Nk and ik = Ik /Nk , k = 1, 2, reduce the
4-dimensional system to a 2-dimensional system for the
prevalence variables i1 and i2 .
(d) Carry out local stability analysis for each equilibrium using
linearization. Verify that the value of R0 determines the
stability of equilibria: if R0 < 1 then the disease-free equi-
librium is locally asymptotically stable; if R0 > 1, then the
disease-free equilibrium is unstable and the endemic equi-
librium is locally asymptotically stable.
(e) Prove that the disease-free equilibrium is globally stable

in the feasible region when R0 ≤ 1, using the method of
Lyapunov functions and the LaSalle Invariance Principle.
(f) Prove that the endemic equilibrium is globally asymptot-
ically stable in the interior of the feasible region when
R0 > 1, using the Poincaré–Bendixson Theorem and the
Bendixson–Dulac criteria.
(g) Prove the global stability of the endemic equilibrium (when
R0 > 1) by constructing a global Lyapunov function. Hint:
you may start by considering V (I1 , I2 ) = c1 (I1 − I1∗ log I1 ) +
c2 (I2 − I2∗ log I2 ), for suitable constants c1 , c2 > 0.
(h) Show that the reduced system for i1 and i2 is a monotone
system and is sublinear. Prove the global stability of the
disease-free equilibrium (when R0 ≤ 1) and of the endemic
equilibrium (when R0 > 1) using the properties of mono-
tone systems.
(3). Draw a transfer diagram for an n-group gonorrhea model,

where n ≥ 1 is an arbitrary integer. Derive a system of differen-
tial equations based on your transfer diagram. Derive a system for
the fractional variables, and then reduce the system to a system
for the n prevalence variables. Try to repeat the discussions in
previous exercises on the equilibria and local stability. Can you
derive the basic reproduction number R0 ? Can you show that, if
R0 > 1, there exists a unique endemic equilibrium? How would
you go about proving the global stability of the disease-free equi-
librium and the endemic equilibrium? Which of the methods in
previous exercises are applicable?
(4). Verify the local stability results for the Ross–MacDonald
model in Theorem 2.5.1.
Chapter 3
Basic Mathematical Tools

and Techniques
This chapter contains basic concepts and theories for differential

equations, focusing on different approaches and methodologies for
stability analysis. They provide the mathematical tools needed
for model analysis described in earlier chapters. The ideal way to
learn the materials in this chapter is through their applications in
Chapter 2. For students who are interested in more in-depth read-
ing on the theories of ordinary differential equations and finding
more examples, exercises, as well as proofs, classical textbooks
of Coddington and Levinson [11], Hale [24], Hartman [25], and
Piccinini, Stampacchia and Vidossich [51] are excellent sources.
3.1 Stability of Equilibrium Solutions

Consider an open set D in the phase space Rn , and a function
f ∈ C 1 (D → Rn ), called a vector field. A system of differential
equations can be defined as
x = f (x). (3.1)
A solution to (3.1) in an interval I ⊂ R is a differentiable function

ϕ : I → Rn such that
ϕ (t) = f (ϕ(t)).

https://doi.org/10.1007/978-3-319-72122-4_3
80 Basic Tools
When the vector field f (x) is smooth (C 1 ), the fundamental the-

ory of differential equations ensures that, for each initial point
x0 ∈ D, a unique solution x(t, x0 ) exists in an interval I = (−a, a)
such that x(0, x0 ) = x0 . We say such a solution starts from the ini-
tial point x0 . A solution can be extended to its maximal interval
of existence. If a solution x(t, x0 ) remains in a compact subset
of D throughout its maximal interval of existence (α, ω), then it
exists for all t ∈ R. The orbit of a solution x(t, x0 ) is the set
γ(x0 ) = {x(t, x0 ) : t ∈ (α, ω)}.
A solution x(t) is called an equilibrium, or steady state, if it

is a constant for all t, namely, x(t) = x̄ for t ∈ R. In this case,
x̄ satisfies f (x̄) = 0 since x (t) ≡ 0. A periodic solution x(t) of
period T > 0 satisfies x(t + T ) = x(t) for all t ∈ R, and its orbit
γ = {x(t) : 0 ≤ t < T } is a simple closed smooth curve. For this
reason, a periodic orbit is also called a closed orbit. For equilibria
and periodic solutions, we are interested in their stability proper-
ties. Intuitively, an equilibrium x̄ is stable if any solution starting
close to x̄ remains close to x̄.
Definition. An equilibrium x̄ of system (3.1) is
(1) stable, if for each -neighborhood N (x̄, ) of x̄, there exists a

δ-neighborhood N (x̄, δ) of x̄ such that x0 ∈ N (x̄, δ) implies
x(t, x0 ) ∈ N (x̄, ) for all t ≥ 0;
(2) asymptotically stable, if x̄ is stable and if there exists
b-neighborhood N (x̄, b) such that x0 ∈ N (x̄, b) implies
x(t, x0 ) → x̄ as t → ∞.
In the above definition, an asymptotically stable equilibrium
x̄ is said to attract points in a neighborhood N (x̄, b). The set of
points that are attracted by x̄ is an open set and is called the
basin of attraction of x̄.
3.2 Stability by Linearization 81
3.2 Stability Analysis by

Linearization
A standard method of stability analysis is linearization. Let x̄ = 0
be an equilibrium, namely, f (0) = 0, and thus f (x) can be written
in Taylor expansion as
f (x) = A x + F (x), (3.2)
where the matrix

∂f
A= (0)
∂x
is the Jacobian matrix of f at 0, and
F (x) = f (x) − Ax.
Therefore F (0) = 0 and ∂F ∂x

(0) = 0. In the expansion (3.2), Ax is
the linearization of f at 0 and F (x) is the higher order term. The
linearized system of (3.1) at the equilibrium 0 is
y = Ay. (3.3)
The next theorem is a standard result for the method of lineariza-

tion.
Theorem 3.2.1 Let A and F be given in (3.2). If y = 0 is
asymptotically stable for the linearized system (3.3), then the equi-
librium x̄ is asymptotically stable for the nonlinear system (3.1).
By Theorem 3.2.1, it is sufficient to investigate the asymptotic
stability of an equilibrium for the linearized system.
Theorem 3.2.2 The solution y = 0 is asymptotically stable for
the linear system (3.3) if all eigenvalues of A have negative real
parts.
Verifying that an n × n matrix has n eigenvalues with negative
real parts can be a challenging task when n is large, especially if
entries of A contain non-numerical parameters. An algorithm by
Routh–Hurwitz can be used to derive a set of necessary and suf-
ficient conditions, typically called the Routh–Hurwitz conditions.
For n = 2, 3, the Routh–Hurwitz conditions are as follows:
82 Basic Tools
(1) All eigenvalues of a 2 × 2 matrix A have negative real parts

if and only if
tr(A) < 0 and det(A) > 0. (3.4)
(2) All eigenvalues of a 3 × 3 matrix A have negative real parts

if and only if
tr(A) < 0, det(A) < 0, and tr(A) a2 − det(A) < 0,

(3.5)
where a2 denotes the sum of 2 × 2 principal minors of A.
3.3 Stability Analysis Using

Lyapunov Functions
Let U ⊂ Rn be a neighborhood of 0 and V ∈ C 1 (U → R) a real-
valued function. The gradient vector of V (x) is

∂V ∂V
grad V (x) = ,··· , .
∂x1 ∂xn
Let f (x) be the vector field of system (3.1). The derivative of V

in the direction of f is defined as a dot product
∗
V (x) = grad V (x) · f (x).
∗
V (x) is also called the Lyapunov derivative with respect to system
(3.1). The function V (x) is called a Lyapunov function of system
∗
(3.1) near an equilibrium x = 0 if V (x) ≤ 0 for x ∈ U. Let x(t) be
a solution of system (3.1) that stays in U , then
d ∗
V (x(t)) = grad V (x(t)) · x (t) = grad V (x(t)) · f (x(t)) = V (x(t)) ≤ 0.
dt
Therefore, V (x(t)) decreases along a solution of system (3.1) in

a neighborhood of x = 0. Theorems 3.3.1–3.3.3 describe a direct
method of establishing stability using a Lyapunov function.
3.3 Stability Using Lyapunov Functions 83
Theorem 3.3.1 Suppose that a function V (x) exists such that
(1) V (x) ≥ 0 for x ∈ U and V (x) = 0 if and only if x = 0;

∗
(2) V (x) ≤ 0 for x ∈ U .
Then the equilibrium x = 0 of system (3.1) is locally stable.

(1) V (x) ≥ 0 for x ∈ U and V (x) = 0 if and only if x = 0;

∗ ∗
(2) V (x) ≤ 0 for x ∈ U and V (x) = 0 if and only if x = 0.
Then the equilibrium x = 0 of system (3.1) is locally asymptoti-

cally stable.
A function V satisfying assumption (1) in the above theorems
is called positive definite at x = 0. Similarly, the Lyapunov deriva-
∗
tive V (x) in the assumption (2) of Theorem 3.3.2 is negative def-
inite at x = 0. If an equilibrium x̄ = 0, we may consider a change
of variables x̃ = x − x̄ so that system (3.1) becomes
x̃ = g(x̃),
where g(x̃) = f (x̃ + x̄). Then g(0) = f (x̄) = 0 and we have shifted
the equilibrium x̄ to 0, and Theorems 3.3.1 and 3.3.2 will be appli-
cable. The next result deals with the instability of equilibrium
x = 0.
(1) V (0) = 0 and there exists sequence xn → 0 such that

V (xn ) < 0 for all n;
∗ ∗
(2) V (x) ≤ 0 for x ∈ U and V (x) = 0 if and only if x = 0.
Then the equilibrium x = 0 of system (3.1) is unstable.

Let G ⊂ Rn be an open set. A function V (x) is said to be a
Lyapunov function with respect to G if
84 Basic Tools
∗
V (x) ≤ 0, for x ∈ G.
∗
Let K be the largest invariant subset in the set {x ∈ G : V (x) = 0}.
Since V (x(t)) decreases along a solution x(t, x0 ) of system (3.1),
the omega-limit set of the solution,
ω(x0 ) = {x ∈ G : there exists tn → ∞ such that x(tn , x0 ) → x1 as n → ∞},
∗
is contained in the set where V (x) = 0. Since omega-limit sets
are invariant, we know ω(x0 ) must be contained in the largest
invariant subset K. This is the well-known LaSalle’s Invariance
Principle.
Theorem 3.3.4 If a solution x(t, x0 ) stays entirely in G for
t ≥ 0, then its omega-limit set ω(x0 ) ∩ G ⊂ K.
Corollary 3.3.5 If K contains a single point x̄, then x̄ is an
equilibrium and solutions that stay entirely in G for t ≥ 0 converge
to x̄ as t → ∞.
If we also know that x̄ is locally stable, and that all solutions
starting in G remain in G (in this case, G is said to be posi-
tively invariant), then Corollary 3.3.5 implies that x̄ is globally
asymptotically stable in the region G.
3.4 Stability of Periodic Solutions:

The Floquet Theory
Let x = p(t) be a nonconstant periodic solution of period T of
system (3.1), and
γ = {p(t) : 0 ≤ t < T }
be its orbit. Stability analysis of a periodic orbit is more involved

than that of an equilibrium. First of all, we make an observation
that the notion of Lyapunov stability we gave for equilibria in
Section 3.1 is not appropriate for periodic solutions. The phase
portrait of a nonlinear pendulum equation
3.4 Stability of Periodic Solutions 85
x (t) + sin x(t) = 0 (3.6)
consists of a family of periodic orbits arranged in concentric circles

centered at the origin (see Figure 3.1). Each of these orbits should
be considered “stable.” However, each periodic orbit has a different
period. Therefore, for two periodic solutions p(t), q(t) whose orbits
are close to each other, |p(t) − q(t)| will not necessarily be small
for all t, and thus neither can be stable according to the definition
in Section 3.1. This situation necessitates the concept of orbital
stability, in the sense that the distance between two orbits remains
close while |p(t) − q(t)| may not.
Definition. A periodic solution x = P (t) is said to be
(1) orbitally stable if for all > 0, there exists δ > 0 such that
|x0 − p(0)| < δ implies d(x(t, x0 ), γ) < , where d(x, γ) =
miny∈γ |x − y| is the distance from x to γ.
(2) orbitally asymptotically stable if (a) it is orbitally stable and
(b) there exists δ1 > 0 such that |x0 − p(0)| < δ1 implies
d(x(t, x0 ), γ) → 0 as t → ∞. In this case, the periodic orbit
is called a limit cycle.
15
10
-3 -2 -1 1 2 3
-5
-10
-15
Figure 3.1: A family of nested periodic orbits of the pendulum

equation; these orbits all have the same periods if the pendulum
equation is linear, but different periods if the pendulum equation
is nonlinear, as in equation (3.6).
86 Basic Tools
By these definitions, the periodic orbits in Figure 3.1 are

orbitally stable but not orbitally asymptotically stable, since the
distance between any two periodic orbits remains a constant as t
varies.
To analyze the stability of a periodic solution x = p(t), we also
use the method of linearization and consider the linearized system
of (3.1) along x = p(t),
∂f
y (t) = (p(t)) y(t). (3.7)
∂x
We note that system (3.7) is a linear system with periodic coeffi-
cients. Structures of the solution space of this type of system are
described in the Floquet theory.
Theorem 3.4.1 (Floquet Theory)
(1) A fundamental matrix of system (3.7) can be written in the

form
Y (t) = P (t)eLt , (3.8)
where P (t) is a n × n matrix-valued function periodic in
t with period T , P (0) = In×n , and L is a n × n constant
matrix;
(2) Eigenvalues of L are called Floquet exponents, and eigen-
values of Y (T ) = eLT are called Floquet multipliers;
(3) System (3.7) has a nonconstant periodic solution of period
T if and only if a Floquet multiplier is equal to 1, or equiv-
alently, a Floquet exponent is equal to 0.
We can verify by direct differentiation that y(t) = p (t) is a

nonconstant periodic solution to system (3.7). Therefore, by The-
orem 3.4.1-(3), one of its Floquet multipliers is 1, and we can write
all Floquet multipliers of x = p(t) as
1, λ1 , . . . , λn−1 . (3.9)
A fundamental result in the theory of nonlinear differential equa-

tions states that x = p(t) is asymptotically stable if the remain-
3.5 Phase-Line Analysis 87
ing n − 1 Floquet multipliers, λ1 , . . . , λn−1 , all have modulus less

than 1.
Theorem 3.4.2 The periodic solution x = p(t) is orbitally
asymptotically stable if the Floquet multipliers λ1 , . . . , λn−1 in
(3.9) have modulus less than 1.
While Theorem 3.4.2 is a fundamental stability result, we point
out that estimation of Floquet multipliers is not an easy task. A
method of Poincaré for 2-dimensional systems will be given in
Section 3.6. A generalization to higher dimensional systems was
developed by J. S. Muldowney [47].
3.5 Global Dynamics of 1-Dimensional

Systems: Phase-Line Analysis
Consider a scalar differential equation
x = f (x), (3.10)
where f ∈ C 1 (R → R) is a real-valued function. Stability analysis

of this class of equations can be done using a graphical method
called the phase-line analysis. Suppose the graph of f is as shown
in Figure 3.2. Then, each intersection of the graph with the x-
axis is a zero of f (x), and thus is an equilibrium of equation
(3.10). If between two consecutive zeros of f , the graph is above
the x-axis, and thus f (x) > 0, the a solution x(t, x0 ) with x0 in
this interval satisfies x (t) = f (x(t)) > 0, and x(t, x0 ) is monoton-
ically increasing and converges to the equilibrium on the right as
t → ∞. Similarly, if the graph of f is below the x-axis, then
x(t, x0 ) converges to the equilibrium on the left as t → ∞. Also
observe that, if the derivative f (x̄) < 0 at an equilibrium x̄, then
the slope of the tangent line to the graph at x̄ is negative, solu-
tions near x̄ will converge to x̄, and hence x̄ is stable. On the other
hand, if f (x̄) > 0, then x̄ is unstable since solutions near x̄ move
away from x̄ in this case. These observations are summarized in
the next result.
88 Basic Tools
Proposition 3.5.1
(1) An equilibrium x̄ of equation (3.10) satisfies f (x̄) = 0;

(2) If f (x) > 0 for x ∈ (a, b), then solutions starting in (a, b)
monotonically increase; if f (x) < 0 for x ∈ (a, b), then solu-
tions starting in (a, b) monotonically decrease;
(3) If f (x̄) < 0 then the equilibrium x̄ is stable; if f (x̄) > 0
then x̄ is unstable.
f(x)
f(x)
-2 0 3 x
Figure 3.2: A graphical demonstration of phase-line analysis.
As an example, let us consider the logistic equation for popu-

lation growth
N (t)
N (t) = rN (t) 1 − , (3.11)
K
where r > 0 is the intrinsic growth rate and K is the carrying
capacity. Let
N (t)
f (N ) = rN (t) 1 − .
K
The graph of f (N ) intersects the N -axis at two equilibria N = 0
and N = K. Assume that r > 0. Then the graph is above the
N -axis between 0 and K, and below the N -axis outside [0, K],
see Figure 3.3.
3.5 Phase-Line Analysis 89
By Proposition 3.5.1, we know that, if r > 0 then equilibrium

N = 0 is unstable and N = K is stable. Furthermore, solutions
N (t, N0 ) with 0 < N0 < K increase monotonically and converge
to N = K as t → ∞, and solutions N (t, N0 ) with N0 > K decrease
monotonically and converge to N = K as t → ∞. This allows us
to produce the time plots for solutions N (t, N0 ) for different N0 ,
see Figure 3.4.
f(N)
0 K N
Figure 3.3: Phase-line analysis for the logistic equation when

r > 0. The equilibrium N = 0 is unstable and the equilibrium at
the carrying capacity N = K is stable.
t
0 2 4 6 8 10 12
Figure 3.4: Time plots of solutions to the logistic equation when

r > 0. The equilibrium N = 0 is unstable since nearby solutions
move away from it, while the equilibrium N = K is asymptotically
stable since all nearby solutions converge to it as t → ∞.
90 Basic Tools
3.6 Global Dynamics of 2-Dimensional

Systems: Phase-Plane Analysis
We consider a 2-dimensional system of differential equations
x = f (x), (3.12)
where f ∈ C 1 (R2 → R2 ). We assume that all solutions to system

(3.12) exist for all time t ≥ 0 and investigate their asymptotic
behaviors as t → ∞. We first describe some important concepts:
Definition. Let x(t, x0 ) be a solution to system (3.12).
(1) The positive semi-orbit of x(t, x0 ) is γ + (x0 ) = {x(t, x0 )

| t ≥ 0};
(2) The negative semi-orbit of x(t, x0 ) is γ − (x0 ) = {x(t, x0 )
| t ≤ 0};
(3) The orbit of x(t, x0 ) is γ(x0 ) = {x(t, x0 ) | t ∈ R} = γ + (x0 ) ∪
γ − (x0 ).
Definition. Limit sets.

(1) The ω-limit set of x(t, x0 ) is
ω(x0 ) = {x | there exists tn → ∞ such that x(tn , x0 ) → x};

(3.13)
(2) The α-limit set of x(t, x0 ) is
α(x0 ) = {x | there exists tn → −∞ such that x(tn , x0 ) → x}.

(3.14)
3.6 Phase-Plane Analysis 91
Definition. A subset K ⊂ R2 is:
(1) positively invariant, if x0 ∈ K =⇒ x(t, x0 ) ∈ K, t ≥ 0;

(2) negatively invariant, if x0 ∈ K =⇒ x(t, x0 ) ∈ K, t ≤ 0;
(3) invariant, if it is both positively and negatively invariants,
namely, x0 ∈ K =⇒ x(t, x0 ) ∈ K, t ∈ R.
Theorem 3.6.1 Properties of limit sets.
(1) A limit set is closed.

(2) If γ + (x0 ) (or γ − (x0 )) is bounded, then ω(x0 ) (or α(x0 )) is
nonempty, compact, and connected.
(3) Limit sets are invariant.
To investigate asymptotic behaviors of a solution x(t, x0 ), we

try to characterize its limit sets. Examples of limit sets include
a single equilibrium, a periodic orbit, a homoclinic orbit, or a
heteroclinic cycle (Figure 3.5). A limit set in higher dimensional
systems can also be very complicated: the Lorenz attractor in R3 is
a well-known example and is closely related to chaotic dynamics.
The qualitative theory of differential equations aims to classify
all limit sets in a given system. In this respect, some of the best
theories were developed for 2-dimensional systems, largely due to
the work of Poincaré and Bendixson. These results are collectively
known as the Poincaré–Bendixson theory.
Figure 3.5: A limit set can be a periodic orbit, a homoclinic orbit

to an equilibrium, or a heteroclinic cycle consisting multiple equi-
libria and their connecting orbits.
Theorem 3.6.2 (Poincaré–Bendixson Theorem) Let D ⊂ R2

be open and f ∈ C 1 (D → R2 ). Assume that
92 Basic Tools
(1) γ + (x0 ) ⊂ K ⊂ D and K is compact;

(2) ω(x0 ) contains no equilibria.
Then ω(x0 ) is a periodic orbit.

The Poincaré–Bendixson Theorem is often used to prove the
existence of periodic orbits.
Corollary 3.6.3 Assume that
(1) K ⊂ D is compact;
(2) K contains no equilibria;
(3) K contains a semi-orbit.
Then K contains a nonconstant periodic orbit.
x0
K
x(t, x0)
Figure 3.6: A bounded orbit x(t, x0 ) in a positively invariant

annulus K that contains no equilibria. The omega-limit set of
x(t, x0 ) is a periodic orbit.
We note that the Poincaré–Bendixson Theorem as stated here
does not provide any information on ω(x0 ) if ω(x0 ) contains equi-
libria. In the case that all equilibria of the system are hyper-
bolic (therefore isolated), if ω(x0 ) is not a single equilibrium, then
it consists of a finite number of equilibria together with their
3.6 Phase-Plane Analysis 93
connecting orbits. See P. Hartman’s ODE book [25] for related

results. In cases where equilibria are non-hyperbolic, a limit set
can contain infinitely many equilibria.
Exercise. Construct a planar system such that:
(a) the unit circle consists entirely of equilibria, and

(b) the omega-limit set of each nonequilibrium positive semi-
orbit is the unit circle.
Theorem 3.6.4 (Poincaré’s Stability Condition) Assume

that n = 2. A T -periodic solution p(t) of system (3.12) is orbitally
asymptotically stable if
T
div f (p(t)) dt < 0. (3.15)
0
Proof. Let λ1 = 1 and λ2 be the two Floquet multipliers of x =

p(t), as discussed in Section 3.5. Let Y (t) be the fundamental
matrix of the linearized system with respect to x = p(t). Then
λ1 , λ2 are eigenvalues of Y (T ). By the Liouville’s formula,
T T
tr ∂f (p(t))dt divf (p(t))dt
λ2 = λ1 λ2 = det Y (T ) = e 0 ∂x =e 0 .
Therefore, Poincaré’s condition implies 0 < λ2 < 1, and hence the

orbital asymptotic stability of p(t), by Theorem 3.4.2.
Exercise. Suppose that n = 2, system (3.12) has only one equi-
librium x̄, and all solutions are forwardly bounded (for t ≥ 0).
Assume that, for any periodic orbit γ, the condition

divf < 0
γ
holds.
Show the following:
(a) If x̄ is asymptotically stable, then the system has no peri-

odic orbits and x̄ is globally stable.
(b) If x̄ is unstable, then the system has a unique periodic orbit
and it is a limit cycle.
94 Basic Tools
Theorem 3.6.5 (Bendixson’s Negative Criterion) Let

D ⊂ R2 be a simply connected region. If
div f (x) < 0, (or > 0) x ∈ D, (3.16)
then no periodic orbits can lie entirely in D.

Proof. Let x = (u, v) and f (x) = (P (u, v), Q(u, v)). Suppose a
periodic orbit
γ = {(u(t), v(t)) : 0 ≤ t < T } ⊂ D.
Let G be the region enclosed in the interior of γ. Under the right

orientation of γ, Green’s Theorem implies
T
∂P ∂Q
0> + dudv = (P dv − Qdu) = (P v − Qu )dt

G ∂u ∂v γ 0
T
= (P Q − P Q)dt = 0,
0
and the contradiction establishes the theorem.

Remarks.
(a) Bendixson’s condition requires that div(f ) does not change
sign in D.
(b) From the proof, we can see that Bendixson’s condition
rules out periodic orbits, homoclinic orbits, and heteroclinic
cycles.
Exercise. Construct an example to show that Bendixson’s crite-

rion as stated in Theorem 3.6.5 is no longer true in Rn for n > 2,
namely, the negative divergence condition may not be able to rule
out periodic orbits in dimensions higher than 2.
However, suitable conditions (not necessarily using the diver-
gence) can be derived in higher dimensions that rule out peri-
odic orbits. We refer interested readers to the work of Li and
Muldowney [40].
Exercise. Show that if Bendixson’s condition holds in an annular
region D ∈ R2 , then D can contain at most one periodic orbit. Can
you generalize this result?
3.7 Uniform Persistence 95
Corollary 3.6.6 (Dulac’s Criteria) Assume that D ⊂ R2 is

simply connected. If there exists scalar-valued function α(x) such
that
div (αf )(x) < 0 (or > 0) x ∈ D,
then D contains no periodic orbits.
Proof. Follow the same proof of Theorem 3.6.5, apply the Green’s
Theorem to (αP, αQ).
3.7 Uniform Persistence

Persistence is an important concept in population biology. It cap-
tures the longtime survival of a species, even when the population
size of the species is quite low at times. For epidemic models, per-
sistence of the model refers to a situation where the disease is
endemic in a population. In this section, we formulate the con-
cept of uniform persistence and present a theorem for establishing
uniform persistence, adapted from the work of J. Hofbauer and
J. So [29]. A more general result can be found in [17].
Consider a differential equation in Rn
x = f (x), x ∈ Rn . (3.17)
We assume that D ⊂ Rn is an open subset, and function

f : D→Rn is such that the solution x(t, x0 ) to (3.17) with initial
condition x(0, x0 ) = x0 exists for t ≥ 0 and is unique, for x0 ∈ D.
In the context of population models, we assume D is posi-
tively invariant with respect to solutions of (3.17), namely, x0 ∈ D
implies that x(t, x0 ) ∈ D for t ≥ 0. Let ∂D denote the boundary
of D. Typically, for an epidemic model, D = Rn+ , the nonnegative
orthant of Rn , and the boundary ∂D consists of the intersections
of the coordinate hyperplanes with Rn+ . The boundary ∂D may
not be positively invariant for most epidemic models.
System (3.17) is said to be uniformly persistent (see [9, 56,
61]) if there exists constant 0 > 0 such that all solutions x(t, x0 )
with x0 ∈ D satisfy
96 Basic Tools
lim inf d(x(t, x0 ), ∂D) > 0 . (3.18)

t→∞
Here d(x, K) = min{|x − y| | y ∈ K} denotes the distance from

a point x to a subset K. Weaker notions of persistence have
been defined and used in the literature of population dynamics
(see [61]).
Let M ⊂ D be a compact invariant set, namely, M is a compact
subset of D and x0 ∈ M implies that the solution x(t, x0 ) stays
in M for t ∈ R. The stable set of the set M is defined as
W s (M ) = {x0 ∈ D | d(x(t, x0 ), M ) → 0, as t → ∞}.
A global attractor A in D for system (3.17) is the largest compact

invariant set in D.
The next result is a special case of Theorem 4.1 of [29].
Theorem 3.7.1 Assume that D is positively invariant and con-
tains a global attractor A. Then, system (3.17) is uniformly per-
sistent if and only if
(1) the largest compact invariant set M in ∂D is isolated in A,

and
(2) W s (M ) ⊂ ∂D.
As an example, we consider the reduced SIR model (2.25)
I = βIS − γI − bI.
Its feasible region
Γ = {(S, I) ∈ R2+ | 0 ≤ S + I ≤ 1}
is bounded and positively invariant. The open set D will be under-

◦
stood as the interior Γ of Γ and its boundary consists of the union
of the S-axis, I-axis, and line S + I = 1. The definition of uniform
persistence given above can be interpreted in this setting as the
following: there exists 0 > 0 such that
3.7 Uniform Persistence 97
lim inf S(t) > 0 , lim inf I(t) > 0 , lim inf 1 − S(t) − I(t) > 0 . (3.19)
t→∞ t→∞ t→∞
The disease is endemic if (2.25) is uniformly persistent, since in

this case the infected subpopulation I(t) will remain above a pos-
itive level 0 for all sufficiently large times.
We would like to show that the model is uniformly persistent
when R0 > 1 using Theorem 3.7.1. We note that, being bounded
and positively invariant, Γ contains a global attractor A = ∅. The
largest compact invariant set on the boundary of Γ is the sin-
gleton {P0 }, which is also isolated, i.e., there is a neighborhood
of P0 that contains no other full orbit. Therefore, the model is
uniformly persistent if the stable set W s (P0 ) is contained in the
boundary of Γ, or equivalently, if P0 repels toward the interior of
Γ. From Theorem 2.3.2 we know this can only happen if R0 > 1.
This repelling property can be established by showing that the
direction of the eigenvector of the Jacobian matrix J(P0 ) for the
positive eigenvalue (when R0 > 1) is transversal to the S-axis.
Another way to establish this repelling property is to use the
same Lyapunov function in Section 2.3.4:
L(S, I) = I.
Its derivative along a solution (S(t), I(t)) of the model, as calcu-

lated in Section 2.3.4, is
dL dI 1
= = I(βS − γ − b) = βI(S − ).
dt dt R0
Note that P0 = (1, 0), and 1/R0 < 1 if R0 > 1. If we take ini-
tial conditions (S0 , I0 ) in the interior of Γ and sufficiently close
to P0 , namely, I0 > 0 and S0 sufficiently close to 1, we have
S0 − 1/R0 > 0 and thus dL dt
> 0 for small t > 0. This implies that
I(t) will increase for small t > 0 and P0 repels in the direction of
the interior of Γ. This gives the next result.
Theorem 3.7.2 The SIR model (2.25) is uniformly persistent
if and only if R0 > 1.
98 Basic Tools
3.8 Metzler Matrices and Monotone

Systems
Let M be an n × n real matrix. We say that M is a Metzler
matrix if all off-diagonal entries are nonnegative. Metzler matri-
ces were introduced in the economics literature after L. Metzler,
whose work provided the essential development of the theory. The
importance of Metzler matrices is well recognized in other fields
including biology and engineering. Development of Metzler matri-
ces also coincides with that of M -matrices. A matrix M is an
M -matrix if and only if −M is Metzler.
For two real matrices A = (aij ) and B = (bij ), we say that
(a) A ≤ B if and only if aij ≤ bij for all 1 ≤ i, j ≤ n.

(b) A < B if and only if aij < bij for all 1 ≤ i, j ≤ n.
A matrix A is said to be nonnegative if A ≥ 0, and positive if

A > 0. Such a partial order can also be defined for vectors in Rn ,
and we can define nonnegative vectors and positive vectors in a
similar manner.
Letting λ1 , · · · λn be the eigenvalues of A, the spectral radius
of A is defined as
ρ(A) = max{|λi | | i = 1, · · · , n}, (3.20)
namely, the largest modulus of eigenvalues of A. The stability

modulus of A is defined as
s(A) = max{Re(λi ) | i = 1, · · · , n}, (3.21)
namely, the largest real part of eigenvalues of A. If s(A) < 0, we

say that A is stable. In other words, a matrix is stable if all its
eigenvalues have negative real parts.
A permutation matrix is a binary matrix that has exactly one
entry equal to 1 in each row and each column and 0 elsewhere.
A permutation matrix P represents a specific permutation of n
elements and, when used to multiply another matrix A, can pro-
duce the permutation in the rows or columns of A. A nonnegative
3.8 Monotone Systems 99
matrix A is reducible if, for some permutation matrix P ,

A 0
P AP = 1
T
,
A2 A3
and A1 , A3 are square matrices. Otherwise, A is irreducible. Irre-

ducibility of A can be checked using the directed graph associated
with A. The directed graph G(A) associated with A = (aij ) has
vertices {1, 2, · · · , n} with a directed arc (i, j) from i to j if and
only if aij = 0. It is strongly connected if any two distinct vertices
are joined by an oriented path. Matrix A is irreducible if and only
if G(A) is strongly connected.
The next result is the well-known Perron–Frobenius Theorem.
Theorem 3.8.1 (Perron–Frobenius Theorem) Let A ≥ 0 be
an n × n real matrix.
(1) The spectral radius ρ(A) is an eigenvalue of A with respect

to a nonnegative eigenvector;
(2) If A is also irreducible, then ρ(A) is a simple eigenvalue,
and the associated eigenvector is positive.
The next result is the Perron–Frobenius Theorem for Metzler

matrices.
Theorem 3.8.2 (Perron–Frobenius Theorem for Metzler
Matrices) Let A be an n × n Metzler matrix. Then the stability
modulus s(A) is an eigenvalue of A with respect to a nonnegative
eigenvector. Furthermore, if A is also irreducible, then s(A) is a
simple eigenvalue with a positive eigenvector.
Some useful properties of Metzler matrices are given in the
next theorem.
Theorem 3.8.3 Let A be an n × n Metzler matrix. Then the
following statements are equivalent.
(1) A is stable;
(2) A is nonsingular and −A−1 ≥ 0;
100 Basic Tools
(3) For each b > 0, there exists x > 0 such that Ax + b = 0;

(4) There exists x ≥ 0 such that Ax < 0;
(5) There exists x > 0 such that Ax < 0.
Metzler matrices are related to the concept of monotone
dynamical systems. A mapping T : Rn → Rn is said to be mono-
tone if, for x, y ∈ Rn ,
x≤y =⇒ T (x) ≤ T (y).
It is strongly monotone if, for x, y ∈ Rn ,
x ≤ y and x = y =⇒ T (x) < T (y).
Let D ∈ Rn be a convex subset. Let f : D → Rn be a C 1 vector

field. Consider a system of differential equations in Rn
x = f (x). (3.22)
Let x(t, x0 ) be the unique solution to system (3.22) such that

x(0, x0 ) = x0 . The flow ϕt generated by system (3.22) is defined
as ϕt (x0 ) = x(t, x0 ). System (3.22) is said to be monotone or
strongly monotone if its flow ϕt is monotone or strongly mono-
tone, respectively.
Theorem 3.8.4 Let f be a C 1 vector field in Rn defined on a
convex subset D ⊂ Rn . Then system (3.22) is monotone if and
only if, for each x ∈ D, the Jacobian matrix ∂f ∂x
(x) is Metzler.
∂f
Furthermore, if ∂x (x) is also irreducible, then (3.22) is strongly
monotone.
A mapping f : Rn → Rn is said to be sublinear if
0 < λ < 1, x ≥ 0 =⇒ f (λx) ≥ λ f (x),
and strictly sublinear if
0 < λ < 1, x > 0 =⇒ f (λx) > λ f (x).

3.8 Monotone Systems 101
The next theorem is a useful stability result for monotone sys-

tems.
Theorem 3.8.5 Let f : Rn → Rn be a C 1 vector field. Assume
(a) The nonnegative orthant Rn+ is positively invariant with
respect to system (3.22).
(b) System (3.22) is strongly monotone;
(c) f (x) is strictly sublinear;
(d) Solutions to system (3.22) are bounded for t ≥ 0.
Then:
(1) If f (0) = 0, then either all solutions in Rn+ tend to the equi-
librium 0 or there exists an equilibrium p > 0 that is globally
asymptotically stable in the region Rn+ \ {0};
(2) If f (0) > 0, then there exists equilibrium p > 0 that is glob-
ally asymptotically stable in Rn+ .
Note: Section 3.8 is adapted from lecture notes of Professor

Gauthier Sallet of Université Paul Verlaine, France.
Chapter 4
Parameter Estimation
and Nonlinear
Least-Squares Methods
In this chapter, we deal with the problem of parameter estima-

tion. We have seen from Chapter 2 that the outcomes of an
epidemic model critically depend on the values of the model
parameters. While mathematical analysis of models we have dis-
cussed in Chapter 2 is very useful for understanding asymptotic
behaviors and longtime qualitative outcomes, when models are
confronted with disease data the problem is often for finite time,
and an accurate estimation of parameter values is essential for
reliable quantitative predictions within a finite time interval.
Certain parameters such as birth rates, natural death rates,
and recovery rates can be estimated directly from population and
epidemiological data. For instance, if the mean life expectancy at
birth of the population under study is 70 years, then its natural
death rate can be approximated as d = 1/70 ≈ 0.0143, if the time
unit is years. Similarly, if the mean infectious period of the disease
is 6 months and the time unit is years, then the recovery rate is
γ = 1/0.5 = 2. The time unit is important in such a conversion.
If the time unit is months in these examples, then the natural
death rate is d = 1/(70 × 12) ≈ 0.00119, and the recovery rate is
γ = 1/6 ≈ 0.1667.
Some parameters are not so easily estimated directly from data.
Most notable is the transmission coefficient β in the incidence
term βIS. If reliable yearly incidence or prevalence data for the
disease is available, then we can obtain an estimation of β by best
fitting the model outcome to the given data. For simple models,
https://doi.org/10.1007/978-3-319-72122-4_4
104 Nonlinear Least-Squares
this may be done by manually adjusting the values of β to get

a satisfactory fit. For more complicated models, or for estima-
tion of multiple parameters, a systematic approach for the fitting
is desirable. This is often achieved through the nonlinear least-
squares method. We will first introduce the classical method of
linear least squares for curve fitting, and then examine a non-
linear version of the curve-fitting problem. With a basic under-
standing of these simple problems, we will then discuss nonlinear
least-squares problems associated with parameter estimation of
epidemic models.
4.1 Curve-Fitting and Linear

Least-Squares Problem
Curve-fitting problems. Given data points (x1 , y1 ), · · · ,
(xn , yn ), we consider the following curve-fitting problems:
(1) Find a straight line y = ax + b that “best fits” all the data
points.
(2) Find a m-th order polynomial
y = am xm + am−1 xm−1 + · · · + a1 x + a0
that “best fits” all data points.
(3) Find a curve of form
y = a0 f0 (x) + a1 f1 (x) + · · · + am fm (x)
that “best fits” all data points. Here f0 (x), f1 (x), · · · , fm (x)
are given functions.
We see that Problem (1) is a special case of Problem (2), and

Problem (2) is a special case of Problem (3) with fi (x) = xk ,
k = 0, 1, · · · , m. For simplicity of presentation, we will use Prob-
lem (1) as an example. Problems (2) and (3) can be dealt with
the same way.
Least-squares fitting. Suppose that y = a1 x + a0 is a line of the
best fitting, also called a line of regression. We need to clarify the
4.1 Linear Least-Squares Problem 105
meaning of best fitting. When n > 2, there is little hope for a line
to pass through more than two data points, or in other words, for
all the following equations to hold simultaneously:
y1 = a1 x 1 + a0
y 2 = a1 x 2 + a0
···
y n = a1 x n + a0 .
Instead, we look for a best-fitting line that minimizes the total

error
d(a0 , a1 ) = [y1 − (a1 x1 + a0 )]2 + · · · + [yn − (a1 xn + a0 )]2 . (4.1)
data prediction error

x1 y1 a1 x 1 + a0 y1 − (a1 x1 + d0 )
x2 y2 a1 x 2 + a0 y2 − (a1 x2 + d0 )
··· ··· ··· ···
xn yn a1 x n + a0 yn − (a1 xn + d0 )
Table 4.1: Data, predictions, and errors.
From Table 4.1, we see that d(a0 , a1 ) measures the total error
between data yi and prediction a1 xi + a0 for i = 1, · · · , n. This
problem is also a standard minimization problem: we look for a
pair (â0 , â1 ) at which the function d(a0 , a1 ) achieves a minimum.
The choice of the Euclidean norm for the measure of error gives
rise to the term “least squares”. It ensures that d(a0 , a1 ) is a dif-
ferentiable function. It also allows the utilization of Euclidean dot
product and orthogonal projection.
A mathematical formulation of the least-squares prob-
lem. Let
⎡ ⎤ ⎡ ⎤
y1 1 x1
⎢ . ⎥ ⎢. . ⎥ a
b=⎢ ⎥
⎣ .. ⎦ , A=⎢ ⎥
⎣ .. .. ⎦ , x= 0 .
a1
yn 1 xn
Then a least-squares solution (x̂) = (â0 , â1 )T satisfies
b − Ax̂ ≤ b − Ax (4.2)
for all x ∈ R2 . Here · denotes the Euclidean norm and the

superscript T denotes the transposition of matrices and vectors.
The expression in the error term (4.1) is in fact the square of the
Euclidean norm b − Ax2 , and we use the fact that b − Ax is
minimized if and only if b − Ax2 is minimized.
More generally, we allow Am×n be an m × n matrix, and
b ∈ Rm . Then x̂ ∈ Rn is a least-squares solution of Ax = b if x̂
satisfies
b − Ax̂ ≤ b − Ax (4.3)
for all x ∈ Rn .
A least-squares solution x̂ can be found based on geometrical
observations. First we note that the set
col(A) = {Ax : x ∈ Rn }
is the column space col(A) of matrix A. Therefore, minimizing

d = b − Ax is equivalent to finding the distance from vector b to
the subspace col(A). From geometry, we know that such distance
is achieved at the orthogonal projection of b onto the subspace
col(A),
b̂ = Projcol(A) b.
This is illustrated in Figure 4.1. Therefore, the least-squares solu-
tion necessarily satisfies
Ax̂ = b̂. (4.4)
It would be desirable to find a solution x̂ of (4.4) without
having to find the projection b̂. Again, from geometry, we observe
that b − b̂ is orthogonal to the subspace col(A), and thus (b −
Ax̂) ⊥ col(A), see Figure 4.1. In terms of the dot product, we
have
(b − Ax̂) · all columns of A = 0.
Written in matrix form, this relation becomes
AT (b − Ax̂) = 0.
Based on these observations, we know that a least-squares solution

x̂ necessarily satisfies the following normal system:
AT Ax̂ = AT b. (4.5)
Theorem 4.1.1 A least-squares solution x̂ satisfies the normal
system (4.5).
Figure 4.1: A geometric illustration of the least-squares solution.

A least-squares solution x̂ may not be unique. However, when
solutions to system (4.4) or (4.5) are unique, the least-squares
solution must be unique.
Theorem 4.1.2 The following statements are equivalent.
(1) The least-squares solution is unique for each b ∈ Rm .

(2) The columns of A are linearly independent (A has f ull
rank).
(3) Matrix AT A is invertible, and x̂ = (AT A)−1 AT b.
Example 1. Find the line y = a0 + a1 x that best fits the data
points (2, 1), (5, 2), (7, 3), and (8, 3).
Solution. In this case,
⎡ ⎤ ⎡ ⎤
1 2 1
⎢1 5⎥ ⎢2⎥ a
A=⎢
⎢
⎥
⎥, b=⎢ ⎥
⎢ ⎥, x= 0 .
⎣1 7⎦ ⎣3⎦ a1
1 8 3
The normal system AT Ax = AT b becomes

⎡ ⎤ ⎡ ⎤
1 2 1
1 1 1 1 ⎢ ⎥
⎢1 5⎥ a0 1 1 1 1 ⎢ ⎥
⎢2⎥
⎢ ⎥ = ⎢ ⎥,
2 5 7 8 ⎣1 7⎦ a1 2 5 7 8 ⎣3⎦
1 8 3
namely,
4 22 a0 9
= .
22 142 a1 57
Therefore
−1 ⎡ ⎤
2
a0 4 22 9 1 142 −22 9 ⎣ 7 ⎦.
= = =
a1 22 142 57 84 −22 4 57 5
14
This gives a0 = 27 , a1 = 5
14
, and the least-squares line
2 5
y= + x.
7 14
See Figure 4.2.
Example 2. Find a quadratic curve that best fits the data points
(2, 1), (−1, 5), (6, 2), (4, −1).
Solution. A quadratic curve has the form
y = a 2 x 2 + a 1 x + a0 .
We want to find coefficients (â0 , â1 , â2 ) such that
d = [(â0 + 2â1 + 4â2 ) − 1]2 + [(â0 + (−1)â1 + â2 ) − 5]2

+ [(â0 + 6â1 + 36â2 ) − 2]2 + [(â0 + 4â1 + 16â2 ) − (−1)]2
Figure 4.2: The least-squares line for Example 1.

is the smallest among all choices of (a0 , a1 , a2 ). Let
⎡ ⎤
1 x1 x21 ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 2 4 y1 1 ⎡ ⎤
⎢ ⎥ a0
⎢1 x22 ⎥ ⎢ ⎥ ⎢y2 ⎥ ⎢ 5 ⎥
⎥ ⎢1 −1 1 ⎥ ,
x2
A=⎢
⎢1 2 ⎥ = ⎣1 ⎦ b=⎢ ⎥ ⎢ ⎥
⎣y3 ⎦ = ⎣ 2 ⎦ , x = ⎣a1 ⎦ .
⎣ x3 x3 ⎦ 6 36
a2
1 x4 x24 1 4 16 y4 −1
Then, the normal system AT Ax = AT b becomes

⎡ ⎤ ⎡ ⎤
⎡ ⎤ 1 2 4 ⎡ ⎤ ⎡ 1 ⎤
1 1 1 1 ⎢ ⎥ a0 1 1 1 1 ⎢ ⎥
⎢ ⎥ ⎢1 −1 1 ⎥ ⎢ ⎥ ⎢ ⎥⎢ 5 ⎥
⎣2 −1 6 4 ⎦⎢ ⎥ ⎣a ⎦ = ⎣2 −1 6 4 ⎦ ⎢ ⎥,
⎣1 6 36⎦ 1 ⎣ 2 ⎦
4 1 36 16 a2 4 1 36 16
1 4 16 −1
namely, ⎡ ⎤⎡ ⎤ ⎡ ⎤
4 11 57 a0 7
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎣11 57 287 ⎦ ⎣a1 ⎦ = ⎣ 5 ⎦ .
57 287 1569 a2 65
The system has a unique solution
4871 12515 1853
â0 = ≈ 2.9719, â1 = − ≈ −1.909, â2 = ≈ 0.2826,
1639 6556 6556
and the unique least-squares quadratic curve for the given data
points is
y = 0.2826x2 − 1.909x + 2.9719.
See Figure 4.3.
Figure 4.3: The least-squares parabola for Example 2.
We note that, in Example 2, we use nonlinear best-fit functions.

Why do we call it a linear least-squares problem? The important
characteristic of linear least-squares problems is that a best-fit
function takes the form of linear combinations of basis functions,
and finding the best-fit function means finding the best choice
of coefficients (or parameters). In certain cases when the best-fit
function has a nonlinear dependence on parameters, the method
for linear least-squares problems can still be applied after a suit-
able transformation.
Example 3. Find the least-squares function of form
x(t) = a0 ea1 t , t > 0, a0 > 0
for the data points
(t1 , x1 ), (t2 , x2 ), · · · , (tn , xn ), x1 , x2 , · · · , xn > 0.
Solution. Let
y(t) = ln x = ln a0 + a1 t,
and
b0 = ln a0 , b1 = a1 , y1 = ln x1 , ··· , yn = ln xn .
4.2 Nonlinear Least-Squares Problem 111
Then, we can first solve the following linear least-squares problem:

Find the least-squares curve of form
y(t) = b0 + b1 t, t>0
for data points (t1 , y1 ), · · · , (tn , yn ).

This problem can be solved using the linear least-squares method
as in Example 1 to produce a least-squares solution (b̂0 , b̂1 ). The
best-fit curve to the original problem is then given by
x(t) = â0 eâ1 t ,
where â0 = eb̂0 , â1 = b̂1 .
4.2 Nonlinear Least-Squares Problem

Let θ = (θ1 , · · · , θm ) be a multidimensional parameter. Consider a
family of scalar-valued curves y = f (x, θ) that depend on
parameter θ.
Nonlinear least-squares fitting: Given data points
(x1 , y1 ), (x2 , y2 ), · · · , (xn , yn ),
find a parameter value θ̂ such that the curve y = f (x, θ̂) minimizes
the squared sum of errors (SSE):

n
SSE(θ) = (yi − f (xi , θ))2 . (4.6)
i=1
The reader should think about why the method of linear least-
squares problems discussed in the previous section will not work
for this problem. We will treat SSE(θ) as a smooth function of θ
and employ multivariable calculus to find its minimum. In fact, a
minimum of SSE(θ) in Rm must be achieved at a critical point;
namely, where
∂SSE(θ)
= 0, j = 1, · · · , m. (4.7)
∂θj
Using the chain rule for differentiation, we rewrite system (4.7)

as
n
∂f
[yi − f (xi , θ)] − (xi , θ) = 0, j = 1, · · · , m. (4.8)
i=1 ∂θj
This system may still be nonlinear in θ through nonlinear depen-

dence in f (xi , θ). Numerical schemes are used to find an approx-
imate solution.
Starting from an initial guess θ(0) of θ, we define a sequence
of approximations {θ(k) } inductively, by the process of Newton’s
method. We expand f (xi , θ) by its first-order Taylor polynomial
at θ(k)

n
∂f (xi , θ(k) )
f (xi , θ) ≈ f (xi , θ (k)
)+ (θs − θs(k) ).
s=1 ∂θs
Let
∂f (xi , θ(k) )
J= = (Jij )
∂θj
be the Jacobian matrix at θ(k) . We note that Jij in the above
relation depends on k, and for simplicity of notation, we choose
to suppress the dependence on k. Then equation (4.8) can be
approximated by
n

m
yi − f (xi , θ(k) ) − Jis (θs − θs(k) ) [−Jij ] = 0, j = 1, · · · , m. (4.9)
i=1 s=1
Let
Δyi = yi − f (xi , θ(k) ),
(k+1) (k)
Δθj = θj − θj .
Then equation (4.9) can be rewritten as
n
Jis Jij Δθs(k+1) = Jij Δyi , (4.10)
i=1 s=1 i=1
from which we can solve for Δθ(k+1) and define
θ(k+1) = θ(k) + Δθ(k+1) , for k = 0, 1, 2, · · · .
In matrix form equation (4.10) can be written as
J T JΔθ = J T Δy, (4.11)
System (4.11) is called the normal system.

The following iteration scheme for the nonlinear least-squares
method is called the Gauss–Newton Method:
(1) Choose an initial value θ(0) .

(2) Solve the normal equation (4.11) for Δθ(1) .
(3) Update θ by θ(1) = θ(0) + Δθ(1) .
(4) Repeat the iteration until convergence is achieved (when
difference θ(k+1) − θ(k) is below margin of error.)
There are many other methods for nonlinear least-squares prob-
lems aimed at improving efficiency and rate of convergence.
If f (x, θ) is a linear function of θ, for example
f (x, θ) = α(x) · θ,
where α(x) = (α1 (x), · · · , αm (x)) and “·” denotes the dot product
in Rm , then we would expect that the Gauss–Newton method
leads to the linear least-squares method. Indeed,

m
f (xi , θ) = α(xi ) · θ = αj (xi )θi , i = 1, · · · , n,
j=1
and

n
SSE(θ) = (yi − α(xi ) · θ)2 .
i=1
Therefore,
∂SSE
n
= (yi − α(xi ) · θ)(−αj (xi ))
∂θj i=1
(4.12)

n
=− αj (xi )(yi − α(xi ) · θ), j = 1, · · · , m.
i=1
Let A = (αj (xi )). Then equation (4.12) can be written in matrix
form as
AT y − AT A θ = 0,
which gives the normal equation
A T A θ = AT y
of the linear least-squares problem.
Example 4. Given data
(1, 4.6), (2, 8.82), (3, 16), (4, 31.3), (5, 58.5),
Find the best-fit curve x = a0 ea1 t .
Solution 1. Using transformation
y = ln x, b0 = ln a0 , b1 = a1 ,
we obtain
y = b0 + b1 t
and the new data set
(1, 1.526), (2, 2.177), (3, 2.773), (4, 3.444), (5, 4.069).
We will apply the linear least-squares method. Let

⎡ ⎤ ⎡ ⎤
1 1 1.526
⎢ ⎥ ⎢ ⎥
⎢1 2⎥ ⎢2.177⎥
⎢ ⎥ ⎢ ⎥ b0
A=⎢
⎢1 3⎥ ⎢ ⎥
⎥ , b = ⎢2.773⎥ , y = b .
⎢
⎣1 4⎥
⎦
⎢
⎣3.444⎦
⎥ 1
1 5 4.069
The normal equation AT Ay = AT b becomes


5 15 b0 13.989
= .
15 55 b0 48.32
Solving this system we obtain

−1
b0 5 15 13.989 0.892
= = .
b0 15 55 48.32 0.635
This gives
a0 = e0.892 = 2.44, a1 = b1 = 0.635,
and the least-squares curve is
x = 2.44e0.635t .
Solution 2. We will use the Gauss–Newton method to solve the

nonlinear least-squares problem directly. Consider the nonlinear
function
f (t, a) = a0 ea1 t , a = (a0 , a1 ), x = (1, 2, · · · , 5)T .
The Jacobian matrix is

⎡ ⎤
ea1 a0 ea1
⎢ 2a1 ⎥
⎢e
⎢ 3a
2a0 e2a1 ⎥
⎥
J(x, a) = ⎢
⎢e
1
3a0 e3a1 ⎥
⎥
⎢ 4a1
⎣e 4a0 e4a1 ⎥
⎦
e5a1 5a0 e5a1
and the normal equation J T JΔa = J T Δy becomes

e2a1 + e4a1 + e6a1 + e8a1 + e10a1 a0 (e2a1 + 2e4a1 + 3e6a1 + 4e8a1 + 5e10a1 ) Δa1
a0 (e2a1 + 2e4a1 + 3e6a1 + 4e8a1 + 5e10a1 ) a20 (e2a1 + 4e4a1 + 9e6a1 + 16e8a1 + 25e10a1 ) Δa2
⎡ ⎤
Δy1

e a1 e2a1 e3a1 e4a1 e5a1 ⎢Δy 2
⎥
=
a0 e3a1 2a0 e2a1 3a0 e3a1 4a0 e4a1 5a0 e5a1 ⎣Δy3 ⎦ .
Δy4
Δy5
Here
(k)
Δa1 a − a0
= 0 (k) ,
Δa2 a1 − a1
and
(k)
(k)
Δyi = yi − a0 ea1 , 1 ≤ i ≤ 5.
Choosing an initial vector
(0)
a0 1
(0) =
a1 1
and using the iteration scheme

⎡ (k) (k) ⎤
y − a0 ea0
⎢ 1 ⎥
(k+1) (k) ⎢y − a(k) ea(k) ⎥
⎢ 2 0
0 ⎥
a0 a0
−1 T
⎢ (k) ⎥
T ⎢ (k) a ⎥
(k+1) = (k) + (J J) J (k) (k) ⎢y3 − a0 e 0 ⎥
a1 a1 (a0 , a1 ) ⎢ ⎥
⎢y − a(k) ea(k)
0 ⎥
⎣ 4 0 ⎦
(k) (k)
y5 − a0 ea0
we obtain, for k = 0,
(1)

a0 1
(1) = 1
a1
⎡ 1.882 ⎤
−1
2.718 7.389 20.086 54.598 148.413 ⎢
1.431 ⎥
+
25472.8 123383 ⎢ −4.086 ⎥
133383 602214 2.718 14.778 60.257 218.393 742.066 ⎣ ⎦
−23.298
−89.913

1.386
= .
0.801
Iterating again we obtain, for k = 1,

(2)

a0 1.386
(2) = 0.801
a1
⎡ 1.511 ⎤
−1
2.228 4.965 11.064 24.653 54.934 ⎢ ⎥
1.936
+
3777.53 24873.7 ⎢ 0.661 ⎥
24873.7 166018 3.089 13.768 46.017 136.716 380.802 ⎣ ⎦
−2.879
−17.66

2.104
= .
0.651
Subsequent iterations can be calculated in the same way to obtain

(3)
(4)
(5)

a0 2.428 a0 2.431 a0 2.431
(3) = , (4) = , (5) = , ··· .
a1 0.635 a1 0.636 a1 0.636
We see that if the margin of error is three decimal points, then

we have achieved the required accuracy at the 4th iteration and
obtain
a0 ≈ 2.431, a1 ≈ 0.636.
Comparing with the answers in Solution 1, we see that they agree
up to the first decimal point. The resulting curves are shown in
Figure 4.4. The two curves are virtually indistinguishable.
Figure 4.4: The least-squares curve for Example 4.

4.3 Parameter Estimation for

Epidemic Models
Suppose that our epidemic model is described by the initial value
problem of a system of differential equations:
x = f (x, θ), x ∈ Rd , t ∈ [0, tmax ],

(4.13)
x(0) = x0 .
Here, θ ∈ Rm is an m-dimensional parameter, and [0, tmax ] is the

finite time interval in which we investigate the epidemic. The
disease data is often given at discrete observation time points
t1 , t2 , · · · , tp ∈ [0, tmax ] in the form
(t1 , g(x(1) )), (t2 , g(x(2) )), · · · , (tp , g(x(p) )), x(i) ∈ Rd . (4.14)
The function g(x) : Rd → Rn represents measurable quantities of

the state variable x, also called the observables. To fit with data,
it is natural that we only consider values of the solution x(t, θ) at
these observational time points:
x(t, θ) ≈ (x(t1 , θ), x(t2 , θ), · · · , x(tp , θ))T .
From the fundamental theory of differential equations we know

that, if the vector field f (x, θ) is a smooth function (having contin-
uous partial derivatives) with respect to (x, θ), then the solution
x(t, θ) has a dependence on θ with the same order of smooth-
ness as f . In the above notation, the dependence of the solution
on initial condition x0 is understood and suppressed. In the fol-
lowing discussion, we will keep x0 fixed and discuss fitting the
parameter θ. In many applications, the initial conditions are not
always known and need to be fitted. Since the solution x(t, x0 , θ)
is a diffeomorphism with respect to the initial conditions x0 , we
can consider x0 as part of the parameter θ. We denote the data
points as
y = (g(x(1) ), g(x(2) ), · · · , g(x(p) )), x(i) ∈ Rd .

4.3 Parameter Estimation for Epidemic Models 119
Then the squared sum of errors (SSE) between our solution and
the data can be measured by

p
SSE(θ) = d(g(x(t, θ)), y)2 = ||g(x(ti , θ)) − g(x(i) )||2 . (4.15)
i=1
We note here that in the above expression, it is important to

measure differences in quantities of the same type. Since the data
is given as the observable quantities g(x), we need to compare the
observable part of the model solution g(x(t, θ)) with the data. This
is often referred to as “comparing apples to apples, and oranges to
oranges.” We also note that ||g(x) − g(y)|| in the above expression
denotes the Euclidean norm of the n-dimensional vector g(x) −
g(y), namely,

n
||g(x) − g(y)||2 = |gi (x) − gi (y)|2 .
i=1
In the least-squares fitting, we look for a value θ̂ of model

parameter θ such that SSE(θ) is the minimum. Such a problem is
clearly a nonlinear least-squares problem, since the dependence of
a solution x(t, θ) on the parameter θ is through a highly nonlinear
system of differential equations.
We can apply the Gauss–Newton method to the least-squares
parameter fitting problem. The differential equations can be dis-
cretized to give a system of difference equations for the solution.
The normal equation for the difference equation can be derived
and then solved by Gauss–Newton iteration. When the number of
equations in (4.13) is large, this requires too much human effort
for code writing. An alternative to this approach is to take advan-
tage of the powerful direct search routines of mathematical soft-
ware packages such as Matlab, Maple, and Mathematica. We will
explain this approach using a simple SIR model as an example,
together with the Matlab codes for each step.
Several Matlab functions can be used for parameter estima-
tion. Matlab function lsqcurvefit requires the following inputs: the
model equation, an initial guess for the parameters to be fitted,

and the time points and data points. It then solves the nonlinear
least-squares problem directly. Matlab function nlinfit is another
nonlinear regression routine that uses an iterative least-squares
estimation with an initial value for the parameters. Matlab func-
tion fminsearch is commonly used for parameter estimation. We
first define the sum of squares of errors SSE(θ) between the model
output and data. With an initial guess θ0 of the unknown param-
eters, the epidemic model can be numerically solved to produce a
value for SSE(θ0 ). The fminsearch routine takes the least-squares
error function SSE(θ) and an initial guess of the parameter value,
and uses a direct search routine to find a minimum value of least-
squares error. To ensure the minimum value returned by fmin-
search is not just a local minimum, the process can be repeated
with several choices of initial guess.
4.3.1 An Example of Using Matlab for

Parameter Estimation
We will use the Kermack–McKendrick SIR model for demonstra-
tion:
S = −λIS
I = λIS − γI (4.16)

R = γI.
Our objective is to estimate the parameters λ and γ by fitting the
model to disease data.
The first step is to input the model equation into Matlab by

defining a function, SIRModel, to be called on later for fitting.
The function takes three inputs: the time variable t ∈ R, state
variable y ∈ R3 , and parameter vector par ∈ R2 .
f u n c t i o n f=SIRmodel ( t , y , par )
% L a b e l t h e p a r a m e t e r s and v a r i a b l e s
lambda = par ( 1 ) ;
gamma = par ( 2 ) ;
S=y ( 1 ) ;
I=y ( 2 ) ;
R=y ( 3 ) ;
N=S+I+R ;
% Input the d i f f e r e n t i a l equations
Sdot =-lambda * I *S ;
I d o t=lambda * I *S - gamma* I ;
Rdot=gamma* I ;
f =[ Sdot I d o t Rdot ] ’ ;
end
We use the ODE Solver ODE45 of Matlab to solve the differential

equation for a given set of initial conditions (IC), by defining a
function SIRSol.
f u n c t i o n s o l=SIRSol ( par , IC , t )
% d i s p ( num2str ( par ) )
DeHandle=@( t , y ) SIRModel ( t , y , par ) ;
[ ~ , Y]= ode45 ( DeHandle , t , IC ) ;
s o l=Y ’ ;
end
The second step is create the data. For demonstration pur-

poses, we will artificially generate data at 20 time points from the
model with prescribed parameter values, and add random noise to
the generated data. To mimic real disease situations, we assume
that only data from the infected population I (prevalence) and
the total population N = S + I + R are given.
First take 20 randomly chosen time points:

numpts=20;
t d a t a =[0 s o r t (20* rand ( 1 , numpts ) ) ] ;
Then we generate normally distributed noise:

width = 0 . 1 ;
ndataSIR =20*[ 0 normrnd ( 0 , width , [ 1 , numpts ] ) ;
0 normrnd ( 0 , width , [ 1 , numpts ] ) ;
0 normrnd ( 0 , width , [ 1 , numpts ] ) ] ;
We add the noise term to the model outputs with λ = 0.01 and
γ = 0.1 and initial conditions S(0) = 50, I(0) = 1, and R(0) = 0
to produce a set of artificial data:
lambda = 0 . 0 1 ;
gamma= 0 . 1 ;
par =[ lambda gamma ] ;
IC =[50 1 0 ] ;
SIRData=SIRSol ( par , IC , t d a t a )+ndataSIR ;
We will only use the data for I(t) and N (t):
SIRData =[0 1 0 ; 1 1 1 ] * SIRData ;
We can visualize the data (shown in Figure 4.5) for I(t) and N (t):
figure ;
p l o t ( t d a t a , SIRData ( 1 , : ) , ’ r * ’ ) ;
h o l d on ;
p l o t ( t d a t a , SIRData ( 2 , : ) , ’ o ’ ) ;
Figure 4.5: Data on I and N = S + I + R that are artificially

created by adding noise to solution curves.
In the last step, we define least-squares error terms:
S I R p a r S o l= @( par , t ) [ 0 1 0 ; 1 1 1 ] ∗ SIRSol ( [ par ( 1 ) par ( 2 ) ] ,

IC , t ) ;
SumSquaresSIR = @( par ) sum(sum( ( S I R p a r S o l ( par , t d a t a )−SIRData
) .^2) ) ;
and call fminsearch with an initial guess for the parameter values
λ = 8, γ = 0.02:
[ SIRtheta , f v a l , e x i t f l a g ] = f m i n s e a r c h ( SumSquaresSIR , [ 8
0.02]) ;
S I R s o l=S I R p a r S o l ( SIRtheta , t s o l ) ;
Then we visualize the solutions in comparison to data, as shown

in Figure 4.6, using:
figure ;
p l o t ( t d a t a , SIRData , ’ . ’ ) ;
h o l d on ;
p l o t ( t s o l , SIRsol , ’ - - ’ ) ;
The results of the best-fit values for the parameters are λ =

0.0099658 and γ = 0.10063, which are very close to the preas-
signed values λ = 0.01 and γ = 0.1. Fitting results can be visual-
ized in Figure 4.6. Solutions to the model with the best-fit param-
eter values are shown in Figure 4.7.
Figure 4.6: Fitting results of solutions I(t) and N (t) = S(t) +

I(t) + R(t) to the artificial data.
Figure 4.7: Solutions (S(t), I(t), R(t)) of the model using best-fit
parameter values.
Exercises.
(1) Try out the Matlab codes in the example and reproduce the
best-fit parameter values and the graphics. You may want
to choose a different set of values for λ and γ to generate the
data and see what results you get. Another good practice is
to pick several initial guesses for the fminsearch to repeat
the fitting process, and see if you obtain the same set of
best-fit parameters.
(2) Suppose that we only have data for I(t) available for us.
How would you modify the least-squares error term to
reflect this change? Can you modify the Matlab codes and
fit the best-fit parameter values for both λ and γ?
Chapter 5
Special Topics
In this chapter, we select some materials for further study. In

Section 5.1, an SEIR model for measles is presented and ana-
lyzed. This is an example of higher dimensional models, models
having more than two modeling equations. For the local stabil-
ity analysis, we demonstrate how the Routh–Hurwitz conditions
can be used to show that the eigenvalues of a 3 × 3 matrix have
negative real parts. The proof of global stability of the endemic
equilibrium using a Lyapunov function is adapted from the work
of Korobenikov and Maini [34].
In-host models describe the infection of different cell types by
viruses, bacteria, and other pathogens in the body, typically in
the peripheral blood. Whereas these models describe infections
on a microscopic level, the modeling equations strongly resemble
epidemic models we have derived in earlier chapters. As a result,
the machinery we have learned in this book can also be applied to
analyze this type of models. In Section 5.2, we present a simple
in-host model for the infection of CD4+ T cells by the Human
T-cell Lymphotropic Virus type I (HTLV-I) in the peripheral
blood. This is one of the simplest disease models that have the
phenomenon of backward bifurcation. We will explain how a back-
ward bifurcation occurs, its difference from the forward bifurca-
tion we have discussed in Section 2.3.3, as well as its implications
for the infection process and disease control.

https://doi.org/10.1007/978-3-319-72122-4_5
126 Special Topics
5.1 Higher Dimensional Models: SEIR

Models
Let us consider an infectious disease that has latency and causes
a permanent immunity in its host, and whose transmission is
described by the transfer diagram in Figure 5.1.
As usual, S, E, I, and R denote the subpopulations that are

susceptible, exposed, infectious, and recovered, respectively. As
we have seen in Section 1.4.4, this type of model is called an
SEIR model. One of the diseases that can be described by an SEIR
model is measles, whose latent period is typically about two weeks.
The parameters and γ are understood in such a way that 1/
and 1/γ are the mean latent and infectious period, respectively.
The parameter μ denotes the natural birth rate, which we assume
to be equal to the natural death rate to keep the total population
constant.
5.1.1 SEIR Models

Let S(t), E(t), I(t), and R(t) denote the number of hosts in the
corresponding compartments. Based on our assumptions and the
transfer diagram, we can derive the following system of differential
equations for the model:
S = μ − λIS − μS
E = λIS − ( + μ)E
(5.1)
I = E − (γ + μ)I
R = γI − μR.
5.1 SEIR Models 127
We investigate solutions with initial conditions S(0) = S0 ≥ 0,

E(0) = E0 , I(0) = I0 , and R(0) = R0 .
First, we show that the total population remains a constant.
If we add all four equations in (5.1) we obtain
(S + E + I + R) = μ − μ(S + E + I + R).
If we let N (t) = S(t) + E(t) + I(t) + R(t) and N0 = S0 + E0 +

I0 + R0 , then N (t) satisfies
(1 − N ) = −μ(1 − N ),
which can be solved to yield
1 − N (t) = (1 − N0 )e−μt , t ∈ R.
We can draw two conclusions based on this relation:
(1) If N0 = 1 then N (t) = 1 for all t, and thus the total popu-
lation remains a constant 1.
(2) If N0 = 1 then N (t) → 1 exponentially as t → ∞.
Based on these conclusions, we can deduce that all the limit sets
(as t → ∞) of solutions to (5.1) are contained in the hyperplane
S + E + I + R = 1, which is also invariant. If we are interested in
the limiting behaviors of solutions, it suffices to restrict our atten-
tion to the invariant plane S + E + I + R = 1. The invariance of
the plane means that the total population N (t) remains 1 for all
t if N (0) = 1. As we have seen in Chapter 2, the constant total
population N has been scaled to 1 for simplicity.
The conservation of the total population, S + E + I + R = 1,
allows us to simplify system (5.1) by reducing the number of vari-
ables by 1. For instance, we can substitute
R(t) = 1 − S(t) − E(t) − I(t) (5.2)
into the system and only consider the equations for S, E, and I.
Once the behaviors of S(t), E(t), and I(t) are understood, the
behaviors of R(t) can readily be obtained from the linear
128 Special Topics
relation (5.2). In fact, for system (5.1), since the first three equa-
tions do not contain the variable R, we may simply first ignore
the R equation and consider the following 3-dimensional system
of differential equations:
S = μ − λIS − μS
E = λIS − ( + μ)E (5.3)
I = E − (γ + μ)I.
In the rest of the section, we focus our mathematical analysis on

system (5.3). Our results can be readily translated to those for
system (5.1).
We first derive the natural feasible region of system (5.3). We
note that the nonnegative orthant of R3 ,
R3+ = {(S, E, I) | S ≥ 0, E ≥ 0, I ≥ 0},
is positively invariant for system (5.3). This can be checked by ver-

ifying that the vector field of (5.3) on the boundary of R3+ is either
tangential to the boundary or pointing to the interior of R3+ . Sim-
ilarly, we can show that the nonnegative orthant R4+ is positively
invariant for system (5.1). Biologically speaking, the positive
invariance of R4+ is equivalent to the following statement: non-
negative initial conditions give rise to nonnegative solutions for
t ≥ 0, a required property considering that S(t), E(t), I(t), R(t)
all represent numbers of individuals and should be nonnegative.
In this sense, we say that the model is well-posed.
As it turns out, there is no need to consider all initial conditions
in R3+ . Noting that S(t) + E(t) + I(t) + R(t) = 1 and R(t) ≥ 0,
we know that
S(t) + E(t) + I(t) ≤ 1 for all t.
Therefore we only need to consider initial conditions in the

bounded region
Γ = {(S, E, I) ∈ R3+ | S + E + I ≤ 1}. (5.4)

5.1 SEIR Models 129
By considering the direction of the vector field on the plane

S + E + I = 1, we can verify that Γ is positively invariant for
system (5.3) and solutions starting in Γ will remain in Γ. We will
investigate system (5.3) in the feasible region Γ.
5.1.2 Equilibria and the Basic Reproduction

Number
The coordinates of an equilibrium (S, E, I) of system (5.3) satisfy
the following equations:
μ − λIS − μS = 0
λIS − ( + μ)E = 0 (5.5)
E − (γ + μ)I = 0.
The disease-free equilibrium P0 = (1, 0, 0) always exists. An

endemic equilibrium P ∗ = (S ∗ , E ∗ , I ∗ ), with S ∗ , E ∗ , I ∗ > 0, sat-
isfies
( + μ)(γ + μ) γ+μ ∗ μ(1 − S ∗ )
S∗ = , E∗ = I , I∗ = . (5.6)
λ λS ∗
We note that equilibrium P ∗ does not always belong to the
◦
feasible region Γ, especially its interior Γ, since the relations
0 ≤ S ∗ , E ∗ , I ∗ ≤ 1 have to be satisfied. It is then clear that
◦ λ
P∗ ∈ Γ ⇐⇒ S ∗ < 1 ⇐⇒ > 1.
( + μ)(γ + μ)
Using the interpretation of the basic reproduction number in
Section 1.4.9, we see that
λ
R0 = (5.7)
( + μ)(γ + μ)
is the basic reproduction number for model (5.3). We arrive at
the next proposition.
130 Special Topics
Proposition 5.1.1
(1) If R0 ≤ 1, then system (5.3) has only the disease-free equi-

librium P0 = (1, 0, 0) in Γ.
(2) If R0 > 1, then system (5.3) has two equilibria: the disease-
free equilibrium P0 and a unique endemic equilibrium P ∗ =
(S ∗ , E ∗ , I ∗ ), whose coordinates are given in (5.6).
5.1.3 Local Stability Analysis

In the previous section, we have seen that the basic reproduction
number serves as a threshold parameter in determining the num-
ber of equilibria in system (5.3). We will show in this section that
R0 also determines the local stability of the equilibria.
Theorem 5.1.2
(1) If R0 < 1, then the disease-free equilibrium P0 = (1, 0, 0) is

locally asymptotically stable in Γ.
(2) If R0 > 1, then the disease-free equilibrium P0 becomes
unstable and the endemic equilibrium P ∗ is asymptotically
stable.
We will apply the method of linearization to the stability anal-

ysis in both cases.
Proof. The Jacobian matrix of system (5.3) at P0 = (1, 0, 0) is
⎡ ⎤
−μ 0 −λ
⎢
J(P0 ) = ⎣ 0 − − μ λ ⎥ ⎦.
0 −γ − μ
The characteristic equation det(pI3×3 − J(P0 )) = 0 is a cubic

equation
(p + μ)[p2 + ( + γ + 2μ)p + ( + μ)(γ + μ) − λ] = 0.
Therefore, one eigenvalue is p1 = −μ < 0. The remaining two

eigenvalues p2 and p3 satisfy
5.1 SEIR Models 131
p2 + ( + γ + 2μ)p + ( + μ)(γ + μ) − λ = 0.
Since
p2 + p3 = −( + γ + 2μ) < 0,
p2 p3 = ( + μ)(γ + μ) − λ) = ( + μ)(γ + μ)(1 − R0 ),
we know that, if R0 < 1, then p2 p3 > 0, and p2 and p3 are either

both real numbers of the same sign, or complex numbers that are
conjugates of each other. The condition p2 + p3 < 0 implies that
they are either both negative real numbers or complex numbers
with negative real parts. Therefore P0 is asymptotically stable by
Theorems 3.2.1 and 3.2.2 in Chapter 3. If R0 > 1, then p2 p3 < 0,
and thus p2 and p3 have opposite signs, and P0 is unstable.
◦
Assume that R0 > 1 so that P ∗ ∈ Γ. The Jacobian matrix at
P ∗ is ⎡ ⎤
−λI ∗ − μ 0 −λS ∗
J(P ∗ ) = ⎢
⎣ λI ∗ − − μ λS ∗ ⎥ ⎦.
0 −γ − μ
The characteristic equation of J(P ∗ ) is more complicated than
that of J(P0 ). We will use the Routh–Hurwitz conditions in
Section 3.2 of Chapter 3.
It is easy to see that tr(J(P ∗ )) = −λI ∗ − − γ − 3μ < 0.
Direct calculation gives
det(J(P ∗ )) = −(λI ∗ + μ)( + μ)(γ + μ) − λI ∗ λS ∗ + (λI ∗ + μ)λS ∗
= −(λI ∗ + μ)( + μ)(γ + μ) + μλS ∗
= −(λI ∗ + μ)( + μ)(γ + μ) + μ( + μ)(γ + μ)
= −λI ∗ ( + μ)(γ + μ) < 0.
The sum of 2 × 2 principal minors of J(P ∗ ) is

a2 = (λI ∗ + μ)( + μ) + (λI ∗ + μ)(γ + μ) + ( + μ)(γ + μ) − λS ∗
= (λI ∗ + μ)( + γ + 2μ),
since ( + μ)(γ + μ) − λS ∗ = 0. Therefore,
tr(J(P ∗ )) · a2 = −(λI ∗ + + γ + 3μ)(λI ∗ + μ)( + γ + 2μ)

< −λI ∗ ( + μ)(γ + μ) = det(J(P ∗ )).
132 Special Topics
We have verified all the Routh–Hurwitz conditions and proved

that P ∗ is asymptotically stable when R0 > 1.
Remark. Careful readers might have noticed that, in Theorem
5.1.2, we have neglected the case R0 = 1. We note that, when
R0 = 1, E ∗ = I ∗ = 0 and the two equilibria P0 and P ∗ coin-
cide. Furthermore, the two eigenvalues p2 , p3 of J(P0 ) satisfy
p2 p3 = ( + μ)(γ + μ)(1 − R0 ) = 0, and thus at least one eigen-
value of J(P0 ) is zero. In this case, P0 is not a hyperbolic equi-
librium, and the method of linearization will not directly yield a
nonlinear stability result of P0 . As we will see in the next section,
the stability of P0 when R0 = 1 can be established using a Lya-
punov function.
5.1.4 The Global Stability of the

Disease-Free Equilibrium When
R0 ≤ 1: LaSalle’s Invariance Principle
Theorem 5.1.3 The disease-free equilibrium P0 = (1, 0, 0) of

(5.3) is globally asymptotically stable in Γ if R0 ≤ 1.
Proof. Consider function L = E + ( + μ)I. Then the derivative

of L along a solution (S(t), I(t), R(t)) is
L = I [ λS − ( + μ)(γ + μ) ]
(5.8)
= ( + μ)(γ + μ) I [ R0 S − 1 ] ≤ 0 since S ≤ 1.
If L = 0 then either I = 0 or R0 S = 1. If R0 = 1, then the sec-

ond relation implies S = 1, E = I = 0. The set G = { (S, E, I) ∈
Γ | L = 0 } ⊂ { (S, E, I) ∈ Γ | I = 0 }. A solution (S(t), E(t), I(t))
that stays in G satisfies equation S = μ − μS, and thus S(t) → 1
as t → ∞. It follows that E(t), I(t) → 0 as t → ∞ since S(t) +
E(t) + I(t) ≤ 1. The largest invariant set in G is then the sin-
gleton {P0 }. The global asymptotic stability of P0 when R0 ≤ 1
follows from LeSalle’s Invariance Principle (Theorem 3.3.4 and
Corollary 3.3.5, Chapter 3).
5.1 SEIR Models 133
5.1.5 Uniform Persistence and Endemicity

of a Disease When R0 > 1
In this section, we show that a disease persists and becomes
endemic in the host population when the basic reproduction
number R0 > 1. This is done by establishing that system (5.3)
is uniformly persistent in the feasible region Γ when R0 > 1.
System (5.3) is said to be uniformly persistent if there exists a
constant 0 < 0 < 1 such that any solution (S(t), E(t), I(t)) with
◦
(S(0), E(0), I(0)) ∈ Γ satisfies:
lim inf S(t) > 0 , lim inf E(t) > 0 , lim inf I(t) > 0 ,
t→∞ t→∞ t→∞
(5.9)
lim inf 1 − S(t) − E(t) − I(t) > 0 .
t→∞
The disease is endemic if (5.3) is uniformly persistent, since in this

case the infected subpopulations E(t) and I(t) will remain above
a positive level 0 > 0 after a long time. Weaker notions of persis-
tence have been defined and used in the literature of population
dynamics (see [61]). One may choose to define endemicity of the
disease using one of the weaker notions of persistence. However, as
the following result shows, persistence of (5.3) in any reasonable
sense is equivalent to the uniform persistence as defined above.
Proposition 5.1.4 System (5.3) is uniformly persistent in Γ if
and only if R0 > 1.
Proof. The necessity of R0 > 1 follows from Theorem 5.1.3 and

the fact that the asymptotic stability of P0 precludes persistence.
To establish sufficiency of the condition R0 > 1, we will apply the
uniform persistence result Theorem 3.7.1, from Chapter 3. Using
the same argument as in the proof of Theorem 3.7.2 in Section
3.7, we see that the maximal invariant set M on the boundary
∂Γ is the singleton {P0 } and is an isolated invariant set. To show
that, when R0 > 1, the disease-free equilibrium P0 repels in the
direction of the interior of Γ, we use the Lyapunov function L and
relation (5.8) in the preceding section:
L = ( + μ)(γ + μ) I ( R0 S − 1 ).
134 Special Topics
Here we see that, when R0 > 1, L > 0 for (S, E, I) in the interior
of Γ if S is sufficiently close to 1. As a result, solutions in the
interior of Γ starting sufficiently close to P0 leave a neighborhood
of P0 . This verifies the condition of Theorem 3.7.1 and system
(5.3) is uniformly persistent in Γ when R0 > 1.
Theorem 5.1.3 and Proposition 5.1.4 establish the basic repro-
duction number R0 as a sharp threshold parameter; if R0 ≤ 1 the
disease always dies out irrespective of initial conditions, and if
R0 > 1 the disease remains endemic in the population as long as
it is initially present.
5.1.6 The Global Stability of the Endemic

Equilibrium When R0 > 1
Theorem 5.1.5 Suppose that R0 > 1. Then the unique endemic
equilibrium P ∗ is globally asymptotically stable in the interior
of Γ.
Proof. Let P ∗ = (S ∗ , E ∗ , I ∗ ) be the endemic equilibrium.
Consider a function
S E
V (S, E, I) = (S − S ∗ ) − S ∗ log + (E − E ∗ ) − E ∗ log ∗

S ∗

E
+μ I
+ (I − I ∗ ) − I ∗ log ∗ .
I
We first show that V (S, E, I) ≥ 0 in the interior of Γ and V (S, E,
I) = 0 only at P ∗ . For x∗ > 0, let f (x) = x − x∗ − x∗ log xx∗ . Then
f (x∗ ) = 0 and f (x) = 1 − x∗ /x. Therefore f (x) < 0 if x < x∗ and
f (x) > 0 if x > x∗ . This implies that f (x) has an absolute mini-
mum 0 at x = x∗ in the interval (0, ∞). This property shows that
V (S, E, I) is positive definite with respect to point P ∗ .
5.1 SEIR Models 135
The Lyapunov derivative of V along solutions of (5.3) is

• S∗ E∗ + μ I ∗
V = S − S + E − E + I − I
S E I
S∗ E∗
= μ − μS − (μ − λIS − μS) − ( + μ)E − (λS − ( + μ)E)
S E
+μ + μ I∗
+ (E − (γ + μ)I) − (E − (γ + μ)I)
I
∗ ∗
S λISE ( + μ)(γ + μ)
= μ − μS − μ − λIS ∗ − μS ∗ − + ( + μ)E ∗ − I
S E
EI ( + μ)(γ + μ) ∗
− ( + μ) ∗ + I .
I
•
While the expressions in V seem complex, we can simplify them
using the equilibrium relations satisfied by P ∗ :
μ = λI ∗ S ∗ + μS ∗ , λI ∗ S ∗ = ( + μ)E ∗ , E ∗ = (γ + μ)I ∗ .
(5.10)
These relations are obtained by setting the derivatives in system
•
(5.3) to 0. Substituting (5.10) into V , we obtain
• μS ∗ S E∗ I ∗
∗E I
V = μ − μS − + μS ∗ + λI ∗ S ∗ ∗ − ( + μ)E
S S E I∗ E I∗
( + μ)(γ + μ) ∗
+ ( + μ)E ∗ + I

∗ S S∗ ∗ ∗ S∗ S E∗ I E I∗
= μS 2 − ∗ − + λI S 3 − − ∗ −
S S S S E I ∗ E∗ I
≤ 0, for all (S, E, I) in the interior of Γ.
The last inequality follows from the inequality for the arithmetic
and geometric means, and the same inequality implies that we
•
must have S = S ∗ , E/E ∗ = I/I ∗ if V = 0. Letting S = S ∗ in the
first equation of system (5.3), we obtain I = I ∗ , and thus E = E ∗ .
•
This implies that V is negative definite with respect to P ∗ . This
proves the global stability of P ∗ when it exists.
136 Special Topics
5.2 In-host Models and Backward

Bifurcation
We have seen in previous chapters how mathematical models
can be used to describe disease transmission on the scale of an
epidemic. We will show in this section that the same modeling
approach can also be applied to describe disease processes at a
microscopic level, such as the spread of a viral infection among
a population of target cells. The resulting models are commonly
called in-host models. It is interesting to note that, although the
in-host models are intended for processes at a very different scale
from that of epidemic models, the resulting differential equations
are very similar. This allows us to apply the mathematical machin-
ery and our modeling intuitions developed in previous chapters to
the analysis of in-host models.
Just as an understanding of infectious disease epidemiology is
important for modeling epidemics, a sufficient understanding of
cell biology, cell physiology, and immunology is crucial for in-host
modeling. It often requires an extensive study of medical literature
on the molecular biology and pathogenesis of the infection process
we set out to model. A good place to start is often recent review
articles on the subject in medical journals.
In this section, we will use the infection of CD4 T cells by
Human T-cell Lymphotropic Virus Type 1 (HTLV-I) as an exam-
ple. HTLV-I was the first retrovirus linked to human diseases and
consequently one of the most studied retroviruses in medical lit-
erature. We will show that a 2-dimensional model for HTLV-I
infection can demonstrate a very interesting phenomenon of back-
ward bifurcation. This presentation is adapted from the publica-
tion [19]. For further readings on this topic, we refer the reader
to the book of Nowak and May [49] and research papers of the
author and his collaborators [20, 37, 39].
At the time this book is written, in-host modeling is one of
the most active areas of research in mathematical biology, and
there are many opportunities for interdisciplinary collaborations
between modelers and medical researchers.
5.2 In-host Models of Viral Dynamics 137
5.2.1 Infection of T Cells by HTLV-I

Human T-cell Lymphotropic Virus Type 1 (HTLV-I) infection is
responsible for several diseases such as Adult T-cell Leukemia/
Lymphoma (ATL) and HTLV-I Associated Myelopathy (HAM).
As a retrovirus, HTLV-I uses a viral protein called reverse tran-
scriptase to synthesize DNA segments (provirus) that integrate
into the host DNA. The main target for the viral infection is the
CD4+ T-lymphocyte population. HTLV-I shares the same tar-
get cells and replication mechanism as Human Immunodeficiency
Virus Type 1 (HIV-1), which causes Acquired Immune Deficiency
Syndrome (AIDS). There are many differences between HTLV-I
and HIV-I. Unlike HIV-1, which can break free from host cells and
infect other T cells, cell-free HTLV-I does not trigger infection.
Cell-to-cell contact is normally required to transmit the infec-
tion between CD4+ T cells. Like other retroviruses, the HTLV-I
provirus can also be vertically transmitted to the daughter cells
of an infected cell during mitosis.
The proportion of provirus-containing cells in HTLV-I infection
is remarkably high; typically between 0.1% and 10% of peripheral
blood mononuclear cells harbor HTLV-I provirus. It requires a
considerable level of viral replication to obtain such a proviral
load. During mitotic division, proviruses in an HTLV-I infected
T cell are replicated by the host DNA polymerase and the rate
of mutation is very low. In contrast, the action of viral reverse
transcriptase is error-prone and accumulates sequence variations
rapidly. HTLV-I exhibits an extraordinary genetic stability, while
HIV-1 is known to be highly mutative. This apparent discrepancy
between a high proviral load and low genetic variability suggests
that mitotic transmission may play a crucial role in the persistence
of the HTLV-I infection. For more detailed discussions on the
biology of the HTLV-I infection and transmission, we refer the
reader to selected medical studies [3, 4, 10, 21, 46] and books on
modeling viral dynamics [49, 65].
We will derive a mathematical model to describe HTLV-I infec-
tion among a population of CD4+ T cells based on biological
138 Special Topics
assumptions, and investigate the role of mitotic transmission in

viral persistence.
5.2.2 An In-host Model for HTLV-I Infection

To model the HTLV-I infection of CD4+ T cells, we partition the
CD4+ T-cell population into uninfected and infected classes. Let
x(t), y(t) denote the number of uninfected and infected cells at
time t, respectively. We assume that the growth of T cells through
mitotic division obeys the law of logistic growth, and is described
by ν1 x(t)[1 − (x(t) + y(t))/K], where ν1 is the growth rate con-
stant and K is the level at which mitotic division of CD4+ T
cells stops. Infected cells retain most of the cellular functionalities.
Their division is assumed to be similar to that of the uninfected
cells: ν2 y(t)[1 − (x(t) + y(t))/K], with a growth rate constant ν2 .
The horizontal transmission of HTLV-I occurs through cell-to-
cell contact between infected and uninfected cells. Since we are
modeling the infection process in the peripheral blood, where the
cells are sufficiently mixed, a bilinear incidence form βx(t)y(t) is
commonly used, where β is the transmission coefficient. Newly
infected CD4+ T cells face a strong immune response by antibod-
ies and cytotoxic lymphocytes. We crudely incorporate the effects
of immune responses into the model by assuming that only a frac-
tion σ of cells newly infected by direct contact escape the immune
system actions. Here 0 ≤ σ ≤ 1. We assume that the body gener-
ates CD4+ T cells at a constant rate λ and newly generated cells
are uninfected. The removal rate of uninfected CD4+ T cells is a
constant μ1 , and may include loss due to natural death and from
other sources. The removal rate for infected cells, μ2 , may include
loss due to natural causes and immune responses.
The model is described by the following system of differential
equations:

x+y
x = λ + ν1 x 1 − − μ1 x − βxy

K (5.11)
x+y
y = σβxy + ν2 y 1 − − μ2 y.
K
Adding the two equations and using N = x + y, we obtain

x+y
(x + y) ≤ λ + ν(x + y) 1 − − μ(x + y),
K
where ν = max{ν1 , ν2 } and μ = min{μ1 , μ2 }. It follows that
lim sup(x(t) + y(t)) ≤ N ,

t→∞
where N = N is the positive root of the quadratic equation

ν 2
λ + (ν − μ)N − N = 0.
K
Thus a feasible region for (5.11) is
Γ = {(x, y) ∈ R2+ | x + y ≤ N }. (5.12)
It can be shown that Γ is positively invariant with respect to

(5.11).
We present a detailed mathematical analysis of the bifurca-
tion and global dynamics of (5.11) in Γ. Since the system is
2-dimensional, the phase-plane analysis in Chapter 3 can be
applied for the mathematical analysis. We will use graphical
method to demonstrate the occurrence of the backward bifurca-
tion: multiple stable equilibria exist for an open set of parameter
values when the basic reproduction number is below one.
5.2.3 Equilibria and Backward Bifurcation

Bifurcation analysis can be done by first examining the changes
in the number of equilibria as we vary a model parameter. An
equilibrium (x, y) of model (5.11) satisfies

x+y
0 = λ + ν1 x 1 − − μ1 x − βxy

K (5.13)
x+y
0 = σβxy + ν2 y 1 − − μ2 y.
K
140 Special Topics
The infection-free equilibrium P0 = (x0 , 0) exists for all parameter

values, where x0 > 0 is the positive root of the polynomial
ν1 2
f1 (x) = λ + (ν1 − μ1 )x − x. (5.14)
K
A chronic-infection equilibrium P = (x, y) satisfies (5.13) with
y > 0. From the second equation of (5.13) we obtain

K ν2
y= (σβ − )x + (ν2 − μ2 ) . (5.15)
ν2 K
Substituting (5.15) into the first equation of (5.13), we conclude
that a chronic-infection equilibrium satisfies
f1 (x) = f2 (x),
where f1 is defined in (5.14) and

1 ν1
f2 (x) = + β [(σKβ − ν2 ) x + K(ν2 − μ2 )] x. (5.16)
ν2 K
The number of chronic-infection equilibria can be analyzed geo-
metrically by examining intersections of the graphs of f1 and f2 .
System (5.11) can have more than one chronic-infection equilib-
rium only if f2 is concave down and has a positive root. This
happens if and only if the following condition holds:
σKβ < ν2 and μ2 < ν2 . (5.17)
Since our primary interest is to explore the occurrence of back-

ward bifurcation in system (5.11), we will assume condition (5.17)
holds throughout the section. Evidence from medical research lit-
erature can be used to show that the parameter range defined by
(5.17) is biologically sound.
To ensure that the graphs of f1 and f2 intersect, we require the
following technical condition:
2
μ2 ν1
(ν1 + βK)2 1 − > (ν1 − μ1 )2 + 4λ , (5.18)
ν2 K
which is derived from the condition f2 (x0 ) > f1 (x0 ), see
Figure 5.2 (d).
Define
4λβν22 − [(ν1 − μ1 )ν2 − (ν1 + Kβ)(ν2 − μ2 )]2
σ0 = (5.19)
4λβν2 (ν1 + Kβ)
and
ν2 (ν2 − μ2 )
σc = − . (5.20)
Kβ βx0
As parameter σ varies, the number of chronic equilibria changes
and the changes can be summarized in the following four cases:
(1) If 0 < σ < σ0 , the graphs of f1 and f2 do not intersect and

there is no chronic-infection equilibrium. See Figure 5.2 (a).
(2) If σ = σ0 , the graphs of f1 and f2 are tangent and there is
a unique chronic-infection equilibrium. See Figure 5.2 (b).
(3) If σ0 < σ < σc , there are two chronic-infection equilibria,
P∗ = (x∗ , y∗ ) and P ∗ = (x∗ , y ∗ ), x∗ < x∗ . These are the two
intersections of graphs of f1 and f2 in the first quadrant.
See Figure 5.2 (c).
(4) If σ ≥ σc there is a unique chronic-infection equilibrium
P∗ = (x∗ , y∗ ) in the feasible region Γ. In this case, graphs
of f1 and f2 have only one intersection in the first quad-
rant. The value σc is such that x∗ = x0 . Also the condition
(5.18) implies that |f1 (x0 )| < |f2 (x0 )| when σ = σc , so that
the graph of f2 is above that of f1 to the left of and close
to x0 . See Figure 5.2 (d).
142 Special Topics
(a) 1.2
f (b) 1.2 f
1 1
0.8 0.8 x* = x*
f1
0.6 0.6
0.4
f2 0.4
0.2 0.2
x0 x0
-0.5 0.5 1 1.5 2 2.5 3 3.5 x -0.5 0.5 1 1.5 2 2.5 3 3.5 x
-0.2 -0.2
(c) 1.2 f (d) 1.2 f

1 x* x*
1
0.8
0.8
0.6 0.6
0.4 0.4
0.2 0.2
x* x0 x0 = x*
-0.5 0.5 1 1.5 2 2.5 3 3.5
x -0.5 0.5 1 1.5 2 2.5 3 3.5
x
-0.2 -0.2
Figure 5.2: A geometric analysis of changes in the number of equi-

libria as parameter σ is varied. (a) 0 < σ < σ0 , (b) σ = σ0 , (c)
σ0 < σ < σc , and (d) σ ≥ σc .
Results of the preceding analysis are summarized in the next

result. The corresponding bifurcation diagram is shown in
Figure 5.3.
Theorem 5.2.1 System (5.11) always has the infection-free equi-
librium P0 = (x0 , 0). Assume that conditions (5.17) and (5.18) are
satisfied. Then the number of chronic-infection equilibria is deter-
mined by σ. More specifically,
(1) If 0 < σ < σ0 , there are no chronic-infection equilibria.
(2) If σ = σ0 , there is a unique chronic-infection equilibrium.
(3) If σ0 < σ < σc , there are two chronic-infection equilibria P∗
and P ∗ .
(4) If σ ≥ σc , there is a unique chronic-infection equilibrium P∗ .
Similar to the basic reproduction number for epidemics, the

basic reproduction number for HTLV-I infection is the average
number of secondary infections caused by a single infected CD4+
T cell introduced in an entirely susceptible CD4+ T-cell popula-

tion, over its entire infectious period 1/μ2 . For model (5.11), the
basic reproduction number R0 is given as follows:

1 x0
R0 = R0 (σ) = σβx0 + ν2 1 − . (5.21)
μ2 K
Here we emphasize the dependence of R0 (σ) on the HTLV-I
parameter σ. The first term in the brackets describes the
per-unit-time secondary infections through cell-to-cell contact,
and the second term those through mitotic division. From its
definition, we see that R0 (σ) is an increasing function of σ.
Furthermore, R0 (σc ) = 1 by (5.20) and (5.21).
y*
y*
0 σ0 σc σ
Figure 5.3: Bifurcation diagram for backward bifurcation.
The bifurcation diagram in Figure 5.3 shows the standard fea-

tures of a backward bifurcation: multiple chronic-infection equilib-
ria exist when the basic reproduction number is below unity. Such
a bifurcation possesses certain catastrophic behaviors:
(1) When R0 increases through 1, the number of infected cells

has a sudden jump. This may result in a sudden explosion
of the infected cell population.
(2) When R0 decreases through 1, the level of chronic-infection
remains high.
144 Special Topics
This is in sharp contrast to the standard forward bifurcation in

virus-host and epidemic models. A serious complication associated
with a backward bifurcation is the following: lowering the basic
production number R0 below unity may no longer be a viable
control measure, hence different prevention and control measures
may have to be considered.
5.2.4 Stability of Equilibria

Proposition 5.2.2 The infection-free equilibrium P0 = (x0 , 0)
is locally asymptotically stable if R0 (σ) < 1 and is unstable if
R0 (σ) > 1.
Proof. The Jacobian matrix of (5.11) at P0 = (x0 , 0) is

x0 ν 1 x0
ν1 1 − − − μ1 − ν1Kx0 − βx0
J(P0 ) = K K .
0 μ2 (R0 (σ) − 1)

One of the eigenvalues is ν1 1 − xK0 − ν1Kx0 − μ1 = f1 (x0 ) < 0, by
(5.14) and the graph of f1 . The other is μ2 (R0 (σ) − 1), which has
the same sign as R0 (σ) − 1. This establishes the proposition.
Proposition 5.2.3
(1) If σ0 < σ < σc , then P∗ is locally asymptotically stable
whereas P ∗ is a saddle.
(2) If σ > σc , then P∗ is locally asymptotically stable.
Proof. The Jacobian matrix J(P ) of (5.11) at a chronic-infection

equilibrium point P = (x, y) is

x+y
ν1 (1 − − νK1 x − μ1 − βy
) − νK1 x − βx
K .
ν2
(σβ − K )y σβx + ν2 (1 − x+y K
)− ν2 y
K
− μ2
By the equilibrium equation (5.13),

x+y ν1 x
tr J(P ) = ν1 1 − K − K − μ1 − βy

x+y ν2 y ν1 x ν2 y
+ σβx + ν2 1 − K − K − μ2 = − λx − K − K < 0.
Thus tr J(P ) < 0 for both P = P∗ and P = P ∗ . Also, using the

definition of f1 and f2 in (5.14) and (5.16), respectively, we obtain

2 y ν1
det J(P ) = f1 (x) − νK1 + β y − νK + K + β σβ − νK2 xy

= y − νK2 f1 (x) + νK2 νK1 + β y + νK1 + β σβ − νK2 x

= y νK2 f1 (x) + νK1 + β (ν2 − μ2 ) + 2 νK1 + β σβ − νK2 x
= y νK2 (f2 (x) − f1 (x)).
Note that f2 (x) − f1 (x) > 0 when x = x∗ and f2 (x) − f1 (x) < 0
when x = x∗ . We know that P∗ is locally asymptotically stable
whenever it exists, and P ∗ is a saddle whenever it exists. This
establishes Proposition 5.2.3.
The stability properties of equilibria are indicated in the bifur-
cation diagram in Figure 5.3, where solid lines indicate stable
equilibria, and dotted lines indicate unstable equilibria. Note that
R0 = R0 (σ) increases with σ and R0 (σc ) = 1. Thus σ0 < σ <
σc ⇐⇒ R0 (σ0 ) < R0 (σ) < 1, and σ > σc ⇐⇒ R0 (σ) > 1. We
see that R0 (σ) still behaves as a threshold parameter in that if
R0 (σ) < 1, the infection-free equilibrium P0 is stable, whereas if
R0 (σ) > 1, P0 becomes unstable and a unique chronic-infection
equilibrium P∗ is stable. However, the peculiarity of the back-
ward bifurcation as shown in Figure 5.3 is that a stable chronic-
infection equilibrium coexists with the stable infection-free
equilibrium P0 when R0 (σ) is in the parameter range R0 (σ0 ) <
R0 (σ) < 1, or equivalently when σ0 < σ < σc .
5.2.5 Global Dynamics

The coexisting local attractors P0 and P ∗ established by the sta-
bility analysis in the previous section and their basins of attrac-
tion are further described in the next result, which establishes the
global dynamics of (5.11).
Theorem 5.2.4 Assume that conditions (5.17) and (5.18) are
satisfied. Then
(1) When 0 < R0 < R0 (σ0 ), the infection-free equilibrium P0 is

globally asymptotically stable in Γ;
146 Special Topics
(2) When R0 (σ0 ) < R0 < 1, system (5.11) has two attractors
in Γ, the infection-free equilibrium P0 and the chronic-
◦
infection P∗ . Their basins of attraction in Γ are separated
by the stable manifolds of the saddle point P ∗ ;
(3) When R0 > 1, the unique chronic-infection equilibrium P∗
◦
is globally asymptotically stable in Γ.
◦
Proof. We first rule out periodic orbits in Γ using Dulac’s criteria
(Corollary 3.6.6 in Chapter 3). We choose a Dulac multiplier
α(x, y) = 1/xy. Let (P (x, y), Q(x, y)) denote the right-hand side
of (5.11). We have

∂(αP ) ∂(αQ) λ ν1 ν2
+ =− 2 + + < 0,
∂x ∂y x y Ky Kx
for all x > 0, y > 0. Thus (5.11) has no periodic orbits in Γ. By the
Poincaré–Bendixson Theorem (Theorem 3.6.2, Chapter 3), we
conclude that all solutions in Γ converge to a single equilibrium.
This establishes the claims in the theorem.
A phase portrait of (5.11) for case (2) of Theorem 5.2.4 is
numerically generated using Mathematica and shown in
Figure 5.4. Three equilibria are marked as dots, with P0 on the
x-axis and P∗ sitting between P ∗ and P0 . The heteroclinic orbits
from P∗ to P ∗ and P0 are clearly identifiable, as are the basin
boundaries, which are the stable manifolds of the saddle point
P∗ . In this case, R0 (σ) < 1, but the fate of an initial infection
critically depends on the initial conditions. If the initial point
lies in the basin of attraction of P0 , then the infection will clear,
whereas the infection will persist if the initial point lies in the
basin of attraction of P∗ . This behavior does not occur in virus-
host models that have only forward bifurcation.
10
6 P*
P*
2
P0
x
2.5 5 7.5 10 12.5 15 17.5 20
Figure 5.4: Phase portrait showing the coexistence of two stable

equilibria when R0 (σ0 ) < R(σ) < 1.
5.2.6 Simulation Results

We would like to interpret the mathematical results in previous
sections in the biological context of HTLV-I infection.
Clinical evidence has led to the hypothesis that HTLV-I infec-
tion consists of two steps: a transient phase of reverse transcrip-
tion and a phase of persistent multiplication of infected CD4+ T
cells. When an infected T cell multiplies, proviruses can be passed
to the genome of the daughter cells as a form of vertical transmis-
sion. This two-step process is proposed as an explanation for the
observed high proviral load and low genetic variability in HTLV-I
infected T cells [46].
Model (5.11) for the infection of CD4+ T cells by HTLV-I is
derived based on the biological assumptions of the two-step pro-
cess hypothesis. The model incorporates both horizontal transmis-
sion through cell-to-cell contact and vertical transmission through
mitotic division of infected T cells. We also assume that a fraction
σ of the infected cells survives the immune system attack after the
error-prone reverse transcription process. The genetic stability of
HTLV-I suggests that the fraction σ should be very low (σ 1).
Also, the high proviral load in HTLV-I infection suggests that the
rate of mitotic division should be high (ν2 > μ2 ). The bifurca-
tion diagram of model (5.11) predicts persistent infection for an
148 Special Topics
extended range of the basic reproduction number R0 > R0 (σ0 ).

This verifies the hypothesis on the two-step process.
What might come as a surprise is that the same bifurcation
diagram shows that the model undergoes a backward bifurcation
as σ increases: a stable chronic-infection equilibrium P∗ exists
when the basic reproduction number R0 (σ) is below unity for an
open range of parameter values. The global dynamics for the cor-
responding parameter range show that P∗ and the infection-free
equilibrium P0 are coexisting attractors whose basins of attraction
partition the feasible region. Even when the basic reproduction
number R0 (σ) < 1, whether an infection persists or dies out crit-
ically depends on if the initial point lies in the basin of attraction
of P∗ or that of P0 , respectively. This clearly demonstrates that
the catastrophic behavior accompanies the backward bifurcation.
Clinical data in the literature can be used to show that
the parameter region for backward bifurcation defined by (5.17)
and (5.18) is realistic for human HTLV-I infection. According
to [50], the human body produces CD4+ T cells at a rate of
λ = 25 cells/mm3 and CD4+ T cells have a natural death rate
μ2 ≈ μ1 ≈ 0.03 day−1 . In the absence of HTLV-I infection, the
number of CD4+ T cells is expected to be constant and has a
normal value around 1000/mm3 ([50]), and a carrying capac-
ity constant K = 1150/mm3 . This gives us x0 ≈ 1000/mm3 , or
f1 (1000) ≈ 0. Using (5.14) we can then estimate the prolifera-
tion constant for the CD4+ T cells as ν2 ≈ ν1 ≈ 0.038 day−1 . To
get an estimate of β, we use the data in Hoshino et al. [31] on
10 cell lines derived from ATL patients. It was reported in [31]
that in a culture of 5 × 106 cells/ml, within 243 days 47% of
the total cell population showed significant amounts of retro-
viral genetic materials. This information allows a rough esti-
mate of β ≈ 1.03 × 10−3 mm3 /cells/day. One can verify that, with
these parameter values, condition (5.18) holds, and condition
(5.18) gives a range (0, 0.032) of σ for backward bifurcation to
occur. Furthermore, σc = 0.024, σ0 = 0.00007, and R( σ0 ) = 0.169.
Therefore, infection is able to persist at steady state if R0 > 0.169.
Numerical simulations of the model using these parameter values
are carried out using Mathematica, see Figure 5.5. The simula-
tions show that persistent infection will produce a proviral load
in the range of 10–20%, which is consistent with the clinical data.
(a) (b)
1000 800
x(t) 700
800 600
600 500
400
y(t)
400 300
200
200 x(t)
y(t) 100
t t
200 400 600 800 200 400 600 800
(c)
800
700
600
500
x(t)
400
300
200
y(t)
100
t
200 400 600 800
Figure 5.5: Mathematica simulations with parameter values μ1 =

μ2 = 0.03, ν1 = ν2 = 0.038, λ = 25, K = 1150, β = 0.00103, x0 =
1000. (a) and (b) show that, when σ = 0.015 and R0 = 0.68,
the fate of the infection depends on the initial condition due to
the bistability. (c) shows that, when σ = 0.03 and R0 = 1.17,
infection persists even when y(0) = 50. The persistent infection
shows a proviral load level of 16%.
5.2.7 Biological Interpretations

Several biological mechanisms have been shown to lead to back-
ward bifurcation in epidemic models, see [12, 16, 22, 23, 35, 43,
58]. These include imperfect immunity, a vaccine that is leaky,
and behavioral responses to perceived disease risk. In many of
these models, backward bifurcation occurs due to an asymmetry
among different contact groups, multiple routes for transmission,
150 Special Topics
or density dependence in disease incidence. The results in this

section indicate that backward bifurcation can be a byproduct of
the two-step process of HTLV-I infection.
In a standard forward bifurcation, when R0 passes through 1,
the level of chronic infection remains low. In contrast, as demon-
strated in the bifurcation diagram in Figure 5.3, a backward bifur-
cation results in the following catastrophic effects:
(1) When R0 increases through 1, the infected T-cell population

may experience a sudden explosion.
(2) When R0 decreases through 1, the level of chronic-infection
remains high.
As a result, the standard infection-control measure of lowering

R0 to below 1 is no longer viable; R0 needs to be below R0 (σ0 )
to achieve infection control, which may be very difficult. Other
infection-control measures need to be investigated. Based on our
model and analysis, such measures may include:
(1) Lowering the level of chronic infection y∗ to safe levels;
(2) Increasing the value R0 (σ0 ) to be close to 1, thus reducing
the parameter range for backward bifurcation to occur.
(3) Identifying basin of attractions and basin boundaries as
shown in Figure 5.4. This may help to design tests to screen
for development of chronic infection.
In conclusion, outcomes of modeling studies in this section

suggest that backward bifurcation may be intrinsic to HTLV-I
infection dynamics.
Bibliography
1. H.I. Abrash, C.M. Gulberg, Studies concerning affinity. J. Chem. Ed.

63, 1044–1047 (1986)
2. R.M. Anderson, R.M. May, Infectious Diseases of Humans: Dynamics
and Control (Oxford University Press, Oxford, 1991)
3. C.R.M. Bangham, The immune response to HTLV-I. Curr. Opin.
Immunol. 12, 397–402 (2000)
4. C.R.M. Bangham, S.E. Hall, K.J.M. Jeffery, A.M. Vine, A. Witkover,
M.A. Nowak, K. Usuku, M. Osame, Genetic control and dynamics of the
cellular immune response to the human T-cell leukemia virus, HTLV-I.
Philos. Trans. R. Soc. Lond. B Biol. Sci. 354, 691–700 (1999)
5. M. Begon, J.L. Harper, C.R. Townsend, Ecology: Individuals, Popula-
tions, and Communities, 3rd edn. (Blackwell Science, Cambridge, 1996)
6. M. Begon, M. Bennett, R.G. Bowers, N.P. French, S.M. Hazel, J. Turner,
A clarification of transmission terms in host-microparasite models: num-
bers, densities and areas. Epidemiol. Infect. 129, 147–153 (2002)
7. F. Brauer, C. Castillo-Chavez, Mathematical Models in Population Biol-
ogy and Epidemiology (Springer, New York, 2001)
8. S. Busenberg, K.L. Cooke, Vertically Transmitted Diseases, Biomathe-
matics, vol. 23 (Springer, Berlin, 1993)
9. G.J. Butler, P. Waltman, Persistence in dynamical systems. Proc. Am.
Math. Soc. 96, 425–430 (1986)
10. A.J. Cann, I.S.Y. Chen, Human T-cell leukemia virus types I and II, in
Fields Virology, 3rd edn., ed. by B.N. Fieds, D.M. Knipe, P.M. Howley,
et al. (Lippincott-Raven Publishers, Philadelphia, 1996), pp. 1849–1880
11. E.A. Coddington, N. Levinson, Theory of Ordinary Differential Equa-
tions (McGraw-Hill Education, New York, 1985)
12. R. de Boer, M.Y. Li, Density dependence in disease incidence and its
impacts on transmission dynamics. Can. Appl. Math. Q. 19, 195–218
(2011)
https://doi.org/10.1007/978-3-319-72122-4
152 Bibliography
13. M.C.M. de Jong, O. Diekmann, H. Heesterbeek, How does transmis-

sion of infection depend on population size? in Epidemic Models: Their
Structure and Relation to Data, ed. by D. Mollison, Publications of the
Newton Institute, vol. 5 (Cambridge University, Cambridge, 1995), pp.
84–94
14. O. Diekmann, J.A.P. Heesterbeek, Mathematical Epidemiology of Infec-
tious Diseases (Wiley, New York, 2000)
15. O. Diekmann, J.A.P. Heesterbeek, J.A.J. Metz, On the definition and
computation of the basic reproduction ratio R0 in models for infec-
tious diseases in heterogeneous populations. J. Math. Biol. 28, 365–382
(1990)
16. J. Dushoff, W. Huang, C. Castillo-Chávez, Backwards bifurcations and
catastrophe in simple models of fatal diseases. J. Math. Biol. 36, 227–
248 (1998)
17. H.I. Freedman, M.X. Tang, S.G. Ruan, Uniform persistence and flows
near a closed positively invariant set. J. Dyn. Differ. Equ. 6, 583–600
(1994)
18. L.Q. Gao, H.W. Hethcote, Disease transmission models with density
dependent demographics. J. Math. Biol. 30, 717–731 (1992)
19. H. Gomez-Acevedo, M.Y. Li, Backward bifurcation in a model for HTLV-
I infection of CD4+ T cells. Bull. Math. Biol. 67, 101–114 (2005)
20. H. Gomez-Acevedo, M.Y. Li, S. Jacobson, Multi-stability in a model for
CTL response to HTLV-I infection and its implications to HAM/TSP
development and prevention. Bull. Math. Biol. 72, 681–696 (2010)
21. C. Grant, K. Barmak, T. Alefantis, J. Yao, S. Jacobson, B. Wigdahl,
Human T cell leukemia virus type I and neurologic disease: Events in
bone marrow, peripheral blood, and central nervous system during nor-
mal immune surveillance and neuroinflammation. J. Cellular Phys. 190,
133–159 (2002)
22. D. Greenhalgh, O. Diekmann, M.C.M. de Jong, Subcritical endemic
steady states in mathematical models for animal infections with incom-
plete immunity. Math. Biosci. 165, 1–25 (2000)
23. K.P. Hadeler, P. van den Driessche, Backward bifurcation in epidemic
control. Math. Biosci. 146, 15–35 (1997)
24. J.K. Hale, Ordinary Differential Equations (Dover Publications, New
York, 2009)
25. P. Hartman, Ordinary Differential Equations, Classics in Applied Mathe-
matics, Society for Industrial and Applied Mathematics, 2nd edn. (2002)
26. J.A.P. Heesterbeek, The law of mass-action in epidemiology: a historical
perspective, in Ecological Paradigms Lost: Routes of Theory Change,
ed. by B.E. Beisner, (Elsevier Academic Press, Amsterdam, 2005), pp.
81–104
27. H.W. Hethcote, The mathematics of infectious diseases. SIAM Rev. 42,
599–653 (2000)
Bibliography 153
28. M.W. Hirsch, Systems of differential equations which are competitive or

cooperative IV: structural stability in three dimensional systems. SIAM
J. Math. Anal. 21, 1225–1234 (1990)
29. J. Hofbauer, J. So, Uniform persistence and repellers for maps. Proc.
Am. Math. Soc. 107, 1137–1142 (1989)
30. C.S. Holling, Some characteristics of simple types of predation and par-
asitism. Can. Entomol. 91, 385–398 (1959)
31. H. Hoshino, H. Esumi, M. Miwa, M. Shimoyama, K. Minato, K. Tobinai
et al., Establishment and characterization of 10 cell lines derived from
patients with adult T-cell leukemia. Proc. Nat. Acad. Sci. USA 80,
6061–6065 (1983)
32. M. Iannelli, Mathematical Theory of Age-Structured Population Dynam-
ics, Appl. Math. Monogr. C.N.R., vol. 7, Giardini Editori e Stampatori,
Pisa (1995)
33. H. Inaba, Threshold and stability results for an age-structured epidemic
model. J. Math. Biol. 28, 411–434 (1990)
34. A. Korobeinikov, P.K. Maini, A Lyapunov function and global proper-
ties for SIR and SEIR epidemiological models with nonlinear incidence.
Math. Biosci. Eng. 1, 57–60 (2004)
35. C.M. Kribs-Zaleta, J. Velasco-Hernández, A simple vaccination model
with multiple endemic states. Math. Biosci. 164, 183–201 (2000)
36. A. Lajmanovich, J.A. Yorke, A deterministic model for gonorrhea in a
nonhomogeneous population. Math. Biosci 28, 221–236 (1976)
37. J. Lang, M.Y. Li, Stable and transient periodic oscillations in a math-
ematical model for CTL response to HTLV-I infection. J. Math. Biol.
65, 181–199 (2012)
38. J.P. LaSalle, The Stability of Dynamical Systems. Regional Conference
Series in Applied Mathematics (SIAM, Philadelphia, 1976)
39. M.Y. Li, A. Lim, Modelling the role of tax expression in HTLV-I persis-
tence in vivo. Bull. Math. Biol. 73, 3008–3029 (2011)
40. Y. Li, J.S. Muldowney, On Bendixson’s criterion. J. Differ. Equ. 106,
27–39 (1994)
41. E.W. Lund, Guldberg and Waage and the law of mass action. J. Chem.
Ed. 42, 548–550 (1965)
42. G. MacDonald, The Epidemiology and Control of Malaria (Oxford Uni-
versity Press, London, 1957)
43. M. Martcheva, H.R. Thieme, Progression age enhanced backward bifur-
cation in an epidemic model with super-infection. J. Math. Biol. 46,
385–424 (2003)
44. L. Michaelis, M.L. Menten, Die kinetik der invertinwirkung. Biochem. Z
49, 333–369 (1913)
45. J. Monod, The growth of bacterial cultures. Ann. Rev. Microbiol. 3,
371–394 (1949)
46. F. Mortreux, A.-S. Gabet, E. Wattel, Molecular and cellular aspects of
HTLV-I associated leukemogenesis in vivo. Leukemia 17, 26–38 (2003)
154 Bibliography
47. J.S. Muldowney, Compound matrices and ordinary differential equa-

tions. Rocky Mountain J. Math. 20, 857–872 (1990)
48. J.D. Murray, Mathematical Biology: I. An Introduction, 3rd edn.
(Springer, New York, 2002)
49. M.A. Nowak, R. May, Virus Dynamics: Mathematical Principles of
Immunology and Virology (Oxford University Press, Oxford, 2001)
50. P.W. Nelson, J.D. Murray, A.S. Perelson, A model of HIV-1 pathogenesis
that includes an intracellular delay. Math. Biosci. 163, 201–215 (2000)
51. L.C. Piccinini, G. Stampacchia, G. Vidossich, Ordinary Differential
Equations in Rn (Springer, New York, 1984)
52. R. Ross, The Prevention of Malaria (John Murray, London, 1910)
53. H.L. Smith, Monotone Dynamical Systems, An Introduction to the The-
ory of Competitive and Cooperative Systems (American Mathematical
Society, Providence, 1995)
54. H.L. Smith, Periodic orbits of competitive and cooperative systems. J.
Differ. Equ. 65, 361–373 (1986)
55. H.L. Smith, P. Waltman, The Theory of the Chemostat: Dynamics of
Microbial Competition (Cambridge University Press, Cambridge, 1995)
56. H.R. Thieme, Epidemic and demographic interaction in the spread of
potentially fatal diseases in growing populations. Math. Biosci. 111, 99–
130 (1992)
57. H.R. Thieme, Mathematics in Population Biology (Princeton University
Press, Princeton, 2003)
58. P. van den Driessche, J. Watmough, A simple SIS epidemic model with
a backward bifurcation. J. Math. Biol. 40, 525–540 (2000)
59. P. Waage, C.M. Guldberg, Studies concerning affinity, Forhandlinger:
Videnskabs - Selskabet i Christinia (Norwegian Academy of Science and
Letters): 35 (1864)
60. P. Waltman, Deterministic Threshold Models in the Theory of Epi-
demics. Lecture Notes in Biomathematics, vol. 1 (Springer, Berlin, 1974)
61. P. Waltman, A brief survey of persistence, in Delay Differential
Equations and Dynamical Systems, ed. by S. Busenberg, M. Martelli
(Springer, New York, 1991)
62. L. Wang, M.Y. Li, D. Kirschner, Mathematical analysis of the global
dynamics of a model for HTLV-I infection and ATL progression. Math.
Biosci. 179, 207–217 (2002)
63. G.F. Webb, Theory of Age Dependent Population Dynamics (Marcel
Dekker, New York, 1985)
64. P. van den Driessche, J. Watmough, Reproduction numbers and sub-
threshold endemic equilibria for compartmental models of disease trans-
mission. Math Biosci. 180, 29–48 (2002)
65. D. Wodarz, M.A. Nowak, C.R.M. Bangham, The dynamics of HTLV-I
and the CTL response. Immun. Today 20, 220–227 (1999)
66. J. Zhou, H.W. Hethcote, Population size dependent incidence in models
for diseases without immunity. J. Math. Biol. 32, 809–834 (1994)
Index
A E
Acquired immunity, 26 Endemic, 2
Age structure, 29 Epidemic, 1, 39
severity, 43
Epidemic curves, 41
B
Epidemic models
Basic reproduction number R0 ,
age structured, 29
32, 33, 43, 49, 52, 73, 129,
deterministic, 5, 6
142
in-host, 136
Bendixson’s criterion, 94
statistical, 5
Bifurcation, 50
stochastic, 6
backward, 139, 143, 149
time delayed, 18
transcritical, 56
vector–host transmission, 70
Bifurcation diagram, 55, 143
Equilibrium, 48, 53, 80
chronic-infection, 140
C disease-free, 49, 72
Contact number, 21 endemic, 49, 72
effective, 21 infection-free, 140
Critical community size, 11
F
D Final size equation, 43
Disease control First integrals, 40
quarantine, 31 Floquet Theory, 86
vaccination, 30
Disease incidence, 18
Disease outbreak, 1 I
Disease prevalence, 18 Immune period, 15
Dulac criteria, 57, 95, 146 mean, 15

https://doi.org/10.1007/978-3-319-72122-4
156 Index
Incidence form, 19 Phase-line analysis, 47, 87

bilinear, 19 Phase-plane analysis, 57, 58, 63,
nonlinear, 20 139
proportionate, 21 Poincaré–Bendixson Theorem, 91
saturation, 20 Positive invariance, 37, 91
standard, 21 Probability density function, 14
Incidence rate, 9 exponential, 14
Incubation period, 25 Probability distribution function,
Infectious period, 15, 25 13
mean, 15 exponential, 14
general, 15
Heavidide, 17
L Probability mean, 15, 18
LaSalle’s Invariance Principle, 56,
84
Latent period, 25 R
mean, 25 Recovery rate, 9, 12
Law of Mass Action, 8, 19 Residence time, 13
Least squares Routh–Hurwitz conditions, 54, 81
linear, 104
Matlab, 119, 120
S
nonlinear, 111, 115, 119
Stability
Limit set, 90
asymptotic, 48, 80
Logistic growth, 23, 88, 138
global, 56, 84
Lyapunov function, 56, 82
local, 54, 80
orbital, 85
M unstable, 48
Matrix Stability analysis
Jacobian, 54, 74, 81, 112, 115, linearization method, 81
130, 131, 144 Lyapunov function method, 82
Metzler, 74, 98, 99
T
N Threshold Theorem, 45, 50, 58,
Normal system, 107, 113 64, 73, 130
Time delays, 17
Transfer diagram, 6
O Transmission
Orbit, 90 horizontal, 27
vertical, 27, 30
Transmission coefficient, 9, 30
P
Pandemic, 2
Parameter estimation, 103 U
Periodic solution, 80 Uniform persistence, 95, 96, 133

Thesis 1

Uploaded by

Copyright:

Available Formats

Thesis 1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Thesis 1

Uploaded by

Copyright:

Available Formats

What is the book about?

What is the book about?

What topics are covered in the book?

What topics are covered in the book?

Mathematics of Planet Earth 2

Ken Golden, The University of Utah, USA

Springer’s Mathematics of Planet Earth series provides a variety of well-

More information about this series at http://www.springer.com/series/13771

Mathematics of Planet Earth

Library of Congress Control Number: 2017960197

This Springer imprint is published by Springer Nature

Mathematical modeling has been increasingly recognized in the

author’s modeling courses and went on to write MSc and PhD

to model analysis: ﬁrst integrals and level curves, phase-line anal-

1 Important Concepts in Mathematical Modeling

2 Five Classic Epidemic Models and Their

3 Basic Mathematical Tools and Techniques . . . . . . 79

3.3 Stability Analysis Using Lyapunov Functions . . . . 82

4 Parameter Estimation and Nonlinear

5 Special Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

1.1 Mathematical Modeling of

Partial answers may be obtained using a variety of approaches.

(1) Infectious diseases often aﬀect a large population of individ-

Mathematical modeling can provide an understanding of the

(1) Make assumptions about the disease transmission process

typically done by applying existing mathematical theories

It is important to keep in mind that a mathematical model is

(1) Due to our limited knowledge about an infectious disease,

(3) Mathematical analysis of the model may be limited by exist-

There is always a trade-oﬀ in mathematical modeling between

• Drawbacks: these models are not expected to be valid

This book is an introduction to deterministic models of infec-

1.2 Deterministic Epidemic Models:

S : susceptible hosts, I : infectious hosts, R : recovered hosts.

Then, we illustrate the transmission process schematically in a

compartments. The term “removal” includes loss of individuals

Figure 1.1: Transfer diagram for an SIR compartment model.

The goal of modeling is to track the number of hosts in each

ΔS(t) = new susceptibles + transfer from R − new infections

If we divide both sides of these equations by Δt and let Δt → 0,

ΔS(t) S(t + Δt) − S(t)

recovery, and removal. We thus have the following diﬀerential

S (t) = inﬂux of new susceptibles + transfer rate from R − incidence rate

− removal rate from S

(1) Transmission occurs horizontally through direct contact

of hosts in each compartment. In particular, the incidence

Although they may appear very restrictive, these assumptions

Figure 1.2: Transfer diagram for an simple SIR model.

Substituting all terms in (1.2) by our mathematical descrip-

S(0) = S0 > 0, I(0) = I0 > 0, R(0) = 0.

3. Rewrite equation (1.4) as

We see that it is possible to derive properties of solutions

1.4 Important Concepts in

An understanding of the assumptions behind the mathematical

1.4.1 Transfer Rates

ity of an individual entering C at time t = 0 and remaining in C

F (t) = Prob [ X ≤ t ]. (1.8)

In other words, F (t) is the probability distribution function of

Figure 1.4: A probability distribution function.