ENG2005 Lecture Book

School of Mathematics
ENG2005
Advanced Engineering
Mathematics
Lecture notes
Dr. Alina Donea, Dr. Anthony Lun, Mr. John McCloughan, Assoc. Prof. Michael Page
and the ENG2005 Development Group
Clayton Campus
Malaysia Campus
Semester 1 and 2, 2021
Australia Malaysia South Africa Italy India monash.edu/science

Contents
0.1 Assumed background knowledge and skills for ENG2005 . . . . . . . . . . . . . . . . . . . 7
1 Matrix Algebra 11
1.1 Matrix Algebra: Assumed Background Knowledge . . . . . . . . . . . . . . . . . . . . . . 12
1.1.1 Linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.1.2 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.1.3 Matrices - Gaussian elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.1.4 Matrices - definitions and operations . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.1.5 Matrices - special matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.1.6 Matrices - Properties of matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.1.7 Matrix inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.1.8 Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.2 Systems of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.2.1 Systems of m linear equations with n unknowns . . . . . . . . . . . . . . . . . . . . 18
1.2.2 n × n matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.2.3 m × n matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.2.4 Solution space of a linear system of equations . . . . . . . . . . . . . . . . . . . . . 21
1.2.5 Systems of linear equations - special matrices . . . . . . . . . . . . . . . . . . . . . 23
1.2.6 Systems of linear equations - LU-decomposition . . . . . . . . . . . . . . . . . . . . 25
1.3 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.3.1 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.3.2 Finding eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.3.3 Eigenvectors corresponding to real distinct eigenvalues . . . . . . . . . . . . . . . . 30
1.3.4 Eigenvectors corresponding to real repeated eigenvalues . . . . . . . . . . . . . . . 31
1.3.5 Eigenvectors corresponding to complex eigenvalues . . . . . . . . . . . . . . . . . . 33
1.4 Cartesian tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.4.1 The Stress Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.4.2 Other second-order tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
1.4.3 Einstein summation convention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.4.4 Other important tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
1.4.5 Transformation of a tensor to a new coordinate system . . . . . . . . . . . . . . . . 41
2
ENG2005 Advanced Engineering Mathematics Monash University
2 Multivariable Calculus 43
2.1 Multivariable Calculus: Assumed Background Knowledge . . . . . . . . . . . . . . . . . . 44
2.1.1 Multivariable functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.1.2 Partial derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.1.3 The gradient vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.1.4 Gradient vector in spherical coordinates . . . . . . . . . . . . . . . . . . . . . . . . 46
2.1.5 Two-variable function extrema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.2 Derivatives of multivariable functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.2.1 Partial derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.2.2 The chain rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.3 Double Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.3.1 Double Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.3.2 The Order of Integration for Double Integrals . . . . . . . . . . . . . . . . . . . . . 55
2.3.3 Applications of Double Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.3.4 Change of Coordinate System in Two Dimensions . . . . . . . . . . . . . . . . . . 59
2.3.5 Double Integrals in Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.4 Triple Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2.4.1 Triple integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2.4.2 The order of integration for triple integrals . . . . . . . . . . . . . . . . . . . . . . 67
2.4.3 Applications of triple integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
2.4.4 Change of coordinate system in three-dimensions . . . . . . . . . . . . . . . . . . . 70
2.4.5 Triple integrals in cylindrical coordinates . . . . . . . . . . . . . . . . . . . . . . . 71
2.4.6 Triple integrals in spherical coordinates . . . . . . . . . . . . . . . . . . . . . . . . 73
3 Vector Calculus 76
3.1 Vector Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.1.1 Scalar Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.1.2 Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.2 The Del Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.2.1 The Del Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.2.2 The Gradient of a Scalar Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.2.3 The Divergence of a Vector Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.2.4 The Laplacian Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.2.5 The Curl of a Vector Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
3.3 Tangent Vectors and Arc Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
3.3.1 Tangent to a curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
3.3.2 Arc length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
3.4 Line Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
3.4.1 Line Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
3.4.2 Conservative Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
3
3.4.3 Conservation of Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

3.4.4 Other Forms of Line Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
3.5 Surface Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
3.5.1 Surface integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
3.5.2 Defining a curved surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
3.5.3 Surface area of a curved surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
3.5.4 Surface integral of scalar fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
3.5.5 Surface integral of vector fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.5.6 Integrating a scalar field across a curved surface . . . . . . . . . . . . . . . . . . . 116
3.6 The Divergence Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
3.7 Stokes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
3.7.1 Stokes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
3.7.2 Independence of path for integration . . . . . . . . . . . . . . . . . . . . . . . . . . 122
3.7.3 Green’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4 Fourier Series 128

4.1 Periodic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
4.1.1 Periodic functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
4.1.2 Trigonometric sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
4.1.3 Convergence of trigonometric sums . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
4.2 Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
4.2.1 Fourier series of 2π-periodic functions . . . . . . . . . . . . . . . . . . . . . . . . . 137
4.2.2 Fourier series for functions of arbitrary period . . . . . . . . . . . . . . . . . . . . . 144
4.2.3 Convergence of a Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
4.3 Fourier Cosine and Sine Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
4.3.1 Even and odd functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
4.3.2 Even and odd functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
4.3.3 Half-range expansions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
4.4 Applications of Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
4.4.1 Forced oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
4.4.2 Harmonics and energy of periodic functions . . . . . . . . . . . . . . . . . . . . . . 156
5 Ordinary Differential Equations 159

5.1 Ordinary Differential Equations: Assumed Background Knowledge . . . . . . . . . . . . . 160
5.1.1 Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
5.1.2 First order ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
5.1.3 Second order linear ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
5.2 Higher Order Homogeneous Linear ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
5.2.1 Linear independence of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
5.2.2 Homogeneous nth -order ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
4
5.2.3 Homogeneous 2nd order linear ODEs with constant coefficients . . . . . . . . . . . 167
5.2.4 Higher order homogeneous linear ODEs with constant coefficients . . . . . . . . . . 171
5.3 Non-Homogeneous nth -Order Linear ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . 173
5.3.1 Variation of parameters - First order linear ODEs . . . . . . . . . . . . . . . . . . 173
5.3.2 Variation of parameters - Second-order linear ODEs . . . . . . . . . . . . . . . . . 174
5.3.3 Variation of parameters - nth -order linear ODEs . . . . . . . . . . . . . . . . . . . 177
5.4 Systems of Linear Ordinary Differential Equations . . . . . . . . . . . . . . . . . . . . . . 178
5.4.1 Systems of homogeneous 1st -order linear ODEs . . . . . . . . . . . . . . . . . . . . 178
5.4.2 A homogeneous system of linear ODEs to a homogeneous nth -order linear ODE . . 185
5.4.3 A homogeneous nth -order linear ODE to a homogeneous system of 1st order linear
ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
5.5 Boundary Value Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
5.5.1 Boundary value problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
5.5.2 The eigenvalue-eigenfunction problem . . . . . . . . . . . . . . . . . . . . . . . . . 189
6 Partial Differential Equations 193

6.1 Introduction to Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . 194
6.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
6.1.2 Solutions of PDEs - and how they are different to solutions of ODEs . . . . . . . . 195
6.1.3 Solutions to PDEs - determining the arbitrary function . . . . . . . . . . . . . . . 196
6.1.4 Solutions to first order PDEs - separation of variables . . . . . . . . . . . . . . . . 199
6.1.5 Solutions to second order PDEs - superposition principle . . . . . . . . . . . . . . . 200
6.2 The Advection Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
6.2.1 The advection equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
6.3 The Heat Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
6.3.1 Heat equation - temperature fixed at the ends of the rod . . . . . . . . . . . . . . . 205
6.3.2 Heat equation - perfectly insulated at the ends of the rod . . . . . . . . . . . . . . 208
6.4 Laplace’s Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
6.4.1 Two dimensional Laplace equation - boundary conditions . . . . . . . . . . . . . . 212
6.4.2 Two dimensional Laplace equation - four non-homogeneous boundary conditions . 215
6.5 Laplace’s Equation in Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
6.5.1 The Laplace equation in polar coordinates . . . . . . . . . . . . . . . . . . . . . . . 216
6.5.2 Laplace’s equation on a disc—setup of problem . . . . . . . . . . . . . . . . . . . . 217
6.5.3 Laplace’s equation on a disc—separation of variables . . . . . . . . . . . . . . . . . 218
6.5.4 Laplace’s equation on a disc—general solution . . . . . . . . . . . . . . . . . . . . . 218
6.5.5 Laplace’s equation on a half disc - a boundary value problem . . . . . . . . . . . . 221
6.6 The Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
6.6.1 One dimensional wave equation - D’Alembert’s solution . . . . . . . . . . . . . . . 224
6.6.2 One dimensional wave equation - separation of variables . . . . . . . . . . . . . . . 226
6.6.3 One dimensional wave equation - boundary and initial conditions . . . . . . . . . . 226
5
6.6.4 One dimensional wave equation - an example . . . . . . . . . . . . . . . . . . . . . 226

6.6.5 The one dimensional wave equation - physical interpretation . . . . . . . . . . . . . 227
6.7 Further Examples and Applications of PDEs . . . . . . . . . . . . . . . . . . . . . . . . . 229
6.7.1 Poisson’s equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
6.7.2 Helmholtz equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
6.7.3 Euler-Bernoulli beam equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
6.7.4 The heat conduction equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
6.7.5 Schrödinger’s equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
6
0.1 Assumed background knowledge and skills for ENG2005
To undertake ENG2005 you will need to have some basic mathematics knowledge, including some
simple calculus, and be competent at key algebraic skills and graphical techniques. In some cases that
material will be revised briefly before it is used in ENG2005 but more generally it is advisable to spend
time in the first week reviewing all of the material listed below, most of which has been covered in or
before the prerequisite ENG1005 Engineering Mathematics or its equivalents.
Note that this is fundamental material that you are expected to know and understand this material
prior to undertaking ENG2005 and without the assistance of electronic calculators or other aids. If
it is some time since you completed your first year studies, or you are uncertain of any of the listed
material for other reasons, it is strongly recommended that you revise these concepts immediately - for
example, using a suitable engineering mathematics textbook or any introductory book on mathematics
or calculus in the library.
Numbers, arithmetic, algebra and logic
I The concepts of natural numbers (N), integers (Z), rational numbers (Q), irrational numbers,
real numbers (R) and complex numbers (C).
I The laws of arithmetic for addition, subtraction, multiplication and division of real and complex
numbers.
I Simple set theory and notation, including set membership (∈), union (∪), intersection (∩), subsets
(⊂, ⊆) and the empty set (∅).
I The meaning of and notation used for closed and open intervals of real numbers.
I Correct use of inequalities, their manipulation and their equivalents in terms of interval notation.
I The manipulation of algebraic expressions, including correct use of brackets, expansion of prod-
ucts, simplification of expressions involving fractions and simple factorisations.
X
I Use of the ‘sigma’ ( ) notation for summations (series), and the meaning of the factorial
function f (n) = n! for n ∈ N ∪ {0}.
I An appreciation of basic logic, including correct use of the logical relations ‘and’, ‘or’ and ‘not’,
and the meaning and correct usage of the implication symbol (=⇒).
Geometry and trigonometry
I A recognition of basic geometry and terminology, including for common one-, two- and three-
dimensional coordinates systems, objects, shapes and solids.
I An understanding of circular functions; triangular geometry, trigonometry, angles and key related
results, including the sine and cosine laws, the ‘angle sum theorem’ and the Pythagorean identity.
I An appreciation of the symmetry of geometrical objects.
Functions and graphs

I The concept of a ‘function’, what is meant by its domain and range, and how to identify or
determine those. The meaning of the terms ‘independent’ and ‘dependent’ variables.
I The ability to sketch graphs√ of each of the following elementary functions’: xn for any n ∈ Z,
sin(x), cos(x) and tan(x), x for small integers, ax , loge (x) (ln(x)) and log10 (x), the absolute
n
value function |x|. Also simple polynomials including linear, quadratic and cubic functions.
I The ability to transform the graphs of the functions above with simple horizontal and vertical
scalings and translations, for example, Af (kx + b)+B for simple functions f and constant values
of A, k, b, B.
I The difference between a variable and a constant, including where the constant is not specified
as a particular numerical value (also known as a ‘parameter’).
I Understanding of how to add, subtract, multiply and divide functions, including how those
operations affect the domain of the resulting function. Recognition of a rational function, and
when it exists.
I The meaning of composition of functions, and how to both write and evaluate them.
I Understanding of the terms extremum, minimum, maximum and inflection point, and what is
meant by an ‘increasing’ and ‘decreasing’ function.
I How to solve y = f (x) algebraically when f is linear, quadratic or exponential function, including
cases where no real-value solution exists.
I The basic properties of sin(x), cos(x) and tan(x), including their definition, their symmetry
properties, the relationship between them, and their exact values when x is an integer multiple
π π
of or .
6 4
I The basic properties (or ‘laws’) of exponential and logarithmic functions, including a0 , a1 , ax+y ,
axy and their equivalents in terms of logarithms.
I An understanding of hyperbolic functions; the basic properties of sinh(x), cosh(x) and tanh(x),
including their definition, their symmetry properties, the relationship between them, including
identities involving hyperbolic functions.
Calculus
I The concept of a limit of a function, and how it may differ from the value of that function at
the corresponding point (where that exists). The determination of limits of simple functions,
including polynomials and rational functions.
I Understanding of the concept of a continuous function at a point and in a domain.
I The meaning of, and formal definition of, the derivative of a function at a point, both geometri-
cally and algebraically. An understanding of what is meant by the derivative function, and some
dy
common notations for that. An understanding that the derivative notation is not a fraction
dx
and therefore cannot be split into ‘bits’ dy and dx.
I The derivative functions of the ‘elementary functions’: xn , sin(x) , cos(x), ax , , loge (x) (ln(x))
and log10 (x), and where they do or do not exist.
I The basic properties of the derivative function, including how to calculate the derivatives of
additions, multiples, quotients and compositions of elementary functions.
8
I How to perform implicit differentiation on equations of the form g(x, y) = 0 to find the derivative
dy
.
dx
I The concept of a definite integral of a function over a given finite interval, including its properties
and its relationship to areas of some simple two-dimensional shapes.
I Understanding of the ‘indefinite integral’ and how it differs from the concept of an ‘anti-derivative
function. Anti-derivative functions of: xn , sin(x), cos(x), ax .
I How to solve an integral by method of substitution or method of integration by parts.
I The concept of an ‘improper integral’, and how to determine by mathematical rigour whether
an improper integral converges or diverges.
I Understanding of sequences and series, fundamentals of convergence/divergence, and finding
Taylor series for applications such as error analysis.
Complex numbers
I The concepts of complex numbers.

I How to graphically represent complex numbers in the Argand diagram.
I Understanding of complex number algebra operations in Cartesian form: addition, multiplication
and using the conjugate.
I The concepts of representing complex numbers in polar form; modulus and argument, multipli-
cation and division in polar form.
I The concepts of representing complex numbers in exponential form; multiplication and division
in exponential form, and the Euler formula.
I Understanding of De Moivre’s theorem and applying De Moivre’s theorem to find the roots and
powers of complex numbers.
I Understanding the relationship of trigonometric functions to hyperbolic functions via the expo-
nential form of complex numbers.
Vectors
I Understanding of the concept of a vector in two- and three-dimensional space and how to scale,
add and subtract them geometrically,
I Understanding of vector algebra, how to find the length of a vector, how to find the angle
between two vectors using the dot product, and how to find a vector which is perpendicular to
two non-parallel vectors using the cross product.
I Understanding of how to find scalar-projections and vector-projections.
I Understanding of the vector equations of lines and planes in three-dimensional space.
Linear algebra
I An appreciation of the concept of a matrix, what is meant by ‘row’ and ‘column’, and how to
identify the element of a matrix using row and column numbers.
9
I Understanding of matrix algebra operations; addition, multiplication, transpose, trace and de-
terminant.
I The concept of how to find the inverse of an n × n-matrix.

I Understanding of how to apply Gaussian elimination to a matrix equation Ax = b and what
the solution set means both algebraically and geometrically, and the concept of an ‘inconsistent
system’.
I Understanding of how to find eigenvalues of an n × n-matrix, and how to find the corresponding
eigenvectors for n-distinct real eigenvalues.
Ordinary Differential Equations
I The concept of the terms ‘ordinary differential equations’, ‘first order’, ‘second order’, ’constant
coefficient’, ’linear/non-linear’, ’homogeneous/non-homogeneous’, ‘initial condition’ and ’bound-
ary value’.
I How to solve a first order ordinary differential equations, including by separable variables and
integration factors.
I How to solve second order constant coefficient ordinary differential equations which has real
roots.
I The concept of seeking a numerical solution to a first order ordinary differential equation.
Laplace Transforms
I The formal definition of the Laplace transform.

I How to find Laplace transforms of the ‘elementary functions’: xn , sin(x) , cos(x) and ex .
I How to find the inverse of Laplace transforms.

I The shifting theorems of Laplace transforms.
I The Laplace transform of derivatives and derivatives of Laplace transforms.
I The concept of the unit step-function, the Dirac-delta function, convolution functions, and how
to use them with Laplace transforms.
I The application of Laplace transfroms to ordinary differential equations.
Multivariable calculus
I The concept of a limit of a multivariable function, and how it may differ from the value of that
function at the corresponding point (where that exists).
I Understanding of the concept of a continuous multivariable function at a point and in a domain.
I The meaning of, and formal definition of, the partial derivatives of a multivariable function at a
point, both geometrically and algebraically. An understanding of what is meant by the gradient
vector and the directional derivative, and some common notations for that.
I Understanding how to find critical points of a multivariable function, and how to classify the
critical points of a two-variable function.
10
SCHOOL OF MATHEMATICS
ENG2005 Advanced Engineering Mathematics
Chapter 1
Matrix Algebra
1 Version: Monday 22nd February, 2021
11
1.1 Matrix Algebra: Assumed Background Knowledge
The following assumed knowledge is a revision of material covered in the prerequisite ENG1005 En-
gineering Mathematics (or equivalent units) on matrix algebra that you are expected to know and
understand. At the discretion of the lecturer some of this may be briefly revised during the initial
lecture on matrix algebra. However, more generally it is advisable for you to spend time on all of the
material listed below prior to the matrix algebra lectures.
1.1.1 Linear systems
Consider a linear system of three equations with three unknowns
a11 x + a12 y + a13 z = b1

a21 x + a22 y + a23 z = b2
a31 x + a32 y + a33 z = b3
for real constants a11 , a12 , a13 , a21 , a22 , a23 , a31 , a32 , a33 , b1 , b2 , b3 . This system of linear equations can
be geometrically thought of as a collection of the three planes in R3 . The question is then, how do the
three planes intersect, if at all?
Very broadly speaking, there are three cases:
1. The three planes do not intersect.

2. The three planes may intersect along a line.
3. The three planes may intersect at one and only one point.
Example 1.1.1 Consistent and under-determined system

The linear system of equations
2x + 3y − z = 1
0x − 5y + 5z = −1
0x + 0y + 0z = 0
is a consistent and under-determined system since the third equation gives us no information
about the solutions to the system of linear equations. The second equation gives −5y + 5z = −1. If we
1
let y = t for a parameter t ∈ R then the second equation gives z = + t. Then substituting y = t and
5
1 1
z = + t into the first equation 2x + 3y − z = 1 gives x = − t. Therefore, we have all the solutions
5 5
described by
1 1
(x, y, z) = , 0, + t (−1, 1, 1) for t ∈ R
5 5
which we recognise as the vector equation of a line.
Example 1.1.2 Inconsistent system

The linear system of equations
2x + 3y − z = 1
0x + −5y + 5z = −1
0x + 0y + 0z = −2
is an inconsistent system since the third equation gives us a contradictory statement 0 = −2. This
implies that there are no values for x, y, z that solve the system of linear equations.
1.1.2 Matrices
The previous system of linear equations can be written as the matrix equation
Ax = b
where A is the 3 × 3 matrix  

a11 a12 a13
A =  a21 a22 a23 
a31 a32 a33
The matrix A is the coefficient matrix. The column vector
 
x
x= y 
z
is the solution vector representing the unknown values x, y and z. The column vector
 
b1
b =  b2 
b3
represents the values on the right hand side of the linear equations.
If b = 0 then Ax = 0 and the corresponding system of linear equations is said to be homogeneous.
The matrix equation can written as the augmented matrix

 
a11 a12 a13 b1
[ A| b] =  a21 a22 a23 b2 
a31 a32 a33 b3
1.1.3 Matrices - Gaussian elimination
To find the solution vector x, which is equivalent to solving the system of linear equations, we apply a
sequence of row reductions called Gaussian elimination.
1. Use row-operations to eliminate elements below the diagonal element.
2. Use row-operations to eliminate elements above the diagonal element.

3. If possible, re-scale each row so that each diagonal element is equal to 1.
4. The right hand side of the augmented matrix is now the solution of the system of equations.
13
In first year, you were encouraged to stop after step 1, where you would have an upper triangular
matrix and then solve the last equation for the remaining variable, then substitute this variable into the
equation directly above to find the next variable, and so on. This process is Gaussian elimination
with back-substitution.
Note that it is also possible to apply step 1 to obtain a lower triangular matrix, and then solve the first
equation for the remaining variable, then substitute that variable into the equation directly below to
find the next variable, and so. This process is Gaussian elimination with foward-substitution.
Possible row operations are:
I Swap row i with row j: Ri ←→ Rj .

I Multiply row i by a constant k: Ri = kRi .
I Addition of row i and row j to give a new row i: Ri = Ri + Rj .
1.1.4 Matrices - definitions and operations

I We have the equality of two matrices A and B
A=B
if and only if A and B have the same size, and all entries in A are equal to those entries in B,
that is, aij = bij for all i and j.
I The transpose AT of a matrix A is the matrix such that the columns and rows of A are “swapped”,
that is, aij 7→ aji for all i and j.
Example 1.1.3
The transpose of a 3 × 3-matrix A is
 T  
a11 a12 a13 a11 a21 a31
 a21 a22 a23  =  a12 a22 a32 
a31 a32 a33 a13 a23 a33
The transpose of a row vector v is

 
T v1
v1 v2 v3 =  v2 
v3
I Matrix addition
C =A+B
can be performed if A and B are the same size, that is, both matrices are of size n × m. Matrix
addition is defined element-wise cij = aij + bij for each and every i and j.
I Scalar multiplication
C = kA
is defined element-wise cij = kaij for each and every i and j, for constant k.
I Multiplication of matrices
C = AB
is defined if and only if A is an n × p matrix and B is an p × m matrix. The resulting matrix has
size n × m. The cij element is defined by the dot product of the transpose of the ith row vector
in A with the j th column vector in B.
14
Example 1.1.4
The c21 element for the matrix C which is the multiplication of a 3 × 3-matrix A by a 3 × 3-matrix
B is defined as
    
b11
 c21  =  a21 a22 a23   b21 
b31
 
T b11
= a21 a22 a23 ·  b21 
b31
   
a21 b11
=  a22  ·  b21 
a23 b31
= a21 b11 + a22 b21 + a23 b31 .
I If a row operation is applied to an augmented matrix [ A| u] to give the augmented matrix [ B| v]

then the augmented matrix [ A| u] is similar to the augmented matrix [ B| v], and is written as
[ A| u] ∼ [ B| v]. This is different to equality, they are not the same matrix, but both augmented
matrices give the same solutions x, y and z.
I The determinant of a square matrix:

a b
– If A is a 2 × 2 matrix A = then
c d
det(A) = ad − bc
– For an n × n matrix A:
1. create an (n − 1) × (n − 1) sub-matrix Sij of A by deleting ith row and j th column of A,
2. then define
1+1 1+2 1+3 1+n
det(A) = (−1) a11 det(S11 )+(−1) a12 det(S12 )+(−1) a13 det(S13 )+· · ·+(−1) a1n det(S1n )
3. if the sub-matrix Sij is a 2 × 2-matrix then use the determinant definition for 2 × 2-
matrices, otherwise return to step 1 to calculate det(Sij ).
This describes expanding the determinant along a row. We could also expand along a column.
Thus to compute det(A) we want to rewrite det(A) as a series of 2 × 2-determinants.
The determinant expanded along any row or any column will always give the same value.
Note that occasionally det(A) is written as |A|; be careful with this notation we could have
det(A) > 0, det(A) = 0 or det(A) < 0.
1.1.5 Matrices - special matrices

I The identity matrix is an n × n matrix with zeros everywhere and ones down the main diagonal,
and satisfies IA = AI = A for any n × n-matrix A.
Example 1.1.5
The 3 × 3-identity matrix is  
1 0 0
I= 0 1 0 
0 0 1
I The zero matrix is any n × m-matrix full of zeros.
I A symmetric matrix is any n × n-matrix A for which A = AT .
I A skew-symmetric matrix (or anti-symmetric matrix) is any n × n-matrix A for which A = −AT .
15
1.1.6 Matrices - Properties of matrices

I Care needs to be taken with matrix multiplication: in general AB is not equal to BA. Therefore,
the order of matrix multiplication is very important.
I If AB = BA then we say A and B are commutative.
I (AB) C = A (BC).
T
I AT = A.
T
I (AB) = B T AT .
1.1.7 Matrix inverse
Given an n × n matrix A, the inverse matrix A−1 is defined as a matrix which satisfies
A−1 A = AA−1 = I.
Note the following:
1
I A−1 does not mean .
A
I A−1 exists if and only if det(A) 6= 0.
Given the matrix equation Ax = b for an n × n coefficient matrix A, if the inverse matrix A−1 exists
then by left-hand matrix multiplication, we can find the solution vector
A−1 Ax = A−1 b
Ix = A−1 b
x = A−1 b
If A−1 does not exist then Ax = b does not have a unique solution, that is, there are infinitely many
solutions or no solution.
If the inverse exists then we can find the inverse by Gaussian elimination:
I Start with the augmented matrix [ A| I].

I Apply a sequence of row operations to obtain [I |B ]
I The matrix B is the inverse matrix A−1 . Note that you should always check that B satisfies
AB = BA = I before writing B = A−1 .
1.1.8 Eigenvalues
If A is a square matrix and v is a non-zero column vector satisfying the matrix equation
Av = λv
then we say that the matrix A has eigenvalue λ with corresponding eigenvector v.
The eigenvalues λ of an n × n matrix A are the solutions of the characteristic equation
det(A − λI) = 0
16
If A is an n × n matrix, then this equation will be a polynomial of degree n in λ. The eigenvalues may
be real distinct, real repeated or complex numbers.
If n × n matrix A has n distinct real eigenvalues then we saw in the first year lectures that we can
find the corresponding eigenvector vk to each eigenvalue λk (for k = 1, . . . , n) by solving the matrix
equation
(A − λk I) vk = 0
We also saw in first year that once we have all these n eigenvectors that the eigenvectors are orthogonal.
17
1.2 Systems of Linear Equations
In first year during the matrix algebra section we were introduced to the concept of systems of n linear
equations with n unknowns and the corresponding matrices, and the algorithmic method of Gaussian
elimination to find solutions to the system of equations. In particular, we considered 2 × 2 and 3 × 3
matrices. In this section we want to extend those concepts to systems of m linear equations with n
unknowns where m and n could be small values (such as 3 or 4) as in the examples in this section, or
very large values (such as thousands) which occurs in computational work. Furthermore, we need to
be aware that in many applications it is often the case that n 6= m, whether n < m or n > m plays an
important role in the seeking of solutions to the system of linear equations.
1.2.1 Systems of m linear equations with n unknowns
As the number of equations and unknowns in a system of linear equations increases, so does the
complexity of the algebra in finding solutions.
Consider a linear system of m equations with n unknowns x1 , x2 , . . . , xn :
a11 x1 + a12 x2 + · · · + a1n xn = b1

a21 x1 + a22 x2 + · · · + a2n xn = b2
.. .
. = ..
am1 x1 + am2 x2 + · · · + amn xn = bm
where aij and bi are real constants (i = 1, 2, . . . , m and j = 1, 2, . . . , n).
This system of linear equations can then be written as the augmented matrix form
 
a11 a12 . . . a1n b1
 a21 a22 . . . a2n b2 
[ A| b] =  .
 
.. .. 
 .. . . 
am1 am2 . . . amn bm
1.2.2 n × n matrices
In first year matrix algebra we saw a number of results for n × n-matrices that we summarise here:
Theorem Equivalent statements for n × n matrices
If A is an n × n matrix then the following statements are equivalent:
1. det(A) 6= 0
2. A−1 exists
3. Ax = 0 has only the trivial solution; x = 0

4. Ax = b has exactly one solution
When working with 3 × 3 matrices we used a geometrical interpretation to describe the system of
linear equations represented by Ax = b, where each equation represented a plane in three-dimensional
space. Then the geometric interpretation of the solution to the system of linear equations represented
by Ax = b was the manner in which the three planes intersected.
When Ax = b has exactly one solution, this corresponds to the three planes intersecting at one point
in R3 which all three planes have in common.
If det(A) = 0 then Ax = b does not have a unique solution, that is, there is either
I no solution; geometrically this could mean the planes are, for example, all parallel, or
I infinitely many solutions; geometrically all three planes intersect along one line in R3 .
When working with n × n matrices for n > 3 we can no longer try to interpret the system of linear
equations represented by Ax = b geometrically. However, the mathematical interpretation still holds
true; if det(A) 6= 0 then there is a unique solution, while if det(A) = 0 there is either no solution or
infinitely many solutions.
If a solution exists then we apply Gaussian elimination to the augmented matrix [ A| b].
1.2.3 m × n matrices
If Ax = b represents a system of m linear equations with n unknowns x1 , x2 , . . . , xn (where n 6= m) we

cannot calculate det(A) to get an idea of how many, if any, solutions there will be. However, we can
still apply Gaussian elimination to the augmented matrix [ A| b].
Definition: Row echelon form of a matrix
A matrix is in row echelon form when
I the leading non-zero coefficient of each row, called the pivot entry, has zeros below it, and
I the pivot entries of following rows are located in columns further to the right.
I any rows which have no pivot, and therefore consist entirely of zeros, must be the final rows in
the matrix.
Definition: Basic and free variables
For a matrix in row echelon form, a column containing a pivot entry is a called a pivot column.
For a matrix in row echelon form
I a variable corresponding to a pivot column is called a basic variable.
I a variable which is not a basic variable is called a free variable.
Although row echelon forms of a matrix are not unique, all row echelon forms of a matrix A have
I the same number of zero rows, and
19
I the pivot column always occurs in the same position.
Example 1.2.1
Consider the matrix A  
1 2 3
 3 7 −4 
2 9 7
By the sequence of row operations:

1
R2 = R2 − 3R1 , R3 = R3 − 2R1 , R3 = R3 − 5R2 and R3 = R3
66
we obtain a row echelon form for A:  
1 2 3
 0 1 −13 
0 0 1
The pivot columns are column 1, column 2 and column 3. Therefore, x1 , x2 and x3 are basic variables,
and there are no free variables.
Example 1.2.2
0 0 −2 0 12
 3 6 −15 9 42 
2 4 −5 6 −1

1 1
R1 ←→ R2 , R1 = R1 , R3 = R3 − 2R1 , R2 = − R2 and R3 = R3 − 5R2
3 2
we obtain a row echelon form for A:
 
1 2 −5 3 14
 0 0 1 0 −6 
0 0 0 0 1
while x2 and x4 are free variables.
Example 1.2.3
1 0 −1 0
 2 1 0 8 
 
 0 1 −2 0 
1 −1 −2 −6

1
R2 = R2 − 2R1 , R4 = R4 − R1 , R3 = R3 − R2 , R4 = R4 + R2 , R3 = − R3 and R4 = R4 − R3
4
1 0 −1 0
 0 1 2 8 
 
 0 0 1 2 
0 0 0 0
while x4 is a free variable.
20
Example 1.2.4
1 3
 2
 −1 

 3 −2 
5 −8

1 1 1
R2 = R2 −2R1 , R3 = R3 −3R1 , R4 = R4 −5R1 , R2 = − R2 , R3 = − R3 −R2 , and R4 = − R4 −R2
7 11 23
1 3
 0 1 
 
 0 0 
0 0
The pivot columns are column 1 and column 2. Therefore, x1 and x2 are basic variables and there are
no free variables.
1.2.4 Solution space of a linear system of equations
As we saw in first year, a system of three linear equations with three unknowns can have no solution,
a unique solution or infinitely many solutions. It may have been stated in first year that the solutions
together form a set of vectors, or a solution space. It then follows that the solution space may be empty,
have one vector, or infinitely many vectors respectively. This concept of a set of solution vectors can
be extended to any system of m linear equations with n unknowns.
Definition: Solution space of a linear system of equations
Let Ax = b represent a system of m linear equations with n unknowns x1 , x2 , . . . , xn . If there

exist solutions to Ax = b then the solution space is the set of all vectors x ∈ Rn which satisfy
Ax = b.
Definition: Consistent and inconsistent
Let Ax = b represent a system of m linear equations with n unknowns.

If there exist solutions to Ax = b then the system is called consistent.
If no solutions exist to Ax = b then the system is called inconsistent.
A couple of nice results which we should be aware of:
Theorem Solution space for homogeneous linear system of equations
Let Ax = 0 represent a system of m linear equations with n unknowns. If n > m then the solution
space will contain non-trivial solutions.
21
Theorem Solution space for non-homogeneous linear system of equations
Let Ax = b represent a system of m linear equations with n unknowns.
I If n < m (overdetermined system) then there exists a vector b such that there is no solution
to Ax = b, that is, the solution space is the empty set.
I If n > m (underdetermined system) then for every b the system Ax = b has either no solution
or infinitely many solutions.
Example 1.2.5
Consider the augmented matrix [ A| b]
 
1 2 3 6
 3 7 −4 6 
2 9 7 18
which row reduces to  

1 2 3 6
 0 1 −13 −12 
0 0 1 1
This is a consistent system with one solution. The solution space is
 
 1 
 1  .
1
 
Example 1.2.6
 
0 0 −2 0 12 4
 3 6 −15 9 42 6 
2 4 −5 6 −1 1

1 2 −5 3 14 2
 0 0 1 0 −6 −2 
0 0 0 0 1 7
This is a consistent system with infinitely many solutions. The solution space is
        

 x1 104 −2 −3 

 x2   0 


     1   0  


   
 x3  =  40  + s  0  + t  0  : s, t ∈ R
       


  x4   0   0   1  


 
x5 7 0 0
 
22
Example 1.2.7
 
1 3 7
 2
 −1 7 
 3 −2 10 
5 −8 −11

1 3 7
 0 1 1 
 
 0 0 0 
0 0 1
The first two rows show that the two lines have the point (x, y) = (4, 1) in common. The third row shows
that the corresponding equation is consistent with the first two equations, therefore (x, y) = (4, 1) lies
on the third line also. However, the system of all four equations is inconsistent. Therefore, the fourth
line does not pass through the point (x, y) = (4, 1). Hence, the solution space for the corresponding
system of four equations is the empty set ∅, or explicitly { }.
1.2.5 Systems of linear equations - special matrices
Definition: Sparse matrices
An n × n system of linear equations is called sparse if only a relatively small number of its matrix
elements aij are non-zero.
It would be wasteful to use general analytical or numerical methods of linear algebra on such problems,
because most of the arithmetic operations devoted to solving the set of equations or inverting the matrix
involve zero operands.
Example 1.2.8
The matrix  
0 0 2 0 6 0

 0 0 1 −3 0 0 


 2 −1 0 0 0 2 


 0 7 0 0 4 0 

 0 0 2 1 0 0 
0 0 0 8 0 1
is sparse. It only has 13 non-zero elements.
23
Example 1.2.9
The 10 × 10 matrix  
0 2 3 0 5 0 7 0 0 0

 11 0 13 0 0 0 17 0 19 0 


 0 0 23 0 0 0 0 0 29 0 


 31 0 0 0 0 0 37 0 0 0 


 41 0 43 0 0 0 47 0 0 0 


 0 0 53 0 0 0 0 0 59 0 


 61 0 0 0 0 0 67 0 0 0 


 71 0 73 0 0 0 0 0 79 0 

 0 0 83 0 0 0 0 0 89 0 
0 0 0 0 0 0 97 0 0 0
is a sparse matrix. It only has 25 non-zero elements. The sparsity is 75% while the density is 25%.
Large sparse matrices are common in attempting to find numerical solutions. There are sparse matrices
that have certain patterns that you may see in applications, such as tridiagonal matrices, banded (with
bandwidth M ) matrices and block tridiagonal matrices.
Definition: Tridiagonal matrices
A sparse system of linear equations is called tridiagonal, if it has non-zero elements only on the
diagonal, the plus one diagonal and the minus one diagonal.
Example 1.2.10
The following 5 × 5-matrix  
3 4 0 0 0

 −1 2 1 0 0 


 0 5 4 −2 0 

 0 0 3 1 2 
0 0 0 1 5
is a tridiagonal matrix.
The tridiagonal matrix is a common sparse matrix occurring in applications such as numerical methods
for solving PDEs.
Example 1.2.11
The following 10 × 10-matrix
 
2 −1 0 0 0 0 0 0 0 0
 −1 2 −1 0 0 0 0 0 0 0 
 
 0
 −1 2 −1 0 0 0 0 0 0 

 0
 0 − 2 −1 0 0 0 0 0 

 0
 0 0 −1 2 −1 0 0 0 0 

 0
 0 0 0 −1 2 −1 0 0 0 

 0
 0 0 0 0 −1 2 −1 0 0 

 0
 0 0 0 0 0 −1 2 −1 0 

 0 0 0 0 0 0 0 −1 2 −1 
0 0 0 0 0 0 0 0 −1 2
is a tridiagonal matrix set up using finite difference method to solve a PDE problem. In practice, this
would be an N × N -matrix for large N .
24
1.2.6 Systems of linear equations - LU-decomposition
Gaussian elimination is a handy method for solving Ax = b where A is an n × n-matrix for small
values of n. However, as alluded to in the previous example, in applications n = N can be large. The
computational effort for Gaussian elimination is of order N 3 , that is, in general there will something
like N 3 additions and multiplications in the process.
To put this in perspective, assume we have a computer that can perform 106 addition and multiplication
operations per second, then for N = 1000 the computational run time to apply Gaussian elimination
will take of order 1000 seconds (about 17 minutes). While, for a simple increase in the size of N to
N = 10 000, the computational run time to apply Gaussian elimination will take of order 106 seconds
(about 11.5 days!) And we have only discussed the effects of large N on computational time, we haven’t
considered the storage and round off error questions.
We are going to have to construct a list of methods that we can use for large N that reduces our
concerns about issues such as computational time. Here we introduce one method in which we factorise
the coefficient matrix of A.
Definition: LU -decomposition
The LU -decomposition of an n × n-matrix A is
A = LU
where L is a lower triangular matrix and U is an upper triangular matrix.
In many applications where linear systems appear, one needs to solve Ax = b for many different vectors
b. For instance, a structure may need to be tested under several different loads, not just one. In the
example of a truss, the loading in such a problem is usually represented by the vector b. Gaussian
elimination with pivoting is the most efficient and accurate way to solve a linear system. Most of the
work in this method is spent on the matrix A itself. If we need to solve several different systems with
the same A, and A is big, then we would like to avoid repeating the steps of Gaussian elimination on A
for every different b. This can be accomplished by the LU -decomposition, which in effect records the
steps of Gaussian elimination.
Example 1.2.12
The following A = LU is the LU -decomposition of A
    
3 1 5 1 0 0 3 1 5
 3 4 8  =  1 1 0  0 3 3 
6 11 3 2 3 1 0 0 −16
How did we find the matrices L and U ?

We are going to assume that no row swapping is necessary then starting with A, we row reduce (Rj =
Rj − αRi ) A to the upper triangular matrix U . Simultaneously, starting with the identity matrix, for
each row reduction we substitute α into the position we just made zero in the equivalent A matrix,
which will result in a lower triangular matrix L.
I Begin with matrix A and the 3 × 3-identity matrix I.

   
3 1 5 1 0 0
A =  3 4 8  and I =  0 1 0 
6 11 3 0 0 1
25
I R2 = R2 − R1 gives  
3 1 5
A∼ 0 3 3 
6 11 3
and substituting α = 1 into the identity matrix at position (2, 1),
 
1 0 0
 1 1 0 
0 0 1
I R3 = R3 − 2R1 gives    
3 1 5 3 1 5
 0 3 3 ∼ 0 3 3 
6 11 3 0 9 −7
and substituting α = 2 into the previously adjusted identity matrix at position (3, 1) gives
 
1 0 0
 1 1 0 
2 0 1
I R3 = R3 − 3R2 gives    
3 1 5 3 1 5
 0 3 3 ∼ 0 3 3 
0 9 −7 0 0 −16
and substituting α = 3 into the previously adjusted identity matrix at position (3, 2) gives
 
1 0 0
 1 1 0 
2 3 1
Therefore,    
3 1 5 1 0 0
U = 0 3 3  and L =  1 1 0 
0 0 −16 2 3 1
gives the LU -decomposition.
You should check that the matrix multiplication LU is equal to A.
Now we consider how to use the LU -decomposition of A in solving a linear system Ax = b. Finding
the solution of Ax = b, using the LU -decomposition, is a two step process. Since A = LU then we are
really solving LU x = b.
1. Write y = U x, then solve Ly = b for y by forward substitution.

2. Solve U x = y for x by back substitution.
26
Example 1.2.13
Use LU -decomposition to find the solution of the system of linear equations
3x1 + x2 + 5x3 = 10
3x1 + 4x2 + 8x3 = 19
6x1 + 11x2 + 3x3 = 31
The first step is to solve Ly = b for y

    
1 0 0 y1 10
 1 1 0   y2  =  19 
2 3 1 y3 31
Forward substitution gives y1 = 10, then y2 = 9 and then y3 = −16.

The second step is to solve U x = y for x
    
3 1 5 x1 10
 0 3 3   x2  =  9 
0 0 −16 x3 −16
Back substitution gives x3 = 1, then x2 = 2 and then x3 = 1.

Hence the solution space to the system of linear equations is
   
 x1 1 
 x2  =  2 
x3 1
 
Once the LU -decomposition of A is known, the solution of Ax = b is reduced to solving two linear
systems Ly = b and U x = y where L and U are triangular matrices. This offers a greatly simplified
solution process through forward substitution for the former and back substitution for the latter. The
LU -decomposition of an n × n matrix A only needs to be found once. Note that for n = N the
computational effort is roughly 31 N 3 . In comparison, Gaussian elimination requires N 3 computational
operations for each b.
27
1.3 Eigenvalues and Eigenvectors
In first year during the matrix algebra section we were introduced to the concept of eigenvalues and the
corresponding eigenvectors. We saw examples of n × n matrices with n real distinct eigenvalues and
how to find the corresponding eigenvectors. In the matrix algebra problem sets we also saw that an
n × n matrix could have repeated real eigenvalues or even complex eigenvalues but we did not attempt
to find the eigenvectors. In this section, we will start with a short recap of what we do know about
eigenvectors and eigenvalues. We will then proceed to find eigenvectors for repeated real eigenvalues
and complex eigenvalues.
1.3.1 Eigenvalues and Eigenvectors
Recall that when the determinant of an n × n system of equations is zero, we can only deduce that the
system has no solution or infinitely many solutions.
A homogeneous system of equations, Ax = 0, always has the trivial solution x = 0. If det(A) = 0 then
the system must have infinitely many solutions.
Definition: Eigenvalues and Eigenvectors
Let A be an n × n matrix with real entries and v be a non-zero n × 1 vector. If Av = λv then λ

is an eigenvalue of A with corresponding eigenvector v.
I It must be emphasised that an eigenvector v cannot be the zero vector 0.
I It is possible to have a zero eigenvalue.

I Eigenvalues can be real distinct, real repeated or complex scalars.
1.3.2 Finding eigenvalues
It is tempting to rewrite the equation Av = λv as (A − λ) v = 0, but this cannot possibly be correct,

since A is an n × n matrix while λ is a scalar (or to make the error obvious λ is an 1 × 1-matrix) and
therefore A − λ does not mathematically make sense.
Instead we can rewrite Av = λv as (A − λI) v = 0 where I is the n × n identity matrix. This is now a
homogeneous system of equations and we know the trivial solution v = 0 is always available, however
the definition of an eigenvector specifies that v is a non-zero vector. Thus we require an infinite number
of solutions for the homogeneous system which happens precisely when det(A − λI) = 0.
Definition: The Characteristic equation
The eigenvalues λ of an n × n matrix A are the solutions of the polynomial equation
det(A − λI) = 0
This is called the characteristic equation of A.

I If A is an n × n matrix, then the characteristic equation will be an nth -degree polynomial.
I It is possible for the roots of the polynomial to be repeated so we should not assume that λ1 , λ2 ,
. . . , λn are necessarily distinct.
I If A has real entries then the characteristic equation has real coefficients.
I If A has real entries, and a root of the characteristic equation is a complex number α + iβ then
there will be another root which is the complex conjugate α − iβ. Thus we will have two complex
eigenvalues λ1 = α + iβ and λ2 = α − iβ.
Example 1.3.1 Real distinct eigenvalues

Find the eigenvalues of the matrix  
1 2 1
A= 6 −1 0 
−1 −2 −1
The eigenvalues of A are given by the characteristic equation det(A − λI) = 0. We have
 
1−λ 2 1
det(A − λI) = det 6 −1 − λ 0 
−1 −2 −1 − λ

6 −1 − λ 1−λ 2
= det + (−1 − λ) det
−1 −2 6 −1 − λ
= λ3 + λ2 − 12λ
Therefore the characteristic equation is
λ3 + λ2 − 12λ = 0
and then the eigenvalues are λ1 = −4, λ2 = 0 and λ3 = 3.
Example 1.3.2 Real repeated eigenvalues

3 1 1
A= 1 3 1 
1 1 3
 
3−λ 1 1
det(A − λI) = det 1 3−λ 1 
1 1 3−λ

3−λ 1 1 1 1 3−λ
= (3 − λ) det − det + det
1 3−λ 1 3−λ 1 1
= λ3 − 9λ2 + 24λ − 20
λ3 − 9λ2 + 24λ − 20 = 0
and then the eigenvalues are λ1 = 2, λ2 = 2 and λ3 = 5.
29
Example 1.3.3 Complex eigenvalues

2 8 0
A =  −1 −2 0 
0 0 1
 
2−λ 8 0
det(A − λI) = det −1 −2 − λ 0 
0 0 1−λ

2−λ 8
= (1 − λ) det
−1 −2 − λ
2

= (λ − 1) λ + 4
(λ − 1) λ2 + 4 = 0

and then the eigenvalues are λ1 = 1, λ2 = 2i and λ3 = −2i.
1.3.3 Eigenvectors corresponding to real distinct eigenvalues
Example 1.3.4
We found the eigenvalues of the matrix
 
1 2 1
A= 6 −1 0 
−1 −2 −1
are λ1 = −4, λ2 = 0 and λ3 = 3. Find the corresponding eigenvectors.

The corresponding eigenvector to λ1 = −4 is given by (A + 4I) v1 = 0, that is,
 
5 2 1
 6 3 0  v1 = 0.
−1 −2 3
Row reduce the matrix

   
5 2 1 0 −8 16
 6 3 0  R1 = R1 + 5R3 :  6 3 0 
−1 −2 3 −1 −2 3
 
0 −8 16
R2 = R2 + 6R3 :  0 −9 18 
−1 −2 3
 
0 0 0
R1 = 9R1 + 8R2 :  0 −9 18 
−1 −2 3
Row 2 gives −9y + 18z = 0. Let z = s for parameter s ∈ R then y = 2s.

Thus row 3 gives −x − 2 (2s) + 3 (s) = 0, then x = −s. Therefore, letting s = 1,
 
−1
λ1 = −4, v1 =  2  .
1
30
The corresponding eigenvector to λ2 = 0 is given by (A − 0I) v2 = 0, that is,

 
1 2 1
 6 −1 0  v2 = 0.
−1 −2 −1

   
1 2 1 1 2 1
 6 −1 0  R3 = R3 + R1 :  6 −1 0 
−1 −2 −1 0 0 0
 
1 2 1
R2 = R2 − 6R1 :  0 −13 −6 
0 0 0
Row 2 gives −13y − 6z = 0. Let y = s for parameter s ∈ R then z = −13s/6.

Thus row 1 gives x + 2 (s) + − 13
6 s = 0, then x = s/6. Therefore, letting s = 6,
 
1
λ2 = 0, v2 =  6  .
−13

 
−2 2 1
 6 −4 0  v3 = 0.
−1 −2 −4

   
−2 2 1 0 6 9
 6 −4 0  R1 = R1 − 2R3 :  6 −4 0 
−1 −2 −4 −1 −2 −4
 
0 6 9
R2 = R2 + 6R3 :  0 −16 −24 
−1 −2 −4
 
0 0 0
R1 = 16R1 + 6R2 :  0 −16 −24 
−1 −2 −4
Row 2 gives −16y − 24z = 0. Let

z = s for parameter s ∈ R then y = −3s/2.
Thus row 3 gives −x − 2 − 23 s − 4 (s) = 0, then x = −s. Therefore, letting s = −2,
 
2
λ3 = 3, v3 =  3  .
−2
1.3.4 Eigenvectors corresponding to real repeated eigenvalues
If the eigenvalues of the matrix A are real and distinct, then it can be shown that the corresponding
eigenvectors are linearly independent. Recall, if the only solution to au + bv = 0 is a = b = 0, then u
and v are linearly independent.
If, however, the eigenvalues are real and repeated, it may or may not be possible to find n linearly
independent eigenvectors. Consider an example where given two repeated eigenvalues we can find two
corresponding linearly independent eigenvectors.
31
Example 1.3.5
 
3 1 1
A= 1 3 1 
1 1 3
are λ1 = 2, λ2 = 2 and λ3 = 5.
It is left as an exercise to show that the corresponding eigenvector to λ3 = 5 is
 
1
v3 =  1 
1
The corresponding eigenvector to λ1 = 2 or λ2 = 2 is given by (A − 2I) v = 0, that is,

 
1 1 1
 1 1 1  v = 0.
1 1 1
Any of the three rows gives x + y + z = 0. Let y = s and z = t for parameters s, t ∈ R then x = −s − t.
Therefore, we have    
−1 −1
v = s 1  + t 0 .
0 1
Hence, we have  
−1
λ1 = 2, v1 =  1 
0
and  
−1
λ2 = 2, v2 =  0 
1
When we have an n × n matrix A we expect A to have n eigenvalues (some of which may be repeated)
and n eigenvectors.
We define the algebraic multiplicity of an eigenvalue to be the number of times it is a root of the
characteristic equation. We define the geometric multiplicity of an eigenvalue to be the number of
linearly independent eigenvectors for that eigenvalue. In the previous example, λ = 2 has an algebraic
multiplicity of 2 since we had λ1 = 2 and λ2 = 2, and a geometric multiplicity of 2 as we were able to
find two linearly independent eigenvectors.
However, it is possible for the geometric multiplicity of an eigenvalue to be less than the algebraic multi-
plicity of the eigenvalue, in which case we need to construct the remaining eigenvector(s) corresponding
to that eigenvalue such that the geometric multiplicity and algebraic multiplicity for that eigenvalue
are the same. In this situation we must search for additional solutions. This can be done with the help
of the generalised eigenvector . This is beyond the scope of this unit.
32
1.3.5 Eigenvectors corresponding to complex eigenvalues
Theorem Complex eigenvalues and eigenvectors
Let A be an n × n matrix with real entries. If λ = α + iβ is an eigenvalue of A then the complex

conjugate λ̄ = α − iβ is an eigenvalue of A. Furthermore, if v is the corresponding eigenvector to
λ then the complex conjugate v̄ is a corresponding eigenvector to the eigenvalue λ̄.
Example 1.3.6
 
2 8 0
A =  −1 −2 0 
0 0 1
are λ1 = 1, λ2 = 2i and λ3 = −2i.
It is left as an exercise to show that the corresponding eigenvector for λ1 = 1 is

 
0
v1 =  0 
1
The corresponding eigenvector to λ2 = 2i is given by (A − (2i) I) v2 = 0, that is,

 
2 − 2i 8 0
 −1 −2 − 2i 0 
0 0 1 − 2i

   
2 − 2i 8 0 0 0 0
 −1 −2 − 2i 0  R1 = R1 + (2 − 2i) R2 :  −1 −2 − 2i 0 
0 0 1 − 2i 0 0 1 − 2i
Row 3 gives (1 − 2i) z = 0, that is, z = 0.

Row 2 gives −x − (2 + 2i) y = 0. Let y = s for parameter s ∈ R then x = (−2 − 2i) s.
Therefore, we have  
−2 − 2i
λ2 = 2i, v2 =  1 .
0
It follows that  
−2 + 2i
λ3 = −2i, v3 =  1 .
0
When we solve systems of linear first order ordinary differential equations we will sometimes find
complex eigenvectors. We will then discuss how to interpret these when searching for solutions to a
system in the real vector space rather than a complex vector space.
33
1.4 Cartesian tensors
§1.4 NOT COVERED IN 2021

In this section we will discuss mathematical objects, called tensors, that represent physical quantities.
We will concentrate our attention on three-dimensional space R3 , and on objects called Cartesian
tensors; these are tensors that are conceptually based in the Cartesian coordinate system.
Tensors are referred to by their order (or rank) which in effect corresponds to the number of subscripts
needed to specify a particular element of a tensor. (We will discuss subscripts further in the section
below about the ‘Einstein summation notation’.) For example,
I A zero-order Cartesian tensor is a scalar. A scalar is just a number. Scalars represent physical
quantities such as the temperature at any particular point in space.
I A first-order Cartesian tensor is a one-dimensional array of three numbers or a vector. A vector is
an object with magnitude (length) and direction. Many physical phenomena can be expressed by
vectors, such as an electric field, where a positively charged particle moves along the field. Note
that not all vectors are first-order Cartesian tensors.
I A second-order Cartesian tensor looks like a typical square matrix (3 × 3). Stress, strain, thermal
conductivity, magnetic susceptibility and electrical permittivity are all second-order tensors. Note
that not all 3 × 3 matrices are second-order Cartesian tensors.
I A third-order Cartesian tensor would look like a three-dimensional array: a ‘cubic lattice’ (3 ×
3 × 3). Piezoelectricity is described by a third-order tensor.
I A fourth-order Cartesian tensor is a four-dimensional array of numbers (3 × 3 × 3 × 3). The
elasticity of single crystals is described by a fourth-order tensor.
Before proceeding it is worthwhile to reflect on what we mean by a vector in these contexts, noting
that the quantitative description of physical processes cannot depend upon the choice of coordinate
system. A typical example of a vector is the relative position of some point from a given origin in
three-dimensional space. Other examples might include the instantaneous velocity of an object, or the
strength and direction of a force (also in three dimensions relative to some given coordinate system).
Each of these examples represents more than simply a set of three numbers arranged in a row or column,
as really the vector quantity is not altered by the way we might choose to measure it, based on the
precise coordinate system we happen to have chosen. In two different coordinate systems the same
position vector will be represented as two different sets of three numbers, but both refer to the same
vector.
Tensors are the same, in that they can be represented in different ways in different coordinate systems,
but essentially are the same ‘object’ and must be independent of the coordinate system. An example
of this might be how a particular type of material responds to an imposed force. In ENG2005 we
concentrate on second-order tensors which, as noted above, can look like a 3 × 3 matrix in a Cartesian
coordinate system.
In a general coordinate system things become more complicated to the extent that two types of tensors
have to be introduced. These are contravariant tensor and covariant tensor . General relativity and
differential geometry extensively use contravariant and covariant tensors, but the beauty of the Cartesian
coordinate system is that these two types of tensors coincide, and thus there is no need to distinguish
between them here.
A simple, almost trivial fact that you should already be aware of is that scalars such as temperature,
density, pressure, are unaffected by coordinate transformations. The beauty of using tensors in mathe-
matics and engineering is that an equation written in tensor form is valid in any coordinate system.
1.4.1 The Stress Tensor
A typical example, and in fact the first example, of a second-order tensor with engineering application
is the Cauchy stress tensor.
Stress is defined as force per unit area. If we take a cube of material and subject it to an arbitrary
load we can measure its deformation in various directions. Stress might have several sources: it can
be a reaction to external forces, temperature variation, or electromagnetic fields (e.g. piezoelectric
devices). Stress can also arise during material fabrication, due to micro–structural phenomena. To
control the stress (stress management), engineers use the structural geometry of the device, material
properties, thermal management, and different manufacturing techniques. These measurements will
form a second-order tensor, often known simply as ‘the stress tensor’.
The stress has a dependence on the chosen plane we analyse and the plane will have a unique normal
direction. Consider the following diagram, where the small arrows represent the normal direction to
the plane and the large arrows represent the normal and tangential components of the stress:
A body under a load (a) can have different stress configurations at a point, depending on the considered
plane (b and c). However, they must describe the same physical phenomenon and, in general, are not
in the same direction as the normal to the plane.
Therefore a proper description of the stress at a point of a body cannot be accomplished by a single
vector and the information about the chosen plane should be included. Since each point in the body is
under static equilibrium (no net force in the absence of any body forces), only nine stress components
from three planes are needed to describe the stress state at a given point P. So, the stress tensor has
order two. These nine components can be organised into the matrix or tensor:
 
σxx σxy σxz
σ =  σyx σyy σyz 
σzx σzy σzz
The units which are typically used for the stress tensor are [σ]= N/mm2 = MPa.
The subscript notation used for the nine stress components has the following meaning: first subscript
keeps track of the plane the components of stress act on, second subscript keeps track of the direction:
I The stress tensor σij is the component of stress in the i-direction on a surface with a normal in
the j-direction.
35
Augustin-Louis Cauchy realised that the force vector T has a linear relationship with the surface normal
n = n1 ex + n2 ey + n3 ez . This relationship is dependent on the stress tensor σ, which is independent of
the chosen surface and describes only the stress state at a specific point in a body, and is defined as:
   
T1 n1
 T2  = σ  n2  .
T3 n3
Intuitively, this can be seen if one imagines shrinking the following cube to a point.
If the cube is infinitesimally small, the forces across each face will be uniform. Any stress tensor may be
broken into two parts: normal and shear tensor. If the cube is to remain stationary the normal forces
on opposite faces must be equal in magnitude and opposite in direction. Similarly, the shear tractions
which would tend to rotate it must balance each other and angular momentum must be conserved.
Therefore σxy = σyx , etc.
Principal stresses
The cube can be orientated in such a way that the major stress acting on it, is normal to one of the
planes, and therefore no shear stresses are caused, only normal stresses (named also principal stresses).
This causes the stress tensor to be reduced to
 
σxx 0 0
 0 σyy 0 
0 0 σzz
Symmetry
An important property of the stress tensor is that it is symmetric; σij = σji . Since the stress tensor
is a second-order object (matrix), the symmetry eases the mathematical handling of the tensor. It is
possible to prove that a symmetric matrix has real eigenvalues and orthogonal eigenvectors.
1.4.2 Other second-order tensors
Two examples of second-order tensors which you may come across later are:
36
Viscous stress tensor
Consider a fluid with velocity v and define the second-order tensor

(
1 if i = j,
δij =
0 if i 6= j.
We will come across this later. Then the second-order tensor

∂vi ∂vj
σij = −P δij + η + . (1.1)
∂xj ∂xi
is called the stress tensor of fluid flow. It is very useful for evaluating forces on surfaces and therefore
deriving the governing Navier–Stokes equations, and for describing a variety of boundary conditions in
fluid dynamics.
The scalar function P represents the pressure which acts normal to a fluid particle,
∂vi ∂vj
while the ∂x j
+ ∂xi contributes to the internal friction within the fluid. The scalar η is the dynamic
viscosity measured in kg m−1 s−1 .
Ohm’s law
If the electric current j flowing through a material is not parallel to the electric field E applied to
the material, such as a material made up of alternating layers of a conductor and insulator, then an
alternative version of Ohm’s law is
ji = σik Ek (1.2)
where σik is the second order conductivity tensor.
1.4.3 Einstein summation convention
We can now introduce a common subscript notation (or abbreviation) that is used when describing and
manipulating tensors mathematically. Note that you will not be expected to use this in other sections
of ENG2005, unless advised otherwise, but it may arise in your subsequent engineering studies.
Example 1.4.1
Consider the following examples:
1. Recall that the dot product of two vectors u = u1 i + u2 j + u3 k and v = v1 i + v2 j + v3 k is defined

as
u · v = u1 v1 + u2 v2 + u3 v3
2. In the linear systems section we discussed linear systems of three equations with three unknowns
a11 x1 + a12 x2 + a13 x3 = b1
a21 x1 + a22 x2 + a23 x3 = b2
a31 x1 + a32 x2 + a33 x3 = b3
3. The trace of the coefficient matrix A in the linear system of equations Ax = b is
tr(A) = a11 + a22 + a33
4. The transpose of the coefficient matrix A in the linear system of equations Ax = b is

 
a11 a21 a31
AT =  a12 a22 a32 
a13 a23 a33
37
5. The determinant of the coefficient matrix A in the linear system of equations Ax = b is
det (A) = a11 (a22 a33 − a23 a32 ) − a12 (a21 a33 − a23 a31 ) − a13 (a21 a32 − a22 a31 )
Each of these examples clearly exhibit a pattern in subscripts, even if the pattern for the determinant
is not so obvious. For the moment we will examine the first four examples.
Example 1.4.2
Clearly, the first four of the previous examples could be written as:
1. The dot product of two vectors u and v is defined as

3
X
u·v = ui vi
i=1
3
X
b1 = a1j xj
j=1
3
X
b2 = a2j xj
j=1
3
X
b3 = a3j xj
j=1
or simplified even further,

3
X
bi = aij xj where i = 1, 2, 3
j=1
3. The trace of the coefficient matrix A is

3
X
tr(A) = aii
i=1
4. The transpose of the coefficient matrix A with components aij for i = 1, 2, 3 and j = 1, 2, 3 is
AT = [aji ] where i = 1, 2, 3 and j = 1, 2, 3
Observe in each example that there is a summation over the subscript that appears twice in the equation,
we shall call these dummy subscripts, while in the example of the linear system of equations the
subscript i is not repeated and can vary over the values of 1, 2, 3, we call such subscripts free subscripts.
This leads us to the following convention:
Definition: Einstein summation convention
I subscripts can take on the values 1, 2, 3.

I a subscript repeated exactly twice in any term
Pof an expression indicates a summation over that
subscript. Note that means we can drop the sign.
I no subscript should appear more than twice.
38
For each of the following examples, observe what the results mathematically are, and the number of
corresponding free subscripts:
Example 1.4.3
1. The dot product of two vectors u and v is defined as
u · v = u i vi
This is a scalar, and there are no free subscripts.
bi = aij xj
This gives a vector b, and there is one free subscript.
3. The trace of the coefficient matrix A is
tr(A) = aii
This is a scalar, and there are no free subscripts.

4. The transpose of the coefficient matrix A with components aij for i = 1, 2, 3 and j = 1, 2, 3 is
AT = aji
This is a matrix, and there are two free subscripts.

5. The expression aij bjj cjk does not make sense, note the four j subscripts. Whoever wrote this
expression might have actually meant aij bjk ckl which is
3 3
!
X X
aij bjk ckl = aij bjk ckl
j=1 k=1
3 3
!
X X
= aij bjk ckl
j=1 k=1
= A (BC)
that is, the left hand multiplication of a matrix B with matrix C, followed by the left hand
multiplication of matrix A with matrix (BC). This results in a matrix, and the expression has
two free variables i and l.
6. We could even use this subscript notation for expressions involving derivatives, such as
∂f
∂xi
∂f ∂f ∂f
Since there are no repeated subscripts this represents three entities , and . These
∂x1 ∂x2 ∂x3
are clearly the first, second and third components of the gradient vector of f , ∇f , we met in first
year. This is a vector, and there is one free variable i
df df du
7. We could try to write the chain rule = as
dx du dx
∂f ∂ui
∂ui ∂x
39
but if you write this out explicitly remembering that i = 1, 2, 3 and we have to sum over repeated
subscripts, i here, then this is actually
∂f du1 ∂f du2 ∂f du3
+ +
∂u1 dx ∂u2 dx ∂u3 dx
We have inadvertently found the chain rule of a three variable function f (u1 , u2 , u3 ) where each
variable depends upon the one variable x; u1 = u1 (x), u2 = u2 (x), u3 = u3 (x). In the multivariable
calculus section, we will meet the chain rule of an n-variable function f whose variables depend
upon m variables. The chain rule results in a scalar expression, and there are no free variables
8. We could even use this summation notation to write a single-term expression, such as
∂ui
=0
∂xi
Since the subscript i are repeated, we sum over i to give
∂u1 ∂u2 ∂u3
+ + =0
∂x1 ∂x2 ∂x3
This expression has no free subscript, and for the moment you are told that the expression
represents a scalar. We will meet this formally in the vector calculus section.
1.4.4 Other important tensors
It is also pertinent to introduce two important and very useful subscripted entities
The Kronecker delta tensor
This is a very common quantity used in conjunction with the summation notation.
Definition: Kronecker delta

(
1 if i = j
δij = (1.3)
0 if i 6= j
You could think of this as the 3 × 3-identity matrix

 
1 0 0
 0 1 0 
0 0 1
That means, for example, if we evaluate δij uj explicitly for each i
δ1j uj = δ11 u1 + δ12 u2 + δ13 u3 = u1

δ2j uj = δ21 u1 + δ22 u2 + δ23 u3 = u2
δ3j uj = δ31 u1 + δ32 u2 + δ33 u3 = u3
we see that δij uj = ui . Similarly for ajk we can show that δij ajk = aik . This is called the substitution
property.
40
Levi-Civita symbol or alternating symbol
This is a third-order tensor but it can arise in some common circumstances, including in the calculation
of determinants and vector (cross) products.
Definition: Levi-Civitia’s epsilon



 +1 if ijk = 123, 231, 312 (cyclic permutation)


ijk = −1 if ijk = 321, 132, 213 (anti-cyclic permutation)



0, otherwise.

This has 27 elements, so we will not explicitly write it out. It is presented here so you are aware of
this symbol as it can be very powerful if you are using tensor analysis. For the moment, we note that
the Levi-Civitia’s epsilon symbol allows us to express the determinant of a 3 × 3-matrix A in subscript
notation
det (A) = ijk a1i a2j a3k
Lastly, we note that the tensor subscript notation can be used to simplify the process of deriving vector
identities and to write equations such as the Navier-Stokes equations in a compact form.
Example 1.4.4
The vector product of two vectors u × v leads to a new vector
 
w1
w =  w2 
w3
 
u2 v3 − u3 v2
=  u3 v1 − u1 v3 
u1 v2 − u2 v1
Using the summation convention, the components of w are wi = ijk uj vk .
1.4.5 Transformation of a tensor to a new coordinate system
As noted in the introduction, for a tensor to have physical meanings it must be independent of the
adopted coordinate system. Sometimes for reasons of convenience, we need to represent a tensor in a
specific coordinate system. This transformation (rotation) of a tensor into a new coordinate system is
a common problem in rock mechanics and in continuum mechanics, chemical engineering, etc., and can
be done via orthogonal transformations. The behaviour of quantities under orthogonal transformations
of the coordinate system is the basis of Cartesian tensors. We want to formulate equations in such a
way that they are in-dependent of the specific coordinate system.
For example, if we return to the stress tensor of the previous section, sometimes this tensor could require
us to do a lot of calculations. To make things easier it can be rotated into another stress tensor by a
suitable change of axes.
For every stress state, we can rotate the axes, so that the only non-zero components of the stress tensor
are the ones along the diagonal. That is, there are no shear stress components, only normal stress
components. (See the link: Representing stress as a tensor)
41
A general linear transformation of coordinates is
x0i = aij xj
where aij is a transformation tensor (sometimes a matrix).

If we impose the condition that both the original and the primed reference frames are orthonormal, i.e.,
have orthogonal unit normals, then the transformation is named orthogonal.
We have not changed the stress state, and we have not moved or changed the material, we have simply
rotated the axes we are using and are looking at the stress state seen with respect to these new axes.
42
Chapter 2
Multivariable Calculus
43
2.1 Multivariable Calculus: Assumed Background Knowledge
The following assumed knowledge is a revision of material covered in the prerequisite ENG1005 En-
gineering Mathematics (or equivalent units) on multivariable calculus that you are expected to know
and understand. At the discretion of the lecturer some of this may be briefly revised during the initial
lecture on multivariable calculus. However, more generally it is advisable for you to spend time on all
of the material listed below prior to the multivariable calculus lectures.
2.1.1 Multivariable functions
Definition:
A multivariable function f (x1 , x2 , . . . , xn ) is a mapping of a subset, D, of Rn into a subset, R,

of R. The function maps each n-tuple (x1 , x2 , . . . , xn ) ∈ D to a unique real value in R.
In most applications we use two, three or four variable functions, such as u(x, t) representing the heat
distribution along a thin rod over time, or P (x, y, z, t) representing pressure in a three dimensional
space over time.
2.1.2 Partial derivatives
The first order partial derivative of f (x1 , x2 , . . . , xn ) with respect to the variable xk (for a fixed k =
1, 2, . . . , n) is

∂f f (x1 , . . . , xk + ∆xk , . . . xn ) − f (x1 , . . . , xk , . . . xn )
= lim .
∂xk ∆xk −→0 ∆xk
∂ d
Notice the use of the symbol rather than . This is to remind us that in computing this
∂xk dxk
derivative all the other n − 1 variables x1 , . . . , xk−1 , xk+1 , . . . , xn are held constant.
Given a n-variable function f (x1 , x2 , . . . , xn ) we can define the second order partial derivatives
∂2f

∂ ∂
=
∂xi ∂xj ∂xi ∂xj
for i = 1, 2, . . . , n and j = 1, 2, . . . , n.
Example 2.1.1
Given a two-variable function f (x, y) there are four second order partial derivatives of f :
∂2f ∂2f ∂2f ∂2f
2
, , and
∂x ∂x∂y ∂y∂x ∂y 2
√
The two-variable function f (x, y) = y sin x2 has first order partial derivatives
∂f √ ∂f 1
= 2x y cos x2 , = √ sin x2

∂x ∂y 2 y
and second order partial derivatives
∂2f √ √ ∂2f x ∂2f ∂2f 1
= 2 y cos x2 − 4x2 y sin x2 , = √ cos x2 = 2

2
, 2
=− 3 sin x
∂x ∂x∂y y ∂y∂x ∂y 4 (y) 2
Theorem Clairaut’s Theorem
In general , if f (x, y) is a twice-differentiable function, then the order in which its mixed partial
derivatives are calculated does not matter. Each ordering will yield the same function. For a
function of two variables this means
∂2f ∂2f
=
∂x∂y ∂y∂x
I For most multivariable functions that we use in applications and modelling, we do indeed find
∂2f ∂2f
that = . However, there are some functions for which this equality does not hold true
∂x∂y ∂y∂x
as they fail specific assumptions in the theorem alluded to above.
I Clairaut’s theorem holds true for functions of n-variables.
Let z = f (x, y) be a differentiable function. The tangent plane to z = f (x, y) at the point (x, y) =
(a, b) is given by
˜ ∂f ∂f
f (x, y) = f (a, b) + (x − a) + (y − b)
∂x (a,b) ∂y (a,b)
2.1.3 The gradient vector
For a function f (x, y, z) the gradient vector is
∂f ∂f ∂f
∇f = i+ j+ k
∂x ∂y ∂z
The directional derivative of a function f at a point (x, y, z) = (a, b, c) in the direction u is given by
Du (a, b, c) = û · (∇f (a, b, c))
Gradient vector in cylindrical coordinates
Here we have
x = r cos(θ) , y = r sin(θ) and z = z
p
where r = x2 + y 2 represents the distance from the cylinder axis to the cylindrical surface.
45
The cylindrical coordinate vectors are
er = cos(θ) i + sin(θ) j + 0k
eθ = − sin(θ) i + cos(θ) + 0k
ez = 0i + 0j + k
where er points in the direction of increasing r-values, eθ points in the direction of increasing θ-values
and ez points in the direction of increasing z-values.
The gradient vector of a function f (r, θ, z) is
∂f 1 ∂f ∂f
∇f = er + eθ + ez .
∂r r ∂θ ∂z
2.1.4 Gradient vector in spherical coordinates
Here we have
x = r sin(φ) cos(θ) , y = r sin(φ) sin(θ) and z = r cos(φ)
p
where r = x2 + y 2 + z 2 represents the distance from the origin to the spherical surface.
The spherical coordinate vectors are
er = cos(θ) sin(φ) i + sin(θ) sin(φ) j + cos(φ) k

eφ = cos(φ) cos(θ) i + sin(θ) cos(φ) j + sin(φ) k
eθ = − sin(θ) i + cos(θ) j + 0k
where er points in the direction of increasing r-values, eθ points in the direction of increasing θ-values
and eφ points in the direction of increasing φ-values.
The gradient vector of f (r, φ, θ) is
∂f 1 ∂f 1 ∂f
∇f = er + eφ + eθ .
∂r r ∂φ r sin(φ) ∂θ
2.1.5 Two-variable function extrema
If ∇f = 0 at a point P : (x, y) = (a, b) then P is a critical point for f .

Let (x, y) = (a, b) be an interior point of the domain for the function f and suppose that the first and
second partial derivatives of f exist and are continuous on some circular disc with (x, y) = (a, b) at its
46
centre and contained in the domain of f . Assume that (x, y) = (a, b) is a critical point of f .
Define 
∂ 2 f ∂ 2 f

 ∂x2 ∂y∂x (a,b)  
D(a, b) = det 

2
(a,b) 2

 ∂ f ∂ f


∂x∂y (a,b) ∂y 2 (a,b)
∂ 2 f ∂ 2 f

1. If D(a, b) > 0 and < 0 or < 0, then (x, y) = (a, b) is a local maximum.
∂x2 (a,b) ∂y 2 (a,b)
∂ 2 f ∂ 2 f

2. If D(a, b) > 0 and > 0 or > 0, then (x, y) = (a, b) is a local minimum.
∂x2 (a,b) ∂y 2 (a,b)
3. If D(a, b) < 0, then (x, y) = (a, b) is a saddle point.

4. If D(a, b) = 0, then this test is inconclusive.
47
2.2 Derivatives of multivariable functions
In first year we started to extend the basic concepts, such as convergence, limits, continuity and deriva-
tives in calculus for single variable functions to calculus of several variables. For example, we can use
Newton’s law of cooling to model the temperature T = T (t) of a cup of coffee over time t. However,
we could model the temperature over time throughout the volume of the cup of coffee, thus requiring
four independent variables; x, y, z, t. Therefore, we would have T = T (x, y, z, t). In this section we
will continue that journey into multivariable calculus.
2.2.1 Partial derivatives

df
Recall that for a function of a single variable, say f = f (x), the derivative of f is the function
dx

df f (x + ∆x) − f (x)
= lim .
dx ∆x−→0 ∆x
Recall that occasionally this derivative may be denoted as f 0 (x), which makes it explicit that the
derivative is a function of x itself.
For a function of several variables, say f = f (x, y, z, t), consider the first order partial derivatives of f
with respect to any one of the independent variables:
I The first order partial derivative of f with respect to x is the function

∂f f (x + ∆x, y, z, t) − f (x, y, z, t)
= lim
∂x ∆x−→0 ∆x
where y, z, t have all been held constant.

I The first order partial derivative of f with respect to y is the function

∂f f (x, y + ∆y, z, t) − f (x, y, z, t)
= lim
∂y ∆y−→0 ∆y
where x, z, t have all been held constant.
I The first order partial derivative of f with respect to z is the function

∂f f (x, y, z + ∆z, t) − f (x, y, z, t)
= lim
∂z ∆z−→0 ∆z
where x, y, t have all been held constant.

I The first order partial derivative of f with respect to t is the function

∂f f (x, y, z, t + ∆t) − f (x, y, z, t)
= lim
∂t ∆z−→0 ∆t
where x, y, z have all been held constant.

Example 2.2.1
Consider the four-variable function
f (x, y, z, t) = x2 + yz t

then the first order partial derivatives of f are

∂f ∂f ∂f ∂f
= 2xt, = zt, = yt, = x2 + yz
∂x ∂y ∂z ∂t
2.2.2 The chain rule
Recall that in single-variable calculus, if z = f (y) and y = g(x) are two functions of a single variable,
then the derivative of the composite function f ◦ g(x) = f (g(x)) is
dz df dy
=
dx dy dx
It is important to remember; we are not cancelling out the “dy” differentials, since derivatives are not
fractions. This is more obvious when written in Newton’s notation
z 0 (x) = f 0 (y) g 0 (x)
We can extend the chain rule to a function of n-variables, where each variable depends upon m-variables.
Definition: Chain rule
Let f (x1 , x2 , . . . , xn ) be a differentiable function dependent upon n variables, where each vari-
able xk (u1 , u2 , . . . , um ) is a differentiable function dependent upon m variables. The first order
derivative of f with respect to uk is given by the chain rule
n
∂f X ∂f ∂xi
=
∂uk i=1
∂xi ∂uk
that is,
∂f ∂f ∂x1 ∂f ∂x2 ∂f ∂xn
= + + ··· +
∂uk ∂x1 ∂uk ∂x2 ∂uk ∂xn ∂uk
I again; we are not cancelling out the “∂xi ” symbol, since derivatives are not fractions.
I if a specific function depends upon only one variable then the partial derivative is just an ordinary
∂
derivative, for example x3 (u1 ) only depends upon one variable u1 , thus we write x3 (u1 ) =
∂u1
dx3
.
du1
49
Example 2.2.2
2
Let z = f (y) = y 2 and y = g(x) = sin(x), then the composite function f ◦g(x) is z = f (g(x)) = (sin(x)) .
By chain rule the derivative of z with respect to x is
dz df dy
=
dx dy dx
d 2 d
= y sin(x)
dy dx
= (2y) (cos(x))
= 2 sin(x) cos(x) .
Now, consider a function of three independent variables w = f (x, y, z) and let x = x(t), y = y(t),
z = z(t) be functions of a single variable t. The composite function w = f (x(t) , y(t) , z(t)) = F (t) is
a function of the (single) independent variable t. Using the chain rule, the derivative of the function
w = F (t) with respect to t is
dF ∂f dx ∂f dy ∂f dz
= + + .
dt ∂x dt ∂y dt ∂z dt
Example 2.2.3
1
Let w = f (x, y, z) = xy 2 + x2 z, and let x(t) = t2 , y(t) = t, z(t) = then applying the chain rule, the
t
derivative of w with respect to t is
dw ∂f dx ∂f dy ∂f dz
= + +
dt ∂x dt ∂y dt ∂z dt
∂ 2 d ∂ 2 d ∂ 2 d 1
= xy + x2 z t2 + xy + x2 z t + xy + x2 z
∂x dt ∂y dt ∂z dt t

1
= y 2 + 2xz (2t) + (2xy) (1) + x2 − 2

t
2 1
2 2 1
(2t) + 2 t (t) (1) + t2
2

= (t) + 2 t − 2
t t
= 4t3 + 3t2 .
Alternatively, the composite function is

w = f (x(t) , y(t) , z(t))

2
2 1
2 2
= t (t) + t
t
= t4 + t3
and then the derivative of w with respect to t is
dw d 4
= t + t3
dt dt
= 4t3 + 3t2 .
The chain rule for functions of several variables can get more complicated. Consider, for example, w =
f (x, y, z), which is a function of three independent variables, and assume that x = x(u, v), y = y(u, v)
and z = z(u, v), are functions of two independent variables u and v. Hence the composite function
w = f (x(u, v) , y(u, v) , z(u, v)) = F (u, v) is a function of the independent variables u and v. Using the
chain rule, the partial derivative of the function w = F (u, v) with respect to u is
∂F ∂f ∂x ∂f ∂y ∂f ∂z
= + +
∂u ∂x ∂u ∂y ∂u ∂z ∂u
50
and the partial derivative of the function w = F (u, v) with respect to v is
∂F ∂f ∂x ∂f ∂y ∂f ∂z
= + +
∂v ∂x ∂v ∂y ∂v ∂z ∂v
Example 2.2.4
Let w = f (x, y) = x2 − y 2 , and x(r, θ) = r cos(θ) and y(r, θ) = r sin(θ). Using the chain rule, the partial
derivative of the function w = F (r, θ) with respect to r is
∂F ∂f ∂x ∂f ∂y
= +
∂r ∂x ∂r ∂y ∂r
∂ 2
∂ ∂ 2 ∂
= x − y2 r cos(θ) + x − y2 r sin(θ)
∂x ∂r ∂y ∂r
= (2x) (cos(θ)) + (−2y) (sin(θ))
= (2r cos(θ)) (cos(θ)) + (−2r sin(θ)) (sin(θ))
= 2r cos2 (θ) − sin2 (θ)

= 2r cos(2θ)
and the partial derivative of the function w = F (r, θ) with respect to θ is
∂F ∂f ∂x ∂f ∂y
= +
∂θ ∂x ∂θ ∂y ∂θ
∂ 2 ∂ ∂ 2 ∂
= x − y2 r cos(θ) + x − y2 r sin(θ)
∂x ∂θ ∂y ∂θ
= (2x) (−r sin(θ)) + (−2y) (r cos(θ))
= (2r cos(θ)) (−r sin(θ)) + (−2r sin(θ)) (r cos(θ))
= −4r2 cos(θ) sin(θ)
= −2r2 sin(2θ)
Alternatively, the composite function is
w = f (x(r, θ) , y(r, θ))

2 2
= (r cos(θ)) − (r sin(θ))
= r2 cos2 (θ) − sin2 (θ)

= r2 cos(2θ)
Therefore, the derivative of w = F (r, θ) with respect to r is
∂w ∂ 2
= r cos(2θ)
∂r ∂r
= 2r cos(2θ)
and the derivative of w = F (r, θ) with respect to θ is
∂w ∂ 2
= r cos(2θ)
∂θ ∂θ
= −2r2 sin(2θ)
The chain rule allows us to determine how a partial derivative in one coordinate system, such as
Cartesian coordinates, is described in another coordinate system, such as polar coordinates.
51
Example 2.2.5
Find the first order partial derivatives of a function u(x, y) with respect to x and y in terms of the polar
coordinates r and θ.
The polar coordinates are defined by
x(r, θ) = r cos(θ) and y(r, θ) = r sin(θ)
where
y
r2 = x2 + y 2 and tan(θ) =
x
Using the chain rule, the first order partial derivative of u with respect to x is given by
∂u ∂u ∂r ∂u ∂θ
= +
∂x ∂r ∂x ∂θ ∂x
and the first order partial derivative of u with respect to y is given by
∂u ∂u ∂r ∂u ∂θ
= +
∂y ∂r ∂y ∂θ ∂y
Applying implicit differentiation we can find the first order partial derivative of r with respect to x and
the first order partial derivative of r with respect to y.
∂ 2 ∂ 2 ∂ 2 ∂ 2
r = x + y2 r = x + y2
∂x ∂x ∂y ∂y
∂r ∂r
2r = 2x 2r = 2y
∂x ∂y
∂r x ∂r y
= =
∂x r ∂y r
∂r ∂r
= cos(θ) = sin(θ)
∂x ∂y
Applying implicit differentiation we can find the first order partial derivative of θ with respect to x and
the first order partial derivative of θ with respect to y.
∂ ∂ y ∂ ∂ y
tan(θ) = tan(θ) =
∂x ∂x x ∂y ∂y x
∂θ y ∂θ 1
sec2 (θ) =− 2 sec2 (θ) =
∂x x ∂y x
∂θ y ∂θ 1
=− 2 2 =
∂x x sec (θ) ∂y x sec2 (θ)
∂θ y ∂θ 1
=− 2 =
x 1 + tan2 (θ)

2

∂x x 1 + tan (θ) ∂y
∂θ y ∂θ 1
=− = 2
∂x 2
x 1+ y 2 ∂y x 1+ y
x x
∂θ y ∂θ x
=− 2 = 2
∂x r ∂y r
∂θ sin(θ) ∂θ cos(θ)
=− =
∂x r ∂y r
Hence, by using the chain rule, the first order partial derivative of u with respect to x is given by
∂u ∂u sin(θ) ∂u
= cos(θ) −
∂x ∂r r ∂θ
∂u ∂u cos(θ) ∂u
= sin(θ) +
∂y ∂r r ∂θ
52
2.3 Double Integrals
2.3.1 Double Integrals
For the moment, we will assume that the region of integration is in the xy-plane, that is, a region of
the two-dimensional plane. To define a region in the plane, we need to be able to define the “edges” of
the region.
Example 2.3.1
The rectangular region in the xy-plane
which can be described as
(x, y) ∈ R2 : −1 ≤ x ≤ 2, 0 ≤ y ≤ 1 .

R=
Example 2.3.2
The region bounded by the curves y = x2 and x = y 2 in the xy-plane
could be described as √
(x, y) ∈ R2 : 0 ≤ x ≤ 1, x2 ≤ y ≤

R= x
or equivalently as
√
(x, y) ∈ R2 : 0 ≤ y ≤ 1, y 2 ≤ x ≤

R= y
Example 2.3.3
The region bounded by the lines x = 0, y = 4 and the curve y = x2 in the xy-plane
could be described as
(x, y) ∈ R2 : 0 ≤ x ≤ 2, x2 ≤ y ≤ 4

R=
or equivalently as
√
(x, y) ∈ R2 : 0 ≤ y ≤ 4, 0 ≤ x ≤

R= y
Example 2.3.4
The region bounded by the lines y = x, y = 2x and x = 2 in the xy-plane
could be described as
(x, y) ∈ R2 : 0 ≤ x ≤ 2, x ≤ y ≤ 2x

R=
or equivalently as R = R1 ∪ R2 where R is split into two regions
n y o n y o
R1 = (x, y) ∈ R2 : 0 ≤ y ≤ 2, ≤ x ≤ y and R2 = (x, y) ∈ R2 : 2 ≤ y ≤ 4, ≤x≤2
2 2
54
Definition: Double Integrals
Let R be a bounded region in the xy-plane, which is completely contained within a rectangular
region D. We divide D into M × N uniform small rectangles each with area ∆A = ∆x∆y and
centre points (xi , yj ). Then define the function
(
f (xi , yj ), if (xi , yj ) is in R,
F (xi , yj ) = .
0 otherwise.
Then the double integral of f (x, y) over the region R is given by

  
ZZ M X
X N
f (x, y) dA = lim  lim  F (xi , yj ) ∆A ,
R M →∞ N →∞
i=1 j=1
where the limits exist as both ∆x → 0 and ∆y → 0.
2.3.2 The Order of Integration for Double Integrals
Theoretically, if the integrand and the region of integration are “suitably behaved”, then the order of
integration does not matter. However, in practice the order of integration often matters in deriving an
analytic solution.
In Cartesian coordinates, a double integral may be expressed as either the iterated integrals
ZZ Z x=b Z y=g2(x) !

f (x, y) dA = f (x, y) dy dx
R x=a y=g1(x)
where g1 and g2 are single-variable functions of x, or the iterated integrals

ZZ Z y=d Z x=h2(y) !

f (x, y) dA = f (x, y) dx dy
R y=c x=h1(y)
where h1 and h2 are single-variable functions of y.

When deciding what order of integration to try, you should always first try sketch the bounded region.
Example 2.3.5
Consider the double integral ZZ
x + 2y dA
R
over the region bounded by the curves y = x2 and x = y 2 in the xy-plane. We could evaluate this
double integral by first integrating over y
√
R = (x, y) ∈ R2 : 0 ≤ x ≤ 1, x2 ≤ y ≤ x

and then the double integral becomes

√ !
ZZ Z x=1 Z y= x
x + 2y dA = x + 2y dy dx
R x=0 y=x2
Z x=1 h iy=√x
= xy + y 2 dx
x=0 y=x2
Z x=1
3
= x 2 + x − x3 − x4 dx
x=0
9
= .
20
55
For this double integral, the order of integration is irrelevant, the double integral could be evaluated by
integrating over x first
√
R = (x, y) ∈ R2 : 0 ≤ y ≤ 1, y 2 ≤ x ≤ y


√ !
ZZ Z y=1 Z x= y
x + 2y dA = x + 2y dx dy
R y=0 x=y 2
Z y=1 h
1 ix=√y
= x2 + 2xy dy
y=0 2 x=y 2
Z y=1
1 3 1
= y + 2y 2 − y 4 − 2y 3 dx
y=0 2 2
9
= .
20
Example 2.3.6
(4 − x − y) dA
R
over the region bounded by the lines y = x, y = 2x and x = 2 in the xy-plane. We could evaluate this
double integral by first integrating over y
R = (x, y) ∈ R2 : 0 ≤ x ≤ 2, x ≤ y ≤ 2x


ZZ Z 2 Z 2x
(4 − x − y) dA = 4 − x − y dy dx
R 0 x
Z 2 2x !
1 2
= 4y − xy − y dx
0 2 x
Z 2
5 2
= 4x − x dx
0 2
4
= .
3
We could evaluate this double integral by changing the order of integration to first integrate over x
which would require splitting R into two regions
n y o n y o
R1 = (x, y) ∈ R2 : 0 ≤ y ≤ 2, ≤ x ≤ y and R2 = (x, y) ∈ R2 : 2 ≤ y ≤ 4, ≤x≤2
2 2
ZZ ZZ ZZ
(4 − x − y) dA = 4 − x − y dx dy + 4 − x − y dx dy
R R1 R2
Z 2 Z y ! Z 4 Z 2 !

= 4 − x − y dx dy + 4 − x − y dx dy.
y y
0 2 2 2
This would involve a lot more work and is left as an exercise.
56
Example 2.3.7
xy dA
R
over the region R
We could evaluate this double integral by first integrating over y which would require splitting R into
two regions R1 ∪ R2 where
√ √
R1 = (x, y) ∈ R2 : 0 ≤ x ≤ 1, − x ≤ y ≤ x

and √
R2 = (x, y) ∈ R2 : 1 ≤ x ≤ 4, x − 2 ≤ y ≤ x


ZZ ZZ ZZ
(xy) dA = xy dy dx + xy dy dx
R R1 R2
Z 1 Z √x ! Z 4 Z √x !
= √
xy dy dx + xy dy dx.
0 − x 1 x−2
Instead we can evaluate the double integral by changing the order of integration to first integrate over
x. For this situation the region can be described as
R = (x, y) ∈ R2 : −1 ≤ y ≤ 2, y 2 ≤ x ≤ y + 2 .


ZZ Z 2 Z y+2
xy dA = xy dx dy
R −1 y2
Z 2 !
y+2
1 2
= x y dy
−1 2 y2
1 2 3
Z
= y + 4y 2 + 4y − y 5 dy
2 −1
45
= .
8
57
Example 2.3.8
sin(y)
dA
R y
over the region R
If we attempt to integrate with respect to y first then the region is described as
R = (x, y) ∈ R2 : 0 ≤ x ≤ 1, x ≤ y ≤ 1

and the double integral becomes

ZZ Z 1 Z 1
sin(y) sin(y)
dA = dy dx.
R y 0 x y
sin(y)
However we would have to integrate with respect to y, therefore it is best to change the order
y
of integration. We will evaluate this double integral by first integrating with respect to x, where the
region is described as
R = (x, y) ∈ R2 : 0 ≤ y ≤ 1, 0 ≤ x ≤ y


ZZ Z 1 Z y
sin(y) sin(y)
dA = dx dy
R y 0 0 y
Z 1 y
sin(y)
= x dy
0 y 0
Z 1
= sin(y) dy
0
= 1 − cos(1) .
2.3.3 Applications of Double Integrals
58
Example 2.3.9
If z = f (x, y) represents the height above a region R in the xy-plane, then the double integral represents
the volume V under the surface z = f (x, y) above the region R.
ZZ
V = f (x, y) dA.
R
If z = f (x, y) = 1, then the double integral represents the area A of region R

ZZ
A= 1 dA
R
which is also written as ZZ

A= dA.
R
Example 2.3.10
Suppose a lamina occupies a region R in the xy-plane and its density at a point (x, y) is given by the
continuous function ρ(x, y), then the mass m of the lamina R is given by
ZZ
m= ρ(x, y) dA
R
and then the centre of mass (x̄, ȳ) for the lamina is given by
ZZ ZZ
1 1
x̄ = xρ(x, y) dA and ȳ = yρ(x, y) dA.
m R m R
The moment of inertia of the lamina about the x-axis is

ZZ
Ix = y 2 ρ(x, y) dA
R
and the moment of inertia of the lamina about the y-axis is

ZZ
Iy = x2 ρ(x, y) dA.
R
2.3.4 Change of Coordinate System in Two Dimensions
Occasionally, instead of changing the order of integration, it may be worthwhile to change the coordinate
system in an attempt to make the integration simpler. While the process of changing coordinates appears
simple, one needs to be careful to ensure the evaluation of the double integral is correct! Consider the
following motivational example.
59
Example 2.3.11
x2 + y 2 dA
R
over the region R
We will evaluate this double integral by first integrating over y which requires splitting R into two
regions R1 ∪ R2 where
R1 = (x, y) ∈ R2 : 0 ≤ x ≤ 1, −x ≤ y ≤ x

and
R2 = (x, y) ∈ R2 : 1 ≤ x ≤ 2, x − 2 ≤ y ≤ 2 − x


ZZ Z 1 Z x Z 2 Z 2−x
2 2 2 2 2 2
x + y dA = x +y dy dx + x +y dy dx
R 0 −x 1 x−2
8 1 3 8 2 3
Z Z
x − 3x2 + 3x − 2 dx

= x dx +
3 0 3 1
8
= .
3
Consider the change of coordinate system
u = x + y, v = x − y
then the boundaries of the diamond in the xy-plane become lines in the uv-plane:
y = −x for 0 ≤ x ≤ 1 then u = 0 for 0 ≤ v ≤ 2

y = x for 0 ≤ x ≤ 1 then v = 0 for 0 ≤ u ≤ 2
y = 2 − x for 1 ≤ x ≤ 2 then u = 2 for 0 ≤ v ≤ 2
y = x − 2 for 1 ≤ x ≤ 2 then v = 2 for 0 ≤ u ≤ 2
and the transformed region is
60
The inverse transformation is

1 1
x= (u + v) , y = (u − v)
2 2
then the integrand f (x, y) = x2 + y 2 becomes
2 2
1 1
f (u, v) = (u + v) + (u − v)
2 2
1 2
u + v2

=
2
and the double integral becomes
Z 2Z 2 Z 2
2 !
1 2 1 1 3
u + v2 2

dv du = u v+ v du
0 0 2 2 0 3 0
1 2
Z
8
= 2u2 + du
2 0 3
2
1 2 3 8
= u + u
2 3 3 0
16
= .
3
However, this does not agree with the value of the double integral evaluated in the xy-coordinate system.
“What went wrong?”
Observe that the area of the diamond region R is 2 units2 while the area of the square region R0 is 4
units2 .
Recall from single variable integral calculus, the goal of substitution was to replace complicated integrals
with integrals that were easier to evaluate. Substitution accomplishes this by application of the chain
Z b
rule. Given the definite integral f (x) dx, if x is a function of some variable u, that is, x = x(u)
a
with the end points x(α) = a and x(β) = b then by applying the chain rule the integral can be written
as Z b Z β
dx
f (x) dx = f (x(u)) du.
a α du
For the purpose of being able to apply an analogous method of substitution to double integrals com-
Z β Z 2Z 2
dx 1 2 2

pare f (x(u)) du to the double integral u +v dv du where f (x(u, v) , y(u, v)) =
α du 0 0 2
1 2 dx
u + v 2 . We notice that in the single integral there is the

term but there is no equivalent term
2 du
in the double integral.
The Jacobian in Two Dimensions
We assume that the region R in the xy-plane can be related to the region R0 in the uv-plane by a
one-to-one transformation
x = x(u, v) , y = y(u, v) .
A function f (x, y) defined on the region R in the xy-plane can be written as f (x(u, v) , y(u, v)) defined
on the region R0 in the uv-plane.
61
Definition: The Jacobian
If x = x(u, v) and y = y(u, v) have continuous first order partial derivatives then the Jacobian of
the coordinate transformation x = x(u, v) , y = y(u, v) is
 ∂x ∂x 
 ∂u ∂v 
J(u, v) = det  .
∂y ∂y
∂u ∂v
The analogous method to substitution for double integrals is

ZZ ZZ
f (x, y) dx dy = f (x(u, v) , y(u, v)) |J(u, v)| du dv.
R R0
I J(u, v) is occasionally called the Jacobian determinant.

I The Jacobian is sometimes denoted as
∂(x, y)
J(u, v) = .
∂(u, v)
I Note that in the double integral we use the absolute value of the Jacobian; |J(u, v)|.
I The factor |J(u, v)| can be thought of as a measure of how much the coordinate transformation
is expanding or contracting the (surface) area as R0 is transformed into R.
Example 2.3.12 [Previous example continued]

x2 + y 2 dA
R
over the region R
The chosen change of coordinate system was given by

1 1
x = (u + v) , y = (u − v)
2 2
then the Jacobian is

∂ 1 ∂ 1
 
 ∂u 2 (u + v) ∂v 2
(u + v) 
J(u, v) = det 
 ∂ 1 
∂ 1 
(u − v) (u − v)
∂u 2 ∂v 2
 1 1 
 2 2 
= det
1 1

−
2 2
1
=− .
2
62
Therefore, the double integral becomes

Z 2Z 2 2 !
1 2
Z
1 2 2 1
2 1 3
u + v − dv du = u v+ v du
0 0 2 2 4 0 3 0
1 2
Z
2 8
= 2u + du
4 0 3
2
1 2 3 8
= u + u
4 3 3 0
8
= .
3
which does agree with the value we found when evaluating the double integral in the xy-plane.
1
In this example, J(u, v) = − , you may wonder about the significance of the negative for this Jacobian.
2
The negative indicates that the orientation of R in the xy-plane is the opposite of the orientation of R0
in the uv-plane. To see this, consider traversing the edge of the diamond in a counterclockwise direction
by starting at (x, y) = (0, 0) then head to (x, y) = (1, −1), (x, y) = (2, 0), (x, y) = (1, 1) and back to
(x, y) = (0, 0). Under the transformation u = x + y and v = x − y the corresponding journey around
the edge of the square would be starting at (u, v) = (0, 0) then head to (u, v) = (0, 2), (u, v) = (2, 2),
(u, v) = (2, 0) and back to (u, v) = (0, 0); which traverses the square in a clockwise direction.
2.3.5 Double Integrals in Polar Coordinates
Double integrals are sometimes easier to evaluate in polar coordinates, given by the transformation
with inverse transformation

p y
r(x, y) = x2 + y 2 and θ(x, y) = arctan .
x
The corresponding Jacobian to the polar coordinates transformation is

 ∂ ∂ 
r cos(θ) r cos(θ)
 ∂r ∂θ
J(r, θ) = det

∂ ∂

r sin(θ) r sin(θ)
∂r ∂θ

cos(θ) −r sin(θ)
= det
sin(θ) r cos(θ)
= r cos2 (θ) + r sin2 (θ)
= r.
Therefore, converting a double integral to polar coordinates becomes

ZZ ZZ
f (x, y) dx dy = f (r cos(θ) , r sin(θ)) r dr dθ.
R R0
63
Example 2.3.13
2

+y 2
ex dA
R
over the region bounded by the lines x = 0, y = 0, the circle x2 + y 2 = 1 and the circle x2 + y 2 = 4 in
the xy-plane. The region can be described in terms of polar coordinates as
n πo
R = (r, θ) ∈ R2 : 1 ≤ r ≤ 2, 0 ≤ θ ≤
2
π
!
ZZ Z 2 Z 2
x2 +y 2 r2
e dx dy = e r dθ dr
R 1 0
Z 2 h i π2
r2
= re θ dr
1 0
Z 2
π 2

= rer dr
2 1
2
π 1 r2
= e
2 2 1
πe 3
= e −1 .
4
Example 2.3.14
Find the area of a semi-disc of radius 2 with centre at (x, y) = (0, 2).
By the area formula, the answer is simply 2π units2 . This would be nasty to derive in Cartesian
coordinates, so we convert to polar coordinates.
The equation of the semi-circle is
2
x2 + (y − 2) = 4 for x ≥ 0.
Converting to polar coordinates gives

2
r2 cos2 (θ) + (r sin(θ) − 2) = 4 for r cos(θ) ≥ 0.
This reduces to
r2 − 4r sin(θ) = 0 for cos(θ) ≥ 0.
The equation r2 − 4r sin(θ) = 0 has solutions r = 0 and 4 sin(θ) = 0, and the restriction cos(θ) ≥ 0
π
gives 0 ≤ θ ≤ .
2
64
Therefore, the region described in polar coordinates is

n πo
R = (r, θ) ∈ R2 : 0 ≤ r ≤ 4 sin(θ) , 0 ≤ θ ≤ .
2
Hence, the area of the semi-disc is given by

ZZ
A= 1 dA
R
π
!
Z 2
Z 4 sin(θ)
= r dr dθ
0 0
Z π 4 sin(θ) !
2 1 2
= r dθ
0 2 0
Z π
2
8 sin2 (θ) dθ

=
0
Z π
2
= 4 − 4 cos(2θ) dθ
0
h i π2
= 4θ − 2 sin(2θ)
0
= 2π
that is, the area of the semi-disc is 2π units2 .
65
2.4 Triple Integrals
2.4.1 Triple integrals
To define a region of a three-dimensional space, we need to state the surfaces that are the boundaries
of the region.
Example 2.4.1
A cubic region in xyz-space can be described by its boundary surfaces as
V = (x, y, z) ∈ R3 : 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ z ≤ 1 .

Example 2.4.2
A finite cylindrical region in xyz-space bounded by the cylinder surface x2 + y 2 = 1, the z = −1 plane
and the z = 2 plane could be described as
V = (x, y, z) ∈ R3 : 0 ≤ x2 + y 2 ≤ 1, −1 ≤ z ≤ 2 .

We could write an equivalent form of the three dimensional region as

p p
V = (x, y, z) ∈ R3 : −1 ≤ z ≤ 2, −1 ≤ x ≤ 1, − 1 − x2 ≤ y ≤ 1 − x2 .

Example 2.4.3
The ball in xyz-space bounded by the sphere with radius 2 and centre at the origin could be described
as
V = (x, y, z) ∈ R3 : 0 ≤ x2 + y 2 + z 2 ≤ 4 .

We could write one equivalent form of the three dimensional region as

p p p p
V = (x, y, z) ∈ R3 : −2 ≤ x ≤ 2, − 4 − x2 ≤ y ≤ 4 − x2 , − 4 − x2 − y 2 ≤ z ≤ 4 − x2 − y 2 .

Example 2.4.4
Consider the tetrahedron region in xyz-space bounded by the x = 0 plane, the y = 0 plane, the z = 0
plane and the 3x + 2y + z = 6 plane.
Here are three possible descriptions:
This three dimensional region could be described by

3 3
V = (x, y, z) ∈ R : 0 ≤ x ≤ 2, 0 ≤ y ≤ − x + 3, 0 ≤ z ≤ 6 − 3x − 2y
2
or equivalently could be described by

3 2
V = (x, y, z) ∈ R : 0 ≤ y ≤ 3, 0 ≤ x ≤ − x + 2, 0 ≤ z ≤ 6 − 3x − 2y
3
or equivalently could be described by

3 1 3 1
V = (x, y, z) ∈ R : 0 ≤ z ≤ 6, 0 ≤ x ≤ − z + 2, 0 ≤ y ≤ 3 − x − z .
3 2 2
It is left as an exercise to find the three other equivalent descriptions.
Definition: Triple Integrals
Let V be a bounded three dimensional region in the xyz-space, which is completely bounded by a
cubic region E. We divide E into L × M × N × suitably small rectangular prisms, each with volume
∆V = ∆x∆y∆z and centre points (xi , yj , zk ). Then define the function
(
f (xi , yj , zk ), if (xi , yj , zk ) is in V,
F (xi , yj , zk ) = .
0 otherwise.
The triple integral of f (x, y, z) over the volume region V is given by

   
ZZZ ML X
X M X N
f (x, y, z) dV = lim  lim  lim  F (xi , yj , zk ) ∆V 
V L→∞ M →∞ N →∞
i=1 j=1 k=1
where the limits exist as ∆x → 0, ∆y → 0 and ∆z → 0.
2.4.2 The order of integration for triple integrals
We evaluate a triple integral by evaluating three iterated integrals. As with double integrals, there is a
geometric procedure for finding the limits of integration for these single integrals. In practice, the order
of integration often matters and should be selected carefully in deriving an analytic solution.
In Cartesian coordinates, a triple integral may be expressed in six possible iterated integral forms, such
as, ! !
ZZZ Z x=b Z y=g2(x) Z h2(x,y)
f (x, y, z) dV = f (x, y, z) dz dy dx.
V x=a y=g1(x) h1(x,y)
Example 2.4.5
Consider the triple integral ZZZ
x + y + z dV
V
over the tetrahedron region V in xyz-space bounded by the x = 0 plane, the y = 0 plane, the z = 0
plane and the 3x + 2y + z = 6 plane.
In this example, the order of integration is not relevant, therefore we arbitrarily choose the description

3 3
V = (x, y, z) ∈ R : 0 ≤ x ≤ 2, 0 ≤ y ≤ − x + 3, 0 ≤ z ≤ 6 − 3x − 2y
2
67
and then the triple integral becomes

!
ZZZ Z 2 Z − 23 x+3 Z 6−3x−2y
x + y + z dV = x + y + z dz dy dx
V 0 0 0
Z 2 Z − 32 x+3
6−3x−2y ! !
1 2
= xz + yz + z dy dx
0 0 2 0
Z − 23 x+3 !
Z 2
1
= 3x2 + 2xy − 24x − 12y + 36 dy dx
2 0 0
1 2 2
Z
− 3 x+3
= 3x y + xy 2 − 24xy − 6y 2 + 36y 0 2 dx
2 0
9 2 3
Z
=− x − 10x2 + 28x − 24 dx
8 0
2
9 1 4 10 3
=− x − x + 14x2 − 24x
8 4 3 0
33
= .
2
It is left as a long and tedious exercise to show that the triple integral with the other five descriptions
can all be evaluated and give the same answer.
At this point you may ask: If there are six possible orders of integration for a given triple integral,
then how do we know which option should be used to evaluate the triple integral? The answer is; as
with double integrals, either the geometry of the bounded region or the integrand itself may suggest
the more preferable orders of integration. You should always try sketch the bounded three dimensional
region first.
Example 2.4.6
Consider the triple integral ZZZ
y dV
V
over the three dimensional region bounded by the x = 0 plane, the y = 0 plane, the y + z = 2 plane
and the parabolic cylinder x = 4 − y 2 . If we sketch the three dimensional region
then we may decide to use the parabolic cylinder as one of the√surfaces for the most inner integral, we
could either use x = h2 (y, z) = 4 − y 2 + 0z or y = h2 (x, z) = 4 − x + 0z. Since the square root may
make the second integration messy we will select x = h2 (y, z) = 4 − y 2 and we note that this surface
68
is “above” the yz-plane, that is, x = h1 (y, z) = 0. The projection (or “shadow”) of x = h2 (y, z) onto
x = h1 (y, z) is the triangle area in the yz-plane:
which is easily described by
R = (y, z) ∈ R2 : 0 ≤ y ≤ 2, 0 ≤ z ≤ 2 − y .

Thus the triple integral becomes

4−y 2
! !
ZZZ Z 2 Z 2−y Z
y dV = y dx dz dy
V 0 0 0
Z 2 Z 2−y h i4−y2
= yx dz dy
0 0 0
Z 2 Z 2−y
3

= 4y − y dz dy
0 0
Z 2 h i2−y
3
= 4yz − y z dy
0 0
Z 2
= y 4 − 2y 3 − 4y 2 + 8y dy
0
2
1 5 1 4 4 3
= y − y − y + 4y 2
5 2 3 0
56
= .
15
2.4.3 Applications of triple integrals
Example 2.4.7
If f (x, y, z) = 1, then the triple integral represents the volume V of region V
ZZZ
V = 1 dx dy dz
V
which is also written as ZZZ

V = dx dy dz.
V
69
Example 2.4.8
Suppose the density at a point (x, y, z) of a three dimensional region V in the xyz-space is given by the
continuous function ρ(x, y, z), then the mass m of the solid V is given by
ZZZ
m= ρ(x, y, z) dx dy dz
V
and then the centre of mass (x̄, ȳ, z̄) for the solid is given by
ZZZ ZZZ ZZZ
1 1 1
x̄ = xρ(x, y, z) dV, ȳ = yρ(x, y, z) dV and z̄ = zρ(x, y, z) dV.
m V m V m V
The moment of inertia of the solid about the x-axis is

ZZZ
y 2 + z 2 ρ(x, y, z) dV

Ix =
V
the moment of inertia of the solid about the y-axis is
ZZZ
x2 + z 2 ρ(x, y, z) dV

Iy =
V
and the moment of inertia of the solid about the z-axis is
ZZZ
x2 + y 2 ρ(x, y, z) dV

Iz =
V
2.4.4 Change of coordinate system in three-dimensions
The choice of coordinate system for triple integrals is typically dictated by the geometry of the three
dimensional region. When Cartesian coordinates are not suitable we may consider other coordinate
systems such as cylindrical polar coordinates and spherical coordinates. After our discussion of changing
coordinate systems in two dimensions we should be aware that given a coordinate transformation
x = x(u, v, w), y = y(u, v, w) and z = z(u, v, w), in general,
ZZZ ZZZ
f (x, y, z) dx dy dz 6= f (x(u, v, w) , y(u, v, w) , z(u, v, w)) du dv dw.
V V0
The Jacobian in three-dimensions
We assume that the three dimensional region V in xyz-space can be related to the region V 0 in the
uvw-space by a one-to-one transformation
x = x(u, v, w) , y = y(u, v, w) , z = z(u, v, w) .
A function f (x, y, z) defined in the three dimensional region V in the xyz-space can be written as
f (x(u, v, w) , y(u, v, w) , z(u, v, w)) defined in the three dimensional region V 0 in the uvw-space.
Definition: The Jacobian
If x = x(u, v, w), y = y(u, v, w) and z = z(u, v, w) have continuous first order partial derivatives
then the Jacobian of the coordinate transformation x = x(u, v, w), y = y(u, v, w) and z =
z(u, v, w) is
∂x ∂x ∂x
 
 ∂u ∂v ∂w  
 
 ∂y ∂y ∂y  
J(u, v, w) = det 
 ∂u
.
 ∂v ∂w  

 ∂z ∂z ∂z 
∂u ∂v ∂w
70
Therefore, the analogous method to substitution for triple integrals is

ZZZ ZZZ
f (x, y, z) dx dy dz = f (x(u, v, w) , y(u, v, w) , z(u, v, w)) |J(u, v, w)| du dv dw.
V V0
I The Jacobian is sometimes denoted as

∂(x, y, z)
J(u, v, w) = .
∂(u, v, w)
I Note that in the triple integral we use the absolute value of the Jacobian; |J(u, v, w)|.
I The factor |J(u, v, w)| can be thought of as a measure of how much the coordinate transformation
is expanding or contracting the the three dimensional region as V 0 is transformed into V.
Many problems in engineering and science involve geometries that could be described in terms of
cylinders or spheres, for example,
I fluid flow along a pipe,

I fluid flow through a conical funnel,
I magnetic fields in the Sun,
2.4.5 Triple integrals in cylindrical coordinates
The cylindrical coordinates are given by the transformation

x(r, θ, z) = r cos(θ) , y(r, θ, z) = r sin(θ) and z(r, θ, z) = Z
p y
r(x, y, z) = x2 + y 2 , θ(x, y, z) = arctan and Z(x, y, z) = z.
x
The corresponding Jacobian to the polar coordinates transformation is

∂ ∂ ∂
 
 ∂r r cos(θ) r cos(θ) r cos(θ)
 ∂θ ∂Z 

 ∂ ∂ ∂ 
J(r, θ, Z) = det
 ∂r r sin(θ) r sin(θ) r sin(θ) 
 ∂θ ∂Z 

 ∂ ∂ ∂ 
Z Z Z
 ∂r ∂θ ∂Z
cos(θ) −r sin(θ) 0
= det sin(θ) r cos(θ) 0 
0 0 1
= r cos2 (θ) + r sin2 (θ)
= r.
71
Therefore, converting a triple integral to cylindrical coordinates becomes

ZZZ ZZZ
f (x, y, z) dx dy dz = f (r cos(θ) , r sin(θ) , Z) r dr dθ dz.
V V0
Example 2.4.9
Find the volume of a quarter of a cylinder of radius 2 units and height of 6 units.
By the volume formula, the answer is simply 6π units3 . To derive this in Cartesian coordinates would
require the volume integral
Z 6 Z 2 Z √4−x2 ! ! Z 6 Z 2 p
1 dy dx dz = 2
4 − x dx dz.
0 0 0 0 0
This volume integral would be easier to evaluate in cylindrical coordinates, the volume region could be
described as n π o
V = (r, θ, Z) ∈ R3 : 0 ≤ r ≤ 2, 0 ≤ θ ≤ , 0 ≤ Z ≤ 6
2
ZZZ
V = 1 dV
V
Z π2 Z 6 Z 2
= r dr dz dθ
0 0 0
Z π2 Z 6 2 ! !
1 2
= r dz dθ
0 0 2 0
Z π2 Z 6
= 2 dz dθ
0 0
Z π h i
2 6
= 2z dθ
0 0
Z π
2
= 12 dθ
0
h i π2
= 12θ
0
= 6π
that is, the volume of the quarter cylinder is 6π units3 .
72
Example 2.4.10
Find the volume of a cone of radius a units and height of h units.
To derive this in Cartesian coordinates would require the volume integral

Z a Z √a2 −x2 Z h
! !
V = √ √ 1 dz dy dx
h
−a − a2 −x2 a x2 +y 2
which would be a lot of work requiring a number of substitutions. This volume integral would be easier
to evaluate in cylindrical coordinates, the volume region could be described as

3 h
V = (r, θ, Z) ∈ R : 0 ≤ r ≤ a, 0 ≤ θ ≤ 2π, r ≤ Z ≤ h
a
ZZZ
V = 1 dV
V
! !
Z 2π Z a Z h
= r dz dr dθ
h
0 0 ar
Z 2π Z a h ih
= rz h
dr dθ
0 0 ar
Z 2π Z a
h 2
= hr − r dr dθ
0 0 a
Z 2π a
h 2 h
= r − r3 dθ
0 2 3a 0
Z 2π 2
ha
= dθ
0 6
2 2π
ha
= θ
6 0
ha2 π
=
3
ha2 π
that is, the volume of the cone is units3 .
3
2.4.6 Triple integrals in spherical coordinates
The spherical polar coordinates are given by the transformation

x(r, φ, θ) = r sin(φ) cos(θ) , y(r, φ, θ) = r sin(φ) sin(θ) and z(r, φ, θ) = r cos(φ)
73

!
p y z
r(x, y, z) = x2 + y2 + z2, θ(x, y, z) = arctan and φ(x, y, z) = arccos p .
x x2 + y 2 + z 2
The corresponding Jacobian to the spherical coordinates transformation is

∂ ∂ ∂
 
r sin(φ) cos(θ) r sin(φ) cos(θ) r sin(φ) cos(θ)
 ∂r ∂φ ∂θ 
 
 ∂ ∂ ∂ 
 ∂r r sin(φ) sin(θ)
J(r, θ, φ) = det r sin(φ) sin(θ) r sin(φ) sin(θ)
 
 ∂φ ∂θ 

 ∂ ∂ ∂ 
r cos(φ) r cos(φ) r cos(φ)
∂r ∂φ ∂θ
 
sin(φ) cos(θ) r cos(φ) cos(θ) −r sin(φ) sin(θ)
= det sin(φ) sin(θ) r cos(φ) sin(θ) r sin(φ) cos(θ) 
cos(φ) −r sin(φ) 0

r cos(φ) cos(θ) −r sin(φ) sin(θ)
= cos(φ) det
r cos(φ) sin(θ) r sin(φ) cos(θ)

sin(φ) cos(θ) −r sin(φ) sin(θ)
+ r sin(φ) det
sin(φ) sin(θ) r sin(φ) cos(θ)
= r2 cos2 (φ) sin(φ) + r2 sin3 (φ)
= r2 sin(φ) .
Therefore, converting a triple integral to spherical coordinates becomes

ZZZ ZZZ
f (x, y, z) dx dy dz = f (r sin(φ) cos(θ) , r sin(φ) sin(θ) , r cos(φ)) r2 sin(φ) dr dφ dθ
V V0
Example 2.4.11
Find the volume of a sphere of radius R units.
To derive this in Cartesian coordinates would require the volume integral
Z R Z √R2 −x2 Z √R2 −(x2 +y2 ) ! !
V = √ √ 2 2 2 1 dz dy dx
2 2
−R − R −x − R −(x +y )
which would be a lot of work requiring a number of substitutions. This triple integral would be easier
to evaluate in spherical coordinates, the volume region could be described as
V = (r, φ, θ) ∈ R3 : 0 ≤ r ≤ R, 0 ≤ φ ≤ π, 0 ≤ θ ≤ 2π

74

ZZZ
V = 1 dV
V
! !
Z 2π Z π Z R
2
= r sin(φ) dr dφ dθ
0 0 0
Z 2π Z π R ! !

1 3
= r sin(φ) dφ dθ
0 0 3 0
Z 2π Z π 3
R
= sin(φ) dφ dθ
0 0 3
Z 2π π
R3
= − cos(φ) dθ
0 3 0
Z 2π 3
2R
= dθ
0 3
3 2π
2R
= θ
3 0
4πR3
=
3
4πR3
that is, the volume of the sphere is units3 .
3
75
Chapter 3
Vector Calculus
76
3.1 Vector Functions
3.1.1 Scalar Fields
Definition: Scalar field
A scalar field (or scalar function) is a function f defined in a region of either two-dimensional
space (R2 ) or three-dimensional space (R3 ) which assigns a unique scalar value (that is, a real
number) to each point P in the domain of f
f = f (P ) = f (r)
where r denotes the position vector from the origin to the point P .
Any point P in R3 with Cartesian coordinates (x, y, z) has a corresponding position vector r = xi +
yj + zk, and then the scalar field can be written as
f = f (P ) = f (r) = f (x, y, z) .
Example 3.1.1
Atmospheric pressure at points on the surface of the Earth is a scalar field.
Example 3.1.2
The temperature at points throughout the volume of the Sun is a scalar field.
Example 3.1.3
The distance function p
d(x, y, z) = x2 + y 2 + z 2
describing the distance from the origin to any three-dimensional point P : (x, y, z) is a scalar field.
Example 3.1.4
The function
f (x, y) = x2 + y 2
is a (two-dimensional) scalar field.
Definition: Level curves
Let f (x, y) be a two-dimensional scalar function. A level curve (or contour) of height c is the
set of all points (x, y) such that f (x, y) = c.
Geometrically, a level curve of height c is the intersection of the z = c plane with the surface z = f (x, y).
Example 3.1.5
Consider the scalar field
f (x, y) = x2 + y 2 .
The surface represented by this function is
The surface f(x,y) = x2+y2
50
40
30
z
20
10
0
5
5
0
0
y −5 −5
x
The level curves of f (x, y) are concentric circles x2 + y 2 = c for c ≥ 0.
Level Curves for x2+y2 = c for values c = 1,4,9,16,25

5
25 25
4 16
3 9
16
25
2
9
25
4
1
16
1
4
0
y
9
25
1
9
−1
16
−2 4
25
16
−3 9
−4 25 16
−5 25
−6 −4 −2 0 2 4 6
x
78
Example 3.1.6
Consider the scalar function
f (x, y) = x2 − y 2 .
The surface represented by this function is
The surface f(x,y) = x2−y2
30
20
10
0
z
−10
−20
−30
5
5
0
0
y −5 −5
x
The level curves of f (x, y) are rectangular hyperbolae x2 − y 2 = c for any c ∈ R.

Level Curves for x2−y2 = c for values c = −9, −4,−1,1,4,9
5
−4
−9 −4 1
−
4 1 − −9 1
1
−9
9
3
−4 −4 4
4
2 −1
−1
1
1
0
y
−1
4
−1
−2 1 −1 1
−4 −4
−3 −9
−1
9
−4 4 −9 4
−9 −1 1
1 −4 −4
−5
−5 −4 −3 −2 −1 0 1 2 3 4 5
x
Definition: Level surface
Let f (x, y, z) be a three-dimensional scalar function. A level surface of value c is the set of all
points (x, y, z) such that f (x, y, z) = c.
79
Example 3.1.7
f (x, y, z) = x2 + y 2 − z.
The level surfaces of f (x, y, z) are nested paraboloids x2 + y 2 − z = c for any c ∈ R.
Level surfaces for x2+y2−z=c for values c = −2,0,2
2
z
−1
−2
−2 −2
0 0
2 2
x
y
80
3.1.2 Vector Fields
Definition: Vector field
A vector field, or vector-valued function, F defined in a region of either two-dimensional space

(R2 ) or three-dimensional space (R3 ) assigns a unique vector to each point P in the domain of F,
such that
F = F(P ) = F(r) ,
where r denotes the position vector from the origin to the point P .
Any point P in R3 with Cartesian coordinates (x, y, z) has a corresponding position vector r = xi +
yj + zk, and then the vector field can be written as
F = F(r)
= F(x, y, z)
= F1 (x, y, z) i + F2 (x, y, z) j + F3 (x, y, z) k
= F1 (r) i + F2 (r) j + F3 (r) k.
Definition: Field line
A curve in either R2 or R3 represented by the parameterisation r(t) is a field line of a vector field
F if at each point P on the curve, the tangent vector to the curve is parallel to the vector field F.
Given the vector field F(r) = F1 (r) i + F2 (r) j + F3 (r) k, the field lines of F are represented by the
position vector r = xi + yj + zk, whose components satisfy the differential equations
dr
= F(r)
dt
with the prescribed initial condition r(t0 ) = r0 . Equivalently, the field lines are the solution of the
following simultaneous ordinary differential equations
dx dy dz
= F1 (x, y, z) , = F2 (x, y, z) , = F3 (x, y, z)
dt dt dt
with the prescribed initial conditions x(t0 ) = x0 , y(t0 ) = y0 , z(t0 ) = z0 .
Example 3.1.8
The velocity of a fluid flow along a pipe is a vector field.
Example 3.1.9
The magnetic field of a bar magnet is a vector field.
Example 3.1.10
The gravitational field of the earth is a vector field.
81
Example 3.1.11
The vector field of a body rigidly rotating counter-clockwise with constant angular velocity ω = ωk,
for ω > 0, is defined by v = ω × r;
v(x, y, z) = ω × r
 
i j k
= det 0 0 ω 
x y z
= −ωyi + ωxj
= ω (−yi + xj)
where P (x, y, z) is a point in the body. Hence the velocity v is a vector field independent of z.
vector field v = −ω y i + ω x j
4
0
y
−1
−2
−3
−4
−4 −3 −2 −1 0 1 2 3 4
x
Example 3.1.12
Given a two-dimensional vector field F(x, y) = 2i + 3xj,
1. sketch the vector field,
2. sketch a selection of field lines of F(x, y), and

3. find the equation of the field line that passes through the point (x, y) = (2, 5).
82
The vector field is
vector field F = 2 i + 3x j
6
0
y
−2
−4
−6
−6 −4 −2 0 2 4 6
x
with a selection of field lines
Feild lines of F = 2 i + 3x j
6
0
y
−2
−4
−6
−6 −4 −2 0 2 4 6
x
where the magenta coloured line corresponds to the field line which passes through the point (x, y) =
(2, 5).
The field lines of the two-dimensional vector field F(x, y) = 2i + 3xj are given by solving the differential
equation
dr
= 2i + 3xj
dt
83
that is, the simultaneous ordinary differential equations

dx dy
= 2, = 3x.
dt dt
Since the tangent vector to the field line is parallel to the vector field then
dy 3x
= .
dx 2
Integrating with respect to x gives
3 2
y(x) = x +C
4
for arbitrary constant C. For the field line passing through the point (x, y) = (2, 5);
3 2
5= (2) + C =⇒ C = 2.
4
Hence, the field line passing through the point (x, y) = (2, 5) is
3 2
y(x) = x + 2.
4
Example 3.1.13
Given a two-dimensional vector field F(x, y) = x2 i + y 2 j,
1. sketch the vector field,

2. sketch a selection of field lines of F(x, y), and
3. find the equation of the field line that passes through the point (x, y) = (−2, 2).
84
The vector field is
Vector field F = x2 i + y2 j
6
0
y
−2
−4
−6
−6 −4 −2 0 2 4 6
x
with a selection of field lines
Field lines of F = x2 i + y2 j
6
0
y
−2
−4
−6
−6 −4 −2 0 2 4 6
x
where the magenta coloured line corresponds to the field line which passes through the point (x, y) =
(−2, 2).
The field lines of the two-dimensional vector field F(x, y) = x2 i + y @ j are obtained by solving the
differential equation
dr
= x2 i + y 2 j
dt
that is, the simultaneous ordinary differential equations
dx dy
= x2 , = y2 .
dt dt
85
Since the tangent vector to the field line is parallel to the vector field then
dy y2
= 2.
dx x
Cx
y(x) =
x+C
for arbitrary constant C. For the field line passing through the point (x, y) = (−2, 2);
−2C
2= =⇒ C = 1.
C −2
Hence, the field line passing through the point (x, y) = (−2, 2) is
x
y(x) = .
x+1
86
3.2 The Del Operator
3.2.1 The Del Operator
Definition: The Del Operator
The del operator is the quantity written as

∂ ∂ ∂
∇=i +j +k
∂x ∂y ∂z
The del operators interaction with scalar fields and vector fields is crucial in vector calculus and the
subsequent applications of vector calculus. Before we look at this further there are a number of points
we need to emphasise:
I The symbol “∇” is usually called del or on some occasions nabla.

I The del operator is an operator+, like the derivative or integral, and a three-dimensional vector.
I In the above definition, the partial derivatives are written after the unit vectors to avoid the
misinterpretation that the derivatives are operating on the unit vectors.
3.2.2 The Gradient of a Scalar Field
Definition: The Gradient of a Scalar Field
Given a three-dimensional scalar function f (x, y, z), the gradient of f is defined as
∂f ∂f ∂f
∇f = i+ j+ k.
∂x ∂y ∂z
I ∇f is read as “grad of f ” and on some occasions is written as grad f .

I ∇f is a vector field : The del operator takes a scalar field and returns a vector field.
I For a two-dimensional scalar field f (x, y) the gradient of f is a two-dimensional vector
– with a direction that is perpendicular to the level curves f (x, y) = c,
– pointing in the direction of greatest increase of f , and thus
– has magnitude equal to the greatest increase in the rate of change of f .
I For a three-dimensional scalar field f (x, y, z) the gradient of f is a three-dimensional vector
– with a direction that is perpendicular to the level surfaces f (x, y, z) = c,
– pointing in the direction of greatest increase of f , thus

– has magnitude equal to the greatest increase in the rate of change of f , and
– is often referred to as a normal vector to the surface f (x, y, z) = c.
Example 3.2.1
Consider the scalar field f (x, y) = x2 + y 2 . The gradient of f is the vector field
∇f = 2xi + 2yj + 0k.
Consider the gradient of f at a number of points:
If (x, y) = (2, 1) then (∇f )(2, 1) = 4i + 2j

If (x, y) = (1, 0) then (∇f )(1, 0) = 2i
If (x, y) = (−2, 3) then (∇f )(−2, 3) = −4i + 6j
If (x, y) = (−3, −3) then (∇f )(−3, −3) = −6i − 6j
To show the gradient vector of f is perpendicular to the level curves x2 + y 2 = c2 we proceed as follows
1. write the parameterisation for the level curves,

2. find the tangent vector to the level curves,
3. write the gradient of f in terms of the parameterisation, and
4. show the tangent vector is perpendicular to the gradient vector.
The parameterisation of x2 + y 2 = c2 is
r(t) = c cos(t) i + c sin(t) j + 0k, for 0 ≤ t < 2π
then the tangent vector is

dr
= −c sin(t) i + c cos(t) j
dt
and the gradient vector is parameterised as
(∇f )(r) = 2c cos(t) i + 2c sin(t) j.
The dot product of the tangent vector and gradient vector gives
dr
· ∇f = (−c sin(t) i + c cos(t) j) · (2c cos(t) i + 2c sin(t) j)
dt
= 0.
Hence, the gradient vector of f is perpendicular to the level curves x2 + y 2 = c2 .
88
Example 3.2.2
mM G
f (r) =
r
mM G
=p
x + y2 + z2
2
where m and M are the masses of two objects separated by a distance of r = |r| and G is the gravitational
constant. The gradient of f is
! ! !
∂ mM G ∂ mM G ∂ mM G
∇f = p i+ p j+ p k
∂x x2 + y 2 + z 2 ∂y x2 + y 2 + z 2 ∂z x2 + y 2 + z 2
mM Gx mM Gy mM Gz
=− 3 i − 3 j + − 3 k
(x2 + y 2 + z 2 ) 2 (x2 + y 2 + z 2 ) 2 (x2 + y 2 + z 2 ) 2
mM G
=− r̂
r2
mM G
which you may recognise as Newton’s law of gravitation g = − r̂.
r2
The Gradient Vector in Other Coordinate Systems
The gradient vector of a function f (r, θ, z) in cylindrical coordinates is
∂f 1 ∂f ∂f
∇f = er + eθ + ez .
∂r r ∂θ ∂z
The gradient vector of f (r, φ, θ) in spherical coordinates is
∂f 1 ∂f 1 ∂f
∇f = er + eφ + eθ .
∂r r ∂φ r sin(φ) ∂θ
3.2.3 The Divergence of a Vector Field
Definition: The Divergence of a Vector Field
Given a three-dimensional vector function F(x, y, z) = F1 (x, y, z) i + F2 (x, y, z) j + F3 (x, y, z) k, the

divergence of F is defined as
∂F1 ∂F2 ∂F3
∇·F= + + .
∂x ∂y ∂z
I ∇ · F on some occasions is written as div F.
I ∇ · F is a scalar field : The del operator takes a vector field and returns a scalar field.
I In fluid dynamics, ∇·v is a measure of compressibility of the fluid flow v. In particular, if ∇·v = 0
then the fluid flow is said to be incompressible.
89
I In electromagnetism, ∇ · B is a measure of flux of the magnetic field B. In particular, if ∇ · B = 0

then the magnetic field is said to be solenoidal.
I Be warned: The divergence of F
∂F1 ∂F2 ∂F3
∇·F= + +
∂x ∂y ∂z
is a scalar field and is not the same as the operator
∂ ∂ ∂
F · ∇ = F1 + F2 + F3
∂x ∂y ∂z
Not only are these two quantities mathematically different, they also have different physical in-
terpretations.
Example 3.2.3
Consider a fluid flow rotating counter-clockwise with constant angular velocity ω = ωk, for ω > 0, then
the fluid velocity is v = −ωyi + ωxj. The divergence of v is
∂ ∂ ∂
∇·v = − ωy + ωx + 0
∂x ∂y ∂z
= 0.
Hence the fluid velocity v is incompressible, physically this means the fluid velocity is preserved at each
point in the fluid flow.
Example 3.2.4
Consider the fluid velocity field v = xyi + 0j + 0k. The divergence of v is
∂ ∂ ∂
∇·v = xy + 0 + 0
∂x ∂y ∂z
= y.
For the region y > 0, ∇ · v > 0, and the fluid is expanding (diverging).
For the region y < 0, ∇ · v < 0, and the fluid is contracting (converging)
Vector field v = xy i + 0 j + 0 k
6
0
y
−2
−4
−6
−6 −4 −2 0 2 4 6
x
90
Here are a number of interesting web pages that consider the physical significance of the divergence of
a vector field:
I The idea of the divergence of a vector field

I Subtleties about divergence
I Flows of vector fields
Divergence of a Vector Field in Other Coordinate Systems
The divergence of a vector field F(r, θ, z) = Fr er + Fθ eθ + Fz ez in cylindrical coordinates is

1 ∂ 1 ∂F
θ ∂Fz
∇·F= rFr + + .
r ∂r r ∂θ ∂z
The divergence of a vector field F(r, φ, θ) = Fr er + Fφ eφ + Fθ eθ in spherical coordinates is

1 ∂ 2 1 ∂Fθ 1 ∂
∇·F= 2 r Fr + + sin(φ) Fφ .
r ∂r r sin(φ) ∂θ r sin(φ) ∂φ
3.2.4 The Laplacian Operator
Given a three-dimensional scalar function f (x, y, z) which is twice-differentiable, we can generate the
vector field F = ∇f and then can generate the scalar field ∇ · F, that is,

∂ ∂ ∂ ∂f ∂f ∂f
∇ · (∇f ) = i +j +k · i +j +k
∂x ∂y ∂z ∂x ∂y ∂z
2 2 2
∂ f ∂ f ∂ f
= + 2 + 2.
∂x2 ∂y ∂z
Definition: The Laplacian Operator
The Laplacian operator is the quantity written as
∂2 ∂2 ∂2
∇2 = + +
∂x2 ∂y 2 ∂z 2
I The Laplacian operator takes a scalar field and returns a scalar field
∇2 f = ∇ · (∇f )
∂2f ∂2f ∂2f
= 2
+ 2 + 2.
∂x ∂y ∂z
I The Laplacian is an important quantity in many physical applications, such as heat transfer and
wave motion.
I If ∇2 f = 0 then f is said to be a harmonic function.
I The Laplacian operator can take a vector field F(x, y, z) = F1 (x, y, z) i+F2 (x, y, z) j+F3 (x, y, z) k
and return a vector field
∇2 F = ∇2 F1 i + ∇2 F2 j + ∇2 F3 k.
91
Example 3.2.5
Consider the scalar field f (x, y, z) = x2 y. The Laplacian of f is
2
∂2 ∂2 2

∂
∇2 f = + + x y
∂x2 ∂y 2 ∂z 2
∂2 2 ∂2 2 ∂2 2
= x y + x y + x y
∂x2 ∂y 2 ∂z 2
= 2y.
Example 3.2.6
mM G
f (r) =
r
mM G
=p
x + y2 + z2
2
where m and M are the masses of two objects separated by a distance of r = |r| and G is the gravitational
constant. The gradient of f is
g = ∇f
mM G
=− r̂
r2
Then the Laplacian of f is the divergence of g, that is,
∇2 f = ∇ · g
!
∂ ∂ ∂ mM Gx mM Gy mM Gz
= i +j +k · − 3 i − 3 j + − 3 k
∂x ∂y ∂z (x2 + y 2 + z 2 ) 2 (x2 + y 2 + z 2 ) 2 (x2 + y 2 + z 2 ) 2
! ! !!
∂ x ∂ y ∂ z
= mM G − 3 + − 3 + − 3
∂x (x2 + y 2 + z 2 ) 2 ∂y (x2 + y 2 + z 2 ) 2 ∂z (x2 + y 2 + z 2 ) 2
!
−2x2 + y 2 + z 2 x2 − 2y 2 + z 2 x2 + y 2 − 2z 2
= mM G 5 + 5 + 5
(x2 + y 2 + z 2 ) 2 (x2 + y 2 + z 2 ) 2 (x2 + y 2 + z 2 ) 2
=0
3.2.5 The Curl of a Vector Field
Definition: The Curl of a Vector Field
Given a three-dimensional vector function F(x, y, z) = F1 (x, y, z) i + F2 (x, y, z) j + F3 (x, y, z) k, the

curl of F is defined as
i j k
 
 
∇ × F = det ∂ ∂ ∂   .
 
 ∂x ∂y ∂z 
F1 F2 F3
I ∇ × F on some occasions is written as curl F.
92
I ∇ × F is a vector field : The del operator takes a vector field and returns a vector field.
I ∇ × F is a local property describing the tendency of a particle to “twist” or “rotate” in a vector

field.
I If ∇ × F = 0 everywhere then the vector field is said to be irrotational.
There is a relationship between ∇ × F and the rotation of a vector field.
Example 3.2.7
Consider a vector field v = yi + 0j + 0k which could represent a shear fluid flow .
Vector field v = y i + 0 j + 0 k
6
0
y
−2
−4
−6
−6 −4 −2 0 2 4 6
x
The curl of v is
i j k
 
 
∇ × v = det ∂ ∂ ∂ 
 

 ∂x ∂y ∂z 
y 0 0

∂

= 0i + 0j + 0 − y k
∂y
= −k.
Then a particle at any point in the velocity field will experience clockwise rotation about an axis parallel
to k.
93
Example 3.2.8
There is a relationship between the curl and the angular velocity of a vector field.
Consider a fluid flow rotating counter-clockwise with constant angular velocity ω = ωk, for ω > 0, then
the fluid velocity is v = −ωyi + ωxj. The curl of v is
i j k
 
 
∇ × v = det ∂ ∂ ∂ 
 

 ∂x ∂y ∂z 
−ωy ωx 0

∂ ∂
= 0i + 0j + ωx − − ωy k
∂x ∂y
= 2ωk.
Hence, a particle at any point in the fluid flow v will experience counter-clockwise curl about an axis
parallel to k with magnitude twice that of the magnitude of the angular velocity.
Exercise 2.1
For any scalar field f verify the vector identity
∇ × (∇f ) = 0.
Hence, we can conclude that any field which can be written in the form F = ∇f is irrotational.
Here is an interesting web page which considers the physical significance of the curl of a vector field:
I The idea of the curl of a vector field
Curl of a Vector Field in Other Coordinate Systems
The curl of a vector field F(r, θ, z) = Fr er + Fθ eθ + Fz ez in cylindrical coordinates is given by

1 ∂F
1 ∂Fz ∂Fθ ∂Fr ∂Fz 1 ∂ r
∇×F= − er + − eθ + rFθ − ez .
r ∂θ ∂z ∂z ∂r r ∂r r ∂θ
The curl of a vector field F(r, φ, θ) = Fr er + Fθ eθ + Fφ eφ in spherical coordinates is

1 ∂F
1 ∂ 1 ∂Fφ 1 ∂Fr 1 ∂ 1 ∂ r
∇×F = sin(φ) Fθ − er + − rFθ eφ + rFφ − eθ .
r sin(φ) ∂φ r sin(φ) ∂θ r sin(φ) ∂θ r ∂r r ∂r r ∂φ
94
3.3 Tangent Vectors and Arc Length
3.3.1 Tangent to a curve
At any given point on a simple smooth curve C, there exists a unique tangent, that is, there exists
a unique straight line touching the curve at that point. Let such a simple smooth curve C be given
parametrically by the position vector r(t) = x(t) i + y(t) j + z(t) k as a vector-valued function of the
parameter t. The tangent vector to the curve C at a point P , specified by a t-value, is defined to be
the vector
dr r(t + ∆t) − r(t)
= lim
dt ∆t−→0 ∆t
dr
if the limit exists and dt 6= 0. The corresponding unit tangent vector is
1 dr
T = dr
dt
dt
The tangent line to the curve C at a point P , specified by t = t0 , is

dr
q(w) = r(t0 ) + w
dt t=t0
where w is a parameter used explicitly for the tangent line parameterisation.
Example 3.3.1
y2
Find the vector equation of the tangent line to the ellipse x2 + = 1 at the point P : (x, y, z) =
√ 4
3
2 , 1, 0 .
A parameterisation for the ellipse is

r(t) = cos(t) i + 2 sin(t) j + 0k for 0 ≤ t ≤ 2π.
The tangent vector to the ellipse is
dr
= − sin(t) i + 2 cos(t) j + 0k
dt
√
3
The point (x, y, z) = 2 , 1, 0 is given by
√
cos(t) = 23 π
=⇒ t = .
2 sin(t) = 1 6
√
Thus, the tangent vector at the point (x, y, z) = 23 , 1, 0 is
√

dr 1
= − i + 3j + 0k
dt t= π
2
6
√
Hence, the tangent line to the ellipse at the point (x, y, z) = 23 , 1, 0 is
√ !
√

3 1
q(w) = i + j + 0k + w − i + 3j + 0k
2 2
for parameter w ∈ R.
3.3.2 Arc length
Definition: Arc length
Let C be a simple smooth curve C with parametric representation r(t) = x(t) i + y(t) j + z(t) k for
a ≤ t ≤ b, then the length of the curve, or arc length, is defined by
Z b r !
dr dr
`= · dt
a dt dt
or equivalently
Z b
dr
`= dt
dt
a
Example 3.3.2
Let the curve C be the parabola y = x2 from (x, y, z) = (0, 0, 0) to (x, y, z) = (1, 1, 0). Find the arc
length of C.
A parameterisation for C is
r(t) = ti + t2 j + 0k for 0 ≤ t ≤ 1.
The tangent vector to C is
dr
= i + 2tj + 0k
dt
therefore,

dr
= |i + 2tj + 0k|
dt
p
= 1 + 4t2
The arc length is given by Z 1 p

`= 1 + 4t2 dt
0
Z p
1 1
Consider the indefinite integral 1 + 4t2 dt, let t = 2 sinh(u) then 1 = 2 cosh(u) du
dt , thus
s 2
Z p Z
1 1 du
1 + 4t2 dt = 1+4 sinh(u) cosh(u) dt
2 2 dt
Z
1
= cosh2 (u) du
2
Z
1 1
= + cosh(2u) du
4 4
1 1
= u + sinh(2u) + C
4 8
1 1
= u + sinh(u) cosh(u) + C
4 4
1 1 p
= sinh−1 (2t) + t 1 + 4t2 + C
4 2
1 p 1 p
= loge 2t + 4t2 + 1 + t 1 + 4t2 + C
4 2
96
for arbitrary constant C. Hence, the arc length of C is

s 2
Z p Z
1 1 du
1 + 4t2 dt = 1+4 sinh(u) cosh(u) dt
2 2 dt
p
1 1 p 1
= t 1 + 4t2 + loge 2t + 4t2 + 1
2 4 0
√
5 1 √
= + loge 2 + 5
2 4
That is, the arc length of C is approximately 1.47894 units.
Example 3.3.3
Consider the circular helix C with parameterisation
r(t) = 1.5 cos(2πt) i + 1.5 sin(2πt) j + tk for 0 ≤ t ≤ 3
The tangent vector is

dr
= −3π sin(2πt) i + 3π cos(2πt) j + k
dt
The arc length of this helix curve is
Z 3 r !
dr dr
`= · dt
0 dt dt
Z 3 p
= 9π 2 + 1 dt
0
p Z 3
= 9π 2 + 1 1 dt
0
p
= 3 9π 2 + 1
that is, the arc length of C is approximately 28.43304 units.
97
3.4 Line Integrals
3.4.1 Line Integrals
As a motivational example for introducing line integrals, consider a particle moving along a curve C in
R3 with parametric representation r(t) for a ≤ t ≤ b. If the particle is acted on by a force represented
by the vector field F(r) then what is the total amount of work done as the particle moves along the
curve C?
Firstly, curve C can be approximately
represented by N straight lines which have direction vectors
dr
parallel to the tangent vector .
dt ti
The amount of work done on one line segment is
!
dr
∆W = F(r(ti )) · ∆t
dt ti
Then the approximate amount of work done along the curve is

N
X dr
W ≈ F(r(ti )) · ∆t.
i=1
dt ti
Hence, the total amount of work done as the particle moves along the curve C is
N !
X dr
W = lim F(r(ti )) · ∆t
N −→∞
i=1
dt ti
Z b
dr
= F(r(t)) · dt
a dt
I On occasions this line integral is written as

Z
W = F · dr.
C
I If C is closed curve then we write I

W = F · dr.
C
Example 3.4.1
Find the work done by the force field F = (x − y) i + (x + y) j + (z + 2x)k in moving a particle along
the part of the line from (x, y, z) = (0, 0, 0) to (x, y, z) = (1, 2, 5).
Let C represent the part of the line from (x, y, z) = (0, 0, 0) to (x, y, z) = (1, 2, 5), then a parameterisation
of C is
r(t) = ti + 2tj + 5tk for 0 ≤ t ≤ 1
dr
= i + 2j + 5k
dt
and the force field can be written as
F(r(t)) = −ti + 3tj + 7tk.
Therefore we have
dr
F(r(t)) · = 40t.
dt
Hence, the work done is
Z
W = F · dr
C
Z 1
dr
= F(r(t)) · dt
0 dt
Z 1
= 40t dt
0
h i1
= 20t
0
= 20.
Example 3.4.2
Find the work done by the force field F = (x − y) i + (x + y) j + (z + 2x)k in moving a particle along
the part of the parabola from (x, y, z) = (0, 0, 0) to (x, y, z) = (1, 2, 5) given by the intersection of the
paraboloid z = x2 + y 2 and the plane y = 2x.
Let C represent the part of the parabola from (x, y, z) = (0, 0, 0) to (x, y, z) = (1, 2, 5) given by the
intersection of the paraboloid z = x2 + y 2 and the plane y = 2x, then a parameterisation of C is
r(t) = ti + 2tj + 5t2 k for 0 ≤ t ≤ 1

dr
= i + 2j + 10tk
dt
F(r(t)) = −ti + 3tj + (5t2 + 2t)k.
Therefore we have
dr
F(r(t)) · = 5t + 20t2 + 50t3 .
dt
99

Z
W = F · dr
C
Z 1
dr
= F(r(t)) · dt
0 dt
Z 1
= 5t + 20t2 + 50t3 dt
0
1
5 2 20 3 50 4
= t + t + t
2 3 4 0
65
= .
3
Observe that we have found different values for the line integral for different paths C then we say in
this case the line integral for this vector field is dependent on path.
Example 3.4.3
Find the work done by the force field F = (x − y) i + (x + y) j + (z + 2x)k in moving a particle once
counterclockwise around the circle x2 + y 2 = 4 in the z = 1 plane.
Let C represent the path “once counterclockwise around the circle x2 + y 2 = 4 in the z = 1 plane”, then
a parameterisation of C is
r(t) = 2 cos(t) i + 2 sin(t) j + k for 0 ≤ t ≤ 2π

dr
= −2 sin(t) i + 2 cos(t) j + 0k
dt
F(r(t)) = 2 (cos(t) − sin(t)) i + 2 (cos(t) + sin(t)) j + (1 + 2 cos(t))k.
Therefore we have
dr
F(r(t)) · = 4.
dt
I
W = F · dr
C
Z 2π
dr
= F(r(t)) · dt
0 dt
Z 2π
= 4 dt
0
h i2π
= 4t
0
= 8π.
100
3.4.2 Conservative Vector Fields
Definition: Conservative Vector Fields
Suppose that F is a vector field that is continuous in an open connected region D, then F is a
conservative vector field if and only if the value of the line integral
Z
dr
F· dt
C dt
is independent of path, that is, the value of the line integral is the same for any path C with the
same terminal points inside the region D.
The vector field F is a conservative vector field if

I
I F · dr = 0 for every closed path C;
C
I there exists a scalar field f such that F = ∇f , in which case f is called a potential;
I ∇ × F = 0.
Example 3.4.4
Examples of conservative fields include gravitational, electrostatic and magnetic fields.
Frictional forces such as air resistance are not conservative fields.
101
3.4.3 Conservation of Energy
Consider a continuous vector field F moving a particle with mass m along a path C with parameterisation
dr d2 r
r(t) then the velocity is v(t) = and the acceleration is a(t) = 2 as acceleration. Newton’s second
dt dt
law states a force F can be related to acceleration by F = ma at each point on a path C,
F = mr00
Then the work done is
Z
W = F · dr
C
Z b
= mr00 · r0 dt
a
Z b
1 d 0 0
=m r ·r dt
a 2 dt
Z b
1 d 0 2
=m |r | dt
a 2 dt
m h 0 2 ib
= |r |
2 a
m 2 2

= |r (b)| − |r0 (a)|
0
2
m 2 m 2
= |v(b)| − |v(a)| .
2 2
m 2
The quantity |v(t)| is the kinetic energy of the particle. Then we can write
2
W = KE(b) − KE(a) ,
that is, the work done by the force field F along the path C is the difference in kinetic energy at the
end points of C.
If we assume F is a conservative vector field then there exists a scalar field f such that F = ∇f , then
Z
W = F · dr
C
Z b
dr
= F· dt
a dt
Z b
dr
= ∇f · dt
a dt
Z b
∂f dx ∂f dy ∂f dz
= + + dt
a ∂x dt ∂y dt ∂z dt
Z b
df
= dt
a dt
h ib
= f (t)
a
= f (b) − f (a) .
The potential energy of an object at a point (x, y, z) is defined as PE(x, y, z) = −f (x, y, z), such that
F = −∇PE, therefore
W = PE(a) − PE(b) .
Hence, for a particle moving along a path C from point r(a) to point r(b) under the influence of a
conservative vector field
KE(b) − KE(a) = PE(a) − PE(b) ,
102
which gives the Conservation of Energy;
PE(a) + KE(a) = PE(b) + KE(b) ,
that is, the sum of the kinetic energy and potential energy of the particle remains constant.
3.4.4 Other Forms of Line Integrals

Z
In applications, the line integral F · dr is the most common type of line integral. However, you should
C
be aware that there are two other forms of line integrals
Z
f (x, y, z) dr
C
for a scalar field f , and Z

F × dr
C
for a vector field F.
Z Z Z
Note that the line integral F · dr results in a scalar while f (x, y, z) dr and F × dr results in
C C C
a vector.
Example 3.4.5
Evaluate the line integral Z
x + y 2 dr
C
where C is the quarter circle from (x, y) = (1, 0) to (x, y) = (0, 1) in the z = 0 plane.
A possible parameterisation of C is
π
r(t) = cos(t) i + sin(t) j + 0k for 0 ≤ t ≤
2
dr
= − sin(t) i + cos(t) j + 0k
dt
and the scalar field can be written as
f (x, y, z) = x + y 2 = cos(t) + sin2 (t) .
Therefore we have
Z Z
2 dr

2

x + y dr = x+y dt
C C dt
Z π2 ! Z π
!
2
2 2

= cos(t) + sin (t) (− sin(t)) dt i + cos(t) + sin (t) (cos(t)) dt j
0 0
Z π
! Z π
!
2 2
3 2 2
= − cos(t) sin(t) − sin (t) dt i + cos (t) + sin (t) cos(t) dt j
0 0
Z π
! Z π !
2
2
2 1 1 2
= − cos(t) sin(t) − 1 − cos (t) sin(t) dt i + + cos(2t) + sin (t) cos(t) dt j
0 0 2 2

7 1 π
=− i+ + j.
6 3 4
103
3.5 Surface Integrals
3.5.1 Surface integrals
We now know how to integrate a function over a flat region in a plane, but what if the function is
defined over a curved surface? In many applications in engineering and science the region of integration
will not be flat, for example
I calculating the total force on the wall of a dam,

I calculating the total electric charge over a spherical shell.
In this topic we also wish to establish the flux integral which is concerned with the flux (or transmission)
of some field lines across a curved surface. For example, in fluid dynamics the flux of water through a
porous surface.
3.5.2 Defining a curved surface
When integrating over a flat region, we simply defined

the boundary of the region. As an example,
a
rectangle defined in the xy-plane can be written as R = (x, y) ∈ R2 : −1 ≤ x ≤ 2, 0 ≤ y ≤ 1 . There
was no need for any formal parametric representations, even when the flat region required a splitting
into two or more regions.
Recall that surfaces which can be expressed in the form of a function, for example z = f (x, y), are rather
restrictive because of the uniqueness of the image of a point in the domain of a function. Surfaces that
are expressed in the form of an equation g(x, y, z) = 0, are more diverse in nature because they may be
‘multi-valued’.
We need two independent variables to cover a 2-dimensional space in the parametric variables, so a
two-dimensional surface in three-dimensional space can be represented parametrically as the position
vector of a function of two independent variables u and v
r(u, v) = x(u, v) i + y(u, v) j + z(u, v) k,
with some bounds defining the domain for the parameters u and v.
When we finally get to integrating over a surface, we will need to know the normal vector to the surface
at any given point:
I If the surface is parametrically represented by r(u, v) then we can define the tangent vectors to
∂r ∂r
the surface as and . Therefore, a normal vector to the surface can be defined as the cross
∂u ∂v
product of the tangent vectors
∂r ∂r
N= × . (3.1)
∂u ∂v
3.5.3 Surface area of a curved surface
Consider a curved surface S defined parametrically as
with some bounds defining the domain for the parameters u and v. We could approximate the surface
area of S by dividing the surface into a grid of n × m small parallelogram like regions. Consider a small
parallelogram like region on the curved surface S with vertices described by r(ui , vj ), r(ui + ∆u, vj ),
r(ui , vj + ∆v) and r(ui + ∆u, vj + ∆v) respectively. The surface area of this region is approximately
the area of a parallelogram with these vertices, that is,
∆S = ∆x∆y ≈ |(r(ui + ∆u, vj ) − r(ui , vj )) × (r(ui , vj + ∆v) − r(ui , vj ))|

∂r ∂r
= ∆u ×
∆v
∂u ∂v

∂r ∂r
= × ∆u∆v (3.2)
∂u ∂v
= |N(ui , vj )| ∆u∆v.
The first quantity on line (3.2) is the Jacobian |J(u, v)|, as in §2.3.4. The last line follows from the
equation for the normal (3.1).
Then the surface area of S is approximately the summation of the approximate area for all of these
small parallelograms, that is,
Xn Xm
A≈ |N(ui , vj )| ∆u∆v.
i=1 j=1
If we let ∆u → 0 and ∆v → 0 then we have the surface area of S:

ZZ ZZ
A= 1 dS = |N| du dv.
S R
Definition: Surface Area
The surface area of a (curved) surface S parametrically defined as r(u, v) = x(u, v) i + y(u, v) j +
z(u, v) k is given by Z Z
∂r ∂r
A= ∂u × ∂v du dv

R
where R is the region in the uv-plane which was given by the parameterisation r(u, v) of the
(curved) surface S.
Example 3.5.1

Let S be the part of the plane 3x + 2y + z = 6 restricted to the first octant, that is, (x, y, z) ∈ R3 :
x ≥ 0, y ≥ 0, z ≥ 0 . Find the surface area of S.
A parameterisation of S is
3
r(u, v) = ui + vj + (6 − 3u − 2v) k for 0 ≤ u ≤ 2, 0 ≤ v ≤ − u + 3.
2
105
The normal to the plane is N = 3i + 2j + k, which can be shown using

∂r ∂r
N= ×
∂u ∂v 
i j k
 
 ∂ ∂ ∂ 
= det
 u v 6 − 3u − 2v 
 ∂u ∂u ∂u 

 ∂ ∂
∂ 
u v 6 − 3u − 2v
 ∂v ∂v
 ∂v
i j k
= det 1 0 −3 
0 1 −2
= 3i + 2j + k.
and we have
|N| = |3i + 2j + k|
√
= 14
Therefore, the surface area of S is
Z Z
∂r ∂r
A= ∂u × ∂v du dv

R
Z 2 Z 3− 23 u √ !
= 14 dv du
0 0
√ Z 2
3
= 14 3 − u du
0 2
2
√

3
= 14 3u − u2
4 0
√
= 3 14.
Example 3.5.2
Find the area of the elliptical surface formed by the intersection of the plane y + z = 2 with the
cylindrical volume x2 + y 2 ≤ 1.
The elliptical surface S could be parametrically represented by
r(r, θ) = r cos(θ) i + r sin(θ) j + (2 − r sin(θ)) k for 0 ≤ r ≤ 1, 0 ≤ θ ≤ 2π.
Then a normal vector to the surface is
∂r ∂r
N= ×
∂r ∂θ 
i j k
 
 ∂ ∂ ∂ 
= det
 r cos(θ) r sin(θ) 2 − r sin(θ) 
 ∂r ∂r ∂r 

 ∂ ∂ ∂ 
r cos(θ) r sin(θ) 2 − r sin(θ)
 ∂θ ∂θ ∂θ

i j k
 
= det cos(θ) sin(θ) − sin(θ)
 

 
−r sin(θ) r cos(θ) −r cos(θ)
= 0i + rj + rk
106
and we have
|N| = |0i + rj + rk|
√
= 2r2
√
= 2r.

Z Z
∂r ∂r
A= ∂r × ∂θ dθ dr

R
Z 1 Z 2π √
= 2r dθ dr
0 0
√ Z 1
= 2 2π r dr
0
1
√

1 2
= 2 2π r
2 0
√
= 2π.
Example 3.5.3
Find the area of the paraboloid z = 9 − x2 − y 2 that lies above the xy-plane.
We need to represent the surface S parametrically. One such representations is
r(r, θ) = r cos(θ) i + r sin(θ) j + 9 − r2 k for 0 ≤ r ≤ 3, 0 ≤ θ ≤ 2π.


∂r ∂r
N= ×
∂r ∂θ 
i j k
 
 ∂ ∂ ∂ 
= det
 r cos(θ) r sin(θ) 9 − r2 
 ∂r ∂r ∂r 

 ∂ ∂ ∂ 
r cos(θ) r sin(θ) 9 − r2
 ∂θ ∂θ  ∂θ
i j k
 
= det cos(θ) sin(θ) −2r 
 
 
−r sin(θ) r cos(θ) 0
= 2r2 cos(θ) i + 2r2 sin(θ) j + rk
and we have
|N| = 2r2 cos(θ) i + 2r2 sin(θ) j + rk

p
= 4r4 + r2
p
= r 4r2 + 1
107

Z Z
∂r ∂r
A= ×
∂r ∂θ dθ dr

R
Z 3 Z 2π p
= 2
r 4r + 1 dθ dr
0 0
Z 3 p
= 2π r 4r2 + 1 dr
0
3
1 3
= 2π 4r2 + 1 2
12 0
π √
= 37 37 − 1 .
6
Example 3.5.4
LetS be the part of the sphere with radius
a and centre at the origin restricted to the first octant, that
is, (x, y, z) ∈ R3 : x ≥ 0, y ≥ 0, z ≥ 0 . Find the surface area of S.
π π
r(φ, θ) = a sin(φ) cos(θ) i + a sin(φ) sin(θ) j + a cos(φ) k for 0 ≤ φ ≤ , 0≤θ≤ .
2 2
∂r ∂r
N= ×
∂φ ∂θ
 
i j k
 
 ∂ ∂ ∂ 
= det
 a sin(φ) cos(θ) a sin(φ) sin(θ) a cos(φ) 
 ∂φ ∂φ ∂φ 

 ∂ ∂ ∂ 
a sin(φ) cos(θ) a sin(φ) sin(θ) a cos(φ)
∂θ ∂θ ∂θ
 
i j k
= det a cos(φ) cos(θ) a cos(φ) sin(θ) −a sin(φ)
 

−a sin(φ) sin(θ) a sin(φ) cos(θ) 0
= a2 sin2 (φ) cos(θ) i + a2 sin2 (φ) sin(θ) j + a2 cos(φ) sin(φ) k

and we have
|N| = a2 sin2 (φ) cos(θ) i + a2 sin2 (φ) sin(θ) j + a2 cos(φ) sin(φ) k

q
= a4 sin2 (φ)
= a2 sin(φ)
Z Z
∂r ∂r
A= ×
∂φ ∂θ dθ dφ

R
Z π2 Z π2 !

2
= a sin(φ) dθ dφ
0 0
π
πa2
Z 2
= sin(φ) dφ
2 0
πa2
= .
2
108
3.5.4 Surface integral of scalar fields
One obvious generalisation of double integrals over regions in the plane which we saw in the multivariable
chapter is to consider a double integral of a scalar field over any surface.
The surface integral of a three dimensional scalar field f (x, y, z) across a curved surface S is given by
ZZ
f (x, y, z) dS
S
where dS is a scalar surface element.
Example 3.5.5
Let S be the entire xy-plane. Evaluate the surface integral
ZZ

exp −x2 − y 2 dS.
S
r(r, θ) = r cos(θ) i + r sin(θ) j + 0k for 0 ≤ r < ∞, 0 ≤ θ < 2π.
Then we have
ZZ

Z ∞ Z ∞

exp −x2 − y 2 dS = exp −x2 − y 2 dx dy
S −∞ −∞
Z ∞ Z 2π

2
= exp −r r dθ dr
0 0
Z ∞
= 2π r exp −r2 dr
0
!
Z N
2
= 2π lim r exp −r dr for 0 < N < ∞
N −→∞ 0
N !
1
− exp −r2

= 2π lim
N −→∞ 2 0

1 1
= 2π lim − exp −N 2 +

N −→∞ 2 2
= π.
Z ∞
2
Note that an implication of this previous example is that; if I = e−x dx then
−∞
Z ∞ 2
2
I2 = e−x dx
−∞
Z ∞ Z ∞
2 2
= e−x dx e−x dx
−∞ −∞
then a change of variables x = y for the second integral gives

Z ∞ Z ∞
2 −x2 −y 2
I = e dx e dy
−∞ −∞
Z ∞ Z ∞
2 2
= e−x −y dx dy
−∞ −∞
2
Therefore, by the previous example I = π and then
Z ∞
2
√
e−x dx = π.
−∞
109
3.5.5 Surface integral of vector fields
One of the most important applications of surface integrals is the flux of a vector field across a surface
S. We assume S is a two-sided (orientable) surface which has a normal vector defined at each point on
the surface. Furthermore, if S is a closed surface then by convention we choose the outward pointing
normal vector .
Definition: Flux Integral
The flux of a three dimensional vector field F across a curved surface S is given by
ZZ
F · n̂ dS
S
where dS is a scalar surface element and n̂ the unit normal to the surface S.
If the surface S is defined parametrically as
with some bounds defining the domain for the parameters u and v then the flux integral becomes
ZZ ZZ
N(u, v)
F · n̂ dS = F(r(u, v)) · |N(u, v)| du dv
S |N(u, v)|
Z ZR
= F(r(u, v)) · N(u, v) du dv
R
where the parameterisation maps the curved surface S onto the flat region R in the uv-plane. Note
that N is a normal vector to the surface S but is not necessarily a unit vector.
Example 3.5.6 Flux through an open surface

Given the vector field
F = yi + 2j + xzk
evaluate the flux integral of F across the curved surface
S = (x, y, z) ∈ R3 : y = x2 , 0 ≤ x ≤ 2, 0 ≤ z ≤ 3 .

We need to represent the surface S parametrically. One such representation is to let x = u and z = v,
that is,
r(u, v) = ui + u2 j + vk for 0 ≤ u ≤ 2, 0 ≤ v ≤ 3
then a normal vector to the surface is
∂r ∂r
N= ×
∂u ∂v 
i j k
 
 ∂ ∂ 2 ∂ 
= det
 (u) u (v) 
 ∂u ∂u ∂u 

 ∂ ∂ 2 ∂ 
(u) u (v)
 ∂v ∂v
 ∂v
i j k
= det 1 2u 0 
0 0 1
= 2ui − j + 0k
110
and the vector field is

F = u2 i + 2j + uvk.
Hence, the flux of F across the curved surface S is
ZZ ZZ
F · n̂ dA = F(r(u, v)) · N(u, v) du dv
S R
Z 3 Z 2
2

= u i + 2j + uvk · (2ui − j + 0k) du dv
0 0
Z 3 Z 2
3
= 2u − 2 du dv
0 0
= 12.
Example 3.5.7 Flux through a closed surface

The orientation of a closed surface is important when evaluating a flux integral across it. If we wish
to find the flux across a closed surface then we need to use the outward pointing normal vector to the
surface.
Consider a vector field defined as F = (2x − z) i + x2 yj + xz 2 k representing fluid flowing through a box
defined by 0 ≤ x ≤ 1, 0 ≤ y ≤ 1 and 0 ≤ z ≤ 1. Calculate the net flux across the surface of the box.
The surface of the box has six sides, so we must break the flux integral into six separate flux integrals.
Furthermore, since the box is closed we will need to use the outward pointing normal for each side.
Define surface 1, S1 , as the side defined by x = 0, 0 ≤ y ≤ 1 and 0 ≤ z ≤ 1. Visually, the outward
pointing normal to the surface S1 is N = −i. Note that if we parameterise the surface then we must
be careful in our choice of parameterisation to ensure we can find the outward pointing normal to the
surface. For S1 , the parameterisation we could use is
r(u, v) = 0i + vj + uk for 0 ≤ u ≤ 1, 0 ≤ v ≤ 1
then the vector field becomes

F = −ui + 0j + 0k
Given this parameterisation, the normal is given by
∂r ∂r
N= ×
∂u ∂v 
i j k
 
 ∂ ∂ ∂ 
= det
 (0) (v) (u) 
 ∂u ∂u ∂u 

 ∂ ∂ ∂ 
(0) (v) (u)
 ∂v ∂v
 ∂v
i j k
= det 0 0 1 
0 1 0
= −i + 0j + 0k
which indeed is a normal vector pointing outward from the box.

Note that using the parameterisation r(u, v) = 0i + uj + vk for 0 ≤ u ≤ 1, 0 ≤ v ≤ 1 would give N = i
which is a normal vector pointing into the box.
111
Therefore, the flux across surface 1 is

ZZ ZZ
F · n̂ dS = F(r(u, v)) · N(u, v) du dv
S1 R
Z 1 Z 1
= (−ui + 0j + 0k) · (−i + 0j + 0k) dv du
0 0
Z 1 Z 1
= u dv du
0 0
Z 1
= u du
0
1
= .
2
This means the flux through S1 is in the direction of the normal to the surface, that is, the flux is out
of the box through surface S1 .
Define surface 2, S2 , as the side defined by x = 1, 0 ≤ y ≤ 1 and 0 ≤ z ≤ 1. Visually, the outward

pointing normal to the surface S2 is N = i. To parameterise S2 we could use
r(u, v) = i + uj + vk for 0 ≤ u ≤ 1, 0 ≤ v ≤ 1
and then the vector field becomes

F = (2 − v) i + uj + v 2 k
Therefore, evaluating the flux integral across S2 gives

ZZ ZZ
S2 R
Z 1 Z 1
2

= (2 − v) i + uj + v k · (i + 0j + 0k) du dv
0 0
Z 1 Z 1
= 2 − v du dv
0 0
Z 1
= 2 − v du
0
3
= .
2
Define surface 3, S3 , as the side defined by 0 ≤ x ≤ 1, y = 0 and 0 ≤ z ≤ 1. Visually, the outward

pointing normal to the surface S3 is N = −j. To parameterise S3 we could use
r(u, v) = ui + 0j + vk for 0 ≤ u ≤ 1, 0 ≤ v ≤ 1

F = (2u − v) i + 0j + uv 2 k
ZZ ZZ
S3 R
Z 1 Z 1
2

= (2u − v) i + 0j + uv k · (0i − j + 0k) du dv
0 0
Z 1 Z 1
= 0 du dv
0 0
= 0.
112
This means there is no flux through S3 .
Define surface 4, S4 , as the side defined by 0 ≤ x ≤ 1, y = 1 and 0 ≤ z ≤ 1. Visually, the outward

pointing normal to the surface S4 is N = j. To parameterise S4 we could use
r(u, v) = vi + 0j + uk for 0 ≤ u ≤ 1, 0 ≤ v ≤ 1

F = (2v − u) i + v 2 j + vu2 k
ZZ ZZ
S4 R
Z 1 Z 1
2 2

= (2v − u) i + v j + vu k · (0i + j + 0k) du dv
0 0
Z 1 Z 1
= v 2 du dv
0 0
Z 1
= v 2 dv
0
1
= .
3
Define surface 5, S5 , as the side defined by 0 ≤ x ≤ 1, 0 ≤ y ≤ 1 and z = 0. Visually, the outward

pointing normal to the surface S5 is N = −k. To parameterise S5 we could use
r(u, v) = vi + uj + 0k for 0 ≤ u ≤ 1, 0 ≤ v ≤ 1

F = 2vi + v 2 uj + 0k
ZZ ZZ
S5 R
Z 1 Z 1
2vi + v 2 uj + 0k · (0i + 0j − k) du dv

=
0 0
Z 1 Z 1
= 0 du dv
0 0
= 0.
Define surface 6, S6 , as the side defined by 0 ≤ x ≤ 1, 0 ≤ y ≤ 1 and z = 1. Visually, the outward

pointing normal to the surface S6 is N = k. To parameterise S6 we could use
r(u, v) = ui + vj + k for 0 ≤ u ≤ 1, 0 ≤ v ≤ 1

F = (2v − u) i + v 2 j + vu2 k
113

ZZ ZZ
S6 R
Z 1 Z 1
2

= (2u − 1) i + u vj + uk · (0i + 0j + k) du dv
0 0
Z 1 Z 1
= u dv du
0 0
Z 1
= u du
0
1
= .
2
Hence, the net flux across the surface of the box is

ZZ 1 3 1 1
F · n̂ dS = + + 0 + + 0 +
S 2 2 3 2
17
= .
6
Therefore, we can conclude that the net flux through the box is positive, that is, the flow diverges out
of the box.
Example 3.5.8
Let S be the surface enclosing the upper half ball of radius 2 and centre at the origin. Find the flux
across the surface S given the vector field F = 7xi + 0j − (z + 1) k.
The surface S has two sides; the upper hemisphere and the disc of radius 2 with centre at the origin in
the xy-plane.
Define surface 1, S1 , as the disc of radius 2 with centre at the origin in the xy-plane. A possible
parameterisation of S1 could be
r(r, θ) = r cos(θ) i + r sin(θ) j + 0k for 0 ≤ r ≤ 2, 0 ≤ θ ≤ 2π
F = 7r cos(θ) i + 0j − k
and the normal to S1 is
∂r ∂r
N= ×
∂r ∂θ 
i j k
 
 ∂ ∂ ∂ 
= det
 (r cos(θ)) (r sin(θ)) (0) 
 ∂r ∂r ∂r 

 ∂ ∂ ∂ 
(r cos(θ)) (r sin(θ)) (0)
 ∂θ ∂θ  ∂θ
i j k
 
= det cos(θ) sin(θ) 0 
 
 
−r sin(θ) r cos(θ) 0
= 0i + 0j + rk.
However, this is the inward pointing normal, and so we choose the outward pointing normal to be
∂r ∂r
N= × = 0i + 0j − rk.
∂θ ∂r
114
Therefore, the flux across S1 is given by

ZZ ZZ
F · n̂ dS = F(r(r, θ)) · N(r, θ) dr dθ
S1 R
Z 2 Z 2π
= (7r cos(θ) i + 0j − k) · (0i + 0j − rk) dθ dr
0 0
Z 2 Z 2π
= r dθ dr
0 0
Z 2
= 2π r dr
0
= 4π.
This means the flux through S1 is directed outward.
Define surface 2, S2 , as the upper hemisphere of radius 2 with centre at the origin. A possible parame-
terisation of S2 could be
π
r(φ, θ) = 2 sin(φ) cos(θ) i + 2 sin(φ) sin(θ) j + 2 cos(φ) k for 0 ≤ φ ≤ , 0 ≤ θ ≤ 2π.
2
∂r ∂r
N= ×
∂φ ∂θ
 
i j k
 
 ∂ ∂ ∂ 
= det
 2 sin(φ) cos(θ) 2 sin(φ) sin(θ) 2 cos(φ) 
 ∂φ ∂φ ∂φ 

 ∂ ∂ ∂ 
2 sin(φ) cos(θ) 2 sin(φ) sin(θ) 2 cos(φ)
∂θ ∂θ ∂θ
 
i j k
= det 2 cos(φ) cos(θ) 2 cos(φ) sin(θ) −2 sin(φ)
 

−2 sin(φ) sin(θ) 2 sin(φ) cos(θ) 0
2 2
= 4 sin (φ) cos(θ) i + 4 sin (φ) sin(θ) j + 4 cos(φ) sin(φ) k.
The vector field becomes
F = 14 sin(φ) cos(θ) i + 0j − (2 cos(φ) + 1) k.
115
Therefore, the flux across S2 is given by

ZZ ZZ
F · n̂ dS = F(r(φ, θ)) · N(φ, θ) dφ dθ
S2 R
π
Z 2
Z 2π
= 56 sin3 (φ) cos2 (θ) − 8 cos2 (φ) sin(φ) − 4 cos(φ) sin(φ) dθ dφ
0 0
π
Z 2
Z 2π
3 2
= 28 sin (φ) (1 + cos(2θ)) − 8 cos (φ) sin(φ) − 4 cos(φ) sin(φ) dθ
dφ
0 0
Z π2 2π !
1
= 28 sin3 (φ) θ + sin(2θ) − 8 cos2 (φ) sin(φ) θ − 4 cos(φ) sin(φ) θ dφ
0 2 0
Z π2
= 56π sin3 (φ) − 16π cos2 (φ) sin(φ) − 8π cos(φ) sin(φ) dφ
0
Z π
2
56π 1 − cos2 (φ) sin(φ) − 16π cos2 (φ) sin(φ) − 8π cos(φ) sin(φ) dφ

=
0
Z π
2
= 56π sin(φ) − 72π cos2 (φ) sin(φ) − 8π cos(φ) sin(φ) dφ
0
h i π2 h i π2 h i π2
= −56π cos(φ) + 24π cos3 (φ) + 4π cos2 (φ)
0 0 0
= 28π.
Hence, the net flux across the surface S is

ZZ ZZ ZZ
F · n̂ dS = F · n̂ dS + F · n̂ dS
S S1 S2
= 4π + 28π
= 32π.
3.5.6 Integrating a scalar field across a curved surface
When introducing double integrals we saw how to integrate a scalar field f (x, y) over a flat region R in
the xy-plane. ZZ
f (x, y) dx dy
R
Integrating a scalar field g(r) over a curved surface S in xyz-space is similarly evaluated
ZZ ZZ
g(r) dx dy = g(r(u, v)) |N(u, v)| du dv.
S R
Example 3.5.9
π
Let S be the part of the sphere with radius 1 and centre at the origin restricted to 0 ≤ φ ≤ and
4
0 ≤ θ ≤ 2π. Find the mass of S if the density function across the surface is ρ(x, y, z) = z.
π
r(φ, θ) = sin(φ) cos(θ) i + sin(φ) sin(θ) j + cos(φ) k for 0 ≤ φ ≤ , 0 ≤ θ ≤ 2π
4
and then the density function becomes
ρ(φ, θ) = cos(φ) .
116
A normal vector to the surface is
N = sin2 (φ) cos(θ) i + sin2 (φ) sin(θ) j + cos(φ) sin(φ) k
and then
|N| = sin(φ) .
The mass of the surface S is given by
ZZ
m= ρ(x, y, z) dS
Z ZS
= ρ(φ, θ) |N(φ, θ)| dθ dφ
R
Z π4 Z 2π
= sin(φ) cos(φ) dθ dφ
0 0
Z π
4
= 2π sin(φ) cos(φ) dφ
0
π4
1
= 2π sin2 (φ)
2 0
π
= .
2
Example 3.5.10
Let S be the sphere with radius 1 and centre at the origin. Calculate the moment of inertia of S about
the z-axis if the surface has a uniform density.
r(φ, θ) = sin(φ) cos(θ) i + sin(φ) sin(θ) j + cos(φ) k for 0 ≤ φ ≤ π, 0 ≤ θ ≤ 2π
A normal vector to the surface is
N = sin2 (φ) cos(θ) i + sin2 (φ) sin(θ) j + cos(φ) sin(φ) k
and then
|N| = sin(φ) .
The moment of inertia of S about the z-axis is given by
ZZ

I= x2 + y 2 ρ dS
Z ZS
= sin2 (φ) (1) |N(φ, θ)| dθ dφ
R
Z π Z 2π
3
= sin (φ) dθ dφ
0 0
Z π
= 2π sin3 (φ) dφ
Z0 π
1 − cos2 (φ) sin(φ) dφ

= 2π
0 π
1
= 2π cos3 (φ) − cos(φ)
3 0
8π
= .
3
117
3.6 The Divergence Theorem
We have seen that calculating surface integrals may involve a great deal of work. There is another
method that we could use. If we have a closed surface, then the divergence theorem gives a rela-
tionship between a surface integral and a volume integral. The divergence theorem is often associated
with either Carl Friedrich Gauss or Michel Ostrogradski; the latter developed the general theorem by
the late 1820s but since it was written in Russian it was not initially acknowledged, while the former
developed special cases of the theorem through the 1830s.
Definitions
I A surface is orientable if there exists a continuous unit normal n̂ at every point on the surface.
∂r ∂r
I A surface r(u, v) is smooth if and are continuous at every point (u, v) on the surface and
∂u ∂v
∂r ∂r
× 6= 0 at every point (u, v) on the surface.
∂u ∂v
I A surface is piecewise smooth if it is made up of finitely many smooth surfaces.
I A surface is closed if it is the boundary surface of some volume in three-dimensional space.
Theorem The Divergence Theorem
Let S be a closed surface which encloses a volume V. If F is a three dimensional vector field with
components which have continuous first order partial derivatives throughout V then
ZZ ZZZ
F · n̂ dS = ∇ · F dV
S V
where n̂ is an outward pointing normal to the surface S.

Example 3.6.1
Consider a the vector field defined as F = (2x − z) i + x2 yj + xz 2 k representing fluid flowing through a
box defined by 0 ≤ x ≤ 1, 0 ≤ y ≤ 1 and 0 ≤ z ≤ 1. Calculate the net flux across the surface of the
box.
The flux across the surface is
ZZ ZZZ
F · n̂ dA = ∇ · F dV
S V
Z 1 Z 1 Z 1
= 2 + x2 + 2xz dy dz dx
0 0 0
Z 1 Z 1
2
= 2 + x + 2xz dz dx
0 0
Z 1 h i1
= 2z + x2 z + xz 2 dx
0 0
Z 1
= 2 + x2 + x dx
0
1
1 3 1 2
= 2x + x + x
3 2 0
17
=
6
which is the same value we found in the flux integral section but with much less working required.
Example 3.6.2
Let S be the surface enclosing the upper half ball of radius 2 and centre at the origin. Find the flux
across the surface S given the vector field F = 7xi + 0j − (z + 1) k.
We note that ∇ · F = 6 then the flux across S is
ZZ ZZZ
F · n̂ dA = ∇ · F dV
S
Z Z ZV
= 6 dV
V
ZZZ
=6 1 dV
V 3
1 4π2
=6
2 3
= 32π
4π 3
where the volume of the ball formula V = r has been used.
3
Note that this agrees with the value found in the flux integral section while, again, involving much less
work.
119
3.7 Stokes’ Theorem
3.7.1 Stokes’ Theorem
Stokes’ theorem gives a relationship between a surface integral and a line integral.
Definitions
I A curve is closed if it is the boundary curve of some surface in three-dimensional space.

dr dr
I A curve r(t) is smooth if is continuous at every point t on the curve and 6= 0 at every
dt dt
point t on the curve.
I A curve is piecewise smooth if it is made up of finitely many smooth curves.
I Let C be a closed piecewise smooth curve bounding an orientated surface S. The curve C is
dr
positively orientated if the vector defined by n × points “into”, or “over” the surface S.
dt
Another way to think about positive orientation is; if you are walking along the curve C with
your head pointing in the same direction as n then your outstretched left arm should point over
the surface.
Theorem Stokes’ Theorem
Let S be a surface bounded by a positively orientated curve C. If F is a three dimensional vector

field with components which have continuous first order partial derivatives throughout S then
I ZZ
dr
F· dt = (∇ × F) · n̂ dS.
C dt S
Example 3.7.1
Consider a vector field F = yi − xj + zk. Let S be the hemispherical surface x2 + y 2 + z 2 = a2 , z ≥ 0.
Verify Stokes’ theorem.
A possible parameterisation of the surface, S, could be
π
r(θ, φ) = a sin(φ) cos(θ) i + a sin(φ) sin(θ) j + a cos(φ) k for 0 ≤ θ ≤ 2π, 0 ≤ φ ≤ .
2
N = a2 sin2 (φ) cos(θ) i + a2 sin2 (φ) sin(θ) j + a2 cos(φ) sin(φ) k.
The curl of F is
i j k
 
 
∇ × F = det ∂ ∂ ∂ 
 

 ∂x ∂y ∂z 

y −x z
= 0i + 0j − 2k.
Therefore, the surface integral for Stokes’ theorem gives

ZZ Z π2 Z 2π
(∇ × F) · n̂ dS = − 2a2 cos(φ) sin(φ) dθ dφ
S 0 0
Z π
2
2
= −4a π cos(φ) sin(φ) dφ
0
π2
1
= −4a2 π sin2 (φ)
2 0
= −2a2 π.
The curve C bounding the surface S is the circle x2 + y 2 = a2 in the z = 0 plane. A possible
parameterisation of the curve, C, could be
r(t) = a cos(t) i + a sin(t) j + 0k for 0 ≤ t ≤ 2π.
then
dr
= −a sin(t) i + a cos(t) j + 0k
dt
and the vector field becomes
F(r(t)) = a sin(t) i − a cos(t) j + 0k.
Therefore, the line integral for Stokes’ theorem gives
I Z 2π
dr
F· dt = (a sin(t) i − a cos(t) j + 0k) · (−a sin(t) i + a cos(t) j + 0k) dt
C dt 0
Z 2π
= − a2 dt
0
= −2a2 π.
Note that any surface S bounded by the curve C will give the same result. Consider the last example
for another surface.
Example 3.7.2
Consider a the vector field defined as F = yi − xj + zk. Let C be the closed positively orientated curve
with parameterisation
r(t) = a cos(t) i + a sin(t) j + 0k for 0 ≤ t ≤ 2π.
We have already seen that the line integral for Stokes’ theorem gives
I
dr
F· dt = −2a2 π
C dt
and that the curl of F is

∇ × F = 0i + 0j − 2k.
Verify Stokes’ theorem for the surface, S, which is the disc surface x2 + y 2 = a2 in the z = 0 plane.
A possible parameterisation of the surface, S, could be
r(u, v) = u cos(v) i + u sin(v) j + 0k for 0 ≤ u ≤ a, 0 ≤ v ≤ 2π.
N = 0i + 0j + uk.
121
Therefore, the surface integral for Stokes’ theorem gives

ZZ Z a Z 2π
(∇ × F) · n̂ dS = − 2u dv du
S 0 0
Z a
= 2π − 2u du
0
h ia
= −2π u2
0
2
= −2a π.
Example 3.7.3
Consider a vector field defined as F = zi + xj + yk. Let C be the curve defined
I by the intersection of
2 2 dr
the cylinder x + y = 1 and the plane y + z = 2. Evaluate the line integral F· dt.
C dt
The curl of F is
i j k
 
 
∇ × F = det ∂ ∂ ∂ 
 

 ∂x ∂y ∂z 
z x y
= i + j + k.
The surface S bounded by the curve C is part of the plane y + z = 2 and therefore the unit normal
vector is
1
n̂ = √ (0i + j + k) .
2
Hence, Stokes’ theorem gives
I ZZ
dr
F· dt = (∇ × F) · n̂ dS
C dt
Z ZS
1
= (i + j + k) · √ (0i + j + k) dS
S 2
√ Z Z
= 2 1 dS
S
= 2π
√
where the area of the ellipse is 2π as seen in the surface integral section.
3.7.2 Independence of path for integration

I Recall that if F is a vector field that is continuous in an open simply connected
Z region D, then F
is a conservative vector field if and only if the value of the line integral F · dr is independent
C
of path.
I Let C be any closed smooth curve bounding some surface S within an open simply connected
region D.
Let ∇ × F = 0 for every point in D. By Stokes’ theorem
I ZZ
dr
F· dt = (∇ × F) · n̂ dS
C dt
Z ZS
= 0 dS
S
= 0.
122
I
dr
Let F· dt = 0. Assume that ∇×F 6= 0 such that ∇×F points in the same direction as the
C dt
unit normal to the surface at each point on the surface, ∇ × F = g(x, y, z) n̂ where g(x, y, z) > 0.
Stokes’ theorem gives
I
dr
0= F· dt
dt
ZCZ
= (∇ × F) · n̂ dS
Z ZS
= g(x, y, z) n̂ · n̂ dS
Z ZS
= g(x, y, z) dS.
S
ZZ I
dr
Since g(x, y, z) > 0 then g(x, y, z) dS > 0 which implies F· dt > 0 contradicting
SI C dt
dr
the original assumption that F· dt = 0. Therefore, the assumption ∇ × F 6= 0 must be
C dt
false, that is, ∇ × F = 0.
I
Hence, F · dr = 0 if and only if ∇ × F = 0.
C
I
I Assume F · dr = 0 for any closed smooth curve C within an open connected region D. Divide
C
C intoZ two curves;ZC1 which starts at A and ends Z at B, and C2Zwhich starts at B and ends at A,
then F · dr + F · dr = 0 which implies F · dr = − F · dr. Since C is an arbitrary
C1 C2 Z Z C1 C2
closed smooth curve, then F · dr = − F · dr for any smooth curve C1 starting at A and
Z C1 C2
ending at B. Hence, F · dr is independent of path from A to B.

C
I The vector identity ∇ × (∇f ) = 0 for any scalar field f (x, y, z) implies that ∇ × F = 0 if and only
if there exists a scalar field f (x, y, z) such that F = ∇f .
I If there exists a scalar field f (x, y, z) such that F = ∇f then

Z Z
dr dr
F· dt = ∇f · dt
C dt dt
ZC
∂f ∂f ∂f dx dy dz
= i+ j+ k · i+ j+ k dt
C ∂x ∂y ∂z dt dt dt
Z
∂f dx ∂f dy ∂f dz
= + + dt
C ∂x dt ∂y dt ∂z dt
Z
df
= dt
C dt
h it=B
= f (x(t) , y(t) , z(t)) .
t=A
In summary, we have:
123
Theorem Conservative vector fields
Let F be a three dimensional vector field with components which have continuous first order partial
derivatives throughout an open connected region D. Then the following statements are equivalent:
I F is a conservative vector field;

Z
I F · dr is independent of path C inside the region D;
C
I ∇ × F = 0 at all points throughout the region D;
I there exists a scalar field f (x, y, z) such that F = ∇f for all (x, y, z) throughout the region D;
I
I F · dr = 0 for any smooth closed curve C inside the region D.
C
Example 3.7.4
Z
Consider a vector field F = 4xi + 2yj + 2zk. Evaluate the line integral F · dr for an arbitrary smooth
C
curve starting at (x, y, z) = (0, 0, 0) and ending at (x, y, z) = (1, 2, 3).
The curl of F is
i j k
 
 
∇ × F = det ∂ ∂ ∂ 
 

 ∂x ∂y ∂z 
4x 2y 2z
= 0i + 0j + 0k.
Therefore, F is a conservative vector field and there exists a scalar field f (x, y, z) such that F = ∇f ,
that is,
∂f ∂f ∂f
= 4x, = 2y and = 2z.
∂x ∂y ∂z
∂f
Integrating = 4x with respect to x gives
∂x
f (x, y, z) = 2x2 + g(y, z)
∂f ∂g
for an arbitrary function g of y and z. Differentiating this function with respect to y gives =
∂y ∂y
∂f
and then comparing this with = 2y gives
∂y
∂g
= 2y.
∂y
Integrating this equation with respect to y gives
g(y, z) = y 2 + h(z)
for an arbitrary function h of z, and then we have
f (x, y, z) = 2x2 + y 2 + h(z) .
124
∂f dh ∂f
Differentiating this function with respect to z gives = and then comparing this with = 2z
∂z dz ∂z
gives
dh
= 2z.
dz
Integrating this equation with respect to z gives
h(z) = z 2 + C
for an arbitrary constant C, and thus we have
f (x, y, z) = 2x2 + y 2 + z 2 + C.
Hence, since F is a conservative vector field, the line integral becomes

Z Z
df
F · dr = dt
C C dt
h i(1,2,3)
= 2x2 + y 2 + z 2
(0,0,0)
= 15.
3.7.3 Green’s Theorem
Green’s theorem is a special case of Stokes’ theorem restricted to the xy-plane. Consider a vector field
defined as F = P (x, y) i + Q(x, y) j + 0k. Let S = R be a flat region in the z = 0 plane bounded by a
positively orientated curve C.
For the surface integral of Stokes’ theorem; the curl of F is
i j k
 
 
∇ × F = det
 ∂ ∂ ∂ 

 ∂x ∂y ∂z 
P (x, y) Q(x, y) 0

∂Q ∂P
= 0i + 0j + − k.
∂x ∂y
The unit normal to the surface is n̂ = k then the surface integral of Stokes’ theorem becomes
ZZ ZZ
∂Q ∂P
(∇ × F) · n̂ dS = − dA
S R ∂x ∂y
For the line integral of Stokes’ theorem; let r = x(t) i + y(t) j + 0k be a parameterisation of the curve
C then the line integral becomes
I I
dr dx dy
F· dt = (P (x, y) i + Q(x, y) j + 0k) · i+ j + 0k dt
C dt C dt dt
I
dx dy
= P (x, y) + Q(x, y) dt.
C dt dt
Therefore, we arrive at the result called Green’s theorem:
125
Theorem Green’s Theorem
Let C be a piecewise smooth closed curve that bounds the region R in the xy-plane. If F =
P (x, y) i + Q(x, y) j where P (x, y) and Q(x, y) have continuous first order partial derivatives
throughout R then
I ZZ
dx dy ∂Q ∂P
P (x, y) + Q(x, y) dt = − dA.
C dt dt R ∂x ∂y
Note that some texts will write

I I
dx dy
P (x, y) + Q(x, y) dt = P (x, y) dx + Q(x, y) dy.
C dt dt C
Example 3.7.5
Calculate the work done by the force F = (y − sin(x)) i + cos(x) j in moving a particle around the
triangle with vertices (x, y) = (0, 0), (x, y) = π2 , 0 and (x, y) = π2 , 1 .
I
Direct evaluation of the work integral F · dr would require performing three line integrals, this is left
C
as an exercise.
We observe that P (x, y) = y −sin(x) and Q(x, y) = cos(x) have continuous first order partial derivatives
throughout the region
π 2
R = (x, y) ∈ R2 : 0 ≤ x ≤ , 0 ≤ y ≤ x
2 π
therefore applying Green’s theorem gives
I I
F · dr = P (x, y) dx + Q(x, y) dy
C C
ZZ
∂Q ∂P
= − dA
R ∂x ∂y
Z π2 Z π2 x !
∂ ∂
= cos(x) − y − sin(x) dy dx
0 0 ∂x ∂y
Z π2 Z π2 x !

= − sin(x) − 1 dy dx
0 0
π
Z 2
h i π2 x
= − sin(x) y − y dx
0 0
Z π
2 2
=− x sin(x) + x dx
π
(0 π )
2 h i π2 Z π2
1 2 2
=− − x cos(x) + cos(x) dx + x
π 0 0 2 0
π2
2 1
=− −x cos(x) + sin(x) + x2
π 2 0
π2

2
=− 1+
π 8
2 π
=− − .
π 4
126
Example 3.7.6 A puzzling exercise for the student

Consider the vector field
−y x
F= i+ 2 j + 0k.
x2 + y 2 x + y2
I
1. Evaluate the line integral P (x, y) dx + Q(x, y) dy if
C
(a) the path C1 is the unit circle traversed counterclockwise starting and ending at (x, y) =
(1, 0).
(b) the path C2 is the unit circle traversed clockwise starting and ending at (x, y) = (1, 0).
Make sure your parameterisation does give clockwise motion!
ZZ
∂Q ∂P
2. Evaluate the double integral − dA over the region
R ∂x ∂y
R = (x, y) ∈ R2 : x2 + y 2 ≤ 1 .

3. Given these results, explain why this does not present a contradictory example of the path inde-
pendence result implied by Green’s Theorem.
Example 3.7.7 An application of Green’s theorem: The area of an ellipse

x2 y2
The elliptical curve + = 1 for a > 0 and b > 0 can be parameterised as
a2 b2
r(θ) = a cos(θ) i + b sin(θ) j + 0k for 0 ≤ θ ≤ 2π.
Note that
dr
= −a sin(θ) i + b cos(θ) j + 0k.
dθ
Let R be the elliptical region bounded by the ellipse, then the area of the elliptical region is
ZZ
A= 1 dA.
R
We could use Green’s theorem here if we can find functions P (x, y) and Q(x, y) such that
∂Q ∂P
− =1
∂x ∂y
which can be satisfied by
1 1
P (x, y) = − y and Q(x, y) = x.
2 2
Therefore, by Green’s theorem we have
ZZ
A= (1) dA
R
ZZ
∂ 1 ∂ 1
= x − − y dA
R ∂x 2 ∂y 2
I
1 dx 1 dy
= − y + x dθ
C 2 dθ 2 dθ
Z 2π
1 1
= − (b sin(θ)) (−a sin(θ)) + (a cos(θ)) (b cos(θ)) dθ
0 2 2
Z 2π
1
= ab dθ
2 0
= abπ.
127
Chapter 4
Fourier Series
128
4.1 Periodic Functions
4.1.1 Periodic functions
Consider some “real world” phenomena
I position of hands on an analogue watch,

I position of a swinging pendulum,
I phases of the moon,
I position of the planets,
I waves in fluids,
I alternating electric current.
Each demonstrates a repetitive behaviour, for example a planet will pass through a specific position on
its orbit at specific equally spaced times. These phenomena are said to be “periodic”.
Definition: Periodic function
A function y = f (x) is periodic if for some fixed positive real number N
f (x + N ) = f (x) for all x.
The smallest positive real N to satisfy this is the period P
f (x + P ) = f (x) for all x.
Thus the function repeats itself
. . . = f (x − 2P ) = f (x − P ) = f (x) = f (x + P ) = f (x + 2P ) = . . .
The frequency is the number of times the function repeats in a unit interval: if P is the period, then
thefrequency is 1/P , and the circular frequency is 2π/P .
Example 4.1.1
The sine function f (x) = sin(x)
and the cosine function f (x) = cos(x)
are two functions that you are familiar with that are periodic. These functions repeat themselves over
a period of 2π, that is, P = 2π.
In general, periodic functions do not have to be continuous, let alone smooth. Furthermore, the period
of periodic functions can be of any length.
Example 4.1.2
Define the periodic function
sin(2x) for 0 ≤ x ≤ π2

f (x) = and f (x) = f (x + π)
0 for π2 ≤ x ≤ π
which is represented by the graph
and has a period of π units.
Example 4.1.3
Define the periodic function
f (x) = x2 for − 1 ≤ x ≤ 1 and f (x) = f (x + 2)
and has a period of 2 units.
130
Example 4.1.4
Define the periodic function, the sawtooth function,
f (x) = x for − π ≤ x ≤ π and f (x) = f (x + 2π)
and has a period of 2π units.
Example 4.1.5
Define the periodic function, a step function,

 −π for − π < x < − π2
f (x) = π for − π2 < x < π2 and f (x) = f (x + 2π)
−π for π2 < x < π

and has a period of 2π units.
131
4.1.2 Trigonometric sums
2π
Observe that the functions such as sin(2x) which has period π units, sin(3x) which has period units
3
π
and sin(10x) which has period units, all repeat over 2π units.
5
What happens if we sum two sine functions?
Example 4.1.6
The function sin(x) + sin(2x) represented by the graph
has a period of 2π units.

What happens if we sum sine functions but use coefficients that are not all equal to one?
Example 4.1.7
The function
2
2 sin(x) − sin(2x) + sin(3x)
3
represented by the graph
has a period of 2π units.

And if we keep adding more sine terms?
132
Example 4.1.8
Consider the function
2 1 2
2 sin(x) − sin(2x) + sin(3x) − sin(4x) + sin(5x)
3 2 5
and the function

2 1 2 1 2 1 2
2 sin(x) − sin(2x) + sin(3x) − sin(4x) + sin(5x) − sin(6x) + sin(7x) − sin(8x) + sin(9x)
3 2 5 3 7 4 9
Both are 2π-periodic functions.

We could do the same using cosine functions instead of sine functions. To make it interesting, we will
not include cosines of some frequencies (which corresponds mathematically to having some coefficients
equal to zero).
133
Example 4.1.9
The function
4
4 cos(x) − cos(3x)
3
is represented by the graph
The function
4 4
4 cos(x) − cos(3x) + cos(5x)
3 5
The function
4 4 4 4
4 cos(x) − cos(3x) + cos(5x) − cos(7x) + cos(9x)
3 5 7 9
134
All three functions are 2π-periodic.

We can make a number of observations from these examples.
1. Each example illustrates a 2π-periodic function.
2. The more terms added to the sine partial sum and the more terms added to the cosine partial
sum, the more each graph looks like the periodic functions we saw in the previous section.
3. It would appear that if we continue to add more and more terms to the sine summation and cosine
summation, then the corresponding graphs would converge to the graphs of the periodic functions
we saw in the previous section.
4. The choice of coefficients was deliberate. We could have used any numbers for the coefficients, but
we would not have obtained the periodic graphs above. (By now you may be wondering; “how
did we choose those specific coefficients?”)
4.1.3 Convergence of trigonometric sums
Recall from first year that we saw how to represent exponential, trigonometric and hyperbolic functions
∞
X
as a power series, f (x) = cn xn , for example;
n=0
∞
1 2 1 X 1 n
exp(x) = 1 + x + x + x3 + . . . = x
2! 3! n=0
n!
∞ 2n+1
1 3 1 X (−1)
sin(x) = x − x + x5 − . . . = x2n+1
3! 5! n=0
(2n + 1)!
∞
1 2 1 X 1 2n
cosh(x) = 1 + x + x4 + . . . = x
2! 4! n=0
2n!
Power series comprise a basis set of power functions
1, x, x2 , x3 , . . .

and to determine the coefficients Taylor’s formula gave us

1 (n)
cn = f (0) .
n!
135
The examples above suggest we could have a similar progression from trigonometric polynomials to a
infinite trigonometric series as we did from polynomials to an infinite power series which could be used
to represent periodic functions using a basis set of trigonometric functions

cos(x) , sin(x) , cos(2x) , sin(2x) , cos(3x) , sin(3x) , . . . .
Although superficially similar, the progressions are profoundly different. Power series theory has been
well established since the early days of calculus, while there still remains unresolved fundamental issues
for infinite trigonometric series.
Recall that a power series either converges everywhere, or on an interval centered at x = 0, or nowhere
except at x = 0. An infinite trigonometric series can converge on quite bizarre sets which led the
nineteenth century German mathematician Georg Cantor to formulate modern set theory. This was
crucial in the establishment of the foundations of modern mathematics.
Furthermore, when a power series converges, it converges to an analytic function, which is infinitely
differentiable, and whose derivatives are represented by the power series obtained by term-wise dif-
ferentiation. In contrast, infinite trigonometric series may converge, not only to periodic continuous
functions, but also to a wide variety of discontinuous functions and, when suitably interpreted, to gen-
eralised functions like the delta function (which we met in Laplace transforms in first year). This means
termwise differentiation of a infinite trigonometric series is a non-trivial issue.
136
4.2 Fourier Series
4.2.1 Fourier series of 2π-periodic functions
Fourier series are used to represent a given periodic function as a infinite trigonometric series
Definition: Fourier Series of a 2π-periodic function
The Fourier series of a 2π-periodic function f (x) defined on the interval −π < x < π is defined
as the infinite trigonometric series
∞
X
f (x) = a0 + an cos(nx) + bn sin(nx) .
n=1
If this series converges then the series representation of the function f (x) will be 2π-periodic. We will
consider convergence later.
If we recall the trigonometric identities
1
cos(A) cos(B) = (cos(A − B) + cos(A + B)) ,
2
1
sin(A) sin(B) = (cos(A − B) − cos(A + B)) ,
2
1
sin(A) cos(B) = (sin(A + B) + sin(A − B)) ,
2
then we can return to the question of how to determine the coefficients of the Fourier series, that is,
a0 , an and bn .
For determining a0 , integrate the Fourier series of f (x) with respect to x on the interval −π < x < π
∞
Z π Z π !
X
f (x) dx = a0 + (an cos(nx) + bn sin(nx)) dx
−π −π n=1
Z π Z π ∞
X Z π Z π
f (x) dx = a0 1 dx + an cos(nx) dx + bn sin(nx) dx
−π −π n=1 −π −π
Z π ∞
X
f (x) dx = a0 (2π) + an (0) + bn (0)
−π n=1
Therefore, the a0 coefficient is defined as

Z π
1
a0 = f (x) dx.
2π −π
The first coefficient a0 is simply the average value of the function f (x) over the length of the period,
2π.
To determine the an coefficients for any n ∈ N, multiply the Fourier series of f (x) by cos(mx) for
some fixed m ∈ N and then integrate with respect to x on the interval −π < x < π.
∞
Z π Z π !
X
f (x) cos(mx) dx = a0 cos(mx) + an cos(nx) cos(mx) + bn sin(nx) cos(mx) dx
−π −π n=1
Z π Z π
f (x) cos(mx) dx = a0 cos(mx) dx
−π −π
∞
X Z π Z π
+ an cos(nx) cos(mx) dx + bn sin(nx) cos(mx) dx
n=1 −π −π
The first term on the right hand side gives

Z π
a0 cos(mx) dx = 0 for any m ∈ N.
−π
The second term on the right hand side;
I If n 6= m then, applying the first trigonometric identity given above, gives

Z π
1 π
Z
cos(nx) cos(mx) dx = cos((n − m) x) + cos((n + m) x) dx
−π 2 −π
iπ iπ
1 1 h 1 h
= sin((n − m) x) + sin((n + m) x)
2 n−m −π n+m −π

1 1 1 1
= (0) + (0)
2 n−m 2n+m
= 0.
I If n = m then
Z π Z π
cos(nx) cos(nx) dx = cos2 (nx) dx
−π −π
Z π
1 1
= + cos(2nx) dx
−π 2 2
π
1 1
= x+ sin(2nx)
2 2n −π

1 1
= 2π + (0)
2 2n
= π.
Thus the second term on the right hand side gives

Z π
an π for n = m
an cos(nx) cos(mx) dx =
−π 0 6 m
for n =
138
The third term on the right hand side;
I If n 6= m then, applying the third trigonometric identity given above, gives

Z π
1 π
Z
sin(nx) cos(mx) dx = sin((n + m) x) + sin((n − m) x) dx
−π 2 −π
iπ iπ
1 1 h 1 h
=− cos((n + m) x) + cos((n − m) x)
2 n+m −π n−m −π

1 1 1
=− (0) + (0)
2 n+m n−m
= 0.
I If n = m then
Z π Z π 1
sin(nx) cos(nx) dx = sin(2nx) dx
−π −π 2
1 1 h iπ
=− cos(2nx)
2 2n −π
1 1
=− (0)
2 2n
= 0.
Thus the third term on the right hand side gives

Z π
bn sin(nx) cos(mx) dx = 0 for all n, m ∈ N.
−π
Therefore, the an coefficient is defined as

Z π
1
an = f (x) cos(nx) dx.
π −π
To determine the bn coefficients for any n ∈ N, multiply the Fourier series of f (x) by sin(mx) for
some fixed m ∈ N and then integrate with respect to x on the interval −π < x < π
∞
Z π Z π !
X
f (x) sin(mx) dx = a0 sin(mx) + (an cos(nx) sin(mx) + bn sin(nx) sin(mx)) dx
−π −π n=1
Z π Z π
f (x) sin(mx) dx = a0 sin(mx) dx
−π −π
∞
X Z π Z π
+ an cos(nx) sin(mx) dx + bn sin(nx) sin(mx) dx
n=1 −π −π
The first term on the right hand side gives

Z π
a0 sin(mx) dx = 0 for any m ∈ N.
−π
139
The second term on the right hand side;
I If n 6= m then, applying the fourth trigonometric identity given above, gives

Z π
1 π
Z
cos(nx) sin(mx) dx = sin((n + m) x) − sin((n − m) x) dx
−π 2 −π

1 1 π 1 π
=− [cos((n + m) x)]−π − [cos((n − m) x)]−π
2 n+m n−m

1 1 1 1
=− (0) − (0)
2 n+m 2n−m
= 0.
I If n = m then
Z π Z π
1
cos(nx) sin(nx) dx = sin(2nx) dx
−π −π 2
1 1 h iπ
=− cos(2nx)
2 2n −π
1 1
=− (0)
2 2n
= 0.
Thus the second term on the right hand side gives

Z π
an cos(nx) sin(mx) dx = 0 for all n, m ∈ N.
−π
The third term on the right hand side;
I If n 6= m then, applying the second trigonometric identity given above, gives

Z π
1 π
Z
sin(nx) sin(mx) dx = cos((n − m) x) − cos((n + m) x) dx
−π 2 −π
iπ iπ
1 1 h 1 h
= sin((n − m) x) − sin((n + m) x)
2 n−m −π n+m −π

1 1 1 1
= (0) − (0)
2 n−m 2n+m
= 0.
I If n = m then
Z π Z π
sin(nx) sin(nx) dx = sin2 (nx) dx
−π −π
Z π
1 1
= − cos(2nx) dx
−π 2 2
π
1 1
= x− sin(2nx)
2 2n −π
1
= (2π)
2
= π.
Thus the third term on the right hand side gives

Z π
bn π for n = m
bn sin(nx) sin(mx) dx =
−π 0 6 m
for n =
140
Therefore, the bn coefficient is defined as

Z π
1
bn = f (x) sin(nx) dx.
π −π
Definition: Euler formulae for the Fourier coefficients
Given a 2π-periodic function f (x) defined on the interval −π < x < π, the Fourier coefficients
are
Z π
1
a0 = f (x) dx
2π −π
1 π
Z
an = f (x) cos(nx) dx for n ∈ N
π −π
1 π
Z
bn = f (x) sin(nx) dx for n ∈ N
π −π
Example 4.2.1
Determine the Fourier series for the step function

 −π for − π < x < − π2
f (x) = π for − π2 < x < π2 and f (x) = f (x + 2π)
−π for π2 < x < π

The a0 coefficient is given by

Z π
1
a0 = f (x) dx
2π −π
!
Z −π Z π Z π
1 2 2
= − π dx + π dx + π dx
2π −π −π
2
π
2
−π π
h iπ
1 i 2
h i 2
h
= − πx + πx + − πx π
2π −π −π
2 2
= 0.
The an coefficient is given by

1 π
Z
an = f (x) cos(nx) dx
π −π
Z − π2 Z π2 Z π !
1
= − π cos(nx) dx + π cos(nx) dx + − π cos(nx) dx
π −π −π2
π
2
1

πh i− π2 πh i π2 πh iπ
= − sin(nx) + sin(nx) π − sin(nx) π
π n −π n −2 n 2
1 π h π i π h π π i π h π i
= − sin −n −0 + sin n − sin −n − 0 − sin n
π n 2 n 2 2 n 2
4 nπ
= sin
n 2
where we have used the symmetry property sin(−θ) = − sin(θ). Note that

π  0 for n = 2, 4, 6, . . .
sin n = 1 for n = 1, 5, 9, . . .
2
−1 for n = 3, 7, 11, . . .

141
The bn coefficient is given by
1 π
Z
bn = f (x) sin(nx) dx
π −π
Z − π2 Z π2 Z π !
1
= − π sin(nx) dx + π sin(nx) dx + − π sin(nx) dx
π −π −π2
π
2

1 πh i− π2 πh i π2 πh iπ
= cos(nx) − cos(nx) π + cos(nx) π
π n −π n −2 n 2
1 π
h nπ i π h nπ nπ i π h nπ i
= cos − − cos(−nπ) − cos − cos − + cos(nπ) − cos
π n 2 n 2 2 n 2
=0
where we have used the symmetry property cos(−θ) = cos(θ).

Hence, the Fourier series for f (x) is
∞
X 4 nπ
f (x) = sin cos(nx) ,
n=1
n 2
that is,
4 4 4
f (x) = 4 cos(x) − cos(3x) + cos(5x) − cos(7x) + . . .
3 5 7
Example 4.2.2
Determine the Fourier series for the sawtooth function
f (x) = x for − π ≤ x ≤ π and f (x) = f (x + 2π)

Z π
1
a0 = f (x) dx
2π −π
Z π
1
= x dx
2π −π
π
1 1 2
= x
2π 2 −π
= 0.
1 π
Z
an = f (x) cos(nx) dx
π −π
1 π
Z
= x cos(nx) dx.
π −π
This definite integral will require integration by parts:
142
dv du 1
Let u(x) = x and = cos(nx) then = 1 and v(x) = sin(nx), therefore
dx dt n
Z −π
1
an = x cos(nx) dx
π −π
iπ Z π
1 hx 1
= sin(nx) − sin(nx) dx
π n −π −π n
iπ π !
1 hx 1
= sin(nx) − − 2 cos(nx)
π n −π n −π
π
1 x 1
= sin(nx) + 2 cos(nx)
π n n −π
= 0.
1 π
Z
bn = f (x) sin(nx) dx
π −π
1 π
Z
= x sin(nx) dx.
π −π

dv du 1
Let u(x) = x and = sin(nx) then = 1 and v(x) = − cos(nx), therefore
dx dt n
1 −π
Z
an = x sin(nx) dx
π −π
iπ Z π
1 h x 1
= − cos(nx) − − cos(nx) dx
π n −π −π n
iπ π !
1 h x 1
= − cos(nx) − − 2 sin(nx)
π n −π n −π
π
1 x 1
= − cos(nx) + 2 sin(nx)
π n n −π

1 2π
= − cos(nπ)
π n
2 n
= − (−1)
n
2 n+1
= (−1) .
n

∞
X 2 n+1
f (x) = (−1) sin(nx) ,
n=1
n
that is,
2 1
f (x) = 2 sin(x) − sin(2x) + sin(3x) − sin(4x) + . . .
3 2
143
4.2.2 Fourier series for functions of arbitrary period
Our two previous examples both had period of 2π. But what if we had a function with period P 6= 2π?
How do we write the Fourier series for a function with an arbitrary period of 2L? This is quite easy to
do, as we shall see.
If we start with a 2π-periodic function then we have f (x) = f (x + 2π). We want to find a transformation
such that the function is 2L-periodic;

2πL
f (x + 2π) = f x +
L

π L
=f x + 2L .
L π
L π
π

If we introduce the coordinate transformation X = x then we have f L X =f L (X + 2L) and we
π
can find the Fourier series and Fourier coefficients for a 2L-periodic function.
Definition: Fourier series for an arbitrary period function
The Fourier series of a 2L-periodic function f (x) defined on the interval −L < x < L is defined
as the infinite trigonometric series
∞
X nπ nπ
f (x) = a0 + an cos x + bn sin x
n=1
L L
where the Fourier coefficients are

Z L
1
a0 = f (x) dx
2L −L
1 L
Z nπ
an = f (x) cos x dx for n ∈ N
L −L L
1 L
Z nπ
bn = f (x) sin x dx for n ∈ N
L −L L
144
Example 4.2.3
Determine the Fourier series for the function
f (x) = x + 1 for − 2 < x ≤ 2 and f (x) = f (x + 4) .
First we sketch the graph of the function
This function has a period of 4 units, then L = 2.

Z 2
1
a0 = f (x) dx
4 −2
1 2
Z
= x + 1 dx
4 −2
i2
1 h1 2
= x +x
4 2 −2
= 1.

Z 2
1 nπ
an = f (x) cos x dx
2 −2 2
1 2
Z nπ
= (x + 1) cos x dx.
2 −2 2

dv nπ du 2 nπ
Let u(x) = x + 1 and = cos x then = 1 and v(x) = sin x , therefore
dx 2 dt nπ 2
1 2
Z nπ
an = f (x) cos x dx
2 −2 2
1 2
Z nπ
= (x + 1) cos x dx
2 −2 2
!
1

2 nπ 2 2
Z 2
nπ
= (x + 1) sin x − sin x dx
2 nπ 2 −2 nπ −2 2
nπ i2
1 2 h 4 h nπ i2
= (x + 1) sin x + 2 2 cos x
2 nπ 2 −2 n π 2 −2
= 0.
145

Z 2
1 nπ
bn = f (x) sin x dx
2 −2 2
1 2
Z nπ
= (x + 1) sin x dx
2 −2 2

dv nπ du 2 nπ
Let u(x) = x + 1 and = sin x then = 1 and v(x) = − cos x , therefore
dx 2 dt nπ 2
1 2
Z nπ
bn = f (x) sin x dx
2 −2 2
1 2
Z nπ
= (x + 1) sin x dx
2 −2 2
nπ i2 Z 2 nπ
1 2 h 2
= − (x + 1) cos x + cos x dx
2 nπ 2 −2 nπ −2 2
!
1 2 h nπ i2 2

2 nπ 2
= − (x + 1) cos x + sin x
2 nπ 2 −2 nπ nπ 2 −2

1 2
= − [3 cos(nπ) − (−1) cos(−nπ)] + 0
2 nπ
4 n
=− (−1)
nπ
4 n+1
= (−1) .
nπ
∞ nπ
X 4 n+1
f (x) ∼ 1 + (−1) sin x .
n=1
nπ 2
Consider the partial summation

N nπ
X 4 n+1
SN = 1 + (−1) sin x for finite N.
n=1
nπ 2
For N = 2 we have
4 π
S2 (x) = 1 + sin x − 2π sin(πx)
π 2
which has graph
146
For N = 5 we have

4 π 4 3π 4 5π
S5 (x) = 1 + sin x − 2π sin(πx) + sin x − π sin(2πx) + sin x
π 2 3π 2 5π 2
which has graph
For N = 10 the partial sum function S10 (x) has the graph
4.2.3 Convergence of a Fourier series
We have stated that the Fourier series of a function f (x) defined on −L < x < L is
∞
X nπ nπ
f (x) ∼ a0 + an cos x + bn sin x
n=1
L L

Z L
1 L 1 L
Z Z
1 nπ nπ
a0 = f (x) dx, an = f (x) cos x dx, bn = f (x) sin x dx.
2L −L L −L L L −L L
The choice of function f cannot be completely arbitrary since the formulae for the Fourier coefficients
must be well-defined and finite. Even if the Fourier coefficients are finite, the resulting Fourier series may
not converge. And then, even if the Fourier series converges there is no guarantee that it will converge
to the original function f (x). Because of these potential complications we often use the symbol ∼ for
“represented by” instead of the equality symbol.
In the previous example we saw that the function
f (x) = x + 1 for − 2 < x ≤ 2 and f (x) = f (x + 4) .
has the Fourier series representation
∞ nπ
X 4 n+1
f (x) ∼ 1 + (−1) sin x .
n=1
nπ 2
147
Applying standard series convergence tests such as ratio test are inconclusive. If we did know that the
series converges it is not obvious if it will converge to the original f (x). In fact, this Fourier series
representation does not converge for all real x. Observe that at x = 2, all but the very first term of the
Fourier series representation are zero, and thus the Fourier series converges to 1 at x = 2. This does
not agree with the value of the function at x = 2, that is, f (2) = 3.
Given the N th -partial sum of a Fourier series
N
X nπ nπ
SN (x) = a0 + an cos x + bn sin x for finite N
n=1
L L
then the Fourier series converges at a point x = x0 if and only if the limit of the partial sums at that
point exists:
lim (SN (x)) = F (x0 )
x−→x0
which may or may not agree with the value of the original function f (x0 ).
Theorem Convergence conditions for a Fourier series
df
Let f and be continuous except at a number of finite points, and only have jump discontinuities
dx
at these points, in the interval −L < x < L. The Fourier series representation of f converges to
f (xc ) at points of continuity xc . At a point of discontinuity, xd , the Fourier series will converge
to the average value of the left hand and right hand limits of f at xd , namely
1
lim (f (xd − ε) + f (xd + ε))
2 ε→0
In the previous example we saw that the function
f (x) = x + 1 for − 2 ≤ x ≤ 2 and f (x) = f (x + 4) .
has the Fourier series representation

∞ nπ
X 4 n+1
f (x) ∼ 1 + (−1) sin x .
n=1
nπ 2
and we saw the graphs for the partial sum functions S2 (x), S5 (x) and S10 (x). Consider a larger value
of N ; for N = 20 the partial sum function S20 (x) has the graph
Considering the graphs for S2 (x), S5 (x), S10 (x) and S20 (x) we observe that away from the points of
jump discontinuity, the Fourier series does indeed appear to be converging to f (x). However, closer to
a point of discontinuity the convergence of the Fourier series appears to be slower. Furthermore, near
148
the points of discontinuity there are overshoots and undershoots in the Fourier series representation,
and these do not appear to decrease as N increases, instead they become narrow sharp spikes. This
behaviour which is a manifestation of the subtle non-uniform convergence of a Fourier series is known
as the Gibbs phenomenon.
149
4.3 Fourier Cosine and Sine Series
4.3.1 Even and odd functions
Definition: Odd and even functions
A function f is an even function if

f (−x) = f (x)
for all x.
A function f is an odd function if
f (−x) = −f (x)
for all x.
Example 4.3.1 Even functions

Graphically, even functions have a reflection symmetry about the y-axis.
Consider f (x) = x2 we have
2
f (−x) = (−x)
=x
= f (x)
then f (x) = x2 is an even function.

Other examples of even functions include the even degree power functions 1, x4 , x6 , . . ., the cosine
functions cos(nx) and the hyperbolic cosine functions cosh(αx).
Example 4.3.2 Odd functions

Graphically, odd functions have a π radians rotation symmetry about the origin (or equivalently, a
reflection about the y-axis followed by a reflection about the x-axis).
Consider f (x) = x3 we have
3
f (−x) = (−x)
3
= (−1) x3
= −1 x3

= −f (x)
then f (x) = x3 is an odd function.

Other examples of odd functions include the odd degree power functions x, x5 , x7 , . . ., the sine functions
sin(nx) and the hyperbolic sine functions sinh(αx).
Example 4.3.3 Functions with no symmetry

There are functions that have neither even or odd symmetry.
Consider f (x) = 2x we have
1
f (−1) =
2
while
f (1) = 2
thus f (−1) 6= f (1) and therefore the exponential function f (x) = 2x has no symmetry.
Other examples of functions that do not have symmetry include ax , x + 1 and x2 + sin(x). However, a
function with no symmetry can be written as the summation of an even function and an odd function.
Example 4.3.4
Consider the Fourier series representation of a function f (x) defined as
∞
X nπ nπ
f (x) ∼ a0 + an cos x + bn sin x .
n=1
L L
Since a0 and cos nπ nπ

L x are even functions while sin L x are odd functions then we can write the
Fourier series as the sum of an even function
∞
X nπ
F (x) = a0 + an cos x
n=1
L
and an odd function

∞
X nπ
G(x) = bn sin x .
n=1
L
A list of properties for even and odd functions:
I If f (x) is an even function and g(x) is an even function then the product f (x) g(x) is an even
function.
I If f (x) is an even function and g(x) is an odd function then the product f (x) g(x) is an odd
function.
I If f (x) is an odd function and g(x) is an odd function then the product f (x) g(x) is an even
function.
Z L Z L
I If f (x) is an even function then f (x) dx = 2 f (x) dx.
−L 0
Z L
I If f (x) is an odd function then f (x) dx = 0.
−L
151
4.3.2 Even and odd functions
In view of the foregoing list of properties, if f is an even function then the Fourier coefficients become
Z L
1
a0 = f (x) dx
2L −L
1 L
Z
= f (x) dx
L 0
1 L
Z nπ
an = f (x) cos x dx
L −L L
2 L
Z nπ
= f (x) cos x dx
L 0 L
1 L
Z nπ
bn = f (x) sin x dx
L −L L
=0
while if f is an odd function then the Fourier coefficients become

Z L
1
a0 = f (x) dx
2L −L
=0
Z L
1 nπ
an = f (x) cos x dx
L −L L
=0
Z L
1 nπ
bn = f (x) sin x dx
L
−L L
2 L
Z nπ
= f (x) sin x dx
L 0 L
In summary, we have
152
Definition: Fourier cosine and sine series
The Fourier series of an even function f (x) on −L < x < L is the Fourier cosine series
∞
X nπ
f (x) = a0 + an cos x
n=1
L

Z L
1
a0 = f (x) dx
L 0
Z L
2 nπ
an = f (x) cos x dx for n ∈ N
L 0 L
The Fourier series of an odd function f (x) on −L < x < L is the Fourier sine series
∞
X nπ
f (x) = bn sin x
n=1
L

Z L
2 nπ
bn = f (x) sin x dx for n ∈ N
L 0 L
Example 4.3.5

−k for − π < x < 0
f (x) = and f (x) = f (x + 2π)
k for 0 < x < π
We can show graphically (or mathematically) that f (x) is an odd function. We can then state that
f (x) is an odd function, then we can state that a0 = 0 and an = 0 for all n ∈ N.
The bn coefficients are
Z π
2
bn = k sin(nx) dx
π 0
2k 1 h iπ
=− cos(nx)
π n 0
2k
=− (cos(nπ) − 1) .
nπ
Hence, the Fourier sine series for f (x) is
∞
X 2k
f (x) = (1 − cos(nπ)) sin(nx) .
n=1
nπ
4.3.3 Half-range expansions
Throughout the preceding discussions it was assumed that the function f was defined on an interval
−L < x < L symmetric about the origin. However, we will find in many instances that a function
153
will only be defined on an interval 0 < x < L and then we will need to extend the function to include
−L < x < 0. We will find that this will be of use in solving partial differential equations.
Let y = f (x) be a function defined on the interval 0 < x < L.
I Reflect the graph of y = f (x) about the y-axis onto −L < x < 0, and then extend this to an
even periodic function of period 2L. The resulting function feven (x) is called the even periodic
extension, and has a Fourier cosine series expansion.
I Rotate the graph of y = f (x) about the origin by π radians onto −L < x < 0, and then extend
this to an odd periodic function of period 2L. The resulting function fodd (x) is called the odd
periodic extension, and has a Fourier sine series expansion.
Example 4.3.6
f (x) = x for 0 < x < 1.
I If we perform an even periodic extension of f then we would have

x for 0 < x < 1
f (x) = and f (x + 2) = f (x) .
−x for − 1 < x < 0
Since this function is even then bn = 0 for all n ∈ N.

For the a0 coefficient, we have
Z 1
a0 = x dx
0
1
1 2
= x
2 0
1
=
2
and for the an coefficient, we have
Z 1
an = 2 x cos(nπx) dx
0
h i1 Z 1
1 1
=2 x sin(nπx) − sin(nπx) dx
nπ 0 nπ 0
h i1 i1
1 1 h
=2 x sin(nπx) + 2 2 cos(nπx)
nπ 0 n π 0
2
= 2 2 (cos(nπ) − 1)
n π
2 n
= 2 2 ((−1) − 1) .
n π
Therefore, the Fourier cosine series of f is
∞
1 X 2 n
f (x) = + ((−1) − 1) cos(nπx) .
2 n=1 n2 π 2
I If we perform an odd periodic extension of f then we would have
f (x) = x for − 1 < x < 1 and f (x + 2) = f (x) .
154
Since this function is odd then a0 = 0 and an = 0 for all n ∈ N.

For the bn coefficient, we have
Z 1
bn = 2 x sin(nπx) dx
0
i1 Z 1
1 h 1
=2 − x cos(nπx) + cos(nπx) dx
nπ 0 nπ 0
i1 i1
1 h 1 h
=2 − x cos(nπx) + 2 2 sin(nπx)
nπ 0 n π 0
2
=− cos(nπ)
nπ
2 n
=− (−1) .
nπ
Therefore, the Fourier sine series of f is
∞
X 2 n+1
f (x) = (−1) sin(nπx) .
n=1
nπ
155
4.4 Applications of Fourier Series
As mentioned earlier, Fourier series and the associated analysis have a wide range of applications. Our
focus has primarily been on developing Fourier series as a tool we will use in solving partial differential
equations.
We will briefly explore some of the other uses for Fourier series here.
4.4.1 Forced oscillations
Consider a dynamical system governed by a second-order linear ordinary differential equation
d2 y dy
m +c + ky = r(t)
dt2 dt
where m, c, k are given constants. This ODE can be used in modelling an electrical circuit or vibrations
in a system. In first year we developed solutions to this type of ODE for simple functions r(t).
The right hand side of this equation, r(t), represents the external forcing on a system. For example,
for a system which is a vibrating mass on a spring, r(t) would represent a driving force causing an
oscillatory vertical motion. If r(t) is too complicated, we may find it difficult to determine an analytical
solution. Assume that the forcing is periodic (for example, the forcing by the tide) then we could find
the Fourier series representation of r(t)
∞
X nπ nπ
r(t) ∼ a0 + an cos t + bn sin t .
n=1
L L
We could say the forcing comes in at certain frequencies or modes. For any mode, we can get the
solution of the governing ODE
yn (t) = An cos(nt) + Bn sin(nt) .
These solutions represent that frequency with which the system responds. We can determine the
coefficients An and Bn by substituting the solution into the ODE and solving for them. The overall
X∞
solution to the system will then simply be the sum y(t) = yn (t). The interesting aspect of this
n=0
problem is that the system has preferred modes or frequencies with which to oscillate. Energy can come
in over a range of frequencies and be converted to the frequencies preferred by the system.
4.4.2 Harmonics and energy of periodic functions
Let f (x) be a 2L-periodic function with a Fourier series representation

∞
X nπ nπ
n=1
L L
valid for all x except at finitely many points where there are jump discontinuous.
nπ nπ
The function an cos x + bn sin x is called the nth -harmonic of f (x). The Fourier series is
L L
thus a decomposition of f (x) into the sum of a constant term and harmonics. The first harmonic
π π
a1 cos x + b1 sin x is also called the fundamental harmonic. The nth -harmonic can be ex-
L L
pressed as a sine curve:
nπ nπ nπ
an cos x + bn sin x = An sin x + δn
L L L

p an
where An = a2n + b2n is the amplitude and δn = arctan is the phase of the nth -harmonic.
bn
157
We can define the following quantities representing the energies of f (x):
Z L
1
2

I E= (f (x)) dx is called the total energy of f (x);
L −L
1 L
Z nπ nπ 2
I En = an cos x + bn sin x dx is the energy of the nth -harmonic of f (x);
L −L L L
and, in particular,
1 L a0 2 a2
Z
I E0 = dx = 0 is the energy of the constant term of f (x).
L −L 2 2

The sequence of energies E0 , E1 , E2 , . . . is called the energy spectrum of f (x). We have the
following theorem giving the relationship between the total energy and the energy spectrum of f (x).
Theorem Energy Theorem
Let f (x) be a 2L-periodic function such that it is piecewise continuous on the interval −L < x < L
and has left-hand and right-hand derivatives at every x. Let
∞
X nπ nπ
n=1
L L
be the Fourier series representation of f (x). Then
1. En = a2n + b2n = A2n .

2. (Parseval’s identity) E = E0 + E1 + E2 + . . .
158
Chapter 5
Ordinary Differential Equations
159
5.1 Ordinary Differential Equations: Assumed Background
Knowledge
The following assumed knowledge is a revision of material covered in the prerequisite ENG1005 Engi-
neering Mathematics (or equivalent units) on ordinary differential equations that you are expected to
know and understand. At the discretion of the lecturer some of this may be briefly revised during the
initial lecture on differential equations. However, more generally it is advisable for you to spend time
on all of the material listed below prior to the differential equations lectures.
5.1.1 Nomenclature
I A differential equation is an equation containing the derivatives of one or more dependent
variables with respect to one or more independent variables.
I An ordinary differential equation (ODE) is an equation containing the derivatives of one
dependent variable with respect to one independent variable.
I A partial differential equation (PDE) is an equation containing the derivatives of one or more
dependent variable with respect to more than one independent variable.
I A differential equation is linear if the dependent variable, and corresponding derivatives, do not
occur as products, raised to powers or in non-linear functions.
I The order of a differential equation is the order of the highest derivative.
Example 5.1.1
dy x
1. The equation = − is a first-order non-linear ODE. It is non-linear since it can be written as
dx y
dy dy
y = −x which has a multiplication of y(x) with .
dx dx
dy
2. The equation x − 4y = x6 ex is a first-order linear ODE.
dx
d2 θ g
3. The simple pendulum equation 2
+ sin(θ) = 0 is a non-linear second-order ODE. It is non-
dt l
linear because of the sine function of the dependent variable θ(t).
3
d2 y dy
4. The equation + = x is a non-linear second-order ODE. It is non-linear because of the
dx2 dx
dy
cubic function of the derivative .
dx
d3 f d2 f
5. The Blasius equation + f = 0 is a non-linear third-order ODE. It is non-linear because
dx3 dx2
d2 f
of the multiplication of f (x) with the derivative .
dx2
∂2y ∂4y
6. The vibrating cantilevered beam equation m + EI = 0 is a linear fourth-order PDE.
∂t2 ∂x4
∂u ∂u
7. The shock wave equation +u = 0 is a non-linear first-order PDE. It is non-linear because
∂x ∂y
∂u
of the multiplication of u(x, y) with .
∂y
5.1.2 First order ODEs
Separable first order ODEs
If an ODE can be written in the form

dy
= f (x) g(y)
dx
then the ODE is said to be separable.
Example 5.1.2
dy x dy 1
Consider = − which is separable as it can be written as = −x , and thus
dx y dx y
dy
y = −x
dx
Z Z
dy
y dx = − x dx
dx
Z Z
d 1 2
y dx = − x dx
dx 2
1 2 1
y = − x2 + C
2 2
for arbitrary constant C. The ODE describes motion in concentric circles x2 + y 2 = C for arbitrary
constant C.
Linear first order ODEs - integration factors
The solution of a linear first-order ODE

dy
+ p(x) y = q(x)
dx
can be found by constructing an integration factor I(x) and rewriting the ODE as
d
I(x) y = I(x) q(x)
dx
Z
where the integration factor is given by I(x) = exp p(x) dx .
Example 5.1.3
dy
Consider the first-order linear ODE x − 4y = x6 ex . Rewrite the ODE in the “standard form”
dx

dy 4
+ − y = x5 ex
dx x
4
then we have p(x) = − and q(x) = x5 ex . The corresponding integration factor is
x
Z
4
I(x) = exp − dx
x
= exp(−4 loge (x))
= exp loge x−4

= x−4 .
161
Multiplying the standard form of the ODE by the integration factor gives

dy 4
x−4 + x−4 − y = x−4 x5 ex
dx x
that is,
d −4
x y = xex .
dx
Integrate with respect to x Z Z
d −4
x y dx = xex dx.
dx
The left hand side is just the application of the fundamental theorem of calculus. The right hand side
dv du
will require integration by parts. Let u = x and = ex then = 1 and v = ex therefore,
dx dx
Z
x−4 y = xex − ex dx
x−4 y = xex − ex + C
for arbitrary constant C. Hence, the general solution of the ODE is
y = x5 ex − x4 ex + Cx4 .
5.1.3 Second order linear ODEs
In ENG1005 we briefly discussed finding the solutions to second order linear ODEs with constant
coefficients. In the following sections we will review this theory and see how it can be extended to
nth -order linear ODEs with constant coefficients.
Given a constant coefficient 2nd -order homogeneous ODE
d2 y dy
a 2
+b + cy = 0
dx dx
first solve the characteristic equation
aλ2 + bλ + c = 0
for λ.
1. If λ1 6= λ2 and λ1 , λ2 ∈ R then
y(x) = c1 eλ1 x + c2 eλ2 x
for arbitrary constants c1 and c2 .
2. λ1 = λ2 and λ1 , λ2 ∈ R then
y(x) = (c1 + c2 x) eλx
3. If λ = α ± iβ then
y(x) = eαx (c1 cos(βx) + c2 sin(βx))
162
Given a constant coefficient 2nd -order non-homogeneous ODE
d2 y dy
a +b + cy = r(x)
dx2 dx
We first find the homogeneous solution, yh (x), corresponding to the 2nd -order homogeneous ODE given
by r(x) = 0. Then we seek a particular solution, yp (x), corresponding to the 2nd -order non-homogeneous
ODE. Finally, the addition yh (x) + yp (x) gives the general solution to the 2nd -order non-homogeneous
ODE. In order to find a particular solution to the non-homogeneous ODE we apply the method of
Undetermined coefficients: In this we try choose yp (x) to be of a similar form to r(x) but involving
unknown constants (ki for i = 0, 1, 2, . . . , n) which are determined by substituting the assumed form of
yp (x) into the non-homogeneous ODE.
I If r(x) ∝ eax then try yp (x) = keax .
I If r(x) ∝ xn then try yp (x) = k0 + k1 x + k2 x2 + . . . + kn xn .

I If r(x) ∝ cos(ωx) or r(x) ∝ sin(ωx) then try yp (x) = k1 cos(ωx) + k2 sin(ωx).
I If r(x) ∝ eax cos(ωx) or r(x) ∝ eax sin(ωx) then try yp (x) = eax (k1 cos(ωx) + k2 sin(ωx)).

I If r(x) ∝ eax xn then try yp (x) = eax k0 + k1 x + k2 x2 + . . . + kn xn .
163
5.2 Higher Order Homogeneous Linear ODEs
5.2.1 Linear independence of functions
The underlying structure of the solution space for a linear ODE has the fundamental property that
the scalar multiple of a solution gives another solution of the linear ODE and the addition of any two
solutions gives another solution of the linear ODE. This property is often referred to as the linearity
property or superposition property, which requires the solutions to be linear independent.
Definition: Linear independent functions

A set of functions f1 (x) , f2 (x) , . . . , fn (x) is said to be linearly independent if
c1 f1 (x) + c2 f2 (x) + . . . + cn fn (x) = 0 if and only if c1 = c2 = . . . = cn = 0.
If any of the
constants c1 , c2 , . . . , c
n fail to be zero then the set of
functions f1 (x) , f2 (x) , . . . , fn (x) is said to be linearly dependent.
Example 5.2.1
Let f (x) = x, g(x) = sin(2x) and h(x) = sin(x) cos(x).
The functions f and g are linearly independent since ax + b sin(2x) = 0 if and only if a = b = 0.
The functions g and h are linear dependent since a sin(2x) + b sin(x) cos(x) = 0 is true for a = 1 and
b = −2.
There is an easy way to check linear independence of n functions:
Definition: The Wronskian
Let f1 (x) , f2 (x) , . . . , fn (x) be n functions which have n − 1 continuous derivatives for all x ∈ R.
The determinant
 
f1 f2 ... fn
 df1 df2 dfn 
 ... 
 dx dx dx 
W (f1 , f2 , . . . , fn ) = det .. .. .. .. 
. . . .
 
 
 dn−1 f dn f2 dn−1 fn 
1
...
dxn−1 dxn dxn−1
is called the Wronskian of the functions f1 , f2 , . . . , fn .
Theorem Criteria for linearly independent functions

The set of functions f1 (x) , f2 (x) , . . . , fn (x) is linearly independent if and only if there exists
an x ∈ R such that
W (f1 , f2 , . . . , fn ) 6= 0 at x.
Example 5.2.2
Let f (x) = x, g(x) = sin(2x) and h(x) = sin(x) cos(x).
The Wronskian of f and g is
" #!
f g
W (x, sin(2x)) = det df dg
dx dx
x sin(2x)
= det
1 2 cos(2x)
= 2x cos(2x) − sin(2x) .
The Wronskian W (x, sin(2x)) = 2x cos(2x) − sin(2x) is equal to zero at a number of points (verify
with a graph) but is not equal to zero for all x ∈ R. Hence f (x) = x and g(x) = sin(2x) are linearly
independent.
The Wronskian of g and h is
" #!
g h
W (sin(2x) , sin(x) cos(x)) = det dg dh
dx dx
sin(2x) cos(x) sin(x)
= det
2 cos(2x) cos2 (x) − sin2 (x)
= sin(2x) cos2 (x) − sin2 (x) − 2 cos(2x) cos(x) sin(x)

= sin(2x) cos(2x) − cos(2x) sin(2x)

= 0.
Since W (sin(2x) , sin(x) cos(x)) = 0 for all x ∈ R then we conclude that g(x) = sin(2x) and h(x) =
sin(x) cos(x) are linearly dependent.
Example 5.2.3
Let f (x) = eαx and g(x) = eβx for real constants α and β. The Wronskian of f and g is
" #!
f g
αx βx

W e ,e = det df dg
dx dx
eαx eβx

= det
αeαx βeβx
= (β − α) e(α+β)x .

The Wronskian W eαx , eβx 6= 0 for all x ∈ R if and only if α 6= β. Hence f (x) = eαx and g(x) = eβx
are linearly independent whenever α 6= β.
5.2.2 Homogeneous nth -order ODEs
For the following discussion we assume that
165
1. ak (x) for k = 0, 1, 2, . . . , n are single continuous real variable functions on some real interval I,
2. an (x) 6= 0 for all x ∈ I, and
3. f (x) is a continuous real variable function on I.
I A non-homogeneous nth -order linear ODE with real coefficients takes the form
dn y dn−1 y dy
an (x) n
+ an−1 (x) n−1
+ . . . + a1 (x) + a0 (x) y(x) = f (x)
dx dx dx
On occasions these are called inhomogeneous ODEs.
I If f (x) = 0 then
dn y dn−1 y dy
an (x) n
+ an−1 (x) n−1 + . . . + a1 (x) + a0 (x) y(x) = 0
dx dx dx
is a homogeneous nth -order linear ODE with real coefficients.
I The trivial solution y(x) = 0 is always a solution of a homogeneous nth -order linear ODE with
real coefficients.
I A homogeneous nth -order linear ODE with real coefficients will have a fundamental solution
set
y1 (x) , y2 (x) , . . . , yn (x)
of n linearly independent solutions yk (x) for k = 1, 2, . . . , n.
Theorem Superposition principle - homogeneous linear ODEs

Given the fundamental solution set y1 (x) , y2 (x) , . . . , yn (x) of a homogeneous nth -order linear
ODE with real coefficients the linear combination
y(x) = c1 y1 (x) + c2 y2 (x) + . . . + cn yn (x)
for arbitrary constants c1 , c2 , . . . , cn , is also a solution of the homogeneous nth -order linear ODE
with real coefficients. To determine all n arbitrary constants we will need n initial and boundary
conditions.
Consider an nth -order linear ODE with real constant coefficients

dn y dn−1 y dy
an n
+ an−1 n−1 + . . . + a1 + a0 y(x) = f (x)
dx dx dx
Equations of this form are common throughout the physical sciences and engineering. The method
for seeking the general solution to equations of this form requires two parts; finding the homogeneous
solution yh (x) and finding the particular solution yp (x).
If the equation is a homogeneous linear ODE then the trival solution y = 0 is always a solution. The
nth -order linear ODE with real constant coefficients has the corresponding characteristic equation
an λn + an−1 λn−1 + . . . + a1 λ + a0 = 0
which has n roots λ1 , λ2 , . . . , λn . There are three main cases to consider:
I all roots are real and distinct

I some roots are complex conjugates
I some real roots are repeated
166
5.2.3 Homogeneous 2nd order linear ODEs with constant coefficients
In this section we will review the material on second order homogeneous linear ODEs that was seen in
first year. You are expected to know this material.
Example 5.2.4
The vertical motion (in centimetres) of a mass (m) attached to a spring (with spring constant k) over
time t is described by the simple harmonic equation
d2 y k
2
+ y(t) = 0.
dt m
Since this is a homogeneous second order linear ODE we already know there will be two linearly
independent solutions y1 (t) and y2 (t) which will give the general solution y(t) = c1 y1 (t) + c2 y2 (t) for
arbitrary constants c1 and c2 . To determine the values of c1 and c2 we will need to know two conditions
characterising the system; for example, if we know the spring was initially at rest stretched down 10
centimetres from the spring-mass equilibrium point then the initial conditions (since they occur at
time t = 0) for the system would be

dy
y(0) = −10 and = 0.
dt t=0
Consider the homogeneous second order linear ODEs with constant coefficients
d2 y dy
a +b + cy(x) = 0
dx2 dx
where a, b and c are real numbers. Assuming a solution of this ODE has the form y(x) = eλx for some
unknown value of λ, then we have the corresponding characteristic equation to the homogeneous second
order linear ODE with constant coefficients could have;
aλ2 + bλ + c = 0
The characteristic equation has two real, distinct roots
If aλ2 + bλ + c = 0 has two real, distinct roots λ1 and λ2 then y1 (x) = eλ1 x and y2 (x) = eλ2 x are
linearly independent solutions of the homogeneous second order linear ODE with constant coefficients.
Therefore, the general solution is
y(x) = c1 eλ1 x + c2 eλ2 x
Example 5.2.5
Find the solution of the second order homogeneous linear ODE
d2 y dy
2
+3 + 2y(t) = 0
dt dt
with the two initial conditions given by

dy
y(0) = −1 and = 5.
dt t=0
167
Assuming y(t) = eλt then the characteristic equation corresponding to the ODE is
λ2 + 3λ + 2 = 0
which has solutions λ1 = −2 and λ2 = −1. Therefore, the general solution (GS) of the ODE is
y(t) = c1 e−2t + c2 e−t .
We note that the first derivative of the general solution with respect to t is
dy
= −2c1 e−2t − c2 e−t .
dt
Applying the first initial condition (IC) gives

IC: y(0) = −1
=⇒ c1 + c2 = −1
GS: y(0) = c1 + c2
and applying the second initial condition gives


dy
IC: =5 

dt t=0

=⇒ −2c1 − c2 = 5
dy
GS: = −2c1 − c2


dt t=0

Therefore, to find the values of c1 and c2 we need to solve the system of equations
c1 + c2 = −1
−2c1 − c2 = 5
which has solutions c1 = −4 and c2 = 3.

Hence, the solution of the initial value problem is
y(t) = −4e−2t + 3e−t .
Note that the solution passes over the t-axis once and approaches 0 as t approaches infinity. With
different initial conditions, the solution need not pass over the axis at all. This case is sometimes called
overdamped.
168
The charateristic equation has two real, repeated roots
If aλ2 + bλ + c = 0 has two real, repeated roots λ1 = λ2 = −b/2a then y1 (x) = eλ1 x and y2 (x) = eλ1 x
would be linearly dependent. A homogeneous second order linear ODE with constant coefficients
should have two linearly independent solutions, but we only have one solution y1 (x) = eλ1 x . We could
try y(x) = xeλ1 x as a possible solution;
d2 d λ1 x
a 2 xeλ1 x + b + cxeλ1 x = a λ21 xeλ1 x + 2λ1 eλ1 x + b λ1 xeλ1 x + eλ1 x + cxeλ1 x

xe
dx dx
= aλ21 + bλ1 + c xeλ1 x + (2aλ1 + b) eλ1 x

= (0) xeλ1 x + (2aλ1 + b) eλ1 x

b
= 2a − + b eλ1 x
2a
= 0.
Therefore, y2 (x) = xeλ1 x is a second solution. Note that the Wronskian of y1 (x) and y2 (x) is
" #!
y1 y2
λ1 x λ1 x

W e , xe = det dy1 dy2
dx dx
eλ1 x xeλ1 x

= det
λ1 eλ1 x λ1 xeλ1 x + eλ1 x
= λ1 xe2λ1 x + e2λ1 x − xe2λ1 x
= e2λ1 x .

In this case W eλ1 x , xeλ1 x 6= 0 for all x ∈ R, therefore y1 (x) = eλ1 x and y2 (x) = xeλ1 x are linearly
independent solutions of the ODE. Hence, the general solution is
y(x) = c1 eλ1 x + c2 xeλ1 x
Example 5.2.6
d2 y dy
+2 + y(t) = 0
dt2 dt
subject to the two initial conditions

dy
y(0) = 3 and = 5.
dt t=0
The characteristic equation corresponding to the ODE is

λ2 + 2λ + 1 = 0
then λ1 = −1 and λ2 = −1. Therefore, the general solution of the ODE is
y(t) = c1 e−t + c2 te−t .
Applying the first initial condition gives c1 = 3 and the second initial condition gives c2 = 8.
Thus, the solution of the initial value problem is
y(t) = 3e−t + 8te−t .
169
Note that the solution is sometimes called critically damped as it is similar to the example for real
distinct roots case, but the damping is just sufficient to suppress vibrations.
The characteristic equation has complex roots
Here there are no real solutions to the characteristic equation, instead there are two complex conjugate
solutions λ1 = α + βi and λ2 = α − βi. Therefore, the general solution is
y(x) = Ae(α+βi)x + Be(α−βi)x .
However, this solution involves complex solutions when we would rather real solutions. We can rewrite
this general solution using Euler’s equation eiθ = cos(θ) + i sin(θ) to give
y(x) = Ae(α+βi)x + Be(α−βi)x

= eαx Aeiβx + Be−iβx

= eαx (A (cos(βx) + i sin(βx)) + B (cos(−βx) + i sin(−βx)))

= eαx (A (cos(βx) + i sin(βx)) + B (cos(βx) − i sin(βx)))
= eαx ((A + B) cos(βx) + i (A − B) sin(βx))
= eαx (c1 cos(βx) + c2 sin(βx))
for arbitrary constants c1 = A + B and c2 = iA − iB.

It is left as an exercise to show that the Wronskian W (eαx cos(βx) , eαx sin(βx)) 6= 0 for all x ∈ R,
therefore y1 (x) = eαx cos(βx) and y2 (x) = eαx sin(βx) are linearly independent solutions of the ODE.
In summary, if the auxillary equation gives complex roots λ1 = α + βi and λ2 = α − βi then the general
solution in the real vector space is
y(x) = eαx (c1 cos(βx) + c2 sin(βx))
170
Example 5.2.7
d2 y dy
2
+2 + 10y(t) = 0
dt dt
subject to the two initial conditions

dy
y(0) = −2 and = 0.
dt t=0
The characteristic equation corresponding to the ODE is

λ2 + 2λ + 10 = 0
which has solutions λ1 = −1 + 3i and λ2 = −1 − 3i. Therefore, the general solution of the ODE is
y(t) = e−t (c1 cos(3t) + c2 sin(3t)) .
Applying the first initial condition gives c1 = −2 and applying the second initial condition gives c2 =
−2/3.

2
y(t) = e−t −2 cos(3t) − sin(3t) .
3
Note that while the solution is damped and y(t) will approach 0 as t approaches infinity, the solution
oscillates about y = 0. This is sometimes called an underdamped system.
5.2.4 Higher order homogeneous linear ODEs with constant coefficients
The discussion for solving homogeneous 2nd order linear ODEs with constant coefficients easily extends
to higher order ODEs.
Consider a third order example here.
171
Example 5.2.8
Find the solution of the initial value problem
d3 y d2 y dy
3
+ 2 +4 + 4y(t) = 0
dt dt dt
subject to the three initial conditions
d2 y

dy
y(0) = 1, = 0 and 2 =2
dt t=0 dt t=0
Assuming y(t) = eλt then the corresponding characteristic equation is

λ3 + λ2 + 4λ + 4 = 0
The auxillary equation has solutions λ1 = −1, λ2 = 2i and λ3 = −2i.
Therefore, the general solution (GS) of the ODE is
y(t) = c1 e−t + c2 cos(2t) + c3 sin(2t)
for arbitrary constants c1 , c2 and c3 .
Applying the first initial condition (IC) gives

IC: y(0) = 1
=⇒ c2 = 1 − c1
GS: y(0) = c1 + c2
Then the solution becomes
y(t) = c1 e−t + (1 − c1 ) cos(2t) + c3 sin(2t)
We note that the first derivative of the general solution with respect to t is
dy
= −c1 e−t − 2 (1 − c1 ) sin(2t) + 2c3 cos(2t) .
dt
Then applying the second initial condition gives


dy
IC: =0 

dt t=0

=⇒ c1 = 2c3
dy
GS: = −c1 + 2c3


dt t=0

Then the solution becomes

y(t) = 2c3 e−t + (1 − 2c3 ) cos(2t) + c3 sin(2t)
We note that the second derivative of the general solution with respect to t is
d2 y
= 2c3 e−t − 4 (1 − 2c3 ) cos(2t) − 4c3 sin(2t) .
dt2
Applying the third initial condition gives

d2 y

IC: =2


dt2 t=0
 3
2 =⇒ c3 =
d y 5
GS: = 2c3 e−t − 4 (1 − 2c3 )


dt2 t=0


6 −t 1 3
y(t) = e − cos(2t) + sin(2t)
5 5 5
172
5.3 Non-Homogeneous nth -Order Linear ODEs
In first year lectures on non-homogeneous ODEs we used the method of undetermined coefficients to
find the particular solution. You may have even used the reduction of order method. In this section
we develop another method to find particular solutions of non-homogeneous nth -order linear ODEs.
Variation of parameters has a distinct advantage over the method of undetermined coefficients in that
it will always result in a particular solution provided the solutions to the associated homogeneous ODE
can be found. Variation of parameters is not restricted to a specific table of functions like the method
of undetermined coefficients (see the ODEs Assumed Background Knowledge section). Furthermore,
variation of parameters can be applied to non-homogeneous nth -order linear ODEs which do not have
constant coefficients. As an example, you could apply variation of parameters to Cauchy-Euler ODEs.
5.3.1 Variation of parameters - First order linear ODEs
Consider a non-homogeneous 1st -order linear ODE

dy
+ p(x) y(x) = r(x)
dx
If y1 (x) is the solution of the homogeneous 1st -order linear ODE
dy
+ p(x) y(x) = 0
dx
then we seek a solution to the non-homogeneous ODE of the form
yp (x) = u(x) y1 (x)
where u(x) is a function to be determined. If this function yp (x) is a solution of the non-homogeneous
equation then we have
dyp
r(x) = + p(x) yp (x)
dx
d
= u(x) y1 (x) + p(x) (u(x) y1 (x))
dx
dy1 du
= u(x) + p(x) y1 (x) + y1 (x)
dx dx
Since y1 (x) is the solution of the homogeneous ODE then this equation reduces to
du du r(x)
y1 (x) = r(x) or =
dx dx y1 (x)
Integrating with respect to x gives the function u(x)

Z
r(x)
u(x) = dx
y1 (x)
and hence determines the function yp (x) = u(x) y1 (x).

This is best illustrated with an example.
Example 5.3.1
Find the general solution of the first order ODE
dy
x2 + xy = 1.
dx
Write the ODE in the standard form for variation of parameters

dy 1 1
+ y= 2
dx x x
The homogeneous ODE

dy 1
+ y = 0.
dx x
is a first order separable ODE and has general solution
1
yh (x) = C
x
for arbitrary constant C.
1
We have y1 (x) = then the function u(x) satisfies
x
du r(x) 1
= =
dx y1 (x) x
and then u(x) = loge (|x|).
Therefore, the particular solution yp (x) = u(x) y1 (x) is
1
yp (x) = loge (|x|)
x
Hence, the general solution y(x) = yh (x) + yp (x) to the non-homogeneous ODE is
1 1
y(x) = C + loge (|x|)
x x
5.3.2 Variation of parameters - Second-order linear ODEs
The method of variation of parameters can be extended to second-order ODEs but with some differences.
Consider a non-homogeneous 2nd -order linear ODE
d2 y dy
+ p(x) + q(x) y(x) = r(x)
dx2 dx
If y1 (x) and y2 (x) are linearly independent solutions of the homogeneous 2nd -order linear ODE
d2 y dy
+ p(x) + q(x) y(x) = 0
dx2 dx
then we seek a solution to the non-homogeneous ODE of the form
yp (x) = u(x) y1 (x) + v(x) y2 (x)
174
where u(x) and v(x) are functions to be determined. If this function yp (x) is a solution of the non-
homogeneous equation then it can be shown that
2 2
d2 u

d y1 dy1 d y2 dy2
r(x) = u(x) 2
+ p(x) + q(x) y1 (x) + v(x) 2
+ p(x) + q(x) y 2 (x) + y1 (x) 2
dx dx dx dx dx
2

du dy1 d v dv dy2 du dv dy1 du dy2 dv
+ + y2 (x) 2 + + p(x) y1 (x) + y2 (x) + + .
dx dx dx dx dx dx dx dx dx dx dx
Since y1 (x) and y2 (x) are linearly independent solutions of the homogeneous ODE then this equation
reduces to

d du dv du dv dy1 du dy2 dv
y1 (x) + y2 (x) + p(x) y1 (x) + y2 (x) + + = r(x) .
dx dx dx dx dx dx dx dx dx
If we make the assumption

du dv
y1 (x) + y2 (x) =0
dx dx
then the original ODE reduces to
dy1 du dy2 dv
+ = r(x)
dx dx dx dx
du dv
We now have a linear system of equations in terms of and which has augmented matrix
dx dx
" #
y1 (x) y2 (x) 0
dy1 dy2
r(x)
dx dx
Provided the Wronskian W (y1 , y2 ) 6= 0 this system can be solved in terms of u and v to give
du −y2 (x) r(x) dv y1 (x) r(x)

= and =
dx W (y1 , y2 ) dx W (y1 , y2 )
Hence, integrate each first order ODE with respect to x to obtain the functions u(x) and v(x), that is,
Z Z
−y2 (x) r(x) y1 (x) r(x)
u(x) = dx and v(x) = dx
W (y1 , y2 ) W (y1 , y2 )
which will determine the particular solution
yp (x) = u(x) y1 (x) + v(x) y2 (x)
to the non-homogeneous ODE.
175
Example 5.3.2
Find the general solution of the second order ODE
d2 y dy
2
−3 + 2y(x) = x.
dx dx
The homogeneous ODE

d2 y dy
−3 + 2y(x) = 0.
dx2 dx
has corresponding characteristic equation
λ2 − 3λ + 2 = 0
which has solutions λ1 = 1 and λ2 = 2. Therefore, the homogeneous solution is
yh (x) = c1 ex + c2 e2x

We seek the particular solution of the form
yp (x) = u(x) y1 (x) + v(x) y2 (x)

= u(x) ex + v(x) e2x
with the two conditions

du dv
y1 (x) + y2 (x) =0
dx dx
dy1 du dy2 dv
+ = r(x)
dx dx dx dx
We have y1 (x) = ex and y2 (x) = e2x then the functions u(x) and v(x) satisfy
du −y2 (x) r(x)

= = −xe−x
dx W (y1 , y2 )
dv y1 (x) r(x)
= = xe−2x
dx W (y1 , y2 )
which by integration by parts gives
u(x) = xe−x + e−x

1 1
v(x) = − xe−2x − e−2x
2 4
Therefore, the particular solution yp (x) = u(x) y1 (x) + v(x) y2 (x) is
1 3
yp (x) = x+
2 4
Hence, the general solution to the non-homogeneous ODE is

1 3
y(x) = c1 ex + c2 e2x + x +
2 4
176
Example 5.3.3
In this example you will consider an ODE that we would not be able to use method of undetermined
coefficients to find the particular solution of the ODE.
It is left as an exercise to use the method of variation of parameters to show the general solution of the
second order ODE
d2 y
+ y(x) = sec(x) .
dx2
is
y(x) = c1 cos(x) + c2 sin(x) + loge (|cos(x)|) cos(x) + x sin(x)
5.3.3 Variation of parameters - nth -order linear ODEs
The method for second-order ODEs can be extended to nth -order ODEs using a similar approach. In
this case, given a non-homogeneous nth -order linear ODE
dn y dn−1 y dy
n
+ pn−1 (x) n−1 + . . . + p1 (x) + p0 y(x) = r(x)
dx dx dx
If y1 (x) , y2 (x) , . . . , yn (x) are linearly independent solutions of the associated homogeneous ODE then
we can seek the particular solution to the non-homogeneous ODE of the form
yp (x) = u1 (x) y1 (x) + u2 (x) y2 (x) + . . . + un (x) yn (x)
where uk (x) (for k = 1, 2, . . . , n) are functions to be determined subject to the n-conditions
du1 du2 dun

y1 (x) + y2 (x) + . . . + yn (x) =0
dx dx dx
dy1 du1 dy2 du2 dyn dun
+ + ... + (x) =0
dx dx dx dx dx dx
.. ..
. .
dn−2 y1 du1 dn−2 y2 du2 dn−2 yn dun
n−2
+ + ... + (x) =0
dx dx dxn−2 dx dxn−2 dx
dn−1 y1 du1 dn−1 y2 du2 dn−1 yn dun
+ + ... + (x) = r(x)
dxn−1 dx dxn−1 dx dx n−1 dx
This leads to a system of n equations for u1 (x) , u2 (x) , . . . , un (x) which in principle can be solved in
the same way as the second-order systems in the previous section.
177
5.4 Systems of Linear Ordinary Differential Equations
5.4.1 Systems of homogeneous 1st -order linear ODEs
First we consider a single tank mixing problem, which will be revision for some of you from previous
studies.
Example 5.4.1
Consider a tank containing 1000 L of pure water. A salt solution of concentration 1 mg/L enters the
tank at a rate of 10 L/s. The mixture is continuously mixed and flows out at 12 L/s. We would like to
know the mass of salt in the tank at any time after the salt solution begins to enter the tank.
The rate of accumulation is equal to the difference between the rate of inflow and the rate of outflow.
Clearly, in this example, the tank will eventually be empty. The change in volume V (t) of the tank
contents can be described by
dV
= Vin − Vout
dt
= 10 − 12
= −2
which means the tank is losing a volume of solution at a rate of 2 L/s. This is a simple first order ODE
with solution
V (t) = −2t + C
for arbitrary constant. Given the tank was full at the start then we have an initial condition to apply

IC: V (0) = 1000
=⇒ C = 1000.
GS: V (0) = −2 (0) + C
Therefore,
V (t) = −2t + 1000
and we can find that the tank will be empty V (t) = 0 after 500 seconds.
M
The concentration of salt is given by C = , where M is the mass of salt, then M = CV . The change
V
in mass is described by
dM
= Min − Mout
dt
M (t)
= (1) (10) − (12)
1000 − 2t
6M (t)
= 10 −
500 − t
which gives a linear first order ODE
dM 6
+ M (t) = 10
dt 500 − t
which has the integration factor

Z
6
I(t) = exp dt
500 − t
= exp(−6 loge (500 − t))
−6
= (500 − t) .
The ODE becomes

d −6

−6
(500 − t) M (t) = 10 (500 − t)
dt
and integrating with respect to t gives
−6 −5
(500 − t) M (t) = 2 (500 − t) +C
6
M (t) = 2 (500 − t) + C (500 − t)
for arbitrary constant C. Since the water was initially pure, that is, M (0) = 0 then
6 1000
0 = 1000 + C (500) =⇒ C = − 6.
(500)
Hence, the amount of salt at any time in the tank is described by

1000 6
M (t) = 2 (500 − t) − 6 (500 − t) for 0 < t < 500.
(500)
Now we consider a motivational example of a two tank mixing problem to introduce the systems of
linear ODEs.
179
Example 5.4.2
Consider two tanks containing 100 L of water each. Initially the first tank contains pure water while the
second tank also contains 300 g of salt. At the exact same moment two tubes are installed connecting
the two tanks; one tube pumps 4 L/min from tank 1 to tank 2, and the other tube pumps 4 L/min
from tank 2 to tank 1. Both tanks are continuously mixed. We would like to know the mass of salt in
each tank at any time after the salt solution begins to circulate between tanks.
Clearly the volume in each tank does not change over time. Let M1 (t) represent the mass of salt in
tank 1 at any time t > 0 and M2 (t) represent the mass of salt in tank 2 at any time t > 0.
We can write the initial conditions: M1 (0) = 0 and M2 (0) = 300.
The change in mass of salt in tank 1 is described by

dM1 M2 (t) M1 (t)
= (4) − (4)
dt 100 100
1 1
= M2 (t) − M1 (t)
25 25
and the change in mass of salt in tank 2 is described by

dM2 M1 (t) M2 (t)
= (4) − (4)
dt 100 100
1 1
= M1 (t) − M2 (t)
25 25
Therefore, we have a system of two coupled linear first order ODEs;
dM1 1 1
= M2 (t) − M1 (t)
dt 25 25
dM2 1 1
= M1 (t) − M2 (t)
dt 25 25
How are we going to find M1 (t) and M2 (t)?
The first step in solving such a system is to write the system in matrix form:
 
dM1
 dt
 = −0.04 0.04 M1
dM2 0.04 −0.04 M2
dt
In matrix notation this can be written as

dm M1 −0.04 0.04
= Am where m = , and A = .
dt M2 0.04 −0.04
This is a first order homogeneous system. Furthermore, we could now write the initial condition
as
0
m(0) = .
300
180
Homogeneous systems with distinct eigenvalues
Theorem General solution - first order homogeneous systems
Suppose we have a first order homogeneous linear system of ODEs:

dx
= Ax
dt
where  
x1 (t)
x =  ... 
 
xn (t)
is the solution vector and A is an n × n constant matrix.
Let λ1 , λ2 , . . . , λn be n distinct eigenvalues of the coefficient A with corresponding linearly inde-

pendent eigenvectors v1 , v2 , . . . , vn then each function
xk (t) = eλk t vk for k = 1, 2, . . . , n
is a solution of the homogeneous system. Therefore, the general solution of the first order homo-
geneous linear system of ODEs is
x(t) = c1 eλ1 t v1 + c2 eλ2 t v2 + . . . + cn eλn t vn
for arbitrary constants c1 , c2 , . . . , cn .
Let λ1 , λ2 , . . . , λn be distinct real eigenvalues of A with corresponding eigenvectors v1 , v2 , . . . , vn , that

is, Avk = λk vk for k = 1, 2, . . . , n.
Observe that if x(t) = c1 eλ1 t v1 + c2 eλ2 t v2 + . . . + cn eλn t vn then
dx
= c1 λ1 eλ1 t v1 + c2 λ2 eλ2 t v2 + . . . + cn λn eλn t vn
dt
= c1 eλ1 t (λ1 v1 ) + c2 eλ2 t (λ2 v2 ) + . . . + cn eλn t (λn vn )
= c1 eλ1 t (Av1 ) + c2 eλ2 t (Av2 ) + . . . + cn eλn t (Avn )
= A c1 eλ1 t v1 + c2 eλ2 t v2 + . . . + cn eλn t vn

= Ax
dx
that is, x(t) is a solution of the first order homogeneous solution = Ax.
dt
Example 5.4.3
We now return to seeking the solution of the homogeneous first order system for the two tank mixing
problem:
dM1 1 1
= M2 (t) − M1 (t)
dt 25 25
dM2 1 1
= M1 (t) − M2 (t)
dt 25 25
subject to the initial conditions
M1 (0) = 0, M2 (0) = 300.
181
Write the homogeneous system in matrix form

dm −0.04 0.04 0
= m with m(0) =
dt 0.04 −0.04 300

−0.04 − λ 0.04
det(A − λI) = det
0.04 −0.04 − λ
2 2
= (−0.04 − λ) + (0.04)
= λ2 + 0.08λ

λ2 + 0.08λ = 0
and then the eigenvalues are λ1 = −0.08 and λ2 = 0.
The corresponding eigenvector to λ1 = −0.08 is given by (A + 0.08I) m1 = 0, that is,

0.04 0.04
m1 = 0.
0.04 0.04
Row 1 and 2 both give 0.04M1 + 0.04M2 = 0. Let M2 = s for parameter s ∈ R then M1 = −s.
Therefore,
−1
λ1 = −0.08, m1 = .
1
The corresponding eigenvector to λ2 = 0 is given by (A + 0I) m2 = 0, that is,

−0.04 0.04
m2 = 0.
0.04 −0.04
Row 1 gives −0.04M1 + 0.04M2 = 0. Let M2 = s for parameter s ∈ R then M1 = s.

Therefore,
1
λ2 = 0, m1 = .
1
Thus, the general solution to the first order homogeneous system is

−0.08t −1 0t 1
m(t) = c1 e + c2 e (5.1)
1 1
for arbitrary constants c1 and c2 . Apply the initial condition to find the constants

−1 1 c1 0
= =⇒ c1 = 150, c2 = 150 (5.2)
1 1 c2 300
Hence, the solution to the mixing problem is

−0.08t −1 0t 1
m(t) = 150e + 150e (5.3)
1 1
that is, the mass in tank 1 is

M1 (t) = −150e−0.08t + 150, t > 0 (5.4)
and the mass in tank 2 is
M2 (t) = 150e−0.08t + 150, t > 0 (5.5)
182
Homogeneous systems with repeated eigenvalues
As discussed in the eigenvalues and eigenvectors section, if an n × n matrix A has an eigenvalue λ of

algebraic multiplicity m, it may or may not be possible to find m linearly independent eigenvectors.
Example 5.4.4
Find the general solution of the homogeneous first order system:
dx
= 3x(t) + y(t) + z(t)
dt
dy
= x(t) + 3y(t) + z(t)
dt
dz
= x(t) + y(t) + 3z(t)
dt

 
3 1 1
dx 
= 1 3 1 x
dt
1 1 3
We saw this coefficient matrix as an example in the eigenvalues and eigenvectors section.
We found the eigenvalues of A to be λ1 = 2, λ2 = 2 and λ3 = 5. We found an algebraic multiplicity of

two for λ = 2
The corresponding eigenvectors, respectively, are
     
−1 −1 1
v1 =  1  , v2 =  0  and v3 =  1 
0 1 1
Therefore, the geometric multiplicity for λ = 2 is also two. In this case the algebraic multiplicity and
geometric multiplicity are equal.
183

It is left as an exercise to confirm that v1 , v2 , v3 is a linearly independent set. Hence, the general
solution of the homogeneous system is
     
−1 −1 1
x(t) = c1 e2t  1  + c2 e2t  0  + c3 e5t  1 
0 1 1
that is,
x(t) = −c1 e2t − c2 e2t + c3 e5t , y(t) = c1 e2t + c3 e5t , z(t) = c2 e2t + c3 e5t .
It is possible for the geometric multiplicity of an eigenvalue to be less than the algebraic multiplicity of
the eigenvalue. In such a case we would need to construct the remaining eigenvector(s) corresponding
to that eigenvalue such that the geometric multiplicity and algebraic multiplicity are equal for that
eigenvalue. This is beyond the scope of this unit and left for your future studies.
A homogeneous system with complex eigenvalues
Theorem Complex eigenvalues
Suppose we have a first order homogeneous linear system of ODEs:

dx
= Ax
dt
where A is the coefficient matrix with real entries. If λ1 = α + iβ is an eigenvalue of A with
corresponding eigenvector v1 then each function
x1 (t) = eλ1 t v1 and x2 (t) = eλ1 t v1
is a complex solution of the homogeneous system.
Example 5.4.5
It is left as an exercise to find the general solution of the homogeneous first-order system
dx
= x(t) + y(t)
dt
dy
= −4x(t) + y(t)
dt
In applications it is usually desirable to find the general solution in terms of real functions instead of
complex functions. You will require Euler’s formula to reduce the complex solutions to real solutions.
Hence, the general solution to the homogeneous system is
x(t) = c1 et cos(2t) + c2 et sin(2t)

y(t) = −2c1 et sin(2t) + 2c2 et cos(2t)
184
5.4.2 A homogeneous system of linear ODEs to a homogeneous nth -order

linear ODE
It is possible to convert from a homogeneous system of linear ODEs with n × n coefficient matrix A to
one homogeneous nth -order linear ODE. This is best illustrated with an example.
Example 5.4.6
Consider the homogeneous first order system:
dx
= x(t) + y(t)
dt
dy
= −4x(t) + y(t)
dt
Differentiate the first ODE with respect to t

d dx dx dy
= +
dt dt dt dt
then substituting in the second ODE gives

d dx dx
= + (−4x(t) + y(t))
dt dt dt
and rearranging the first equation to set up an equation for y(t)
dx
y(t) = − x(t)
dt
which then gives
d dx dx dx
= + −4x(t) + − x(t) .
dt dt dt dt
Therefore, we have a homogeneous second order linear ODE for x(t)
d2 x dx
−2 + 5x(t) = 0
dt2 dt
Assuming x(t) = eλt then the characteristic equation corresponding to the ODE is
λ2 − 2λ + 5 = 0
which has solutions λ1 = 1 − 2i and λ2 = 1 + 2i. Therefore, the general solution of the ODE for x(t) is
for arbitrary constants c1 and c2 . Substituting this into the equation of y(t) gives
dx
y(t) = − x(t)
dt
d
c1 et cos(2t) + c2 et sin(2t) − c1 et cos(2t) + c2 et sin(2t)

=
dt
= −2c1 et sin(2t) + c1 et cos(2t) + 2c2 et cos(2t) + c2 et sin(2t) − c1 et cos(2t) + c2 et sin(2t)

= 2c2 et cos(2t) − 2c1 et sin(2t)
Hence, the general solution of the homogeneous system is

y(t) = 2c2 et cos(2t) − 2c1 et sin(2t)
which agrees with the general solution given in the previous example.
185
5.4.3 A homogeneous nth -order linear ODE to a homogeneous system of 1st

order linear ODEs
Example 5.4.7
Find the solution of third order ODE
d2 u du
+3 − 10u(t) = 0.
dt2 dt
Instead of finding the auxillary equation we could convert this to a homogeneous system of first order
ODEs. Let
du
x = u and y =
dt
then the first derivative of x with respect to t gives
dx du
=
dt dt
=y
the first derivative of y with respect to t gives
dy d2 u
= 2
dt dt
du
= −3 + 10u
dt
= 10x − 3y
Therefore, we have the homogeneous system

dx
=y
dt
dy
= 10x − 3y
dt

dx 0 1
= x
dt 10 −3

−λ 1
det(A − λI) = det
10 −3 − λ
= −λ (−3 − λ) − 10
= λ2 + 3λ − 10
λ2 + 3λ − 10 = 0
and then the eigenvalues are λ1 = −5 and λ2 = 2.

The corresponding eigenvector to λ1 = −5 is given by (A + 5I) v1 = 0, that is,

5 1
v1 = 0.
10 2
186
Row 1 gives 5x + y = 0. Let x = s for parameter s ∈ R then y = −5s.

Therefore,
1
λ1 = −5, v1 =
−5

−2 1
v2 = 0.
10 −5
Row 1 gives −2x + y = 0. Let x = s for parameter s ∈ R then y = 2s.

Therefore,
1
λ2 = 2, v2 =
2
The general solution of the homogeneous system is

1 1
x(t) = c1 e−5t + c2 e2t
−5 2
which gives
x(t) = c1 e−5t + c2 e2t

y(t) = −5c1 e−5t + 2c2 e2t
Since we defined x = u, then we have the solution of the second order ODE is
u(t) = c1 e−5t + c2 e2t
187
5.5 Boundary Value Problems
5.5.1 Boundary value problems
In this section we will discuss boundary value problems for linear ODEs. Here we shall concentrate on
second order ODEs but the theory can be extended to nth -order linear ODEs.
Consider the homogeneous second order linear ODE with real coefficients
d2 y dy
c2 (x) + c1 (x) + c0 (x) y(x) = 0
dx2 dx
defined on the closed interval [a, b], each c0 (x), c1 (x) and c2 (x) is continuous on [a, b] and c2 (x) 6= 0 for
all x ∈ [a, b]. Since this is a second order ODE, there exists two linearly independent solutions which
give a general solution y(x) = c1 y1 (x) + c2 y2 (x). To completely specify a solution to this problem we
need two conditions to determine the arbitrary constants.
So far in first and second year we have considered initial value problems; an ODE with two conditions
specified at the same x-value, that is,

dy
y(a) = k1 and = k2
dx x=a
These two conditions are called the initial conditions.
In contrast a boundary value problem is an ODE with one condition specified at one end of the
interval and another condition specified at the other end of the interval, for example,
y(a) = k1 and y(b) = k2
These two conditions are called the boundary conditions.

It should be noted that initial value problems and boundary value problems are fundamentally different.
Example 5.5.1
The homogeneous second order linear ODE
d2 y dy
−2 + 5y(x) = 0
dx2 dx
has general solution
y(x) = ex (c1 cos(2x) + c2 sin(2x))
1. Consider the boundary value problem

d2 y dy
−2 + 5y(x) = 0
dx2 dx
subject to the boundary conditions
y(0) = 0, y(1) = 3
Applying the first boundary condition gives

BC: y(0) = 0
=⇒ c1 = 0
GS: y(0) = c1
then the general solution reduces to
y(x) = c2 ex sin(2x)
Applying the second boundary condition gives

BC: y(1) = 3 3
=⇒ c2 =
GS: y(1) = c2 e sin(2) e sin(2)
Hence, the boundary value problem has the unique solution
3
y(x) = ex sin(2x)
e sin(2)

d2 y dy
−2 + 5y(x) = 0, y(0) = 0, y(π) = 0
dx2 dx
Applying the first boundary condition gives

BC: y(0) = 0
=⇒ c1 = 0
GS: y(0) = c1
then the general solution reduces to
Applying the second boundary condition gives

BC: y(π) = 0
GS: y(π) = c2 e sin(2π)
Since sin(2π) = 0 this gives us no information about c2 .

Hence, the boundary value problem with homogeneous boundary conditions has infinitely many
solutions

d2 y dy
2
−2 + 5y(x) = 0, y(0) = 0, y(π) = 1
dx dx
Applying the first boundary condition gives c1 = 0.
Applying the second boundary condition gives c2 e sin(2π) = 1, that is, 0 = 1 which is a contra-
diction.
Hence, the boundary value problem has no solution.
5.5.2 The eigenvalue-eigenfunction problem
For the remainder of this section we will concentrate on a special case of boundary value problems; the
eigenvalue-eigenfunction problem
d2 y
+ λy(x) = 0
dx2
subject to the homogeneous boundary conditions

dy
α y(a) + α =0

 1
 2
dxx=a
dy
 β1 y(b) + β2 =0


dx x=b
189
where λ is a real constant.
Recall from the matrix algebra section that given a square matrix A, if Av = λv, where v 6= 0, then λ
is an eigenvalue of A with corresponding eigenvector v. We have an analogous definition for boundary
value problems of this form.
A non-zero solution u(x) of the above boundary value problem is called an eigenfunction and the
corresponding λ is the eigenvalue. The solution y(x) ≡ 0 is called the trivial solution, and is not an
eigenfunction.
Eigenvalues are related to resonant frequencies— we’ll see exactly this equation arising from a model
of a vibrating string, in Chapter 6. They are of broad use when solving partial differential equations.
We will examine a couple of specific cases that will be useful later.
The eigenvalue–eigenfunction problem only has a (non-zero) solution for a very specific set of values λ.
These depend on the length of the domain, and the constants from the boundary conditions α1 , α2 , β1 ,
and β2 .
Example 5.5.2
Find the eigenvalues and eigenfunctions of the boundary value problem
d2 y
+ λy(x) = 0
dx2
y(0) = 0, y(L) = 0
for some L > 0. When the value of the function is zero on the boundaries, it is called Dirichlet boundary
conditions.
The characteristic equation here is s2 + λ = 0. We note that since λisreal, s must be purely complex
OR purely real: if s = a + ib, then λ = −s2 = (a2 − b2 ) + 2abi, and therefore either a or b must be zero.
We examine three possibilities:
I If λ = 0 then the ODE reduces to

d2 y
=0
dx2
which, by integrating twice with respect to x, has general solution
y(x) = c1 x + c2

Applying the first boundary condition gives y(0) = c2 and thus c2 = 0.
Applying the second boundary condition gives y(L) = c1 L and thus c1 = 0 since L > 0.
Therefore, the solution for λ = 0 is
yλ=0 (x) = 0.
This is the trivial solution, but it is not an eigenfunction (since eigenfunctions cannot be identically
zero).
I If λ < 0, then λ = −ω 2 for some real ω 6= 0, and the characteristic equation s2 + λ = s2 − ω 2 = 0
has two solutions, +ω and −ω.
Then the general solution is
y(x) = c1 e−ωx + c2 eωx
190

Applying the first boundary condition gives y(0) = c1 + c2 and thus c1 = −c2 .
Applying the second boundary condition gives y(L) = 2c2 sinh(ωL). We note that sinh(ωL) = 0
is only true if ω = 0. However, we assumed λ = +ω 2 > 0, that is, ω 6= 0. Thus we have to
conclude that c2 = 0.
Therefore the only solution is yλ<0 (x) = 0, which trivially solves the boundary value problem,
but once again is not an eigenfunction.
I If λ > 0, suppose that λ = ω 2 , for ω real. The characteristic equation s2 + λ = s2 + ω 2 = 0 has
two solutions, iω and −iω, and so the the general solution is
y(x) = c1 cos(ωx) + c2 sin(ωx)
Applying the first boundary condition gives y(0) = c1 and thus c1 = 0.
Applying the second boundary condition gives y(L) = c2 sin(ωL) = 0.
There are two possibilities here: either c2 = 0 or sin(ωL) = 0. In the first case, the general
solution is once again the trivial solution y(x) = 0, which is not an eigenfunction.
nπ
In the second case, sin(ωL) = 0 if and only if ωL = nπ for n ∈ Z, and so it is only when ω =
L
for n ∈ Z that we have corresponding eigenfunctions, y(x) = c sin(nπx/L). Here c is arbitrary.
Note that n here appears only as n2 , so instead of allowing n to be any integer, we restrict it to
the positive integers. In the case that n = 0, this would imply that λ = 0, which we’ve already
ruled out. Hence we take n ∈ N (that is, n = 1, 2, 3, . . . ).
Thus we find that this problem has eigenvalues
nπ 2
λn = for n ∈ N
L
with corresponding eigenfunctions
nπ
yn (x) = c sin x .
L
Example 5.5.3
Find the eigenvalues and eigenfunctions of the boundary value problem
d2 y
+ λy(x) = 0
dx2

dy dy
= 0, = 0.
dx x=0 dx x=L
When the derivative of the function is zero on the boundary, it is called a Neumann boundary condition.
The characteristic equation is s2 + λ = 0. Depending on the sign of λ, we have the same possibilities
as in the previous example for the general solution. We examine them in turn.
I If λ = 0 then the general solution is

y(x) = c1 x + c0

dy
Applying the first boundary condition gives = c1 and thus c1 = 0.
dx x=0
Applying the second boundary condition gives us no further information.
Therefore for any c0 6= 0, y(x) = c0 is an eigenfunction with eigenvalue λ = 0.
191
I If λ < 0, let λ = −ω 2 , then the ODE becomes

d2 y
− ω 2 y(x) = 0
dx2
and the general solution is
y(x) = c1 e−ωx + c2 eωx

dy
Applying the first boundary condition gives = ω (c2 − c1 ) and thus c2 = c1 since ω 6= 0.
dx x=0

dy
Applying the second boundary condition gives = 2c2 ω sinh(ωL) and thus c2 = 0 since
dx x=L
ω 6= 0.
Therefore, the solution is
yλ<0 (x) = 0.
This is not an eigenfunction.
I If λ > 0, suppose λ = ω 2 , then the ODE becomes
d2 y
+ ω 2 y(x) = 0
dx2
which has general solution
y(x) = c1 cos(ωx) + c2 sin(ωx)

dy
Applying the first boundary condition gives = ωc2 and thus c2 = 0 since ω 6= 0.
dx x=0

dy
Applying the second boundary condition gives = −ωc1 sin(ωL). We note that sin(ωL) = 0
dx x=L
nπ
is true if and only if ω = for n ∈ Z.
L
Therefore we have the eigenvalues
nπ 2
λn = for n ∈ N
L
with corresponding eigenfunctions
nπ
yn (x) = c cos x .
L
In summary, for the eigenvalue-eigenfunction problem

d2 y
+ λy(x) = 0 for x ∈ [0, L]
dx2

dy dy
Boundary conditions y(0) = 0, y(L) = 0 = 0, =0
dx x=0 dx x=L
n2 π 2 n2 π 2
Eigenvalues λn = , n = 1, 2, 3, . . . λ0 = 0, λn = , n = 1, 2, 3, . . .
L2 L2
nπx nπx
Corresponding
yn = c sin y0 = c0 , yn = c cos
eigenfunctions L L
192
Chapter 6
Partial Differential Equations
193
6.1 Introduction to Partial Differential Equations
6.1.1 Introduction
A partial differential equation (PDE) is an equation involving partial derivatives, that is, the
dependent variable is defined in terms of more than one independent variable. Many natural laws of
physics are written in terms of PDEs as they describe physical phenomena by relating space and time
derivatives. Some of these include Maxwell’s equations, Newton’s law of cooling, Newton’s equations
of motion, Schrödinger’s equation, the beam distortion equations, shock wave equations. The partial
derivatives represent physical quantities such as velocity, force, flux, friction, current and so forth.
Example 6.1.1
The (two-dimensional) advection equation
∂u ∂u ∂u
+ + =0
∂x ∂y ∂t
describes the motion of some quantity, (such as pollutant concentration or temperature) represented by
the scalar function u(x, y, t) over the two spatial dimensions x, y and over time t. Therefore, u depends
upon the three variables x, y and t.
Example 6.1.2
The (one-dimensional) heat conduction equation is described by

∂T ∂T 1 ∂ ∂T
+V = κ
∂t ∂x ρc ∂x ∂x
where V is the speed, ρ is the density, c is the specific heat and κ is thermal conductivity. The dependent
variable T (x, t) representing heat distribution over one spatial dimension and over time is dependent
upon the two variable x and t.
Example 6.1.3
The (three-dimensional) Laplace equation in cylindrical coordinates
1 ∂2u ∂2u

1 ∂ ∂u
r + 2 2 + 2 =0
r ∂r ∂r r ∂θ ∂z
describes the steady state heat distribution u(r, θ, z) throughout a cylinder, where u is dependent upon
the independent spatial variables r, θ and z.
Example 6.1.4
A simple (one-dimensional) traffic-flow equation
∂u ∂u
+ 2u =0
∂t ∂x
describes the car density u(x, t), where u is dependent upon the independent spatial variable x and
time t.
∂u
This PDE is non-linear because of the multiplication of u with . In this course we will not attempt
∂x
to solve non-linear PDEs.
Example 6.1.5
The Navier-Stokes’ equations for an incompressible fluid with out body forces (such as gravitational
acceleration) is described by
∂u
ρ + (u · ∇) u = −∇P + η∇2 u
∂t
where ρ is density, P is pressure and η is the dynamic viscosity. The dependent variable u is a vector
which has components u1 (x, y, z, t), u2 (x, y, z, t), u3 (x, y, z, t) each dependent upon the independent
variable x, y, z and t. This means there are a total of four PDEs (the three PDEs in this vector
equation plus the PDE representing incompressibility ∇ · u = 0) to determine the fluid flow. In this
course we will not attempt to solve PDEs of a vector, the analysis of the simplified versions of the
Navier-Stokes’ equations is left for fluid dynamics courses.
6.1.2 Solutions of PDEs - and how they are different to solutions of ODEs
In this section we will compare examples of simple ODEs and simple PDEs which look similar, and the
corresponding solutions.
Example 6.1.6
Consider the first-order ODE
du
= 2 where u = u(x) .
dx
Integration with respect to x gives the general solution
u(x) = 2x + C
for arbitrary constant C. This is clearly all possible solutions. If we substitute u(x) = 2x + C back
into the original ODE we do indeed get the equality to two, particularly given that the derivative of a
constant is zero.
That last sentence in the previous example is important when we consider multivariable functions.
When we differentiate a multivariable function u(x, y, z) with respect to x we hold all other variables
constant. In the case of u(x, y, z) we would then hold y and z constant.
Example 6.1.7
Consider the PDE
∂u
= 2 where u = u(x, y) .
∂x
Integration with respect to x gives the general solution
u(x, y) = 2x + f (y)
for an arbitrary function of y.

If we differentiate this function with respect to x we have
∂u ∂
= 2x + f (y)
∂x ∂x
∂ ∂
= 2x + f (y)
∂x ∂x
= 2.
Therefore, u(x, y) = 2x + f (y) is a solution of the PDE.
195
Example 6.1.8
Consider the ODE
d2 u
= 1 where u = u(x) .
dx2
Integrating once with respect to x gives
du
=x+A
dx
and then integrating with respect to x again gives
1 2
u(x) = x + Ax + B
2
for arbitrary constants A and B.
Consider the PDE
∂2u
= 1 where u = u(x, y, z) .
∂x2
Integrating once with respect to x gives
∂u
= x + f (y, z)
∂x
and then integrating with respect to x again gives
1 2
u(x, y, z) = x + xf (y, z) + g(y, z)
2
for arbitrary functions f and g of independent variables y and z.
Example 6.1.9
Find the general solution of the PDE
∂2u
= 0 where u = u(x, y)
∂y∂x
that is,
∂ ∂u
= 0 where u = u(x, y)
∂y ∂x
Integrating once with respect to y gives
∂u
= h(x)
∂x
for arbitrary function h(x). Now integrating with respect to x gives
Z
u(x, y) = h(x) dx + g(y)
Z
for arbitrary functions g(y). Define the function f (x) = h(x) dx. Hence, the general solution is
u(x, y) = f (x) + g(y)
for arbitrary functions.
6.1.3 Solutions to PDEs - determining the arbitrary function
Conversely, we can construct ODEs and PDEs to eliminate arbitrary constants and arbitrary functions
respectively.
196
Example 6.1.10
Given the single variable function
u(x) = Ce3x
Differentiating once with respect to x gives
du
= 3Ce3x
dx
and then given that u = Ce3x we have a first order ODE
du
= 3u.
dx
du
Therefore, u(x) = Ce3x is the general solution of the linear first order ODE = 3u.
dx
Example 6.1.11
Given the function
u(x, y) = x2 + y 2
The first order partial derivatives are
∂u ∂u
= 2x and = 2y
∂x ∂y
We can write the linear combination
y (2x) − x (2y) = 0
∂u ∂u
y −x =0
∂x ∂y
Now notice that if we have the function
u(x, y) = sin x2 + y 2

The first order partial derivatives are

∂u ∂u
= 2x cos x2 + y 2 and = 2y cos x2 + y 2

∂x ∂y
y 2x cos x2 + y 2 − x 2y cos x2 + y 2

=0
thus
∂u ∂u
y −x =0
∂x ∂y
In fact, any function that we choose of the form
u(x, y) = f x2 + y 2

will have first order partial derivatives

∂u df ∂u df
= 2x and = 2y
∂x dv ∂y dv
where v = x2 + y 2 .
197

df df
y 2x − x 2y =0
dv dv
therefore
∂u ∂u
y −x =0
∂x ∂y
Hence, the first order linear PDE

∂u ∂u
y −x =0
∂x ∂y
has general solution
u(x, y) = f x2 + y 2

for arbitrary function f of the argument x2 + y 2 .
To determine the explicit form of an arbitrary function in the general solution to a PDE we need
boundary conditions and/or initial conditions.
Example 6.1.12
Find the solution of the first order linear PDE
∂u ∂u
y −x = 0 for x ∈ (0, ∞) and y ∈ R
∂x ∂y
with boundary condition at x = 0:
2
u(0, y) = e−y for all y ∈ (−∞, ∞)
We know that the general solution to this PDE is
u(x, y) = f x2 + y 2

for arbitrary function f x2 + y 2 .
Applying the boundary condition we have
2
BC: u(0, y) = e−y 2
=⇒ f y 2 = e−y

GS: u(0, y) = f y 2
To make the next step more obvious, let G = y 2 then the rule for f is
f (G) = e−G
Applying this rule to the function in the general solution gives
u(x, y) = f x2 + y 2

2
+y 2 )
= e−(x
where we substituted G = x2 + y 2 into the rule we found from the boundary condition for f .
Clearly, given a PDE and defining the boundary condition at a single point (for example, at (x, y) =
(0, 0)), like we did for ODEs, will not be enough to determine the explicit form of an arbitrary functions
in general solutions to PDEs. In the previous example, we defined the boundary condition along the
x = 0 line. A boundary condition for a PDE must be defined along a curve.
198
6.1.4 Solutions to first order PDEs - separation of variables
This method is widely used for finding solutions to PDEs. Suppose we have a PDE for a function
u(x, y). The strategy behind separation of variables is to assume a solution of the form
u(x, y) = F (x) G(y)
where F is function of only x and G is a function of only y.
Example 6.1.13
Find the solution of the first order linear PDE
∂u ∂u
=4 where u(0, y) = 4e−y for all y ∈ R
∂x ∂y
Assume the general solution is of the form
u(x, y) = F (x) G(y)
then we have
∂ ∂
F (x) G(y) = 4 F (x) G(y)
∂x ∂y
dF dG
G(y) = 4F (x)
dx dy
1 dF 4 dG
=
F (x) dx G(y) dy
We note that the left hand side of this last equation is a function of only x while the right hand side
of this equation is a function of only y. A function of x equals a function of y when both functions are
equal to the same constant. Thus we introduce the separation constant λ:
1 dF 4 dG
=λ=
F (x) dx G(y) dy
Therefore, we have two first order linear ODEs

1 dF
=λ
F (x) dx
1 dG 1
= λ
G(y) dy 4
The general solution to

1 dF
=λ
F (x) dx
is
F (x) = Aeλx
for arbitrary constant A.
The general solution to
1 dG 1
= λ
G(y) dy 4
is
1
G(y) = Be 4 λy
for arbitrary constant B.
199
Therefore, the general solution of the PDE is
u(x, y) = F (x) G(y)

1
= Aeλx Be 4 λy
= Ceλ(x+ 4 y)
1
Applying the boundary condition gives
u(0, y) = 4e−y

BC: 1
1 =⇒ Ce 4 λy = 4e−y
GS: u(0, y) = Ce 4 λy
Therefore, C = 4 and λ = −4.

Hence, the solution is
u(x, y) = 4e−4(x+ 4 y)
1
6.1.5 Solutions to second order PDEs - superposition principle
For the remainder of this topic we will concentrate on three homogeneous linear second order PDEs;
the heat equation, Laplace’s equation and the wave equation. We only consider these three because all
linear second order PDEs fall into one of three types; parabolic PDEs, elliptic PDEs and hyperbolic
PDEs. The heat equation, Laplace’s equation and wave equation are the simplest examples of each
type, respectively. Knowing which type of PDE has appeared in your model of a physical system can
be instructive for determining analytic and numerical solutions.
Furthermore, given a homogeneous linear PDE we have an analogous result to that the superposition
principle for homogeneous linear ODEs:
Theorem Superposition principle - homogeneous linear PDEs
If u1 , u2 , . . . , un are solutions of a homogeneous linear PDE then the linear combination
u = c1 u1 + c2 u2 + . . . + cn un
for arbitrary constants c1 , c2 , . . . , cn , is also a solution of the homogeneous linear PDE.
We saw that a homogeneous linear nth -order ODE will have n solutions and the linear combination of
the n solutions with n arbitrary constants is the general solution of the homogeneous linear nth -order
ODE. This is not the case for PDEs. A homogeneous linear nth -order PDE can have infinitely many
solutions and therefore the general solution is the infinite summation of these solutions with infinitely
many arbitrary constants.
To determine the arbitrary constants for the general solution of a PDE we will need initial conditions
and boundary conditions.
200
Example 6.1.14
For the one-dimensional wave equation
∂2u 2
2∂ u
= c
∂t2 ∂x2
representing the vibration in a string of length L we will need four conditions.
If the ends of the string are fixed then there will be no displacement of the endpoints, therefore the
boundary conditions are
u(0, t) = 0 and u(L, t) = 0 for all t > 0
Before we release the string, the string is at rest and may have a displacement described with a shape
f (x), therefore the initial conditions are

∂u
= 0 and u(x, 0) = f (x) for all 0 < x < L
∂t t=0
In context of this problem we would assume that f (x) is a continuous function such that f (0) = 0 and
f (L) = 0 and the boundary conditions and the initial conditions are consistent.
Example 6.1.15
For the one-dimensional heat equation
∂u ∂2u
=κ 2
∂t ∂x
representing the temperature in a thin rod of length L we will need three conditions.
If the ends of the rod are perfectly insulated then there will be no flux of heat through the endpoints,
therefore the boundary conditions are

∂u ∂u
= 0 and = 0 for all t > 0
∂x x=0 ∂x x=L
Initially the rod may be heated such that the temperature distribution over the rod is described by
f (x), therefore the initial condition is
u(x, 0) = f (x) for all 0 < x < L
On a boundary we can assign one value of the following three conditions:
1. Dirichlet condition: The value of u is specified at the boundary.

For example, the fixed end points in the wave equation example above.
∂u ∂u
2. Neumann condition: The value of is specified at the boundary, where represents the
∂xi ∂xi
derivative perpendicular to the boundary. An example of this is the perfectly insulated end points
in the heat equation example above.
∂u
3. Robin condition: The value of + ku, for constant k, is specified at the boundary.
∂xi
We will use the Dirichlet conditions and the Neumann conditions in our analysis of the heat equation,
Laplace’s equation and the wave equation.
201
6.2 The Advection Equation
6.2.1 The advection equation
Consider the one-dimensional advection equation

∂u ∂u
+V =0
∂t ∂x
where V is a constant.
We will consider the question; along what curves x(t) is the solution u(x, t) constant?
Consider the derivative of u along a curve x(t). From the chain rule
d ∂u dx ∂
u(x(t) , t) = + u(x(t) , t) .
dt ∂t dt ∂x
If we have
dx
=V
dt
then
d ∂u ∂
u(x(t) , t) = +V u(x(t) , t)
dt ∂t ∂x
∂u ∂u
= +V
∂t ∂x
=0
that is, u is constant. Hence, we can conclude that

dx
=V
dt
defines the curves along which the solution u(x, t) is constant. We call these curves characteristic
curves and this method of seeking a general solution to a first order linear PDE is called the method
of characteristics.
Example 6.2.1
Find the general solution of the advection equation
∂u ∂u
+ =0
∂t ∂x
Here we have V (x, t) = 1. The characteristic curves are those curves along which u is constant and are
described by the differential equation
dx
=1
dt
Integrating with respect to t gives the general solution of the characteristic curves
x(t) = t + C

The characteristic curves are curves along which u is constant, that is,
u(x, t) = f (C)
for some arbitrary function. Since x = t + C then we have the x − t is a constant; C = x − t. Therefore,
the general solution to the advection equation is
u(x, t) = f (x − t)
Consider the initial condition

2
u(x, 0) = e−x for − ∞ < x < ∞
2
then f (x) = e−x and the solution of the initial value problem is
2
u(x, t) = e−(x−t)
which describes motion of the initial curve to the right as time increases.
Example 6.2.2
Find the general solution of the advection equation
∂u ∂u
−2 = 0.
∂t ∂x
Here we have V (x, t) = −2. The characteristic curves are those curves along which u is constant and
are described by the differential equation
dx
= −2
dt
Integrating with respect to t gives the general solution of the characteristic curves
x(t) = −2t + C
The characteristic curves are curves along which u is constant, that is,
u(x, t) = g(C)
for some arbitrary function g. Since x = −2t + C then we have the x + 2t is a constant; C = x + 2t.
Therefore, the general solution to the advection equation is
u(x, t) = g(x + 2t)
Consider the initial condition

2
u(x, 0) = e−x for − ∞ < x < ∞
2
then g(x) = e−x and the solution of the initial value problem is
2
u(x, t) = e−(x+2t)
which describes motion of the initial curve to the left as time increases.
203
204
6.3 The Heat Equation
The three dimensional heat equation or diffusion equation for u = u(x, y, z, t) has the form
1 ∂u
∇2 u =
κ ∂t
where κ is the diffusivity constant. In general, this PDE applies when the physical model is simply
representing the redistribution of a conserved quantity (such as heat) in space over time.
In this section we will concentrate on the one-dimensional heat equation; we consider a thin (one-
dimensional) rod that is insulated laterally so that the heat can only flow along the rod (in the x-
direction). We will seek the solutions of the heat equation by the separation of variables method.
When solving a PDE using separation of variables it is important to apply the homogeneous bound-
ary/initial conditions as soon as mathematically possible, as part of the separation of variables process.
Once all of the possible solutions are found using separation of variables, apply the superposition prin-
ciple to sum the solutions. Applying the non-homogeneous boundary/initial conditions to the general
solution, for the purpose of determining any arbitrary constants, should always be the final step.
6.3.1 Heat equation - temperature fixed at the ends of the rod
Let the ends of the rod (x = 0 and x = L) be held at zero temperature for all time (t > 0) and let the
initial heat distribution be given along the rod as f (x).
Therefore, we would like to find the solutions of the one-dimensional heat equation
∂u ∂2u
= κ 2 for x ∈ (0, L) , t ∈ (0, ∞)
∂t ∂x
u(0, t) = u(L, t) = 0 for all t ∈ (0, ∞)
and the initial condition

u(x, 0) = f (x) for all x ∈ (0, L)
Assume a solution of the form

u(x, t) = F (x) T (t)
then substituting into the heat equation gives
∂ ∂2
F (x) T (t) = κ 2 F (x) T (t)
∂t ∂x
dT d2 F
F (x) = κT (t) 2
dt dx
1 1 dT 1 d2 F
=
κ T (t) dt F (x) dx2
The left hand side of the last equation is a function of only t while the right hand side is a function of
only x, thus the equality can only occur if both are equal to the same constant. Introduce the separation
constant λ ∈ R:
1 1 dT 1 d2 F
= −λ =
Therefore, we have a second order linear ODE in terms of x and a first order linear ODE in terms of t
d2 F
+ λF (x) = 0
dx2
1 dT
= −κλ
T (t) dt
At this point we note that we have homogeneous boundary conditions. Using the assumed separation
form the boundary condition u(0, t) = 0 for all t ∈ (0, ∞) becomes
F (0) T (t) = 0 for all t ∈ (0, ∞)
The trivial case T (t) = 0 will give the trivial general solution u(x, t) = 0 for all x ∈ (0, L) and t ∈ (0, ∞).
Therefore F (0) = 0. Using the assumed separation form the boundary condition u(L, t) = 0 for all
t ∈ (0, ∞) becomes
F (L) T (t) = 0 for all t ∈ (0, ∞)
Therefore F (L) = 0.
Therefore, we have
d2 F
+ λF (x) = 0, F (0) = 0, F (L) = 0
dx2
1 dT
= −κλ
T (t) dt
In the boundary value problem section we found that

d2 F
+ λF (x) = 0, F (0) = 0, F (L) = 0
dx2
only has non-trivial solutions for λ > 0; the eigenvalues being
nπ 2
λn = for n ∈ N
L
and corresponding eigenfunctions
nπ
Fn (x) = kn sin x for n ∈ N
L
for arbitrary constants kn (n ∈ N).
The first order ODE then becomes
1 dTn nπ 2
= −κ for n ∈ N
Tn (t) dt L
which is nπ 2
d
loge (|Tn (t)|) = −κ for n ∈ N
dt L
Integrating with respect to t and algebraically rearranging gives
nπ 2
Tn (t) = cn exp −κ t for n ∈ N
L
for arbitrary constants cn (n ∈ N).
Therefore, all possible solutions to the heat equation with zero temperature boundaries
un (x, t) = Fn (x) Tn (t)
nπ 2 nπ
= Bn exp −κ t sin x for n ∈ N
L L
206
for arbitrary constants Bn = kn cn for n ∈ N.

Hence, the general solution to the heat equation with zero temperature boundaries is
∞
X
u(x, t) = un (x, t)
n=1
X∞ nπ 2
nπ
= Bn exp −κ t sin x
n=1
L L
Notice the decaying exponential part of this solution, this implies that u −→ 0 as t −→ ∞.
To completely specify the solution to this heat equation problem we need to determine the coefficients
Bn for all n ∈ N by applying the initial condition

 IC: u(x, 0) = f (x)



X∞ nπ

 GS: u(x, 0) = Bn sin x


n=1
L
that is,
∞
X nπ
f (x) = Bn sin x
n=1
L
As an example, consider the heat equation on a rod of length L = 1 with zero temperature end points
and an initial condition u(x, 0) = sin(πx) then
∞
X
sin(πx) = Bn sin (nπx)
n=1
which gives B1 = 1 and Bn = 0 for all n ∈ N\ {21}. Hence, the solution corresponding to the initial
condition is
2
u(x, t) = exp −κ (π) t sin (πx)
Consider the plot of this solution with κ = 1
Solution for heat equation for κ = 1
0.8
0.6
u(x,t)
0.4
0.2
0
0 0
0.05 0.2
0.1 0.4
0.15 0.6
0.2 0.8
0.25 1 x
t
207
Note that as t increases (left to right) the solution u(x, t) decays towards zero.
0.8
0.6
u(x,t)
0.4
0.2
0
0
0.5 0.25
0.2
0.15
0.1
1 0.05
0
x t
6.3.2 Heat equation - perfectly insulated at the ends of the rod
Let the ends of the rod (x = 0 and x = L) be perfectly insulated for all time (t > 0), that is, there is
zero flux through the ends of the rod. Let the initial heat distribution be given along the rod as f (x).
Therefore, we would like to find the solutions of the one-dimensional heat equation
∂u ∂2u
= κ 2 for x ∈ (0, L) , t ∈ (0, ∞)
∂t ∂x

∂u ∂u
= = 0 for all t ∈ (0, ∞)
∂x (0,t) ∂x (L,t)
and the initial condition

u(x, 0) = f (x) for all x ∈ (0, L)

u(x, t) = F (x) T (t)
then substituting into the heat equation gives
1 1 dT 1 d2 F
=
The left hand side of the last equation is a function of only t while the right hand side is a function of
only x, thus the equality can only occur if both are equal to the same constant. Introduce the separation
constant λ ∈ R:
1 1 dT 1 d2 F
= −λ =
208
Therefore, we have a second order linear ODE in terms of x and a first order linear ODE in terms of t
d2 F
+ λF (x) = 0
dx2
1 dT
= −κλ
T (t) dt
At this point we note that we have homogeneous boundary conditions. Using the assumed separation
form the boundary condition u(0, t) = 0 for all t ∈ (0, ∞) becomes

dF
T (t) = 0 for all t ∈ (0, ∞)
dx x=0
dF
Therefore = 0. Using the assumed separation form the boundary condition u(L, t) = 0 for all
dx x=0
t ∈ (0, ∞) becomes
dF
T (t) = 0 for all t ∈ (0, ∞)
dx x=L
dF
Therefore = 0.
dx x=L
Therefore, we have
d2 F

dF dF
+ λF (x) = 0 subject to = 0 and =0
dx2 dx x=0 dx x=L
1 dT
= −κλ
T (t) dt
In the boundary value problem section we found that

d2 F

dF dF
+ λF (x) = 0 subject to = 0 and =0
dx2 dx x=0 dx x=L
only has non-trivial solutions for λ = 0 and λ > 0; the eigenvalues are
nπ 2
λ0 = 0 and λn = for n ∈ N
L
and corresponding eigenfunctions
nπ
F0 = k0 and Fn (x) = kn cos x for n ∈ N
L
for arbitrary constants k0 and kn (n ∈ N).
For λ0 = 0 the first order ODE becomes
1 dT0
=0
T0 (t) dt
which has the constant solution T0 (t) = c0 for an arbitrary constant c0 .
For λ > 0 the first order ODE becomes
1 dTn nπ 2
= −κ for n ∈ N
Tn (t) dt L
which is nπ 2
d
loge (|Tn (t)|) = −κ for n ∈ N
dt L
209
Integrating with respect to t and algebraically rearranging gives

nπ 2
Tn (t) = cn exp −κ t for n ∈ N
L
for arbitrary constants cn (n ∈ N).

Therefore, all possible solutions to the heat equation with perfectly insulated boundaries are
nπ 2 nπ
u0 (x, t) = A0 and un (x, t) = An exp −κ t cos x for n ∈ N
L L
where A0 = k0 c0 and An = kn cn for n ∈ N.

Hence, the general solution to the heat equation with perfectly insulated boundaries is
∞
X
u(x, t) = un (x, t)
n=0
∞
X nπ 2 nπ
= A0 + An exp −κ t cos x
n=1
L L
To completely specify the solution to this heat equation problem we need to determine the coefficients
An for all n ∈ N by applying the initial condition

 IC: u(x, 0) = f (x) ∞

X nπ
 GS: u(x, 0) = A0 +
 An cos
L
x
n=1
that is,
∞
X nπ
f (x) = A0 + An cos x
n=1
L
As an example, consider a rod with length L = 1 and the initial condition u(x, 0) = x for x ∈ (0, 1)
then
∞
X
x = A0 + An cos (nπx)
n=1
This was an example in the Fourier cosine and sine series section, and we found that taking the even
periodic extension of f (x) = x onto (−1, 0) gives the Fourier cosine series
∞
1 X 2 n
f (x) = + ((−1) − 1) cos(nπx) .
2 n=1 n2 π 2
Therefore,
1 2 n
A0 = and An = 2 2 ((−1) − 1)
2 n π
Hence, the solution to the heat equation for a rod of length one unit with perfectly insulated end points
and an initial condition u(x, 0) = x for x ∈ (0, 1) is
∞
1 X 2 n

2

u(x, t) = + 2 2
((−1) − 1) exp −κ (nπ) t cos (nπx)
2 n=1 n π
Consider the plot of this solution with κ = 1
210
0.8
0.6
u(x,t)
0.4
0.2
0
1
0.5 0
0.05
0.1
0.15
0 0.2
0.25
x t
As t increases (right to left in the above plot) the solution u(x, t) decays towards 21 . The next plot
shows the temperature distribution starting along the line f (x) = x and then decays towards the line
u = 21 .

1
0.9
0.8
0.7
0.6
u(x,t)
0.5
0.4
0.3
0.2
0.1
0
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
x
211
6.4 Laplace’s Equation
The three-dimensional Laplace equation for u = u(x, y, z) has the form
∇2 u = 0
Comparing the three-dimensional Laplace equation to the three-dimensional heat equation we find that
∂u
= 0. This implies that while the heat equation models the redistribution in space over time of a
∂t
conserved quantity, Laplace’s equation is not dependent upon time. Typically solutions of Laplace’s
equation represent equilibrium or steady states in a system.
In this section we will concentrate on the two-dimensional Laplace equation; we consider a thin (two-
dimensional) plate that is insulated on either face. This means we will have four sides each of which
will require a boundary condition. We can seek the solutions of Laplace’s equation by the separation
of variables method.
As mentioned in the heat equation section; when solving a PDE using separation of variables it is
important to apply the homogeneous boundary/initial conditions as soon as mathematically possible,
while leaving the non-homogeneous boundary/initial conditions to be applied to the general solution
for the purpose of evaluating arbitrary constants.
6.4.1 Two dimensional Laplace equation - boundary conditions
In Cartesian coordinates, Laplace’s equation for u = u(x, y) is
∂2u ∂2u
+ 2 = 0.
∂x2 ∂y
on a rectangular region
(x, y) ∈ R2 : 0 ≤ x ≤ L, 0 ≤ y ≤ W

If u(x, y) represents the temperature across the rectangular plate, each side of the rectangular region
should have a standard boundary condition; either a fixed temperature distribution or is insulated, that
is,

∂u
u(0, y) = f1 (y) or =0
∂x (0,y)

∂u
u(L, y) = f2 (y) or =0
∂x (L,y)

∂u
u(x, 0) = g1 (x) or =0
∂x (x,0)

∂u
u(x, W ) = g2 (x) or =0
∂x (x,W )
Here we concentrate on the Laplace equation with either homogeneous boundary conditions on two
parallel sides, for example f1 (y) = 0 and f2 (y) = 0, or three homogeneous boundary conditions.
Example 6.4.1
Find the solution of Laplace’s equation on the square plate
∂2u ∂2u
+ 2 = 0 for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1
∂x2 ∂y
with the boundary conditions

 u(0, y) = 0 and u(1, y) = 0 for y ∈ (0, 1)
u(x, 0) = 0 for x ∈ (0, 1)
u(x, 1) = sin(πx) for x ∈ (0, 1)

The Laplace equation coupled with these four Dirichlet conditions is called a Dirichlet problem.
Here we will note the key steps and leave the working as an exercise.
Assuming a solution of the form
u(x, y) = F (x) G(y)
gives a boundary value problem in terms of x
d2 F
+ λF (x) = 0, F (0) = 0 and F (1) = 0
dx2
for x ∈ (0, 1) and the second order ODE in terms of y
d2 G
− λG(y) = 0
dy 2
for y ∈ (0, 1).

2
The boundary value problem for F has eigenvalues λn = (nπ) (n ∈ N) and corresponding eigenfunctions
Fn (x) = cn sin(nπx) for x ∈ (0, 1).
The corresponding solution to the ODE for G is Gn (y) = An cosh(nπy) + Bn sinh(nπy); the boundary
condition u(x, 0) = 0 implies that An = 0 for all n.
Therefore, un (x, y) = Bn sinh(nπy) sin(nπx) for arbitrary constants Bn (n ∈ N) are solutions of

Laplace’s equation given the homogeneous boundary conditions. By superposition principle, the general
solution is
∞
X
u(x, y) = Bn sinh(nπy) sin(nπx)
n=1
1
The non-homogeneous boundary condition gives Bn = 0 for n ∈ N\ {1} and B1 = . Hence, the
sinh(π)
solution to this Dirichlet problem is
1
u(x, y) = sinh(πy) sin(πx)
sinh(π)
for 0 < x < 1 and 0 < y < 1.
213
1
Consider the plot of the solution u(x, y) = sinh(πy) sin(πx)
sinh(π)
Solution for the Laplace equation on the plate 0 < x < 1 and 0 < y < 1
0.8
0.6
u
0.4
0.2
0
0 0.2
0.4
0 0.6
0.2 x
0.4 0.8
0.6 1
y 0.8
1
Note that the boundary conditions for all four sides are satisfied.
214
6.4.2 Two dimensional Laplace equation - four non-homogeneous boundary

conditions
As suggested by the previous example, a Dirichlet problem on a rectangular region with homogeneous
boundary conditions on two parallel boundaries can be easily solved by separation of variables. However,
separation of variables does not work for a Dirichlet problem on a rectangular region with all non-
homogeneous boundary conditions. Clearly, this type of problem is going to turn up in applications, so
we need to find a way around this.
Consider
∂2u ∂2u
+ 2 = 0 for 0 ≤ x ≤ L and 0 ≤ y ≤ W
∂x2 ∂y
subject to
u(0, y) = f1 (y) and u(L, y) = f2 (y) for y ∈ (0, W )

u(x, 0) = g1 (x) and u(x, W ) = g2 (x) for x ∈ (0, L)
where f1 (y) 6= 0 and f2 (y) 6= 0 for some y ∈ (0, W ), and g1 (x) 6= 0 and g2 (x) 6= 0 for some x ∈ (0, L).
The superposition principle provides us with an approach; we can define the function v(x, y) satisfying
∂2v ∂2v
2
+ 2 = 0 for 0 ≤ x ≤ L and 0 ≤ y ≤ W
∂x ∂y
subject to
v(0, y) = 0 and v(L, y) = 0 for y ∈ (0, W )

v(x, 0) = g1 (x) and v(x, W ) = g2 (x) for x ∈ (0, L)
and define a function w(x, y) satisfying
∂2w ∂2w
+ = 0 for 0 ≤ x ≤ L and 0 ≤ y ≤ W
∂x2 ∂y 2
subject to
w(0, y) = f1 (y) and u(L, y) = f2 (y) for y ∈ (0, W )

w(x, 0) = 0 and w(x, W ) = 0 for x ∈ (0, L)
If v(x, y) is a solution of the first Dirichlet problem and w(x, y) is a solution of the second Dirichlet
problem, then u(x, y) = v(x, y) + w(x, y) is a solution of Laplace’s equation and u(x, y) will satisfy all
four non-homogeneous boundary conditions, for example:
u(0, y) = v(0, y) + w(0, y)

= 0 + f1 (y)
= f1 (y)
Hence, solving the two Dirichlet problems and then adding the corresponding solutions provides the
solution to the original Dirichlet problem with all non-homogeneous boundary conditions.
215
6.5 Laplace’s Equation in Polar Coordinates
6.5.1 The Laplace equation in polar coordinates
Let u(x, y) be a twice-differentiable function satisfying Laplace’s equation
∂2u ∂2u
+ 2 = 0.
∂x2 ∂y
The polar coordinates are defined by
where
y
r2 = x2 + y 2 and tan(θ) =
x
Recall in the chain rule section of the multivariable calculus topic we showed by implicit differentiation
that
∂r ∂r
= cos(θ) = sin(θ)
∂x ∂y
∂θ sin(θ) ∂θ cos(θ)
=− =
∂x r ∂y r
and by the chain rule the first order partial derivative of u with respect to x is given by
∂u ∂u sin(θ) ∂u
= cos(θ) −
∂x ∂r r ∂θ
∂u ∂u cos(θ) ∂u
= sin(θ) +
∂y ∂r r ∂θ
Applying the chain rule a second time, the second order partial derivative of u with respect to x is given
by
∂2u

∂ ∂u
=
∂x2 ∂x ∂x

∂ ∂u ∂r ∂ ∂u ∂θ
= +
∂r ∂x ∂x ∂θ ∂x ∂x

∂ ∂u sin(θ) ∂u ∂r ∂ ∂u sin(θ) ∂u ∂θ
= cos(θ) − + cos(θ) −
∂r ∂r r ∂θ ∂x ∂θ ∂r r ∂θ ∂x
∂ 2 u sin(θ) ∂u sin(θ) ∂ 2 u

= cos(θ) 2 + − (cos(θ))
∂r r2 ∂θ r ∂r∂θ
∂2u cos(θ) ∂u sin(θ) ∂ 2 u

∂u sin(θ)
+ − sin(θ) + cos(θ) − − −
∂r ∂θ∂r r ∂θ r ∂θ2 r
∂ 2 u 2 sin(θ) cos(θ) ∂u 2 sin(θ) cos(θ) ∂ 2 u sin2 (θ) ∂u sin2 (θ) ∂ 2 u
= cos2 (θ) 2
+ 2
− + +
∂r r ∂θ r ∂r∂θ r ∂r r2 ∂θ2
Applying the chain rule a second time, the second order partial derivative of u with respect to y is given
by
∂2u

∂ ∂u
=
∂y 2 ∂y ∂y

∂ ∂u ∂r ∂ ∂u ∂θ
= +
∂r ∂y ∂y ∂θ ∂y ∂y

∂ ∂u cos(θ) ∂u ∂r ∂ ∂u cos(θ) ∂u ∂θ
= sin(θ) + + sin(θ) +
∂r ∂r r ∂θ ∂y ∂θ ∂r r ∂θ ∂y
2 2

∂ u cos(θ) ∂u cos(θ) ∂ u
= sin(θ) 2 − + (sin(θ))
∂r r2 ∂θ r ∂r∂θ
∂2u sin(θ) ∂u cos(θ) ∂ 2 u

∂u cos(θ)
+ cos(θ) + sin(θ) − +
∂r ∂θ∂r r ∂θ r ∂θ2 r
2 2
∂ u 2 cos(θ) sin(θ) ∂u 2 cos(θ) sin(θ) ∂ u cos (θ) ∂u cos2 (θ) ∂ 2 u
2
= sin2 (θ) 2 − + + +
∂r r2 ∂θ r ∂r∂θ r ∂r r2 ∂θ2
Substituting these two second order partial derivatives into Laplace’s equation gives
∂2u ∂2u
0= + 2
∂x2 ∂y
∂ 2 u 2 sin(θ) cos(θ) ∂u 2 sin(θ) cos(θ) ∂ 2 u sin2 (θ) ∂u sin2 (θ) ∂ 2 u
= cos2 (θ) 2 + − + +
∂r r2 ∂θ r ∂r∂θ r ∂r r2 ∂θ2
2 2 2
∂ u 2 cos(θ) sin(θ) ∂u 2 cos(θ) sin(θ) ∂ u cos (θ) ∂u cos2 (θ) ∂ 2 u
+ sin2 (θ) 2 − 2
+ + +
∂r r ∂θ r ∂r∂θ r ∂r r2 ∂θ2
2 2 2 2 2 2
∂ u cos (θ) + sin (θ) ∂u cos (θ) + sin (θ) ∂ u
= cos2 (θ) + sin2 (θ) 2
+ +
∂r r ∂r r2 ∂θ2
Hence, the Laplace equation in polar coordinates is
∂ 2 u 1 ∂u 1 ∂2u
+ + =0
∂r2 r ∂r r2 ∂θ2
Note that on occasions this can be rewritten as
1 ∂2u

1 ∂ ∂u
r + 2 2 =0
r ∂r ∂r r ∂θ
6.5.2 Laplace’s equation on a disc—setup of problem
The Laplace equation in polar coordinates arises when seeking steady state solutions over geometries
such as discs, half discs and annuli. In this section we will concentrate on finding the general solution
of Laplace’s equation in polar coordinates over a disc of radius a units:
∂ 2 u 1 ∂u 1 ∂2u
2
+ + 2 2 = 0, 0 < r < a, 0 < θ < 2π
∂r r ∂r r ∂θ
Consider a thin disc that is insulated on either face. This means we will have only one side (compared
to the four sides of a rectangular region) which will require a boundary condition, since the condition
is on r = a then the condition will depend upon θ at most, that is,
u(a, θ) = f (θ) for 0 < θ < 2π
Assume u represents a physical quantity such as temperature. Physical intuition suggests that at any
given point (r, θ) on the disc we would expect only one value of u, that is, u is not multi-valued. This
guarantee this condition we have
u(r, θ + 2π) = u(r, θ) for 0 < θ < 2π
217
Furthermore, we expect the value of u at each and every point (r, θ) on the disc will be a finite value,
that is,
u(r, θ) is bounded for all 0 ≤ r ≤ a, 0 ≤ θ < 2π
You may be concerned that there are only three conditions when you would have expected four. For
the moment, let us see what happens if we attempt to find the general solution with only these three
conditions.
6.5.3 Laplace’s equation on a disc—separation of variables

u(r, θ) = R(r) Θ(θ)
then substituting into Laplace’s equation in polar coordinates gives
∂2 1 ∂ 1 ∂2
R(r) Θ(θ) + R(r) Θ(θ) + R(r) Θ(θ) =0
∂r2 r ∂r r2 ∂θ2
d2 R 1 dR 1 d2 Θ
Θ(θ) 2 + Θ(θ) + 2 R(r) 2 = 0
dr r dr r dθ
r2 d2 R r dR 1 d2 Θ
+ = −
R(r) dr2 R(r) dr Θ(θ) dθ2
The left hand side of the last equation is a function of only r while the right hand side is a function
of only θ, thus the equality can only occur if both are equal to the same constant. Similar to previous
PDE sections, introduce the separation constant λ ∈ R:
2
1 d2 Θ

1 2d R dR
r + r = λ = −
R(r) dr2 dr Θ(θ) dθ2
Therefore, we have a second order linear ODE in terms of r and a second order linear ODE in terms of
θ
d2 R dR
r2 2
+r − λR(r) = 0
dr dr
d2 Θ
+ λΘ(θ) = 0
dθ2
6.5.4 Laplace’s equation on a disc—general solution
Note that we do not have homogeneous boundary conditions to work with here. Using the assumed
separation form the condition u(r, θ + 2π) = u(r, θ) becomes
R(r) Θ(θ + 2π) = R(r) Θ(θ)

⇒ Θ(θ + 2π) = Θ(θ)
since we can dismiss the trivial solution R(r) ≡ 0. We will consider
d2 Θ
+ λΘ(θ) = 0 subject to Θ(θ + 2π) = Θ(θ)
dθ2
and then
d2 R dR
r2 +r − λR(r) = 0
dr2 dr
218
I If λ = 0 then the ODE reduces to

d2 Θ
=0
dθ2
which, by integrating twice with respect to θ, has general solution
Θ(θ) = c1 θ + c2
for arbitrary constants c1 and c2 . Applying the condition Θ(θ + 2π) = Θ(θ) gives
c1 (θ + 2π) + c2 = c1 θ + c2
2πc1 = 0
which implies that c1 = 0, and thus the solution for λ = 0 is
Θλ=0 (θ) = c2
for an arbitrary constant c2 .

For λ = 0 the ODE
d2 R dR
r2 2
+r − λR(r) = 0
dr dr
can be written as

1 ∂ ∂R
r =0
r ∂r ∂r

∂ ∂R
r =0
∂r ∂r
Integrating with respect to r gives

∂R
r = k1
∂r
∂R k1
=
∂r r
for arbitrary constant k1 . Integrating with respect to r a second time gives
Rλ=0 (r) = k1 loge (r) + k2
for arbitrary constants k1 and k2 .

Therefore, the general solution for λ = 0 is
uλ=0 (r, θ) = Rλ=0 (r) Θλ=0 (θ)

= (k1 loge (r) + k2 ) (c2 )
The third condition requires that u(r, θ) is bounded for all 0 ≤ r ≤ a, 0 ≤ θ < 2π. However,
loge (r) −→ −∞ as r −→ 0, in which case u(r, θ) would be not bounded as r −→ 0. Therefore, we
let k1 = 0 and the general solution for λ = 0 reduces to
uλ=0 (r, θ) = A0
for arbitrary constant A0 .

I If λ < 0, for simplicity let λ = −ω 2 , then the ODE becomes
d2 Θ
− ω 2 Θ(θ) = 0
dθ2
If we assume a solution of the form Θ(θ) = emθ then the corresponding auxillary equation is
m2 − ω 2 = 0
219
which has solutions m1 = −ω and m2 = ω. Thus the general solution is
Θ(θ) = c1 e−ωθ + c2 eωθ
for arbitrary constants c1 and c2 . The exponential function is not a periodic function, therefore
the condition Θ(θ + 2π) = Θ(θ) is satisfied if and only if c1 = 0 and c2 = 0. Therefore, the
solution for λ < 0 is
Θλ<0 (θ) = 0
Therefore, the general solution for λ < 0 reduces to
uλ<0 (r, θ) = 0
I If λ > 0, for simplicity let λ = +ω 2 , then the ODE becomes
d2 Θ
+ ω 2 Θ(θ) = 0
dθ2
If we assume a solution of the form Θ(θ) = emθ then the corresponding auxillary equation is
m2 + ω 2 = 0
which has solutions m1 = −iω and m2 = iω. Thus the general solution is
Θ(θ) = c1 cos(ωθ) + c2 sin(ωθ)
for arbitrary constants c1 and c2 . The condition Θ(θ + 2π) = Θ(θ) for the sine and cosine functions
holds true if ω = n for n ∈ N, then we have the eigenvalues λn = ωn2 = n2 for n ∈ N. Relabel the
constant c1 for each n ∈ N as an and relabel the constant c2 for each n ∈ N as bn .
Therefore, the solution for λ > 0 is
(Θλ>0 )n (θ) = an cos(nθ) + bn sin(nθ) for n ∈ N
for arbitrary constants an and bn (n ∈ N).

Given that λ = n2 for n ∈ N then the ODE
d2 R dR
r2 +r − λR(r) = 0
dr2 dr
can be written as
dRn2 dRn
r2 +r − n2 Rn (r) = 0 for n ∈ N
dr2 dr
This is a Cauchy-Euler equation, as the coefficients of the derivatives of R are powers of r. If we
assume a solution of the form Rn (r) = rm for n ∈ N, then
r2 m (m − 1) rm−2 + r mrm−1 − n2 rm = 0

m2 − n2 rm = 0

Since rm 6= 0 for all 0 ≤ r ≤ a then

m2 − n2 = 0
which has solution m1 = −n and m2 = n. The solution is
Rn (r) = k1 r−n + k2 rn .
for arbitrary constants k1 and k2 .

Therefore, the general solution for λ > 0 where λ = n2 is
(uλ>0 )n (r, θ) = (Rλ>0 )n (r) (Θλ>0 )n (θ)

= k1 r−n + k2 rn (an cos(nθ) + bn sin(nθ)) for n ∈ N.

220
The third condition requires that u(r, θ) is bounded for all 0 ≤ r ≤ a, 0 ≤ θ < 2π. However,
r−n −→ ∞ as r −→ 0, and then u(r, θ) would not be bounded as r −→ 0. Therefore, we let
k1 = 0 and the general solution for λ > 0 reduces to
(uλ>0 )n (r, θ) = rn (An cos(nθ) + Bn sin(nθ)) for n ∈ N
for arbitrary constants An and Bn (n ∈ N).
Therefore, all possible solutions to Laplace’s equation in polar coordinates is
u0 (r, θ) = A0 and un (r, θ) = rn (An cos(nθ) + Bn sin(nθ)) for n ∈ N
Hence, the general solution to Laplace’s equation in polar coordinates is

∞
X
u(r, θ) = un (r, θ)
n=0
∞
X
= A0 + rn (An cos(nθ) + Bn sin(nθ))
n=1
6.5.5 Laplace’s equation on a half disc - a boundary value problem
Example 6.5.1
Find the solution of Laplace’s equation on a half disc of radius c units
(r, θ) ∈ R2 : 0 ≤ r ≤ c, 0 ≤ θ ≤ π

with the boundary conditions
u(r, 0) = 0 and u(r, π) = 0 for 0 ≤ r ≤ c
and
u(c, θ) = u0 for 0 ≤ θ ≤ π
where u0 > 0 is a constant.
The homogeneous boundary conditions

 BC:
 u(r, 0) = 0
∞
X
 GS:
 u(r, 0) = A0 + An r n
n=1
and 
 BC:
 u(r, π) = 0
∞
X
 GS:
 u(r, π) = A0 + − An r n
n=1
give the two equations

∞
X
A0 + An r n = 0
n=1
∞
X
A0 + − An r n = 0
n=1
Adding the two equations gives A0 = 0.

Subtract the second equation from the first equation to give An = 0 for all n ∈ N.
221
Therefore, the general solution reduces to

∞
X
u(r, θ) = Bn rn sin(nθ)
n=1
Applying the third boundary condition gives


 BC: u(c, θ) = u∞
 0
X
n

 GS: u(c, θ) = B n c sin(nθ)
n=1
that is,
∞
X
u0 = Bn cn sin(nθ)
n=1
If we perform an odd periodic extension of f then we would have

u0 for 0 < θ < π
f (θ) =
−u0 for − π < θ < 0
Since this function is odd then a0 = 0 and an = 0 for all n ∈ N.
For the bn coefficient, we have

Z π
2
bn = u0 sin(nθ) dθ
π 0
2u0 h iπ
=− cos(nθ)
nπ 0
2u0 n
=− ((−1) − 1) .
nπ
Therefore, the Fourier sine series of f is
∞
X 2u0 n
f (θ) ∼ (1 − (−1) ) sin(nθ) .
n=1
nπ
Then the equation for the boundary condition becomes

∞ X∞
X 2u0 n

(1 − (−1) ) sin(nθ) = Bn cn sin(nθ)
n=1
nπ n=1
which gives
2u0 n
(1 − (−1) ) = Bn cn for n ∈ N
nπ
that is,
2u0 n
Bn = (1 − (−1) ) for n ∈ N
cn nπ
Hence, the solution to Laplace’s equation over the half disc with provided boundary conditions is
∞ n
2u0 X (1 − (−1) ) n
u(r, θ) = r sin(nθ)
π n=1 cn n
222
Solution for the Laplace equation on half disc c=1 and u_0 = 1
0.5
u
-1
-0.5
0
x 1
0.5
0.5
1 0 y
223
6.6 The Wave Equation
The three-dimensional wave equation for u(x, y, z, t) has the form

1 ∂2u
∇2 u =
c2 ∂t2
In this section we will concentrate on the one-dimensional wave equation. We first consider an infinitely
long string and then we consider a string of a finite length L, such as a guitar string stretched taut
between two points on the x-axis, say x = 0 and x = L.
6.6.1 One dimensional wave equation - D’Alembert’s solution
Consider the one-dimensional wave equation

∂2u 2∂ u
2
= c for x ∈ (−∞, ∞)
∂t2 ∂x2
where c is the wave speed. Note that here we are considering the one dimensional wave equation over
an infinite domain −∞ < x < ∞ for t > 0. There will be no boundary conditions but we can subject
the one-dimensional wave equation over the infinite domain to the initial conditions

∂u
u(x, 0) = f (x) and = g(x)
∂t t=0

u(x, t) = F (x + ct) + G(x − ct)
2 2
for arbitrary functions F and G. Substituting this assumed solution into ∂∂t2u − c2 ∂∂xu2 gives
∂2
2 ∂
2
F (x + ct) + G(x − ct) − c F (x + ct) + G(x − ct)
∂t2 ∂x2
∂ ∂ ∂

∂
= F (x + ct) + G(x − ct) − c2 F (x + ct) + G(x − ct)
∂t ∂t ∂x ∂x

∂ ∂F ∂ ∂G ∂
= (x + ct) + (x − ct)
∂t ∂ (x + ct) ∂t ∂ (x − ct) ∂t

∂ ∂F ∂ ∂G ∂
− c2 (x + ct) + G(x − ct)
∂x ∂ (x + ct) ∂x ∂ (x − ct) ∂x

∂ ∂F ∂G ∂ ∂F ∂G
= c −c − c2 +
∂t ∂ (x + ct) ∂ (x − ct) ∂x ∂ (x + ct) ∂ (x − ct)
2 2
∂ F ∂ ∂ G ∂
2 ∂2F ∂
2 ∂2G ∂
=c 2 ∂t x + ct − c 2 ∂t x − ct − c 2 ∂x x + ct + c 2 ∂x x − ct
∂ (x + ct) ∂ (x − ct) ∂ (x + ct) ∂ (x − ct)
∂2F ∂2G ∂2F ∂2G
= c2 2 − c2 2 − c2 2 + c2 2
∂ (x + ct) ∂ (x − ct) ∂ (x + ct) ∂ (x − ct)
=0
Hence, the assumed form
u(x, t) = F (x + ct) + G(x − ct)
is a general solution of the one-dimensional equation on an infinite domain.
This general solution to the one-dimensional wave equation on an infinite domain is D’Alembert’s
solution.
To determine the arbitrary functions F and G you need to define the initial conditions.
Example 6.6.1
A string is initially held at rest with a shape of the sine curve sin(x) before being released for t > 0.
Use D’Alembert’s solution to find the solution to the wave equation given these initial conditions.
The first initial condition is that the string is held with the shape of the sine curve sin(x), that is,
u(x, 0) = sin(x).

∂u
The second initial condition is that the string is initially at rest, that is, = 0.
∂t t=0
Applying the first initial condition gives

IC: u(x, 0) = sin(x)
=⇒ F (x) + G(x) = sin(x)
GS: u(x, 0) = F (x) + G(x)
The first order partial derivative of D’Alembert’s solution with respect to t is

∂u
= cF 0 (x + ct) − cG0 (x − ct)
∂t
then applying the second initial condition gives

∂u
IC: =0


∂t t=0

=⇒ cF 0 (x) − cG0 (x) = 0
∂u
GS: = cF 0 (x) − cG0 (x)


∂t t=0

This second condition gives

F 0 (x) = G0 (x)
then integrating with respect to x gives
F (x) = G(x) + k
for arbitrary constant k. Then the first condition F (x) + G(x) = sin(x) becomes
2G(x) + k = sin(x)
that is,
1 k
G(x) = sin(x) −
2 2
and therefore,
1 k
F (x) = sin(x) +
2 2
Hence, the solution is
u(x, t) = F (x + ct) + G(x − ct)

1 k 1 k
= sin(x + ct) + + sin(x − ct) −
2 2 2 2
1 1
= sin(x + ct) + sin(x − ct)
2 2
Physically, as soon as the string is released the initial curve splits into two half-size curves, one moving
to the right and one moving to the left.
225
6.6.2 One dimensional wave equation - separation of variables
Consider a string of a finite length L, such as a guitar string stretched taut between two points on the
x-axis, say x = 0 and x = L. When the string starts vibrating u(x, t) represents the displacement along
the x-axis from the equilibrium position. We assume that the string is perfectly flexible, has constant
density, constant tension which is large compared to the force of gravity and there are
sno other external
T
forces acting on the string. The constant c is the wave speed and is defined as c = where ρ is the
ρ
density of the string and T is the horizontal component of the tension.
We can seek the solutions of the wave equation by the separation of variables method and we will need
four conditions to completely define the problem.
As mentioned in the heat equation section; when solving a PDE using separation of variables it is
important to apply the homogeneous boundary/initial conditions as soon as mathematically possible,
while leaving the non-homogeneous boundary/initial conditions to be applied to the general solution
for the purpose of evaluating arbitrary constants.
6.6.3 One dimensional wave equation - boundary and initial conditions
If u(x, t) represents the displacement of the string from x-axis over time then in Cartesian coordinates,
the wave equation for u(x, t) will be
∂2u ∂2u
2
= c2 2
∂t ∂x
u(0, t) = 0 and u(L, t) = 0 for t > 0
and the initial conditions

∂u
u(x, 0) = f (x) and = g(x) for 0 < x < L
∂t t=0
6.6.4 One dimensional wave equation - an example
Example 6.6.2
Find the solution of the wave equation on a string with a length of π units
∂2u ∂2u
2
= c2 2
∂t ∂x
u(0, t) = 0 and u(π, t) = 0 for t > 0
and the initial conditions

∂u
u(x, 0) = x (π − x) and = 0 for 0 < x < π
∂t t=0
Here we will note the key steps and leave the working as an exercise.
Assuming a solution of the form
u(x, t) = F (x) T (t)
226
gives a boundary value problem in terms of x
d2 F
− λF (x) = 0, F (0) = 0 and F (π) = 0
dx2
for x ∈ (0, π) and the second order ODE in terms of t
d2 T
− c2 λT (t) = 0 for t > 0
dt2
The boundary value problem has eigenvalues λn = −n2 (n ∈ N) and corresponding eigenfunctions
Fn (x) = sin(nx) for x ∈ (0, π). Then the second order ODE in terms of t becomes
d2 Tn
+ c2 n2 Tn (t) = 0 for t > 0
dt2
for n ∈ N or equivalently
d2 Tn
+ σn2 Tn (t) = 0 for t > 0
dt2
where σn = cn for n ∈ N.
Therefore, un (x, t) = sin(nx) (An cos(cnt) + Bn sin(cnt)) for arbitrary constants An and Bn (n ∈ N)
are solutions of the wave equation with the given conditions. By superposition principle, the general
solution is
∞
X
u(x, t) = sin(nx) (An cos(cnt) + Bn sin(cnt))
n=1

The initial condition ∂u
∂t t=0 = 0 gives Bn = 0 for all n ∈ N

The initial condition u(x, 0) = x (π − x) gives

∞
X
x (π − x) = An sin(nx)
n=1
The odd periodic extension of f (x) = x (π − x) gives the Fourier sine series representation of f as
∞
X 4 n

x (π − x) ∼ − ((−1) − 1) sin(nx)
n=1
πn3
therefore
4 n
An = (1 − (−1) ) for n ∈ N
πn3
Hence, the solution to this one dimensional wave equation is

∞
X 4
u(x, t) = 3
((1 − (−1)n )) sin(nx) cos(cnt)
n=1
πn
for x ∈ (0, π) and t > 0.
6.6.5 The one dimensional wave equation - physical interpretation
The solutions nπ cnπ

un (x, t) = sin x (An cos(σn t) + Bn sin(σn t)) for σn =
L L
are called the eigenfunctions and σn are the corresponding eigenvalues of the one-dimensional wave
equation. The set of eigenvalues σ1 , σ2 , . . . is called the spectrum.
227
σn
Each solution un (x, t) represents a harmonic motion with frequency cycles per unit time. They are
2π
called the normal modes of the vibration. The amplitude of the vibration of a normal mode is
q nπ
2 2
amplituden = (An ) + (Bn ) sin x
L
nπ
that is, the amplitude is proportional to sin x and therefore depends upon the position along the
L
string. Wherever the amplitude is zero, a node (distinct from a mode) occurs at that point, that is,
there are nodes at the points
L 2L (n − 1) L
x = 0, , ,..., ,L
n n n
For a fundamental mode, that is n = 1 mode, the only nodes are at the endpoints x = 0 and x = L.
For an overtone n = 2, 3, 4, . . . there are n + 1 nodes including the endpoints.
s
T cnπ
Since c = and σn = then
ρ L
s
Lσn T
=
nπ ρ
Therefore the frequency of the vibration is directly proportional to the square root of the horizontal
component of tension T . It is apparent that the greater the tension in the string, the greater the pitch
of the sound.
228
6.7 Further Examples and Applications of PDEs
We have seen
I the advection equation; which can describe the change in some physical quantity due to transport
of a material.
I the heat equation; which can be used to model diffusion phenomena.
I Laplace’s equation; which can be used to model steady state phenomena, that is, where there is
no time-dependence.
I the wave equation; which describes vibrational phenomena.
In this section we will list a number of examples of second order PDEs which you may come across
later.
6.7.1 Poisson’s equation
Problems describing potential field caused by a charge or mass density distribution give rise to Poisson’s
equation for u(x)
∇2 u = f (x)
A common example is Poisson’s equation for electrostatics

ρ
∇2 u = −
0
where is the permittivity of space constant and ρ is charge density.
6.7.2 Helmholtz equation
Many problems related to steady-state oscillations, such as mechanical, acoustical, thermal and elec-
tromagnetic, lead to either the two-dimensional or three dimensional Helmholtz equation for u(x)
∇2 u + k 2 u = 0
for wave number k.
6.7.3 Euler-Bernoulli beam equation
Given a thin beam of length L assuming modest displacement, negligible rotary inertia and stress that
does not vary across the beam section, the Euler-Bernoulli beam equation is
∂4u ∂2u
EI 4
+ ρ 2 = f (x, t)
∂x ∂t
where E is the Young’s modulus, I is the moment of inertia and ρ is the density of the beam. Then
u(x, t) represents the displacement of the beam from the x-axis at time t and f (x, t) represents any
distributed body forces.
6.7.4 The heat conduction equation
∂T ∂T 1
+V = ∇ · (κ∇T )
∂t ∂x ρc
where V is the speed, ρ is the density, c is the specific heat and κ is thermal conductivity. The dependent
variable T (x, t) represents the heat distribution over space and time.
6.7.5 Schrödinger’s equation
Given an electron of mass m and charge e moving around a proton (a hydrogen atom), the motion of
the electron is described by the wave function Ψ(x, t) which satisfies Schrödinger’s equation
2m
∇2 Ψ + (E − V ) Ψ = 0
~
where ~ is Planck’s constant, V is the potential it moves in and E is an eigenvalue which specifies the
energy level.
230

ENG2005 Lecture Book

Uploaded by

Copyright:

Available Formats

ENG2005 Lecture Book

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ENG2005 Lecture Book

Uploaded by

Copyright:

Available Formats

School of Mathematics

Australia Malaysia South Africa Italy India monash.edu/science

0.1 Assumed background knowledge and skills for ENG2005 . . . . . . . . . . . . . . . . . . . 7

3.4.3 Conservation of Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

4 Fourier Series 128

5 Ordinary Differential Equations 159

6 Partial Differential Equations 193

6.6.4 One dimensional wave equation - an example . . . . . . . . . . . . . . . . . . . . . 226

Geometry and trigonometry

I An appreciation of the symmetry of geometrical objects.

Functions and graphs

I The concepts of complex numbers.

I The concept of how to find the inverse of an n × n-matrix.

Ordinary Differential Equations

I The formal definition of the Laplace transform.

I How to find the inverse of Laplace transforms.

ENG2005 Advanced Engineering Mathematics

1 Version: Monday 22nd February, 2021

1.1.1 Linear systems

Consider a linear system of three equations with three unknowns

a11 x + a12 y + a13 z = b1

Very broadly speaking, there are three cases:

1. The three planes do not intersect.

Example 1.1.1 Consistent and under-determined system

Example 1.1.2 Inconsistent system

where A is the 3 × 3 matrix  

The matrix equation can written as the augmented matrix

1.1.3 Matrices - Gaussian elimination

1. Use row-operations to eliminate elements below the diagonal element.

2. Use row-operations to eliminate elements above the diagonal element.

Possible row operations are:

I Swap row i with row j: Ri ←→ Rj .

I Addition of row i and row j to give a new row i: Ri = Ri + Rj .

1.1.4 Matrices - definitions and operations

The transpose of a row vector v is

I If a row operation is applied to an augmented matrix [ A| u] to give the augmented matrix [ B| v]

1.1.5 Matrices - special matrices

1.1.6 Matrices - Properties of matrices

1.1.7 Matrix inverse

Note the following:

I Start with the augmented matrix [ A| I].

1.2.1 Systems of m linear equations with n unknowns

a11 x1 + a12 x2 + · · · + a1n xn = b1

where aij and bi are real constants (i = 1, 2, . . . , m and j = 1, 2, . . . , n).

Theorem Equivalent statements for n × n matrices

If A is an n × n matrix then the following statements are equivalent:

3. Ax = 0 has only the trivial solution; x = 0

If Ax = b represents a system of m linear equations with n unknowns x1 , x2 , . . . , xn (where n 6= m) we

Definition: Row echelon form of a matrix

A matrix is in row echelon form when

Definition: Basic and free variables

I a variable corresponding to a pivot column is called a basic variable.

I a variable which is not a basic variable is called a free variable.

I the same number of zero rows, and

I the pivot column always occurs in the same position.

By the sequence of row operations:

By the sequence of row operations:

By the sequence of row operations:

By the sequence of row operations: