Joel Franklin - Mathematical Methods For Oscillations and Waves-Cambridge University Press (2020)

Mathematical Methods for Oscillations and Waves
Anchored in simple and familiar physics problems, Joel Franklin provides a focused
introduction to mathematical methods in a narrative-driven and structured manner. Ordi-
nary and partial differential equation solving, linear algebra, vector calculus, complex
variables, and numerical methods are all introduced and bear relevance to a wide range
of physical problems. Expanded and novel applications of these methods highlight their
utility in less familiar areas, and advertise those areas that will become more important
as students continue. This highlights both the utility of each method in progressing with
problems of increasing complexity while also allowing students to see how a simplified
problem becomes “recomplexified.” Advanced topics include nonlinear partial differential
equations, and relativistic and quantum mechanical variants of problems like the harmonic
oscillator. Physics, mathematics, and engineering students will find 300 problems treated in
a sophisticated manner. The insights emerging from Franklin’s treatment make it a valuable
teaching resource.
Joel Franklin is a professor in the Physics Department of Reed College. His research focuses
on mathematical and computational methods with applications to classical mechanics,
quantum mechanics, electrodynamics, general relativity, and modifications to general
relativity. He is also the author of Advanced Mechanics and General Relativity (Cambridge
University Press, 2010), Computational Methods for Physics (Cambridge University Press,
2013), and Classical Field Theory (Cambridge University Press, 2017).
Mathematical Methods for
Oscillations and Waves
JOEL FRANKLIN
Reed College
University Printing House, Cambridge CB2 8BS, United Kingdom
One Liberty Plaza, 20th Floor, New York, NY 10006, USA
477 Williamstown Road, Port Melbourne, VIC 3207, Australia
314–321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre, New Delhi – 110025, India
79 Anson Road, #06–04/06, Singapore 079906
Cambridge University Press is part of the University of Cambridge.

It furthers the University’s mission by disseminating knowledge in the pursuit of
education, learning, and research at the highest international levels of excellence.
www.cambridge.org
Information on this title: www.cambridge.org/9781108488228
DOI: 10.1017/9781108769228
© Joel Franklin 2020
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2020
Printed in the United Kingdom by TJ International Ltd, Padstow Cornwall
A catalogue record for this publication is available from the British Library.
ISBN 978-1-108-48822-8 Hardback
Cambridge University Press has no responsibility for the persistence or accuracy
of URLs for external or third-party internet websites referred to in this publication
and does not guarantee that any content on such websites is, or will remain,
accurate or appropriate.
For Lancaster, Lewis, Oliver, and Mom
Contents
Preface page ix
Acknowledgments xiii
1 Harmonic Oscillator 1
1.1 Solution Review 1
1.2 Taylor Expansion 3
1.3 Conservative Forces 5
1.4 Series Expansion, Method of Frobenius 14
1.5 Complex Numbers 19
1.6 Properties of Exponentials and Logarithms 23
1.7 Solving First-Order ODEs 26
2 Damped Harmonic Oscillator 31

2.1 Damping 32
2.2 Driven Harmonic Oscillator 38
2.3 Fourier Series 40
2.4 Fourier Series and ODEs 48
2.5 Damped Driven Harmonic Oscillator 51
2.6 Fourier Transform 57
2.7 Fourier Transform and ODEs 63
3 Coupled Oscillators 65
3.1 Vectors 66
3.2 Matrices 69
3.3 Linear Transformations 73
3.4 Free Oscillator Chain 82
3.5 Fixed Oscillator Chain 87
4 The Wave Equation 90

4.1 Continuum Limit 91
4.2 Wave Equation for Strings 95
4.3 Solving the Wave Equation 96
4.4 Standing Waves 104
4.5 Plane Waves 107
4.6 Delays 110
vii
viii Contents
4.7 Shocks 112

4.8 Wave Equation with Varying Propagation Speed 113
5 Integration 120
5.1 First-Order ODEs 120
5.2 Two-Dimensional Oscillator 126
5.3 Period of Motion 129
5.4 Techniques of Integration 133
5.5 Relativistic Oscillator 139
5.6 Relativistic Lengths 142
6 Waves in Three Dimensions 146

6.1 Vectors in Three Dimensions 146
6.2 Derivatives 148
6.3 Fundamental Theorem of Calculus for Vectors 152
6.4 Delta Functions in Three Dimensions 163
6.5 The Laplacian and Harmonic Functions 165
6.6 Wave Equation 168
6.7 Laplace’s Equation 172
7 Other Wave Equations 181

7.1 Electromagnetic Waves 181
7.2 Fluids 185
7.3 Nonlinear Wave Equation 190
7.4 Schrödinger’s Wave Equation 193
7.5 Quantum Mechanical Harmonic Oscillator 202
8 Numerical Methods 207

8.1 Root-Finding 207
8.2 Solving ODEs 211
8.3 Integration 215
8.4 Finite Difference 220
8.5 Eigenvalue Problems 224
8.6 Discrete Fourier Transform 229
Appendix A Solving ODEs: A Roadmap 234
Appendix B Vector Calculus: Curvilinear Coordinates 238

B.1 Cylindrical Coordinates 238
B.2 A Better Way 242
B.3 Spherical Coordinates 245
B.4 Integral Elements 248
References 253
Index 254
Preface
There are many books on “mathematical methods for physics” [1, 3, 15], including
some with that exact title. Most of these are wide-ranging explorations of the physical
applications of fairly deep analytic and group-theoretic mathematics. They cover topics
that one might encounter anywhere from first-year undergraduate to first-year graduate
physics, and remain on our shelves as well-thumbed references and problem libraries. In
addition to a plethora of techniques, they cover all sorts of important special cases that can
keep the naı̈ve physicist out of trouble in a variety of technical situations.
There is also the Internet, itself a repository for all sorts of human knowledge, including
physical, mathematical, and their intersection. The Internet is even more encyclopedic
than most mathematical methods books, with more special cases and more specialized
examples. Here we can find, in almost equal number, the inspiring, the arcane, and the
incorrect. Students of physics, especially early in their studies, need to be sophisticated
and wary.
What is missing in both cases (especially the latter) is narrative. A clear description
of why we care about these methods, and how they are related to diverse, yet logically
connected problems of interest to physicists. Why is it, for example, that the Fourier
transform shows up in the analysis of networks of springs and also in the analysis of
analog circuits? I suggest the reason is that both involve the characterization of timescales
of oscillation and decay, and in some sense, almost all of physics is interested in such
timescales, so there is a universality here that is not shared with, say, the Laplace transform.
Yet Wikipedia, and other “online resources” fail to make this point – or rather point and
counterpoint – effectively, because there is no individual curator deciding what goes in,
what stays out, and how much time/space to dedicate to each topic.
This book has such a curator, and I have made choices based on my own research
experience, broadened by the teaching I have done at Reed College, and feedback from
students I have taught. At a small liberal arts college like Reed, the faculty must teach
and advise students in areas beyond the faculty member’s expertise and experience. The
advantage is a generalist’s view, but with the depth that is required to teach extremely
curious and talented students confidently. After all, much of what we sell in physics
is counterintuitive or even wrong (witness: gravity as one of the fundamental forces of
nature). We should expect, delight in, and encourage our students’ skepticism. The topics
in this book are intended to invite students to ask and answer many of the questions I have
been asked by their peers over the past 15 years.
In my department, there is a long tradition of teaching mathematical methods using oscil-
lations and waves as a motivational topic. And there are many appropriate “oscillations and
waves” texts and monographs [9, 16]. These are typically short supplemental books that
ix
x Preface
exhaustively treat the topic. Yet few of them attempt to extend their reach fully, to include
the mathematical methods that, for example, might be useful to a student of E&M (an
exception is [10], which does have a broader mathematical base). I was inspired by Sidney
Coleman’s remark, “The career of a young theoretical physicist consists of treating the
harmonic oscillator in ever-increasing levels of abstraction.” I am not sure any physics
that I know is particularly far removed from the harmonic oscillator, and embracing that
sentiment gives one plenty of room to maneuver. There is no reason that the mathematical
methods of oscillations and waves can’t serve as a stand-in for “mathematical methods for
physics.”
I have used chapters of the present volume to teach a one-semester undergraduate course
on mathematical physics to second-year physics students. For that audience, I work through
the following chapters:
Chapter 1 Harmonic Oscillator: A review of the problem of motion for masses attached
to springs. That’s the physics of the chapter, a familiar problem from high
school and introductory college classes, meant to orient and refresh the reader.
The mathematical lesson is about series solutions (the method of Frobenius)
for ordinary differential equations (ODEs), and the definition of trigonometric
special functions in terms of the ODEs that they solve. This is the chapter that
reviews complex numbers and the basic properties of ODEs and their solutions
(superposition, continuity, separation of variables).
Chapter 2 Damped Harmonic Oscillator: Here, we add damping to the harmonic
oscillator, and explore the role of the resulting new timescale in the solutions
to the equations of motion. Specifically, the ratio of damping to oscillatory
timescale can be used to identify very different regimes of motion: under-,
critically-, and over-damped. Then driving forces are added, we consider the
effect those have on the different flavors of forcing already in place. The main
physical example (beyond springs attached to masses in dashpots) is electrical,
sinusoidally driven resistor, inductor, capacitor (RLC) circuits provide a nice,
experimentally accessible test case. On the mathematical side, the chapter serves
as a thinly veiled introduction to Fourier series and the Fourier transform.
Chapter 3 Coupled Oscillators: We turn next to the case of additional masses. In one
dimension, we can attach masses by springs to achieve collective motions that
occur at a single frequency, the normal modes. Building general solutions, using
superposition, from this “basis” of solutions is physically relevant and requires
a relatively formal treatment of linear algebra, the mathematical topic of the
chapter.
Chapter 4 The Wave Equation: Taking the continuum limit of the chains of masses
from the previous chapter, we arrive at the wave equation, the physical subject
of this chapter. The connection to approximate string motion is an additional
motivation. Viewed as a manifestation of a conservation law, the wave equation
can be extended to other conservative, but nonlinear cases, like traffic flow.
Mathematically, we are interested in turning partial differential equations
xi Preface
(PDEs) into ODEs, making contact with some familiar examples. Making PDEs
into ODEs occurs in a couple of ways – the method of characteristics, and
additive/multiplicative separation of variables are the primary tools.
Chapter 5 Integration: With many physical applications already on the table, in this
chapter, we return to some of the simplified ones and recomplexify them.
These problems require more sophisticated, and incomplete, solutions. Instead
of finding the position of the bob for the simple pendulum, we find the period of
motion for the “real” pendulum. Instead of the classical harmonic oscillator,
with its familiar solution, we study the period of the relativistic harmonic
oscillator, and find that in the high-energy limit, a mass attached to a spring
behaves very differently from its nonrelativistic counterpart.
The eighth chapter, Numerical Methods, is used as a six-week “lab” component, one
section each week. The chapter is relatively self-contained, and consists of numerical
methods that complement the analytic solutions found in the rest of the book. There
are methods for solving ODE problems (both in initial and boundary value form)
approximating integrals, and finding roots. There is also a discussion of the eigenvalue
problem in the context of approximate solutions in quantum mechanics and a section on
the discrete Fourier transform.
There are two additional chapters that provide content when the book is used in an
upper level setting, for third- or fourth-year students. In the sixth chapter, Waves in Three
Dimensions, we explore the wave equation and its solutions in three dimensions. The
chapter’s mathematical focus is on vector calculus, enough to understand and appreciate
the harmonic functions that make up the static solutions to the wave equation. Finally,
the seventh chapter, Other Wave Equations, extends the discussion of waves beyond the
longitudinal oscillations with which we began. Here, we look at the wave equation as
it arises in electricity and magnetism (the three-dimensional context is set up in the
previous chapter), in Euler’s equation and its shallow water approximation, in “realistic”
(extensible) strings, and in the quantum mechanical setting, culminating in a quantum
mechanical treatment of the book’s defining problem, the harmonic oscillator.
There are two appendices to provide review. The first reviews the basic strategy of ODE
solving in a step-by-step way – what guesses to try, and when, with references to the
motivating solutions in the text. The second appendix is a review of basic vector calculus
expressions, like the gradient, divergence, curl, and Laplacian, in cylindrical, spherical,
and more general coordinate systems.
My hope is that this book provides a focused introduction to many of the mathematical
methods used in theoretical physics, and that the vehicles used to present the material
are clear and compelling. I have kept the book as short as possible, yet tried to cover a
variety of different tools and techniques. That coverage is necessarily incomplete, and for
students going on in physics, a copy of one of the larger [1, 3, 15], and more sophisticated
[2, 4, 17] mathematical methods texts will eventually be a welcome necessity, with this
book providing some motivating guidance. (I encourage students to have one of these texts
on hand as they read, so that when a topic like spherical Bessel functions comes up, they
can look at the relevant section for additional information.)
xii Preface
Mary Boas has a wonderful “To the Student” section at the start of [3], an individual call
to action that cannot be improved, so I will quote a portion of it:
To use mathematics effectively in applications, you need not just knowledge, but
skill. Skill can be obtained only through practice. You can obtain a certain superficial
knowledge of mathematics by listening to lectures, but you cannot obtain skill this way. . . .
The only way to develop the skill necessary to use this material in your later courses is to
practice by solving many problems. Always study with pencil and paper at hand. Don’t
just read through a solved problem – try to do it yourself!
Since I was an undergraduate, I have always followed and benefited from this advice, and
so, have included a large number of problems in this text.
Acknowledgments
It is a pleasure to thank the students and my colleagues in the physics department

at Reed College. I have benefited from my interactions with them, and in particular,
from discussions about teaching our second-year general physics course with Professors
Lucas Illing, Johnny Powell, and Darrell Schroeter. A very special thanks to Professor
David Latimer, who carefully read and thoughtfully commented on much of this text, his
suggestions have added value to the document, and been instructive (and fun) to think
about.
My own research background has informed some of the topics and presentation in this
book, and that background has been influenced by many talented physicists and physics
teachers – thanks to my mentors from undergraduate to postdoctoral, Professors Nicholas
Wheeler, Stanley Deser, Sebastian Doniach, and Scott Hughes.
Finally, David Griffiths has, throughout my career been an honest sounding board, a
source of clarity and wisdom. He has helped me both think about and present physics far
better than I could on my own. I thank him for sharing his insights on this material and my
presentation of it.
xiii
1 Harmonic Oscillator
The motivating problem we consider in this chapter is Newton’s second law applied to a
spring with spring constant k and equilibrium spacing a as shown in Figure 1.1.
The nonrelativistic equation of motion reads
mẍ(t) = −k(x(t) − a) (1.1)
and we must specify initial (or boundary) conditions. The point of Newton’s second law
is the determination of the trajectory of the particle, x(t) in this one-dimensional setting.
The initial conditions render the solution unique. As a second-order differential equation
(ODE), we expect the general solution to have two constants. Then we need two pieces
of information beyond the equation itself to set those constants, and initial or boundary
conditions can be used.
1.1 Solution Review
To proceed, we can define k/m ≡ ω 2 , so that our equation of motion becomes
ẍ(t) = −ω2 (x(t) − a) , (1.2)
and finally, we let y(t) ≡ x(t) − a in order to remove reference to a and allow us to identify
the solution to this familiar ODE:
ÿ(t) = −ω2 y(t) −→ y(t) = A cos(ωt) + B sin(ωt). (1.3)
k
m
a x(t)
Fig. 1.1 A mass m is attached to a spring with spring constant k and equilibrium spacing a. It moves without friction under
the influence of a force F = −k(x(t) − a). We want to find the location of the mass at time t, x(t), by solving
Newton’s second law.
1
The constants A and B have no a priori physical meaning, they are just the constants we
get from a second-order ODE like (1.3).1 The solution for x(t) is
x(t) = y(t) + a = A cos(ωt) + B sin(ωt) + a. (1.4)
Suppose we take the initial value form of the problem. We see the particle at x0 at time
t = 0, moving with velocity v0 . This allows us to algebraically solve for A and B:
x(0) = A + a = x0 −→ A = x0 − a
v0 (1.5)
ẋ(0) = Bω = v0 −→ B = .
ω
When we combine an ODE (like Newton’s second law) with constants (the initial position
and velocity), we have a well-posed problem and a unique solution. Putting it all together,
the problem is
mẍ(t) = −k(x(t) − a) x(0) = x0 ẋ(0) = v0 (1.6)
with solution

k m k
x(t) = (x0 − a) cos t + v0 sin t + a. (1.7)
m k m
There are many physical observations and definitions associated with this solution.
Suppose that we start the mass from rest, v0 = 0, with an initial extension x0 , and we set the
zero of the x axis at the equilibrium spacing a. Then the solution from (1.7) simplifies to

k
x(t) = x0 cos t . (1.8)
m
We call x0 the “amplitude,” the maximum displacement from equilibrium. The “period”
of the motion is defined to be the time it takes for the mass to return to its starting point.
In this case, we start at t = 0, and want to know when the “cosand” (argument of cosine)
returns to 2π ∼ 0. That is, the period T is defined to be the first time at which

k k m
x(T) = x0 cos T = x0 −→ T = 2π −→ T = 2π . (1.9)
m m k
This period is, famously, independent of the initial extension.2 That makes some sense,
physically – the larger the initial extension, the faster the maximum speed of the mass is,
so that even though it has to travel a longer distance, it does so at a greater speed. Somehow,
magically, the two effects cancel in this special case.
We can also define the “frequency” of the oscillatory motion, that is just the inverse of
the period, f ≡ 1/T. For the mass on a spring motion,

1 k
f= , (1.10)
2π m
1 These constants are called “constants of integration” and are reminiscent of the constants that appear when
you can actually integrate the equation of motion twice. That happens when, for example, a force depends
only on time. Then you can literally integrate ẍ(t) = F(t)/m twice to find x(t). There will be two constants of
integration that show up in that process.
2 Don’t believe it! See Section 5.5.
3 Taylor Expansion
and we define the “angular frequency” of the oscillatory motion to be

k
ω ≡ 2πf = , (1.11)
m
where ω is the letter commonly used, and we have taken advantage of that in writing (1.2).
Problem 1.1.1 What is the solution to Newton’s second law (1.1) with boundary values
given: x(0) = x0 and x(t∗ ) = x∗ (t∗ refers to some specific time at which we are
given the position, x∗ )?
Problem 1.1.2 What happens to the solution in the previous problem if ωt∗ = nπ for
integer n?
Problem 1.1.3 Solve mẍ(t) = F0 for constant force F0 subject to the boundary conditions:
x(0) = x0 , x(t∗ ) = x∗ with x0 and x∗ given. Solve the same problem for a “mixed”
set of conditions: x(0) = x0 and ẋ(t∗ ) = v∗ with x0 and v∗ given.
Problem 1.1.4 For the oscillatory function x(t) = x0 cos(ωt + φ) with constant φ ∈ [0, 2π)
(the “phase”), find the amplitude and period, and sketch one full cycle of this
function.
...
Problem 1.1.5 Suppose Newton’s second law read: α x (t) = F(x(t), t) for force F. What are
the units of α in this case? Solve the modified Newton’s second law if the force is
a constant F0 with initial conditions x(0) = x0 and ẋ(0) = v0 . What is the problem
with this solution? (is it, for example, unique?).
Problem 1.1.6 Solve mẍ(t) = F0 cos(ωt) (F0 is a constant with Newtons as it unit, ω is a
constant with unit of inverse seconds) for x(t) given x(0) = x0 and ẋ(0) = v0 .
1.2 Taylor Expansion
To appreciate the role of the harmonic oscillator problem in physics, we need to review the
idea of expanding a function f(x) about a particular value x0 and apply it to minima of a
potential energy. We’ll start in this section with the former, called “Taylor expansion.” The
idea is to estimate f(x0 +Δx) for small Δx given the value of the function and its derivatives
at x0 . Our first guess is that the function is unchanged at x0 + Δx,
f(x0 + Δx) ≈ f(x0 ). (1.12)
That’s a fine approximation, but can we improve upon it? Sure: if we knew f(x0 ) and

df (x)
f (x0 ) ≡ , (1.13)
dx x=x0
then we could add in a correction associated with the slope of the line tangent to f(x0 ) at x0 :
f(x0 + Δx) ≈ f(x0 ) + f (x0 )Δx. (1.14)
The picture of this approximation, with the initial estimate and the linear refinement is
shown in Figure 1.2.
f (x)
f (x0 + Δx) f (x0 ) + f (x0 )Δx ≈ f (x0 + Δx)

f (x0 ) f (x0 )
x
x0 x0 + Δx
Fig. 1.2 Given a function f (x), if we know the value of the function and its derivative at x0 , then we can estimate the value
at a nearby point x0 + Δx for Δx small.
The process continues, we can take the quadratic correction, f (x0 )Δx 2 , and use it to
refine further,
1
f(x0 + Δx) ≈ f(x0 ) + f (x0 )Δx + f (x0 )Δx 2 . (1.15)
2
You can keep going to any desired accuracy, with equality restored when an infinite number
of terms are kept,
∞
1 dj
f(x0 + Δx) = f(x) Δx j , (1.16)
j! dx j
j=0 x=x0
although we rarely need much past the first three terms. Note that the first term that you
drop gives an estimate of the error you are making in the approximation. For example, if
you took only the first two terms in (1.15), you would know that the leading source of error
goes like f (x0 )Δx 2 .
The hardest part of applying Taylor expansion lies in correctly identifying the function
f(x), the point of interest,
√ x0 , and the “small” correction Δx. As an example, suppose we
√
wanted to evaluate 102, then we have f(x) = x with x0 = 100 and √ Δx = 2. Why pick
x√0 = 100 instead of 101? Because it is easy to compute f(100) = 100 = 10, while for
√101 we have the same basic√problem we started out with (i.e. I don’t know the value of
101 any more than I know 102). Using (1.15) gives the estimate
1
f(100 + 2) ≈ f(100) + f (100) × 2 + f (100) × 4
2
√ 1 1 1
= 100 + √ ×2− ×4 (1.17)
2 100 2 4(100)3/2
1 1
= 10 + − = 10.0995
10 2000
√
while the “actual” value is 102 ≈ 10.099505.
Problem 1.2.1 Evaluate sin(Δx) and cos(Δx) for the values of Δx given in the following
table using a calculator. Then use the Taylor expansions of these functions (to second
order, from (1.15)) to approximate their value at those same Δx (assuming Δx is
5 Conservative Forces
small, an assumption that is violated by some of the values so you can see the error
in the approximations). Write out your results to four places after the decimal.
Δx sin(Δx) cos(Δx) sin(Δx) approx. cos(Δx) approx.
.1
.2
.4
.8
Problem 1.2.2 Use Taylor expansion to find the approximate value of the function f(x) =
(a + x)n for constants a and n with x0 = 0, i.e. what is f(Δx) for Δx small (take only
the “leading-order” approximation, in which you just write out the Taylor expansion
through the Δx term as in (1.14))? Using your result, give the Taylor expansion
approximation near zero (again, you’ll write expressions approximating f(Δx)) for:
√
f(x) = 1 + x ≈
1
f(x) = √ ≈
1+x
1
f(x) = 2
≈
(1 + x)
Problem 1.2.3 Estimate the value of 1/121 using Taylor expansion (to first order in the
small parameter) and compare with the value you get from a calculator. Hint:
121 = (10 + 1)2 .
Problem 1.2.4 For the function f(θ) = (1 + cos θ)−1 , estimate f(π/2 + Δθ) for Δθ 1
using (1.15). Try it again by Taylor expanding the cos θ function first, then expanding
in the inverse polynomial, a two-step process that should yield the same result (up to
errors that we have ignored in both cases).
1.3 Conservative Forces
The spring force starts off life as rusty bits of metal providing a roughly linear restoring
force. But the model’s utility in physics has little to do with the coiled metal itself. Instead,
the “harmonic” oscillator behavior is really the dominant response of a particle moving
near the equilibrium of a potential energy function. To review, a conservative force F comes
from the derivative of a potential energy U via:
dU(x)
F(x) = − . (1.18)
dx
If we have a potential energy U(x) (from whatever physical configuration), then a point of
equilibrium is defined to be one for which the force vanishes. For xe a point of equilibrium,

dU(x)
F(xe ) = 0 = − ≡ −U (xe ). (1.19)
dx x=xe
Now if we expand the potential energy function U(x) about the point xe using Taylor
expansion:
1
U(x) = U((x − xe ) +xe ) = U(xe ) + ΔxU (xe ) + Δx 2 U (xe ) + · · · , (1.20)

2
≡Δx
then the first term is just a constant, and that will not contribute to the force in the vicinity
of xe (since we take a derivative with respect to x sitting inside Δx to get the force). The
second term vanishes by the assumption that xe is a point of equilibrium, and the first
term that informs the dynamics of a particle moving in the vicinity of xe is the third term
∼ (1/2)U (xe )(x − xe )2 , leading to a force, near xe :
dU(x)
F(x) = − ≈ −U (xe )(x − xe ) + · · · (1.21)
dx
The effective force in the vicinity of the equilibrium is just a linear restoring force with
“spring constant” k ∼ U (xe ) (assuming U (xe ) > 0 so that the equilibrium represents
a local minimum) and equilibrium spacing xe . A picture of a local minimum in the
potential energy and the associated force is shown in Figure 1.3. Near xe , the potential is
approximately quadratic, and the force is a linear restoring force of the sort we have been
studying. There is also an equilibrium point at the maximum of U(x) in that picture, but
the associated force tends to drive motion away from this second equilibrium. We call such
locations points of “unstable equilibrium,” even a small perturbation from the equilibrium
location drives masses away.
As an example, suppose we have somehow managed to set up a potential energy of the
form U(x) = U0 cos(2πx/) for a length and constant U0 > 0. What is the period of
motion for a particle that starts out “near” xe = /2? In this case, the equilibrium position
F (x) = −U (x)
x
U (x)
x
xe xm
Fig. 1.3 A potential energy U(x) (lower plot) with associated force F(x) = −U (x) (upper). A minimum in the potential
has an approximately linear restoring force associated with it. We can approximate the force, in the vicinity of xe
(bracketed with dotted lines) by F(x) ≈ −U (xe )(x − xe ). The maximum at xm is also a point of equilibrium,
but this one is “unstable,”if a particle starts near xm it tends to get driven away from xm (the slope of the force
is positive).
is at xe = /2, with U (xe ) = 0 as required. The second derivative sets the effective spring
constant, k = U (xe ) = (2π/)2 U0 , and then the period and frequency of the resulting
oscillatory motion come from (1.9) and (1.10):

m m2 U0
T = 2π = f= . (1.22)
U (xe ) U0 m2
1.3.1 Conservation of Energy

It is worth reviewing the utility of conservation of energy. In particular, while we have
a complete solution to our problem, in (1.4), it is not always possible to find such a
complete solution. In those cases, we retreat to a “partial” solution, where we can still
make quantitative predictions (and hence physical progress), but we might not have the
“whole” story.
Let’s go back to Newton’s second law, this time for an arbitrary potential energy U(x),
but with initial values specified. Think of the ODE piece of the “problem”:
dU(x)
mẍ(t) = − . (1.23)
dx
Now, the only ODEs that I can solve in closed form are ones in which both sides are a total
derivative, in this case, a total time-derivative. Then to integrate, you just remove the dtd
from both sides, and add in a constant – that process returns a function for ẋ(t). Then, when
possible, you integrate again to get x(t) (picking up another constant). This direct approach
will be considered in Section 5.1.
Looking at the left side of (1.23), it is clear that we have a total time derivative:
mẍ(t) = dtd (mẋ(t)), but what about the right-hand side? Is there a function W(x) such that
dW(x(t)) dU(x)
=− ? (1.24)
dt dx
The answer is no. The reason is clear: If we had a function evaluated at x(t), W(x(t)), then
the total time derivative of W would look like
dW(x(t)) dW(x) dx(t) dW(x)
= = ẋ(t) (1.25)
dt dx dt dx
and there is no ẋ(t) that appears on the right in (1.23). The fix is easy, just put an ẋ(t) on
the right-hand side of (1.23), which requires putting one on the left-hand side as well. Then
Newton’s second law looks like
dU(x)
mẋ(t)ẍ(t) = − ẋ(t). (1.26)
dx
The situation on the right is now very good, since we can write the right-hand side as a
total time derivative:
dU(x(t)) dU(x)
− =− ẋ(t). (1.27)
dt dx
There is potential trouble on the left-hand side of Newton’s second law, though. Can
ẋ(t)ẍ(t) be written as a total time derivative? Yes, note that
d 2
ẋ(t) = 2ẋ(t)ẍ(t). (1.28)
dt
Just multiplying Newton’s second law by ẋ(t) on both sides has given us the integrable
equation

d 1 dU(x(t)) 1
mẋ(t) = −
2
−→ mẋ(t)2 = −U(x(t)) + E (1.29)
dt 2 dt 2
where E is the constant of integration. We could re-write (1.29) as
1
mẋ(t)2 + U(x(t)) = E. (1.30)
2
This represents an interesting situation – the combination of the time-dependent terms on
the left yields a time in-dependent term on the right. Thinking about units (or dimensions,
if that’s what you are into) we have, in the first term of (1.30) a “kinetic” energy (energy
because of the units, kinetic because the term is associated with movement through its
dependence on ẋ(t)) and a “potential” energy (again from units, and the dependence on,
this time, position). The sum is a constant of the motion of the particle, E, the “total” energy
of the system. That is the statement of energy conservation expressed by (1.30). Because E
is a constant, we can set its value from the provided initial conditions: x(0) = x0 , ẋ(0) = v0 ,
1 2
mv + U(x0 ) = E. (1.31)
2 0
The integration of Newton’s second law, in the presence of a “conservative” force, given
by (1.30) is notable for its predictive ability – if you tell me where the particle is, its location
at time t, x(t), I can tell you how fast it is moving. Using the constant value of E set by the
initial conditions for the motion (1.31), we can write (1.30) as
1 1
mẋ(t)2 + U(x(t)) = mv20 + U(x0 ), (1.32)
2 2
and then
1/2
2
ẋ(t) = ± v20 + (U(x0 ) − U(x(t))) (1.33)
m
gives the speed (taking the positive root), at time t, of the particle at location x(t).
1.3.2 Harmonic Oscillator

The harmonic oscillator potential energy is just the quadratic U(x) = 1/2k(x − a)2 for
equilibrium location a. In this case, the quadratic expansion of U(x) about the equilibrium
consists of just the one term. We know what happens √ here: a particle oscillates about the
equilibrium value with frequency governed by k/m. If you think of the graph of the
potential energy, we have a convex curve, and if you draw a line representing energy E as
in Figure 1.4, you can tell the “story of the motion”: where the value of E intersects the
U (x)
Particle at rest
en
Fo
idd
rb
rb
idd
Fo
en
x
Maximum speed occurs here
Fig. 1.4 A particle with energy E moving in a quadratic potential well. The particle is at rest where E intersects the
potential energy function, and achieves its maximum speed where the difference between E and U(x) is largest.
U (x)
Fig. 1.5 A particle with energy E moving in a quadratic potential with a maximum. The particle cannot exist “underneath”
the potential energy curve.
potential energy, the particle must be at rest (all the energy is potential, none is kinetic), the
point at which the difference between E and U(x) is largest represents the position at which
the maximum speed of the particle occurs. Locations where U(x) > E are impossible to
achieve physically, since the kinetic energy would have to be negative.
What if the potential energy had the form U(x) = −1/2k(x − a)2 , with a concave
graph? This time, a particle tends to move away from the equilibrium position, without
returning to it. Thinking of motion at a fixed E, as in Figure 1.5, the particle speeds up
as it gets further away from the equilibrium location. This is an example of an “unstable”
equilibrium, particles that start near a are driven away from it. For the usual harmonic
potential, with its + sign, the equilibrium point is “stable,” if you start near equilibrium,
you remain near it. In a more general potential energy landscape, the sign of the second
derivative of the potential energy function, evaluated at a point of equilibrium, determines
whether the equilibrium is stable (positive) or unstable (negative).
At rest m
vesc
∞
R R
M M
Fig. 1.6 A mass m leaves the surface of the Earth with speed vesc and comes to rest infinitely far away.
1.3.3 Escape Speed
Conservation of energy can be used to make quantitative predictions even when the full
position of an object as a function of time is not known. As an example, consider a spherical
mass M of radius R. A test particle3 of mass m at a distance r ≥ R from the center of the
sphere experiences a force with magnitude
GMm
F= (1.34)
r2
where G is the gravitational constant. This force comes from a potential energy U(r) =
−GMm/r. Suppose the test mass starts from the surface of the sphere with a speed vesc .
We want to know the minimum value for vesc that allows the test mass to escape the
gravitational pull of the spherical body. That special speed is called the “escape speed.”
Think of the spherical body as the earth, and the test mass is a ball that you throw up into
the air. The ball goes up and comes back down. If you throw the ball up in the air a little
faster, it takes longer to come down. The escape speed is the (minimum) speed at which
you must throw the ball up so that it never comes back down.
Formally, we want the test particle to reach r → ∞ where it will be at rest as shown on
the right in Figure 1.6.4 From (1.30), if we take “x(t) = r → ∞” with “ẋ(t) = ṙ → 0,” and
use the potential energy associated with gravity, we have E = 0. Going back to the initial
values, which must of course have the same energy,
1 2
mv + U(R) = E = 0, (1.35)
2 esc
and then the escape speed can be isolated algebraically

2 2GM
vsc = − U(R) = . (1.36)
m R
3 “Test particle” is a technical term that means “a particle that feels the effect of a force without contributing to
it.” When we want to probe the gravitational force associated with some external body, we often imagine a test
particle’s response to that force. The same idea shows up in electricity and magnetism.
4 That’s the “minimum” part of the requirement. You could have the test particle rocketing around at spatial
infinity, but that excess speed is overkill.
θ(t)
L
L
m
h
Fig. 1.7 A mass m is attached to a rope of length L. The rope makes an angle of θ (t) with respect to the vertical axis.
1.3.4 Pendulum
Another case of interest that can be simplified using conservation of energy (1.30) is the
pendulum. A pendulum consists of a mass m (the “bob”) attached to a rope of length L
that is allowed to swing back and forth. We can describe the bob’s motion by finding the
angle θ(t) that the rope makes with respect to vertical, as shown in Figure 1.7. The kinetic
energy of the pendulum mass is 1/2m(Lθ̇)2 and its potential energy is U = mgh with
h = L(1 − cos θ). Conservation of energy can be expressed as
1 2 2
mL θ̇ + mgL(1 − cos θ).
E= (1.37)
2
If the pendulum starts from rest at an angle θ0 , then E = mgL(1− cos θ0 ), and we can solve
for θ̇2 :
2g
θ̇2 = (cos θ − cos θ0 ) . (1.38)
L
This is interesting, it already tells us that the value of θ must fall between −θ0 and θ0 to
keep θ̇2 > 0, so that the largest |θ| can be, at any given time, is θ0 .
If we take the time-derivative of (1.38), to eliminate the constant and remove the sign
ambiguity associated with taking the square root,5 we get
g
θ̈ = − sin θ. (1.39)
L
Since we know the motion is bounded by θ0 , if the starting angle is small, then θ will be
small for all times. In that case, we can make the small angle approximation sin θ ≈ θ to
write
g
θ̈ ≈ − θ (1.40)
L
which is of the form of the harmonic oscillator. The solution, with appropriate initial
values, is
5 That process of taking a first-order quadratic equation and making it a second-order linear one is almost
always a better idea than trying to handle the signs associated with the square roots in the original formulation,
see Problem 1.3.11.
m1 m2
x̂ (Direction of increasing “x”)
x1 (t) x2 (t)
Fig. 1.8 For Problem 1.3.5. Two masses are initially separated by a distance d. Find the force and acceleration of each mass,
solve Newton’s second law for x1 (t) and x2 (t) if m2 = −m1 .

g
θ(t) = θ0 cos t , (1.41)
L
√
with period T = 2π L/g. This is an exact solution to the approximate problem defined
by (1.40), hence an approximate solution to the full problem in (1.39).
Problem 1.3.1 A spring with spring constant k and equilibrium spacing a is hung from
the ceiling, and a mass m attached at the bottom. Find the equilibrium position of
the mass, and the period of oscillation if you released the mass from rest at some
nonequilibrium extension.
Problem 1.3.2 For gravity near the surface of the earth, the magnitude of the force is
GMm
F= ,
(R + r)2
where r is the height above the earth’s surface. Using Taylor expansion (assuming r
is small), find the first nonconstant term in the force expression. Take the radius R
and mass M for the earth and evaluate the constant term for a mass m that is 1 kg
(it should look familiar). How big is the first-order correction for a 1 kg mass at a
distance of 1 m above the surface of the earth?
Problem 1.3.3 One can turn (1.36) around, and ask: “for what radius sphere is a particular
v the escape speed?” At what radius is the escape speed the speed of light? Calculate
that special radius for the earth and sun. If all of the mass is packed into a sphere of
this radius (or less), the resulting object is called a black hole, and the radius is the
“event horizon.”6
Problem 1.3.4 Find the escape speed for a Yukawa potential, U(r) = −U0 e −μr /(4πr) for
constant μ with dimension of inverse-length and constant U0 to set the magnitude.
Problem 1.3.5 Two masses, m1 and m2 sit a distance d apart (working in one dimension, call
it “x”). Find the (Newtonian) gravitational force acting on m1 and m2 . What is the
acceleration of each mass? Suppose we set m2 = −m1 with m1 > 0. What happens
to the forces on each mass? What is the acceleration of each mass? Provide a solution
to the equation of motion (i.e. Newton’s second law) for the masses assuming they
start from rest with x1 (0) = −d/2, x2 (0) = d/2 in this negative mass case. It is
fun to think about the implications for energy and momentum conservation in this
problem.
6 This is a nongeneral-relativistic calculation first carried out separately by Michell (1784) and Laplace (1796).
The result in general relativity is numerically identical.
Problem 1.3.6 Assuming a force of the form F = −k/x 3 for constant k: What are the units
of k? What is the potential energy U(x) from which the force comes? What is the
escape speed associated with this force (assume you are “blasting off ” from a sphere
of radius R that exerts this force on your rocket of mass m). Check that your escape
speed expression has the units of speed.
Problem 1.3.7 For the potential energy U(x) = U0 sin(x/) cos(x/) with constants U0 (unit
of energy) and (unit of length), what is the minimum on the domain x/ ∈ [0, π]?
What is the period of oscillation for motion near this minimum (assume mass m for
the moving object)?
Problem 1.3.8 The force on a charge Q due to another charge q has magnitude F =
Qq/(4π0 r2 ) where r is the distance between the (point) charges, and 0 is a
proportionality constant. The force is attractive for charges of opposite sign, and
repulsive if charges have the same sign. Referring to the configuration, find the net
force on Q > 0 (at location x) due to the pair of charges q > 0 (at 0) and −q (at d)
assuming that the individual forces add. For x d, find the first nonzero term in the
Taylor expansion of the net force on Q.
q −q Q
x̂
0 d x
Problem 1.3.9 Is the period of motion for a “real” pendulum, with motion governed
by (1.39), longer or shorter than the small angle “simple” pendulum, with (1.40)
as its equation of motion? Hint: plot the right-hand sides of each of these equations
of motion and compare magnitudes.
Problem 1.3.10 For the energy curve and particle energy shown in Figure 1.5, identify the
regions of the graph where particle motion is classically allowed. In the allowed
region, where does the maximum/minimum speed occur (mark the points on the
figure)?
Problem 1.3.11 Given a potential energy U(x), conservation of energy in one dimension
implies that we can write

2
ẋ(t) = ± E − U(x(t),
m
and it would be nice to solve this equation. But we don’t know which sign to pick
for ẋ(t). Show that for either sign, taking the time derivative of this equation yields
the same equation for ẍ(t) indicating that we should solve that instead of this first
derivative form if we want to get x(t) without worrying about changing signs for
velocity.
Problem 1.3.12 In the following plot, we see a potential energy curve U(x).
U (x)
x
I II III IV V
a. For a particle that has energy E (shown as the dashed line), which of the
positions, I–V, are allowed locations for the particle? At which allowed location
would the particle be traveling the fastest? How about the location at which the
particle is traveling the slowest?
b. Which locations have oscillatory motion associated with a particle placed in
their vicinity (ignore the E line for this and the next part of the problem)?
c. For the locations that support oscillatory motion, which one has the longest
period?

Problem 1.3.13 Gauss’s law for electricity and magnetism reads: E · da = Qenc /0 where
the integration on the left is over a closed surface, and Qenc is the charge enclosed by
that surface. There is a similar integral form for Newtonian gravity – the gravitational
field7 H, integrated over a closed surface is related to the mass enclosed by the
surface as follows:

H · da = −4πGMenc
where G ≈ 6.67 × 10−11 N m2 /kg2 is the gravitational constant.

a. Use this relation to find the gravitational field inside a sphere of radius R with
uniform mass density ρ0 (mass-per-unit-volume).
b. Assuming the earth is a uniform sphere with constant mass density, if we dug
a hole from the north pole to the south and dropped a mass m down it (starting from
rest), how long, in minutes, would it take to return?
1.4 Series Expansion, Method of Frobenius
We solved (1.3) by “knowing the answer.” In a sense, the oscillatory cosine and sine (and
their parent, the complex exponential) are defined to be the solution to the differential
equation: ÿ(t) = −ω2 y(t). But how would we find these given just the ODE itself? One
approach, known as the “Method of Frobenius,” is to assume a solution of the form
7 The gravitational field is related to the force on a mass m by: F = mH, and the field has units of N/kg.
15 Series Expansion, Method of Frobenius
∞

y(t) = t p aj t j (1.42)
j=0
where p and {aj }∞j=0 are constants. This power series solution is motivated by, for example,
the Taylor series expansion. For most functions f(t), we know that
∞
1 d j f(t)
f(t) = t j, (1.43)
j! dt j t=0
j=0

=“aj ”
which shares the form of (1.42).
1.4.1 Exponential
To demonstrate the process clearly, let’s suppose we were given the first-order ODE:
ẏ(t) = y(t), with y(0) = y0 also provided. We are asking for a function that is equal to
its derivative. There are interesting graphical ways of producing a plot of such a function
(where the tangent to the curve y(t), defining the local slope, is itself equal to y(t)), but we
can use the Method of Frobenius to generate a quantitative solution. Assume y(t) takes the
form given in (1.42), then we can calculate the derivative
⎡ ⎤
∞
∞

ẏ(t) = t p aj ( j + p)t j−1 = t p ⎣a0 pt −1 + aj+1 (j + p + 1) t j ⎦ , (1.44)
j=0 j=0
where the sum has been reindexed and written out so that the derivative can be expressed
as a sum from j = 0 → ∞, matching the sum limits in y(t) itself.
Now writing the ODE in the form ẏ(t) − y(t) = 0, and inserting the expansions, we get
⎡ ⎤
∞
∞

t p ⎣a0 pt −1 + aj+1 (j + p + 1) t j ⎦ − t p aj t j = 0. (1.45)
j=0 j=0
Because the coefficients {aj }∞ j=0 do not depend on t, the only way for the sum in (1.45)
to be zero for all values of t is if each power of t vanishes separately. One cannot, for all
times t, kill an a1 t term with an a2 t 2 term, for example, so each power of t must have a
zero coefficient in front of it. Writing the equation with the coefficient of t j isolated, we
can explore the implication of requiring that it be zero,
⎡ ⎤
∞
t p ⎣a0 pt −1 + (aj+1 (j + p + 1) − aj ) t j ⎦ = 0. (1.46)
j=0
The coefficient preceding t j in the sum provide a “recursion relation.” By requiring that
each term multiplying t j vanish, we are demanding that
aj
aj+1 = (1.47)
j+p+1
which relates the aj+1 coefficient of y(t) to the aj one. That recursion ensures that the
infinite sum in (1.46) vanishes, but we still have the t −1 term out front. We could
take a0 = 0 to start things off, but then the recursion relation tells us that all the other
coefficients are zero (the problem here is that we cannot match the initial value, unless
y0 = 0). Instead, we must take p = 0, at which point the recursion becomes
aj
aj+1 = . (1.48)
j+1
We can “solve” the recursion by writing aj in terms of the starting value, a0 , and this can
be done by inspecting terms. The first few look like,
a1 = a0
a1 a0
a2 = = (1.49)
2 2
a2 a0
a3 = =
3 6
from which it is pretty clear that aj = a0 /( j! ). The sum, with these coefficients, is
∞ j
t
y(t) = a0 . (1.50)
j!
j=0
Finally, y(0) = y0 , so we pick a0 = y0 to match the provided initial value. The sum
in (1.50) comes up so often it has its own name,8 the “exponential” of x is defined as:
∞
xj
ex ≡ , (1.51)
j!
j=0
and the familiar properties of exponentials follow from this definition. As an example of
one of these, we’ll sketch the proof that e x+y = e x e y . We can expand the right-hand side
from the product of the sums:
⎛ ⎞
∞ j ∞
k
x y
e e =⎝
x y ⎠
j! k!
j=0 k=0

1 2 1 2
= 1 + x + x + ··· 1 + y + y + ···
2 2 (1.52)
1
= 1 + (x + y) + x 2 + 2xy + y 2 + · · ·
2
∞
(x + y)m
=
m!
m=0
x+y
which is e .
8 Certain functions show up so much that we identify them by a common name rather than the more precise set
of coefficients in the infinite sum.
17 Series Expansion, Method of Frobenius
1.4.2 Harmonic Oscillator

The starting point in (1.42), used in the equation of motion for the harmonic oscillator,
ÿ(t) = −ω 2 y(t), will allow us to solve for the coefficients {aj }∞ j=0 (and p). This time,
we need to write the second derivative in terms of the unknown coefficients. Working
from (1.44), we can differentiate to get the second derivative of y(t),
∞

ÿ(t) = t p aj ( j + p)( j + p − 1)t j−2
j=0
⎡ ⎤
∞

= t p ⎣a0 p(p − 1)t −2 + a1 (p + 1)pt −1 + aj+2 ( j + p + 2)( j + p + 1)t j ⎦
j=0
(1.53)
where we have again re-indexed and extracted terms so as to start all sums at j = 0 with t j
in the summand.
The ODE of interest here can be written as ÿ(t) + ω 2 y(t) = 0, and then inserting the
sums for ÿ(t) and y(t), and collecting in powers of t, we have
∞

−2 −1

a0 p(p − 1)t + a1 (p + 1)pt + aj+2 ( j + p + 2)( j + p + 1) + aj ω 2 t j = 0.
j=0
(1.54)
Looking at the t −2 term, we can get this to be zero if: p = 0, p = 1 or a0 = 0. Let’s take
p = 0, then (1.54) becomes
∞

aj+2 ( j + 2)( j + 1) + aj ω 2 t j = 0, (1.55)
j=0
so that in order for each coefficient of t j to vanish separately, we must have

aj ω 2
aj+2 = − . (1.56)
( j + 2)( j + 1)
This is a recursion relation that links the coefficient aj+2 to the coefficient aj . Given a0 , the
first few terms are
a0 ω 2
a2 = −
2
a2 ω 2 a0 ω 4
a4 = − = (1.57)
12 24
a4 ω 2 a0 ω 6
a6 = − =− .
30 720
We can now see the pattern:
k ω 2k
a2k = (−1) a0 for k = 0, 1, . . . . (1.58)
(2k)!
For the odd coefficients, take a1 as given:

a1 ω 2
a3 = −
6
a3 ω 2 a1 ω 4
a5 = − = (1.59)
20 120
a5 ω 2
a1 ω 6
a7 = − =−
42 5040
from which
k ω 2k
a2k+1 = (−1) a1 . (1.60)
(2k + 1) !
The full solution, obtained by putting the expressions for the coefficients {aj }∞
j=0
into (1.42), is
∞
∞
k ω 2k 2k a1 k ω
2k+1
y(t) = a0 (−1) t + (−1) t 2k+1
(2k)! ω (2k + 1) !
k=0 k=0 (1.61)
a1
= a0 cos(ωt) + sin(ωt),
ω
where the sums themselves define the cosine and sine functions. We have a two-parameter
family of solutions here, with a0 and a1 available to set initial or boundary conditions. This
is to be expected given the starting second-order ODE. If we had taken p = 1 to eliminate
the first term in (1.54), we would have to set a1 = 0 to get rid of the second term, and
we would recover the even powers of t in the sum (the cosine term). Similarly, if we took
p = −1 to kill the second term, we’d be forced to take a0 = 0 and would recover the odd
powers of t in the sum, defining sine.
From the current point of view, what we have is a pair of infinite sums that represent the
independent solutions to the ODE ÿ(t) + ω2 y(t) = 0. These sums are given special names
(because they show up a lot)
∞
∞

k θ2k k θ2k+1
cos θ ≡ (−1) sin θ ≡ (−1) . (1.62)
(2k)! (2k + 1)!
k=0 k=0
All of the properties of cosine and sine are contained in these expressions. For example,
we can take the derivative of the terms in the sum9 to evaluate the derivatives of cosine
and sine:
∞ 2k−1
d k 2kθ
cos θ = (−1)
dθ (2k)!
k=1
∞
+1 θ2+1
= (−1) (1.63)
(2 + 1)!
=0
∞
k θ2k+1
=− (−1) = − sin θ.
(2k + 1)!
k=0
9 Throughout this book, we will take the physicist’s view that “all is well” – sums converge, and we can
interchange summation and differentiation, etc.
19 Complex Numbers
Problem 1.4.1 By differentiating each term in

∞ j
t
f(t) = f0 ,
j!
j=0
show that df(t)

dt = f(t) explicitly.
Problem 1.4.2 Use the series approach to find the solution to the ODE:
df(t)
= αf(t)
dt
(for constant α) with f(0) = f0 given.
Problem 1.4.3 Take the derivative of sin θ from the defining sum in (1.62).
Problem 1.4.4 Using the Frobenius series solution method, solve:
t 2 ¨f(t) + tf(t)
˙ + t 2 f(t) = 0
starting from
∞

p
f(t) = t aj t j .
j=0
You will end up with two terms (from the t 0 and t 1 powers) that can be used to set p
and a1 , take a1 = 0. Find the “even” coefficients using the recursion relation you get
from demanding that all terms in the infinite sum (which goes from 2 to ∞) vanish
individually. Use the initial value f(0) = f0 to set a0 and write the infinite sum in
terms of an index k = 0, 1, 2, . . . (because you are finding the “even” coefficients,
only t 2k will appear in your sum). Be on the lookout for functions of factorials of k
(like, for example, (k! )2 ). Treat f(t) and t as dimensionless variables (i.e. don’t worry
about units, if you normally worry about units!). The function you get is called a
“Bessel function” (in particular, the “zeroth” Bessel function).
Problem 1.4.5 Solve the ODE

d 2 df(x)

x + x 2 − 2 f(x) = 0
dx dx
using the Frobenius method: Start from the usual sum, set p = 0, and solve the
recursion relation to find the odd (in powers of x) solution. This function is an
example of a “spherical Bessel function.”
1.5 Complex Numbers
There is a connection between exponentials and the cosine and sine functions. To establish
that connection, we need to review some definitions and properties of complex numbers.
A complex number can be represented √ by a pair of real numbers: z = a + ib for real a
and b, and where the imaginary i ≡ −1. The “real” part of z is a, the “imaginary” part
is b. We can think of z as a location in the two-dimensional plane, where a represents the
Im
b z = a + ib
φ
Re
a
Fig. 1.9 The complex plane is spanned by a horizontal real (“Re”) axis, and a vertical imaginary (“Im”) axis. We locate
points z in the plane by specifying their real and imaginary pieces. We can do this in Cartesian coordinates, where a
and b represent distances along the horizontal and vertical axes, or in polar coordinates where s is the distance
from the origin to the point z and φ is the angle that z makes with the real axis.
horizontal distance to the origin, b the vertical. Then the angle that a line drawn from the
origin to the point (a, b) makes, with respect to the horizontal
√ axis, is φ = tan−1 (b/a). The
distance from the origin to the point (a, b) is s = a + b . In terms of s and φ, we can
2 2
write the “Cartesian” components of the complex number, a and b, in terms of the “polar”
components, s and φ: a = s cos φ, and b = s sin φ. We have
z = a + ib = s(cos φ + i sin φ) , (1.64)
the two-dimensional picture to keep in mind is shown in Figure 1.9.

Using the summation form from (1.51), we can write the exponential of a complex
argument as
∞
∞
∞

xj x 2j x 2j+1
e ix = ij = (−1)j +i (−1)j = cos(x) + i sin(x), (1.65)
j! (2j)! (2j + 1)!
j=0 j=0 j=0
where we have identified the sums that define cosine and sine from (1.62). This result, that
e ix = cos(x) + i sin(x), is known as “Euler’s formula.” We can use it to neatly write the
polar form of the complex number z from (1.64)
z = s(cos φ + i sin φ) = se iφ . (1.66)
Addition for complex numbers is defined in terms of addition for the real and imaginary
parts. For z1 = a + ib and z2 = c + id,
z1 + z2 = (a + c) + i (b + d). (1.67)
Geometrically, this is like vector addition in two dimensions, with its usual “head-to-tail”
visualization as shown in Figure 1.10.
Multiplication proceeds by treating z1 and z2 as polynomials in i, so that
z1 z2 = ac + i (ad + bc) + i2 bd = ac − bd + i (ad + bc), (1.68)

21 Complex Numbers
Im
b+d
z1 + z2
z2
b
z1
Re
a a+c
Fig. 1.10 We add vectors by components. The light gray vectors are z1 = a + ib and z2 = c + id, then the sum is
z1 + z2 = (a + c) + i (b + d).
Im
φ1 + φ2
φ2
φ1
s2
Re
s1
s1 s2
Fig. 1.11 Two complex numbers, z1 = s1 e iφ1 and z2 = s2 e iφ2 are multiplied together as in (1.69). The resulting product
has magnitude s1 s2 and makes an angle φ1 + φ2 with the real axis.
and the real part of the product is ac − bd, with imaginary part ad + bc. The polar form
makes it easy to see the effect of multiplication. Suppose z1 = s1 e iφ1 and z2 = s2 e iφ2 , then
using the multiplicative properties of the exponential, we have
z1 z2 = s1 s2 e i (φ1 +φ2 ) (1.69)
so that the product has magnitude s1 s2 (a stretching) and makes an angle of φ1 + φ2 with
respect to the horizontal axis (a rotation). The stretch and rotation are shown in Figure 1.11.
Finally, there is a new operation that is not inherited directly from the real numbers,
“conjugation,” which is defined, for z = a + ib = se iφ , as
z∗ ≡ a − ib = se −iφ . (1.70)
The two-dimensional plane operation here is reflection about the horizontal axis. As a
practical matter, when we take the “complex conjugate” of expressions involving complex
numbers and/or variables, we just flip the sign of i wherever it appears (see Problem 1.5.5
for partial justification of this procedure). We can use the conjugate to solve for cosine and
sine by algebraically inverting Euler’s formula:
1 1 iφ
cos φ = e iφ + e −iφ sin φ = e − e −iφ . (1.71)
2 2i
Let’s return to our model problem, ÿ(t) = −ω 2 y(t). The two independent solutions,
cosine and sine, are linear combinations of exponentials. So for a general solution like
y(t) = A cos(ωt) + B sin(ωt), (1.72)
if we use (1.71), we could write
y(t) = Āe i (ωt) + B̄e −i (ωt) (1.73)
for new (complex) constants Ā and B̄.
This alternate form can be quite useful, and it is important to get comfortable moving
back and forth from the trigonometric form of the solution to the exponential form. One
advantage of the exponential solution is that it makes the original problem of solving for
y(t) easier by turning ÿ(t) + ω2 y(t) = 0 into an algebraic equation. To see this, suppose
we guess y(t) = αe βt for constants α and β, motivated by the fact that the derivatives of
exponentials are proportional to themselves. Inserting this into the ODE gives

α β2 + ω 2 e βt = 0 (1.74)
from which we learn that β = ±iω. There are two value of β, hence two solutions here, so
the most general case is a linear combination of the two, and we are led immediately to
y(t) = Āe i (ωt) + B̄e −i (ωt) . (1.75)
Problem 1.5.1 Find the relationship between Ā, B̄ from (1.73) and the original A and B
from (1.72).
Problem 1.5.2 Evaluate the products i2 , i3 , i4 , and i5 .
Problem 1.5.3 For a complex number, like p = u + iv the “real” part of p is u and the
“imaginary” part of p is v (both u and v are themselves real numbers). For z = a + ib
(with a and b both real) and p = u + iv, what are the real and imaginary parts of z/p?
Problem 1.5.4 For z = a + ib, p = u + iv (with a, b, u, and v all real), show that:
a. (z + p)∗ = z∗ + p∗
b. (zp)∗ = z∗ p∗
∗ ∗
z
c. p = pz ∗
23 Properties of Exponentials and Logarithms
R φ
ψ
θ
Fig. 1.12 For Problem 1.5.8.
Problem
1.5.5
Use part b. of the previous problem to show that for z = a + ib we have
∗
= (z∗ ) . Now consider a function f(z) that we can expand as
j
zj
∞

f(z) = aj z j ,
j=0
for real constants {aj }∞ ∗ ∗

j=0 . Show that f(z) = f(z ) so that if you have a function of
a + ib, to take the conjugate, you just evaluate f at a − ib.
Problem 1.5.6 From Euler’s formula: e iθ = cos(θ) + i sin(θ), write cos(θ) and sin(θ) in
terms of exponentials (and any constants you need).
Problem 1.5.7 Convert cos(θ + φ) to complex exponentials and use that representation to
prove that
cos(θ + φ) = cos φ cos θ − sin φ sin θ.
Problem 1.5.8 In Figure 1.12, we have a circle of radius R with three angles shown: θ, φ,
and ψ ≡ θ + φ. Using Figure 1.12, identify lengths and angles to show
sin ψ = sin φ cos θ + cos φ sin θ.
You are provding a pictorial version, for sine, of what you did in Problem 1.5.7 for
cosine, providing a complementary point of view.
1.6 Properties of Exponentials and Logarithms
We have developed the exponential as the function that has derivative equal to itself at
every point, so that the exponential y(t) = e t satisfies ẏ(t) = y(t) for all t by construction.
Suppose we have an exponential that involves a function of t, f(t), then we can use the
chain rule to find the derivative of e f(t) . Let y(p) = e p , then
d f(t) dy(f(t)) dy( f ) df(t) ˙ = e f(t) f(t).

˙
e = = = y(f(t))f(t) (1.76)
dt dt df dt
Any time you have an ODE of the form
ẏ(t) = g(t)y(t) (1.77)
˙ for some function f(t), the solution is
where g(t) = f(t)
¯ f(t)
y(t) = ye (1.78)
where y¯ is a constant that can be used to set the initial value (if f(0) = 0, then y¯ = y0 , the
provided value at t = 0).
As an example, suppose we have
ẏ(t) = t p y(t) with y(0) = y0 given, (1.79)
and p some arbitrary constant. Here, t p = g(t), so that f(t) = t p+1 /(p + 1), and then
t p+1
y(t) = y0 e p+1 . (1.80)
The “logarithm” function un-does the exponential. If you have e a = b and you want to
know what a is, you take the “log” of both sides,10
log(e a ) ≡ a = log(b). (1.81)
The log plays the same inverse role as arcsine and arccosine. It inherits properties directly
from properties of the exponential. For example, suppose we have e a = b and e c = d, so
that a = log(b) and c = log(d). We know that
e a e c = bd → e a+c = bd (1.82)
since e a e c = e a+c . Then taking the log of both sides,
a + c = log(bd) = log(b) + log(d), (1.83)
and the sum of the log terms becomes the log of the product.
We can also find the derivative (and anti-derivative) of the logarithm function, again
from its definition. We have log(e t ) = t, and let f(t) = e t , then log(f(t)) = t, and taking
the t-derivative of both sides gives
d log( f ) df(t)
=1 (1.84)
df dt
using the chain rule. The derivative of f(t) is itself, by the definition of exponential, so that
d log( f ) d log( f ) 1
f = 1 −→ = . (1.85)
df df f
10 For my purposes, the function “log(x)” refers to the “natural logarithm” (base e). This is sometimes written
ln(x) to remind us of the appropriate base, and to differentiate it from the base 10 log which, in some contexts,
is called log. In physics, we rarely have uses for bases other than e, so I’ll use log(x) to refer exclusively to
the base e form.
25 Properties of Exponentials and Logarithms
You can call f anything you like, we have

d log(t) 1
= (1.86)
dt t
and if you have a generic function p(t), then the chain rule gives
d log(p(t)) ṗ(t)
= . (1.87)
dt p(t)
Going the other direction, we can integrate both sides of (1.86) to get (omitting the
constant of integration)

1
log(t) = dt. (1.88)
t
We can do this to the defining relation for the exponential, too: ẏ(t) = y(t) means y(t) =
y(t) dt so that

e = e t dt,
t
(1.89)
and the exponential is its own integral. This result can be extended for integrals of the form
e αt for constant α,

αt 1 e αt
e dt = e q dq = , (1.90)
α α
where we set q ≡ αt in a change of variables. For definite limits of integration, t0 → tf ,
we have
tf
1 αtf
e αt dt = e − e αt0 . (1.91)
t0 α
Problem 1.6.1 Show that log(x p ) = p log(x) for constant p (not necessarily an integer).
Problem 1.6.2 Using the chain rule, take the t-derivative of h(t) = e g(t) and write your
expression for dh(t)
dt in terms of h(t) itself and the derivative of g(t). Using this result
(or any other technique you like short of looking it up) find the solution to the ODE:
df(t)
= iωt 2 f(t),
dt
with f(0) = f0 given. What is the real part of your solution?
Problem 1.6.3 The “hyperbolic” cosine and sine are defined by the sums:
∞
∞

x 2j x 2j+1
cosh(x) ≡ sinh(x) ≡ .
(2j)! (2j + 1)!
j=0 j=0
Write the infinite sum for the exponential in terms of these two sums (i.e. write e x
in terms of cosh(x) and sinh(x)), arriving at “Euler’s formula” for hyperbolic cosine
and sine. “Invert” the relation to find cosh(x) and sinh(x) in terms of exponentials
(as in Problem 1.5.6) and sketch cosh(x) and sinh(x) (include positive and negative
values for x).
Problem 1.6.4 Evaluate sine and sinh with complex arguments: What are sin(iθ) and
sinh(iη) for real θ and η?
Problem 1.6.5 When we solve Newton’s second law for the harmonic oscillator:
d 2 x(t)
m = −kx(t),
dt 2
we get cosine and sine solutions. Suppose we “complexify time” (people do this) by
letting t = is for a new “temporal” parameter s. Write Newton’s second law in terms
of s and solve that equation for x(s) using x(0) = x0 and dx(s)
ds s=0 = 0. What do your
solutions look like?
1.7 Solving First-Order ODEs
We will be taking the harmonic oscillator that started us off, with its near universal
applicability to motion near the minima of any potential energy function, and adding
additional forcing, capturing new and interesting physics. In the next chapter, we will think
about the effects of friction and driving (adding an external force to the spring system), and
will be focused on solving more and more general second-order ODEs. As a warmup and
advertisement of some of the techniques we will encounter, we’ll close the chapter by
discussing the solutions to
df(x)
= G(x, f(x)) f(0) = f0 (1.92)
dx
for some provided “driving” function G(x, f(x)) and initial value f0 .
1.7.1 Continuity
As a first observation, we can show that solutions to (1.92) are continuous provided G is
finite. Continuity means that for all points x in the domain of the solution, we have
lim (f(x + ) − f(x − )) = 0. (1.93)
→0
If the function G(x, f(x)) is itself continuous, we can be assured that f(x) is continuous
since integrating a continuous function returns a continuous function. So we’ll focus on the
case where the right-hand side of (1.92) is discontinuous at some point. Then integrating
both sides of the ODE across the discontinuity serves to “smooth” it out, leading to
continuous f(x).
To be concrete, suppose our function G(x, f(x)) takes the following discontinuous but
simple form,11 for constant G0 ,
!
0 x<a
G(x, f(x)) = (1.94)
G0 x ≥ a
11 We could have two separate constants on the “left” and “right” of the discontinuity, but it is simplest to pick
zero for one of them.
27 Solving First-Order ODEs
with a discontinuity at x = a. Now integrate both sides of (1.92) from a − → a + :

a+ a+
df(x)
dx = G(x, f(x)) dx
a− dx a− (1.95)
f(a + ) − f(a − ) = (a + ) G0 − aG0 = G0 ,
and taking the limit as → 0 gives us continuity of f(x), reproducing (1.93) at x = a.
Continuity of the solution is a property of the ODE in (1.92), unless G(x, f(x)) becomes
infinite at the discontinuity, that case requires a more nuanced limit on the right of (1.95).
1.7.2 Separation of Variables

For a generic G(x, f(x)), it is not necessarily possible to solve (1.92), at least analytically.12
If the function can be “separated” into a function of x and a separate function of f(x), then
we can make progress. We’ll consider a separation of the form G(x, f(x)) = g(x)h(f(x)), a
product of a function g(x) and h(y), the latter of which is evaluated at y = f(x). Now for
df
the mnemonic, we can take the fraction dx seriously (à la Leibniz), and multiply both sides
df
of dx = g(x)h( f ) by dx while dividing by h( f ) (less problematic)
df
= g(x)dx. (1.96)
h( f )
Now we just integrate both sides between the relevant limits, starting from x = 0 on the
right, and f = f0 on the left
f ¯ x
df
¯ = g(x̄) dx̄. (1.97)
f0 h(f) 0
All that’s left is to perform the integrals (which may or may not be difficult), and invert to
find f(x) (which will almost certainly be difficult).
As an example of the process, take g(x) = 1, h(y) = y, so that we are solving
df(x)
= f(x), (1.98)
dx
and we multiply by dx and divide by f(x) on both sides,
df
= dx, (1.99)
f
then integrating both sides yields
f ¯ x
df f
¯ = dx̄ −→ log = x. (1.100)
f0 f 0 f0
Now for the inversion, we exponentiate both sides and isolate f:

f(x) = f0 e x . (1.101)
12 Meaning, here, without the use of a computer.

Let’s try it again for an ODE whose solution we haven’t been focused on, take g(x) = x
and h(y) = 1/y, we want to solve
df(x) x
= f(0) = f0 . (1.102)
dx f(x)
Separating the x and f variables as before, we have the integrals
f x
¯ 1 1
f̄ df = x̄ dx̄ −→ f 2 − f02 = x 2 (1.103)
f0 0 2 2
giving the pair of solutions

"
f(x) = ± x 2 + f02 . (1.104)
The “multiplication” by dx that is the basis of the technique can be justified. For a
df
function f(x), we know that the differential df is related to dx by df = dx dx, telling us
how f changes due to a change in x. So really, it is a property of the differential df that is
df
being exploited here. Since dx = G from the start, we always had
df(x)
df = dx = G(x, f(x))dx, (1.105)
dx
and in the special separable case G(x, f(x)) = g(x)h(f(x)), we get
df
df = g(x)h( f )dx −→ = g(x)dx (1.106)
h( f )
as before.
What sort of ODE would not be solved by this approach (at least, up to integration
and inversion)? We need an example in which the multiplicative separation fails, which
is easy to imagine. Take G(x, f(x)) = sin(xf(x)), for example. The separation technique
will not allow us to make progress here. Fortunately, this right-hand side has no physical
significance, and in practice, many first-order differential equations can be solved using
separation of variables. In the broader context of partial differential equations, there is a
related class of solutions obtained by “separation of variables,” and we shall see these later
on in Section 4.3.2.
1.7.3 Superposition
As a special case of separation of variables, consider the most general linear form of the
model problem:
df(x)
= g(x)f(x) f(0) = f0 (1.107)
dx
where g(x) is any function of x and f0 is given. The problem is linear in the sense that only
f(x) and its derivative show up, there are no terms like f(x)2 , for example. When an ODE is
linear, we can add together solutions and the result is also a solution. To see this, suppose
29 Solving First-Order ODEs
we have f1 (x) and f2 (x) both satisfying the ODE in (1.107), then let h(x) = Af1 (x) + Bf2 (x)
for constants A and B, and we have
dh(x) df1 (x) df2 (x)
=A +B = Ag(x)f1 (x) + Bg(x)f2 (x) = g(x)h(x), (1.108)
dx dx dx
so that h(x) is also a solution.
Once the initial value is introduced, we have fewer choices, and in many cases, satisfying
the initial value constraint on the problem renders the solution unique. Given that this
problem is separable, we can also write down an explicit integral form13 of the solution,
f ¯ x x
df
¯ = g(x̄) dx̄ −→ f(x) = f0 e 0 g(x̄) dx̄ , (1.109)
f0 f 0
as we saw in Section 1.6.
1.7.4 Homogeneous and Sourced Solutions

Finally, there are linear problems of the form
df(x)
= αf(x) + g(x) f(0) = f0 (1.110)
dx
for some function of x only, the “source,” g(x) (here, α is just a constant). In order to
solve these, the general approach is to separate the solution f(x) into a piece that solves
the source-free (g(x) = 0) “homogeneous” problem, h(x), and a piece associated with the
¯ Then we can add the two solutions to get a solution to the full problem,
source, call it f(x).
¯ The homogeneous piece is normally associated with the constant that
f(x) = h(x) + f(x).
will allow us to set the initial value.
The homogeneous solution, h(x), solves
dh(x)
= αh(x) −→ h(x) = h0 e αx (1.111)
dx
¯ we can use the “variation
where h0 is the promised constant. To get the source piece, f(x),
¯
of parameters” approach (see [2], for example). Let f(x) = u(x)h(x) where u(x) is an
unknown function, and h(x) is the homogeneous solution. Running this through the ODE
gives
¯
df(x) ¯ + g(x) −→ du(x) h(x) + u(x) dh(x) = αu(x)h(x) + g(x),
= α f(x) (1.112)
dx dx dx
and we know the derivative of h(x) is αh(x) from (1.111). We can use that to cancel a term
on the right. The ODE for u(x) is now
du(x) g(x) 1
= = e −αx g(x) (1.113)
dx h(x) h0
which can be solved by integration,
x
1
u(x) = e −αx̄ g(x̄)dx̄, (1.114)
h0 0
13 Of course, the integral may or may not exist, depending on g(x) and the relevant domain.
and
x
¯ = u(x)h(x) = e αx
f(x) e −αx̄ g(x̄) dx̄ (1.115)
0
with no constants for setting initial values. The full solution is the sum
x
αx αx
¯
f(x) = h(x) + f(x) = h0 e + e e −αx̄ g(x̄) dx̄ (1.116)
0
and we can even set h0 = f0 to satisfy the initial value.

Problem 1.7.1 Find f(x) solving the following first-order ODEs:
df(x)
= x 2 f(x) f(0) = 1
dx
df(x)
= cos(x)f(x) f(0) = −2
dx
df(x)
= xf(x)2 f(0) = 1
dx
df(x) sin(x)
= f(0) = 0.
dx f(x)
Problem 1.7.2 Find the homogeneous (where the “source” function on the right is zero) and
sourced solution to the ODE
df(x)
+ f(x) = g0 sin(x).
dx
Notice that it is the homogeneous solution that comes with the constant of integra-
tion. Use that to set f(0) = 5.
Problem 1.7.3 Suppose you had the second-order differential equation for a driven har-
monic oscillator
mẍ(t) = −mω2 x(t) + F(t)
for a given driving force F(t). This is a second-order version of (1.111) with “source”
F(t). Find the two homogeneous solutions, h1 (t) and h2 (t) and try the variation of
parameters procedure starting with x̄(t) = u(t)h1 (t) + v(t)h2 (t) to write an integral
solution like (1.116). The functions u(t) and v(t) can be related in a variety of ways
(there aren’t two independent functions here – you only get one solution, not two),
use the constraint u̇(t)h1 (t) + v̇(t)h2 (t) = 0 to relate the two (that will simplify the
expressions you get in the variation of parameters).
2 Damped Harmonic Oscillator
Let’s complicate matters – suppose we take our mass-on-a-spring, and introduce some
friction. We’ll start with the familiar kinetic friction which opposes motion with constant
magnitude. Calling that magnitude α, we can include the friction force in Newton’s
second law,
mẍ(t) = −kx(t) − αsign(ẋ), (2.1)
and we have set the equilibrium position of the spring to be at a = 0, as shown in
Figure 2.1.
We’ll start the mass off from rest with initial extension p0 > 0. The mass moves to the
left, so that the sign of ẋ is −1, and we start by solving
k α k α
ẍ1 (t) = − x1 (t) + = − x1 (t) − . (2.2)
m m m k
Again letting ω 2 ≡ k/m, and defining the length z ≡ α/k, we have solution
x1 (t) = A cos(ωt) + B sin(ωt) + z, (2.3)
and the initial conditions, x1 (0) = p0 , ẋ1 (0) = 0, give A = (p0 − z) and B = 0. The
solution is
x1 (t) = (p0 − z) cos(ωt) + z. (2.4)
This solution only holds until the mass comes to rest on the other side of its equilibrium
location. That happens when t1 = π/ω, at which point the mass is at x1 (π/ω) = −p0 + 2z.
Now we have a mass that starts from rest at p1 ≡ −p0 + 2z and travels to the right, so that
the sign of ẋ(t) is +1. We must solve
ẍ2 (t) = −ω2 (x2 (t) + z) with x2 (π/ω) = −p0 + 2z and ẋ2 (π/ω) = 0. (2.5)
Since this is the same problem as in (2.2), with z → −z, and p0 → p1 , we have solution
k
m
a = 0 x(t)
Fig. 2.1 A mass moves under the influence of both a spring and kinetic friction.
31
x(t)
p0 x1 (t)
π
ω
x2 (t)
−p0 + 2z
Fig. 2.2 The first full “cycle”of the frictionally damped harmonic oscillator.
x2 (t) = (p1 + z) cos(ω(t − π/ω)) − z

(2.6)
= (−p0 + 3z) cos(ω(t − π/ω)) − z.
The process continues, each time the mass stops to turn around, we pick up a new value for
“p” and the sign of z changes. Notice that at any point, the position and its derivative are
continuous, that’s a requirement of Newton’s second law (see Section 1.7.1). A solution
like this, which is pieced together from individual solutions, is known as a “piecewise
solution.” For this constant friction case, it is the only reasonable way to express the
solution to the problem. The first two iterations of the solution are shown in Figure 2.2.
Problem 2.0.1 Finish the job – what is the nth solution to (2.1), xn (t) assuming that at the
points where the mass is at rest, |xn (t)| > z (that way the force reliably switches
direction each time the mass comes to rest).
Problem 2.0.2 Show, from Newton’s second law (in one dimension, for simplicity), that if
a force is discontinuous but finite, the velocity and position of a mass subject to that
force are continuous.
Problem 2.0.3 For the piecewise force:
!
0 t<0
F(t) = ,
F0 t ≥ 0
solve Newton’s second law for a particle of mass m for both t < 0 and t > 0 given
x(0) = 0, ẋ(0) = 0. Use the continuity and derivative continuity you established in
the last problem, applied at the discontinuity in the force, to set any constants in your
solution.
2.1 Damping
Imagine the resistance you feel moving a hand through water – the faster you try to move
your hand, the harder you must push. The response of the water is to generate a force that
33 Damping
k b
m
a = 0 x(t)
Fig. 2.3 A mass is attached to a spring and immersed in a fluid to generate a force that opposes the motion of the mass
with magnitude proportional to the mass’s speed and proportionality constant mb.
always opposes the motion of your hand, as in the previous example, but the magnitude
of the force the water generates is itself dependent on how fast your hand is moving. In
general, a force that is meant to model “drag” has the following general form (for velocity
vector v with magnitude v, and constants {α j }∞j=0 )
v
Fv = − α0 + α1 v + α2 v2 + · · · . (2.7)
v
The minus sign out front, and velocity unit vector at the end enforce the notion that the force
always opposes the current motion of the object on which it acts. Inside the parentheses,
we see a (Taylor) series expansion of the magnitude of F in powers of the speed v, and
we can tune the coefficients to account for a wide array of “drag” behaviors. For kinetic
friction, only the constant α0 is nonzero. Suppose we take the first correction to that, let’s
take α 1 to be nonzero with all other terms vanishing. Using this form in Newton’s second
law, we have (setting α1 = 2mb for new constant b and letting k/m ≡ ω2 as in the last
chapter)
mẍ(t) = −kx(t) − 2mbẋ(t) −→ ẍ(t) + 2bẋ(t) + ω 2 x(t) = 0. (2.8)
This new drag force can be generated for the spring setup by attaching a “dashpot,”
basically some oil or other viscous medium that slows down the motion of the mass in
a manner that is proportional to its speed. The physical setup is sketched in Figure 2.3.
How should we solve (2.8), together with initial conditions (to make the problem well
posed and uniquely solvable): x(0) = x0 and ẋ(0) = v0 ? We could “guess” (or just
write down a solution and check it), or use the series technique from the previous chapter.
Thinking of the limiting cases, if b = 0 (no damping), we have x(t) ∼ e ±iωt , a complex
exponential. On the other hand, if ω = 0 (no oscillation), we get x(t) ∼ e −2bt , a decaying
exponential. These limiting cases suggest that for the full problem, we try the general form
x(t) = pe qt for constant p and q, then each term in (2.8) will have a factor of pe qt and we
can cancel those, and in addition, we know we can capture the limiting cases.1 Inserting
our ansatz gives

q2 + 2bq + ω2 = 0 −→ q = −b ± b2 − ω 2 . (2.9)
There are two independent values for q here, and so we know the general solution will be
a linear combination of these two
1 Interestingly, the case b = 0 and ω = 0 has x(t) = At + B for constants A and B, and while B = Be 0 , the At
term is not so easily expressed as an exponential. That will prove to be an issue in the case of “critical damping”
as we shall see in Section 2.1.2.
x(t)
x0
Fig. 2.4 A mass is released from rest with the same ω, but varying b to demonstrate the underdamped (black), critically
damped (gray), and overdamped (light gray) behaviors.
√ √ √2 2 √
−b+ b2 −ω2 t −b− b2 −ω2 t
= e −bt Ae b −ω t + Be − b −ω t .
2 2
x(t) = Ae + Be
(2.10)
Fine, but what about the physics of these solutions? It’s pretty clear that we have some
sort of decaying exponential, since e −bt is sitting out front, and for b > 0, this will go
to zero as t → ∞. But the relative values of b and ω are also important in setting the
physical behavior. For example, if b/ω < 1, then the square root in (2.10) introduces a
factor of i in the exponential, and we have oscillatory solutions (sines and cosines), an
“underdamped” motion. If b/ω > 1, the square root is real, so we pick up growing and
decaying exponentials, the motion is “overdamped.” Finally, if b/ω = 1, we just get decay
and the motion is said to be “critically damped.”
We can look at the various cases together to see what sort of behavior to expect in
general. In Figure 2.4, we see the underdamped (black), critically damped (gray), and
overdamped (light gray) motion that occurs for a mass that begins from rest at some x0 . To
make these plots, I used the same ω and varied b to explore the three regimes.
2.1.1 Underdamped
Each of these cases is of interest, so we’ll consider them√one at a time.√For b/ω < 1 the
physical system is “underdamped.” In this case, we write b2 − ω 2 = i ω 2 − b2 and can
introduce sines and cosines explicitly from Euler’s formula. The solution, written in terms
of x0 and v0 , is

−bt bx0 + v0
x(t) = e x0 cos ω −b t + √
2 2 sin ω −b t .
2 2 (2.11)
ω 2 − b2
This clearly reduces to our familiar expression in the b = 0 limit. The oscillation occurs
with modified (as compared with b = 0) period and frequency
√
2π 1 ω 2 − b2
T= √ f= = . (2.12)
ω 2 − b2 T 2π
35 Damping
The definition of period here is different when compared with the purely oscillatory b = 0
case. There, we defined the period to be the amount of time it took, starting from rest,
to return to the initial location. Here, the mass never returns to the initial location, so we
revert to a weaker definition, “the period is the time it takes for the mass, starting from rest,
to come to rest twice,” matching our definition in the undamped case (the mass stops at
maximal extension on the other side of the equilibrium, then returns).
2.1.2 Critically Damped

If b/ω = 1, the two constants are equal, the motion is “critically damped.” Referring to the
general (2.10), we get
x(t) = e −bt (A + B). (2.13)
This is a problem, since the combination A + B is just a single constant. We have lost a
constant of integration, and cannot set both x(0) = x0 and ẋ(0) = v0 . . . . The issue is
that just as the quadratic equation loses a root when its discriminant vanishes (the two
distinct roots becoming a single double root), we have lost our ability to describe the two
independent solutions to the ODE. Let’s go back to the ODE, with b = ω,
ẍ(t) + 2ω ẋ(t) + ω2 x(t) = 0. (2.14)
We’ll try an ansatz of the form x(t) = e −ωt y(t) since we know that the decaying
exponential is common to all of our solutions. The goal is to “peel off” the portion of
the ODE that is setting the decaying exponential (which we already know about) leaving
us with a simplified ODE governing the auxiliary function y(t). This works well, running
the ansatz through the ODE gives
e −ωt ÿ(t) = 0 −→ y(t) = A + Bt (2.15)
and now we can set A and B using the initial conditions for x(t)
x(t) = e −ωt (x0 +(v0 + ωx0 ) t) . (2.16)
You may be worried about the linear growth in t found inside the parenthesis here, but you
can show, in Problem 2.1.1, that x(t → ∞) → 0.
2.1.3 Overdamped
√
Finally, for b/ω > 1, we are in the “overdamped” regime, where b2 − ω 2 is a real,
positive number. The exponentials are no longer oscillatory, but rather growing and
decaying. We can express the solution in terms of the hyperbolic cosh and sinh functions.
Those satisfy a sort of “real” version of Euler’s formula and can be inverted providing a
relation similar to (1.71)
1 η 1 η
cosh η = e + e −η sinh η = e − e −η . (2.17)
2 2
Then we have

−bt bx0 + v0
x(t) = e x0 cosh b −ω t + √
2 2 sinh b −w t
2 2 (2.18)
b2 − ω 2
which can be compared to (2.11).
In the critically damped and overdamped cases, there is no oscillation, so our notion of
periodicity is missing. For exponential functions of the form f(t) = f0 e ±αt , a characteristic
time can be defined as the time it takes for the function to grow (decay) by a factor of e from
its initial value: t = 1/α in this case. For the critically damped solution, the characteristic
time is
1 1
τcd = = (2.19)
ω b
while for the overdamped case, taking the negative root (since it leads to the larger value)
1
√
τod = > τcd . (2.20)
b − b2 − w2
The critically damped solution approaches its equilibrium value (zero here) faster than the
overdamped solution, as is evident in Figure 2.4.
When we have both damping and oscillation, as in the √ underdamped case, there are two
natural timescales that are defined, the period T = 2π/ ω 2 − b2 from the oscillation, and
τ = 1/b from the decay. The ratio of those two defines a dimensionless constant called the
“Q”-factor of the system (or the “quality” factor). Formally,

τ 1 2 ω 2
Q ≡ 2π = ω −b = 2 − 1, (2.21)
T b b
where the 2π in the definition is just for aesthetics (there are other conventions that omit the
2 or the π or both). The value of Q is large when there are many oscillations within the
decay envelope (T τ), and it is small when the decay envelope is small compared to
the oscillation, τ T. A sketch of two decaying, oscillatory solutions, together with the
exponential decay envelope is shown in Figure 2.5.
Problem 2.1.1 Show that
lim e −t t → 0.
t→∞
A convincing indication is enough; we’re not looking for a rigorous mathematical

proof here.
Q ≈ 10 Q≈4
0 t 0 t
2π 2π
Fig. 2.5 Two oscillatory solutions with their respective decay envelopes plotted over the same time interval. The left
solution has Q ≈ 10 while the right has Q ≈ 4.
37 Damping
Problem 2.1.2 Solve

ẍ(t) = −ω2 x(t) − 2bẋ(t)
with initial values x(0) = 0, ẋ(0) = v0 and b2 < ω 2 (underdamped). Evaluate the
energy of the system:
1 1
E(t) = mẋ(t)2 + mω2 x(t)2 .
2 2
What happens as t → ∞? What happens to E(t) if b = 0?
Problem 2.1.3 A potential energy function U(x) has a minimum at x = 0. A particle of
mass m moves under the influence of the potential while inside a viscous medium
with coefficient b (our usual oil-filled cylinder). If the mass starts near zero, find the
value of b that leads to critical damping – write your expression in terms of the value
of U(x) and its derivatives evaluated at zero.
Problem 2.1.4 For the potential energy curve below, imagine starting a bunch of particles
off with energy E, but with different initial locations along the x axis. Assuming a
realistic physical situation (no perpetual motion machines), indicate where, on the x
axis, the particles end up.
U (x)
Problem 2.1.5 Given the equation of motion governing a mass m,

mẍ(t) = −kx(t) − 2mbẋ(t) + F0
√
for constant F0 (and positive constants b, k with k/m > b), find the angular
frequency of oscillation and equilibrium location of the mass.
Problem 2.1.6 The following ODE shows up in a study of the electromagnetic “self force”
(see, for example, [8, 12]):
...
ẍ(t) = −2bẋ(t) − σ x (t).
What are the units of σ? Solve this ODE for x(t) – you should have three constants
of integration that you could set using initial values (don’t worry about setting those
constants).
Problem 2.1.7 Suppose you have a charge 2q at −a, and a charge q at a. These are fixed
charges, they are pinned down. In between these charges, there is a charge q with
mass m that is free to move. Write the equation of motion for x(t), the location of
the moving charge. If there is some damping mechanism in place, where will the
central charge end up? If you moved the charge a little from this position, what will
the period of small oscillation be (assume no damping for this piece)?
2q q q
x̂
−a x(t) a
Fig. 2.6 The charges at −a and a are fixed, the charge at x(t) is free to move.
2.2 Driven Harmonic Oscillator
Now suppose we take our mass attached to a spring, and drive it with a force F(t) (I attach
a motor, or just exert a time-varying force with my hand). Letting f(t) ≡ F(t)/m, Newton’s
second law reads
ẍ(t) = −ω2 x(t) + f(t). (2.22)
There will be two pieces to the solution here, the homogeneous and sourced solu-
tions from Section 1.7.4. Remember the point of this decomposition, as a second-
order ODE (2.22) requires two constants to set initial (or boundary) conditions. The
homogeneous piece, h(t), obtained by setting f(t) = 0 in (2.22), will give us those
constants, while the source piece, x̄(t) allows us to focus on the forcing term in isolation.
The homogeneous h(t) solves
ḧ(t) = −ω2 h(t). (2.23)
The sourced piece of (2.22), x̄(t), solves
x̄¨(t) = −ω 2 x̄(t) + f(t). (2.24)
Then the solution to (2.22) is x(t) = h(t) + x̄(t). We know h(t) = A cos(ωt) + B sin(ωt)
is the homogeneous solution, and expect that the constants A and B will be involved in
setting the initial values for x(t). To find x̄(t), we could go back to Problem 1.7.3, but we
will take a more physically inspired approach.
For the source piece, suppose we take f(t) = f0 e iσt for constants σ and f0 . Why have we
introduced a complex driving force here? How do we produce a complex driving force?
We could have used f0 cos(σt) or f0 sin(σt), but the exponential form is easier to work with.
This means that x̄(t) itself will be complex, so how do we extract the physical (i.e. real)
solution from the x̄(t) we will get upon solving? Take x̄(t) = p(t) + iq(t), for real functions
p(t) and q(t). Then we can write the now complex (2.24) as the real pair:
p̈(t) = −ω 2 p(t) + f0 cos(σt)
(2.25)
q̈(t) = −ω 2 q(t) + f0 sin(σt).
The sum of the first equation with i times the second gives us back (2.24) with complex
f(t). Conversely, if we solve (2.24) with the complex force in place, we can take either the
real or imaginary part of x̄(t) (depending on the phase of our driving force) to recover a
real signal. The other solution, we just throw out.
39 Driven Harmonic Oscillator
Issues of complex forcing aside, we might also be concerned that the force we’re using
is too highly specialized. It turns out that it suffices to consider this very simple source,
since we can build any more complicated source out of these (that notion is the subject of
the next section). We have a linear ODE with a (complex) driving “force” – our first guess
at the solution must be x̄(t) = X0 e iαt for constants α and X0 to be determined. Inserting the
ansatz into (2.24), we have
−α2 X0 e iαt = −ω2 X0 e iαt + f0 e iσt (2.26)
and the only way for this to hold, for all values of t, is if α = σ. That choice allows us to
clear out the exponentials in (2.26). Then we can solve for X0 ,
f0
X0 = (2.27)
ω2 − σ2
and we have pinned down x̄(t). Putting x̄(t) together with h(t), our x(t) is
f0
x(t) = A cos(ωt) + B sin(ωt) + e iσt . (2.28)
ω2 − σ2
To get a physically relevant solution, we could take either the real or imaginary piece of
this x(t), depending on the desired phase. Taking the real part,
f0
xR (t) = A cos(ωt) + B sin(ωt) + cos(σt). (2.29)
ω2 − σ2
The values of A and B can be set once we have been given initial or boundary values.
Take xR (0) = x0 and ẋR (0) = v0 , we start the mass off from rest with some initial extension.
Then we have to solve the pair of algebraic equations
f0
x0 = A +
ω2 − σ2 (2.30)
v0 = ωB,
where we can isolate A and B and put them back in to (2.28) to obtain a complete solution

f0 + x0 σ2 − ω 2 v0 f0
xR (t) = cos(ωt) + sin(ωt) + 2 cos(σt). (2.31)
σ2 − ω2 ω ω − σ2
Notice, in passing, the denominator in the terms in (2.31). What would happen if σ = ω
(see Problem 2.2.1)?
Problem 2.2.1 Solve
ẍ(t) = −ω2 x(t) − 2bẋ(t) + f0 e iσt ,
but set the homogeneous portion of the solution that solves ḧ(t) = −ω2 h(t) − 2bḣ(t)
to zero (i.e. don’t worry about the familiar under/over/critically damped exponential
solutions that we have been studying, set all constants of integration to zero). Take
the real part of x(t) to obtain an actual position. What is the real part of x(t) if you
take σ = ω?
2.3 Fourier Series
First, the result: For a function2 p(t) that is periodic with period T, so that p(t + T) = p(t),
there exist complex coefficients {aj }∞ j=0 such that
∞

p(t) = aj e i2πjt/T . (2.32)
j=−∞
This is, in a sense, the same idea as the one behind Taylor series expansion, but we’re
working in a different “basis” (here, functions in which to expand, exponentials instead of
powers of t). Still, to the extent that the exponentials themselves represent infinite sums,
the shift from powers of t to special (exponential) collections of powers of t is not such a
dramatic one.3 We could expand each of the exponentials in the sum (2.32) using
∞
k
(αt)
e αt = (2.33)
k!
k=0
to get a sum that only involved powers of t,

∞

p(t) = cj t j (2.34)
j=−∞
where the coefficients {cj }∞

j=−∞ include the aj and elements of the exponential decompo-
sition.
Why would we pick one type of decomposition, exponentials, or polynomials, over
another? The Fourier series, which uses exponentials, is well adapted to periodic functions,
while a power series is not. Notice that every term in the sum from (2.32) is periodic with
period T since
e i2πj(t+T)/T = e i2πjt/T e

i2πj
= e i2πjt/T . (2.35)
=1
As a property of exponentials, we also note the integral identity, for integers j and k
(from (1.91)):
T T
i2πjt/T −i2πkt/T T
e e dt = e i2π( j−k)t/T dt = e −i2π( j−k) − 1 (2.36)
0 0 i2π( j − k)
which is zero provided j = k (because e −i2π( j−k) = 1 for j and k integers). If j = k, the
original integration is a little off, we would have:
T T
i2πjt/T −i2πkt/T

e e dt = dt = T. (2.37)
0 j=k 0
2 Nice, well-behaved, smooth, continuous, infinitely differentiable, etc.

3 Indeed, there are any number of basis “functions” that one can use to decompose a general function:
polynomials, exponentials, etc.
41 Fourier Series
We can encapsulate these results using the “Kronecker delta” symbol, defined to be
!
1 j=k
δjk ≡ . (2.38)
0 j = k
In terms of this symbol, the integrals for j = k and j = k can be handily combined:

1 T i2πjt/T −i2πkt/T
e e dt = δjk . (2.39)
T 0
Going back to the decomposition in (2.32), we want to know how to extract the
coefficients {aj }∞
j=−∞ from the sum. Multiply both sides of (2.32) by e
−i2πkt/T
/T and
4
integrate over one full period:
∞
1 T −i2πkt/T 1 T i2πjt/T −i2πkt/T
p(t)e dt = aj e e dt
T 0 j=−∞
T 0
∞
(2.40)
= aj δjk
j=−∞
= ak .
The moral of the story: Given p(t), the coefficients in the decomposition from (2.32) are
obtained by integrating,

1 T
aj = p(t)e −i2πjt/T dt. (2.41)
T 0
2.3.1 Example: Square Wave

All of the functions in this section are periodic with period T, so we only need to specify
values for one cycle, t ∈ [0, T). One of the simplest periodic functions is the square wave,
#
p0 0 ≤ t < T/2
p(t) = . (2.42)
−p0 T/2 ≤ t < T
We can find the coefficients in the Fourier series decomposition using (2.41),
T T
1 2
aj = p0 e −i2πjt/T dt + (−p0 )e −i2πjt/T dt
T 0 T
2
T2 T
p0 (2.43)
=− e −i2πjt/T − e −i2πjt/T
i2πj t=0 T t= 2
p0 −iπj
=− e −1 .
iπj
For j even, we have aj = 0 because in that case, e −iπj = 1. For j odd, we have aj =
−2p0 i/(πj). The infinite sum representing the decomposition of p(t) into exponentials is
then
4 Assuming, as always, that the integration and summation operations are interchangeable.
∞
2p0 i e i2π(2j+1)t/T
p(t) = − . (2.44)
π j=−∞ 2j + 1
If we use a summation index k that is only odd, we can simplify the decomposition:
∞

2p0 i e i2πkt/T
p(t) = −
π k
odd
k=−∞
∞
(2.45)
2p0 i cos(2πkt/T) + i sin(2πkt/T)
=− .
π k
k=−∞ odd
In the sum, we will have pairs of positive and negative integer terms that look like
ak cos(2πkt/T) + a−k cos(2π(−k)t/T) = 0 since a−k = −ak here. All of the cosine
terms will vanish, leaving only the sine terms, and these show up in pairs as well:
ak i sin(2πkt/T) + a−k i sin(2π(−k)t/T) = 2ak i sin(2πkt/T). The sum in (2.44) can be
written entirely in terms of the sine terms:
∞
∞
4p0 sin(2πkt/T) 4p0 sin(2π(2j + 1)t/T)
p(t) = = (2.46)
π k π 2j + 1
k=1 odd j=0
where in the second sum, we revert to j = 0, 1, 2, . . ..
Even vs. Odd Functions

The cosine terms dropped out of the exponential sum in (2.46) because the square wave
we were trying to match was “odd,” meaning that p(−t) = −p(t). Technically, the odd-
ness of the square wave here is manifest in the [0, T) range, so that what we really have
is p(t + T/2) = −p(t) for t ∈ [0, T/2] (although that makes the function odd about t = 0,
as well, p(−t) = −p(t)). Examples of odd and even (with p(−t) = p(t)) functions are
shown in Figure 2.7. When we multiply an odd function by an even function, the resulting
product is an odd function, so for p(t) even, p(t) cos(2πjt/T) is odd. When we integrate
that over the “symmetric” limits of the aj integral in (2.41) (with the exponential written
in terms of trigonometric functions), we get zero, so that odd functions have Fourier series
decompositions involving only sine.
t t
−T T T T −T T T T
− −
2 2 2 2
Fig. 2.7 On the left, we have an odd function, with p(t + T/2) = −p(t) (but then, by periodicity, p(−t) = −p(t)). The
function on the right is even, p(t + T/2) = p(t) (or, about zero, p(−t) = p(t)).
43 Fourier Series
p0
T t
T
2
Fig. 2.8 An even square wave, this function has p(t + T/2) = p(t).
Similarly, we expect an even function p(t) to have vanishing sine terms since
p(t) sin(2πjt/T) is now odd. Let’s see how that works out explicitly. To make an even
square wave, take
!
p0 0 ≤ T/4 and 3T/4 < t < T
p(t) = (2.47)
−p0 T/4 < t ≤ 3T/4,
shown in Figure 2.8, and using (2.41), we get (for j = 0, that case will give a0 = 0, but
must be handled separately)
3T/4 T
T/4
1 −i2πkt/T −i2πkt/T −i2πkt/T
ak = p0 e dt − p0 e dt + p0 e dt
T 0 T/4 3T/4
(2.48)
ip0 −ikπ/2
= e − e −i3kπ/2 ,
kπ
and the complex exponentials are e −ikπ/2 = (−i)k and e −i3kπ/2 = (−i)3k which means that
for k even, the coefficient vanishes, while for k odd, we get
2p0 k−1
ak = (−1) for k = ±1, ±3, ±5, . . ..
2 (2.49)
kπ
Now what is interesting is that here, ak = a−k , so that this time, when we write the
entire sum out, we get ak cos(2πkt/T) + a−k cos(2π(−k)t/T) = 2ak cos(2πkt/T) while
ak sin(2πkt/T) + a−k sin(2π(−k)t/T) = 0. Then
∞
4p0 k−1 cos(2πkt/T)
p(t) = (−1) 2 , (2.50)
π k
k=1,3,...
or in terms of an index j = 0 → ∞,
∞
4p0 j cos(2π(2j + 1)t/T)
p(t) = (−1) . (2.51)
π 2j + 1
j=0
The moral of the story is that even functions can be written entirely in terms of cosine
(with positive j-integers only), while odd functions can be written in terms of sine (again,
for positive integers).
Finally, what if we wanted to decompose the function p(t) = p0 for t ∈ [0, T)? This is
clearly an even function, and we expect to decompose it into a “cosine” series. The problem
is that aj = 0 unless j = 0, in which case a0 = p0 . This suggests that the cosine series
should include j = 0 to complete the story. So, for even p(t), we have
∞
2πjt
p(t) = aj cos (2.52)
T
j=0
and for odd p(t),

∞

2πjt
p(t) = aj sin . (2.53)
T
j=1
For a function p(t) that is neither purely even nor purely odd, you should go back to the
full exponential decomposition (2.32).
Alternative
Many people (see, for example, [3, 15]) will define the Fourier series as
∞

p(t) = fj e iπjt/S (2.54)
j=−∞
with coefficients {fj }∞j=−∞ , where t = 0 → S is the temporal domain of interest. This is
easily related to the form we are using by noting that S = T/2 recovers our decomposition.
The advantage of using the half-domain is that the even or odd-ness of the signal is
unspecified, and this allows us to use either of the sine/cosine series for decomposition.
The sine series is
∞
jπt
p(t) = sj sin (2.55)
S
j=1
and the cosine series is

∞

jπt
p(t) = cj cos . (2.56)
S
j=0
If the extension of the signal from S = T/2 → T is even, you should use the cosine
form, if it is odd, use the sine form. If you don’t know or care, use either one, and the
behavior in the unspecified domain will be fixed by your choice (referring to Figure 2.8, if
you took only up to S = T/2, you wouldn’t know what type of function you have, even or
odd, for t ∈ [T/2, T)). I prefer to specify the signal over the entire interval, and use basis
functions that have the same periodicity as the signal. In addition to this natural adaptation,
it is easier to develop the Fourier transform with the 2π in place, as we shall see in
Section 2.6.
2.3.2 Gibb’s Phenomenon

We started by assuming that
∞

p(t) = aj e i2πjt/T (2.57)
j=−∞
45 Fourier Series
for a function p(t) with p(t + T) = p(t). Then we identified the coefficients:

1 T
p(t¯)e −i2πjt/T dt¯.
¯
aj = (2.58)
T 0
But what can we say about the self-consistency of the procedure? What happens if we try
to evaluate the sum in (2.57) with the coefficients from (2.58) in place? Let that function
be called q(t):
∞ T
1 −i2πj ¯/T
t
q(t) = p(t¯)e dt¯ e i2πjt/T
j=−∞
T 0
$ ∞ % (2.59)
1 T ¯)/T
i2πj(t− t
= p(t¯) e dt¯.
T 0 j=−∞
This should be equal to p(t) (almost everywhere, at least).

The sum of the exponentials comes from Problem 2.3.15:

N
sin N + 12 2π(t − t¯)/T
i2πj(t−t¯)/T
e = . (2.60)
j=−N
sin(2π(t − t¯)/(2T))
If we identify the sum in (2.59) as the N → ∞ limit of the sum in (2.60), then we can write

1 T sin N + 12 2π(t − t¯)/T
q(t) = lim p(t¯) dt¯ . (2.61)
N→∞ T 0 sin(π(t − t¯)/T)
The limit must be evaluated carefully, for specific cases, to ensure the convergence of
q(t) to p(t) for all values of t. Limiting integrals of this basic form will be considered
again in Section 2.6.1, but there must be something special about them since somehow we
must recover q(t) = p(t) (almost everywhere). There are subtleties, and as an example of
what can go wrong, let’s go back to our square wave example, with p(t) given by (2.42).
You will see, in Problem 2.3.3, that truncating the sum in (2.44) leads to overshoot at the
boundaries, a sort of “ringing” that can be reduced by including more terms in the sum.
What we will establish here is that while the ringing can be reduced, the overshoot remains
at the discontinuity, even in the limit as N → ∞. That remaining overshoot is known as
the “Gibb’s phenomenon.” There are many ways to establish its existence, and we will
take a truncated numerical experiment approach. For a more formal proof of the Gibb’s
phenomenon that starts with (2.61), see [1], for example.
Go back to the square wave signal from (2.42), with decomposition (using the sine
series) in (2.46) and take p0 = 1. Define the truncated form
4 sin(2π(2j + 1)t/T)
N
pN (t) = , (2.62)
π 2j + 1
j=0
and we’ll probe values near t = 0, where the discontinuity in the full p(t) occurs. Take
t = T where 1, then we have
4 sin(2π(2j + 1))
N
pN (t = T) = . (2.63)
π 2j + 1
j=0
Fig. 2.9 The maximum value of the truncated Fourier series for a square wave evaluated near zero as defined in (2.64). As
the number of terms in the truncated sum, N, is increased, the maximum converges to ≈1.179.
We are looking for the maximum value of pN (t) for t near zero, so we’ll probe the maximum
value over a small grid in . To be concrete, let the maximum value of the truncated signal,
evaluated near zero, be defined by
mN ≡ max (pN (kT)) , (2.64)
k∈[1,100]
and then we can probe mN for a variety of N. A plot for N = 1 → 5,000 (in steps of
50) is shown in Figure 2.9. It is clear that the maximum value in this region near zero is
converging to some number greater than one as N gets large. That represents an overshoot
since the maximum value of the original signal is 1, and we would have expected mN → 1
as N → ∞. The actual convergent value for this “data” is ≈1.179, agreeing with the
analytic result (from [1] again) of 1.1789797 . . ..
T
2 jπt kπt
sin sin dt = δjk
T 0 T T
by expressing sine in terms of exponentials.
Problem 2.3.2 For a function p(t) = p0 sin(ωt + φ), for what value of φ ∈ [0, 2π) is
p(t) = p0 cos(ωt)?
Problem 2.3.3 Truncate the sum in (2.46) at integer values n = 1, 10, and 100. Plot the
resulting approximations to p(t) for p0 = 1, T = 2.
Problem 2.3.4 We have p(t) = p0 t/T for t ∈ [0, T] – assuming p(t) is periodic with period
T, sketch p(t) for t = −T → 3T (this is just to get an idea of what the function
looks like over multiple periods). Find the coefficients {aj }∞
j=−∞ in the Fourier series
5
expansion of this p(t) using (2.41). Be careful, the j = 0 case must be handled
separately. If you have access to Mathematica, download the “Fourier.nb”
5 The “integration by parts” formula (see Section 5.4.2) may prove useful: for u(t) and v(t) functions of t,
b b
du(t) b dv(t)
v(t) dt = u(t)v(t)t=a − u(t) dt.
a dt a dt
47 Fourier Series
notebook from the book website, and put in your values for aj in the “Homework
Problem Template” section – taking p0 = 1/2 and T = 2, plot the truncated form
of your series for N = 25 (all of this is set up in the notebook, except for your
coefficients). Does the truncated plot look like your sketch?
Problem 2.3.5 Find the coefficients {aj }∞
j=−∞ in the decomposition of the “triangle wave”
function, with constant p0 ,
!
p0 t 0 ≤ t < T/2
p(t) = .
−p0 (t − T) T/2 ≤ t < T
Problem 2.3.6 Find the coefficients {aj }∞
j=−∞ in the decomposition of the “sawtooth”
function
p(t) = p0 − αt for 0 ≤ t < T,
with constants p0 and α.
Problem 2.3.7 For
⎧
⎨ p0 0 ≤ t < T/4
p(t) = −p0 T/4 ≤ t < 3T/4
⎩
p0 3T/4 ≤ t < T
find the cosine series decomposition:
∞
2πkt 2 T 2πkt
p(t) = ck cos with ck = p(t) cos dt.
T T 0 T
k=0
(watch out for k = 0).

Problem 2.3.8 Show that any signal p(t) can be written as a piece that is even in t and
a piece that is odd in t, i.e. show that p(t) = e(t) + o(t) with e(−t) = e(t) and
o(−t) = −o(t) (Hint: construct e(t) and o(t) explicitly using p(t) and p(−t)).
Problem 2.3.9 Show that for even functions e1 (t), e2 (t), and odd o1 (t), o2 (t), the product
e1 (t)o1 (t) is an odd function while o1 (t)o2 (t) and e1 (t)e2 (t) are even.
Problem 2.3.10 Given an even function e(t) and an odd function o(t), show that:
T
e(t)o(t) dt = 0
−T
where T is a constant.
Problem 2.3.11 For p(t) = p0 sin(2πkt/T) cos(2πt/T), a signal with period T and constant
integers k and , is p(t) even or odd? Decompose p(t) into the appropriate (sine if
odd, cosine if even) Fourier series (i.e. find the coefficients in the relevant infinite
sum).
Problem 2.3.12 Given the periodic signal (with period T ):
!
p0 0 ≤ t ≤ T/4
p(t) =
0 T/4 < t < T
where p0 is a positive constant, and p(t + T) = p(t). Is p(t) even, odd, or neither?
Find the Fourier series decomposition of p(t) (use the cosine series if the function is
even, the sine series if odd, and the exponential if neither).
Problem 2.3.13 Decompose the square wave signal

!
p0 0 < t < S/2
p(t) =
0 S/2 < t < S
into both the sine (2.55) and cosine (2.56) series (using the orthogonality of sine and
cosine to find the coefficients). In each case, plot the reconstruction of the signal from
t = 0 → 2S to see how the even/odd behavior is enforced (use S = 2 with p0 = 3/2
and truncate the sums at N = 25).
Problem 2.3.14 For the finite sum,

N
TN = r j,
j=1
find an expression for the sum that involves just r and N (and any numerical constants
like 1) i.e. “solve” for the sum (Hint: write rTN in terms of TN and powers of r, then
solve for TN as in [3]). Do the same for
−1

SN = r j.
j=−N
Put the pieces together to write an expression for

N
r j.
j=−N
Problem 2.3.15 Using your result from the previous problem, show that

N
cos(Nθ) − cos((N + 1)θ)
e ikθ = .
1 − cos θ
k=−N
The right-hand side can be simplified,

cos(Nθ) − cos((N + 1)θ) sin N + 12 θ
= .
1 − cos θ sin θ2
Problem 2.3.16 Produce the plot shown in Figure 2.9 as follows: take the truncated form
of the Fourier series expansion of the square wave, and, using Mathematica (or
other), evaluate (2.64) with = .000001 for values of N ranging from 1 → 5,000
(in steps of 50, say). Plot the values you get as a function of N, you should see
convergence to the appropriate number.
2.4 Fourier Series and ODEs
The Fourier series can be used to turn ODEs into algebraic equations, possibly infinite in
number, but easy to solve individually. This is the formal justification for the “guesses” we
have made previously, leading to (1.74) and (2.9).
49 Fourier Series and ODEs
2.4.1 First-Order Example

We’ll start by applying the Fourier series approach to a first-order ODE. To retain maximal
overlap with our driven harmonic oscillator, take
ẋ(t) + iωx(t) = f(t) x(0) = x0 (2.65)
which has homogeneous solution h(t) = h0 e −iωt . We’ll focus on finding the “sourced”
solution, x̄(t), and add in h(t) at the end. Assuming f(t) is periodic with period T, we can
write it as a Fourier series:
∞

1 T
f(t) = ck e i2πkt/T cj = f(t)e −i2πjt/T dt (2.66)
T 0
k=−∞
and assume that x̄(t) can also be expanded

∞

x̄(t) = ak e i2πkt/T . (2.67)
k=−∞
We want to use (2.65) to find the unknown coefficients {ak }∞

k=−∞ . Inserting the expan-
sions, from (2.66) and (2.67), into (2.65), we have
∞

i2πk
ak + iω − ck e i2πkt/T = 0. (2.68)
T
k=−∞
Multiply this equation by e −i2πjt/T /T and integrate from t = 0 → T. The resulting

Kronecker delta can be used to isolate the jth coefficient equation

i2πj cj
aj + iω − cj = 0 −→ aj = . (2.69)
T i(ω + 2πj/T)
Then the solution for x̄(t) is

∞
cj
x̄(t) = e i2πjt/T . (2.70)
j=−∞
i(ω + 2πj/T)
˙
This x̄(t) contains the piece that is responsible for f(t) in the derivative x̄(t), but we can add
in the homogeneous piece, giving us a constant of integration that we can use to set the
initial value, so the full solution, x(t) = h(t) + x̄(t), is
∞
cj
x(t) = h0 e −iωt + e i2πjt/T
j=−∞
i(ω + 2πj/T)
(2.71)
T
1
cj ≡ f(t)e −i2πjt/T dt,
T 0
and we can use h0 to set x(0) = x0 .

2.4.2 Driven Oscillator

Let’s go back to our second-order driven oscillator problem,
ẍ(t) + ω 2 x(t) − f(t) = 0, (2.72)
and repeat the procedure. This time, we know the homogeneous solution is h(t) =
A cos(ωt) + B sin(ωt), and we can add this back in at the end.
We will expand f(t) in its Fourier series as in (2.66) with coefficients {ck }∞
k=−∞ and then
∞
use (2.72) to find the coefficients {ak }k=−∞ in the expansion of x̄(t), the sourced solution.
Inserting
∞

f(t) = ck e i2πkt/T
k=−∞
∞
(2.73)

x̄(t) = ak e i2πkt/T
k=−∞
into (2.72), and collecting terms, we get:

∞
$ 2 %
2πk
− + ω ak − ck e i2πkt/T = 0.
2
(2.74)
T
k=−∞
Now multiply both sides by e −i2πjt/T /T and integrate from 0 → T, only the jth term in the
sum contributes, so we have
$ 2 %
2πj
− + ω 2 aj − cj = 0 (2.75)
T
and we can solve for aj (note the similarity between this coefficient and the one appearing
in (2.28) with σ ≡ 2πj/T),
cj
aj = 2
. (2.76)
ω 2 −(2πj/T)
The coefficients cj come from (2.66) in the usual way. We can write out x(t) explicitly,
adding back in the homogeneous solution
∞
T
1
T 0
f(t¯)e −i2πjt¯/T dt¯ i2πjt/T
x(t) = A cos(ωt) + B sin(ωt) + 2
e , (2.77)
j=−∞ ω 2 −(2πj/T)
and this solution is just a sum of the individual (2.28) solutions with coefficients tuned
appropriately. It is the second-order ODE version of the solution in (2.71).
There is a physical problem with the solution here: What if the natural spring frequency
ω was equal to one of the driving frequencies? If ω = 2πj/T for some j, then the solution
blows up due to the denominator in the sum in (2.77). This extreme “resonance” between
the driving frequency and the spring frequency is not what we observe. We could argue that
it is impossible (in the Platonic sense) to achieve the exact equality of driving and spring
frequency. Still, an arbitrary approach to infinite amplitude is not what we expect from a
51 Damped Driven Harmonic Oscillator
driven oscillator. The issue is that no physical system can be driven without some sort of
damping loss, and the damping puts a finite cutoff on the resonance between driving and
natural frequency.
Problem 2.4.1 Find x(t) solving (2.72) for a driving force that is a square wave
!
f0 0 ≤ t < T/2
f(t) =
−f0 T/2 ≤ t < T
using (2.77) with A = B = 0 to focus on the driven piece of the solution.
2.5 Damped Driven Harmonic Oscillator
We’ll now add back in the damping term to consider the full damped, driven harmonic
oscillator problem
ẍ(t) + 2bẋ(t) + ω 2 x(t) = f(t). (2.78)
We already know the pair of solutions to the homogeneous ( f(t) = 0) problem: ḧ(t) +
2bḣ(t) + ω 2 h(t) = 0, those are just
√
h± (t) = e −bt e ± b −ω t ,
2 2
(2.79)
and we could use the same Fourier series approach as for the driven harmonic oscillator to
find a solution analogous to (2.77).
Instead we’ll focus on the common situation in which a single driving frequency is used.
This happens in, for example, driven electrical (capacitor-inductor-resistor, CLR) circuits
all the time (a function generator can be enticed to produce a sinusoidal voltage of well-
defined frequency). In this specialized case
ẍ(t) + 2bẋ(t) + ω2 x(t) = f0 e iσt , (2.80)
where we again have a complex driving force, and will eventually take either the real or
imaginary piece of the complex solution x(t).
To solve (2.80), we can use the same approach as in Section 2.2. For the full solution
x(t) = Ah+ (t) + Bh− (t) + x̄(t), where x̄(t) satisfies (2.78), we’ll take it to be of the form
x̄(t) = X0 e iαt . With this guess in place, the ODE reads
−α 2 X0 e iαt + 2ibαX0 e iαt + ω 2 X0 e iαt = f0 e iσt . (2.81)
Once again, we must set α = σ to render the equation true for all times t. Then we can
solve for X0 :
f0
X0 = (2.82)
ω2 + 2ibσ − σ2
and the full solution is

√ √ f0 e iσt
x(t) = Ae −bt e b2 −ω2 t
+ Be −bt e − b2 −ω2 t
+ . (2.83)
ω2 + 2ibσ − σ2
Regardless of the relative values of b and ω, it is clear that the terms associated with the
homogeneous solutions will eventually go away, so that the new piece of the solution is the
one that survives as t → ∞. For that reason, we can focus on it (and specifically, its real
part in the end, since that will be relevant to an actual physical driving force). The solution
obtained by setting A = B = 0 (alternatively, letting t → ∞) is called the “steady state”
solution, and dominates after the “transients” (the decaying bits) have died down.
Let’s identify the real and imaginary pieces of the steady state solution,
f0 e iσt
x(t) = , (2.84)
ω 2 + 2ibσ − σ2
by writing the numerator in terms of identifiable real values. Multiply the top and bottom
of the fraction by ω 2 − 2ibσ − σ2 (the complex conjugate of the denominator):

ω2 − σ2 − 2ibσ iσt
x(t) = 2
f0 e . (2.85)
(ω 2 − σ2 ) + 4b2 σ2
This looks like the product of a pair of complex numbers, and it is useful to write the first
term in polar notation to go along with the second. Let

−1 2bσ
φ ≡ tan , (2.86)
ω2 − σ2
so that in polar form, we have

ω 2 − σ2 − 2ibσ 1
2
=" e −iφ . (2.87)
(ω 2 − σ2 ) + 4b2 σ2 2 2
(ω − σ ) + 4b σ
2 2 2
Using this polar form in (2.85) gives a fully polar representation of the solution:
f0
x(t) = " e i(σt−φ) . (2.88)
2
(ω − σ ) + 4b σ
2 2 2 2
There are a few things to note here. First, the solution and the driving force are not in
phase, they differ by φ, a constant set by ω and b (the physical parameters governing the
oscillation and damping) in addition to σ. If you are given a driving frequency σ, and
want to know the frequency ω that will maximize the amplitude of x(t), then we want to
minimize the denominator with respect to ω:
d 2 2
ω − σ2 + 4b2 σ2 = 4ω(ω 2 − σ2 ) = 0 −→ ω = σ, (2.89)
dω
and the “resonant” frequency is ω = σ. Turning it around, if we are given ω, the natural
frequency of the spring, what driving frequency σ maximizes the amplitude?
d 2 2
ω − σ2 + 4b2 σ2 = 8b2 σ − 4σ(ω 2 − σ2 ) = 0 −→ σ = ω 2 − 2b2 . (2.90)
dσ
The solution (2.83) is for a single driving frequency, but if we knew the Fourier series
decomposition of a more general driving function, f(t) = F(t)/m, with
∞

f(t) = fj e i2πjt/T (2.91)
j=−∞
then we can just add up the solutions for the individual driving frequencies “σ ∼ 2jπ/T ”:
√ √ ∞
fj e i2πjt/T
x(t) = Ae −bt e b −ω t + Be −bt e − b −ω t +
2 2 2 2
. (2.92)
j=−∞
ω + 4ibjπ/T − ( 2jπ/T)2
2
Electrical Example
Suppose we have a signal generator outputting V(t) = V0 e iσt (the real part is all that
will matter in the end), and current runs through a capacitor (capacitance C), an inductor
(inductance L) and a resistor (with resistance R) as shown in Figure 2.10. Using Kirchoff ’s
voltage law with VR = IR, VC = Q/C and VL = −IL ˙ as the voltage drops across each
device, we have
Q
V0 e iσt − − IL
˙ − IR = 0. (2.93)
C
Positive current is bringing charge to the positive capacitor plate, so here I(t) = Q̇(t),
and we can write everything in terms of the charge on the positive capacitor plate and its
derivatives:
1 R V0 iσt
Q̈(t) + Q(t) + 2 Q̇(t) = e . (2.94)
LC 2L L
√
We can solve this ODE by comparing it with (2.78), where we identify “ω”∼ 1/ LC,
“b”∼ R/(2L), “f0 ”∼ V0 /L, and, of course, “x(t)”∼ Q(t). Using these in the solution (2.88)
gives (taking, finally, the real part)
V0
Q(t) = "
L
2 cos(σt − φ)
2
1
LC − σ2 + RL2 σ2
(2.95)
Lσ
R
φ = tan−1 .
1
LC − σ2
Vout (t)
V0 eiσt
C L R
Fig. 2.10 An oscillatory driving voltage powers a circuit with a capacitor (C), an inductor (L), and a resistor (R). The output
voltage is taken across the resistor.
The current flowing through the circuit is
L σ
V0
dQ(t)
I(t) = = − " 2 sin(σt − φ). (2.96)
dt R2 2
1
LC − σ2 + L2
σ
Finally, the power in the circuit is

V0 2
σ R
2
P = I R = L 2 sin2 (σt − φ). (2.97)
R2 2
LC − σ σ
1 2 + L2
To maximize the power, we√take the σ derivative of the term in front of sin2 and set it
equal to zero to get σ = 1/ LC, the maximum power is obtained √ by tuning the driving
frequency to the natural frequency of the circuit, ω ≡ 1/ LC. The maximum power
delivered at this resonant frequency is V02 /R.
There are a variety of ways to characterize these “resonant circuits,” and the particular
language depends on the application (see [9] for further examples). On the electrical side,
if we take the voltage drop across the resistor (the final circuit element before ground) to
be the “output” voltage, then
σR
L V0
Vout (t) = I(t)R = − " 2 sin(σt − φ). (2.98)
R2 2
1
LC − σ2 + L2
σ
The amplitude here is

σR
L V0 V0
|Vout (t)| = " 2 =" 1 2 . (2.99)
R2
1
LC − σ2 + L2
σ2 1+ 1
R2 σC − Lσ
The amplitude for the input voltage (the driving signal) is V0 , and the “gain” of the circuit
is defined to be the ratio of these amplitudes:
|Vout (t)| 1
g(σ) ≡ =" 1 2 . (2.100)
V0
1+ 1
R2 σC − Lσ
You can plot g(σ), the gain of a circuit as a function of the input driving√frequency, such
a plot is called a “resonance curve.” The curve is peaked about σ = 1/ LC, the natural
frequency of the circuit. How sharply peaked is the curve? One way to characterize the
width of the resonance curve is
√to pick the√values of σ for which the gain has dropped from
its maximum at 1 (for σ = 1/ LC) to 1/ 2 ≈ .7. There are two values of σ, one on either
side of the peak (σ− to the left, σ+ to the right), at which the gain is roughly 70 percent
√ of
the maximum. Given those two frequencies, define Δσ ≡ σ+ − σ− , the width at 1/ 2 of
max. In Figure 2.11 we see an example gain curve, g(σ), with√σ± marked.
If we write the gain in terms of the frequency ω = 1/ LC, and use the Q-factor
from (2.21),

2
ω 2 2
Q= −1= −1 (2.101)
b ωRC
g(σ)
√
1/ 2
σ
σ− σ +
Fig. 2.11 A plot of g(σ) (for a circuit with Q ≈ 12). The peak, with √
a gain of one, is at the resonant frequency of the circuit,
and σ± represent the frequencies at which the gain is 1/ 2.
to eliminate C, the capacitance, then the gain can be written as

2
g= . (2.102)
σ2 ω2
2(1 − Q2 ) + ω2
+ σ2
(1 + Q2 )
√
We want the pair of σ values for which g = 1/ 2. The positive roots end up being:

(3 + Q2 ) ± 2 (2 + Q2 )
σ± = ω (2.103)
1 + Q2
so that

Δσ (3 + Q2 ) + 2 (2 + Q2 ) (3 + Q2 ) − 2 (2 + Q2 )
= − . (2.104)
ω 1 + Q2 1 + Q2
Remember that Q is defined entirely in terms of the ratio of decay to oscillation time-scale.
It has nothing to do with driving or anything else, yet it governs the circuit’s response to
driving frequencies. As Q gets large (so that oscillations dominate decay), we have
Δσ 2
∼ (2.105)
ω Q
and large Q means small width, a sharply peaked resonance curve (or, in terms of its
definition (2.21), many cycles of oscillation within the decay envelope of the undriven
motion).
Problem 2.5.1 What is the amplitude associated with the optimizations in (2.89) and (2.90)?
Problem 2.5.2 A capacitor is charged up so that it has charge Q0 on its positive plate (−Q0
on its negative plate). At time t = 0, we close the switch in the following circuit and
current flows. At time t, what is the charge Q(t) on the positive capacitor plate? What
L R
V0
Fig. 2.12 Circuit for Problem 2.5.3.
C L R
Fig. 2.13 Circuit for Problem 2.5.4.
is the timescale that governs the decay of charge? Does it have the appropriate unit?
What is the current through the resistor at time t = 0?
Q(t) C
Problem 2.5.3 We have an inductor and a resistor in series. At time t = 0, we close the
switch, connecting the inductor and resistor to a constant potential V0 . Find the
current, I(t), flowing through the resistor as a function of time, taking I(0) = 0.
Problem 2.5.4 We have a capacitor, inductor, and resistor in series. We charge up the
positive plate of the capacitor to Q0 (the negative plate has −Q0 ) and then close the
switch at t = 0 so that current can flow. Kirchoff ’s law, applied around the loop,
gives the following ODE:
Q(t)
+ LQ̈(t) + Q̇(t)R = 0
C
with Q(0) = Q0 and Q̇(0) = 0. Solve this ODE for Q(t), and impose the initial
condition. What relations between the values R, L, and C give an “underdamped”
solution? What is the exponential timescale τ for this solution? What is the
oscillatory period T for the underdamped solution? Using these expressions, write
the Q-factor: Q ≡ 2πτ/T in terms of the R, L, and C values.
Problem 2.5.5 We run a “complex” voltage V(t) = V0 e iσt through a circuit element,
represented by the ? box in the following diagram. Find the current, I(t), flowing
“through” the circuit element if it is a resistor (with resistance R), a capacitor (with
capacitance C), and an inductor (with inductance L). In each case, write the potential
drop across these elements as I(t)Z and determine Z, the “complex impedance”
(generalizing the notion of resistance) for the resistor, capacitor, and inductor.
V0 eiσt ?
57 Fourier Transform
2.6 Fourier Transform
There is an extension of the Fourier series decomposition appropriate for functions with
no definite periodicity. Our various harmonic oscillator problems have had a notion of
periodicity (except the critically damped and overdamped cases), and so the Fourier series
made a certain amount of sense. But how should we think about decomposing a function
that has no natural period?
The fundamental shift from fixed period (for the Fourier series) and “infinite period”
(for the Fourier transform) is in going from the discrete to the continuous. For the Fourier
series, we decompose a function p(t) with period T so that p(t + T) = p(t) using periodic
“basis” functions e i2πjt/T for integer j:
∞

p(t) = aj e i2πjt/T (2.106)
j=−∞
with
T
1
ak = p(t)e −i2πkt/T dt. (2.107)
T 0
From (2.106), we can see that the coefficients {ak }∞ k=−∞ tell us, for a given k, how much
6
i2πkt/T
e is in p(t). In the Fourier transform, the coefficients themselves become continuous,
so that the “k” is no longer just an integer.
The Fourier transform of a function p(t), periodic or not, is defined by
∞
p̃(k) = p(t)e i2πkt dt (2.108)
−∞
which is analagous to (2.107). The function p̃(k) is the coefficient that tells us “how much”
e −i2πkt is in p(t), for a given k (now continuous). The only problem is that p̃(k) is complex,
but we’ll address that later.
)
To recreate the function p(t), we sum (in the continuous sense, so that → ) up the
coefficients p̃(k) together with their contributing7 e −i2πkt
∞
p(t) = p̃(k)e −i2πkt dk, (2.109)
−∞
a continuous version of (2.106).

The Fourier pair of t and k correspond to time and frequency. I used the letter k to
remind us of the discrete index k in the Fourier series decomposition coefficients, ak . But
for Fourier transforms, the frequency variable is generally called f, and k is reserved for the
6 This interpretation is difficult, since ak could be complex – as we shall see in a moment, a better measure is
|ak |2 .
7 There is a conventional sign change between the Fourier series and Fourier transform. For the series, the
functions we use to create p(t), are e i2πkt/T , while for the transform, the functions from which we build p(t)
are e −i2πkt . It couldn’t matter less, since we sum over both positive and negative values, but it is a notational
pitfall that spoils the otherwise pristine discrete-to-continuous story.
Fourier transform of spatial variables (like x). So in what follows, I’ll revert to the standard
notation: the Fourier transform of p(t) will be called p̃( f ) with frequency variable f. The
transform and its inverse are, then
∞
p̃( f ) = p(t)e i2πft dt
−∞
∞
(2.110)
p(t) = p̃( f )e −i2πft df.
−∞
2.6.1 A Surprising “Orthogonality”

If you are given p(t), and you compute its Fourier transform p̃( f ) using the top equation
in (2.110), how do we know that taking that p̃( f ) and inserting it into the bottom equation
of (2.110), we really recover p(t)? Let’s try it out: take the output of p̃( f ) from (2.110) and
use it in the equation for p(t):
∞ ∞
ds e −i2πft df
? i2πfs
p(t) = p(s)e
−∞ −∞
∞ ∞ (2.111)
i2πf(s−t)
= p(s) e df ds,
−∞ −∞
where we have once again performed mathematical surgery (interchanging the integral
orders) with the physicists’ impunity (although there are important times when that
interchange would be invalid).
Now the question: What is the integral in parenthesis in the second line of (2.111)? It
looks innocuous enough, few functions are easier to integrate than exponentials, but the
evaluation of the oscillatory integral at ±∞ is troubling. What is cos(∞)? Let’s take a
more formal view and consider
∞ R
g(s, t) ≡ e i2πf(s−t)
df = lim e i2πf(s−t) df
−∞ R→∞ −R
(2.112)
sin(2πR(s − t))
= lim .
R→∞ π(s − t)
For any values of s and t, the numerator is bounded by one (in absolute value) while the
denominator gets very large for s → t (the numerator has the R in it, which is running off
to infinity, so even if s → t, the argument of sine is not necessarily zero). The function
h(s, t) ≡ sin(2πR(s − t))/(π(s − t)) with g(s, t) = limR→∞ h(s, t) for various values of
R is shown in Figure 2.14. Notice that the peak grows with R, while the off-peak values
oscillate faster and faster. If we integrate g(s, t) in t, letting z ≡ 2πR(s − t), we get the
“sine integral” (defined to be sin(z)/z dz, see Problem 8.3.5 for a numerical approach to
the sine integral),
R=9
R=5
R=1
Fig. 2.14 The function h(0, t) plotted for R = 1 (black), R = 5 (gray), and R = 9 (light gray).
∞ ∞
sin(2πR(s − t))
g(s, t) dt = lim dt
−∞ R→∞ −∞ π(s − t)
∞
1 sin(z) (2.113)
= lim dz
R→∞ π −∞ z
= lim 1 = 1
R→∞
where the integral of h(s, t) is R-independent.

We want a function g(s, t) that is infinite when s = t, zero elsewhere (looking at what we
need to get equality in (2.111)) and integrates to 1. That “function” is known as the “Dirac
delta function,” and is denoted δ(s − t). It has the property that for “any” function p(t):
∞
p(s)δ(s − t) ds = p(t), (2.114)
−∞
just what we wanted to establish the consistency of the Fourier Transform in (2.111) which
now reads
∞ ∞ ∞
?
p(t) = p(s) e i2πf(s−t) df ds = p(s)δ(s − t) ds = p(t). (2.115)
−∞ −∞ −∞
2.6.2 Dirac Delta Function

Motivated by the discussion in the last section, we posit the existence of a function δ(x)
defined by the following properties:
! b
0 x = 0
δ(x) = with (for positive constants a and b) δ(x) dx = 1. (2.116)
∞ x=0 −a
That is, a symmetric function that is zero except at x = 0 where it is infinite, but such
that any integral of δ(x) that fully encloses x = 0 is unity. Then we clearly have, for a
well-behaved function f(x),
b b b
f(x)δ(x) dx = f(0)δ(x) dx = f(0) δ(x) dx = f(0), (2.117)
−a −a −a
and by change of variables, we can shift the zero of the argument of the δ wherever we
like:
b !
f(t) −a < t < b
f(s)δ(s − t) ds = (2.118)
−a 0 else
How could we possibly make such a function (beyond the maddening definition of
g(s, t) in (2.112))? There are a variety of ways to build a function8 that behaves according
to (2.116). As an example, take a Gaussian
1 x2
d(x, σ) ≡ √ e − 2σ2 (2.119)
σ 2π
normalized so that its integral over all x is one. The Gaussian becomes more sharply peaked
as σ → 0, while retaining its integral normalization. So one way to achieve δ(x) is to take
the limit of the Gaussian as σ → 0,
1 −x
2
δ(x) = lim d(x, σ) = lim √ e 2σ2 . (2.120)
σ→0 σ→0 σ 2π
In the end, any tunable sharply peaked function with unit integral will do.
Connection to Kronecker Delta

The integral form of the delta function, sketched in Section 2.6.1 is
∞
δ(x − y) = e i2π(x−y)t dt. (2.121)
−∞
This form was necessary just to get a consistent Fourier transform/inverse transform pair.
The integral was motivated by looking at the limiting behavior of a truncated (and hence
defined) version. We can also argue for the plausibility of the equality in (2.121) by
appealing to the Kronecker delta, which appears in a well-defined orthogonality integral
for exponentials. Referring to (2.39), we have

1 T i2π( j−k)t/T 1 T/2 i2π( j−k)t/T
δ jk = e dt = e dt. (2.122)
T 0 T −T/2
which is similar to (2.121) if we identify x ∼ j/T and y ∼ k/T. Then we could even write9

δ(x − y) = lim TδxT yT , (2.123)
T→∞
from which it is clear that, at least up to normalization, the Dirac delta function is the
continuum limit of the Kronecker delta.
8 Or, more appropriately, a “distribution.”

9 Here, we are using the “floor” operator that takes a number and rounds it down to the integer below it.
2.6.3 Properties of the Fourier Transform

Given a function p(t) with Fourier transform p̃( f ) from (2.110), we are tempted to interpret
p̃( f ) as the “amount” (or relative amount) of e −i2πft in p(t) from (2.109). That’s fine, except
that p̃( f ) is complex. What does it mean to have “5i” of something? Instead, it is really
the magnitude (squared) |p̃( f )|2 that we think of as giving the fractional amount of the
contribution of e −i2πft to p(t). Fractional, because we don’t know what the integral
∞
P≡ |p̃( f )|2 df (2.124)
−∞
is. If it were 1, then |p̃( f )| df tells us the amount of p(t) that depends on the df-vicinity of
2
f. If P in (2.124) is not one, we can just divide,

|p̃( f )|2
df (2.125)
P
is the amount of p(t) that has frequencies in the vicinity of f.
The “power spectrum” associated with p(t) is just the function P( f ) ≡ |p̃( f )|2 , a density
in frequency space. The total “power” in an interval fa to fb is defined to be
fb fb
P(fa , fb ) ≡ P( f ) df = |p̃( f )|2 df (2.126)
fa fa
and again, our interpretation is that P(fa , fb ) captures how much of p(t) depends on the
frequencies in the range fa → fb . The “total power” is obtained by sampling across the
entire frequency range, just P from (2.124) again,
∞
P = P(−∞, ∞) = |p̃( f )|2 df. (2.127)
−∞
If we write the total power in terms of the original function p(t), using (2.109), we find
∞
P= |p(t)|2 dt, (2.128)
−∞
so that we can calculate the total power without ever computing the Fourier transform. This
result, that
∞ ∞
|p(t)| dt =
2
|p̃( f )|2 df (2.129)
−∞ −∞
is an example of “Parseval’s relation” (see [1] for the general case).

If the function p(t) is real, then we can show that there is a relation between the positive
and negative frequencies in the Fourier transform (see Problem 2.6.3) that allows us to
interpret the negative frequencies. Those could otherwise be confusing: if “concert A” is
440 Hz, what note is represented by −440 Hz? Another confounding frequency, the “zero,”
f = 0, value of the Fourier transform,
∞
p̃(0) = p(t) dt, (2.130)
−∞
is just the total area under p(t).

Problem 2.6.1 For p(t) = p0 sin(2πjt/T) for integer j, find the Fourier series coefficients,
{ak }∞k=−∞ (stick with the full exponential form). Write your expression for ak in
terms of Kronecker deltas and compare with the Fourier transform of this p(t).
Problem 2.6.2 Find the Fourier transform of p(t) = p0 cos(2πqt) for (real) constants p0 and
q. What is the relationship between p̃( f ) and p̃(−f ) in this case?
Problem 2.6.3 For a real signal p(t) (with p(t)∗ = p(t)) what is the relationship between
the positive and negative frequency values in the Fourier transform? (i.e. what is the
relation between p̃( f ) and p̃(−f )).
Problem 2.6.4 Find the Fourier transform of
!
0 t<0
p(t) =
p0 e −αt t ≥ 0
for constant real α. Does the relationship between p̃( f ) and p̃(−f) that you found for
the previous problem hold here? Sketch the power spectrum |p̃( f )|2 .
Problem 2.6.5 The Fourier transform, and in particular the power spectrum, tells us “how
much” of a particular frequency f is in our signal p(t). The zero-frequency case is
interesting, show that for odd functions p(t), p̃( f = 0) = 0.
Problem 2.6.6 Evaluate the integral
∞
δ(kt)p(t) dt
−∞
for constant k (Hint: use change of variables). Make sure your expression holds for
both positive and negative values of k.
Problem 2.6.7 For a function f(t) with roots at the set of values {tj }nj=1 : f (tj ) = 0 but with
f (tj ) = 0, show that
∞ n
p(tj )
p(t)δ(f(t)) dt = . (2.131)
−∞ | f (tj )|
j=1
One way to proceed is to isolate the integrals near the roots of f(t), then you can
Taylor expand f(tj + t) for each integral and use your result from the previous
problem. Because of its behavior under the integral sign, we sometimes write (2.131)
more generally as

n
δ(t − tj )
δ( f(t)) = . (2.132)
| f (tj )|
j=1
Problem 2.6.8 The “Heaviside step function” is defined as

!
0 t<0
θ(t) = . (2.133)
1 t>0
Show that dθ(t)

dt = δ(t) by establishing that the derivative of the step function
behaves like the delta function under an integral. Usually, we leave the value of
θ(0) undefined, but in order for the derivative to be identified with the Dirac delta
function, θ(0) must take on a specific value, what is that value?
∞
d
p(t) δ(t) dt = −p (0).
−∞ dt
63 Fourier Transform and ODEs
Problem 2.6.10 What is the Fourier transform of p(t) = p0 δ(t − t0 ) for constants p0 and t0 ?
What is the power spectrum, |p̃( f )|2 ?
Problem 2.6.11 Show that for a real signal p(t), if p(−t) = p(t), then p̃( f ) is real and also
symmetric, p̃(−f) = p̃( f ).
Problem 2.6.12 For the function
!
p0 e αt t < 0
p(t) = ,
p0 e −αt t ≥ 0
with α > 0 a real constant, sketch p(t) and find its Fourier transform, p̃( f ). Make a
sketch of p̃( f ). For what values (large or small) of α is p(t) sharply peaked about the
origin? For what values of α is p̃( f ) sharply peaked?
2.7 Fourier Transform and ODEs
Just as the Fourier series can be used to simplify the problem of solving ODEs, we can
use the Fourier transform to do so when the target functions are not periodic (think of the
overdamped motion in this chapter). Take an ODE like the damped harmonic oscillator:
ẍ(t) + ω2 x(t) + 2bẋ(t) = 0. (2.134)
Assume that x(t) has a Fourier transform, x̃( f ), then we can write x(t), ẋ(t), and ẍ(t) in
terms of x̃( f ):
∞
¯
x(t) = x̃( f¯)e −i2πft df¯
−∞
∞ ¯
ẋ(t) = −i2π f¯ x̃( f¯)e −i2πft df¯ (2.135)
−∞
∞
2 ¯
ẍ(t) = − 2π f¯ x̃( f¯)e −i2πft df,¯
−∞
and we can insert these into the equation of motion (2.134) to get
∞* +
2
− 2π f¯ + ω 2 − 4iπ fb ¯ x̃( f¯)e −i2πft¯ df¯= 0. (2.136)
−∞
Multiply this equation by e i2πft and integrate from t = −∞ → ∞ to pick up a δ(f − f¯)
¯ into fs with the f¯integration,
under the original f¯integral.10 Using that delta to turn all fs
we have an algebraic equation
2
ω − 4iπbf − 4π 2 f 2 x̃( f ) = 0 (2.137)
which is solved by
√
−ib ± ω 2 − b2
f± = , (2.138)
2π
10 The usual caveats apply.

and x̃( f ) should be zero everywhere except at these two “frequencies,” where it should
be . . . sharply peaked. The natural form for x̃( f ) is
x̃( f ) = Aδ( f − f+ ) + Bδ( f − f− ) (2.139)
for constants A and B. In this case, we can perform the inverse Fourier transform
∞
x(t) = x̃( f )e −i2πft df = Ae −i2πf+ t + Be −i2πf− t
−∞
√ √ (2.140)
−bt −i ω2 −b2 t i ω2 −b2 t
=e Ae + Be ,
just as we got in (2.10). Of course, the procedure we have just outlined is the motivation
for making exponential guesses as we did in Section 2.1.
If you have a driving term, the starting point is
ẍ(t) + ω 2 x(t) + 2bẋ(t) = F(t)/m, (2.141)

≡a(t)
and we assume that a(t), in addition to x(t), has a Fourier transform,

∞
¯
a(t) = ã( f¯)e −i2πft df.¯ (2.142)
−∞
Then going through the same process as before, we get

2
ω − 4iπbf − 4π 2 f 2 x̃( f ) = ã( f ), (2.143)
and this time, we don’t just pick up two coefficients (unless ã( f ) has a special form,
like a delta function at a particular frequency, for example). We can still solve for x̃( f )
algebraically,
ã( f )
x̃( f ) = (2.144)
ω2 − 4iπbf − 4π 2 f 2
but now we need to Fourier transform back to obtain x(t), and the difficulty of that
operation is determined by the form of ã( f ).
Problem 2.7.1 For a charged particle of mass m attached to a spring with spring constant
mω2 , Newton’s second law reads (see [8, 12] for the development of this equation)
...
mẍ(t) = −mω2 x(t) + mτ x (t)
for constant τ. Find the algebraic equation governing the Fourier transform of x(t)
and solve it for x̃( f ).
Problem 2.7.2 The “cross-correlation” of two real functions p(t) and q(t) is defined to be
∞
(p
q) (τ) ≡ p(t)q(t + τ) dt.
−∞
This is a measure of the “overlap,” as a function of time τ, of the “signals” p(t) and
q(t). Show that for c(τ) ≡ (p
q) (τ), the Fourier transform of c(τ) has c̃( f ) =
p̃( f )∗ q̃( f ), so that the cross-correlation is just a product on the Fourier transform
side.
3 Coupled Oscillators
Suppose we now include additional masses in our mass-on-a-spring problem. If we have

two masses, m1 and m2 at respective locations x1 (t) and x2 (t) (in one dimension) attached
by a spring with spring constant k and equilibrium spacing a as in Figure 3.1, the equations
of motion are
m1 ẍ1 (t) = k(x2 (t) − x1 (t) − a)
(3.1)
m2 ẍ2 (t) = −k(x2 (t) − x1 (t) − a).
It is useful, in this situation, to change variables, since m1 ẍ1 + m2 ẍ2 = 0 (by Newton’s
third law), the coordinate z ∝ m1 x1 + m2 x2 has z̈ = 0. To give z(t) the appropriate
dimension of length, we must divide by a mass, and to be democratic, we’ll treat m1 and
m2 symmetrically, let
m1 x1 (t) + m2 x2 (t)
z(t) ≡ . (3.2)
m1 + m2
If we define the difference coordinate d(t) ≡ x2 (t) − x1 (t) − a, then the equations of motion
can be combined to give
z̈(t) = 0
(3.3)
1 1
d̈(t) = −k + d(t).
m1 m2
The angular frequency ω can be read off from the equation for d(t):

1 1
ω ≡k
2
+ , (3.4)
m1 m2
and we know the general solution to the problem is
z(t) = At + B d(t) = C cos(ωt) + D sin(ωt) (3.5)
a
m1 m2
k
x1 (t) x2 (t)
Fig. 3.1 Two masses connected by a spring.
65
with four constants of integration, {A, B, C, D}, just right for a pair of second-order ODEs.
We could set the constants given the initial position and velocity for each mass. The “center
of mass” motion, z(t), and relative motion d(t) have been decoupled, each has its own ODE
in (3.3) that makes no reference to the other. Our goal is to perform that same decoupling
for a system of n masses connected by n − 1 springs, but in order to do this, we need to
review some linear algebra.
3.1 Vectors
As we add masses and springs, we add equations to (3.1), and the language of linear algebra
allows us to think about those equations of motion as a whole. A “vector” is a collection
of n numbers that could be real or complex. For a vector with n real entries, we write
v ∈ IRn , “the vector vee is in arr-enn,” and n is called the “dimension.”1 We can represent
the collection as a vertical list. Given the entries {vi }ni=1 , we write
⎛ ⎞
v1
⎜ v2 ⎟
⎜ ⎟
⎜ ⎟
v= ˙ ⎜ ... ⎟ (3.6)
⎜ ⎟
⎝ vn−1 ⎠
vn
where the dot over the equals sign means “is represented by.”2
For two vectors v and w in IRn , with entries {vi }ni=1 and {wi }ni=1 , we define vector
addition:
⎛ ⎞
v1 + w1
⎜ v2 + w2 ⎟
⎜ ⎟
⎜ . ⎟
v + w= ˙ ⎜ .. ⎟, (3.7)
⎜ ⎟
⎝ vn−1 + wn−1 ⎠
vn + wn
so that the components add, leaving us with another vector in IRn (addition of vectors with
different dimensions is not defined).
Given a vector v with components {vi }ni=1 and a number α ∈ IR, we have “scalar–
vector” multiplication defined by
⎛ ⎞
αv1
⎜ αv2 ⎟
⎜ ⎟
⎜ .. ⎟
αv= ˙ ⎜ . ⎟ (3.8)
⎜ ⎟
⎝ αvn−1 ⎠
αvn
where each component of the vector gets multiplied by the constant α.
1 There are many ways to define the notion of dimension. This one is the most familiar, the number of numbers
you need to specify an element v ∈ IRn .
2 Don’t ask. But, if you insist, see [18].
67 Vectors
Finally, we can define a special kind of “vector–vector” multiplication. Given two

vectors v and w, both in IRn , the “dot product” is

n
v·w ≡ vi wi . (3.9)
i=1
Then the dot product of a vector with itself is

n
v·v = v2i (3.10)
i=1
which is the square of the “magnitude” or “length” of the vector (in the Pythagorean sense).
We write the magnitude of the vector without the bold face, so that the length of v is
√
v ≡ v · v. (3.11)
Given a collection of k vectors each in IRn , which we’ll denote {vj }k≤n j=1 , and a set of
constants {αj }kj=1 , a general “linear combination” of the vectors can be written as

k
w= α j vj (3.12)
j=1
where w ∈ IRn is a weighted sum of the set {vj }kj=1 .

There is a special set of vectors, called the “canonical basis vectors,” {e j }nj=1 with
the form:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 0 0
⎜ 0 ⎟ ⎜ 1 ⎟ ⎜ 0 ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 0 ⎟ ⎜ 0 ⎟ ⎜ 0 ⎟
˙ ⎜
e 1= ⎜ .. ⎟
⎟ 2 ⎜
=˙
e ⎜ . ⎟ ⎟ ... n ⎜
e ⎜ . ⎟
=˙ ⎟ (3.13)
⎜ . ⎟ ⎜ .. ⎟ ⎜ .. ⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎝ 0 ⎠ ⎝ 0 ⎠ ⎝ 0 ⎠
0 0 1
so that the jth vector, ej , contains all zeroes, except the jth entry which is 1. These vectors are
“normalized,” meaning that they have length 1. They are also “orthogonal,”3 with e i · e j =
0 if i = j. We can express the orthonormality using the Kronecker delta to write e i ·e j = δij
in general.
The collection {e j }nj=1 is also “complete” in the sense that any vector v ∈ IRn can be
written as a linear combination of the set {e j }nj=1 :
⎛ ⎞
v1
⎜ v2 ⎟
⎜ ⎟ n
⎜ ⎟
v=˙ ⎜ ... ⎟ = vj e j . (3.14)
⎜ ⎟
⎝ vn−1 ⎠ j=1
vn
3 Orthogonality here refers to our geometric sense: the dot product of three-dimensional vectors can be written
in terms of the angle between them. For vectors a and b ∈ IR3 , we have a · b = ab cos θ (see Section 6.1).
For orthogonal vectors, θ = π/2, and a · b = 0.
Such a collection of complete, orthogonal, normalized vectors is called a “basis.” There

are many different basis sets for IRn , with the canonical case being the simplest.
Example in IR2
In two dimensions, the canonical basis has elements

1 1 2 0
e = ˙ e = ˙ . (3.15)
0 1
A generic vector in IR2 can be written:

v1
v=˙ = v1 e 1 + v2 e 2 . (3.16)
v2
Here is another basis, with vectors that are normalized and orthogonal:

1 1 1 1
1
b =˙√ 2
b = ˙√ . (3.17)
2 1 2 −1
A generic vector can also be written in this basis, although figuring out the coefficients is
more difficult,

v1 1 1
˙ √ (v1 + v2 ) b1 + √ (v1 − v2 ) b2 .
= (3.18)
v2 2 2
I obtained the coefficients in (3.18) by tedious algebra, is there a better way? Sure: we know
v can be written as a linear combination of the basis vectors, so we start by assuming
v = αb1 + βb2 (3.19)
where our target is the pair of constants {α, β}. By orthogonality of basis vectors, we know
that b1 ·b2 = 0, and the basis vectors are normalized, so that b1 ·b1 = b2 ·b2 = 1. Suppose
we take the decomposition (3.19) and dot b1 into both sides
b1 · v = α (3.20)
and similarly, b2 ·v = β, so all we need are the dot products of b1 and b2 with the vector v,
1 1
b1 · v = √ (v1 + v2 ) = α b2 · v = √ (v1 − v2 ) = β, (3.21)
2 2
precisely the coefficients in (3.18). The dot product b1 · v tells us “how much” b1 is in v,
and represents the “projection” of v onto b1 . Similar language holds for the other basis vec-
tor. The whole process is pretty general, for a basis vector Bk ∈ IRn , the projection of any
v ∈ IRn onto Bk , Bk · v, gives us the coefficient multiplying Bk in the decomposition of v.
Problem 3.1.1 Show that the following pair,

cos θ − sin θ
b1 =
˙ b2 =
˙ (3.22)
sin θ cos θ
is a basis for IR2 for any θ (i.e. show that these vectors are orthonormal and complete,
meaning that any vector in IR2 can be written as a linear combination of these two).
69 Matrices
Problem 3.1.2 For the vector (written with respect to the canonical basis)
v = e 1 + 2e 2 + 3e 3 , (3.23)
express v in terms of the basis vectors

⎛ ⎞ ⎛ ⎞ ⎞⎛
1 −1 1
1 1 1
b1 = √ ⎝ −2 ⎠ b2 = √ ⎝ 0 ⎠ b3 = √ ⎝ 1 ⎠ . (3.24)
6 2 3
1 1 1
Problem 3.1.3 Given two vectors v and w ∈ IR2 , show that v · w = vw cos θ where θ
is the angle between v and w if they are drawn at a common origin (Hint: without
loss of generality, orient your two-dimensional basis vectors so that one points in the
direction of w).
Problem 3.1.4 For two vectors in IR2 , v and w, with v · w = 0, there is a natural way to
generate a basis: Take b1 ≡ v/v a unit vector pointing in the direction of v, then let
a = w − (w · b1 )b1 . Show that a is perpendicular to b1 . Normalize the second basis
vector, b2 = a/a, and write v and w in the b1 , b2 basis (in that decomposition, you
can use the magnitudes of v and w and the dot-product v · w since those building
blocks are independent of basis).
3.2 Matrices
A “matrix” is a collection of vectors, put side by side. For example, if we take a set of
vectors in IRn : {vj }k≤n
j=1 , then we can form a matrix as follows:
⎛ ⎞
v11 v21 ··· vk1
. / ⎜ v12 v22 ··· vk2 ⎟
⎜ ⎟
M=
˙ v1 v2 ··· vk ˙ ⎜
= .. .. .. .. ⎟. (3.25)
⎝ . . . . ⎠
v1n v2n ··· vkn
On the far right, we see the usual rectangular array of numbers normally associated with a
matrix. There are n rows and k columns, and we typically write the indices either both up or
both down. In the present case, I want to highlight the “collection of vectors” interpretation,
but the more typical representation is
⎛ ⎞
M11 M12 · · · M1k
⎜ M21 M22 · · · M2k ⎟
⎜ ⎟
M= ˙ ⎜ . .. .. .. ⎟ , (3.26)
⎝ . . . . . ⎠
Mn1 Mn2 ··· Mnk
where Mij is the number appearing in the ith row, jth column, and we write (for matrices
with real entries) M ∈ IRn×k for a matrix with n rows and k columns.
Matrices can be multiplied by numbers: given α ∈ IR and M ∈ IRn×k , we have

⎛ ⎞
αM11 αM12 · · · αM1k
⎜ αM21 αM22 · · · αM2k ⎟
⎜ ⎟
αM≡˙ ⎜ . .. .. .. ⎟ , (3.27)
⎝ .. . . . ⎠
αMn1 αMn2 ··· αMnk
so that α just multiplies each entry of the matrix.

For two matrices with the same number of rows and columns, addition is defined
componentwise: given M, W ∈ IRn×k
⎛ ⎞
M11 + W11 M12 + W12 · · · M1k + W1k
⎜ M21 + W21 M22 + W22 · · · M2k + W3k ⎟
⎜ ⎟
M + W≡˙ ⎜ .. .. .. .. ⎟. (3.28)
⎝ . . . . ⎠
Mn1 + Wn1 Mn2 + Wn2 · · · Mnk + Wnk
Of more utility are the definitions of multiplication of matrices by vectors and other
matrices. We can multiply a matrix M ∈ IRn×k by a vector p ∈ IRk on the right to get a
vector in IRn :
⎛ )k ⎞
j=1 M1j pj
⎜ ) k ⎟
⎜ )j=1 M2j pj ⎟
⎜ k ⎟
Mp≡˙ ⎜ ⎜ j=1 M3j pj ⎟
⎟. (3.29)
⎜ .. ⎟
⎝ . ⎠
)k
j=1 Mnj pj
If we rearrange this definition a bit, we can make contact with our original “collection of
vectors” interpretation from (3.25):
⎛ ⎞
M11 p1 + M12 p2 + M13 p3 + · · · + M1k pk
⎜ M21 p1 + M22 p2 + M23 p3 + · · · + M2k pk ⎟
⎜ ⎟
⎜ ⎟
Mp = ˙ ⎜ M31 p1 + M32 p2 + M33 p3 + · · · + M3k pk ⎟
⎜ .. ⎟
⎝ . ⎠
Mn1 p1 + Mn2 p2 + Mn3 p3 + · · · + Mnk pk
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
M11 M12 M13 M1k
⎜ M21 ⎟⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ (3.30)
⎜ ⎜ M22 ⎟ ⎜ M23 ⎟ ⎜ M2k ⎟
⎜ M31 ⎟ ⎜ M32 ⎟ ⎜ M33 ⎟ ⎜ M3k ⎟
=⎜ ⎟ 1 ⎜
p + ⎟ 2 ⎜
p + ⎟ 3
p + · · · + ⎜ ⎟ pk
⎜ .. ⎟ ⎜ . ⎟ ⎜ . ⎟ ⎜ . ⎟
⎝ . ⎠ ⎝ .. ⎠ ⎝ .. ⎠ ⎝ .. ⎠
Mn1 Mn2 Mn3 Mnk

k
= pj vj
j=1
so that matrix–vector multiplication can be used to generate arbitrary linear combina-

tions of vectors, just as in (3.12), with the coefficients given by the entries of the
71 Matrices
vector p. As alternate notation,4 if we let the vector w = Mp, then the entries of
w ∈ IRn are:

k
k
wi = vij pj = Mij pj for i = 1 → n. (3.31)
j=1 j=1
Moving on to matrix–matrix multiplication: If we have M ∈ IRn×k (n rows, k columns)

and Q ∈ IRk× (k rows, columns), then W = MQ is in IRn× with entries given by5

k
Wij = Miz Qzj for i = 1 → n, j = 1 → . (3.32)
z=1
We can view matrix–vector multiplication as a degenerate case in which Q ∈ IRk×1 (k rows

in 1 column, the usual representation of a column vector). Notice that the definition also
gives us a way to think about vector–matrix multiplication, with a vector on the left – then
M ∈ IR1×k , a “matrix” with 1 row and k columns, a “row” vector, which we sometimes
think of as the “transpose”6 of a column vector. To be concrete, if v ∈ IRk ≡ IRk×1 is a
column vector, then the transpose (denoted with a superscript “T ”) vT ∈ IR1×k is a row
vector. For a matrix Q ∈ IRk× , we can left-multiply by vT ,

k
w = vT Q has wi = vj Qji for i = 1 → . (3.33)
j=1
We can also use the transpose of a column vector to represent the usual dot product. For
column vectors v, w ∈ IRn ,

n
v·w ≡ vj wj = vT w = wT v. (3.34)
j=1
Finally, we define the inverse of a matrix M ∈ IRn×n to be the matrix that, when
multiplied by M, returns the “identity” matrix I ∈ IRn×n with ones along the diagonal,
zero everywhere else. We denote the inverse as M−1 , with M−1 M = I.
Matrix multiplication (for both vectors and matrices) is the real workhorse of the
notation in expressing (and allowing us to solve) linear algebraic relationships. For
example, suppose we have the set of coupled, linear equations:
ax + by = c
(3.35)
dx + ey = f
for (provided) constants {a, b, d, e} on the left, and {c, f }, on the right. We want to find the
values of {x, y} that make both equations true (or establish that no set of values allows
4 There are a lot of different ways of talking about linear algebra, and you should get used to as many as possible.
5 Matrix–matrix multiplication is basically just matrix–vector multiplication applied to each column vector of
the matrix on the right.
a matrix M with Mij the
6 The matrix operation “transpose” takes rows to columns and vice-versa. If you have
entry associated with the ith row, jth column, then the transpose of M, MT , has MT ij = Mji .
both equations to hold). We can write the set of equations in terms of matrix–vector
multiplication:

a b x c
= . (3.36)
d e y f

≡A ≡x ≡b
Then the formal solution to Ax = b is, in terms of the “inverse” of A, x = A−1 b, provided
the inverse exists.7
The solution for {x, y}, obtained by algebraic manipulation, is
ce − bf
x=
ae − bd
(3.37)
af − cd
y= .
ae − bd
It is clear from this general solution that if ae − bd = 0, the solution in the form of (3.37),
does not exist. Otherwise, we have the most general solution to the pair of equations
in (3.35). Aside from the solution, we would like a test for the solution’s existence, and
the denominator in (3.37) provides a provocative clue.
Determinant
The determinant of a matrix is a number we assign to a (square, although the notion can be
generalized) matrix. For a two-by-two matrix M ∈ IR2×2 , the determinant is defined to be
det M ≡ M11 M22 − M12 M21 . (3.38)
For M ∈ IR 3×3
, the determinant is defined recursively, according to
det M = M11 det M̃11 − M12 det M̃12 + M13 det M̃13 (3.39)
where M̃ij is the 2 × 2 matrix obtained by striking out the ith row and jth column of M.
We can continue in this manner to higher and higher dimensional matrices. The recursive
structure allows us to write the determinant, for M ∈ IRn×n as

n
det M = (−1)j+1 M1j det M̃1j . (3.40)
j=1
For the defining case of two dimensions (and from its extension), the determinant gives
us a way of answering the question: “does a solution to (3.35) exist?” For the matrix A
defined in (3.36), the determinant is
det A = ae − bd. (3.41)
That is precisely the quantity that “determines” whether or not a solution to (3.35) exists
(as is evidenced by the denominators in (3.37)). If a matrix (in any dimension) has a
7 Finding the inverse in general is another whole problem, and one which we will ignore (you can find relevant
methods in [7, 11]) – most numerical packages include a routine to produce the inverse if it exists.
73 Linear Transformations
determinant that vanishes, then it is not invertible. The determinant is the bellwether of
invertibility. Prior to solving a set of linear algebraic equations, one should calculate the
determinant of the associated matrix to ensure that a solution exists.
An important property of the determinant, as you will establish in Problem 3.2.2, is “the
determinant of the product of two matrices A and B is the product of the determinants (of
A and B),”
det AB = det A det B. (3.42)
Problem 3.2.1 Given two matrices:

⎛ ⎞
1 5
4 2
˙ ⎝ 2 −1 ⎠
M= W=
˙ ,
0 1
3 0
what products can you form out of the matrices themselves and their transposes?
For example, is MW a valid matrix product? How about WM? For the ones that are
valid, evaluate the product.
Problem 3.2.2 Show, for 2 × 2 matrices, that det(AB) = det(A) det(B). As a corollary,
show that det(A−1 ) = 1/det(A).
Problem 3.2.3 What is the relation between det(A) and det(AT )? (work with A ∈ IR2×2
for simplicity, the result holds for all dimensions.)
Problem 3.2.4 For a square matrix M with inverse M−1 , we have M−1 M = I. Show that
MM−1 = I (establishing that M is the matrix inverse of M−1 ).
Problem 3.2.5 For matrices A, B ∈ IRn×n , show that(AB)−1 = B−1 A−1 .
Problem 3.2.6 Show that for matrices A and B both in IRn×n ,(AB)T = BT AT .
Problem 3.2.7 For a matrix M ∈ IRn×n , evaluate the determinant of MT M in terms of the
determinant of M.
3.3 Linear Transformations
Given a vector v ∈ IRn , the most general linear “transformation” that can act on v is
represented by a matrix–vector multiplication
w = Mv (3.43)
for some8 M ∈ IRn×n .
From one point of view, this operation takes the columns of M and forms a linear
combination with the weighting given by the elements of v as in (3.30). But if we think
of the action of M on v, we are taking a vector v and mapping it to a new vector w by
multiplying v by M. There are two important ways that v can be changed into w: rotation
8 We could have M ∈ IRn× but then w and v live in different spaces.

and stretching.9 Under a rotation, the length of the vector is unchanged but the direction
changes. Under a stretch, the direction of the vector is unchanged, while its length increases
or decreases.
3.3.1 Rotation
For a pure rotation, turning v into w, we can enforce length preservation, w · w = v · v
by putting a constraint on the matrix M representing the rotation:
T
w · w = (Mv) Mv = vT MT Mv = v · v (3.44)
so that we must have MT M = I, the identity matrix, in order for the vectors w and v to have
the same length. A matrix that satisfies MT M = I is called an “orthogonal” matrix, and we
see that in this special case, the inverse of M is just the transpose of M, i.e. M−1 = MT .
If you think of the matrix M in terms of its column vectors,
. /
M= ˙ v1 v2 · · · vn , (3.45)
then its transpose has these column vectors as its rows
⎡ T ⎤
v1
⎢ 2 T ⎥
⎢ v ⎥
MT =˙ ⎢
⎢ .. ⎥ .
⎥ (3.46)
⎣ . ⎦
T
(vn )
The matrix–matrix product can be viewed as a series of dot-products
⎛ 1 1 ⎞
v · v v1 · v2 ... v1 · vn
⎜ v2 · v1 v2 · v2 v2 · v3 ... ⎟
⎜ ⎟
MT M= ˙ ⎜ .. .. .. ⎟. (3.47)
⎝ . . . ... ⎠
vn · v1 vn · v2 ... vn · vn
For an orthogonal matrix, this product is the identity, meaning that vi · vj = δij , the column
vectors of M are orthogonal and normalized to one, the properties that give an orthogonal
matrix its name. These vectors also form a natural basis for the vector space IRn .
As a simple example of a rotation matrix, consider the following 2 × 2:

cos θ sin θ
M= ˙ . (3.48)
− sin θ cos θ
If you take a point in the two-dimensional plane, with coordinates x and y, and form the
vector

x
v=˙ , (3.49)
y
then the new vector w ≡ Mv is rotated clockwise through an angle θ as shown in
Figure 3.2.
9 How about axis inversion?
ŷ
v
y
w = Mv
θ
x̂
x
Fig. 3.2 A vector v and the rotated vector w = Mv, where M is given by (3.48). The vector v is rotated clockwise through
an angle θ to get w.
3.3.2 Eigenvectors
A pure stretch relating w to v (again through w = Mv) means that there is a number λ
such that
w = Mv = λv. (3.50)
In this case, the vector w points in the same direction as v (okay, parallel or anti-parallel,
depending on the sign of λ), and the magnitude has changed.
As an example, the matrix

1 0
M=λ˙ (3.51)
0 1
takes any vector v and scales it by λ: Mv = λv. But this matrix is very special in its
universality. For most matrices M, there are some vectors that are stretched, and others
that are not purely stretched (they may be rotated as well). As an example, take the matrix

α 0
M=˙ (3.52)
0 β
for constants α = β. Again thinking of a point in two dimensions with coordinates x and y
placed into a vector v as in (3.49), if we multiply by the matrix M, we get

αx
w = Mv= ˙ , (3.53)
βy
and this w is not parallel to the original v (unless α = β, which we already covered). So
we turn the problem around a little, and ask for the vectors that are stretched (and only
stretched) under multiplication by M. In this case, the vectors are:

1 0
v1 =
˙ v2 =˙ , (3.54)
0 1
with Mv1 = αv1 and Mv2 = βv2 . These vectors, v1 and v2 are special, and adapted
to the matrix M, they are called “eigenvectors of M.” Notice that the scaling coefficients,
α and β here, also come from M itself. These coefficients are known as the “eigenvalues
of M.”
The “eigenvalue problem” comes from the observation that not all vectors behave in the
same way under multiplication by M and is defined as follows: given a matrix M ∈ IRn×n ,
find the collection of vectors {vj }nj=1 that are only scaled upon multiplication by M. That is,
we want to find the set of vectors {vj }nj=1 (the “eigenvectors”) and scaling factors {λj }nj=1
(the “eigenvalues”) such that
Mvj = λj vj . (3.55)
The numerical values found in the eigenvectors and eigenvalues are set by the matrix
(linear transformation) M.
Solving for the eigenvalues and eigenvectors is not easy. The idea is to take the defining
equation: Mvj = λj vj and rewrite it with the unknown eigenvalue on the left:
(M − λj I) vj = 0, (3.56)
where I is again the n × n identity matrix. If the matrix M − λj I is invertible, then the
only way this equation can be satisfied is by vj = 0, giving us no eigenvectors (this is the
situation for most values λj could take). We are interested in values of λj that make M−λj I
un-invertible. In that case, there will be nontrivial vectors that satisfy (M − λj I)vj = 0. So
we want the determinant of M − λj I to be zero, and we have an equation involving only
λj (and the entries of M):
det(M − λj I) = 0. (3.57)
Given M, this is a polynomial of degree n in λj and so has n roots, there are n values that
λj can take on to satisfy (3.57). We find each of those, somehow,10 and then plug each back
in to Mvj = λj vj to find the vector vj .
As an example, take

a b
M= ˙ , (3.58)
c d
then
det(M − λj I) = λ2j − λj (a + d) +(ad − bc) = 0 (3.59)
with two solutions (from the quadratic formula),
1
λj = a + d ± a2 + 4bc − 2ad + d 2 . (3.60)
2
Taking each of these in turn, we solve for x and y in

a b x x
= λj (3.61)
c d y y
10 This can be done using a numerical root-finding procedure as described in Section 8.1, but an extension of the
procedure sketched in Section 8.5.3 is more commonly used.
to get v1 and v2 . Notice that the eigenvalue equation in (3.55) puts no constraint on the
length of the vj (multiply both sides of (3.55) by a constant α and let αvj → vj , scaling the
eigenvector does not change the eigenvalue), and it is typical to normalize the eigenvectors
to have unit length.
We can write equation (3.55) as a matrix–vector equation, for j = 1 → n, by making
a matrix V whose columns are the individual vectors vj . That matrix will be11 V ∈ Cn×n
(an n × n matrix with complex entries),
. /
V= ˙ v1 v2 · · · vn . (3.62)
Then the product MV is a matrix, and we can write it in terms of V itself by defining the
diagonal matrix L ∈ IRn×n , with the eigenvalues along its diagonal, zeroes elsewhere,
⎛ ⎞
λ1 0 · · · 0
⎜ 0 λ2 0 · · · ⎟
⎜ ⎟
L≡˙ ⎜⎜ 0 .. ⎟ . (3.63)
⎝ 0 λ3 . ⎟
⎠
.. .. ..
. . 0 .
We can, finally, write (3.55) as a collection of equations, expressed as a single matrix
equation:
MV = VL. (3.64)
If V is invertible, we can write:
V−1 MV = L (3.65)
and we say that we have “diagonalized” the matrix M (we can also turn the equation around
to write M = VLV−1 ).
3.3.3 Symmetric Matrices

We’ll end our discussion of matrices by proving two special properties of “symmetric”
matrices. A symmetric matrix is one whose transpose is equal to itself: MT = M. What
we will show is that if a matrix is symmetric, then all of its eigenvalues are real, and its
eigenvectors are orthogonal: vj · vk = δjk .
Suppose M is a square
∗ symmetric matrix with real entries. Then if we take the eigenvalue
equation and dot vj (the complex conjugate of the jth eigenvector) into both sides,
we get:
T ∗
∗
vj Mvj = λj vj · vj (3.66)
∗ ∗ T
where on the left we are using the fact that vj · Mvj = vj Mvj by the definition
∗
of dot product (see (3.34)). On the right, it is clear that vj · vj is real, since this is just the
11 Even for matrices with real entries, the eigenvalues and eigenvectors could be complex (that’s where the
fundamental theorem of algebra, ensuring n solutions to det(M − λI) = 0, lives), so we have to expand our
target space.
length (squared) of a complex vector. When dealing with complex vectors and matrices,
we often end up conjugating and transposing, and the combination is given its own symbol:
j † j ∗ T
v ≡ v , (3.67)
so that we would write (3.66) as

j † †
v Mvj = λj vj vj . (3.68)
If we take the transpose and conjugate of (3.55), then
j † T †
v M = λ∗j vj (3.69)
and we can multiply by vj on the right to get:
j † T j †
v M v = λ∗j vj vj . (3.70)
Subtracting (3.70) from (3.68) gives
j † †
v M − MT vj = λj − λ∗j vj vj . (3.71)
For symmetric matrices with MT = M, the left-hand side is zero. Assuming vj = 0, the
only way the right-hand side can be zero is if λ∗j = λj , so that the eigenvalue is a real
number.
Moving on to orthogonality, we use a similar approach. This time, we’ll start with two
different eigenvectors, vj and vk , with
Mvj = λj vj
(3.72)
Mvk = λk vk ,
†
and assume λ j = λk . Multiply the top equation by vk on the left, take the conjugate
transpose of the equation for vk (noting the already established reality of the eigenvalue)
and multiply by vj on the right to get the two equations
k † †
v Mvj = λj vk vj
k † T j † (3.73)
v M v = λ k vk vj .
Subtracting bottom from top, we get an equation similar to (3.71)
k † †
v M − MT vj = (λj − λk ) vk vj , (3.74)
with the left-hand side vanishing since M = MT . For the right-hand-side to vanish, we
either need λj = λk which we have already excluded, or12
k † j
v v = vk · vj = 0 (3.75)
so that we can conclude that the eigenvectors are orthogonal.13 Since eigenvectors can be
normalized to unity, we have
vj · vk = δjk , (3.76)
12 Since the entries of M are real, and the λj are real, the eigenvectors will have real entries.
13 Even for λ k = λ j , we can pick orthogonal eigenvectors, see Problem 3.3.10.
and we have a collection of n orthogonal unit vectors. These can be used as a “natural” basis
for IRn . Evidently, symmetric matrices have a built-in preferred basis. It is this property that
we will use to solve our coupled oscillator problem.
When we have a symmetric matrix, the matrix form of the eigenvector problem
from (3.64) has V with columns that are orthogonal vectors. Then it is clear that VT V = I
and the transpose of V is the inverse of V (whence V is an orthogonal matrix). That makes
it easy to factor M into a product of orthogonal and diagonal matrices,
M = VLVT . (3.77)
Example: Image Compression

The factorization of a symmetric matrix as in (3.77) motivates a typical type of compres-
sion for matrices. Roughly, the importance of an eigenvector in the decomposition of M is
related to the size of the associated eigenvalue magnitude. The larger the eigenvalue, the
more its eigenvector contributes to M (see Problem 3.3.17). Assume that the eigenvalues
in L have been ordered so that |λ1 | > |λ2 | > · · · > |λn |, and the matrix V respects that
ordering (so that v1 , the first column, goes along with λ1 , etc.).
Let the truncated matrix Vk ∈ IRn×k be defined to contain, as its columns, the first k
eigenvectors of M. Similarly, take Lk ∈ IRk×k to be the square matrix with λ1 → λk as its
diagonal entries, all else are zero. Then we can define the “truncated” Mk ≡ Vk Lk VTk , and
it is this matrix that we hope will approximate M for some value of k. Obviously, if k = n,
we recover M, we can write Mn = M. As it turns out (see, for example, [7]), the error that
is made is bounded by the sum of the unapproximated eigenvalues:

n
M − Mk 2 ≤ |λj |2 . (3.78)
j=k+1
How does compression factor in here? If you had to store the entries of M, you’d
need to hold onto ∼ n2 /2 values, or roughly n2 pieces of data. If you just take the first k
eigenvectors, you have to store ∼ n × k values plus the eigenvalues themselves, so you end
up with ∼ n × k + k ∼ n × k. If k n, then you’ve traded memory for reconstruction effort
(you actually have to perform the multiplications necessary to generate Mk = Vk Lk VTk ).
Finally, the connection to images. A matrix that has values from 0 → 1 can be interpreted
as a grayscale image (with 0 associated with black, 1 with white). You can visually analyze
the error of truncation on a particular image. As an example, in Figure 3.3, we can see the
“exact” image on the right, with approximate truncations at k = 10, 50, and 100 (out of
10,000 total eigenvectors) from left to right.
Problem 3.3.1 Find the eigenvalues of the matrix

1 −10
A=
˙ .
1 −1
The lesson here is that just because a matrix has real entries doesn’t mean its
eigenvalues are real.
Fig. 3.3 From left to right, the image views of the grayscale matrices M10 , M50 , M100 , and M10000 (the “exact”image). Image
by Hugh Hochman.
Problem 3.3.2 Using the two-dimensional rotation matrix from (3.48), find the mapping of
the points x = 1, y = 0 and x = 0, y = 1 under the rotation with θ = π/2. Do they
make sense?
Problem 3.3.3 Verify, for the rotation matrix (3.48), that MT M = I. What is the geometrical
significance of the matrix MT when viewed as a transformation (i.e. what is the
relation between v and w = MT v)?
Problem 3.3.4 Find the eigenvectors for the matrix M in (3.58) (don’t worry about
normalization).
Problem 3.3.5 The “trace” of a matrix is the sum of its diagonal entries. Show, for the 2 × 2
matrix in (3.58) that

2
2
tr(M) ≡ Mjj = λj
j=1 j=1
where λj is the jth eigenvalue. This property, that the trace of a matrix is equal to the
sum of its eigenvalues, is true in any dimension.
Problem 3.3.6 Suppose you have decomposed a matrix M ∈ IRn×n into its eigenvalues and
eigenvectors as in (3.65) which can also be written as M = VLV−1 . Show that
2
n
det M = Ljj ,
j=1
so that the determinant of a matrix is equal to the product of its eigenvalues.

Problem 3.3.7 Show that if v is an eigenvector of a matrix A ∈ IRn×n , with associated
eigenvalue λ, then w ≡ αv (for constant, real α = 0) is also an eigenvector of A.
What is the eigenvalue of αv?
Problem 3.3.8 Show that if two eigenvectors v and w of the matrix A ∈ IRn×n have the
same eigenvalue λ, then the linear combination z = av + bw (for real constants
a and b) is also an eigenvector with eigenvalue λ. What about z if v and w are
eigenvectors with different eigenvalues, is z an eigenvector in that case?
Problem 3.3.9 Find the eigenvalues and eigenvectors of the matrix

−1 1
A=˙ −κ 2
,
1 −1
normalize the eigenvectors, so that each has unit length.
Problem 3.3.10 Suppose you have two unit vectors v and w which are both eigenvectors
of a symmetric matrix A ∈ IRn×n with the same eigenvalue λ, so that: Av = λv
and Aw = λw, and further assume that v = ±w. Show that I can pick two (unit)
eigenvectors, p and q that both have eigenvalue λ, but that are also orthogonal:
p · q = 0. This observation, together with the notion that all eigenvectors with
eigenvalue not equal to λ are orthogonal to both v and w (and hence to p and q)
ensures that we can form a complete orthonormal basis out of the eigenvectors of A
even if some of the eigenvectors share the same eigenvalue.
Problem 3.3.11 There is a continuous form of the eigenvalue problem, where the linear
operators (represented by matrices for the vector eigenvalue problem) are now
differential operators. The simplest case is:
2
d
f(x) = −ω2 f(x),
dx 2
2
2 is the linear operator, f(x) is the “eigenfunction”, and −ω is its associated
d 2
where dx
eigenvalue. Find f(x) and −ω given the boundary conditions: f(0) = 0 and f(L) = 0
2
for constant L. Since this is a continuous problem, we expect there to be an infinite

number of eigenvalues and eigenfunctions (i.e. make sure you get solutions other
than f(x) = 0).
Problem 3.3.12 For a matrix A ∈ IRn×n with A = AT , suppose we .construct a matrix with /
the unit (i.e. length 1) eigenvectors of A as its columns, V≡˙ v1 v2 · · · vn .
What is the determinant (up to sign) of V ∈ IRn×n ?
Problem 3.3.13 For a symmetric matrix A ∈ IRn×n what is the relationship between the
eigenvalues of A and the eigenvalues of αA for constant (real) α?
Problem 3.3.14 For M ∈ IRn×n , show that if M is antisymmetric, MT = −M, its
eigenvalues have no real part. Show that the eigenvectors have the property: vT v =
0 – note the lack of conjugation in this product – these vectors do have v† v = 1, so
they are normalized in the appropriate complex sense.
Problem 3.3.15 Show that any matrix M ∈ IRn×n can be written as a symmetric matrix plus
an antisymmetric matrix.
Problem 3.3.16 Given two matrices A and B ∈ IRn×n , suppose there is a vector v ∈
IRn (with v = 0) that is an eigenvector of both matrices, but with different
eigenvalues: Av = λv, Bv = σv with σ = λ. Show that it must be the case
that det(AB − BA) = 0.
Problem 3.3.17 In this problem, we will establish (3.78) for diagonal matrices. Let D be a
diagonal matrix in IRn with nonzero entries Dii = di for i = 1 → n. Then Dk is the
diagonal matrix with all entries above the kth one set to zero. Let the elements of r
be ri ≡ (D − Dk )ii for i = 1 → n (i.e. the diagonal elements of the matrix D − Dk ),
show that

n
n
r2 ≤ r2j = λ2j (3.79)
j=k+1 j=k+1
where λj is the jth eigenvalue of D.

3.4 Free Oscillator Chain
Suppose we have a set of n masses connected by springs all in a row, a few are shown in
Figure 3.4. Each spring has equilibrium length a and spring constant k and we’ll take all
the masses to have the same mass m. If we denote the position of the jth mass by xj (t), then
the equations of motion look like
mẍ1 (t) = k(x2 (t) − x1 (t) − a)
mẍ2 (t) = −k(x2 (t) − x1 (t) − a) + k(x3 (t) − x2 (t) − a) = k(x1 (t) − 2x2 (t) + x3 (t))
..
.
mẍj (t) = k(xj−1 (t) − 2xj (t) + xj+1 (t))
..
.
mẍn (t) = −k(xn (t) − xn−1 (t) − a) .
(3.80)
It is interesting that only the first and last masses have equations of motion that depend on
the equilibrium spacing. Defining κ2 ≡ k/m, we can write these equations as
ẍ1 (t) = κ2 (x2 (t) − x1 (t) − a)
ẍj (t) = κ2 (xj−1 (t) − 2xj (t) + xj+1 (t)) for j = 2 → n − 1 (3.81)
ẍn (t) = κ (−xn (t) + xn−1 (t) + a) .
2
Now we get to the point: The equations of motion can be expressed in terms of matrices
and vectors. Define X(t) ∈ IRn to be the vector with the unknown xj (t) as its jth entry, then
the set (3.81) can be written
Ẍ(t) = −Q(X(t) − A) (3.82)
where the matrix Q ∈ IRn×n has the form

⎛ ⎞
−1 1 0 0 0 ···
⎜ 1 −2 ⎟
···
⎜ 1 0 0 ⎟
⎜ ⎟
⎜ 0 1 −2 1 0 ···
⎟
Q= ˙ − κ2 ⎜
⎜ .. .. ..
⎟,
⎟ (3.83)
⎜ 0 . . . 0 ⎟
⎜ ⎟
⎝ 0 ··· 0 1 −2 1 ⎠
0 ··· 0 0 1 −1
a a
k k
m m m ...
x1 (t) x2 (t) x3 (t)
Fig. 3.4 A chain of identical masses connected by identical springs. The location of the jth mass at time t is xj (t).
83 Free Oscillator Chain
and the vector QA ∈ IRn has two nonzero entries, its first, −κ2 a and last, κ2 a. Notice that
the matrix Q is symmetric, so we know it has real eigenvalues and orthogonal eigenvectors,
and we can decompose it into orthogonal matrices multiplying a diagonal one: Q = VLVT ,
where the columns of V are the eigenvectors of Q and the diagonal entries of L the
eigenvalues, {λj }nj=1 .
Following our single, one-dimensional spring solution from Section 1.1, let Y(t) ≡
X(t) − A, so that the equations of motion are
Ÿ(t) = −QY(t) (3.84)
and using the decomposition of Q, we have

d2 d2 T
2
Y(t) = −VLVT
Y(t) −→ 2
V Y(t) = −L VT Y(t) (3.85)
dt dt
where we multiplied both sides by VT (the inverse of V) and noted that V is time-
independent (so we could slip it through the temporal derivatives). Now let Z(t) ≡
VT Y(t), and the equation of motion is Z̈(t) = −LZ(t). Remember that L is diagonal,
so that we have effectively “decoupled” the equations of motion, each one has the form
z̈j (t) = −λj zj (t) (3.86)
for j = 1 → n. We can now solve each equation of motion separately:

zj (t) = pj cos( λj t) + qj sin( λj t) (3.87)
for constants pj and qj . With Z(t) in hand, we can work our way backwards
Z(t) = VT Y(t) = VT (X(t) − A) −→ X(t) = VZ(t) + A (3.88)
and from here, we can set the constants {pj }nj=1 and {qj }nj=1 using the initial position and
velocity of each mass.
That’s it, in theory – we construct Q, find the eigenvalues and eigenvectors, and use
those to take the de-coupled solution for Z(t) back to X(t) setting constants of integration
as we go. We’ll work out the case for the pair of masses that we solved explicitly at the
start of the chapter using this new formulation to see how it all works. That process will
also provide some insight into the physical meaning of the eigenvectors.
3.4.1 Two Masses

For two masses and one spring, the matrix Q is

−1 1
Q = −κ2 , (3.89)
1 −1
with eigenvalues λ1 = 2κ2 = 2k/m and λ 2 = 0 from Problem 3.3.9. The first eigenvalue
is precisely the ω2 we had in (3.4) for equal masses, so we associate this eigenvalue with
the oscillatory motion. The second eigenvalue is zero, and taken as a frequency this would
correspond to motion that has an infinite period, code for nonoscillatory motion. Recall
from our previous solution to this problem that the center of mass motion is not periodic,
so we expect it to be associated with this eigenvalue.
The eigenvectors that go along with these eigenvalues are

1 −1 1 1
1
v = ˙√ 2
v = ˙√ . (3.90)
2 1 2 1
Suppose we had set up our initial conditions so that the solution for X(t) was parallel√ to
v1 – there would be an oscillatory portion out front that had frequency governed by 2k/m
and the two masses would be moving in opposite directions relative to one another. For
X(t) parallel to v2 , we see that the motion will always be in the same direction (if the first
mass moves to the right with unit magnitude, so does the second), and this represents the
center of mass motion. These two different types of motion are orthogonal, and the general
solution can be made up of a linear combination of the pair (with appropriate coefficients
out front). In the end, we reproduce the solution from (3.5).
3.4.2 Three Masses

Things get more interesting if we introduce a third mass. Now the matrix Q looks like
⎛ ⎞
−1 1 0
Q=˙ − κ2 ⎝ 1 −2 1 ⎠ (3.91)
0 1 −1
with eigenvalues
λ1 = 3κ 2 λ2 = κ2 λ3 = 0 (3.92)
and associated (normalized) eigenvectors

⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 −1 1
1 1 1
v1 =˙ √ ⎝ −2 ⎠ ˙√ ⎝ 0 ⎠
v2 = ˙ √ ⎝ 1 ⎠.
v3 = (3.93)
6 2 3
1 1 1
Once again, the eigenvectors

describe the relative motion of the masses, and to each is
associated the frequency λj . For the first eigenvector, we have the end masses moving
one way
√ while the middle mass moves in the opposite direction, this occurs with frequency
ω = 3κ. For the second eigenvector, with frequency κ, the outer masses move opposite
one another (similar to the first eigenvector in (3.90) for the two-mass problem). The
third eigenvector is again the center of mass one, with no frequency. Notice that the set
{v1 , v2 , v3 } form a basis for IR3 (this is to be expected since Q is symmetric), so any
relative motion between the masses can be expressed as a linear combination of the basis
motions shown in Figure 3.5.
Setting the initial conditions for the masses sets the admixture of the eigenvectors with
connected frequencies in the solution. The eigenvectors that describe the characteristic
relative motion are called “normal modes,” and finding them amounts to solving the
problem of motion.
85 Free Oscillator Chain
m m m
v1
m m m
v2
m m m
v3
Fig. 3.5 The relative motion described by the eigenvectors in (3.93).
3.4.3 Normal Modes

Putting a system into one of its normal modes can be done by appropriate choice of initial
conditions. The advantage of running an oscillator chain in a particular normal mode is
that only one frequency of oscillation will govern the motion. There are a variety of ways
to set up the excitation of a single mode, and we’ll take the simplest, in which the system
starts from rest with tuned initial displacements, a configuration that is easy to establish in
a classroom demonstration.
For a system of n masses, and the usual description of motion given by X(t), we’ll set
Ẋ(0) = 0, the masses start from rest. In the diagonalized space of motions represented by
the vector Z(t) with entries from (3.87), starting from rest means setting qj = 0 for all j.
The real-space positions can be recovered from (3.88), X(t) = VZ(t) + A, which can be
written as the sum of the columns of V weighted by the entries of Z,

n
n
X(t) = A + zj (t)vj = A + pj cos λ j t vj . (3.94)
j=1 j=1
T
In terms of the initial values, if we define the vector P=
˙ p1 p2 ... pn , then
we have
X(0) = A + VP. (3.95)
If our goal is to start in the kth normal mode, then we should set P = αe k where e k ∈ IRn
is the canonical basis vector (with zeroes in all positions except the kth which has a 1) and
α is a constant setting the size of the initial extension. So we can isolate this normal mode
motion by starting with X(0) = X0 = A + Ve k (setting α → 1).
How can we set up such an initial configuration physically (see [9] for some good
examples)? Well, if you just set X(0) = A, the masses start, and remain, in equilibrium.
For X(0) = A + Ve k , the initial extension beyond equilibrium is given by the Ve k = vk
term. You can realize this experimentally by setting the initial displacement according to
vk . Going back to the case of three masses, if you wanted to set up motion in the first mode
(v1 in (3.93)), you’d start with the masses in their equilibrium positions, and move the
outer two masses left (say), and the inner mass twice as far to the right, then release from
rest. This type of solution can be set up using an air track and masses attached by springs.
With some practice, you can achieve a visually clear normal mode motion.
Position |p̃(f )|2
x3 (t)
x2 (t)
x1 (t)
t f
Fig. 3.6 The plot on the left shows the position of three masses that began from rest in the first normal mode. The plot on
the right is the power spectrum of the motion, with a single peak, indicating that the motion occurs with one and
only one frequency.
If started correctly using extension from vk , you end up with a solution that looks like

X(t) = A + α cos( λk t)vk . (3.96)
The defining property of the normal mode is its single-frequency of oscillation applied to a
collective motion of the masses. This means that if you looked at any mass in the chain, you
√
would see the frequency f = λk /(2π), or no motion at all, since there are normal modes
in which a mass remains stationary, like the second mass in v2 from (3.93). In Figure 3.6,
we see the motion in time of all three masses (left) started off in the first normal mode, and
on the right, the power spectrum associated with the motion of the first mass. The outer
pair of masses are in phase and have the same magnitude, while the central mass is exactly
out of phase with twice the magnitude of the outer pairs as indicated by the eigenvector
v1 in (3.93). Note that the peak in the power spectrum always occurs at the same spot
regardless of which mass we use to make the power spectrum, but the peak height can
change based on which mass we use.
If we combine two normal modes by taking P = αvk + βvj for constants α and β,
and use it to set X0 = A + VP, then we get motion that is a linear combination of the
two modes. This time, there will be two frequencies present in the spectrum of most of
the masses. Using α = β = 1/2 and taking k = 1, j = 2, we can again plot the motion
of the three masses, and the power spectrum associated with the first mass (note that the
second mass has frequency due only to v1 since v2 does not move the second mass at all).
The motion and spectrum for this case are shown in Figure 3.7. In the power spectrum,
the first peak is the new one associated with v2 . The second peak occurs at the same
location as the peak in the power spectrum from Figure 3.6, where it is caused by the v1
motion.
The point is that if you have a time series of the motion of a single mass, you can
actually determine quite a bit about the system by looking at the power spectrum of the
motion. That spectrum gives the characteristic normal mode frequencies from the physical
configuration (although again, the heights of the various peaks depend on the particular
mass that is sampled).
87 Fixed Oscillator Chain
Position |p̃(f )|2
x3 (t)
x2 (t)
x1 (t)
t f
Fig. 3.7 The plot on the left shows the position of three masses that began from rest in a mixture of the first and second
normal modes. The plot on the right is the power spectrum of the motion, with two peaks this time, the second
peak associated with the first normal mode, the first peak with the second normal mode.
a a a a a
k k k k k
m m m ... m
L
x1 (t) x2 (t) x3 (t)
Fig. 3.8 A chain of masses connected to walls separated by a distance L.
Problem 3.4.1 We know how to set up the chain so that it starts from rest with relative
displacements from one of the normal modes. But we can also select a normal
mode by setting the chain to be in equilibrium initially with a tuned initial velocity.
Find the starting velocity Ẋ(0) that excites oscillation in the kth normal mode. This
approach to setting up a particular normal mode oscillation is harder to realize as a
demonstration.
3.5 Fixed Oscillator Chain
Suppose we connect the first and last masses of an oscillator chain to a “wall” by a spring
(with constant k and equilibrium length a) as depicted in Figure 3.8. We’ll imagine that the
left edge of the wall is at the origin, and the right edge is at a distance L. All that changes
in the story in the last section is an extra term for the first and last masses, so that their
equations of motion become
mẍ1 (t) = k(x2 (t) − x1 (t) − a) − k(x1 (t) − a) = k(−2x1 (t) + x2 (t))
mẍn (t) = −k(xn (t) − xn−1 (t) − a) + k(L − xn (t) − a) = k(−2xn (t) + xn−1 (t) + L).
(3.97)
Going through the same steps as before, with κ2 ≡ k/m, only the definitions of Q and A
change. Q has a slightly nicer form,
⎛ ⎞
−2 1 0 0 0 ···
⎜ −2 ···⎟
⎜ 1 1 0 0 ⎟
⎜ ⎟
⎜ 0 1 −2 1 0 ···⎟
2⎜ ⎟,
Q=
˙ −κ ⎜ .. .. .. ⎟ (3.98)
⎜ 0 . . . 0 ⎟
⎜ ⎟
⎝ 0 ··· 0 1 −2 1 ⎠
0 ··· 0 0 1 −2
and for A, we have
⎛ ⎞
0
⎜ 0 ⎟
⎜ ⎟
⎜ .. ⎟
QA=
˙ ⎜ . ⎟. (3.99)
⎜ ⎟
⎝ 0 ⎠
κ2 L
The matrix Q in (3.98) has eigenvalues and eigenvectors that we can calculate
analytically, as it turns out. We want to solve for the jth eigenvalue/vector pair
Qvj = λj vj . (3.100)
Think of the pth row of this vector equation,

−κ2 vjp−1 − 2vjp + vjp+1 = λj vjp . (3.101)
Suppose we assume that

πjk
vjk = cj sin (3.102)
n+1
for constants {cj }nj=1 . Then inserting this form into (3.101) gives

πj
2κ 1 − cos
2
= λj . (3.103)
n+1
For n = 2, the eigenvalues, according to this formula, are λ1 = κ2 and λ2 = 3κ2 (try
using the 2 × 2 matrix
√ to find the eigenvalues directly).√If we take n = 3, then the formula
gives λ1 = (2 − 2)κ2 , λ 2 = 2κ 2 , and λ 3 = (2 + 2)κ2 . Notice that in both of these
cases, we have lost the λ = 0 solution, meaning that there is no drift of the center of mass.
That makes sense, the masses are all attached to each other or a wall now, so the center of
mass, if it moves, must move in an oscillatory fashion.
Problem 3.5.1 For a matrix A ∈ IRn×n , with nonzero entries in the kth row (for 1 < k < n):
Akk−1 = a, Akk = b, and Akk+1 = a (so that the matrix is symmetric) find the n
eigenvalues {λ j }nj=1 and eigenvectors {vj }nj=1 analytically (Hint: start from a guess
like the one in (3.102)).
Problem 3.5.2 For the following signal p(t), sketch the power spectrum (no explicit cal-
culation allowed), |p̃( f )|2 vs. f using appropriate units and values for the frequency
axis, and get the correct relative heights, but don’t worry about the height values
themselves.
89 Fixed Oscillator Chain
Fig. 3.9 Signal for Problem 3.5.2.
Problem 3.5.3 Suppose you have n masses (each with mass m) coupled by identical springs
(with spring constant k) and attached to walls separated a distance L. The chain is
immersed in a viscous medium with coefficient b, so that the jth mass (where j = 0
or n) has equation of motion (let ω 2 ≡ k/m)
ẍj (t) = −ω 2 (−xj−1 (t) + 2xj (t) − xj+1 (t)) − 2bẋj (t).
Find the decoupled equation of motion for the kth normal mode, i.e. what is z̈k (t) =?
(write in terms of the eigenvalues/vectors of the matrix Q, our usual tri-diagonal one
for masses connected by springs).
4 The Wave Equation
We can take our n balls and springs from the last chapter and ask: What happens if we
take n → ∞? We’ll do that carefully, of course, otherwise the answer is meaningless.
In particular, our model will be n balls of identical mass m connected by springs with spring
constant k and equilibrium spacing a. The total mass of the string of balls is M = mn, the
total length at equilibrium is L = (n − 1)a. These bulk physical properties we will leave
fixed as we send n → ∞. Then it is clear that we will also have to send a → 0, the
equilibrium spacing must shrink as well. In addition, the mass of each individual ball must
go to zero like 1/n to keep M fixed.
Spring Constant
There is one more bulk quantity to consider, the “net” spring constant of our chain of balls
and springs. We can think of the springs connected to one another in series, and try to
find the effective spring constant of the chain. We will require that that be constant as we
increase the number of balls (while shrinking down the mass of the individual balls).
Suppose we have a spring with constant K, and we stretch it out to a distance Δx
(with respect to its equilbrium length). The energy stored in the spring is
1 2
U= K(Δx) . (4.1)
2
Where is the energy stored? The spring is the same everywhere, so it is reasonable to
assume the energy is spread uniformly throughout the extended spring. If you think of two
halves of the spring, half the energy should be stored on the left, half on the right. So the
energy in the left-half of the spring is U = U/2. Call the effective spring constant on the
left k (same as on the right), then we know that
2
1 Δx
U = k (4.2)
2 2
is the energy stored in the left-half spring. Now we can equate this with half of the total
energy, U/2, to find an expression for K in terms of k:
1 1 k
kΔx 2 = KΔx 2 −→ K = . (4.3)
8 4 2
If we divide up the extended spring into n−1 intervals, we expect energy U/(n−1) to be
stored in each interval. Again let k be the effective spring constant of each of the intervals.
90
91 Continuum Limit
Then
2
1 Δx 1
KΔx 2 k
k = 2 −→ K = . (4.4)
2 n−1 n−1 n−1
Turning it around: the chain of n balls connected by n − 1 springs each with constant k has
an effective bulk spring constant of K = k/(n − 1). As we send n → ∞, then, we must take
k → ∞ so as to keep K fixed.
4.1 Continuum Limit
The equation of motion for the jth mass in a chain of masses-connected-by-springs is,
from (3.80),
mẍj (t) = k(xj−1 (t) − 2xj (t) + xj+1 (t)), (4.5)
and we can express this in terms of the fixed M and K
K
ẍj (t) = n(n − 1) (xj−1 (t) − 2xj (t) + xj+1 (t)) . (4.6)
M
We will assume that the motion of the individual masses is small. The jth mass has
equilibrium location ej ≡ ja, and we will take xj (t) ≈ ej for all times of interest. This
is part of our continuum assumption – we don’t expect the mass that is at equilibrium
at ej to be found far to the right of the mass at equilibrium at ej+1 unless we allow the
individual masses to pass through each other without consequence (not usually allowed
physically). To exploit this “close-to-equilibrium” assumption, define the new relative
separation variable φ(ej , t) ≡ xj (t) − ej so that xj (t) = φ(ej , t) + ej . Putting this into (4.6)
gives1
∂2 K
φ(ej , t) = n(n − 1) (φ(ej−1 , t) − 2φ(ej , t) + φ(ej+1 , t))
∂t 2 M (4.7)
K
= n(n − 1) (φ(ej − a, t) − 2φ(ej , t) + φ(ej + a, t)) .
M
Now ej = ja will eventually be a continuous variable. As n → ∞, a gets smaller,
and the values of ja come closer and closer to describing any point along the x axis (ja
becomes a finer and finer gradation of points along the axis). Call2 x ≡ ej = ja, the point
of equilibrium for the jth mass. We are being general – in the continuum limit, every point
x along the x-axis has an associated massless ball that is in equilibrium at it. So every point
has one of the infinite number of balls lying at it when in equilibrium. This allows us to
write φ(ej , t) = φ(x, t) and provides a language for the Taylor expansion we are about to
perform. With this notation in place, we can write the equation of motion for the jth ball,
now the “ball at equilibrium at x” as
1 The function φ(ej , t) depends on both ej and t, so we’ll turn the dots appearing in (4.6) into partial time
derivatives.
2 Technically, x ≡ lima→0 ej = ja, of course.
92 The Wave Equation
∂2 K
φ(x, t) = n(n − 1) (φ(x − a, t) − 2φ(x, t) + φ(x + a, t)) (4.8)
∂t 2 M
and we are finally ready to Taylor expand for small a (remember that L = (n − 1)a and L
is fixed). Using O-notation,3

∂ 2 φ(x, t) K ∂ 2 φ(x, t) 2
= n(n − 1) a + O(a 4
)
∂t 2 M ∂x 2
(4.9)
n KL2 ∂ 2 φ(x, t) 1
= +O .
n−1 M ∂x 2 (n − 1)2
And now, finally, for the limit as n → ∞, the term out front goes to 1, and O((n−1)−2 ) → 0
leaving
∂ 2 φ(x, t) KL2 ∂ 2 φ(x, t)
= . (4.10)
∂t 2 M ∂x 2
We have a continuous “spring” with spring constant K, mass M distributed uniformly
along it, and equilibrium length L. Almost any material can be described in this manner,
at least to good approximation. Examples include The Slinky R
(obviously), and rods
of material (in which context the stiffness K is known as “Young’s modulus” and can
be measured by stretching the rod and seeing with what force it responds). But more
diffuse materials also have displacements described by (4.10), the density of air in a one-
dimensional column behaves approximately according to (4.10).
The constant KL2 /M has the dimension of velocity-squared. Define the speed v by
v ≡ KL2 /M so that (4.10) becomes
2
∂ 2 φ(x, t) 2 ∂ φ(x, t)
2
− + v = 0, (4.11)
∂t 2 ∂x 2
and in this form, we have the “wave equation.” The wave equation shows up all over
the place in physics, with v being set by physically relevant parameters.4 For materials,
it is the mass, length, and Young’s modulus that determines v. For density in air, it is
the temperature and pressure of the ambient environment that determines v. Whatever the
mechanism, this equation governs, at least in approximation, far more of physics than it
has any right to.
Partial Derivatives
I’ll take this opportunity to review the definition and manipulations associated with partial
derivatives. A partial derivative just refers to the derivative of a multivariable function
with respect to one of its variables. As an example, take a function of two variables, f(x, y).
The partial derivative of this function with respect to x is
3 Without being overly formal, an expression like O(ap ) refers to any function whose leading-order contribution
is ap , with higher-order contributions omitted. As an example, take sin θ for θ near zero, then we say that
sin θ = O(θ) since the first term in the Taylor expansion is θ. For cos θ with θ ≈ 0, we have cos θ =
1 + O(θ 2 ).
4 That is, not all instantiations of the wave equation can be associated with masses connected by springs in
some sort of limit. It arises from totally different physical configurations in, for example, Maxwell’s theory of
electromagnetism, in which case the “speed” v is set by the geometry of space–time. See Chapter 7 for more
examples and extensions.
93 Continuum Limit
∂f(x, y) f(x + , y) − f(x, y)

≡ lim , (4.12)
∂x →0
and the partial derivative with respect to y is
∂f(x, y) f(x, y + ) − f(x, y)
≡ lim . (4.13)
∂y →0
For a warmup, take the function f(x, y) = αx + βy for constants α and β, then the two
different partial derivatives are
∂f(x, y) (α(x + ) + βy) −(αx + βy)
= lim =α
∂x →0
(4.14)
∂f(x, y) (αx + β(y + )) −(αx + βy)
= lim = β.
∂y →0
These derivatives tell us, for example, that ∂x ∂y = 0 (take α = 1, β = 0 in our test
f(x, y) = αx + βy), a reminder here that x does not explicitly depend on y. You can
imagine a relationship between x and y, that would define a curve in one dimension, and
in that setting, the ordinary derivative is appropriate. We use partials when we assume the
coordinates are independent, and you can really treat the partials as “regular” derivatives
with respect to one or the other variable, with the remaining one(s) as constant(s) for the
purposes of differentiation.
We’ll look at a more complicated case, let f(x, y) = αx sin(βy) (for constant α and β
again). To take the x-partial, we imagine sin(βy) is a fixed constant, and just take the x
derivative of the resulting “ordinary” expression:
∂f(x, y)
= α sin(βy). (4.15)
∂x
Similarly, treating x as a constant for the y-derivative, we get
∂f(x, y)
= αxβ cos(βy). (4.16)
∂y
Additional derivatives function similarly, so we can define the second partial derivatives

∂ 2 f(x, y) 1 ∂f(x, y) ∂f(x, y)
≡ lim − (4.17)
∂x 2 →0 ∂x x+,y ∂x x,y
and similarly for the second derivative with respect to y. Again, you just treat everything
except the derivative variable as constant and proceed to differentiate.
There are also “mixed” partials to consider. We can take the x-derivative of the y-
derivative of f(x, y):

∂ 2 f(x, y) ∂ ∂f(x, y)
≡ , (4.18)
∂x∂y ∂x ∂y
and vice-versa, the y-derivative of the x-derivative is

∂ 2 f(x, y) ∂ ∂f(x, y)
≡ . (4.19)
∂y∂x ∂y ∂x
Fortunately, these two are equal (as you will show in Problem 4.1.3), so we can be sloppy
about which we have in mind, they are numerically identical.
Problem 4.1.1 For the function f(x, y) = A(x 2 sin(y/L) − xy) (with constants A and L),
evaluate the following partial derivatives:
∂f(x, y) ∂ 2 f(x, y) ∂ 3 f(x, y) ∂ 2 f(x, y)
∂x ∂y 2 ∂x 3 ∂x∂y
Problem 4.1.2 In three dimensions, with coordinates x, y, and z, let q1 ≡ x, q2 ≡ y, and
q3 ≡ z. Define a matrix M by its entries,
∂qi
Mij ≡
∂qj
for i = 1 → 3 and j = 1 → 3. Write out the entries of this matrix of partial
derivatives.
Problem 4.1.3 Show that mixed partials satisfy “cross-derivative” equality:
∂ 2 f(x, y) ∂ 2 f(x, y)
=
∂x∂y ∂y∂x
so that the order in which you take the derivatives doesn’t matter. Check this result
explicitly using the function f(x, y) = xy sin(x 2 y).
Problem 4.1.4 You can generate the series spring addition formula by considering a mass M
attached by a spring with constant k1 to a mass m which is attached by a spring with
constant k2 to the wall. Taking m → 0 eliminates the intermediate mass, leaving us
with an effective force acting on M. To extract that effective force and the effective
spring constant it implies, write the equations of motion for M and m = 0, then
take m → 0 and see what the equation of motion for M must become – the constant
in front of the distance to M defines the spring constant. Do you recover (4.3) for
k1 = k2 ≡ k?
Problem 4.1.5 The relative separation φ(x, t) tells us, at time t, “the distance, relative to x,
of the mass that is at equilibrium at x.” For a continuous spring system extending
from x = 0 to L, suppose you were given φ(x, 0) = 0, describe the configuration of
masses. Describe the system if φ(x, 0) = α for positive, constant α. If φ(x, 0) = x/2,
where, relative to x = 0, is the mass that is at equilibrium at 1/4 m? (the m here stands
for “meter”).
Problem 4.1.6 Show that the wave equation (4.11) supports “superposition,” so that if
φ1 (x, t) and φ2 (x, t) both satisfy (4.11) so does the sum φ1 (x, t) + φ2 (x, t).
Problem 4.1.7 Which of the following PDEs supports “superposition” (see the previous
problem):
∂ 2 f(x, y) ∂ 2 f(x, y)
+ =0
∂x 2 ∂y 2
∂ 2 f(x, y) ∂ 2 f(x, y)
+ = ρ(x, y)f(x, y)
∂x 2 ∂y 2
∂Ψ(x, t) 2 ∂ 2 Ψ(x, t)
i + − V(x, t)Ψ(x, t) = 0
∂t 2m ∂x 2
∂ 2 u(x, t) u(x, t) ∂u(x, t)
− = 0.
∂x 2 α ∂t
95 Wave Equation for Strings
Problem 4.1.8 For a discrete chain of masses connected by springs, with xj (t) giving the
location of the jth mass at time t, the total energy of the chain is
1 1
n n−1
2
E= m ẋj (t)2 + k (xj+1 (t) − xj (t) − a) .
2 2
j=1 j=1
Find the continuum limit of the energy following the procedure in this section: start
by letting ej ≡ ja and switch to the difference variable φ(ej , t) ≡ xj (t) − ej , replace
the constants k and m using their relation to K and M, and take the limit as the
number of masses goes to infinity (with fixed total length at equilibrium L). The sums
)
will become integrals ( a→ dx), and the integrand is itself the “energy
density” (energy per unit length here) of the continuous system. You should end up
with an energy density that is
2 2
1M ∂φ(x, t) 2 ∂φ(x, t)
u(x, t) = +v .
2L ∂t ∂x
4.2 Wave Equation for Strings
Another physical system in which the wave equation appears is the taut string. Suppose
you have a string with constant mass density μ (mass per unit length), and we think about
a segment of string extending from x to x + dx as shown in Figure 4.1. The tension in the
string exerts a force on the left and right ends of the segment. If we demand that the string
be “inextensible,” so that the pieces of the string do not move left or right (opposite our
ŷ
Fr
θr
θ
F
x̂
x x + dx
Fig. 4.1 A portion of string extending from x → x + dx. Each end of the string has a force on it due to the tension in the
string. The magnitude of the force on the left is F and it makes an angle of θ with respect to horizontal. On the
right, the force magnitude is Fr and the angle is θr .
spring system from the previous section), then the forces in the horizontal direction must
cancel:
F cos θ = Fr cos θr . (4.20)
We’ll make two further assumptions (in Section 7.3, we will see what happens if you
relax these): (1) the tension in the string is constant throughout, so that F = Fr ≡ T,
and (2) the angles θ and θr are small. With these two assumptions, the inextensibility
requirement (4.20) is automatically satisfied, since cos θ ≈ 1 and cos θr ≈ 1 for small
angles.
Moving on to the y component of Newton’s second law, let y(x, t) be the height of the
string at location x and time t. The portion of the string extending from x to x + dx has mass
m = μdx, and the net vertical force is just T(sin θr − sin θ ) (from Figure 4.1 again), so
that we have
∂ 2 y(x, t) ∂ 2 y(x, t)
m = T(sin θ r − sin θ ) −→ μdx ≈ T(tan θr − tan θ ) (4.21)
∂t 2 ∂t 2
where we have used the small angle approximation sin θ ≈ θ ≈ tan θ. But those tangent
terms just represent the slopes at the left and right ends of the string segment. If we let
y (x, t) ≡ ∂y(x,t)
∂x , then we can write Newton’s second law as
∂ 2 y(x, t)
μdx ≈ T(y (x + dx, t) − y (x, t)) , (4.22)
∂t 2
and Taylor expanding on the right, we get
∂ 2 y(x, t) ∂ 2 y(x, t) T ∂ 2 y(x, t)

μdx ≈ Ty (x, t)dx −→ − + =0 (4.23)
∂t 2 ∂t 2 μ ∂x 2
which is the wave equation again. This time, the characteristic speed is set by v2 = T/μ.
Problem 4.2.1 A hundred yard ball of string has mass .5 kg. One meter of the string has one
end attached to a wall, and the other attached to a 1 lb weight and hung over a pulley.
Assuming constant tension, what is the speed with which waves will travel between
the wall and the pulley?
4.3 Solving the Wave Equation
We know how to solve ordinary differential equations of various sorts. The wave equation
is our first example of a “partial” differential equation (PDE). How should we solve it?
There are a number of approaches, and we’ll think about three specific ones that reduce the
PDE to an ODE (or set of ODEs) that we can then solve.
97 Solving the Wave Equation
4.3.1 First-Order Form: Method of Characteristics

From (4.11), we can factor the derivatives:

∂ 2 φ(x, t) 2 ∂ φ(x, t)
2
∂ ∂ ∂ ∂
− +v = v − v + φ(x, t) = 0 (4.24)
∂t 2 ∂x 2 ∂x ∂t ∂x ∂t
and then it is clear that one way to satisfy the wave equation is to have

∂ ∂
v + φ(x, t) = 0. (4.25)
∂x ∂t
We have reduced a second-order partial differential equation to a first-order one (actually
a pair of first-order ones, as you will see). Suppose we were given the values of φ(x, t) at
t = 0: φ(x, 0) = u(x) for some provided u(x). Then we can ask: Are there curves, x(t),
along which the value of φ(x, t) solving (4.25) remains constant? Then we’d just pick up
our known values of u(x) at t = 0 and move them along the curve x(t). That’s an easy way
to find solutions for φ(x, t). Operationally, we take the curves x(t) and evaluate φ(x(t), t)
along them, then the total time derivative of φ(x(t), t) along the curve is
dφ(x(t), t) ∂φ(x, t) dx(t) ∂φ(x, t)
= + . (4.26)
dt ∂x dt ∂t
dφ
Now if we pick dx(t)dt = v, then (4.25) applies and dt = 0. So for curves of the form
x(t) = x0 + vt, the value of φ is the same all along the curve. These curves are straight lines
with slope v (or slope 1/v if we plot t versus x). Then we know that the solution for φ(x, t)
at x is φ(x, t) = u(x0 ) = u(x − vt). You can easily check that this form solves the original
wave equation, and of course the factored form in (4.25), by taking derivatives. The curves
that we generated are called “characteristic curves,” and this solution was obtained by the
“method of characteristics” where we evaluate the total time derivative of φ(x(t), t) and
use that to define the curves x(t).
This approach has a pictorial interpretation. The initial curve u(x) has values that get
“picked up” and moved along the straight lines of slope 1/v emanating from the x axis.
So the “picture” of u(x) at t = 0 gets shifted to the right in time, with constant speed v.
An example of such curves and an initial function that is moved along them is shown in
Figure 4.2.
All of this right-traveling initial data is for φ(x, t) = u(x − vt), but of course if we
factored the wave equation in the other direction, then we’d have

∂ ∂ ∂ ∂
v + v − φ(x, t) = 0 (4.27)
∂x ∂t ∂x ∂t
and the first-order form that solves this is

∂ ∂
v − φ(x, t) = 0. (4.28)
∂x ∂t
Here, the solutions are related to the initial values u(x) by φ(x, t) = u(x + vt), and the
solution moves to the left with constant speed v.
Fig. 4.2 An example of some characteristic curves, defined by t = (x − x0 )/v for initial position x0 . There is a reference
Gaussian for u(x) plotted at t = 0 and some other times to demonstrate how the constant values of the initial data
along the characteristic curves can be viewed as a right-traveling waveform.
In general, the solution to the full wave equation is given by
φ(x, t) = f(x − vt) + g(x + vt) (4.29)
for single-variable functions f(x), g(x) which would be used to satisfy the initial conditions.
For example, suppose we are given functions u(x) and w(x) such that:

∂φ
φ(x, 0) = u(x) = w (x), (4.30)
∂t t=0
then we would take

1 1 1 1
f(x) = u(x) − w(x) g(x) = u(x) + w(x) (4.31)
2 v 2 v
and the full solution would be

1 1 1
φ(x, t) = u(x − vt) − w(x − vt) + u(x + vt) + w(x + vt) . (4.32)
2 v v
4.3.2 Additive Separation of Variables

The other way of turning a PDE into an ODE (possibly many) is to make some assumptions
about the form of the solution. As an example, let’s take φ(x, t) = X(x) + T(t), where X(x)
and T(t) are unknown functions of x and t respectively, so that we are looking for solutions
that depend on x and t separately in additive combination. Running this through the full
wave equation from (4.11) gives
d 2 T(t) d 2 X(x)
2
= v2 . (4.33)
dt dx 2
Now the logic of separation of variables (SOV) is as follows: The left-hand side of the
equation depends only on t, while the right-hand side depends only on x. The only way that
the equation can hold, for all values of x and t is if both sides are constant. So we demand,
calling the shared constant α,
d 2 T(t) 1
= α −→ T(t) = αt 2 + bt + c
dt 2 2
(4.34)
2
d X(x) 1α 2
v2 = α −→ X(x) = x + dx + e
dx 2 2 v2
and “the” solution is

1 x2
φ(x, t) = α t + 2 + bt + dx + (c + e).
2
(4.35)
2 v
There are four parameters here: α, b, d, and c + e ≡ f. That’s the right number for a
PDE that is second order in both time and space. Yet it is clear that we will not be able to
satisfy many different types of boundary conditions with the solution in (4.35). There is
a notion of superposition here (the idea that sums of solutions are solutions), so we could
add together solutions with different values of the four independent constants. But there is
another type of separation of variables that comes up a lot, and we will think about issues
of superposition and boundary values in that setting.
4.3.3 Multiplicative Separation of Variables

Another way to achieve the separation found in, for example, (4.33) is to start with the
“multiplicative” form: φ(x, t) = X(x)T(t). We still have independent functions that depend
on x and t only, but this time they are multiplied together instead of added. Running this
ansatz through (4.11), we have
d 2 T(t) 2
2 d X(x)
X(x) = v T(t). (4.36)
dt 2 dx 2
This is not obviously in the form of (4.33), since there is x dependence on the left, t
dependence on the right. Yet if we divide both sides by φ itself, we get
1 d 2 T(t) v2 d 2 X(x)
= , (4.37)
T(t) dt 2 X(x) dx 2
and now we can make the separation argument: The left-hand side depends only on t and
the right depends only on x, so each must be separately equal to a constant. Call that
constant, for reasons that will become clear later on, −α2 . Then we have to solve the pair
of ODEs
d 2 T(t)
= −α2 T(t) −→ T(t) = A cos(αt) + B sin(αt)
dt 2
αx αx (4.38)
d 2 X(x) α2
= − X(x) −→ X(x) = C cos + D sin
dx 2 v2 v v
for constants A, B, C, and D.
Combining the two factors, the solution is
αx αx
φ(x, t) = C cos + D sin (A cos(αt) + B sin(αt)) . (4.39)
v v
We can use the constants to satisfy boundary conditions. For example, suppose we require
that at x = 0 and x = , the function φ(x, t) vanish at all times: φ(0, t) = φ(, t) = 0.
Physically, we are requiring that the mass at the endpoints of the chain be at fixed locations,
for example. Then we have
φ(0, t) = C(A cos(αt) + B sin(αt)) = 0 −→ C = 0 (4.40)
and we are left with (absorbing the constant D into A and B)

αx
φ(x, t) = (A cos(αt) + B sin(αt)) sin . (4.41)
v
For the other end, we have

α
φ(, t) = (A cos(αt) + B sin(αt)) sin =0 (4.42)
v
and we could satisfy this equation, for all times t, by taking A = B = 0. This choice
leaves us with φ(x, t) = 0, and we have no hope of satisfying initial conditions. To retain
a nontrivial solution, we instead set sin(α/v) = 0, giving us a set of values for α,
α nπv
= nπ −→ α = for integer n. (4.43)
v
There is still an infinite family of solutions here, one for each integer n.
The wave equation is linear in φ(x, t), and so superposition holds: If you have two
solutions to the wave equation, φ1 (x, t) and φ2 (x, t), then the sum is also a solution. For
our integer-indexed φ(x, t), we can form an infinite sum that is still a solution and satisfies
the boundary conditions term-by-term,
∞ *
nπv nπv + nπ
φ(x, t) = An cos t + Bn sin t sin x . (4.44)

n=1
We could write the sine term in position as exponentials to make the connection with
the Fourier series we studied earlier in Section 2.3. Then it is clear that the coefficients
{An , Bn }∞
n=1 are associated with the decomposition of the initial conditions. Remember
that we must be given φ(x, 0) = u(x) and the time-derivative at t = 0, φ̇(x, 0) = w (x).
From the solution, we have
∞
nπ
φ(x, 0) = An sin x = u(x)

n=1
(4.45)
∂φ(x, t)
∞
nπv nπ
= Bn sin x = w (x).
∂t t=0
n=1
Using the orthogonality of sine:

nπx kπx
sin sin dx = δ nk , (4.46)
0 2
we can multiply both sides of the equations in (4.45) by sin(kπx/) and integrate to isolate
the coefficients

2 kπx
Ak = u(x) sin dx
0
(4.47)
2 kπx
Bk = w (x) sin dx.
kπv 0
4.3.4 Series Solution

The general solution in (4.44) has the basic form of a (spatial) Fourier sine series like (2.55)
but with time-varying coefficients. We could have started with a solution ansatz of the form
∞
jπx
φ(x, t) = aj (t) sin . (4.48)

j=1
We know, for appropriate choice of {aj (t)}∞ j=1 , that “any” function can be written in
this form, and because the coefficients can change in time, so can the function we are
representing in its sine series decomposition. So a solution to the wave equation, with the
boundary conditions φ(0, t) = φ(, t) = 0 can be expressed as (4.48) for a particular set
of {aj (t)}∞
j=1 .
This observation leads us to turn around and use (4.48) as the starting point in generating
a solution to the wave equation. If we take φ(x, t) from (4.48) and run it through the wave
equation, we get
∞
$ 2 %

2 jπ jπx
äj (t) + v aj (t) sin = 0. (4.49)

j=1
Now we use the orthogonality of sine to demand that each term in the sum vanish
separately.5 Then we get an infinite family of ordinary differential equations in time,
2
2 jπ
äj (t) = −v aj (t) (4.50)

5 We could do this carefully, as we have before, by multiplying (4.49) by sin(kπx/) and integrating x : 0 → ,
this would eliminate all terms in the sum with j = k.
an equation that we recognize, and whose solutions we know all too well:

jπv jπv
aj (t) = Aj cos t + Bj sin t (4.51)

giving precisely the term in front of sin( jπx/) in (4.44). The utility of the approach, in
this case, comes from applying it to PDEs other than the wave equation where we already
know quite a bit about the solutions. You can explore some examples in the problems.
Problem 4.3.1 Suppose we had a factored (i.e. first-order) wave equation of the form
x ∂φ(x, t) ∂φ(x, t)
v0 + =0
∂x ∂t
for constant v0 and and with φ(x, 0) = u(x) given. Find the solution to this equation
using the method of characteristics.
Problem 4.3.2 The Laplace equation in two dimensions is
∂ 2 f(x, y) ∂ 2 f(x, y)
+ = 0,
∂x 2 ∂y 2
for a function f(x, y) of the two Cartesian variables x and y, find the solution that has
f(0, y) = 0, f(L, y) = 0 (for constant L) with f(x, 0) = f0 sin(2πx/L) and f(x, L) = 0.
We will study this type of equation and its solutions in Chapter 6.
Problem 4.3.3 The “heat equation” in one spatial dimension is
∂ 2 u(x, t) 1 ∂u(x, t)
− = 0, (4.52)
∂x 2 α ∂t
find u(x, t) given u(0, t) = 0, u(L, t) = 0, and u(x, 0) = u0 sin(πx/L). Sketch the
solution at different times indicating its behavior as t goes from 0 → ∞. What are
the units of α?
Problem 4.3.4 For the partial differential equation:
∂ 2 u(x, t) 2
2 ∂ u(x, t)
− + v + m2 u(x, t) = 0,
∂t 2 ∂x 2
with constant m (some real number6 ), solve for u(x, t) given the boundary conditions
u(0, t) = 0, u(L, t) = 0 with ∂u(x,t)
∂t |t=0 = 0 and u(x, 0) = u0 sin(qπx/L) for integer
q. For what values of q (relative to m, say) is your solution bounded (does not become
infinite) for all time t ≥ 0.
Problem 4.3.5 The wave equation governing the height of a rope is
∂ 2 y(x, t) 2
2 ∂ y(x, t)
− + v = 0,
∂t 2 ∂x 2
where y(x, t) is the height of the piece of rope with horizontal location x at time t. We
can write this equation in terms of the Fourier transform of y(x, t) in time (one can
also Fourier transform the spatial coordinate, but we’ll leave x alone here):
∞
ỹ(x, f) = y(x, t)e i2πft dt.
−∞
6 This equation is what you would get from Maxwell’s electricity and magnetism if light had mass.
Take the inverse transform version of y(x, t),

∞
y(x, t) = ỹ(x, f)e −i2πft df,
−∞
and insert it into the wave equation. Solve the resulting ODE (in x) for ỹ(x, f) (you
should end up with two undetermined constants of integration, which, for once, we’ll
leave as undetermined constants).
Problem 4.3.6 A solution to the wave equation,
p(x, t) = p0 [cos(2π(x − vt)g1 ) + cos(2π(x − vt)g2 )] (4.53)
(for constants g1 and g2 ) is observed by you at your location, x = 0. Show that you
can write that signal as
p(0, t) = F cos(2πf1 t) cos(2πf2 t) (4.54)
for constant frequencies f1 and f2 and magnitude F. What are f1 and f2 in terms of
g1 and g2 ? If the original signal was made up of g1 v = 441 Hz and g2 v = 440
Hz, what frequencies appear in (4.54)? Can you hear both of those frequencies? The
expression (4.54) is an example of the “beats” that the intereference of two signals
can make – you perceive a new frequency that has time-varying amplitude (that’s
how (4.54) is interpreted by your ears and brain).
Problem 4.3.7 a. A ball of mass m moving with constant speed v bounces elastically
off of walls separated by a distance a (no gravity here). We want to describe the
probability-per-unit-length of finding the ball in the vicinity of x ∈ [0, a], we’ll call
this quantity ρ(x) and the probability of finding the ball within dx of x is dP = ρ(x)dx
so that the probability of finding the ball between two locations, x1 and x2 > x1 (with
both in [0, a]) is
x2
P(x1 , x2 ) = ρ(x) dx.
x1
Construct ρ(x) for the ball – think about the following two observations: (1) given
two locations x and y (both in between 0 and a), at which location is the ball more
likely to be found? and (2) What is the total probability of finding the ball between
x = 0 and a?
b. When you study quantum mechanics (a great resource is [13]), you will
learn that the fundamental object is a complex function Ψ(x, t). The quantum
mechanical probability density, ρ(x, t) (defined similarly to a.) is given by: ρ(x, t) =
Ψ∗ (x, t)Ψ(x, t). For a ball of mass m bouncing back and forth between elastic walls,
the equation governing Ψ(x, t), called the “Schrödinger wave equation,” is:
2 ∂ 2 Ψ(x, t) ∂Ψ(x, t)
− 2
= i , (4.55)
2m ∂x ∂t
for constant (with what units?). The boundary conditions are: Ψ(0, t) = 0 and
Ψ(a, t) = 0. Solve this equation for Ψ(x, 0) = ψ0 sin(πx/a). Find the value of the
constant ψ0 (Hint: what is the total probability of finding the ball between 0 and a?).
Sketch the resulting probability density ρ(x, t). Do the same for the initial condition
Ψ(x, 0) = ψ0 sin(100πx/a). Which probability density ρ(x, t) looks more like your
result from part a.?
Problem 4.3.8 Use the approach from Section 4.3.4 to write the general, time-dependent
solution to Schrödinger’s equation (4.55) subject to the boundary conditions
Ψ(0, t) = 0 = Ψ(a, t).
Problem 4.3.9 Using a series ansatz, solve the heat equation (4.52) with boundary condi-
tions u(0, t) = 0 = u(L, t).
Problem 4.3.10 Find “a” (by hook or crook, with any boundary/initial conditions that you
like) solution to the wave equation with “driving” term on the right-hand side
∂ 2 y(x, t) 2
2 ∂ y(x, t)
− + v = F0 sin(2πft).
∂t 2 ∂x 2
Problem 4.3.11 Solve the wave equation with φ(0, t) = φ(, t) = 0 and initial conditions
∂φ(x,t)
φ(x, 0) = u0 x(x − )/2 and ∂t |t=0 = 0.
Problem 4.3.12 Solve the wave equation with φ(0, t) = φ(, t) = 0 and initial conditions
∂φ(x,t)
φ(x, 0) = 0 and ∂t |t=0 = w0 x(x − )/2 .
4.4 Standing Waves
The individual solutions that we got from separation of variables are already interesting on
their own. Think of

jπvt jπx
φ(x, t) = φ0 sin sin (4.56)

for integer j. This solution can be thought of as a time-varying amplitude for the spatial
sine function selected by j. That is why these solutions are called “standing” waves. All
that changes in time is the magnitude of the solution, the spatial form always looks the
same (although when the coefficient out front is negative, the spatial function has flipped
upside down, and of course, it also goes through zero as it changes sign). An example, for
j = 4, is shown in Figure 4.3 where the plot of φ(x, t) is shown for a few different times.
The points at which the solution goes through the x axis are fixed in time, and are called
“nodes.” The maxima and minima that occur in between the nodes are called “antinodes.”
The oscillatory pieces of φ(x, t) in space and time determine a characteristic length and
time for the solution. The “wavelength,” usually denoted λ, is the spatial length of a single
cycle of sine. If you took a snapshot of φ(x, t) at time t0 and measured the distance between
peaks as in Figure 4.4, that distance is the wavelength, and it is related to j and by
jπλ 2
= 2π −→ λ = . (4.57)
j
The temporal oscillation defines the “period” of the wave. If you stood at a single
location x0 and measured the wave’s magnitude at that location, the time it takes for one
105 Standing Waves
φ(x, t)
Fig. 4.3 The standing wave solution from (4.56) with j = 4 shown for six different times (from dark to light). The zero
crossings on the x axis are “nodes,”and the max/min that occur in between are “antinodes.”
φ(x, t0 )
λ
Fig. 4.4 Given a snapshot of φ(x, t) at time t0 , the distance from one peak to the next defines the wavelength λ.
φ(x0 , t)
T
Fig. 4.5 Plotting the wave φ(x0 , t) in time at a particular location x0 . The period of the wave, T, is the time from one peak
to the next.
full cycle of the wave to pass you (assuming it is not a node) is the period, usually denoted
T – this is shown in Figure 4.5. In terms of the solution parameters, the period is given by
jπvT 2
= 2π −→ T = . (4.58)
jv
From the period, we also define the “frequency”: f ≡ 1/T = jv/(2), and the “angular
frequency” ω ≡ 2πf = jπv/.
We can relate the wavelength and period, T = λ/v, so that λ is also interpretable as
the distance travelled by the wave in one full cycle (the wave, remember, travels with
characteristic speed v). That’s strange since these standing waves aren’t really traveling
left and right in the usual way. If we were thinking of waves on a string, the standing
waves would describe pieces of string moving up and down. Is anything moving left or
right? How should we think about standing waves in terms of the general solution to the
wave equation in (4.29)?
As it turns out, standing waves are the sum of carefully tuned left- and right-traveling
solutions. We can write the standing wave solution using angle addition:

jπ jπv jπ jπv jπ
cos (x ± vt) = cos t cos x ∓ sin t sin x (4.59)

so that

φ0 jπ jπ
φ(x, t) = cos (x − vt) − cos (x + vt) , (4.60)
2
where the first term represents the right-traveling piece, the second term the left-traveling
one. So there are still traveling waves here, it’s just the superposition that makes it look as
if we have a horizontally static waveform with time-varying vertical height.
The standing wave solution can also be used to connect with the oscillator chain from
Section 3.5. Suppose we discretized space into a grid with fixed spacing Δx = /(n + 1)
for integer n determining the number of grid points. The grid points are at xk = kΔx for
k = 1 → n, and then the evaluation of φ(x, t) at those grid points is

jπvt jπxk jπvt πjk
φ(xk , t) = φ0 sin sin = φ0 sin sin . (4.61)
n+1
The spatial portion of this function gives precisely the entries of vkj from (3.102). The
oscillatory angular frequency is (for κ2 ≡ Kn(n − 1)/M and using = L to match the
previous definition)

jπv jπ KL2 κ2 jπκ
ωj = = = jπ ≈ (4.62)
L L M n(n − 1) n
for large n. Meanwhile, if we expand the expression for the eigenvalue λj in (3.103) for
large n, noting that the angular frequency there is ω2 ∼ λj , we get

jπκ
ω≈ , (4.63)
n
so that ωj is the same for the continuous solution and its (large n) discrete approximation.
Of course, you must immediately turn around and ask the question: What (continuous?!)
eigenvalue problem does φ(x, t) satisfy? Why does this correspondence exist at all?
Problem 4.4.1 Find standing wave solutions to the wave equation for a string,
∂ 2 y(x, t) 2
2 ∂ y(x, t)
− + v = 0,
∂t 2 ∂x 2

with boundary conditions y(0, t) = 0 and ∂y(x,t)
∂x = 0.
x=L
Problem 4.4.2 Does the heat equation (4.52) support standing wave solutions for u(0, t) =
u(L, t) = 0?
107 Plane Waves
4.5 Plane Waves
Standing waves are made up of a linear combination of left- and right-traveling waves as
we saw explicitly in (4.60). Those individual traveling waves are themselves oscillatory
with a single frequency and wavelength. Take a solution to the wave equation
φ(x, t) = A cos(k(x − vt)) (4.64)
with k a constant playing the role of jπ/ from (4.60). The solution at time t = 0
is φ(x, 0) = A cos(kx) and this waveform moves to the right with speed v, a special
case of (4.29). Because of its purely oscillatory form, we can identify a wavelength and
frequency just as we did in Section 4.4. From the t = 0 solution, the wavelength is
kλ = 2π → λ = 2π/k. Now taking x = 0, we have φ(0, t) = A cos(kvt) and the
period is kvT = 2π → T = 2π/(kv) = λ/v so that the frequency is f = 1/T = v/λ. We can
write the solution to highlight the roles of wavelength and frequency,
*x +
φ(x, t) = A cos 2π − ft , (4.65)
λ
and if we had a left-traveling oscillatory wave solution, of the form cos(2π(x/λ + ft)) =
cos(2π(−x/λ − ft)), we could write a sum of left and right travelers,
*x + * x +
φ(x, t) = A cos 2π − ft + B cos 2π − − ft . (4.66)
λ λ
Finally, we can take the view from Chapter 2, and think of φ(x, t) as a complex function
of exponentials,

φ(x, t) = Ae i2π(−ft+x/λ) + Be i2π(−ft−x/λ) = e −i2πft Ae i2πx/λ + Be −i2πx/λ
v (4.67)
f= .
λ
If you want to recover (4.66), just take the real part of the exponential. You can also get a
sinusoidal version of the solution by taking the imaginary portion of the complex solution.
Individual solutions of the form (4.67) are called “plane waves.” They have well-defined
frequency and wavelength, and are written to highlight those, but the auxiliary condition
relating them, f = v/λ, must be kept in mind, otherwise the function φ(x, t) in (4.67)
does not satisfy the wave equation (i.e. the values of f and λ are not independent). Plane
waves with different frequencies and wavelengths can be added together with the sum
itself a solution to the wave equation by superposition. Indeed, the spatial exponentials in
the second expression for φ(x, t) in (4.67) look a lot like the terms in a spatial Fourier
series decomposition. Because of the shared temporal pre-factor, we sometimes focus on
the spatial piece of φ(x, t) with the understanding that the temporal piece can be put back
in when necessary.
An example of a cartoon that focuses attention on the spatial piece is shown in
Figure 4.6, where we identify the term e i2πx/λ with a right-traveling sinusoidal wave, and
e −i2πx/λ with a left-traveling one. Those identifications are impossible to justify without
referring to the full solution, complete with temporal dependence, in (4.67), and that
reference is often implicit.
x
Aei2π λ
v
v x̂
Be−i2π λ
x
Fig. 4.6 A cartoon of the spatial piece of the solution in (4.67). The real part of the left and right traveling pieces are shown
separately, identified by their spatial dependence.
Aei2πx/λ1 F ei2πx/λ2 v2
v1
v1 x
Be−i2πx/λ1
x=0
Fig. 4.7 A string has speed v1 on the left (x < 0) and is attached to another string with speed v2 on the right. An “incident”
wave coming from the left gives rise to a “transmitted”wave on the right and a “reflected”wave on the left.
We can use this picture to set up the physically interesting case of a wave that moves
from one medium to another. The hallmark of that change in medium will be a change in
the speed of the wave. Referring to the wave equation for a string (4.23), for example, we
see that the physical parameters of the string are the tension T and the mass density μ. But
these always appear together in v2 = T/μ, so regardless of which you want to change, the
relevant parameter describing the wave equation in each medium will be the speed v.
Staying with the string example, suppose we took two strings with different mass
densities, leading to different speeds v1 and v2 . We combine the strings at x = 0, and
let sinusoidal waves travel from the left to the right, from the region x < 0 to x > 0,
encountering the change in speed at the join point. If we think of a plane wave coming in
from the left, traveling to the right, then it is natural to have, on the left, a solution with
spatial dependence Ae i2πx/λ1 where λ1 is the wavelength of the wave on the left string.
When the wave encounters the new string at x = 0, we expect a piece of the wave to
propagate onto the new string, so on the right, x > 0, we expect a right-traveling plane
wave with new wavelength λ2 , Fe i2πx/λ2 . Meanwhile, we could also have reflection at the
join point,7 a left-traveling wave occurring on the left, x < 0, which we denote Be −i2πx/λ1 .
A representation of this setup is shown in Figure 4.7. The two strings naturally partition
space into x < 0 and x > 0, and the assumed spatial form of the solutions on either side is
φ1 (x) = Ae i2πx/λ1 + Be −i2πx/λ1 for x < 0
(4.68)
φ2 (x) = Fe i2πx/λ2 for x > 0.
7 Imagine that the second string has an infinite mass density, then the incoming wave will not enter the new
section of string and must return along the first string.
109 Plane Waves
How should we relate these two solutions? Since the wave equation is second order
in space, we expect its solutions to be continuous and derivative-continuous, similar to
the continuity argument we made in Section 1.7.1 (and see Problem 2.0.2). Those two
continuity conditions give

∂φ1 (x, t) ∂φ2 (x, t)
φ1 (0, t) = φ2 (0, t) = , (4.69)
∂x x=0 ∂x x=0
which here read

i2π i2π
A+B=F (A − B) = F. (4.70)
λ1 λ2
Adding these two equations eliminates B, and we get a relation between F and A:
2A
F= λ1
. (4.71)
1+ λ2
Now λ1 = v1 /f and λ 2 = v2 /f so that λ1 /λ2 = v1 /v2 . Think about what we just did: we
propagated the change in speed to a change in wavelength while assuming the frequency
is the same on the left and right. Does that make sense? Sure, the frequency is fixed
by whatever mechanism generated the plane wave on the left, some person moving the
left edge of the string up and down rhythmically with frequency f, for example. So the
frequency is fixed by the production of the waves, then the only thing that can change in
the relation v = λf from (4.67) is the wavelength. The upshot is that we can write
2A
F= (4.72)
1 + vv12
to directly relate F to A and the characteristic speeds on the left and right. Going back
to (4.70), we can subtract to isolate B

2A 1 − λ1
A 1 − v1
λ1 λ2 v2
2B = 1 − F= −→ B = . (4.73)
λ2 1+ λ1 1 + v1
v2
λ2
This case of two strings is the simplest example of a wave equation with speed that has
spatial dependence, and we shall see more of this type of spatially-varying wave speed in
Section 4.8 and Chapter 7.
Problem 4.5.1 What happens to the F and B coefficients in the case v2 → 0? Does this make
sense?
Problem 4.5.2 For a wavelength λ1 on the left, is the wavelength on the right larger or
smaller than λ1 if v1 < v2 ?
Problem 4.5.3 The “reflection” coefficient is defined to be R ≡ |B/A|2 and the “transmis-
sion” coefficient is T ≡ v1 /v2 |F/A|2 . These coefficients tell us “how much” of the
incident wave is reflected, and how much passes through to the other string. Then we
must have R + T = 1, show that this is indeed the case.
v(t − t1 ) = |x2 (t) − x1 (t1 )|
v(t − t2 ) = |x1 (t) − x2 (t2 )|
x2 (t2 ) x2 (t) x1 (t) x1 (t1 )
Fig. 4.8 The distances involved in the relations (4.75) and (4.76). The lengths themselves are defined in terms of the
location of the masses at different times, and the equations enforce the idea that “something”is traveling from
one location to the other at constant speed.
4.6 Delays
One important feature of the wave equation is the finite speed v exhibited by its solutions.
That speed is set by the physical environment in which the wave equation appears. But
once set, all wave propagation occurs at this speed. That means that there are fundamental
speed limits in place, and information of whatever sort (sound, electromagnetic radiation,
pulses in a game of tug-of-war) arrive with a lag relative to their production.
If we think about a pair of masses connected by a spring, the force exerted by the spring
depends on the relative locations of the masses, but it takes some time for that information
to be transmitted from one particle to the other. We started with (3.1) for a pair of masses,
but in order to capture the idea that signals travel at finite speed, we should have
m1 ẍ1 (t) = k(x2 (t2 ) − x1 (t) − a)
(4.74)
m2 ẍ2 (t) = −k(x2 (t) − x1 (t1 ) − a).
Looking at the equation for x1 , the time t2 refers to the time at which the signal from the
second mass left, arriving at the first mass at time t. The signal from the second mass travels
along the spring with constant speed v (assuming a continuum spring model), so we get an
implicit equation governing t2 :
v(t − t2 ) = |x1 (t) − x2 (t2 )| (4.75)
which we need to solve for t2 . Similarly, t1 in (4.74) is the time at which a signal leaves the
first mass, arriving at the second mass at t, so that
v(t − t1 ) = |x1 (t1 ) − x2 (t)|. (4.76)
These algebraic relations are shown in Figure 4.8 where it is easy to see the geometry of the
delay. The force acting on the first particle depends on the location of the second particle
at time t2 , while the force acting on the second particle depends on the location of the first
particle at time t1 – there is no guarantee that, for example, those forces (at time t) are equal
and opposite.8
There is no obvious analytical way to solve (4.74) together with (4.75) and (4.76), and
even thinking about the solution is difficult. For example, if we specify the position and
8 Indeed, Newton’s third law is problematic in settings where these delays occur, which is almost all.
111 Delays
velocity of each particle at t = 0, we cannot solve the equations of motion since the
“forces” depend on the motion of the masses prior to t = 0. You can sidestep the problem
of the (extreme) increase in “initial” data that is required by imagining the masses at rest
in a fixed position for all times t < 0. Then there will be some time t∗ = (x2 (0) − x1 (0))/v
at which the particles begin interacting, and you still have to figure out, once the masses
are moving, at what time each influences the other.
Let’s simplify the problem to see the issues in a concrete setting. Take a single mass
attached to a wall with a spring, and suppose that the delay is constant. For a time constant
τ, we’ll take the equation of motion to be
mẍ(t) = −kx(t − τ) (4.77)
where we have set the equilibrium position of the spring to be at zero. Before we solve this,
we can look at the limiting behavior. Suppose τ is small (compared to some time scale of
interest), then we can expand the equation of motion
mẍ(t) ≈ −k(x(t) − τ ẋ(t)) = −kx(t) + kτ ẋ(t). (4.78)
This looks like a damped harmonic oscillator, but with the wrong sign for the damping.
The solution to this equation will involve exponential growth, which is problematic since
we do not observe that type of behavior in nature.
Going back to the full equation of motion (4.77), we can insert the usual guess: x(t) ∼
αt
e and try to find α. Taking ω 2 ≡ k/m as always, we get
mα2 e αt = −ke αt e −ατ −→ α2 = −ω2 e ατ , (4.79)
a transcendental equation. We’ll look at numerical solutions to these in Section 8.1.
Problem 4.6.1 A charge moves according to w(t) = vt x̂ + d ŷ. Sketch the trajectory of the
particle (i.e. sketch the path it traces out in time). You are at the origin, x = y = 0.
You measure the electric field of the charge – where was the charge when it emitted
the electric field that you receive at t = 0 (electric field information travels at the
speed of light, c)?
Problem 4.6.2 For the delay differential equation
dx(t)
= −x(t − τ), (4.80)
dt
write the approximate ODE for τ small (using Taylor expansion). Solve this ODE
with x(0) = x0 .
Problem 4.6.3 Find an exact solution to the delay differential equation (4.80) for constant
τ = 1, and with “initial” value x(0) = 1. (Hint: look up the definition of the “product
log”).
Problem 4.6.4 A pair of equal and opposite charges (with the same mass) undergoes
uniform circular motion via their electrostatic interaction. Assuming no delay, draw
the charges at an instant in time and indicate the force directions on each. Now if
there is a delay (which there is, light takes a finite amount of time to get from one
charge to the other), sketch the forces on each charge at an instant in time – from
your force diagram, can you have uniform circular motion in a system with delays?
4.7 Shocks
The time-of-flight required for a “signal” to go from one place to another leads to some
interesting physics, even in the simplified case where the signal travels with constant speed.
If you clap your hands at time t = 0, making a sonic “point source,” the sound propagates
in air at roughly constant speed. As time goes on, the signal propagates further from the
source, making a sphere of influence that has radius R(t) = vt centered on the source.
Now suppose that the source is moving at constant speed vsrc , and the hand claps occur
at equally spaced times, so that the center of the source changes as time goes on as shown
on the left in Figure 4.9. The “x” marks the location of the source at different times, and
the circles centered on the “x”’s show the sphere of influence for the hand clap (meaning,
the furthest point that could hear the clap at time t). The circles get smaller moving from
left to right since less time has elapsed for the points to the right of the starting position.
In Figure 4.9, the source is moving faster than the sound speed, vsrc > v. On the right,
we see the situation in the rest frame of the source (the source is fixed, and I just shifted
the centers of the circles as if they moved with constant speed vsrc − v). Notice that the
rest frame of the source lies outside all of the spheres of influence, so that the source itself
hears nothing. Meanwhile, the spheres “pile up” behind the source forming a “cone” that
has a lot of different fronts superimposed. This is the “sonic boom” that you hear when an
airplane goes faster than the speed of sound.
Depending on the ratio of the source and sound speeds, that cone can open at different
angles. In Figure 4.10, we have a source moving with constant speed that is greater than
the sound speed. At time t = 0 the source emits a signal, then at time t the source’s location
together with the largest sphere of influence can be used to determine the “Mach angle,” θm ,
v
tan θm = . (4.81)
vsrc
Problem 4.7.1 Some jets can go faster than the speed of sound. What is the angle of the
Mach cone for a jet traveling at five times the speed of sound in air?
Moving source In source rest frame

Fig. 4.9 On the left, a moving source makes a sound (hand clap) at each “x”location. The circles show how far the sound
has travelled since it was generated. On the right, the same picture but in the rest frame of the source.
113 Wave Equation with Varying Propagation Speed
vt
θm
vsrc t
Fig. 4.10 A source moves faster than the sound speed. A “Mach cone”forms behind the source, making an angle θm that is
determined by the ratio of the source and sound speeds.
4.8 Wave Equation with Varying Propagation Speed
Going back to the first-order wave equation developed in Section 4.3.1,

∂φ(x, t) ∂φ(x, t)
= −v , (4.82)
∂t ∂x
we assumed that v, a speed, was fixed. But that speed comes to us as part of the continuum
limit of identical balls connected by identical springs. If some of the springs had different
spring constants, or the mass of the balls changed, we could end up with a speed v that itself
depended on position (and potentially time as well). This is not an unreasonable situation
to consider – in air, the speed with which acoustic disturbances propagate can change
based on the local temperature, pressure, etc. How must the wave equation be modified
to incorporate a spatially varying v(x)? (We’ll return to questions like this in more detail
in Chapter 7.)
We could go back to the ball and spring model, but we can also develop (4.82) by
appealing to conservation of mass. Suppose we interpret φ(x, t) as a mass-per-unit-length,9
so that in an interval between x and x + dx, the mass contained at time t is φ(x, t)dx for
infinitesimal dx. Now in order for the mass contained in the interval to change to some new
value at time t + dt, it must be the case that some mass entered the interval from the left
(say), and some mass exited the interval on the right. That’s the statement of conservation:
no mass is created or destroyed in the interval. Let v(x, t) be the velocity of the mass at
location x, time t, then the amount of mass coming in from the left is φ(x, t)v(x, t)dt since
vdt is the length of material that enters from the left over the time interval dt. The amount
of mass exiting on the right over that same interval is φ(x + dx, t)v(x + dx, t)dt. A sketch
of the situation is shown in Figure 4.11.
The difference between the mass in the interval at time t + dt and the mass contained at
time t is accounted for by the contributions entering and leaving:
φ(x, t + dt)dx − φ(x, t)dx = φ(x, t)v(x, t)dt − φ(x + dx, t)v(x + dx, t)dt. (4.83)
9 That’s just a dimensional change from its usual interpretation as “displacement of material that has equilibrium
location x.”
v(x, t)dt v(x + dx, t)dt
v(x, t) v(x + dx, t)

φ(x, t)
x x + dx
Fig. 4.11 The mass in the region x → x + dx changes because mass flows in from the left with speed v(x, t) and out to the
right with speed v(x + dx, t).
If we Taylor expand in dt on the left, and in dx on the right, we get

∂φ(x, t) ∂ ∂φ(x, t) ∂
dxdt = − (φ(x, t)v(x, t)) dtdx −→ = − (φ(x, t)v(x, t)) . (4.84)
∂t ∂x ∂t ∂x
This becomes (4.82) when v(x, t) is just a constant, and tells us how to account for local
changes in speed.
We can again use the method of characteristics to make some progress here. Assume
that the local speed, v(x), is constant in time, but varies spatially. Then the conservative
wave equation reads
∂φ(x, t) ∂φ(x, t) ∂v(x)
+ v(x) = −φ(x, t) . (4.85)
∂t ∂x ∂x
Suppose we consider curves x(t) with ẋ(t) = v(x(t)) and x(0) = x0 . Along those curves,

dφ(x(t), t) ∂φ(x, t) ∂φ(x(t), t) dv(x)
= ẋ(t) + = −φ(x, t) . (4.86)
dt ∂x x=x(t) ∂t dx
You find the curves x(t) by solving a first-order ODE, ẋ(t) = v(x(t)), then you can find the
value of φ(x, t) along those curves by solving the ODE (4.86) with φ(x(0), 0) given (see
Section 4.8.2). This is not necessarily an easy problem to solve analytically, but it can be
done numerically.
4.8.1 Conservation Laws

An equation of the form (4.84) is known as a “conservation law.” These show up a lot in
physics because there are many things that are not created or destroyed, and conservation
laws can be used to keep track of those quantities, like charge and mass, that change
because of a flow of the relevant quantity through a boundary. In the differential form
of (4.84), it is hard to see this association, so we switch to the “integral” form of the
conservation law.
Consider a portion of the x axis, from a → b with a < b. We can integrate both sides
of (4.84) from a to b to get:

d b
φ(x, t) dx = −(φ(b, t)v(b, t) − φ(a, t)v(a, t)) . (4.87)
dt a
On the left, we have the temporal variation of the total amount of φ(x, t)dx (the total amount
of “stuff,” mass in the setup from the previous section) contained between a and b. On
the right, we evaluate the product φv at the left and right boundary. The reason the total
amount of “stuff ” in the interval changes is because of stuff entering on the left at a, and
stuff leaving on the right at b (the signs are based on positive v pointing from a to b), as
expressed by the right-hand side. Notice that if nothing is flowing, so that v(x, t) = 0, then
nothing enters or leaves and the total amount of φ(x, t) found between a and b is just a
constant.
4.8.2 Traffic Flow

There is another class of varying v(x, t) that can be used to generate interesting nonlinear
wave equations. Suppose v(x, t) depended on φ(x, t) itself in some manner. This happens
in many places, notably fluid dynamics, where the speed with which sound propagates
depends on things like the density of air, which is itself changing due to wave-like motion
that depends on the local speed. As a fun example (described formally in [14]), if we take
φ(x, t) to be the density of cars along a road, then conservation of “number of cars” gives
∂φ(x, t) ∂
+ (v(x, t)φ(x, t)) = 0. (4.88)
∂t ∂x
To make a model for the local car speed v(x, t), we could assume that the speed is inversely
proportional to the car density itself. If there are no cars on the road, you travel at some
maximum speed (the speed limit) vmax . If cars are bumper-to-bumper with some maximum
density φmax , the speed is zero. The simplest (linear) relationship that captures these
assumptions is

φ(x, t)
v(x, t) = vmax 1 − , (4.89)
φmax
but now the wave equation itself has become nonlinear (in φ(x, t))

∂φ(x, t) ∂ φ(x, t)
+ vmax 1 − φ(x, t) = 0. (4.90)
∂t ∂x φmax
One can still use the method of characteristics to solve this, up to a point. If we expand the
derivatives, this “traffic flow” equation becomes

∂φ(x, t) φ(x, t) ∂φ(x, t)
+ vmax 1 − 2 = 0, (4.91)
∂t φmax ∂x
and we can define characteristic curves x(t) that have slope

φ(x, t)
ẋ(t) = vmax 1 − 2 (4.92)
φmax
so that (4.91) is satisfied. Think of the initial situation: we are given φ(x, 0), and that
defines a local slope, for the curves governed by (4.92), that varies from one location to
another. For t ≈ 0, the characteristic curve emanating from x-location x0 has slope
Fig. 4.12 Some characteristic curves generated by the initial data, with slope given by the inverse of (4.93) (inverse because
we are plotting t vs. x here). The characteristic curves, along which φ(x(t), t) is constant, cross at some finite time.
But then the solution is double-valued, and not physically relevant.

φ(x0 , 0)
vmax 1 − 2 . (4.93)
φmax
The slope depends on both the initial values φ(x, 0) and on where the curve starts. It is
possible to get slopes such that characteristic curves that start close to one another end up
crossing, giving multiple values for φ(x, t) at certain points as shown in Figure 4.12. That
is not allowable (you can’t have both five cars per mile and two cars per mile at the same
location), and the solution is to allow a discontinuity in density to form. This discontinuity
is called a “shock,” and represents the same sort of discontinuous behavior as sound waves
piling up behind a fast-moving jet. Many nonlinear partial differential equations share this
behavior. Shocks typically form when physical variables in the problem travel faster than
the local speed in the wave equation.
4.8.3 Riemann Problem

One way to capture discontinuous shock-like behavior is to build it in from the start. The
“Riemann problem” is defined by an initial condition that has one constant value for x < 0
and another constant value for x > 0. The initial data, then, is
!
φ x < 0
φ(x, 0) = = φ +(φr − φ ) θ(x) (4.94)
φr x > 0
for constants φ and φr (and step function, θ(x), defined in (2.133)).

Take the traffic flow equation in the form (4.91) as our example PDE. It is clear that
constant functions satisfy the PDE, and this motivates us to guess the following piecewise
constant “solution,”
φ(x, t) = φ +(φr − φ ) θ(x − D(t)). (4.95)

This tentative solution takes on constant values to the left and right of some location D(t).
We want to find D(t) such that this φ(x, t) solves (4.91). Clearly the initial condition (4.94)
is satisfied as long as D(0) = 0. What else constrains the function separating the two sides?
Think of the spatial and temporal derivatives here, using the derivative of the step function
from Problem 2.6.8:
∂φ(x, t) ∂φ(x, t)
= (φr − φ ) δ(x − D(t)) = −(φr − φ ) δ(x − D(t))Ḋ(t), (4.96)
∂x ∂t
and we can relate the two derivatives
∂φ(x, t) ∂φ(x, t)
= −Ḋ(t) . (4.97)
∂t ∂x
Using this relation in (4.91), we have

∂φ(x, t) φ(x, t) ∂φ(x, t)
−Ḋ(t) + vmax 1 − 2 = 0, (4.98)
∂x φmax ∂x
or written out,

φ(x, t)
Ḋ(t)(φr − φ ) δ(x − D(t)) = vmax 1 − 2 (φr − φ ) δ(x − D(t)). (4.99)
φmax
If we integrate both sides of this equation in x from x = −∞ → ∞, then the delta functions
go away,

φ(D(t), t)
Ḋ(t) = vmax 1 − 2 . (4.100)
φmax
We will use this equation to isolate D(t), but it is worth noting that our φ(x, t) solves the
integrated form of (4.91) rather than the PDE itself. Such a solution is called a “weak”
solution, and these are relevant when there is a discontinuity built-in to the ansatz – the
PDE itself has trouble with functions φ(x, t) that have ill-defined derivatives (here, the
derivatives of φ(x, t) involve delta functions).
Back to (4.100), we have to evaluate φ(D(t), t) = φ + 1/2(φr − φ ) where the 1/2
comes from evaluating the step function at the discontinuity.10 Putting this in, we get a
constant (in time) right-hand side

φ + φr φ + φr
Ḋ(t) = vmax 1 − −→ D(t) = vmax 1 − t (4.101)
φmax φmax
with D(0) = 0 required by the initial conditions.
The function D(t) is itself a line with constant slope, and it separates the left and
right values of the weak solution φ(x, t). The “shock” discontinuity can travel to the left
(negative slope) or right (positive slope) depending on the relative constants φ and φr
(see Ḋ(t) in (4.101)). In Figure 4.13, we have the maximum density on the right, so that
“cars” have to stop as they get to the more dense side, the discontinuity, then, moves left.
The characteristic curves emanating from initial points with density φ on the left and φr
on the right have slope given by
10 See Problem 2.6.8.

D(t) t
φmax
φl =
3
φr = φmax
x
Fig. 4.13 The density on the right is maximal, so the shock discontinuity, D(t), has a negative slope, and proceeds to the left.
The characteristic curves, along which the value of φ(x, t) is constant, are shown for both left and right. These run
right into the discontinuity curve.
t D(t)
φmax
φ =
4
φmax
φr =
3
x
Fig. 4.14 Here, the constant densities on the left and right are tuned so that the shock discontinuity moves to the right (D(t)
has a positive slope). Again, the characteristic curves from the left and right run into the discontinuity.

φ
v = vmax 1 − 2
φmax
(4.102)
φr
vr = vmax 1 − 2
φmax
from (4.93). We can also get a shock that travels to the right, and the characteristic curves
and shock curve for φ = φmax /4 with φr = φmax /3 are shown in Figure 4.14.
There is another type of behavior we can get from this weak piecewise solution. Suppose
the characteristic curves coming from densities on the left have negative slope, while
characteristic curves coming from densities on the right have positive slope. Referring
to (4.102), this could happen if φ > φmax /2 and φr < φmax /2. In Figure 4.15, we see
that there is no clash of values (as happens when characteristic curves run into each other,
t
3φmax
φ =
4
φmax
φr =
3
x
Fig. 4.15 Characteristic curves on the left have negative slope, while those on the right have positive slope. The curves do
not intersect, but they do leave a “void”of values in the center.
suggesting that a given point has two different values for density, precisely the situation
that is resolved by the weak solution with traveling discontinuity) but there is a growing
(in time) gap in which there is no information about the density.
Problem 4.8.1 Draw the solution to the Riemann problem for traffic flow with initial data
φ = φmax and φr = φmax /4.
5 Integration
In this chapter, we will study harmonic oscillators in a setting where we need to explicitly
integrate various different functions. We’ll start by developing a general integral solution to
the one-dimensional damped driven harmonic oscillator. Then we’ll move on to problems
for which a closed form trajectory is not immediately attainable, and show how to make
progress characterizing motion by, for example, computing its period.
5.1 First-Order ODEs
We start with a version of integration that takes us back to our work in Section 2.1. Any
ODE can be written as a set of first-order ODEs. As an example, take the harmonic
oscillator problem,
ẍ(t) = −ω2 x(t) with x(0) = x0 and ẋ(0) = v0 given. (5.1)
This equation can be written in terms of the vector

x(t)
X(t)≡˙ (5.2)
ẋ(t)
as

d x(t) 0 1 x(t)
= , (5.3)
dt ẋ(t) −ω2 0 ẋ(t)

≡M
so that we have

dX(t) x0
= MX(t) with X(0) = . (5.4)
dt v0
This is, in a sense, just like ẋ(t) = mx(t) with solution x(t) = e mt x0 . The only difference is
that X(t) is a vector and M is a matrix. Still, it is tempting to take
X(t) = e Mt X(0) (5.5)
and try to appropriately define the exponential of a matrix. The most natural definition
relies on the series expansion from (1.51),
∞ j

Mt t
e = Mj . (5.6)
j!
j=0
120
121 First-Order ODEs
The matrix multiplication makes it clear that the exponential of a matrix is itself a matrix of
the same (square) size as M itself. Suppose, further, that the eigenvalues and eigenvectors
of the matrix M are known. If the eigenvectors are the columns of a matrix V, and the
eigenvalues appear on the diagonal of the (diagonal) matrix L, then
M = VLV−1 (5.7)
from Section 3.3. The powers of M can be written in terms of V and L. Let’s start with M2
to see the pattern,
M2 = VLV−1 VLV−1 = VL2 V−1 . (5.8)
The diagonal matrix L has powers Lp that are themselves diagonal matrices with entries
that are the entries of L raised to the pth power,
⎛ p ⎞
λ1 0 0 ···
⎜ 0 λp 0 · · · ⎟
⎜ 2 ⎟
Lp = ⎜ . .. .. .. ⎟ . (5.9)
⎝ .. . . . ⎠
0 0 ··· λpn
Since we always pair a V with a V−1 (except the outermost pair), the pth power of M is
Mp = VLp V−1 (5.10)
and we can use this in (5.6)

⎡ ⎤
∞ j
∞ j

t t
e Mt = VLj V−1 = V ⎣ Lj ⎦ V−1 . (5.11)
j! j!
j=0 j=0
Since the matrix Lj is diagonal, the expression in brackets is just a set of exponentials
arrayed along the diagonal of a matrix e Lt , and we can simplify further,
⎛ λ1 t ⎞
e 0 0 ···
⎜ 0 e λ2 t 0 ··· ⎟
Mt ⎜ ⎟ −1
e = V⎜ . . . .. ⎟ V . (5.12)
⎝ .. .. .. . ⎠
0 0 · · · e λn t

=e Lt
The solution for X(t) is
X(t) = Ve Lt V−1 X(0). (5.13)
For the matrix M in (5.3), we have eigenvalues λ 1 = −iω, λ 2 = iω with un-normalized

eigenvectors
i
ω − ωi
v1 =
˙ v2 =
˙ . (5.14)
1 1
122 Integration
The matrices of interest for constructing X(t) are

i −iωt
ω − ωi Lt e 0
V= ˙ e = ˙ (5.15)
1 1 0 e iωt
and putting these into (5.13),

x0 cos(ωt) + vω0 sin(ωt)
X(t) = (5.16)
v0 cos(ωt) − ωx0 sin(ωt)
as always.
To include damping, we would just take the equation of motion
ẍ(t) = −ω2 x(t) − 2bẋ(t) (5.17)
and find the new matrix M that comes from writing (5.17) as a pair of first-order ODEs:

d x(t) 0 1 x(t)
= , (5.18)
dt ẋ(t) −ω2 −2b ẋ(t)

≡M
√
which modifies the eigenvalues and eigenvectors. The two eigenvalues are −b± b2 − ω 2 ,
which we recognize from our last pass at the damped harmonic oscillator in Section 2.1.
5.1.1 Driven Harmonic Oscillator
To include a driving term, we start with

ẍ(t) = −ω2 x(t) + f(t), x(0) = x0 ẋ(0) = v0 , (5.19)
for driving force F(t) ≡ mf(t), and again write in vector form to obtain a first-order ODE

d x(t) 0 1 x(t) 0
= + . (5.20)
dt ẋ(t) −ω2 0 ẋ(t) f(t)

≡M ≡A
This looks like a linear, first-order ODE with an “offset” A. Consider a solution X(t) =
e Mt (X(0) + Y(t)) for some new vector Y(t), then the derivative of X(t) is related to the
derivative of Y(t) by
Ẋ(t) = MX(t) + e Mt Ẏ(t), (5.21)
and if we put this into Ẋ(t) − MX(t) = A,

e Mt Ẏ = A, (5.22)
so that we can set1
t
e −Mt A(t¯) dt¯
¯
Y(t) = (5.23)
0
1 Note that e −Mt is the matrix inverse of e Mt . The matrix e Mt = Ve Lt V−1 has inverse e −Mt = Ve −Lt V−1 .
and the solution is

t
X(t) = e Mt X(0) + e Mt e −Mt A(t¯) dt¯ .
¯
(5.24)
0
Written in terms of the eigenvalue decomposition of M, we have

t
X(t) = Ve Lt V−1 X(0) + Ve Lt e −Lt V−1 A(t¯) dt¯.
¯
(5.25)
0
Using the M for the harmonic oscillator (without damping) from (5.3), we can write out
the solution in terms of the integral of f(t)
⎛ * +⎞
v0 i
e −iωt t iωt¯ ¯ ¯
e f( t ) d t − e iωt t −iωt¯ ¯ ¯
e f( t ) d t
x0 cos(ωt) + ω sin(ωt)
+⎝ * + ⎠.
2ω 0 0
X(t) =
v0 cos(ωt) − ωx0 sin(ωt) 1
e −iωt t iωt¯ ¯ ¯
e f( t ) d t + e iωt t −iωt¯ ¯ ¯
e f( t ) d t
2 0 0
(5.26)
In the language of our previous ODE solutions, the first term here is the homogeneous
piece, h(t), and the term with integrals is the sourced piece of the solution, denoted x̄(t)
previously. Recall that you got the top equation using a variation of parameters approach
in Problem 1.7.3.
5.1.2 Damped Driven Harmonic Oscillator

We can carry out the same setup for a damped, driven harmonic oscillator. Start with the
equation of motion from (5.17) with matrix form in (5.18). You will calculate the matrix
exponential in Problem 5.1.4, and get

Mt −bt cos(ω̄t) + ω̄b sin(ω̄t) ω̄ sin( ω̄t)
1
e =e 2
− ωω̄ sin( ω̄t) cos( ω̄t) − ω̄b sin(ω̄t) (5.27)

ω̄ ≡ ω 2 − b2 .
Using this matrix exponential in (5.24), we can write out the integral solution to the
damped, driven harmonic oscillator problem for an arbitrary forcing function. To keep
things simple, let’s just record the position as a function of time,

−bt b v0
x(t) = e x0 cos(ω̄t) + sin(ω̄t) + sin(ω̄t)
ω̄ ω̄
t
sin(ω̄t) b ¯
+ cos(ω̄ t ) + sin(ω̄ t ) e bt f(t¯) dt¯
¯ ¯ (5.28)
ω̄ 0 ω̄
t
1 b ¯
− cos(ω̄t) + sin(ω̄t) sin(ω̄ t¯)e bt f(t¯) dt¯ .
ω̄ ω̄ 0
We can try the familiar case of a single oscillatory driving frequency, as we did back
in Section 2.5. This time, we are getting the solution directly by integrating a bunch
of exponentials (with single-frequency driving, the driving force itself can be written
in exponential form). An example is shown in Figure 5.1 where we can see both x(t)
124 Integration
x(t)
Fig. 5.1 The position, x(t), for a mass moving under the influence of a spring force, damping, and a driving force that is
sinusoidal with frequency that is not associated with the natural spring frequency. The driving force is shown as
the dashed curve so that you can see the transient solution decay, leaving an oscillatory piece with frequency that
matches the driving frequency.
x(t)
Fig. 5.2 A square wave driving force (shown as dashed lines) and the position as a function of time of a damped oscillator
moving under the influence of the square driver.
for some set of ω, b and initial conditions, with the driving force overlaid. In that
case, the transient piece, associated with the initial conditions (i.e. the homogenous
solution) dies out giving back the driven part of the solution which shares the driving
frequency.
As another example of interest, we can take a square wave driving force, and apply it
together with the damped oscillator forces to get x(t). An example here would be an RLC
circuit that is driven by a square wave voltage from a function generator. In Figure 5.2
we can see the curve of x(t) together with the square wave driving function. There is,
again, a notion of transient decay. As time goes on, the damped oscillator responds only
to the square wave force that periodically “pings” the oscillator which then decays due to
damping until another square wave pulse hits it.
Problem 5.1.1 Show, from its definition, that the matrix exponential of a matrix A ∈
Rn×n has:
d At
e = Ae At = e At A.
dt
Problem 5.1.2 Given a function h(t¯), define the integral:
t
I(t) = h(t¯) dt¯
a
for constant a < t. Evaluate the derivative: dI(t)

dt .
Problem 5.1.3 We are most interested in x(t) in a solution like (5.26), and since ẋ(t) is
recoverable from x(t), there is no need to record the second entry in X(t). Show that
ẋ(t) from (5.26) is, indeed, the time derivative of the x(t) term.
Problem 5.1.4 For the matrix:

0 1
A=˙
−ω2 −2b
what is e At (write the 2×2 matrix that you get, involving exponents and the constants
ω, b)?
Problem 5.1.5 A charged particle of mass m moves under the influence of an external force
F(t). Newton’s second law reads:
...
mẍ(t) = F(t) + mτ x (t) (5.29)
where τ is a constant. Write this third-order differential equation as three first-order
ones.
Problem 5.1.6 For (5.29), find the general integral solution analogous to (5.26) given F(t).
For initial conditions, use x(0) = x0 , ẋ(0) = v0 , and ẍ(0) = a0 .
Problem 5.1.7 Two masses (m) are connected by (identical) springs (spring constant k,
equilibrium length a). They move on a frictionless table under the influence of both
the springs and a constant force F0 pointing to the right as shown in Figure 5.3.
F0
a a
k k
m m
x1 (t) x2 (t)
Fig. 5.3 Masses connected by identical springs, a constant force F0 pointing to the right acts on each mass
(for Problem 5.1.7).
126 Integration
Write down the equations of motion for the two masses. Convert this pair of second-
order ODEs into four first-order ODEs. Write your four equations in the form:
dt Z(t) = AZ(t) + B for Z(t) ∈ R , A ∈ R , and B ∈ R4 .
d 4 4×4
Problem 5.1.8 Suppose we had a particle acted on by the force F(t) = mf0 Tδ(t), a “kick”
delivered at time t = 0. Using Newton’s second law, mẍ(t) = F(t), find x(t) for a
particle that starts from rest at x(0) = 0. Note the discontinuity here, since the force
violates the finiteness assumption we made in Problem 2.0.2.
Problem 5.1.9 Take the force F(t) = mf0 Tδ(t − t ) for constants m, f0 , T, and t > 0,
in (5.19). Find the position as a function of time, x(t), for this impulsive driving
force with x(0) = 0 and ẋ(0) = 0.
Problem 5.1.10 For the time-dependent force
⎧
⎨ 0 t<0
F(t) = mf0 0 ≤ t < T ,
⎩
0 t≥T
driving a harmonic oscillator, find the position as a function of time, x(t), from (5.26).
Problem 5.1.11 Using the expression in (5.26), find the position as a function of time for a
driving force F(t) = mf0 e iσt , with x(0) = x0 , ẋ(0) = v0 , and compare with the result
from Section 2.2.
Problem 5.1.12 Use the impulsive delta force, F(t) = mf0 Tδ(t) in (5.28) to find the position
as a function of time for a damped harmonic oscillator that gets “kicked” at t = 0.
Problem 5.1.13 Do the same for the force from Problem 5.1.10 to generate the first cycle in
Figure 5.2 (i.e. solve for t = 0 → T).
5.2 Two-Dimensional Oscillator
Suppose we have a mass m that can move in two dimensions, and is connected to a spring
that is attached to the origin, but can pivot in any direction. We’ll take the spring constant
to be k, and set its equilibrium location to the origin. The equations of motion are just two
copies of the usual one-dimensional case, one for each direction
mẍ(t) = −kx(t) mÿ(t) = −ky(t) (5.30)
and letting ω2 ≡ k/m, we can write the solutions
x(t) = A cos(ωt) + B sin(ωt) y(t) = F cos(ωt) + G sin(ωt), (5.31)
with constants A, B, F, and G waiting for initial or boundary values to pin them down. We
can achieve uniform circular motion, of radius R, for the mass by taking B = F = 0 and
setting A = G = R, so that
x(t) = R cos(ωt) y(t) = R sin(ωt). (5.32)

127 Two-Dimensional Oscillator
The vector pointing from the origin to the current (at time t) location of the mass is
r(t) = x(t) x̂ + y(t) ŷ = R(cos(ωt) x̂ + sin(ωt) ŷ) . (5.33)
The initial conditions associated with this solution are x(0) = R, ẋ(0) = 0, y(0) = 0, and
ẏ(0) = ωR. We know everything about this solution, the perimeter of the motion is 2πR,
the period of the motion is T = 2π/ω, all as usual.
What happens if we start the mass off at the same position, but give an initial velocity in
the ŷ direction that is not carefully tuned to produce uniform circular motion? Suppose we
take ẏ(0) = αωR for α some dimensionless constant – what type of motion do we have?
What is the perimeter of the motion in this case? What is the period?
5.2.1 Elliptical Motion

If we take the more general solution, associated with ẏ(0) = αωR,
x(t) = R cos(ωt) y(t) = αR sin(ωt), (5.34)
the vector pointing from the origin to a point along the curve at time t is
r(t) = R(cos(ωt) x̂ + α sin(ωt) ŷ) , (5.35)
and along this one-dimensional curve, the relationship between the x and y coordinates is
x2 y2
+ = 1. (5.36)
R2 α 2 R2
An ellipse is defined as the set of points satisfying
x2 y2
+ = 1. (5.37)
a2 b2
The constants a and b set the horizontal and vertical extent of the ellipse. The smaller
of the two is called the “semi-minor” axis, and the larger is the “semi-major” axis.
Comparing (5.36) with this general form, it is clear that we have an ellipse with semi-
minor axis R and semi-major axis αR > R (for α > 1). A sketch is shown in Figure 5.4.
5.2.2 Perimeter
For the elliptical x(t) and y(t) from (5.34), we have r(t) ≡ x(t) x̂ + y(t) ŷ, and the tangent
vector to the curve is d = ṙ(t)dt as shown in Figure 5.5. We want to find the length of the
tangent to the curve as t goes from t to t + dt,
"
2 2
d = ṙ(t) · ṙ(t)dt = (ωR) sin2 (ωt) +(αωR) cos2 (ωt)dt. (5.38)
Then the total distance traveled is obtained by integrating (over one full cycle, here)
T"
L ≡ d = ωR sin2 (ωt) + α 2 cos2 (ωt) dt. (5.39)
0
128 Integration
ŷ
r(t)
b = αR
x̂
a=R
Fig. 5.4 An ellipse, centered at the origin, with semi-major axis b and semi-minor axis a. Points on the ellipse have x and y
coordinates that satisfy (5.37), with the values of a and b coming from (5.36).
ŷ
d ≈ r(t + dt) − r(t) ≈ ṙ(t)dt
r(t)
r(t + dt)
Path of particle
x̂
Fig. 5.5 A particle moves along a path in two dimensions. Between time t and t + dt, the particle moves from r(t) to
.
r(t + dt) and the infinitesimal displacement over that time interval is d = r(t)dt.
We know the time it takes for the particle to return to its starting location/velocity, that’s
just the usual period T = 2π/ω, so we are interested in the integral
2πω
"
L = ωR sin2 (ωt) + α 2 cos2 (ωt) dt. (5.40)
0
If α = 1, we should recover uniform circular motion, and we can check the perimeter in
this case:
2πω
L = ωR dt = 2πR (5.41)
0
as expected.
Going back, we want to evaluate the integral in (5.40) for arbitrary α. If we change
variables to p ≡ ωt, then the length becomes
2π "
L = αR 1 − β sin2 (p) dp (5.42)
0
129 Period of Motion
with β ≡ 1 − 1/α2 . This equation for the total length along the curve does not make any
reference to the angular frequency. That is sensible, since it’s the distance travelled in one
full cycle that we are probing, not how long it took the particle to traverse that distance.
For a circular orbit, we have β = 0 and recover the usual circumference. The integral itself
is called an “elliptic integral of the second kind,” and is defined by
φ"
E(φ, β) ≡ 1 − β sin2 (p) dp, (5.43)
0
so that the total length in (5.42) is

1
L = d = αRE(2π, β) = αRE 2π, 1 − 2 . (5.44)
α
Approximation
If we have β 1, we can expand the integrand of E(φ, β),
"
1 1 1 5
1 − β sin2 (p) ≈ 1 − sin2 (p)β − sin4 (p)β 2 − sin6 (p)β 3 − sin8 (p)β 4 − · · ·
2 8 16 128
(5.45)
and integrate each term by itself to get the approximation,

1 π 3π 2 5π 3 175π 4
L = d ≈ R 2π − β − β − β − β − ··· , (5.46)
1−β 2 32 128 8192
and we could use β ≡ 1 − 1/α 2 to write the approximation entirely in terms of α.
Problem 5.2.1 What does the parameter β do to our ellipse? What type of ellipse is
described by β = 0, β = 1? Find the correction, for β = 1/10, to the circumference
for a circle of radius R.
Problem 5.2.2 For the elliptical trajectories in this section, what type of motion occurs when
α = 1? Evaluate the length integral (5.44) in this case and make sure you get what
you expect. What motion occurs when α = 0? Again, evaluate the total distance
travelled to make sure it makes sense (you will have to go back to (5.40) to do this).
Problem 5.2.3 Characterize the curve with x(t) = R cos(t), y(t) = R sin(t + φ) where
φ ∈ [0, π/2) is a constant and the parameter t goes from 0 → 2π.
5.3 Period of Motion
Conservation of energy is a powerful tool for characterizing the motion of a particle under
the influence of some conservative force. Even if the position as a function of time cannot
be obtained directly, we can use energy conservation to find the period of motion, provided
we are willing to carry out some integration. We’ll set up the period calculation in a
130 Integration
spherically symmetric setting where only the distance to the origin, r, is relevant. But
the same manipulations can be used to develop virtually identical expressions in a pure
one-dimensional setting.2
Suppose we have a potential energy function in two or three dimensions that depends
only on the distance to some origin. In that case the potential energy can be written
U(x, y, z) = U(r) with r ≡ (x 2 + y 2 + z 2 )1/2 . We already have an example of this
for our spring, U(r) = 12 kr2 describes the potential energy of a spring with constant k
and equilibrium at the origin. Quite generally, the equations of motion for a “spherically
symmetric” potential energy of this form are
x y z
mẍ(t) = −U (r) mÿ(t) = −U (r) mz̈(t) = −U (r) . (5.47)
r r r
The
statement of conservation of energy itself gives us a relation between the speed
v ≡ ṙ(t) · ṙ(t) and the potential. For a particle of mass m and total energy E, we have
1 2
mv + U(r) = E, (5.48)
2
and we can evaluate the speed in spherical coordinates, {r, θ, φ}, where (see Figure B.2)
x = r sin θ cos φ y = r sin θ sin φ z = r cos θ. (5.49)
Taking the time derivatives and forming v2 = ẋ 2 + ẏ 2 + ż 2 in these new coordinates gives
1 2
m ṙ + r2 θ̇2 + r2 sin2 θ φ̇2 + U(r) = E. (5.50)
2
From the rotational form of Newton’s second law,
dL
= τ ≡r×F (5.51)
dt
with F r̂ (for F coming from U(r)), we see that the torque is zero, so the angular
momentum L ≡ r × p is constant, L̇ = 0. If we set the motion initially in the xy plane by
taking z = 0 and pz = 0, then L(t = 0) = Lz ẑ. From this initial condition, we see that the
motion will remain in the xy plane since L(t) = L(t = 0) for all times. We can, then, set
θ = π/2 and θ̇(t) = 0 (amounting to z(t) = 0, ż(t) = 0 here). In spherical coordinates, the
z-component of angular momentum is

Lz = py x − px y = m ṙ sin φ + r cos φ φ̇ r cos φ − m ṙ cos φ − r sin φ φ̇ r sin φ
= mr2 φ̇
(5.52)
giving us an expression for φ̇ in terms of the constant Lz : φ̇ = Lz /(mr2 ).

With these dynamical observations in place, conservation of energy reads
1 2 L2
mṙ + z2 + U(r) = E. (5.53)
2 mr
2 Indeed, systems in which the angles of spherical coordinates do not appear, like spherically symmetric ones,
are almost one-dimensional. The difference is that in one dimension, we have a variable x : −∞ → ∞ whereas
the radial coordinate is confined to the half-line, r : 0 → ∞.
131 Period of Motion
Suppose we start at r0 with ṙ = 0. Then the constant value of E is just

L2z
E= + U(r0 ), (5.54)
mr20
and we can solve (5.53) for ṙ up to sign:

2 L2z 1 1
ṙ = ± − + U(r0 ) − U(r). (5.55)
m m r20 r2
For periodic motion, we expect a return to the starting radius in some amount of time. That
means the sign of ṙ has to change (the mass goes in towards the center, then turns around
and comes back, for example). We can find the time it takes to make it to that change of
sign by integrating (5.55), assuming an initially negative value for ṙ,
R 2 −1/2 T¯
m Lz 1 1
− + U(r 0 ) − U(r) dr = − dt = −T,
¯ (5.56)
2 r0 m r20 r2 0
where R is the value for which ṙ is zero as determined by (5.55), and T̄ is how long it
takes to get to R. We are looking at half the total period, then. There are other values of R
we could use, and those select different fractions of the period, but it’s hard to know what
fraction if we don’t already have a good sense of the motion.
5.3.1 Harmonic Oscillator Period

Let’s think about the harmonic oscillator again in this new language. For a one-dimensional
oscillator, we would set Lz = 0, no angular momentum, and take U(r) = kr2 /2, with r
playing the role of x here. The left-hand side of (5.56) is
R √ R
m 1 2 −1/2 m 1
k r0 − r 2
dr = √ 2 dr. (5.57)
2 r0 2 kr0 r0
1 − r0r
We can evaluate the integral by letting x ≡ r/r0 ,

√ R R/r0
m 1 m 1
√ 2 dr = √ dx. (5.58)
kr0 r0 k 1 1 − x2
1 − r0 r
Integrals that involve terms like 1 − x 2 benefit from trigonometric substitution (since
cos2 θ + sin2 θ = 1) as we shall see in Section 5.4. If we take x = sin θ, then
R/r0 sin−1 (R/r0 )
1 −1 R
√ dx = dθ = sin − sin−1 (1) (5.59)
1 1 − x2 sin−1 (1) r0
and putting this back into (5.56), we have

m R π
sin−1 − = −T̄. (5.60)
k r0 2
Finally, we need to know at what value of R does ṙ = 0? We can solve for R algebraically
by setting ṙ = 0 in (5.55),
132 Integration

2 1 2
0= k(r − R2 ) (5.61)
m 2 0
so that R = −r0 is the first solution that occurs, and then

m π m
T̄ = − sin−1 (−1) − =π . (5.62)
k 2 k
This is, of course, only half of the√full period, it takes just as long to get back to r0 from
−r0 , so the period is T = 2T̄ = 2π m/k as usual.
It is interesting to see that the period has emerged without reference to the oscillatory
solution itself. We don’t need to know the solution in order to find the period. We also
note the lack of dependence on the initial extension, a famous property of (non-relativistic)
simple harmonic oscillators.
5.3.2 Pendulum Period

Another physical system that reduces to simple harmonic motion in some appropriate
limit is the “real” pendulum. We’ve already treated the pendulum in the small-angle
approximation where the motion is approximately harmonic in Section 1.3. Referring to
Figure 1.7, the energy, from (1.37) is
1 2
E = m Lθ̇ + mgL(1 − cos θ). (5.63)
2
Initially, we’ll let the pendulum bob go from rest at an angle of θ0 so that E = mgL(1 −
cos θ0 ) is the total energy. Then we can find θ̇ in terms of θ using (5.63),
2g
(cos θ − cos θ0 ) .
θ̇2 = (5.64)
L
We’ll again consider just a portion of the motion, from θ = θ0 → 0 (corresponding to a
quarter of the full period), where θ̇ ≤ 0 for the whole integration. Using (5.64) to generate
the analogue of (5.56), we have

0
L dθ
√ = −T̄. (5.65)
2g θ0 cos θ − cos θ0
Now for the change of variables. If we could turn cos θ into − sin2 φ, we would be in
more familiar territory. Let3 θ = 2φ, then

θ0 /2
L 2dφ
T̄ = "
2g 0 1 − 2 sin2 φ − cos θ 0
θ0 /2
L 2dφ
= √ " (5.66)
2g 1 − cos θ0 1 − 1−cos θ0 sin φ
0 2 2
θ0 /2
2L dφ
= "
g(1 − cos θ0 ) 0 1 − csc2 (θ0 /2) sin2 φ
3 So that we can use the identity cos(2φ) = 1 − 2 sin2 φ.

133 Techniques of Integration
T in L/g
7.42
2π θ0
0 π/2
√
Fig. 5.6 The pendulum period (in units of L/g) from (5.68) for initial angles θ0 = 0 → π/2.
The integral in (5.66) is again a named one, it is called the “elliptic integral of the first
kind,” and defined by
φ
1
F(φ, β) = " dp, (5.67)
0 1 − β sin2 (p)
where the integrand is just the inverse of the integrand in (5.43). In terms of the elliptic
integral, the period of the pendulum, T = 4T¯ is

2L θ0 θ0
T=4 F , csc2 . (5.68)
g(1 − cos θ0 ) 2 2
We can plot the period as a function of initial angle θ0 , that is shown in Figure 5.6, where
√
T is in units of L/g. The period starts at 2π (associated with the period of the simple
pendulum at all starting angles) in those units, and rises to ≈ 7.42 at θ0 = π/2.
Problem 5.3.1 What is the period of a pendulum of length 1 m that starts from rest at an
initial angle θ0 = π/4? What is the “simple” pendulum period in this case? What
percent error does one make in using the simple pendulum period?
5.4 Techniques of Integration
In the last section, we encountered two different types of integration. In the case of the
harmonic oscillator period, the integral was familiar once a trigonometric substitution
was in place, and we could evaluate the integral in terms of simple functions that we
recognized. In the case of the period of a pendulum, the integral itself was defined to
be an “elliptic integral,” and could not be simplified into familiar functions. In this section,
we’ll review some standard integral substitutions that can help in the former case.
134 Integration
Given an integral to evaluate,

b
I= f(x) dx, (5.69)
a
where the function f(x) and the limits a and b are given, what tools are available to us?
Most one-dimensional simplifications start (and many end) with a “change of variables”:
Take some function of x (inspired by the integrand) g(x) and let y = g(x) relate x to the
new variable y. Then
dg(x) dy
dy = dx −→ dx = dg(x) . (5.70)
dx
dx
If we define the values ya ≡ g(a) and yb ≡ g(b), then we can write the integral as
yb
dy
I= f(x) dg(x) (5.71)
ya dx
with the understanding that we must replace any x appearing in the integrand with its
expression in terms of y, obtained by inverting the defining relationship y = g(x). The goal
of this type of approach is to find a function g(x) that makes the integral easy to carry out
in the new variable y. This procedure is what took us from the integral in (5.58) to the
significantly simpler one in (5.59).
As an example, take f(x) = 1/(x + c)α for some constant c and α = 1. We’ll let
g(x) = x+c, simplifying the denominator of f(x). Then dy = dx and we need the “inverse,”
y = x+c → x = y−c, easy in this case. The limit points become ya = a+c and yb = b+c,
and using these in (5.71) gives
b+c y=b+c
−α 1
−α+1 1 1 1
I= y dy = y = − .
a+c −α + 1 y=a+c 1 − α (b + c)α−1 (a + c)α−1
(5.72)
For a slightly more involved example, let f(x) = x/(x 2 + c)α (again with α = 1), so we
want to evaluate
b
x
I= 2 α dx. (5.73)
a (x + c)
This time, take y = g(x) ≡ x 2 + c. The choice is motivated by the fact that dy = 2xdx and
xdx appears in the numerator of (5.73). Then working through the same steps gives
2 y=b2 +c
1 b +c −α 1
−α+1 1 1 1
I= y dy = y 2 = 2(1 − α) α−1
− α−1
.
2 a2 +c 2(1 − α) y=a +c (b2 + c) (a2 + c)
(5.74)
5.4.1 Trigonometric Substitution

There is a special class of integrals where the most useful substitution involves trigono-
metric functions. In general, if an integrand depends on the special combination 1 − x 2 ,
then substitutions of the form y ≡ g(x) = sin−1 (x) can be useful. We’ve√ already seen an
example of this type of substitution in Section 5.3.1 with the integrand 1/ 1 − x 2 , where
taking x = sin(y) gives
b sin−1 (b)
1 1
I= √ dx = cos(y) dy = sin−1 (b) − sin−1 (a) (5.75)
a 1−x 2
sin−1 (a) cos(y)
as in (5.59). Of course, we have the potential problem that if a or b is larger than 1, the
arcsine will return a complex value.4
As another example, take f(x) = 1/(1 − x 2 )3/2 , just one additional power in the
denominator. This time, if we take x = sin(y), we get
b sin−1 (b) sin−1 (b)
1 1 1
I= 3/2
dx = 3
cos(y) dy = 2
dy. (5.76)
a (1 − x )2 sin−1 (a) cos (y) sin−1 (a) cos (y)
The integral of 1/ cos2 (y) appearing on the far right can be simplified by noting that
d tan(y) d sin(y) sin2 (y) 1
= =1+ 2
= (5.77)
dy dy cos(y) cos (y) cos2 (y)
so
sin−1 (b)
1 sin−1 (b)
I= dy = tan(y) . (5.78)
cos2 (y) y=sin−1 (a)
sin−1 (a)
Before
√ evaluating the limits, we can write tan(y) in terms of x = sin(y): tan(y) =
x/ 1 − x 2 , and
b
x b a
I= √ =√ −√ . (5.79)
2
1 − x x=a 1−b 2 1 − a2
There are some integrals that do not clearly depend on 1 − x 2 , but have integrands that
can be transformed to depend on 1 − x 2 . Take
b
1
I= √ dx (5.80)
a A + Bx 2
√
for constants A and B. If we factor out the A and let y = i B/Ax, then using Problem 1.6.4
to evaluate sine with complex argument,
i√B/Ab
1 1 dy 1 −1 B −1 B
I= √ √ " = √ sin i b − sin i a .
i B/Aa A 1−y i 2 B i B A A
A
(5.81)
1 B B
=√ sinh−1 b − sinh−1 a .
B A A
The inverse trigonometric functions like arcsine and arccosine are related to logarithms,
which is a sort of generalized inverse for both trigonometric and hyperbolic trigonometric
4 Of course, if |a| or |b| is greater than one, the integrand itself will be complex for some values of x, a warning
sign.
136 Integration
functions. It is sometimes useful to write

integrals
involving, for example, arcsine in terms
iθ
of logarithms. To do that, note that log e = iθ, so
−1
iθ
sin (sin θ) = −i log e = −i log(cos θ + i sin θ) (5.82)
and if we let x ≡ sin θ, then

sin−1 x = −i log 1 − x 2 + ix . (5.83)
To get the hyperbolic form, you could let θ → −iη in (5.82) or equivalently, start with
sinh−1 (sinh(η)) = log(eη ) = log(cosh η + sinh η) , (5.84)
let x = sinh(η) and use the hyperbolic relation cosh2 η − sinh2 η = 1 to get

sinh−1 x = log 1 + x2 + x . (5.85)
Going back to our integrand of interest from (5.81), this time in indefinite integral form,

1 1 −1 B
√ dx = √ sinh x +C
A + Bx 2 B A
$ % (5.86)
1 B B 2
= √ log F x+ 1+ x
B A A
where C was a constant of integration that we then took to be C = log(F) giving a
multiplicative F inside the argument of log.
5.4.2 Integration by Parts

Another important tool is integration by parts. Given two functions f(x) and g(x),
integration by parts reads
b b
b
f (x)g(x) dx = f(x)g(x)x=a − f(x)g (x) dx (5.87)
a a
where primes refer to x-derivatives. The proof of the formula comes directly from the
fundamental theorem of calculus and the product rule. The fundamental theorem of
calculus tells us that
b
d b
(f(x)g(x)) dx = f(x)g(x)x=a (5.88)
a dx
and at the same time, using the product rule, we have
b b b
d
(f(x)g(x)) dx = f (x)g(x) dx + f(x)g (x) dx. (5.89)
a dx a a
Since the left-hand sides of (5.88) and (5.89) are the same, the right-hand sides must also
be equal, giving
b b
b

f(x)g(x) x=a =
f (x)g(x) dx + f(x)g (x) dx (5.90)
a a
whence (5.87) follows.
We’ll apply integration by parts to an example integral,

2π
I= x sin x dx (5.91)
0
where we take f (x) = sin x, g(x) = x, then using (5.87)

2π 2π
I = −x cos x|x=0 −
2π
(− cos x) dx = −2π + cos x dx = −2π. (5.92)
0 0
In the next example, we’ll use integration by parts to evaluate the integral
b
x2
J≡ 3/2
dx. (5.93)
a (1 − x 2 )
The sneaky trick is to start by considering an integral that we already know how to evaluate,
b
1
I= √ dx = sin−1 (a) − sin−1 (b). (5.94)
a 1 − x2
Now reimagine√the integrand on the left in the context of integration by parts, take f(x) = x
and g(x) = 1/ 1 − x 2 , then f (x)g(x) is the integrand in I. Applying integration by parts,
b b
x x2
I= √ − dx (5.95)

1 − x 2 x=a 3/2
a (1 − x 2 )
and the second term is precisely J. Using our expression for I from (5.94), we have
b
x2 b a
dx = √ −√ − sin−1 (a) + sin−1 (b). (5.96)
a (1 − x 2 )
3/2
1−b 2 1 − a2
Problem 5.4.1 For the example f(x) = 1/(x + c)α , why doesn’t the expression in (5.72)
apply when α = 1? Evaluate the integral in that case.
Problem 5.4.2 It is often the case that we need to evaluate derivatives of inverse trigonomet-
ric functions like sin−1 (x). A nice way to find the expressions for these derivatives
is to let f(x) ≡ sin−1 (x), then take the derivative of both sides of the equation
sin(f(x)) = x with respect to x and use the chain rule to isolate the derivative of f(x)
with respect to x. Carry out this program to find the derivative of sin−1 (x), cos−1 (x),
and tan−1 (x).
Problem 5.4.3 Sometimes an integrand’s dependence on 1 − x 2 can be obscured by
decorative constants. Evaluate the integral
b
1
I= √ du
a Au2 + B
for constants A and B by first changing variables to get an integrand that clearly
depends on the combination 1 − x 2 (don’t be afraid to use complex numbers in your
change of variables).
138 Integration
Problem 5.4.4 For integrands that depend on the combination 1 − x 2 , we took sin(y) = x
since 1 − x 2 was easy to evaluate in that case. Just as easy is to take cos(y) = x, what
happens to the integral
b
1
I= √ dx
a 1 − x2
if you use this substitution instead?
Problem 5.4.5 Write cos−1 (x) and cosh−1 (y) in terms of the logarithm function.
Problem 5.4.6 Evaluate the indefinite integrals (don’t forget the constant of integration),
write your results in terms of logarithms when inverse trigonometric functions
appear,

1 1 1
√ dx dx √ dx
1−x 2
(1 − x )
2 3/2
1 + x2

1 x x
dx √ dx dx
2
(1 + x )
3/2
1−x 2
(1 − x 2 )
3/2
check your results by differentiating each and verifying that you recover the
integrand.
Problem 5.4.7 Evaluate the integral
b
x2
I= √ dx
a 1 − x2
using integration by parts.
Problem 5.4.8 Evaluate the definite integral:
1
I= 1 − x 2 dx.
0
Problem 5.4.9 We can use integrals similar to the one-dimensional ones in this section to
find areas for axially symmetric “surfaces of revolution.” Given a surface extending
from z = 0 to z = , and with a provided function s(z) that tells us the radius of the
surface at a particular height z, the infinitesimal surface area element can be written,
referring to Figure 5.7, as
" "
2
da = dz 2 +(s (z)dz) 2πs(z) = 2π 1 + s (z)2 s(z)dz
and the area integral for the entire surface is

"
A = 2π 1 + s (z)2 s(z) dz.
0
Work out the function s(z) for a cylinder of radius R and a sphere of radius R, and
check that you get the correct surface areas in both those cases.
139 Relativistic Oscillator
ẑ
s(z + dz)
dz
s(z) dz 2 + (s(z + dz) − s(z))2

2
≈ dz 2 + (s (z)dz)
z=0
2
≈ dz 2 + (s (z)dz)
2πs(z)
Fig. 5.7 A surface obtained by rotating a curve around the z axis. At a height z, the radius is given by s(z). The strip shown
on the upper
√ figure, and blown up in the lower (by cutting it and unrolling) has infinitesimal area given by
da = 2π 1 + s (z)2 s(z)dz.
5.5 Relativistic Oscillator
Returning now to oscillator physics – as another modification that reduces to the simple
harmonic oscillator in some limit, we can think about a mass moving at relativistic speeds
(see [6] for a broader discussion). One immediate problem with simple harmonic oscillator
motion is its maximum speed. Working in one dimension, conservation of energy gives
(for a mass m that starts at x0 from rest)
1 2 1 2 1 k 2
mv + kx = kx20 −→ v2 = x0 − x 2 (5.97)
2 2 2 m
√
which is maximized at x = 0. The maximum speed is k/mx0 which could be greater than
the speed of light. That is forbidden by special relativity, and the fundamental shift that
avoids this problem is a redefinition of the total energy. In nonrelativistic mechanics, the
total energy has a kinetic, mv2 /2 piece, and a potential energy U(x) that is given. There,
mv2 /2 + U(x) = E, while in special relativity, energy conservation for the same potential
energy function is:
mc2
" + U(x) = E. (5.98)
v2
1− c2
This form has two important properties: (1) for v c, it reduces to the nonrelativistic
expression,

mc2 1 v2 1
" + U(x) ≈ mc 1 +
2
2
+ U(x) = mc2 + mv2 + U(x) (5.99)
2
1 − vc2 2 c 2
140 Integration
with an undetectable constant offset (the mc2 out front on the right) and (2) it prevents
speeds greater than c. We can see this by solving (5.98) for v2
$ 2 %
mc2
v =c 1−
2 2
(5.100)
E − U(x)
with E − U(x) ≥ mc2 so that the term we subtract from 1 is itself less than or equal to one.
Let’s find the period of an oscillating mass in this relativistic setting. We can use (5.100)
directly as we did in the nonrelativistic case. Suppose the mass starts from rest at an initial
extension of x0 , then E = mc2 + (k/2)x20 , and initially ( just after t = 0) the mass has v < 0,
so that
0 T¯
dx
=− dt = −T̄ (5.101)
x0 2 0
mc2
c 1− E−U(x)
where we know that the time it takes for the mass to go from x0 → 0 is a quarter of the full
period. Now for some substitutions, let α ≡ E/(mc2 ), then

1 x0 dx
T¯ = 2 , (5.102)
c 0
1 − α−kx 2 /(2mc2 )
1

and take q ≡ k/(2m)x/c to get
q0 ≡√ 2mk xc0
2m α − q2
T¯ = " dq. (5.103)
k 0 2
(α − q2 ) − 1
√
The integral is ready for trigonometric substitution, define θ by q = 1 + α sin θ, then

2m θ0 α cos2 θ − sin2 θ √
T¯ = " 1 + α dθ
k 0 (α 2 + 1) cos2 θ − 2 α sin2 θ + 1
(5.104)
2m θ0 α −(1 + α) sin2 θ √
= " 1 + α dθ
k 0 (α − 1)(α + 1) 1 − α+1 sin2 θ
α−1
−1
√
with θ0 ≡ sin (q0 / 1 + α). The integrand here is made up of the two pieces in the
numerator. The first term is again an elliptic integral familiar from (5.67). We have another
term that looks like
φ
sin2 (p)
" dp, (5.105)
0 1 − β sin2 (p)
and writing the integrand’s numerator as sin2 (p) = 1/β(1 − β sin2 (p)) − 1/β, we can write
the integrand in terms of a sum
φ "
sin2 (p) 1 φ 1 φ 1
" dp = 1 − β sin2 (p) dp − " dp
0 1 − β sin2 (p) β 0 β 0 1 − β sin2 (p)
1
= (E(φ, β) − F(φ, β)) . (5.106)
β
141 Relativistic Oscillator
Returning to (5.104) with these in place (in this setting, the generic β takes on the value
β = (α + 1)/(α − 1)),

2m 1 α+1 α+1 α+1
T̄ = √ αF θ0 , + (α − 1) E θ0 , − F θ0 ,
k α−1 α−1 α−1 α−1

2m 1 α+1 α+1
= √ F θ0 , + (α − 1) E θ0 , .
k α−1 α−1 α−1
(5.107)
Noting that E = mc2 + kx20 /2 by the initial conditions, we have α = 1 + q20 with q0 ≡

k/(2m)x0 /c, then the period of the motion is
⎡ ⎛ ⎛ ⎞ ⎞
2m 1 ⎣F⎝sin−1 ⎝ " 0 q ⎠,1 + ⎠2
T = 4T¯ = 4
k q0 2 + q2 q20
0
⎛ ⎛ ⎛ ⎞⎞⎤ ⎞ (5.108)
2q0
+q20 ⎝E⎝sin−1 ⎝ " ⎠ , 1 + ⎠⎠⎦ .
2+q2 q20
0
√
If we take the q0 → 0 limit, we recover the familiar T = 2π m/k. In the other limit, as
q0 → ∞, we can gain insight by thinking about the physics of the motion. If you pull
the mass way way back and release it, there is a huge force initially, and a large initial
acceleration. So the mass speeds up to its maximum (at c) quickly. Once the mass has gone
to −x0 , it slows, stops, and turns around. In the ultra-relativistic limit, then, the mass moves
back and forth between x0 and −x0 at the speed of light. Its period, in that setting, is just
T = 4x0 /c, independent of k and m. In the nonrelativistic limit, the period is independent
of x0 and in the ultra-relativistic limit, it depends only on x0 . In between, the dependence
is on x0 , k and m together in the complicated fashion
√ expressed in (5.108), and displayed
in Figure 5.8, where we plot T(q0 ) in units of m/k. Beyond the information the period
provides, the particle’s trajectory itself can be found numerically, and you will explore that
process in Section 8.2.
T in m/k
30
2π q0
0 5
√
Fig. 5.8 The relativistic period as a function of initial extension (via q0 ) from (5.108) in units of m/k.
142 Integration
Problem 5.5.1 Show that E − U(x) ≥ mc2 for all x in special relativity. This property
ensures that the expression on the right in (5.100) is never negative (at which point
the particle would have to travel with imaginary speed).
5.6 Relativistic Lengths
In special relativity, there is a different notion of length, one that extends the usual
Pythagorean sum-of-squares to include time, and enforces the constancy of the speed of
light (in inertial frames moving with constant relative velocity). Suppose we have a curve
with parameter λ in one spatial dimension and time, so that we are given t(λ) and x(λ).
Then the infinitesimal “Minkowski” length, squared, is given by
ds2 = −c2 dt 2 + dx 2
$ 2 2 % (5.109)
2 dt(λ) dx(λ)
= −c + dλ2 .
dλ dλ
The relative sign of the temporal and spatial-piece is important, and is a defining property
of Minkowski space–time, but we could put minus signs on the spatial piece, and use a
“+” for the temporal piece. The choice makes a difference in certain expressions, but not,
of course, for any physical predictions.
As an example, suppose a particle travels at constant speed from the origin to location
a along the x axis at time T. The motion can be written in temporal-parametrization as
x(t) = at/T. The Pythagorean distance travelled is, of course, a. What is the relativistic
length of this curve? Since we are using time to parametrize the motion, we have λ = t
and dt(λ)
dλ = 1. Then the infinitesimal length is

2
dx(t) a2
ds = −c + 2 dt = ic 1 − 2 2 dt. (5.110)
dt cT
The imaginary number i shows up out front, but we could eliminate it by taking the other
sign convention in (5.109). To find the total Minkowski length of the path, we integrate
T
a2
S = ic 1 − 2 2 dt = a2 − c2 T2 (5.111)
0 cT
and the temporal motion makes a (big) difference here.
It is, of course, possible to move through time by itself, with no spatial motion at all
(that happens when, for example, you stand still). If x(t) = 0 for t = 0 → T, we would say
that the Pythagorean distance travelled is zero (remaining at the origin for the whole time
of interest), but the Minkowski distance is
T
S = ic dt = icT, (5.112)
0
143 Relativistic Lengths
which is related to the distance travelled by light in a time T. This special type of
motion, through time and not space, is always achievable for massive particles, and the
corresponding temporal evolution is referred to as “proper time,” usually denoted τ. In
special relativity, then, there are two natural types of time: (1) the coordinate time that
is one of the elements of the new “spacetime” coordinate system (x, y, z, and ct) and (2)
proper time, the time in the rest frame of a moving particle. The rest frame of a moving
particle is an interesting concept. Think of wearing a watch on your wrist. As you move
through spacetime, along whatever trajectory you like, the watch moves with you – an
external observer would see it moving along with you, but relative to you, the watch is at
rest (as is evidenced by the fact that it remains at the same spot on your wrist). You are the
“rest frame” of the watch. This doesn’t mean that you have to be at rest relative to some
other observer, just that the watch is at rest relative to you. From the watch’s point of view,
staring at the same patch of your arm, it is moving only through time.
The relation that defines proper time mathematically is:
−c2 dτ 2 = −c2 dt 2 + dx 2 , (5.113)
enforcing the notion that the total Minkowski distance travelled is the same in the watch’s
rest frame (on the left) or any other (the right-hand side, an external observer would say
that the watch is moving through time and space). The presence of two different times
leads to all sorts of strange behavior. One of the most famous is the “twin paradox.”
The Twin Paradox

You have two people who are the same age standing on the surface of the earth. One of
them blasts off, heads away from the earth, then turns around and comes back while the
other stays on the earth at rest. Question: How much time has elapsed for each twin? To be
concrete, suppose the twin that leaves moves according to
x(t) = R + a(1 − cos(ωt)), (5.114)
starting at the surface of the earth, R, at t = 0, going out to R + 2a, and returning to R at
t∗ = 2π/ω. These times all refer to the coordinate time, and we can find the Minkowski
distance travelled by the moving twin in either the rest frame of the moving twin, which
has ds2 = −c2 dτ 2 or in the frame of the twin on earth, in which case we need to use
ds2 = −c2 dt 2 + dx 2 . The relation (5.113) tells us the two distances must be the same. If
we use the proper time, then the integral is easy to set up:
τ∗
S = ic dτ = icτ ∗ (5.115)
0
where τ ∗ is the proper time associated with the coordinate time t∗ . But we don’t know τ ∗ ,
that’s not given in the problem, and the relation between t and τ is, in a sense, what we are
trying to identify. So we must instead compute the total distance travelled using the known
t and x(t) in the earth’s frame,

t∗ =2π/ω 2
1 dx(t)
S = ic 1− 2 dt (5.116)
0 c dt
144 Integration
dx(t)
with dt = aω sin(ωt), so that

t∗ =2π/ω
a2 ω 2
S = ic 1− sin2 (ωt) dt. (5.117)
0 c2
In this form, it is clear that an elliptic integral will play a role, and we perform the usual
substitutions to get the integral to look like the right-hand side of (5.43), let ωt ≡ φ, then

ic 2π a2 ω 2
S= 1 − 2 sin2 φ dφ. (5.118)
ω 0 c
The maximum speed of the moving twin is aω, so let βmax ≡ aω/c,
ic
S= E(2π, β2max ). (5.119)
ω
Now that we have the total (Minkowski) distance travelled, calculated from the earth’s
point of view, we can use the equality from (5.113) to find τ ∗ , since S = icτ ∗ ,
1 t∗
τ∗ = E(2π, β2max ) = E(2π, β2max ). (5.120)
ω 2π
A plot of τ ∗ in units of t∗ , as a function of the maximum speed (ratio with c) βmax is
shown in Figure 5.9. For βmax small, τ ∗ ≈ t∗ as should be expected from our low-speed
experience where the proper time and coordinate time are indistinguishable. But as the
maximum speed of the moving twin goes up, the amount of time elapsed in its rest frame,
τ ∗ , goes down. The moving twin returns to earth younger than the one that remained on
the earth.
Problem 5.6.1 In space–time, the manner in which we travel through time matters. Suppose
we have motion in one spatial dimension: x(t) = at 2 /T2 for a length a and time T.
What is the Pythagorean distance travelled by a particle moving according to this
x(t) as t goes from 0 → T ? What is the Minkowski distance travelled in the same
time interval?
1.0
τ ∗ (in units of t∗ )
.9
.8
.7
βmax
.5 1.0
Fig. 5.9 The total time elapsed in the rest frame of the twin that leaves the earth, τ ∗ , as a function of the maximum speed
of the moving twin (βmax ≡ vmax /c).
145 Relativistic Lengths
Problem 5.6.2 For uniform circular motion of radius R, x(t) = R cos(ωt) and y(t) =
R sin(ωt). In this two-dimensional case, the infinitesimal Minkowski length is
ds2 = −c2 dt 2 + dx 2 + dy 2 .
Using this expression find the Minkowski distance travelled in one full cycle.
Compare with the “usual” distance travelled by a particle undergoing circular
motion.
Problem 5.6.3 Take the time derivative of (5.98) and show that you can write the result as
3/2
ẋ(t)2
mẍ(t) = F(x(t)) 1 − 2 (5.121)
c
given a conservative force F(x) = − dU(x)

dx . The right-hand side defines a relativistic
“effective” force, what happens to it when the particle speed approaches c?
Problem 5.6.4 What is the force F(x) that causes the motion (5.114) (for this relativistic
problem, you must use the relativistic form of Newton’s second law (5.121))?
Problem 5.6.5 Solve (5.121) for a constant force F(x) = F0 given the initial conditions:
x(0) = b and ẋ(0) = 0.
6 Waves in Three Dimensions
Most of the work we have done thus far has been in one spatial dimension. It is easy
to imagine a network of masses connected by springs such that oscillation can occur in
multiple directions at once. In this chapter, we will move the discussion of both oscillations
and waves to three dimensions. There are a number of changes that happen in this
expanded setting. First, the equations of motion for springs connecting masses become
nonlinear and thus much harder to solve (we cannot use the linear algebraic solutions
from Chapter 3). Second, and the focus of this chapter, the wave equation has solutions
that are more interesting in three dimensions, even in the static case, than in one dimension.
Those higher-dimensional solutions require vector calculus, and we will develop some of
the fundamental results from vector calculus slowly, taking time to make contact with
problems of interest that come up as we go.
6.1 Vectors in Three Dimensions
The vector calculus we need exists in three dimensions, and to start, we’ll briefly review
the discussion from Section 3.1 in the specific IR3 setting. In three dimensions, we have
three orthogonal, independent directions. To each, we assign a basis vector that points in
the increasing-coordinate direction and has unit length. These basis vectors are denoted
e 1 ≡ x̂, e 2 ≡ ŷ and e 3 ≡ ẑ. Any three-dimensional vector can then be expressed as a
linear combination of these three. As an example, v = 3 x̂ + 2 ŷ − ẑ represents a vector
whose tail is at the origin, and whose tip is at x = 3, y = 2 and z = −1.
In general, we will use variables like v1 , v2, and v3 (as in Section 3.11 ) to indicate the
components of v = v1 x̂ + v2 ŷ + v3 ẑ (equivalently, we sometimes indicate the direction
in the component name, so you will also see vx , vy, and vz in v = vx x̂ + vy ŷ + vz ẑ).
All of the operations we defined in general hold in three dimensions. Given two vectors
v = v1 x̂ + v2 ŷ + v3 ẑ and w = w1 x̂ + w2 ŷ + w3 ẑ, the dot product is

3
v·w ≡ vj wj . (6.1)
j=1
1 Aside from consistency with our previous vector discussions, keeping the indices lowered allows us to
unambiguously raise the components of a vector to powers. There is mathematical significance to the up/down
placement of component indices, and technically, all component indices should be up, but we will have no
reason to distinguish between the up/down placement in any technical sense.
146
147 Vectors in Three Dimensions
ŷ
v2 v
θ w
x̂ = ŵ
w1 v1
Fig. 6.1 Two vectors v and w define a plane. Align the x̂ axis with the vector ŵ as shown.
The length of a vector v is given by its dot product with itself,

3
4 3
√ 4 2
v ≡ v · v = 5 (vj ) . (6.2)
j=1
The dot product can be understood as a projection of one vector onto another. Given two
vectors v and w, define the unit vectors v̂ ≡ v/v, ŵ ≡ w/w. Any two nonparallel vectors
define a plane, and take the plane spanned by v and w to be the xy plane. In addition, align
the x̂ axis with ŵ, so that we have the setup shown in Figure 6.1.
With this orientation and alignment, we have w = w1 x̂ and v = v1 x̂ + v2 ŷ. The dot
product is then
v · w = v1 w1 . (6.3)
Referring to Figure 6.1 again, we can write v1 = v cos θ where θ is the angle between v
and w and v is the length of v, as always. Then since the length of w is just w1 , we have
v · w = vw cos θ (6.4)
with v cos θ telling us the amount of v that points in the ŵ direction.
In addition to the dot product, there is a special vector product found in three dimensions,
the “cross product.” For v and w in the usual Cartesian coordinates, the cross product is
defined to be
⎛ ⎞
x̂ ŷ ẑ
v × w ≡ det ⎝ v1 v2 v3 ⎠ = (v2 w3 − w2 v3 ) x̂ − (v1 w3 − w1 v3 ) ŷ + (v1 w2 − w1 v2 ) ẑ.
w1 w2 w3
(6.5)
That’s the general definition, if we go back to the setup in Figure 6.1, we have
v × w = −w1 v2 ẑ. (6.6)
This time, v2 = v sin θ, so we can write the cross product geometrically
v × w = −vw sin θ ẑ. (6.7)
The magnitude of the cross product gives the projection of v onto the ŷ axis, with direction
given by −ẑ, into the page in Figure 6.1. In general, the direction of the cross product is
obtained using the right-hand rule: Point your fingers in the direction of v, bend them in
the direction of w, then your thumb points in the direction of v × w. From the right-hand
rule it should be clear that w × v = −v × w.
Any two vectors that aren’t parallel (or anti-parallel) can be used to define a set of three
directions that spans the space IR3 . Given v and w, which span a two-dimensional plane,
we augment with v × w, which is perpendicular to both v and w, to get an orthogonal
third dimension. The dot and cross product magnitudes tell us the projection of the vectors
onto each other.
Problem 6.1.1 Given v = x̂ − ŷ and w = 2 x̂ + 3 ŷ, what is the angle between v and w?
Problem 6.1.2 Show that A × (B × C) = B(A · C) − C(A · B).
Problem 6.1.3 Given any two nonparallel unit (length one) vectors, â and v̂, what is the
angle between â and b ≡ v̂ − (â · v̂)â?
Problem 6.1.4 Suppose that you made a unit vector out of b from the previous problem,
b̂ = b/b. Show that taking ĉ ≡ â×b̂, you have a three-dimensional basis (orthogonal
unit vectors). What are â × ĉ and b̂ × ĉ?
Problem 6.1.5 Given two (nonparallel) vectors v and w, show that the magnitude of the
cross product, v × w is equal to the area of the parallelogram spanned by the two
vectors (shown here).
6.2 Derivatives
For one-dimensional functions like f(x), there is only one derivative to consider, df(x)dx
defined in the usual way. In two or more dimensions, functions can depend on multiple
independent arguments. Then we must be able to probe the derivative in each direction
independently. For a function f(x, y, z) there are three partial derivatives (as defined in
Section 4.1). These can be usefully made into a vector of sorts, the “gradient,” defined by
∂f(x, y, z) ∂f(x, y, z) ∂f(x, y, z)
∇f(x, y, z) ≡ x̂ + ŷ + ẑ. (6.8)
∂x ∂y ∂z
6.2.1 Gradient
The gradient has a natural interpretation. Suppose we want to compare the value of the
function f at x, y, and z to its value at a nearby point with coordinates x + Δx, y + Δy,
Fig. 6.2 A parallelogram spanned by the vectors v and w. In this problem, you will show that the area of the parallelogram
is the magnitude of v × w.
149 Derivatives
z + Δz, for “small” {Δx, Δy, Δz}. We can Taylor expand f(x + Δx, y + Δy, z + Δz) to first
order in the small variables,
∂f(x, y, z) ∂f(x, y, z) ∂f(x, y, z)
f(x + Δx, y + Δy, z + Δz) ≈ f(x, y, z) + Δx + Δy + Δz.
∂x ∂y ∂z
(6.9)
Let Δ ≡ Δx x̂ + Δy ŷ + Δz ẑ, then we can write the expansion as
f(x + Δx, y + Δy, z + Δz) ≈ f(x, y, z) + ∇f(x, y, z) · Δ. (6.10)
Now let’s say you wanted to choose the vector Δ so as to maximize the difference f(x +
Δx, y + Δy, z + Δz) − f(x, y, z). Given the dot product’s geometric interpretation, we have
∇f · Δ = |∇f|Δ cos θ where θ is the angle between the gradient of f and the vector Δ. To
maximize the difference, we take θ = 0 which makes Δ ∇f. The gradient, then, points in
the direction of maximum increase for the function f(x, y, z).
If we take the vector operator
∂ ∂ ∂
∇ ≡ x̂ + ŷ + ẑ , (6.11)
∂x ∂y ∂z
which is waiting to act on functions like f(x, y, z) (hence the placement of the derivatives to
the right of the basis vectors), and treat it as if it were itself a vector, then we can introduce
dot and cross products involving ∇.
Springs in Three Dimensions

The gradient can be used to find the spring force in three dimensions. In one dimension,
the potential energy associated with a spring is U = 1/2k(x − a)2 where k is the spring
constant,
√ and a is 2the equilibrium length. In three dimensions, the potential energy is U =
1/2k r · r − a where r ≡ x x̂ + y ŷ + z ẑ. Then the associated force is given by the
negative gradient of the potential energy,
F = −∇U = −k(r − a) r̂ = −k(r − a r̂) . (6.12)
Notice that the force here is no longer linear in r, the position of the particle attached to
the spring, with the nonlinear r̂ as the culprit.
For some initial conditions, the behavior of the mass attached to the spring is the familiar
one-dimensional, oscillatory motion. That happens when the initial velocity is zero, or
parallel to the initial position vector. In that case, you may as well define that direction
to be x̂ and rewrite the problem as a one-dimensional one. If the initial velocity vector is
not parallel to the initial position, we can get other behaviors like the elliptical orbits of
Section 5.2.1.
Another place we are forced to take a higher-dimensional view of the particle motion is
when a mass is attached to a spring that is not attached to the origin. Suppose we start with
a spring that has one end fixed at b, the other end attached to a mass m at r. The potential
energy function must now depend on the difference between the magnitude of r − b and
the equilibrium length a. The energy takes the form
1 2
U= k(|r − b| − a) (6.13)
2
ŷ
x̂
Fig. 6.3 Some representative points with the vector directions of r = x x̂ + y ŷ + z ẑ (solid) and φ = −y x̂ + x ŷ
(dotted) shown. In evaluating r, we have set z = 0 to display the vectors in the xy plane.
with force

r−b
F = −∇U = −k r − b − a , (6.14)
|r − b|
which is again nonlinear, and has the direction b built in. The nonlinearity in r means that
we cannot just write the equations of motion in matrix-vector form, with integral solution
given by expressions analogous to those developed in Section 5.1, nor are their natural
normal modes we can use to build solutions as in Section 3.4.
6.2.2 Divergence and Curl
Given a vector valued function f (x, y, z) ≡ fx (x, y, z) x̂ + fy (x, y, z) ŷ + fz (x, y, z) ẑ (using

coordinate labels for the components of f ), we can take the dot product of ∇ with f :
∂fx ∂fy ∂fz
∇·f = + + . (6.15)
∂x ∂y ∂z
This derivative, itself a single function of position, is called the “divergence.” To under-
stand the name, and gain some intuition for its operation, consider two vector functions:
r ≡ x x̂+y ŷ+z ẑ and φ ≡ −y x̂+x ŷ. The vector r points from the origin to any point with
coordinates {x, y, z} as shown in Figure 6.3. It “diverges” from the origin with magnitude
that gets larger as the points get further away from the origin. The divergence of r is
∂x ∂y ∂z
∇·r= + + =3 (6.16)
∂x ∂y ∂z
a constant.2
2 Notice its dimensional significance. In four dimensions with coordinates x, y, z, and w, you’d have r = x x̂ +
y ŷ + z ẑ + w ŵ with ∇ · r = 4.
151 Derivatives
The other vector function,

φ, also shown in Figure 6.3, points counter-clockwise along
a circle of radius s ≡ x + y 2 and increases in magnitude with distance from the origin.
2
The divergence of this function is

∂y ∂x
∇·φ =− + = 0. (6.17)
∂x ∂y
The divergence operation returns zero for vectors like φ that “curl around” a point, and
nonzero for vectors like r that “diverge” from a point.
If we act on our generic f (x, y, z) with ∇ using the cross product, then we have the “curl”
of f :

∂fz ∂fy ∂fz ∂fx ∂fy ∂fx
∇×f ≡ − x̂ − − ŷ + − ẑ. (6.18)
∂y ∂z ∂x ∂z ∂x ∂y
The curl of r is zero, ∇ × r = 0. The curl is insensitive to functions that “diverge” from a
point. Meanwhile,
∇ × φ = 2 ẑ (6.19)
a constant. The direction of the curl of φ is given by the right-hand rule: Curl the fingers
3
of your right hand in the direction of the vector function, your thumb points in the direction
of the curl of the vector. Since φ points counter-clockwise in the xy plane, the right-hand
rule predicts a direction out of the page, the ẑ direction. The curl tells us the extent to which
a vector function “curls around” a point (more appropriately, an infinite line going through
the point perpendicular to the plane of the “curling” vector function).
Suppose we had a vector function whose magnitude depends only on the distance from
the origin, and whose direction is radially outward, f = f( x 2 + y 2 + z 2 ) r̂. We could
write this as f = f(r) r̂ since the magnitude of the vector r is precisely r = x 2 + y 2 + z 2 .
What is the curl of this f ? Because f points radially away from the origin, we expect its
curl to be zero, but let’s check,

f(r) f(r)
∇×f =∇× r =∇× (x x̂ + y ŷ + z ẑ) (6.20)
r r
where we have used the fact that r = r r̂ to isolate the components of f . The x-derivative
of r is
1
∂r ∂ x2 + y2 + z2 2 2x x
= = = (6.21)
∂x ∂x 2
x +y +z 2 2 r
and similarly for the y and z derivatives. Using these to evaluate the right-hand side
of (6.20) gives zero, as expected.
For the divergence of this specialized f , we have

f(r) 2r2 f(r) + r3 f (r)
∇·f =∇· (x x̂ + y ŷ + z ẑ) = . (6.22)
r r3
Suppose we put in the magnitude f(r) = A/r2 , for constant A, so that f = A/r2 r̂. This
example comes up a lot, and represents a highly divergent vector function, with magnitude
3 Given that φ is defined in a plane, the 2 is . . . interesting.
decreasing as the function is evaluated at points further and further from the origin.
Running this through the divergence in the form (6.22) gives ∇·f = 0. This is problematic,
since the function clearly diverges, and the resolution will come
in the next section.
We can also study the curl of g = g(s) φ̂ where s ≡ x 2 + y 2 . This function has
no divergence, befitting a “curly” function, and the curl can be written in terms of the
magnitude g(s). Noting that φ̂ = (−y x̂ + x ŷ) /s, we have

g(s)(−y x̂ + x ŷ) s2 g(s) + s3 g (s)
∇×g =∇× = ẑ. (6.23)
s s3
Now again take a special case, g(s) = A/s for constant A. Using this form for g(s) in (6.23)
gives a curl of zero, and yet we expect the curl to be nonzero since g is quintessentially
curly. This case complements the special case for the divergence of r̂/r2 . The major
problem here is one of definition and existence, since in neither of these cases is the
function, or its derivative, well-defined at the origin.

Problem 6.2.1 What is ∇r for r ≡ x 2 + y 2 + z 2 ? What is the gradient of rp ?

Problem 6.2.2 Given the function f(x, y, z) = x 2 + y 2 , what is the direction of greatest
increase at location x = 1, y = 1 computed “visually”? Does this agree with the
direction you get from the gradient of f ?
Problem 6.2.3 Given three masses, m1 , m2, and m3 with associated position vectors x 1 (t),
x 2 (t), and x 3 (t), write the equations of motion if the masses m1 and m2 , m2, and m3 ,
and m1 and m3 are attached by identical springs with constant k and equilibrium a.
Problem 6.2.4 Think up a vector f that has zero curl, nonzero divergence, and vice-versa.
Problem 6.2.5 Work out the details of the curl in (6.20) to establish that the curl really is
zero.
Problem 6.2.6 Verify that for g = g(s) φ̂, ∇ · g = 0, the vector function is divergenceless.
Problem 6.2.7 What are the divergence and curl of the vector v = A(x, y, z)r + B(x, y, z)φ
where A(x, y, z) and B(x, y, z) are arbitrary functions of position, r ≡ x x̂ + y ŷ + z ẑ,
and φ ≡ −y x̂ + x ŷ?
Problem 6.2.8 What is the curl of the gradient of a function f(x, y, z): ∇ × ∇f =? Try to
describe the result geometrically by thinking about the geometric role of the gradient
(pointing in the direction of greatest increase) and the curl (measuring “curliness”).
Problem 6.2.9 Find the divergence of the curl of a generic vector function h(x, y, z), ∇ ·
(∇ × h) = ?
6.3 Fundamental Theorem of Calculus for Vectors
The integral is the operation that “undoes” derivatives. The fundamental theorem of
calculus provides the formal description. In one dimension, the fundamental theorem tells
us that for a function f(x), whose derivative is integrated between x = a to b,
153 Fundamental Theorem of Calculus for Vectors
ẑ
w(p)
d
w(pf ) = b
w(pi ) = a
ŷ
x̂
Fig. 6.4 An example path given by the tip of the vector w(p) evaluated from p = pi → pf . The tangent to the path, d, is
shown at a point.
b
df(x)
dx = f(b) − f(a). (6.24)
a dx
The integral of the derivative of f(x) is related to the evaluation of f(x) at the integration
limits. The fundamental theorem turns an integral over a domain into an evaluation at the
domain’s boundary. Given the proliferation of derivatives found in vector calculus, there
should be a variety of similar statements, for the integration of gradients, divergences, and
curls.
6.3.1 Line Integrals

The work done by a force along a path is an example of a “line integral.” You provide
the force, F, a vector function of position, and a path, described by a vector w(p),
parametrized by4 p, that points from the origin to locations along the path. Then the integral
of interest involves the projection of the component of force along the path, accomplished
by taking the dot-product of F with an infinitesimal tangent (to the path) vector d, the
“line element.” The work done by the force is given by
b
W= F · d, (6.25)
a
where the starting point a = w(pi ) and the ending point b = w(pf ) for some initial pi and
final pf . An example of a path, with defining w(p) vector and d is shown in Figure 6.4.
The infinitesimal tangent vector can be written in terms of the derivative of w(p), since
d ∝ w(p + dp) − w(p), (6.26)
4 The parameter is typically taken to be time, but that is not necessary.

and Taylor expanding on the right for small dp gives d = dw(p)

dp dp. Then we can write the
work integral as
pf
dw(p)
W= F(w(p)) · dp (6.27)
pi dp
where F(w(p)) means evaluate the force vector at the three-dimensional location given by
w(p).
Example
Take the force vector to be F = Ar for constant A > 0. This is a force that points away
from the origin at all locations. For the path, we’ll take the line from the origin to 1 on the
x axis. The vector describing this path is w(p) = p x̂ for p = 0 → 1. The work done by
the force is then
1 1
dw(p)
W= F· dp = A dp = A. (6.28)
0 dp 0
As another example, take the same force, but the path described by w(p) = A(cos(p) x̂+
sin(p) ŷ + p ẑ), with p = 0 → 2π. This time, the work is
2π
W= A(− sin(p) + cos(p) + 1) dp = 2πA. (6.29)
0
Notice that the first two terms integrate to zero, those represent circular motion in a plane,
while the ẑ component of w(p) goes straight up the z axis.
Conservative Forces
Given a potential energy function U(x, y, z), a conservative force comes from its gradient:
F = −∇U. For conservative forces, the work integral simplifies considerably. Along the
path, the values of x, y, and z are themselves functions of the parameter p through w(p), so
that the total p-derivative of the potential energy function is
dU(w(p)) ∂U dwx (p) ∂U dwy (p) ∂U dwz (p) dw(p)
= + + = ∇U · . (6.30)
dp ∂x dp ∂y dp ∂z dp dp
The right-hand side here can be written as −F · dw(p)

dp , just the negative of the integrand
associated with the work. For conservative forces, then, the one-dimensional fundamental
theorem of calculus applies,
b pf
dU
W= F · d = − dp = −(U(w(pf )) − U(w(pi ))) = U(a) − U(b), (6.31)
a pi dp
and we can see that the work done by a conservative force is path-independent: only the
locations of the end-points matter. As a corollary of that observation, it is clear that if
the path is “closed,” meaning that the starting and ending points are the same, a = b, then
the work done by the force is zero.
The spring force in three dimensions from (6.12) is

√an example 2 of a conservative force,
coming from the potential energy function U = 1/2k r · r − a . The work done by this
force as a particle moves from f to g (changing the labels of the initial and final points to
avoid clashing with the equilibrium length a) is
1 * 2 2
+
W = k (f − a) −(g − a) ,
2
and, of course, if f = g, parametrizing a closed curve, W = 0.
6.3.2 Area and Volume Integrals

Conservative forces that come from the gradient of a potential energy function represent
one type of vector function, ∇U. How about the integrals of the divergence and curl applied
to vector functions? The relevant integrands and their “fundamental theorem” variants are
most useful in higher-dimensional integration.
A single function of three variables, f(x, y, z), can be integrated over a three-dimensional
domain, call it Ω. The boundary of the domain is denoted ∂Ω, and this boundary is a
“closed” surface, one with a clear interior and exterior. There is a vector n̂ that points
normal to the surface (and is called the unit normal), and away from the interior. For
surfaces that do not have an interior and exterior, the vector n̂ points perpendicular to
the surface, but there are two options (roughly, “upward” and “downward”).
A generic volume integral is denoted

I= f(x, y, z) dxdydz, (6.32)
Ω

≡dτ
where we must be given the integrand, f(x, y, z), and the domain of integration, Ω. The
“volume element” dτ represents the volume of an infinitesimal box, so that we are adding
up tiny box volumes weighted by the function f(x, y, z). As an example, take f(x, y, z) =
x 2 y 2 z 2 , and for Ω, use a box of side length , centered at the origin. Then,

89
I= x 2 y 2 z 2 dxdydz = , (6.33)
− − − 27
performing the integrals one at a time.
Area integrals involve an infinitesimal area vector da = da n̂, the “area element.” The
magnitude of the area element is the area of an infinitesimal patch on the surface, but
unlike the volume element, the area element has a direction, pointing perpendicular to the
surface. As an example, if you have a flat surface lying in the xy plane, then the direction
perpendicular to the surface is n̂ = ±ẑ (a flat surface has no clear interior/exterior, so there
is a sign ambiguity in the unit normal). The area element is da = dxdy ẑ, picking the + sign.
There are many different types of area integral. You can integrate a function like f(x, y, z)
over some surface, and that integral will naturally return a vector. This can lead to some
confusing results. For example, take a spherical surface, this has a unit normal vector n̂ = r̂
which we take to point radially outward (a sphere has a clear inside and outside). Since this
is a closed surface, we put a circle around the integral to indicate that we are adding up the
area elements over the entire surface, and we’ll refer to the surface itself as S. Consider the
integral

I = da (6.34)
S
for the surface of the sphere. Every little patch of area has an antipodal patch with area
vector pointing in the opposite direction. If you add up all the area elements, you get
I = 0, they cancel in pairs.5 We know the area of a sphere of radius R is 4πR2 , not zero.
But the vector area of a sphere is zero, because of the vector addition of infinitesimal area
elements. If you take the magnitude of da everywhere, you will get the expected result

da = 4πR2 . (6.35)
S
Just be careful what you are asking for.
Of the many types of area integral, one that is very useful is the integral of a vector
dotted into the area element, a generalization of the line integral. Given a surface S with
area element da and a vector function V, we can construct

V · da or V · da. (6.36)
S S
The first expression applies to open surfaces, the second to closed ones. In this type of
integral, we project the vector function V onto the unit normal and add up over the entire
surface, evaluating V at the surface.
To see how the evaluation works, let’s take our surface to be a square of side length 2
centered at the origin and sitting in the yz plane. The area element is da = dydz x̂. For our
vector function, take V = r ≡ x x̂ + y ŷ + z ẑ. Then the integral over the surface is

r · da = 0 dydz = 0 (6.37)
S − −
since the value of x at the surface is zero. If you instead took V = φ = −y x̂ + x ŷ and
used the same surface, you’d get

φ · da = (−y) dydz = −2 y dy = 0 (6.38)
S − − −
but this time, the integral is nontrivial even though it ultimately evaluates to zero.
As an example that does not vanish, take V = −y 2 x̂+x 2 ŷ +z 2 ẑ, for the same surface,

44
V · da = (−y ) dydz = −2
2
y 2 dy = − . (6.39)
S − − 3
6.3.3 The Divergence Theorem

Remember the structure of the fundamental theorem of calculus: The integral of a
derivative of a function is equal to the function evaluated on the boundary. For the volume
5 There is a “symmetry” argument we can make to establish that the vector area of a sphere is zero: Since a
sphere is the same in all directions, where could such an area vector possibly point?
da = dxdyẑ
da = dxdz ŷ
da = dydzx̂
Fig. 6.5 The domain Ω is a cube of side length 2 centered at the origin. The boundary of the domain, ∂Ω, is the surface
of the cube, each face with its own outward normal and infinitesimal area element da. The left, back and bottom
sides mirror the right, front, and top (shown).
integral in (6.32), a natural “derivative” for use as the integrand is the divergence of a vector
function V. If we integrate that over a volume Ω with closed boundary ∂Ω, a reasonable
guess for the form of the fundamental theorem here would be

∇ · V dτ = V · da. (6.40)
Ω ∂Ω
This expression has the correct structure: on the left, the integrand is a derivative of sorts,
and on the right, we have the function V evaluated over the boundary of the domain.
As a plausibility argument and example, take Ω to be the interior of a cube of side length
2 centered at the origin. The surface of the cube is ∂Ω, and consists of the six faces, each
with its own unit normal, as shown in Figure 6.5. Take V = z ẑ so that ∇ · V = 1, then
the volume integral on the left of (6.40) is just (2)3 . For the surface integral on the right,
we have

V · da = dxdy + (−)(−1) dxdy = (2)2 + (2)2 = (2)3 , (6.41)
∂Ω top bottom
(the other four sides do not contribute because their surface normals are perpendicular to ẑ)
matching the result from the volume integral.
We won’t prove the theorem rigorously, but an argument can be sketched quickly. Any
vector function can be Taylor-expanded in the vicinity of a point r0 , using (6.9) applied to
each component. Take V(r0 + r) where as usual r ≡ x x̂ + y ŷ + z ẑ, and represents our
probe of the vicinity of the constant r0 , then
V(r0 + r) ≈ V(r0 ) +(r · ∇Vx ) x̂ +(r · ∇Vy ) ŷ +(r · ∇Vz ) ẑ. (6.42)
All of the derivatives in (6.42) are just constants, derivatives of V evaluated at the point
r0 . We can name them to make it clear that they do not change under the integration. The
coordinate dependence that we need to integrate is all carried by r. Let
a ≡ ∇Vx (r0 ) b ≡ ∇Vy (r0 ) c ≡ ∇Vz (r0 ), (6.43)
then we can write
V(r0 + r) ≈ V(r0 ) +(r · a) x̂ +(r · b) ŷ +(r · c) ẑ. (6.44)

The divergence of V(r0 + r), taking derivatives with respect to the coordinates in r is just
∇ · V(r0 + r) ≈ ax + by + cz (6.45)
(referring to the components of a, b, and c).

Imagine making a cube of side length 2 as before, but small enough so that the
approximation in (6.44) holds. For simplicity, take r0 = 0 and assume V(0) = 0. Then we
can perform the volume and surface integrals,

3
∇ · V dτ = (ax + by + cz )(2) (6.46)
Ω
and (performing the top, right, and front integrals, with each doubled to account for the
other three sides)

V · da = 2 r|z= · c dxdy + 2 r|y= · b dxdz +2 r|x= · a dydz
∂Ω − − − − − −
3 3 3
= 8ax + 8by + 8cz
3
= (ax + by + cz )(2) .
(6.47)
The two integrals are equal, establishing the theorem for a tiny domain over which the
vector function is approximated by (6.42) (a physicist’s assumption is that such a tiny
domain exists). To build up larger domains, we just add together the small cubical ones.
Take two cubical domains for which we have established the theorem, call them Ω1 and
Ω2 , and let them touch at a face as shown in Figure 6.6. We have a new domain, Ω, that is
the union of Ω1 and Ω2 . Now, given V, we know that the theorem holds for Ω1 and Ω2 ,
so that

∇ · V dτ1 = V · da1 , ∇ · V dτ2 = V · da2 . (6.48)
Ω1 ∂Ω1 Ω2 ∂Ω2
Adding these two equations together

∇ · V dτ1 + ∇ · V dτ2 = V · da1 + V · da2 , (6.49)
Ω1 Ω2 ∂Ω1 ∂Ω2

= Ω ∇·V dτ
the left-hand side is just the integral over the entire domain Ω = Ω1 ∪ Ω2 . On the right, we
do not have the integral over the boundary ∂Ω because the sum of the boundary integrals
includes the interior shared wall (see Figure 6.6 again). But that shared wall has da1 =
−da2 , the area vector points outward in both cases, so that they point in opposite directions
at the shared face. Meanwhile, the value of V is the same at the shared boundary, so we
get equal and opposite contributions, and

V · da + V · da = V · da. (6.50)
∂Ω1 ∂Ω2 ∂Ω
Ω2
da2 da1
Ω1
Fig. 6.6 A domain made out of two small cubical ones, Ω1 and Ω2 that meet at a shared internal face. The area vectors on
the shared face point in opposite directions, so integrating V over each of the cube’s surfaces will give an internal
cancellation, leaving us with an integral over the external surfaces, i.e. ∂Ω.
Putting this in (6.49), we have established the theorem for the pair of domains,

∇ · V dτ = V · da. (6.51)
Ω ∂Ω
Using additional small cubes, we can build any domain and add them up as we did here.
The interior surface integrals will cancel in pairs, leaving us with the boundary of the
composite domain Ω, and the theorem holds in general.
6.3.4 Conservation Laws in Three Dimensions

We developed the physics of conservation laws in one dimension in Section 4.8.1. Given
a conserved quantity, like charge, which cannot be created or destroyed, we account for
changes in the amount of charge in a region by looking at the flow of charge through
the boundaries of the region. In three dimensions the same idea holds. Given a charge
density ρ(x, y, z, t), the charge per unit volume at a location x, y, z in space at time t, and a
current density J = ρv, also a function of position and time, conservation is expressed in
differential form by
∂ρ
= −∇ · J, (6.52)
∂t
generalizing the one-dimensional form from (4.84).
If we consider a domain Ω with boundary ∂Ω, then integrating both sides of (6.52) over
the domain Ω gives

∂ρ
dτ = − ∇ · J dτ. (6.53)
Ω ∂t Ω
The quantity ρdτ represents the amount of “stuff ” (charge, mass, whatever the conserved
quantity is) in an infinitesimal volume. For ρ a charge density, the total amount of charge
in the domain Ω is

Q(t) = ρ dτ, (6.54)
Ω
and this is a function only of t since the spatial dependence has been integrated out.
Then the quantity on the left in (6.53) is really the time-derivative of6 Q(t),

∂ρ d dQ(t)
dτ = ρ dτ = . (6.55)
Ω ∂t dt Ω dt
For the right-hand side of (6.53), we can use the divergence theorem (6.40) to rewrite the
volume integral over the surface,

∇ · J dτ = J · da, (6.56)
Ω ∂Ω
so the integrated form of the conservation law reads

dQ(t) d
= ρ dτ = − J · da. (6.57)
dt dt Ω ∂Ω
The physical interpretation of this equation is that “the change in the amount of charge
in the domain Ω can be accounted for by the current coming in (or leaving) through the
boundary of the domain.” The minus sign on the right makes physical sense: the area
element points out of Ω, so if J da (with J · da > 0), current is carrying charge out of
Ω, the amount of charge inside Ω should decrease.
6.3.5 Curl Theorem

Consider the surface integral of the curl of a vector function V over a domain S with
boundary ∂S. For our version of the fundamental theorem of calculus here, we expect that
the integral over the domain should be related to the evaluation of V on the boundary,

(∇ × V) · da = V · d. (6.58)
S ∂S
We’ll do a simple example of a curly function, V = φ ≡ −y x̂+x ŷ, with ∇×V = 2 ẑ,
as we have seen before. For the domain, S, take a square of side length 2 centered at the
origin, and lying in the xy plane as in Figure 6.7. This has area vector given by da = dxdy ẑ.
Then the area integral on the left of (6.58) is

(∇ × V) · da = 2 dxdy = 2(2)2 = 82 . (6.59)
S S
For the line integral on the right of (6.58), we go around in the direction shown in
Figure 6.7,

V · d = Vy |x= dy − Vx |y= dx − Vy |x=− dy + Vx |y=− dx
∂S − − − − (6.60)
= (2) − (−)(2) − (−)(2) + ()(2) = 82
and the equation (6.58) is true for this example.
We can again sketch the proof of this theorem using our approximate form (6.42) with S
a small (so that the approximate expression for V holds) square of side length 2, centered
6 Assuming, as usual, that the temporal derivative can be pulled out of the integral.
ẑ
da = dxdy ẑ
d 3 = −dy ŷ
d 4 = dx x̂
ŷ
d 2 = −dx x̂
d 1 = dy ŷ
x̂
Fig. 6.7 The square domain S with boundary ∂S traversed using the direction given by the right-hand rule (point thumb in
direction of da, then fingers curl in the direction of integration for the boundary).
at the origin, and lying in the xy plane. We will again assume that r0 = 0, and that V(0) = 0
just to simplify the calculation. The curl of the approximate V is
∇ × V = (cy − bz ) x̂ +(az − cx ) ŷ +(bx − ay ) ẑ. (6.61)
The surface integral is just the ẑ component of the curl multiplied by the surface area,

2
(∇ × V) · da = (bx − ay )(2) . (6.62)
S
We have chosen ẑ as the unit normal for the integration. That gives a direction for the line
integral around ∂S obtained by the right-hand rule. Point the thumb of your right hand in
the direction of the normal, then your fingers curl in the traversal direction. In this case, we
want to go around the square counter-clockwise as shown in Figure 6.7.
For the boundary integral, we have z = 0, and we’ll integrate over the four sides shown
in Figure 6.7 (in the order shown)

V · d = r · b|x= dy+ r · a|y= (−dx)+ r · b|x=− (−dy)+ r · a|y=− dx
∂S − − − −
2
= 4(bx − ay )2 = (bx − ay )(2)
(6.63)
so the theorem holds for this approximate, infinitesimal scenario.

We again combine integrals to finish the sketch. Take two platelets, S1 and S2 placed
next to each other so that they share a line segment along their boundary. Call the union of
the domains S = S1 ∪ S2 , with boundary ∂S. The theorem holds for each platelet,

(∇ × V) · da1 = V · d1 (∇ × V) · da2 = V · d2 . (6.64)
S1 ∂S1 S2 ∂S2
Adding the equations,

(∇ × V) · da1 + (∇ × V) · da2 = V · d1 + V · d2 , (6.65)
S S2 ∂S1 ∂S2
1

= S(∇×V)·da
we have an integral over the union of the domains on the left, as desired. On the right, we
have two line integrals around the boundaries of the platelets, including the shared interior
line so that the integral is not over the boundary of S. Along the shared line, however, we
have d1 = −d2 since we use the same counterclockwise direction for each, and the value
of the function is the same, so the interior contributions cancel, and we have

V · d1 + V · d2 = V · d. (6.66)
∂S1 ∂S2 ∂S
Using this relation in (6.65),

(∇ × V) · da = V · d. (6.67)
S ∂S
So the theorem holds for S made up of the pair of platelet surfaces S1 and S2 . You can
continue to form any surface of interest by adding additional platelets with internal lines
cancelling to leave the integral around ∂S, so the theorem holds for an arbitrary surface.
Problem 6.3.1 What path is described by the vector w(p) = cos(p) x̂ + sin(p) ŷ + p ẑ for
p ∈ [0, 2π]?
Problem 6.3.2 Generate the vector w(p) for a parabolic curve in two dimensions, y = Ax 2 ,
for constant A. Construct w(p) so that: w(−1) = −A x̂ + A ŷ. Compute the work
done by the force F = k(x x̂ − y ŷ) in going from A(−x̂ + ŷ) to A(x̂ + ŷ).
Problem 6.3.3 Is the force F = k(x x̂ − y ŷ) (for constant k) conservative? How about
F = k(y 2 x̂ − x ŷ) (again, with k some constant)?
Problem 6.3.4 Draw the picture, analagous to Figure 6.6, that shows the internal line
cancellation between platelets allowing the boundaries of S1 and S2 to combine to
form the boundary of S ≡ S1 ∪ S2 .
Problem 6.3.5 Check the divergence theorem for v = A(zy 2 x̂−xz 2 ŷ+zx 2 ẑ) (for constant
A) with Ω a cube of side length 2 centered at the origin (compute the two sides
of (6.40) and check that they are the same).
Problem 6.3.6 One of Maxwell’s equations for the electric field is ∇ · E = ρ/0 where ρ
is a charge density (charge per unit volume) and 0 sets units for the electric field E.
Use the divergence theorem applied to an arbitrary closed domain Ω to show that

Q
= E · da (6.68)
0 ∂Ω
where Q is the total charge contained in Ω. This is the integral form of Gauss’s law.
Problem 6.3.7 Check the curl theorem using v = A(zy 2 x̂ − xz 2 ŷ + zx 2 ẑ) and the domain
S that is a square of side length 2 lying centered at the origin at a height z = 1
(as with Problem 6.3.5, we are just checking here, this time using (6.58)).
163 Delta Functions in Three Dimensions
6.4 Delta Functions in Three Dimensions
Recall the definition of the Dirac delta function in (2.116) from Section 2.6.2,
! b
0 x = 0
δ(x) = with (for positive constants a and b) δ(x) dx = 1. (6.69)
∞ x=0 −a
In three dimensions, we have
δ3 (x, y, z) = δ(x)δ(y)δ(z), (6.70)
the product of delta functions in each of the three coordinate variables. Using the notation
f(r) ≡ f(x, y, z) to denote a generic function of the three coordinates, we have

δ3 (r) dτ = 1 if Ω contains the origin, (6.71)
Ω
and in general,

δ3 (r − r0 )f(r) dτ = f(r0 ) (6.72)
Ω
provided the domain Ω contains the point r0 .
6.4.1 Divergence of f = r̂/r 2

Remember one of our dilemnas from Section 6.2.2: For the function f = (1/r2 ) r̂, a highly
divergent function, we found ∇ · f = 0 which seemed counterintuitive. The problem
is with the value of the function at r = 0, where f blows up. Let’s use the divergence
theorem (6.40) for this f to make some progress.
Take the domain Ω to be a ball of radius R, where R can be as small as you like.
The infinitesimal area element on the surface of the sphere, shown in Figure 6.8, is
da = R2 sin θdθdφ r̂ where θ is the polar angle of the patch with respect to the ẑ axis
(see Appendix B for more on area elements in curvilinear coordinates). The area element
is parallel to f so that f · da = fda, and the right-hand side of (6.40) is
2π π
1 2
f · da = 2
R sin θ dθdφ = 4π, (6.73)
∂Ω 0 0 R
independent of the radius R.
For the left-hand side of (6.40), we need to integrate the divergence of f over the entire
volume of the sphere, ending up with 4π to match the surface integral in the divergence
theorem. The origin is included in the integration domain, and since the divergence of f
is actually zero at all points except the origin, it must be the divergence at the origin that
accounts for all of the 4π in the volume integral. That sounds like a job for the three-
dimensional delta function. If we take
∇ · f = 4πδ3 (r), (6.74)

ẑ
da = (R sin θdφ)(Rdθ) r̂
dφ
R sin θ θdφ
R sin
Rd
θ
θ
R
dθ
Fig. 6.8 A sphere of radius R has an infinitesimal area element with da = R2 sin θ dθ dφ r̂. The magnitude gives the area
of the patch and the direction is normal to the sphere’s surface, pointing outward.
then we have

(∇ · f ) dτ = 4π (6.75)
Ω
for our spherical domain. Away from the origin the divergence is zero, since the delta
function is zero away from the origin. At the origin, the divergence is only as defined
as the delta function, some sort of integrable infinity. The delta function is precisely the
function we need to represent the divergence of this very special f , while maintaining the
divergence theorem.
6.4.2 Curl of g ≡ φ̂/s

We had a similar sort of problem with the curl of g = (1/s) φ̂: ∇ × g = 0 for a function
that clearly should have a curl. Once again, the problem is with the value of g at zero. We
can pick a disk of radius R lying in the xy plane as our domain of integration S, and apply
the curl theorem. The integral around the curve has g · d = gd since both g and d point
in the same, φ̂, direction. The magnitude of d is Rdφ, so that
2π
1
g · d = R dφ = 2π. (6.76)
∂S 0 R
The integral of the curl over the disk appears to give zero, violating the curl theorem.
But once again, the function blows up at the origin, and the origin is included in the S
integration. So we guess
∇ × g = 2πδ(x)δ(y) ẑ, (6.77)

165 The Laplacian and Harmonic Functions
where we only need a two-dimensional delta function on the right since it is an area integral
that we will perform. Now the curl is zero away from zero, and integrable infinity at zero,
so we recover

(∇ × g) · da = 2π (6.78)
S
noting that da points in the ẑ direction (enforced by our choice to perform the boundary line
integral in the counterclockwise direction). Taking (6.77) seriously saves the curl theorem.
Problem 6.4.1 Evaluate the divergence of r̂/r, which, like r̂/r2 is infinite at the origin. Does
the divergence of r̂/r have a “hidden” delta function in it?
Problem 6.4.2 Evaluate the divergence of the gradient of 1/r (where
r ≡ x 2 + y 2 + z 2 ).
Problem 6.4.3 Evaluate the curl of the curl of log s ẑ (for s ≡ x + y 2 ).
2
6.5 The Laplacian and Harmonic Functions
So far we have been working with first derivatives, the gradient, divergence and curl. But
we can combine these to form second derivative operators. The most important of these is
the “Laplacian,” ∇2 ≡ ∇ · ∇. When acting on a function f(x, y, z), it takes the form:
∂ 2f ∂ 2f ∂ 2f
∇2 f(x, y, z) = 2
+ 2
+ 2. (6.79)
∂x ∂y ∂z
The equation ∇2 f(x, y, z) = 0, known as Laplace’s equation, is the static form of the
wave equation in three dimensions, and itself has interesting solutions. With appropriate
boundary conditions, solutions to Laplace’s equation are called “harmonic functions.”
There are many ways to solve Laplace’s equation, but more important than the solutions
themselves are the general properties that we can prove directly from the defining equation.
For example, we can show that solutions to Laplace’s equation satisfy an “averaging
property.” For a function f(x, y, z), the average value of f over a surface S is defined to be

1
f¯≡ f(x, y, z) da (6.80)
AS S
where AS is the surface area of S,

AS ≡ da. (6.81)
S
Take S to be a sphere of radius R centered at the origin. We will show that for ∇2 f = 0, we
have f¯ = f(0, 0, 0): the average of the function f over the surface of the sphere is equal to
the value of the function at the center of the sphere.
Let’s Taylor expand f(x, y, z) about the origin, this time including the quadratic terms:

1 ∂ 2f 2 ∂ 2f 2 ∂ 2f 2
f(x, y, z) ≈ f(0, 0, 0) + r · ∇f + x + y + 2z
2 ∂x 2 ∂y 2 ∂z
(6.82)
2 2 2
∂ f ∂ f ∂ f
+ xy + xz + yz 2
∂x∂y ∂x∂z ∂z
ẑ
z r sin
θ
r
θ
y
ŷ
x φ
x̂
Fig. 6.9 Given r, θ , and φ as shown, we can identify the x, y, and z locations of a point: z = r cos θ , x = r sin θ cos φ,
y = r sin θ sin φ.
where all derivatives are evaluated at the center. In the spherical coordinate setup with
{r, θ, φ}, shown in Figure 6.9, we have
x = r sin θ cos φ y = r sin θ sin φ z = r cos θ. (6.83)
The surface area element da, in spherical coordinates, is da = R2 sin θdθdφ from
Figure 6.8 (see Appendix B for the development of the area element in spherical coor-
dinates).
Looking back at the form of f(x, y, z) in (6.82), the average over the sphere will involve
integrals of the coordinates themselves, like
2π π
x da = R sin θ cos φR2 sin θ dθdφ = 0 (6.84)
S 0 0
and similarly for the y and z integrals. There are also integrals quadratic in the coordinates,
like
2π π
xy da = R sin θ cos φR sin θ sin φR2 sin θ dθdφdr = 0 (6.85)
S 0 0
and the same happens for the xz and yz integrals; all of them collapse to zero. The terms
quadratic in the same variable, like x 2 all contribute equally:
2π π
4 4
x 2 da = R2 sin2 θ cos2 φR2 sin θ dθdφ = πR . (6.86)
S 0 0 3
The average over the sphere is then

¯ 1 1
f= f(x, y, z) da ≈ 2
f(0, 0, 0)4πR2 + 2πR4 ∇2 f(0, 0, 0) (6.87)
AS S 4πR
167 The Laplacian and Harmonic Functions
with higher-order terms going like larger powers of R. As R → 0, the only terms that
remain, to leading order in R, are
1
f¯= f(0, 0, 0) + R2 ∇2 f(0, 0, 0). (6.88)
2
But the second term is zero, for ∇2 f = 0, so the average value of the function f over the
sphere of radius R is just the value of f at the center of the sphere. We have used the fact
that R is small to expand the function f, and then shown that for harmonic f, the averaging
property holds through order R2 , and not just at order R, which is trivially true for all
functions. In fact, the averaging property holds for spheres of any size.
That result can then be used to prove that for a domain Ω with ∇2 f = 0 in Ω, there can be
no minimum or maximum values except on the boundary of the domain, ∂Ω. Suppose you
had a maximum value somewhere in Ω, and center a sphere of radius R at that maximum.
Then by the averaging property, the average value of f over the sphere is equal to the value
at the center, which is the maximum. But it is impossible for the average to equal the
maximum, since the average must consist of values less than the maximum value, there is
no way to take values less than the maximum and average them to get a value that is larger
than the constituents. That impossibility is easiest to see in one dimension for a function
h(x) as in Figure 6.10. Taking the two points on either side of the maximum and averaging
(the analogue of averaging over the surface of a sphere in higher dimension) gives h¯ = h0
which is less than hmax , you cannot have a maximum value together with the averaging
property. The same argument holds for minima.
For harmonic functions, then, the maximum and minimum values must occur on the
boundary of the domain, ∂Ω. Finally, we can use this property to prove that the solution
to ∇2 f = 0 in a domain Ω, with boundary values for f specified on ∂Ω, is unique. We are
given a function g on the boundary of the domain; the problem we want to solve is
∇2 f = 0 in Ω with f |∂Ω = g. (6.89)
Suppose there were two functions, f1 and f2 that solved this problem:
∇2 f1 = 0 in Ω with f1 |∂Ω = g, ∇2 f2 = 0 in Ω with f2 |∂Ω = g. (6.90)
h(x)
hmax
h0
x − dx x x + dx
Fig. 6.10 A function h has a maximum value at x, and takes on value h0 at nearby points x ± dx.
Let u ≡ f1 − f2 , then u satisfies
∇2 u = 0 in Ω with u|∂Ω = 0. (6.91)
Since u is harmonic, it must achieve its maximum value on the boundary ∂Ω, but we know
that u = 0 there. Since 0 is the maximum value, u must in fact be zero everywhere in
Ω (including its boundary), and then f1 = f2 everywhere, so the solution to the Laplace
problem (6.89) is unique.
Problem 6.5.1 Compute the Laplacian, ∇2 f(x, y, z), for f(x, y, z) = A(xy − x 2 + z) (A is a
constant).

Problem 6.5.2 What is the Laplacian of r ≡ x 2 + y 2 + z 2 ?
Problem 6.5.3 An example of a function that satisfies Laplace’s equation is f(x, y, z) =
A + B(x 2 − y 2 ) + C(x − z) for constants A, B, and C. Compute the average value
of this function over a sphere of radius R centered on the origin. It may help to use
spherical coordinates from Figure 6.9 to carry out the integration.
Problem 6.5.4 Vectors like r̂ depend on position, and you can compute the Laplacian of
them by writing the vector in terms of the Cartesian basis vectors x̂, ŷ, and ẑ which
do not depend on position, so that ∇2 x̂ = 0, for example. Compute ∇2 r̂. Compute
the Laplacian of ŝ where s ≡ x x̂ + y ŷ.
Problem 6.5.5 Evaluate the second derivative ∇ × (∇ × A) for a vector function A(x, y, z)
in terms of the Laplacian of A and the gradient of its divergence.
Problem 6.5.6 The “Poisson equation” is ∇2 f(x, y, z) = ρ(x, y, z) for a source density ρ
(charge per unit volume, for example, in E&M), a given function of position. This
is just the Laplace equation with a nonzero right-hand side. Given a domain Ω in
which the Poisson equation holds, and given a boundary value function g on the
surface of Ω, we can solve the “Poisson problem.” Show that the solution to the
Poisson problem is unique.
6.6 Wave Equation
The wave equation we developed by taking a continuum limit of one-dimensional springs

in Section 4.1 takes the three-dimensional form,
∂ 2 φ(r, t)
− + v2 ∇2 φ(r, t) = 0, (6.92)
∂t 2
where the second derivative with respect to x from (4.11) has turned into a sum of second
derivatives, one in each direction. This form recovers the one-dimensional case. If we take
φ(r, t) = φ(x, t), so that φ has no y or z dependence, then we get (4.11) from (6.92),
and similarly for the other two coordinates (omitting x and y dependence gives back the
one-dimensional wave equation in z, for example).
169 Wave Equation
6.6.1 Plane Waves

We can develop the three-dimensional form of the plane wave solutions from Section 4.5.
Let k be a constant vector with dimension of inverse length. This vector is called the “wave
vector” and its magnitude is the “wave number.” For frequency, we’ll use the angular
ω = 2πf, then a natural update for an expression like (4.67) is
φ(r, t) = Ae i(−ωt+k·r) = Ae i(−ωt+kx x+ky y+kz z) (6.93)
for constant A that sets the scale and dimension of φ.

For the derivatives, we have
∇φ = ikAe i(−ωt+k·r) = ikφ, (6.94)
so that
∇2 φ = −k2 φ, (6.95)
and similarly,
∂ 2φ
= −ω2 φ. (6.96)
∂t 2
Putting these expressions into the three-dimensional wave equation gives
ω 2 φ − k2 v2 φ = 0 −→ ω = kv. (6.97)
Physically, φ(r, t) travels in the direction k̂, with wavelength λ = 2π/k, frequency
f = ω/(2π) and speed v. The only difference between the solution φ here and the one
in (4.67) is that the direction is more general.
The solution in (6.93) has only one direction of travel, along k̂, but by introducing a
solution with the other sign, we can superimpose waves traveling in the −k̂ direction, a
three-dimensional version of left and right traveling waves in one dimension as in (4.67).
The full plane wave solution associated with a particular wave vector k is
φ(r, t) = Ae i(−ωt+k·r) + Be i(−ωt−k·r) , ω = kv. (6.98)
6.6.2 Longitudinal and Transverse Waves

In Chapter 4, we encountered two different types of wave motion. For one-dimensional
masses connected by springs, the masses move parallel (or anti-parallel) to the wave’s
traveling direction, producing “longitudinal” waves. In the approximate wave equation
that comes from a string under tension, the string moves up and down while the waves
move left and right, so the string motion is perpendicular to the wave motion, and we call
these “transverse” waves. It’s possible to have a combination of transverse and longitudinal
waves, too. The wave equation (6.93) refers to a single function φ(r, t) in three dimensions,
but we can also apply the wave equation to a vector-valued function. Take H(r, t) =
Hx (r, t) x̂ + Hy (r, t) ŷ + Hz (r, t) ẑ, then we could have a wave equation of the form
∂ 2 H(r, t)
− + v2 ∇2 H(r, t) = 0, (6.99)
∂t 2
which is really three equations, one for each component of the vector H.
Referring to the plane wave solution (6.93), take k = k ẑ so that the wave moves in the
z direction. We can solve the wave equation (with appropriate boundary/initial conditions)
with Hy = 0, then we just have the pair
∂ 2 Hx (z, t) ∂ 2 Hx (z, t) ∂ 2 Hz (z, t) ∂ 2 Hz (z, t)
− 2
+ v2 2
=0 − 2
+ v2 = 0. (6.100)
∂t ∂z ∂t ∂z 2
The solution for Hz is associated with the longitudinal component of H, since Hz points in
the ±ẑ direction, and that is the direction of travel. The Hx solution provides a transverse
component, pointing in the ±x̂ direction, orthogonal to the direction of travel. We could
write a plane wave solution as
H = Ae i(−ωt+kz) x̂ + Fe i(−ωt+kz) ẑ, (6.101)
for constants A and F. Because of the orthogonality of the Cartesian basis vectors, the two
equations in (6.100) are independent, they are decoupled from one another. In Chapter 7 we
shall see examples of wave equations in which the transverse and longitudinal components
effect each other.
6.6.3 Spherical Waves

There are other wave geometries we can explore with our three-dimensional wave equation
for vector H(r, t). As an example, suppose we take H(r, t) = H(r, t) r̂ representing a
vector that points radially away from the origin with magnitude that depends only on r,
the distance from the origin. What happens if we put this assumed form into (6.99)? First
let’s think about the action of the ∇2 operator on H. The issue is that both the magnitude,
H(r, t), and the vector r̂ are position dependent7 and the ∇2 operator acts on both:

∇2 H = ∇2 H(r, t) r̂ + H(r, t) ∇2 r̂ , (6.102)
with

x x̂ + y ŷ + z ẑ x x̂ + y ŷ + z ẑ 2
∇2 r̂ = ∇2 = −2 3/2
= − 2 r̂ (6.103)
x2 + y2 + z2 2 2
(x + y + z ) 2 r
from Problem 6.5.4.
The wave equation can be written in terms of H(r, t) by itself, since all terms (including
the Laplacian of r̂) point in the same direction,

∂ 2 H(r, t) 2
− + v ∇ H(r, t) − 2 H(r, t) = 0.
2 2
(6.104)
∂t 2 r
7 Contrast this situation with a vector like x̂ which is position independent, so its derivatives vanish, ∇2 x̂ = 0
automatically.
171 Wave Equation
The Laplacian in these coordinates is given in (B.50), and the PDE of interest is

∂ 2 H(r, t) v2 ∂ 2 ∂H(r, t)
− + 2 r − 2H(r, t) = 0. (6.105)
∂t 2 r ∂r ∂r
We can proceed using multiplicative separation of variables: take H(r, t) = R(r)T(t), then
the PDE becomes (after dividing by H(r, t))

1 d 2 T(t) v2 d 2 dR(r)
− + r − 2R(r) = 0. (6.106)
T(t) dt 2 R(r)r2 dr dr
The first term depends only on t, the second only on r, so each must be equal to a constant.
To get oscillatory behavior in time, it makes sense to set the first term equal to a2 for a real
constant a, then
T(t) = F cos(at) + G sin(at) (6.107)
for constants F and G.
The spatial equation becomes, multiplying through by r2 R(r)/v2 ,

a2 r2 d 2 dR(r)
R(r) + r − 2R(r) = 0. (6.108)
v2 dr dr
We can introduce a new spatial coordinate x ≡ ar/v, to get (see Section 8.2.4 for a
systematic look at this process)

d 2 dR(x)
x + (x 2 − 2)R(x) = 0. (6.109)
dx dx
This equation is the same one you solved using the method of Frobenius in Problem 1.4.5.
The odd solution is the first spherical Bessel function, j1 (x), and we can combine it with
the temporal oscillation to get a spherically symmetric solution to the wave equation
H(r, t) = (F cos(at) + G sin(at)) j1 (ar/v). (6.110)
This is a standing wave solution where the profile is the first spherical Bessel function,
with magnitude set by the temporal oscillation sitting out front. An example of H(r, t) for
a few different times is shown in Figure 6.11.
Problem 6.6.1 Show that for the Laplacian in spherical coordinates acting on a function of
r only, f(r), we have
1 d2
∇2 f(r) = (rf(r)) .
r dr2
Problem 6.6.2 Find a solution to the wave equation for a function f(r, t), a spherically
symmetric function (not a vector this time, though).
Problem 6.6.3 Find a solution
to the wave equation for a vector of the form H(r, t) =
H(r, t) ẑ with r ≡ x + y 2 + z 2 as usual.
2
Problem 6.6.4 Find a solution

to the wave equation for a vector of the form H(s, t) =
H(s, t) ŝ where s ≡ x 2 + y 2 and s = x x̂ + y ŷ. This is a “cylindrically symmetric”
solution. See Section 6.7.1 if you end up with an ODE you don’t immediately
recognize.
H(r, t)
Fig. 6.11 The magnitude of the spherically symmetric solution to the wave equation shown at a few different times
(increasing from dark to light).
6.7 Laplace’s Equation
In one (spatial) dimension, the wave equation reads

∂ 2 φ(x, t) 2 ∂ φ(x, t)
2
− + v = 0. (6.111)
∂t 2 ∂x 2
If we look for solutions that do not depend on t, φ(x, t) = φ(x), we have the ODE
d 2 φ(x)
v2 = 0 −→ φ(x) = Ax + B (6.112)
dx 2
an uninteresting linear function with little chance of satisfying realistic boundary condi-
tions.
In higher dimensions, the static wave equation takes the form: ∇2 f = 0, Laplace’s
equation. Solutions to this equation are more interesting, as you have already seen
in Problem 4.3.2, with general properties that we developed in Section 6.5. Let’s look at
the solutions in three dimensions for various boundary conditions. To enforce boundary
conditions, it is often useful to employ coordinate systems other than Cartesian. If you are
not used to the expressions for the Laplacian in cylindrical or spherical coordinates, take a
look at Appendix B.
6.7.1 Cylindrical Coordinates

We want to solve ∇2 f = 0 with the value of f specified on an infinite cylinder of radius R.
It is natural to use the cylindrical coordinates defined in Figure B.1, with
y
s = x2 + y2 φ = tan−1 z the Cartesian z, (6.113)
x
so that the function f takes the form f(s, φ, z) and it is easy to impose the boundary
condition: f(R, φ, z) = g(φ, z) for boundary function g(φ, z). Laplace’s equation in
cylindrical coordinates is (see (B.19))
173 Laplace’s Equation

1 ∂ ∂f 1 ∂ 2f ∂ 2f
∇ f= 2
s + 2 + = 0. (6.114)
s ∂s ∂s s ∂φ2 ∂z 2
How should we solve this partial differential equation? The primary tool is separation of
variables as described in Section 4.3.3. Assume that f(s, φ, z) = S(s)Φ(φ)Z(z) and run it
through the Laplace equation, dividing by f to get
⎛ dS(s) ⎞
d d 2 Φ(φ) d 2 Z(z)
1 ⎝ ds s ds dφ 2
⎠ + dz 2 = 0.
s + (6.115)
s2 S(s) Φ(φ) Z(z)
The first term depends on s and φ while the second depends only on z, so using the logic
2
of separation of variables, we take ddz Z2 /Z = 2 , a constant. Inside the parentheses, there
2
are functions that depend on s and on φ. Take ddφΦ2 /Φ = −m2 another constant (the minus
sign is there to suggest an oscillatory solution, good for angular variables that require
periodicity), to combine with the s-dependent piece. Then we have the triplet of equations
d 2Z
= 2 Z
dz 2
d 2Φ
= −m2 Φ (6.116)
dφ2

d 2 S 1 dS m2
+ + − 2 S = 0.
2
ds2 s ds s
The first two equations are familiar, and we can solve immediately for Z(z) and Φ(φ),
Z(z) = Ae z + Be −z Φ(φ) = Fe imφ + Ge −imφ , (6.117)
where we require that m be an integer to get Φ(φ) = Φ(φ + 2π), but there is no obvious
restriction on . The third equation is not one we have encountered before. We can clean it
up a bit by introducing a scaled s-coordinate, x = s,
d 2 S(x) dS(x) 2
x2 2
+x + x − m2 S(x) = 0. (6.118)
dx dx
This is “Bessel’s equation.”
In order to solve it, we’ll use the Frobenius approach from Section 1.4. Assume that
∞

S(x) = x p aj x j . (6.119)
j=0
Taking derivatives, putting them in (6.118), and collecting like powers of x j , we get
* + ∞
. /
2
a0 (p2 − m2 ) + a1 (1 + p) − m2 x + aj ( j + p)2 − m2 + aj−2 x j = 0. (6.120)
j=2
To eliminate the first term, we take p = ±m. Setting a1 = 0 to get rid of the second term,
we are left with the recursion relation
aj−2
aj−2 = aj m2 − ( j ± m)2 = −aj j( j ± 2m) −→ aj = − , (6.121)
j( j ± 2m)
J0 (x)
J1 (x)
J2 (x)
Fig. 6.12 The first three Bessel functions.
or letting j ≡ 2k,
a2k−2 a2k−2
a2k = − =− . (6.122)
2k(2k ± 2m) 4k(k ± m)
Writing out the first few terms, we can solve the recursion
a0
a2 = −
4(1 ± m)
a2 a0
a4 = − =
8(2 ± m) 32(m ± 1)(m ± 2)
a0
a6 = −
384(1 ± m)(2 ± m)(3 ± m)
a0
a8 = 8 (6.123)
2 4! (1 ± m)(2 ± m)(3 ± m)(4 ± m)
a0
a10 = − 10
2 × 5! (1 ± m)(2 ± m)(3 ± m)(4 ± m)(5 ± m)
..
.
a0 a0 (−m) !
a2j = (−1)j 6j = (−1)j .
22j j! k=1 (k ± m) 22j j! ( j ± m)!
Using these coefficients in (6.119) leads to “Bessel’s function” for both positive and
negative values of 8 m:
∞
1 (−m) !
J±m (x) = a0 ±m (−1)j 2j x 2j±m (6.124)
2 (1 ± m)! 2 j! ( j ± m)!
j=0
where the a0 out front is just a constant, and the term in parentheses is a normalization
convention. These functions are well-studied, and the first few are shown in Figure 6.12.
They are oscillatory, and decay away from the origin. For even values of m, the functions
are even, and for odd values of m, they are odd.
8 This definition holds for integer values of m; there is a generalized expression that applies for noninteger values.
f (s, φ, z)
Fig. 6.13 An example of a function that has ∇ 2 f = 0.
Bessel’s equation is linear, so sums of solutions are also solutions, and we can construct
a general solution for S(x) using the individual Bessel functions9 :
∞

S(x) = ck Jk (x). (6.125)
k=0
The Bessel functions satisfy an orthogonality and completeness relation, much like sine
and cosine, and we can build functions out of them by tuning the coefficients {ck }∞ k=0 .
Returning to our separation of variables solution, for a particular value of , we have
∞
imφ
f(s, φ, z) = Ae z + Be −z fm e + gm e −imφ Jm (s) (6.126)
m=0
for constant A, B, and coefficients {fm }∞ ∞

m=0 , {gm }m=0 . We can go further by adding together
solutions with different values of . This general solution to ∇2 f = 0 can be used to enforce
boundary conditions (in cylindrical coordinates) and obtain unique solutions.
The solutions to ∇2 f = 0 in three dimensions are much more interesting than the linear
solution (6.112) in one dimension. As an example, take the solution f(s, φ, z) = e −z J0 (s)
(obtained from (6.126) with m = 0, = 1) shown in Figure 6.13.
6.7.2 Spherical Boundary Conditions: Axial Symmetry

A function of spherical coordinates (see Figure 6.9), f(r, θ, φ), is said to be “axially
symmetric” if it is independent of the variable φ. The Laplacian in spherical coordinates,
from (B.50), applied to an axially symmetric function becomes

1 ∂ 2 ∂f 1 ∂ ∂f
r + sin θ = 0. (6.127)
r2 ∂r ∂r r2 sin θ ∂θ ∂θ
Taking the separation ansatz, f(r, θ) = R(r)Θ(θ), inserting in this equation, and performing
some algebraic manipulation gives

1 d 2 dR(r) 1 d dΘ(θ)
r + sin θ = 0. (6.128)
R(r) dr dr Θ(θ) sin θ dθ dθ
9 There is an independent second solution to Bessel’s equation, called the Neumann function, but it blows up at
the origin and hence is rejected in most physical applications.
The first term is a function only of r, while the second is a function of θ, so they must both
be constant in order to satisfy the equation for all values of r and θ. Call that constant value
( + 1),10 then we have the pair:

d 2 dR(r) d dΘ(θ)
r = ( + 1)R(r), sin θ = −( + 1) sin θΘ(θ). (6.129)
dr dr dθ dθ
The first equation can be solved by taking R(r) ∝ rq as a “guess” (see Appendix A), then
we get
q(q + 1) rq = ( + 1)rq −→ q = or q = − − 1 (6.130)
so the radial piece of the solution can be written as the combination of these two
B
R(r) = Ar + . (6.131)
r+1
For the angular piece, let x ≡ cos θ (not to be confused with the Cartesian coordinate
x which doesn’t appear in this spherical setting); if we view Θ(θ) as a function of x (i.e.
depending on θ through cos θ), the chain rule gives
dΘ(x) dΘ(x) dx dΘ(x) dΘ(x)
= = − sin θ = − 1 − x2 (6.132)
dθ dx dθ dx dx
and we can write the second ODE in (6.129) as
d 2 Θ(x) dΘ(x)
1 − x2 2
− 2x + ( + 1)Θ(x) = 0 (6.133)
dx dx
which is known as “Legendre’s differential equation.” We will again solve using the series
solution method of Frobenius, taking
∞

Θ(x) = x p aj x j . (6.134)
j=0
Putting this form into (6.133) and collecting as usual gives

∞
∞

aj ( j + p)( j + p − 1)x j−2 + [−( j + p)( j + p − 1) − 2( j + p) + ( + 1)] aj x j = 0,
j=0 j=0
(6.135)
or, re-indexing the first term,
0 = a0 (p)(p − 1) x −2 + a1 (p + 1)(p) x −1
∞

+ [(−( j + p)( j + p − 1) − 2( j + p) + ( + 1)) aj
j=0
+( j + 2 + p)( j + 1 + p)aj+2 ] x j = 0. (6.136)
10 We’ll take to be an integer, although that is not strictly speaking necessary. This choice does simplify the
solutions to Legendre’s equation, and suffices in many cases. Noninteger values of lead to angular solutions
that blow up.
Taking p = 0 so that a0 and a1 are free to take on any values, the recursion relation is
j( j + 1) − ( + 1)
aj+2 = aj . (6.137)
(j + 1)(j + 2)
This defines both the even (for a1 = 0) and odd (for a0 = 0) solutions. Notice that for a
particular integer value of , the numerator of the recursion will be zero for j = , we have a
polynomial of degree , either even or odd depending on the value of . These polynomials
are known as “Legendre polynomials,” and indexed by the integer : P (x).
For = 0, we get the zeroth Legendre polynomial, P0 (x) = a0 . The Legendre
polynomials are “normalized” as we shall see in a moment, and that normalization sets
the value of a0 . Moving on to = 1, working with the odd series, we have a1 to start off,
then a3 = 0 as do all other coefficients, so the solution is P1 (x) = a1 x with a1 again set by
the normalization convention. For = 2, we take a0 to start off, and then a2 = −3a0 and
a4 and all higher coefficients are zero, giving P2 (x) = (1 − 3x 2 )a0 .
The general solution to (6.133) is a superposition of the individual multiplicative terms
∞

B
f(r, θ) = A r + P (cos θ). (6.138)
r+1
=0
There is, again, an orthogonality relation for the Legendre polynomials, which can be
obtained from the ODE itself. Take two solutions, P (x) and Pm (x) with = m. These
each satisfy the Legendre differential equation,
(1 − x 2 )P (x) − 2xP (x) + ( + 1) P (x) = 0

(6.139)
(1 − x 2 )Pm (x) − 2xPm (x) + m(m + 1) Pm (x) = 0.
The idea is to multiply the top equation by Pm (x), the bottom by P (x) and then integrate
each with respect to x. Since the argument x = cos θ, we expect the domain here to
be x ∈ [−1, 1], and we’ll integrate over all of it. For the top equation, multiplying and
integrating gives
1 1 1
(1 − x 2 )P (x)Pm (x) dx − 2xP (x)Pm (x) dx + ( + 1) P (x)Pm (x) dx = 0.
−1 −1 −1
(6.140)
Using integration by parts on the first term, we have

1
1
(1 − x 2
)P (x)Pm (x) dx = 1−x 2
P (x)Pm (x)
−1 x=−1
1
− P (x)Pm (x) + P (x) x 2 Pm (x) + 2xPm (x) dx.
−1
(6.141)
The boundary term vanishes,11 and (6.140) becomes

1 1

− 2
P (x)Pm (x)(1 + x ) dx + ( + 1) P (x)Pm (x) dx = 0. (6.142)
−1 −1
The equation for Pm (x) in (6.139), when multiplied by P (x) and integrated from
x = −1 → 1 is just (6.142) with ↔ m,
1 1

− 2
Pm (x)P (x)(1 + x ) dx + m(m + 1) Pm (x)P (x) dx = 0. (6.143)
−1 −1
and subtracting this from (6.142) gives

1
(( + 1) − m(m + 1)) P (x)Pm (x) dx = 0 (6.144)
−1
which means that

1
P (x)Pm (x) dx = 0 (6.145)
−1
since = m by assumption. The case = m will not give zero, and can be used to set the
normalization of the Legendre polynomials.12 A typical choice is
1
2
P (x)2 dx = (6.146)
−1 2 +1
so that the full orthogonality relation reads
1
2
P (x)Pm (x) dx = δ m . (6.147)
−1 2 + 1
If we return to the angular θ using x = cos θ, then the orthonormality condition is
π
2
P (cos θ)Pm (cos θ) sin θ dθ = δ m . (6.148)
0 2 + 1
Example
Suppose we solve ∇2 f = 0 in a domain Ω that is the interior of a sphere of radius R, and
we are given a function g(θ) with f(R, θ) = g(θ) on the boundary of the domain (i.e. at the
spherical surface). The boundary condition and domain are axisymmetric, so it is safe to
assume that f is as well.13 If we want the solution f(r, θ) in Ω to be free of infinities, then
11 The Legendre polynomials do not blow up at x = ±1.

12 Many choices of normalization exist.
13 An interesting assumption, why should the function f have the same symmetry as the domain and boundary
condition function?
we must set B = 0 for all in the general solution (6.138), otherwise r = 0 will pose a
problem. Starting from
∞

f(r, θ) = A r P (cos θ), (6.149)
=0
we want to find the set of coefficients {A }∞=0 , and we can use the boundary condition at
r = R, together with the orthogonality of the Legendre polynomials. At r = R,
∞

f(R, θ) = A R P (cos θ) = g(θ). (6.150)
=0
Multiply both sides by Pm (cos θ) sin θ and integrate as in (6.148),

π
2
Am Rm = Pm (cos θ)g(θ) sin θ dθ −→
2m + 1 0
(6.151)
2m + 1 π
Am = Pm (cos θ)g(θ) sin θ dθ.
2Rm 0
That does it, in principle, but the integral is difficult to carry out, especially since all we
have is a recursive formula for the coefficients in the Legendre polynomials. One can
sometimes (especially in textbook problems) get away with “inspection.” For example,
suppose that g(θ) = g0 cos θ. This is just g(θ) = g0 P1 (cos θ) and we know that no other
Legendre polynomials can contribute by orthogonality. In that case, it is easy to read off
the solution. We know = 1 is the only contributing term, so
f(r, θ) = A1 r1 P1 (cos θ) (6.152)
and imposing f(R, θ) = A1 R cos θ = g0 cos θ gives us A1 = g0 /R, with full solution
g0 r
f(r, θ) = cos θ. (6.153)
R
Problem 6.7.1 Use the orthogonality relation (6.148) to set the coefficient for P3 (x).
Problem 6.7.2 Show that the solutions to f n (x) + (nπ/L)2 fn (x) = 0 for integer n, with
fn (0) = fn (L) = 0 satisfy
L
fn (x)fm (x) dx = 0
0
for n = m.
Problem 6.7.3 The “Rodrigues formula” provides a way to find the nth Legendre polyno-
mial. The formula reads
n
1 d 2 n
Pn (x) = n x −1 , (6.154)
2 n! dx
so that you take n derivatives of (x 2 − 1)n to get the nth polynomial, normalized
according to the orthonormality convention in (6.147). Work out the first three
Legendre polynomials using this formula and compare with the expressions from
the last section.
Problem 6.7.4 There is an orthogonality relation for the Bessel functions that can be proved
in a manner similar to the orthogonality of the Legendre polynomials we proved in
the last section. Suppose α and β are zeroes of the nth Bessel function, Jn (α) = 0 =
Jn (β). Show that
1
xJn (αx)Jn (βx) dx = 0
0
if α = β, i.e. if they are not the same zero. Finding the zeroes of Bessel functions is
not easy (unlike, for example, finding the regularly spaced zeroes of sine or cosine).
The simplest approach is to isolate them numerically, and we’ll see how to do that in
Section 8.1.
Problem 6.7.5 In the example from the last section, we found the solution on the interior of
a sphere of radius R. This time, find the solution to ∇2 f = 0 in the domain Ω defined
by the exterior of the sphere of radius R given the boundary condition: f(R, θ) =
g(θ) = g0 cos θ. Just as we excluded one of the radial solutions because r = 0 was
in the domain of the example, this time we must exclude one of the radial solutions
because spatial infinity (r → ∞) is in our domain.
7 Other Wave Equations
What is a wave? What is a wave equation? These are probably questions we should have
started with. The “wave equation” we spent most of Chapter 4 developing and solving
comes from the longitudinal motion of springs (exact), and the transverse motion of
strings under tension (approximate). In the broader context of wave equations, we have
studied conservation laws, and used them to predict some of the behavior that arises
in the nonlinear setting (e.g., traffic flow). But, we have never really defined “waves” or
the equations that govern them. There does not seem to be much in common between
the applications, which become even more exotic in, for example, quantum mechanics.
Physicists tend to have a fairly broad description of what constitutes a wave.
Turning to the professionals, in this case, physical oceanographers, it is interesting that
the situation remains somewhat ambiguous. From [5], “Waves are not easy to define.
Whitham (1974 [19]) defines a wave as ‘a recognizable signal that is transferred from one
part of a medium to another with recognizable velocity of propagation.’” This definition
captures almost any physically relevant phenomenon, and is a parallel to Coleman’s quote
from the preface. What’s worse, a “wave equation” is taken to be any equation that
supports solutions that are waves. That seems tautological, but as we saw in the previous
chapter, those wave equations have interesting solutions even when those solutions aren’t
themselves waves (as in the static cases).
In this chapter, we will look at some of the other places that wave and wave-like
equations arise. In some cases, we can solve the wave equation, or find approximate
solutions, as we have in previous chapters. But for a majority of the topics in this chapter,
it is the journey that is the reward. Nonlinear partial differential equations are notoriously
difficult to solve, and so we will simply work out the physics that develops the equations,
letting further study address the challenges of solution. An exception comes at the end
of the chapter, where Schrödinger’s wave equation is introduced. There, many of the
techniques we have studied so far will apply, and we will solve Schrödinger’s equation
for some familiar cases of interest.
7.1 Electromagnetic Waves
We have seen the wave equation in the form (4.11) (or its three-dimensional version (6.92))
emerge from coupled oscillators in Section 4.1, and as an approximation to the vertical
motion of a string in Section 4.2, and there are many other places where this wave equation
181
appears, either exactly or in approximation. Perhaps the most famous appearance is in

electricity and magnetism, where the electromagnetic field satisfies the wave equation “in
vacuum” (meaning away from the sources).
Given a charge density ρ(r, t), the charge-per-unit-volume in some domain, and a
current density J(r, t) which tells how that charge is moving around in time and space,
Maxwell’s equations relate the divergence and curl of an electric field E and a magnetic
field B to ρ and J:
ρ ∂B 1 ∂E
∇·E= ∇×E=− ∇·B=0 ∇ × B = μ0 J + , (7.1)
0 ∂t c2 ∂t
where 0 is a constant associated with electric sources, μ0 with magnetic ones, and
√
c = 1/ μ0 0 is the speed of light. The electric field acts on a particle carrying charge q with
a force F = qE and the magnetic field acts on a particle of charge q moving with velocity
vector v via the force F = qv × B. The equations can be understood geometrically using
the intuition we developed in Section 6.2.2. For example, from ∇ · B = 0, we learn that
magnetic fields cannot “diverge” from a point, they must be entirely “curly”. Meanwhile,
the divergence of E depends on ρ, so that where there is a large charge-per-unit-volume,
the divergence of E at those points is itself large.
In vacuum, we have ρ = 0 and J = 0. This does not mean that there are no sources for
the electric and magnetic fields, just that they are distant from the domain of interest. Then
Maxwell’s equations read
∂B 1 ∂E
∇·E=0 ∇×E=− ∇·B=0 ∇×B= . (7.2)
∂t c2 ∂t
Taking the curl of the curl of the electric field and using the identity from Problem 6.5.5:
∇ ×(∇ × E) = ∇(∇ · E) − ∇2 E gives
∂
∇(∇ · E) − ∇2 E = − (∇ × B) (7.3)
∂t
and the divergence on the left is zero (from (7.2)), the curl of B on the right can be replaced
by the time-derivative of E, leaving
∂ 2E
− + c2 ∇2 E = 0, (7.4)
∂t 2
the wave equation for the vector function E. If we took the curl of the curl of B, the same
sort of simplification would occur, and we’d have the wave equation for B as well,
∂ 2B
− + c2 ∇2 B = 0. (7.5)
∂t 2
Maxwell’s equations in vacuum, then, lead to wave equations for E and B, and the
characteristic speed is c, the speed of light. This is surprising, since in our previous
examples, the speed of the waves is set by physical properties of the medium that transmits
the waves. But we have just developed the wave equation for electromagnetic fields in
vacuum, with no obvious physical properties whatsoever. This paradox originally led to
the idea of an “ether,” a medium through which electromagnetic waves move, and which
183 Electromagnetic Waves
is responsible for setting the characteristic speed of the waves. That explanation is at odds
with experiments that eventually supported the special relativistic interpretation, that the
vacuum itself has a natural speed. The wave equation that appears here is exact, it does not
come from any approximation (as with, for example, the wave equation governing waves
on a string).
The wave equations (7.4) and (7.5) are not complete. They came from four first-
derivative relations, Maxwell’s equations, and we have lost some information in taking
the derivatives. To see the problem, and its solution, take plane waves that solve the wave
equations:
E = E0 e i(−ωE t+kE ·r) ê B = B0 e i(−ωB t+kB ·r) b̂ (7.6)
where to satisfy the wave equation, we must have ωE = ckE and ωB = ckB , relating
frequencies and wave vector magnitude. The unit vector directions ê and b̂ are constant,
and a priori unrelated. The constants E0 and B0 , setting the size of the fields, are similarly
unrelated (at this point).
Sending these wave equation solutions back in to Maxwell’s equations in vacuum,
we get
∇ · E = 0 −→ ikE · êE0 e i(−ωE t+kE ·r) = 0 (7.7)
from which we learn that kE · ê = 0, the direction of wave travel, the wave vector k̂E , is
perpendicular to the direction of the electric field, ê. The electric field plane wave is thus
transverse. Running the divergence of B through ∇ · B = 0 gives the same orthogonality
for the magnetic field’s wave propagation direction and magnetic field direction, kB · b̂ = 0.
The magnetic field also has a transverse plane wave solution. Remember, these constraints
come from Maxwell’s equations, they are not present in the wave equations themselves.
Moving on to the curls, we have
∂B
∇×E=− −→ ikE × êE0 e i(−ωE t+kE ·r) = iωB B0 e i(−ωB t+kB ·r) b̂. (7.8)
∂t
This equation provides a wealth of information. In order for it to hold, for all time t and
spatial locations r, we must have ωB = ωE and kB = kE allowing us to clear out the
exponentials. In addition, we see that the vector directions of E and B must be related:
b̂ = k̂E × ê. Finally, the constants E0 and B0 are related
kE E0 E0
B0 ωB = kE E0 −→ B0 = = . (7.9)
ωE c
With all these constraints in place, the plane waves E and B that solve Maxwell’s
equations are
E0 i(−ωE t+kE ·r)
E = E0 e i(−ωE t+kE ·r) ê B= e k̂E × ê
c (7.10)
with k̂E · ê = 0 and ωE = ckE .
In this setting, the vector ê is called the “polarization” of the wave, and E0 sets the
magnitude of both E and B. The electric and magnetic waves both move in the k̂E
k̂E
ê
b̂ = k̂E × ê
Fig. 7.1 A snapshot of the real part of the electric and magnetic plane waves that solve Maxwell’s equations. The electric
field points in the ê direction (the “polarization”), the magnetic field points in b̂ = k̂E × ê, and the waves
themselves travel in the k̂E direction.
direction, which is perpendicular to both the electric and magnetic field directions. Finally,
the electric and magnetic fields are themselves perpendicular. An example of the real part
of these fields is shown in Figure 7.1, where you can see the triumvirate of directions
and their orthogonal relationships. These single-frequency electromagnetic waves are the
mathematical representation of light, a phenomena produced by the mutual propagation of
electric and magnetic fields, through the vacuum, at constant speed c.
Problem 7.1.1 From (7.2), show that B satisfies (7.5).

Problem 7.1.2 A plane wave has frequency f = 5.1 × 1014 Hz, what is its wavelength?
Problem 7.1.3 Give the form of both E and B for a plane wave of frequency f that is
traveling in the x̂ − ŷ + 2 ẑ direction and polarized in the x̂ + ŷ direction with
magnitude E0 .
Problem 7.1.4 The general solution to the wave equation (in one dimension)
∂ 2 φ(x, t) 2 ∂ φ(x, t)
2
− + v =0
∂t 2 ∂x 2
is φ(x, t) = f(x − vt) + g(x + vt) as we saw in Section 4.3.1, with f and g chosen
to satisfy initial conditions. Suppose you “complexify time” as in Problem 1.6.5 by
taking t = is. What does the wave equation look like now? What happens to the
general solution?
Problem 7.1.5 In two dimensions, the Laplace equation reads
∂ 2 u(x, y) ∂ 2 u(x, y)
+ = 0.
∂x 2 ∂y 2

A solution that is valid for all points except the origin is u(x, y) = u0 log( x 2 + y 2 ).
Check that this is a solution, and show that it separates into the “f ” and “g” parts you
established in the previous problem.
185 Fluids
7.2 Fluids
One of the most familiar places to observe waves is water. The wave equation that governs
water waves is more complicated than the linear wave equation that we get in the study
of electricity and magnetism. Let’s briefly develop a one-dimensional form of the “Euler”
equations for fluids. There are two inputs here, the first is conservation of mass. As water
moves around, the total amount of it remains unchanged. In a local region, the amount of
water can fluctuate, but only because some mass has entered or left the domain, not because
water is created or destroyed. Take ρ to be the density of water (mass per unit length here),
a function of position and time. Then we have a current density J = ρv, another function
of position and time that describes how the water is moving. The conservation law, from
Section 4.8.1, reads
∂ρ ∂J
=− . (7.11)
∂t ∂x
The second piece of physics we need is Newton’s second law,1 m dv dt = − ∂x for a
∂U
conservative force with potential energy U. Here, we can replace mass with mass density,
and then we’ll substitute an energy density U for U, so we have
dv ∂U
=− ρ . (7.12)
dt ∂x
This equation applies to a small volume of fluid with density ρ and velocity v that is acted
on by a conservative force density.
The velocity function v depends on position and time, and so the total time derivative
can be written in terms of partial derivatives that take into account the change in v due to
both changing position and changing time. In general, for a function f(x, t), we have
df ∂f ∂f
= ẋ + . (7.13)
dt ∂x ∂t
The change in position, ẋ, is itself given by v, the function that tells us how the fluid is
moving at different locations, so we can write
df ∂f ∂f
= v+ . (7.14)
dt ∂x ∂t
This is called the “convective derivative” and accounts for the change in f that occurs
because fluid is moving (first term) in addition to the change in f that comes from its
explicit time dependence (second term). Using the convective derivative in (7.12) gives
∂v ∂v ∂U
+ ρvρ =− . (7.15)
∂t ∂x ∂x
Finally, we can put this equation into conservative form, i.e. into the form of a
conservation law like (7.11), by writing the time-derivative of J = ρv in terms of a spatial
derivative, employing (7.11) along the way:
1 We’re using a partial spatial derivative here as a reminder that in higher dimension, we could have a potential
energy that depends on the other coordinates.
∂(ρv) ∂v ∂ρ ∂v ∂U ∂(ρv) ∂ 2
=ρ + v = −ρv − −v =− ρv + U . (7.16)
∂t ∂t ∂t ∂x ∂x ∂x ∂x
In this context, what we have is conservation of momentum (density) ρv, with an effective
force governed by both U and ρv2 . We can write the pair of equations now, both
conservation laws, one for ρ, one for J = ρv:
∂ρ ∂J
=−
∂t ∂x
(7.17)
∂J ∂ J2
=− +U .
∂t ∂x ρ
This pair of equations is nonlinear in its variables (with J2 /ρ appearing in the second one),
leading to immediate complications in its solution.
7.2.1 Three Dimensions

We can update the one-dimensional equations from (7.17) to three dimensions, where we
have the mass conservation equation from (6.52). For ρ the mass per unit volume and
J = ρv the current,
∂ρ
= −∇ · J (7.18)
∂t
as in Section 6.3.4. The force equation becomes, for U now a potential energy density
(see Problem 4.1.8) in three dimensions (energy per unit volume),
dv
ρ = −∇U . (7.19)
dt
In three dimensions, for a function f(x, y, z, t), we again have the convective derivative,
df ∂f ∂f ∂f ∂f ∂f
= ẋ + ẏ + ż + = v · ∇f + (7.20)
dt ∂x ∂y ∂z ∂t ∂t
or, for f → v,
dv ∂v
= (v · ∇) v + , (7.21)
dt ∂t
so that (7.19) becomes
∂v
ρ + ρv · ∇v = −∇U . (7.22)
∂t
It is more difficult to put this equation into conservative form.
The potential energy density U has units of force per unit area, and in addition to any
external potential energies governing the fluid (like gravity), this is where the fluid pressure
comes into the picture. If two fluid elements exert a force on each other at their interface,
that force per unit area acting on the interface is called the pressure p, and we can separate
this from other external forces by writing U = p + U¯ where U¯ represents other physical
interactions in a particular problem. When we take U¯ = ρgz for mass density ρ and gravity
near the surface of the earth (g ≈ 9.8 m/s2 ), we get the “Euler” equations (for ρ and v;
187 Fluids
there is an additional equation enforcing conservation of energy which we omit here for
simplicity),
∂ρ
= −∇ ·(ρv)
∂t
(7.23)
∂v
ρ + ρ(v · ∇)v = −∇(p + ρgz) .
∂t
These equations govern, for example, gravity driven water waves. The pressure p must be
specified, and it is typical to take p ∝ ρ γ , i.e., pressure is proportional to density raised to
some power.
7.2.2 Shallow Water Wave Equation

In three dimensions, we want to find the density ρ(x, y, z, t) and vector v(x, y, z, t)
using (7.23) with appropriate initial and boundary conditions. This is a difficult task (to
say the least), and there are many simplifying approximate forms for (7.23) that apply in
particular situations. One special case is the “shallow water approximation.” Suppose we
have a volume of fluid with vertical depth (in the z direction, say) that is small compared
to its horizontal extent, a puddle of water, if you like. In this case, we are interested in the
horizontal motion of the water rather than its vertical motion. To impose the approximation,
we assume that the vertical (z-) component of velocity is negligible compared to the
horizontal ones, and that the details of the water below its surface are irrelevant (things
like the geometry of the bottom of the puddle are ignorable, for example, we can consider
a perfectly flat bottom). The approximation allows us to focus on the waves that form
on the surface of the water, and ultimately the shallow water equations refer only to that
surface.
We’ll develop the shallow water equations in one horizontal direction (leaving two
for Problem 7.2.3). Imagine a three-dimensional volume of water with a wave riding along
the top surface as shown in Figure 7.2. The top surface is given by the function h(y, t), and
we’ll assume, as shown in the figure, that the wave’s shape is the same in the x direction
so that we need only consider a slice in the yz plane to describe it. The relevant component
of velocity is vy : we have assumed that vz ≈ 0 is negligible compared to vy , and because of
the assumed symmetry along the x direction, there is no vx . The function vy (y, t) describes
the horizontal velocity of the wave, and we assume it is independent of z (again, because
ẑ
h(y, t)
ŷ
x̂
Fig. 7.2 A wave traveling in the ŷ direction. The height of the top surface of the water is given by h(y, t).
the water is shallow, the surface wave velocity holds at any of the necessarily negligible
depths).
The density of the water is a function of y, z, and t, and can be written (recall the
definition of the step function from (2.133))
ρ(y, z, t) = ρ0 θ(h(y, t) − z) (7.24)
which says that the mass density of the water is only nonzero below its top surface
h(y, t) where it has uniform mass density (and no pressure variation). This is, again, a
manifestation of the shallow water assumption – the depth isn’t great enough to support
changes in density beneath the water’s surface.
From our assumptions about the velocity components and their dependencies, the
conservation of density and ŷ component of (7.22) read
∂ρ ∂ρ ∂vy
+ vy + ρ =0
∂t ∂y ∂y
(7.25)
∂vy ∂vy ∂U
ρ + ρ vy =− ,
∂t ∂y ∂y
where the potential energy density U = ρgz has implicit y dependence through ρ (and,
again, we are ignoring the pressure contribution). Using the form for ρ from (7.24),
together with its derivatives,
∂ρ ∂h(y, t)
= ρ 0 δ(h(y, t) − z)
∂t ∂t
(7.26)
∂ρ ∂h(y, t)
= ρ 0 δ(h(y, t) − z)
∂y ∂y
in (7.25) gives the pair

∂h(y, t) ∂h(y, t)
0 = δ(h(y, t) − z) + vy (y, t)
∂t ∂y
∂vy (y, t)
+ θ(h(y, t) − z) (7.27)
∂y

∂h(y, t) ∂vy (y, t) ∂vy (y, t)
−δ(h(y, t) − z) gz = θ(h(y, t) − z) + vy (y, t) .
∂y ∂t ∂y
We are not interested in what happens below the surface of the fluid. In order to eliminate
the “underwater” portion of these equations, we can integrate from z = 0 → ∞, and use
the delta and step functions to simplify the integrals,
h(y,t)
∂h(y, t) ∂h(y, t) ∂vy (y, t)
+ vy (y, t) + dz = 0
∂t ∂y 0 ∂y
h(y,t) h(y,t) (7.28)
∂vy (y, t) ∂vy (y, t) ∂h(y, t)
dz + vy (y, t) dz = −gh(y, t) .
0 ∂t 0 ∂y ∂y
189 Fluids
All of the remaining integrals are z-independent, so we get

∂h(y, t) ∂h(y, t) ∂vy (y, t)
+ vy (y, t) + h(y, t) =0
∂t ∂y ∂y
(7.29)
∂vy (y, t) ∂vy (y, t) ∂h(y, t)
h(y, t) + h(y, t)vy (y, t) = −gh(y, t) .
∂t ∂y ∂y
Using the top equation, we can write this pair in “conservative” form,
∂h(y, t) ∂
= − (h(y, t)vy (y, t))
∂t ∂y
(7.30)
∂(h(y, t)vy (y, t)) ∂ 2 1
=− 2
h(y, t)(vy (y, t)) + gh(y, t) .
∂t ∂y 2
These are the “shallow water equations.” The goal is to solve for the surface h(y, t) and
velocity at the surface, vy (y, t) given initial data and boundary conditions.
Problem 7.2.1 For a particle traveling along a one-dimensional trajectory, x(t), the associ-
ated mass density is ρ(x, t) = mδ(x−x(t)) and the particle travels with v(x, t) = ẋ(t).
Show that mass conservation, (7.11), is satisfied.
Problem 7.2.2 Write the one-dimensional shallow water equation height function as
h(y, t) = h0 + η(y, t) where h0 is some mean height and η(y, t) rides on top. Take
η(y, t) and the velocity vy (y, t) to be small, and write the shallow water equations
in terms of η and vy “to first order” (meaning that you will eliminate η2 , v2y , and
ηvy terms once everything is expanded). This pair is called the “linearized” shallow
water equations. Assuming a wave-like solution of the form η(y, t) = F(y − vt), find
the wave speed v that solves the linearized shallow water equations (hint: use one of
the equations to express vy in terms of η).
Problem 7.2.3 Work out the two-dimensional shallow water equations starting from (7.23) –
keep vz = 0 (and its derivatives), but assume both vy and vx are nonzero and can
depend on the coordinates x, y, and the time t. Similarly, the height function can
be written as h(x, y, t). The derivation basically follows the one-dimensional case
earlier, but with the additional x-velocity component. Don’t worry about writing the
resulting PDEs in conservative form.
Problem 7.2.4 The Korteweg–de Vries (KdV) equation is related to the shallow water equa-
tions and can be used to describe fluid motion in a long channel. In dimensionless
form (see Section 8.2.4) it reads
∂f(y, t) ∂ 3 f(y, t) ∂f(y, t)

+ 3
− 6f(y, t) = 0. (7.31)
∂t ∂y ∂y
There is a special class of solution called a “soliton” that behaves like a traveling
wave. To find it, assume f(y, t) = P(y − vt) a right-traveling waveform. This
assumption turns the PDE into an ODE that can be solved for P(u) – first integrate
the ODE once assuming that P(±∞) → 0 as a boundary condition, then think about
an ansatz of the form P(u) = a/ cosh2 (bu) for constants a and b to be determined.
7.3 Nonlinear Wave Equation
When we developed the wave equation for strings in Section 4.2, we made several
related and unreasonable assumptions, like constant tension throughout the string and the
requirement that the string pieces move vertically with no longitudinal motion. The benefit
was simplicity, resulting in a linear wave equation with known solutions. More realistic
models lead to much more complicated nonlinear partial differential equations. These are
interesting to develop even as they are difficult to solve. Here, we will work out the coupled
longitudinal and transverse equations of motion for a string with spatially varying tension.
We’ll take a “string” that has uniform mass density μ when in equilibrium, where the
string lies straight along what we’ll take to be the x axis. Focusing on a portion of the string
between x and x + dx as on the left in Figure 7.3, the total mass is μdx. Now suppose the
string is not in equilibrium, as on the right in Figure 7.3. The portions of the string have
moved both vertically and horizontally. The piece of string that was in equilibrium at x is
now at vector location r(x, t) at time t, and the piece that was in equilibrium at x + dx is at
r(x + dx, t).
The force of tension acts along the string, pointing in the direction tangent to the string
everywhere. That tangent direction is given by the x-derivative of the position vector, which
we’ll denote with primes. The tangent vector on the left, at time t, is r (x, t), and the one on
the right is r (x + dx, t). We’ll take the magnitude of the tension to be T(x) at location x, so
we need to make unit tangent vectors in order to express the forces acting on the segment
of string. Define the unit vector
r (x, t)
p̂(x, t) ≡ (7.32)
r (x, t) · r (x, t)
which can be evaluated at x + dx to describe the unit tangent vector on the right. The net
force on the patch of string comes from adding the tension forces on the left and right. Since
positive values of tension correspond to “pulling” (instead of “pushing”), the direction of
the tension on the left is −p̂(x, t), and on the right, p̂(x + dx, t). The total force on the
segment, at time t, is
ŷ ŷ
r (x, t)
r (x + dx, t)
r(x, t) r(x + dx, t)
x̂ x̂
x x + dx x x + dx
Fig. 7.3 On the left, a piece of string in equilibrium with mass μdx. On the right, the same piece has been stretched in both
the horizontal and vertical directions. The vectors r(x, t) and r(x + dx, t) point to the new locations of the left and
right end points of the piece of string at time t. The vectors tangent to those endpoints are given by the spatial
derivatives r (x, t) on the left, r (x + dx, t) on the right.
191 Nonlinear Wave Equation
F(x) = −T(x) p̂(x, t) + T(x + dx) p̂(x + dx, t) (7.33)
and then Newton’s second law reads (using dots to denote partial time derivatives)
μ dx r̈(x, t) = T(x + dx) p̂(x + dx, t) − T(x) p̂(x, t). (7.34)
For dx small, the term on the right can be Taylor expanded,
∂
μ dx r̈(x, t) ≈ (T(x) p̂(x, t)) dx. (7.35)
∂x
Now we can cancel the dx from both sides to arrive at the general expression, written
entirely in terms of the vector r(x, t) that, at time t, points from the origin to the piece of
string that would be at x in equilibrium,

∂ r (x, t)
μ r̈(x, t) = T(x) . (7.36)
∂x r (x, t) · r (x, t)
Next, let’s think about how to express the vector r(x, t) in terms of its components. The
vertical part is easy, we define the function v(x, t) (not to be confused with velocity here)
that gives the height above the x axis of the string segment at x. For the horizontal piece,
we want a function u(x, t) that gives the horizontal displacement of the piece of string that
is in equilibrium at x, relative to x, just as we had with φ(x, t) in Section 4.1. Then the
horizontal component of r(x, t) is x + u(x, t) so that when u(x, t) = 0 (equilibrium) the
horizontal location of the string is x. The position vector is
r(x, t) = (x + u(x, t)) x̂ + v(x, t) ŷ. (7.37)
The equation of motion (7.36), written in terms of the functions u(x, t) and v(x, t) becomes
the pair
⎛ ⎞

∂ ⎝ 1 + u (x, t) ⎠
μ ü(x, t) = T(x) "
∂x 2
(1 + u (x, t)) + v (x, t)2
⎛ ⎞ (7.38)
∂⎝ v (x, t) ⎠.
μ v̈(x, t) = T(x) "
∂x 2
(1 + u (x, t)) + v (x, t)2
Here we have equations governing both longitudinal (top equation) and transverse (bottom)
motion, and the motion is coupled, you can’t have one without the other.
How do we recover the wave equation from Section 4.2? There, we only had transverse
motion, so only the second equation in (7.38) is relevant (we took u(x, t) and its derivatives
to be zero). We assumed that the magnitude of tension was constant, T(x) = T0 , so that the
equation governing v(x, t) is

T0 ∂ v (x, t)
v̈(x, t) = (7.39)
μ ∂x 1 + v (x, t)2
in this approximation. Finally, the small angle assumption (related to the tangent vector,
∼ v (x, t)) means that we can Taylor expand the term in parenthesis:

v (x, t)
≈ v (x, t) + O(v (x, t)3 ) (7.40)

1 + v (x, t) 2
giving
∂ 2 v(x, t) T0 ∂ 2 v(x, t)
− + =0 (7.41)
∂t 2 μ ∂x 2
which is (4.23).
Going back to the full equations in (7.38), what form should we use for T(x)? There are
a variety of options, but perhaps the simplest is to take a constant tension and introduce
a “linear” correction, similar to Hooke’s law itself. When the string is in its equilibrium
configuration, lying along the x axis as in Figure 7.3, we have u(x, t) = 0, u (x, t) = 0 and
v (x, t) = 0, and we could have constant tension T0 that wouldn’t cause a net force on the
patch of string (equal and opposite tensions cancel). The length of the piece of string is dx.
Once the string has been stretched, it has a new length approximated by

d = r (x, t) · r (x, t)dx (7.42)
just the length of the tangent vector r (x, t) shown on the right in Figure 7.3. We will
assume that the correction to the tension is linear in the difference d−dx. Then the tension
as a function of position takes the form
d − dx
T(x) = T0 + K = T0 + K r (x, t) · r (x, t) − 1 (7.43)
dx
for K a constant with units of tension, since we made a dimensionless displacement
measure. Putting this assumed form in to (7.36) gives

∂ r (x, t)
μr̈(x, t) = (T0 − K) + Kr (x, t). (7.44)
∂x r (x, t) · r (x, t)
In this form, we can again see the wave equation emerge for K = T0 , a relation set by the
material-dependent constants K and T0 . In that case, the transverse and longitudinal pieces
of the wave equation decouple, effectively functioning independently.
Problem 7.3.1 The “curvature” of a curve specified by w(x) is defined to be the magnitude
of the curvature vector,

1 ∂ w (x)
k(x) = , (7.45)
w (x) · w (x) ∂x w (x) · w (x)
proportional to the derivative of the unit tangent vector to the curve. Calculate
the curvature of a circle with w(x) = R(cos(x) x̂ + sin(x) ŷ). The inverse of the
magnitude of the curvature defines the “radius of curvature” of a curve: Does that
make sense for the curvature vector for a circle? Notice that the nonlinear wave
equation in (7.44) can be written in terms of the curvature. Indeed, if you wrote
the “wave” part of that equation on the left, you could even say that the nonlinear
193 Schrödinger’s Wave Equation
equation was a wave equation that had, as its “source,” a term proportional to its own
curvature.
Problem 7.3.2 Show, from its definition, that the curvature vector k(x) is orthogonal to the
unit tangent vector p̂(x) from (7.32). Check that this is true for the circular example
from the previous problem.
Problem 7.3.3 Work out and keep the v (x, t)3 term from (7.40) in (7.39) to get the first
corrective update to (7.41).
7.4 Schrödinger’s Wave Equation
Schrödinger’s equation governs the quantum mechanical “motion” of a particle of mass m

that is acted on by a potential energy U. In one dimension, the equation reads
2 ∂ 2 Ψ(x, t) ∂Ψ(x, t)
− + U(x)Ψ(x, t) = i (7.46)
2m ∂x 2 ∂t
where is “Planck’s constant.” The goal is to solve for Ψ(x, t) (which could be a complex
function; i appears in the equation) given some boundary and initial conditions. This “wave
equation” plays a central quantitative role in quantum mechanics, analogous to Newton’s
second law in classical mechanics. The interpretation of the target function Ψ(x, t) is
what makes quantum mechanics fundamentally different from all of classical physics. The
“wave function” tells us about the probability of finding a particle in the dx vicinity of
x at time t. That probability is given by2 dP = Ψ(x, t)∗ Ψ(x, t)dx. Quantum mechanics
makes predictions about probabilities and averages obtained by performing experiments
over and over from the same starting point. Given an ensemble of particles acted on by
the same potential and starting from some initial probability distribution (in either space or
momentum), the wave function tells us the likelihood of finding a particle near a particular
location. Moving on from infinitesimals, if we want to know the probability of finding a
particle between x = a and x = b, we can integrate
b
P(a, b) = Ψ∗ (x, t)Ψ(x, t) dx. (7.47)
a
Because of the central role of probability here, and the statistical interpretation of the
function Ψ(x, t), we pause to review some statistics.
7.4.1 Densities and Averages

A probability density ρ(x, t) in one dimension has dimension of inverse length (or, if
you like, “probability” per unit length). The probability of an event occurring within a
dx window of x is dP = ρ(x, t)dx, and the probability of an event occurring in a region
x ∈ [a, b] is
2 Note that one peculiarity of densities is that the probability of finding a particle at x is zero. You can only have
nonzero probability of finding the particle “in the vicinity,” dx, of x.
b
P(a, b) = ρ(x, t) dx. (7.48)
a
Probability densities are normalized so that the probability of an event occurring over the
entire x axis must be one at all times t:
∞
P(−∞, ∞) = ρ(x, t) dx = 1. (7.49)
−∞
In quantum mechanics, we have ρ(x, t) = Ψ∗ (x, t)Ψ(x, t), and the event of interest is
measuring a particle near location x.
We can use probability densities to compute “average” values of functions of position.
Given some function of position, f(x), the average or “expectation” value of the function
is just the sum of all values of f(x) weighted by the probability of the event occurring at x,
so that
∞
f(t) ≡ ρ(x, t)f(x) dx. (7.50)
−∞
The probability density ρ(x, t) may be a function of time, as is the case for densities
that come from a wave function solving (7.46), or it may be static. As an example of an
expectation value, we might be interested in the average location of a particle. The function
f(x) = x, in that case, and we add up all the possible positions of the particle multiplied by
the probability of finding the particle near each:
∞
x(t) = ρ(x, t)x dx. (7.51)
−∞
Again, the average value may or may not be time-dependent depending on the form of
ρ(x, t).
With the average value in hand, we can ask how much deviation from the average is
associated with a particular probability density. We don’t particularly care whether the
deviation is to the left or right of the average value, so it makes sense to compute the
average of the new function f(x) = (x − x)2 , the “variance”:
∞

σ (t) ≡ (x − x) (t) =
2 2
ρ(x, t) x 2 − 2xx + x 2 dx. (7.52)
−∞
The average x and its square x are just numbers (from the point of view of the position
2
integral) and can be pulled outside the integral, leaving us with

∞ ∞ ∞
σ (t) =
2
ρ(x, t)x dx − 2x
2
ρ(x, t)x dx + x 2
ρ(x, t) dx
−∞ −∞ −∞
(7.53)
= x − 2x + x
2 2 2
= x 2 − x 2 .
If you want a measure of the “spread” of the distribution, you can take the square root of
the variance to obtain the “standard deviation,” σ.3
3 Note that there are other ways to measure the “spread.” You could, for example, compute |x − x
|
, that is
also a measure of the average distance to the mean and is insensitive to direction (above or below the mean).
The variance is a nice, continuous, differentiable, function of the difference x − x
, and that makes it easier to
work with than the absolute value (see Problem 7.4.3).
Example: Gaussian Density

Consider the well-known Gaussian probability density
2
ρ(x) = Ae −B(x−C) (7.54)
for constants A, B, and C – this density is time-independent, so we removed the t in ρ(x, t)

as a reminder. First, we will find out what the normalization requirement (7.49) tells us,
∞ ∞
−B(x−C)2 π
ρ(x) dx = Ae dx = A =1 (7.55)
−∞ −∞ B
√
so that A = B/π in order to normalize the density. Here, and for the integrals below, we
have used the definite integral identities (for even and odd powers):
∞ ∞
2 √ (2n)! 1 2n+1 2 n!
x 2n e −x dx = π x 2n+1 e −x dx = (7.56)
0 n! 2 0 2
which we will not prove, but you can test experimentally in Problem 8.3.4.
The average value of position is
∞
B −B(X−C)2
x = x e dx = C (7.57)
−∞ π
so that C is the average value here, the Gaussian is peaked at C, let C ≡ μ for “mean.” For
the variance, we have
∞
B −B(x−μ)2 1
(x − μ) =
2
(x − μ) 2
e dx = ≡ σ2 . (7.58)
−∞ π 2B
Since the variance is denoted σ 2 , we sometimes write B = 1/(2σ2 ) and then the normalized
Gaussian, with all constants tuned to their statistical meaning, becomes

1 (x−μ)2
−
ρ(x) = e 2σ2 . (7.59)
2πσ 2
The standard picture of the density, with mean, μ, and standard deviation, σ, marked on it,
is shown in Figure 7.4.
Example: Constant Density

In quantum mechanics, we describe particle motion using probability densities. You can
do this classically as well, although it is not as common. As a vehicle, think of a ball
bouncing back and forth elastically between two walls (no gravity) separated by a distance
a (one wall is at x = 0, the other at x = a), a classical “infinite square well.” In one
cycle of the ball’s motion, what is the probability of finding the ball in the vicinity of any
point x ∈ [0, a]? No point is preferred, the ball moves back and forth at constant speed,
so it spends the same amount of time at every location. Then we know that the probability
density is constant between the walls. Take ρ(x) = 0 (no time-dependence here) for x < 0
ρ(x)
x
μ−σ μ μ+σ
Fig. 7.4 A plot of the Gaussian density from (7.59). The mean, μ, as well as one standard deviation to the left and right,
μ ± σ, are shown.
and x > a, no chance of finding the particle outside of its walled box. Inside the walls,
ρ(x) = C, and we can normalize,
∞ a
ρ(x) dx = C dx = aC = 1 (7.60)
−∞ 0
from which we conclude that C = 1/a and ρ(x) = 1/a. What is the average value of the
ball’s position? We expect it to be at a/2 on average, but let’s calculate:
∞ a
x a
x = xρ(x) dx = dx = . (7.61)
−∞ 0 a 2
The variance is
7 8 ∞ a
a 2 a 2 1 a 2 a2
x− = x− ρ(x) dx = x− dx = . (7.62)
2 −∞ 2 0 a 2 12
√
The standard deviation, σ = a/ 12 tells us, on average, how far from the mean the ball is.
Since we are thinking about an average over one cycle to get ρ(x), the actual speed doesn’t
matter to the deviation.
The statistical interpretation in the classical case represented by this and the next
example is different from the quantum mechanical one. In classical mechanics, we are
imagining a single particle moving under the influence of a potential, and the probability
density has the interpretation of the probability of finding that particle at a particular
location given the particle’s time-averaged motion. For the ball bouncing back and forth,
that time-averaging is carried out implicitly, we know the particle visits each point twice
in each full cycle, and it spends the same amount of time in the vicinity of every point
because the ball travels with constant speed. But we are making a prediction about the
whereabouts of a single ball as it moves classically, not, as in quantum mechanics, about
the whereabouts of a particle that is part of an ensemble of similarly prepared particles.
Example: Oscillator Density

Let’s develop the classical probability density for our harmonic oscillator motion to
contrast with the quantum mechanical result we will develop in Section 7.5. A particle
of mass m starts from rest at location −a/2, and moves under the influence of the harmonic
potential energy function U(x) = mω 2 x 2 /2. For x(t) the solution to Newton’s second law
with the associated force F = −U (x) = −mω2 x, the time-dependent spatial probability
density is given by
ρ(x, t) = δ(x − x(t)). (7.63)
In classical mechanics, a particle is at a particular location at a particular time, and this

form for ρ(x, t) enforces that extreme localization and is normalized for all values of t,
thanks to the delta function,
∞
ρ(x, t) dx = 1. (7.64)
−∞
To define the time-independent density, we note that x(t) is oscillatory as in our previous
example, so we can average over a half period, during which the oscillating mass will visit
all locations between −a/2 and a/2, so this suffices. Let ρ(x) be the time average,

1 T/2 2 T/2
ρ(x) ≡ T ρ(x, t) dt = δ(x − x(t)) dt. (7.65)
2 0
T 0
Using (2.132), we can change variables. Let u ≡ x(t), then

2 T/2 2 a/2 δ(x − u)
ρ(x) = δ(x − x(t)) dt = du. (7.66)
T 0 T −a/2 |ẋ(t)|
We must write the integrand in terms of x(t), not its time-derivative, in order to use the
delta function. From conservation of energy with x(t) = u for use in the integral,
1 1
E= mẋ(t)2 + mω2 u2 (7.67)
2 2
and from the initial condition, E = mω 2 a2 /8, so that

a 2
ẋ(t) = ±ω − u2 . (7.68)
2
Now we can replace the ẋ(t) in the integral for ρ(x) with its value in terms of x(t) = u
and evaluate the integral using the delta function,

2 a/2 δ(x − u) 2 1
ρ(x) = " du = " . (7.69)
T −a/2 2 ωT a 2
ω a2 − u2 2 − x 2
Finally, we use the expression for the period, T = 2π/ω to write

1
ρ(x) = " . (7.70)
a 2
π 2 − x2
ρ(x)
2
aπ
x
a a
−
2 2
Fig. 7.5 The density from (7.70) as a function of position. The value at x = 0 is 2/(aπ), and the function goes to infinity as
x → ±a/2.
The classical harmonic oscillator is constrained to move between −a/2 and a/2, and cannot
be found outside of that domain. If x is outside of [−a/2, a/2], the density is complex, a
warning sign here. We’ve defined the density to be zero for |x| > a/2, which is sensible.
As you can check, the density is already normalized for x ∈ [−a/2, a/2]. A plot of ρ(x) is
shown in Figure 7.5. Notice that the density goes to infinity at the turning points. Because
the mass stops at those points, it is more likely to be in their vicinity.
What do we expect from the average position and its variance? The average location
should be x = 0 from the symmetry of the oscillator. The spread should be relatively wide
since the value of the density is largest at the end points. Performing the integration, we get
a/2
x
x = " dx = 0, (7.71)
−a/2 π a 2
2 − x 2
and since the mean is zero, the variance is easy to compute

a/2
x2 1 a 2
σ = x =
2 2
" dx = (7.72)
−a/2 π a 2 2 2
2 − x2
√
so that σ = (a/2)/ 2.
7.4.2 Separation and Schrödinger’s Equation

Let’s go back to Schrödinger’s equation (7.46) and think about how to solve this partial
differential equation. As written, the potential energy function U(x) is time-independent,
and that hints at a separation of variables approach. Let Ψ(x, t) = ψ(x)T(t), the usual
multiplication separation ansatz. Putting this into the Schrödinger equation and dividing
by Ψ(x, t) allows us to write
dT(t)
2 d 2 ψ(x)
− + U(x) = i dt . (7.73)
2mψ(x) dx 2 T(t)
The left side depends only on x while the right depends only on t. For the equation to
hold for all x and t, each side must be separately constant. That separation constant is
traditionally called E, and we can use it to immediately solve for the time-dependent
equation
dT(t)
= ET(t) −→ T(t) = Ae −i
Et
i (7.74)
dt
where A is a constant of integration, ultimately used to normalize the wave function.
Next we have to solve the “time-independent wave equation”
2 d 2 ψ(x)
− + U(x)ψ(x) = Eψ(x) (7.75)
2m dx 2
subject to some boundary conditions. This equation cannot be solved until the physical
environment is specified by giving U(x).
If we just take the solution we have so far, with
Ψ(x, t) = Ae −i ψ(x),
Et
(7.76)
the associated position probability density is
ρ(x, t) = Ψ(x, t)∗ Ψ(x, t) = |A|2 ψ(x)∗ ψ(x), (7.77)
which is time-independent. For this reason, these individual separable solutions are referred
to as “stationary states.” Schrödinger’s equation is linear in Ψ(x, t), and so superposition
holds. It is only when we take solutions with multiple values for E, and add them together
in a weighted sum to form Ψ(x, t) that we get nontrivial dynamics for the probability
density ρ(x, t). We’ll focus on the stationary solutions for a few different potential energy
functions.
7.4.3 Time-Independent Solutions

The structure of the time-independent Schrödinger equation is that of a continuous
eigenvalue problem (recall Problem 3.3.11) – we must find ψ(x) and E with

2 d 2
− + U(x) ψ(x) = Eψ(x), (7.78)
2m dx 2
where the term in brackets represents a linear differential operator, a continuous “matrix”
of sorts. We typically require that the probability density, and hence ψ, vanish at spatial
infinity. For any spatially localized potential energy, we don’t expect to find the particle
infinitely far away. Think of performing an experiment in a lab, the electron you’re moving
around with some electrostatic potential should remain in the lab, and not end up at the
other side of the universe. This assumption gives us boundary conditions, ψ(±∞) = 0,
and we’ll use those in what follows.
Infinite Square Well

Our first case will be the infinite square well, just a pair of impenetrable walls at x = 0 and
a with potential energy
#
0 0<x<a
U(x) = (7.79)
∞ x < 0 or x > a
The infinite potential outside the well allows us to set ψ = 0 for all points outside,
automatically satisfying our boundary condition. Since (7.78) is second order, the wave
function is continuous, and then we have ψ(0) = ψ(a) = 0. Our problem has been moved
entirely inside the box, with localized boundary conditions. Inside, U(x) = 0, and we have
to solve
2 d 2 ψ(x)
− = Eψ(x) (7.80)
2m dx 2
a familiar equation if there ever was one. Start with the general solution for constants F
and G,

2mE 2mE
ψ(x) = F cos x + G sin x . (7.81)
2 2
Now for the boundaries, from the one at x = 0, ψ(0) = F = 0. Then the second is

2mE
ψ(a) = G sin a =0 (7.82)
2
and we can’t take G = 0 to satisfy this equation, since then we get the trivial ψ(x) = 0, no
particle anywhere. So instead, we must have

2mE 2mE
sin a = 0 −→ a = nπ (7.83)
2 2
for integer n. This equation can be satisfied for an integer-indexed infinite family of values
for E,
n2 π 2 2
En = , (7.84)
2ma2
and then ψ(x) is indexed by n as well,
nπx
ψn (x) = G sin . (7.85)
a
In terms of the standard interpretation of quantum mechanics, if you measured the energy
of the particle in the infinite square well, you would get one of the {En }∞
n=1 as the result.
These are the “allowed energies” that could be obtained upon energy measurement, and
their integer index represents the quantization of energy. The solution to the full time-
dependent wave for one of these stationary states is (absorbing the constant A from (7.74)
into G in ψn (x))
En t
Ψ(x, t) = e −i ψn (x). (7.86)
Again, there is no time dependence in ρ(x, t) for this pure stationary state, where
Ψ(x, t)∗ Ψ(x, t) = ψn (x)∗ ψn (x). But we can use superposition to write a general solution
for Ψ(x, t),
∞
En t
nπx
Ψ(x, t) = Gn e −i sin . (7.87)
a
j=1
To find the coefficients {Gn }∞

n=1 , we need an initial function f(x) where we require
Ψ(x, 0) = f(x), then
∞
nπx
Ψ(x, 0) = Gn sin = f(x), (7.88)
a
j=1
and we would use the completeness and orthogonality of the sine functions to isolate the
coefficients as in Section 4.3.3.
Focusing on the stationary states themselves, the building blocks of more general
solutions, we can calculate the average position (expectation value) and variance to see
how they compare with the classical case of a ball bouncing between two walls. First, we
need to normalize the quantum mechanical probability density. Working from (7.85), the
density is
nπx
ρn (x) = ψn (x)∗ ψn = |G|2 sin2 (7.89)
a
and then normalization requires
∞ a nπx a
ρ n (x) dx = 1 = |G|2 sin2 dx = |G|2 (7.90)
−∞ 0 a 2
√
so that we could take G = 2/a. Then the position expectation value is
∞ a nπx
2 a
x = xρ n (x) dx = x sin2 dx = (7.91)
−∞ 0 a a 2
as expected, and matching the classical result.
For the variance, we have
7 8 ∞
a 2 a 2
σ2 ≡ x − = x− ρ n (x) dx
2 −∞ 2
a (7.92)
a 2 2
2 nπx
a2 6
= x− sin dx = 1− 2 2 .
0 2 a a 12 n π
Comparing this with (7.62), we recover the classical variance as n → ∞.
Problem 7.4.1 Show that the Gaussian density in (7.59) is normalized using the provided
identity in (7.56).
Problem 7.4.2 Using (7.56), compute the expectation values x and x 2 for the den-
sity (7.59) explicitly.
Problem 7.4.3 What is the derivative of the function f(x) = |x| (absolute value for real x)?
What is the second derivative? This is why it is preferable to compute the spread of
a distribution using the standard deviation.
Problem 7.4.4 Suppose you had a quartic potential of the form: U(x) = αx 4 for constant
α (with what units?). Without worrying about normalization, what is the form of
the classical, time-averaged probability density assuming a particle of mass m starts
from rest at −a? Sketch ρ(x) for x = −a → a.
Problem 7.4.5 Given a potential energy, U(x), that leads to oscillatory behavior for particles
that move under its influence, use conservation of energy to solve for ẋ(t) as a
function of x(t), and evaluate ρ(x) from (7.66) to get the general expression.
Problem 7.4.6 The time-independent form of Schrödinger’s equation is shown in (7.78),
but the complex conjugate of that equation is also equally valid (the “energy” E is
real here). Once the boundary conditions, ψ(±∞) = 0, are in place, we have seen
that the equations can be solved with quantized energy and wave functions. Suppose
you have two solutions, indexed by integers n and m, and with En = Em . Use (7.78),
its complex conjugate, and the probabilistic interpretation of the wave function to
prove the following orthonormality relation:
∞
ψn (x)∗ ψm (x) dx = δnm (7.93)
−∞
using, for example, the technique from Section 6.7.2.

Problem 7.4.7 Complexify time in the one-dimensional Schrödinger equation with generic
potential U(x) by taking t = −is as in Problem 1.6.5 (although this time with a
minus sign out front). Solve this modified form of Schrödinger’s equation for the
infinite square well and write the general solution, analogous to (7.87). What does
the general solution become as s → ∞ (can you see why we wanted the minus sign
in t = −is?)?
7.5 Quantum Mechanical Harmonic Oscillator
We’ll end by finding the stationary states of the harmonic oscillator in the quantum
mechanical setting (this approach is carried out in many quantum mechanical texts,
see [13], for example). Our goal is to solve Schrödinger’s equation with the potential
energy U(x) = 1/2mω2 x 2 for a spring with equilibrium at zero, and fundamental frequency
of oscillation ω. Remember, we have to find both ψ(x) and E in
2 d 2 ψ(x) 1
− + mω2 x 2 ψ(x) = Eψ(x) (7.94)
2m dx 2 2
subject to the boundary conditions ψ(±∞) = 0. Multiply both sides of Schrödinger’s
equation by 2/(ω) to get
1 d 2 ψ(x) mω 2 2E
− mω + x ψ(x) = ψ(x), (7.95)
dx 2 ω
and we can define the dimensionless √ variable (see Section 8.2.4 for discussion of the
systematic version of this process) z ≡ mω/x,
203 Quantum Mechanical Harmonic Oscillator
d 2 ψ(z) 2E
− + z 2 ψ(z) = ψ(z). (7.96)
dz 2 ω
Define the dimensionless energy W ≡ 2E/(ω) to get the simplified equation
d 2 ψ(z)
= z 2 ψ(z) − Wψ(z). (7.97)
dz 2
We want to impose the boundary conditions at spatial infinity. For x → ∞, the
dimensionless z also goes to infinity, and for large enough value of z 2 , the ODE we
are trying to solve becomes W-independent. That simplifies our problem, at least in
approximation, we have to solve
d 2 ψ(z)
= z 2 ψ(z). (7.98)
dz 2
This equation is similar in form to (1.107), consider the first-order equation for p(z)
dp(z)
= ±zp(z), (7.99)
dz
then taking the derivative of both sides gives
d 2 p(z) dp(z)
= ±z ± p(z) = z 2 p(z) ± p(z) (7.100)
dz 2 dz
where we used (7.99) to get the second equality. For large z, this second-order differential
equation for p(z) has right-hand side that is dominated by z 2 p(z), so that for p(z)
solving (7.99), we have an approximate solution to p (z) = z 2 p(z). But we know the
solution to (7.99) from (1.109), it’s just
z2
p(z) = p0 e ± 2 (7.101)
(for constant p0 ) and so we have an approximate solution to (7.98), one that holds for large
z ((7.98) is already a large z approximation to (7.97)),
z2
ψ(z) ≈ e − 2 (7.102)
where we have taken the solution that vanishes at spatial infinity in order to match our
boundary conditions.
All we have done is make some approximate observations about the asymptotic form
of the solution to (7.97) at large z in order to identify the correct boundary behavior. But
we can use these observations to simplify the full problem by incorporating the asymptotic
behavior from the start. Take (peeling off the desired behavior at spatial infinity)
z2
ψ(z) = e − 2 u(z) (7.103)
and insert in (7.97) to get an ODE for u(z),
d 2 u(z) du(z)
− 2z + (W − 1)u(z) = 0. (7.104)
dz 2 dz
This is the equation we need to solve, and once we have u(z), the solution to (7.97) will be
given by (7.103).
We’ll use the series solution approach to solve (7.104), take

∞

u(z) = z p aj z j
j=0
∞
du(z)
= zp aj ( j + p)z j−1 (7.105)
dz
j=0
∞
d 2 u(z)
2
= zp aj ( j + p)( j + p − 1)z j−2
dz
j=0
in (7.104). Performing the usual collection, we get

∞
0= aj+2 ( j + p + 2)( j + p + 1) + aj ((W − 1) − 2( j + p)) z j
j=0 (7.106)
−2 −1
+ a0 p(p − 1)z + a1 p(p + 1)z .
If we take p = 0, then the recursion relation is
2j − (W − 1)
aj+2 = aj (7.107)
( j + 1)( j + 2)
and there is an even series that starts at a0 and an odd one that starts at a1 . As j gets large,
this recursion relation becomes that of a function that goes to infinity for z → ∞, as you
will establish in Problem 7.5.2. Even when combined with the Gaussian in (7.103) this
will give an overall ψ(z) that does not go to zero at spatial infinity, violating our boundary
condition. To preserve our solution, the recursion relation must truncate at some j. Then
we will have a polynomial in z coming from u(z) and that will get killed at spatial infinity
by the Gaussian in ψ(z). So there must be some aj+2 = 0, after which all other coefficients
will be zero. To get aj+2 = 0, we must have
2j − (W − 1) = 0 −→ Wj = 2j + 1. (7.108)
Since W = 2E/(ω), our boundary condition requirement has once again quantized the
energies in the problem,

1
Ej = j + ω. (7.109)
2
To finish the job, let’s look at the polynomials we get from the recursion. The equation
for u(z) in (7.104) is called “Hermite’s equation” and its polynomial solution are the
“Hermite polynomials,” denoted Hn (z) for integer n. If we start with a0 = 1 and a1 = 0,
we would have a2 = 0 corresponding to E0 = ω/2 the lowest possible energy here. All
other coefficients vanish, and we have the zeroth-order polynomial, H0 (z) = 1. Taking4
a0 = 0, a1 = 2, we would have a3 = 0, giving us E1 = 3/2ω, and H1 (z) = 2z.
4 The Hermite polynomials have a particular normalization convention, like the Legendre polynomials we
encountered in Section 6.7.2.
205 Quantum Mechanical Harmonic Oscillator
For the higher-order polynomials, we have to use the recursion relation. Let’s work out
the second-order polynomial, for W2 = 5, E2 = 5/2ω, we have a2 = −2a0 , a4 = 0 and all
higher coefficients are zero, too. Taking a0 = −2 (again, chosen for external normalization
reasons), we get the second-order polynomial
H2 (z) = 4z 2 − 2. (7.110)
For the third-order polynomial, associated with W3 = 7, E3 = 7/2ω, we get a3 = −2a1 /3,
and taking a1 = −12 (normalization convention), we get
H3 (z) = 8z 3 − 12z. (7.111)
You can try getting a few of the higher-order forms in Problem 7.5.3. In these examples,
we see that W − 1 is twice the order of the polynomial, and can rewrite (7.104) in terms of
the polynomial order, n to get the traditional form of the “Hermite equation”
d 2 u(z) du(z)
− 2z + 2nu(z) = 0. (7.112)
dz 2 dz
Putting the pieces back together, the stationary wave functions in dimensionless
variables are
z2
ψn (z) = e − 2 Hn (z) with Wn = 2n + 1, (7.113)
and back in x coordinate,

2 mω 1
ψn (x) = e − 2 x Hn
mω
x with En = n + ω. (7.114)
2
These wave functions are not normalized, but must be before any probabilistic interpreta-
tion is applied.
We can construct the stationary densities, ρn (x) = ψn (x)∗ ψn (x) (appropriately normal-
ized), and the first three of these are plotted in Figure 7.6. Notice that there is a nonzero
probability of finding the particle far from its classically allowed region (see Figure 7.5). In
quantum mechanics, complete localization is never possible except in artificial cases like
the infinite square well. There is almost always some probability of finding the particle far
from where it is “supposed to be.” You can see from the probability densities in Figure 7.6
that the effect is more pronounced for larger energies, the density is more and more spread
out as n gets large.
Problem 7.5.1 For the ODE:

df(x) 1
= 2x + f(x),
dx x
identify the asymptotic, x → ∞ ODE and solve it. Call that solution g(x), and let
f(x) = g(x)h(x) for unknown h(x). Running your assumed form through the ODE,
write down and solve the ODE for h(x). Finally, write the full solution for f (0) = 1.
This problem gives a first-derivative version of the procedure we carried out in the
last section where we explicitly incorporated the asymptotic behavior of our solution
to make life simpler.
ρn (x)
n=2
n=1
n=0
Fig. 7.6 Position probability densities for the first three stationary states of the quantum mechanical harmonic oscillator.
2
Problem 7.5.2 Write the Taylor expansion of the function f(x) = e x . Identify the “coeffi-
cients” in the general form
∞

f(x) = aj x j
j=0
and find the recursion relation between successive terms for large j, i.e., what is aj+2
in terms of aj (only even terms will contribute to the sum) when j is large? Compare
with the large j limit of (7.107).
Problem 7.5.3 The Rodrigues formula for the Hermite polynomials is
n
2 d 2
Hn (x) = (−1)n e x e −x . (7.115)
dx
Using this, verify that you get the first four polynomials as in the text and find H4 (x).
Problem 7.5.4 Plot the first three stationary wave functions, ψ
n (x) from (7.114). If you
had a classical oscillator that started from rest at x = /(mω), what would the
classically allowed region (in which motion could occur) be? Indicate this region on
your plot.
Problem 7.5.5 Show that the Hermite polynomials have
∞
2
e −x Hn (x)Hm (x) dx = 0 for n = m, (7.116)
−∞
an orthogonality relation. Use the Hermite differential equation (7.112) together

with the technique from Section 6.7.2 that gave orthogonality for the Legendre
polynomials.
Problem 7.5.6 For a classical particle of mass m attached to a spring with frequency
ω released from rest at x = /(mω), we know oscillation occurs between
± /(mω). In quantum mechanics, there is less restriction on where the particle
can be. What is the probability of finding a particle outside the classically allowed
region for the harmonic oscillator in its “ground state” (ψ0 in (7.114))? How about
in the “first excited state,” ψ1 ? (don’t forget to normalize the probability densities
before computing these probabilities).
8 Numerical Methods
In this final chapter, we look at a variety of numerical methods that can be applied to
the problems we have encountered throughout the book. Each section is relatively self-
contained, starting off with a familiar physical problem that cannot be solved analytically
(at least, not completely). These problems are stated, or re-stated, as the case may be, and
then a numerical method to solve them is introduced with some examples. The goal is to
see how numerical techniques can augment the closed-form “analytical” approach taken in
the rest of this book.1
At the end of each section are problems that are meant to be worked with paper and
pencil, and some “lab” problems that require a computer. For some of the problems,
a numerical package like Mathematica or Matlab should be used (for calculating
eigenvalues/vectors, for example). I have described all algorithms in pseudo-code to avoid
picking one package/language over another.
8.1 Root-Finding
The “root finding” problem is: Given a function F(x), find some or all of the set {x̄i }ni=1
such that F(x̄i ) = 0. This type of question shows up in a variety of settings.
8.1.1 E&M Example

For the time-dependent fields in electricity and magnetism, we know that “information”
(the magnitude and direction of the electric and magnetic fields E and B at a particular
point r = x x̂ + y ŷ + z ẑ at time t) travels at a finite and specific speed: c (in vacuum). The
immediate implication is that if we would like to know the electric field at our observation
point (and time) due to a charge that is moving along some prescribed path, w(t), we
need to evaluate the location of the particle not at time t, but at some earlier time tr
(the “retarded” time).
The setup is shown in Figure 8.1 – the particle of charge q is at w(t) at time t, but it is
the earlier location w(tr ) that has information traveling at c to the observation point r at
time t. The time it takes for the information from w(tr ) to reach r is t − tr , and during that
interval, the field information travels at c, so the distance travelled is c(t − tr ). The distance
1 Expanded discussion of these methods, with additional examples, can be found in [7].
207
208 Numerical Methods
ẑ
Path of particle
q
c(t − tr )
w(t)
w(tr )
r at t
ŷ
x̂
Fig. 8.1 A particle with charge q moves along the prescribed curve w(t). For the point r at time t, field information comes
from the earlier time tr , when the charge was at w(tr ).
between r and w(tr ) is, geometrically, |r − w(tr )|. Putting these together, we have the
defining equation for the retarded time:

c(t − tr ) = |r − w(tr )| ≡ (r − w(tr )) · (r − w(tr )). (8.1)
Notice that the retarded time tr is a function of t and r (the observation time and location),
and is defined implicitly in terms of w(tr ). We may or may not be able to solve (8.1) for tr
given w(t) analytically (this cannot be done for any but the simplest trajectories), but we
can define a root-finding problem. The roots of the function
F(T) = c(t − T) − |r − w(T)| (8.2)
give the retarded time.
8.1.2 Bisection
The procedure we will use is called “bisection.” Start with a pair of values, x0 and xr0 with
x0 < xr0 (hence the and r designations), and F(x0 )F(xr0 ) < 0 (so
that a root
is in between
the points). Then calculate the midpoint between this pair, xm0 ≡ x0 + xr0 /2, and evaluate
the product p ≡ F(x0 )F(xm0 ). If p is less than zero, then the root lies between the left and
middle points, so we can move the left and right points over by setting x1 = x0 , xr1 = xm0 .
If p is greater than zero, the root is between the middle and right points, and we update
x1 = xm0 , xr1 = xr0 . The iteration is continued until the absolute value of F at the current
midpoint is smaller than some tolerance ,
|F(xmn )| ≤ , (8.3)
and then xmn is an approximation to the root. The process is shown pictorially in Figure 8.2,
and is described with pseudocode in Algorithm 8.1.
209 Root-Finding
F (x)
x0 x0m x0r Initial bracketing
x1 x1m x1r First iteration
x2 x2r Second iteration
x2m
Fig. 8.2 Two iterations of the bisection procedure. The midpoint of the second bisection, xm2 , is a good approximation to the
root here.
Algorithm 8.1 Bisection(F, x0 , xr0 , )

x ← x0
xr ← xr0
xm ← 1/2(x + xr )
while |F(xm )| > do
if F(xm )F(xr ) < 0 then
x ← xm
else
xr ← xm
end if
xm ← 1/2(x + xr )
end while
return xm
8.1.3 Example: The Electric Potential of a Moving Charge

Given a vector describing the location of a charge q at time t, w(t), we can use root-finding
to find the retarded time associated with a field point r at time t. From that, we can evaluate
the full electric field, or we can focus on the (simpler) potential (see [12]):
q 1
V(r, t) = , (8.4)
4π 0 |r − w(tr )| −(r − w(tr )) · ẇ(tr )/c
which reduces, for slow source-charge motion (ẇ(tr ) c), to
q
V(r, t) = , (8.5)
4π 0 |r − w(tr )|
and we’ll use this approximation.
Given a charge’s trajectory through space, w(t), we can use the routine described
in Algorithm 8.2 to generate the (approximate) electric potential at an individual field
point, and by evaluating the potential at a variety of points (in a grid, for example), we
can generate a contour plot of it (that process is described in Problem 8.1.5).
Algorithm 8.2 Vfield(x, y, z, t, w, )

r ← {x, y, z}
F(X) ← c(t − X) − |r − w(X)|
tr ← Bisection(F,
−∞, t, )
R ← (r − w(tr )) ·(r − w(tr ))
V ← 4πq 0 R
return V
Problem 8.1.1 The convergence of the bisection method to a root with tolerance is
independent of the function being bisected. To see this, note that for an interval
sufficiently close to the root, the interval width should be ∼ : think of the Taylor
expansion of the function F(x) near a root, F(x̄ + Δx) ≈ F(x̄) + F (x̄)Δx = F (x̄)Δx
so that the magnitude of the function in the Δx vicinity of x̄ is itself F (x̄)Δx, i.e.,
proportional to Δx. Then if we demand that |F(x̄ + Δx)| ≈ , we have |F (x̄)|Δx ≈
or Δx ≈ for finite F (x̄). Starting with the initial interval, Δ0 ≡ xr0 − x0 , how many
times, n, must you bisect the interval to get Δn ≡ xrn − xr ≈ ?
Problem 8.1.2 Implement the bisection procedure, and test it by finding the roots of f(x) =
(x− π)(x+1) and the first two roots of the 0th Bessel function, J0 (x). Use = 10−10
as the tolerance in both cases.
Problem 8.1.3 In order to solve the delay differential equation from Problem 4.6.3, we
needed to be able to solve the transcendental equation
x = e −αx .
This type of equation also shows up in quantum mechanics. Use bisection to find x
given α = 0 −→ 10 in steps of 1 using = 10−10 as the tolerance for bisection.
Make a plot of x versus α and compare with a plot of the formal solution of this
problem: x = ProductLog(α)/α (which basically defines the product log function).
Problem 8.1.4 A charged particle moves along the trajectory described by
w(t) = R cos(ωt) x̂ + dωt ẑ.
Use the bisection routine to find the retarded time for the point r = 0 at t = 0 if
R = 105 m, ω = 2000 1/s, d = 1 m. Use = 10−10 as your tolerance. What is
w(tr )? How does this compare with w(0), the value we would use if we ignored the
retarded time?
Problem 8.1.5 Given w(t) = d cos(ωt) ŷ with d = .05 m, ω = 5 s−1 , and taking
q/(4π0 ) → 1, c → 1 for simplicity, implement Algorithm 8.2 to find V and make
contour plots of its value for points between −1 → 1 in x and .1 → 1 in y, with
z = 0 and t going from 0 to 8π/5 in steps of 2π/(25). Use = 10−10 again, and take
−∞ → −10 for the bisection inside Algorithm 8.2.
211 Solving ODEs
8.2 Solving ODEs
Ordinary differential equations come up in physics all the time. The earliest example is
Newton’s second law, mẍ(t) = F(x), or its relativistic version in (8.7). While the effect of
some simple forces can be determined from these equations, most interesting cases cannot
be solved analytically. We want a numerical method for solving Newton’s second law.
We’ll start with some motivation, then introduce the method.
8.2.1 Pendulum
The nonlinearized pendulum has motion that can be described by an angle θ(t) which
satisfies Newton’s second law as we saw in Section 1.3.4:
g
θ̈(t) = − sin(θ(t)) (8.6)
L
where g is the gravitational constant, L is the length of the pendulum, and θ(t) the angle
the bob makes with respect to vertical at time t. We know how to find the period from
Section 5.3.2, but the actual θ(t) can be found numerically by approximating the solution
to (8.6).
8.2.2 Relativistic Spring

For a mass m attached to a wall by a spring with spring constant k, if we start√ the mass
from rest with initial extension a, the mass moves according to x(t) = a cos( k/mt) √from
Newton’s second law with F = −kx. The maximum speed of the particle is then k/ma
which can be greater than c. In its relativistic form, Newton’s second law: dp
dt = F is
⎡ ⎤
3/2
d ⎣ mẋ ⎦ ẋ 2
" = F −→ mẍ = 1 − 2 F, (8.7)
dt 1 − ẋ
2 c
c2
and this comes from using the relativistic momentum for p (instead of p = mẋ). The
solution, in this setting, does not have speed greater than c. For the spring force, we have
3/2
ẋ 2
mẍ = 1 − 2 (−kx). (8.8)
c
As with the pendulum, we used integration to find the period in Section 5.3.2, and now we
will use numerical methods to find x(t).
8.2.3 Discretization and Approach

We want a numerical method that will allow us to generate an approximate solution for
θ(t) in (8.6) or x(t) in (8.8), and one way to go about doing that is to introduce a temporal
“grid.” Imagine chopping time up into small chunks of size Δt, then we could refer to the
jth chunk as tj ≡ jΔt. We’ll develop a numerical method for approximating the values of
x(tj ) at these special time points.
Let xj ≡ x(tj ), then we can relate xj to xj+1 and xj−1 using you-know-what:

dx(t) 1 d 2 x(t)
xj±1 ≈ xj ± Δt + Δt 2 . (8.9)
dt t=tj 2 dt 2 t=tj
The second derivative of x(t) at time tj is just the acceleration at that time. If we are given
a force F(t) (or other form for the second derivative, as in (8.6)), we could write xj±1 as

dx(t) 1 F(tj ) 2
xj±1 ≈ xj ± Δt + Δt . (8.10)
dt t=tj 2 m
From here, we can eliminate velocity by adding xj+1 and xj−1 :
Fj
xj+1 + xj−1 = 2xj + Δt 2 (8.11)
m
where Fj is short-hand for F(tj ), or F(xj , tj ) for a force that depends on position and
(explicitly on) time. This form allows us to define a method: Given xj and xj−1 , set xj+1
according to:
Fj
xj+1 = 2xj − xj−1 + Δt 2
. (8.12)
m
The update here defines the “Verlet method.” If you know the previous and current
positions of a particle, you can estimate the next position, and proceed.
In the Newtonian setting, we are typically given x(0) = x0 and v(0) = v0 . How can we
turn that initial data into x(0) = x0 and x(−Δt) = x−1 ? Using Taylor expansion, we can
write:
1 F0
x−1 ≈ x0 − Δtv0 + Δt 2 . (8.13)
2 m
Now, given both x0 and v0 , and the force, you can estimate the trajectories. The Verlet
method is shown in Algorithm 8.3: you provide the mass of the particle, the initial position
x0 and velocity v0 , the function that evaluates the force, F (that takes a position and time,
F(x, t)), the time-step Δt and the number of steps to take, N. The Verlet method then sets
the Xc , Xp (current and previous positions) and stores the initial position in “xout.” Then
the method goes from j = 1 to N, calculating Xn (the “next” position) according to (8.13),
storing the new position, and updating the current and previous positions.
8.2.4 Rendering Equations Dimensionless

When working numerically, we have to be careful with the constants that appear in the
relevant equations. The issue is that there are limits to the numbers we can represent on a
computer – obviously infinity is out, but so, too, is zero. We’d like to stay away from zero
and infinity numerically, keeping our problems safely in the range near 1.2
2 The number one falls halfway between zero and infinity from this point of view.
213 Solving ODEs
Algorithm 8.3 Verlet(m, x0 , v0 , F, Δt, N)

Xc ← x0
Xp ← x0 − Δtv0 + 12 Δt 2 F(Xmc ,0)
xout ← table of n zeroes
xout1 ← Xc
for j = 2 → N do
Xn ← 2Xc − Xp + Δt 2 m−1 F(Xc , jΔt)
xoutj ← Xn
Xp = Xc
Xc = Xn
end for
return xout
In order to do this, we use a rescaling procedure to soak up unwieldy constants. That

procedure is also interesting outside of numerical work, since the number and type of
dimensionless quantities we can produce tells us something about the structure of the
equations we are solving and can reveal the true number of degrees of freedom available to
us. As an example, take the harmonic oscillator equation of motion, mẍ(t) = −k(x(t) − a).
We would say there are three constants here, the mass m, spring constant k and equilibrium
spacing a. But we know from experience that only the ratio k/m shows up in the solution,
so there are really only two independent constants, k/m and a.
The approach is to take every variable in a particular equation and make dimensionless
versions of them. To demonstrate, we’ll use the oscillator equation of motion
d 2 x(t)
m = −k(x(t) − a) (8.14)
dt 2
as a model.
The “variables” here are t and x(t), those are the things that can change. Let t = t0 T
where t0 has the dimension of time and the variable T is dimensionless (I like to use
capitalized variable names to refer to the dimensionless form), and take x = x0 X where
X is dimensionless and x0 has dimension of length. We don’t know what t0 and x0 are just
yet, and we’ll set them in order to simplify the final dimensionless equation of motion.
Putting our expressions for x and t in gives

m d 2X a
= −k X − . (8.15)
t20 dT2 x0
If we take the dimensionless quantity a/x0 = 1 and similarly m/t20 = k, then we can write
the differential equation governing X(T) in simplified form:
d 2 X(T)
= −(X(T) − 1) (8.16)
dT2
where the equation is dimensionless, and governs √ dimensionless variables, with no
constants in sight. We have fixed x0 = a, t0 = m/k which looks familiar, this is a
characteristic time-scale for the problem.
The simplicity of (8.16) highlights another advantage of the dimensionless approach, we

only need to solve the equation once.3 The solution is, of course,
X(T) = F cos(T) + G sin(T), (8.17)
for dimensionless constants of integration F and G. How do we restore the physical
constants? Just invert the defining relations between t and T, x and X (i.e. X = x/x0 ,
T = t/t0 ),

k k
x(t) = a F cos t + G sin t (8.18)
m m
which we recognize.
The harmonic oscillator by itself doesn’t have any tunable parameters left in its
dimensionless form (8.16), there’s nothing left to do but solve it. The damped harmonic
oscillator has two time-scales in it, one set by the spring constant, one set by the damping
parameter, and we can’t pick a single value for t0 that covers both of them. Let’s see what
happens in this case. Starting from
d 2 x(t) dx(t)
m = −mω2 (x(t) − a) − 2mb , (8.19)
dt 2 dt
let t = t0 T, x = x0 X as before. Putting these in gives

d 2 X(T) a dX(T)
= −ω 2 2
t0 X(T) − − 2bt0 . (8.20)
dT2 x0 dT
The length scale x0 can again be set by demanding that a/x0 = 1. But now we need to
pick either ω 2 t20 = 1 or bt0 = 1. Which should we choose? Either one is useful, and the
particular choice depends on what you want to do. For example, if you would like to be
able to probe the b = 0 (no damping) limit, then take ω2 t20 = 1, allowing us to pick t0
independent of the value of b. The equation of motion becomes
d 2 X(T) b dX(T)
= −(X(T) − 1) − 2 . (8.21)
dT2 ω dT
The ratio b/ω is itself dimensionless, and clearly represents a parameter that can tune the
influence of the damping from none, b = 0, to overdamped. It is useful to define the
dimensionless ratios in the problem to highlight their role. Let α ≡ b/ω, then the final
form of the dimensionless equation of motion governing the damped harmonic oscillator is
d 2 X(T) dX(T)
= −(X(T) − 1) − 2α . (8.22)
dT2 dT
The initial values must also be dimensionless, for x(0) = xi and ẋ(0) = vi given, we have
initial values for X(T): X(0) = xi /x0 and dX(T)
dT T=0 = vi t0 /x0 .
3 That’s no great savings if you solve an ODE symbolically, since all the constants are available in the single
solution, but when working numerically, each new value of a constant like k in (8.14) requires a new numerical
solution. With the dimensionless form, we solve once, and then change the “axis labels” in our plots to indicate
the unit of measure.
215 Integration
Problem 8.2.1 Come up with a one-dimensional force that is familiar to you from your
physics work so far, pose the Newton’s second law problem (i.e. write Newton’s
second law and provide initial conditions). You will solve for the trajectory of a
particle under the influence of this force (relativistically, if you like).
Problem 8.2.2 Pick the “other” value of t0 for use in (8.20) and form the resulting dimen-
sionless equation of motion. With this one, you could probe the “no oscillation” limit
of the problem.
Problem 8.2.3 Make a dimensionless version of the equation of motion for a relativistic
spring (8.8).
Problem 8.2.4 Using xj and xj−1 (the “current” and “previous” positions), come up with an
approximation for vj (the velocity at tj ) using Taylor expansion.
Problem 8.2.5 Find the period for the nonrelativistic problem: mẍ = −kx for x(0) = a (the
initial extension) and ẋ(0) = 0 (starting from rest). What is the maximum speed of
the mass? For what initial extension will the maximum speed be c?
√
Problem 8.2.6 Implement and run Verlet for a spring with k/m = 2 Hz (the spring
constant is k and the mass is m = 1 kg). Go from t = 0 to t = 4π s with N = 1000
for a mass that starts from rest at x0 = 2 m. Plot the resulting position as a function
of time. Make a table that takes each point in your numerical solution and subtracts
the exact solution, plot this table of “residuals.”
Problem 8.2.7 Run Verlet for a real pendulum (what will play the role of the “force”
in Algorithm 8.3?) of length L = 1 m starting from rest at initial angles θ = 10◦ ,
45◦ , and 80◦ . Use N = 1000 and t = 4π s as in the previous problem. For each of
your solutions, plot the exact solution to the linearized problem, θ̈(t) ≈ −g/Lθ(t) on
top of your numerical solution for the full problem.
Problem 8.2.8 Modify Verlet to solve the relativistic problem (8.7) (you will need your
result from Problem 8.2.4). Run your mass-on-a-spring from Problem 8.2.6 again,
this time starting with a mass at rest at the critical initial extension that leads to
a maximum speed of c from Problem 8.2.5 – use the same total time and N as
in Problem 8.2.6. Try running at ten times the critical initial extension (you will
need to increase the total time, and should also increase the number of steps). Is the
position (as a function of time) what you expect? What is the period of the motion
( just estimate from a plot)?
Problem 8.2.9 Solve your problem from Problem 8.2.1 numerically over some reasonable
domain. Try solving the relativistic version using (8.7).
8.3 Integration
The one-dimensional integration problem is: Given a function F(x), find the value of
b
I= F(x) dx (8.23)
a
for provided integration limits a and b. There are any number of places this type of problem
shows up. It can arise in a relatively direct setting: Given a rod with mass-per-unit-length
λ(x), how much mass is contained between x0 and xf ? But evaluating (8.23) also occurs in
the slightly disguised form of integral solutions to PDEs.
8.3.1 Biot–Savart Law

The Biot–Savart law (see [12]) governing the generation of a magnetic field from a closed
loop of wire carrying a steady current I is

μ0 I d ×(r − r )
B(r) = (8.24)
4π |r − r |3
where r is the vector pointing to the location at which we would like to know the field, r
points to the portion of the wire that is generating the field (we integrate over the closed
wire loop, so we include all those source pieces), d is the vector line element that points
locally along the wire, and there are constants out front. The setup is shown in Figure 8.3.
We can use the Biot–Savart law to find the magnetic field of a circular loop of radius R
carrying steady current I. Set the loop in the xy plane, and use polar coordinates so that:
r = R cos φ x̂+R sin φ ŷ points from the origin to points along the loop. Then the vector
d is just:
dr
d = dφ = (−R sin φ x̂ + R cos φ ŷ) dφ . (8.25)
dφ
Take r = x x̂ + y ŷ + z ẑ, an arbitrary location, as the point at which we would like to know
the magnetic field. The Biot–Savart law gives (once the dust has settled):

μ0 IR 2π z cos φ x̂ + z sin φ ŷ +(−x cos φ − y sin φ + R) ẑ
B(r) = 3/2 dφ . (8.26)
4π 0 2 2
(x − R cos φ ) +(y − R sin φ ) + z 2
All that remains is the evaluation of the integral given a point of interest, r.
ẑ
r−r
d
r
I
r
ŷ
x̂
Fig. 8.3 A current carrying wire generates a magnetic field at r according to (8.24).
217 Integration
8.3.2 Quadrature
In order to numerically approximate I in (8.23), we’ll introduce a grid in x as we did with
time in Section 8.2. Let xj = a + jΔx with xn = b for given n (the number of points in the
grid) so that j goes from zero to n. Let Fj ≡ F(xj ). The first approximation to I comes from
replacing the integral sign with a summation and dx with Δx:

n−1
I≈ Fj Δx ≡ Ib . (8.27)
j=0
You can think of this approximation as follows: Assuming the value of F(x) is constant
over the interval from xj to xj+1 , the exact integral over the interval is
xj+1
Fj dx = Fj (xj+1 − xj ) = Fj Δx, (8.28)
xj
so we are integrating a constant function exactly over each interval, then adding those up.
The approximation comes from the fact that the function F(x) does not take on the constant
value Fj over the interval. The idea is shown geometrically in Figure 8.4.
We can refine the integral approximation (without changing Δx) by using better
approximations to F(x) on the interior of each interval. For example, suppose we make
a linear approximation to F(x) between xj and xj+1 that matches F(x) at xj and xj+1 , then
integrate that exactly. Take
Fj+1 − Fj
F(x) ≈ Fj + (x − xj ) (8.29)
xj+1 − xj
for x = xj to xj+1 , then the exact integral of this linear function is
xj+1
Fj+1 − Fj 1
Fj + (x − xj ) dx = (Fj + Fj+1 )(xj+1 − xj )
xj xj+1 − xj 2
(8.30)
1
= (Fj + Fj+1 ) Δx,
2
F (x)
x
x0 x1 ... xj xj+1
Δx
Fig. 8.4 A segment showing the approximation we make in using (8.27). The dashed lines represent the piecewise
continuous function that (8.27) integrates exactly.
F (x)
x
x0 x1 ... xj xj+1
Δx
Fig. 8.5 An exact integration of a piecewise linear approximation to F(x), giving (8.31).
and if we add these up over the entire domain, we get a new approximation to I in (8.23)

n−1
1
I≈ (Fj + Fj+1 ) Δx ≡ It . (8.31)
2
j=0
This approximation for I is shown in Figure 8.5, and is known as the “trapezoidal approx-
imation.” Notice in Figure 8.5, the dashed lines indicating the piecewise linear function
that we use to approximate F(x) over each interval are a much better approximation to the
function (and lie “on top of it” visually in places).
Continuing, we can replace our piecewise linear approximation with a quadratic one.
This time, we need to include more points in our approximation of F(x). To do linear
interpolation, we only needed the values of F(x) at xj and xj+1 but now to set all the
coefficients in a quadratic interpolation, we must use the values of F(x) at xj , xj+1 , and
xj+2 . Over the interval xj to xj+2 , then, we’ll approximate F(x) by

1
F(x) ≈ (x − xj+1 )(x − xj+2 )Fj − 2(x − xj )(x − xj+2 )Fj+1
2Δx 2
(8.32)
+ (x − xj )(x − xj+1 )Fj+2 ,
where the quadratic approximation matches F(x) at xj , xj+1 , and xj+2 . The exact integral of
the quadratic approximation function, over the interval of interest, is
xj+2
1
F(x) dx = Δx(Fj + 4Fj+1 + Fj+2 ) , (8.33)
xj 3
giving us the “Simpson’s rule” approximation to I,

n−2
1
I≈ Δx(Fj + 4Fj+1 + Fj+2 ) ≡ Is . (8.34)
3
j=0,2,4,...
219 Integration
The implementation of the Simpson’s rule approximation is shown in Algorithm 8.4. There,
you provide the function F, the limits of integration a to b and the number of grid points n
(that must be a factor of two, as is clear from (8.34)).
Algorithm 8.4 Simpson(F, a, b, n)

Check that n is a factor of 2, exit if not.
Δx ← (b − a)/n
Is ← 0
for j = 0 → n − 2 in steps of 2 do
Is ← Is + 13 Δx(F(a + jΔx) + 4F(a + ( j + 1)Δx) + F(a + ( j + 2)Δx))
end for
return Is
You can use Simpson’s method to find the magnetic field associated with (8.26) as
shown:
Algorithm 8.5 Bfield(x, y, z, R, n)

μ0 IR z cos φ x̂+z sin φ ŷ−(x cos φ+y sin φ−R) ẑ
F= 4π 3/2
((x−R cos φ)2 +(y−R sin φ)2 +z 2 )
Bout ← Simpson(F, 0, 2π, n)
return Bout
Problem 8.3.1 Evaluate (8.26) for points along the z axis (i.e. for x = y = 0) analytically.
We’ll use this to compare with your numerical result.
Problem 8.3.2 Write (8.31) in terms of (8.27), i.e. relate It to Ib – you should need to
evaluate two extra points, so that It = Ib + X + Y where X and Y involve single
evaluations of F(x).
Problem 8.3.3 Implement Simpson’s method, Algorithm 8.4, and use it to approximate the
integral
2
sin(x)
I= dx. (8.35)
0 x
The correct value, up to 10 digits, is 1.605412977. How many grid points do you
need to match these ten digits?
Problem 8.3.4 Use Simpson’s method to approximate the integral:
f
2
I( f ) = e −x dx, (8.36)
−f
and make a plot of I( f ) for f goes from f = 1 → 5 in integer steps using n = 100 grid
√
points for the integration. Does the integral converge to π as claimed in (7.56)? Try
2
the same experiment for the integrand x 2 e −x .
Problem 8.3.5 Use Simpson’s method to approximate the sine integral
f
sin(x)
I( f ) = dx,
−f x
and make a plot of I( f ) for f = 0 → 100 in integer steps, again using n = 100 grid
points in Simpson’s method. To what value does the integral converge?
Problem 8.3.6 A circular loop of wire carries steady current I, and sits in the xy plane. Take
μ0 I/(4π) → 1 and implement Algorithm 8.5. Calculate the magnetic field along the
z axis and compare the magnetic field at the point z = .5 with your exact solution
using R = 1. Use n = 20 grid points for the integration (here, the number of grid
points won’t change the result of the numerical integration, can you see why?).
Problem 8.3.7 Use your method from the previous problem, with n = 20, to generate the
magnetic field at a set of points in three dimensions. Specifically, take x = −2 → 2
in steps of .1, y = −2 → 2 in steps of .1, and z = −.5 → .5 in steps of 1/π. If you
have access to a program that will take this data and plot it as a vector field, make
the plot.
8.4 Finite Difference
One can also “integrate” using a “finite difference” approximation, in which derivatives
are replaced with differences (via Taylor expansion). This type of approach is well-suited
to linear, boundary-value problems in mechanics. For example, solving the damped, driven
harmonic oscillator ODE:
d 2 x(t) dx(t) 1
+ 2b + ω 2 x(t) = F(t), (8.37)
dt 2 dt m
in boundary value form, where the position of a mass is given at times t = 0 and t = T is
difficult using the Verlet method which is tailored to the initial value form of the problem
(where x(0) and ẋ(0) are given). Of course, the damped, driven harmonic oscillator can also
be solved using an explicit integral as in Section 5.1.1 (again, for the initial value problem),
so the solution method here complements the integral solution to the same problem from
the previous section.
In higher dimension, the same finite difference approach can be used to solve Poisson’s
problem, a partial differential equation familiar from E&M (and almost everywhere else).
Because the method can be applied to problems beyond the familiar damped, driven
oscillator, we’ll replace (8.37) with a generic ODE governing a function F(x) with
“driving” source function s(x). So we’ll refer to the model problem (ODE plus boundary
values):
d 2 F(x) dF(x)
2
+α + βF(x) = s(x)
dx dx
(8.38)
F(0) = F0
F(X) = FX ,
where F0 and FX are the provided boundary values, and α, β are constants (they could be
made into functions of x, an interesting extension that you should try).
221 Finite Difference
8.4.1 One-Dimensional Form

As an example of the procedure, take the simplest case, with α = β = 0. We want F(x)
solving
d 2 F(x)
= s(x) F(0) = F0 F(X) = FX (8.39)
dx 2
for given source s(x) and boundary values provided at x = 0 and x = X. Introduce a grid
in x: xj = jΔx, for j = 0 → N + 1 and xN+1 = X, with Fj ≡ F(xj ) as usual. The second
derivative of F(x) evaluated at xj can be approximated by noting that

dF 1 2 d 2 F 1 3 d 3 F
Fj±1 = Fj ± Δx + Δx ± Δx + ··· (8.40)
dx x=xj 2 dx 2 x=xj 6 dx 3 x=xj
and then we can isolate the second derivative using a linear combination

d 2 F(x) Fj+1 − 2Fj + Fj−1
≈ (8.41)
dx 2 x=xj Δx 2
where the error we make in using this approximation is of order4 Δx 2 .

Putting the approximation (8.41) into (8.39), and writing sj ≡ s(xj ), we get a set
of algebraic equations governing the approximations to F(x) on the grid (Fj is now an
unknown approximate numerical value for F(xj ))
Fj+1 − 2Fj + Fj−1
= sj (8.42)
Δx 2
for j = 1 → N, and we need to be careful with the cases j = 1 and N, since those will
involve the boundary values F0 and FX . The full set of equations, including those special
cases, is
F2 − 2F1 F0
= s1 −
Δx 2 Δx 2
Fj+1 − 2Fj + Fj−1
= sj for j = 2 → N − 1 (8.43)
Δx 2
−2FN + FN−1 FX
= sN −
Δx 2 Δx 2
where we have moved the F0 and FX terms over to the right-hand side since those are
known values. We want to solve this set of equations for {Fj }Nj=1 .
We can write (8.43) in matrix-vector form. Define the vector F ∈ IRN with entries that
are its unknown values:
⎛ ⎞
F1
⎜ F2 ⎟
⎜ ⎟
F= ˙ ⎜ . ⎟ (8.44)
⎝ .. ⎠
FN
4 Meaning, roughly, that the error is bounded by some constant times Δx2 . The constant that sits out front depends
on the fourth derivative of F(x) evaluated at xj , in the present case.
and the tridiagonal matrix that acts on this to produce the left-hand side of (8.43) is5
⎛ ⎞
−2 1 0 ··· 0
⎜ 1 −2 1 0 ··· ⎟
1 ⎜⎜ 0 1 −2 1 0 ⎟
⎟
D=˙ ⎜ ⎟. (8.45)
Δx 2 ⎜ . .. .. .. ⎟
⎝ .. 0 . . . ⎠
0 ... 0 1 −2
Finally, define the slightly modified source “vector,”
⎛ F0 ⎞
s1 − Δx 2
⎜ s2 ⎟
⎜ ⎟
⎜ .. ⎟
s=˙ ⎜ . ⎟, (8.46)
⎜ ⎟
⎝ sn−1 ⎠
sN − Δx 2
FX
and we can write (8.43) as

DF = s (8.47)
which has solution obtained by inverting D. Formally, F = D−1 s is what we want.
How we obtain that matrix inverse depends on the particular problem (there are direct
matrix inversion routines implemented in almost any programming language, and a suite
of approximate inverses that can also be used).
The pseudocode that sets up the matrix D and vector s for this simplified one-
dimensional problem and returns the solution is shown in Algorithm 8.6 – you send in
the driving function s(x), the end-point X, the number of grid points to use, N, and the
initial and final values, F0 and FX , and this function constructs the matrix D and vector
(including boundary values) s, and returns D−1 s. In this segment of pseudocode, as in all
of them, we use the convention that tables start at 1 (not zero).
For the more general model problem in (8.38), we also need an approximation to the
derivative at xj . From (8.40), we see that an appropriate combination is:

dF Fj+1 − Fj−1
≈ . (8.48)
dx x=xj 2Δx
The discretization of (8.38) (for points away from the boundaries), gives:
Fj+1 − 2Fj + Fj−1 Fj+1 − Fj−1
+α + βFj = sj . (8.49)
Δx 2 2Δx
Problem 8.4.1 What is the exact solution to

d 2 F(x)
= 4 sin(16x), (8.50)
dx 2
with F(0) = 0 and F(1) = 1?
5 Note the similarity with the matrix for coupled oscillators, Q in (3.98) from Section 3.5.
223 Finite Difference
Algorithm 8.6 FDint(s, X, N, F0 , FX )

Δx ← X/(N + 1)
Dmat ← N × N table of zeros.
svec ← N table of zeros.
Dmat11 ← −2/Δx 2
Dmat12 ← 1/Δx 2
svec1 ← s(Δx) − F0 /Δx 2
for j = 2 → N − 1 do
Dmatjj−1 ← 1/Δx 2
Dmatjj ← −2/Δx 2
Dmatjj+1 ← 1/Δx 2
svecj ← s( jΔx)
end for
DmatNN−1 ← 1/Δx 2
DmatNN ← −2/Δx 2
svecN ← s(NΔx) − FX /Δx 2
return Dmat−1 svec
Problem 8.4.2 What is the exact solution to the harmonic oscillator problem:
d 2 F(x) 2
+(2π) F(x) = 0 (8.51)
dx 2
with F(0) = 1, F(1) = 0?
Problem 8.4.3 What is the exact solution to the damped harmonic oscillator problem:
d 2 F(x) dF(x)
2
+4 + 5F(x) = 0, (8.52)
dx dx
with F(0) = 0 and F(1) = 1?
Problem 8.4.4 Write the entries of the matrix analogous to D in (8.45), that you will make
for the full problem (8.49). What will you have to do to the first and last entries of
the vector s from (8.46) (with F(0) = F0 , F(X) = FX ) in this expanded setting?
Problem 8.4.5 Implement Algorithm 8.6, use it to solve (8.50) with N = 100 grid
points. Plot your numerical solution as points on top of the exact solution. Note
that the output of Algorithm 8.6 is just a list of numbers, corresponding to our
approximations on the grid. To plot the solution with the correct x axis underneath,
you need to provide the grid points themselves.
Problem 8.4.6 Expand the content of Algorithm 8.6 (i.e. use the matrix you developed
in Problem 8.4.4) so that it solves the full problem:
d 2 F(x) dF(x)
+α + βF(x) = s(x).
dx 2 dx
Find the solution for α = 4, β = 5, and s(x) = 0 from (8.52), taking F(0) = 0,
F(1) = 1, and using N = 999 gridpoints. Compare with your exact solution – what
is the difference between the analytic solution and your numerical estimate at x = .4?
Problem 8.4.7 Use your expanded algorithm from the previous problem to solve (8.51) for
N = 100 grid points. Compare with your solution to Problem 8.4.2.
Problem 8.4.8 Try solving
d 2 F(x) dF(x)
2
+4 + 5F(x) = 16 sin(16x),
dx dx
with F(0) = 0, F(1) = 1 using your numerical method with N = 999. What is the
value of your numerical solution at x = .76?
8.5 Eigenvalue Problems
The eigenvalue problem for matrices reads (referring to Section 3.3.2): Given a matrix
A ∈ IRn×n , find some/all of the set of vectors {vi }ni=1 and numbers {λ i }ni=1 such that:
Avi = λi vi . (8.53)
In general, for a vector y, the linear operation (matrix-vector multiplication) Ay can be
thought of in terms of rotations and stretches of y. The eigenvectors, vi , are the special
vectors for which Avi is parallel to vi , only the length has changed.
There are continuous versions of the eigenvalue problem as we saw in Problem 3.3.11.
The time-independent form of Schrödinger’s equation, for a particle of mass m moving in
the presence of a potential energy U(x) (in one spatial dimension) is

2 d 2
− + U(x) ψ(x) = Eψ(x), (8.54)
2m dx 2
where we have some sort of boundary condition, like ψ(±∞) → 0. We solve this equation
for both ψ(x) and E. The left-hand side of the equation represents a linear differential
operator acting on ψ(x), and if we can satisfy the equation, the effect of that linear
operator is to scale ψ(x) by E. This is a continuous eigenvalue problem, with ψ(x)
the “eigenfunction” and E the eigenvalue. What we will do is replace the differential
operators with finite difference approximations, turning the continuous eigenvalue problem
presented by Schrödinger’s equation into a discretized matrix eigenvalue problem of the
form (8.53).
Introduce a grid xj = jΔx with Δx fixed and j = 0, 1, . . . N + 1 (the left-hand boundary
is at x = 0 with j = 0, the right-hand boundary will be at xf , with j = N + 1 so that
Δx = xf /(N + 1)). Then, letting ψj ≡ ψ(xj ), and recalling the derivative approximation
in (8.41), the projection of Schrödinger’s equation onto the grid is:
2 ψj+1 − 2ψj + ψj−1
− + U(xj )ψj ≈ Eψj , (8.55)
2m Δx 2
and this holds for j = 2 → N − 1. We’ll assume that at the boundaries, the wave function
vanishes (so that −∞ → ∞ gets remapped to 0 → xf ), then for j = 1:
2 ψ2 − 2ψ1
− + U(x1 )ψ1 = Eψ1 , (8.56)
2m Δx 2
225 Eigenvalue Problems
and for j = N,
2 −2ψN + ψN−1
− + U(xN )ψN = EψN , (8.57)
2m Δx 2
and once again, we can box up the approximation in a matrix-vector multiplication. Let ψj
be the approximation to ψ(xj ) that is our target, and define the vector of unknown values:
⎛ ⎞
ψ1
⎜ ψ2 ⎟
⎜ ⎟
ψ=˙ ⎜ . ⎟. (8.58)
⎝ .. ⎠
ψN
The matrix that encapsulates (8.55) and the boundary points is, taking p ≡ 2 /(2mΔx 2 ),
⎛ ⎞
−2p + U(x1 ) p 0 ··· 0
⎜ p −2p + U(x2 ) p 0 ··· ⎟
⎜ ⎟
⎜ 0 p −2p + U(x3 ) p 0 ⎟
H=˙ −⎜ ⎟.
⎜ .. .. .. .. ⎟
⎝ . 0 . . . ⎠
0 ... 0 p −2p + U(xN )
(8.59)
This is a tridiagonal matrix, and can be made relatively easily using your function(s)
from the previous section. Now we have turned the continuous eigenvalue problem into
a discrete one, we want ψ and E that solve
Hψ = Eψ, (8.60)
a matrix eigenvalue problem. If you can construct the matrix H, then you can use the built-
in command “Eigensystem” in Mathematica, for example, to get the eigenvalues
(the set of energies) and eigenvectors (the associated discrete approximations to the wave
functions) of the matrix.
8.5.1 Infinite Square Well

For a particle in a box of length a, we have:
2 d 2 ψ(x)
− = Eψ(x), (8.61)
2m dx 2
with ψ(0) = ψ(a) = 0. Taking x = x0 X, we can render the equation dimensionless
using the approach from Section 8.2.4. Let Ẽ ≡ 2mx20 E/2 be the dimensionless energy,
then (8.61) becomes
d 2 ψ(X)
− = Ẽψ(X), (8.62)
dX 2
with ψ(0) = 0 and ψ(a/x0 ) = 0, suggesting we take x0 = a. The full solution to this
problem is, of course:

ψ(X) = A sin( Ẽ X) (8.63)
√
for constant A, which will be set by normalization, and with Ẽ = nπ for integer n (to
get ψ(1) = 0), so that in these units, Ẽ = n2 π 2 , or, going back to the original units in the
problem:
2 2 n2 π 2
E= Ẽ = . (8.64)
2ma2 2ma2
If we wanted to solve this problem numerically, we would approximate the derivative
in (8.62) with a finite difference. The dimensionless spatial coordinate X runs from 0 → 1,
so let Xj = jΔX with ΔX = 1/(N + 1) for N the number of grid points. Let ψj = ψ(Xj ) be
the unknown values associated with ψ on the spatial grid. Then using our familiar finite
difference approximation to the second derivative, (8.62) becomes:
ψj+1 − 2ψj + ψj−1
− = Ẽψj (8.65)
ΔX 2
for j = 2 → N − 1. At j = 1 and j = N we need to enforce our boundary conditions:
ψ0 = 0 and ψN+1 = 0, so those two special cases satisfy:
ψ2 − 2ψ1
− = Ẽψ1
ΔX 2
(8.66)
−2ψN + ψN−1
− = Ẽψ N .
ΔX 2
Together (8.65) and (8.66) define a matrix H similar to (8.59), and we can again define ψ
as in (8.58). Then the discretized problem can be written as Hψ = Ẽψ. In Algorithm 8.7,
we see the pseudocode that generates the matrix H given Xf (here, one) and the size of the
grid, N.
Algorithm 8.7 Pbox(Xf , N)

ΔX ← Xf /(N + 1)
Hmat ← N × N table of zeros.
Hmat11 ← 2/ΔX 2
Hmat12 ← −1/ΔX 2
for j = 2 → N − 1 do
Hmatjj−1 ← −1/ΔX 2
Hmatjj ← 2/ΔX 2
Hmatjj+1 ← −1/ΔX 2
end for
HmatNN−1 ← −1/ΔX 2
HmatNN ← 2/ΔX 2
return Hmat
8.5.2 Hydrogen
To get the spectrum of hydrogen (see [13] for details), we just need to put the appropriate
potential energy function in (8.59). For an electron and a proton separated by a distance x
and interacting electrostatically, we have
227 Eigenvalue Problems
e2
U(x) = − (8.67)
4π 0 x
where e is the charge of the electron. We can work on the “half ”-line, letting x = 0 → ∞,
and we want to solve
2 d 2 ψ(x) e2
− 2
− ψ(x) = Eψ(x) (8.68)
2m dx 4π 0 x
with ψ(0) = 0 and ψ(∞) = 0.
Let x = x0 X for dimensionless X and where x0 has dimension of length. Then (8.68)
can be written:
d 2 ψ(X) mx0 e 2 2 2mx20
− − ψ(X) = Eψ(X), (8.69)
dX 2 4π 0 X
2 2
and we’ll clean up the potential term by requiring that
mx0 e 2
= 1, (8.70)
4π 0 2
which serves to define x0 . Letting Ẽ = 2mx20 E/2 (so that Ẽ is dimensionless), the
dimensionless form of (8.68) reads:
d 2 ψ(X) 2
− − ψ(X) = Ẽψ(X). (8.71)
dX 2 X
When we discretize in position, we cannot go all the way out to spatial infinity, so we
pick a “large” value of X (since X is dimensionless, large means X 1). Let Xf be the
endpoint of our grid, then ΔX = Xf /(N + 1). The discretized form of (8.71) is (for ψj =
ψ(Xj ), an unknown value for ψ evaluated at Xj = jΔX)
ψj+1 − 2ψj + ψj−1 2
− − ψj = Ẽψj (8.72)
ΔX 2 Xj
and this is for j = 2 → N−1. For j = 1, we use the fact that ψ0 = 0 (a boundary condition)
to get
ψ2 − 2ψ1 2
− − ψ1 = Ẽψ1 , (8.73)
ΔX 2 X1
and similarly for j = N, with ψN+1 = 0 (the boundary at “infinity”), we have
−2ψN + ψN−1 2
− − ψN = ẼψN . (8.74)
ΔX 2 XN
8.5.3 The Power Method

How does one obtain the eigenvalues and eigenvectors associated with a matrix? The
approach we used in Section 3.3.2 is inefficient, and involves multiple root-finding
expeditions just to start it off (and one must know something about the root structure
of the relevant determinant in order to bracket the roots). The crux of almost any
numerical eigenvalue-problem solver is the so-called “power method.” Suppose you have
a symmetric matrix A ∈ IRn×n , so that we know that the eigenvectors span IRn , and the
eigenvalues are real. Assume, further, that the eigenvalues are all distinct and ordered,
with |λ1 | > |λ2 | > . . . > |λn | > 1 and we’ll call the associated eigenvectors v1 , v2 , . . .,
vn . Pick a random, nonzero vector x ∈ IRn , then we know from Section 3.3.3 that there
exist coefficients {βj }nj=1 such that

n
x= β j vj . (8.75)
j=1
Now multiply the vector x by A, the multiplication slips through the sum in (8.75)
and because the vectors {vj }nj=1 are eigenvectors, we know that Avj = λj vj , so that the
multiplication yields

n
Ax = β j λ j vj . (8.76)
j=1
Similarly, if we multiply by Ap , i.e. multiply by A p-times, we get

n
Ap x = βj λpj vj . (8.77)
j=1
Because the largest eigenvalue is λ1 , we know the first term in the sum (8.77) will
dominate, since |λ1 |p |λ 2 |p . . . for some p. Then for p “large enough,” we get
Ap x ≈ β 1 λp1 v1 , (8.78)
the product is parallel to v1 . If you let w ≡ Ap x, then ŵ ≈ v1 , the unit vector w should
approximate the normalized eigenvector v1 . To get the associated eigenvalue, note that
Aŵ ≈ λ1 ŵ, and we can multiply both sides by ŵT , isolating λ1 ,
λ1 ≈ ŵT Aŵ. (8.79)
Now you have the first eigenvalue and associated eigenvector.

To continue the process, you basically start over with a new random vector, multiply
by A over and over, but after each multiplication, you “project out” the component that
lies along the now known v1 approximate. In this way, you can force the power method to
converge to the eigenvector associated with the second-largest eigenvalue, and the process
can be continued from there. While the method sketched here is specific to symmetric
matrices, generalizations exist for nonsymmetric and complex matrices (see [11]).
Problem 8.5.1 Find the value of x0 from (8.70), this is a characteristic length scale for
hydrogen. For this value of x0 , find the energy scale 2 /(2mx20 ) that you will use
in converting the dimensionless Ẽ back to E. What is the value of this energy scale
in electron-volts?
Problem 8.5.2 Indicate the nonzero entries of the matrix associated with the dimensionless
hydrogen problem – i.e. take (8.72), (8.73), and (8.74), and write the entries of the
matrix H appearing in the discrete eigenvalue problem: Hψ = Ẽψ.
229 Discrete Fourier Transform
Problem 8.5.3 The finite difference (8.65) can be solved analytically. Take the ansatz: ψj =
Ae iπnjΔX , insert in
ψj+1 − 2ψj + ψj−1
− = Ẽn ψj ,
ΔX 2
and find Ẽn . Take the limit as ΔX → 0 and show that you recover n2 π 2 .
Problem 8.5.4 Generate the appropriate matrix H for the particle-in-a-box problem
from Algorithm 8.7 using Xf = 1 and N = 100. Find the eigenvalues using a
linear algebra package. Sort your list so that you get the values from smallest to
largest. What is the difference between the first two smallest eigenvalues and their
analytic values from your solution to Problem 8.5.3?
Problem 8.5.5 Modify your matrix from the previous problem so that its entries come from
your solution to Problem 8.5.2, i.e. make a matrix that reflects the content of (8.72),
(8.73), and (8.74). Using Xf = 50 and N = 10000, build the matrix and find its
eigenvalues (using a linear algebra package). List the first four (sorted) eigenvalues
(they should be negative). Guess the form of these eigenvalues if you associate the
lowest one with the lowest energy, the second with next highest energy, and so on
(i.e. find an expression for Ek with k = 1, 2, 3, and 4). These are dimensionless
eigenvalues, restore dimensions of energy using your result from Problem 8.5.1, and
express the spectrum of hydrogen, Ek =? in eV.
Problem 8.5.6 Use the power method to find the maximum eigenvalue and eigenvector for
the matrix
⎛ ⎞
1 2 3 4
⎜ 2 5 6 7 ⎟
A=˙ ⎜
⎝ 3 6 8 9 ⎠.
⎟
4 7 9 10
8.6 Discrete Fourier Transform
The Fourier series that we discussed in Section 2.3 decomposed a periodic function p(t)
into exponentials of the same period T:
∞

p(t) = aj e i2πjt/T , (8.80)
j=−∞
with coefficients given by the orthogonality, under integration, of the exponentials,

1 T
aj = p(t)e −i2πjt/T dt. (8.81)
T 0
If we discretize in time, letting tj ≡ jΔt for some time-step Δt,6 and let pj ≡ p(tj ), then
we can approximate the integral in (8.81), with T = nΔt,
6 This discretization could be performed on known continuous functions p(t) of course, but is meant to be
reminiscent of the time-series obtained in the laboratory, where the cadence of data-taking is set by the particular
instruments on which the data is taken, we poll the voltage in a circuit every Δt = .1 millisecond, for example.
1 1
n−1 n−1
aj ≈ pk e −i2πj(kΔt)/T Δt = pk e −i2πjk/n . (8.82)
T n
k=0 k=0
Assume that n is a multiple of 2, then looking at (8.80), it is clear we need both positive
and negative values of j to recover (a truncated form of p(t)), so let j = −n/2 → n/2 in
evaluating aj . That is, seemingly, n + 1 values for aj , and this is one too many given the
original set of n values for pk . But note that the coefficients aj are themselves periodic with
period n, i.e. aj+n = aj ,
1 1 1
n−1 n−1 n−1
aj+n = pk e −i2π( j+n)k/n = pk e −i2πjk/n e −i2πk

n= pk e −i2πjk/n = aj .
n n
k=0 k=0 =1 k=0
(8.83)
Now we can see that a−n/2 = a−n/2+n = an/2 so that there really are only n unique values
for the coefficients aj here.
To get the “inverse,” recovering the set {pk }n−1
k=0 , we need to evaluate (8.80) at the
discrete temporal points
∞

pk = aj e i2πjk/n , (8.84)
j=−∞
and again, since we only have n values of aj , we cannot perform the infinite sum. Instead
we truncate as before:

n/2
pk = aj e i2πjk/n . (8.85)
j=−n/2
To summarize, given a set of data {pj }n−1

j=0 taken at a temporal spacing of Δt, the discrete
Fourier transform (DFT) is given by the coefficients
1
n−1
aj = pk e −i2πjk/n for j = −n/2 → n/2. (8.86)
n
k=0
Given the coefficients {aj }n/2

j=−n/2 , we can construct the inverse DFT,

n/2
pk = aj e i2πjk/n for k = 0 → n − 1. (8.87)
j=−n/2
If we imagine that the datum pj came at time tj ≡ jΔt given the fixed temporal spacing
Δt, what frequency should we associate with the coefficient ak ? The frequency spacing
should, like the temporal one, be constant, call it Δf. Then we expect ak to be the coefficient
associated with the frequency fk = kΔf. How do we determine Δf given Δt? Let’s work
out an example to see how to make the identification. Take a continuous signal, p(t) =
sin(2π5t) so that the frequency f here is 5. We’ll sample this function at n = 10 equally
spaced steps of size Δt = 1/25 so that pj has j = 0 → 9. Computing the coefficients
using (8.86), we find that the aj are all zero except for j = −2 and j = 2 with a−2 = .5i
and a2 = −.5i. Notice first that the positive and negative frequency components are related
in the usual way for sine, which is both real and odd. Now we have j = ±2 as the integer
index of the frequency, and we want fj = jΔf, with f2 = 2Δf = 5 the frequency of the
original signal. Then Δf = 5/2 is the frequency spacing.
That’s fine for calibration, but how do we generalize the result for an arbitrary Δt? We
can’t just send in a pure cosine or sine signal to find the spacing in each case, so we ask
the general question: For a pure sinusoidal function, what is the minimum period signal we
can represent with a set of data {pj }n−1
j=0 assumed to come from a temporal discretization
Δt? Well, the simplest signal that is purely oscillatory has pj = 0 for all values. Assuming
we have not sent in “nothing,” we must interpret this as a sine wave with zeroes at every
grid point, meaning that for a signal of frequency f,¯ we must have
¯ = 0 −→ 2π fΔt
sin(2π fΔt) ¯ = mπ (8.88)
for integer m. The smallest m could be is m = 1, and that represents the minimum period
for a signal of frequency f¯ = 1/(2Δt). Minimum period means maximum frequency and
this special maximum is known as the “Nyquist frequency.” Meanwhile, for a fixed Δf, the
maximum frequency we can represent on our frequency grid is (n/2)Δf. Equating this with
the Nyquist frequency allows us to solve for Δf:
1 n 1 1
f¯= = Δf −→ Δf = = . (8.89)
2Δt 2 nΔt T
Let’s check this spacing using our example, where n = 10, Δt = 1/25 so that Δf = 5/2 just
as we expected.
There is a scale invariance in (8.86) and (8.87): Those equations do not depend on Δt
or Δf, and hence refer to an infinite family of Fourier series, each with a different {Δt,Δf}
pair. Once one is given, the other is fixed, but a priori there is no preferred value. The
algorithm for the DFT is shown in Algorithm 8.8, where the only input is the data itself,
called “indata” there. You are in charge of keeping track of the input Δt and associated
value of Δf which is necessary when plotting the power spectrum, for example. The input
data must have length that is a multiple of 2, so that n/2 is an integer. The output of the DFT
has an extra point in it as compared to the input data. That point is the redundant value of
the DFT for the n/2 entry (identical to the −n/2 entry by periodicity). Note that, as with all
pseudocode in this chapter, we assume that vector indices start at 1, hence the additional
decorative indexing.
Algorithm 8.8 DFT(indata)

n ← length(indata)
oput ← zero table of length n + 1
for j = −n/2 → n/2 do
)n
oputn/2+j+1 = 1/n k=1 indatak e −i2πj(k−1)/n
end for
return oput
The output is a vector of length n + 1 whose first entry corresponds to the frequency
−n/2Δf. The inverse process is shown in Algorithm 8.9 where this time we send in the
Fourier transform data (“indata”) and get back out the temporal signal, with first entry (at
index location 1) associated with time t = 0.
Algorithm 8.9 iDFT(indata)

n ← length(indata) − 1
oput ← zero table of length n
for k = 0 → n − 1 do
)n/2
oputk+1 = j=−n/2 indataj+1+n/2 e i2πjk/n
end for
return oput
The DFT is a great start for signal processing applications. The only real problem with
it is the amount of time it takes. Looking at Algorithm 8.8, each entry in the output vector,
“oput,” requires n additions (to form the sum), and there are ∼n output entries so that the
time it takes to compute the DFT (and the inverse) is proportional to n2 . That scaling with
size can be avoided using a clever recursive structure that exploits the periodicity of the
signal and the exponentials that make up the sum in the algorithm. The resulting recursive
algorithm is called the “fast Fourier transform” abbreviated FFT. It produces the exact
same results as the DFT, but scales with input vector size n like n log(n) which is much
smaller than n2 for large n.
Problem 8.6.1 Implement the DFT algorithm and use it to reproduce the power spectrum
we used to find Δf as follows: Sample the signal function sin(2π5t) for n = 10 and
Δt = 1/25 (making sure to start with the value at t = 0), send that in to the DFT, find
Δf and plot the magnitude-squared of the values from the DFT on the y axis, with the
correct frequencies (negative and positive) on the x axis.
Problem 8.6.2 Implement the inverse DFT algorithm and use it to invert the Fourier
transform you got in the previous problem. You should exactly recover the original
discretized signal values.
Problem 8.6.3 Take the signal function
p(t) = e −t ,
a decaying exponential, and discretize it using Δt = 1/50 and n = 256. Compute

the DFT and plot the power spectrum (with correct x axis frequencies). Now take
p(t) = e −10t and do the same thing. What happens to the power spectrum, does it
become more or less sharply peaked (see Problem 2.6.4 and Problem 2.6.12)?
Problem 8.6.4 For the signal
p(t) = e −2t sin(2π5t),
sample using Δt = 1/50, n = 256, compute and plot the power spectrum. Is the peak
in the right place?
Problem 8.6.5 Make a signal with a variety of frequencies:
p(t) = sin(2π(20t)2 ),
and discretize using Δt = 1/8000, n = 4096. If possible, play the signal so you
can hear what it sounds like. Compute the DFT, and attach the correct frequencies
to the vector values. Now “zero out” all entries in the discrete transform that have
absolute value frequency between 200 and 400 Hz, i.e. for any frequency values in
this range, set the corresponding entry in the DFT vector to zero. Use the inverse
DFT to recover a signal and try playing it. This problem gives an example of a
frequency “filter,” where you perform surgery on the Fourier transform to eliminate
or augment frequencies in a certain range, then inverse transform to get a signal with
those frequencies absent (or punched up).
A Appendix A Solving ODEs
A Roadmap
We have covered both general and specialized methods for solving various ordinary
differential equations, including: (1) definition: the solution is defined to be some named
function, and properties of those functions are studied (think of cosine and sine); (2) series
solution, we take the solution to be an infinite sum of powers (like a Taylor expansion)
or exponentials (like Fourier series), and find the set of coefficients that solve the ODE;
(3) separating out the homogeneous and sourced solutions (homogeneous solutions are
used to set initial/boundary values). What should your plan of attack be when confronted
with some new or unfamiliar ordinary differential equation? What I will suggest is a
laundry list of things to try, but surprisingly many problems will yield to one of the
approaches, and those that do not generally require numerical methods in order to make
progress.
For an ODE of the form: D(f(x)) = g(x) where D(f(x)) represents a differential operator
acting on f(x) and g(x) is some provided function, and assuming the appropriate initial
conditions are provided, the first thing to try is a guess of the form f(x) ∼ Ae Bx for
constant A and B. This is a good guess because it represents a single term in a Fourier series
expansion (where we would set B = i2πb for some new constant b), and is particularly
useful when the differential operator is linear. The goal is to find A and B that satisfy

D Ae Bx = g(x) (A.1)
for all x. Note that it may be impossible to find such an A and B – does that mean the
problem is unsolvable? No, it just means that this initial guess does not yield a solution,
and you should move on and try something else.
As an example, suppose we take D(f(x)) = f (x) + pf (x) − qf(x) (for constants p and
q), and we have g(x) = 0, then

D Ae Bx = 0 −→ B2 Ae Bx + pB Ae Bx − q Ae Bx = 0. (A.2)
The equation can be divided by Ae Bx leaving the algebraic equation B2 + pB − q = 0.

The values of B that solve this equation are B± = (−p ± p2 + 4q)/2, but there is no
constraint on A. Since the differential operator is linear in f(x), we know that superposition
holds, and we can take an arbitrary sum of the two solutions:
√ 9 √ 9
x −p+ p2 +4q 2 x −p− p2 +4q 2
f(x) = A1 e + A2 e . (A.3)
In this form, it is clear that the constants A1 and A2 will be set by initial (or boundary)
values.
234
235 Solving ODEs: A Roadmap
If we had the operator D(f(x)) = f (x)2 − pf(x) for constant p, and again g(x) = 0, then
our guess would give
2
D Ae Bx = 0 −→ B2 Ae Bx − p Ae Bx = 0. (A.4)
This time, the exponentials cannot be cancelled, and we have an equation that cannot be
solved for all x:

B2 Ae Bx = p. (A.5)
Again, the lesson here is that our initial guess is not robust enough. In order for the
exponential ansatz to work, we need to be able to reliably cancel out the exponentials from
the ODE, removing the x-dependence and leaving us with an equation for the constants A
and B. Here, we cannot do that, so we move on.
Your next guess should be a polynomial in x, f(x) = Ax B . This starting point represents
a single term in an infinite series expansion of f(x). We again run the assumed form
through the ODE to see if we can solve for the constants A and B (possibly appealing
to superposition)

D Ax B = g(x). (A.6)
For the differential operator D(f(x)) = f (x)2 − pf(x) from (A.6), we have

D Ax B = 0 −→ A2 B2 x 2(B−1) − pAx B = 0 (A.7)
and we can see that for this equation to hold for all x, we must have 2(B − 1) = B or B = 2.
Then we are left with 4A2 − pA = 0 so that A = p/4, and our solution reads
px 2
f(x) = , (A.8)
4
but we are missing a constant of integration that would allow us to set the initial values.
We have a particular solution here, and now we need to find “the rest” of the solution. In
the present case, a clever observation allows us to make progress. When confronted with
a nonlinear ODE, one should always look for a simple way to get a linear ODE out. Here,
taking the derivative of the ODE itself gives
px 2
2f (x)f (x) − pf (x) = 0 −→ 2f (x) − p = 0 −→ f(x) = + a + bx (A.9)
4
where a and b are constants. The problem is that we have too many constants of integration,
we expect to get just one. So take this f(x) and run it through the original ODE:
2
px
D + a + bx = 0 −→ b2 − ap = 0. (A.10)
4
√
We can set b = ± ap, to get
px 2 √
f(x) = + a ± apx (A.11)
4
with a waiting to be set by some provided initial value.
If we had the original D(f(x)) = f (x) + pf (x) − qf(x), we would not be able to solve
using the polynomial guess:

D Ax B = 0 −→ A(B − 1)Bx B−2 + ABpx B−1 − Aqx B = 0, (A.12)
and there is no way to get B − 2 = B − 1 = B so as to cancel the x-dependence, and hence,

no solution of this form with constant A and B.
Finally, when there is a nontrivial g(x), you should try to find the homogeneous (setting
g(x) = 0) and sourced solutions separately, although each solution benefits from the
guesses in (A.12). If you had D(f(x)) = f (x) + pf (x) − qf(x) with g(x) = g0 x, then
you’d start by solving
D(h(x)) = 0 (A.13)
to get the homogeneous piece of the solution, and this would crack under h(x) = Ae Bx (the
solution would be as in (A.3)). For the particular solution, we want
¯
D(f(x)) = g0 x (A.14)
which we can get using the variation of parameters approach from Section 1.7.4 (and in
particular, see Problem 1.7.3). It ends up being
¯ = − g0 x − g0 p ,
f(x) (A.15)
q q2
and the full solution is the sum of the homogeneous and sourced solutions
√ √
¯ x −p+ p2 +4q /2 x −p− p2 +4q /2 g0 x g0 p
f(x) = h(x) + f(x) = A1 e + A2 e − − 2 . (A.16)
q q
Problem A.0.1 Try putting x(t) = At p into the ODE

ẍ(t) + ω2 x(t) + 2bẋ(t) = 0,
and identify the problem with this attempted solution. (i.e. how can you tell it won’t
work?)
Problem A.0.2 Solve the second-order ODE (that comes up in both electricity and mag-
netism and gravity):
d 2 f(x) 2 df(x)
+ =α
dx 2 x dx
for constant α. In addition to writing the general solution, give the one that has
f(0) = 0 and f (0) = 0.
Problem A.0.3 Solve the second-order ODE (that comes up in both electricity and mag-
netism and gravity in two spatial dimensions):
d 2 f(x) 1 df(x)
+ =α
dx 2 x dx
for constant α. This time, find the general solution and the one with f(1) = c1 ,
f (1) = c2 for arbitrary constants c1 and c2 .
Problem A.0.4 Solve the first-order ODE:

df(x) 1
= − f(x)
dx x
for f(x) – there should be one constant of integration.
Problem A.0.5 A driven harmonic oscillator has equation of motion:
mẍ(t) = −kx(t) + F0 e −μt
for μ > 0. Find x(t) given x(0) = 0 and ẋ(0) = 0. What is the solution as t → ∞?
B Appendix B Vector Calculus
Curvilinear Coordinates
When we developed the vector calculus operations in Chapter 6, we started with the
gradient operator written in Cartesian coordinates,
∂ ∂ ∂
∇ = x̂ + ŷ + ẑ , (B.1)
∂x ∂y ∂z
which acted on functions f(x, y, z). The divergence and curl were then naturally written
in terms of Cartesian coordinates and basis vectors. In this appendix, we’ll look at how
to use the chain rule to express these vector derivative operations in other coordinate
systems. Some of this material appears in the main text, but I’d like to develop it clearly
and completely in one place to serve as a reference.
B.1 Cylindrical Coordinates
In cylindrical coordinates, we have the three coordinate variables s, φ, and z, defined as

shown in Figure B.1.
Referring to the figure, we can write
x = s cos φ y = s sin φ z=z (B.2)
(the cylindrical coordinate z is the same as the Cartesian z). These can be inverted, to give
y
s = x2 + y2 φ = tan−1 z = z. (B.3)
x
Along with the coordinate definitions, we need to develop the cylindrical basis vectors.
A coordinate basis vector points in the direction of increasing coordinate value and is
normalized to one. The basis vector associated with z is easy, that’s ẑ as always. How about
the basis vector that points in the direction of increasing s? Well, start with the vector that
points from the origin to the point with coordinates x, y: s ≡ x x̂ + y ŷ. At any point, the
direction of s is parallel to the direction of increasing s.1 The unit vector is
x x̂ + y ŷ
ŝ = = cos φ x̂ + sin φ ŷ (B.4)
x2 + y2
1 Besides this geometrical approach, we could identify the direction of increasing s using the gradient as in
Section 6.2.1,
x y
∇s = x̂ + ŷ = cos φ x̂ + sin φ ŷ.
x2 + y2 x2 + y2
238
239 Cylindrical Coordinates
ẑ
z s
y
ŷ
φ
x
x̂
Fig. B.1 The cylindrical coordinates s, φ, and z are related to the Cartesian x, y, and z.
where we can express the components of this basis vector in Cartesian coordinates (middle
equality) or in the new cylindrical coordinates (right-hand equality).
For the φ direction, we have increasing φ pointing tangent to a circle, and going
around in the counter-clockwise direction. We know, from, for example Section 6.2.2, that
φ = −y x̂ + x ŷ does precisely this,2 and we can again normalize to get a unit vector
−y x̂ + x ŷ
φ̂ = = − sin φ x̂ + cos φ ŷ, (B.5)
x2 + y2
writing the components in either Cartesian or cylindrical coordinates. We now have the
relation between the cylindrical basis vectors and the Cartesian ones,
ŝ = cos φ x̂ + sin φ ŷ φ̂ = − sin φ x̂ + cos φ ŷ ẑ = ẑ, (B.6)
and we can algebraically invert these to get the relation between the Cartesian set and the
cylindrical one
x̂ = cos φ ŝ − sin φ φ̂ ŷ = sin φ ŝ + cos φ φ̂ ẑ = ẑ. (B.7)
For derivatives, let’s start with the gradient. What happens if we take ∇f(s, φ, z)? We
know the gradient in Cartesian coordinates and with respect to the Cartesian basis. We’ll
∂f(s,φ,z)
use the chain rule from Calculus to work out expressions like ∂x , and then we can use
the basis relations in (B.7) to rewrite the gradient in terms of the cylindrical basis vectors.
First, let’s move the derivatives from Cartesian to cylindrical,
2 The gradient of φ = tan−1 (y/x) also gives a vector pointing in the direction of increasing φ,

y y x 1
∇ tan−1 =− 2 x̂ + 2 ŷ = (− sin φ x̂ + cos φ ŷ) .
x x + y2 x + y2 s
240 Vector Calculus: Curvilinear Coordinates
∂f (s, φ, z) ∂f ∂s ∂f ∂φ ∂f ∂z ∂f x ∂f −y ∂f
= + + = + + 0
∂x ∂s ∂x ∂φ ∂x ∂z ∂x ∂s x 2 + y 2 ∂φ x 2 + y 2 ∂z
(B.8)
∂f ∂f sin φ
= cos φ −
∂s ∂φ s
where again we can write everything in terms of Cartesian (top right equality) or cylindrical
coordinates (bottom). Similarly, we have
∂f(s, φ, z) ∂f ∂s ∂f ∂φ ∂f ∂z ∂f y ∂f x ∂f
= + + = + + 0
∂y ∂s ∂y ∂φ ∂y ∂z ∂y ∂s x 2 + y 2 ∂φ x 2 + y 2 ∂z
(B.9)
∂f ∂f cos φ
= sin φ + .
∂s ∂φ s
Finally, the z-derivative is unchanged from its Cartesian form. Now putting it all together,

∂f ∂f sin φ ∂f ∂f cos φ ∂f
∇f(s, φ, z) = cos φ − x̂ + sin φ + ŷ + ẑ, (B.10)
∂s ∂φ s ∂s ∂φ s ∂z
or, using (B.7) to write the gradient in terms of ŝ, φ̂ and ẑ,
∂f 1 ∂f ∂f
∇f(s, φ, z) = ŝ + φ̂ + ẑ. (B.11)
∂s s ∂φ ∂z
We can write this in operator form similar to (B.1),
∂ 1 ∂ ∂
∇ = ŝ + φ̂ + ẑ . (B.12)
∂s s ∂φ ∂z
The divergence of a vector function V(s, φ, z) = Vs ŝ+Vφ φ̂ +Vz ẑ comes directly from
the application of the gradient operator dotted into V. We have to be careful now that the
basis vectors themselves are position dependent, and so have nontrivial derivatives. Start
by applying ∇ from (B.12) to each term in V employing the product rule:
$ %
∂Vs ∂ŝ ∂Vφ ∂ φ̂ ∂Vz ∂ẑ
∇ · V = ŝ · ŝ + Vs + φ̂ + Vφ + ẑ + Vz
∂s ∂s ∂s ∂s ∂s ∂s
$ %
1 ∂Vs ∂ŝ ∂Vφ ∂ φ̂ ∂Vz ∂ẑ
+ φ̂ · ŝ + Vs + φ̂ + Vφ + ẑ + Vz
s ∂φ ∂φ ∂φ ∂φ ∂φ ∂φ
$ %
∂Vs ∂ŝ ∂Vφ ∂ φ̂ ∂Vz ∂ẑ
+ ẑ · ŝ + Vs + φ̂ + Vφ + ẑ + Vz
∂z ∂z ∂z ∂z ∂z ∂z
$ % (B.13)
∂Vs ∂ŝ ∂ φ̂ ∂ẑ
= + Vs ŝ · + Vφ ŝ · + Vz ŝ ·
∂s ∂s ∂s ∂s
$ %
1 ∂ŝ ∂Vφ ∂ φ̂ ∂ẑ
+ Vs φ̂ · + + Vφ φ̂ · + Vz φ̂ ·
s ∂φ ∂φ ∂φ ∂φ
$ %
∂ŝ ∂ φ̂ ∂Vz ∂ẑ
+ Vs ẑ · + Vφ ẑ · + + Vz ẑ · .
∂z ∂z ∂z ∂z
241 Cylindrical Coordinates
We need to evaluate the derivatives of the basis vectors. To do this, we will express the
cylindrical basis vectors in terms of the Cartesian ones, which do not depend on position,
from (B.6):
∂ŝ ∂ŝ ∂ŝ
=0 = − sin φ x̂ + cos φ ŷ = φ̂ =0
∂s ∂φ ∂z
∂ φ̂ ∂ φ̂ ∂ φ̂
=0 = − cos φ x̂ − sin φ ŷ = −ŝ =0 (B.14)
∂s ∂φ ∂z
∂ẑ ∂ẑ ∂ẑ
=0 =0 = 0.
∂s ∂φ ∂z
Using these results in (B.13),

∂Vs 1 ∂Vφ ∂Vz
∇·V = + Vs + + (B.15)
∂s s ∂φ ∂z
which we can write, using the product rule, as
1 ∂ 1 ∂Vφ ∂Vz
∇·V = (sVs ) + + . (B.16)
s ∂s s ∂φ ∂z
The final single derivative operator of interest is the curl. To calculate the curl, we use the
same approach as for the divergence, just replacing the dot product with the cross product.
We can use the derivates from (B.14) and the right-hand rule to perform the calculation:
$ %
∂Vs ∂ŝ ∂Vφ ∂ φ̂ ∂Vz ∂ẑ
∇ × V = ŝ × ŝ + Vs + φ̂ + Vφ + ẑ + Vz
∂s ∂s ∂s ∂s ∂s ∂s
$ %
1 ∂Vs ∂ŝ ∂Vφ ∂ φ̂ ∂Vz ∂ẑ
+ φ̂ × ŝ + Vs + φ̂ + Vφ + ẑ + Vz
s ∂φ ∂φ ∂φ ∂φ ∂φ ∂φ
$ %
∂Vs ∂ŝ ∂Vφ ∂ φ̂ ∂Vz ∂ẑ
+ ẑ × ŝ + Vs + φ̂ + Vφ + ẑ + Vz
∂z ∂z ∂z ∂z ∂z ∂z

∂Vφ ∂Vz 1 ∂Vs ∂Vz ∂Vs ∂Vφ
= ẑ − φ̂ + − ẑ + Vφ ẑ + ŝ + φ̂ − ŝ .
∂s ∂s s ∂φ ∂φ ∂z ∂z
(B.17)
Collecting terms, and using the product rule, we have

1 ∂Vz ∂Vφ ∂Vs ∂Vz 1 ∂ ∂Vs
∇×V = − ŝ + − φ̂ + sVφ − ẑ (B.18)
s ∂φ ∂z ∂z ∂s s ∂s ∂φ
For the second derivatives in the form of the Laplacian, ∇2 f = ∇ · ∇f, we just apply
the divergence formula in (B.16) to the gradient using the elements of the gradient vector
V ≡ ∇f from (B.11),

1 ∂ ∂f 1 ∂ 2f ∂ 2f
∇ f(s, φ, z) =
2
s + 2 + . (B.19)
s ∂s ∂s s ∂φ2 ∂z 2
B.2 A Better Way
All that chain ruling and basis switching is clear, but not always easy to carry out. Let’s
organize the calculation and broaden its applicability. Let the Cartesian coordinates be the
numbered set x1 = x, x2 = y, x3 = z. Suppose we introduce an arbitrary new set of
coordinates, call them X1 , X2 , and X3 , each a function of x1 , x2 , and x3 as in (B.3) (where
we’d have X1 = s, X2 = φ, X3 = z). We’ll assume that the coordinate transformation is
invertible, so that you could write x1 , x2 , and x3 in terms of the X1 (x1 , x2 , x3 ), X2 (x1 , x2 , x3 ),
and X3 (x1 , x2 , x3 ) – we carried out that process explicitly in, for example (B.2).
With the coordinate transformation and its inverse given, our next job is to get the
basis vectors. We know that the X̂1 basis vector points in the direction of increasing X1
coordinate, and similarly for the other basis vectors. If we take the usual r = x1 x̂ 1 +
x2 x̂ 2 + x3 x̂ 3 , and evaluate it at X1 and X1 + Δ, then subtract, we’ll get a vector parallel to
X̂1 . That is, (r(X1 + Δ, X2 , X3 ) − r(X1 , X2 , X3 ))/Δ points from the original location to the
new one, defining the direction of X1 increase in the original Cartesian basis. If we take the
parameter Δ → 0, we can write the difference in terms of the derivative of r with respect
to X1 :

r(X1 + Δ, X2 , X3 ) − r(X1 , X2 , X3 ) ∂r
W1 ≡ lim = , (B.20)
Δ→0 Δ ∂X1
and then X̂1 = W1 /W1 is the unit vector. We can similarly generate W2 ≡ ∂r
∂X2 and
W ≡ 3 ∂r 2 2
with X̂ = W /W and X̂ = W /W .
∂X3
2 3 3 3
The process here can be written in matrix-vector form, define the matrix3
⎛ ∂x1 ∂x2 ∂x3 ⎞
∂X1 ∂X1 ∂X1
⎜ ∂x1 ∂x2 ∂x3 ⎟
J=
˙ ⎝ ∂X2 ∂X2 ∂X2 ⎠, (B.21)
∂x1 ∂x2 ∂x3
∂X3 ∂X3 ∂X3
∂xj
with entries Jij = ∂Xi . Now we can summarize the relation between the new basis vectors
and the old ones
⎛ ⎞
⎛ ⎞ 1
∂r 0 0 ⎛ ⎞⎛ ⎞
X̂1 ⎜ ∂X ⎟ ∂x1 ∂x2 ∂x3
x̂1
⎜ 2 ⎟ ⎜ ⎟⎜
1 ∂X1 ∂X1 ∂X1
1 ⎟⎝ 2 ⎠
⎝ X̂ ⎠ = ⎜ 0 0 ⎟⎝ ∂x1 ∂x2 ∂x3
⎜ ∂r
∂X ⎟ ∂X2 ∂X2 ∂X2 ⎠ x̂ (B.22)
X̂3 ⎝ 2
1
⎠ ∂x1 ∂x2 ∂x3
x̂3
0 0 ∂r ∂X3 ∂X3 ∂X3
∂X

3

≡H
where the matrix out front on the right serves to normalize the basis vectors, call it H. This
matrix-vector structure allows us to easily write the formal inverse to relate the original
basis vectors to the new ones (i.e. we can invert to write x̂1 , x̂2 , and x̂3 in terms of the new
basis vectors).
3 This matrix is called the “Jacobian” of the transformation. In a confusing abuse of notation, sometimes it is the
determinant of this matrix that is called the Jacobian.
243 A Better Way
Let’s make sure we recover the cylindrical basis vectors from the previous section using
our new approach. We have x1 = x, x2 = y, and x3 = z as usual, with X1 = s, X2 = φ,
and X3 = z for the cylindrical coordinates. The matrices H and J become, in this specific
setting,
⎛ ⎞ ⎛ ⎞
1 0 0 cos(X2 ) sin(X2 ) 0
H=˙ ⎝ 0 1/X1 0 ⎠ J=˙ ⎝ −X1 sin(X2 ) X1 cos(X2 ) 0 ⎠ (B.23)
0 0 1 0 0 1
and using these in (B.22) gives
⎛ ⎞ ⎛ ⎞
X̂1 cos(X2 ) x̂ 1 + sin(X2 ) x̂ 2
⎜ 2 ⎟ ⎝
⎝ X̂ ⎠ = − sin(X2 ) x̂ 1 + cos(X2 ) x̂ 2 ⎠ (B.24)
X̂3 x̂ 3
matching, in this notation, our expression from (B.6).

In order to write everything in a consistent matrix-vector form, I need an expression for
the “vector of vectors” on the left and right in (B.22). Define
⎛ ⎞ ⎛ 1 ⎞
X̂1 x̂
⎜ ⎟
E=˙ ⎝ X̂2 ⎠ e=˙ ⎝ x̂ 2 ⎠ (B.25)
3 3
X̂ x̂
These are just convenient collections of the symbols X̂1 , X̂2 , X̂3 , and the lower-case
version, although it looks odd to have a bold face vector whose entries are themselves
basis vectors. It will only happen in this section, and is only to express the fact that as
objects, the basis vectors are related by linear combination. We can now write
E = HJe −→ e = J−1 H−1 E. (B.26)
Moving on to the gradient. Remember there are two steps we have to carry out. The first
is to use the chain rule to take derivatives. For f(X1 , X2 , X3 ), we have
∂f ∂f ∂X1 ∂f ∂X2 ∂f ∂X3
= + +
∂x1 ∂X1 ∂x1 ∂X2 ∂x1 ∂X3 ∂x1
∂f ∂f ∂X1 ∂f ∂X2 ∂f ∂X3
= + + (B.27)
∂x2 ∂X1 ∂x2 ∂X2 ∂x2 ∂X3 ∂x2
∂f ∂f ∂X1 ∂f ∂X2 ∂f ∂X3
= + + ,
∂x3 ∂X1 ∂x3 ∂X2 ∂x3 ∂X3 ∂x3
and we can write this in matrix-vector form
⎛ ∂f ⎞ ⎛ ∂X1 ∂X2 ∂X3 ⎞⎛ ∂f ⎞
∂x1 ∂x1 ∂x1 ∂x1 ∂X1
⎜ ∂f ⎟ ⎜ ∂X1 ∂X2 ∂X3 ⎟⎜ ∂f ⎟
⎝ ∂x2 ⎠=⎝ ∂x2 ∂x2 ∂x2 ⎠⎝ ∂X2 ⎠. (B.28)
∂f ∂X1 ∂X2 ∂X3 ∂f
∂x3 ∂x3 ∂x3 ∂x3 ∂X3
The matrix appearing in this equation is structurally identical to J in (B.21), but with the
roles of the two coordinates reversed (so that this matrix is also a Jacobian, but with the
{X1 , X2 , X3 } as the “original” coordinates, {x1 , x2 , x3 } as the “new” set). Call the matrix
∂Xj
in (B.28) K with entries Kij = ∂xi . What is the relationship between K and J from (B.21)?
Consider their product:

3
3
∂Xk ∂xn
3
∂xn ∂Xk
(KJ)mn = Kmk Jkn = = , (B.29)
∂xm ∂Xk ∂Xk ∂xm
k=1 k=1 k=1
and think about the derivative ∂xn

∂xm = δmn – the derivative is 1 if n = m (an object like ∂x1
∂x1 )
and 0 if n = m (as with, for example, ∂x1
∂x2 ). If we view xn as a function of X1 , X2 , and X3 ,
then the chain rule says
∂xn ∂xn ∂X1 ∂xn ∂X2 ∂xn ∂X3 ∂xn ∂Xk 3

= + + = (B.30)
∂xm ∂X1 ∂xm ∂X2 ∂xm ∂X3 ∂xm ∂Xk ∂xm
k=1
which is precisely what appears in (B.29). But this is just δmn , so we have
(KJ)mn = δmn (B.31)
or, in terms of matrices (rather than their entries): KJ = I the identity matrix. We have just
learned that the matrix K is the matrix inverse of J: K = J−1 .
Now for the second piece of the gradient, the basis vectors. In our current vector
notation, the gradient can be expressed as a product of e T with the vector appearing on
the left in (B.28):
⎛ ∂f ⎞
∂f ∂f ∂f ⎜ ∂x1
∂f ⎟
∇f = x̂ 1 + x̂ 2 + x̂ 3 = x̂ 1 x̂ 2
x̂ 3
·⎝ ∂x2 ⎠. (B.32)
∂x1 ∂x2 ∂x3
∂f
=e T ∂x3
T
If we multiply both sides of (B.28) by e from the left, we get the gradient:
⎛ ∂f ⎞
∂X1
⎜ ⎟
∇f = e T J−1 ⎝ ∂f
∂X2 ⎠ (B.33)
∂f
∂X3
and now we can use e = J−1 H−1 E to write everything in terms of the new basis set,
⎛ ∂f ⎞
T ∂X1
⎜ ∂f ⎟
∇f = ET H−1 J−1 J−1 ⎝ ∂X 2
⎠. (B.34)
∂f
∂X3
The gradient operator that we will use in the divergence and curl is then
⎛ ∂ ⎞
T ∂X1
⎜ ∂ ⎟
∇ = ET H−1 J−1 J−1 ⎝ ∂X 2
⎠. (B.35)
∂
∂X3
Rather than keeping the matrix notation going throughout the divergence and curl
calculations, we will assume that the gradient operator takes the general form, calculated
using (B.35),
∂ ∂ ∂
∇ = F(X1 , X2 , X3 ) X̂1 + G(X1 , X2 , X3 ) X̂2 + H(X1 , X2 , X3 ) X̂3 . (B.36)
∂X1 ∂X2 ∂X3
245 Spherical Coordinates
For a vector function V = VX1 X̂1 + VX2 X̂2 + VX3 X̂3 , we can use the product rule to
write the divergence
$ %
1 2 3
∂VX1 1 ∂ X̂ 1 ∂ X̂ 1 ∂ X̂
∇·V =F + VX1 X̂ · + VX2 X̂ · + VX3 X̂ ·
∂X1 ∂X1 ∂X1 ∂X1
$ %
2 1 3
∂VX2 2 ∂ X̂ 2 ∂ X̂ 2 ∂ X̂
+G + VX2 X̂ · + VX1 X̂ · + VX3 X̂ · (B.37)
∂X2 ∂X2 ∂X2 ∂X2
$ %
3 2 1
∂VX3 ∂ X̂ ∂ X̂ ∂ X̂
+H + VX3 X̂3 · + VX2 X̂3 · + VX1 X̂3 · .
∂X3 ∂X3 ∂X3 ∂X3
These expressions require us to evaluate the derivatives of the basis functions, easy enough
to do once you have their expressions from (B.26). The curl requires us to know all of the
cross products of the basis vectors, which can again be calculated from (B.26). Then the
curl reads,
∇×V =
$ %
∂ X̂1 ∂VX2 1 ∂ X̂2 ∂VX3 1 ∂ X̂3
F VX1 X̂ × 1
+ X̂ × X̂ +VX2 X̂ ×
2 1
+ X̂ × X̂ +VX3 X̂ ×
3 1
∂X1 ∂X1 ∂X1 ∂X1 ∂X1
$ %
∂ X̂2 ∂VX1 2 ∂ X̂1 ∂VX3 2 ∂ X̂3
+G VX2 X̂ ×
2
+ X̂ × X̂ +VX1 X̂ ×
1 2
+ X̂ × X̂ +VX3 X̂ ×
3 2
∂X2 ∂X2 ∂X2 ∂X2 ∂X2
$ %
∂ X̂3 ∂VX2 3 ∂ X̂2 ∂VX1 3 ∂ X̂1
+ H VX3 X̂ ×
3
+ X̂ × X̂ +VX2 X̂ ×
2 3
+ X̂ × X̂ +VX1 X̂ ×
1 3
.
∂X3 ∂X3 ∂X3 ∂X3 ∂X3
(B.38)
Finally, the general Laplacian acting on f(X1 , X2 , X3 ) comes from applying the diver-
gence to the gradient, as usual:
$ %
1 2 3
∂ ∂f ∂f 1 ∂ X̂ ∂f 1 ∂ X̂ ∂f 1 ∂ X̂
∇ f=F
2
F +F X̂ · +G X̂ · +H X̂ ·
∂X1 ∂X1 ∂X1 ∂X1 ∂X2 ∂X1 ∂X3 ∂X1
$ %
2 1 3
∂ ∂f ∂f ∂ X̂ ∂f ∂ X̂ ∂f ∂ X̂
+G G +G X̂2 · +F X̂2 · +H X̂2 ·
∂X2 ∂X2 ∂X2 ∂X2 ∂X1 ∂X2 ∂X3 ∂X2
$ %
3 2 1
∂ ∂f ∂f 3 ∂ X̂ ∂f 3 ∂ X̂ ∂f 3 ∂ X̂
+H H +H X̂ · +G X̂ · +F X̂ · .
∂X3 ∂X3 ∂X3 ∂X3 ∂X2 ∂X3 ∂X1 ∂X3
(B.39)
B.3 Spherical Coordinates
Spherical coordinates are defined as shown in Figure 6.9, reproduced in this section
as Figure B.2. From the geometry of the definition, we have x = r sin θ cos φ, y =
r sin θ sin φ, z = r cos θ.
ẑ
z r sin
θ
r
θ
y
ŷ
x φ
x̂
Fig. B.2 Spherical coordinates, r, θ , and φ are related to the Cartesian x, y, and z as shown.
To use the formalism from the previous section, take X1 ≡ r, X2 ≡ θ, X3 ≡ φ, with

x1 ≡ x, x2 ≡ y, x3 ≡ z as before. Our first job is to compute J and H from (B.21)
and (B.22)
⎛ ⎞
sin(X2 ) cos(X3 ) sin(X2 ) sin(X3 ) cos(X2 )
J= ˙ ⎝ X1 cos(X2 ) cos(X3 ) X1 cos(X2 ) sin(X3 ) −X1 sin(X2 ) ⎠
−X1 sin(X2 ) sin(X3 ) X1 cos(X3 ) sin(X2 ) 0
⎛ ⎞ (B.40)
1 0 0
⎜ ⎟
H= ˙ ⎝ 0 X11 0 ⎠.
1
0 0 X1 sin(X2 )
Using E = HJe, we can write the spherical basis vectors, in terms of the Cartesian ones:
X̂1 = sin(X2 ) cos(X3 ) x̂ 1 + sin(X2 ) sin(X3 ) x̂ 2 + cos(X2 ) x̂ 3

X̂2 = cos(X2 ) cos(X3 ) x̂ 1 + cos(X2 ) sin(X3 ) x̂ 2 − sin(X2 ) x̂ 3 (B.41)
X̂ = − sin(X3 ) x̂ + cos(X3 ) x̂ .
3 1 2
The gradient, according to (B.35) is

∂ 1 2 ∂ 1 ∂
∇ = X̂1 + X̂ + X̂3 , (B.42)
∂X1 X1 ∂X2 X1 sin(X2 ) ∂X3
and we identify F ≡ 1, G ≡ 1/X1 , and H ≡ 1/(X1 sin(X2 )) in (B.36).

From here, we need to know the derivatives of the basis vectors. The nonzero
derivatives are:
∂ X̂1 ∂ X̂2 ∂ X̂1 ∂ X̂2

= X̂2 = −X̂1 = sin(X2 ) X̂3 = cos(X2 ) X̂3
∂X2 ∂X2 ∂X3 ∂X3
(B.43)
∂ X̂3
= − sin(X2 ) X̂1 − cos(X2 ) X̂2 .
∂X3
247 Spherical Coordinates
With these, we can write the divergence

∂VX1 1 ∂VX2 1 ∂VX3
∇·V = + + VX1 + + VX2 cos(X2 ) + VX1 sin(X2 )
∂X1 X1 ∂X2 X1 sin(X2 ) ∂X3
1 ∂ 1 ∂
2
= 2
(X1 ) VX1 + (sin(X2 )VX2 ) .
(X1 ) ∂X1 X1 sin(X2 ) ∂X2
(B.44)
For the curl, we must compute the cross products of the new basis vectors
X̂1 × X̂2 = X̂3 X̂1 × X̂3 = −X̂2 X̂2 × X̂3 = X̂1 (B.45)
and then

∂VX2 3 ∂VX3 2 1 ∂VX1 3 ∂VX3 1
∇×V = X̂ − X̂ + VX2 X̂ −
3
X̂ + X̂
∂X1 ∂X1 X1 ∂X2 ∂X2
∂V
1 X2 ∂VX1 2
+ VX3 − sin(X2 ) X̂2 + cos(X2 ) X̂1 − X̂1 + X̂
X1 sin(X2 ) ∂X3 ∂X3

1 ∂ ∂VX2 1 1 1 ∂VX1 ∂
= (sin(X2 )VX3 )− X̂ + − (X1 VX3 ) X̂2
X1 sin(X2 ) ∂X2 ∂X3 X1 sin(X2 ) ∂X3 ∂X1

1 ∂ ∂VX1
+ (X1 VX2 ) − X̂3 .
X1 ∂X1 ∂X2
(B.46)
Finally, the Laplacian here is

$ %
∂ 2f 1 ∂ 1 ∂f ∂f
∇ f=
2
2
+ +
∂(X1 ) X1 ∂X2 X1 ∂X2 ∂X1

1 ∂ 1 ∂f 1 ∂f ∂f
+ + cos(X2 ) + sin(X2 )
X1 sin(X2 ) ∂X3 X1 sin(X2 ) ∂X3 X1 ∂X2 ∂X1

1 ∂ 2 ∂f 1 ∂ ∂f
= 2
(X1 ) + 2
sin(X2 )
(X1 ) ∂X1 ∂X1 (X1 ) sin(X2 ) ∂X2 ∂X2
1 ∂ 2f
+ 2 2 2
.
(X1 ) sin (X2 ) ∂(X3 )
(B.47)
It is useful to record these results in terms of the more naturally named r, θ, and φ
coordinate labels from Figure B.2. In this notation, the basis vectors are r̂ = X̂1 , θ̂ = X̂2 ,
and φ̂ = X̂3 . The gradient is
∂f 1 ∂f 1 ∂f
∇f(r, θ, φ) = r̂ + θ̂ + φ̂. (B.48)
∂r r ∂θ r sin θ ∂φ
The divergence and curl are

1 ∂2 1 ∂ 1 ∂Vφ
∇·V = 2
r Vr + (sin θVθ ) +
r ∂r r sin θ ∂θ r sin θ ∂φ

1 ∂ ∂Vθ 1 1 ∂Vr ∂
∇×V = sin θVφ − r̂ + − rVφ θ̂ (B.49)
r sin θ ∂θ ∂φ r sin θ ∂φ ∂r

1 ∂ ∂Vr
+ (rVθ ) − φ̂,
r ∂r ∂θ
with Laplacian

1 ∂ 2 ∂f 1 ∂ ∂f 1 ∂ 2f
∇ f= 2
2
r + 2 sin θ + 2 2 . (B.50)
r ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂φ2
B.4 Integral Elements
At the heart of integrals in vector calculus is the vector “line element.” We’ll start in
Cartesian coordinates, where it is easy to to define the line element, and then focus on
its geometric meaning in order to build the other cases. Imagine sitting at a point in
three dimensions, and you are told to go out a distance dx in the x̂ direction. Then the
infinitesimal vector representing your direction of motion would be dx x̂. If you also went
dy in the ŷ direction and dz in the ẑ direction, then the vector that points from your starting
location to your new one is the line element:
d = dx x̂ + dy ŷ + dz ẑ. (B.51)
Since the directions are orthogonal, we just add together the “moves” in each direction.
The magnitude of the line element gives the Pythagorean length:
√
d = d · d = dx 2 + dy 2 + dz 2 . (B.52)
In cylindrical coordinates, we’d like to use the cylindrical basis vectors ŝ, φ̂, and ẑ
together with the infinitesimal changes in coordinates, ds, dφ, and dz to express the line
element d. Imagine making a move ds in the ŝ direction. If you take dφ = 0 and dz = 0,
then d = ds ŝ. If you moved only in the ẑ direction a distance dz, then d = dz ẑ. Since
the ŝ and ẑ direction are orthogonal (at all points), we could combine these to get d =
ds ŝ + dz ẑ. How about a dφ move in the φ̂ direction? If we set ds = 0 = dz, and consider
a move only in the φ̂ direction, what is d =? We need to know the length associated with
an angular arc dφ, and that length depends on s. Referring to Figure B.3, if the starting
point is at a height z, and a distance s from the ẑ axis, then the length of the arc spanned
by dφ is sdφ so that d = sdφ φ̂ for a pure angular move. Putting all of these together,
249 Integral Elements
ẑ
dz
z ds
dφ
s sdφ
ŷ
x̂
Fig. B.3 The three orthogonal moves we can make from a point at height z, distance s from the ẑ axis.
ẑ
r sin θ
dφ
r sin θdφ
θ dr
dθ
ŷ
r rdθ
φ
x̂
Fig. B.4 Infinitesimal moves in the r̂, θ̂, and φ̂ directions from the point with spherical coordinates r, θ , φ.
we take a step in each of the three directions as shown in Figure B.3, the cylindrical line
element is
d = ds ŝ + sdφ φ̂ + dz ẑ. (B.53)
In spherical coordinates, infinitesimal displacements in the three independent directions

for a point at r, θ, φ are shown in Figure B.4. From the picture, we can read off the line
element
d = dr r̂ + rdθ θ̂ + r sin θdφ φ̂. (B.54)
These line elements are easy to pick off when we have a clear geometrical picture
of the infinitesimals and their orthogonal directions at a point. But many coordinate
transformations are much harder to draw, and we don’t want to rely on always being able
to make a picture and identify the line elements “by eye.” The analytical approach is the
obvious one, just take the Cartesian definition (B.51) and rewrite dx, dy and dz in terms of
the new coordinate infinitesimals and the derivatives of the new coordinates with respect
to the Cartesian ones, and express the Cartesian basis vectors in terms of the new ones.
As an example, let’s work out the cylindrical line element directly from the Cartesian
form. From the chain rule, we have
∂x ∂x ∂x
dx = ds + dφ + dz
∂s ∂φ ∂z
= cos φds − s sin φdφ
∂y ∂y ∂y
dy = ds + dφ + dz
∂s ∂φ ∂z (B.55)
= sin φds + s cos φdφ
∂z ∂z ∂z
dz = ds + dφ + dz
∂s ∂φ ∂z
= dz.
Using these, together with the basis vector relations from (B.7), we have
d = dx x̂ + dy ŷ + dz ẑ = ds ŝ + sdφ φ̂ + dz ẑ (B.56)
as expected.
B.4.1 Volume Element

The line element defines an infinitesimal box with side lengths given by the infinitesimal
lengths in each of the independent directions. If you want to build a volume element in
preparation for performing a volume integral, you just form the volume of the box by
multiplying the side lengths. For Cartesian coordinates with line element from (B.51), the
infinitesimal cube has volume
dτ = dxdydz. (B.57)
For the cylindrical line element in (B.53), the product of the side lengths gives
dτ = sdsdφdz, (B.58)
and for the spherical line element, (B.54), the product is
dτ = r2 sin θdrdθdφ. (B.59)
B.4.2 Area Elements

An infinitesimal area element has direction that is parallel to the surface normal, and
magnitude that is set by the area of the infinitesimal platelet on the surface. In Cartesian
coordinates, for a surface lying in the xy plane, the area element points in the ±ẑ direction
with magnitude dxdy, the rectangular area spanned by the infinitesimal line element in the
plane. If the surface was in the xz plane, the area element would be da = dxdz ŷ (picking
251 Integral Elements
ẑ
da = sdsdφẑ
sdφ
s
ds ds
dz
da = dsdz φ̂ s da = s dφdzŝ
dz
s dφ
Fig. B.5 Three natural surface area elements for the cylinder. Each one has direction that is normal to the surface, and
magnitude that is given by the infinitesimal surface patch.
the positive direction arbitrarily for this open surface). For a flat surface in the yz plane, the
area element is da = dydz x̂.
When we move to curvilinear coordinates like cylindrical ones, there are three “natural”
area elements we can make, pointing in each of the three directions. The algorithm is:
Identify the direction of the normal, that gives the direction of da, then multiply the
other two line element magnitudes together to get the infinitesimal area magnitude. For
a cylinder, if you take the top surface, with da ẑ, then the magnitude is da = sdsdφ. The
curved surface of the cylinder has da = sdφdz ŝ. If you cut the cylinder in half, the interior
face has da = dsdz φ̂.
In spherical coordinates, the most natural surface of interest is the surface of a sphere,
and this has da r̂. Multiplying the other two line element magnitudes together gives
da = r2 sin θdθdφ r̂.
Problem B.4.1 Evaluate ∇2 θ̂ (for the spherical unit vector pointing in the direction of
increasing θ) and ∇ · θ̂.
Problem B.4.2 Work out the basis vectors, gradient, divergence, curl, and Laplacian for
“elliptical cylindrical” coordinates {p, q, z}, related to the Cartesian ones via,
x = a cosh p cos q y = a sinh p sin q z=z (B.60)
for constant parameter a.

Problem B.4.3 Work out the basis vectors, gradient, divergence, curl and Laplacian for
“prolate spheroidal” coordinates, {u, v, φ}, related to the Cartesian ones via,
x = a sinh u sin v cos φ y = a sinh u sin v sin φ z = a cosh u cos v (B.61)
for constant parameter a.

Problem B.4.4 Four-dimensional Cartesian coordinates consist of the three usual coordi-
nates, {x, y, z} augmented with a fourth coordinate w. Work out the basis vectors,
gradient, divergence, curl, and Laplacian for four-dimensional spherical coordinates
where we introduce a new angle ψ to go along with θ and φ, and the four Cartesian
coordinates are related to the spherical ones {r, θ, φ, ψ} by
x = r sin ψ sin θ cos φ y = r sin ψ sin θ sin φ z = r sin ψ cos θ w = r cos ψ.
(B.62)
References
[1] G. B. Arfken, H. J. Weber, and F. E. Harris, “Mathematical Methods for Physicists:

A Comprehensive Guide,” Academic Press, 7th ed., 2012.
[2] C. M. Bender and S. A. Orszag, “Advanced Mathematical Methods for Scientists and
Engineers: Asymptotic Methods and Perturbation Theory,” Springer, 1999.
[3] M. L. Boas, “Mathematical Methods in the Physical Sciences,” Wiley, 3rd ed., 2005.
[4] F. W. Byron, Jr. and R. W. Fuller, “Mathematics of Classical and Quantum Physics,”
Dover Publications, revised ed., 1992.
[5] D. C. Chapman and P. M. Rizzoli, “Wave Motions in the Ocean: Myrl’s View.”
Technical Report, MIT/WHOI Joint Program, Woods Hole, MA, 1989.
[6] D. Clark, J. Franklin, and N. Mann, “Relativistic Linear Restoring Force,” European
Journal of Physics, 33, 1041–1051, 2012.
[7] J. Franklin, “ Computational Methods for Physics,” Cambridge University Press,
2013.
[8] J. Franklin, “Classical Field Theory,” Cambridge University Press, 2017.
[9] A. P. French, “Vibrations and Waves,” CBS Publishers & Distributors, 2003.
[10] H. Georgi, “The Physics of Waves,” Prentice Hall, Inc., 1993.
[11] G. H. Golub and C. F. Van Loan, “Matrix Computations,” Johns Hopkins University
Press, 4th ed., 2013.
[12] D. J. Griffiths, “Introduction to Electrodynamics,” Cambridge University Press, 4th
ed., 2017.
[13] D. J. Griffiths and D. F. Schroeter, “Introduction to Quantum Mechanics,” Cambridge
University Press, 3rd ed., 2018.
[14] R. J. LeVeque, “Numerical Methods for Conservation Laws,” Lectures in Mathemat-
ics, Birkhäuser, 1992.
[15] K. F. Riley, M. P. Hobson, and S. J. Bence, “Mathematical Methods for Physics and
Engineering,” Cambridge University Press, 3rd ed., 2006.
[16] I. G. Main, “Vibrations and Waves in Physics,” Cambridge University Press, 3rd ed.,
1993.
[17] P. McCord Morse and H. Feschbach, “Methods of Theoretical Physics,” McGraw-
Hill, 1953.
[18] J. J. Sakurai and J. Napolitano, “Modern Quantum Mechanics,” Cambridge Univer-
sity Press, 2nd ed., 2017.
[19] G. B. Whitham, “Linear and Nonlinear Waves,” John Wiley & Sons, Inc., 1999.
253
Index
amplitude, 2 complex driving force, 38

angular frequency, 3 complex impedance, 56
harmonic oscillator, 3 complex numbers, 19–23
plane waves, 169 addition, 20
standing waves, 105 Cartesian components, 20
antinodes, 104 complex conjugate, 22
antisymmetric matrix, 81 imaginary axis, 20
arcsine imaginary part, 19
derivative, 137 multiplication, 20
relation to log, 136 polar components, 20
arcsinh (relation to log), 136 real axis, 20
area element, 155, 250–251 real part, 19
cylindrical coordinates, 251 complex time, 26, 184, 202
spherical coordinates, 163, 251 conjugate transpose, 78
area integral, 155–156 conservation law, 114–115, 185, 189
average, 194 integral form, 114–115, 159–160
averaging property, 165–167 three-dimensional, 159–160
axially symmetric, 175 conservation of energy, 7–14, 130
relativistic, 139
basis, 68, 79 conservation of mass, 113–114
canonical, 67 conservative force, 5–8
beats, 103 closed curve, 154
Bessel function, 19, 174, 210 path independence, 154
orthogonality, 180 three-dimensional, 154–155
recursion relation, 174 continuity, 32
spherical, 19, 171 violation, 126
Bessel’s equation, 173 convective derivative, 185, 186
Frobenius method, 173–175 cosine, 18
Biot–Savart law, 216 angle addition, 23
bisection, 208 hyperbolic, 25
convergence, 210 cosine series, 44, 48
black hole, 12 Coulomb force, 13
boundary values, 3 critically damped, 34–35
numerical, 220 cross product, 147
of basis vectors, 247
canonical basis, 67 right-hand rule, 148
capacitor, 53 cross-correlation, 64
Cartesian coordinates, 242 cross-derivative equality, 94
four-dimensional, 252 curl, 150–152, 164–165
center of mass, 66 cylindrical coordinates, 241
change of variables, 134 general coordinates, 245
characteristic curves, 97 right-hand rule, 151
charge density, 159, 182 spherical coordinates, 247
closed path, 154 curl theorem, 160–162
closed surface, 155 current density, 182
complex conjugate, 22 curvature, 192
254
255 Index
curvature vector, 192 elliptical cylindrical coordinates, 251

cylindrical basis vectors, 239 elliptical motion, 127–129
cylindrical coordinates, 172, 238–241 perimeter, 127–129
area element, 251 semi-major axis, 127
basis vectors, 239 semi-minor axis, 127
curl, 241 energy density, 95, 185, 186
divergence, 241 energy quantization, 200, 204
gradient, 240 equilibrium, 5
Laplacian, 241 minimum, 6
line element, 249 stable, 9
volume element, 250 unstable, 6, 9
escape speed, 10–11
damped driven harmonic oscillator, see harmonic ether, 182
oscillator Euler equations, 185–187
damped harmonic oscillator, see harmonic oscillator approximate, 187–189
damping, 32–38 three-dimensional, 186–187
dashpot, 33 Euler’s formula, 20, 22
determinant, 72–73, 76, 80 hyperbolic, 25
diagonalized matrix, 77 even function, 42–44, 47–48, 177, 204
differential, 28 event horizon, 12
dimension, 66 expectation value, 194
dimensionless equation, 212–214, 225, 227 exponential, 15–16, 23–26
restoring dimension, 214 characteristic time, 36
Dirac delta function, 59–60, 62–63 complex argument, 20
and Heaviside step function, 62 derivatives, 23
and Kronecker delta, 60 Euler’s formula, 20
three-dimensional, 163–165 infinite sum, 15–16
discrete Fourier transform, 229–233 integrals, 25, 40
divergence, 150–152, 163–164 matrix, 121
of basis vectors, 251 exponential ansatz, 234
cylindrical coordinates, 241
general coordinates, 245 fast Fourier transform, 232
spherical coordinates, 247 filter, 233
divergence theorem, 156–159 finite difference, 220–224
dot product, 67, 69, 71, 146 matrix form, 222
drag, 33 fluids, 185–189
driven harmonic oscillator, see harmonic oscillator Fourier series, 40–51, 229
alternative definition, 44
eigenfunction, 81, 224 cosine series, 44
eigenvalue problem, 76–79, 224 discretized, 229
continuous, 81, 199, 224 even and odd functions, 42–44
numerical solution, 227–228 Gibb’s phenomenon, 44–46
eigenvalues, 76–81, 224 sine series, 44, 101
eigenvectors, 75–81 square wave, 41–44
electric field, 182, 207 triangle wave, 47
electrical circuit, 51, 53–57 Fourier transform, 57–64
complex impedance, 56 cross-correlation, 64
gain, 54 Dirac delta function, 59–60
power, 54 discrete, 229–233
Q factor, 54–55 and Fourier series, 57–58
resonance, 54 Parseval’s relation, 61
electromagnetic waves, 181–184 power spectrum, 61
electrostatic force, 13 real signal, 62
elliptic integral wave equation, 102–103
approximation, 129 frequency, 2
first kind (F), 133 angular, 3
second kind (E), 129 and eigenvalues, 84
256 Index
frequency (cont.) Fourier transform, 63–64

beats, 103 frequency, 2
filter, 233 initial values, 2
harmonic oscillator, 2 integral solution, 123
negative, 61 kinetic friction, 31–32
Nyquist, 231 numerical solution, 215, 223–224
plane waves, 107, 169 overdamped, 34–36
standing waves, 105 period, 2, 35, 131–132, 140–142
Frobenius method, 14–19 potential energy, 130, 149
Bessel’s equation, 173–175 probability density, 197–198
cosine and sine, 17–18 Q factor, 36
exponential, 15–16 quantum mechanical, 202–205
Hermite’s equation, 204–205 relativistic, 139–142, 215
Legendre’s equation, 176–177 resonance, 50, 52
recursion relation, 15 series solution, 17–18
fundamental theorem of calculus, 136, 152, 156, 160 three-dimensional, 149–150
two-dimensional, 126–129
gain, 54 underdamped, 34–35
Gauss’s law, 14, 162 work, 155
Gaussian, 60 heat equation, 102
density, 195 Heaviside step function, 62
general basis vectors, 242–243 Hermite polynomials, 204
general coordinates, 242 orthogonality, 206
basis vectors, 242–243 Rodrigues formula, 206
curl, 245 Hermite’s equation, 203–205
divergence, 245 Frobenius method, 204–205
gradient, 244–245 homogeneous problem, 29
Laplacian, 245 hydrogen, 226–227
Gibb’s phenomenon, 44–46 numerical solution, 229
gradient, 148–149, 238 hyperbolic cosine, 25
cylindrical coordinates, 240 hyperbolic sine, 25
general coordinates, 244–245
spherical coordinates, 246 identity matrix, 71
gravitational field, 14 image compression, 79
gravity impedance, 56
force, 10 incident wave, 108
near surface of the earth, 186 inductor, 53
potential energy, 10 infinite square well
grid, 211, 221, 224 classical, 195–196
ground state, 206 numerical solution, 229
quantum mechanical, 200–202, 225–226
harmonic function, 165–168 initial values, 2
averaging property, 165–167 numerical, 212
no local max/min, 167 integration
uniqueness, 167–168 area element, 155, 250–251
harmonic oscillator, 1–3 area integral, 155–156
amplitude, 2 change of variables, 134
angular frequency, 3 curl theorem, 160–162
boundary values, 3 divergence theorem, 156–159
critically damped, 34–35 fundamental theorem of calculus, 136
damped driven, 51–55, 64, 123–126 line element, 153, 248, 249
damping, 32–38, 63–64, 122 line integral, 153–154
delays, 110–111 numerical, 215–220
dimensionless, 213–214 by parts, 136–137
driven, 38–39, 50–51, 122–123 product rule, 136
first-order form, 120–126 Simpson’s rule, 218
Fourier series, 50–51 trapezoidal approximation, 218
257 Index
trigonometric substitution, 134–136 eigenvalue problem, 76–79, 224

volume element, 155, 250 eigenvalues, 76–81, 224
volume integral, 155 eigenvectors, 75–81
integration by parts, 136–137 exponential, 121
inverse trigonometric functions, 135 identity, 71
derivatives, 137 inverse, 71
Jacobian, 242–244, 246
Jacobian, 242–244, 246 matrix multiplication, 71
inverse, 244 orthogonal, 74
scalar multiplication, 70
kinetic energy, 8 symmetric, 77–79
kinetic friction, 31–33 trace, 80
Kirchoff ’s voltage law, 53, 56 transpose, 71, 74
Korteweg–de Vries (KdV) equation, 189 and vectors, 69–70, 74
Kronecker delta, 41 vector multiplication, 70–71
and Dirac delta function, 60 Maxwell’s equations, 162, 182
in vacuum, 182
Laplace equation, 102 method of characteristics, 97–98
Laplace’s equation, 165, 172–179, 184 traffic flow, 115–116
cylindrical coordinates, 172–173 wave equation, 114
separation of variables, 173–179 method of Frobenius, 14–19
Laplacian, 165–168 Bessel’s equation, 173–175
of basis vectors, 168, 170, 251 cosine and sine, 17–18
cylindrical coordinates, 172–173, 241 exponential, 15–16
general coordinates, 245 Hermite’s equation, 204–205
spherical coordinates, 175, 247 Legendre’s equation, 176–177
Legendre polynomials, 177 recursion relation, 15
orthogonality, 177–178 Minkowski length, 142–143
Rodrigues formula, 179
Legendre’s equation, 176 negative mass motion, 12
Frobenius method, 176–177 Neumann function, 175
light, 184 Newton’s second law, 211
line element, 153, 248 relativistic, 145
cylindrical coordinates, 249 nodes, 104
spherical coordinates, 249 nonlinear wave equation, 190–192
line integral, 153–154 inextensible limit, 191–192
linear combination, 67 normal modes, 84–87
linear transformation, 73 achieving, 85
ln, 24 Nyquist frequency, 231
local maximum, 6, 9
local minimum, 6 O notation, 92
logarithm, 23–26 odd function, 42–44, 47–48, 177, 204
derivatives, 24 ODE
integrals, 25 asymptotic solution, 203
natural, 24 continuity, 26–27, 32
longitudinal, 191 exponential ansatz, 234
longitudinal waves, 169 Fourier series, 48–51
Fourier transform, 63–64
Mach angle, 112 homogeneous solution, 29–30, 38
Mach cone, 113 numerical solution, 211–215, 220–224
magnetic field, 182, 207, 216 orthogonality, 177–178, 202, 206
numerical approximation, 220 peel off, 35, 203, 205
matrix, 69–73 plan of attack, 234–236
addition, 70 polynomial ansatz, 235
antisymmetric, 81 separation of variables, 27–28
determinant, 72–73, 80 sourced solution, 29–30, 38
diagonalized, 77 superposition, 28–29
258 Index
ODE (cont.) equilibrium, 5

variation of parameters, 29, 30 gravitational, 10
vector form, 120–126 maximum, 9
orthogonal matrix, 74 spherically symmetric, 130
overdamped, 34–36 Yukawa, 12
power method, 227–229
Parseval’s relation, 61 power spectrum, 61, 86
partial derivatives, 92–95 pressure, 186
of basis vectors, 246 probability density, 193–198
cross-derivative equality, 94 constant, 195–196
path, 153, 207 Gaussian, 195
closed, 154 harmonic oscillator, 197–198
PDE, 96 infinite square well, 195–196
method of characteristics, 97–98, 114 normalization, 194
Riemann problem, 116–119 quantum mechanical harmonic oscillator, 205
separation of variables, 98–101, 171, 173–179, quantum mechanical infinite square well, 201
198–199 statistical interpretation, 196
series solution, 101–102 time average, 197–198
soliton, 189 time independent, 199
weak solution, 117 product log, 111, 210
pendulum, 11–12 product rule, 136
approximation, 11 prolate spheroidal coordinates, 251
comparison, 13 proper time, 143
numerical solution, 215
period, 132–133 Q factor, 36, 54–55
simple, 13 quadratic interpolation, 218
period, 2, 35
harmonic oscillator, 2, 131–132 radius of curvature, 192
pendulum, 132–133 recursion relation, 15, 17
plane waves, 107 asymptotic, 204, 206
relativistic oscillator, 140–142 reflected wave, 108
standing waves, 104 reflection coefficient, 109
periodic function, 40 relativistic length, 142–143
phase, 3 relativistic oscillator, 139–142
piecewise solution, 32 numerical solution, 215
plane waves, 107–109 period, 140–142
angular frequency, 169 residuals, 215
electromagnetic, 183–184 resistor, 53
frequency, 107, 169 resonance, 50, 52, 54
incident, 108 resonance curve, 54
period, 107 rest frame, 143
polarization, 183, 184 retarded time, 207
reflected, 108 implicit equation, 208
three-dimensional, 169 Riemann problem, 116–119
transmitted, 108 right-hand rule
transverse, 183 cross product, 148
wave number, 169 curl, 151
wave vector, 169, 183 Rodrigues formula
wavelength, 107, 169 Hermite polynomials, 206
Poisson equation, 168 Legendre polynomials, 179
uniqueness, 168 root finding problem, 207
polarization, 183, 184 rotation, 74, 80
polynomial ansatz, 235
potential, 209 Schrödinger’s equation, 103–104, 193, 198–205
of moving charge, 209 allowed energies, 200
potential energy, 5 boundary conditions, 199
Coulomb, 226 dimensionless, 227
259 Index
for hydrogen, 227 volume element, 250

harmonic oscillator, 202–205 springs in series, 91, 94
infinite square well, 200–202 square wave, 41–44, 48, 124
numerical solution, 224 Gibb’s phenomenon, 45
orthogonality, 202 stable equilibrium, 9
separation of variables, 198–199 standard deviation, 194
superposition, 201 standing waves, 104–106
time independent, 199 angular frequency, 105
self force, 37 antinode, 104
semi-major axis, 127 frequency, 105
semi-minor axis, 127 nodes, 104
separation of variables, 27–28, 98–101 period, 104
additive, 98–99 wavelength, 104
Laplace’s equation, 173–179 stationary states, 199
logic, 99 steady state solution, 52
multiplicative, 99–101 superposition, 28–29, 94–95, 201
Schrödinger’s equation, 198–199 surface area, 138
series expansion, 14–19 surface of revolution, 138
Bessel’s equation, 173–175 symmetric matrix, 77–79
cosine and sine, 17–18
exponential, 15–16 Taylor expansion, 3–5, 40
Hermite’s equation, 204–205 vector, 157
Legendre’s equation, 176–177 time independent Schrödinger’s equation, 199
recursion relation, 15 trace, 80
shallow water equations, 187–189 traffic flow, 115–119
linearized, 189 Riemann problem, 116–119
one-dimensional, 187–189 shocks, 116
two-dimensional, 189 transient solution, 52, 124
shocks transmission coefficient, 109
Riemann problem, 116–119 transmitted wave, 108
traffic flow, 116 transpose, 71
wave equation, 112 transverse, 183, 191
simple pendulum, 13 transverse waves, 169
Simpson’s rule, 218 trapezoidal approximation, 218
sine, 18 triangle wave, 47
angle addition, 23 trigonometric subsitution, 134–136
hyperbolic, 25 twin paradox, 143–144
integral, 46
sine integral, 58 underdamped, 34–35
numerical approximation, 219 uniform circular motion, 111, 126
sine series, 44, 48, 101 uniqueness, 167–168
Slinky,R 92 unit normal, 155
small angle approximation, 96 unit tangent vector, 190
soliton, 189 unstable equilibrium, 6, 9
sonic boom, 112
spacetime coordinates, 143 variance, 194
speed of light (c), 182 variation of parameters, 29, 30
spherical basis vectors, 246 vector, 66–69, 146–148
spherical Bessel function, 19, 171 addition, 66
spherical coordinates, 130, 166, 245–248 area element, 155
area element, 251 area for a sphere, 156
basis vectors, 246 basis, 68, 79
curl, 247 canonical basis, 67
divergence, 247 column, 71
gradient, 246 complete set, 67
Laplacian, 247 conjugate transpose, 78
line element, 249 cross product, 147
260 Index
vector (cont.) electromagnetic, 181–182

curl, 150–152, 241, 245, 247 energy density, 95
curl theorem, 160–162 Euler equations, 185–187
dimension, 66 extensible string, 190–192
divergence, 150–152, 241, 245, 247 finite propagation speed, 110
divergence theorem, 156–159 Fourier transform, 102–103
dot product, 67, 69, 71, 146 general solution, 97–98, 184
eigenvectors, 75–81 inextensible string, 95–96
gradient, 148–149, 238, 240, from mass conservation, 113–114
244–246 method of characteristics, 97–98
length, 67, 147 plane waves, 107–109
matrix multiplication, 70–71 Schrödinger’s equation, 193, 198–199
normalized, 67 separation of variables, 98–101, 171
orthogonal, 67 series solution, 101–102
path, 153, 207 shallow water, 187–189
projection, 68 shocks, 112
row, 71 spherically symmetric, 170–171
scalar multiplication, 66 standing waves, 104–106
tangent, 153 superposition, 94–95, 107
Taylor expansion, 157 three-dimensional, 168–171
transpose, 71 varying speed, 113–114
unit normal, 155 vector, 170
wave equation, 170 wave number, 169
Verlet method, 212 wave vector, 169, 183
volume element, 155, 250 wavelength
cylindrical coordinates, 250 plane waves, 107, 169
spherical coordinates, 250 standing waves, 104
volume integral, 155 weak solution, 117
work, 153
wave equation closed curve, 154
from balls and springs, 91–92
changing medium, 108 Young’s modulus, 92
continuity, 109 Yukawa potential, 12

Joel Franklin - Mathematical Methods For Oscillations and Waves-Cambridge University Press (2020)

Uploaded by

Copyright:

Available Formats

Joel Franklin - Mathematical Methods For Oscillations and Waves-Cambridge University Press (2020)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Joel Franklin - Mathematical Methods For Oscillations and Waves-Cambridge University Press (2020)

Uploaded by

Copyright:

Available Formats

Mathematical Methods for Oscillations and Waves

Cambridge University Press is part of the University of Cambridge.

2 Damped Harmonic Oscillator 31

4 The Wave Equation 90

4.7 Shocks 112

6 Waves in Three Dimensions 146

7 Other Wave Equations 181

8 Numerical Methods 207

Appendix A Solving ODEs: A Roadmap 234

Appendix B Vector Calculus: Curvilinear Coordinates 238

It is a pleasure to thank the students and my colleagues in the physics department

mẍ(t) = −k(x(t) − a) (1.1)

1.1 Solution Review

To proceed, we can deﬁne k/m ≡ ω 2 , so that our equation of motion becomes

ẍ(t) = −ω2 (x(t) − a) , (1.2)

ÿ(t) = −ω2 y(t) −→ y(t) = A cos(ωt) + B sin(ωt). (1.3)

and we deﬁne the “angular frequency” of the oscillatory motion to be

1.2 Taylor Expansion

f (x0 + Δx) f (x0 ) + f (x0 )Δx ≈ f (x0 + Δx)

1.3 Conservative Forces

1.3.1 Conservation of Energy

1.3.2 Harmonic Oscillator

1.3.3 Escape Speed

where G ≈ 6.67 × 10−11 N m2 /kg2 is the gravitational constant.

1.4 Series Expansion, Method of Frobenius

which shares the form of (1.42).

1.4.2 Harmonic Oscillator

so that in order for each coefﬁcient of t j to vanish separately, we must have

For the odd coefﬁcients, take a1 as given:

Problem 1.4.1 By differentiating each term in

show that df(t)

1.5 Complex Numbers

z = a + ib = s(cos φ + i sin φ) , (1.64)

the two-dimensional picture to keep in mind is shown in Figure 1.9.

z = s(cos φ + i sin φ) = se iφ . (1.66)

z1 z2 = ac + i (ad + bc) + i2 bd = ac − bd + i (ad + bc), (1.68)

z1 z2 = s1 s2 e i (φ1 +φ2 ) (1.69)

Fig. 1.12 For Problem 1.5.8.

for real constants {aj }∞ ∗ ∗

cos(θ + φ) = cos φ cos θ − sin φ sin θ.

sin ψ = sin φ cos θ + cos φ sin θ.

1.6 Properties of Exponentials and Logarithms

d f(t) dy(f(t)) dy( f ) df(t) ˙ = e f(t) f(t).

You can call f anything you like, we have

1.7 Solving First-Order ODEs

with a discontinuity at x = a. Now integrate both sides of (1.92) from a −  → a + :

1.7.2 Separation of Variables

Now for the inversion, we exponentiate both sides and isolate f:

12 Meaning, here, without the use of a computer.

giving the pair of solutions

as we saw in Section 1.6.

1.7.4 Homogeneous and Sourced Solutions

and we can even set h0 = f0 to satisfy the initial value.

x2 (t) = (p1 + z) cos(ω(t − π/ω)) − z

2.1.2 Critically Damped

x(t) = e −bt (A + B). (2.13)

ẍ(t) + 2ω ẋ(t) + ω2 x(t) = 0. (2.14)

e −ωt ÿ(t) = 0 −→ y(t) = A + Bt (2.15)

x(t) = e −ωt (x0 +(v0 + ωx0 ) t) . (2.16)

with a discontinuity at x = a. Now integrate both sides of (1.92) from a − → a + :