A Student S Guide To Special Relativity Norman S Annas Archive Libgenrs NF 3366011
A Student S Guide To Special Relativity Norman S Annas Archive Libgenrs NF 3366011
A Student S Guide To Special Relativity Norman S Annas Archive Libgenrs NF 3366011
This compact yet informative Guide presents an accessible route through Special
Relativity, taking a modern axiomatic and geometrical approach. It begins by
explaining key concepts and introducing Einstein’s postulates. The consequences of
the postulates – length contraction and time dilation – are unravelled qualitatively and
then quantitatively. These strands are then tied together using the mathematical
framework of the Lorentz transformation, before applying these ideas to kinematics
and dynamics. This volume demonstrates the essential simplicity of the core ideas of
Special Relativity, while acknowledging the challenges of developing new intuitions
and dealing with the apparent paradoxes that arise. A valuable supplementary resource
for intermediate undergraduates, as well as independent learners with some technical
background, the Guide includes numerous exercises with hints and notes provided
online. It lays the foundations for further study in General Relativity, which is
introduced briefly in an appendix.
N O R M A N G R AY
University of Glasgow
University Printing House, Cambridge CB2 8BS, United Kingdom
One Liberty Plaza, 20th Floor, New York, NY 10006, USA
477 Williamstown Road, Port Melbourne, VIC 3207, Australia
314–321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre,
New Delhi – 110025, India
103 Penang Road, #05–06/07, Visioncrest Commercial, Singapore 238467
www.cambridge.org
Information on this title: www.cambridge.org/highereducation/isbn/9781108834094
DOI: 10.1017/9781108999588
© Norman Gray 2022
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 2022
A catalogue record for this publication is available from the British Library.
ISBN 978-1-108-83409-4 Hardback
ISBN 978-1-108-99563-4 Paperback
Additional resources for this publication at cambridge.org/gray-sgsr
Cambridge University Press has no responsibility for the persistence or accuracy of
URLs for external or third-party internet websites referred to in this publication
and does not guarantee that any content on such websites is, or will remain,
accurate or appropriate.
I do not know what I may appear to the world, but to
myself I seem to have been only like a boy playing on the
sea-shore, and diverting myself in now and then finding a
smoother pebble or a prettier shell than ordinary, whilst
the great ocean of truth lay all undiscovered before me.
Isaac Newton, as quoted in Brewster, Memoirs of
the Life, Writings, and Discoveries of Sir Isaac Newton,
Vol. 2, p. 407.
Preface page xi
Acknowledgements xv
Aims xvi
1 Introduction 1
1.1 The Basic Ideas 1
1.2 Events 3
1.3 Inertial Reference Frames 3
1.4 Simultaneity: Measuring Times 7
1.5 Simultaneity: Measuring Lengths 9
1.6 The Clock Hypothesis 10
1.7 Standard Configuration 12
1.8 Further Reading 13
Exercises 16
2 The Axioms 18
2.1 The First Postulate: the Principle of Relativity 18
2.2 The Second Postulate: the Constancy of the Speed of Light 27
Exercises 29
3 Length Contraction and Time Dilation 32
3.1 Simultaneity 32
3.2 Length Contraction and Time Dilation, Qualitatively 34
3.3 The Light Clock 37
3.4 The Horizontal Light Clock: Length Contraction 40
3.5 Is There Anything I Can Hold on To? 42
Exercises 43
4 Spacetime and Geometry 45
4.1 Natural Units 46
4.2 The Minkowski Diagram 50
vii
viii
References 205
Index 211
Preface
xi
xii Preface
the geometrical structure of our universe, and also see that structure as
something with concrete physical consequences.
I developed the material here over a number of years, in the course of
teaching an early-undergraduate course in SR at the University of Glasgow.
It has therefore been read, puzzled over, commented on, and corrected by
multiple cohorts of students, and the distribution of exercises in each chapter
reflects both their questions and difficulties, and the inevitable requirement
that I formally examine their understanding. That is, this book is not an
armchair exercise in finding ‘interesting topics in SR’, but an attempt to
bring students to the understanding of SR that they will need in their future
work, and which will expand their intellectual horizons. And of course
those lectures were not just a discussion group, but a course with an exam at
the end of it; the presence of that exam influenced the selection and shape
of material in a way which may also be of use to you.
The text here is therefore suitable to support a course: I covered the
content, without appendices, in ten fairly busy lectures (and did not declare
all of it examinable). But in converting the text to a book I have also had
in mind both the independent reader working their way into the subject,
and the student of another course who wants light shone on the landscape
from a different direction. I resisted adding new core material in this pro-
cess, but have expanded explanations here and there, and added various
supplementary comments, either historical, mostly in discursive footnotes,
or technical, in ‘dangerous-bend’ sections.
Although I believe my students’ experience validates this text as one good
route to SR, I insist that it is not the only route, nor even the only good
one. Amongst the unusual things about relativity is that there exists no
single royal road to an understanding of it, nor even a single obvious way of
mapping the territory. Relativity always makes more sense the second time
you read about it (and makes still more sense the first time you explain it to
someone else), and so it is always useful to expect to read more than one
introduction; to that end, I point to a number of other books in Chapter 1.
My goal for this book is not to disperse, but to join, this swarm of alternatives.
Throughout, I have freely referred you to other textbooks, to review articles,
and to original research articles. It is not necessary to follow each of these
outward-pointing links to get an understanding of the subject, but I hope
they indicate the richness of the subject’s connections with the rest of physics,
and with the history of physics and astronomy.
It’s easy for me to talk about the simplicity and minimalism of SR: you
may find the claim a little surprising, if you look ahead and see a lot of solid
text, including a dense undergrowth of primes and subscripts. A lot of this
Preface xiii
Each chapter starts with a few high-level aims. These are collected on
page xvi, and I think of them as the ‘intellectual table of contents’ of the
book.
The exercises at the end of each chapter are, I think, important in solidify-
ing your understanding of SR. A number of them are annotated with 𝑑+ , 𝑑−
or 𝑢+ to indicate ones which are more or less difficult, or more particularly
useful, than those around them.
I have included an appendix on the link to General Relativity and gravi-
tation. Although it is of course auxiliary to a book on Special Relativity, I
believe it is useful to show that the gap between the two areas is narrower
than it may at first appear, and that we can build a bridge which lands a
small but significant way into the new territory. The approach I have taken
to SR is designed to give this bridge as firm a foundation as possible, on the
SR side.
Similarly, the appendix on relativity’s contact with experiment is auxiliary
but, I hope, both useful and interesting. It it not concerned with the question
‘is relativity right?’ – the answer to which it takes as obviously ‘yes’ – but
instead steps back and discusses the nature of the relationship between
relativity, both special and general, and experimental corroboration, and
indeed has something to say about the relationship between corroboration,
science in general, and scientists as a community.
Throughout the book, I have included a number of historical asides. Spe-
cial Relativity does not have an unusually intricate history, but these asides
are present partly because they add an extra dimension to our knowledge
of the topic, but also because, by hinting at an alternative intellectual path
xiv Preface
not followed, they can colour in our understanding of the ideas as they have
developed in fact. This also seems a good point at which to mention my
deliberate habit of styling scientists’ names, when they become adjectives,
in lowercase: thus ‘Newton’s laws’, but ‘newtonian physics’. This is partly
because, by the time a topic acquires an adjective like this, it has absorbed
the work of a multitude of people beyond any original creator (see also
‘Stigler’s law’), and evading the ownership question with a lowercase letter
seems both fairer and less cumbersome than a hyphenated list of all an
idea’s retrospectively discovered co-discoverers. Even Newton would have
to go back to school if presented with a book on contemporary classical
mechanics.
These notes have benefitted from very thoughtful comments, criticism and
error-checking, received from both colleagues and students, over the years
this course has been presented, for which I am very grateful. In particular I
would like to thank Richard Barrett, Andrew Conway, and Susan Stuart for
detailed criticial comments when turning the notes into a book.
Thank you also to Cleon Teunissen for help with the history of the term
‘Invariantz-Theorie’ (p. 98, note 9); to the contributors to Wikimedia Com-
mons for the image in Figure 1.3 (and so many resources elsewhere); and to
the Historical Naval Ships Association for the scan of Figure 1.4. And thank
you, finally, to the editorial staff at CUP, for their encouragement, precision,
and patience.
xv
Aims
xvi
1
Introduction
1.1. understand the importance of events within Special Relativity, and the
distinction between events and their coordinates in a particular frame;
and
1.2. appreciate why we have to define very carefully the process of measuring
distances and times, and how we go about this.
1. All inertial reference frames are equivalent for the performance of all
physical experiments;
2. The speed of light has the same constant value when measured in any
inertial frame.
1
2 1 Introduction
understanding ideas we thought were already clear, and try to think precisely
about processes we thought were intuitive: what do we mean when we talk
about ‘the length of a stick’?
Our first step is to understand the axioms, and we’ll start on that in the
next chapter. What does it mean to talk about a ‘transformation between
frames’, and why should the speed of light have such a significant place in
this story?
Chapters 3 and 4 are about the reasonably direct consequences of the
axioms. It’s in this pair of chapters that we discover the most surprising
features of SR – length contraction and time dilation – first qualitatively then
quantitatively. This pair of chapters is where the most profound conceptual
challenges are.
The various strands here are tied together by the main calculational tool
of SR, the ‘Lorentz transformation’, which we derive and study in Chapter 5.
This chapter is quite a long one, and detailed, so that it would be fairly easy to
get lost in it. However it is really only Section 5.1 that has the new material,
and the rest of the chapter is, again, an exploration of the consequences.
Those consequences are often surprising, and will frequently, I think, cause
you to re-read and re-think Chapters 3 or 4.
In Chapters 6 and 7 we look outwards to the rest of physics, and learn
how the principles of SR oblige us to recast familiar ideas such as velocity,
acceleration, frequency, momentum, mass, and energy.
Finally, in Appendix A, we look at how we can apply these ideas more
generally, and discover Einstein’s theory of gravity: General Relativity (GR).
We can’t dive too far beneath the surface here, without more advanced
mathematics, but our understanding of SR will allow us to at least see the
connection to the main structures of GR, and how they link to gravity.
Prior to all that, however, there are a couple of bits of terminology that
it’s useful to clarify in this first chapter, to avoid breaking the flow of the
argument in Chapter 2. In particular:
I expand on each of these points below. The following sections are rather
short (and possibly a little dry), but we will use the ideas in them again and
again and again.
It’s probably a good idea to re-read these sections repeatedly as you work
through the rest of the text. It possibly follows that these remarks might
not make perfect sense first time; they may even at first seem absurdly over-
precise, since they are making distinctions which may appear unnecessary
until you have understood some of the rest of the material.
1.2 Events
An event in SR is something that happens at a particular place, at a particular
instant of time. The standard examples of events are a flashbulb going off,
or a banger or firecracker exploding, or two things colliding.
There is nothing relative about an event: if two cars crash and metal
is bent, there is no ‘point of view’ from which the crash did not happen.
Although this may seem at this point to be too obvious to be worth stating,
we will discover that more things than we may expect are relative to our
‘point of view’, and we will use events as our way of navigating through the
puzzles this produces.
We will soon discover that, although we can all agree that a particular
event did happen, we might well have different answers to the questions
‘where?’ and ‘when?’ These are questions that we can answer using a
reference frame. [Exercise 1.1]
1 We’re going to hear an awful lot more about this train. Although I will occasionally vary the
examples by talking about rockets or boats, a train going past a station platform presents
such an immediate picture of two reference frames, in constant relative motion, that it will
be hard to avoid. I see no reason for wanton innovation with respect to this particular aspect
of the subject.
1.3 Inertial Reference Frames 5
is an inertial frame, the spinning roundabout is not. In both cases, you can
tell whether you’re the one in the non-inertial frame: in the first case you
feel yourself pushed back into the train seat, and in the second case, it’s only
your grip on the roundabout that stops you flying off, pulled towards the
outside by centrifugal ‘force’.
Acceleration and force are intimately connected with the notion of inertial
frames – an inertial frame is one which isn’t accelerated in any way. From
that, you would be correct to conclude that once the train has stopped
accelerating, and is speeding smoothly on its way, it becomes an inertial
frame again; if you closed your eyes, you wouldn’t be able to tell if you were
on a moving train or at rest in the station. Anything you can do whilst
standing on a station platform (such as juggling, perhaps), you can also do
whilst racing through that station on a train, even though, to the person
watching the performance from the platform, the balls you’re juggling with
are moving at a hundred kilometres an hour, or so.
What we have concluded here is that, although different observers may
reasonably ascribe different coordinates to events, and different speeds,
there is no ambiguity about who is accelerating or not. If you are on a train
picking up speed as it leaves a station, you can feel the pressure of the seat
on your back, and be under no illusion that you are not moving, and there is
no point of view from which the drink on the table in front of you does not
look likely to spill. We will have more to say about this point in Section 2.1.
Newton’s second law is more quantitative, since it relates the amount
of force applied to an object, the amount it is accelerated, and the body’s
inertial mass, through the well-known relation 𝐹 = 𝑚𝑎.
We can therefore define an ‘inertial frame’ as follows:
Definition of Inertial Frames: An inertial reference frame is a
reference frame, with respect to which Newton’s first law holds.
All this being said: don’t over-think this. An inertial frame is one that isn’t
accelerating.
Note that, in the context of SR, inertial frames are infinite in extent; also,
since all inertial frames move with constant velocity, it follows that no pair
of such frames mutually accelerate. [Exercise 1.2]
both masses are gravitational masses. It turns out, however, that whatever
the composition or construction of an object, its gravitational and inertial
masses, though logically completely distinct, are always measured to be
equal. This fact is more surprising than it may at first appear; we will
examine some of its consequences in Appendix A.
If, however, the event happens some distance away (answering a question
such as ‘what time did the train pass the next signal box?’), or if we want to
know what time was measured by someone in a moving frame (answering,
for example, ‘what was the time on the train-driver’s watch as the train
passed the signal box?’), things are not so simple, as most of the rest of this
text makes clear. Special Relativity is very clear about what we mean by
‘the time of an event’: when we talk about the time of an event, we always
mean the time of the event as measured on a clock carried by a local observer,
that is, an observer at the same spatial position as the event (which is rather
unfortunate if the event in question is an explosion of some type – but what
are friends for?), who is stationary with respect to the frame they represent.
We will typically imagine more than one observer at an event; indeed we
imagine one local observer per frame of interest, stationary in that frame,
and responsible for reporting the space and time coordinates of the event as
8 1 Introduction
Figure 1.1 Our observers, equipped with their clocks and surveyors’ wheels.
2 Note that it makes no sense to talk of being ‘stationary with respect to an event’ or to talk of
‘the rest frame of an event’: since an event is instantaneous, it cannot be said to be moving in
any frame. This also means that there is no observer who is ‘special’ with respect to an event.
1.5 Simultaneity: Measuring Lengths 9
Figure 1.2 Measuring the length of a train, using a ruler painted on the edge of
the station platform. The observers are standing on the platform. You should
imagine observers all along the platform; I’ve shown only the two who happen
to be next to the ends of the train at the observation time.
tance: as well as being stationary in the frame, we presume that they have
arranged things so that the clocks they carry are, and remain, mutually
synchronised. I have a little more to say about this in Section 2.2.1, but I
mention it here only to reassure you that this is not on the list of unexpectedly
complicated things.
Although an event might have a number of adjacent observers, in different
frames, this doesn’t imply that all observers report the same time. The
observers’ watches may differ for trivial reasons – perhaps their watches
are set to different time zones, or designed to run at different rates. Or, less
trivially, they may be running at different rates for relativistic reasons that
we’ll come to later. We assume that, despite these complications, all of the
clocks tick out a time – produce a number – which is linearly related to the
passage of time. [Exercise 1.3]
A further way of putting this, which will mean more to you after you
have looked at Chapter 5, is to say that proper time is the measure of
the length of a worldline, and that this length is a property of the worldline, as
3 In a famous experiment, Bailey et al. (1977) measured the decay rates (which mark the
passage of time, and thus count as a clock for this purpose) of muons in a storage ring at
CERN; because the particles were moving in a circle, they were subject to very high
centripetal accelerations. The muons decayed at the same rate as they would if they were
moving at the same speed in a straight line.
12 1 Introduction
Figure 1.4 A ‘taffrail log’: the propellor is towed through the water, and the
number of turns is recorded on the dials (from Plate 6 of S. B. Luce, Textbook
of Seamanship (1891); image courtesy Historical Naval Ships Association).
a path through spacetime, rather than being a property of any physical system
within spacetime, such as a clock. In these terms, the clock hypothesis is the idea
that a ‘good clock’ is a clock which, though it is a physical thing within spacetime,
faithfully records this worldline length.
1. they are aligned so that the (𝑥, 𝑦, 𝑧) and (𝑥 ′ , 𝑦 ′ , 𝑧′ ) axes are parallel;
2. the frame 𝑆 ′ is moving along the 𝑥-axis with velocity 𝑣; and
3. we set the zero of the time coordinates so that the origins coincide at
𝑡 = 𝑡 ′ = 0; this means that the origin of the 𝑆 ′ frame (which of course
remains at position 𝑥 ′ = 0 by definition) is always at position 𝑥𝑆′ = 𝑣𝑡 in
frame 𝑆.
y y v S y S y
xS = vt
t = 0 t > 0
t=0 t>0
x x
x x
In Figure 1.5, both of the observers are at positions which can be given
coordinates in both frames, (𝑡, 𝑥, 𝑦, 𝑧) and (𝑡′ , 𝑥 ′ , 𝑦 ′ , 𝑧′ ). The left-hand ob-
server in each figure is at rest in frame 𝑆, meaning that they stay stationary at
position (𝑥, 𝑦, 𝑧), but of course move through time 𝑡, as shown on their clock;
their position in terms of coordinates (𝑥′ , 𝑦 ′ , 𝑧′ ) changes, with d𝑥′∕d𝑡′ = −𝑣.
The other observer, at rest in frame 𝑆 ′ , stays at coordinates (𝑥′ , 𝑦 ′ , 𝑧′ ), but
with changing 𝑥-coordinate: d𝑥∕d𝑡 = 𝑣.
When we refer to ‘frame 𝑆’ and ‘frame 𝑆 ′ ’, we will interchangeably be re-
ferring either to the frames themselves, or to the sets of coordinates (𝑡, 𝑥, 𝑦, 𝑧)
or (𝑡′ , 𝑥 ′ , 𝑦 ′ , 𝑧′ ). It’s also worth mentioning in passing that, in standard con-
figuration, we can presume that the corresponding coordinates in the two
frames, such as 𝑡 and 𝑡′ , or 𝑥 and 𝑥′ , will always be in the same units,
whether these be nanoseconds or parsecs.
Instandardconfiguration,themotionoftheframesisalwaysalongthe
𝑥-and𝑥 ′ -axes,andintheexamplesweusewecanalwayschoosethe
axessothatthisisso.Thisisnotafundamentalrestriction,andwecanrelaxitat
thecostofmerealgebra.ThereisalittlemoretosayaboutthisinSection5.2.1
There are also some books which might provide some historical insight
and context. It’s worthwhile taking at least a look at Einstein’s famous 1905
paper introducing the ideas which were later termed ‘Special Relativity’
(Einstein 1905). The first few sections are worth reading for their clarity and
directness (and I’m going to quote from this paper in the next chapter). That
said, this paper is of historical interest rather than being a great introduction
because, amongst other things, the notation is somewhat different from
what we would use today. It is also unusual, amongst foundational papers
in physics, for being at least intelligible; most analogous papers are for
historical specialists. Einstein’s own popular account of relativity (Einstein
1920) is very readable, though it’s naturally a little old-fashioned in places.
You can see the influence of this book, and its examples, in many later SR
textbooks. The Principle of Relativity (Lorentz et al. 1952) is a collection
of (translations of) original papers on the Special and General theories,
including Einstein’s 1905 paper, but also some earlier papers by Lorentz
16 1 Introduction
Exercises
Exercise 1.1 (§1.2)
Which of the following are events?
1. A supernova explosion.
2. A concert.
3. The whole country clapping hands at once.
4. A collision between two particles in the LHC. [ 𝑑− ]
18
2.1 The First Postulate: the Principle of Relativity 19
feet together, you pass equal spaces in every direction. When you have
observed all these things carefully (though there is no doubt that when the
ship is standing still everything must happen in this way), have the ship
proceed with any speed you like, so long as the motion is uniform and not
fluctuating this way and that. You will discover not the least change in all
the effects named, nor could you tell from any of them whether the ship
was moving or standing still. In jumping, you will pass on the floor the
same spaces as before, nor will you make larger jumps toward the stern
than toward the prow even though the ship is moving quite rapidly, despite
the fact that during the time that you are in the air the floor under you will
be going in a direction opposite to your jump. In throwing something to
your companion, you will need no more force to get it to him whether he
is in the direction of the bow or the stern, with yourself situated opposite.
The droplets will fall as before into the vessel beneath without dropping
toward the stern, although while the drops are in the air the ship runs
many spans. The fish in their water will swim toward the front of their
bowl with no more effort than toward the back, and will go with equal
ease to bait placed anywhere around the edges of the bowl. Finally the
butterflies and flies will continue their flights indifferently toward every
side, nor will it ever happen that they are concentrated toward the stern,
as if tired out from keeping up with the course of the ship, from which
they will have been separated during long intervals by keeping themselves
in the air.
Galileo Galilei (1632), quoted in Taylor & Wheeler (1992, §3.1)
This is a very vivid account of the Relativity Principle (RP), which I shall
state more precisely at the end of this section. It’s also an illustration of the
idea of an inertial frame, which we discussed in Section 1.3. Another way
of phrasing the principle is that ‘you can’t tell if you’re moving’ – there’s
no experiment you can do which would allow you to distinguish between a
moving and a stationary frame.
x x = x − V t
numbers might change from frame to frame – if you walk along a moving
train, you are moving faster with respect to the nearby platform than you
are moving with respect to the other passengers – but the physics doesn’t
change. If you can juggle on the platform, you can do so on the (smoothly)
moving train as well. If you throw a ball whilst on a moving train, the usual
constant-acceleration equations tell us that it will follow a parabola, with
certain parameters of maximum height, angle, and so on; someone watching
this ball from a station platform also sees a parabola: the ball is moving at a
different speed and a different angle, and it moves a greater distance – it is
a different parabola – but it remains a parabola nonetheless. The physics
has led us to solutions for the path of the ball which are different in the two
cases, but only to the extent of them being merely two variants of the same
shape.
How do we phrase this last paragraph in equation form? Consider New-
ton’s second law, in the alternative form 𝐹 = d𝑝∕d𝑡 – that is, we conceive of
force as a thing which changes the momentum of an object it acts on, which
only secondarily results in acceleration. Specifically, consider motion under
gravity, so that 𝐹 = 𝑚𝑔. Writing 𝑝 = 𝑚𝑣 = 𝑚d𝑥∕d𝑡, and 𝑝′ = 𝑚𝑣 ′ in the
primed frame,
d𝑥′
𝑝′ = 𝑚𝑣 ′ = 𝑚
d𝑡 ′
d
= 𝑚 (𝑥 − 𝑉𝑡) using Eq. (2.1a)
d𝑡
= 𝑝 − 𝑚𝑉
(that is, we are differentiating the 𝑥 ′ -coordinate with respect to the time-
coordinate in that frame, and not merely sticking primes on the expression).
Thus
𝑚𝑔 = d𝑝∕d𝑡 ⇔ 𝑚𝑔 = d𝑝′∕d𝑡′ .
1
𝑥 = 𝑣0 𝑡 + 𝑎𝑡 2 . (2.2)
2
22 2 The Axioms
1 2
𝑥 ′ = 𝑣0′ 𝑡′ + 𝑎′ 𝑡′ . (2.3)
2
That is, we find exactly the same relation, as if we had simply put primes
on each of the quantities in Eq. (2.2). This is known as ‘form invariance’,
or sometimes covariance, and indicates that the expression Eq. (2.3) has
exactly the same form as Eq. (2.2), with the only difference being that we
have different numerical values for the coefficients and coordinates (in
general, though, 𝑎′ = 𝑎 and 𝑡′ = 𝑡 according to the GT). Barton (1999, §2.3.3)
discusses this usefully; see also Exercise 2.4.
Going further, if all frames are equivalent, in the sense of the RP, then
there is no frame that is special, and in particular this means that we cannot
identify any frame corresponding to a state of absolute rest. But that in turn
means that the very idea of such a frame is redundant.
We can generalise this, and say that the RP, in classical mechanics, de-
mands that all laws of mechanics, and by obvious extension all other physical
laws, be covariant under the galilean transformation. Though you probably
wouldn’t naturally phrase things like this, this is entirely in accordance with
your (and my) physical intuition, and it seems amply corroborated by the
majority of our experience. Writing down Eq. (2.1) seems little more than
an exercise in notation.
Note carefully that, although we are talking in this section about ‘the
Principle of Relativity’, we are not yet talking about ‘Special Relativity’. The
RP is consistent with both newtonian physics and SR: it is the second axiom,
in Section 2.2 below, which distinguishes between them (I have more to say
about this in Section 5.8.1, which is a ‘dangerous bend’ section).
This first axiom is consistent with our intuition, with our mathematical
tastes, and also, it seems, with experiment.
Everything, therefore, seems to be rosy. [Exercises 2.1–2.3]
2.1.1 Electromagnetism
Everything, in fact, was rosy, until the end of the nineteenth century. Around
then, physicists were investigating Maxwell’s equations, one of the highpoints
of nineteenth-century physics, which unified all of the phenomena of elec-
tricity and magnetism into a single formalism of tremendously insightful
power and overwhelmingly successful application. For concreteness, let’s
2.1 The First Postulate: the Principle of Relativity 23
This is fairly easy to show for the wave equation, slightly more involved
for Maxwell’s equations. We have more to say about this in Section 5.8.2.
More advanced textbooks on electromagnetic theory also tend to have sections
on SR, which make this point more or less emphatically.
It appeared that Maxwell’s equations had their simplest form – that is,
Eq. (2.4) – only in a frame which was not moving. The fact that the equations
of electromagnetism are not invariant under the GT appeared to indicate
that, whenever you watched an electromagnetic experiment (such as an
ammeter, or a microwave oven) in a moving frame, it should work differently
from that same experiment in a stationary frame. Specifically, it suggests
that there actually exists such a unique absolutely stationary frame, which
is otherwise rendered unnecessary by the RP. [Exercise 2.4]
1 If you haven’t encountered Maxwell’s equations yet, or if this notation is unfamiliar to you,
don’t worry – the point here is to show that, excepting some notational complexities (OK,
quite a few complexities), they are as elegant and simple as Newton’s laws. The ∇ symbol on
the left hand side is a differential operator, so that this is a set of differential equations
relating spatial gradients in the electric and magnetic fields to distributions of charge and
current, and temporal changes in the magnetic and electric fields.
24 2 The Axioms
The puzzle that Einstein is drawing attention to is that, although there are
two significantly different explanations of what is happening, when either
the magnet or the conductor is in motion, the observable current is identical.
The thing that is special about Einstein’s approach here is that he sees this,
not as a curiosity, but as a massive problem: why is there this unexplained
symmetry? What is it telling us?
Another, linked, problem was that of the aether. Since light is an electro-
magnetic wave, it seems obvious that, like water waves or sound waves,
there must be something that light waves propagate in. This ‘light medium’
was named the aether, and had the apparently contradictory properties of
being both very rigid (so that it could sustain the very high frequencies of
light) and very tenuous (so that objects such as planets could move through
it freely). The aether is an obvious candidate for the frame of absolute rest.
The Earth moves around the Sun in its orbit, with a constantly changing
velocity. It followed, therefore, that there was some point in its orbit at
which it had a maximum, and another point at which it had its minimum,
speed with respect to the putative aether. Although this speed is rather slow
compared to the speed of light, it should have been possible to measure
the change in the velocity of the Earth with respect to the aether or the
2.1 The First Postulate: the Principle of Relativity 25
absolute rest frame. There was therefore a series of experiments in the late
nineteenth and early twentieth centuries which attempted to measure this
phenomenon: the Michelson–Morley aether-wind experiment attempted to
measure the different light-travel times for beams directed along and across
the flow of the aether; the Fizeau experiment and Lodge’s experiments
attempted to detect the extent to which the aether could be dragged along by
fast-moving objects on Earth. All of them failed: no-one was able to detect
the Earth’s movement through the aether, or the movement relative to the
absolute rest frame, which the galilean transformation and the apparently
necessary properties of an apparently necessary aether demanded.
2 For more about Lorentz, FitzGerald, and this transformation, see Section 5.8.2.
26 2 The Axioms
community were then left with the problem of explaining why electromag-
netism was apparently uniquely subject to a different transformation law
from everything else. The Lorentz transformation appeared to indicate that
objects would change their lengths, and time be distorted, when moving
head-on into the aether, without there being any clear physical mechanism
for this. It would have been clear that something was very wrong.3
This is a fascinating episode in the history of science, but the resolution
was that (i) Maxwell’s equations are right, (ii) the Relativity Principle is
right, (iii) the galilean transformation is only approximate, and (iv) Special
Relativity is the new physics to come out of this.
By saying that the GT is ‘approximate’ I mean that, although the transfor-
mation clearly works in our normal experience, we can find circumstances
(namely when we are moving quickly, at ‘relativistic speed’) where it pro-
duces wrong predictions.
Einstein explains this as clearly as anyone. In his 1905 paper which
introduced SR (‘On the Electrodynamics of Moving Bodies’), he opens with
the paragraph above commenting that only relative motion is important in
Maxwell’s equations, and then goes on to say, with magisterial finality:
Examples of this sort, together with the unsuccessful attempts to discover
any motion of the Earth relatively to the ‘light medium’, suggest that
the phenomena of electrodynamics as well as of mechanics possess no
properties corresponding to the idea of absolute rest. They suggest rather
that, as has already been shown to the first order of small quantities, the
same laws of electrodynamics and optics will be valid for all frames of
reference for which the equations of mechanics hold good. We will raise
this conjecture (the purport of which will hereafter be called the ‘Principle
of Relativity’) to the status of a postulate. . . (Einstein 1905)
3 There are some interesting historical details in Einstein’s ‘Autobiographical Notes’ (1991)
and Pais’s scientific biography of Einstein (2005, ch. 6). The apparent non-covariance of
Maxwell’s equations, and the problems with the aether theory, were well-recognised
problems by the end of the nineteenth century, and the historically interesting thing here is
the extent to which Einstein was aware of the prior work but didn’t feel he needed to build
on it, since his motivations for the 1905 paper were largely philosophical. Lorentz’s
discussion of the Michelson–Morley experiment describes how one might reconcile it with
the aether theory, by assuming inter-molecular forces are modified by the aether in a
particular way ‘though to be sure, there is no reason for doing so’ (Lorentz 1895, §4); and
Bell (2004, ch. 9) has described a potential way of teaching relativity, by considering the
electrostatic field of a point charge, which would now be regarded as extremely eccentric,
but which would have made a lot of sense to Einstein’s contemporaries, and which is
illuminating therefore (and see Section 5.8.2 below). It was clear in 1905 that the physics of
mechanics and of electromagnetism were intimately related to one another, so that the fact
they seemed to observe incompatible transformation laws was a major anomaly.
2.2 The Second Postulate: the Constancy of the Speed of Light 27
all of physics.
We can recast the Principle of Relativity (RP), the first postulate of Special
Relativity, as follows:
Option 2 is ruled out by the first postulate: if this were true then the frame
in which light had this special value would be picked out as special; the RP
also incidentally excludes the notion of the aether.
Option 1 also turns out not to be the case, if the statement here is taken to
mean that light behaves just like a classical projectile. While light is always
emitted at the speed 𝑐, option 1 suggests that it may potentially arrive at a
different speed at a moving detector. This is what we would intuit from the
velocity addition part of the GT, Eq. (2.1b), which says that velocities add in
a straightforward way; this turns out not to be true for light, or indeed any
object moving at a significant fraction of the speed of light.
No, option 3 is the case, so that, no matter what sort of experiment you are
doing, whether you are directly observing the travel-time of a flash of light,
or doing some interferometric experiment, the speed of light relative to your
apparatus will always have the same numerical value. This is perfectly inde-
pendent of how fast you are moving relative to the source: it is independent
of whichever inertial frame you are in, so that another observer, measuring
the same flash of light from their moving laboratory, will measure the speed
of light relative to their detectors to have exactly the same value.
Constancy of 𝑐: There exists a finite constant speed
𝑐 = 299 792 458 m s−1 ,
such that anything which moves at this speed in one inertial frame is
measured to move at that speed in all other inertial frames.
Barton (1999, §3.1) gives a wonderfully careful expression of this. There is
no real way of justifying this postulate: it is simply a truth of our universe,
and we can do nothing more than simply demonstrate its truth through
experiment.
This experimental corroboration might take the form of a measurement
of the speed of light emitted from an orbiting body, at the phases in its orbit
when it is moving directly towards or away from us. The orbiting body can
be a particle in an accelerator, or a binary star orbiting its companion, but
in either case the measured light speed is determined to be independent of
the speed of the emitter, to impressively high accuracy.
For further discussion of this experimental support, and references to
further reading, see Appendix B, French (1968, chs. 2 and 3), and Barton
(1999, §3.4). [Exercise 2.6]
Exercises 29
There are a few further subtleties to this procedure which are not impor-
tant for our purposes; both Rindler (2006, §2.6) and Taylor & Wheeler
(1992, §2.6), for example, give details. One subtlety is that you may notice that
the above procedure assumes that the speed of light is the same in both directions:
although I doubt anyone seriously thinks that the speed of light is different, out
and back, it is surprisingly difficult (indeed, impossible to date) to devise a proce-
dure which avoids this assuption, or which is capable of measuring the ‘one-way’
speed of light.
Exercises
Exercise 2.1 (§2.1)
Consider a rocket at rest (𝑣0 = 0) at the origin of a frame 𝑆. At time 𝑡 = 0, it
starts to fire its rockets so that it moves along the 𝑥-axis, and at time 𝑡 = 𝑡1 we
find the rocket moving at speed 𝑣 = 𝑣1 . Consider a second frame 𝑆 ′ , moving
at speed 𝑉 along the 𝑥-axis, such that frames 𝑆 and 𝑆 ′ are in standard
configuration (so that 𝑥 = 𝑥 ′ = 0 when 𝑡 = 𝑡 ′ = 0; the rocket of course has
speed 𝑣 ′ and 𝑣1′ in frame 𝑆 ′ ).
30 2 The Axioms
Presume that both the rocket and the primed frame are moving slowly
enough that we can reasonably use the galilean transformation.
Work out the momentum of the rocket at 𝑡 = 0 and 𝑡 = 𝑡1 in the two
frames (that is, work out 𝑝0 = 𝑚𝑣0 , 𝑝1 = 𝑚𝑣1 , 𝑝0′ = 𝑚𝑣0′ and 𝑝1′ = 𝑚𝑣1′ ). Is
momentum frame-invariant?
Work out the change in momentum in the two frames: is this frame-
invariant?
Work out the kinetic energy and the change in kinetic energy in the two
frames: are these frame-invariant?
3.1 Simultaneity
Imagine standing in the centre of a train carriage,1 with suitably agile friends
at either end: Fred (at the Front) and Barbara (at the Back). At a prearranged
time, say time ‘0’ on your carefully synchronised watches, you fire off a
flashbulb and your friends note down the time showing on their watches
when the flash reaches them (Figure 3.1). Since you are standing in the
1 The argument below ultimately originates from Einstein’s popular book about
relativity (Einstein 1920), first published in English in 1920. It is clearly ancestral to the
multiple versions, involving planes, trains, automobiles and rockets, in both popular and
professional books on relativity. The particular variant described here is most immediately
descended from Rindler’s version (2006).
32
3.1 Simultaneity 33
3 3
3 1
middle of the carriage, Fred’s and Barbara’s times must be the same as each
other. Comparing notes afterwards, you all find that it took some time for
the flash to travel from the middle of the carriage to the end, and that your
friends have noted down the same arrival time on their watches, time ‘3’,
say. In other words, Fred’s watch reading ‘3’, and Barbara’s watch reading ‘3’,
are simultaneous events in the frame of the carriage.
These watches are obviously not calibrated in seconds, but these could
be sensible values if the watches are telling time in the natural relativistic
time unit of light-metres. See Section 4.1 below.
Suppose now that this train is moving through a station as all this goes
on, and you look from the platform into the carriage – what would you see
from this point of view? You would see the light from the flash move both
forwards towards Fred and backwards towards Barbara, but remember that
you would not see the light moving forwards faster than the speed of light –
its speed would not be enhanced by the motion of the train – nor would you
see the light moving backwards at less than 𝑐. Since the back of the train is
rushing towards where the light was emitted, the flash would naturally get
to Barbara first, as illustrated in Figure 3.2. At that point Barbara’s watch
must read ‘3’, since the flash meeting her and her watch reading ‘3’ are
simultaneous at the same point in space at exactly the same time, and so
34 3 Length Contraction and Time Dilation
B 3 1 F
Y 1 3 Z
Figure 3.3 Two trains passing each other, with the observers’ locations indi-
cated.
must be simultaneous for observers in any frame. But at this point, the
light moving towards Fred cannot yet have caught up with him: since the
light reaches Fred when his watch reads ‘3’, his watch must still be reading
something less than that, ‘1’, say. In other words, Barbara’s watch reading ‘3’
and Fred’s watch reading ‘1’ are simultaneous events in your inertial frame
on the platform.
What is going on here? Are these events simultaneous or not? What this
tells us is that our notion of simultaneity is initially rather naïve, and that we
have to be very careful exactly what we mean when we talk of events as being
simultaneous. The only case where two events are quite unambiguously
simultaneous is if they take place at exactly the same point in space.
L
H 3
Z ?
B 11 ?
? 11 Z
location, and thus can make ‘local’ observations of each other’s clocks. In
Figure 3.3, for example, Barbara can at this instant legitimately read the
time on her watch and the time on Yvette’s. She cannot read the time on
Fred’s watch, nor Yvette’s watch at any other time, because those would not
be ‘local’ observations.
Now pause a moment, and take another set of observations, shown in
Figure 3.4 (we’ll come back to this image in a moment). And then take a
final set of observations when the two rear observers are beside each other,
this time getting Figure 3.5.
After this, the various observers calm down, amble together, and compare
notes. Barbara (standing at the back of the top carriage) could remark ‘I
saw the front of the other carriage pass me when my clock was reading “3” ’
36 3 Length Contraction and Time Dilation
Δt
of view of the observers in the train, it is the rails which would be moving;
therefore, if there were a perpendicular length contraction, the measured
distance between the rails would become shorter, and at some speed the
train would be derailed with its wheels lying outside the rails. These state-
ments must be either both true or both false; they contradict one another,
so they cannot both be true, and they must therefore both be false; they
are both consequences of the supposition that there exists a perpendicular
length contraction; so that statement must in turn be false, demonstrating
that there can be no such contraction.
If the mirror and the flashbulb are a distance 𝐿 apart, and I, standing by
the light clock in Figure 3.6, time the round trip as Δ𝑡 ′ seconds, then, since
the speed of light is the constant 𝑐,
2𝐿 = 𝑐Δ𝑡 ′ . (3.1)
Note that Δ𝑡 ′ here is the time interval on my watch, standing alongside and
moving with the clock.
Now observe the light clock, stationary on the train going through the
station, as you watch it from the platform (see Figure 3.7). The clock is in
motion, at a speed 𝑣, so that the flash of light is reflected by the opposite
mirror when it is a little way down the platform, and detected when it is still
further on. Here, one tick is timed as Δ𝑡 seconds, during which time the
clock will have moved a distance 𝑣Δ𝑡 down the platform.
How far has the light travelled? We know the light travelled at a speed 𝑐,
from the second axiom, and we timed its round trip at Δ𝑡 seconds, so the
light beam must have travelled a distance 𝑐Δ𝑡 in the time that the clock itself
travelled a distance 𝑣Δ𝑡. But from the figure,
2 2
𝑐Δ𝑡 𝑣Δ𝑡
( ) = 𝐿2 + ( ) . (3.2)
2 2
3.3 The Light Clock 39
cΔt/2
L
vΔt/2
Δt
Figure 3.7 Light clock: the observer is standing on the station platform, and
times the round-trip time as Δ𝑡 on their clock.
Now, the important thing about Eq. (3.3) is that it involves Δ𝑡 ′ , the time for
the clock to ‘tick’ as measured by me, standing next to it on the train, and it
involves Δ𝑡, the time as measured by you on the platform, and they are not
the same.
How can this possibly be? Why is this different from the perfectly reason-
able behaviour of the ball thrown down the carriage, in the non-relativistic
example at the beginning of this section? The difference is that when you
watched the ball from the platform, you saw it move with the speed it was
given plus the speed of the train – in other words, the person on the platform
and the person on the train had a perfectly reasonable disagreement about
the speed of the ball, which resulted in them agreeing on the time the ball
was in flight. In the relativistic example, however, both of them agree on
the speed of the light in the light clock, as the second axiom says they must.
Something has to give, and the result is that the two observers disagree on
how long the light takes for a circuit.
So at least one of the clocks is broken? They’re both in perfect working
order (this is again the ‘clock hypothesis’ which was briefly mentioned in
Section 1.6). They only work properly when they’re stationary? No, the
40 3 Length Contraction and Time Dilation
10
6
γ(v)
4
0.5 1
v/c
Figure 3.8 The gamma function, or Lorentz factor, 𝛾 = (1 − 𝑣 2 ∕𝑐2 )−1∕2 , plot-
ted as a function of 𝑣∕𝑐.
Think back to the ‘taffrail log’ of Section 1.6: the clocks carried by the
moving and by the stationary observer have been dragged through different
distances in time.
The factor 𝛾 = (1 − 𝑣2 ∕𝑐2 )−1∕2 is known by a couple of different names,
including the ‘Lorentz factor’; you will become very familiar with this expres-
sion. As you can see from Eq. (3.3) or Eq. (3.5), it indicates how significant
the relativistic effects are, for a given velocity 𝑣. At 𝑣 = 0, it has the value
𝛾(0) = 1, showing that there is no time dilation for a stationary frame. As
you can see in Figure 3.8, the factor stays very close to one√ for much of its
range, and even at nearly 90% of the speed of light (𝑣∕𝑐 = 3∕2 = 0.87) it
is still only 𝛾 = 2; as 𝑣 increases beyond this, however, the Lorentz factor
grows very rapidly, becoming infinite at 𝑣 = 𝑐. Thus there is no point at
which relativistic effects suddenly switch on – we are always in a relativistic
universe – but they are ignorable at lower speeds. [Exercises 3.3 & 3.4]
of this light clock, which lets us derive the length-contraction formula (we
obtain this result again, in a rather more compact way, in Section 5.5.1, and
that version may be a little easier to follow).
Consider a horizontal light clock, such as a train carriage of length 𝐿0
with a mirror at the front. Take the carriage to be at rest in frame 𝑆 ′ , with
the rear of the carriage at the origin of 𝑆 ′ . The carriage is measured to be
length 𝐿 in the frame of the station; we know from Section 3.2 that the
distance 𝐿 will be less than 𝐿0 , as a result of length contraction, but we do
not know, at this point, just how much shorter it will be. This length, 𝐿0
(which we could write 𝐿′ if we preferred), is sometimes referred to as the
proper length of the carriage, meaning its length in a frame in which it is
stationary.
Imagine a light flash at 𝑥 ′ = 0 at time 𝑡′ = 0, which is reflected from the
′
mirror at the front, which we will call event ○ 1 , at times denoted 𝑡1 and 𝑡 in
1
the two frames, and detected at the back of the carriage again, in event ○ 2 ,
at times 𝑡2 and 𝑡2′ . By considering the round-trip time in the two frames, we
can find expressions for 𝐿 and 𝐿0 in terms of 𝑣 and 𝑐 (you might want to try
deriving this yourself, before reading on).
In frame 𝑆 ′ , the analysis is simple. The light travels a distance 2𝐿0 in
time 𝑡2′ , so
2𝐿0 = 𝑐𝑡2′ .
Because the original flash and event ○ 2 happen at the same location in 𝑆 ′ –
we are reading the same clock twice – we can use the time-dilation formula,
Eq. (3.3), to relate 𝑡2 and 𝑡2′ , and discover that 𝑡2 = 𝛾𝑡2′ (the time-dilation
formula relates time intervals, not time coordinates, so we are here relating
the intervals between time 𝑡 = 𝑡 ′ = 0, when the light flashed, and times 𝑡2
and 𝑡2′ when the reflection returned).
In frame 𝑆, we examine the light’s travel from the origin to event ○ 1 ,
and separately its travel between events ○ and ○. The light travels at 𝑐, so
1 2
𝑥1 = 𝑐𝑡1 . In doing so, it has travelled the length of the carriage (as measured
in frame 𝑆), plus the distance the carriage has travelled in that time, so that
𝑥1 = 𝐿 + 𝑣𝑡1 . Thus
𝐿
𝑡1 = .
𝑐−𝑣
– that is, on what they would see happening at their own point in space and
time – rather than on what we intuitively expect them to see, then when we
find ourselves puzzled, saying ‘that couldn’t possibly happen, because. . . ’,
we have someplace to start.2
Exercises
Exercise 3.1 (§3.2)
Which of the following statements are true, referring to the observers at the
front and back of the train carriage, in the discussion of Section 3.2?
1. Fred and Barbara measure the other carriage to get shorter when it’s
moving relative to them.
2. Fred and Barbara measure the speed of light in the other carriage to be
less than 𝑐.
3. By measuring the Doppler shift of a light signal sent from the back of
the carriage to the front, the two observers can determine the carriage’s
velocity to any desired accuracy.
2 In this focus on (local) measurement, we can see the influence of the philosophical
‘positivism’ which influenced Einstein in his development of SR. This is the claim that it is
observations of the external world that give us knowledge, rather than a priori suppositions.
There is some apparent tension with the rather axiomatic approach to SR which Einstein
then develops, but while the approach is rather abstract, the contact between the theory and
the world is via tangible events or measurements, rather than abstractions such as ‘absolute
space’.
44 3 Length Contraction and Time Dilation
But in the dynamic space of the living Rocket, the double integral has a differ-
ent meaning. To integrate here is to operate on a rate of change so that time
falls away: change is stilled. . . . ‘Meters per second’ will integrate to ‘meters.’
The moving vehicle is frozen, in space, to become architecture, and timeless.
It was never launched. It will never fall.
Thomas Pynchon, Gravity’s Rainbow
45
46 4 Spacetime and Geometry
are relayed from observers local, in space and time, to those events. Instead,
we have discovered that not only will different observers disagree about the
order in which separated events are observed to occur (Section 3.1), but
(partly as a consequence, and as we saw in Section 3.2), observers in different
frames will not agree about the separation in time, or the separation in space,
between two events. Such separations are things we would think are solidly
established (once we have put aside the trivial complications we have just
discussed); it is disturbing to find that they are not.
The relationship between the different measurements is not random
– the coordinates obtained by one observer are systematically related to
those obtained by another. You are already familiar with this general idea:
the space between objects in your environment is not changed when we
change the units we use to measure it (such as kilometres versus inches), and
there are all sorts of regularities in the network of such distances, such as
Pythagoras’s theorem, or that the internal angles of a triangle add up to 180°.
That is, we are familiar with the idea of geometry. We also learn to become
comfortable with the idea that the rules of geometry in our immediate
environment (Pythagoras, and all that) have to be adjusted when we consider
Earth-sized distances, or distances on the sky: the geometry of the surface
of a sphere has different rules from those that Euclid wrote down. We are
about to learn a further set of geometrical rules, and learn that we can make
sense of the things we encountered in the last chapter by seeing them as the
consequences of a set of geometrical rules that we have not known about
before now.
This chapter takes a first look at these ideas, before I pull them together
in a geometrical approach to the Lorentz transformation (LT, Eq. (5.6)), in
the next chapter. This way of approaching Special Relativity is, I think,
tremendously powerful, and (a separate advantage) creates a natural bridge
to General Relativity, which we will explore a little more in Appendix A.
Before we can properly embark on this, however, we must look more
carefully at the way in which time and space are interrelated.
And before we do that, it is convenient to get our units of measurement
straight.
talking about moving near the speed of light? The fact that the speed of light
is 3 × 108 m s−1 tells us that if we persist in using metres and seconds, we’ll
get a lot of very large or very small numbers, and lose all hope of developing
much in the way of intuition.
If the problem were merely one of large numbers, we could settle on the
gigametre as our usual unit, give it a handy name, and be done with the
question. However we will soon discover that space and time are not as
distinct as might at first appear, and that having different names for the
separations in these directions can obscure this. It’s as if we had decided
to measure distances east–west in kilometres and distances north–south in
inches. Much better is to measure the two in the same units.
One possibility is to use time as a measure of distance. We do this naturally
when we talk of the Earth being about 8 light-minutes from the Sun, or the
nearest star being a little more than 4 light-years away. We can also talk
of the light-second, of 1 s = 3 × 108 m (the Sun has a diameter of 4.6 s), or
the light-nanosecond of 30 cm. In these units, light moves at a speed of one
light-second per second, 𝑐 = 1 s s−1 , or one light-year per year, 𝑐 = 1 yr yr−1 ;
that is, 𝑐 = 1, a unitless number.1
There’s nothing wrong with this in principle, but a more common con-
vention in this context is to instead use space as a measure of time, and use
the light-metre as our time unit, with a light-metre being the time it takes
for light to travel one metre; that is, a little more than 3.3 ns. In these units,
light travels a distance of one metre in a time of one light-metre, so again
𝑐 = 1 m m−1 as a unitless number (we did this, in fact, in Section 3.1). These
are referred to as natural units. Talking of ‘metres of time’ feels initially
unnatural, but it’s not any odder than talking about (light-)years as a dis-
tance, and we’re generally quite comfortable with that. When we need to
distinguish them, we refer to SI units – that is, metres and seconds – as
physical units.
In fact, since 1983, the International Standard definition of the metre is
that it is the distance light travels in 1/299 792 458 seconds; that is, the speed
of light is 299 792 458 m s−1 by definition, without measurement uncertainty,
and so 𝑐 is therefore demoted to being merely a conversion factor between
two different units of time. The ‘second’ being used here is the SI second,
which is defined so that a particular atomic transition in caesium has a
specific defined frequency in hertz (for further documentation of the various
resolutions here, see the ‘SI Brochure’ (BIPM 2019)). International Atomic
1 It is not dimensionless, since it still has the dimensions LT−1 , but since both dimensions use
the same units, they cancel.
48 4 Spacetime and Geometry
2 If there really were some convention that east–west distances were in kilometres and
north–south ones in inches, then land surveyors would all be familiar with the conversion
factor 𝑘 = 2.54 × 10−5 km in−1 , and the equations in their textbooks would be littered with
factors of 𝑘 and 𝑘2 and so on. Instead, surveyors would surely write their textbook equations
in ‘natural units’ where 𝑘 = 1, and when they want to calculate ‘real-world’ numbers, they
would have to learn the techniques for either converting those measurements to natural
units, or re-inserting the ‘missing’ factors of 𝑘 into their equations, to re-obtain ‘physical
units’. But of course land surveyors would amongst themselves avoid this km/in conversion;
and of course we as spacetime-surveyors want to avoid the analogous convention of
heterogeneous units.
3 Natural units are not a modern innovation, cruelly designed to confuse students. Eddington
used natural units throughout his ‘Report’ (1920), which was the first book-length
description in English of Special and General Relativity.
4.1 Natural Units 49
10 J = 10 kg m2 s−2 × (1)2
= 10 kg m2 s−2 × (3 × 108 )−2 s2 m−2
= 1.1 × 10−16 kg.
𝐸 = 1.1 × 10−16 kg
= 1.1 × 10−16 kg m2 m−2 , recalling our time units are m
= 1.1 × 10−16 kg m2 × (3 × 108 s−1 )2 , unit conversion
= 10 kg m2 s−2 = 10 J.
It helps to recall that, with 1 m = (3 × 108 )−1 s, a metre is a very small unit
of time, or that with 1 m−1 = 3 × 108 Hz something that repeats once per
metre-of-time is happening at a very high frequency. Similarly, it might be
useful to think that when we write an energy as 𝐸 = 1 kg, what we mean
is 𝐸 = 1 kg m2 m−2 , to be dimensionally correct, but we’ve skipped writing
the cancelling units.
It’s also possible to write equations with and without explicit factors
of 𝑐. We have seen the Lorentz factor 𝛾 = (1 − 𝑣2 ∕𝑐2 )−1∕2 . In units where
𝑐 = 1, this simplifies to 𝛾 = (1 − 𝑣2 )−1∕2 . This expression is dimensionally
consistent because, in these units, 𝑣 has no units (note: it still has the
dimensions of LT−1 , but our choice of time units means that these cancel).
To convert this expression back to units where 𝑣 and 𝑐 have physical units,
we simply have to add enough factors of 𝑐 inside the expression to make
it dimensionally consistent. Another way of thinking about this is to take
the various factors of 𝑐 to be still present in the equations, but ‘invisible’
because they have numerical value 𝑐 = 1 m m−1 . Resurfacing these factors,
by deducing what power of 𝑐 must be present, then allows us to put in the
physical-units value of 𝑐.
t a c
b
1
t1
x
x1
Figure 4.1 A Minkowski diagram: this shows the sequence of events corre-
sponding to a flashing bulb which is (a) at rest at 𝑥 = 0, (b) accelerating from
rest, and (c) moving at (nearly) the speed of light.
as the radius of the black hole that would result if the object were sufficiently
compressed (in these terms, the Sun has a mass of 3 km). In relativistic quantum
mechanics and particle physics, likewise, units are chosen so that Planck’s con-
stant ℏ = 𝑐 = 1, and everything is quoted in energy units (usually electron-volts;
see Section 7.5). As of 2018, all of the base units of SI are defined via conversion
factors, in the same way as the metre – for example, the kilogramme is defined so
that Planck’s constant has a specific exact value (BIPM 2019).
[Exercises 4.1–4.3]
frame 𝑆. Since we will almost always restrict our attention to frames in stan-
dard configuration, and since there is no transverse contraction, the 𝑦 and 𝑧
coordinates are uninteresting. We can plot the remaining two coordinates
(𝑡1 , 𝑥1 ) on a diagram, to obtain something like Figure 4.1.
Imagine a flashing bulb which is stationary in frame 𝑆, at the origin. Each
flash is a separate event: each event will have 𝑥 = 0 and successively increas-
ing 𝑡. We can plot these points to obtain the line marked ‘a’ in Figure 4.1;
the line connecting these is the worldline of the bulb. If instead the bulb
accelerates along the 𝑥-axis, then we obtain the line ‘b’. As the bulb moves
faster, the distance in space between two successive flashes, Δ𝑥, and their
separation in time, Δ𝑡, are related as Δ𝑥 = 𝑣Δ𝑡, giving the worldline’s gradi-
ent on the diagram as Δ𝑡∕Δ𝑥 = 1∕𝑣. Since 𝑣 < 1 for any physical object, an
object’s worldline always has a gradient larger than 45°.
If an object moves at speed 𝑐, then in 1 m of time it will travel 1 m along
the 𝑥-axis, so its worldline will be a line at 45° to the axes, giving line ‘c’.
4.2 The Minkowski Diagram 51
The worldline of the flashbulb is simple, and is a straight line lying along
the 𝑡′ -axis, like worldline ‘a’ in Figure 4.1. We can plot these events, the
worldline of the flashbulb, and the worldlines of the light flashes, on a
Minkowski diagram, to obtain Figure 4.3(a). What does this sequence of
events look like on a Minkowski diagram for frame 𝑆?
r Firstly, the worldline of the moving flashbulb is a slanted line in this
frame; we know from the previous paragraph that this worldline lies
along the 𝑡′ -axis, so we can draw that axis in immediately. From 𝑥 = 𝑣𝑡,
or rather 𝑡 = 𝑥∕𝑣, we can see that this line has gradient 1∕𝑣.
4 Hermann Minkowski (1864–1909) introduced the idea in a talk in 1908 (see Section 5.3),
where he emphasised the importance and fertility of this geometrical approach. The
diagram is less exotic than it may at first appear: if it were drawn with 𝑥 vertical and 𝑡 as the
horizontal axis, there would be nothing odd about it at all.
52 4 Spacetime and Geometry
t t
4
t
4
x
3
2
2
x
3
x
1
1
(a) (b)
they are located on the worldline of the flashbulb an equal distance either
side of the origin. We’ll shortly be able to work out just where on this
worldline the events appear.
r So where are the events ○ 2 and ○3 ? The light which travels from event ○ 1
to events ○ 2 and ○ 3 must have a worldline which is angled at 45° (the fact
that this is true in all frames is another statement of the second axiom);
and the reflected light which travels from events ○ 2 and ○
3 back to event ○ 4
events ○ 2 and ○ 3 , which are simultaneous in the frame 𝑆 ′ (they have the
same 𝑡 ′ -coordinate), are not simultaneous in the frame 𝑆; this is yet another
illustration of the relativity of simultaneity discussed in Section 3.1).
r The 𝑡 ′ -axis joins the events ○ 1 and ○ 4 in Figure 4.3, and the 𝑥 ′ -axis joins
events ○ 2 and ○ 3 . This must also be true for the newly discovered locations
This means that we can show, in Figure 4.3, the positions of the four events ○
1
axes of the 𝑆 ′ frame (the 𝑡′ -axis is the worldline of the moving flashbulb,
and the 𝑥′ -axis is the line of simultaneity in the moving frame).
In Figure 4.3(a), events which happen at the same time in frame 𝑆 ′ – that
is, which have the same 𝑡′ -coordinate – are connected by a line parallel to
the 𝑥 ′ -axis. This remains true in Figure 4.3(b). Therefore in any Minkowski
diagram, we can indicate the coordinates of a marked event, in each of
4.2 The Minkowski Diagram 53
t
t
slope 1/v
1
t1
t1 x1 x
x
x1
Figure 4.4 The coordinates of an event projected onto the (𝑡, 𝑥) and (𝑡 ′ , 𝑥 ′ )
axes of a Minkowski diagram.
B t F
1
2
x
x = 3
Figure 4.5 The Minkowski diagram of the flashes in Figure 3.1, shown in the
frame of the carriage.
the frames displayed in the diagram. We can see this in Figure 4.4, where
the event ○
1 has coordinates (𝑡1 , 𝑥1 ) in frame 𝑆, and the same event has
and ○ 2 , ‘the flash reaches the front of the carriage’, as shown in Figure 4.5.
How does this appear if we draw the Minkowski diagram of frame 𝑆, the
station platform that the carriage is moving past? In this frame, the front and
back are moving from left to right, so their worldlines slope upwards to the
right (𝑐 = 1 in both frames, remember). The light flashes move diagonally as
54 4 Spacetime and Geometry
B F
t
2
1
B F
t t
2
1
x
before. The events are the same as before: event ○1 is on both worldline 𝐵 and
that of the forwards-moving flash. We thus obtain Figure 4.6, and we can
immediately see that, although events ○ 2 are simultaneous in 𝑆 ′ (in
1 and ○
Figure 4.5), they are not simultaneous in frame 𝑆, as we saw in Section 3.1.
We know that events ○ 1 and ○2 are simultaneous in 𝑆 ′ , so that the 𝑥 ′ -axis
must be parallel to the line joining them, and that the worldlines 𝐵 and 𝐹 are
parallel to the 𝑡′ -axis, so we can go on to complete the Minkowski diagram
in Figure 4.7 (it’s unfortunately the case that Minkowski diagrams end up
looking a lot more messy and confusing when complete, than they look
when you’re building them up by thinking through the description of a
problem).
It may by now be clear that the geometry of the Minkowski diagram – the
relationships between lengths and angles and what is and isn’t perpendicular
– is very different from the geometry we’re used to. We need to find out more
about that.
4.3 Plane Rotations 55
t
t
x
Figure 4.8 The dots represent the events where the flashes from an offshore
lighthouse beam are observed on a linear shoreline.
C
C
y P
r
y
α x
θ
x
Figure 4.9 Rotation in the plane: the point 𝑃 has different coordinates in two
frames.
In Figure 4.9 we see frames 𝐶 and 𝐶 ′ , where the latter has been rotated
by an angle 𝜃 with respect to the former. The point 𝑃 has coordinates (𝑥, 𝑦)
in the frame 𝐶 (here, as elsewhere, I am taking ‘frame’ and ‘coordinate
system’ to be synonyms). The same point 𝑃 has coordinates (𝑥 ′ , 𝑦 ′ ) in the
frame 𝐶 ′ ; what is the relationship between the two sets of coordinates (𝑥, 𝑦)
and (𝑥′ , 𝑦 ′ )?
A little geometry (observe that the distance 𝑟 is the same in both frames,
express 𝑥, 𝑦, 𝑥 ′ and 𝑦 ′ in terms of 𝛼, 𝜃 and 𝑟, and eliminate 𝑟 and 𝛼) gives
𝑥 2 + 𝑦 2 = 𝑟2 = 𝑥 ′2 + 𝑦 ′2 . (4.2)
C t
t
C
y P slope 1/v
1
r t1
y
t1 x1 x
α x
θ x
x x1
(a) (b)
received? In between these two events, the light has travelled a total dis-
tance 2𝐿, so the temporal separation between events ○ 1 and ○ 2 is Δ𝑡 ′ = 2𝐿
(in units where the speed is 𝑐 = 1), even though the spatial separation
between the two events, as measured in this frame, is Δ𝑥′ = 0. Inspired
by the pythagorean distance above, let’s write that separation-squared as
𝑠′2 = Δ𝑡 ′2 = (2𝐿)2 . That interval involves only the time between the two
events, but the two events happened at the same location (Δ𝑥′ = 0), so
any contribution from that spatial separation will not be present in this
expression.
Now look at Figure 3.7, showing the light clock as viewed from the frame
it is moving through, and again ask how far apart are the same two events,
as measured in this frame. This time, the separation between the events
includes both a temporal separation Δ𝑡, and a spatial separation Δ𝑥 = 𝑣Δ𝑡.
Taking further pythagorean inspiration, let’s suppose that there is a quasi-
pythagorean expression which represents the distance here, and write
𝑠2 = Δ𝑡 2 + 𝑎Δ𝑥2 ,
58 4 Spacetime and Geometry
for some constant 𝑎 we don’t yet know (we are at this point taking a very
non-axiomatic approach to thinking about this, in contrast to the approach
we took in Chapter 2). We rewrite Eq. (3.2) using our new convention of
𝑐 = 1, and find
Δ𝑡 2 = (2𝐿)2 + (𝑣Δ𝑡)2 = Δ𝑡 ′2 + Δ𝑥 2 .
Looking at this, we can see that we will have 𝑠2 = 𝑠′2 – that is, it will be an
invariant of the transformation – if 𝑎 = −1, or
𝑠2 = Δ𝑡 2 − Δ𝑥 2 . (4.3)
This is our ‘distance function’, which has the same property of frame invari-
ance in Minkowski space, that 𝑑2 = Δ𝑥 2 + Δ𝑦 2 has in euclidean space. It
is a difference of squares rather than a sum of squares, and this is the first
clue that the geometry of Minkowski space – that is, the rules of distance
there – is systematically different from euclidean space. The argument in
this paragraph, leading up to Eq. (4.3), is founded on the requirement, in
Section 3.3, as re-examined here, that the speed of light is measured to be
the same in both frames.
This quantity 𝑠2 is a frame-invariant separation between two events,
where Δ𝑡 and Δ𝑥 are the differences in coordinates of the two events. Since
the orientation of our axes is arbitrary, we can immediately generalise
Eq. (4.3) into a (3 + 1)-dimensional version, where
Let’s make this a little more concrete, and link it to what we learned in
Chapter 3. Specifically, is this consistent with Eq. (3.3)? Let event ○ 1 be the
light being emitted from the flashbulb at the bottom of the light clock, and
event ○ 2 the light being received there again, after its round trip. As before,
we take the primed frame, 𝑆 ′ , to be fixed to the clock, and we can decide
to put its space origin at the location of the bulb and its time origin at the
instant the bulb flashes. This means that the coordinates of the two events
are, respectively, (𝑡1′ , 𝑥1′ ) = (0, 0) (by the definition of our frame’s origin) and
(𝑡2′ , 𝑥2′ ) = (2𝐿, 0) (this is a restatement of Eq. (3.1) with 𝑐 = 1, combined with
the statement that ○ 2 happens at the spatial origin). The interval between
which frame 𝑆 ′ is moving? Because we assume that the two frames are in
standard configuration (Section 1.7), we know immediately that an event
which happened at the (spatial and temporal) origin of frame 𝑆 ′ happened
at the origin of frame 𝑆 also, so the coordinates in frame 𝑆 of event ○ 1 are
(𝑡1 , 𝑥1 ) = (0, 0). For definiteness, let us suppose that the light clock has
′2
size 𝐿 = 1 m (and so 𝑠12 = 4 m2 ) and is moving at speed 𝑣 = 3∕5, so that 𝛾 =
′2
5∕4. Thus the interval between the two events is, according to Eq. (4.5), 𝑠12 =
2
4 m (we shouldn’t attempt to interpret this as an ‘area’, of course, but we
will be well-behaved and include the units in numerical expressions below).
Equation (3.3) tells us that 𝑡2 = 2.5 m (since 𝑡2′ = 2 m and 𝛾 = 5∕4), and
Figure 3.7 reminds us that 𝑥2 = 𝑣𝑡2 , or 𝑥2 = 1.5 m. Putting these together,
we find
2
𝑠12 = Δ𝑡 2 − Δ𝑥 2
= (𝑡2 − 𝑡1 )2 − (𝑥2 − 𝑥1 )2
= (2.5 m)2 − (1.5 m)2
= 4 m2 ,
′2 2
equal to 𝑠12 . Thus we see explicitly that the interval 𝑠12 has the same value
when worked out using the 𝑆 frame’s coordinates, as it has when worked
out using the 𝑆 ′ frame’s coordinates.
More generally, we can see that, in this scenario, Δ𝑡 = 𝛾Δ𝑡′ = 𝛾2𝐿 and
Δ𝑥 = 𝑣Δ𝑡 (compare Section 3.3). Thus
Δ𝑡′2 − Δ𝑥 ′2 = (2𝐿)2
Δ𝑡 2 − Δ𝑥 2 = 𝛾2 (2𝐿)2 − 𝑣 2 𝛾2 (2𝐿)2
= 𝛾2 (1 − 𝑣 2 )(2𝐿)2
= (2𝐿)2 .
Note: In the discussion in this section, we could with equal justice have
started off by defining 𝑠′2 = −Δ𝑡′2 , and we would have ended up with an
interval 𝑠2 = Δ𝑥 2 − Δ𝑡2 or, more generally, 𝑠2 = −Δ𝑡 2 + Δ𝑥 2 + Δ𝑦 2 + Δ𝑧2 .
Choosing one or the other is purely a matter of convention. I slightly prefer
the convention 𝑠2 = Δ𝑡 2 − Δ𝑥 2 , because it means that the interval is the
same as the ‘proper time’, which we come to later; but there’s very little in
it. Arbitrary though it may be, the convention determines the signs of a
number of important equations in the text, and different texts choose different
conventions here, so you should be aware of this if you read around the
subject. Most significantly, the definitions of ‘timelike’ and ‘spacelike’ in
Section 4.7 swap signs. The sum of the signs of the terms is known as the
4.5 Changes of Frame, and Perspective 61
y y (1m)2 = Δx2 + Δy 2
Δy = 1m Δy
x Δx
x
(a) (b)
Figure 4.11 A metre stick in front of you: (a) perpendicular to the line of sight;
and (b) at an angle.
signature, so that Eq. (4.4), with signs (+, −, −, −), has signature −2, and
the above alternative definition, with signs (−, +, +, +), has signature +2.
[Exercises 4.5 & 4.6]
t
t
1
x
Δt Δt Δx
x
2
Δx
Figure 4.12 Two events in Minkowski space, and their separations in two ref-
erence frames. Compare Figure 4.4.
Figure 4.12. In one frame, the events are separated by Δ𝑥 in space and Δ𝑡 in
2
time, and thus by an interval of 𝑠12 = Δ𝑡 2 − Δ𝑥 2 . Viewed from a different
frame, however, these two events are separated by different amounts of space
and time, purely because of the change of frame, but the interval between
2
them, 𝑠12 = Δ𝑡 ′2 − Δ𝑥 ′2 , is the same.
Exactly as in Figure 4.11(b), the separations along any one coordinate
axis are frame-dependent and physically meaningless. It is only the interval
Δ𝑥2 + Δ𝑦 2 , or Δ𝑡2 − Δ𝑥 2 , that is significant, depending on whether we are
talking about the geometry of, respectively, euclidean space (our usual
intuition) or Minkowski space (the intuition we must develop for SR).
The correspondence between Figures 4.10(a) and 4.10(b) is not intended
to be some vague handwaving analogy. On the contrary, it is a very precise
analogy: the same thing is happening in both cases. The only thing that is
different between the two cases is the geometry of the space in question. In
the first case, the transformation from one set of coordinates to the other
(Eq. (4.1)) is such that it preserves euclidean distance (Eq. (4.2)). In the sec-
ond, the coordinate transformation is the one which preserves ‘Minkowski
distance’, Eq. (4.3), with a specific transformation – the expression corre-
sponding to Eq. (4.1) – which we are going to discover in the next chapter. If
you think of moving from one inertial frame to another as a ‘rotation’ (keep
the scare-quotes), you will not go far wrong. If you can find a place in your
head, where Figure 4.12 looks as natural as Figure 4.11(b), you will have
reached some relativistic nirvana.5
t
s2 = 1
t=1
s2 = −1
x=1 x
Figure 4.13 All the points on the two hyperbolae are the same Minkowski-
distance from the origin.
t t
2 x
10m
1
Figure 4.14 Two clock-ticks on clocks at rest in frames 𝑆 and 𝑆 ′ . In each case,
the previous tick was at time 𝑡 = 𝑡′ = 0. The null line is shown dotted.
t
nt
t fro
x
4
3
10m x
Figure 4.15 A 10 m long train (at rest in 𝑆 ′ ) moving through a station (at rest
in 𝑆) with a 10 m long platform, also showing (dot-dashed) the worldline of
the front of the train.
2 (how do we know that? Because that event is on the 𝑡 ′ -axis, and the
at ○
hyperbola marks out the locus of points which are an interval of 𝑠2 = (10 m)2
from the origin). But the 𝑡-coordinate of this event ○ 2 is larger than 10 m –
t
t
1
2
10 m
x
x
Figure 4.16 The same events as shown in Figure 4.14, but this time shown in
the frame 𝑆 ′ .
hyperbola again marks out the set of points which are the same distance –
10 m in this case – from the origin, and simultaneity in 𝑆 ′ implies that the
event ○ 3 is on the 𝑥 ′ -axis. How long does the platform party measure this
train carriage to be? The way they measure that is to identify where the
front of the carriage was at some instant (choosing for example event ○ 3 , at
the position and time of the firework explosion), and find another event at
the back of the carriage at the same time in 𝑆, namely event ○ 4 . Directly from
the diagram, you can see that the spatial distance, in 𝑆, between events ○ 3
for reference (just as in Figure 4.15, the line joining ○ 4 and ○ 5 is parallel to
the 𝑥-axis). I have also shown a new event ○ 5 , which is at the 10 m on the
platform (as we can see from the calibration hyperbola), and an event ○ 6 ,
which is at the zero-end of the station platform (it’s on the worldline of the
end of the platform, overlaid on the 𝑡-axis), which happens simultaneously
66 4 Spacetime and Geometry
t
front
t
4
3
x
5
6
′ ′
with ○ 5 in the train frame (i.e., 𝑡 = 𝑡 ); it is clear that the separation between
5 6
events ○ 5 and ○ 6 – that is, between the zero and 10 m marks on the platform
t
5
s
2
future s2 > 0
=
4
0
3
elsewhere
s2 < 0 x
0
0
2
=
past s2 > 0
s2
1
like separations from an event ○ 0 , at the origin, events ○4 and ○2 have null
can immediately categorise the five events in terms of the type of separation
between the event and an event at the origin.
Event ○5 timelike separation from ○ 0 : a slower-than-light signal could
from ○ 0 to ○
4 .
to ○
0 .
from ○ 1 to ○
0 .
Although we are familiar with time being simply divided into a future and
a past, we can see from this Minkowski diagram that spacetime is divided
into three regions, the familiar ‘future’ and ‘past’, plus a region indicated by
‘elsewhere’, consisting of events which cannot interact causally with events
at the origin.
Figure 4.18 shows only the 𝑥- and 𝑡-dimensions. If we showed the 𝑦-
dimension as well, and rotated the figure around the 𝑡-axis, the dashed
lines would turn into a cone defined by 𝑠2 = 𝑡 2 − (𝑥2 + 𝑦 2 ) = 0. This null
cone demarcates areas of spacetime in the same way as the diagonals in the
68 4 Spacetime and Geometry
figure. Every event has a null cone, pointing both forwards and backwards
along the time axis: events ‘inside’ the null cone are timelike separated from
it, with 𝑠2 > 0, and are in the possible future of the event or its possible
past, depending on whether 𝑡 > 0 or 𝑡 < 0; events ‘outside’ the null cone
are spacelike separated from it, and are ‘elsewhere’; events on the null cone
have a null separation from it, and are reachable only by a light signal.
Although it is harder to visualise, the full null cone is the surface 𝑠2 =
𝑡2 − (𝑥 2 + 𝑦 2 + 𝑧2 ) = 0 in spacetime. [Exercises 4.8 & 4.9]
Exercises
Exercise 4.1 (§4.1)
Convert the following to units in which 𝑐 = 1:
1. 1J
2. 1 ns
3. light-bulb power, 100 W
4. Planck’s constant, ℎ = 6.626 × 10−34 J s
5. jogging pace, 3 m s−1
6. momentum of a jogger, 300 kg m s−1
7. power station output, 1 GW
Convert the following to physical units:
1. velocity 10−2
2. time 9.46 × 1015 m
3. acceleration 1.11 × 10−16 m−1
4. 𝑡 ′ = 𝛾(𝑡 − 𝑣𝑥) (this is Eq. (5.6), from the next chapter)
5. the ‘mass-shell’ equation 𝐸 2 = 𝑝2 + 𝑚2
If you’re having trouble working out whether to multiply or divide, I find
it helpful to remember that, just as an inch is a small number of kilometres, a
light-metre is a small number of seconds, or a light-second is a large number
of metres.
t
t
4
x
2
3
x
1
3. 10−16 kg m s−2
You can also do this without doing any arithmetic – think of the dimen-
sions. [ 𝑑− ]
1. Event ○
3 happens before event ○2
3. Event ○
3 happens after ○
2
And in frame 𝑆 ′ ? [ 𝑑− ]
2 2 2
(𝑡3 , 𝑥3 ) = (4, 6) (take 𝑐 = 1). Calculate the intervals 𝑠12 , 𝑠13 and 𝑠23 between
the three pairs of events, state whether each is timelike, spacelike or null,
and in each case whether the earlier event could influence the later event
through a signal travelling no faster than light.
5
The Lorentz Transformation
72
5.1 The Derivation of the Lorentz Transformation 73
𝑡 2 − 𝑥 2 = 𝑠2 = 𝑡 ′2 − 𝑥 ′2 . (5.1)
Thus the relationship between (𝑡, 𝑥) and (𝑡 ′ , 𝑥 ′ ) must be one for which
Eq. (5.1) is true.
Equation (5.1) is strongly reminiscent of Eq. (4.2), and we can make it
more so by writing 𝑙 = i𝑡 and 𝑙 ′ = i𝑡 ′ , so that Eq. (5.1) becomes1
𝑙 2 + 𝑥 2 = −𝑠2 = 𝑙 ′2 + 𝑥 ′2 . (5.2)
1 You may wish to refer to Appendix C, to refresh your memory on complex numbers, and
trigonometric functions using them.
2 If 𝜃 is pure imaginary then, from the Taylor expansions, cos 𝜃 is real and sin 𝜃 is pure
imaginary, so that 𝑥′ and 𝑙′ are real and pure imaginary, respectively, as they should be.
74 5 The Lorentz Transformation
(I have swapped the order of the two equations, relative to Eq. (5.3), and
swapped 𝑥 and 𝑡 in each row, for later notational convenience; you might
want to take a look at Exercise 6.5 for an alternative way of arriving at these
equations.)
Now consider an event at the spatial origin of the moving frame, that is,
at 𝑥 ′ = 0 for some unknown 𝑡 ′ . What are the coordinates of this event in the
unprimed frame? That’s easy: if it happens at time 𝑡 in the unprimed frame
then it happens at position 𝑥 = 𝑣𝑡 in that frame (because the two frames are
in standard configuration), in which case Eq. (5.4b) can be rewritten as
The quantity 𝜙(𝑣) is sometimes called the ‘rapidity’, and the application of
an LT to a frame is sometimes called a ‘boost’ of that frame.
Since we now have 𝜙 as a function of 𝑣, we also have, in Eq. (5.4), the full
transformation between the two frames; but combining Eq. (5.4) and Eq. (5.5)
2 2
with a little hyperbolic trigonometry (remember cosh 𝜙 − sinh 𝜙 = 1), we
can rewrite Eq. (5.4) in the more usual form
𝑡 = 𝛾(𝑡 ′ + 𝑣𝑥 ′ ) (5.8a)
𝑥= 𝛾(𝑥 ′ + 𝑣𝑡 ′ ), (5.8b)
5.2 Addition of Velocities 75
which can be verified by direct solution of Eq. (5.6) for the unprimed coordi-
nates. [Exercises 5.1 & 5.2]
any two transformations performed one after the other, there exists a third
with the same net effect (i.e., the LT is ‘transitive’); (ii) there exists a trans-
formation (with 𝜙 = 0) which maps (𝑡, 𝑥) to themselves (i.e., there exists
an identity transformation); and (iii) for every transformation (with 𝜙 = 𝜙1 ,
say) there exists another transformation (with 𝜙 = −𝜙1 ) which results in
the identity transformation (i.e., there exists an inverse). If we add to these
three properties the observation that, for three Lorentz transformations 𝐿1 ,
𝐿2 and 𝐿3 , the associative property holds, 𝐿3 ◦(𝐿2 ◦𝐿1 ) = (𝐿3 ◦𝐿2 )◦𝐿1 , then
these are enough to indicate that the LT is an example of a mathematical
‘group’, known as the ‘Lorentz group’.
The Lorentz group consists of all those transformations which leave
invariant the interval in Eq. (4.4) (or, as we shall see in the next chapter,
which leave the lengths of 4-vectors invariant). The LT of Eq. (5.6) describes
transformations where the motion is along the 𝑥-axis (i.e., standard con-
figuration). It is possible to generalise this transformation to boosts in any
spatial direction, and to include spatial rotations (it’s usual to use one of a few
different notations when discussing this, to avoid the discussion becoming
unbearably messy); this is referred to as the group of restricted (homogeneous)
Lorentz transformations. The ‘restriction’ is that this excludes reflections in
space or time, and the full Lorentz group consists of the restricted group
extended with transformations which flip the signs of the coordinates. If
we further add translations – transformations which change the origin of
the coordinate system – we obtain the ‘inhomogeneous Lorentz transfor-
mations’, also known as the Poincaré transformations, which represent the
Poincaré group.
Groups are very closely related to the mathematical study of symmetry (to
say a sphere is ‘symmetric’ is to say that there is a mathematically analysable
set of rotations you can perform, which leave the sphere looking unchanged).
Einstein was repeatedly motivated by ideas of symmetry, and invariance,
and they are of profound importance in modern physics.
Poincaré has a lot to do with the mathematical history of relativity. You
might also be interested in a discussion (Adlam 2011) of how much Poincaré
understood of what we now know as relativity, and how his approach to
relativity is linked to a particular philosophical position on science (this is
firmly in the category of dangerous-bend remarks).
5.3 The Invariant Interval and the Geometry of Spacetime 77
coordinates of these two events in the two frames (this is the definition of
the length of the rod, as measured in the ‘stationary’ frame – see Section 1.4).
In frame 𝑆 ′ , there is no complication: any events which happen at the
ends of the rod will be a distance 𝐿0 apart, so that 𝑥2′ = 𝐿0 .
In 𝑆, the two events happen at 𝑡-coordinates 𝑡1 = 𝑡2 = 0, and 𝑥-
coordinates 𝑥1 = 0 and 𝑥2 = 𝐿. Since the origins were coincident at 𝑡 =
𝑡′ = 0, we know that 𝑥1 = 𝑥1′ = 0 (standard configuration again). From
Eq. (5.8), we can write down that
( )
𝑥2 = 𝛾 𝑥2′ + 𝑣𝑡2′ (5.13a)
(′ ′
)
0 = 𝑡2 = 𝛾 𝑡2 + 𝑣𝑥2 . (5.13b)
80 5 The Lorentz Transformation
t
1
x = L = 3 m
Equation (5.13b) tells us that 𝑡2′ = −𝑣𝑥2′ ; substituting this into Eq. (5.13a)
and writing 𝑥2 = 𝐿 and 𝑥2′ = 𝐿0 , we have
( ) 𝐿0
𝐿 = 𝛾 1 − 𝑣2 𝐿0 = , (5.14)
𝛾
showing that the moving rod is measured, by observers in the stationary
frame, to be shorter than its ‘rest length’ 𝐿0 .
Compare Section 3.4. The only difference between this section and that
one is that here, we were able to short-circuit the rather elaborate argument
of Section 3.4, since the LT in a sense provides the machinery for that
argument in a ready-to-use form. [Exercise 5.5]
1
2
3
x
that worldline and the worldline of the light flash from the origin, and thus
at the intersection of those two worldlines.
In the platform frame, we draw the worldlines of the ends of the carriage,
in motion to the right, and the worldlines of the same light flashes, again at
slope 1; see Figure 5.2. Exactly as before, event ○ 1 is on the intersection of
which happens just here, which is incidentally telling us that this is a rather
artificial question to ask, but the point is that we must synthesise such an
event in our heads, in order to have something to attach coordinates to).
82 5 The Lorentz Transformation
t t
x
1
3
x
Call this event ○ 2 . Thus we can draw it onto Figure 5.2 on the forwards-
going light flash worldline, and simultaneous with event ○ 1 (i.e., at the same
want (𝑡2 , 𝑥2 ) given that we now have (𝑡2 , 𝑥2 ). From Eq. (5.6), 𝑥2′ = 𝛾(𝑥 −
′ ′
𝑣𝑡) = 1 m and 𝑡2′ = 𝛾(𝑡 − 𝑣𝑥) = 1 m (since the flash is moving at speed 𝑐 = 1,
we must have 𝑥2′ = 𝑡2′ , so this result is reassuring).
(v) Saying ‘at this instant (measured in the platform’s frame)’ tells us that
we are looking for a 𝑡 and not a 𝑡 ′ . We’re looking for the coordinates of an
event ○ 3 which is such that 𝑡3 = 𝑡1 , and the position of which is at the front
of the carriage. Marking this in Figure 5.2 confirms to us that all three events
have been described as being simultaneous in the platform frame. With 𝐿′
as defined above, we can reason that the front of the carriage is at (constant)
coordinate 𝑥 ′ = 𝐿′ = 3 m in the 𝑆 ′ frame and so at coordinate 𝑥 = 𝐿 = 𝐿′ ∕𝛾
at time 𝑡 = 0 (by length contraction or from Eq. (5.6b)). Therefore the front
of the carriage, moving √ at speed 𝑣, is afterwards at coordinate 𝑥 = 𝐿 + 𝑣𝑡.
At time 𝑡3 √ = 𝑡2 = 3, √ the coordinate of the front will therefore be 𝑥3 =
(3 m)∕𝛾 + 3∕2 = 2 3 (alternatively, we can spot that we know 𝑥3′ and 𝑡3 ,
and want 𝑥3 , so we can use the LT 𝑥3′ = 𝛾(𝑥3 − 𝑣𝑡3 ) and rearrange for 𝑥3 ;
this is more direct, but you will often have to do simple speed-times-time
reasoning of this sort).
(vi) Using the result of part (v), we immediately obtain 𝑡3′ = 𝛾(𝑡3 − 𝑣𝑥3 ) =
√
0, from Eq. (5.6a). Alternatively, we have 𝑡3 = 3 m and 𝑥3′ = 3 m and want
𝑡3′ ; we can therefore rearrange Eq. (5.8a) to find 𝑡′ = 𝑡∕𝛾 − 𝑣𝑥 ′ and get the
same result without going via step (v).
I’ve drawn both frames in the same diagram, in Figure 5.3. Notice that
event ○ 3 is on the 𝑥 ′ -axis, reassuringly consistent with having coordinate
5.6 The Equations of Special Relativity in Physical Units 83
𝑠2 = Δ𝑡 2 − Δ𝑥 2 X→ 𝑠2 = 𝑐2 Δ𝑡2 − Δ𝑥 2 .
( 𝑣𝑥 )
𝑡′ = 𝛾(𝑡 − 𝑣𝑥) X→ 𝑐𝑡 ′ = 𝛾 𝑐𝑡 −
𝑐
𝑥′ = 𝛾(𝑥 − 𝑣𝑡) X→ 𝑥 ′ = 𝛾(𝑥 − 𝑣𝑡)
−1∕2
( )−1∕2 𝑣2
𝛾= 1 − 𝑣2 X→ 𝛾 = (1 − 2 ) .
𝑐
84 5 The Lorentz Transformation
5.7 Paradoxes
The teaching of SR conventionally and usefully includes various paradoxes.
In this context, ‘paradox’ means something that seems wrong at first sight,
but isn’t. These so-called paradoxes are thought experiments where we
arrive at conclusions which are probably unexpected, but which are simply
correct deductions from ideas we at least partly understand. We look at
them in order to deepen our understanding of, and familiarity with, the
ideas of SR.
y t
Ithaca t = 100 yr
P
O
Circe
O O
x x
Troy 25 ly
(a) (b)
Figure 5.4 Travellers’ paths: (a) Odysseus can go from Troy to Ithaca by two
different routes; (b) Penelope travels outward at speed 𝑣 = 0.5 for 25 yr, and
then returns.
less, according to Eq. (3.3) above: for 𝑣 = 0.5, we have 𝛾 = 1.15, and only 87
years will pass for Penelope, who will consequently be substantially younger
than Odysseus when she returns to Earth.
That seems a little odd, but we’re used to peculiarity in SR, now. However,
some bright spark then points out that, relative to Penelope, it is Odysseus
who has been moving, so shouldn’t the whole situation be symmetrical, just
as with the trains in Section 3.1 above, forcing us to conclude that Odysseus
will be younger than Penelope? This is nonsensical – although two trains
can be mutually measured each to be shorter than the other, two clocks
(Penelope and Odysseus) can’t logically each be showing less time than the
other.3
The paradox is dissolved as soon as we point out that Penelope cannot
conclude that she is not moving (and thus that Odysseus is moving), since of
the two only Penelope experiences the change of velocity at the distant star,
with an absolutely observable acceleration – SR discusses the relationships
between inertial frames, and Penelope does not (indeed cannot) remain in a
single inertial frame for the entire round trip. This becomes clearer when
we consider the Minkowski diagram.
In Figure 5.4(b), we see the path which Penelope takes through spacetime,
on the way to the star and back. It appears that Penelope’s route is the longer,
3 The two situations are not as directly comparable as may at first appear. A measurement of
distance necessarily involves events separated in space, so we must deal with questions of
simultaneity. In contrast, observations of clocks represent a sequence of observations of a
single object as it moves through time and space.
86 5 The Lorentz Transformation
t t t
4
x
a
3
2
1
x
parallel to the 𝑥′ -axis. The dotted lines are the paths of light flashes moving in
the positive and negative 𝑥 direction.
but remember that the plane of the Minkowski diagram is not a euclidean
surface, so that our intuitions about lengths and angles are not reliable. It is
possible to show, in fact, that a straight line is the longest distance between
two points in Minkowski space (see Exercise 5.8), so that Odysseus, taking
the straight route, has travelled a greater distance through spacetime than
has Penelope. Since, √ for a pair of timelike-separated events, the distance
through spacetime, 𝑠2 , is the same as the proper time 𝜏 between the events
(Section 5.4), we can say that the proper time along Odysseus’s path is greater
than the proper time along Penelope’s, which is to say that a clock carried
with him will show a greater time elapsed than one carried by Penelope,
which is to say that Odysseus is older than Penelope when they re-meet.
It is at this point that I, for one, get the most immediate benefit from
the analogy of the clock as a spacetime version of the taffrail log. When
Odysseus sailed, in a boat, from Troy to Ithaca via Circe, the taffrail log
showed the amount of Mediterranean he moved through; it would have
turned fewer times if he’d gone home by the direct route. When Odysseus
and Penelope move from take-off to event ○ 4 , their clocks show how much
she is at rest on the return journey, moving along the 𝑡 ′′ -axis; note that
frame 𝑆 ′′ is unusually not in standard configuration with respect to the
other two frames). This diagram makes it very clear that Penelope changes
inertial frame at the turnaround event, ○ 1 . We also see that, just before the
4
turnaround, Penelope sees event ○ 2 , which is at Odysseus’s location, 𝑥 = 0,
with event ○; in this frame, event ○ happened in the past, with respect
1 2
to ○1 . If Penelope has skipped her relativity lectures, and is not aware of the
significance of the change of frame, she will simply ‘miss’ the time interval
between events ○ 2 and ○ 3 . During the two ‘cruise’ phases, before and after
the turnaround, the two observers’ clocks are indeed symmetrically slower
than each other, but the frame-independent difference between the two
observers’ elapsed time is attributable to the gap between ○ 2 and ○ 3 . This
gap, one might say, is where Odysseus’s extra ageing comes from.
There is an alluring blind alley here, prompted by the presence of the
acceleration at the turning-point, even prompting some folk to insist that
the ‘Twins Paradox’ needs GR to resolve it. It does not: there is no need to
talk about any actual acceleration: rather than actually turning round at
the star, Penelope could simply set the clock of another traveller she meets
there, who is already travelling at the right speed in the return direction, or
hand over her log-book to them. Our conclusion would then be to do with
the total elapsed time on the two legs of the journey, as calculated from the
log-books, rather than counting the grey hairs on one miraculously unaged
traveller; but this is exactly the same conclusion as above, merely in a less
vivid form.
As a final remark, a real Penelope would of course accelerate towards
the destination, and then slow to a halt before accelerating in the opposite
direction for the return journey. Gourgoulhon (2013, §2.6, where he refers
to the scenario by its alternative name of Langevin’s traveller) works through
this more realistic case in detail. This doesn’t change the logic of the argu-
ment, but firstly it illustrates the way that SR can comfortably deal with
acceleration (we touch on this also in Exercise 6.11); and secondly it may
reassure you that, when we introduce the simplification of an instantaneous
turnaround, above, we are not also introducing a subtle flaw in the argu-
ment. [Exercise 5.8]
4 More precisely, and to avoid the word ‘sees’, Penelope can arrange a network of friends at
′
rest in frame 𝑆 ′ , one of whom ends up local to event ○ 2 , and logs it as happening at time 𝑡 ,
2
′
2 in frame 𝑆 ′ , which is numerically equal to 𝑡 .
the time of event ○ 1
88 5 The Lorentz Transformation
the barn and the back of the pole are coincident (the place and time of the
slammed-shut door), and event ○ 2 is an event which happens at the place
and time where the back of the barn and the front of the pole are coincident.
We can see that events ○ 1 and ○ 2 are simultaneous in 𝑆 – they have the same
𝑡-coordinate – this is the situation where the pole is moving at a speed where
5 This is also known as the ‘ladder paradox’, and was first described, in a slightly different
form, by Rindler (1961). You can also think of it as the car-in-the-garage paradox, but do
wear a seat-belt.
5.7 Paradoxes 89
t
t
BP FP
x
4
1
x
2
3
FB BB
Figure 5.6 The pole moving through the barn. Frame 𝑆 is the barn, frame 𝑆 ′
is attached to the pole. The worldlines FP, FB, BP, BB, are the front/back of
the pole/barn. For events, see the text.
it is length contracted just the right amount to fit exactly inside the barn, in
the sense of the previous paragraph. From this same diagram, you can see
that event ○ 4 , which is an event which happens at the front of the pole at
the same 𝑡 ′ as event ○ 1 , is beyond the back of the barn: at the instant of 𝑡 ′
when the back of the pole enters the barn in the farmer’s frame (at time
𝑡′ = 𝑡1′ = 𝑡4′ ), the front of the pole is well clear of the back of the barn, and
the event when the front of the pole hits the back of the barn (event ○ 2 ) is
of the barn – the back of the pole hasn’t entered the barn at this point. The
constant spatial distance between events ○ 1 and ○ 4 , or between events ○ 3
and ○ 2 , is the rest length of the pole, 20 m, and the distance between ○ 1 and ○ 2
is the rest length of the barn, 10 m. You can plainly see from this figure that
it is impossible to find a pair of events which have the same 𝑡′ -coordinate,
which are both between the FB and BB worldlines: the pole is never entirely
within the barn in the farmer’s frame.
Figure 5.7 is the same set of events, but drawn in the farmer’s frame, 𝑆 ′ .
Here, the worldlines of the barn show it to be moving in the negative 𝑥 ′ -
direction.
We can look at the problem another way, by asking the question ‘sup-
posing that the back of the barn were made of super-strong concrete, how
would the trailing end of the pole know when to stop?’ We cannot assume
that the pole is completely rigid; thus when the front of the pole hits the back
wall of the barn, the shock wave – the information that this has happened –
takes a finite time to make it to the back of the pole, so that the back of the
90 5 The Lorentz Transformation
t t
FB BB
1
4
x
2
3
x
BP FP
Figure 5.7 The barn moving past the pole (this view is in the pole’s frame).
The events and worldlines are the same as in Figure 5.6.
pole keeps moving forwards into the barn even after the front of the pole has
halted at the back wall. The information about the front of the pole stopping
moves rearwards at the sound-speed in the pole: for the pole to be rigid
would require an infinite sound-speed. Very shortly after the barn door has
been slammed shut, the pole’s recoil will smash through the door (or more
likely be vaporised, but let’s not worry ourselves with physical practicalities
at this point).
What this example shows is that any conclusion which you correctly
reach in one frame must be reachable in any other frame, even though the
detailed mechanisms might be different.
Exercise 5.9, though it is framed in different language, works through
this problem in illuminating detail. [Exercise 5.9]
6 This paradox is discussed in Bell (1976), and has become known as ‘Bell’s spaceship paradox’,
although he quotes it as originating somewhat earlier (Dewan & Beran 1959, Dewan 1963).
5.7 Paradoxes 91
t r2 r1
t
p3 x
p1 p2
Figure 5.8 Bell’s rockets: two spaceships moving along the 𝑥-axis at increas-
ingly relativistic speeds. The two rockets have worldlines 𝑟1 and 𝑟2 .
other, the string can no longer stretch from one rocket to the other, and
therefore must break at some point.
There is a contrary argument, however, which has it that, in the rockets’
frame, the two rockets are stationary, so the string between them is stationary,
so there’s no length contraction, so the string will still stretch between the
rockets, so it won’t break.
Which of these arguments holds up – does the string break or not? Since
the string does or does not break, one of these arguments is mistaken.
You may wish to think the arguments through before reading on. It’s
useful to consider how the two rockets’ accelerations will be observed by
the pilots of the ‘other’ rocket. When doing so, imagine the accelerations
happening as a sequence of instantaneous increments in speed (when do
these jumps happen?), rather than being continuous; also, drawing a well-
chosen Minkowski diagram makes the problem very simple.
In Figure 5.8 we can see the Minkowski diagram of the rockets’ motion,
plus the axes of a reference frame, 𝑆 ′ , which is instantaneously co-moving
with the rear rocket (that is, an inertial frame which, at a specific instant,
is briefly moving at the same velocity as the accelerating rocket, so that its
𝑡′ -axis is parallel to the 𝑟2 worldline at that point; 𝑆 and 𝑆 ′ are not in standard
configuration). The key to resolving the paradox is the insight that, once the
rockets are moving at speed, their clocks are no longer synchronised from
each other’s point of view.
Imagine one-year anniversary parties, events 𝑝1 and 𝑝2 , on board the
two spaceships, when a year of proper time has passed on board. Since the
rockets have identical acceleration schedules, these will be simultaneous in
the launch-pad frame (though not, of course, at 𝑡 = 1 yr).
92 5 The Lorentz Transformation
t r2 r1
t
x
p3
p1 p2
Figure 5.9 Bell’s rockets: the same as Figure 5.8, but with the axes of the front
rocket’s frame drawn in.
In the frame of the rear rocket, however, events 𝑝1 and 𝑝2 are not simul-
taneous; instead it is the events 𝑝1 and 𝑝3 which are simultaneous there.
But event 𝑝3 is after the front rocket’s one-year party, and thus later in the
rockets’ acceleration schedule, so the rear rocket’s pilot (presuming they’ve
forgotten their relativity lectures) will conclude that the front pilot is ‘cheat-
ing’ on the acceleration schedule, by both organising their party ahead of
time (𝑝2 ), and by applying more thrust than agreed, with the result that the
front rocket pulls away, causing the string to break.
The piece of string, which has never attended a relativity lecture in its
short and sorry life, and which is obliged to have one end at 𝑝1 and the other
simultaneously (in its frame7 ) at 𝑝3 , simply detects that the front rocket is
further from the rear rocket than the string can cope with, and breaks.
If you sketch in the axes of a frame 𝑆 ′′ which is instantaneously co-moving
with the front rocket at 𝑝3 (Figure 5.9), then you will see that the event 𝑝3 is
simultaneous in frame 𝑆 ′′ with an event on 𝑟2 with a smaller 𝑡 than 𝑝1 , that
is, at an earlier point in the acceleration schedule: the front rocket’s pilot
(equally negligent of their relativity lectures) believes that the rear rocket
has fallen behind schedule, so it is lagging behind, causing the string to
break (as usual, different frames’ observers have different explanations for
what happens, but they must agree on the results).
The error in the contrary argument above is the phrase ‘in the rockets’
frame. . . ’ – there is no single ‘rockets’ frame’ once they’ve started accelerat-
ing. Here, as in Figure 5.5, the key part of the resolution is to identify the
7 There is no complication about ‘the string’s frame’: even a modest continuous acceleration
produces a relativistic speed before long – see Exercise 6.11 – so we can take the string to be
always in equilibrium. And yes; there is such a thing as ‘string theory’; no, it’s not this.
5.8 Some Comments on the Lorentz Transformation 93
frames in which things are or are not simultaneous. In Figure 5.5 Penelope
‘misses’ a chunk of time when she switches from frame 𝑆 ′ to 𝑆 ′′ ; in Figure 5.8,
the relativity-naïve pilots accuse each other of bad navigation, because they
have not realised that both rockets have changed frame between the launch
and the one-year party. [Exercise 5.10]
8 This expression was first established by Heaviside in 1888 or 1889 (in different notation; see
Heaviside (1889)). Although the result looks reasonably simple, the calculation is famously
hard: there is a discussion of the various routes to the result in Jefimenko (1994), which
refers to the derivation as ‘one of the most complicated procedures in classical
electromagnetic theory’. The discussion in this section follows that in Bell (1976), which is
also the source for ‘Bell’s spaceship paradox’ of Section 5.7.3. Although this paper is now
most commonly cited (slightly erroneously) merely as the source of the paradox, Bell’s goal
in the paper was to claim that the argument in this section is a more intuitive way of arriving
at SR than the axiomatic approach. I think this is true only for those with a substantial
pre-existing familiarity with advanced EM theory, but that it is largely unintelligible for
those without this familiarity, as well as missing the fundamental insights mentioned at the
end of this section.
96 5 The Lorentz Transformation
test particle in orbit around it, reduce to the same form as the (spherically
symmetrical) expressions for a stationary charge, and the particle again
orbits with period 𝑇; and further, that Maxwell’s equations in these changed
variables have the same form as Maxwell’s equations in the original variables
(compare Exercise 2.4).
Equations (5.16) are of course now familiar. These and similar expres-
sions were well known to physicists for at least a decade prior to Einstein’s
1905 paper. The LT is also sometimes referred to as the FitzGerald–Lorentz
transformation, since it appears to have been George Francis FitzGerald who
first suggested (1889) that ‘the length of material bodies changes, according
as they are moving through the ether or across it’, which was elaborated as
an idea by Hendrik Antoon Lorentz (1895, 1904). It may be J. J. Thomson
who first put the change of variables Eq. (5.16a) into print (Thomson 1889),
as a way of recovering the spherical symmetry of Maxwell’s equations, in the
case of a particle in motion in an electric field, but it does seem that it was
Lorentz who came closest to writing down, what it now seems fair to call,
the Lorentz transformation. The papers I have mentioned here form part of
a network of dauntingly complicated attempts to discuss the consequences
of Maxwell’s equations in a frame moving with respect to the aether and –
although this is certainly not how these authors would have described their
work – to find a transformation which left form-invariant the equations
of electromagnetism. It’s easy to think, now, of SR as part of mechanics
(think of all these trains moving about at high speed and, in chapters to
come, particles colliding), and to be slightly puzzled at why it would occur
to Einstein to think of specifically light as having some special status. But
Einstein’s 1905 paper is, even looking only at its title, located within the
study of electromagnetism; and almost half of it, subtitled ‘Electrodynamical
part’ and beginning with ‘Transformation of the Maxwell–Hertz equations
for empty space’, is manifestly connected to the same network of concerns
as above. Despite thus advertising its connections with previous work, the
1905 paper is remarkable for the completeness and concision of its reboot of
physics as the study of symmetry in nature, which Minkowski could be said
to declare as complete in 1908.
Recall that, at this point in this section, we have not yet talked about
Special Relativity; so far, this is all (advanced) classical electromagnetism.
If the ‘moving charge’ described above is the nucleus of an atom, and the
orbiting charges are the atomic electrons, then the above analysis is telling
us that those atoms will be ‘squashed’ along the direction of motion by a
factor of 𝛾, and thus that a rod which is, for example, Avogadro’s number
of atoms long will be contracted in length by the same factor. If the ‘rod’
5.8 Some Comments on the Lorentz Transformation 97
is instead your brain, in motion while sitting in a travelling train, then not
only will it be contracted along the direction of your motion, but all of
the electrons around all of the atoms in it will be orbiting with a period
which is longer than the period at rest, by another factor of 𝛾. The result,
presumably, would be that you would think slower by a factor of 𝛾, and
the watch in your hand – with its components also made of atoms – would
tick slower by a factor of 𝛾, ticking out 𝑡′ rather than 𝑡. Thus all of your
observations within the carriage would be in terms of (𝑡 ′ , 𝑥 ′ ) rather than the
(𝑡, 𝑥) appropriate to measurements on the stationary platform. But that in
turn means that all of the electromagnetic measurements you make, in the
carriage, would be indistinguishable from the corresponding measurements
made on the platform, even though they would be measured to go more
slowly by observers standing on that platform. It would be impossible,
based on experiments and observations made within the carriage, for you to
discover whether you were in motion.
It is possible to imagine, at this point, a counterfactual history of rela-
tivity without Einstein, in which all of the equations of SR are obtained as
consequences of Eq. (5.16), and we discover, as a consequence of these, the
invariance of the interval 𝑠2 . The measurable results would be largely the
same as what we have now, but the route and the foundations would be very
different.
We might end up concluding that an absolute rest frame was a physi-
cally real thing, even though, by a peculiarity of Maxwell’s equations, it was
impossible to detect (this appears to have been Lorentz’s position, just as,
looking further back, it may have been Newton’s position even though his
theory, similarly, contained nothing which would allow the rest frame to
be identified). If you believe in atoms, and in particular atoms composed
of electrons orbiting a nucleus (a belief which was not universal in the first
decades of the twentieth century), then this feature of Maxwell’s equations
transfers itself to material bodies, and thus to the rest of physics. The ar-
gument above, about the atoms in your brain, makes it plausible that you
would think slower whilst moving, but it doesn’t prove it. All together, this
argument from Maxwell’s equations explains why we don’t see an aether
and why ‘moving clocks run slow’, but leaves a large number of questions
still open, and in addition doesn’t address the question of why it should
be that Maxwell’s equations have this odd Lorentz-group symmetry, rather
than anything more intuitive.
Einstein’s axiomatic starting point, in contrast, focuses on the underlying
geometry of, and properties of, spacetime, rather than one particular physical
theory, and it therefore immediately has universal applicability.
98 5 The Lorentz Transformation
Exercises
Exercise 5.1 (§5.1)
Look back at Figure 3.3 on p. 34, and consider the following two frames.
Frame 𝑆 is attached to the right-moving carriage and has its spatial origin 𝑥 =
0 at the centre of the top carriage; frame 𝑆 ′ is attached to the left-moving
carriage and has its spatial origin 𝑥 ′ = 0 at the centre of that carriage. Sketch
the position of these carriages/frames at time 𝑡 = 0. Are these frames in
standard configuration? Can we use the Lorentz transformation to relate
the coordinates of these frames? [ 𝑢+ ]
9 It’s a fairly common observation that Einstein disliked the term, but it’s unexpectedly hard
to find a specific source for it. The nearest I have been able to find is a 1921 letter to
Eberhard Zschwimmer (Einstein Papers, volume 12, document 250,
https://einsteinpapers.press.princeton.edu/vol12-doc/372), in which he says that the name
‘relativity theory’ can lead to philosophical misunderstandings, and agrees that ‘invariance
theory’ (Invariantz-Theorie) would better describe the method; he acknowledges, however,
that it was even then too late to change the term.
Exercises 99
What is the form of the LT in this slow-speed limit? Do you recognise this?
From Eq. (5.11), how do velocities add in this limit, where either 𝑣1 or 𝑣2
is small compared with 𝑐 = 1? And what happens if one of the velocities is
already the speed of light?
𝛾(𝑣)
= 1 + 𝑣1 𝑣2 .
𝛾(𝑣1 )𝛾(𝑣2 )
100 5 The Lorentz Transformation
an event ○3 at the point where the front clock emerges from the tunnel.
namely a direct route, and one via a second event ○. Take the interval
2
invariant intervals between these three events, work out the total invariant
Exercises 101
t
3
2
x
1
interval along the two paths, and thus conclude that in Minkowski space a
straight line is the longest timelike interval between two points.
It is reasonable to assume that the total interval along a path is the sum
of the intervals along its segments. [ 𝑑+ ]
frame;
r ○4 , the second car reaching the checkpoint; and
r ○5 , the first car reaching the policeman.
[It will probably help if you choose the frames such that event ○ 1 has coor-
of 𝑣, and calculate numerical values for this in the cases (i) 𝑣 = 1∕2, (ii) 𝑣 =
3∕5 and (iii) 𝑣 = 4∕5. In each of the three cases, state, with an explanation,
whether it is possible for the traffic policeman to signal to the checkpoint to
102 5 The Lorentz Transformation
6.1 Three-Vectors
You are familiar with 3-vectors – the vectors of ordinary three-dimensional
euclidean space. To an extent, 3-vectors are merely an ordered triple of
numbers, but they are interesting to us as physicists because they represent a
more fundamental geometrical object: the three numbers are not just picked
at random, but are the vector’s components – the projections of the vector
onto three orthogonal axes (that the axes are orthogonal is not essential to
the definition of a vector, but it is almost always simpler than the alternative).
That is, the components of a vector are functions of both the vector and our
103
104 6 Vectors and Kinematics
y
y θ
A
Ay
Ay Ax x
x
Ax
choice of axes, and if we change the axes, then the components will change
in a systematic way.
For example, consider a prototype displacement vector (Δ𝑥, Δ𝑦, Δ𝑧).
These are the components of a vector with respect to the usual axes 𝐞𝑥 ,
𝐞𝑦 and 𝐞𝑧 . If we rotate these axes, say by an angle 𝜃 about the 𝑧-axis, to
obtain axes 𝐞′𝑥 , 𝐞′𝑦 and 𝐞′𝑧 , we obtain a new set of coordinates (Δ𝑥 ′ , Δ𝑦 ′ , Δ𝑧′ ),
related to the original coordinates by
Ideally, you should think of the vectors here as being defined as a length
plus direction, defined at a point, rather than an arrow spread out over
a finite amount of space (you might think of this as some sort of infinitesimal
displacement vector). If you think of an electric or magnetic field, which has a
size and a direction at every point, you’ll have a very good picture of a vector in
the sense we’re using it here.
There is a swift review of linear algebra in Section C.3; make sure you
are comfortable with the terms and definitions mentioned there.
6.2 Four-Vectors 105
6.2 Four-Vectors
As we saw in Section 5.3, we can regard the events of SR taking place in a
4-dimensional space termed spacetime. Here, the prototype displacement
4-vector is (Δ𝑡, Δ𝑥, Δ𝑦, Δ𝑧), relative to the space axes and wristwatch of a
specific observer, and the transformation which takes one 4-vector into
another is the familiar LT of Eq. (5.6), or
⎛ Δ𝑡 ′ ⎞ ⎛ 𝛾 −𝛾𝑣 0 0⎞ ⎛ Δ𝑡 ⎞
⎜Δ𝑥′ ⎟ ⎜−𝛾𝑣 𝛾 0 0⎟ ⎜Δ𝑥 ⎟
⎜Δ𝑦 ′ ⎟ = ⎜ 0 0 1 0⎟ ⎜Δ𝑦 ⎟
(6.2a)
⎜ ′⎟ ⎜ ⎟⎜ ⎟
Δ𝑧 0 0 0 1 Δ𝑧
⎝ ⎠ ⎝ ⎠⎝ ⎠
for the ‘forward transformation’ and
⎛ Δ𝑡 ⎞ ⎛ 𝛾 +𝛾𝑣 0 0⎞ ⎛ Δ𝑡 ′ ⎞
⎜Δ𝑥⎟ ⎜+𝛾𝑣 𝛾 0 0⎟ ⎜Δ𝑥′ ⎟
⎜Δ𝑦 ⎟ = ⎜ 0 0 1 0⎟ ⎜Δ𝑦 ′ ⎟
(6.2b)
⎜ ⎟ ⎜ ⎟⎜ ⎟
Δ𝑧 0 0 0 1 Δ𝑧′
⎝ ⎠ ⎝ ⎠⎝ ⎠
for the inverse transformation (that the matrices are inverses of each other
can be verified by direct multiplication). These give the coordinates of
the same displacement as viewed by a second observer whose frame is in
standard configuration with respect to the first.
This displacement 4-vector Δ𝐑 = (Δ𝑡, Δ𝑥, Δ𝑦, Δ𝑧) we can take as the
prototype 4-vector, and recognise as a 4-vector anything which transforms
in the same way under the coordinate transformation of Eq. (6.2a) (this may
seem a rather abstract way of defining vectors, but we will see a concrete
example in Section 6.5).
We write the components of a general vector as 𝐀 = (𝐴0 , 𝐴1 , 𝐴2 , 𝐴3 )
(do note that the superscripts are indexes, not powers),1 or collectively 𝐴𝜇 ,
where the greek index 𝜇 runs from 0 to 3. We will also occasionally use latin
superscripts like 𝑖 or 𝑗: these should be taken to run over the ‘space’ indexes,
from 1 to 3.
An arbitrary vector 𝐀 has components 𝐴𝜇 in a frame 𝑆, as illustrated in
Figure 6.2. The components 𝐴1 , 𝐴2 and 𝐴3 (that is, 𝐴𝑖 ) are just as you would
expect, namely the projections of the vector 𝐀 onto the 𝑥-, 𝑦- and 𝑧-axes.
The component 𝐴0 is the projection of the vector onto the 𝑡-axis – it’s the
timelike component. In a frame 𝑆 ′ , the space and time axes will be different,
1 This is now, I think, the most common notation, but you can still find books which observe
the slightly more old-fashioned convention of labelling these as (𝐴1 , 𝐴2 , 𝐴3 , 𝐴4 ).
106 6 Vectors and Kinematics
t t
A
A0 A0
A1 x
x
A1
and so the projections of the vector 𝐀 onto these axes will be different. Just
as in Figure 6.1, we find the projection of a point by moving it parallel to
one of the axes. For example, we find the projection onto the 𝑥-axis of the
end of the vector 𝐀, by moving that point parallel to the 𝑡-axis until it hits
the 𝑥-axis, and we find the projection of the same point onto the 𝑥 ′ -axis by
moving it parallel to the 𝑡 ′ -axis until it hits the 𝑥′ -axis.
Given the components of the vector in one frame, we want to be able
to work out the components of the same vector in another frame. We can
obtain this transformation by direct analogy with the transformation of the
displacement 4-vector, and take an arbitrary vector 𝐀 to transform in the
same way as the prototype 4-vector Δ𝐑.
That is, given an arbitrary vector 𝐀, the transformation of its compo-
nents 𝐴𝜇 in 𝑆 into its components 𝐴′𝜇 in 𝑆 ′ is exactly as given in Eq. (6.2a):
′
⎛𝐴0 ⎞ ⎛ 𝛾 −𝛾𝑣 0 0⎞ ⎛𝐴0 ⎞
⎜𝐴1′ ⎟ ⎜−𝛾𝑣 𝛾 0 0⎟ ⎜𝐴1 ⎟
⎜𝐴2′ ⎟ = ⎜ 0 0 1 0⎟ ⎜𝐴 2 ⎟
. (6.3a)
⎜ 3′ ⎟ ⎜ ⎟⎜ ⎟
𝐴 0 0 0 1 𝐴3
⎝ ⎠ ⎝ ⎠⎝ ⎠
Since this is a matrix equation, the inverse transformation is straightforward:
it is just the matrix inverse of this:
′
⎛𝐴0 ⎞ ⎛ 𝛾 𝛾𝑣 0 0⎞ ⎛𝐴0 ⎞
⎜𝐴1 ⎟ ⎜𝛾𝑣 𝛾 0 0⎟ ⎜𝐴1′ ⎟
⎜𝐴2 ⎟ = ⎜ 0 0 1 0⎟ ⎜𝐴2′ ⎟ . (6.3b)
⎜ 3⎟ ⎜ ⎟ ⎜ ′⎟
𝐴 0 0 0 1 𝐴3
⎝ ⎠ ⎝ ⎠⎝ ⎠
You may also see this written out in matrix form as
t t
A0 A
A0
A1
x
A1 x
Figure 6.3 The displacement vector of Figure 6.2 in the ‘other’ frame.
𝐀 ⋅ 𝐁 = 𝐴0 𝐵 0 − 𝐴1 𝐵 1 − 𝐴2 𝐵 2 − 𝐴3 𝐵 3 (6.6a)
∑
= 𝜂𝜇𝜈 𝐴𝜇 𝐵𝜈 (6.6b)
𝜇,𝜈
= 𝐀𝑇 𝜂𝐁, (6.6c)
(denoting the matrix with components which are all zero except for
(1, −1, −1, −1) on its diagonal), and in the last line I have written the equiva-
lent matrix expression involving column vectors 𝐀 and 𝐁. The scalar product
of a vector with itself, 𝐀 ⋅ 𝐀, is its magnitude (or length-squared, written
|𝐀|2 or 𝐴2 ), and from this definition we can see that the magnitude of the
displacement vector is Δ𝐑 ⋅ Δ𝐑 = Δ𝑡 2 − Δ𝑥 2 − Δ𝑦 2 − Δ𝑧2 = 𝑠2 .
This matrix 𝜂𝜇𝜈 is known as the metric, and through its use in defining
the magnitude of the vector, it defines what ‘distance’ means in Minkowski
space; that is, we could take the above definition of Δ𝐑 ⋅ Δ𝐑 as the definition
of the invariant interval 𝑠2 (we will see a lot more of this when we look at
General Relativity). You can verify by explicit calculation that the matrix 𝜂
has the property that, when it is acted upon by the transformation matrix of
Eq. (6.3a),
Λ𝑇 𝜂Λ = 𝜂 (6.8)
(where Λ𝑇 is the transpose of the matrix Λ). That is, it transforms into itself,
telling us that the definition of distance in one coordinate system is the same
in every transformed coordinate system; this is another statement of the
frame-independence of the invariant interval, 𝑠2 .
The metric of three-dimensional euclidean space is diag(1, 1, 1), which
(for 3-vector 𝐚, and by analogy with Eq. (6.6)) gives us 𝐚 ⋅ 𝐚 = 𝑎𝑥2 + 𝑎𝑦2 + 𝑎𝑧2 ,
or Pythagoras’s theorem, as the definition of distance in euclidean space.
Just as we discussed in Section 4.7 for intervals, 4-vectors can be timelike,
spacelike or null, depending on whether their magnitude is positive, negative
or zero; note that, since the magnitude is not positive-definite (i.e., it can be
negative), even a non-zero vector can be null. Just as in Section 6.1, we say
that two vectors are orthogonal if their scalar product vanishes. For example,
the two vectors 𝐴 = (1, 2, 0, 0) and 𝐵 = (2, 1, 0, 0) have scalar product 1 ×
6.2 Four-Vectors 109
[Exercise 6.5]
d2 𝑥 0 d2 𝑥 1 d2 𝑥 2 d2 𝑥 3
𝐀=( , , , ) (6.10)
d𝜏2 d𝜏2 d𝜏2 d𝜏2
d𝜏2 = d𝑡 2 − |d𝐫|2 ,
so that
2
d𝜏 (d𝜏)2 |d𝐫|2 1
( ) = 2
= 1 − 2
= 1 − 𝑣2 = 2 .
d𝑡 (d𝑡) (d𝑡) 𝛾
Taking the square root and inverting, we immediately find that
d𝑡
= 𝛾, (6.11)
d𝜏
2 By ‘scalar’ I merely mean ‘a number’, in a context where we want to distinguish the quantity
from a vector. Note that a number is necessarily frame-invariant.
112 6 Vectors and Kinematics
and so
d𝑥0 d𝑡
𝑈0 = = =𝛾 (6.12a)
d𝜏 d𝜏
d𝑥𝑖 d𝑡 d𝑥 𝑖
𝑈𝑖 = = = 𝛾𝑣 𝑖 . (6.12b)
d𝜏 d𝜏 d𝑡
You can view Eq. (6.11) as yet another manifestation of time dilation. Thus
we can write
𝐔 ≡ (𝑈 0 , 𝑈 1 , 𝑈 2 , 𝑈 3 ) = (𝛾, 𝛾𝑣 𝑥 , 𝛾𝑣 𝑦 , 𝛾𝑣 𝑧 ) = 𝛾(1, 𝑣 𝑥 , 𝑣 𝑦 , 𝑣 𝑧 ). (6.13a)
We will generally write this, below, as
𝐔 = 𝛾(1, 𝐯), (6.13b)
using 𝐯 to represent the three (space) components of the (spatial) velocity
vector, but as notation this is perhaps a little ‘slangy’.
Note that I will consistently use upper-case letters for 4-vectors (either
as a vector 𝐀, written with bold-face, or referring to their components 𝐴𝜇 ),
and lower-case letters for 3-vectors (either 𝐚 or 𝑎𝑖 ).
In a frame which is co-moving with a particle, the particle’s velocity is
𝐔 = (1, 0, 0, 0), so that, from Eq. (6.6), 𝐔 ⋅ 𝐔 = 1; since the scalar product is
frame-invariant, it must have this same value in all frames, so that, quite
generally, we have the relation
𝐔 ⋅ 𝐔 = 1. (6.14)
You can confirm that this is indeed true by applying Eq. (6.6) to Eq. (6.13).
Here, we defined the 4-velocity by differentiating the displacement 4-
vector, and deduced its value in a frame co-moving with a particle. We can
now turn this on its head, and define the 4-velocity as a vector which has
magnitude 1 and which points along the 𝑡-axis of a co-moving frame (this
is known as a ‘tangent vector’, and is effectively a vector ‘pointing along’
the worldline). We have thus defined the 4-velocity of a particle as the
vector which has components (1, 𝟎) in the particle’s rest frame. Note that
the magnitude of the vector is always the same; the particle’s speed relative
to a frame 𝑆 is indicated not by the ‘length’ of the velocity – its magnitude,
which is always 1 – but by the direction of the vector in Minkowski space,
in the frame 𝑆. We can then deduce the form in Eq. (6.13) as the Lorentz-
transformed version of (1, 𝟎). Compare Section 6.4 and Section 7.1.1.
Equations (6.12) can lead us to some intuition about what the velocity
vector is telling us. When we say that the velocity vector in the particle’s
rest frame is (1, 𝟎), we are saying that, for each unit proper time 𝜏, the particle
6.3 Velocity and Acceleration 113
moves the same amount through coordinate time 𝑡, and not at all through space 𝑥;
the particle ‘moves into the future’ directly along the 𝑡-axis. When we are talking
instead about a particle which is moving with respect to some frame, the equation
𝑈 0 = d𝑡∕d𝜏 = 𝛾 tells us that the particle moves through a greater amount of this
frame’s coordinate time, 𝑡, per unit proper time (where, again, the ‘proper time’
is the time showing on a clock attached to the particle) – ‘time dilation’ yet again.
̇ 𝛾𝐯
𝐀 = 𝛾 (𝛾, ̇ + 𝛾𝐚) . (6.16)
𝐔⋅𝐀=0
in this co-moving frame, and therefore in all frames. From the result in this
co-moving frame we can deduce the magnitude of the 4-acceleration
𝐀 ⋅ 𝐀 = −𝑎2 ,
𝐔 ⋅ 𝐕 = 𝛾(𝑣),
3 This is also referred to as the momentarily co-moving reference frame (MCRF) by some
authors.
114 6 Vectors and Kinematics
y
dr/dλ
r
λ
x
2
ΔR
1
λ
λ
uΔt
contains information about the ‘speed’ of the particle, and which we can
define to be the velocity 4-vector. By the same argument that led up to
Eq. (6.14), we discover that this vector has magnitude 𝐔 ⋅ 𝐔 = 1, and that
its direction corresponds precisely to the 𝑡-axis of a frame co-moving with
the particle.
The point of this approach is that the idea of a path, and the idea of a
tangent vector to that path, are both geometrical ideas, existing at a level
beneath coordinates (which are more-or-less algebraic things), and so can
be defined and discussed without using coordinates, and so without having
any dependence on reference frames. They are therefore manifestly frame-
independent.
y
θ
x
Is this not inevitable? Not quite: imagine if we had naïvely defined the
frequency 4-vector as a vector whose space components were 𝐧∕𝜆 and whose
time component was defined to be zero. On transformation by either of
the routes in the previous paragraph, the vector would acquire a non-zero
time component, so that the transformed vector would have a different form
from the untransformed one. The components of such a ‘vector’ would not
transform in the same way as Δ𝐑, so it would not be a proper 4-vector, so
that we would not be able to identify an underlying geometrical object of
which these were the components.
Can we use the frequency 4-vector for anything? Yes. Imagine that the
wavetrain is moving at an angle 𝜃 ′ in the (𝑥′ , 𝑦 ′ ) plane, so that its direction
is 𝐧 = (cos 𝜃′ , sin 𝜃 ′ , 0) for some angle 𝜃 ′ in frame 𝑆 ′ (Figure 6.6). In that
case we have
cos 𝜃′ sin 𝜃′
𝐋′ = [𝑓 ′ , , , 0] (6.20)
𝜆′ 𝜆′
cos 𝜃′ cos 𝜃′ sin 𝜃′
𝐋 = [𝛾 (𝑓 ′ + 𝑣 ′ ) , 𝛾 ( ′ + 𝑣𝑓 ′ ) , , 0] . (6.21)
𝜆 𝜆 𝜆′
cos 𝜃 sin 𝜃
𝐋 = [𝑓, , , 0] , (6.22)
𝜆 𝜆
andcompareEq.(6.21)and(6.22)componentbycomponent.Afterabitof
rearrangement,wefind
𝑣
𝑓 = 𝑓 ′ 𝛾 (1 + cos 𝜃 ′ ) (6.23)
𝑢′
cos 𝜃 ′ + 𝑣𝑢′
cos 𝜃 = . (6.24)
1 + 𝑣𝑢′ cos 𝜃′
Or, for the case of light, where 𝑢′ = 1, we have the simpler and well-known
118 6 Vectors and Kinematics
versions
𝑓 = 𝑓 ′ 𝛾(1 + 𝑣 cos 𝜃′ ) (6.25)
cos 𝜃 ′
+𝑣
cos 𝜃 = . (6.26)
1 + 𝑣 cos 𝜃 ′
Equation (6.23) is the relativistic Doppler effect, and describes the change
in frequency of a wave, as measured in a frame moving with respect to the
frame in which it was emitted. This applies for everything from water waves
(for which the effect would be exceedingly small) all the way up to light, for
which 𝑢 = 1.
Equation (6.24) shows that a wave travelling at an angle 𝜃 ′ in the moving
frame 𝑆 ′ is measured to be moving at a different angle 𝜃 in a frame 𝑆, with
respect to which 𝑆 ′ is moving with speed 𝑣. To calculate the change in the
speed of the wave, we could laboriously eliminate variables from Eq. (6.21)
and (6.22), but much more directly, we can make use of the fact that the
magnitudes of vectors are conserved under Lorentz transformation; thus
𝐋 ⋅ 𝐋 = 𝐋′ ⋅ 𝐋′ or, again using Eq. (6.5) and 𝑓 = 𝑢∕𝜆,
1 1
𝑓 2 (1 − 2
) = 𝑓 ′2 (1 − ′2 ) .
𝑢 𝑢
We could rewrite this to obtain an expression for 𝑢′ , but simply from this
form we can see that if 𝑢 = 1, the fact that neither 𝑓 nor 𝑓 ′ is zero implies
that 𝑢′ = 1 also (as the second postulate says).
Note that there are multiple possible ways to set up this problem, from
a notational point of view, so that if you are comparing the discussion
here with another text, make sure the various symbols mean what you expect.
[Exercises 6.7–6.17]
Exercises
Exercise 6.1 (§6.2)
Verify that the transformation matrices of Eq. (6.3a) and Eq. (6.3b) are in-
verses. You will probably find it convenient to omit the (trivial) 𝑦 and 𝑧
components, and instead write
1 −𝑣
Λ = 𝛾( ).
−𝑣 1
1 0
𝜂=( ).
0 −1
⎛ Δ𝑡 ′ ⎞ ⎛𝑎 𝑏 0 0⎞ ⎛ Δ𝑡 ⎞
⎜Δ𝑥′ ⎟ ⎜ 𝑐 𝑑 0 0⎟ ⎜Δ𝑥⎟
⎜Δ𝑦 ′ ⎟ = ⎜ 0 0 1 0⎟ ⎜Δ𝑦 ⎟
(i)
⎜ ′⎟ ⎜ ⎟⎜ ⎟
Δ𝑧 0 0 0 1 Δ𝑧
⎝ ⎠ ⎝ ⎠⎝ ⎠
which is the simplest transformation which ‘mixes’ the 𝑥- and 𝑡-coordinates.
2
By requiring that Δ𝑠′ = Δ𝑠2 = Δ𝑡 2 − Δ𝑥 2 after this transformation, find
constraints on the parameters 𝑎, 𝑏, 𝑐, 𝑑 (you can freely add the constraint 𝑏 =
𝑐; why?), and by setting 𝑏 proportional to 𝑎 deduce the matrix Eq. (6.2a).
Bonus: by instead parameterising 𝑎, 𝑏, 𝑐, 𝑑 with suitable hyperbolic
functions, recover Eq. (5.3). [ 𝑑+ ]
120 6 Vectors and Kinematics
r (2, 2, 0, 0)
r (2, 0.2, 0, 0)
r (10, 5, 0, 0) [ 𝑑− ]
d𝐔 d𝛾 d
= 𝛾 ( , (𝛾𝑣), 0, 0) .
d𝜏 d𝑡 d𝑡
1 1
= 2 2 + 1. (i)
𝑣2 𝛼 𝑡
(b) If, at time 𝑡, the rocket sets off a flashbulb which has frequency 𝑓 ′ in
its frame, use the Doppler formula to show that the light is observed, at the
space station, to have frequency
(√ )
𝑓= 1 + 𝛼2 𝑡2 − 𝛼𝑡 𝑓 ′ .
2
1 1
(𝑥 + ) − 𝑡2 = 2
𝛼 𝛼
(you may or may not find the substitution 𝛼𝑡 = sinh 𝜃 useful or obvious
here). This is the equation for a hyperbola. By sketching this on a Minkowski
diagram and considering the point at which the asymptote intersects the
𝑡-axis, demonstrate that it is impossible for the space station to signal to the
retreating rocket after a time 1∕𝛼.
(d) How long does it take, from launch at 𝑣 = 0, for a rocket accelerating
at 𝑔 = 10 m s−2 to be moving at 𝛾 = 2?
[This question is a little algebra-heavy, but instructive.] [ 𝑑+ 𝑢+ ]
122 6 Vectors and Kinematics
This is not the same as Eq. (6.25), even for 𝜃′ = 0. Why not? Derive this
expression from Eq. (6.25).
Δ𝜃 = 𝜅 sin 𝜃
where Δ𝜃 is the aberration in the star’s altitude 𝜃, and 𝜅 = 20′′. 496 is the
constant of aberration.
Consider pointing a laser at a star which has altitude 𝜃 in a frame in
which the Earth is at rest, and altitude 𝜃 ′ in a frame which is moving at the
Earth’s orbital velocity of approximately 𝑣 = 29.8 km s−1 (the two frames are
equivalent to making an observation where the Earth’s motion is orthogonal
to the line-of-sight to the star, and a frame, three months later, which is
moving towards the star; also, it’s easier to solve this variant of the problem
rather than the equivalent problem of the light coming from the star, since
it avoids lots of minus signs in the algebra). Show that the two altitudes are
related by:
cos 𝜃 ′ + 𝑣
cos 𝜃 = . (i)
1 + 𝑣 cos 𝜃 ′
Given this, it is merely tedious to work out that sin 𝜃 = sin 𝜃′ ∕(𝛾(1 +
𝑣 cos 𝜃′ )).
Deduce that, for small 𝑣,
Δ𝜃 = 𝜃 ′ − 𝜃 ≈ 𝑣 sin 𝜃 ′ ,
125
126 7 Dynamics
P1 P3
P2 P4
Note that here, and throughout, the symbol 𝑚 denotes the mass as
measured in a particle’s rest frame. The reason I mention this is that
some treatments of relativity, particularly older ones, introduce the concept of
the ‘relativistic mass’ 𝑚(𝑣) = 𝛾(𝑣)𝑚0 , distinct from the ‘rest mass’, 𝑚0 . The only
(dubious) benefit of this is that it makes a factor of 𝛾 disappear from a few equa-
tions, making them look a little more like their newtonian counterparts; the cost
is that of introducing one more new concept to worry about, which doesn’t help
much in the long term, and which can obscure aspects of the energy-momentum
vector. Rindler (2006, §6.2), for example, introduces the relativistic mass, but
his subsequent discussion of relativistic force is sufficiently confusing, from our
notational point of view (it obliges one to introduce the notions of ‘longitudi-
nal’ and ‘transverse relativistic mass’!) that I feel it quite amply illustrates the
unhelpfulness of the whole idea.
is conserved:
𝐏1 + 𝐏 2 = 𝐏 3 + 𝐏 4 . (7.2a)
This is an equation between 4-vectors. Equating the time and space coordi-
nates separately, recalling Eq. (7.1), and writing 𝐩 ≡ 𝛾𝑚𝐯, we have
𝑚1 𝛾(𝑣1 ) + 𝑚2 𝛾(𝑣2 ) = 𝑚3 𝛾(𝑣3 ) + 𝑚4 𝛾(𝑣4 ) (7.2b)
𝐩1 + 𝐩2 = 𝐩3 + 𝐩4 . (7.2c)
Now recall that, as 𝑣 → 0, we have 𝛾(𝑣) → 1, so that the low-speed limit of
the spatial part of the vector 𝐏, Eq. (7.1), is just 𝑚𝐯, so that the spatial part
of the conservation equation, Eq. (7.2c), reduces to the statement that 𝑚𝐯
is conserved. Both of these prompt us to identify the spatial part of the
vector 𝐏 as the linear momentum, and to retrospectively justify both giving
the 4-vector 𝐏 the name 4-momentum and supposing that it is conserved.
What, then, of the time component of Eq. (7.1)? Let us (with, admittedly,
a little fore-knowledge) write this as 𝑃0 = 𝐸, so that
𝐸 = 𝛾𝑚. (7.3)
What is the low-speed limit of this? Taylor’s theorem tells us that
( )−1∕2 𝑣2
𝛾 = 1 − 𝑣2 =1+ + 𝑂(𝑣 4 ), (7.4)
2
so that, when 𝑣 is small, Eq. (7.3) becomes
1
𝐸 = 𝑚 + 𝑚𝑣2 + 𝑂(𝑣 4 ). (7.5)
2
At this point we can (a) spot that 𝑚𝑣2 ∕2 is the expression for the kinetic
energy in newtonian mechanics, and (b) recall that Eq. (7.2b) tells us that
this quantity 𝐸 is conserved in collisions, so that we have persuasive support
for identifying the quantity 𝐸 in Eq. (7.3) as the relativistic energy of a particle
with mass 𝑚 and velocity 𝑣.
If we rewrite Eq. (7.3) in physical units, we find
𝐸 = 𝛾𝑚𝑐2 , (7.6)
the low-speed limit of which (remember 𝛾(0) = 1) recovers what has been
called the most famous equation of the twentieth century.
Note that the units in Eq. (7.3) are kg on both sides, but after rewriting in
units where 𝑐 ≠ 1, Eq. (7.6), the units on both sides are kg m2 s−2 , or joules,
as expected.
The argument presented after Eq. (7.2a) has been concerned with giving
names to quantities, and, reassuringly for us, linking those newly named
128 7 Dynamics
In case you are worried that we are pulling some sort of fast one, that
we never had to do in newtonian mechanics, note that we do have to
do a similar thing in newtonian mechanics. There, we postulate Newton’s third
law (action equals reaction), and from this we can deduce the conservation of
momentum; in this case, we work in the opposite direction, so that we postulate
the conservation of 4-momentum, and would then be able to deduce a relativistic
analogy of Newton’s third law. In each case, we are adding the same amount
of physics to the mathematics. I don’t discuss relativistic force here, but merely
mention it in passing in Section 7.6; this question is discussed in a little more
detail in Rindler (2006, §6.10).
We can see from Eq. (7.5) that, even when a particle is stationary and 𝑣 =
0, the energy 𝐸 is non-zero. In other words, a particle of mass 𝑚 has an
energy 𝛾𝑚 associated with it simply by virtue of its mass.
The low-speed limit of Eq. (7.2b) simply expresses the conservation of
mass, but we see from Eq. (7.3) that it is actually expressing the conservation
of energy. In SR there is no fundamental distinction between mass and
energy – mass is, like kinetic, thermal and strain energy, merely another
form into which energy can be transmuted – albeit a particularly dense store
of energy, as can be seen by calculating the energy equivalent, in joules,
of a mass of 1 kg. It turns out from GR that it is not mass that gravitates,
but energy-momentum (most typically, however, in the particularly dense
form of mass), so that thermal and electromagnetic energy, for example,
and even the energy in the gravitational field itself, all gravitate (it is the
non-linearity implicit in the last remark that is part of the explanation for
the mathematical difficulty of GR). In each of these cases, the amount of
(thermal, electromagnetic, strain) energy would be a quantity with the units
of kilogrammes.
Although the quantities 𝐩 = 𝛾𝑚𝐯 and 𝐸 are frame-dependent, and thus
not physically meaningful by themselves, the quantity 𝐏 defined by Eq. (7.1)
has a physical significance.
Let us now consider the magnitude of the 4-momentum vector. Like
any such magnitude, it will be frame-invariant, and so will express some-
7.1 Energy and Momentum 129
thing fundamental about the vector, analogous to its length. Since this is
the momentum vector we are talking about, this magnitude will be some
important invariant of the motion, indicating something like the ‘quantity of
motion’. From the definition of the momentum, Eq. (7.1), and its magnitude,
Eq. (6.14), we have
𝐏 ⋅ 𝐏 = 𝑚2 𝐔 ⋅ 𝐔 = 𝑚2 , (7.7)
and we find that this important invariant is the mass of the moving particle.
Now using the definition of energy, Eq. (7.3), we can write 𝐏 = (𝐸, 𝐩),
and find
𝐏 ⋅ 𝐏 = 𝐸 2 − 𝐩 ⋅ 𝐩. (7.8)
𝑚2 = 𝐸 2 − 𝑝2 . (7.9)
[Exercises 7.1–7.3]
130 7 Dynamics
We have not added anything in this section that we didn’t know from
the previous one. But simply by demanding that a quantity of interest is a
4-vector, as we did here and in Section 6.5, we have imposed a significant
structure on that quantity, and discovered that two quantities that newtonian
physics sees as related but distinct – frequency and wavelength, or energy
and momentum – are in SR different components of a single geometrical
quantity. We did the same thing in Section 6.3 when we remarked that
we could define the 4-velocity of a particle by simply declaring it to be the
vector which had components (1, 𝟎) in the particle’s rest frame, and letting
Eq. (6.3a) do the rest of the work. This shows the fruitfulness of an approach
which focuses on the geometry of the spacetime we are examining; such
an approach may feel a little abstract, but it pays off when we look at the
description of gravity in general relativity.
7.2 Photons
For a photon, the interval represented by d𝐑 ⋅ d𝐑 is always zero (d𝐑 ⋅ d𝐑 =
d𝑡2 − d𝑥 2 − d𝑦 2 − d𝑧2 = 0 for photons). But this means that the proper
time d𝜏2 is also zero for photons. This means, in turn, that we cannot define
a 4-velocity vector for a photon by the same route that led us to Eq. (6.9),
and therefore cannot define a 4-momentum in the way we did in Eq. (7.1).
We can do so, however, by a different route. Recall that we defined (in
the paragraph below Eq. (6.14)) the 4-velocity as a vector pointing along the
worldline, which resulted in the 4-momentum being in the same direction.
From the discussion of the momentum of massive particles above, we see
that the 𝑃0 component is related to the energy, so we can use this to define
a 4-momentum for a massless particle, and again write
𝐏𝛾 = (𝐸, 𝐩𝛾 ).
Since the photon’s velocity 4-vector is null, the photon’s 4-momentum must
be also (since it is defined above to be pointing in the same direction). Thus
we must have 𝐏𝛾 ⋅ 𝐏𝛾 = 0, thus 𝐩𝛾 ⋅ 𝐩𝛾 = 𝐸 2 , recovering the 𝑚 = 0 version
of Eq. (7.9),
P1
P3
P2
Thinking back to the approach of Section 7.1.1, we realise that Eq. (7.11)
is the conclusion we must come to if we speculate that the zeroth com-
ponent of a photon’s 4-momentum is its energy, 𝐸 = ℎ𝑓, and then demand that
that 4-vector is null.
𝐏𝑖 = 𝛾𝑖 𝑚𝑖 (1, 𝐯𝑖 ), (7.12)
where the three particles have velocities 𝐯𝑖 , and 𝛾𝑖 ≡ 𝛾(𝑣𝑖 ). From momentum
conservation, we also know that
𝐏1 + 𝐏2 = 𝐏3 . (7.13)
7.3 Relativistic Collisions and the Centre-of-Momentum Frame 133
Table 7.1 The momenta of the particles in Figure 7.2, in the ‘lab frame’
𝐏𝑖 = (𝐸𝑖 , 𝑝𝑖 ) 𝑣𝑖 𝛾𝑖 𝑚𝑖
√
1 (17, 15) 15∕17 17∕8 172 − 152 = 8
2 (8, 0) 0 1 8
√
3 (17 + 8, 15 + 0) 3∕5 5∕4 25 − 152 = 20
2
= (25, 15)
Of course, this also indicates that each component of the vectors is separately
conserved.
It’s useful to add some numbers to the discussion at this point.
Let’s simplify this, and imagine the collision taking place in one dimen-
sion, with an incoming particle moving along the 𝑥-axis to strike a stationary
second particle. The resulting particle will also move along the 𝑥-axis. In
this subsection we will, to avoid clutter, write only the 𝑥-component of the
spatial part of vectors, missing out the 𝑦 and 𝑧 components, which are zero
in this one-dimensional setup. Thus we’ll write (𝑡, 𝑥) rather than (𝑡, 𝑥, 𝑦, 𝑧)
or (𝑡, 𝐫), and (𝑃0 , 𝑃1 ) rather than (𝑃0 , 𝑃1 , 𝑃2 , 𝑃3 ).
Let the ‘incoming’ particles have masses 𝑚1 = 𝑚2 = 8 units; let the first
be travelling with speed 𝑣1 = 15∕17 along the 𝑥-axis in the ‘lab frame’ (so
that 𝛾1 = 𝛾(15∕17) = 17∕8); and let the second be stationary, 𝑣2 = 0 (so
𝛾2 = 1). As discussed above, the appropriate measure of the ‘length’ of
the 𝐏 vectors is the magnitude 𝑚2 of Eq. (7.7): 𝑚2 = 𝐏 ⋅ 𝐏. Recall that
𝐏𝑖 = (𝐸𝑖 , 𝑝𝑖 ) = (𝛾𝑖 𝑚𝑖 , 𝛾𝑖 𝑚𝑖 𝑣𝑖 ). Thus we can make a table of the various kine-
matical parameters in Table 7.1, where the first two rows are simply tran-
scribed from this paragraph.
To obtain the values in row 3, we add the 𝐏𝑖 of the first column to get
a 𝐏3 = (25, 15). We can get the speed by noticing that 𝑣𝑖 = 𝑃𝑖1 ∕𝑃𝑖0 , from
Eq. (7.12), and get the mass via the magnitude of the momentum 4-vector,
𝑚2 = (𝑃0 )2 − (𝑃1 )2 (that’s the most straightforward route; alternatively we
could note that if 𝑣3 = 3∕5, then 𝛾3 = 5∕4, so that if 𝑃30 = 𝛾3 𝑚3 = 25, then
𝑚3 = 20).
Notice the following points.
The speed of light is one of the very few things which has the full-house of
being conserved, invariant, and constant. [Exercises 7.4 & 7.5]
𝐏3 = (𝛾1 𝑚1 + 𝛾2 𝑚2 , 𝛾1 𝑚1 𝑣1 + 𝛾2 𝑚2 𝑣2 ).
7.3 Relativistic Collisions and the Centre-of-Momentum Frame 135
Now use the LT, Eq. (6.3), to transform this to a frame which is moving with
speed 𝑉 with respect to the lab frame. We therefore have
We can choose the speed 𝑉 to be such that this spatial momentum is zero,
𝑃3′1 = 0, giving
𝛾1 𝑣1 + 𝛾2 𝑣2 3
𝑉= =
𝛾1 + 𝛾2 5
1. In this frame also, 𝐏′1 + 𝐏′2 = 𝐏′3 : the conservation of a vector quantity in
one frame means that it is conserved in all frames.
2. Also 𝑚1′ = (𝐏′1 ⋅ 𝐏′1 )1∕2 = 8, 𝑚2′ = 8 and 𝑚3′ = 20: these are frame-
invariant quantities.
3. Both the 3-momenta and energy of the vectors in this frame are different
from the lab frame, even though their magnitudes are the same. We are
talking about the same 4-vector here, 𝐏𝑖 , but since we have decided to
change reference frame, we (unsurprisingly) change the vector’s frame-
dependent components.
This frame, in which the total spatial momentum is zero, so that the
incoming particles have equal and opposite spatial momenta, is known as
the centre-of-momentum (CM) frame, and the energy available for particle
136 7 Dynamics
0
production in this frame (𝑃cm ) is known as the centre-of-mass energy.
[Exercise 7.6]
1 To be very precise, movement of masses like this would in principle generate gravitational
waves, but that’s a complication we should not indulge in at this particular point.
7.5 More Unit Fun: an Aside on Electron-volts 137
E, p
Q1 θ
φ
Q2
Figure 7.3 Compton scattering: a photon being scattered from a charged par-
ticle.
Of course this does increase the particle’s energy and momentum sep-
arately. If the particle starts off at rest in some frame, then after the
force has been applied it will have 𝑃0 = 𝑚𝛾, and 𝐩 = 𝛾𝑚𝐯. However, the
time and space components of the momentum 𝐏 are related by Eq. (7.8),
𝐸 2 − 𝐩 ⋅ 𝐩 = 𝐏 ⋅ 𝐏 = 𝑚2 , the mass of the particle, which is unchanged by
the application of the force. Because of the minus sign in this expression, the
time and space components of the momentum vector can increase without
the magnitude of the vector increasing, and this is closely analogous to the
role of the minus sign in our analysis of the twins paradox, in Section 5.7.1:
there, the (travelling) twin who took the indirect route into the future trav-
elled through less spacetime than the stay-at-home who took the direct route;
here, the accelerated vector has a momentum with larger (frame-dependent)
components, which nonetheless has the same length as it started with.
In both cases, the unexpected conclusions follow ultimately from the
observation that Minkowski space is not euclidean space, and that in
Minkowski space, instead of Pythagoras’s theorem, we have Eq. (6.6).
𝑄1 + 𝑚 = 𝑄 2 + 𝐸 (7.14a)
𝑄1 = 𝑝 cos 𝜃 + 𝑄2 cos 𝜙 (7.14b)
0 = 𝑝 sin 𝜃 + 𝑄2 sin 𝜙. (7.14c)
Writing Eq. (7.14b) as 𝑄1 − 𝑄2 cos 𝜙 = 𝑝 cos 𝜃, and squaring and adding this
and Eq. (7.14c), we obtain
Exercises
Exercise 7.1 (§7.1)
Recall the definition of the momentum 4-vector:
𝐏 = 𝑚𝐔 = 𝑚𝛾(1, 𝐯).
What are the time- and 𝑥-components of the momentum of a particle of
mass 2 kg moving with speed 𝑣 = 3∕5 (so 𝛾(𝑣) = 5∕4) along the 𝑥-axis?
1. 𝑃0 = 5∕2, 𝑃1 = 3∕2
142 7 Dynamics
2. 𝑃0 = 5∕2, 𝑃1 = 6∕5
3. 𝑃0 = 1, 𝑃1 = 3∕5
4. 𝑃0 = 5∕4, 𝑃1 = 3∕2 [ 𝑑− ]
Give the answers in both natural units and physical units (SI) (remember
Exercise 4.1).
After you have read Section 7.5, additionally give the answers, including
the two masses, in physical units in terms of eV.
Appendix A
An Overview of General Relativity
Although, as I mentioned at the end of the last chapter, we have now covered
a substantial fraction of what there is to say about Special Relativity, this is
very far from all we can say about relativity in general. Special Relativity
discusses the special case of motion within, and transformation between,
inertial frames moving with constant velocity relative to each other. If we re-
lax this restriction, and ask about transformations between arbitrary frames
(i.e., between arbitrary coordinate systems), which may be accelerating with
respect to each other or moving under the influence of gravity, then we are
asking the questions of General Relativity – GR.
General Relativity is mathematically much more challenging than SR,
and so the account below does give in to ‘it can be shown that. . . ’ on more
than one occasion, and it makes reference to mathematical technology
such as differential geometry which we can talk about only schematically.
However, the way we have covered SR in this course, with its emphasis
on an axiomatic approach and on geometry, gives us a starting point from
which we are able to do more than merely sketch out the landscape here.
You can find alternative explanations of some of the fundamental ideas –
such as the Equivalence Principle, curvature, and cosmological expansion
– online and in popular books. You can also learn something from the
introductory chapters of GR textbooks, before they get properly started on
the tensor calculus. I mention one or two such books in Section 1.8. In
particular, I’ll draw attention to Schutz (2009) and Rindler (2006); and also
145
146 Appendix A An Overview of General Relativity
Misner, Thorne & Wheeler (1973), which is justly famous, though it has a
slightly idiosyncratic take on the subject. I emphasise, however, that these
are all graduate texts, so that the level of maths required will quickly outstrip
what I can prepare you for in this short chapter.
conceptual convenience – which is real only in the same way that centrifugal
force is real, as a way of explaining the physical results of a change of frame.
In this section, we work through a number of thought experiments, which
one by one introduce some of the key ideas, and some of the key conclusions,
of GR.
1 Newton does mention that ‘a bit of fine down and a piece of solid gold descend with equal
velocity’ in the General Scholium to the second edition (1713) of his Principia (see for
example https://isaac-newton.org/general-scholium/), but that text is better known for
Newton’s acknowledgement that, although he has described gravity, he cannot explain its
source: ‘Hitherto we have explaind the phaenomena of the heavens and of our sea, by the
power of Gravity, but have not yet assignd the cause of this power. [. . . But] I frame no
hypotheses.’ Part of the problem was that the expectations of the time presumed that all
forces were transmitted by contact, so that the idea of a pervasive gravitational field was
contrary to all intuition, and a much more radical conceptual innovation than we might
expect. Indeed it was possibly only Newton’s occult and alchemical interests – research into
which he pursued in parallel with his scientific work – that allowed him to think in such
terms.
148 Appendix A An Overview of General Relativity
2 By ‘point of view’ I mean ‘as measured with respect to a reference frame fixed to the box’, but
such circumlocution can distract from the point that this is an observation we’re talking
about – we can see this happening.
3 These experiments use a torsion balance to directly confirm that masses of different
materials experience the same gravitational force; for a review, see Adelberger et al. (2009).
A.1 Some Thought Experiments on Gravitation 149
away from any gravitational forces, and so we may identify that as a local
inertial frame (LIF; we will see the significance of the word ‘local’ below).
Another way of removing gravitational forces, less extreme than going into
deep space, is to put ourselves in free fall. Einstein asserted that these two
situations are indeed fully equivalent, and defined an inertial frame, firstly,
as one in free fall, and declared, secondly, that just as all inertial frames in
SR are equivalent, all (redefined) inertial frames in GR are equivalent, too.
Imagine being in a box floating freely in space, and shining a torch hori-
zontally across it from one wall to the other (Figure A.3). Where will the
beam end up? Obviously, the beam will end up at a point on the wall directly
opposite the torch. There’s nothing exotic about this. The EP, however, tells
us that the same must happen for a box in free fall, Figure A.4, left. That is,
a person inside a falling lift would observe the torch beam to end up level
with the point at which it was emitted, in the (inertial) frame of the lift. This
is a straightforward and unsurprising use of the EP.
How would this appear to someone watching the lift fall? Since the light
takes a finite time to cross the lift cabin, the spot on the wall where it strikes
will have dropped some finite (though small) distance, and so will be lower
A.1 Some Thought Experiments on Gravitation 151
m E
m + mgz = E
than the point of emission, in the frame of someone watching this from a
position of safety (Figure A.4, right). That is, this non-free-fall observer will
measure the light’s path as being curved in the gravitational field. The EP
forces us to conclude that even massless light is affected by gravity.
Note that it is useful to be clear exactly where the EP enters this argu-
ment. It lets us go from a scenario we are confident we can fully analyse
– namely the box far away from gravitating matter – to a scenario we might be
more hesitant about – namely motion under the influence of gravity. We will use
a version of the EP again in Section A.3, to move from a context we understand –
physics in SR – to one we initially don’t – physics in GR.
5 This is also sometimes referred to as ‘gravitational Doppler shift’, but inaccurately, since it is
not a consequence of relative motion, and so has nothing to do with the Doppler shift you
are familiar with. Also, if the photon is descending, it is blue-shifted, so that ‘gravitational
frequency shift’ would be a better term. But ‘redshift’ is conventional.
6 Multiple research groups, including Pound’s in Harvard, USA, and John Paul Schiffer’s in
Harwell, UK, realised in 1959 that they could use the Mössbauer effect (discovered in 1958)
to make this measurement, and published data-less announcements of their plans to do so.
Schiffer’s group were the first to publish experimental results, in February 1960, followed
only a few weeks later by the Harvard group’s first publication; both groups announced a
detection of the redshift, within experimental error. The Harwell group’s results were,
however, arguably the result of a systematic chemical effect which supplied or amplified the
redshift result, and only Pound’s group seems to have gone on to refine the measurement in
a sequence of publications in the succeeding years. It is not unreasonable, therefore, that
A.2 Geometry 153
changes of gamma rays falling a mere 22.5 m, in which the factor 𝑔𝑧∕𝑐2 in
Eq. (A.3)7 is only 2.5 × 10−15 .
Light, it seems, can tell us about the gravitational field it moves through.
A.2 Geometry
To get from here to GR we need to answer three questions: (i) what do
we mean by ‘the geometry of a space’? (ii) how do we describe geometry
mathematically? and (iii) how do we make the link from geometry to gravity?
Geometry is the mathematical discussion of shapes. We tend to think of
this as a fixed thing, which we learned about in school, and haven’t had to
think much more about since. Euclid famously established an axiomatic
basis for geometry, setting out a small number of axioms, or postulates, on
the basis of which a large number of geometrical theorems can be proved.
The space he described, with the fairly straightforward extension from the
plane to higher dimensions, is referred to as euclidean space. In this space,
the internal angles of a triangle add up to 180°, parallel lines never meet, the
circumference of a circle is 𝜋 times its diameter, and Pythagoras’s theorem
gives the distance between points.
That was thought to be the end of the matter for a very long time.
One of Euclid’s postulates is that there exist parallel lines – lines which
never cross each other. This postulate can be expressed in a variety of ways,
but it also seems either to be self-evident, or to be such a trivial addition to
the other postulates that it must surely be provable from them; and for more
than 2000 years after Euclid, mathematicians tried to do exactly that. It was
only in the nineteenth century that the mathematical community discovered
that one can consistently discuss geometry without the parallel postulate, or
with alternative postulates such as the demand that an arbitrary line in the
space has zero parallel lines, or that it has more than one. These alternative
postulates lead to different conclusions about, for example, the sum of the
these are memorialised as the ‘Pound–Rebka’ measurements, but the episode usefully
illustrates the contingency of experimental races. For a history of the episode, including
references to the various original papers, see Hentschel (1996). The race, the
pre-announcements, and the science-by-press-release, are also discussed rather
unflatteringly in Reif (1961). Further, see Vessot et al. (1980) for a description of the 1976
Gravity Probe A experiment, which measured the effect between the ground and a
space-based instrument at an altitude of 10 000 km.
7 Where did the 𝑐2 come from? Equation (A.3) is written in units where 𝑐 = 1, as usual; in
order to get a numerical value from ‘22.5 m’, we must examine the dimensions of the
expression to discover that the denominator, in physical units, must be 1 + 𝑔𝑧∕𝑐2 .
154 Appendix A An Overview of General Relativity
A.2.1 Coordinates
Where are you? What time is it?
At this stage in your physics education, you are comfortable with the
idea of describing the positions of things, and their movement, by giving
them coordinates, and talking about how those coordinates change. Those
coordinates are a mathematical fiction, of course, but it’s useful to stress
how completely arbitrary those coordinates are.
In addressing one or other physics problem, we are taught to make things
easy for ourselves by choosing cartesian coordinates, or polar coordinates, or
one of the various other more exotic alternatives. Our choice here will have
an effect on, for example, how we write down velocities, and the form of the
gradient operator that we must use. Two key points are (i) the coordinates
we use are a choice, and (ii) they are not physically significant.
The way we describe geometry mathematically (or at least the way which
is of most relevance to GR) is using differential geometry. Unfortunately, this
subject is too mathematically challenging to go into at the level of this text.
What I can do instead is to give you a sense of the key ideas, in just enough
depth that we can understand how they are used to make the link to gravity.
A coordinate system is a way of systematically drawing grid-lines through-
out a region of spacetime, which is curved or flat, and thus attaching coordi-
nates to each event. These coordinates don’t have any intrinsic meaning –
they’re just labels attached to points in space and time, and the difference in
coordinates between two events doesn’t tell you anything immediate about
how far apart they are. Also, we discover that, in our universe, we will al-
ways need three space and one time dimension to fully and non-redundantly
locate a point.
This isn’t too hard to think about for space dimensions: we can of course
imagine grid-lines drawn on the ground, and perhaps imagine some ar-
rangement with balloons or helicopter drones, with numeric coordinates
A.2 Geometry 155
painted on the sides of them, and perhaps observers sitting in some of them.
It’s a little harder to think about for time. For Newton, this was easy: the
newtonian picture is of a sequence of three-dimensional ‘snapshots’, each of
which has a time-value attached to it. There’s some clear arbitrariness to
the timescale, in terms of how fast the clock ticks, or where the zero of time
is, but there’s equally well a clear association between each snapshot and
‘the time’. A brief reflection on Chapter 4 will tell us that things aren’t going
to be so simple in a spacetime built on Minkowski space.
Probably the most straightforward way to define a time coordinate is to
decide that the time coordinate of an event is the number showing on the
watch of an observer located there. Or, more generally, that each one of our
balloons or drones has a clock attached to it, which ticks in some systematic
way, and ‘the time’ is the number on that clock face. It might be that these
clocks run differently on different drones, so that it’s not a given that they
all tick ‘together’.
C
a
b
R B
c
A
A.2.4 Geodesics
When we draw a triangle in euclidean space, we obviously connect the three
vertices with straight lines. The lines joining the vertices in the triangles
of Section A.2.3 are not straight lines in 3-d space, but they are the nearest
thing to a ‘straight line’ on the 2-d surface of the sphere (they are ‘great
circles’). This is telling us that we might have to generalise our notion of
‘straight line’ when talking about non-euclidean geometries. The result of
this generalisation is the geodesic.
There are a couple of ways we can think of geodesics. One is to see them
as the shortest distance between two points in the space. This works for
euclidean space, and for the great circles on the surface of a sphere, but if
you think back to Section 5.7.1, you might recall that I mentioned in passing
that the straight line followed by the non-travelling twin was in fact the
A.2 Geometry 159
A.3 Gravity
We now have the set of ideas, and the language, in which we can describe a
relativistic theory of gravity – Einstein’s General Relativity.
This is why we have been so concerned with geometry all the way through
this text.
Imagine doing some physics experiments in a free-falling frame, within
a spacetime which is not necessarily the simple one we have come to un-
derstand as Minkowski space. This free-falling frame can be regarded as an
inertial frame, within which SR tells us the rules. From Chapter 7 we know
that 𝐅 = d𝐏∕d𝜏 in this inertial frame. What this final version of the EP is
telling us is that, even if the spacetime we are falling within is significantly
more complicated (i.e., curved) than Minkowski space, this curvature does
not change the physical laws; there is no ‘curvature coupling’. We could
imagine the law 𝐅 = d𝐏∕d𝜏 acquiring some extra terms due to the curvature
– there’s no mathematical reason why it shouldn’t – but it doesn’t. That’s a
profound physical statement.
This tells us that we already understand the physics of how things move
in the context of GR – particles move along the 𝑡-axis of their local inertial
frame, and according to the force laws of Chapter 7. But at this point we
cannot translate that understanding into the other frames that we care about,
such as ones that are accelerating, or which are otherwise not free-falling
(for example because they are stationary on the surface of the gravitating
Earth). Since we saw above that our normal motion within an SR inertial
A.3 Gravity 161
This discussion is about free-falling under gravity. How does this tell us
something new about our familiar gravitational experience of standing on
the ground? Whether we are standing on the ground or sitting in a chair, or
holding desperately onto a trapeze, what we are not doing is falling towards
the centre of the Earth. But if we were to compare ourselves to a ghostly
alternative self which is indeed falling, we would find the gap between us
increasing at a constant 9.81 m s−2 , and see ourselves being accelerated away
from that falling figure by the force transmitted through our feet or arms,
ultimately from the ground. That is, the ‘force of gravity’ that we feel is
merely the acceleration away from our otherwise free-fall trajectory.
the ‘straight line’ geodesic traced out by a particle in free fall is different
from the geodesic worldline that the particle would trace out if the mass
weren’t there. We interpret that deflection of the worldline as the result of
the ‘gravity’ of the central mass, so that we see the straight line in spacetime
as the movement in space and time of a ball thrown through the air.
Newton explains gravity by saying that a mass creates a field around it,
which results in a force on test particles, which is proportional to their mass
and directed towards the centre. Einstein explains the same behaviour by
saying that the effect of a mass is to change (the metric of) the spacetime
around it. Nearby free-falling test particles ‘know’ nothing of the central
mass, but simply move straight ahead along geodesics in their local inertial
frame (Figure A.9). The shape of the spacetime, however, has the effect of
causing the particle to move along a path which we describe as an ‘orbit’.
An orbit is simply a straight line in a curved space.
It is astonishing that Newton’s and Einstein’s models, which start from
such different places, with such different motivations, nonetheless produce
predictions for the behaviour of free-falling particles which so very precisely
match each other, and match reality.
This relationship allows us to complete the other half of the famous
slogan, originally framed by John Wheeler,
Spacetime tells matter how to move;
matter tells spacetime how to curve.
I can go into very little mathematical detail here, but I hope that I can
give you at least a flavour of how we would use the mathematical structures
of SR, in the generalisation to gravity.
In writing this down, we have also assumed that we are dealing with
‘slow’ movement, in the sense that d𝜎2 ≪ d𝑡 2 . That means that we have not
included a factor (1 + 𝐵)d𝜎2 , where the ‘small × small’ term 𝐵d𝜎2 would
be promptly discarded.
We have decided that the clocks can have their rates adjusted, but to
what value? In Section 2.2.1 we saw how we can synchronise clocks in
flat Minkowski space. Imagine doing a similar thing, but in the context
of a spacetime with a non-flat metric. How do I, stationed at radius 𝑟me ,
instruct an observer at radius 𝑟 how to set their clock? Let’s decide that my
clock is the reference standard, and ask ‘what would we have to do to make
the coordinate time difference between events at radius 𝑟 the same as that
at 𝑟me ?’ We have to slow down the clock’s displayed time with respect to
its proper time, by a factor which depends on the gravitational potential, 𝜙,
at that point, as we saw in Eq. (A.5). Specifically, and arbitrarily taking
𝜙(𝐫me ) = 0, we can write
d𝜏2 = e2𝜙(𝑟) d𝑡 2 , (A.12)
to define the interval in coordinate time between two events, co-located at
radius 𝑟, which are separated by a proper time d𝜏. At this point we can
rewrite Eq. (A.11) to get
d𝑠2 = e2𝜙 d𝑡2 − d𝜎2 . (A.13)
The newtonian potential 𝜙 = −𝐺𝑀∕𝑟 has dimensions L2 T−2 , so that in phys-
ical units we would expect instead 𝜙∕𝑐2 here: putting in standard physical-
units values8 for 𝐺𝑀⊙ and 𝑟⊕ gives 𝜙∕𝑐2 ≈ 10−8 , so that 𝜙, in Eq. (A.13),
is indeed small, this leading term is very close to 1, and we can write this
alternatively as
d𝑠2 = (1 + 2𝜙)d𝑡2 − d𝜎2 , (A.14)
omitting terms in 𝜙2 . This is a weak-field metric: a metric for the spacetime
around a ‘small’ central mass. For small 𝜙, this is close to the metric of
Eq. (4.4) or Eq. (6.7), but it is not quite the same. Our argument here has
told us that the presence of the mass has changed the definition of distance
in the spacetime around it.
How do particles move in this spacetime? We learned in Section A.3.1
that particles move along geodesics, and in Section A.2.4 that those geodesics
are the lines of extremal length.
If you have done some advanced classical mechanics (Hamill 2013) you
will have learned that it is possible to obtain the equations of motion of an
object in (for example) a gravitational field by extremising a quantity known
as the ‘lagrangian’, formed from the difference between the kinetic and
gravitational potential energy, or 𝐿 = 𝑚𝑣 2 ∕2 − 𝑚𝜙. It is possible to show
that the demand that our geodesic extremises Eq. (A.14) produces the same
constraint as this lagrangian formalism. In other words, the solution to our
geometrical problem produces the same path 𝐱(𝑡) as we would otherwise
classically predict on the basis of motion under gravity.9
In other other words, we have rediscovered Newton’s theory of gravity,
as a ‘weak-field’ approximation to the behaviour of objects in free-falling
inertial frames, as constrained by the Equivalence Principle. This is the first
basic test of GR.
Notice that, in this argument, we have not used Einstein’s equation, or
any part of the argument of Section A.3.2 (though we briefly touched on the
physics of Section A.3.1); that is, to be precise, this isn’t really a weak-field
‘solution’. Instead, we brought gravity into the argument by using what is
essentially Newton’s gravity: the argument was built directly on the idea
of gravitational redshift (Section A.1.3), which in turn depends on the idea
of a mass losing energy 𝑚𝑔ℎ when it drops through a given distance. It is
because we have imported gravity by this route that this argument is limited
to low-speed (i.e., newtonian) motion within the field.
9 The argument here is elegant, but unfortunately very compressed. It is ultimately derived
from Rindler (2006, §§9.1–9.4), where you can find the steps I have skipped, and a number
of subtleties which I have elided.
A.4 Solutions of Einstein’s Equation 167
Minkowski space. Put mathematically, that means that the metric of this
spacetime will be approximately the metric of Eq. (4.4), plus some small
perturbation.
Skipping the details, we discover that our invariant interval in this space
– our metric – becomes
𝐟 = −𝑚∇𝜙.
That is, as we would hope, the small mass limit of Einstein’s theory of gravity
is Newton’s law of gravity – the law of gravity that we are familiar with. This
demonstrates explicitly that Newton’s theory of gravity is the non-relativistic
limit of Einstein’s.
10 The combination 𝐺𝑀⊙ is what governs the motions of objects in the solar system, and thus
planetary observations can determine the value of this to great accuracy, to one part in 1010 .
The mass of the Sun in kilogrammes is obtained by dividing this number by 𝐺, but the
gravitational constant is known only to around one part in 105 , from terrestrial
measurements; so the mass of the Sun in kilogrammes has the same uncertainty. This
means that the mass of the Sun is more precisely known in metres than it is in kilogrammes.
See the IAU’s list of ‘current best estimates’ at https://iau-a3.gitlab.io/NSFA/NSFA_cbe.html.
For reference, the gravitational radius of the Earth is 𝑅⊕ = 8.86 × 10−3 m.
A.4 Solutions of Einstein’s Equation 169
t
4
2
3
1
r
r1 r2
these events produces a flash of light which travels outwards along a null
geodesic to the events ○
3 and ○ 4 , which are also co-located, at radius 𝑟 = 𝑟2 .
so d𝑟 = 0, and the interval between them in the figure is the proper time
separating them (that is, the time showing on a clock which is present at
both events by virtue of staying still at the radius 𝑟1 ), and this is obtainable
directly from the metric Eq. (A.16):
2 2
𝜏12 = (1 − 𝑅∕𝑟1 )Δ𝑡12
2 2
(A.17)
𝜏34 = (1 − 𝑅∕𝑟2 )Δ𝑡34 .
Since the metric is not changing in time, the pair of events ○ 2 and ○ 4 must
stand in the same relation to each other as the pair of events ○ 1 and ○3 ,
and thus the time coordinate at ○ 4 must bear the same relationship to that
But this means that 𝜏12 is slightly less than 𝜏34 . That is, more time elapses
between events ○ 3 and ○4 , on the watch of an observer at 𝑟 = 𝑟2 , than elapses
𝜏12 is the period of an EM wave, then this implies that the wave will change
its frequency between 𝑟1 and 𝑟2 . Rewriting Eq. (A.17), we find
−1∕2
𝜏12 1 − 𝑅∕𝑟1
=( ) . (A.18)
𝜏34 1 − 𝑅∕𝑟2
both on Earth and in space. When national standards bodies compare their
atomic clocks, in order to establish the consensus International Atomic
Time (TAI), they must take into account the altitude, on Earth, of their
observatories. And when systems such as GPS or Galileo broadcast their
time signal for navigation purposes, they must include both a correction for
the altitude of the spacecraft (orbital radius of 26 600 km, for GPS), which
causes spacecraft time to advance with respect to the Earth at 45.7 μs per
day, and a correction from SR due to the orbital velocity, which causes the
spacecraft to lose 7.2 μs per day, for a total GPS advance of 38.5 μs per day.
B
α
d A
r = 2GM
the deflection. They found a deflection which matched the predicted one,
which was widely taken as a robust confirmation of the validity of Einstein’s
theory. There is a little more to say about this in Section B.3.
11 To be honest, this is where they come in and kick over the furniture and scare the cat.
174 Appendix A An Overview of General Relativity
y
x
x
y
y
x
x
y
M1
M2
M4 M3
A.4.5 Cosmology
In the last few sections, we have looked at the spacetime around a small mass,
around a large mass, and waves in a spacetime produced by an accelerating
mass. The next step: the universe!
To find a metric for the universe as a whole, we start with the observa-
tion that, on the largest scales, the universe appears to be homogeneous
and isotropic – that is, it is the same at all spatial points, and in all spatial
directions. This is known as the copernican principle. Apart from encour-
aging very proper modesty about our place in the universe, this permits a
considerable mathematical simplification. It is possible to write down, from
purely mathematical considerations, the most general metric which has
these two properties, namely the Friedman–Lemaître–Robertson–Walker
metric.12 In the most convenient choice of coordinates, it looks like
d𝑟2
d𝑠2 = d𝑡 2 − 𝑎2 (𝑡) [ + 𝑟2 dΩ2 ] , (A.21)
1 − 𝜅𝑟2
12 Usually referred to as ‘FLRW’, or by whatever subset and ordering of the four independent
discoverers’ names is demanded by your national prejudices.
176 Appendix A An Overview of General Relativity
scaled so that the parameter 𝜅 has one of the values {−1, 0, +1}, and 𝑎(𝑡) is
a dimensionless scaling factor which depends only on the time coordinate.
The effect of the scaling factor 𝑎(𝑡) is, in effect, to change the size of the
universe. It turns out that, when we apply Einstein’s equation, Eq. (A.10),
to this metric and a reasonable energy-momentum tensor, we deduce 𝑎̈ ≠ 0,
and thus a universe which is changing in size.
That Einstein’s equations admit of a non-stationary solution – an expand-
ing universe – was initially thought to be a physically implausible defect
in the theory. To amend this, Einstein observed that the argument leading
up to Eq. (A.10) can be slightly extended to allow an additional term on the
left-hand side, Λ𝗀, including a scaling factor now known as the cosmolog-
ical constant; this modified version of Einstein’s equation does admit of a
stationary solution, albeit an unstable one. The discovery of the Hubble–
Lemaître expansion of the universe meant that the need for this fell away,
but the last couple of decades have seen the term reintroduced, responding
to otherwise-inexplicable observational evidence, and corresponding to a
further exotic source of energy-momentum in the universe, not identifiable
as either matter or radiation.
Moving from maths to physics, the next step is to consider various dispo-
sitions of energy-momentum, and then use Eq. (A.10) to constrain the form
of 𝑎(𝑡), and thus its first and second time derivatives 𝑎̇ and 𝑎,̈ with different
solutions depending on the value of 𝜅. At that point, we cross over into the
study of cosmology.
One extreme case of this metric is where 𝑎(𝑡) = 0. The solutions to the
FLRW metric generally include an in-principle initial state where this is so,
and they have a singularity at this point. This is the big bang. At this point,
the curvature of the metric becomes infinite, which implies an infinite energy
density, which is enough to disrupt some of the mathematical foundations of
GR. At this point, quantum effects become important, obliging us to attempt
to develop a theory of quantum gravity. That is the next challenge.
Appendix B
Relativity’s Contact with Experimental Fact
For, if I had believed that we could ignore these eight minutes, I would have
patched up my hypothesis accordingly. But since it was not permissible to
ignore them, those eight minutes point the road to a complete reformation
of astronomy: they have become the building material for a large part of this
work. . . .
Johannes Kepler, on the discrepancy of 8′ of arc between the predicted and
actual positions of Mars, in Astronomia Nova, II, Cap.19, quoted in
Arthur Koestler, The Sleepwalkers (1959)
177
178 Appendix B Relativity’s Contact with Experimental Fact
they show the extent to which science is a social as well as a logical process:
to the extent there is a ‘scientific method’ which demarcates science from
non-science, it’s a social one of asking certain types of questions which are
answerable in a particular way, with these questions being asked within a
community which aims and is able to arrive at provisional consensuses by a
more-or-less systematic route.
‘What is science?’ is an interesting question, but it is not a scientific
one. However, this question, along with ‘what does science know?’, ‘how
does science justify its statements?’, and ‘how do scientists choose between
theories?’ are all importantly different questions, however similar they may
sound at first. The last of these questions is the most interesting one, because
it is perhaps the one least often asked. I don’t plan to offer any confident
opinions about these questions – they are too large for that – but I hope to
illustrate that they are distinct questions, and to persuade you that the last
one is one that should be asked more often.
A very conventional starting point, when talking about science, is to ob-
serve (this is Karl Popper’s formulation) that science proceeds by ‘conjecture
and refutation’, meaning that the ‘scientific method’, such as it is, consists
of theories being proposed, and then those theories being either refuted by
experiment, or not refuted yet. Though this badly oversimplifies the practice
of science – there are epistemological, methodological and even sociologi-
cal complications that have to be taken into account – it does capture the
essential asymmetry of scientific logic: experiments can only prove theories
wrong, but never really prove them right, except from exhaustion. So rela-
tivity’s contact with experiment consists of searching for an experimental
failure.
1 This is illustrated in Rindler (2006, §7.3). It is illuminating, but the maths is a little beyond
the level of this text.
2 See FitzGerald (1889) and Lorentz (1895); Lorentz reports that he had first suggested this
idea in 1892, and also notes that Oliver Lodge referred to the idea in 1893.
180 Appendix B Relativity’s Contact with Experimental Fact
that the effective light path changed length when the arms were moving or
not moving with respect to the aether, resulting in observable interference
effects. If they had observed this, they would have refuted the statement
that all inertial frames are equivalent – they would have shown that the
aether frame had a special status, and that Maxwell’s equations only work
properly in that frame. Famously, they did not see this, and nor did the
various people who attempted to re-do the experiment in one or other vari-
ant, encouraged at various times by both Lorentz and Einstein. Attempts
were made with instruments of different materials, at altitude, housed in
buildings with thick or thin walls, at different seasons of the year, and using
sunlight or starlight instead of artificial light, in case any of these made a
difference. In no case, as far as I am aware, was there any principled reason
why such a difference would be expected; for all that they were experiment-
ing with light, these observers were exploring in darkness. As late as 1933,
Dayton Miller claimed to have detected an absolute motion with respect
to the aether, which was smaller than expected but still inconsistent with
zero, requiring (he asserted) adjustments to either the aether theory or the
FitzGerald–Lorentz contraction, but still in contradiction to the precisely-
null result required by relativity theory (Miller 1933). Miller’s one-time
research assistant reanalysed the data in 1955, however, and concluded that,
despite all efforts, the measurements were still consistent with a null result
(Shankland et al. 1955), and thus that there was still no refutation.3
I’ve described this history at a little length, partly because it’s a famous and
important experiment, but also because it is illuminating about the way that
the status of a measurement changes within the scientific community. As
usefully discussed in Collins & Pinch (2012, ch. 2), Michelson and Morley’s
null result was initially understood as an anomaly for aether theory; later,
it was re-perceived as consistent with relativity, and a confirmation of it;
finally, Miller’s claimed non-null result changed, and from being seen as an
anomaly in relativity theory, it itself became an experimental anomaly, to
be explained away by Shankland et al. as a systematic observational error.
3 Both Miller’s and Shankland et al.’s papers provide fascinating and detailed accounts of the
history of this measurement. Reading these papers one is repeatedly reminded of the
technical challenges of highly sensitive equipment, challenges which were and are also
faced by the gravitational wave communities, who also experienced frustrations and
disruptions from distant traffic, microseismic events, weather, and gruelling observational
campaigns. Rather desperately in retrospect, Miller ends his paper with a list of other
observations which seem to hint at an absolute Earth motion, the last one of which,
ironically, is Karl G. Jansky’s report of ‘a peculiar hissing sound in short wave radio
reception, which comes from a definite cosmic direction’.
B.1 Special Relativity 181
4 The Sagnac effect is also used for practical navigation in a class of high-precision gyroscopes
called ‘ring-laser gyroscopes’. The Sagnac experiment was devised by Georges Sagnac in
order to demonstrate the presence of the aether; it appeared to, at the time, but turns out to
corroborate relativity on closer examination. The Sagnac effect is discussed in Ashby (2003),
and there is a history in Pascoli (2017).
182 Appendix B Relativity’s Contact with Experimental Fact
In the presence of these powerful implicit tests of SR, I don’t feel there’s
really a great deal extra that’s useful to say about the various explicit ones.
There is an extensive annotated bibliography of experimental tests by Roberts
& Schleif (2007) and a summary in Will (2014, §2.1.2): it seems sufficient
to say that SR has been tested with great ingenuity, at length, and to great
precision, and it has not failed any test so far. At the end of Roberts &
Schleif’s bibliography is a section on papers which claim but do not sustain
results inconsistent with SR, which is worth reading as a series of (unwitting)
demonstrations of how hard experimental work is, in this area.
Although the underlying theory is not in doubt, there nonetheless are
significant current efforts to make measurements which look for deviations
from SR’s predictions, using observations which stretch from sub-nuclear
physics to astrophysics. Various exotic physical theories – and there is no
current shortage of these, mostly to do with quantum gravity, theories of
the quantum mechanical structure of spacetime at the microscopic level,
or exotic cosmologies, but also including theories of particle physics at very
high energies – include predictions of some violation of Lorentz invariance.
That is, a proposed theory includes some preferred frame (in the sense
that the aether was a ‘preferred frame’, of absolute rest) or, what is much
the same thing, that equations of motion or measured physical properties
would have different values in different inertial frames. The goal here is not
to ‘prove Einstein wrong’, but instead to discover some way in which the
axioms of SR are inadequate or incomplete, and so to uncover new physics
beyond the Standard Model. Finding a preferred frame would contradict
the first postulate, above, turning it from a fundamental organisational
principle of our universe into a special case or low-energy approximation.
No matter how special the special case, or how precise the approximation,
if any deviation from perfect Lorentz invariance were found, that would
be of major significance, and provoke sudden huge interest in any theory
which predicted it; if no deviation is found, then that usefully rules out those
theories which propose it or depend upon it; for a review, see Mattingly
(2005). All of these experiments are technically challenging. None so far
has found anything.
B.2 General Relativity: Classical and Post-classical Tests 183
These tests’ ‘classic’ status derives in part from Einstein’s description of them
in the final section of his 1916 paper, but also because the successful tests
are famous, and between them seem generally regarded as demonstrating
GR to be correct.
It is remarkable how quickly and completely the scientific community
184 Appendix B Relativity’s Contact with Experimental Fact
5 A similar process is currently taking place in experimental particle physics, where ‘normal
science’ is the increasingly desperate attempt to find some measurements which are
demonstrably at odds with the Standard Model of particle physics.
186 Appendix B Relativity’s Contact with Experimental Fact
6 The joint meeting is briefly summarised in Thomson (1919), which includes a record of the
discussion afterwards. This was accompanied by a much more detailed account (Dyson
et al. 1920), which was ‘read’ to the Royal Society on the same date in 1919. You can find
fuller accounts of the expedition and experimental analysis in Earman & Glymour (1980b),
and in a brief but lucid account in Kennefick (2009), which is related to a more detailed
version, which also talks about the ‘myths’ surrounding the observation, in Kennefick
(2012).
7 It was usual at the time to quote errors in terms of ‘probable error’ rather than standard
deviations: the probable error is half the interquartile range, and for a normal distribution is
a little more than 2/3 of the standard deviation. Here and below I have quoted the errors
rescaled to standard deviations.
8 The Sobral plates were remeasured at the RGO in 1979 (Harvey 1979) with the result that
the error on the 4-inch observations was tightened, and the result from the (Sobral)
188 Appendix B Relativity’s Contact with Experimental Fact
quoted here, but he argued that there were no systematic errors which would
otherwise undermine this result (indeed that the cloud cover at Principe
had accidentally but usefully avoided the overexposures which had been
part of the problem at Sobral). Specifically, he stated that ‘the accuracy
seems sufficient to give a fairly trustworthy confirmation of Einstein’s the-
ory, and to render the half-deflection at least very improbable’, but that this
Principe result ‘has much less weight than [the 4-inch one].’ The logic of
the report, in its conclusion, is that the 4-inch observation is the main result,
inconsistent with the ‘newtonian’ prediction, that this is weakly corrobrated
by the Principe data, and that the Sobral astrographic observation was too
unreliable to rule out either possibility.
I have described this measurement in much more detail than you will
usually find in a text of this type – relativity textbooks tend to deal with
the 1919 measurements in a sentence or a short paragraph – but there is
little that is really unusual about it. It was a challenging measurement at
the limit of what was possible; the observers constructed and published an
argument that some of the data was unreliable and should be discarded, and
that some other data, though difficult to analyse, should be retained; they
came to a qualified but still confident conclusion about its consistency with
the ‘einsteinian’ result; and the community promptly and overwhelmingly
accepted this conclusion, and seems to have regarded the matter as largely
settled. It is still regarded as a famously conclusive measurement.
After discussing the history of the measurement, Earman & Glymour
(1980b) drew attention to a number of features of this measurement which
are not uncommon in scientific experimentation, but which were particu-
larly clearly exhibited here; their account has been much discussed since.
(i) The deflection measurements were motivated by a particular theoret-
ical prediction, and so were designed to rule in or rule out three distinct
possibilities, rather than making an unconstrained estimate of the value. In
other words, this was a hypothesis-testing experiment, and not the measure-
ment of a physical parameter.
(ii) Although they were in the service of a particular prediction, there
was no widespread understanding of the details of the theory. Eddington
was one of the few people able to do the calculations confidently, but even
there, it is clear that Eddington’s, Einstein’s and others’ understanding of the
theory, and of how to do calculations, was not as profound as it would later
become. That is, it would have been reasonable to criticise the theoretical
astrographic plates improved to 1′′. 55 ± 0′′. 34, both ruling out the ‘newtonian’ prediction and
suggesting that Dyson had been pessimistic in his analysis.
B.3 A Closer Look at the 1919 Eclipse Observations 189
9 I use the word ‘falsify’ because that fits in with the popperian logic of science, but
‘falsification’ is a complicated business, and the more evasive phrasing of ‘inconsistent with’
is probably more descriptive.
10 ‘In the present’ is important, because current controversies, perhaps scientific controversies
which interact with crucial public policy decisions, necessarily happen without the benefit
of any hindsight, even though ‘how do we know?’ and ‘what do we do now?’ may be
B.3 A Closer Look at the 1919 Eclipse Observations 191
approach originated in the work of David Bloor in 1976 (i.e., Bloor (1991),
and see Laudan (1981) and Bloor (1981)), and is well summarised in Collins
& Pinch (2012). It represented, I suppose, one ‘side’ in the so-called ‘Science
Wars’ of the 1990s, an unedifying episode which has a useful synthesis in
Labinger & Collins (2001).
It’s also worth pointing out that Earman and Glymour were not the first
to talk of this case, nor the first to raise the suggestion (to the extent that they
did) that there was something amiss in Dyson and Eddington’s measurement.
Indeed, it seems to have become something of an urban myth in gravitation
physics: it was mentioned in a rather garbled aside by Everitt et al. (1979),
in an article proposing an alternative test of GR, and it reached possibly its
more extreme published form in Hawking’s A Brief History of Time (1988),
where he suggests, without pointing to a source, that ‘the errors were as great
as the effect they were trying to measure.’ There are at least two interesting
things about these two (historically inaccurate) accounts: one is that neither
author thinks it important, from a scholarly point of view, to give a precise
account of the history; secondly, neither author seems scandalised by the
incident, and indeed Hawking goes on to remark that this was ‘a case of
knowing the result they wanted to get, not an uncommon occurrence in
science’. When I have heard this story retold, in one variant or another, or
retold it myself, it it has never been received with either surprise or outrage,
but at most wry amusement or a facetious tut-tut-tut.
From one point of view, this doesn’t matter. Science tends to be cheerfully
ahistorical in practice, and most scientists would probably agree that, as a
question of how we know what we know, it doesn’t much matter just why
the community arrived at one conclusion or the other, as long as whatever
conclusion we have on the matter, now, can reasonably be agreed to be
correct (this touches on the distinctions between the various questions on
p. 178). Science gains its justifications from current consistency between
theory and experiment, and when we hear ‘Newton said. . . ’ or ‘Galileo
observed. . . ’ it’s generally a pedagogical point being made, rather than a
historical one, or any attempt to persuade. The history of science is much
more intricate (and indeed more interesting) than you would generally
learn in a science lecture, but that history is not part of the professed logic
scientific questions which demand immediate answers. I am writing this during the
COVID-19 pandemic, when demands that politicians ‘follow the science!’, though
reasonable, were met with some unease by the scientists in question, and when the contrasts
between scientific, medical, and political theory-choice – ‘how do we know?’, ‘what are the
chances?’, and ‘what do we do now?’ – were discussed in newspaper leading articles rather
than journals of the sociology of science.
192 Appendix B Relativity’s Contact with Experimental Fact
of science.
The reason I am describing this at such length is that I think that the
episode illustrates some important features of the relationship between the-
ory and experiment, in actual scientific practice. In the swift acceptance of
the 1919 result, we (broadly speaking) see the point at which a theory moves
from prediction to (broadly speaking) accepted fact, and in the Fraunhofer
redshift observations we see the effect this has on a related field, turning
relativistic effects into a background detail for solar spectroscopic measure-
ments. The fact that Dyson and Eddington are so scrupulously exhaustive
in describing their data analysis illustrates the continuing force of the Royal
Society’s motto ‘Nullius in verba’, conventionally glossed as ‘take nobody’s
word for it’ – the idea that scientific results should be described in a way
that lets the reader repeat the observation, or re-do the analysis, themself,
and which has logical, methodological and even ethical dimensions; the
fact that no reader seems to have thought this re-observation worth doing
illustrates the continuing force of scientists’ communal trust in each other’s
craft skills, and in their professional honesty, in reporting and analysing
data as fairly as possible.
So if, as I assert, this incident is more intricate than most people would
expect, but not more intricate than usual practice, we must ask why it has
become so famous, and so frequently and combatively discussed. Why, for
that matter, is this usually described as Eddington’s observation, though it
seems to have been a joint effort, with Dyson, in every important respect?
One answer is that it is a good story, in the practical sense that it incorporates
the multiple features I mentioned above, in a way which illustrates the
strands of motivation and persuasion which are present in actual scientific
practice. Another answer is that it incorporates human involvement with
the process – the result was personally important to Eddington, for more
than merely professional reasons. I think it is those human aspects of the
story – aspects which are essential to the story – which make some people
uncomfortable, as if they suggest that scientific conclusions are arbitrary, or
personal, or are somehow mere consensus (in a deprecatory sense, which is
distinct from the pragmatic sense in which I have used the word above), and
as if they undermine the thoughtful confidence we should have in scientific
conclusions here and elsewhere. But that confidence is emphatically not
undermined by the observation that scientists are people, that the logic of
science is not only the simplistic one described in popular science books (or
press releases), and that uncertainty and doubt are integral to the process.
So after ‘a closer look’ at the 1919 observations, what conclusions can
we come to? The ‘closer look’ appears to have simply made things more
B.3 A Closer Look at the 1919 Eclipse Observations 193
194
C.1 Complex Numbers 195
The above points give us enough information to describe the full range
of arithmetic operations.
Addition: Given two complex numbers 𝑧1 = 𝑎 + i𝑏 and 𝑧2 = 𝑐 + i𝑑, the
sum of the two numbers is obtained by collecting and simplifying as much
as possible:
𝑧1 𝑧2 = (𝑎 + i𝑏)(𝑐 + i𝑑)
= 𝑎𝑐 + i𝑏𝑐 + i𝑎𝑑 + i2 𝑏𝑑
= (𝑎𝑐 − 𝑏𝑑) + i(𝑏𝑐 + 𝑎𝑑),
Either using this, or by inspection, you can see that 1∕i = −i.
The complex conjugate: There is a new arithmetic operation we can
perform on complex numbers, the complex conjugate: this is the operation of
negating the imaginary part of the number. Thus, given a complex number
𝑧 = 𝑎 + i𝑏, the conjugate, 𝑧∗ , is
𝑧∗ = 𝑎 − i𝑏.
1 𝑧
sinh 𝑧 = (e − e−𝑧 ) (C.1a)
2
1
cosh 𝑧 = (e𝑧 + e−𝑧 ), (C.1b)
2
and illustrated in Figure C.1. There are corresponding definitions
sinh 𝑧
tanh 𝑧 =
cosh 𝑧
1
sech 𝑧 = ,
cosh 𝑧
C.2 The Hyperbolic Functions 197
c(
s (
- -
-
-
2 2
cosh 𝑥 − sinh 𝑧 = 1
2 2
1 − tanh 𝑧 = sech 𝑧
sinh(𝑢 ± 𝑣) = sinh 𝑢 cosh 𝑣 ± cosh 𝑢 sinh 𝑣
cosh(𝑢 ± 𝑣) = cosh 𝑢 cosh 𝑣 ± sinh 𝑢 sinh 𝑣
tanh 𝑢 ± tanh 𝑣
tanh(𝑢 ± 𝑣) =
1 ± tanh 𝑢 tanh 𝑣
See your favourite source of mathematical tables for the rest, and for expres-
sions for the derivatives, and so on. You can see that these are quite similar
to the corresponding trigonometric identities.
We can find other expressions for these functions by looking at the series
expansion of the exponential function:
𝑧2 𝑧3 𝑧4
e𝑧 = 1 + 𝑧 + + + + ⋯. (C.2)
2! 3! 4!
𝑧3 𝑧5
sinh 𝑧 = 𝑧 + + +⋯
3! 5!
𝑧2 𝑧4
cosh 𝑧 = 1 + + + ⋯.
2! 4!
We may also now ask what is the exponential of a pure imaginary number, i𝑧?
This seems an odd thing to do, if you think of e𝑧 only as the number e raised
to a given power; but if you think of Eq. (C.2) as defining the exponential
198 Appendix C Maths Revision
𝑎2 = 𝑎𝑥2 + 𝑎𝑦2 + 𝑎𝑧2 (we can also see this from Pythagoras’s theorem, or in-
deed from Eq. (6.1)), and we know that this is an invariant of a rotation –
that is, that it takes the same value irrespective of the coordinate system. It
is easy to see that |𝐚 + 𝐛|2 = 𝑎2 + 𝑏2 + 2𝐚 ⋅ 𝐛 and so, since both 𝑎2 and 𝑏2
are frame-invariant, the scalar product 𝐚 ⋅ 𝐛 must be frame-invariant also,
even though the individual coordinates 𝑎𝑖 and 𝑏𝑖 are not. Finally, if the
scalar product of two vectors vanishes, 𝐚 ⋅ 𝐛 = 0, we say that the two vectors
are orthogonal. If a euclidean vector is orthogonal to itself (𝐚 ⋅ 𝐚 = 0) then
we can deduce that 𝑎𝑖 = 0. In linear algebra, the scalar product is more
generally termed the inner product, and the length of a vector is termed its
norm.
In Minkowski space, in contrast, the scalar product of Section 6.2 can
be negative (and a mathematician would therefore insist that it cannot,
strictly, be called an ‘inner product’), which means that, if the scalar product
of two Minkowski vectors is zero, 𝐀 ⋅ 𝐀 = 0, we therefore cannot deduce
that 𝐀 = 0. Also, in this context, we can continue to talk of the magnitude
of a vector, 𝐀 ⋅ 𝐀, but can no longer (and indeed need no longer) talk of the
norm as the square root of this.
Appendix D
How to Do Calculations: a Recipe
1. they are aligned so that the (𝑥, 𝑦, 𝑧) and (𝑥 ′ , 𝑦 ′ , 𝑧′ ) axes are parallel;
2. the frame 𝑆 ′ is moving along the 𝑥-axis with velocity 𝑣;
3. we set the zero of the time coordinates so that the origins coincide at 𝑡 =
𝑡′ = 0 (which means that the origin of the 𝑆 ′ frame is always at position
𝑥 = 𝑣𝑡).
All three conditions must be satisfied in order for the Lorentz transformation
equations to be valid.
The values 𝑥, 𝑡, and so on, are coordinates of events in the unprimed
frame, 𝑆 (they’re not displacements, or lengths, or the intervals between
events); the values 𝑥 ′ , 𝑡′ , and so on, are coordinates of the same events, as
measured in 𝑆 ′ .
Remember: an event is not ‘in’ one frame or another. An event is a frame-
independent thing which is ‘in’ every frame. The difference between frames
is that an event will have different coordinates in different frames.
200
D.2 A Checklist for Relativity Problems 201
certainly need to complete all of the ‘identify’ steps before it makes sense to
do either of the ‘write down’ steps.
The first steps are the hard ones. After about half-way, you’re just turning
the handle.
The big wrinkle is that you might not need to use the Lorentz transfor-
mation, since there are a few other bits of dynamics you’re supposed to
know.
Adelberger, E., Gundlach, J., Heckel, B. et al. (2009), ‘Torsion balance experi-
ments: A low-energy frontier of particle physics’, Progress in Particle and
Nuclear Physics 62(1), 102–134. https://doi.org/10.1016/j.ppnp.2008.08.
002.
Adlam, E. (2011), Poincaré and special relativity. Preprint, https://arxiv.org/
abs/1112.3175.
Ashby, N. (2003), ‘Relativity in the global positioning system’, Living Reviews
in Relativity 6(1), 1. https://doi.org/10.12942/lrr-2003-1.
Bailey, J., Borer, K., Combley, F. et al. (1977), ‘Measurements of relativistic
time dilatation for positive and negative muons in a circular orbit’,
Nature 268(5618), 301–305. https://doi.org/10.1038/268301a0.
Barton, G. (1999), Introduction to the Relativity Principle, John Wiley and
Sons. ISBN 9780471998969.
Bell, J. S. (1976), ‘How to teach special relativity’, Progress in Scientific Culture
1(2), 135–148. Reprinted with minor changes in Bell (2004, ch. 9).
Bell, J. S. (2004), Speakable and Unspeakable in Quantum Mechanics: Col-
lected Papers on Quantum Philosophy, 2nd ed., Cambridge University
Press. ISBN 9780521523387.
BIPM (2019), ‘Le système international d’unités / the international sys-
tem of units (‘the SI brochure’)’, Online. https://www.bipm.org/en/
publications/si-brochure.
Bloor, D. (1981), ‘II.2 The strengths of the strong programme’, Philos-
ophy of the Social Sciences 11(2), 199–213. https://doi.org/10.1177/
004839318101100206.
Bloor, D. (1991), Knowledge and Social Imagery, 2nd ed., University of
Chicago Press. First edition 1976, ISBN 9780226060972.
Collins, H. M. & Pinch, T. (2012), The Golem: What Everyone Should
205
206 References
doi.org/10.1002/andp.19163540702.
Einstein, A. (1920), Relativity: The Special and the General Theory, Methuen.
Originally published in book form in German, in 1917; first published
in English in 1920, in an authorised translation by Robert W. Lawson;
available in multiple editions and formats.
Einstein, A. (1936a), ‘Physics and reality’, Journal of the Franklin Institute
221(3), 349–382. Translation, by Jean Piccard, of Einstein (1936b), https:
//doi.org/10.1016/S0016-0032(36)91047-5.
Einstein, A. (1936b), ‘Physik und Realität’, Journal of the Franklin Insti-
tute 221(3), 313–347. Translated in Einstein (1936a), https://doi.org/10.
1016/S0016-0032(36)91045-1.
Einstein, A. (1991), Autobiographical Notes, Open Court. First published in
a separate edition 1979; various printings, ISBN 9780812691795.
Everitt, C., Lipa, J. & Siddall, G. (1979), ‘Precision engineering and Einstein:
The relativity gyroscope experiment’, Precision Engineering 1(1), 5–11.
https://doi.org/10.1016/0141-6359(79)90070-9.
FitzGerald, G. F. (1889), ‘The ether and the Earth’s atmosphere’, Science
13(328), 390. https://doi.org/10.1126/science.ns-13.328.390.
French, A. P. (1968), Special Relativity, CRC Press. ISBN 9780748764228.
Galileo Galilei (1632), Dialogue Concerning the Two Chief World Systems,
Batista Landini. Original title (in effect): Dialogo sopra i due massimi
sistemi del mondo.
Gourgoulhon, E. (2013), Special Relativity in General Frames: From Particles
to Astrophysics, Springer-Verlag. ISBN 9783662520833.
Gray, N. (2019), A Student’s Guide to General Relativity, Cambridge Univer-
sity Press. ISBN 9781107183469.
Hamill, P. (2013), A Student’s Guide to Lagrangians and Hamiltonians, Cam-
bridge University Press. ISBN 9781107617520.
Harvey, G. M. (1979), ‘Gravitational deflection of light’, Observatory
99, 195–198. https://ui.adsabs.harvard.edu/abs/1979Obs....99..195H.
Hawking, S. W. (1988), A Brief History of Time, Bantam.
Heaviside, O. (1889), ‘On the electromagnetic effects due to the motion of
electrification through a dielectric’, Philosophical Magazine (fifth series)
27(167), 324–339. https://doi.org/10.1080/14786448908628362.
Hentschel, K. (1996), ‘Measurements of gravitational redshift between 1959
and 1971’, Annals of Science 53(3), 269–295. https://doi.org/10.1080/
00033799600200211.
Jefimenko, O. D. (1994), ‘Direct calculation of the electric and magnetic fields
of an electric point charge moving with constant velocity’, American
Journal of Physics 62(1), 79–85. https://doi.org/10.1119/1.17716.
208 References
acceleration, 87, 90, 149 electromagnetism, 20, 22–27, 95–98, 179, see
constant, equations of, 121 Maxwell’s equations
proper, see proper acceleration electron-volt, 137
vector, 111, 113 energy conservation, 128
aether, 24, 97, 179 energy equivalent of mass, 127
energy-momentum, 126, 133, 137, 162
big bang, 176 Equivalence Principle, 19, 148–150, 160
black hole, 173 euclidean space, 153
boost, 74 event horizon, 172
events, 3, 201
causality, 66–68
centre-of-mass energy, 136 falling lift, 149
centre-of-momentum frame, 132–136 FitzGerald, George Francis, 25, 96
clock force, 138
hypothesis, 10, 39 frequency 4-vector, 115–118
synchronisation, 10, 29
co-moving reference frame, 112 galilean transformation, 20
instantaneously, see ICRF Galileo, 19
Compton scattering, 139 gamma, 39, 74
conserved quantities, 134 general relativity, 5, 19, 110, 137, 160,
constant quantities, 134 163–176
coordinate system, 3, 154–155 geodesic, 158–159, 167
copernican principle, 175 geometry, 46, 153
cosmological constant, 176 good clock, 11
cosmology, 175–176 GPS, 170, 181
covariance, 22, 134 gravitational redshift, 152, 169, 183
curvature, 156–158, 160 gravitational waves, 173–175
gravity, 136, 160–163
differential geometry, 154, 166 group theory, 75–76
Doppler effect, 115–118
ICRF, 91, 113, 121
Eötvös experiment, 148 inertial frame, 5, 146, 149
Eddington–Dyson eclipse measurements, local, 150
172, 183, 186–193 inner product, 199
Einstein tensor, 162 invariant
Einstein’s equation, 162 interval, 56–61, 77–78, 86
211
212 Index
200, 202
vectors
displacement, 104, 105
transformation of, 106
velocity
addition of, 75
vector, 111–113